Trigger more frequent autovacuums
HI Nathan Bossart Melanie Plageman
Firstly, congratulations on the submission of this path:
https://commitfest.postgresql.org/patch/5320/
vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples;
anlthresh = (float4) anl_base_thresh + anl_scale_factor * reltuples;
vacinsthresh = (float4) vac_ins_base_thresh + vac_ins_scale_factor *
reltuples;
These three calculations have already been optimised for two of them, and
with this patch, we have the key data pcnt_unfrozen, I think we can also
consider applying it to the vacthresh and anlthresh calculations, and I've
added a new pcnt_unrelallvisible parameter with reference to pcnt_unfrozen,
so I'm not sure if it's a good idea for me to use it. I'd like to hear your
opinions on this.
#Here's a simple test I did
test=# select count(*) from join1;
count
---------
2289001
(1 row)
test=# update join1 set name=md5(now()::text) where id<=20000;
UPDATE 70001
test=#
2025-03-07 14:03:33.968 +08,,,607191,,67ca8c35.943d7,2,,2025-03-07 14:03:33
+08,2005/2,0,DEBUG,00000,"vacthresh: 222674.750000,anlthresh: 11371.118164,
the j
oin1 has 2291275.000000 reltuples, pcnt_unfrozen: 0.485810,
pcnt_unrelallvisible: 0.049410 ",,,,,,,,,"","autovacuum worker",,0
Attachments:
Optimising-the-vacuum-algorithm.diffapplication/octet-stream; name=Optimising-the-vacuum-algorithm.diffDownload
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 800815dfbc..5583563d08 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3052,9 +3052,11 @@ relation_needs_vacanalyze(Oid relid,
if (PointerIsValid(tabentry) && AutoVacuumingActive())
{
float4 pcnt_unfrozen = 1;
+ float4 pcnt_unrelallvisible = 1;
float4 reltuples = classForm->reltuples;
int32 relpages = classForm->relpages;
int32 relallfrozen = classForm->relallfrozen;
+ int32 relallvisible = classForm->relallvisible;
vactuples = tabentry->dead_tuples;
instuples = tabentry->ins_since_vacuum;
@@ -3080,15 +3082,30 @@ relation_needs_vacanalyze(Oid relid,
relallfrozen = Min(relallfrozen, relpages);
pcnt_unfrozen = 1 - ((float4) relallfrozen / relpages);
}
+ /* If we have data for relallvisible, calculate the unrelallvisible percentage
+ * of the table to modify vac_scale_factor. This helps us decide
+ * whether or not to vacuum an update-heavy table based on the number
+ * of vac_scale_factor to the more "relallvisible" part of the table.
+ */
+ if (relpages > 0 && relallvisible > 0)
+ {
+ /*
+ * It could be the stats were updated manually and relallvisible >
+ * relpages. Clamp relallvisible to relpages to avoid nonsensical
+ * calculations.
+ */
+ relallvisible = Min(relallvisible, relpages);
+ pcnt_unrelallvisible = 1 - ((float4) relallvisible / relpages);
+ }
- vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples;
+ vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples * pcnt_unfrozen;
if (vac_max_thresh >= 0 && vacthresh > (float4) vac_max_thresh)
vacthresh = (float4) vac_max_thresh;
vacinsthresh = (float4) vac_ins_base_thresh +
vac_ins_scale_factor * reltuples * pcnt_unfrozen;
- anlthresh = (float4) anl_base_thresh + anl_scale_factor * reltuples;
-
+ anlthresh = (float4) anl_base_thresh + anl_scale_factor * reltuples * pcnt_unrelallvisible;
+ elog(DEBUG2, "vacthresh: %f,anlthresh: %f, the %s has %f reltuples, pcnt_unfrozen: %f, pcnt_unrelallvisible: %f ", vacthresh, anlthresh,NameStr(classForm->relname), reltuples, pcnt_unfrozen, pcnt_unrelallvisible);
/*
* Note that we don't need to take special consideration for stat
* reset, because if that happens, the last vacuum and analyze counts
Hi
The more accurate data I've found is tabentry->live_tuples; provides the
second version
#Here's a simple test I did
test=# select count(*) from join1;
count
---------
2289001
(1 row)
test=# update join1 set name=md5(now()::text) where id<1000000;
UPDATE 1938700
test=# select 1938700/2289001;
?column?
----------
0
(1 row)
test=# select 1938700/2289001::float;
?column?
--------------------
0.8469633696097119
(1 row)
test=#
test=# select count(*) from join1;
count
---------
2289001
(1 row)
test=# update join1 set name=md5(now()::text) where id<=80000;
UPDATE 159901
test=# select 159901/2289001::float;
?column?
---------------------
0.06985623859491542
(1 row)
test=# select * from pg_stat_all_tables where relname='join1';
-[ RECORD 1 ]----------+------------------------------
relid | 16385
schemaname | public
relname | join1
seq_scan | 17
last_seq_scan | 2025-03-07 15:34:02.793659+08
seq_tup_read | 14994306
idx_scan | 7
last_idx_scan | 2025-03-07 15:34:23.404788+08
idx_tup_fetch | 500281
n_tup_ins | 2289001
n_tup_upd | 2268604
n_tup_del | 0
n_tup_hot_upd | 399
n_tup_newpage_upd | 2268205
n_live_tup | 2286701
n_dead_tup | 159901
n_mod_since_analyze | 159901
n_ins_since_vacuum | 0
last_vacuum | 2025-03-06 18:18:11.318419+08
last_autovacuum | 2025-03-07 15:25:53.055576+08
last_analyze | 2025-03-06 18:18:11.424253+08
last_autoanalyze | 2025-03-07 15:25:53.456656+08
vacuum_count | 3
autovacuum_count | 3
analyze_count | 2
autoanalyze_count | 4
total_vacuum_time | 205
total_autovacuum_time | 2535
total_analyze_time | 203
total_autoanalyze_time | 1398
test=#
test=# update join1 set name=md5(now()::text) where id<=80000;
UPDATE 159901
test=# \x
Expanded display is on.
test=# select (n_live_tup)/(n_live_tup+n_dead_tup)::float from
pg_stat_all_tables where relname='join1';
-[ RECORD 1 ]----------------
?column? | 0.8774142777358045
test=# select * from pg_stat_all_tables where relname='join1';
-[ RECORD 1 ]----------+------------------------------
relid | 16385
schemaname | public
relname | join1
seq_scan | 17
last_seq_scan | 2025-03-07 15:34:02.793659+08
seq_tup_read | 14994306
idx_scan | 8
last_idx_scan | 2025-03-07 15:46:38.331795+08
idx_tup_fetch | 660182
n_tup_ins | 2289001
n_tup_upd | 2428505
n_tup_del | 0
n_tup_hot_upd | 424
n_tup_newpage_upd | 2428081
n_live_tup | 2289001
n_dead_tup | 319802
n_mod_since_analyze | 0
n_ins_since_vacuum | 0
last_vacuum | 2025-03-06 18:18:11.318419+08
last_autovacuum | 2025-03-07 15:25:53.055576+08
last_analyze | 2025-03-06 18:18:11.424253+08
last_autoanalyze | 2025-03-07 15:47:35.950932+08
vacuum_count | 3
autovacuum_count | 3
analyze_count | 2
autoanalyze_count | 5
total_vacuum_time | 205
total_autovacuum_time | 2535
total_analyze_time | 203
total_autoanalyze_time | 1770
test=#
tail -n 1000 postgresql-Fri_17.csv |grep join1
2025-03-07 17:30:12.782 +08,,,755739,,67cabca4.b881b,3,,2025-03-07 17:30:12
+08,2017/2,0,DEBUG,00000,"vacthresh: 457850.218750,anlthresh:
228950.109375, the join1 has 2289001.000000 reltuples, pcnt_unfrozen:
1.000000, pcnt_visibletuples: 0.877414 ",,,,,,,,,"","autovacuum worker",,0
2025-03-07 17:31:12.803 +08,,,756028,,67cabce0.b893c,3,,2025-03-07 17:31:12
+08,2003/4,0,DEBUG,00000,"vacthresh: 457850.218750,anlthresh:
228950.109375, the join1 has 2289001.000000 reltuples, pcnt_unfrozen:
1.000000, pcnt_visibletuples: 0.877414 ",,,,,,,,,"","autovacuum worker",,0
2025-03-07 17:32:12.822 +08,,,756405,,67cabd1c.b8ab5,3,,2025-03-07 17:32:12
+08,2006/4,0,DEBUG,00000,"vacthresh: 457850.218750,anlthresh:
228950.109375, the join1 has 2289001.000000 reltuples, pcnt_unfrozen:
1.000000, pcnt_visibletuples: 0.877414 ",,,,,,,,,"","autovacuum worker",,0
2025-03-07 17:33:12.842 +08,,,757026,,67cabd58.b8d22,3,,2025-03-07 17:33:12
+08,2009/4,0,DEBUG,00000,"vacthresh: 457850.218750,anlthresh:
228950.109375, the join1 has 2289001.000000 reltuples, pcnt_unfrozen:
1.000000, pcnt_visibletuples: 0.877414 ",,,,,,,,,"","autovacuum worker",,0
On Fri, Mar 7, 2025 at 2:22 PM wenhui qiu <qiuwenhuifx@gmail.com> wrote:
Show quoted text
HI Nathan Bossart Melanie Plageman
Firstly, congratulations on the submission of this path:
https://commitfest.postgresql.org/patch/5320/vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples;
anlthresh = (float4) anl_base_thresh + anl_scale_factor * reltuples;
vacinsthresh = (float4) vac_ins_base_thresh + vac_ins_scale_factor *
reltuples;
These three calculations have already been optimised for two of them, and
with this patch, we have the key data pcnt_unfrozen, I think we can also
consider applying it to the vacthresh and anlthresh calculations, and I've
added a new pcnt_unrelallvisible parameter with reference to pcnt_unfrozen,
so I'm not sure if it's a good idea for me to use it. I'd like to hear your
opinions on this.
#Here's a simple test I did
test=# select count(*) from join1;
count
---------
2289001
(1 row)test=# update join1 set name=md5(now()::text) where id<=20000;
UPDATE 70001
test=#2025-03-07 14:03:33.968 +08,,,607191,,67ca8c35.943d7,2,,2025-03-07
14:03:33 +08,2005/2,0,DEBUG,00000,"vacthresh: 222674.750000,anlthresh:
11371.118164, the j
oin1 has 2291275.000000 reltuples, pcnt_unfrozen: 0.485810,
pcnt_unrelallvisible: 0.049410 ",,,,,,,,,"","autovacuum worker",,0
Attachments:
Optimising-the-vacuum-algorithm-v2.diffapplication/octet-stream; name=Optimising-the-vacuum-algorithm-v2.diffDownload
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 800815dfbc..eea66a11c4 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2956,7 +2956,8 @@ relation_needs_vacanalyze(Oid relid,
/* number of vacuum (resp. analyze) tuples at this time */
float4 vactuples,
instuples,
- anltuples;
+ anltuples,
+ livetuples;
/* freeze parameters */
int freeze_max_age;
@@ -3052,6 +3053,7 @@ relation_needs_vacanalyze(Oid relid,
if (PointerIsValid(tabentry) && AutoVacuumingActive())
{
float4 pcnt_unfrozen = 1;
+ float4 pcnt_visibletuples = 1;
float4 reltuples = classForm->reltuples;
int32 relpages = classForm->relpages;
int32 relallfrozen = classForm->relallfrozen;
@@ -3059,6 +3061,7 @@ relation_needs_vacanalyze(Oid relid,
vactuples = tabentry->dead_tuples;
instuples = tabentry->ins_since_vacuum;
anltuples = tabentry->mod_since_analyze;
+ livetuples = tabentry->live_tuples;
/* If the table hasn't yet been vacuumed, take reltuples as zero */
if (reltuples < 0)
@@ -3081,6 +3084,18 @@ relation_needs_vacanalyze(Oid relid,
pcnt_unfrozen = 1 - ((float4) relallfrozen / relpages);
}
+ /*
+ * If we have data for visibletuples, calculate the visibletuples
+ * of the table to modify vac_scale_factor. This helps us decide
+ * whether or not to vacuum an update-heavy table based on the number
+ * of updates to the more "active" part of the table.
+ */
+ if (livetuples > 0 && vactuples > 0)
+ {
+ /* pcnt_visibletuples calculations.*/
+ pcnt_visibletuples = (float4) (livetuples / (livetuples + vactuples));
+ }
+
vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples;
if (vac_max_thresh >= 0 && vacthresh > (float4) vac_max_thresh)
vacthresh = (float4) vac_max_thresh;
@@ -3088,7 +3103,7 @@ relation_needs_vacanalyze(Oid relid,
vacinsthresh = (float4) vac_ins_base_thresh +
vac_ins_scale_factor * reltuples * pcnt_unfrozen;
anlthresh = (float4) anl_base_thresh + anl_scale_factor * reltuples;
-
+ elog(DEBUG2, "vacthresh: %f,anlthresh: %f, the %s has %f reltuples, pcnt_unfrozen: %f, pcnt_visibletuples: %f ", vacthresh, anlthresh,NameStr(classForm->relname), reltuples, pcnt_unfrozen, pcnt_visibletuples);
/*
* Note that we don't need to take special consideration for stat
* reset, because if that happens, the last vacuum and analyze counts
Sorry ,A wrong version of debug pcnt_visibletuples ,kindly please check the
v3 attachment
On Fri, Mar 7, 2025 at 5:37 PM wenhui qiu <qiuwenhuifx@gmail.com> wrote:
Show quoted text
Hi
The more accurate data I've found is tabentry->live_tuples; provides
the second version#Here's a simple test I did
test=# select count(*) from join1;
count
---------
2289001
(1 row)test=# update join1 set name=md5(now()::text) where id<1000000;
UPDATE 1938700
test=# select 1938700/2289001;
?column?
----------
0
(1 row)test=# select 1938700/2289001::float;
?column?
--------------------
0.8469633696097119
(1 row)test=#
test=# select count(*) from join1;
count
---------
2289001
(1 row)
test=# update join1 set name=md5(now()::text) where id<=80000;
UPDATE 159901
test=# select 159901/2289001::float;
?column?
---------------------
0.06985623859491542
(1 row)test=# select * from pg_stat_all_tables where relname='join1';
-[ RECORD 1 ]----------+------------------------------
relid | 16385
schemaname | public
relname | join1
seq_scan | 17
last_seq_scan | 2025-03-07 15:34:02.793659+08
seq_tup_read | 14994306
idx_scan | 7
last_idx_scan | 2025-03-07 15:34:23.404788+08
idx_tup_fetch | 500281
n_tup_ins | 2289001
n_tup_upd | 2268604
n_tup_del | 0
n_tup_hot_upd | 399
n_tup_newpage_upd | 2268205
n_live_tup | 2286701
n_dead_tup | 159901
n_mod_since_analyze | 159901
n_ins_since_vacuum | 0
last_vacuum | 2025-03-06 18:18:11.318419+08
last_autovacuum | 2025-03-07 15:25:53.055576+08
last_analyze | 2025-03-06 18:18:11.424253+08
last_autoanalyze | 2025-03-07 15:25:53.456656+08
vacuum_count | 3
autovacuum_count | 3
analyze_count | 2
autoanalyze_count | 4
total_vacuum_time | 205
total_autovacuum_time | 2535
total_analyze_time | 203
total_autoanalyze_time | 1398test=#
test=# update join1 set name=md5(now()::text) where id<=80000;
UPDATE 159901test=# \x
Expanded display is on.
test=# select (n_live_tup)/(n_live_tup+n_dead_tup)::float from
pg_stat_all_tables where relname='join1';
-[ RECORD 1 ]----------------
?column? | 0.8774142777358045test=# select * from pg_stat_all_tables where relname='join1';
-[ RECORD 1 ]----------+------------------------------
relid | 16385
schemaname | public
relname | join1
seq_scan | 17
last_seq_scan | 2025-03-07 15:34:02.793659+08
seq_tup_read | 14994306
idx_scan | 8
last_idx_scan | 2025-03-07 15:46:38.331795+08
idx_tup_fetch | 660182
n_tup_ins | 2289001
n_tup_upd | 2428505
n_tup_del | 0
n_tup_hot_upd | 424
n_tup_newpage_upd | 2428081
n_live_tup | 2289001
n_dead_tup | 319802
n_mod_since_analyze | 0
n_ins_since_vacuum | 0
last_vacuum | 2025-03-06 18:18:11.318419+08
last_autovacuum | 2025-03-07 15:25:53.055576+08
last_analyze | 2025-03-06 18:18:11.424253+08
last_autoanalyze | 2025-03-07 15:47:35.950932+08
vacuum_count | 3
autovacuum_count | 3
analyze_count | 2
autoanalyze_count | 5
total_vacuum_time | 205
total_autovacuum_time | 2535
total_analyze_time | 203
total_autoanalyze_time | 1770test=#
tail -n 1000 postgresql-Fri_17.csv |grep join1
2025-03-07 17:30:12.782 +08,,,755739,,67cabca4.b881b,3,,2025-03-07
17:30:12 +08,2017/2,0,DEBUG,00000,"vacthresh: 457850.218750,anlthresh:
228950.109375, the join1 has 2289001.000000 reltuples, pcnt_unfrozen:
1.000000, pcnt_visibletuples: 0.877414 ",,,,,,,,,"","autovacuum worker",,0
2025-03-07 17:31:12.803 +08,,,756028,,67cabce0.b893c,3,,2025-03-07
17:31:12 +08,2003/4,0,DEBUG,00000,"vacthresh: 457850.218750,anlthresh:
228950.109375, the join1 has 2289001.000000 reltuples, pcnt_unfrozen:
1.000000, pcnt_visibletuples: 0.877414 ",,,,,,,,,"","autovacuum worker",,0
2025-03-07 17:32:12.822 +08,,,756405,,67cabd1c.b8ab5,3,,2025-03-07
17:32:12 +08,2006/4,0,DEBUG,00000,"vacthresh: 457850.218750,anlthresh:
228950.109375, the join1 has 2289001.000000 reltuples, pcnt_unfrozen:
1.000000, pcnt_visibletuples: 0.877414 ",,,,,,,,,"","autovacuum worker",,0
2025-03-07 17:33:12.842 +08,,,757026,,67cabd58.b8d22,3,,2025-03-07
17:33:12 +08,2009/4,0,DEBUG,00000,"vacthresh: 457850.218750,anlthresh:
228950.109375, the join1 has 2289001.000000 reltuples, pcnt_unfrozen:
1.000000, pcnt_visibletuples: 0.877414 ",,,,,,,,,"","autovacuum worker",,0On Fri, Mar 7, 2025 at 2:22 PM wenhui qiu <qiuwenhuifx@gmail.com> wrote:
HI Nathan Bossart Melanie Plageman
Firstly, congratulations on the submission of this path:
https://commitfest.postgresql.org/patch/5320/vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples;
anlthresh = (float4) anl_base_thresh + anl_scale_factor * reltuples;
vacinsthresh = (float4) vac_ins_base_thresh + vac_ins_scale_factor *
reltuples;
These three calculations have already been optimised for two of them, and
with this patch, we have the key data pcnt_unfrozen, I think we can also
consider applying it to the vacthresh and anlthresh calculations, and I've
added a new pcnt_unrelallvisible parameter with reference to pcnt_unfrozen,
so I'm not sure if it's a good idea for me to use it. I'd like to hear your
opinions on this.
#Here's a simple test I did
test=# select count(*) from join1;
count
---------
2289001
(1 row)test=# update join1 set name=md5(now()::text) where id<=20000;
UPDATE 70001
test=#2025-03-07 14:03:33.968 +08,,,607191,,67ca8c35.943d7,2,,2025-03-07
14:03:33 +08,2005/2,0,DEBUG,00000,"vacthresh: 222674.750000,anlthresh:
11371.118164, the j
oin1 has 2291275.000000 reltuples, pcnt_unfrozen: 0.485810,
pcnt_unrelallvisible: 0.049410 ",,,,,,,,,"","autovacuum worker",,0
Attachments:
Optimising-the-vacuum-algorithm-v3.diffapplication/octet-stream; name=Optimising-the-vacuum-algorithm-v3.diffDownload
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 800815dfbc..eeaf12f524 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2956,7 +2956,8 @@ relation_needs_vacanalyze(Oid relid,
/* number of vacuum (resp. analyze) tuples at this time */
float4 vactuples,
instuples,
- anltuples;
+ anltuples,
+ livetuples;
/* freeze parameters */
int freeze_max_age;
@@ -3052,6 +3053,7 @@ relation_needs_vacanalyze(Oid relid,
if (PointerIsValid(tabentry) && AutoVacuumingActive())
{
float4 pcnt_unfrozen = 1;
+ float4 pcnt_visibletuples = 1;
float4 reltuples = classForm->reltuples;
int32 relpages = classForm->relpages;
int32 relallfrozen = classForm->relallfrozen;
@@ -3059,6 +3061,7 @@ relation_needs_vacanalyze(Oid relid,
vactuples = tabentry->dead_tuples;
instuples = tabentry->ins_since_vacuum;
anltuples = tabentry->mod_since_analyze;
+ livetuples = tabentry->live_tuples;
/* If the table hasn't yet been vacuumed, take reltuples as zero */
if (reltuples < 0)
@@ -3081,14 +3084,26 @@ relation_needs_vacanalyze(Oid relid,
pcnt_unfrozen = 1 - ((float4) relallfrozen / relpages);
}
- vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples;
+ /*
+ * If we have data for visibletuples, calculate the visibletuples
+ * of the table to modify vac_scale_factor. This helps us decide
+ * whether or not to vacuum an update-heavy table based on the number
+ * of updates to the more "active" part of the table.
+ */
+ if (livetuples > 0 && vactuples > 0)
+ {
+ /* pcnt_visibletuples calculations.*/
+ pcnt_visibletuples = (float4) (livetuples / (livetuples + vactuples));
+ }
+
+ vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples * pcnt_visibletuples;
if (vac_max_thresh >= 0 && vacthresh > (float4) vac_max_thresh)
vacthresh = (float4) vac_max_thresh;
vacinsthresh = (float4) vac_ins_base_thresh +
vac_ins_scale_factor * reltuples * pcnt_unfrozen;
- anlthresh = (float4) anl_base_thresh + anl_scale_factor * reltuples;
-
+ anlthresh = (float4) anl_base_thresh + anl_scale_factor * reltuples * pcnt_visibletuples;
+ elog(DEBUG2, "vacthresh: %f,anlthresh: %f, the %s has %f reltuples, pcnt_unfrozen: %f, pcnt_visibletuples: %f ", vacthresh, anlthresh,NameStr(classForm->relname), reltuples, pcnt_unfrozen, pcnt_visibletuples);
/*
* Note that we don't need to take special consideration for stat
* reset, because if that happens, the last vacuum and analyze counts
On Fri, Mar 7, 2025 at 6:19 AM wenhui qiu <qiuwenhuifx@gmail.com> wrote:
Sorry ,A wrong version of debug pcnt_visibletuples ,kindly please check the v3 attachment
I looked at v3. I think I need more than the logging message to
understand your goal here. Could you explain the algorithm and why you
think it makes sense and what scenarios it is meant to handle better?
Thinking about it conceptually, I don't think this makes sense:
pcnt_visibletuples = (float4) (livetuples / (livetuples + vactuples));
vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples *
pcnt_visibletuples
do_vacuum = vactuples > vacthresh
livetuples + deadtuples is approx reltuples (little more complicated
than this, but), so this is basically
livetuples/reltuples*reltuples -> livetuples
So vactuples > vacthresh is basically just deadtuples > livetuples
Maybe you think that we should be comparing the portion of the table
that is dead to the portion of the table that is live, but that
doesn't seem to be what you mean the algorithm to do based on the one
comment you have.
The anlthresh calculation is a different discussion, since
mod_since_analyze is calculated in a different way (tuples updated +
tuples inserted + tuples_deleted). But I am also skeptical of this
one.
I think you need to explain more conceptually about why you think
these ways of calculating the thresholds makes sense.
- Melanie
Hi Melanie Plageman
Thank you for your reply. My calculation logic is to calculate the
proportion of active tuples. What I really want to know is whether this
algorithm is correct and acceptable. The way I wrote it is mainly to
express that I want to calculate the percentage of active tuples. When this
proportion is relatively low, it is more likely to be triggered.for
example,A million rows of tables. it updated 199,000.
50+1000000 * 0.2 = 200050 ,
Use of new calculation methods approximately equal to 50+1000000 * 0.2 *
0.8= 160050 ,
If this algorithm is accepted ,I follow your suggestion or you provide a
patch for a better algorithm,I actually just want to promote these
calculation formulas. In fact, I highly admire the solution provided by SQL
Server.
On Fri, Mar 7, 2025 at 11:48 PM Melanie Plageman <melanieplageman@gmail.com>
wrote:
Show quoted text
On Fri, Mar 7, 2025 at 6:19 AM wenhui qiu <qiuwenhuifx@gmail.com> wrote:
Sorry ,A wrong version of debug pcnt_visibletuples ,kindly please check
the v3 attachment
I looked at v3. I think I need more than the logging message to
understand your goal here. Could you explain the algorithm and why you
think it makes sense and what scenarios it is meant to handle better?Thinking about it conceptually, I don't think this makes sense:
pcnt_visibletuples = (float4) (livetuples / (livetuples + vactuples));
vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples *
pcnt_visibletuples
do_vacuum = vactuples > vacthreshlivetuples + deadtuples is approx reltuples (little more complicated
than this, but), so this is basically
livetuples/reltuples*reltuples -> livetuplesSo vactuples > vacthresh is basically just deadtuples > livetuples
Maybe you think that we should be comparing the portion of the table
that is dead to the portion of the table that is live, but that
doesn't seem to be what you mean the algorithm to do based on the one
comment you have.The anlthresh calculation is a different discussion, since
mod_since_analyze is calculated in a different way (tuples updated +
tuples inserted + tuples_deleted). But I am also skeptical of this
one.I think you need to explain more conceptually about why you think
these ways of calculating the thresholds makes sense.- Melanie