Introduce some randomness to autovacuum

Started by Junwang Zhao9 months ago14 messages
#1Junwang Zhao
zhjwpku@gmail.com

Hi hackers,

After watching Robert's talk[1]How Autovacuum Goes Wrong: And Can We Please Make It Stop Doing That? https://www.youtube.com/watch?v=RfTD-Twpvac on autovacuum and participating in the related
workshop yesterday, it appears that people are inclined to use prioritization
to address the issues highlighted in Robert's presentation. Here I list two
of the failure modes that were discussed.

- Spinning. Running repeatedly on the same table but not accomplishing
anything useful.
- Starvation. autovacuum can't vacuum everything that needs vacuuming.
- ...

The prioritization way needs some basic stuff that postgres doesn't have now.

I had a random thought that introducing some randomness might help
mitigate some of the issues mentioned above. Before performing vacuum
on the collected tables, we could rotate the table_oids list by a random
number within the range [0, list_length(table_oids)]. This way, every table
would have an equal chance of being vacuumed first, thus no spinning and
starvation.

Even if there is a broken table that repeatedly gets stuck, this random
approach would still provide opportunities for other tables to be vacuumed.
Eventually, the system would converge.

The change is something like the following, I haven't tested the code,
just posted it here for discussion, let me know your thoughts.

diff --git a/src/backend/postmaster/autovacuum.c
b/src/backend/postmaster/autovacuum.c
index 16756152b71..6dddd273d22 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -79,6 +79,7 @@
 #include "catalog/pg_namespace.h"
 #include "commands/dbcommands.h"
 #include "commands/vacuum.h"
+#include "common/pg_prng.h"
 #include "common/int.h"
 #include "lib/ilist.h"
 #include "libpq/pqsignal.h"
@@ -2267,6 +2268,25 @@ do_autovacuum(void)

"Autovacuum Portal",

ALLOCSET_DEFAULT_SIZES);

+       /*
+        * Randomly rotate the list of tables to vacuum.  This is to avoid
+        * always vacuuming the same table first, which could lead to spinning
+        * on the same table or vacuuming starvation.
+        */
+       if (list_length(table_oids) > 2)
+       {
+               int rand = 0;
+               static pg_prng_state prng_state;
+               List       *tmp_oids = NIL;
+
+               pg_prng_seed(&prng_state, (uint64) (getpid() ^ time(NULL)));
+               rand = (int) pg_prng_uint64_range(&prng_state, 0,
list_length(table_oids) - 1);
+               if (rand != 0) {
+                       tmp_oids = list_copy_tail(table_oids, rand);
+                       table_oids = list_copy_head(table_oids,
list_length(table_oids) - rand);
+                       table_oids = list_concat(table_oids, tmp_oids);
+               }
+       }
        /*
         * Perform operations on collected tables.
         */

[1]: How Autovacuum Goes Wrong: And Can We Please Make It Stop Doing That? https://www.youtube.com/watch?v=RfTD-Twpvac
That? https://www.youtube.com/watch?v=RfTD-Twpvac

--
Regards
Junwang Zhao

#2wenhui qiu
qiuwenhuifx@gmail.com
In reply to: Junwang Zhao (#1)
Re: Introduce some randomness to autovacuum

Hi,I like your idea,It would be even better if the weights could be taken
according to the larger tables

On Fri, 25 Apr 2025 at 22:03, Junwang Zhao <zhjwpku@gmail.com> wrote:

Show quoted text

Hi hackers,

After watching Robert's talk[1] on autovacuum and participating in the
related
workshop yesterday, it appears that people are inclined to use
prioritization
to address the issues highlighted in Robert's presentation. Here I list two
of the failure modes that were discussed.

- Spinning. Running repeatedly on the same table but not accomplishing
anything useful.
- Starvation. autovacuum can't vacuum everything that needs vacuuming.
- ...

The prioritization way needs some basic stuff that postgres doesn't have
now.

I had a random thought that introducing some randomness might help
mitigate some of the issues mentioned above. Before performing vacuum
on the collected tables, we could rotate the table_oids list by a random
number within the range [0, list_length(table_oids)]. This way, every table
would have an equal chance of being vacuumed first, thus no spinning and
starvation.

Even if there is a broken table that repeatedly gets stuck, this random
approach would still provide opportunities for other tables to be vacuumed.
Eventually, the system would converge.

The change is something like the following, I haven't tested the code,
just posted it here for discussion, let me know your thoughts.

diff --git a/src/backend/postmaster/autovacuum.c
b/src/backend/postmaster/autovacuum.c
index 16756152b71..6dddd273d22 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -79,6 +79,7 @@
#include "catalog/pg_namespace.h"
#include "commands/dbcommands.h"
#include "commands/vacuum.h"
+#include "common/pg_prng.h"
#include "common/int.h"
#include "lib/ilist.h"
#include "libpq/pqsignal.h"
@@ -2267,6 +2268,25 @@ do_autovacuum(void)

"Autovacuum Portal",

ALLOCSET_DEFAULT_SIZES);

+       /*
+        * Randomly rotate the list of tables to vacuum.  This is to avoid
+        * always vacuuming the same table first, which could lead to
spinning
+        * on the same table or vacuuming starvation.
+        */
+       if (list_length(table_oids) > 2)
+       {
+               int rand = 0;
+               static pg_prng_state prng_state;
+               List       *tmp_oids = NIL;
+
+               pg_prng_seed(&prng_state, (uint64) (getpid() ^
time(NULL)));
+               rand = (int) pg_prng_uint64_range(&prng_state, 0,
list_length(table_oids) - 1);
+               if (rand != 0) {
+                       tmp_oids = list_copy_tail(table_oids, rand);
+                       table_oids = list_copy_head(table_oids,
list_length(table_oids) - rand);
+                       table_oids = list_concat(table_oids, tmp_oids);
+               }
+       }
/*
* Perform operations on collected tables.
*/

[1] How Autovacuum Goes Wrong: And Can We Please Make It Stop Doing
That? https://www.youtube.com/watch?v=RfTD-Twpvac

--
Regards
Junwang Zhao

#3Nikita Malakhov
hukutoc@gmail.com
In reply to: wenhui qiu (#2)
Re: Introduce some randomness to autovacuum

Hi!

I agree it is a good idea to shift the table list. Although vacuuming
larger tables first
is a questionable approach because smaller ones could wait a long time to
be vacuumed.
It looks like the most obvious and simple way is that the first table to be
vacuumed
should not be the first one from the previous iteration.

On Fri, Apr 25, 2025 at 6:04 PM wenhui qiu <qiuwenhuifx@gmail.com> wrote:

Hi,I like your idea,It would be even better if the weights could be taken
according to the larger tables

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

#4Junwang Zhao
zhjwpku@gmail.com
In reply to: Nikita Malakhov (#3)
1 attachment(s)
Re: Introduce some randomness to autovacuum

Hi Nikita, wenhui,

On Fri, Apr 25, 2025 at 11:16 PM Nikita Malakhov <hukutoc@gmail.com> wrote:

Hi!

I agree it is a good idea to shift the table list. Although vacuuming larger tables first
is a questionable approach because smaller ones could wait a long time to be vacuumed.
It looks like the most obvious and simple way is that the first table to be vacuumed
should not be the first one from the previous iteration.

On Fri, Apr 25, 2025 at 6:04 PM wenhui qiu <qiuwenhuifx@gmail.com> wrote:

Hi,I like your idea,It would be even better if the weights could be taken according to the larger tables

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Thanks for your feedback.

I ended up with adding a guc configuration that may support different vacuum
strategies. I name it as `autovacuum_vacuum_strategy` but you might have
a better one. For now it support only two strategies:

1. Sequential: Tables are vacuumed in the order they are collected.
2. Random: The list of tables is rotated around a randomly chosen
pivot before vacuuming to avoid always starting with the same
table, which prevents vacuuming starvation for some tables.

We can extend this strategy like prioritization and whatever algorithms
in the future.

--
Regards
Junwang Zhao

Attachments:

v1-0001-Introduce-autovacuum-vacuum-strategy.patchapplication/octet-stream; name=v1-0001-Introduce-autovacuum-vacuum-strategy.patchDownload
From 3763c4813593af7267c37203a09e3d04037cf304 Mon Sep 17 00:00:00 2001
From: Junwang Zhao <zhjwpku@gmail.com>
Date: Wed, 30 Apr 2025 10:20:16 +0000
Subject: [PATCH v1] Introduce autovacuum vacuum strategy

Add a new configuration option `autovacuum_vacuum_strategy` to
control the order in which tables are vacuumed by the autovacuum
process. Two strategies are supported for now:

1. Sequential: Tables are vacuumed in the order they are collected.
2. Random: The list of tables is rotated around a randomly chosen
   pivot before vacuuming to avoid always starting with the same
   table, which prevents vacuuming starvation for some tables.

The default strategy is `sequential`.

Signed-off-by: Junwang Zhao <zhjwpku@gmail.com>
---
 src/backend/postmaster/autovacuum.c           | 33 +++++++++++++++++++
 src/backend/utils/misc/guc_tables.c           | 10 ++++++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/postmaster/autovacuum.h           |  7 ++++
 src/include/utils/guc.h                       |  1 +
 5 files changed, 52 insertions(+)

diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 16756152b71..2452d893533 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -79,6 +79,7 @@
 #include "catalog/pg_namespace.h"
 #include "commands/dbcommands.h"
 #include "commands/vacuum.h"
+#include "common/pg_prng.h"
 #include "common/int.h"
 #include "lib/ilist.h"
 #include "libpq/pqsignal.h"
@@ -132,9 +133,16 @@ int			autovacuum_multixact_freeze_max_age;
 
 double		autovacuum_vac_cost_delay;
 int			autovacuum_vac_cost_limit;
+int			autovacuum_vac_strategy;
 
 int			Log_autovacuum_min_duration = 600000;
 
+const struct config_enum_entry autovacuum_vac_strategy_options[] = {
+	{"sequential", AUTOVACUUM_VAC_STRATEGY_SEQUENTIAL, false},
+	{"random", AUTOVACUUM_VAC_STRATEGY_RANDOM, false},
+	{NULL, 0, false}
+};
+
 /* the minimum allowed time between two awakenings of the launcher */
 #define MIN_AUTOVAC_SLEEPTIME 100.0 /* milliseconds */
 #define MAX_AUTOVAC_SLEEPTIME 300	/* seconds */
@@ -2267,6 +2275,30 @@ do_autovacuum(void)
 										  "Autovacuum Portal",
 										  ALLOCSET_DEFAULT_SIZES);
 
+	/*
+	 * Randomly rotate the list of tables to vacuum.  This is to avoid always
+	 * vacuuming the same table first, which could lead to spinning on the
+	 * same table or vacuuming starvation.
+	 */
+	if (list_length(table_oids) > 1 && autovacuum_vac_strategy == AUTOVACUUM_VAC_STRATEGY_RANDOM)
+	{
+
+		int			rand = 0;
+		static pg_prng_state prng_state;
+		List	   *tmp_oids = NIL;
+
+		pg_prng_seed(&prng_state, (uint64) (getpid() ^ time(NULL)));
+		rand = (int) pg_prng_uint64_range(&prng_state, 0,
+										  list_length(table_oids) - 1);
+		if (rand != 0)
+		{
+			tmp_oids = list_copy_tail(table_oids, rand);
+			table_oids = list_copy_head(table_oids,
+										list_length(table_oids) - rand);
+			table_oids = list_concat(table_oids, tmp_oids);
+		}
+	}
+
 	/*
 	 * Perform operations on collected tables.
 	 */
@@ -3282,6 +3314,7 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
 		workitem->avw_used = true;
 		workitem->avw_active = false;
 		workitem->avw_type = type;
+
 		workitem->avw_database = MyDatabaseId;
 		workitem->avw_relation = relationId;
 		workitem->avw_blockNumber = blkno;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f8cbd86759..c928106fd50 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -5418,6 +5418,16 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, assign_io_method, NULL
 	},
 
+	{
+		{"autovacuum_vacuum_strategy", PGC_SIGHUP, VACUUM_AUTOVACUUM,
+			gettext_noop("Vacuum strategy for autovacuum."),
+			NULL
+		},
+		&autovacuum_vac_strategy,
+		AUTOVACUUM_VAC_STRATEGY_SEQUENTIAL, autovacuum_vac_strategy_options,
+		NULL, NULL, NULL
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34826d01380..5c44944008f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -713,6 +713,7 @@ autovacuum_worker_slots = 16	# autovacuum worker slots to allocate
 #autovacuum_vacuum_cost_limit = -1	# default vacuum cost limit for
 					# autovacuum, -1 means use
 					# vacuum_cost_limit
+#autovacuum_vacuum_strategy = sequential
 
 # - Cost-Based Vacuum Delay -
 
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..218ea4ee1c8 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -25,6 +25,12 @@ typedef enum
 	AVW_BRINSummarizeRange,
 } AutoVacuumWorkItemType;
 
+/* Autovacuum vacuum strategies */
+enum AutovacuumVacStrategy
+{
+	AUTOVACUUM_VAC_STRATEGY_SEQUENTIAL = 0,
+	AUTOVACUUM_VAC_STRATEGY_RANDOM,
+};
 
 /* GUC variables */
 extern PGDLLIMPORT bool autovacuum_start_daemon;
@@ -43,6 +49,7 @@ extern PGDLLIMPORT int autovacuum_freeze_max_age;
 extern PGDLLIMPORT int autovacuum_multixact_freeze_max_age;
 extern PGDLLIMPORT double autovacuum_vac_cost_delay;
 extern PGDLLIMPORT int autovacuum_vac_cost_limit;
+extern PGDLLIMPORT int autovacuum_vac_strategy;
 
 /* autovacuum launcher PID, only valid when worker is shutting down */
 extern PGDLLIMPORT int AutovacuumLauncherPid;
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index f619100467d..2f5c5363b03 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -322,6 +322,7 @@ extern PGDLLIMPORT const struct config_enum_entry io_method_options[];
 extern PGDLLIMPORT const struct config_enum_entry recovery_target_action_options[];
 extern PGDLLIMPORT const struct config_enum_entry wal_level_options[];
 extern PGDLLIMPORT const struct config_enum_entry wal_sync_method_options[];
+extern PGDLLIMPORT const struct config_enum_entry autovacuum_vac_strategy_options[];
 
 /*
  * Functions exported by guc.c
-- 
2.39.5

#5Nathan Bossart
nathandbossart@gmail.com
In reply to: Junwang Zhao (#1)
Re: Introduce some randomness to autovacuum

On Fri, Apr 25, 2025 at 10:02:49PM +0800, Junwang Zhao wrote:

After watching Robert's talk[1] on autovacuum and participating in the related
workshop yesterday, it appears that people are inclined to use prioritization
to address the issues highlighted in Robert's presentation. Here I list two
of the failure modes that were discussed.

- Spinning. Running repeatedly on the same table but not accomplishing
anything useful.
- Starvation. autovacuum can't vacuum everything that needs vacuuming.
- ...

The prioritization way needs some basic stuff that postgres doesn't have now.

I had a random thought that introducing some randomness might help
mitigate some of the issues mentioned above. Before performing vacuum
on the collected tables, we could rotate the table_oids list by a random
number within the range [0, list_length(table_oids)]. This way, every table
would have an equal chance of being vacuumed first, thus no spinning and
starvation.

Even if there is a broken table that repeatedly gets stuck, this random
approach would still provide opportunities for other tables to be vacuumed.
Eventually, the system would converge.

First off, thank you for thinking about this problem and for sharing your
thoughts. Adding randomness to solve this is a creative idea.

That being said, I am -1 for this proposal. Autovacuum parameters and
scheduling are already quite complicated, and making it nondeterministic
would add an additional layer of complexity (and may introduce its own
problems). But more importantly, IMHO it masks the problems instead of
solving them more directly, and it could mask future problems, too. It'd
probably behoove us to think about the known problems more deeply and to
craft more targeted solutions.

--
nathan

#6Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#5)
Re: Introduce some randomness to autovacuum

- Spinning. Running repeatedly on the same table but not accomplishing
anything useful.

But more importantly, IMHO it masks the problems instead of
solving them more directly, and it could mask future problems, too

To add more to Nathan's comment about masking future problems,
this will not solve the "spinning" problem because if the most common
reason for this is a long-running transaction, etc., all your tables will
eventually end up with wasted vacuum cycles because the xmin
horizon is not advancing.

--
Sami Imseih

#7Greg Sabino Mullane
htamfids@gmail.com
In reply to: Junwang Zhao (#4)
Re: Introduce some randomness to autovacuum

On Wed, Apr 30, 2025 at 10:07 AM Junwang Zhao <zhjwpku@gmail.com> wrote:

I ended up with adding a guc configuration that may support different
vacuum
strategies.

+1 to this: it's a good solution to a tricky problem. I would be a -1 if
this were not a GUC.

Yes, it is masking the problem, but maybe a better way to think about it is
that it is delaying the performance impact, allowing more time for a manual
intervention of the problematic table(s).

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support

#8Sami Imseih
samimseih@gmail.com
In reply to: Greg Sabino Mullane (#7)
Re: Introduce some randomness to autovacuum

Yes, it is masking the problem, but maybe a better way to think about it is that it is delaying the
performance impact, allowing more time for a manual intervention of the problematic table(s).

I question how the user will gauge the success of setting the strategy
to "random"? They may make
it random by default, but fall into the same issues and revert it back
to the default strategy.

But also, the key as you mention is "manual intervention" which
requires proper monitoring. I will
argue that for the two cases that this proposal is seeking to correct,
we already have good
solutions that could be implemented by a user.

Let's take the "spinning" case again. If a table has some sort of
problem causing
vacuum to error out, one can just disable autovacuum on a per-table
level and correct
the issue. Also, the xmin horizon being held back ( which is claimed
to be the most common cause,
and I agree with that ), well that one is just going to cause all your
autovacuums to become
useless.

Also, I do think the starvation problem has a good answer now that
autovacuum_max_workers
can be modified online. Maybe something can be done for autovacuum to
auto-tune this
setting to give more workers at times when it's needed. Not sure what
that looks like,
but it is more possible now that this setting does not require a restart.

--
Sami Imseih
Amazon Web Services (AWS)

#9Robert Treat
rob@xzilla.net
In reply to: Sami Imseih (#8)
Re: Introduce some randomness to autovacuum

On Wed, Apr 30, 2025 at 1:56 PM Sami Imseih <samimseih@gmail.com> wrote:

Yes, it is masking the problem, but maybe a better way to think about it is that it is delaying the
performance impact, allowing more time for a manual intervention of the problematic table(s).

I question how the user will gauge the success of setting the strategy
to "random"? They may make
it random by default, but fall into the same issues and revert it back
to the default strategy.

But also, the key as you mention is "manual intervention" which
requires proper monitoring. I will
argue that for the two cases that this proposal is seeking to correct,
we already have good
solutions that could be implemented by a user.

I would have a lot more faith in this discussion if there was any kind
of external solution that had gained popularity as a general solution,
but this doesn't seem to be the case (and trying to wedge something in
the server will likely hurt that kind of research.

As an example, the first fallacy of autovacuum management is the idea
that a single strategy will always work. Having implemented a number
of crude vacuum management systems in user space already, I know that
I have run into multiple cases where I had to essentially build two
different "queues" of vacuums (one for xids, one for bloat) to be fed
into Postgres so as not to be blocked (in either direction) by
conflicting priorities no matter how wonky things got. I can imagine a
set of gucs that we could put in to try to mimic such types of
behavior, but I imagine it would take quite a few rounds before we got
the behavior correct.

Robert Treat
https://xzilla.net

#10David Rowley
dgrowleyml@gmail.com
In reply to: Nathan Bossart (#5)
Re: Introduce some randomness to autovacuum

On Thu, 1 May 2025 at 03:29, Nathan Bossart <nathandbossart@gmail.com> wrote:

That being said, I am -1 for this proposal. Autovacuum parameters and
scheduling are already quite complicated, and making it nondeterministic
would add an additional layer of complexity (and may introduce its own
problems). But more importantly, IMHO it masks the problems instead of
solving them more directly, and it could mask future problems, too. It'd
probably behoove us to think about the known problems more deeply and to
craft more targeted solutions.

-1 from me too.

It sounds like the aim is to fix the problem with autovacuum vacuuming
the same table over and over and being unable to remove enough dead
tuples due to something holding back the oldest xmin horizon. Why
can't we just fix that by remembering the value that
VacuumCutoffs.OldestXmin and only coming back to that table once
that's moved forward some amount?

David

#11Junwang Zhao
zhjwpku@gmail.com
In reply to: Sami Imseih (#8)
Re: Introduce some randomness to autovacuum

Hi Sami,

On Thu, May 1, 2025 at 1:56 AM Sami Imseih <samimseih@gmail.com> wrote:

Yes, it is masking the problem, but maybe a better way to think about it is that it is delaying the
performance impact, allowing more time for a manual intervention of the problematic table(s).

I question how the user will gauge the success of setting the strategy
to "random"? They may make
it random by default, but fall into the same issues and revert it back
to the default strategy.

But also, the key as you mention is "manual intervention" which
requires proper monitoring. I will
argue that for the two cases that this proposal is seeking to correct,
we already have good
solutions that could be implemented by a user.

Let's take the "spinning" case again. If a table has some sort of
problem causing
vacuum to error out, one can just disable autovacuum on a per-table
level and correct
the issue. Also, the xmin horizon being held back ( which is claimed
to be the most common cause,
and I agree with that ), well that one is just going to cause all your
autovacuums to become
useless.

Yeah, I tend to agree with you that the xmin horizon hold back will
make autovacuums to become useless for all tables.

But I have a question, let me quote Andres' comment on slack first:

```quote begin
It seems a bit silly to not just do some basic prioritization instead,
but perhaps we just need to reach for some basic stuff, given that
we seem unable to progress on prioritization.
```quote end

If randomness is not working, ISTM that the prioritization will not benefit
the "spinning" case too, am I right?

Also, I do think the starvation problem has a good answer now that
autovacuum_max_workers
can be modified online. Maybe something can be done for autovacuum to
auto-tune this
setting to give more workers at times when it's needed. Not sure what
that looks like,
but it is more possible now that this setting does not require a restart.

Good to know, thanks.

One case I didn't mention is that some corruption due to vacuuming the
same table might starve other tables two, randomness gives other tables
some chances to be vacuumed. I do admit that multi vacuum workers
can eliminate this issue a little bit if the corrupted table's vacuum progress
lasts for some time, but I think randomness is much better.

--
Sami Imseih
Amazon Web Services (AWS)

--
Regards
Junwang Zhao

#12Junwang Zhao
zhjwpku@gmail.com
In reply to: David Rowley (#10)
Re: Introduce some randomness to autovacuum

On Thu, May 1, 2025 at 8:12 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Thu, 1 May 2025 at 03:29, Nathan Bossart <nathandbossart@gmail.com> wrote:

That being said, I am -1 for this proposal. Autovacuum parameters and
scheduling are already quite complicated, and making it nondeterministic
would add an additional layer of complexity (and may introduce its own
problems). But more importantly, IMHO it masks the problems instead of
solving them more directly, and it could mask future problems, too. It'd
probably behoove us to think about the known problems more deeply and to
craft more targeted solutions.

-1 from me too.

It sounds like the aim is to fix the problem with autovacuum vacuuming
the same table over and over and being unable to remove enough dead
tuples due to something holding back the oldest xmin horizon. Why
can't we just fix that by remembering the value that
VacuumCutoffs.OldestXmin and only coming back to that table once
that's moved forward some amount?

Users expect the tables to be auto vacuumed when:
*dead_tuples > vac_base_thresh + vac_scale_factor * reltuples*
If we depend on xid moving forward to do autovacuum, I think
there are chances some bloated tables won't be vacuumed?

David

--
Regards
Junwang Zhao

#13David Rowley
dgrowleyml@gmail.com
In reply to: Junwang Zhao (#12)
Re: Introduce some randomness to autovacuum

On Thu, 1 May 2025 at 17:35, Junwang Zhao <zhjwpku@gmail.com> wrote:

On Thu, May 1, 2025 at 8:12 AM David Rowley <dgrowleyml@gmail.com> wrote:

It sounds like the aim is to fix the problem with autovacuum vacuuming
the same table over and over and being unable to remove enough dead
tuples due to something holding back the oldest xmin horizon. Why
can't we just fix that by remembering the value that
VacuumCutoffs.OldestXmin and only coming back to that table once
that's moved forward some amount?

Users expect the tables to be auto vacuumed when:
*dead_tuples > vac_base_thresh + vac_scale_factor * reltuples*
If we depend on xid moving forward to do autovacuum, I think
there are chances some bloated tables won't be vacuumed?

Can you explain why you think that? The idea is to start vacuum other
tables that perhaps can have dead tuples removed instead of repeating
vacuums on the same table over and over without any chance of being
able to remove any more dead tuples than we could during the last
vacuum.

David

#14Junwang Zhao
zhjwpku@gmail.com
In reply to: David Rowley (#13)
2 attachment(s)
Re: Introduce some randomness to autovacuum

Hi David,

On Thu, May 1, 2025 at 2:15 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Thu, 1 May 2025 at 17:35, Junwang Zhao <zhjwpku@gmail.com> wrote:

On Thu, May 1, 2025 at 8:12 AM David Rowley <dgrowleyml@gmail.com> wrote:

It sounds like the aim is to fix the problem with autovacuum vacuuming
the same table over and over and being unable to remove enough dead
tuples due to something holding back the oldest xmin horizon. Why
can't we just fix that by remembering the value that
VacuumCutoffs.OldestXmin and only coming back to that table once
that's moved forward some amount?

Users expect the tables to be auto vacuumed when:
*dead_tuples > vac_base_thresh + vac_scale_factor * reltuples*
If we depend on xid moving forward to do autovacuum, I think
there are chances some bloated tables won't be vacuumed?

Can you explain why you think that? The idea is to start vacuum other
tables that perhaps can have dead tuples removed instead of repeating
vacuums on the same table over and over without any chance of being
able to remove any more dead tuples than we could during the last
vacuum.

Sorry, I must have misinterpreted this sentence
**by remembering the value that VacuumCutoffs.OldestXmin**.

How about reporting the number of dead tuples removed by
the last autovacuum and using that statistic to determine whether
to vacuum the table in the next round?

I can think of a simple algorithm like this:

If the last autovacuum removed fewer than 10% of the dead tuples,
then skip vacuuming this table for a while. Numbers used are chosen
arbitrarily, and this is patch v2-0002.

v2-0001 is the same as the last email.

David

--
Regards
Junwang Zhao

Attachments:

v2-0002-skip-vacuuming-the-table-if-last-time-vacuum-perc.patchapplication/octet-stream; name=v2-0002-skip-vacuuming-the-table-if-last-time-vacuum-perc.patchDownload
From 54196b001ca8df9e621811d5204d956851784683 Mon Sep 17 00:00:00 2001
From: Junwang Zhao <zhjwpku@gmail.com>
Date: Thu, 1 May 2025 10:14:08 +0000
Subject: [PATCH v2 2/2] skip vacuuming the table if last time vacuum percent
 less than 10

---
 src/backend/postmaster/autovacuum.c          | 16 ++++++++++++++++
 src/backend/utils/activity/pgstat_relation.c |  7 +++++++
 src/include/pgstat.h                         |  1 +
 3 files changed, 24 insertions(+)

diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 2452d893533..06b91f31d9e 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3093,6 +3093,22 @@ relation_needs_vacanalyze(Oid relid,
 		float4		reltuples = classForm->reltuples;
 		int32		relpages = classForm->relpages;
 		int32		relallfrozen = classForm->relallfrozen;
+		float4		last_remove_tuples_percent = tabentry->last_autovacuum_removed_tuples_percent;
+		TimestampTz last_autovacuum_time = tabentry->last_autovacuum_time;
+
+		/*
+		 * If the last autovacuum removed tuples is less than 10% of the
+		 * current dead tuples, then skip vacuuming this table for some time.
+		 */
+		if (!TimestampDifferenceExceedsSeconds(last_autovacuum_time,
+											   GetCurrentTimestamp(),
+											   autovacuum_naptime) &&
+			last_remove_tuples_percent < 10.0)
+		{
+			*doanalyze = false;
+			*dovacuum = false;
+			return;
+		}
 
 		vactuples = tabentry->dead_tuples;
 		instuples = tabentry->ins_since_vacuum;
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index eeb2d43cb10..d712ec18199 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -217,6 +217,7 @@ pgstat_report_vacuum(Oid tableoid, bool shared,
 	Oid			dboid = (shared ? InvalidOid : MyDatabaseId);
 	TimestampTz ts;
 	PgStat_Counter elapsedtime;
+	PgStat_Counter old_dead_tuples;
 
 	if (!pgstat_track_counts)
 		return;
@@ -232,6 +233,7 @@ pgstat_report_vacuum(Oid tableoid, bool shared,
 	shtabentry = (PgStatShared_Relation *) entry_ref->shared_stats;
 	tabentry = &shtabentry->stats;
 
+	old_dead_tuples = tabentry->dead_tuples;
 	tabentry->live_tuples = livetuples;
 	tabentry->dead_tuples = deadtuples;
 
@@ -250,6 +252,11 @@ pgstat_report_vacuum(Oid tableoid, bool shared,
 	if (AmAutoVacuumWorkerProcess())
 	{
 		tabentry->last_autovacuum_time = ts;
+		if (old_dead_tuples > deadtuples)
+			tabentry->last_autovacuum_removed_tuples_percent =
+				100 * (old_dead_tuples - deadtuples) / old_dead_tuples;
+		else
+			tabentry->last_autovacuum_removed_tuples_percent = 0;
 		tabentry->autovacuum_count++;
 		tabentry->total_autovacuum_time += elapsedtime;
 	}
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 378f2f2c2ba..417699e1fc2 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -443,6 +443,7 @@ typedef struct PgStat_StatTabEntry
 	TimestampTz last_vacuum_time;	/* user initiated vacuum */
 	PgStat_Counter vacuum_count;
 	TimestampTz last_autovacuum_time;	/* autovacuum initiated */
+	PgStat_Counter last_autovacuum_removed_tuples_percent;
 	PgStat_Counter autovacuum_count;
 	TimestampTz last_analyze_time;	/* user initiated */
 	PgStat_Counter analyze_count;
-- 
2.39.5

v2-0001-Introduce-autovacuum-vacuum-strategy.patchapplication/octet-stream; name=v2-0001-Introduce-autovacuum-vacuum-strategy.patchDownload
From 3763c4813593af7267c37203a09e3d04037cf304 Mon Sep 17 00:00:00 2001
From: Junwang Zhao <zhjwpku@gmail.com>
Date: Wed, 30 Apr 2025 10:20:16 +0000
Subject: [PATCH v2 1/2] Introduce autovacuum vacuum strategy

Add a new configuration option `autovacuum_vacuum_strategy` to
control the order in which tables are vacuumed by the autovacuum
process. Two strategies are supported for now:

1. Sequential: Tables are vacuumed in the order they are collected.
2. Random: The list of tables is rotated around a randomly chosen
   pivot before vacuuming to avoid always starting with the same
   table, which prevents vacuuming starvation for some tables.

The default strategy is `sequential`.

Signed-off-by: Junwang Zhao <zhjwpku@gmail.com>
---
 src/backend/postmaster/autovacuum.c           | 33 +++++++++++++++++++
 src/backend/utils/misc/guc_tables.c           | 10 ++++++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/postmaster/autovacuum.h           |  7 ++++
 src/include/utils/guc.h                       |  1 +
 5 files changed, 52 insertions(+)

diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 16756152b71..2452d893533 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -79,6 +79,7 @@
 #include "catalog/pg_namespace.h"
 #include "commands/dbcommands.h"
 #include "commands/vacuum.h"
+#include "common/pg_prng.h"
 #include "common/int.h"
 #include "lib/ilist.h"
 #include "libpq/pqsignal.h"
@@ -132,9 +133,16 @@ int			autovacuum_multixact_freeze_max_age;
 
 double		autovacuum_vac_cost_delay;
 int			autovacuum_vac_cost_limit;
+int			autovacuum_vac_strategy;
 
 int			Log_autovacuum_min_duration = 600000;
 
+const struct config_enum_entry autovacuum_vac_strategy_options[] = {
+	{"sequential", AUTOVACUUM_VAC_STRATEGY_SEQUENTIAL, false},
+	{"random", AUTOVACUUM_VAC_STRATEGY_RANDOM, false},
+	{NULL, 0, false}
+};
+
 /* the minimum allowed time between two awakenings of the launcher */
 #define MIN_AUTOVAC_SLEEPTIME 100.0 /* milliseconds */
 #define MAX_AUTOVAC_SLEEPTIME 300	/* seconds */
@@ -2267,6 +2275,30 @@ do_autovacuum(void)
 										  "Autovacuum Portal",
 										  ALLOCSET_DEFAULT_SIZES);
 
+	/*
+	 * Randomly rotate the list of tables to vacuum.  This is to avoid always
+	 * vacuuming the same table first, which could lead to spinning on the
+	 * same table or vacuuming starvation.
+	 */
+	if (list_length(table_oids) > 1 && autovacuum_vac_strategy == AUTOVACUUM_VAC_STRATEGY_RANDOM)
+	{
+
+		int			rand = 0;
+		static pg_prng_state prng_state;
+		List	   *tmp_oids = NIL;
+
+		pg_prng_seed(&prng_state, (uint64) (getpid() ^ time(NULL)));
+		rand = (int) pg_prng_uint64_range(&prng_state, 0,
+										  list_length(table_oids) - 1);
+		if (rand != 0)
+		{
+			tmp_oids = list_copy_tail(table_oids, rand);
+			table_oids = list_copy_head(table_oids,
+										list_length(table_oids) - rand);
+			table_oids = list_concat(table_oids, tmp_oids);
+		}
+	}
+
 	/*
 	 * Perform operations on collected tables.
 	 */
@@ -3282,6 +3314,7 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
 		workitem->avw_used = true;
 		workitem->avw_active = false;
 		workitem->avw_type = type;
+
 		workitem->avw_database = MyDatabaseId;
 		workitem->avw_relation = relationId;
 		workitem->avw_blockNumber = blkno;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f8cbd86759..c928106fd50 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -5418,6 +5418,16 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, assign_io_method, NULL
 	},
 
+	{
+		{"autovacuum_vacuum_strategy", PGC_SIGHUP, VACUUM_AUTOVACUUM,
+			gettext_noop("Vacuum strategy for autovacuum."),
+			NULL
+		},
+		&autovacuum_vac_strategy,
+		AUTOVACUUM_VAC_STRATEGY_SEQUENTIAL, autovacuum_vac_strategy_options,
+		NULL, NULL, NULL
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34826d01380..5c44944008f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -713,6 +713,7 @@ autovacuum_worker_slots = 16	# autovacuum worker slots to allocate
 #autovacuum_vacuum_cost_limit = -1	# default vacuum cost limit for
 					# autovacuum, -1 means use
 					# vacuum_cost_limit
+#autovacuum_vacuum_strategy = sequential
 
 # - Cost-Based Vacuum Delay -
 
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..218ea4ee1c8 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -25,6 +25,12 @@ typedef enum
 	AVW_BRINSummarizeRange,
 } AutoVacuumWorkItemType;
 
+/* Autovacuum vacuum strategies */
+enum AutovacuumVacStrategy
+{
+	AUTOVACUUM_VAC_STRATEGY_SEQUENTIAL = 0,
+	AUTOVACUUM_VAC_STRATEGY_RANDOM,
+};
 
 /* GUC variables */
 extern PGDLLIMPORT bool autovacuum_start_daemon;
@@ -43,6 +49,7 @@ extern PGDLLIMPORT int autovacuum_freeze_max_age;
 extern PGDLLIMPORT int autovacuum_multixact_freeze_max_age;
 extern PGDLLIMPORT double autovacuum_vac_cost_delay;
 extern PGDLLIMPORT int autovacuum_vac_cost_limit;
+extern PGDLLIMPORT int autovacuum_vac_strategy;
 
 /* autovacuum launcher PID, only valid when worker is shutting down */
 extern PGDLLIMPORT int AutovacuumLauncherPid;
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index f619100467d..2f5c5363b03 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -322,6 +322,7 @@ extern PGDLLIMPORT const struct config_enum_entry io_method_options[];
 extern PGDLLIMPORT const struct config_enum_entry recovery_target_action_options[];
 extern PGDLLIMPORT const struct config_enum_entry wal_level_options[];
 extern PGDLLIMPORT const struct config_enum_entry wal_sync_method_options[];
+extern PGDLLIMPORT const struct config_enum_entry autovacuum_vac_strategy_options[];
 
 /*
  * Functions exported by guc.c
-- 
2.39.5