do only critical work during single-user vacuum?

Started by John Naylorabout 4 years ago97 messages
#1John Naylor
john.naylor@enterprisedb.com

When a user must shut down and restart in single-user mode to run
vacuum on an entire database, that does a lot of work that's
unnecessary for getting the system online again, even without
index_cleanup. We had a recent case where a single-user vacuum took
around 3 days to complete.

Now that we have a concept of a fail-safe vacuum, maybe it would be
beneficial to skip a vacuum in single-user mode if the fail-safe
criteria were not met at the beginning of vacuuming a relation. This
is not without risk, of course, but it should be much faster than
today and once up and running the admin would have a chance to get a
handle on things. Thoughts?

--
John Naylor
EDB: http://www.enterprisedb.com

#2Bossart, Nathan
bossartn@amazon.com
In reply to: John Naylor (#1)
Re: do only critical work during single-user vacuum?

On 12/9/21, 11:34 AM, "John Naylor" <john.naylor@enterprisedb.com> wrote:

When a user must shut down and restart in single-user mode to run
vacuum on an entire database, that does a lot of work that's
unnecessary for getting the system online again, even without
index_cleanup. We had a recent case where a single-user vacuum took
around 3 days to complete.

Now that we have a concept of a fail-safe vacuum, maybe it would be
beneficial to skip a vacuum in single-user mode if the fail-safe
criteria were not met at the beginning of vacuuming a relation. This
is not without risk, of course, but it should be much faster than
today and once up and running the admin would have a chance to get a
handle on things. Thoughts?

Would the --min-xid-age and --no-index-cleanup vacuumdb options help
with this?

Nathan

In reply to: John Naylor (#1)
Re: do only critical work during single-user vacuum?

On Thu, Dec 9, 2021 at 11:28 AM John Naylor
<john.naylor@enterprisedb.com> wrote:

Now that we have a concept of a fail-safe vacuum, maybe it would be
beneficial to skip a vacuum in single-user mode if the fail-safe
criteria were not met at the beginning of vacuuming a relation.

Obviously the main goal of the failsafe is to not get into this
situation in the first place. But it's still very reasonable to ask
"what happens when the failsafe even fails at that?". This was
something that we considered directly when working on the feature.

There is a precheck that takes place before any other work, which
ensures that we won't even start off any of the nonessential tasks the
failsafe skips (e.g., index vacuuming). The precheck works like any
other check -- it checks if relfrozenxid is dangerously old. (We won't
even bother trying to launch parallel workers when this precheck
triggers, which is another reason to have it that Mashahiko pointed
out during development.)

Presumably there is no need to specifically check if we're running in
single user mode when considering if we need to trigger the failsafe
-- which, as you say, we won't do. It shouldn't matter, because
anybody running single-user mode just to VACUUM must already be unable
to allocate new XIDs outside of single user mode. That condition alone
will trigger the failsafe.

That said, it would be very easy to add a check for single user mode.
It didn't happen because we weren't aware of any specific need for it.
Perhaps there is an argument for it.

--
Peter Geoghegan

In reply to: Peter Geoghegan (#3)
Re: do only critical work during single-user vacuum?

On Thu, Dec 9, 2021 at 1:04 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Thu, Dec 9, 2021 at 11:28 AM John Naylor
<john.naylor@enterprisedb.com> wrote:

Now that we have a concept of a fail-safe vacuum, maybe it would be
beneficial to skip a vacuum in single-user mode if the fail-safe
criteria were not met at the beginning of vacuuming a relation.

Obviously the main goal of the failsafe is to not get into this
situation in the first place. But it's still very reasonable to ask
"what happens when the failsafe even fails at that?". This was
something that we considered directly when working on the feature.

Oh, I think I misunderstood. Your concern is for the case where the
DBA runs a simple "VACUUM" in single-user mode; you want to skip over
tables that don't really need to advance relfrozenxid, automatically.

I can see an argument for something like that, but I think that it
should be a variant of VACUUM. Or maybe it could be addressed with a
better user interface; single-user mode should prompt the user about
what exact VACUUM command they ought to run to get things going.

--
Peter Geoghegan

#5Andres Freund
andres@anarazel.de
In reply to: John Naylor (#1)
Re: do only critical work during single-user vacuum?

Hi,

On 2021-12-09 15:28:18 -0400, John Naylor wrote:

When a user must shut down and restart in single-user mode to run
vacuum on an entire database, that does a lot of work that's
unnecessary for getting the system online again, even without
index_cleanup. We had a recent case where a single-user vacuum took
around 3 days to complete.

Now that we have a concept of a fail-safe vacuum, maybe it would be
beneficial to skip a vacuum in single-user mode if the fail-safe
criteria were not met at the beginning of vacuuming a relation. This
is not without risk, of course, but it should be much faster than
today and once up and running the admin would have a chance to get a
handle on things. Thoughts?

What if the user tried to reclaim space by vacuuming (via truncation)? Or is
working around some corruption or such? I think this is too much magic.

That said, having a VACUUM "selector" that selects the oldest tables could be
quite useful. And address this usecase both for single-user and normal
operation.

Another thing that might be worth doing is to update relfrozenxid earlier. We
definitely should update it before doing truncation (that can be quite
expensive). But we probably should do it even before the final
lazy_cleanup_all_indexes() pass - often that'll be the only pass, and there's
really no reason to delay relfrozenxid advancement till after that.

Greetings,

Andres Freund

#6John Naylor
john.naylor@enterprisedb.com
In reply to: Peter Geoghegan (#4)
Re: do only critical work during single-user vacuum?

On Thu, Dec 9, 2021 at 5:13 PM Peter Geoghegan <pg@bowt.ie> wrote:

Oh, I think I misunderstood. Your concern is for the case where the
DBA runs a simple "VACUUM" in single-user mode; you want to skip over
tables that don't really need to advance relfrozenxid, automatically.

Right.

I can see an argument for something like that, but I think that it
should be a variant of VACUUM. Or maybe it could be addressed with a
better user interface;

On Thu, Dec 9, 2021 at 6:08 PM Andres Freund <andres@anarazel.de> wrote:

What if the user tried to reclaim space by vacuuming (via truncation)? Or is
working around some corruption or such? I think this is too much magic.

That said, having a VACUUM "selector" that selects the oldest tables could be
quite useful. And address this usecase both for single-user and normal
operation.

All good points.

[Peter again]

single-user mode should prompt the user about
what exact VACUUM command they ought to run to get things going.

The current message is particularly bad in its vagueness because some
users immediately reach for VACUUM FULL, which quite logically seems
like the most complete thing to do.

--
John Naylor
EDB: http://www.enterprisedb.com

In reply to: John Naylor (#6)
Re: do only critical work during single-user vacuum?

On Thu, Dec 9, 2021 at 3:53 PM John Naylor <john.naylor@enterprisedb.com> wrote:

single-user mode should prompt the user about
what exact VACUUM command they ought to run to get things going.

The current message is particularly bad in its vagueness because some
users immediately reach for VACUUM FULL, which quite logically seems
like the most complete thing to do.

You mean the GetNewTransactionId() error, about single-user mode? Why
do we need to use single-user mode at all? I'm pretty sure that the
reason is "as an escape hatch", but I wonder what that really means.

***Thinks***

I suppose that it might be a good idea to make sure that autovacuum
cannot run, because in general autovacuum might need to allocate an
XID (for autoanalyze), and locking all that down in exactly the right
way might not be a very good use of our time.

But even still, why not have some variant of single-user mode just for
this task? Something that's easy to use when the DBA is rudely
awakened at 4am -- something a little bit like a big red button that
fixes the exact problem of XID exhaustion, in a reasonably targeted
way? I don't think that this needs to involve the VACUUM command
itself.

The current recommendation to do a whole-database VACUUM doesn't take
a position on how old the oldest datfrozenxid has to be in order to
become safe again, preferring to "make a conservative recommendation"
-- which is what a database-level VACUUM really is. But that doesn't
seem helpful at all. In fact, it's not even conservative. We could
easily come up with a reasonable definition of "datfrozenxid that's
sufficiently new to make it safe to come back online and allocate XIDs
again". Perhaps something based on the current
autovacuum_freeze_max_age (and autovacuum_multixact_freeze_max_age)
settings, with sanity checks.

We could then apply this criteria in new code that implements this
"big red button" (maybe this is a new option for the postgres
executable, a little like --single?). Something that's reasonably
targeted, and dead simple to use.

--
Peter Geoghegan

#8Bossart, Nathan
bossartn@amazon.com
In reply to: Peter Geoghegan (#7)
Re: do only critical work during single-user vacuum?

On 12/9/21, 12:33 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote:

On 12/9/21, 11:34 AM, "John Naylor" <john.naylor@enterprisedb.com> wrote:

Now that we have a concept of a fail-safe vacuum, maybe it would be
beneficial to skip a vacuum in single-user mode if the fail-safe
criteria were not met at the beginning of vacuuming a relation. This
is not without risk, of course, but it should be much faster than
today and once up and running the admin would have a chance to get a
handle on things. Thoughts?

Would the --min-xid-age and --no-index-cleanup vacuumdb options help
with this?

Sorry, I'm not sure what I was thinking. Of coure you cannot use
vacuumdb in single-user mode. But I think something like
--min-xid-age in VACUUM is what you are looking for.

Nathan

#9Bossart, Nathan
bossartn@amazon.com
In reply to: Peter Geoghegan (#7)
Re: do only critical work during single-user vacuum?

On 12/9/21, 4:36 PM, "Peter Geoghegan" <pg@bowt.ie> wrote:

We could then apply this criteria in new code that implements this
"big red button" (maybe this is a new option for the postgres
executable, a little like --single?). Something that's reasonably
targeted, and dead simple to use.

+1

Nathan

#10Bossart, Nathan
bossartn@amazon.com
In reply to: Bossart, Nathan (#9)
Re: do only critical work during single-user vacuum?

On 12/9/21, 5:06 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote:

On 12/9/21, 4:36 PM, "Peter Geoghegan" <pg@bowt.ie> wrote:

We could then apply this criteria in new code that implements this
"big red button" (maybe this is a new option for the postgres
executable, a little like --single?). Something that's reasonably
targeted, and dead simple to use.

+1

As Andres noted, such a feature might be useful during normal
operation, too. Perhaps the vacuumdb --min-xid-age stuff should be
moved to a new VACUUM option.

Nathan

In reply to: Bossart, Nathan (#10)
Re: do only critical work during single-user vacuum?

On Thu, Dec 9, 2021 at 5:12 PM Bossart, Nathan <bossartn@amazon.com> wrote:

As Andres noted, such a feature might be useful during normal
operation, too. Perhaps the vacuumdb --min-xid-age stuff should be
moved to a new VACUUM option.

I was thinking of something like pg_import_system_collations() for
this: a function that's built-in, and can be called in single user
mode, that nevertheless doesn't make any assumptions about how it may
be called. Nothing stops a superuser from calling
pg_import_system_collations() themselves, outside of initdb. That
isn't particularly common, but it works in the way you'd expect it to
work. It's easy to test.

I imagine that this new function (to handle maintenance tasks in the
event of a wraparound emergency) would output information about its
progress. For example, it would make an up-front decision about which
tables needed to be vacuumed in order for the current DB's
datfrozenxid to be sufficiently new, before it started anything (with
handling for edge-cases with many tables, perhaps). It might also show
the size of each table, and show another line for each table that has
been processed so far, as a rudimentary progress indicator.

We could still have a separate option for the postgres executable,
just to invoke single-user mode and call this function. It would
mostly just be window dressing, of course, but that still seems
useful.

--
Peter Geoghegan

#12Andres Freund
andres@anarazel.de
In reply to: Peter Geoghegan (#7)
Re: do only critical work during single-user vacuum?

Hi,

On 2021-12-09 16:34:53 -0800, Peter Geoghegan wrote:

But even still, why not have some variant of single-user mode just for
this task?

Something that's easy to use when the DBA is rudely
awakened at 4am -- something a little bit like a big red button that
fixes the exact problem of XID exhaustion, in a reasonably targeted
way? I don't think that this needs to involve the VACUUM command
itself.

I think we should move *away* from single user mode, rather than the
opposite. It's a substantial code burden and it's hard to use.

I don't think single user mode is a good fit for this anyway - it's inherently
focussed on connecting to a single database. But wraparound issues often
involve more than one database (often just because of shared catalogs).

Also, requiring a restart will often exascerbate the problem - the cache will
be cold, there's no walwriter, etc, making the vacuum slower. Making vacuum
not consume an xid seems like a lot more promising - and quite doable. Then
there's no need to restart at all.

Greetings,

Andres Freund

In reply to: Andres Freund (#12)
Re: do only critical work during single-user vacuum?

On Thu, Dec 9, 2021 at 5:56 PM Andres Freund <andres@anarazel.de> wrote:

I think we should move *away* from single user mode, rather than the
opposite. It's a substantial code burden and it's hard to use.

I wouldn't say that this is moving closer to single user mode.

I don't think single user mode is a good fit for this anyway - it's inherently
focussed on connecting to a single database. But wraparound issues often
involve more than one database (often just because of shared catalogs).

I don't disagree with any of that. My suggestions were based on the
assumption that it might be unrealistic to expect somebody to spend a
huge amount of time on this, given that (in a certain sense) it's
never really supposed to be used. Even a very simple approach would be
a big improvement.

Also, requiring a restart will often exascerbate the problem - the cache will
be cold, there's no walwriter, etc, making the vacuum slower. Making vacuum
not consume an xid seems like a lot more promising - and quite doable. Then
there's no need to restart at all.

I didn't give too much consideration to what it would take to keep the
system partially online, without introducing excessive complexity.
Maybe it wouldn't be that hard to teach the system to stop allocating
XIDs, while still allowing autovacuum workers to continue to get the
system functioning again. With the av workers taking a particular
emphasis on doing whatever work is required for the system to be able
to allocate XIDs again -- but not too much more (not until things are
back to normal). Now the plan is starting to get ambitious relative to
how often it'll be seen by users, though.

--
Peter Geoghegan

#14Bossart, Nathan
bossartn@amazon.com
In reply to: Peter Geoghegan (#11)
Re: do only critical work during single-user vacuum?

On 12/9/21, 5:27 PM, "Peter Geoghegan" <pg@bowt.ie> wrote:

I imagine that this new function (to handle maintenance tasks in the
event of a wraparound emergency) would output information about its
progress. For example, it would make an up-front decision about which
tables needed to be vacuumed in order for the current DB's
datfrozenxid to be sufficiently new, before it started anything (with
handling for edge-cases with many tables, perhaps). It might also show
the size of each table, and show another line for each table that has
been processed so far, as a rudimentary progress indicator.

I like the idea of having a built-in function that does the bare
minimum to resolve wraparound emergencies, and I think providing some
sort of simple progress indicator (even if rudimentary) would be very
useful. I imagine the decision logic could be pretty simple. If
we're only interested in getting the cluster out of a wraparound
emergency, we can probably just look for all tables with an age over
~2B.

Nathan

In reply to: Bossart, Nathan (#14)
1 attachment(s)
Re: do only critical work during single-user vacuum?

On Thu, Dec 9, 2021 at 8:41 PM Bossart, Nathan <bossartn@amazon.com> wrote:

I like the idea of having a built-in function that does the bare
minimum to resolve wraparound emergencies, and I think providing some
sort of simple progress indicator (even if rudimentary) would be very
useful.

If John doesn't have time to work on this during the Postgres 15
cycle, and if nobody else picks it up, then we should at least do the
bare minimum here: force the use of the failsafe in single user mode
(regardless of the age of relfrozenxid/relminmxid, which in general
might not be that old in tables where VACUUM might need to do a lot of
work). Attached quick and dirty patch shows what this would take. If
nothing else, it seems natural to define running any VACUUM in single
user mode as an emergency.

This is really the least we could do -- it's much better than nothing,
but still really lazy^M^M^M^M conservative. I haven't revised the
assumption that the user should do a top-level "VACUUM" in databases
that can no longer allocate XIDs due to wraparound, despite the fact
that we could do far better with moderate effort. Although it might
make sense to commit something like the attached as part of a more
worked out solution (assuming it didn't fully remove single user mode
from the equation, which would be better still).

--
Peter Geoghegan

Attachments:

v1-0001-Always-trigger-failsafe-in-single-user-mode.patchapplication/octet-stream; name=v1-0001-Always-trigger-failsafe-in-single-user-mode.patchDownload
From b7a3ded18d1303e64951439437bcd5925d5c6f44 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Mon, 20 Dec 2021 16:58:40 -0800
Subject: [PATCH v1] Always trigger failsafe in single user mode.

---
 src/backend/access/heap/vacuumlazy.c | 12 ++++++++----
 src/backend/commands/vacuum.c        |  6 +++++-
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index db6becfed..2ba882951 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2605,6 +2605,14 @@ lazy_check_wraparound_failsafe(LVRelState *vacrel)
 		vacrel->do_index_cleanup = false;
 		vacrel->do_rel_truncate = false;
 
+		/* Stop applying cost limits from this point on */
+		VacuumCostActive = false;
+		VacuumCostBalance = 0;
+
+		/* Single user mode uses failsafe as standard, so no WARNING needed */
+		if (!IsUnderPostmaster)
+			return true;
+
 		ereport(WARNING,
 				(errmsg("bypassing nonessential maintenance of table \"%s.%s.%s\" as a failsafe after %d index scans",
 						get_database_name(MyDatabaseId),
@@ -2615,10 +2623,6 @@ lazy_check_wraparound_failsafe(LVRelState *vacrel)
 				 errhint("Consider increasing configuration parameter \"maintenance_work_mem\" or \"autovacuum_work_mem\".\n"
 						 "You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.")));
 
-		/* Stop applying cost limits from this point on */
-		VacuumCostActive = false;
-		VacuumCostBalance = 0;
-
 		return true;
 	}
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 5c4bc15b4..217bb6965 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -1153,7 +1153,8 @@ vacuum_set_xid_limits(Relation rel,
 /*
  * vacuum_xid_failsafe_check() -- Used by VACUUM's wraparound failsafe
  * mechanism to determine if its table's relfrozenxid and relminmxid are now
- * dangerously far in the past.
+ * dangerously far in the past.  This is assumed to always be the case when
+ * the backend is run in single user mode.
  *
  * Input parameters are the target relation's relfrozenxid and relminmxid.
  *
@@ -1203,6 +1204,9 @@ vacuum_xid_failsafe_check(TransactionId relfrozenxid, MultiXactId relminmxid)
 		return true;
 	}
 
+	if (!IsUnderPostmaster)
+		return true;
+
 	return false;
 }
 
-- 
2.30.2

#16Andres Freund
andres@anarazel.de
In reply to: Peter Geoghegan (#15)
Re: do only critical work during single-user vacuum?

Hi,

On 2021-12-20 17:17:26 -0800, Peter Geoghegan wrote:

On Thu, Dec 9, 2021 at 8:41 PM Bossart, Nathan <bossartn@amazon.com> wrote:

I like the idea of having a built-in function that does the bare
minimum to resolve wraparound emergencies, and I think providing some
sort of simple progress indicator (even if rudimentary) would be very
useful.

If John doesn't have time to work on this during the Postgres 15
cycle, and if nobody else picks it up, then we should at least do the
bare minimum here: force the use of the failsafe in single user mode
(regardless of the age of relfrozenxid/relminmxid, which in general
might not be that old in tables where VACUUM might need to do a lot of
work). Attached quick and dirty patch shows what this would take. If
nothing else, it seems natural to define running any VACUUM in single
user mode as an emergency.

As I said before I think this is a bad idea. I'm fine with adding a vacuum
parameter forcing failsafe mode. And perhaps a hint to suggest it in single
user mode. But forcing it is a bad idea - single user isn't just used for
emergencies (c.f. initdb, which this patch would regress) and not every
emergency making single user mode useful is related to wraparound.

Greetings,

Andres Freund

#17Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Andres Freund (#16)
Re: do only critical work during single-user vacuum?

On Tue, Dec 21, 2021 at 12:46 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2021-12-20 17:17:26 -0800, Peter Geoghegan wrote:

On Thu, Dec 9, 2021 at 8:41 PM Bossart, Nathan <bossartn@amazon.com> wrote:

I like the idea of having a built-in function that does the bare
minimum to resolve wraparound emergencies, and I think providing some
sort of simple progress indicator (even if rudimentary) would be very
useful.

If John doesn't have time to work on this during the Postgres 15
cycle, and if nobody else picks it up, then we should at least do the
bare minimum here: force the use of the failsafe in single user mode
(regardless of the age of relfrozenxid/relminmxid, which in general
might not be that old in tables where VACUUM might need to do a lot of
work). Attached quick and dirty patch shows what this would take. If
nothing else, it seems natural to define running any VACUUM in single
user mode as an emergency.

As I said before I think this is a bad idea. I'm fine with adding a vacuum
parameter forcing failsafe mode. And perhaps a hint to suggest it in single
user mode. But forcing it is a bad idea - single user isn't just used for
emergencies (c.f. initdb, which this patch would regress) and not every
emergency making single user mode useful is related to wraparound.

+1

BTW a vacuum automatically enters failsafe mode under the situation
where the user has to run a vacuum in the single-user mode, right?

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In reply to: Masahiko Sawada (#17)
Re: do only critical work during single-user vacuum?

On Mon, Dec 20, 2021 at 8:40 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

BTW a vacuum automatically enters failsafe mode under the situation
where the user has to run a vacuum in the single-user mode, right?

Only for the table that had the problem. Maybe there are no other
tables that a database level "VACUUM" will need to spend much time on,
or maybe there are, and they will make it take much much longer (it
all depends).

The goal of the patch is to make sure that when we're in single user
mode, we'll consistently trigger the failsafe, for every VACUUM
against every table -- not just the table (or tables) whose
relfrozenxid is very old. That's still naive, but much less naive than
simply telling users to VACUUM the whole database in single user mode
while vacuuming indexes, etc.

--
Peter Geoghegan

#19Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Peter Geoghegan (#18)
Re: do only critical work during single-user vacuum?

On Tue, Dec 21, 2021 at 1:53 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Mon, Dec 20, 2021 at 8:40 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

BTW a vacuum automatically enters failsafe mode under the situation
where the user has to run a vacuum in the single-user mode, right?

Only for the table that had the problem. Maybe there are no other
tables that a database level "VACUUM" will need to spend much time on,
or maybe there are, and they will make it take much much longer (it
all depends).

The goal of the patch is to make sure that when we're in single user
mode, we'll consistently trigger the failsafe, for every VACUUM
against every table -- not just the table (or tables) whose
relfrozenxid is very old. That's still naive, but much less naive than
simply telling users to VACUUM the whole database in single user mode
while vacuuming indexes, etc.

I understand the patch, thank you for the explanation!

I remember Simon proposed a VACUUM command option[1]https://commitfest.postgresql.org/32/2908/, called
FAST_FREEZE, to turn off index cleanup and heap truncation. Now that
we have failsafe mechanism probably we can have a VACUUM command
option to turn on failsafe mode instead.

Regards,

[1]: https://commitfest.postgresql.org/32/2908/

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

#20John Naylor
john.naylor@enterprisedb.com
In reply to: Masahiko Sawada (#19)
Re: do only critical work during single-user vacuum?

On Tue, Dec 21, 2021 at 3:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Dec 21, 2021 at 1:53 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Mon, Dec 20, 2021 at 8:40 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

BTW a vacuum automatically enters failsafe mode under the situation
where the user has to run a vacuum in the single-user mode, right?

Only for the table that had the problem. Maybe there are no other
tables that a database level "VACUUM" will need to spend much time on,
or maybe there are, and they will make it take much much longer (it
all depends).

The goal of the patch is to make sure that when we're in single user
mode, we'll consistently trigger the failsafe, for every VACUUM
against every table -- not just the table (or tables) whose
relfrozenxid is very old. That's still naive, but much less naive than
simply telling users to VACUUM the whole database in single user mode
while vacuuming indexes, etc.

I understand the patch, thank you for the explanation!

I remember Simon proposed a VACUUM command option[1], called
FAST_FREEZE, to turn off index cleanup and heap truncation. Now that
we have failsafe mechanism probably we can have a VACUUM command
option to turn on failsafe mode instead.

I've been thinking a bit more about this, and I see two desirable
goals of anti-wraparound vacuum in single-user mode:

1. Get out of single-user mode as quickly as possible.

2. Minimize the catch-up work we have to do once we're out.

Currently, a naive vacuum does as much work as possible and leaves a
bunch of WAL streaming and archiving work for later, so that much is
easy to improve upon and we don't have to be terribly sophisticated.
Keeping in mind Andres' point that we don't want to force possibly
unwanted behavior just because we're in single-user mode, it makes
sense to have some kind of option that has the above two goals.
Instead of a boolean, it seems like the new option should specify some
age below which VACUUM will skip the table entirely, and above which
will enter fail-safe mode. As mentioned earlier, the shutdown hint
could spell out the exact command. With this design, it would specify
the fail-safe default, or something else, to use with the option. That
seems doable for v15 -- any thoughts on that approach?

In standard operation, the above goals could be restated as "advance
xmin as quickly as possible" and "generate as little future
'work/debt' as possible, whether dirty pages or WAL". There are some
more sophisticated things we can do in this regard, but something like
the above could also be useful in normal operation. In fact, that
"normal" could be just after we restarted after doing the bare-minimum
in single-user mode, and want to continue freezing and keep things
under control.

--
John Naylor
EDB: http://www.enterprisedb.com

#21John Naylor
john.naylor@enterprisedb.com
In reply to: John Naylor (#20)
Re: do only critical work during single-user vacuum?

I wrote:

Instead of a boolean, it seems like the new option should specify some
age below which VACUUM will skip the table entirely, and above which
will enter fail-safe mode. As mentioned earlier, the shutdown hint
could spell out the exact command. With this design, it would specify
the fail-safe default, or something else, to use with the option.

On second thought, we don't really need another number here. We could
simply go by the existing failsafe parameter, and if the admin wants a
different value, it's already possible to specify
vacuum_failsafe_age/vacuum_multixact_failsafe_age in a session,
including in single-user mode. Perhaps a new boolean called
FAILSAFE_ONLY. If no table is specified, then when generating the list
of tables, include only those with relfrozenxid/relminmxid greater
than their failsafe thresholds.

--
John Naylor
EDB: http://www.enterprisedb.com

In reply to: John Naylor (#21)
Re: do only critical work during single-user vacuum?

On Tue, Dec 21, 2021 at 1:31 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

On second thought, we don't really need another number here. We could
simply go by the existing failsafe parameter, and if the admin wants a
different value, it's already possible to specify
vacuum_failsafe_age/vacuum_multixact_failsafe_age in a session,
including in single-user mode. Perhaps a new boolean called
FAILSAFE_ONLY. If no table is specified, then when generating the list
of tables, include only those with relfrozenxid/relminmxid greater
than their failsafe thresholds.

That's equivalent to the quick and dirty patch I wrote (assuming that
the user actually uses this new FAILSAFE_ONLY thing).

But if we're going to add a new option to the VACUUM command (or
something of similar scope), then we might as well add a new behavior
that is reasonably exact -- something that (say) only *starts* a
VACUUM for those tables whose relfrozenxid age currently exceeds half
the autovacuum_freeze_max_age for the table (usually taken from the
GUC, sometimes taken from the reloption), which also forces the
failsafe. And with similar handling for
relminmxid/autovacuum_multixact_freeze_max_age.

In other words, while triggering the failsafe is important, simply *not
starting* VACUUM for relations where there is really no need for it is
at least as important. We shouldn't even think about pruning or
freezing with these tables. (ISTM that the only thing that might be a
bit controversial about any of this is my definition of "safe", which
seems like about the right trade-off to me.)

This new command/facility should probably not be a new flag to the
VACUUM command, as such. Rather, I think that it should either be an
SQL-callable function, or a dedicated top-level command (that doesn't
accept any tables). The only reason to have this is for scenarios
where the user is already in a tough spot with wraparound failure,
like that client of yours. Nobody wants to force the failsafe for one
specific table. It's not general purpose, at all, and shouldn't claim
to be.

--
Peter Geoghegan

#23John Naylor
john.naylor@enterprisedb.com
In reply to: Peter Geoghegan (#22)
Re: do only critical work during single-user vacuum?

On Tue, Dec 21, 2021 at 5:56 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Tue, Dec 21, 2021 at 1:31 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

On second thought, we don't really need another number here. We could
simply go by the existing failsafe parameter, and if the admin wants a
different value, it's already possible to specify
vacuum_failsafe_age/vacuum_multixact_failsafe_age in a session,
including in single-user mode. Perhaps a new boolean called
FAILSAFE_ONLY. If no table is specified, then when generating the list
of tables, include only those with relfrozenxid/relminmxid greater
than their failsafe thresholds.

That's equivalent to the quick and dirty patch I wrote (assuming that
the user actually uses this new FAILSAFE_ONLY thing).

Equivalent but optional.

But if we're going to add a new option to the VACUUM command (or
something of similar scope), then we might as well add a new behavior
that is reasonably exact -- something that (say) only *starts* a
VACUUM for those tables whose relfrozenxid age currently exceeds half
the autovacuum_freeze_max_age for the table (usually taken from the
GUC, sometimes taken from the reloption), which also forces the
failsafe. And with similar handling for
relminmxid/autovacuum_multixact_freeze_max_age.

In other words, while triggering the failsafe is important, simply *not
starting* VACUUM for relations where there is really no need for it is
at least as important. We shouldn't even think about pruning or
freezing with these tables.

Right, not starting where not necessary is crucial for getting out of
single-user mode as quickly as possible. We're in agreement there.

(ISTM that the only thing that might be a
bit controversial about any of this is my definition of "safe", which
seems like about the right trade-off to me.)

It seems reasonable to want to start back up and not immediately have
anti-wraparound vacuums kick in. On the other hand, it's not good to
do work while unable to monitor progress, and while more WAL is piling
up. I'm not sure where the right trade off is.

This new command/facility should probably not be a new flag to the
VACUUM command, as such. Rather, I think that it should either be an
SQL-callable function, or a dedicated top-level command (that doesn't
accept any tables). The only reason to have this is for scenarios
where the user is already in a tough spot with wraparound failure,
like that client of yours. Nobody wants to force the failsafe for one
specific table. It's not general purpose, at all, and shouldn't claim
to be.

Makes sense, I'll have a think about what that would look like.

--
John Naylor
EDB: http://www.enterprisedb.com

#24Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Peter Geoghegan (#22)
Re: do only critical work during single-user vacuum?

On Wed, Dec 22, 2021 at 6:56 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Tue, Dec 21, 2021 at 1:31 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

On second thought, we don't really need another number here. We could
simply go by the existing failsafe parameter, and if the admin wants a
different value, it's already possible to specify
vacuum_failsafe_age/vacuum_multixact_failsafe_age in a session,
including in single-user mode. Perhaps a new boolean called
FAILSAFE_ONLY. If no table is specified, then when generating the list
of tables, include only those with relfrozenxid/relminmxid greater
than their failsafe thresholds.

That's equivalent to the quick and dirty patch I wrote (assuming that
the user actually uses this new FAILSAFE_ONLY thing).

But if we're going to add a new option to the VACUUM command (or
something of similar scope), then we might as well add a new behavior
that is reasonably exact -- something that (say) only *starts* a
VACUUM for those tables whose relfrozenxid age currently exceeds half
the autovacuum_freeze_max_age for the table (usually taken from the
GUC, sometimes taken from the reloption), which also forces the
failsafe. And with similar handling for
relminmxid/autovacuum_multixact_freeze_max_age.

In other words, while triggering the failsafe is important, simply *not
starting* VACUUM for relations where there is really no need for it is
at least as important. We shouldn't even think about pruning or
freezing with these tables. (ISTM that the only thing that might be a
bit controversial about any of this is my definition of "safe", which
seems like about the right trade-off to me.)

+1

This new command/facility should probably not be a new flag to the
VACUUM command, as such. Rather, I think that it should either be an
SQL-callable function, or a dedicated top-level command (that doesn't
accept any tables). The only reason to have this is for scenarios
where the user is already in a tough spot with wraparound failure,
like that client of yours. Nobody wants to force the failsafe for one
specific table. It's not general purpose, at all, and shouldn't claim
to be.

Even not in the situation where the database has to run as the
single-user mode to freeze tuples, I think there would be some use
cases where users want to run vacuum (in failsafe mode) on tables with
relfrozenxid/relminmxid greater than their failsafe thresholds before
falling into that situation. I think it’s common that users are
somehow monitoring relfrozenxid/relminmxid and want to manually run
vacuum on them rather than relying on autovacuums. --min-xid-age
option and --min-mxid-age option of vacuumdb command would be good
examples. So I think this new command/facility might not necessarily
need to be specific to single-user mode.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

#25John Naylor
john.naylor@enterprisedb.com
In reply to: Masahiko Sawada (#24)
Re: do only critical work during single-user vacuum?

On Tue, Dec 21, 2021 at 10:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Dec 22, 2021 at 6:56 AM Peter Geoghegan <pg@bowt.ie> wrote:

This new command/facility should probably not be a new flag to the
VACUUM command, as such. Rather, I think that it should either be an
SQL-callable function, or a dedicated top-level command (that doesn't
accept any tables). The only reason to have this is for scenarios
where the user is already in a tough spot with wraparound failure,
like that client of yours. Nobody wants to force the failsafe for one
specific table. It's not general purpose, at all, and shouldn't claim
to be.

Even not in the situation where the database has to run as the
single-user mode to freeze tuples, I think there would be some use
cases where users want to run vacuum (in failsafe mode) on tables with
relfrozenxid/relminmxid greater than their failsafe thresholds before
falling into that situation. I think it’s common that users are
somehow monitoring relfrozenxid/relminmxid and want to manually run
vacuum on them rather than relying on autovacuums. --min-xid-age
option and --min-mxid-age option of vacuumdb command would be good
examples. So I think this new command/facility might not necessarily
need to be specific to single-user mode.

If we want to leave open the possibility to specify these parameters,
a SQL-callable function seems like the way to go. And even if we
don't, a function is fine.

--
John Naylor
EDB: http://www.enterprisedb.com

In reply to: Masahiko Sawada (#24)
Re: do only critical work during single-user vacuum?

On Tue, Dec 21, 2021 at 6:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Even not in the situation where the database has to run as the
single-user mode to freeze tuples, I think there would be some use
cases where users want to run vacuum (in failsafe mode) on tables with
relfrozenxid/relminmxid greater than their failsafe thresholds before
falling into that situation. I think it’s common that users are
somehow monitoring relfrozenxid/relminmxid and want to manually run
vacuum on them rather than relying on autovacuums. --min-xid-age
option and --min-mxid-age option of vacuumdb command would be good
examples. So I think this new command/facility might not necessarily
need to be specific to single-user mode.

It wouldn't be specific to single-user mode, since that is not really
special. It's only special in a way that's quite artificial (it can
continue to allocate XIDs past the point where we usually deem it
unsafe).

So, I think we agree; this new emergency vacuuming feature shouldn't
be restricted to single-user mode in any way, and shouldn't care about
whether we're in single user mode or not when run. OTOH, it probably
will be presented as something that is typically run in single user
mode, in situations like the one John's customer found themselves in
-- disastrous, unpleasant situations. It's not just a good policy
(that makes testing easy). The same kind of problem can easily be
caught a little earlier, before the system actually becomes unable to
allocate new XIDs (when not in single-user mode) -- that's quite
likely, and almost as scary.

As I said before, ISTM that the important thing is to have something
dead simple -- something that is easy to use when woken at 4am, when
the DBA is tired and stressed. Something that makes generic choices,
that are not way too conservative, but also don't risk making the
problem worse instead of better.

--
Peter Geoghegan

#27John Naylor
john.naylor@enterprisedb.com
In reply to: Peter Geoghegan (#22)
1 attachment(s)
Re: do only critical work during single-user vacuum?

On Tue, Dec 21, 2021 at 4:56 PM Peter Geoghegan <pg@bowt.ie> wrote:

But if we're going to add a new option to the VACUUM command (or
something of similar scope), then we might as well add a new behavior
that is reasonably exact -- something that (say) only *starts* a
VACUUM for those tables whose relfrozenxid age currently exceeds half
the autovacuum_freeze_max_age for the table (usually taken from the
GUC, sometimes taken from the reloption), which also forces the
failsafe. And with similar handling for
relminmxid/autovacuum_multixact_freeze_max_age.

This new command/facility should probably not be a new flag to the
VACUUM command, as such. Rather, I think that it should either be an
SQL-callable function, or a dedicated top-level command (that doesn't
accept any tables). The only reason to have this is for scenarios
where the user is already in a tough spot with wraparound failure,
like that client of yours. Nobody wants to force the failsafe for one
specific table. It's not general purpose, at all, and shouldn't claim
to be.

I've attached a PoC *untested* patch to show what it would look like
as a top-level statement. If the "shape" is uncontroversial, I'll put
work into testing it and fleshing it out.

For the PoC I wanted to try re-using existing keywords. I went with
"VACUUM LIMIT" since LIMIT is already a keyword that cannot be used as
a table name. It also brings "wraparound limit" to mind. We could add
a single-use unreserved keyword (such as VACUUM_MINIMAL or
VACUUM_FAST), but that doesn't seem great.

In other words, while triggering the failsafe is important, simply *not
starting* VACUUM for relations where there is really no need for it is
at least as important. We shouldn't even think about pruning or
freezing with these tables. (ISTM that the only thing that might be a
bit controversial about any of this is my definition of "safe", which
seems like about the right trade-off to me.)

I'm not sure what the right trade-off is, but as written I used 95% of
max age. It might be undesirable to end up so close to kicking off
uninterruptible vacuums, but the point is to get out of single-user
mode and back to streaming WAL as quickly as possible. It might also
be worth overriding the min ages as well, but haven't done so here.

It can be executed in normal mode (although it's not expected to be),
which makes testing easier and allows for a future possibility of not
requiring shutdown at all, by e.g. terminating non-superuser
connections.

--
John Naylor
EDB: http://www.enterprisedb.com

Attachments:

v2-vacuum-select-tables-closest-to-wraparound.patchtext/x-patch; charset=US-ASCII; name=v2-vacuum-select-tables-closest-to-wraparound.patchDownload
 src/backend/commands/vacuum.c  | 133 +++++++++++++++++++++++++++++++++++++----
 src/backend/nodes/copyfuncs.c  |   3 +
 src/backend/nodes/equalfuncs.c |   3 +
 src/backend/parser/gram.y      |  10 +++-
 src/backend/tcop/utility.c     |  13 ++++
 src/include/commands/vacuum.h  |   1 +
 src/include/nodes/nodes.h      |   1 +
 src/include/nodes/parsenodes.h |  12 ++++
 8 files changed, 165 insertions(+), 11 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 287098e4d0..d1c59a78e9 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -52,6 +52,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/acl.h"
+#include "utils/builtins.h"
 #include "utils/fmgroids.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
@@ -271,10 +272,132 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* user-invoked vacuum never uses this parameter */
 	params.log_min_duration = -1;
 
+	/*
+	 * Create special memory context for cross-transaction storage.
+	 *
+	 * Since it is a child of PortalContext, it will go away eventually even
+	 * if we suffer an error; there's no need for special abort cleanup logic.
+	 */
+	vac_context = AllocSetContextCreate(PortalContext,
+										"Vacuum",
+										ALLOCSET_DEFAULT_SIZES);
+
 	/* Now go through the common routine */
 	vacuum(vacstmt->rels, &params, NULL, isTopLevel);
 }
 
+/*
+ * Like ExecVacuum, but specialized for recovering quickly from anti-wraparound
+ * shutdown.
+ */
+void
+ExecVacuumMinimal(VacuumMinimalStmt *fmstmt, bool isTopLevel)
+{
+	VacuumParams params;
+	List	   *vacrels = NIL;
+	Relation	pgclass;
+	TableScanDesc scan;
+	HeapTuple	tuple;
+	int32 table_xid_age;
+	int32 table_mxid_age;
+	int32 save_VacuumCostDelay;
+
+	/* use defaults */
+	// WIP: It might be worth trying to do less work here
+	params.freeze_min_age = -1;
+	params.multixact_freeze_min_age = -1;
+
+	/* it's unlikely any table we choose will not be eligible for aggressive vacuum, but make sure */
+	params.freeze_table_age = 0;
+	params.multixact_freeze_table_age = 0;
+
+	/* skip unnecessary work, as in failsafe mode */
+	params.index_cleanup = VACOPTVALUE_DISABLED;
+	params.truncate = VACOPTVALUE_DISABLED;
+
+	/* user-invoked vacuum is never "for wraparound" */
+	params.is_wraparound = false;
+
+	/* user-invoked vacuum never uses this parameter */
+	params.log_min_duration = -1;
+
+	/* we only expect this to run in single-user mode anyway */
+	params.nworkers = -1;
+
+	/* we don't need the toast relation since we select them separately */
+	params.options = VACOPT_VACUUM;
+
+	vac_context = AllocSetContextCreate(PortalContext,
+										"Vacuum",
+										ALLOCSET_DEFAULT_SIZES);
+
+	/* select relations closest to the wraparound limit */
+
+	pgclass = table_open(RelationRelationId, AccessShareLock);
+
+	scan = table_beginscan_catalog(pgclass, 0, NULL);
+
+	while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
+	{
+		Form_pg_class classForm = (Form_pg_class) GETSTRUCT(tuple);
+		MemoryContext oldcontext;
+		Oid			relid = classForm->oid;
+
+		/* check permissions of relation */
+		if (!vacuum_is_relation_owner(relid, classForm, params.options))
+		{
+			Assert(IsUnderPostmaster);
+			continue;
+		}
+
+		/*
+		 * Only consider relations able to hold unfrozen XIDs (anything else
+		 * should have InvalidTransactionId in relfrozenxid anyway).
+		 */
+		if (classForm->relkind != RELKIND_RELATION &&
+			classForm->relkind != RELKIND_MATVIEW &&
+			classForm->relkind != RELKIND_TOASTVALUE)
+		{
+			Assert(!TransactionIdIsValid(classForm->relfrozenxid));
+			Assert(!MultiXactIdIsValid(classForm->relminmxid));
+			continue;
+		}
+
+		table_xid_age = DirectFunctionCall1(xid_age, classForm->relfrozenxid);
+		table_mxid_age = DirectFunctionCall1(mxid_age, classForm->relminmxid);
+
+		// FIXME: also check reloption
+		// WIP: 95% is a starting point for discussion
+		if ((table_xid_age < autovacuum_freeze_max_age * 0.95) ||
+			(table_mxid_age < autovacuum_multixact_freeze_max_age * 0.95))
+			continue;
+
+		/*
+		 * Build VacuumRelation(s) specifying the table OIDs to be processed.
+		 * We omit a RangeVar since it wouldn't be appropriate to complain
+		 * about failure to open one of these relations later.
+		 */
+		oldcontext = MemoryContextSwitchTo(vac_context);
+		vacrels = lappend(vacrels, makeVacuumRelation(NULL,
+													  relid,
+													  NIL));
+		MemoryContextSwitchTo(oldcontext);
+	}
+
+	table_endscan(scan);
+	table_close(pgclass, AccessShareLock);
+
+	/* turn off cost delay */
+	save_VacuumCostDelay = VacuumCostDelay;
+	VacuumCostDelay = 0;
+
+	/* Now go through the common routine */
+	vacuum(vacrels, &params, NULL, isTopLevel);
+
+	/* restore cost delay, just in case we're not in single-user mode */
+	VacuumCostDelay = save_VacuumCostDelay;
+}
+
 /*
  * Internal entry point for VACUUM and ANALYZE commands.
  *
@@ -358,16 +481,6 @@ vacuum(List *relations, VacuumParams *params,
 	if ((params->options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess())
 		pgstat_vacuum_stat();
 
-	/*
-	 * Create special memory context for cross-transaction storage.
-	 *
-	 * Since it is a child of PortalContext, it will go away eventually even
-	 * if we suffer an error; there's no need for special abort cleanup logic.
-	 */
-	vac_context = AllocSetContextCreate(PortalContext,
-										"Vacuum",
-										ALLOCSET_DEFAULT_SIZES);
-
 	/*
 	 * If caller didn't give us a buffer strategy object, make one in the
 	 * cross-transaction memory context.
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 456d563f34..1b2a5fc640 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -5578,6 +5578,9 @@ copyObjectImpl(const void *from)
 		case T_VacuumStmt:
 			retval = _copyVacuumStmt(from);
 			break;
+		case T_VacuumMinimalStmt:
+			retval = (void *) makeNode(VacuumMinimalStmt);
+			break;
 		case T_VacuumRelation:
 			retval = _copyVacuumRelation(from);
 			break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 53beef1488..9ba9601058 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -3580,6 +3580,9 @@ equal(const void *a, const void *b)
 		case T_VacuumStmt:
 			retval = _equalVacuumStmt(a, b);
 			break;
+		case T_VacuumMinimalStmt:
+			retval = true;
+			break;
 		case T_VacuumRelation:
 			retval = _equalVacuumRelation(a, b);
 			break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 879018377b..3ec299cd19 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -312,7 +312,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 		RemoveFuncStmt RemoveOperStmt RenameStmt ReturnStmt RevokeStmt RevokeRoleStmt
 		RuleActionStmt RuleActionStmtOrEmpty RuleStmt
 		SecLabelStmt SelectStmt TransactionStmt TransactionStmtLegacy TruncateStmt
-		UnlistenStmt UpdateStmt VacuumStmt
+		UnlistenStmt UpdateStmt VacuumStmt VacuumMinimalStmt
 		VariableResetStmt VariableSetStmt VariableShowStmt
 		ViewStmt CheckPointStmt CreateConversionStmt
 		DeallocateStmt PrepareStmt ExecuteStmt
@@ -1043,6 +1043,7 @@ stmt:
 			| UnlistenStmt
 			| UpdateStmt
 			| VacuumStmt
+			| VacuumMinimalStmt
 			| VariableResetStmt
 			| VariableSetStmt
 			| VariableShowStmt
@@ -10866,6 +10867,13 @@ VacuumStmt: VACUUM opt_full opt_freeze opt_verbose opt_analyze opt_vacuum_relati
 				}
 		;
 
+VacuumMinimalStmt: VACUUM LIMIT
+				{
+					VacuumMinimalStmt *n = makeNode(VacuumMinimalStmt);
+					$$ = (Node *) n;
+				}
+		;
+
 AnalyzeStmt: analyze_keyword opt_verbose opt_vacuum_relation_list
 				{
 					VacuumStmt *n = makeNode(VacuumStmt);
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 83e4e37c78..5907b29110 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -285,6 +285,7 @@ ClassifyUtilityCommandAsReadOnly(Node *parsetree)
 		case T_ClusterStmt:
 		case T_ReindexStmt:
 		case T_VacuumStmt:
+		case T_VacuumMinimalStmt:
 			{
 				/*
 				 * These commands write WAL, so they're not strictly
@@ -859,6 +860,10 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			ExecVacuum(pstate, (VacuumStmt *) parsetree, isTopLevel);
 			break;
 
+		case T_VacuumMinimalStmt:
+			ExecVacuumMinimal((VacuumMinimalStmt *) parsetree, isTopLevel);
+			break;
+
 		case T_ExplainStmt:
 			ExplainQuery(pstate, (ExplainStmt *) parsetree, params, dest);
 			break;
@@ -2843,6 +2848,10 @@ CreateCommandTag(Node *parsetree)
 				tag = CMDTAG_ANALYZE;
 			break;
 
+		case T_VacuumMinimalStmt:
+			tag = CMDTAG_VACUUM;
+			break;
+
 		case T_ExplainStmt:
 			tag = CMDTAG_EXPLAIN;
 			break;
@@ -3483,6 +3492,10 @@ GetCommandLogLevel(Node *parsetree)
 			lev = LOGSTMT_ALL;
 			break;
 
+		case T_VacuumMinimalStmt:
+			lev = LOGSTMT_ALL;
+			break;
+
 		case T_ExplainStmt:
 			{
 				ExplainStmt *stmt = (ExplainStmt *) parsetree;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index f8a7b3664a..637afad45c 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -267,6 +267,7 @@ extern int	VacuumCostBalanceLocal;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
+extern void ExecVacuumMinimal(VacuumMinimalStmt *fmstmt, bool isTopLevel);
 extern void vacuum(List *relations, VacuumParams *params,
 				   BufferAccessStrategy bstrategy, bool isTopLevel);
 extern void vac_open_indexes(Relation relation, LOCKMODE lockmode,
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 28cf5aefca..75074f6de0 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -351,6 +351,7 @@ typedef enum NodeTag
 	T_CreatedbStmt,
 	T_DropdbStmt,
 	T_VacuumStmt,
+	T_VacuumMinimalStmt,
 	T_ExplainStmt,
 	T_CreateTableAsStmt,
 	T_CreateSeqStmt,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 413e7c85a1..e6cad1baa4 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3361,6 +3361,18 @@ typedef struct VacuumStmt
 	bool		is_vacuumcmd;	/* true for VACUUM, false for ANALYZE */
 } VacuumStmt;
 
+/* ----------------------
+ *		VacuumMinimal Statement
+ *
+ * Although this statement executes a type of VACUUM, it takes no options
+ * or relations.
+ * ----------------------
+ */
+typedef struct VacuumMinimalStmt
+{
+	NodeTag		type;
+} VacuumMinimalStmt;
+
 /*
  * Info about a single target table of VACUUM/ANALYZE.
  *
In reply to: John Naylor (#27)
Re: do only critical work during single-user vacuum?

On Tue, Jan 11, 2022 at 4:59 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

I've attached a PoC *untested* patch to show what it would look like
as a top-level statement. If the "shape" is uncontroversial, I'll put
work into testing it and fleshing it out.

Great!

For the PoC I wanted to try re-using existing keywords. I went with
"VACUUM LIMIT" since LIMIT is already a keyword that cannot be used as
a table name. It also brings "wraparound limit" to mind. We could add
a single-use unreserved keyword (such as VACUUM_MINIMAL or
VACUUM_FAST), but that doesn't seem great.

This seems reasonable, but you could add a new option instead, without
much downside. While INDEX_CLEANUP kind of looks like a keyword, it
isn't really a keyword. (Perhaps you knew this already.)

Making this a new option is a little awkward, admittedly. It's not
clear what it means to "VACUUM (LIMIT) my_table" -- do you just throw
an error for stuff like that? So perhaps your approach of adding
VacuumMinimalStmt (a minimal variant of the VACUUM command) is better.

I'm not sure what the right trade-off is, but as written I used 95% of
max age. It might be undesirable to end up so close to kicking off
uninterruptible vacuums, but the point is to get out of single-user
mode and back to streaming WAL as quickly as possible. It might also
be worth overriding the min ages as well, but haven't done so here.

I wonder if we should keep autovacuum_freeze_max_age out of it -- its
default is too conservative in general. I'm concerned that applying
this autovacuum_freeze_max_age test during VACUUM LIMIT doesn't go far
enough -- it may require VACUUM LIMIT to do significantly more work
than is needed to get the system back online (while leaving a sensible
amount of headroom). Also seems like it might be a good idea to avoid
relying on the user configuration, given that VACUUM LIMIT is only run
when everything is already in disarray. (Besides, it's not clear that
it's okay to use the autovacuum_freeze_max_age GUC without also using
the reloption of the same name.)

What do you think of applying a similar test using a generic 1 billion
XID (and 1 billion MXID) age cutoff? When VACUUM LIMIT is run, we've
already learned that the *entire* XID space wasn't sufficient for the
user workload, so we're not really in a position to promise much.
Often the real problem will be something like a leaked replication
slot, or application code that's seriously misbehaving. It's really
the DBA's job to *keep* the system up. VACUUM LIMIT is just a tool
that allows the DBA to do this without excessive downtime.

The GetNewTransactionId() WARNINGs ought to be changed to reference
VACUUM LIMIT. (You probably just didn't get around to that in this
POC, but couldn't hurt to remind you.)

--
Peter Geoghegan

#29Justin Pryzby
pryzby@telsasoft.com
In reply to: John Naylor (#27)
Re: do only critical work during single-user vacuum?

On Tue, Jan 11, 2022 at 07:58:56PM -0500, John Naylor wrote:

+		// FIXME: also check reloption
+		// WIP: 95% is a starting point for discussion
+		if ((table_xid_age < autovacuum_freeze_max_age * 0.95) ||
+			(table_mxid_age < autovacuum_multixact_freeze_max_age * 0.95))
+			continue;

Should be &&

Should this emergency vacuum "order by age() DESC" ?

--
Justin

#30Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Peter Geoghegan (#28)
Re: do only critical work during single-user vacuum?

On Wed, Jan 12, 2022 at 10:57 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Tue, Jan 11, 2022 at 4:59 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

I've attached a PoC *untested* patch to show what it would look like
as a top-level statement. If the "shape" is uncontroversial, I'll put
work into testing it and fleshing it out.

Great!

+1

For the PoC I wanted to try re-using existing keywords. I went with
"VACUUM LIMIT" since LIMIT is already a keyword that cannot be used as
a table name. It also brings "wraparound limit" to mind. We could add
a single-use unreserved keyword (such as VACUUM_MINIMAL or
VACUUM_FAST), but that doesn't seem great.

This seems reasonable, but you could add a new option instead, without
much downside. While INDEX_CLEANUP kind of looks like a keyword, it
isn't really a keyword. (Perhaps you knew this already.)

Making this a new option is a little awkward, admittedly. It's not
clear what it means to "VACUUM (LIMIT) my_table" -- do you just throw
an error for stuff like that? So perhaps your approach of adding
VacuumMinimalStmt (a minimal variant of the VACUUM command) is better.

It seems to me that adding new syntax instead of a new option is less
flexible. In the future, for instance, when we support parallel heap
scan for VACUUM, we may want to add a parallel-related option to both
VACUUM statement and VACUUM LIMIT statement. VACUUM LIMIT statement
would end up becoming like VACUUM statement?

As another idea, we might be able to add a new option that takes an
optional integer value, like VACUUM (MIN_XID), VACUUM (MIN_MXID), and
VACUUM (MIN_XID 500000). We vacuum only tables whose age is older than
the given value. If the value is omitted, we vacuum only tables whose
age exceeds a threshold (say autovacuum_freeze_max_age * 0.95), which
can be used in an emergency case and output in GetNewTransactionID()
WARNINGs output. vacuumdb’s --min-xid-age and --min-mxid-age can use
this option instead of fetching the list of tables from the server.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

#31John Naylor
john.naylor@enterprisedb.com
In reply to: Peter Geoghegan (#28)
Re: do only critical work during single-user vacuum?

On Tue, Jan 11, 2022 at 8:57 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Tue, Jan 11, 2022 at 4:59 PM John Naylor

For the PoC I wanted to try re-using existing keywords. I went with
"VACUUM LIMIT" since LIMIT is already a keyword that cannot be used as
a table name. It also brings "wraparound limit" to mind. We could add
a single-use unreserved keyword (such as VACUUM_MINIMAL or
VACUUM_FAST), but that doesn't seem great.

This seems reasonable, but you could add a new option instead, without
much downside. While INDEX_CLEANUP kind of looks like a keyword, it
isn't really a keyword. (Perhaps you knew this already.)

Making this a new option is a little awkward, admittedly. It's not
clear what it means to "VACUUM (LIMIT) my_table" -- do you just throw
an error for stuff like that? So perhaps your approach of adding
VacuumMinimalStmt (a minimal variant of the VACUUM command) is better.

We'd also have to do some checks to either ignore other options or
throw an error, which seems undesirable for code maintenance. For that
reason, I prefer the separate top-level statement, but I'm open to
bike-shedding on the actual syntax. I also briefly looked into a SQL
function, but the transaction management would make that more
difficult.

I'm not sure what the right trade-off is, but as written I used 95% of
max age. It might be undesirable to end up so close to kicking off
uninterruptible vacuums, but the point is to get out of single-user
mode and back to streaming WAL as quickly as possible. It might also
be worth overriding the min ages as well, but haven't done so here.

I wonder if we should keep autovacuum_freeze_max_age out of it -- its
default is too conservative in general. I'm concerned that applying
this autovacuum_freeze_max_age test during VACUUM LIMIT doesn't go far
enough -- it may require VACUUM LIMIT to do significantly more work
than is needed to get the system back online (while leaving a sensible
amount of headroom). Also seems like it might be a good idea to avoid
relying on the user configuration, given that VACUUM LIMIT is only run
when everything is already in disarray. (Besides, it's not clear that
it's okay to use the autovacuum_freeze_max_age GUC without also using
the reloption of the same name.)

What do you think of applying a similar test using a generic 1 billion
XID (and 1 billion MXID) age cutoff?

I like that a lot, actually. It's simple and insulates us from
wondering about corner cases in configuration.

The GetNewTransactionId() WARNINGs ought to be changed to reference
VACUUM LIMIT. (You probably just didn't get around to that in this
POC, but couldn't hurt to remind you.)

I'll do that as well as documentation after we have agreement (or at
least lack of objection) on the syntax.

--
John Naylor
EDB: http://www.enterprisedb.com

#32John Naylor
john.naylor@enterprisedb.com
In reply to: Justin Pryzby (#29)
Re: do only critical work during single-user vacuum?

On Tue, Jan 11, 2022 at 9:20 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

On Tue, Jan 11, 2022 at 07:58:56PM -0500, John Naylor wrote:

+             // FIXME: also check reloption
+             // WIP: 95% is a starting point for discussion
+             if ((table_xid_age < autovacuum_freeze_max_age * 0.95) ||
+                     (table_mxid_age < autovacuum_multixact_freeze_max_age * 0.95))
+                     continue;

Should be &&

Thanks! Will fix.

Should this emergency vacuum "order by age() DESC" ?

That would add complexity and only save a marginal amount of time.

--
John Naylor
EDB: http://www.enterprisedb.com

#33John Naylor
john.naylor@enterprisedb.com
In reply to: Masahiko Sawada (#30)
Re: do only critical work during single-user vacuum?

On Wed, Jan 12, 2022 at 1:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

It seems to me that adding new syntax instead of a new option is less
flexible. In the future, for instance, when we support parallel heap
scan for VACUUM, we may want to add a parallel-related option to both
VACUUM statement and VACUUM LIMIT statement. VACUUM LIMIT statement
would end up becoming like VACUUM statement?

This is intended for single-user mode, so parallelism is not relevant.

As another idea, we might be able to add a new option that takes an
optional integer value, like VACUUM (MIN_XID), VACUUM (MIN_MXID), and
VACUUM (MIN_XID 500000). We vacuum only tables whose age is older than
the given value. If the value is omitted, we vacuum only tables whose
age exceeds a threshold (say autovacuum_freeze_max_age * 0.95), which
can be used in an emergency case and output in GetNewTransactionID()
WARNINGs output. vacuumdb’s --min-xid-age and --min-mxid-age can use
this option instead of fetching the list of tables from the server.

That could work, and maybe also have general purpose, but I see two
problems if I understand you correctly:

- If we have a default threshold when the values are omitted, that
implies we need to special-case single-user mode with non-obvious
behavior, which is not ideal, as Andres mentioned upthread. (Or, now
manual VACUUM by default would not do anything, except in extreme
cases, which is worse.)

- In the single-user case, the admin would still need to add
INDEX_CLEANUP = off for minimum downtime, and it should be really
simple.

- For the general case, we would now have the ability to vacuum a
table, and possibly have no effect at all. That seems out of place
with the other options.

--
John Naylor
EDB: http://www.enterprisedb.com

#34Bossart, Nathan
bossartn@amazon.com
In reply to: John Naylor (#33)
Re: do only critical work during single-user vacuum?

On 1/12/22, 7:43 AM, "John Naylor" <john.naylor@enterprisedb.com> wrote:

On Wed, Jan 12, 2022 at 1:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

As another idea, we might be able to add a new option that takes an
optional integer value, like VACUUM (MIN_XID), VACUUM (MIN_MXID), and
VACUUM (MIN_XID 500000). We vacuum only tables whose age is older than
the given value. If the value is omitted, we vacuum only tables whose
age exceeds a threshold (say autovacuum_freeze_max_age * 0.95), which
can be used in an emergency case and output in GetNewTransactionID()
WARNINGs output. vacuumdb’s --min-xid-age and --min-mxid-age can use
this option instead of fetching the list of tables from the server.

That could work, and maybe also have general purpose, but I see two
problems if I understand you correctly:

- If we have a default threshold when the values are omitted, that
implies we need to special-case single-user mode with non-obvious
behavior, which is not ideal, as Andres mentioned upthread. (Or, now
manual VACUUM by default would not do anything, except in extreme
cases, which is worse.)

I agree, I don't think such options should have a default value.

- In the single-user case, the admin would still need to add
INDEX_CLEANUP = off for minimum downtime, and it should be really
simple.

- For the general case, we would now have the ability to vacuum a
table, and possibly have no effect at all. That seems out of place
with the other options.

Perhaps a message would be emitted when tables are specified but
skipped due to the min-xid-age option.

As I've stated upthread, Sawada-san's suggested approach was my
initial reaction to this thread. I'm not wedded to the idea of adding
new options, but I think there are a couple of advantages. For both
single-user mode and normal operation (which may be in imminent
wraparound danger), you could use the same command:

VACUUM (MIN_XID_AGE 1600000000, ...);

(As an aside, we'd need to figure out how XID and MXID options would
work together. Presumably most users would want to OR them.)

This doesn't really tie in super nicely with the failsafe mechanism,
but adding something like a FAILSAFE option doesn't seem right to me,
as it's basically just an alias for a bunch of other options. In my
mind, even a new top-level command would just be an alias for the
aforementioned command. Of course, providing a new option is not
quite as simple as opening up single-user mode and typing "BAIL OUT,"
but I don't know if it is prohibitively complicated for end users.
They'll already have had to figure out how to start single-user mode
in the first place, and we can have nice ERROR/WARNING messages that
provide a suggested VACUUM command.

The other advantage I see with age-related options is that it can be
useful for non-imminent-wraparound situations as well. For example,
maybe a user just wants to manually vacuum everything (including
indexes) with an age above 500M on the weekends.

Another idea is to do both. We could add age-related options, and we
could also add a "BAIL OUT" command that is just an alias for a
special VACUUM command that we feel will help get things under control
as quickly as possible.

Nathan

#35John Naylor
john.naylor@enterprisedb.com
In reply to: Bossart, Nathan (#34)
Re: do only critical work during single-user vacuum?

On Wed, Jan 12, 2022 at 12:26 PM Bossart, Nathan <bossartn@amazon.com> wrote:

- For the general case, we would now have the ability to vacuum a
table, and possibly have no effect at all. That seems out of place
with the other options.

Perhaps a message would be emitted when tables are specified but
skipped due to the min-xid-age option.

As I've stated upthread, Sawada-san's suggested approach was my
initial reaction to this thread. I'm not wedded to the idea of adding
new options, but I think there are a couple of advantages. For both
single-user mode and normal operation (which may be in imminent
wraparound danger), you could use the same command:

VACUUM (MIN_XID_AGE 1600000000, ...);

My proposed top-level statement can also be used in normal operation,
so the only possible advantage is configurability. But I don't really
see any advantage in that -- I don't think we should be moving in the
direction of adding more-intricate ways to paper over the deficiencies
in autovacuum scheduling. (It could be argued that I'm doing exactly
that in this whole thread, but [imminent] shutdown situations have
other causes besides deficient scheduling.)

(As an aside, we'd need to figure out how XID and MXID options would
work together. Presumably most users would want to OR them.)

This doesn't really tie in super nicely with the failsafe mechanism,
but adding something like a FAILSAFE option doesn't seem right to me,

I agree -- it would be awkward and messy as an option. However, I see
the same problem with xid/mxid -- I would actually argue they are not
even proper options; they are "selectors". Your comments above about
1) needing to OR them and 2) emitting a message when a VACUUM command
doesn't actually do anything are evidence of that fact.

The other advantage I see with age-related options is that it can be
useful for non-imminent-wraparound situations as well. For example,
maybe a user just wants to manually vacuum everything (including
indexes) with an age above 500M on the weekends.

There is already vaccumdb for that, and I think it's method of
selecting tables is sound -- I'm not convinced that pushing table
selection to the server command as "options" is an improvement.

--
John Naylor
EDB: http://www.enterprisedb.com

#36Bossart, Nathan
bossartn@amazon.com
In reply to: John Naylor (#35)
Re: do only critical work during single-user vacuum?

On 1/13/22, 4:58 AM, "John Naylor" <john.naylor@enterprisedb.com> wrote:

On Wed, Jan 12, 2022 at 12:26 PM Bossart, Nathan <bossartn@amazon.com> wrote:

As I've stated upthread, Sawada-san's suggested approach was my
initial reaction to this thread. I'm not wedded to the idea of adding
new options, but I think there are a couple of advantages. For both
single-user mode and normal operation (which may be in imminent
wraparound danger), you could use the same command:

VACUUM (MIN_XID_AGE 1600000000, ...);

My proposed top-level statement can also be used in normal operation,
so the only possible advantage is configurability. But I don't really
see any advantage in that -- I don't think we should be moving in the
direction of adding more-intricate ways to paper over the deficiencies
in autovacuum scheduling. (It could be argued that I'm doing exactly
that in this whole thread, but [imminent] shutdown situations have
other causes besides deficient scheduling.)

The new top-level command would be configurable, right? Your patch
uses autovacuum_freeze_max_age/autovacuum_multixact_freeze_max_age, so
the behavior of this new command now depends on the values of
parameters that won't obviously be related to it. If these parameters
are set very low (e.g., the default values), then this command will
end up doing far more work than is probably necessary.

If we did go the route of using a parameter to determine which tables
to vacuum, I think vacuum_failsafe_age is a much better candidate, as
it defaults to a much higher value that is more likely to prevent
doing extra work. That being said, I don't know if overloading
parameters is the right way to go.

(As an aside, we'd need to figure out how XID and MXID options would
work together. Presumably most users would want to OR them.)

This doesn't really tie in super nicely with the failsafe mechanism,
but adding something like a FAILSAFE option doesn't seem right to me,

I agree -- it would be awkward and messy as an option. However, I see
the same problem with xid/mxid -- I would actually argue they are not
even proper options; they are "selectors". Your comments above about
1) needing to OR them and 2) emitting a message when a VACUUM command
doesn't actually do anything are evidence of that fact.

That's a fair point. But I don't think these problems are totally
intractable. We already emit "skipping" messages from VACUUM
sometimes, and interactions between VACUUM options exist today, too.
For example, FREEZE is redundant when FULL is specified, and
INDEX_CLEANUP is totally ignored when FULL is used.

The other advantage I see with age-related options is that it can be
useful for non-imminent-wraparound situations as well. For example,
maybe a user just wants to manually vacuum everything (including
indexes) with an age above 500M on the weekends.

There is already vaccumdb for that, and I think it's method of
selecting tables is sound -- I'm not convinced that pushing table
selection to the server command as "options" is an improvement.

I guess I'm ultimately imagining the new options as replacing the
vacuumdb implementation. IOW vacuumdb would just use MIN_(M)XID_AGE
behind the scenes (as would a new top-level command).

Nathan

#37John Naylor
john.naylor@enterprisedb.com
In reply to: Bossart, Nathan (#36)
Re: do only critical work during single-user vacuum?

I see a CF entry has been created already, and the cfbot doesn't like
my PoC. To prevent confusion, I've taken the liberty of switching the
author to myself and set to Waiting on Author. FWIW, my local build
passed make check-world after applying Justin's fix and changing a
couple other things.

--
John Naylor
EDB: http://www.enterprisedb.com

#38Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bossart, Nathan (#36)
Re: do only critical work during single-user vacuum?

On Fri, Jan 14, 2022 at 7:04 AM Bossart, Nathan <bossartn@amazon.com> wrote:

On 1/13/22, 4:58 AM, "John Naylor" <john.naylor@enterprisedb.com> wrote:

On Wed, Jan 12, 2022 at 12:26 PM Bossart, Nathan <bossartn@amazon.com> wrote:

As I've stated upthread, Sawada-san's suggested approach was my
initial reaction to this thread. I'm not wedded to the idea of adding
new options, but I think there are a couple of advantages. For both
single-user mode and normal operation (which may be in imminent
wraparound danger), you could use the same command:

VACUUM (MIN_XID_AGE 1600000000, ...);

My proposed top-level statement can also be used in normal operation,
so the only possible advantage is configurability. But I don't really
see any advantage in that -- I don't think we should be moving in the
direction of adding more-intricate ways to paper over the deficiencies
in autovacuum scheduling. (It could be argued that I'm doing exactly
that in this whole thread, but [imminent] shutdown situations have
other causes besides deficient scheduling.)

The new top-level command would be configurable, right? Your patch
uses autovacuum_freeze_max_age/autovacuum_multixact_freeze_max_age, so
the behavior of this new command now depends on the values of
parameters that won't obviously be related to it. If these parameters
are set very low (e.g., the default values), then this command will
end up doing far more work than is probably necessary.

If we did go the route of using a parameter to determine which tables
to vacuum, I think vacuum_failsafe_age is a much better candidate, as
it defaults to a much higher value that is more likely to prevent
doing extra work. That being said, I don't know if overloading
parameters is the right way to go.

(As an aside, we'd need to figure out how XID and MXID options would
work together. Presumably most users would want to OR them.)

This doesn't really tie in super nicely with the failsafe mechanism,
but adding something like a FAILSAFE option doesn't seem right to me,

I agree -- it would be awkward and messy as an option. However, I see
the same problem with xid/mxid -- I would actually argue they are not
even proper options; they are "selectors". Your comments above about
1) needing to OR them and 2) emitting a message when a VACUUM command
doesn't actually do anything are evidence of that fact.

That's a fair point. But I don't think these problems are totally
intractable. We already emit "skipping" messages from VACUUM
sometimes, and interactions between VACUUM options exist today, too.
For example, FREEZE is redundant when FULL is specified, and
INDEX_CLEANUP is totally ignored when FULL is used.

The other advantage I see with age-related options is that it can be
useful for non-imminent-wraparound situations as well. For example,
maybe a user just wants to manually vacuum everything (including
indexes) with an age above 500M on the weekends.

I also think there is a use case where a user just wants to manually
vacuum tables that are older than a certain threshold. In this case,
they might want to specify VACUUM command options such as the parallel
option while selecting tables.

There is already vaccumdb for that, and I think it's method of
selecting tables is sound -- I'm not convinced that pushing table
selection to the server command as "options" is an improvement.

I think that having the user not rely on vacuumdb by implementing it
on the server side would be an improvement.

I guess I'm ultimately imagining the new options as replacing the
vacuumdb implementation. IOW vacuumdb would just use MIN_(M)XID_AGE
behind the scenes (as would a new top-level command).

I had the same idea.

That having been said, I agree that xid/mxid options are different
things from the existing VACUUM command options; whereas the existing
VACUUM options control its behavior, xid/mxid options are selectors
for tables to vacuum (PROCESS_TOAST option could be a selector but I
think it’s slightly different from xid/mxid options).

IIUC what we want to do here are two things: (1) select only old
tables and (2) set INDEX_CLEANUP = off, TRUNCATE = off, and FREEZE =
on. VACUUM LIMIT statement does both things at the same time. Although
I’m concerned a bit about its flexibility, it’s a reasonable solution.

On the other hand, it’s probably also useful to do either one thing in
some cases. For instance, having a selector for (1) would be useful,
and having a new option like FAST_FREEZE for (2) would also be useful.
Given there is already a way for (2) (it does not default though), I
think it might also be a good start inventing something for (1). For
instance, a selector for VACUUM statement I came up with is:

VACUUM (verbose on) TABLES WITH (min_xid_age = 1600000000);
or
VACUUM (verbose on) TABLES WITH (min_age = failsafe_limit);

We can expand it in the future to select tables by, for example, dead
tuple ratio, size, etc.

It's a random thought but maybe worth considering.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

#39John Naylor
john.naylor@enterprisedb.com
In reply to: Masahiko Sawada (#38)
Re: do only critical work during single-user vacuum?

On Wed, Jan 19, 2022 at 12:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Jan 14, 2022 at 7:04 AM Bossart, Nathan <bossartn@amazon.com> wrote:

I guess I'm ultimately imagining the new options as replacing the
vacuumdb implementation. IOW vacuumdb would just use MIN_(M)XID_AGE
behind the scenes (as would a new top-level command).

I had the same idea.

This seems to be the motivating reason for wanting new configurability
on the server side. In any case, new knobs are out of scope for this
thread. If the use case is compelling enough, may I suggest starting a
new thread?

Regarding the thread subject, I've been playing with the grammar, and
found it's quite easy to have

VACUUM FOR WRAPAROUND
or
VACUUM FOR EMERGENCY

since FOR is a reserved word (and following that can be an IDENT plus
a strcmp check) and cannot conflict with table names. This sounds a
bit more natural than VACUUM LIMIT. Opinions?

--
John Naylor
EDB: http://www.enterprisedb.com

#40Bossart, Nathan
bossartn@amazon.com
In reply to: Masahiko Sawada (#38)
Re: do only critical work during single-user vacuum?

On 1/18/22, 9:47 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

IIUC what we want to do here are two things: (1) select only old
tables and (2) set INDEX_CLEANUP = off, TRUNCATE = off, and FREEZE =
on. VACUUM LIMIT statement does both things at the same time. Although
I’m concerned a bit about its flexibility, it’s a reasonable solution.

On the other hand, it’s probably also useful to do either one thing in
some cases. For instance, having a selector for (1) would be useful,
and having a new option like FAST_FREEZE for (2) would also be useful.
Given there is already a way for (2) (it does not default though), I
think it might also be a good start inventing something for (1). For
instance, a selector for VACUUM statement I came up with is:

VACUUM (verbose on) TABLES WITH (min_xid_age = 1600000000);
or
VACUUM (verbose on) TABLES WITH (min_age = failsafe_limit);

We can expand it in the future to select tables by, for example, dead
tuple ratio, size, etc.

It's a random thought but maybe worth considering.

That's an interesting idea. A separate selector clause could also
allow users to choose how they interacted (e.g., should the options be
OR'd or AND'd).

Nathan

#41Bossart, Nathan
bossartn@amazon.com
In reply to: John Naylor (#39)
Re: do only critical work during single-user vacuum?

On 1/19/22, 11:15 AM, "John Naylor" <john.naylor@enterprisedb.com> wrote:

This seems to be the motivating reason for wanting new configurability
on the server side. In any case, new knobs are out of scope for this
thread. If the use case is compelling enough, may I suggest starting a
new thread?

Sure. Perhaps the new top-level command will use these new options
someday.

Regarding the thread subject, I've been playing with the grammar, and
found it's quite easy to have

VACUUM FOR WRAPAROUND
or
VACUUM FOR EMERGENCY

since FOR is a reserved word (and following that can be an IDENT plus
a strcmp check) and cannot conflict with table names. This sounds a
bit more natural than VACUUM LIMIT. Opinions?

I personally think VACUUM FOR WRAPAROUND is the best of the options
provided thus far.

Nathan

#42Michael Paquier
michael@paquier.xyz
In reply to: Bossart, Nathan (#41)
Re: do only critical work during single-user vacuum?

On Wed, Jan 19, 2022 at 09:11:48PM +0000, Bossart, Nathan wrote:

I personally think VACUUM FOR WRAPAROUND is the best of the options
provided thus far.

Could you avoid introducing a new grammar pattern in VACUUM? Any new
option had better be within the parenthesized part as it is extensible
at will with its set of DefElems.
--
Michael

#43Masahiko Sawada
sawada.mshk@gmail.com
In reply to: John Naylor (#39)
Re: do only critical work during single-user vacuum?

On Thu, Jan 20, 2022 at 4:14 AM John Naylor
<john.naylor@enterprisedb.com> wrote:

On Wed, Jan 19, 2022 at 12:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Jan 14, 2022 at 7:04 AM Bossart, Nathan <bossartn@amazon.com> wrote:

I guess I'm ultimately imagining the new options as replacing the
vacuumdb implementation. IOW vacuumdb would just use MIN_(M)XID_AGE
behind the scenes (as would a new top-level command).

I had the same idea.

This seems to be the motivating reason for wanting new configurability
on the server side. In any case, new knobs are out of scope for this
thread. If the use case is compelling enough, may I suggest starting a
new thread?

The purpose of this thread is to provide a way for users to run vacuum
only very old tables (while skipping index cleanup, etc.), and the way
is not limited to introducing a new top-level VACUUM statement yet,
right? A new top-level VACUUM statement you proposed seems a good idea
but trying to achieve it by extending the current VACUUM statement is
also a good idea. So I think the ideas like MIN_XID_AGE option and new
table selector in VACUUM statement are relevant to this thread.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

#44John Naylor
john.naylor@enterprisedb.com
In reply to: Masahiko Sawada (#43)
2 attachment(s)
Re: do only critical work during single-user vacuum?

On Wed, Jan 19, 2022 at 5:26 PM Michael Paquier <michael@paquier.xyz> wrote:

Could you avoid introducing a new grammar pattern in VACUUM? Any new
option had better be within the parenthesized part as it is extensible
at will with its set of DefElems.

This new behavior is not an option that one can sensibly mix with
other options as the user sees fit, but rather hard-codes the
parameters for its single purpose. That said, I do understand your
objection.

[*thinks*]

How about the attached patch (and test script)? It still needs polish,
but it could work. It allows "verbose" to coexist, although that's
really only for testing normal mode. While testing in single-user
mode, I was sad to find out that it not only doesn't emit messages
(not a client), but also doesn't log. That would have been a decent
way to monitor progress...

In this form, I'm no longer a fan of calling the option "wraparound",
because it's too close to the "is_wraparound" param member.
Internally, at least, we can use "emergency" or "minimal". (In fact
the bit symbol is VACOPT_MINIMAL for this draft). That can be worked
out later.

On Fri, Jan 21, 2022 at 12:59 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The purpose of this thread is to provide a way for users to run vacuum
only very old tables (while skipping index cleanup, etc.),

Ah, thank you Sawada-san, now I understand why we have been talking
past each other. The purpose is actually:

- to have a simple, easy to type, command
- intended for single-user mode, but not limited to it (so it's easy to test)
- to get out of single user mode as quickly as possible

--
John Naylor
EDB: http://www.enterprisedb.com

Attachments:

v3-vacuum-select-tables-closest-to-wraparound.patchtext/x-patch; charset=US-ASCII; name=v3-vacuum-select-tables-closest-to-wraparound.patchDownload
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 283ffaea77..6183f412d3 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -52,6 +52,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/acl.h"
+#include "utils/builtins.h"
 #include "utils/fmgroids.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
@@ -114,6 +115,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		full = false;
 	bool		disable_page_skipping = false;
 	bool		process_toast = true;
+	bool		wraparound = false;
 	ListCell   *lc;
 
 	/* index_cleanup and truncate values unspecified for now */
@@ -200,6 +202,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 					params.nworkers = nworkers;
 			}
 		}
+		else if (strcmp(opt->defname, "wraparound") == 0)
+			wraparound = defGetBoolean(opt);
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -246,17 +250,51 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if (wraparound)
+	{
+		/* exclude incompatible options */
+		foreach(lc, vacstmt->options)
+		{
+			DefElem    *opt = (DefElem *) lfirst(lc);
+
+			// WIP is there a better way?
+			if (strcmp(opt->defname, "wraparound") != 0 &&
+				strcmp(opt->defname, "verbose") != 0 &&
+				defGetBoolean(opt))
+
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+								errmsg("option \"%s\" is incompatible with WRAPAROUND", opt->defname),
+								parser_errposition(pstate, opt->location)));
+		}
+
+		/* skip unnecessary work, as in failsafe mode */
+		params.index_cleanup = VACOPTVALUE_DISABLED;
+		params.truncate = VACOPTVALUE_DISABLED;
+	}
+
 	/*
-	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
-	 * them as -1 which means to use the default values.
+	 * Set freeze ages to zero where appropriate; otherwise pass
+	 * them as -1 which means to use the configured values.
 	 */
 	if (params.options & VACOPT_FREEZE)
 	{
+		/* All freeze ages are zero if the FREEZE option is given */
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
 		params.multixact_freeze_min_age = 0;
 		params.multixact_freeze_table_age = 0;
 	}
+	else if (params.options & VACOPT_MINIMAL)
+	{
+		/* it's unlikely any selected table will not be eligible for aggressive vacuum, but make sure */
+		params.freeze_table_age = 0;
+		params.multixact_freeze_table_age = 0;
+
+		// WIP: It might be worth trying to do less work here, or at least hard-coding the default values
+		params.freeze_min_age = -1;
+		params.multixact_freeze_min_age = -1;
+	}
 	else
 	{
 		params.freeze_min_age = -1;
@@ -894,6 +932,8 @@ get_all_vacuum_rels(int options)
 	Relation	pgclass;
 	TableScanDesc scan;
 	HeapTuple	tuple;
+	int32 		table_xid_age,
+				table_mxid_age;
 
 	pgclass = table_open(RelationRelationId, AccessShareLock);
 
@@ -909,12 +949,42 @@ get_all_vacuum_rels(int options)
 		if (!vacuum_is_relation_owner(relid, classForm, options))
 			continue;
 
+		if (options | VACOPT_MINIMAL)
+		{
+			/*
+			* Only consider relations able to hold unfrozen XIDs (anything else
+			* should have InvalidTransactionId in relfrozenxid anyway).
+			*/
+			if (classForm->relkind != RELKIND_RELATION &&
+				classForm->relkind != RELKIND_MATVIEW &&
+				classForm->relkind != RELKIND_TOASTVALUE)
+			{
+				Assert(!TransactionIdIsValid(classForm->relfrozenxid));
+				Assert(!MultiXactIdIsValid(classForm->relminmxid));
+				continue;
+			}
+
+			table_xid_age = DirectFunctionCall1(xid_age, classForm->relfrozenxid);
+			table_mxid_age = DirectFunctionCall1(mxid_age, classForm->relminmxid);
+
+			/* Hard-code 1 billion for the thresholds to avoid making assumptions
+			 * about the configuration. This leaves some headroom for when the user
+			 * returns to normal mode while also minimizing work.
+			 * WIP: consider passing these constants via the params struct
+			 */
+			// FIXME to speed up testing
+			// if ((table_xid_age < 1000 * 1000 * 1000) &&
+			// 	(table_mxid_age < 1000 * 1000 * 1000))
+			if ((table_xid_age < 1000 * 1000) &&
+				(table_mxid_age < 1000 * 1000))
+				continue;
+		}
 		/*
 		 * We include partitioned tables here; depending on which operation is
 		 * to be performed, caller will decide whether to process or ignore
 		 * them.
 		 */
-		if (classForm->relkind != RELKIND_RELATION &&
+		else if (classForm->relkind != RELKIND_RELATION &&
 			classForm->relkind != RELKIND_MATVIEW &&
 			classForm->relkind != RELKIND_PARTITIONED_TABLE)
 			continue;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5d0bdfa427..3d8b8fbbb1 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -188,6 +188,7 @@ typedef struct VacAttrStats
 #define VACOPT_SKIP_LOCKED 0x20 /* skip if cannot get lock */
 #define VACOPT_PROCESS_TOAST 0x40	/* process the TOAST table, if any */
 #define VACOPT_DISABLE_PAGE_SKIPPING 0x80	/* don't skip any pages */
+#define VACOPT_MINIMAL 0x100	/* do minimal freezing work to prevent or get out of shutdown */
 
 /*
  * Values used by index_cleanup and truncate params.
emergency-test.sqlapplication/sql; name=emergency-test.sqlDownload
#45Bossart, Nathan
bossartn@amazon.com
In reply to: John Naylor (#44)
Re: do only critical work during single-user vacuum?

On 1/21/22, 2:43 PM, "John Naylor" <john.naylor@enterprisedb.com> wrote:

- to have a simple, easy to type, command

AFAICT the disagreement is really just about the grammar.
Sawada-san's idea would look something like

VACUUM (FREEZE, INDEX_CLEANUP OFF, MIN_XID_AGE 1600000000, MIN_MXID_AGE 1600000000);

while your proposal looks more like

VACUUM (WRAPAROUND);

The former is highly configurable, but it is probably annoying to type
at 3 AM, and the interaction between the two *_AGE options is not
exactly intuitive (although I expect MIN_XID_AGE to be sufficient in
most cases). The latter is not as configurable, but it is much easier
to type at 3 AM.

I think simplicity is a good goal, but I don't know if the difference
between the two approaches outweighs the benefits of configurability.
If you are in an emergency situation, you already will have to take
down the server, connect in single-user mode to the database(s) that
need vacuuming, and actually do the vacuuming. The wraparound
WARNING/ERROR already has a HINT that describes the next steps
required. Perhaps it would be enough to also emit an example VACUUM
command to use.

I think folks will find the configurability useful, too. With
MIN_XID_AGE, it's super easy to have pg_cron vacuum everything over
500M on the weekend (and also do index cleanup), which may allow you
to use more relaxed autovacuum settings during the week. The docs
already have suggestions for manually vacuuming when the load is low
[0]: https://www.postgresql.org/docs/devel/routine-vacuuming.html#VACUUM-FOR-SPACE-RECOVERY
use-case.

Nathan

[0]: https://www.postgresql.org/docs/devel/routine-vacuuming.html#VACUUM-FOR-SPACE-RECOVERY

#46Justin Pryzby
pryzby@telsasoft.com
In reply to: John Naylor (#44)
1 attachment(s)
Re: do only critical work during single-user vacuum?

On Fri, Jan 21, 2022 at 05:41:58PM -0500, John Naylor wrote:

On Wed, Jan 19, 2022 at 5:26 PM Michael Paquier <michael@paquier.xyz> wrote:

Could you avoid introducing a new grammar pattern in VACUUM? Any new
option had better be within the parenthesized part as it is extensible
at will with its set of DefElems.

This new behavior is not an option that one can sensibly mix with
other options as the user sees fit, but rather hard-codes the
parameters for its single purpose. That said, I do understand your
objection.

This seems better, and it's shorter too.

I'm sure you meant "&" here (fixed in attached patch to appease the cfbot):
+ if (options | VACOPT_MINIMAL)

It should either refuse to run if a list of tables is specified with MINIMAL,
or it should filter that list by XID condition.

As for the name, it could be MINIMAL or FAILSAFE or EMERGENCY or ??
I think the name should actually be a bit more descriptive, and maybe say XID,
like MINIMAL_XID or XID_EMERGENCY...

Normally, options are independent, but VACUUM (MINIMAL) is a "shortcut" to a
hardcoded set of options: freeze on, truncate off, cleanup off. So it refuses
to be combined with other options - good.

This is effectively a shortcut to hypothetical parameters for selecting tables
by XID/MXID age. In the future, someone could debate adding user-facing knobs
for table selection by age.

I still wonder if the relations should be processed in order of decreasing age.
An admin might have increased autovacuum_freeze_max_age up to 2e9, and your
query might return thousands of tables, with a wide range of sizes and ages.

Processing them in order of decreasing age would allow the admin to quickly
vacuum the oldest tables, and optionally interrupt vacuum to get out of single
user mode ASAP - even if their just want to run VACUUM(MINIMAL) in a normal
backend when services aren't offline. Processing them out of order might be
pretty surprising - they might run vacuum for an hour (or overnight), cancel
it, attempt to start the DB in normal mode, and conclude that it made no
visible progress.

Show quoted text

On Fri, Jan 21, 2022 at 12:59 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The purpose of this thread is to provide a way for users to run vacuum
only very old tables (while skipping index cleanup, etc.),

Ah, thank you Sawada-san, now I understand why we have been talking
past each other. The purpose is actually:

- to have a simple, easy to type, command
- intended for single-user mode, but not limited to it (so it's easy to test)
- to get out of single user mode as quickly as possible

Attachments:

0001-do-only-critical-work-during-single-user-vacuum.patchtext/x-diff; charset=us-asciiDownload
From 03c567bb534219acdd76b0acc40e544c76f938e5 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@enterprisedb.com>
Date: Fri, 21 Jan 2022 17:41:58 -0500
Subject: [PATCH] do only critical work during single-user vacuum?

Jan 21 John Naylor
---
 src/backend/commands/vacuum.c | 76 +++++++++++++++++++++++++++++++++--
 src/include/commands/vacuum.h |  1 +
 2 files changed, 74 insertions(+), 3 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index d1dadc54e47..c7bc97d3f76 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -52,6 +52,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/acl.h"
+#include "utils/builtins.h"
 #include "utils/fmgroids.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
@@ -114,6 +115,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		full = false;
 	bool		disable_page_skipping = false;
 	bool		process_toast = true;
+	bool		wraparound = false;
 	ListCell   *lc;
 
 	/* index_cleanup and truncate values unspecified for now */
@@ -200,6 +202,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 					params.nworkers = nworkers;
 			}
 		}
+		else if (strcmp(opt->defname, "wraparound") == 0)
+			wraparound = defGetBoolean(opt);
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -246,17 +250,51 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if (wraparound)
+	{
+		/* exclude incompatible options */
+		foreach(lc, vacstmt->options)
+		{
+			DefElem    *opt = (DefElem *) lfirst(lc);
+
+			// WIP is there a better way?
+			if (strcmp(opt->defname, "wraparound") != 0 &&
+				strcmp(opt->defname, "verbose") != 0 &&
+				defGetBoolean(opt))
+
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+								errmsg("option \"%s\" is incompatible with WRAPAROUND", opt->defname),
+								parser_errposition(pstate, opt->location)));
+		}
+
+		/* skip unnecessary work, as in failsafe mode */
+		params.index_cleanup = VACOPTVALUE_DISABLED;
+		params.truncate = VACOPTVALUE_DISABLED;
+	}
+
 	/*
-	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
-	 * them as -1 which means to use the default values.
+	 * Set freeze ages to zero where appropriate; otherwise pass
+	 * them as -1 which means to use the configured values.
 	 */
 	if (params.options & VACOPT_FREEZE)
 	{
+		/* All freeze ages are zero if the FREEZE option is given */
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
 		params.multixact_freeze_min_age = 0;
 		params.multixact_freeze_table_age = 0;
 	}
+	else if (params.options & VACOPT_MINIMAL)
+	{
+		/* it's unlikely any selected table will not be eligible for aggressive vacuum, but make sure */
+		params.freeze_table_age = 0;
+		params.multixact_freeze_table_age = 0;
+
+		// WIP: It might be worth trying to do less work here, or at least hard-coding the default values
+		params.freeze_min_age = -1;
+		params.multixact_freeze_min_age = -1;
+	}
 	else
 	{
 		params.freeze_min_age = -1;
@@ -894,6 +932,8 @@ get_all_vacuum_rels(int options)
 	Relation	pgclass;
 	TableScanDesc scan;
 	HeapTuple	tuple;
+	int32 		table_xid_age,
+				table_mxid_age;
 
 	pgclass = table_open(RelationRelationId, AccessShareLock);
 
@@ -909,12 +949,42 @@ get_all_vacuum_rels(int options)
 		if (!vacuum_is_relation_owner(relid, classForm, options))
 			continue;
 
+		if (options & VACOPT_MINIMAL)
+		{
+			/*
+			* Only consider relations able to hold unfrozen XIDs (anything else
+			* should have InvalidTransactionId in relfrozenxid anyway).
+			*/
+			if (classForm->relkind != RELKIND_RELATION &&
+				classForm->relkind != RELKIND_MATVIEW &&
+				classForm->relkind != RELKIND_TOASTVALUE)
+			{
+				Assert(!TransactionIdIsValid(classForm->relfrozenxid));
+				Assert(!MultiXactIdIsValid(classForm->relminmxid));
+				continue;
+			}
+
+			table_xid_age = DirectFunctionCall1(xid_age, classForm->relfrozenxid);
+			table_mxid_age = DirectFunctionCall1(mxid_age, classForm->relminmxid);
+
+			/* Hard-code 1 billion for the thresholds to avoid making assumptions
+			 * about the configuration. This leaves some headroom for when the user
+			 * returns to normal mode while also minimizing work.
+			 * WIP: consider passing these constants via the params struct
+			 */
+			// FIXME to speed up testing
+			// if ((table_xid_age < 1000 * 1000 * 1000) &&
+			// 	(table_mxid_age < 1000 * 1000 * 1000))
+			if ((table_xid_age < 1000 * 1000) &&
+				(table_mxid_age < 1000 * 1000))
+				continue;
+		}
 		/*
 		 * We include partitioned tables here; depending on which operation is
 		 * to be performed, caller will decide whether to process or ignore
 		 * them.
 		 */
-		if (classForm->relkind != RELKIND_RELATION &&
+		else if (classForm->relkind != RELKIND_RELATION &&
 			classForm->relkind != RELKIND_MATVIEW &&
 			classForm->relkind != RELKIND_PARTITIONED_TABLE)
 			continue;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5d0bdfa4279..3d8b8fbbb1a 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -188,6 +188,7 @@ typedef struct VacAttrStats
 #define VACOPT_SKIP_LOCKED 0x20 /* skip if cannot get lock */
 #define VACOPT_PROCESS_TOAST 0x40	/* process the TOAST table, if any */
 #define VACOPT_DISABLE_PAGE_SKIPPING 0x80	/* don't skip any pages */
+#define VACOPT_MINIMAL 0x100	/* do minimal freezing work to prevent or get out of shutdown */
 
 /*
  * Values used by index_cleanup and truncate params.
-- 
2.17.1

#47John Naylor
john.naylor@enterprisedb.com
In reply to: Justin Pryzby (#46)
1 attachment(s)
Re: do only critical work during single-user vacuum?

On Thu, Jan 27, 2022 at 8:28 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

I'm sure you meant "&" here (fixed in attached patch to appease the cfbot):
+ if (options | VACOPT_MINIMAL)

Thanks for catching that! That copy-pasto was also masking my failure
to process the option properly -- fixed in the attached as v5.

It should either refuse to run if a list of tables is specified with MINIMAL,
or it should filter that list by XID condition.

I went with the former for simplicity. As a single-purpose option, it
makes sense.

As for the name, it could be MINIMAL or FAILSAFE or EMERGENCY or ??
I think the name should actually be a bit more descriptive, and maybe say XID,
like MINIMAL_XID or XID_EMERGENCY...

I went with EMERGENCY in this version to reinforce its purpose in the
mind of the user (and reader of this code).

Normally, options are independent, but VACUUM (MINIMAL) is a "shortcut" to a
hardcoded set of options: freeze on, truncate off, cleanup off. So it refuses
to be combined with other options - good.

This is effectively a shortcut to hypothetical parameters for selecting tables
by XID/MXID age. In the future, someone could debate adding user-facing knobs
for table selection by age.

I used the params struct in v5 for the emergency cutoff ages. Even
with the values hard-coded, it seems cleaner to keep them here.

I still wonder if the relations should be processed in order of decreasing age.
An admin might have increased autovacuum_freeze_max_age up to 2e9, and your
query might return thousands of tables, with a wide range of sizes and ages.

Processing them in order of decreasing age would allow the admin to quickly
vacuum the oldest tables, and optionally interrupt vacuum to get out of single
user mode ASAP - even if their just want to run VACUUM(MINIMAL) in a normal
backend when services aren't offline. Processing them out of order might be
pretty surprising - they might run vacuum for an hour (or overnight), cancel
it, attempt to start the DB in normal mode, and conclude that it made no
visible progress.

While that seems like a nice property to have, it does complicate
things, so can be left for follow-on work.

Also in v5:

- It mentions the new command in the error hint in
GetNewTransactionId(). I'm not sure if multi-word commands should be
quoted like this.
- A first draft of documentation

--
John Naylor
EDB: http://www.enterprisedb.com

Attachments:

v5-do-only-critical-work-during-single-user-vacuum.patchtext/x-patch; charset=US-ASCII; name=v5-do-only-critical-work-during-single-user-vacuum.patchDownload
 doc/src/sgml/maintenance.sgml       |  12 ++--
 doc/src/sgml/ref/vacuum.sgml        |  22 ++++++++
 src/backend/access/transam/varsup.c |   4 +-
 src/backend/commands/vacuum.c       | 107 +++++++++++++++++++++++++++++++++---
 src/include/commands/vacuum.h       |   5 ++
 5 files changed, 134 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 36f975b1e5..5c36049950 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -629,17 +629,19 @@ HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that data
 
 <programlisting>
 ERROR:  database is not accepting commands to avoid wraparound data loss in database "mydb"
-HINT:  Stop the postmaster and vacuum that database in single-user mode.
+HINT:  Stop the postmaster and run "VACUUM (EMERGENCY)" in that database in single-user mode.
 </programlisting>
 
     The three-million-transaction safety margin exists to let the
     administrator recover without data loss, by manually executing the
-    required <command>VACUUM</command> commands.  However, since the system will not
+    required <command>VACUUM</command> command.  However, since the system will not
     execute commands once it has gone into the safety shutdown mode,
     the only way to do this is to stop the server and start the server in single-user
-    mode to execute <command>VACUUM</command>.  The shutdown mode is not enforced
-    in single-user mode.  See the <xref linkend="app-postgres"/> reference
-    page for details about using single-user mode.
+    mode to execute <command>VACUUM (EMERGENCY)</command>. The <literal>EMERGENCY</literal> option
+    is recommended, since it enables the vacuum to complete as quickly as possible
+    while still leaving a safety margin for when the system comes back online again.
+    The shutdown mode is not enforced in single-user mode.
+    See the <xref linkend="app-postgres"/> reference page for details about using single-user mode.
    </para>
 
    <sect3 id="vacuum-for-multixact-wraparound">
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 3df32b58ee..2dab01ff37 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -295,6 +295,28 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>EMERGENCY</literal></term>
+    <listitem>
+     <para>
+      Special vacuum mode intended for when a forced shutdown has happened to avoid transaction ID wraparound.
+      It performs a database-wide vacuum on tables, toast tables, and materialized views whose
+      xid age or mxid age is older than 1 billion. To complete as quickly as possible, an emergency
+      vacuum will skip truncation and index cleanup, and will skip toast tables whose age has not
+      exceeded the cutoff.
+     </para>
+
+     <para>
+      While this option could be used while the postmaster is running, it is expected that the wraparound
+      failsafe mechanism will automatically work in the same way to prevent imminent shutdown.
+      When <literal>EMERGENCY</literal> is specified no tables may be listed, since it is designed to
+      select candidate relations from the entire database.
+      The only other option that may be combined with <literal>VERBOSE</literal>, although in single-user mode no client messages are
+      output.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 748120a012..ee9d33dba3 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -128,14 +128,14 @@ GetNewTransactionId(bool isSubXact)
 						(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 						 errmsg("database is not accepting commands to avoid wraparound data loss in database \"%s\"",
 								oldest_datname),
-						 errhint("Stop the postmaster and vacuum that database in single-user mode.\n"
+						 errhint("Stop the postmaster and run \"VACUUM (EMERGENCY)\" in that database in single-user mode.\n"
 								 "You might also need to commit or roll back old prepared transactions, or drop stale replication slots.")));
 			else
 				ereport(ERROR,
 						(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 						 errmsg("database is not accepting commands to avoid wraparound data loss in database with OID %u",
 								oldest_datoid),
-						 errhint("Stop the postmaster and vacuum that database in single-user mode.\n"
+						 errhint("Stop the postmaster and run \"VACUUM (EMERGENCY)\" in that database in single-user mode.\n"
 								 "You might also need to commit or roll back old prepared transactions, or drop stale replication slots.")));
 		}
 		else if (TransactionIdFollowsOrEquals(xid, xidWarnLimit))
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index d1dadc54e4..e36159e827 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -52,6 +52,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/acl.h"
+#include "utils/builtins.h"
 #include "utils/fmgroids.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
@@ -86,7 +87,7 @@ int			VacuumCostBalanceLocal = 0;
 
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
-static List *get_all_vacuum_rels(int options);
+static List *get_all_vacuum_rels(VacuumParams *params);
 static void vac_truncate_clog(TransactionId frozenXID,
 							  MultiXactId minMulti,
 							  TransactionId lastSaneFrozenXid,
@@ -114,6 +115,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		full = false;
 	bool		disable_page_skipping = false;
 	bool		process_toast = true;
+	bool		emergency = false;
 	ListCell   *lc;
 
 	/* index_cleanup and truncate values unspecified for now */
@@ -123,6 +125,10 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* By default parallel vacuum is enabled */
 	params.nworkers = 0;
 
+	/* By default don't skip any tables */
+	params.min_xid_age = 0;
+	params.min_mxid_age = 0;
+
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -200,6 +206,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 					params.nworkers = nworkers;
 			}
 		}
+		else if (strcmp(opt->defname, "emergency") == 0)
+			emergency = defGetBoolean(opt);
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -216,7 +224,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		(freeze ? VACOPT_FREEZE : 0) |
 		(full ? VACOPT_FULL : 0) |
 		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0) |
-		(process_toast ? VACOPT_PROCESS_TOAST : 0);
+		(process_toast ? VACOPT_PROCESS_TOAST : 0) |
+		(emergency ? VACOPT_EMERGENCY : 0);
 
 	/* sanity checks on options */
 	Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE));
@@ -246,17 +255,70 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if (emergency)
+	{
+		/* exclude incompatible options */
+		foreach(lc, vacstmt->options)
+		{
+			DefElem    *opt = (DefElem *) lfirst(lc);
+
+			if (strcmp(opt->defname, "emergency") != 0 &&
+				strcmp(opt->defname, "verbose") != 0 &&
+				defGetBoolean(opt))
+
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+								errmsg("option \"%s\" is incompatible with EMERGENCY", opt->defname),
+								parser_errposition(pstate, opt->location)));
+		}
+
+		/* prevent specifying a list of tables, to keep it simple */
+		if (vacstmt->rels != NULL)
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						errmsg("a relation list is not supported with the EMERGENCY option")));
+
+		/* skip unnecessary work, similar to failsafe mode */
+
+		params.index_cleanup = VACOPTVALUE_DISABLED;
+		params.truncate = VACOPTVALUE_DISABLED;
+
+		/* Hard-code 1 billion for the thresholds to avoid making assumptions
+		* about the configuration. This leaves some headroom for when the user
+		* returns to normal mode while also minimizing work.
+		* WIP: consider passing these constants via the params struct
+		*/
+		// params.min_xid_age = 1000 * 1000 * 1000;
+		// params.min_mxid_age = 1000 * 1000 * 1000;
+		// FIXME to speed up testing
+		params.min_xid_age = 1000 * 1000;
+		params.min_mxid_age = 1000 * 1000;
+	}
+
 	/*
-	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
-	 * them as -1 which means to use the default values.
+	 * Set freeze ages to zero where appropriate; otherwise pass
+	 * them as -1 which means to use the configured values.
 	 */
 	if (params.options & VACOPT_FREEZE)
 	{
+		/* All freeze ages are zero if the FREEZE option is given */
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
 		params.multixact_freeze_min_age = 0;
 		params.multixact_freeze_table_age = 0;
 	}
+	else if (params.options & VACOPT_EMERGENCY)
+	{
+		/* It's highly likely any table selected will be eligible for aggressive vacuum, but make sure */
+		params.freeze_table_age = 0;
+		params.multixact_freeze_table_age = 0;
+
+		// WIP: It might be worth trying to do less work here, such as max age / 2 :
+		// params.freeze_min_age = 100 * 1000 * 1000;
+		// params.multixact_freeze_min_age = 200 * 1000 * 1000;
+		params.freeze_min_age = -1;
+		params.multixact_freeze_min_age = -1;
+	}
 	else
 	{
 		params.freeze_min_age = -1;
@@ -404,7 +466,7 @@ vacuum(List *relations, VacuumParams *params,
 		relations = newrels;
 	}
 	else
-		relations = get_all_vacuum_rels(params->options);
+		relations = get_all_vacuum_rels(params);
 
 	/*
 	 * Decide whether we need to start/commit our own transactions.
@@ -461,7 +523,10 @@ vacuum(List *relations, VacuumParams *params,
 		ListCell   *cur;
 
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		if (params->VACOPT_EMERGENCY)
+			VacuumCostActive = false;
+		else
+			VacuumCostActive = (VacuumCostDelay > 0);
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -888,12 +953,14 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
  * the current database.  The list is built in vac_context.
  */
 static List *
-get_all_vacuum_rels(int options)
+get_all_vacuum_rels(VacuumParams *params)
 {
 	List	   *vacrels = NIL;
 	Relation	pgclass;
 	TableScanDesc scan;
 	HeapTuple	tuple;
+	int32 		table_xid_age,
+				table_mxid_age;
 
 	pgclass = table_open(RelationRelationId, AccessShareLock);
 
@@ -906,15 +973,37 @@ get_all_vacuum_rels(int options)
 		Oid			relid = classForm->oid;
 
 		/* check permissions of relation */
-		if (!vacuum_is_relation_owner(relid, classForm, options))
+		if (!vacuum_is_relation_owner(relid, classForm, params->options))
 			continue;
 
+		if (params->options & VACOPT_EMERGENCY)
+		{
+			/*
+			* Only consider relations able to hold unfrozen XIDs (anything else
+			* should have InvalidTransactionId in relfrozenxid anyway).
+			*/
+			if (classForm->relkind != RELKIND_RELATION &&
+				classForm->relkind != RELKIND_MATVIEW &&
+				classForm->relkind != RELKIND_TOASTVALUE)
+			{
+				Assert(!TransactionIdIsValid(classForm->relfrozenxid));
+				Assert(!MultiXactIdIsValid(classForm->relminmxid));
+				continue;
+			}
+
+			table_xid_age = DirectFunctionCall1(xid_age, classForm->relfrozenxid);
+			table_mxid_age = DirectFunctionCall1(mxid_age, classForm->relminmxid);
+
+			if ((table_xid_age < params->min_xid_age) &&
+				(table_mxid_age < params->min_mxid_age))
+				continue;
+		}
 		/*
 		 * We include partitioned tables here; depending on which operation is
 		 * to be performed, caller will decide whether to process or ignore
 		 * them.
 		 */
-		if (classForm->relkind != RELKIND_RELATION &&
+		else if (classForm->relkind != RELKIND_RELATION &&
 			classForm->relkind != RELKIND_MATVIEW &&
 			classForm->relkind != RELKIND_PARTITIONED_TABLE)
 			continue;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5d0bdfa427..8aab664162 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -188,6 +188,7 @@ typedef struct VacAttrStats
 #define VACOPT_SKIP_LOCKED 0x20 /* skip if cannot get lock */
 #define VACOPT_PROCESS_TOAST 0x40	/* process the TOAST table, if any */
 #define VACOPT_DISABLE_PAGE_SKIPPING 0x80	/* don't skip any pages */
+#define VACOPT_EMERGENCY 0x100	/* do minimal freezing work to prevent or get out of shutdown */
 
 /*
  * Values used by index_cleanup and truncate params.
@@ -233,6 +234,10 @@ typedef struct VacuumParams
 	 * disabled.
 	 */
 	int			nworkers;
+
+	/* cutoff ages for selecting tables to vacuum, 0 is the default */
+	int 		min_xid_age;
+	int 		min_mxid_age;
 } VacuumParams;
 
 /*
#48Masahiko Sawada
sawada.mshk@gmail.com
In reply to: John Naylor (#47)
Re: do only critical work during single-user vacuum?

On Wed, Feb 2, 2022 at 6:50 AM John Naylor <john.naylor@enterprisedb.com> wrote:

On Thu, Jan 27, 2022 at 8:28 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

I'm sure you meant "&" here (fixed in attached patch to appease the cfbot):
+ if (options | VACOPT_MINIMAL)

Thanks for catching that! That copy-pasto was also masking my failure
to process the option properly -- fixed in the attached as v5.

It should either refuse to run if a list of tables is specified with MINIMAL,
or it should filter that list by XID condition.

I went with the former for simplicity. As a single-purpose option, it
makes sense.

As for the name, it could be MINIMAL or FAILSAFE or EMERGENCY or ??
I think the name should actually be a bit more descriptive, and maybe say XID,
like MINIMAL_XID or XID_EMERGENCY...

I went with EMERGENCY in this version to reinforce its purpose in the
mind of the user (and reader of this code).

Normally, options are independent, but VACUUM (MINIMAL) is a "shortcut" to a
hardcoded set of options: freeze on, truncate off, cleanup off. So it refuses
to be combined with other options - good.

This is effectively a shortcut to hypothetical parameters for selecting tables
by XID/MXID age. In the future, someone could debate adding user-facing knobs
for table selection by age.

I used the params struct in v5 for the emergency cutoff ages. Even
with the values hard-coded, it seems cleaner to keep them here.

I still wonder if the relations should be processed in order of decreasing age.
An admin might have increased autovacuum_freeze_max_age up to 2e9, and your
query might return thousands of tables, with a wide range of sizes and ages.

Processing them in order of decreasing age would allow the admin to quickly
vacuum the oldest tables, and optionally interrupt vacuum to get out of single
user mode ASAP - even if their just want to run VACUUM(MINIMAL) in a normal
backend when services aren't offline. Processing them out of order might be
pretty surprising - they might run vacuum for an hour (or overnight), cancel
it, attempt to start the DB in normal mode, and conclude that it made no
visible progress.

While that seems like a nice property to have, it does complicate
things, so can be left for follow-on work.

Also in v5:

- It mentions the new command in the error hint in
GetNewTransactionId(). I'm not sure if multi-word commands should be
quoted like this.
- A first draft of documentation

Thank you for updating the patch.

I have a few questions and comments:

+      The only other option that may be combined with
<literal>VERBOSE</literal>, although in single-user mode no client
messages are
+      output.

Given VERBOSE with EMERGENCY can work only in multi-user mode, why
only VERBOSE can be specified with EMERGENCY? I think the same is true
for other options like PARALLEL; PARALLEL can work only in multi-user
mode.

---
+      It performs a database-wide vacuum on tables, toast tables, and
materialized views whose
+      xid age or mxid age is older than 1 billion.

Do we need to allow the user to specify the threshold or need a higher
value (at least larger than 1.6 billion, default value of
vacuum_failsafe_age)? I imagined a case where there are a few very-old
tables (say 2 billion old) and many tables that are older than 1
billion. In this case, VACUUM (EMERGENCY) would take a long time to
complete. But to minimize the downtime, we might want to run VACUUM
(EMERGENCY) on only the very-old tables, start the cluster in
multi-user mode, and run vacuum on multiple tables in parallel.

---
+       if (params->options & VACOPT_EMERGENCY)
+       {
+           /*
+           * Only consider relations able to hold unfrozen XIDs (anything else
+           * should have InvalidTransactionId in relfrozenxid anyway).
+           */
+           if (classForm->relkind != RELKIND_RELATION &&
+               classForm->relkind != RELKIND_MATVIEW &&
+               classForm->relkind != RELKIND_TOASTVALUE)
+           {
+               Assert(!TransactionIdIsValid(classForm->relfrozenxid));
+               Assert(!MultiXactIdIsValid(classForm->relminmxid));
+               continue;
+           }
+
+           table_xid_age = DirectFunctionCall1(xid_age,
classForm->relfrozenxid);
+           table_mxid_age = DirectFunctionCall1(mxid_age,
classForm->relminmxid);
+

I think that instead of calling xid_age and mxid_age for each
relation, we can compute the thresholds for xid and mxid once, and
then compare them to relation's relfrozenxid and relminmxid.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

#49John Naylor
john.naylor@enterprisedb.com
In reply to: Masahiko Sawada (#48)
Re: do only critical work during single-user vacuum?

On Thu, Feb 3, 2022 at 3:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

+      The only other option that may be combined with
<literal>VERBOSE</literal>, although in single-user mode no client
messages are
+      output.

Given VERBOSE with EMERGENCY can work only in multi-user mode, why
only VERBOSE can be specified with EMERGENCY? I think the same is true
for other options like PARALLEL; PARALLEL can work only in multi-user
mode.

You are right; it makes sense to allow options that would be turned
off automatically in single-user mode. Even if we don't expect it to
be used in normal mode, the restrictions should make sense. Also,
maybe documenting the allowed combinations is a distraction in the
main entry and should be put in the notes at the bottom.

+      It performs a database-wide vacuum on tables, toast tables, and
materialized views whose
+      xid age or mxid age is older than 1 billion.

Do we need to allow the user to specify the threshold or need a higher
value (at least larger than 1.6 billion, default value of
vacuum_failsafe_age)? I imagined a case where there are a few very-old
tables (say 2 billion old) and many tables that are older than 1
billion. In this case, VACUUM (EMERGENCY) would take a long time to
complete.

I still don't think fine-tuning is helpful here. Shutdown vacuum
should be just as trivial to run as it is now, but with better
behavior. I believe a user knowledgeable enough to come up with the
best number is unlikely to get in this situation in the first place.
I'm also not sure a production support engineer would (or should)
immediately figure out a better number than a good default. That said,
the 1 billion figure was a suggestion from Peter G. upthread, and a
higher number could be argued.

But to minimize the downtime, we might want to run VACUUM
(EMERGENCY) on only the very-old tables, start the cluster in
multi-user mode, and run vacuum on multiple tables in parallel.

That's exactly the idea. Also, back in normal mode, we can start
streaming WAL again. However, we don't want to go back online so close
to the limit that we risk shutdown again. People have a reasonable
expectation that if you fix an emergency, it's now fixed and the
application can go back online. Falling down repeatedly, or worrying
if it's possible, is very bad.

+       if (params->options & VACOPT_EMERGENCY)
+       {
+           /*
+           * Only consider relations able to hold unfrozen XIDs (anything else
+           * should have InvalidTransactionId in relfrozenxid anyway).
+           */
+           if (classForm->relkind != RELKIND_RELATION &&
+               classForm->relkind != RELKIND_MATVIEW &&
+               classForm->relkind != RELKIND_TOASTVALUE)
+           {
+               Assert(!TransactionIdIsValid(classForm->relfrozenxid));
+               Assert(!MultiXactIdIsValid(classForm->relminmxid));
+               continue;
+           }
+
+           table_xid_age = DirectFunctionCall1(xid_age,
classForm->relfrozenxid);
+           table_mxid_age = DirectFunctionCall1(mxid_age,
classForm->relminmxid);
+

I think that instead of calling xid_age and mxid_age for each
relation, we can compute the thresholds for xid and mxid once, and
then compare them to relation's relfrozenxid and relminmxid.

That sounds like a good idea if it's simple to implement, so I will
try it. If it adds complexity, I don't think it's worth it. Scanning a
few thousand rows in pg_class along with the function calls is tiny
compared to the actual vacuum work.

--
John Naylor
EDB: http://www.enterprisedb.com

#50John Naylor
john.naylor@enterprisedb.com
In reply to: John Naylor (#49)
Re: do only critical work during single-user vacuum?

Thinking further about the use of emergency mode, we have this:

"If for some reason autovacuum fails to clear old XIDs from a table,
the system will begin to emit warning messages like this when the
database's oldest XIDs reach forty million transactions from the
wraparound point:

WARNING: database "mydb" must be vacuumed within 39985967 transactions
HINT: To avoid a database shutdown, execute a database-wide VACUUM in
that database.
"

It seems people tend not to see these warnings if they didn't already
have some kind of monitoring which would prevent them from getting
here in the first place. But if they do, the hint should mention the
emergency option here, too. This puts Justin's idea upthread in a new
light -- if the admin does notice this warning, then emergency mode
should indeed vacuum the oldest tables first, since autovacuum is not
(yet) smart enough to do that. I'll pursue that as a follow-up.

--
John Naylor
EDB: http://www.enterprisedb.com

#51Justin Pryzby
pryzby@telsasoft.com
In reply to: John Naylor (#47)
3 attachment(s)
Re: do only critical work during single-user vacuum?

On Tue, Feb 01, 2022 at 04:50:31PM -0500, John Naylor wrote:

On Thu, Jan 27, 2022 at 8:28 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

I'm sure you meant "&" here (fixed in attached patch to appease the cfbot):
+ if (options | VACOPT_MINIMAL)

Thanks for catching that! That copy-pasto was also masking my failure
to process the option properly -- fixed in the attached as v5.

Thank the cfbot ;)

Actually, your most recent patch failed again without this:

-               if (params->VACOPT_EMERGENCY)
+               if (params->options & VACOPT_EMERGENCY)

- It mentions the new command in the error hint in
GetNewTransactionId(). I'm not sure if multi-word commands should be
quoted like this.

Use <literal> ?

+      xid age or mxid age is older than 1 billion. To complete as quickly as possible, an emergency
+      vacuum will skip truncation and index cleanup, and will skip toast tables whose age has not
+      exceeded the cutoff.

Why does this specially mention toast tables ?

+      While this option could be used while the postmaster is running, it is expected that the wraparound
+      failsafe mechanism will automatically work in the same way to prevent imminent shutdown.
+      When <literal>EMERGENCY</literal> is specified no tables may be listed, since it is designed to

specified comma

+      select candidate relations from the entire database.
+      The only other option that may be combined with <literal>VERBOSE</literal>, although in single-user mode no client messages are

this is missing a word?
Maybe say: May not be combined with any other option, other than VERBOSE.

Should the docs describe that the vacuum is done with cost based delay disabled
and with vacuum_freeze_table_age=0 (and other settings).

+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+								errmsg("option \"%s\" is incompatible with EMERGENCY", opt->defname),
+								parser_errposition(pstate, opt->location)));

IMO new code should avoid using the outer parenthesis around errcode().

Maybe the errmsg should say: .. may not be specified with EMERGENCY.
EMERGENCY probably shouldn't be part of the translatable string.

+                       if (strcmp(opt->defname, "emergency") != 0 &&
+                               strcmp(opt->defname, "verbose") != 0 &&
+                               defGetBoolean(opt))

It's wrong to call getBoolean(), since the options may not be boolean.
postgres=# VACUUM(EMERGENCY, INDEX_CLEANUP auto);
ERROR: index_cleanup requires a Boolean value

I think EMERGENCY mode should disable process_toast. It already processes
toast tables separately. See 003.

Should probably exercise (EMERGENCY) in vacuum.sql. See 003.

I still wonder if the relations should be processed in order of decreasing age.
An admin might have increased autovacuum_freeze_max_age up to 2e9, and your
query might return thousands of tables, with a wide range of sizes and ages.

While that seems like a nice property to have, it does complicate
things, so can be left for follow-on work.

I added that in the attached 003.

--
Justin

Attachments:

0001-do-only-critical-work-during-single-user-vacuum.patchtext/x-diff; charset=us-asciiDownload
From a870303f4bd62b7c653a4bef53ed6d2748268bc0 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@enterprisedb.com>
Date: Tue, 1 Feb 2022 16:50:31 -0500
Subject: [PATCH 1/3] do only critical work during single-user vacuum?

Feb 01 John Naylor
---
 doc/src/sgml/maintenance.sgml       |  12 ++--
 doc/src/sgml/ref/vacuum.sgml        |  22 ++++++
 src/backend/access/transam/varsup.c |   4 +-
 src/backend/commands/vacuum.c       | 107 +++++++++++++++++++++++++---
 src/include/commands/vacuum.h       |   5 ++
 5 files changed, 134 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 36f975b1e5b..5c360499504 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -629,17 +629,19 @@ HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that data
 
 <programlisting>
 ERROR:  database is not accepting commands to avoid wraparound data loss in database "mydb"
-HINT:  Stop the postmaster and vacuum that database in single-user mode.
+HINT:  Stop the postmaster and run "VACUUM (EMERGENCY)" in that database in single-user mode.
 </programlisting>
 
     The three-million-transaction safety margin exists to let the
     administrator recover without data loss, by manually executing the
-    required <command>VACUUM</command> commands.  However, since the system will not
+    required <command>VACUUM</command> command.  However, since the system will not
     execute commands once it has gone into the safety shutdown mode,
     the only way to do this is to stop the server and start the server in single-user
-    mode to execute <command>VACUUM</command>.  The shutdown mode is not enforced
-    in single-user mode.  See the <xref linkend="app-postgres"/> reference
-    page for details about using single-user mode.
+    mode to execute <command>VACUUM (EMERGENCY)</command>. The <literal>EMERGENCY</literal> option
+    is recommended, since it enables the vacuum to complete as quickly as possible
+    while still leaving a safety margin for when the system comes back online again.
+    The shutdown mode is not enforced in single-user mode.
+    See the <xref linkend="app-postgres"/> reference page for details about using single-user mode.
    </para>
 
    <sect3 id="vacuum-for-multixact-wraparound">
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 3df32b58ee6..2dab01ff376 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -295,6 +295,28 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>EMERGENCY</literal></term>
+    <listitem>
+     <para>
+      Special vacuum mode intended for when a forced shutdown has happened to avoid transaction ID wraparound.
+      It performs a database-wide vacuum on tables, toast tables, and materialized views whose
+      xid age or mxid age is older than 1 billion. To complete as quickly as possible, an emergency
+      vacuum will skip truncation and index cleanup, and will skip toast tables whose age has not
+      exceeded the cutoff.
+     </para>
+
+     <para>
+      While this option could be used while the postmaster is running, it is expected that the wraparound
+      failsafe mechanism will automatically work in the same way to prevent imminent shutdown.
+      When <literal>EMERGENCY</literal> is specified no tables may be listed, since it is designed to
+      select candidate relations from the entire database.
+      The only other option that may be combined with <literal>VERBOSE</literal>, although in single-user mode no client messages are
+      output.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 748120a0125..ee9d33dba37 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -128,14 +128,14 @@ GetNewTransactionId(bool isSubXact)
 						(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 						 errmsg("database is not accepting commands to avoid wraparound data loss in database \"%s\"",
 								oldest_datname),
-						 errhint("Stop the postmaster and vacuum that database in single-user mode.\n"
+						 errhint("Stop the postmaster and run \"VACUUM (EMERGENCY)\" in that database in single-user mode.\n"
 								 "You might also need to commit or roll back old prepared transactions, or drop stale replication slots.")));
 			else
 				ereport(ERROR,
 						(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 						 errmsg("database is not accepting commands to avoid wraparound data loss in database with OID %u",
 								oldest_datoid),
-						 errhint("Stop the postmaster and vacuum that database in single-user mode.\n"
+						 errhint("Stop the postmaster and run \"VACUUM (EMERGENCY)\" in that database in single-user mode.\n"
 								 "You might also need to commit or roll back old prepared transactions, or drop stale replication slots.")));
 		}
 		else if (TransactionIdFollowsOrEquals(xid, xidWarnLimit))
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index d1dadc54e47..e36159e827f 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -52,6 +52,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/acl.h"
+#include "utils/builtins.h"
 #include "utils/fmgroids.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
@@ -86,7 +87,7 @@ int			VacuumCostBalanceLocal = 0;
 
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
-static List *get_all_vacuum_rels(int options);
+static List *get_all_vacuum_rels(VacuumParams *params);
 static void vac_truncate_clog(TransactionId frozenXID,
 							  MultiXactId minMulti,
 							  TransactionId lastSaneFrozenXid,
@@ -114,6 +115,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		full = false;
 	bool		disable_page_skipping = false;
 	bool		process_toast = true;
+	bool		emergency = false;
 	ListCell   *lc;
 
 	/* index_cleanup and truncate values unspecified for now */
@@ -123,6 +125,10 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* By default parallel vacuum is enabled */
 	params.nworkers = 0;
 
+	/* By default don't skip any tables */
+	params.min_xid_age = 0;
+	params.min_mxid_age = 0;
+
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -200,6 +206,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 					params.nworkers = nworkers;
 			}
 		}
+		else if (strcmp(opt->defname, "emergency") == 0)
+			emergency = defGetBoolean(opt);
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -216,7 +224,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		(freeze ? VACOPT_FREEZE : 0) |
 		(full ? VACOPT_FULL : 0) |
 		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0) |
-		(process_toast ? VACOPT_PROCESS_TOAST : 0);
+		(process_toast ? VACOPT_PROCESS_TOAST : 0) |
+		(emergency ? VACOPT_EMERGENCY : 0);
 
 	/* sanity checks on options */
 	Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE));
@@ -246,17 +255,70 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if (emergency)
+	{
+		/* exclude incompatible options */
+		foreach(lc, vacstmt->options)
+		{
+			DefElem    *opt = (DefElem *) lfirst(lc);
+
+			if (strcmp(opt->defname, "emergency") != 0 &&
+				strcmp(opt->defname, "verbose") != 0 &&
+				defGetBoolean(opt))
+
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+								errmsg("option \"%s\" is incompatible with EMERGENCY", opt->defname),
+								parser_errposition(pstate, opt->location)));
+		}
+
+		/* prevent specifying a list of tables, to keep it simple */
+		if (vacstmt->rels != NULL)
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						errmsg("a relation list is not supported with the EMERGENCY option")));
+
+		/* skip unnecessary work, similar to failsafe mode */
+
+		params.index_cleanup = VACOPTVALUE_DISABLED;
+		params.truncate = VACOPTVALUE_DISABLED;
+
+		/* Hard-code 1 billion for the thresholds to avoid making assumptions
+		* about the configuration. This leaves some headroom for when the user
+		* returns to normal mode while also minimizing work.
+		* WIP: consider passing these constants via the params struct
+		*/
+		// params.min_xid_age = 1000 * 1000 * 1000;
+		// params.min_mxid_age = 1000 * 1000 * 1000;
+		// FIXME to speed up testing
+		params.min_xid_age = 1000 * 1000;
+		params.min_mxid_age = 1000 * 1000;
+	}
+
 	/*
-	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
-	 * them as -1 which means to use the default values.
+	 * Set freeze ages to zero where appropriate; otherwise pass
+	 * them as -1 which means to use the configured values.
 	 */
 	if (params.options & VACOPT_FREEZE)
 	{
+		/* All freeze ages are zero if the FREEZE option is given */
 		params.freeze_min_age = 0;
 		params.freeze_table_age = 0;
 		params.multixact_freeze_min_age = 0;
 		params.multixact_freeze_table_age = 0;
 	}
+	else if (params.options & VACOPT_EMERGENCY)
+	{
+		/* It's highly likely any table selected will be eligible for aggressive vacuum, but make sure */
+		params.freeze_table_age = 0;
+		params.multixact_freeze_table_age = 0;
+
+		// WIP: It might be worth trying to do less work here, such as max age / 2 :
+		// params.freeze_min_age = 100 * 1000 * 1000;
+		// params.multixact_freeze_min_age = 200 * 1000 * 1000;
+		params.freeze_min_age = -1;
+		params.multixact_freeze_min_age = -1;
+	}
 	else
 	{
 		params.freeze_min_age = -1;
@@ -404,7 +466,7 @@ vacuum(List *relations, VacuumParams *params,
 		relations = newrels;
 	}
 	else
-		relations = get_all_vacuum_rels(params->options);
+		relations = get_all_vacuum_rels(params);
 
 	/*
 	 * Decide whether we need to start/commit our own transactions.
@@ -461,7 +523,10 @@ vacuum(List *relations, VacuumParams *params,
 		ListCell   *cur;
 
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		if (params->VACOPT_EMERGENCY)
+			VacuumCostActive = false;
+		else
+			VacuumCostActive = (VacuumCostDelay > 0);
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -888,12 +953,14 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
  * the current database.  The list is built in vac_context.
  */
 static List *
-get_all_vacuum_rels(int options)
+get_all_vacuum_rels(VacuumParams *params)
 {
 	List	   *vacrels = NIL;
 	Relation	pgclass;
 	TableScanDesc scan;
 	HeapTuple	tuple;
+	int32 		table_xid_age,
+				table_mxid_age;
 
 	pgclass = table_open(RelationRelationId, AccessShareLock);
 
@@ -906,15 +973,37 @@ get_all_vacuum_rels(int options)
 		Oid			relid = classForm->oid;
 
 		/* check permissions of relation */
-		if (!vacuum_is_relation_owner(relid, classForm, options))
+		if (!vacuum_is_relation_owner(relid, classForm, params->options))
 			continue;
 
+		if (params->options & VACOPT_EMERGENCY)
+		{
+			/*
+			* Only consider relations able to hold unfrozen XIDs (anything else
+			* should have InvalidTransactionId in relfrozenxid anyway).
+			*/
+			if (classForm->relkind != RELKIND_RELATION &&
+				classForm->relkind != RELKIND_MATVIEW &&
+				classForm->relkind != RELKIND_TOASTVALUE)
+			{
+				Assert(!TransactionIdIsValid(classForm->relfrozenxid));
+				Assert(!MultiXactIdIsValid(classForm->relminmxid));
+				continue;
+			}
+
+			table_xid_age = DirectFunctionCall1(xid_age, classForm->relfrozenxid);
+			table_mxid_age = DirectFunctionCall1(mxid_age, classForm->relminmxid);
+
+			if ((table_xid_age < params->min_xid_age) &&
+				(table_mxid_age < params->min_mxid_age))
+				continue;
+		}
 		/*
 		 * We include partitioned tables here; depending on which operation is
 		 * to be performed, caller will decide whether to process or ignore
 		 * them.
 		 */
-		if (classForm->relkind != RELKIND_RELATION &&
+		else if (classForm->relkind != RELKIND_RELATION &&
 			classForm->relkind != RELKIND_MATVIEW &&
 			classForm->relkind != RELKIND_PARTITIONED_TABLE)
 			continue;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5d0bdfa4279..8aab6641628 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -188,6 +188,7 @@ typedef struct VacAttrStats
 #define VACOPT_SKIP_LOCKED 0x20 /* skip if cannot get lock */
 #define VACOPT_PROCESS_TOAST 0x40	/* process the TOAST table, if any */
 #define VACOPT_DISABLE_PAGE_SKIPPING 0x80	/* don't skip any pages */
+#define VACOPT_EMERGENCY 0x100	/* do minimal freezing work to prevent or get out of shutdown */
 
 /*
  * Values used by index_cleanup and truncate params.
@@ -233,6 +234,10 @@ typedef struct VacuumParams
 	 * disabled.
 	 */
 	int			nworkers;
+
+	/* cutoff ages for selecting tables to vacuum, 0 is the default */
+	int 		min_xid_age;
+	int 		min_mxid_age;
 } VacuumParams;
 
 /*
-- 
2.17.1

0002-fix-compile-error.patchtext/x-diff; charset=us-asciiDownload
From 10bccddb53af5bee869174aecf9eb3c0d2855f6e Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Tue, 1 Feb 2022 18:34:33 -0600
Subject: [PATCH 2/3] fix compile error

---
 src/backend/commands/vacuum.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e36159e827f..6bdce7d1373 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -523,7 +523,7 @@ vacuum(List *relations, VacuumParams *params,
 		ListCell   *cur;
 
 		in_vacuum = true;
-		if (params->VACOPT_EMERGENCY)
+		if (params->options & VACOPT_EMERGENCY)
 			VacuumCostActive = false;
 		else
 			VacuumCostActive = (VacuumCostDelay > 0);
-- 
2.17.1

0003-VACUUM-EMERGENCY-sort-tables-by-age-m-xid.patchtext/x-diff; charset=us-asciiDownload
From 742f23c611d896e386a865fb1a80b151471351fa Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Tue, 1 Feb 2022 22:28:56 -0600
Subject: [PATCH 3/3] VACUUM(EMERGENCY): sort tables by age(m/xid)

---
 src/backend/commands/vacuum.c        | 38 +++++++++++++++++++++++++---
 src/backend/nodes/copyfuncs.c        |  1 +
 src/backend/nodes/equalfuncs.c       |  1 +
 src/backend/nodes/makefuncs.c        |  3 ++-
 src/backend/parser/gram.y            |  2 +-
 src/backend/postmaster/autovacuum.c  |  2 +-
 src/include/nodes/makefuncs.h        |  3 ++-
 src/include/nodes/parsenodes.h       |  1 +
 src/test/regress/expected/vacuum.out |  7 +++++
 src/test/regress/sql/vacuum.sql      |  4 +++
 10 files changed, 55 insertions(+), 7 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 6bdce7d1373..2fa1911417c 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -224,7 +224,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		(freeze ? VACOPT_FREEZE : 0) |
 		(full ? VACOPT_FULL : 0) |
 		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0) |
-		(process_toast ? VACOPT_PROCESS_TOAST : 0) |
+		(process_toast && !emergency ? VACOPT_PROCESS_TOAST : 0) |
 		(emergency ? VACOPT_EMERGENCY : 0);
 
 	/* sanity checks on options */
@@ -291,8 +291,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		// params.min_xid_age = 1000 * 1000 * 1000;
 		// params.min_mxid_age = 1000 * 1000 * 1000;
 		// FIXME to speed up testing
-		params.min_xid_age = 1000 * 1000;
-		params.min_mxid_age = 1000 * 1000;
+		params.min_xid_age = 1000 ;
+		params.min_mxid_age = 1000 ;
 	}
 
 	/*
@@ -889,6 +889,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 			oldcontext = MemoryContextSwitchTo(vac_context);
 			vacrels = lappend(vacrels, makeVacuumRelation(vrel->relation,
 														  relid,
+														  0,
 														  vrel->va_cols));
 			MemoryContextSwitchTo(oldcontext);
 		}
@@ -926,6 +927,7 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 				oldcontext = MemoryContextSwitchTo(vac_context);
 				vacrels = lappend(vacrels, makeVacuumRelation(NULL,
 															  part_oid,
+															  0,
 															  vrel->va_cols));
 				MemoryContextSwitchTo(oldcontext);
 			}
@@ -948,6 +950,26 @@ expand_vacuum_rel(VacuumRelation *vrel, int options)
 	return vacrels;
 }
 
+/*
+ * Helper function for qsort()
+ *
+ * The result is in order of decreasing age.
+ */
+static int
+compare_xidage(const void *a, const void *b)
+{
+	const ListCell *alc = a;
+	const ListCell *blc = b;
+	VacuumRelation *aa = (VacuumRelation *) lfirst(alc);
+	VacuumRelation *bb = (VacuumRelation *) lfirst(blc);
+
+	if (aa->age < bb->age)
+		return +1;
+	if (aa->age > bb->age)
+		return -1;
+	return 0;
+}
+
 /*
  * Construct a list of VacuumRelations for all vacuumable rels in
  * the current database.  The list is built in vac_context.
@@ -971,6 +993,7 @@ get_all_vacuum_rels(VacuumParams *params)
 		Form_pg_class classForm = (Form_pg_class) GETSTRUCT(tuple);
 		MemoryContext oldcontext;
 		Oid			relid = classForm->oid;
+		TransactionId	table_age = 0;
 
 		/* check permissions of relation */
 		if (!vacuum_is_relation_owner(relid, classForm, params->options))
@@ -993,6 +1016,7 @@ get_all_vacuum_rels(VacuumParams *params)
 
 			table_xid_age = DirectFunctionCall1(xid_age, classForm->relfrozenxid);
 			table_mxid_age = DirectFunctionCall1(mxid_age, classForm->relminmxid);
+			table_age = Max(table_xid_age, table_mxid_age);
 
 			if ((table_xid_age < params->min_xid_age) &&
 				(table_mxid_age < params->min_mxid_age))
@@ -1016,6 +1040,7 @@ get_all_vacuum_rels(VacuumParams *params)
 		oldcontext = MemoryContextSwitchTo(vac_context);
 		vacrels = lappend(vacrels, makeVacuumRelation(NULL,
 													  relid,
+													  table_age,
 													  NIL));
 		MemoryContextSwitchTo(oldcontext);
 	}
@@ -1023,6 +1048,13 @@ get_all_vacuum_rels(VacuumParams *params)
 	table_endscan(scan);
 	table_close(pgclass, AccessShareLock);
 
+	if ((params->options & VACOPT_EMERGENCY) && vacrels != NIL)
+	{
+		/* Sort the list in order of deceasing XID age */
+		qsort(vacrels->elements, list_length(vacrels), sizeof(ListCell),
+				compare_xidage);
+	}
+
 	return vacrels;
 }
 
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 90b5da51c95..50dab1bb21b 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4100,6 +4100,7 @@ _copyVacuumRelation(const VacuumRelation *from)
 
 	COPY_NODE_FIELD(relation);
 	COPY_SCALAR_FIELD(oid);
+	COPY_SCALAR_FIELD(age);
 	COPY_NODE_FIELD(va_cols);
 
 	return newnode;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 06345da3ba8..3c1adcaf292 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1768,6 +1768,7 @@ _equalVacuumRelation(const VacuumRelation *a, const VacuumRelation *b)
 {
 	COMPARE_NODE_FIELD(relation);
 	COMPARE_SCALAR_FIELD(oid);
+	COMPARE_SCALAR_FIELD(age);
 	COMPARE_NODE_FIELD(va_cols);
 
 	return true;
diff --git a/src/backend/nodes/makefuncs.c b/src/backend/nodes/makefuncs.c
index 822395625b6..e234a501e93 100644
--- a/src/backend/nodes/makefuncs.c
+++ b/src/backend/nodes/makefuncs.c
@@ -808,12 +808,13 @@ makeGroupingSet(GroupingSetKind kind, List *content, int location)
  *	  create a VacuumRelation node
  */
 VacuumRelation *
-makeVacuumRelation(RangeVar *relation, Oid oid, List *va_cols)
+makeVacuumRelation(RangeVar *relation, Oid oid, TransactionId age, List *va_cols)
 {
 	VacuumRelation *v = makeNode(VacuumRelation);
 
 	v->relation = relation;
 	v->oid = oid;
+	v->age = age;
 	v->va_cols = va_cols;
 	return v;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b5966712ce1..8e6541ea165 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10947,7 +10947,7 @@ opt_name_list:
 vacuum_relation:
 			qualified_name opt_name_list
 				{
-					$$ = (Node *) makeVacuumRelation($1, InvalidOid, $2);
+					$$ = (Node *) makeVacuumRelation($1, InvalidOid, 0, $2);
 				}
 		;
 
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 681ef91b81e..e0f2eefd769 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3241,7 +3241,7 @@ autovacuum_do_vac_analyze(autovac_table *tab, BufferAccessStrategy bstrategy)
 
 	/* Set up one VacuumRelation target, identified by OID, for vacuum() */
 	rangevar = makeRangeVar(tab->at_nspname, tab->at_relname, -1);
-	rel = makeVacuumRelation(rangevar, tab->at_relid, NIL);
+	rel = makeVacuumRelation(rangevar, tab->at_relid, 0, NIL);
 	rel_list = list_make1(rel);
 
 	vacuum(rel_list, &tab->at_params, bstrategy, true);
diff --git a/src/include/nodes/makefuncs.h b/src/include/nodes/makefuncs.h
index fe173101d12..e8038bcc5d1 100644
--- a/src/include/nodes/makefuncs.h
+++ b/src/include/nodes/makefuncs.h
@@ -104,6 +104,7 @@ extern DefElem *makeDefElemExtended(char *nameSpace, char *name, Node *arg,
 
 extern GroupingSet *makeGroupingSet(GroupingSetKind kind, List *content, int location);
 
-extern VacuumRelation *makeVacuumRelation(RangeVar *relation, Oid oid, List *va_cols);
+extern VacuumRelation *makeVacuumRelation(RangeVar *relation, Oid oid,
+										  TransactionId age, List *va_cols);
 
 #endif							/* MAKEFUNC_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3e9bdc781f9..c25ee7389a3 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3374,6 +3374,7 @@ typedef struct VacuumRelation
 	NodeTag		type;
 	RangeVar   *relation;		/* table name to process, or NULL */
 	Oid			oid;			/* table's OID; InvalidOid if not looked up */
+	TransactionId		age;			/* table's age or 0 if unknown*/
 	List	   *va_cols;		/* list of column names, or NIL for all */
 } VacuumRelation;
 
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 3e70e4c788e..c5b91871ca1 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -413,3 +413,10 @@ RESET ROLE;
 DROP TABLE vacowned;
 DROP TABLE vacowned_parted;
 DROP ROLE regress_vacuum;
+VACUUM(EMERGENCY, PROCESS_TOAST); -- fails
+ERROR:  option "process_toast" is incompatible with EMERGENCY
+LINE 1: VACUUM(EMERGENCY, PROCESS_TOAST);
+                          ^
+VACUUM(EMERGENCY) pg_class; -- fails
+ERROR:  a relation list is not supported with the EMERGENCY option
+VACUUM(EMERGENCY);
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 18cb7fd08ac..e41faa93590 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -318,3 +318,7 @@ RESET ROLE;
 DROP TABLE vacowned;
 DROP TABLE vacowned_parted;
 DROP ROLE regress_vacuum;
+
+VACUUM(EMERGENCY, PROCESS_TOAST); -- fails
+VACUUM(EMERGENCY) pg_class; -- fails
+VACUUM(EMERGENCY);
-- 
2.17.1

#52Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#12)
Re: do only critical work during single-user vacuum?

On Thu, Dec 9, 2021 at 8:56 PM Andres Freund <andres@anarazel.de> wrote:

I think we should move *away* from single user mode, rather than the
opposite. It's a substantial code burden and it's hard to use.

Yes. This thread seems to be largely devoted to the topic of making
single-user vacuum work better, but I don't see anyone asking the
question "why do we have a message that tells people to vacuum in
single user mode in the first place?". It's basically bad advice, with
one small exception that I'll talk about in a minute. Suppose we had a
message in the tree that said "HINT: Consider angering a live anaconda
to fix this problem." If that were so, the correct thing to do
wouldn't be to add a section to our documentation explaining how to
deal with angry anacondas. The correct thing to do would be to remove
the hint as bad advice that we never should have offered in the first
place. And so here. We should not try to make vacuum in single
user-mode work better or differently, or at least that shouldn't be
our primary objective. We should just stop telling people to do it. We
should probably add messages and documentation *discouraging* the use
of single user mode for recovering from wraparound trouble, exactly
the opposite of what we do now. There's nothing we can do in
single-user mode that we can't do equally well in multi-user mode. If
people try to fix wraparound problems in multi-user mode, they still
have read-only access to their database, they can use parallelism,
they can use command line utilities like vacuumdb, and they can use
psql which has line editing and allows remote access and is a way
nicer user experience than running postgres --single. We need a really
compelling reason to tell people to give up all those advantages, and
there is no such reason. It makes just as much sense as telling people
to deal with wraparound problems by angering a live anaconda.

I did say there was an exception, and it's this: the last time I
studied this issue back in 2019,[1]/messages/by-id/CA+Tgmob1QCMJrHwRBK8HZtGsr+6cJANRQw2mEgJ9e=D+z7cOsw@mail.gmail.com vacuum insisted on trying to
truncate tables even when the system is in wraparound danger. Then it
would fail, because truncating the table required allocating an XID,
which would fail if we were short on XIDs. By putting the system in
single user mode, you could continue to allocate XIDs and thus VACUUM
would work. However, if you think about this for even 10 seconds, you
can see that it's terrible. If we're so short of XIDs that we are
scared to allocate them for fear of causing an actual wraparound,
putting the system into a mode where that protection is bypassed is a
super-terrible idea. People will be able to run vacuum, yes, but if
they have too many tables, they will actually experience wraparound
and thus data loss before they process all the tables they have. What
we ought to do to solve this problem is NOT TRUNCATE when the number
of remaining XIDs is small, so that we don't consume any of the
remaining XIDs until we get the system out of wraparound danger. I
think the "failsafe" stuff Peter added in v14 fixes that, though. If
not, we should adjust it so it does. And then we should KILL WITH FIRE
the message telling people to use single user mode -- and once we do
that, the question of what the behavior ought to be when someone does
run VACUUM in single user mode becomes a lot less important.

This problem is basically self-inflicted. We have given people bad
advice (use single user mode) and then they suffer when they take it.
Ameliorating the suffering isn't the worst idea ever, but it's
basically fixing the wrong problem.

--
Robert Haas
EDB: http://www.enterprisedb.com

[1]: /messages/by-id/CA+Tgmob1QCMJrHwRBK8HZtGsr+6cJANRQw2mEgJ9e=D+z7cOsw@mail.gmail.com

#53John Naylor
john.naylor@enterprisedb.com
In reply to: Robert Haas (#52)
Re: do only critical work during single-user vacuum?

On Thu, Feb 3, 2022 at 1:06 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Dec 9, 2021 at 8:56 PM Andres Freund <andres@anarazel.de> wrote:

I think we should move *away* from single user mode, rather than the
opposite. It's a substantial code burden and it's hard to use.

Yes. This thread seems to be largely devoted to the topic of making
single-user vacuum work better, but I don't see anyone asking the
question "why do we have a message that tells people to vacuum in
single user mode in the first place?". It's basically bad advice, with
one small exception that I'll talk about in a minute.

The word "advice" sounds like people have a choice, rather than the
system not accepting commands anymore. It would be much less painful
if the system closed connections and forbade all but superusers to
connect, but that sounds like a lot of work. (happy to be proven
otherwise)

--
John Naylor
EDB: http://www.enterprisedb.com

#54Robert Haas
robertmhaas@gmail.com
In reply to: John Naylor (#53)
Re: do only critical work during single-user vacuum?

On Thu, Feb 3, 2022 at 1:34 PM John Naylor <john.naylor@enterprisedb.com> wrote:

The word "advice" sounds like people have a choice, rather than the
system not accepting commands anymore. It would be much less painful
if the system closed connections and forbade all but superusers to
connect, but that sounds like a lot of work. (happy to be proven
otherwise)

They *do* have a choice. They can continue to operate the system in
multi-user mode, they can have read access to their data, and they can
run VACUUM and other non-XID-allocating commands to fix the issue.
Sure, their application can't run commands that allocate XIDs, but
it's not going to be able to do that if they go to single-user mode
either.

I don't understand why we would want the system to stop accepting
connections other than superuser connections. That would provide
strictly less functionality and I don't understand what it would gain.
But it would still be better than going into single-user mode, which
provides even less functionality and has basically no advantages of
any kind.

Why are you convinced that the user HAS to go to single-user mode? I
don't think they have to do that, and I don't think they should want
to do that.

--
Robert Haas
EDB: http://www.enterprisedb.com

#55John Naylor
john.naylor@enterprisedb.com
In reply to: Robert Haas (#54)
Re: do only critical work during single-user vacuum?

On Thu, Feb 3, 2022 at 1:42 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Feb 3, 2022 at 1:34 PM John Naylor <john.naylor@enterprisedb.com> wrote:

The word "advice" sounds like people have a choice, rather than the
system not accepting commands anymore. It would be much less painful
if the system closed connections and forbade all but superusers to
connect, but that sounds like a lot of work. (happy to be proven
otherwise)

They *do* have a choice. They can continue to operate the system in
multi-user mode, they can have read access to their data, and they can
run VACUUM and other non-XID-allocating commands to fix the issue.
Sure, their application can't run commands that allocate XIDs, but
it's not going to be able to do that if they go to single-user mode
either.

I just checked some client case notes where they tried just that
before getting outside help, and both SELECT and VACUUM FREEZE
commands were rejected. The failure is clearly indicated in the log.
--
John Naylor
EDB: http://www.enterprisedb.com

#56Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#54)
Re: do only critical work during single-user vacuum?

Hi,

On 2022-02-03 13:42:20 -0500, Robert Haas wrote:

They *do* have a choice. They can continue to operate the system in
multi-user mode, they can have read access to their data, and they can
run VACUUM and other non-XID-allocating commands to fix the issue.
Sure, their application can't run commands that allocate XIDs, but
it's not going to be able to do that if they go to single-user mode
either.

I wonder if we shouldn't add some exceptions to the xid allocation
prevention. It makes sense that we don't allow random DML. But it's e.g. often
more realistic to drop / truncate a few tables with unimportant content,
rather than spend the time vacuuming those. We could e.g. allow xid
consumption within VACUUM, TRUNCATE, DROP TABLE / INDEX when run at the top
level for longer than we allow it for anything else.

But it would still be better than going into single-user mode, which
provides even less functionality and has basically no advantages of
any kind.

Indeed. Single user is the worst response to this (and just about anything
else, really). Even just getting into the single user mode takes a while
(shutdown checkpoint). The user interface is completely different (and
awful). The buffer cache is completely cold. The system is slower because
there's no wal writer / checkpointer running. Which basically is a list of
things one absolutely do not wants when confronted with a wraparound
situation.

Greetings,

Andres Freund

#57Robert Haas
robertmhaas@gmail.com
In reply to: John Naylor (#55)
Re: do only critical work during single-user vacuum?

On Thu, Feb 3, 2022 at 4:18 PM John Naylor <john.naylor@enterprisedb.com> wrote:

I just checked some client case notes where they tried just that
before getting outside help, and both SELECT and VACUUM FREEZE
commands were rejected. The failure is clearly indicated in the log.

It would be helpful to know how it failed - what was the error? And
then I think we should just fix whatever the problem is. As I said
before, I know TRUNCATE has been an issue in the past, and if that's
not already fixed in v14, we should. If there's other stuff, we should
fix that too. The point I'm making here, which I still believe to be
valid, is that there's nothing intrinsically better about being in
single user mode. In fact, it's clearly worse. And I don't think it's
hard to fix it so that we avoid people needing to do that in the first
place.

--
Robert Haas
EDB: http://www.enterprisedb.com

#58Andres Freund
andres@anarazel.de
In reply to: John Naylor (#55)
Re: do only critical work during single-user vacuum?

Hi,

On 2022-02-03 16:18:27 -0500, John Naylor wrote:

I just checked some client case notes where they tried just that
before getting outside help, and both SELECT and VACUUM FREEZE
commands were rejected.

What kind of SELECT was that? Any chance it caused a write via functions, a
view, whatnot? And what version? What was the exact error message?

VACUUM FREEZE is a *terrible* idea to run when encountering anti-wraparound
issues. I understand why people thing key might need it, but basically all it
achieves is to make VACUUM do a lot more, none of it helpful to get out of the
wraparound-can't-write situation (those rows will already get frozen).

I'd plus one the addition of a HINT that tells users that FREEZE likely is a
bad idea when in wraparound land. We should allow it, because there are
situation where it might make sense, but the people that can make that
judgement know they can ignore the HINT.

Greetings,

Andres Freund

#59Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#56)
Re: do only critical work during single-user vacuum?

On Thu, Feb 3, 2022 at 4:50 PM Andres Freund <andres@anarazel.de> wrote:

I wonder if we shouldn't add some exceptions to the xid allocation
prevention. It makes sense that we don't allow random DML. But it's e.g. often
more realistic to drop / truncate a few tables with unimportant content,
rather than spend the time vacuuming those. We could e.g. allow xid
consumption within VACUUM, TRUNCATE, DROP TABLE / INDEX when run at the top
level for longer than we allow it for anything else.

True, although we currently don't start refusing XID allocation
altogether until only 1 million remain, IIRC. And that's cutting it
really close if we need to start consuming 1 XID per table we need to
drop. We might need to push out some of the thresholds a bit.

For the most part, I think that there's no reason why autovacuum
shouldn't be able to recover from this situation automatically, as
long as old replication slots and prepared transactions are cleaned up
and any old transactions are killed off. I don't think we're very far
from that Just Working, but we are not all there yet either. Manual
intervention to drop tables etc. is reasonable to allow a bit more
than we do now, but the big problem IMO is that the behavior when we
run short of XIDs has had very little testing and bug fixing, so
things that don't really need to break just do anyway.

Indeed. Single user is the worst response to this (and just about anything
else, really). Even just getting into the single user mode takes a while
(shutdown checkpoint). The user interface is completely different (and
awful). The buffer cache is completely cold. The system is slower because
there's no wal writer / checkpointer running. Which basically is a list of
things one absolutely do not wants when confronted with a wraparound
situation.

+1.

--
Robert Haas
EDB: http://www.enterprisedb.com

#60John Naylor
john.naylor@enterprisedb.com
In reply to: Andres Freund (#58)
Re: do only critical work during single-user vacuum?

On Thu, Feb 3, 2022 at 4:58 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2022-02-03 16:18:27 -0500, John Naylor wrote:

I just checked some client case notes where they tried just that
before getting outside help, and both SELECT and VACUUM FREEZE
commands were rejected.

What kind of SELECT was that? Any chance it caused a write via functions, a
view, whatnot? And what version? What was the exact error message?

Looking closer, there is a function defined by an extension. I'd have
to dig further to see if writes happen. The error is exactly what
we've been talking about:

2022-01-03 22:03:23 PST ERROR: database is not accepting commands to
avoid wraparound data loss in database "<redacted>"
2022-01-03 22:03:23 PST HINT: Stop the postmaster and vacuum that
database in single-user mode. You might also need to commit or roll
back old prepared transactions.

--
John Naylor
EDB: http://www.enterprisedb.com

#61Robert Haas
robertmhaas@gmail.com
In reply to: John Naylor (#60)
Re: do only critical work during single-user vacuum?

On Thu, Feb 3, 2022 at 5:08 PM John Naylor <john.naylor@enterprisedb.com> wrote:

Looking closer, there is a function defined by an extension. I'd have
to dig further to see if writes happen. The error is exactly what
we've been talking about:

2022-01-03 22:03:23 PST ERROR: database is not accepting commands to
avoid wraparound data loss in database "<redacted>"
2022-01-03 22:03:23 PST HINT: Stop the postmaster and vacuum that
database in single-user mode. You might also need to commit or roll
back old prepared transactions.

That error comes from GetNewTransactionId(), so that function must
either try to execute DML or do something else which causes an XID to
be assigned. I think a plain SELECT should work just fine.

--
Robert Haas
EDB: http://www.enterprisedb.com

#62Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#59)
Re: do only critical work during single-user vacuum?

Hi,

On 2022-02-03 17:02:15 -0500, Robert Haas wrote:

On Thu, Feb 3, 2022 at 4:50 PM Andres Freund <andres@anarazel.de> wrote:

I wonder if we shouldn't add some exceptions to the xid allocation
prevention. It makes sense that we don't allow random DML. But it's e.g. often
more realistic to drop / truncate a few tables with unimportant content,
rather than spend the time vacuuming those. We could e.g. allow xid
consumption within VACUUM, TRUNCATE, DROP TABLE / INDEX when run at the top
level for longer than we allow it for anything else.

True, although we currently don't start refusing XID allocation
altogether until only 1 million remain, IIRC. And that's cutting it
really close if we need to start consuming 1 XID per table we need to
drop. We might need to push out some of the thresholds a bit.

Yea, I'd have no problem leaving the "hard" limit somewhere closer to 1
million (although 100k should be just as well), but introduce a softer "only
vacuum/drop/truncate" limit a good bit before that.

For the most part, I think that there's no reason why autovacuum
shouldn't be able to recover from this situation automatically, as
long as old replication slots and prepared transactions are cleaned up
and any old transactions are killed off.

To address the "as long as" part: I think that describing better what is
holding back the horizon would be a significant usability improvement.

Imagine that instead of the generic hints in these messages:
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("database is not accepting commands to avoid wraparound data loss in database \"%s\"",
oldest_datname),
errhint("Stop the postmaster and vacuum that database in single-user mode.\n"
"You might also need to commit or roll back old prepared transactions, or drop stale replication slots.")));
and
ereport(WARNING,
(errmsg("oldest xmin is far in the past"),
errhint("Close open transactions soon to avoid wraparound problems.\n"
"You might also need to commit or roll back old prepared transactions, or drop stale replication slots.")));

we'd actually tell the user a bit more what about what is causing the
problem.

We can compute the:
1) oldest slot by xmin, with name
2) oldest walsender by xmin, with pid
3) oldest prepared transaction id by xid / xmin, with name
4) oldest in-progress transaction id by xid / xmin, with name
5) oldest database datfrozenxid, with database name

If 1-4) are close to 5), there's no point in trying to vacuum aggressively, it
won't help. So we instead can say that the xmin horizon (with a better name)
is held back by the oldest of these, with enough identifying information for
the user to actually know where to look.

In contrast, if 5) is older than 1-4), then we can tell the user which
database is the problem, as we do right now, but we can stop mentioning the
"You might also need to commit ..." bit.

Also, adding an SRF providing the above in a useful format would be great for
monitoring and for "remote debugging" of problems.

Greetings,

Andres Freund

#63Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#62)
Re: do only critical work during single-user vacuum?

On Thu, Feb 3, 2022 at 8:35 PM Andres Freund <andres@anarazel.de> wrote:

Yea, I'd have no problem leaving the "hard" limit somewhere closer to 1
million (although 100k should be just as well), but introduce a softer "only
vacuum/drop/truncate" limit a good bit before that.

+1.

To address the "as long as" part: I think that describing better what is
holding back the horizon would be a significant usability improvement.

Imagine that instead of the generic hints in these messages:
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("database is not accepting commands to avoid wraparound data loss in database \"%s\"",
oldest_datname),
errhint("Stop the postmaster and vacuum that database in single-user mode.\n"
"You might also need to commit or roll back old prepared transactions, or drop stale replication slots.")));
and
ereport(WARNING,
(errmsg("oldest xmin is far in the past"),
errhint("Close open transactions soon to avoid wraparound problems.\n"
"You might also need to commit or roll back old prepared transactions, or drop stale replication slots.")));

we'd actually tell the user a bit more what about what is causing the
problem.

We can compute the:
1) oldest slot by xmin, with name
2) oldest walsender by xmin, with pid
3) oldest prepared transaction id by xid / xmin, with name
4) oldest in-progress transaction id by xid / xmin, with name
5) oldest database datfrozenxid, with database name

If 1-4) are close to 5), there's no point in trying to vacuum aggressively, it
won't help. So we instead can say that the xmin horizon (with a better name)
is held back by the oldest of these, with enough identifying information for
the user to actually know where to look.

Yes. This kind of thing strikes me as potentially a huge help. To
rephrase that in other terms, we could tell the user what the actual
problem is instead of suggesting to them that they shut down their
database just for fun. It's "just for fun" because (a) it typically
won't fix the real problem, which is most often (1) or (3) from your
list, and even if it's (2) or (4) they could just kill the session
instead of shutting down the whole database, and (b) no matter what
needs to be done, whether it's VACUUM or ROLLBACK PREPARED or
something else, they may as well do that thing in multi-user mode
rather than single-user mode, unless we as PostgreSQL developers
forgot to make that actually work.

--
Robert Haas
EDB: http://www.enterprisedb.com

#64Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#63)
Re: do only critical work during single-user vacuum?

Hi,

On 2022-02-03 21:08:03 -0500, Robert Haas wrote:

On Thu, Feb 3, 2022 at 8:35 PM Andres Freund <andres@anarazel.de> wrote:

We can compute the:
1) oldest slot by xmin, with name
2) oldest walsender by xmin, with pid
3) oldest prepared transaction id by xid / xmin, with name
4) oldest in-progress transaction id by xid / xmin, with name
5) oldest database datfrozenxid, with database name

If 1-4) are close to 5), there's no point in trying to vacuum aggressively, it
won't help. So we instead can say that the xmin horizon (with a better name)
is held back by the oldest of these, with enough identifying information for
the user to actually know where to look.

Yes. This kind of thing strikes me as potentially a huge help. To
rephrase that in other terms, we could tell the user what the actual
problem is instead of suggesting to them that they shut down their
database just for fun. It's "just for fun" because (a) it typically
won't fix the real problem, which is most often (1) or (3) from your
list, and even if it's (2) or (4) they could just kill the session
instead of shutting down the whole database

Not that it matters, but IME the leading cause is 5). Often due to autovacuum
configuration. Which reminded me of the one thing that single user mode
is actually helpful for: Being able to start a manual VACUUM.

Once autovacuum is churning along in anti-wrap mode, with multiple workers, it
can be hard to manually VACUUM without waiting for autovacuum to do it's
throttled thing. The only way is to start the manual VACUUM and kill
autovacuum workers whenever they're blocking the manual vacuum(s).

Which reminds me: Perhaps we ought to hint about reducing / removing
autovacuum cost limits in this situation? And perhaps make autovacuum absorb
config changes while running? It's annoying that an autovac halfway into a
huge table doesn't absorb changed cost limits for example.

(b) no matter what needs to be done, whether it's VACUUM or ROLLBACK
PREPARED or something else, they may as well do that thing in multi-user
mode rather than single-user mode, unless we as PostgreSQL developers forgot
to make that actually work.

One thing that we made quite hard is to rollback prepared transactions,
because we require to be in the same database (a lot of fun in single user
mode with a lot of databases). We can't commit in the same database, but I
wonder if it's doable to allow rollbacks?

Greetings,

Andres Freund

#65Justin Pryzby
pryzby@telsasoft.com
In reply to: Andres Freund (#64)
Re: do only critical work during single-user vacuum?

On Thu, Feb 03, 2022 at 07:26:01PM -0800, Andres Freund wrote:

Which reminds me: Perhaps we ought to hint about reducing / removing
autovacuum cost limits in this situation? And perhaps make autovacuum absorb
config changes while running? It's annoying that an autovac halfway into a
huge table doesn't absorb changed cost limits for example.

I remembered this thread:

https://commitfest.postgresql.org/32/2983/
| Running autovacuum dynamic update to cost_limit and delay

/messages/by-id/13A6B954-5C21-4E60-BC06-751C8EA469A0@amazon.com
/messages/by-id/0A3F8A3C-4328-4A4B-80CF-14CEBE0B695D@amazon.com

--
Justin

#66John Naylor
john.naylor@enterprisedb.com
In reply to: Robert Haas (#61)
Re: do only critical work during single-user vacuum?

On Thu, Feb 3, 2022 at 7:30 PM Robert Haas <robertmhaas@gmail.com> wrote:

That error comes from GetNewTransactionId(), so that function must
either try to execute DML or do something else which causes an XID to
be assigned. I think a plain SELECT should work just fine.

It was indeed doing writes, so that much is not a surprise anymore.

On Thu, Feb 3, 2022 at 9:08 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Feb 3, 2022 at 8:35 PM Andres Freund <andres@anarazel.de> wrote:

Yea, I'd have no problem leaving the "hard" limit somewhere closer to 1
million (although 100k should be just as well), but introduce a softer "only
vacuum/drop/truncate" limit a good bit before that.

+1.

Since there seems to be agreement on this, I can attempt a stab at it,
but it'll be another week before I can do so.

--
John Naylor
EDB: http://www.enterprisedb.com

#67John Naylor
john.naylor@enterprisedb.com
In reply to: Andres Freund (#56)
Re: do only critical work during single-user vacuum?

On Fri, Feb 4, 2022 at 4:58 AM Robert Haas <robertmhaas@gmail.com> wrote:

As I said
before, I know TRUNCATE has been an issue in the past, and if that's
not already fixed in v14, we should. If there's other stuff, we should
fix that too.

The failsafe mode does disable truncation as of v14:

commit 60f1f09ff44308667ef6c72fbafd68235e55ae27
Author: Peter Geoghegan <pg@bowt.ie>
Date: Tue Apr 13 12:58:31 2021 -0700

Don't truncate heap when VACUUM's failsafe is in effect.
--

To demonstrate to myself, I tried a few vacuums in a debugger session
with a breakpoint at GetNewTransactionId(). I've only seen it reach
here when heap truncation happens (or the not relevant for wraparound
situations FULL and ANALYZE).

With the maximum allowable setting of autovacuum_freeze_max_age of 2
billion, the highest allowable vacuum_failsafe_age is 2.1 billion, so
heap truncation will be shut off before the warnings start.

And then we should KILL WITH FIRE
the message telling people to use single user mode -- and once we do
that, the question of what the behavior ought to be when someone does
run VACUUM in single user mode becomes a lot less important.

Okay, so it sounds like changing the message is enough for v15? The
other two things mentioned are nice-to-haves, but wouldn't need to
hold back this minimal change, it seems:

On Fri, Feb 4, 2022 at 4:50 AM Andres Freund <andres@anarazel.de> wrote:

I wonder if we shouldn't add some exceptions to the xid allocation
prevention. It makes sense that we don't allow random DML. But it's e.g. often
more realistic to drop / truncate a few tables with unimportant content,
rather than spend the time vacuuming those. We could e.g. allow xid
consumption within VACUUM, TRUNCATE, DROP TABLE / INDEX when run at the top
level for longer than we allow it for anything else.

It seems like this would require having access to "nodetag(parsetree)"
of the statement available in GetNewTransactionId. I don't immediately
see an easy way to do that...is a global var within the realm of
acceptability?

On Fri, Feb 4, 2022 at 8:35 AM Andres Freund <andres@anarazel.de> wrote:

we'd actually tell the user a bit more what about what is causing the
problem.

We can compute the:
1) oldest slot by xmin, with name
2) oldest walsender by xmin, with pid
3) oldest prepared transaction id by xid / xmin, with name
4) oldest in-progress transaction id by xid / xmin, with name
5) oldest database datfrozenxid, with database name

[...]

Also, adding an SRF providing the above in a useful format would be great for
monitoring and for "remote debugging" of problems.

I concur it sounds very useful, and not terribly hard, but probably a
v16 project.

--
John Naylor
EDB: http://www.enterprisedb.com

In reply to: John Naylor (#67)
Re: do only critical work during single-user vacuum?

On Mon, Feb 14, 2022 at 8:04 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

The failsafe mode does disable truncation as of v14:

commit 60f1f09ff44308667ef6c72fbafd68235e55ae27
Author: Peter Geoghegan <pg@bowt.ie>
Date: Tue Apr 13 12:58:31 2021 -0700

Don't truncate heap when VACUUM's failsafe is in effect.

That's true, but bear in mind that it only does so when the specific
table being vacuumed actually triggers the failsafe. I believe that
VACUUM(EMERGENCY) doesn't just limit itself to vacuuming tables where
this is guaranteed (or even likely). If I'm not mistaken, it's
possible (even likely) that there will be a table whose
age(relfrozenxid) is high enough for VACUUM(EMERGENCY) to target the
table, and yet not so high that the failsafe will kick in at the
earliest opportunity.

To demonstrate to myself, I tried a few vacuums in a debugger session
with a breakpoint at GetNewTransactionId(). I've only seen it reach
here when heap truncation happens (or the not relevant for wraparound
situations FULL and ANALYZE).

It's possible for a manually issued VACUUM to directly disable
truncation (same with index_cleanup). Without getting into the
question of what the ideal behavior might be right now, I can say for
sure that it wouldn't be difficult to teach VACUUM(EMERGENCY) to pass
down the same options.

The failsafe is essentially a mechanism that dynamically changes these
options for an ongoing vacuum, once age(relfrozenxid) crosses a
certain threshold. There is nothing fundamentally special about that.

--
Peter Geoghegan

#69John Naylor
john.naylor@enterprisedb.com
In reply to: Peter Geoghegan (#68)
Re: do only critical work during single-user vacuum?

On Tue, Feb 15, 2022 at 11:22 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Mon, Feb 14, 2022 at 8:04 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

The failsafe mode does disable truncation as of v14:

commit 60f1f09ff44308667ef6c72fbafd68235e55ae27
Author: Peter Geoghegan <pg@bowt.ie>
Date: Tue Apr 13 12:58:31 2021 -0700

Don't truncate heap when VACUUM's failsafe is in effect.

That's true, but bear in mind that it only does so when the specific
table being vacuumed actually triggers the failsafe. I believe that
VACUUM(EMERGENCY) doesn't just limit itself to vacuuming tables where
this is guaranteed (or even likely). If I'm not mistaken, it's
possible (even likely) that there will be a table whose
age(relfrozenxid) is high enough for VACUUM(EMERGENCY) to target the
table, and yet not so high that the failsafe will kick in at the
earliest opportunity.

Well, the point of inventing this new vacuum mode was because I
thought that upon reaching xidStopLimit, we couldn't issue commands,
period, under the postmaster. If it was easier to get a test instance
to xidStopLimit, I certainly would have discovered this sooner. When
Andres wondered about getting away from single user mode, I assumed
that would involve getting into areas too deep to tackle for v15. As
Robert pointed out, lazy_truncate_heap is the only thing that can't
happen for vacuum at this point, and fully explains why in versions <
14 our client's attempts to vacuum resulted in error. Since the
failsafe mode turns off truncation, vacuum should now *just work* near
wraparound. If there is any doubt, we can tighten the check for
entering failsafe.

Now, it's certainly possible that autovacuum is either not working at
all because of something broken, or is not working on the oldest
tables at the moment, so one thing we could do is to make VACUUM [with
no tables listed] get the tables from pg_class in reverse order of
max(xid age, mxid age). That way, the horizon will eventually pull
back over time and the admin can optionally cancel the vacuum at some
point. Since the order is harmless when it's not needed, we can do
that unconditionally.
--
John Naylor
EDB: http://www.enterprisedb.com

#70Robert Haas
robertmhaas@gmail.com
In reply to: John Naylor (#69)
Re: do only critical work during single-user vacuum?

On Tue, Feb 15, 2022 at 1:04 AM John Naylor
<john.naylor@enterprisedb.com> wrote:

Well, the point of inventing this new vacuum mode was because I
thought that upon reaching xidStopLimit, we couldn't issue commands,
period, under the postmaster. If it was easier to get a test instance
to xidStopLimit, I certainly would have discovered this sooner. When
Andres wondered about getting away from single user mode, I assumed
that would involve getting into areas too deep to tackle for v15. As
Robert pointed out, lazy_truncate_heap is the only thing that can't
happen for vacuum at this point, and fully explains why in versions <
14 our client's attempts to vacuum resulted in error. Since the
failsafe mode turns off truncation, vacuum should now *just work* near
wraparound. If there is any doubt, we can tighten the check for
entering failsafe.

+1 to all of that.

--
Robert Haas
EDB: http://www.enterprisedb.com

In reply to: John Naylor (#69)
Re: do only critical work during single-user vacuum?

On Mon, Feb 14, 2022 at 10:04 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

Well, the point of inventing this new vacuum mode was because I
thought that upon reaching xidStopLimit, we couldn't issue commands,
period, under the postmaster. If it was easier to get a test instance
to xidStopLimit, I certainly would have discovered this sooner.

I did notice from my own testing of the failsafe (by artificially
inducing wraparound failure using an XID burning C function) that
autovacuum seemed to totally correct the problem, even when the system
had already crossed xidStopLimit - it came back on its own. I wasn't
completely sure of how robust this effect was, though.

When
Andres wondered about getting away from single user mode, I assumed
that would involve getting into areas too deep to tackle for v15. As
Robert pointed out, lazy_truncate_heap is the only thing that can't
happen for vacuum at this point, and fully explains why in versions <
14 our client's attempts to vacuum resulted in error. Since the
failsafe mode turns off truncation, vacuum should now *just work* near
wraparound. If there is any doubt, we can tighten the check for
entering failsafe.

Obviously having to enter single user mode is horrid. If we can
reasonably update the advice to something more reasonable now, then
that would help users that find themselves in this situation a great
deal.

Now, it's certainly possible that autovacuum is either not working at
all because of something broken, or is not working on the oldest
tables at the moment, so one thing we could do is to make VACUUM [with
no tables listed] get the tables from pg_class in reverse order of
max(xid age, mxid age). That way, the horizon will eventually pull
back over time and the admin can optionally cancel the vacuum at some
point. Since the order is harmless when it's not needed, we can do
that unconditionally.

My ongoing work on freezing/relfrozenxid tends to make the age of
relfrozenxid much more indicative of the amount of work that VACUUM
would have to do when run -- not limited to freezing. You could
probably do this anyway, but it's nice that that'll be true.

--
Peter Geoghegan

In reply to: Peter Geoghegan (#71)
Re: do only critical work during single-user vacuum?

On Tue, Feb 15, 2022 at 9:28 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Mon, Feb 14, 2022 at 10:04 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

Well, the point of inventing this new vacuum mode was because I
thought that upon reaching xidStopLimit, we couldn't issue commands,
period, under the postmaster. If it was easier to get a test instance
to xidStopLimit, I certainly would have discovered this sooner.

I did notice from my own testing of the failsafe (by artificially
inducing wraparound failure using an XID burning C function) that
autovacuum seemed to totally correct the problem, even when the system
had already crossed xidStopLimit - it came back on its own. I wasn't
completely sure of how robust this effect was, though.

It seemed worth noting this in comments above
should_attempt_truncation(). Pushed a commit to do that just now.

Thanks
--
Peter Geoghegan

#73John Naylor
john.naylor@enterprisedb.com
In reply to: Peter Geoghegan (#72)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 6:17 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Tue, Feb 15, 2022 at 9:28 AM Peter Geoghegan <pg@bowt.ie> wrote:

I did notice from my own testing of the failsafe (by artificially
inducing wraparound failure using an XID burning C function) that
autovacuum seemed to totally correct the problem, even when the system
had already crossed xidStopLimit - it came back on its own. I wasn't
completely sure of how robust this effect was, though.

I'll put some effort in finding any way that it might not be robust.
After that, changing the message and docs is trivial.

It seemed worth noting this in comments above
should_attempt_truncation(). Pushed a commit to do that just now.

Thanks for that.

--
John Naylor
EDB: http://www.enterprisedb.com

In reply to: John Naylor (#73)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 12:43 AM John Naylor
<john.naylor@enterprisedb.com> wrote:

I'll put some effort in finding any way that it might not be robust.
After that, changing the message and docs is trivial.

Great, thanks John.

--
Peter Geoghegan

#75Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Peter Geoghegan (#71)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 2:29 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Mon, Feb 14, 2022 at 10:04 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

Well, the point of inventing this new vacuum mode was because I
thought that upon reaching xidStopLimit, we couldn't issue commands,
period, under the postmaster. If it was easier to get a test instance
to xidStopLimit, I certainly would have discovered this sooner.

I did notice from my own testing of the failsafe (by artificially
inducing wraparound failure using an XID burning C function) that
autovacuum seemed to totally correct the problem, even when the system
had already crossed xidStopLimit - it came back on its own. I wasn't
completely sure of how robust this effect was, though.

FYI, I've tested the situation that I assumed autovacuum can not
correct the problem; when the system had already crossed xidStopLimit,
it keeps failing to vacuum on tables that appear in the front of the
list and have sufficient garbage to trigger the truncation but are not
older than the failsafe limit. But contrary to my assumption, it did
correct the problem since autovacuum continues to the next table in
the list even after an error. This probably means that autovacuum
eventually succeeds to process all tables that trigger the failsafe
mode, ensuring advancing datfrozenxid, which is great.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In reply to: Masahiko Sawada (#75)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 8:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

FYI, I've tested the situation that I assumed autovacuum can not
correct the problem; when the system had already crossed xidStopLimit,
it keeps failing to vacuum on tables that appear in the front of the
list and have sufficient garbage to trigger the truncation but are not
older than the failsafe limit. But contrary to my assumption, it did
correct the problem since autovacuum continues to the next table in
the list even after an error. This probably means that autovacuum
eventually succeeds to process all tables that trigger the failsafe
mode, ensuring advancing datfrozenxid, which is great.

Right; it seems as if the situation is much improved, even when the
failsafe didn't prevent the system from going over xidStopLimit. If
autovacuum alone can bring the system back to a normal state as soon
as possible, without a human needing to do anything special, then
clearly the general risk is much smaller. Even this worst case
scenario where "the failsafe has failed" is not so bad anymore, in
practice. I don't think that it really matters if some concurrent
non-emergency VACUUMs fail when attempting to truncate the table (it's
no worse than ANALYZE failing, for example).

Good news!

--
Peter Geoghegan

#77Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#76)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 12:51 PM Peter Geoghegan <pg@bowt.ie> wrote:

Good news!

+1. But I think we might want to try to write documentation around
this. We should explicitly tell people NOT to use single-user mode,
because that stupid message has been there for a long time and a lot
of people have probably internalized it by now. And we should also
tell them that they SHOULD check for prepared transactions, old
replication slots, etc.

--
Robert Haas
EDB: http://www.enterprisedb.com

In reply to: Robert Haas (#77)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 9:56 AM Robert Haas <robertmhaas@gmail.com> wrote:

+1. But I think we might want to try to write documentation around
this. We should explicitly tell people NOT to use single-user mode,
because that stupid message has been there for a long time and a lot
of people have probably internalized it by now. And we should also
tell them that they SHOULD check for prepared transactions, old
replication slots, etc.

Absolutely -- couldn't agree more. Do you think it's worth targeting
14 here, or just HEAD?

I'm pretty sure that some people believe that wraparound can cause
actual data corruption, in part because of the way the docs present
the information. The system won't do that, of course (precisely
because of this xidStopLimit behavior). The docs make it all sound
absolutely terrifying, which doesn't seem proportionate to me (at
least not with this stuff in place, maybe not ever).

--
Peter Geoghegan

#79Andres Freund
andres@anarazel.de
In reply to: Peter Geoghegan (#78)
Re: do only critical work during single-user vacuum?

Hi,

On 2022-02-16 10:14:19 -0800, Peter Geoghegan wrote:

Absolutely -- couldn't agree more. Do you think it's worth targeting
14 here, or just HEAD?

I'd go for HEAD first, but wouldn't protest against 14.

I'm pretty sure that some people believe that wraparound can cause
actual data corruption

Well, historically they're not wrong. And we've enough things stored in 32bit
counters that I'd be surprised if we didn't have more wraparound issues. Of
course that's not related to anti-wrap vacuums...

Greetings,

Andres Freund

In reply to: Andres Freund (#79)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 10:18 AM Andres Freund <andres@anarazel.de> wrote:

I'm pretty sure that some people believe that wraparound can cause
actual data corruption

Well, historically they're not wrong.

True, but the most recent version where that's actually possible is
PostgreSQL 8.0, which was released in early 2005. That was a very
different time for the project. I don't think that people believe that
wraparound can cause data corruption because they remember a time when
it really could. It seems like general confusion to me (which could
have been avoided).

At a minimum, we ought to be very clear on the fact that Postgres
isn't going to just let your database become corrupt in some more or
less predictable way. The xidStopLimit thing is pretty bad, but it's
still much better than that.

--
Peter Geoghegan

In reply to: Peter Geoghegan (#80)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 10:27 AM Peter Geoghegan <pg@bowt.ie> wrote:

True, but the most recent version where that's actually possible is
PostgreSQL 8.0, which was released in early 2005.

It just occurred to me that the main historic reason for the single
user mode advice was the lack of virtual XIDs. The commit that added
the xidStopLimit behavior (commit 60b2444cc3) came a couple of years
before the introduction of virtual transaction IDs (in commit
295e63983d). AFAICT, the advice about single-user mode was added at a
time where exceeding xidStopLimit caused the system to grind to a halt
completely -- even trivial SELECTs would have failed once Postgres 8.1
crossed the xidStopLimit limit.

It seems as if the advice about single user mode persisted for no
great reason at all. Technically there were some remaining reasons to
keep it around (like the truncation thing), but overall these
secondary reasons could have been addressed much sooner if somebody
had thought about it.

--
Peter Geoghegan

#82Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#80)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 1:28 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Wed, Feb 16, 2022 at 10:18 AM Andres Freund <andres@anarazel.de> wrote:

I'm pretty sure that some people believe that wraparound can cause
actual data corruption

Well, historically they're not wrong.

True, but the most recent version where that's actually possible is
PostgreSQL 8.0, which was released in early 2005. That was a very
different time for the project. I don't think that people believe that
wraparound can cause data corruption because they remember a time when
it really could. It seems like general confusion to me (which could
have been avoided).

No, I think it's PostgreSQL 13, because before the vacuum failsafe
thing you could end up truncating enough tables during vacuum
operations to actually wrap around.

And even in 14+, you can still do that, if you use single user mode.

--
Robert Haas
EDB: http://www.enterprisedb.com

#83Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#81)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 2:18 PM Peter Geoghegan <pg@bowt.ie> wrote:

It seems as if the advice about single user mode persisted for no
great reason at all. Technically there were some remaining reasons to
keep it around (like the truncation thing), but overall these
secondary reasons could have been addressed much sooner if somebody
had thought about it.

I raised it on the list a couple of years ago, actually. I think I had
a bit of difficulty convincing people that it wasn't something we had
to keep recommending.

--
Robert Haas
EDB: http://www.enterprisedb.com

In reply to: Robert Haas (#82)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 12:11 PM Robert Haas <robertmhaas@gmail.com> wrote:

No, I think it's PostgreSQL 13, because before the vacuum failsafe
thing you could end up truncating enough tables during vacuum
operations to actually wrap around.

Why wouldn't the xidStopLimit thing prevent actual incorrect answers
to queries, even on Postgres 13? Why wouldn't that be enough, even if
we make the most pessimistic possible assumptions?

To me it looks like it's physically impossible to advance an XID past
xidStopLimit, unless you're in single user mode. Does your concern
have something to do with the actual xidStopLimit value in shared
memory not being sufficiently protective in practice?

And even in 14+, you can still do that, if you use single user mode.

So what you're saying is that there is *some* reason for vacuuming in
single user mode after all, and so we should keep the advice about
that in place? :-)

--
Peter Geoghegan

#85Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#84)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 3:21 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Wed, Feb 16, 2022 at 12:11 PM Robert Haas <robertmhaas@gmail.com> wrote:

No, I think it's PostgreSQL 13, because before the vacuum failsafe
thing you could end up truncating enough tables during vacuum
operations to actually wrap around.

Why wouldn't the xidStopLimit thing prevent actual incorrect answers
to queries, even on Postgres 13? Why wouldn't that be enough, even if
we make the most pessimistic possible assumptions?

To me it looks like it's physically impossible to advance an XID past
xidStopLimit, unless you're in single user mode. Does your concern
have something to do with the actual xidStopLimit value in shared
memory not being sufficiently protective in practice?

No, what I'm saying is that people running older versions routinely
run VACUUM in single-user mode because otherwise it fails due to the
truncation issue. But once they go into single-user mode they lose
protection.

And even in 14+, you can still do that, if you use single user mode.

So what you're saying is that there is *some* reason for vacuuming in
single user mode after all, and so we should keep the advice about
that in place? :-)

We could perhaps amend the text slightly, e.g. "This is a great idea
if you like pain."

--
Robert Haas
EDB: http://www.enterprisedb.com

In reply to: Robert Haas (#85)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 1:04 PM Robert Haas <robertmhaas@gmail.com> wrote:

No, what I'm saying is that people running older versions routinely
run VACUUM in single-user mode because otherwise it fails due to the
truncation issue. But once they go into single-user mode they lose
protection.

Seems logically consistent, but absurd. A Catch-22 situation if ever
there was one.

There might well be an element of survivorship bias here. Most VACUUM
operations won't ever attempt truncation (speaking very generally).
How many times might (say) the customer that John mentioned have
accidentally gone over xidStopLimit for just a little while, before
the situation corrected itself without anybody noticing? A lot of
applications are very read-heavy, or aren't very well monitored.

Eventually (maybe after several years of this), some laggard
anti-wraparound vacuum needs to truncate the relation, due to random
happenstance. Once that happens, the situation is bound to come to a
head. The user is bound to finally notice that the system has gone
over xidStopLimit, because there is no longer any way for the problem
to go away on its own.

--
Peter Geoghegan

#87Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#86)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 4:48 PM Peter Geoghegan <pg@bowt.ie> wrote:

There might well be an element of survivorship bias here. Most VACUUM
operations won't ever attempt truncation (speaking very generally).
How many times might (say) the customer that John mentioned have
accidentally gone over xidStopLimit for just a little while, before
the situation corrected itself without anybody noticing? A lot of
applications are very read-heavy, or aren't very well monitored.

Eventually (maybe after several years of this), some laggard
anti-wraparound vacuum needs to truncate the relation, due to random
happenstance. Once that happens, the situation is bound to come to a
head. The user is bound to finally notice that the system has gone
over xidStopLimit, because there is no longer any way for the problem
to go away on its own.

I think that's not really what is happening, at least not in the cases
that typically are brought to my attention. In those cases, the
typical pattern is:

1. Everything is fine.

2. Then the user forgets about a prepared transaction or a replication
slot, or leaves a transaction open forever, or has some kind of
corruption that causes VACUUM to fall over and die every time it tries
to run.

3. The user has no idea that VACUUM is no longer advanced
relfrozenxid. Time passes.

4. Eventually the system stops being willing to allocate new XIDs. It
tells the user to go to single user mode. So they do.

5. None of the tables in the database have been vacuumed in a long
time. There are a million XIDs left. How many of the tables in the
database are going to be truncate when they are vacuumed and burn one
of the remaining XIDs? Anybody's guess, could be all or none.

6. Sometimes the user decides to run VACUUM FULL instead of plain
VACUUM because it sounds better.

--
Robert Haas
EDB: http://www.enterprisedb.com

In reply to: Robert Haas (#87)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 6:56 PM Robert Haas <robertmhaas@gmail.com> wrote:

I think that's not really what is happening, at least not in the cases
that typically are brought to my attention. In those cases, the
typical pattern is:

5. None of the tables in the database have been vacuumed in a long
time. There are a million XIDs left. How many of the tables in the
database are going to be truncate when they are vacuumed and burn one
of the remaining XIDs? Anybody's guess, could be all or none.

I have to admit that this sounds way more plausible than my
speculative scenario. I haven't been involved in any kind of support
case with a customer in a *long* time, though (not by choice, mind
you).

6. Sometimes the user decides to run VACUUM FULL instead of plain
VACUUM because it sounds better.

It's a pity that the name suggests otherwise. If only we'd named it
something that suggests "option of last resort". Oh well.

--
Peter Geoghegan

#89Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#88)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 10:08 PM Peter Geoghegan <pg@bowt.ie> wrote:

6. Sometimes the user decides to run VACUUM FULL instead of plain
VACUUM because it sounds better.

It's a pity that the name suggests otherwise. If only we'd named it
something that suggests "option of last resort". Oh well.

Unfortunately, such a name would also be misleading, just in a
different way. It is really not at all difficult to have a workload
that demands routine use of VACUUM FULL. I suppose technically it is a
last resort in such situations, because what would you resort to after
trying VF? But it's not like some kind of in-emergency-break-glass
kind of thing, it's just the right tool for the job.

Some years ago I worked with a customer who had a table that was being
used as an update-heavy queue. I don't remember all the details any
more, but I think the general pattern was that they would insert rows,
update them A TON, and then eventually delete them. And they got
really bad table bloat, because vacuum just wasn't running often
enough to keep up. Reducing autovacuum_naptime to 15s fixed the issue,
fortunately, but I was initially thinking that it might be completely
unfixable, because what if they'd also been running a series
4-minute-long reporting queries in a loop on some other table? More
frequent vacuuming wouldn't have helped then, because xmin would not
have been able to advance until the current instance of the reporting
query finished, and then vacuuming more often would have done nothing
useful. I think, anyway.

That's just one example that comes to mind. I think there are lots of
workloads where it's simply not possible to make VACUUM keep up.

--
Robert Haas
EDB: http://www.enterprisedb.com

#90Noah Misch
noah@leadboat.com
In reply to: John Naylor (#73)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 03:43:12PM +0700, John Naylor wrote:

On Wed, Feb 16, 2022 at 6:17 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Tue, Feb 15, 2022 at 9:28 AM Peter Geoghegan <pg@bowt.ie> wrote:

I did notice from my own testing of the failsafe (by artificially
inducing wraparound failure using an XID burning C function) that
autovacuum seemed to totally correct the problem, even when the system
had already crossed xidStopLimit - it came back on its own. I wasn't
completely sure of how robust this effect was, though.

I'll put some effort in finding any way that it might not be robust.

A VACUUM may create a not-trivially-bounded number of multixacts via
FreezeMultiXactId(). In a cluster at multiStopLimit, completing VACUUM
without error needs preparation something like:

1. Kill each XID that might appear in a multixact.
2. Resolve each prepared transaction that might appear in a multixact.
3. Run VACUUM. At this point, multiStopLimit is blocking new multixacts from
other commands, and the lack of running multixact members removes the need
for FreezeMultiXactId() to create multixacts.

Adding to the badness of single-user mode so well described upthread, one can
enter it without doing (2) and then wrap the nextMXact counter.

#91Andres Freund
andres@anarazel.de
In reply to: Noah Misch (#90)
Re: do only critical work during single-user vacuum?

Hi,

On 2022-02-19 20:57:57 -0800, Noah Misch wrote:

On Wed, Feb 16, 2022 at 03:43:12PM +0700, John Naylor wrote:

On Wed, Feb 16, 2022 at 6:17 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Tue, Feb 15, 2022 at 9:28 AM Peter Geoghegan <pg@bowt.ie> wrote:

I did notice from my own testing of the failsafe (by artificially
inducing wraparound failure using an XID burning C function) that
autovacuum seemed to totally correct the problem, even when the system
had already crossed xidStopLimit - it came back on its own. I wasn't
completely sure of how robust this effect was, though.

I'll put some effort in finding any way that it might not be robust.

A VACUUM may create a not-trivially-bounded number of multixacts via
FreezeMultiXactId(). In a cluster at multiStopLimit, completing VACUUM
without error needs preparation something like:

1. Kill each XID that might appear in a multixact.
2. Resolve each prepared transaction that might appear in a multixact.
3. Run VACUUM. At this point, multiStopLimit is blocking new multixacts from
other commands, and the lack of running multixact members removes the need
for FreezeMultiXactId() to create multixacts.

Adding to the badness of single-user mode so well described upthread, one can
enter it without doing (2) and then wrap the nextMXact counter.

If we collected the information along the lines of I proposed in the second half of
/messages/by-id/20220204013539.qdegpqzvayq3d4y2@alap3.anarazel.de
we should be able to handle such cases more intelligently, I think?

We could e.g. add an error if FreezeMultiXactId() needs to create a new
multixact for a far-in-the-past xid. That's not great, of course, but if we
include the precise cause (pid of backend / prepared xact name / slot name /
...) necessitating creating a new multi, it'd still be a significant
improvement over the status quo.

Greetings,

Andres Freund

#92Noah Misch
noah@leadboat.com
In reply to: Andres Freund (#91)
Re: do only critical work during single-user vacuum?

On Sun, Feb 20, 2022 at 02:15:37PM -0800, Andres Freund wrote:

On 2022-02-19 20:57:57 -0800, Noah Misch wrote:

On Wed, Feb 16, 2022 at 03:43:12PM +0700, John Naylor wrote:

On Wed, Feb 16, 2022 at 6:17 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Tue, Feb 15, 2022 at 9:28 AM Peter Geoghegan <pg@bowt.ie> wrote:

I did notice from my own testing of the failsafe (by artificially
inducing wraparound failure using an XID burning C function) that
autovacuum seemed to totally correct the problem, even when the system
had already crossed xidStopLimit - it came back on its own. I wasn't
completely sure of how robust this effect was, though.

I'll put some effort in finding any way that it might not be robust.

A VACUUM may create a not-trivially-bounded number of multixacts via
FreezeMultiXactId(). In a cluster at multiStopLimit, completing VACUUM
without error needs preparation something like:

1. Kill each XID that might appear in a multixact.
2. Resolve each prepared transaction that might appear in a multixact.
3. Run VACUUM. At this point, multiStopLimit is blocking new multixacts from
other commands, and the lack of running multixact members removes the need
for FreezeMultiXactId() to create multixacts.

Adding to the badness of single-user mode so well described upthread, one can
enter it without doing (2) and then wrap the nextMXact counter.

If we collected the information along the lines of I proposed in the second half of
/messages/by-id/20220204013539.qdegpqzvayq3d4y2@alap3.anarazel.de
we should be able to handle such cases more intelligently, I think?

We could e.g. add an error if FreezeMultiXactId() needs to create a new
multixact for a far-in-the-past xid. That's not great, of course, but if we
include the precise cause (pid of backend / prepared xact name / slot name /
...) necessitating creating a new multi, it'd still be a significant
improvement over the status quo.

Yes, exactly.

In reply to: Andres Freund (#91)
Re: do only critical work during single-user vacuum?

On Sun, Feb 20, 2022 at 2:15 PM Andres Freund <andres@anarazel.de> wrote:

We could e.g. add an error if FreezeMultiXactId() needs to create a new
multixact for a far-in-the-past xid. That's not great, of course, but if we
include the precise cause (pid of backend / prepared xact name / slot name /
...) necessitating creating a new multi, it'd still be a significant
improvement over the status quo.

There are databases that have large tables (that grow and grow), and
also have tables that need to allocate many MultiXacts (or lots of
member space, at least). I strongly suspect that these are seldom the
same table, though.

The current inability of the system to recognize this difference seems
like it might be a real problem. Why should big tables that contain no
actual MultiXactIds at all (and never contained even one) get early
anti-wraparound VACUUMs, specifically focussed on averting MultiXact
wraparound? I'm hoping that the patch that adds smarter tracking of
final relfrozenxid/relminmxid values during VACUUM makes this less of
a problem automatically.

--
Peter Geoghegan

In reply to: John Naylor (#73)
1 attachment(s)
Re: do only critical work during single-user vacuum?

On Wed, Feb 16, 2022 at 12:43 AM John Naylor
<john.naylor@enterprisedb.com> wrote:

I'll put some effort in finding any way that it might not be robust.
After that, changing the message and docs is trivial.

It would be great to be able to totally drop the idea of using
single-user mode before Postgres 15 feature freeze. How's that going?

I suggest that we apply the following patch as part of that work. It
adds one last final failsafe check at the point that VACUUM makes a
final decision on rel truncation.

It seems unlikely that the patch will ever make the crucial difference
in a wraparound scenario -- in practice it's very likely that we'd
have triggered the wraparound at that point if we run into trouble
with the target rel's relfrozenxid age. And even if it does get to
that point, it would still be possible for the autovacuum launcher to
launch another autovacuum -- this time around we will avoid rel
truncation, restoring the system to normal operation (i.e. no more
xidStopLimit state).

On the other hand it's possible that lazy_cleanup_all_indexes() will
take a very long time to run, and it runs after the current final
failsafe check. An index AM's amvacuumcleanup() routine can take a
long time to run sometimes, especially with GIN indexes. And so it's
just about possible that we won't have triggered the failsafe by the
time lazy_cleanup_all_indexes() is called, which then spends a long
time doing index cleanup -- long enough for the system to reach
xidStopLimit due to the target rel's relfrozenxid age crossing the
crucial xidStopLimit crossover point.

This patch makes this problem scenario virtually impossible. Right now
I'm only prepared to say it's very unlikely. I don't see a reason to
take any chances, though.

--
Peter Geoghegan

Attachments:

v1-0001-Perform-final-failsafe-check-before-truncation.patchapplication/octet-stream; name=v1-0001-Perform-final-failsafe-check-before-truncation.patchDownload
From 72aebc7854d24fb8466b45536e41cc3d768cd445 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Tue, 15 Mar 2022 14:24:12 -0700
Subject: [PATCH v1] Perform final failsafe check before truncation.

---
 src/backend/access/heap/vacuumlazy.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 87ab7775a..73336560b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2822,7 +2822,13 @@ should_attempt_truncation(LVRelState *vacrel)
 	if (possibly_freeable > 0 &&
 		(possibly_freeable >= REL_TRUNCATE_MINIMUM ||
 		 possibly_freeable >= vacrel->rel_pages / REL_TRUNCATE_FRACTION))
+	{
+		/* Perform a final failsafe check out of an abundance of caution */
+		if (lazy_check_wraparound_failsafe(vacrel))
+			return false;
+
 		return true;
+	}
 
 	return false;
 }
-- 
2.30.2

#95John Naylor
john.naylor@enterprisedb.com
In reply to: Peter Geoghegan (#94)
Re: do only critical work during single-user vacuum?

On Wed, Mar 16, 2022 at 4:48 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Wed, Feb 16, 2022 at 12:43 AM John Naylor
<john.naylor@enterprisedb.com> wrote:

I'll put some effort in finding any way that it might not be robust.
After that, changing the message and docs is trivial.

It would be great to be able to totally drop the idea of using
single-user mode before Postgres 15 feature freeze. How's that going?

Unfortunately, I was distracted from this work for a time, and just as
I had intended to focus on it during March, I was out sick for 2-3
weeks. I gather from subsequent discussion that a full solution goes
beyond just a new warning message and documentation. Either way I'm
not quite prepared to address this in time for v15.

I suggest that we apply the following patch as part of that work. It
adds one last final failsafe check at the point that VACUUM makes a
final decision on rel truncation.

That is one thing that was in the back of my mind, and it seems
reasonable to me.

--
John Naylor
EDB: http://www.enterprisedb.com

#96Justin Pryzby
pryzby@telsasoft.com
In reply to: Robert Haas (#52)
Re: do only critical work during single-user vacuum?

On Thu, Feb 03, 2022 at 01:05:50PM -0500, Robert Haas wrote:

On Thu, Dec 9, 2021 at 8:56 PM Andres Freund <andres@anarazel.de> wrote:

I think we should move *away* from single user mode, rather than the
opposite. It's a substantial code burden and it's hard to use.

Yes. This thread seems to be largely devoted to the topic of making
single-user vacuum work better, but I don't see anyone asking the
question "why do we have a message that tells people to vacuum in
single user mode in the first place?". It's basically bad advice,

The correct thing to do would be to remove
the hint as bad advice that we never should have offered in the first
place. And so here. We should not try to make vacuum in single
user-mode work better or differently, or at least that shouldn't be
our primary objective. We should just stop telling people to do it. We
should probably add messages and documentation *discouraging* the use
of single user mode for recovering from wraparound trouble, exactly
the opposite of what we do now. There's nothing we can do in
single-user mode that we can't do equally well in multi-user mode. If
people try to fix wraparound problems in multi-user mode, they still
have read-only access to their database, they can use parallelism,
they can use command line utilities like vacuumdb, and they can use
psql which has line editing and allows remote access and is a way
nicer user experience than running postgres --single. We need a really
compelling reason to tell people to give up all those advantages, and
there is no such reason. It makes just as much sense as telling people
to deal with wraparound problems by angering a live anaconda.

By chance, I came across this prior thread which advocated the same thing in a
initially (rather than indirectly as in this year's thread).

/messages/by-id/CAMT0RQTmRj_Egtmre6fbiMA9E2hM3BsLULiV8W00stwa3URvzA@mail.gmail.com
|We should stop telling users to "vacuum that database in single-user mode"

In reply to: Justin Pryzby (#96)
Re: do only critical work during single-user vacuum?

On Mon, Jun 27, 2022 at 12:36 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

By chance, I came across this prior thread which advocated the same thing in a
initially (rather than indirectly as in this year's thread).

Revisiting this topic reminded me that PostgreSQL 14 (the first
version that had the wraparound failsafe mechanism controlled by
vacuum_failsafe_age) has been a stable release for 9 months now. As of
today I am still not aware of even one user that ran into the failsafe
mechanism in production. It might well have happened by now, of
course, but I am not aware of any specific case. Perhaps this will
change soon enough -- maybe somebody else will read this and enlighten
me.

To me the fact that the failsafe seems to seldom kick-in in practice
suggests something about workload characteristics in general: that it
isn't all that common for users to try to get away with putting off
freezing until a table attains an age that is significantly above 1
billion XIDs.

When people talk about things like 64-bit XIDs, I tend to wonder: if 2
billion XIDs wasn't enough, why should 4 billion or 8 billion be
enough? *Maybe* the system can do better by getting even further into
debt than it can today, but you can't expect to avoid freezing
altogether (without significant work elsewhere). My general sense is
that freezing isn't a particularly good thing to try to do lazily --
even if we ignore the risk of an eventual wraparound failure.

--
Peter Geoghegan