Testing autovacuum wraparound (including failsafe)

Started by Andres Freundover 4 years ago59 messages
#1Andres Freund
andres@anarazel.de

Hi,

I started to write a test for $Subject, which I think we sorely need.

Currently my approach is to:
- start a cluster, create a few tables with test data
- acquire SHARE UPDATE EXCLUSIVE in a prepared transaction, to prevent
autovacuum from doing anything
- cause dead tuples to exist
- restart
- run pg_resetwal -x 2000027648
- do things like acquiring pins on pages that block vacuum from progressing
- commit prepared transaction
- wait for template0, template1 datfrozenxid to increase
- wait for relfrozenxid for most relations in postgres to increase
- release buffer pin
- wait for postgres datfrozenxid to increase

So far so good. But I've encountered a few things that stand in the way of
enabling such a test by default:

1) During startup StartupSUBTRANS() zeroes out all pages between
oldestActiveXID and nextXid. That takes 8s on my workstation, but only
because I have plenty memory - pg_subtrans ends up 14GB as I currently do
the test. Clearly not something we could do on the BF.

2) FAILSAFE_MIN_PAGES is 4GB - which seems to make it infeasible to test the
failsafe mode, we can't really create 4GB relations on the BF. While
writing the tests I've lowered this to 4MB...

3) pg_resetwal -x requires to carefully choose an xid: It needs to be the
first xid on a clog page. It's not hard to determine which xids are but it
depends on BLCKSZ and a few constants in clog.c. I've for now hardcoded a
value appropriate for 8KB, but ...

I have 2 1/2 ideas about addressing 1);

- We could exposing functionality to do advance nextXid to a future value at
runtime, without filling in clog/subtrans pages. Would probably have to live
in varsup.c and be exposed via regress.so or such?

- The only reason StartupSUBTRANS() does that work is because of the prepared
transaction holding back oldestActiveXID. That transaction in turn exists to
prevent autovacuum from doing anything before we do test setup
steps.

Perhaps it'd be sufficient to set autovacuum_naptime really high initially,
perform the test setup, set naptime to something lower, reload config. But
I'm worried that might not be reliable: If something ends up allocating an
xid we'd potentially reach the path in GetNewTransaction() that wakes up the
launcher? But probably there wouldn't be anything doing so?

Another aspect that might not make this a good choice is that it actually
seems relevant to be able to test cases where there are very old still
running transactions...

- As a variant of the previous idea: If that turns out to be unreliable, we
could instead set nextxid, start in single user mode, create a blocking 2PC
transaction, start normally. Because there's no old active xid we'd not run
into the StartupSUBTRANS problem.

For 2), I don't really have a better idea than making that configurable
somehow?

3) is probably tolerable for now, we could skip the test if BLCKSZ isn't 8KB,
or we could hardcode the calculation for different block sizes.

I noticed one minor bug that's likely new:

2021-04-23 13:32:30.899 PDT [2027738] LOG: automatic aggressive vacuum to prevent wraparound of table "postgres.public.small_trunc": index scans: 1
pages: 400 removed, 28 remain, 0 skipped due to pins, 0 skipped frozen
tuples: 14000 removed, 1000 remain, 0 are dead but not yet removable, oldest xmin: 2000027651
buffer usage: 735 hits, 1262 misses, 874 dirtied
index scan needed: 401 pages from table (1432.14% of total) had 14000 dead item identifiers removed
index "small_trunc_pkey": pages: 43 in total, 37 newly deleted, 37 currently deleted, 0 reusable
avg read rate: 559.048 MB/s, avg write rate: 387.170 MB/s
system usage: CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s
WAL usage: 1809 records, 474 full page images, 3977538 bytes

'1432.14% of total' - looks like removed pages need to be added before the
percentage calculation?

Greetings,

Andres Freund

#2Justin Pryzby
pryzby@telsasoft.com
In reply to: Andres Freund (#1)
Re: Testing autovacuum wraparound (including failsafe)

On Fri, Apr 23, 2021 at 01:43:06PM -0700, Andres Freund wrote:

2) FAILSAFE_MIN_PAGES is 4GB - which seems to make it infeasible to test the
failsafe mode, we can't really create 4GB relations on the BF. While
writing the tests I've lowered this to 4MB...

For 2), I don't really have a better idea than making that configurable
somehow?

Does it work to shut down the cluster and create the .0,.1,.2,.3 segments of a
new, empty relation with zero blocks using something like truncate -s 1G ?

--
Justin

In reply to: Andres Freund (#1)
Re: Testing autovacuum wraparound (including failsafe)

On Fri, Apr 23, 2021 at 1:43 PM Andres Freund <andres@anarazel.de> wrote:

I started to write a test for $Subject, which I think we sorely need.

+1

Currently my approach is to:
- start a cluster, create a few tables with test data
- acquire SHARE UPDATE EXCLUSIVE in a prepared transaction, to prevent
autovacuum from doing anything
- cause dead tuples to exist
- restart
- run pg_resetwal -x 2000027648
- do things like acquiring pins on pages that block vacuum from progressing
- commit prepared transaction
- wait for template0, template1 datfrozenxid to increase
- wait for relfrozenxid for most relations in postgres to increase
- release buffer pin
- wait for postgres datfrozenxid to increase

Just having a standard-ish way to do stress testing like this would
add something.

2) FAILSAFE_MIN_PAGES is 4GB - which seems to make it infeasible to test the
failsafe mode, we can't really create 4GB relations on the BF. While
writing the tests I've lowered this to 4MB...

The only reason that I chose 4GB for FAILSAFE_MIN_PAGES is because the
related VACUUM_FSM_EVERY_PAGES constant was 8GB -- the latter limits
how often we'll consider the failsafe in the single-pass/no-indexes
case.

I see no reason why it cannot be changed now. VACUUM_FSM_EVERY_PAGES
also frustrates FSM testing in the single-pass case in about the same
way, so maybe that should be considered as well? Note that the FSM
handling for the single pass case is actually a bit different to the
two pass/has-indexes case, since the single pass case calls
lazy_vacuum_heap_page() directly in its first and only pass over the
heap (that's the whole point of having it of course).

3) pg_resetwal -x requires to carefully choose an xid: It needs to be the
first xid on a clog page. It's not hard to determine which xids are but it
depends on BLCKSZ and a few constants in clog.c. I've for now hardcoded a
value appropriate for 8KB, but ...

Ugh.

For 2), I don't really have a better idea than making that configurable
somehow?

That could make sense as a developer/testing option, I suppose. I just
doubt that it makes sense as anything else.

2021-04-23 13:32:30.899 PDT [2027738] LOG: automatic aggressive vacuum to prevent wraparound of table "postgres.public.small_trunc": index scans: 1
pages: 400 removed, 28 remain, 0 skipped due to pins, 0 skipped frozen
tuples: 14000 removed, 1000 remain, 0 are dead but not yet removable, oldest xmin: 2000027651
buffer usage: 735 hits, 1262 misses, 874 dirtied
index scan needed: 401 pages from table (1432.14% of total) had 14000 dead item identifiers removed
index "small_trunc_pkey": pages: 43 in total, 37 newly deleted, 37 currently deleted, 0 reusable
avg read rate: 559.048 MB/s, avg write rate: 387.170 MB/s
system usage: CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s
WAL usage: 1809 records, 474 full page images, 3977538 bytes

'1432.14% of total' - looks like removed pages need to be added before the
percentage calculation?

Clearly this needs to account for removed heap pages in order to
consistently express the percentage of pages with LP_DEAD items in
terms of a percentage of the original table size. I can fix this
shortly.

--
Peter Geoghegan

#4Andres Freund
andres@anarazel.de
In reply to: Justin Pryzby (#2)
Re: Testing autovacuum wraparound (including failsafe)

Hi,

On 2021-04-23 18:08:12 -0500, Justin Pryzby wrote:

On Fri, Apr 23, 2021 at 01:43:06PM -0700, Andres Freund wrote:

2) FAILSAFE_MIN_PAGES is 4GB - which seems to make it infeasible to test the
failsafe mode, we can't really create 4GB relations on the BF. While
writing the tests I've lowered this to 4MB...

For 2), I don't really have a better idea than making that configurable
somehow?

Does it work to shut down the cluster and create the .0,.1,.2,.3 segments of a
new, empty relation with zero blocks using something like truncate -s 1G ?

I'd like this to be portable to at least windows - I don't know how well
that deals with sparse files. But the bigger issue is that that IIRC
will trigger vacuum to try to initialize all those pages, which will
then force all that space to be allocated anyway...

Greetings,

Andres Freund

#5Andres Freund
andres@anarazel.de
In reply to: Peter Geoghegan (#3)
Re: Testing autovacuum wraparound (including failsafe)

Hi,

On 2021-04-23 16:12:33 -0700, Peter Geoghegan wrote:

The only reason that I chose 4GB for FAILSAFE_MIN_PAGES is because the
related VACUUM_FSM_EVERY_PAGES constant was 8GB -- the latter limits
how often we'll consider the failsafe in the single-pass/no-indexes
case.

I don't really understand why it makes sense to tie FAILSAFE_MIN_PAGES
and VACUUM_FSM_EVERY_PAGES together? They seem pretty independent to me?

I see no reason why it cannot be changed now. VACUUM_FSM_EVERY_PAGES
also frustrates FSM testing in the single-pass case in about the same
way, so maybe that should be considered as well? Note that the FSM
handling for the single pass case is actually a bit different to the
two pass/has-indexes case, since the single pass case calls
lazy_vacuum_heap_page() directly in its first and only pass over the
heap (that's the whole point of having it of course).

I'm not opposed to lowering VACUUM_FSM_EVERY_PAGES (the costs don't seem
all that high compared to vacuuming?), but I don't think there's as
clear a need for testing around that as there is around wraparound.

The failsafe mode affects the table scan itself by disabling cost
limiting. As far as I can see the ways it triggers for the table scan (vs
truncation or index processing) are:

1) Before vacuuming starts, for heap phases and indexes, if already
necessary at that point
2) For a table with indexes, before/after each index vacuum, if now
necessary
3) On a table without indexes, every 8GB, iff there are dead tuples, if now necessary

Why would we want to trigger the failsafe mode during a scan of a table
with dead tuples and no indexes, but not on a table without dead tuples
or with indexes but fewer than m_w_m dead tuples? That makes little
sense to me.

It seems that for the no-index case the warning message is quite off?

ereport(WARNING,
(errmsg("abandoned index vacuuming of table \"%s.%s.%s\" as a failsafe after %d index scans",

Doesn't exactly make one understand that vacuum cost limiting now is
disabled? And is confusing because there would never be index vacuuming?

And even in the cases indexes exist, it's odd to talk about abandoning
index vacuuming that hasn't even started yet?

For 2), I don't really have a better idea than making that configurable
somehow?

That could make sense as a developer/testing option, I suppose. I just
doubt that it makes sense as anything else.

Yea, I only was thinking of making it configurable to be able to test
it. If we change the limit to something considerably lower I wouldn't
see a need for that anymore.

Greetings,

Andres Freund

In reply to: Andres Freund (#5)
Re: Testing autovacuum wraparound (including failsafe)

On Fri, Apr 23, 2021 at 5:29 PM Andres Freund <andres@anarazel.de> wrote:

On 2021-04-23 16:12:33 -0700, Peter Geoghegan wrote:

The only reason that I chose 4GB for FAILSAFE_MIN_PAGES is because the
related VACUUM_FSM_EVERY_PAGES constant was 8GB -- the latter limits
how often we'll consider the failsafe in the single-pass/no-indexes
case.

I don't really understand why it makes sense to tie FAILSAFE_MIN_PAGES
and VACUUM_FSM_EVERY_PAGES together? They seem pretty independent to me?

VACUUM_FSM_EVERY_PAGES controls how often VACUUM does work that
usually takes place right after the two pass case finishes a round of
index and heap vacuuming. This is work that we certainly don't want to
do every time we process a single heap page in the one-pass/no-indexes
case. Initially this just meant FSM vacuuming, but it now includes a
failsafe check.

Of course all of the precise details here are fairly arbitrary
(including VACUUM_FSM_EVERY_PAGES, which has been around for a couple
of releases now). The overall goal that I had in mind was to make the
one-pass case's use of the failsafe have analogous behavior to the
two-pass/has-indexes case -- a goal which was itself somewhat
arbitrary.

The failsafe mode affects the table scan itself by disabling cost
limiting. As far as I can see the ways it triggers for the table scan (vs
truncation or index processing) are:

1) Before vacuuming starts, for heap phases and indexes, if already
necessary at that point
2) For a table with indexes, before/after each index vacuum, if now
necessary
3) On a table without indexes, every 8GB, iff there are dead tuples, if now necessary

Why would we want to trigger the failsafe mode during a scan of a table
with dead tuples and no indexes, but not on a table without dead tuples
or with indexes but fewer than m_w_m dead tuples? That makes little
sense to me.

What alternative does make sense to you?

It seemed important to put the failsafe check at points where we do
other analogous work in all cases. We made a pragmatic trade-off. In
theory almost any scheme might not check often enough, and/or might
check too frequently.

It seems that for the no-index case the warning message is quite off?

I'll fix that up some point soon. FWIW this happened because the
support for one-pass VACUUM was added quite late, at Robert's request.

Another issue with the failsafe commit is that we haven't considered
the autovacuum_multixact_freeze_max_age table reloption -- we only
check the GUC. That might have accidentally been the right thing to
do, though, since the reloption is interpreted as lower than the GUC
in all cases anyway -- arguably the
autovacuum_multixact_freeze_max_age GUC should be all we care about
anyway. I will need to think about this question some more, though.

For 2), I don't really have a better idea than making that configurable
somehow?

That could make sense as a developer/testing option, I suppose. I just
doubt that it makes sense as anything else.

Yea, I only was thinking of making it configurable to be able to test
it. If we change the limit to something considerably lower I wouldn't
see a need for that anymore.

It would probably be okay to just lower it significantly. Not sure if
that's the best approach, though. Will pick it up next week.

--
Peter Geoghegan

#7Andres Freund
andres@anarazel.de
In reply to: Peter Geoghegan (#6)
Re: Testing autovacuum wraparound (including failsafe)

Hi,

On 2021-04-23 19:15:43 -0700, Peter Geoghegan wrote:

The failsafe mode affects the table scan itself by disabling cost
limiting. As far as I can see the ways it triggers for the table scan (vs
truncation or index processing) are:

1) Before vacuuming starts, for heap phases and indexes, if already
necessary at that point
2) For a table with indexes, before/after each index vacuum, if now
necessary
3) On a table without indexes, every 8GB, iff there are dead tuples, if now necessary

Why would we want to trigger the failsafe mode during a scan of a table
with dead tuples and no indexes, but not on a table without dead tuples
or with indexes but fewer than m_w_m dead tuples? That makes little
sense to me.

What alternative does make sense to you?

Check it every so often, independent of whether there are indexes or
dead tuples? Or just check it at the boundaries.

I'd make it dependent on the number of pages scanned, rather than the
block distance to the last check - otherwise we might end up doing it
way too often when there's only a few individual pages not in the freeze
map.

Greetings,

Andres Freund

In reply to: Andres Freund (#7)
Re: Testing autovacuum wraparound (including failsafe)

On Fri, Apr 23, 2021 at 7:33 PM Andres Freund <andres@anarazel.de> wrote:

Check it every so often, independent of whether there are indexes or
dead tuples? Or just check it at the boundaries.

I think that the former suggestion might be better -- I actually
thought about doing it that way myself.

The latter suggestion sounds like you're suggesting that we just check
it at the beginning and the end in all cases (we do the beginning in
all cases already, but now we'd also do the end outside of the loop in
all cases). Is that right? If that is what you meant, then you should
note that there'd hardly be any check in the one-pass case with that
scheme (apart from the initial check that we do already). The only
work we'd be skipping at the end (in the event of that check
triggering the failsafe) would be heap truncation, which (as you've
pointed out yourself) doesn't seem particularly likely to matter.

--
Peter Geoghegan

#9Andres Freund
andres@anarazel.de
In reply to: Peter Geoghegan (#8)
Re: Testing autovacuum wraparound (including failsafe)

Hi,

On 2021-04-23 19:42:30 -0700, Peter Geoghegan wrote:

On Fri, Apr 23, 2021 at 7:33 PM Andres Freund <andres@anarazel.de> wrote:

Check it every so often, independent of whether there are indexes or
dead tuples? Or just check it at the boundaries.

I think that the former suggestion might be better -- I actually
thought about doing it that way myself.

Cool.

The latter suggestion sounds like you're suggesting that we just check
it at the beginning and the end in all cases (we do the beginning in
all cases already, but now we'd also do the end outside of the loop in
all cases). Is that right?

Yes.

If that is what you meant, then you should note that there'd hardly be
any check in the one-pass case with that scheme (apart from the
initial check that we do already). The only work we'd be skipping at
the end (in the event of that check triggering the failsafe) would be
heap truncation, which (as you've pointed out yourself) doesn't seem
particularly likely to matter.

I mainly suggested it because to me the current seems hard to
understand. I do think it'd be better to check more often. But checking
depending on the amount of dead tuples at the right time doesn't strike
me as a good idea - a lot of anti-wraparound vacuums will mainly be
freezing tuples, rather than removing a lot of dead rows. Which makes it
hard to understand when the failsafe kicks in.

Greetings,

Andres Freund

In reply to: Andres Freund (#9)
Re: Testing autovacuum wraparound (including failsafe)

On Fri, Apr 23, 2021 at 7:53 PM Andres Freund <andres@anarazel.de> wrote:

I mainly suggested it because to me the current seems hard to
understand. I do think it'd be better to check more often. But checking
depending on the amount of dead tuples at the right time doesn't strike
me as a good idea - a lot of anti-wraparound vacuums will mainly be
freezing tuples, rather than removing a lot of dead rows. Which makes it
hard to understand when the failsafe kicks in.

I'm convinced -- decoupling the logic from the one-pass-not-two pass
case seems likely to be simpler and more useful. For both the one pass
and two pass/has indexes case.

--
Peter Geoghegan

In reply to: Peter Geoghegan (#10)
1 attachment(s)
Re: Testing autovacuum wraparound (including failsafe)

On Fri, Apr 23, 2021 at 7:56 PM Peter Geoghegan <pg@bowt.ie> wrote:

I'm convinced -- decoupling the logic from the one-pass-not-two pass
case seems likely to be simpler and more useful. For both the one pass
and two pass/has indexes case.

Attached draft patch does it that way.

--
Peter Geoghegan

Attachments:

v1-0001-Consider-triggering-failsafe-during-first-scan.patchapplication/octet-stream; name=v1-0001-Consider-triggering-failsafe-during-first-scan.patchDownload
From 2a67208c7f660f23eb302288b0b74cbb0e839011 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Thu, 13 May 2021 17:53:10 -0700
Subject: [PATCH v1] Consider triggering failsafe during first scan.

---
 src/backend/access/heap/vacuumlazy.c | 34 ++++++++++++----------------
 1 file changed, 15 insertions(+), 19 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 9f1f8e340d..2dd3fbe07a 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -110,10 +110,9 @@
 #define BYPASS_THRESHOLD_PAGES	0.02	/* i.e. 2% of rel_pages */
 
 /*
- * When a table is small (i.e. smaller than this), save cycles by avoiding
- * repeated failsafe checks
+ * Perform failsafe checks every 4GB, approximately
  */
-#define FAILSAFE_MIN_PAGES \
+#define FAILSAFE_EVERY_PAGES \
 	((BlockNumber) (((uint64) 4 * 1024 * 1024 * 1024) / BLCKSZ))
 
 /*
@@ -890,6 +889,7 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
 	BlockNumber nblocks,
 				blkno,
 				next_unskippable_block,
+				next_failsafe_block,
 				next_fsm_block_to_vacuum;
 	PGRUsage	ru0;
 	Buffer		vmbuffer = InvalidBuffer;
@@ -919,6 +919,7 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
 
 	nblocks = RelationGetNumberOfBlocks(vacrel->rel);
 	next_unskippable_block = 0;
+	next_failsafe_block = 0;
 	next_fsm_block_to_vacuum = 0;
 	vacrel->rel_pages = nblocks;
 	vacrel->scanned_pages = 0;
@@ -1168,6 +1169,15 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
 										 PROGRESS_VACUUM_PHASE_SCAN_HEAP);
 		}
 
+		/*
+		 * Regularly consider if wraparound failsafe should trigger
+		 */
+		if (blkno - next_failsafe_block >= FAILSAFE_EVERY_PAGES)
+		{
+			lazy_check_wraparound_failsafe(vacrel);
+			next_failsafe_block  = blkno;
+		}
+
 		/*
 		 * Set up visibility map page as needed.
 		 *
@@ -1375,17 +1385,12 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
 				 * Periodically perform FSM vacuuming to make newly-freed
 				 * space visible on upper FSM pages.  Note we have not yet
 				 * performed FSM processing for blkno.
-				 *
-				 * Call lazy_check_wraparound_failsafe() here, too, since we
-				 * also don't want to do that too frequently, or too
-				 * infrequently.
 				 */
 				if (blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES)
 				{
 					FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
 											blkno);
 					next_fsm_block_to_vacuum = blkno;
-					lazy_check_wraparound_failsafe(vacrel);
 				}
 
 				/*
@@ -2567,22 +2572,13 @@ lazy_check_needs_freeze(Buffer buf, bool *hastup, LVRelState *vacrel)
  * that it started out with.
  *
  * Returns true when failsafe has been triggered.
- *
- * Caller is expected to call here before and after vacuuming each index in
- * the case of two-pass VACUUM, or every VACUUM_FSM_EVERY_PAGES blocks in the
- * case of no-indexes/one-pass VACUUM.
- *
- * There is also a precheck before the first pass over the heap begins, which
- * is helpful when the failsafe initially triggers during a non-aggressive
- * VACUUM -- the automatic aggressive vacuum to prevent wraparound that
- * follows can independently trigger the failsafe right away.
  */
 static bool
 lazy_check_wraparound_failsafe(LVRelState *vacrel)
 {
 	/* Avoid calling vacuum_xid_failsafe_check() very frequently */
 	if (vacrel->num_index_scans == 0 &&
-		vacrel->rel_pages <= FAILSAFE_MIN_PAGES)
+		vacrel->rel_pages <= FAILSAFE_EVERY_PAGES)
 		return false;
 
 	/* Don't warn more than once per VACUUM */
@@ -2600,7 +2596,7 @@ lazy_check_wraparound_failsafe(LVRelState *vacrel)
 		vacrel->do_failsafe = true;
 
 		ereport(WARNING,
-				(errmsg("abandoned index vacuuming of table \"%s.%s.%s\" as a failsafe after %d index scans",
+				(errmsg("bypassing nonessential maintenance of table \"%s.%s.%s\" as a failsafe after %d index scans",
 						get_database_name(MyDatabaseId),
 						vacrel->relnamespace,
 						vacrel->relname,
-- 
2.27.0

#12Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Peter Geoghegan (#6)
Re: Testing autovacuum wraparound (including failsafe)

On Sat, Apr 24, 2021 at 11:16 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Fri, Apr 23, 2021 at 5:29 PM Andres Freund <andres@anarazel.de> wrote:

On 2021-04-23 16:12:33 -0700, Peter Geoghegan wrote:

The only reason that I chose 4GB for FAILSAFE_MIN_PAGES is because the
related VACUUM_FSM_EVERY_PAGES constant was 8GB -- the latter limits
how often we'll consider the failsafe in the single-pass/no-indexes
case.

I don't really understand why it makes sense to tie FAILSAFE_MIN_PAGES
and VACUUM_FSM_EVERY_PAGES together? They seem pretty independent to me?

VACUUM_FSM_EVERY_PAGES controls how often VACUUM does work that
usually takes place right after the two pass case finishes a round of
index and heap vacuuming. This is work that we certainly don't want to
do every time we process a single heap page in the one-pass/no-indexes
case. Initially this just meant FSM vacuuming, but it now includes a
failsafe check.

Of course all of the precise details here are fairly arbitrary
(including VACUUM_FSM_EVERY_PAGES, which has been around for a couple
of releases now). The overall goal that I had in mind was to make the
one-pass case's use of the failsafe have analogous behavior to the
two-pass/has-indexes case -- a goal which was itself somewhat
arbitrary.

The failsafe mode affects the table scan itself by disabling cost
limiting. As far as I can see the ways it triggers for the table scan (vs
truncation or index processing) are:

1) Before vacuuming starts, for heap phases and indexes, if already
necessary at that point
2) For a table with indexes, before/after each index vacuum, if now
necessary
3) On a table without indexes, every 8GB, iff there are dead tuples, if now necessary

Why would we want to trigger the failsafe mode during a scan of a table
with dead tuples and no indexes, but not on a table without dead tuples
or with indexes but fewer than m_w_m dead tuples? That makes little
sense to me.

What alternative does make sense to you?

It seemed important to put the failsafe check at points where we do
other analogous work in all cases. We made a pragmatic trade-off. In
theory almost any scheme might not check often enough, and/or might
check too frequently.

It seems that for the no-index case the warning message is quite off?

I'll fix that up some point soon. FWIW this happened because the
support for one-pass VACUUM was added quite late, at Robert's request.

+1 to fix this. Are you already working on fixing this? If not, I'll
post a patch.

Another issue with the failsafe commit is that we haven't considered
the autovacuum_multixact_freeze_max_age table reloption -- we only
check the GUC. That might have accidentally been the right thing to
do, though, since the reloption is interpreted as lower than the GUC
in all cases anyway -- arguably the
autovacuum_multixact_freeze_max_age GUC should be all we care about
anyway. I will need to think about this question some more, though.

FWIW, I intentionally ignored the reloption there since they're
interpreted as lower than the GUC as you mentioned and the situation
where we need to enter the failsafe mode is not the table-specific
problem but a system-wide problem.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In reply to: Masahiko Sawada (#12)
Re: Testing autovacuum wraparound (including failsafe)

On Mon, May 17, 2021 at 10:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

+1 to fix this. Are you already working on fixing this? If not, I'll
post a patch.

I posted a patch recently (last Thursday my time). Perhaps you can review it?

--
Peter Geoghegan

#14Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Peter Geoghegan (#13)
Re: Testing autovacuum wraparound (including failsafe)

On Tue, May 18, 2021 at 2:42 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Mon, May 17, 2021 at 10:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

+1 to fix this. Are you already working on fixing this? If not, I'll
post a patch.

I posted a patch recently (last Thursday my time). Perhaps you can review it?

Oh, I missed that the patch includes that fix. I'll review the patch.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

#15Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#14)
Re: Testing autovacuum wraparound (including failsafe)

On Tue, May 18, 2021 at 2:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, May 18, 2021 at 2:42 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Mon, May 17, 2021 at 10:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

+1 to fix this. Are you already working on fixing this? If not, I'll
post a patch.

I posted a patch recently (last Thursday my time). Perhaps you can review it?

Oh, I missed that the patch includes that fix. I'll review the patch.

I've reviewed the patch. Here is one comment:

    if (vacrel->num_index_scans == 0 &&
-       vacrel->rel_pages <= FAILSAFE_MIN_PAGES)
+       vacrel->rel_pages <= FAILSAFE_EVERY_PAGES)
        return false;

Since there is the condition "vacrel->num_index_scans == 0" we could
enter the failsafe mode even if the table is less than 4GB, if we
enter lazy_check_wraparound_failsafe() after executing more than one
index scan. Whereas a vacuum on the table that is less than 4GB and
has no index never enters the failsafe mode. I think we can remove
this condition since I don't see the reason why we don't allow to
enter the failsafe mode only when the first-time index scan in the
case of such tables. What do you think?

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In reply to: Masahiko Sawada (#15)
Re: Testing autovacuum wraparound (including failsafe)

On Tue, May 18, 2021 at 12:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Since there is the condition "vacrel->num_index_scans == 0" we could
enter the failsafe mode even if the table is less than 4GB, if we
enter lazy_check_wraparound_failsafe() after executing more than one
index scan. Whereas a vacuum on the table that is less than 4GB and
has no index never enters the failsafe mode. I think we can remove
this condition since I don't see the reason why we don't allow to
enter the failsafe mode only when the first-time index scan in the
case of such tables. What do you think?

I'm convinced -- this does seem like premature optimization now.

I pushed a version of the patch that removes that code just now.

Thanks
--
Peter Geoghegan

#17Anastasia Lubennikova
lubennikovaav@gmail.com
In reply to: Andres Freund (#1)
Re: Testing autovacuum wraparound (including failsafe)

On Thu, Jun 10, 2021 at 10:52 AM Andres Freund <andres@anarazel.de> wrote:

I started to write a test for $Subject, which I think we sorely need.

Currently my approach is to:
- start a cluster, create a few tables with test data
- acquire SHARE UPDATE EXCLUSIVE in a prepared transaction, to prevent
autovacuum from doing anything
- cause dead tuples to exist
- restart
- run pg_resetwal -x 2000027648
- do things like acquiring pins on pages that block vacuum from progressing
- commit prepared transaction
- wait for template0, template1 datfrozenxid to increase
- wait for relfrozenxid for most relations in postgres to increase
- release buffer pin
- wait for postgres datfrozenxid to increase

Cool. Thank you for working on that!
Could you please share a WIP patch for the $subj? I'd be happy to help with
it.

So far so good. But I've encountered a few things that stand in the way of

enabling such a test by default:

1) During startup StartupSUBTRANS() zeroes out all pages between
oldestActiveXID and nextXid. That takes 8s on my workstation, but only
because I have plenty memory - pg_subtrans ends up 14GB as I currently
do
the test. Clearly not something we could do on the BF.
....

3) pg_resetwal -x requires to carefully choose an xid: It needs to be the

first xid on a clog page. It's not hard to determine which xids are but
it
depends on BLCKSZ and a few constants in clog.c. I've for now hardcoded
a
value appropriate for 8KB, but ...

Maybe we can add new pg_resetwal option? Something like pg_resetwal

--xid-near-wraparound, which will ask pg_resetwal to calculate exact xid
value using values from pg_control and clog macros?
I think it might come in handy for manual testing too.

I have 2 1/2 ideas about addressing 1);

- We could exposing functionality to do advance nextXid to a future value
at
runtime, without filling in clog/subtrans pages. Would probably have to
live
in varsup.c and be exposed via regress.so or such?

This option looks scary to me. Several functions rely on the fact that

StartupSUBTRANS() have zeroed pages.
And if we will do it conditional just for tests, it means that we won't
test the real code path.

- The only reason StartupSUBTRANS() does that work is because of the

prepared
transaction holding back oldestActiveXID. That transaction in turn
exists to
prevent autovacuum from doing anything before we do test setup
steps.

Perhaps it'd be sufficient to set autovacuum_naptime really high
initially,
perform the test setup, set naptime to something lower, reload config.
But
I'm worried that might not be reliable: If something ends up allocating
an
xid we'd potentially reach the path in GetNewTransaction() that wakes up
the
launcher? But probably there wouldn't be anything doing so?

Another aspect that might not make this a good choice is that it actually

seems relevant to be able to test cases where there are very old still
running transactions...

Maybe this exact scenario can be covered with a separate long-running

test, not included in buildfarm test suite?

--
Best regards,
Lubennikova Anastasia

#18Andres Freund
andres@anarazel.de
In reply to: Anastasia Lubennikova (#17)
1 attachment(s)
Re: Testing autovacuum wraparound (including failsafe)

Hi,

On 2021-06-10 16:42:01 +0300, Anastasia Lubennikova wrote:

Cool. Thank you for working on that!
Could you please share a WIP patch for the $subj? I'd be happy to help with
it.

I've attached the current WIP state, which hasn't evolved much since
this message... I put the test in src/backend/access/heap/t/001_emergency_vacuum.pl
but I'm not sure that's the best place. But I didn't think
src/test/recovery is great either.

Regards,

Andres

Attachments:

001_emergency_vacuum.pltext/x-perl; charset=us-asciiDownload
#19Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Andres Freund (#18)
1 attachment(s)
Re: Testing autovacuum wraparound (including failsafe)

On Fri, Jun 11, 2021 at 10:19 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2021-06-10 16:42:01 +0300, Anastasia Lubennikova wrote:

Cool. Thank you for working on that!
Could you please share a WIP patch for the $subj? I'd be happy to help with
it.

I've attached the current WIP state, which hasn't evolved much since
this message... I put the test in src/backend/access/heap/t/001_emergency_vacuum.pl
but I'm not sure that's the best place. But I didn't think
src/test/recovery is great either.

Thank you for sharing the WIP patch.

Regarding point (1) you mentioned (StartupSUBTRANS() takes a long time
for zeroing out all pages), how about using single-user mode instead
of preparing the transaction? That is, after pg_resetwal we check the
ages of datfrozenxid by executing a query in single-user mode. That
way, we don’t need to worry about autovacuum concurrently running
while checking the ages of frozenxids. I’ve attached a PoC patch that
does the scenario like:

1. start cluster with autovacuum=off and create tables with a few data
and make garbage on them
2. stop cluster and do pg_resetwal
3. start cluster in single-user mode
4. check age(datfrozenxid)
5. stop cluster
6. start cluster and wait for autovacuums to increase template0,
template1, and postgres datfrozenxids

I put new tests in src/test/module/heap since we already have tests
for brin in src/test/module/brin.

I think that tap test facility to run queries in single-user mode will
also be helpful for testing a new vacuum option/command that is
intended to use in emergency cases and proposed here[1]/messages/by-id/20220128012842.GZ23027@telsasoft.com.

Regards,

[1]: /messages/by-id/20220128012842.GZ23027@telsasoft.com

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

Attachments:

tap_tests_for_wraparound_emregency.patchapplication/x-patch; name=tap_tests_for_wraparound_emregency.patchDownload
diff --git a/src/test/modules/heap/.gitignore b/src/test/modules/heap/.gitignore
new file mode 100644
index 0000000000..716e17f5a2
--- /dev/null
+++ b/src/test/modules/heap/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/heap/Makefile b/src/test/modules/heap/Makefile
new file mode 100644
index 0000000000..d3c08a04b7
--- /dev/null
+++ b/src/test/modules/heap/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/heap/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/heap
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/heap/t/001_emergency_vacuum.pl b/src/test/modules/heap/t/001_emergency_vacuum.pl
new file mode 100644
index 0000000000..3229f99921
--- /dev/null
+++ b/src/test/modules/heap/t/001_emergency_vacuum.pl
@@ -0,0 +1,131 @@
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test for wraparound emergency situation
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 8;
+use IPC::Run qw(pump finish timer);
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+
+$node_primary->init(allows_streaming => 1);
+$node_primary->append_conf('postgresql.conf', qq/
+autovacuum = off # run autovacuum only when to anti wraparound
+max_prepared_transactions=10
+autovacuum_naptime = 1s
+# So it's easier to verify the order of operations
+autovacuum_max_workers=1
+autovacuum_vacuum_cost_delay=0
+log_autovacuum_min_duration=0
+/);
+$node_primary->start;
+
+#
+# Create tables for a few different test scenarios
+#
+
+$node_primary->safe_psql('postgres', qq/
+CREATE TABLE large(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large(data) SELECT generate_series(1,30000);
+
+CREATE TABLE large_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large_trunc(data) SELECT generate_series(1,30000);
+
+CREATE TABLE small(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small(data) SELECT generate_series(1,15000);
+
+CREATE TABLE small_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small_trunc(data) SELECT generate_series(1,15000);
+
+CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH (autovacuum_enabled=false);
+INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);
+/);
+
+# Delete a few rows to ensure that vacuum has work to do.
+$node_primary->safe_psql('postgres', qq/
+DELETE FROM large WHERE id % 2 = 0;
+DELETE FROM large_trunc WHERE id > 10000;
+DELETE FROM small WHERE id % 2 = 0;
+DELETE FROM small_trunc WHERE id > 1000;
+DELETE FROM autovacuum_disabled WHERE id % 2 = 0;
+/);
+
+
+# Stop the server and temporarily disable log_statement while running in single-user mode
+$node_primary->stop;
+$node_primary->append_conf('postgresql.conf', qq/
+log_statement = 'none'
+/);
+
+# Need to reset to a clog page boundary, otherwise we'll get errors
+# about the file not existing. With default compilation settings
+# CLOG_XACTS_PER_PAGE is 32768. The value below is 32768 *
+# (2000000000/32768 + 1), with 2000000000 being the max value for
+# autovacuum_freeze_max_age.
+
+command_like([ 'pg_resetwal', '-x2000027648', $node_primary->data_dir ],
+	     qr/Write-ahead log reset/, 'pg_resetwal -x to');
+
+my $in  = '';
+my $out = '';
+my $timer = timer(5);
+
+# Start the server in single-user mode.  That allows us to test interactions
+# without autovacuums.
+my $h = $node_primary->start_single_user_mode('postgres', \$in, \$out, $timer);
+
+$out = "";
+# Must be a single line with a new line at the end.
+$in .=
+    "SELECT datname, " .
+    "age(datfrozenxid) > current_setting('autovacuum_freeze_max_age')::int as old ".
+    "FROM pg_database ORDER BY 1;\n";
+
+# Pump until we got the result.
+pump $h until ($out != "" || $timer->is_expired);
+
+# Check all database are old enough.
+like($out, qr/1: datname = "postgres"[^\r\n]+\r\n\t 2: old = "t"/,
+     "postgres database is old enough");
+like($out, qr/1: datname = "template0"[^\r\n]+\r\n\t 2: old = "t"/,
+     "template0 database is old enough");
+like($out, qr/1: datname = "template1"[^\r\n]+\r\n\t 2: old = "t"/,
+     "template1 database is old enough");
+
+# Terminate single user mode.
+$in .= "\cD";
+finish $h or die "postgres --single returned $?";
+
+# Revert back the logging setting.
+$node_primary->append_conf('postgresql.conf', qq/
+log_statement = 'all'
+/);
+
+# Now test autovacuum behaviour.
+$node_primary->start;
+
+ok($node_primary->poll_query_until('postgres', qq/
+    SELECT NOT EXISTS (
+        SELECT *
+        FROM pg_database
+        WHERE age(datfrozenxid) > current_setting('autovacuum_freeze_max_age')::int)
+/),
+   "xid horizon increased");
+
+my $ret = $node_primary->safe_psql('postgres', qq/
+SELECT relname, age(relfrozenxid) > current_setting('autovacuum_freeze_max_age')::int
+FROM pg_class
+WHERE oid = ANY(ARRAY['large'::regclass, 'large_trunc', 'small', 'small_trunc', 'autovacuum_disabled'])
+ORDER BY 1
+/);
+is($ret, "autovacuum_disabled|f
+large|f
+large_trunc|f
+small|f
+small_trunc|f", "all tables are vacuumed");
+
+$node_primary->stop;
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 265f3ae657..2d35978bac 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -858,6 +858,40 @@ sub start
 	return 1;
 }
 
+sub start_single_user_mode
+{
+    my ($self, $dbname, $stdin, $stdout, $timer) = @_;
+    my $name = $self->name;
+
+    BAIL_OUT("node \"$name\" is already running") if defined $self->{_pid};
+
+    print("### Starting node \"$name\" in single-user mode\n");
+
+    local %ENV = $self->_get_env();
+
+    my @postgres_params = (
+	$self->installed_command('postgres'),
+	'--single', '-D', $self->data_dir, 'postgres');
+
+    # Ensure there is no data waiting to be sent:
+    $$stdin = "" if ref($stdin);
+    # IPC::Run would otherwise append to existing contents:
+    $$stdout = "" if ref($stdout);
+
+    my $harness = IPC::Run::start \@postgres_params,
+	'<pty<', $stdin, '>pty>', $stdout, $timer;
+
+    # Pump until we see the startup banner.  This ensures that callers won't
+    # write write anything to the ptr before it's ready, avoiding an
+    # implementation issue in IPC::RUN.
+    pump $harness
+	until $$stdout =~ /PostgreSQL stand-alone backend/ || $timer->is_expired;
+
+    die "postgres --single startup timed out" if $timer->is_expired;
+
+    return $harness;
+}
+
 =pod
 
 =item $node->kill9()
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index a310bcb28c..f94c5ea8cb 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -50,7 +50,8 @@ my @contrib_excludes = (
 	'sepgsql',
 	'brin',             'test_extensions',
 	'test_misc',        'test_pg_dump',
-	'snapshot_too_old', 'unsafe_tests');
+	'snapshot_too_old', 'unsafe_tests',
+	'heap');
 
 # Set of variables for frontend modules
 my $frontend_defines = { 'initdb' => 'FRONTEND' };
#20Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#19)
1 attachment(s)
Re: Testing autovacuum wraparound (including failsafe)

Hi,

On Tue, Feb 1, 2022 at 11:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Jun 11, 2021 at 10:19 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2021-06-10 16:42:01 +0300, Anastasia Lubennikova wrote:

Cool. Thank you for working on that!
Could you please share a WIP patch for the $subj? I'd be happy to help with
it.

I've attached the current WIP state, which hasn't evolved much since
this message... I put the test in src/backend/access/heap/t/001_emergency_vacuum.pl
but I'm not sure that's the best place. But I didn't think
src/test/recovery is great either.

Thank you for sharing the WIP patch.

Regarding point (1) you mentioned (StartupSUBTRANS() takes a long time
for zeroing out all pages), how about using single-user mode instead
of preparing the transaction? That is, after pg_resetwal we check the
ages of datfrozenxid by executing a query in single-user mode. That
way, we don’t need to worry about autovacuum concurrently running
while checking the ages of frozenxids. I’ve attached a PoC patch that
does the scenario like:

1. start cluster with autovacuum=off and create tables with a few data
and make garbage on them
2. stop cluster and do pg_resetwal
3. start cluster in single-user mode
4. check age(datfrozenxid)
5. stop cluster
6. start cluster and wait for autovacuums to increase template0,
template1, and postgres datfrozenxids

The above steps are wrong.

I think we can expose a function in an extension used only by this
test in order to set nextXid to a future value with zeroing out
clog/subtrans pages. We don't need to fill all clog/subtrans pages
between oldestActiveXID and nextXid. I've attached a PoC patch for
adding this regression test and am going to register it to the next
CF.

BTW, while testing the emergency situation, I found there is a race
condition where anti-wraparound vacuum isn't invoked with the settings
autovacuum = off, autovacuum_max_workers = 1. AN autovacuum worker
sends a signal to the postmaster after advancing datfrozenxid in
SetTransactionIdLimit(). But with the settings, if the autovacuum
launcher attempts to launch a worker before the autovacuum worker who
has signaled to the postmaster finishes, the launcher exits without
launching a worker due to no free workers. The new launcher won’t be
launched until new XID is generated (and only when new XID % 65536 ==
0). Although autovacuum_max_workers = 1 is not mandatory for this
test, it's easier to verify the order of operations.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

Attachments:

v1-0001-Add-regression-tests-for-emergency-vacuums.patchapplication/octet-stream; name=v1-0001-Add-regression-tests-for-emergency-vacuums.patchDownload
From 9f686cb3d7edfc5b214c2eddbc20f0ccd6bcda7f Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 27 Jun 2022 16:44:41 +0900
Subject: [PATCH v1 1/2] Add regression tests for emergency vacuums.

---
 src/test/modules/Makefile                     |   1 +
 src/test/modules/heap/.gitignore              |   4 +
 src/test/modules/heap/Makefile                |  20 +++
 .../modules/heap/t/001_emergency_vacuum.pl    | 116 ++++++++++++++++++
 src/test/modules/heap/test_heap--1.0.sql      |   9 ++
 src/test/modules/heap/test_heap.c             |  72 +++++++++++
 src/test/modules/heap/test_heap.control       |   4 +
 src/tools/msvc/Mkvcbuild.pm                   |   2 +-
 8 files changed, 227 insertions(+), 1 deletion(-)
 create mode 100644 src/test/modules/heap/.gitignore
 create mode 100644 src/test/modules/heap/Makefile
 create mode 100644 src/test/modules/heap/t/001_emergency_vacuum.pl
 create mode 100644 src/test/modules/heap/test_heap--1.0.sql
 create mode 100644 src/test/modules/heap/test_heap.c
 create mode 100644 src/test/modules/heap/test_heap.control

diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 9090226daa..3d53edc1d2 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -10,6 +10,7 @@ SUBDIRS = \
 		  delay_execution \
 		  dummy_index_am \
 		  dummy_seclabel \
+		  heap \
 		  libpq_pipeline \
 		  plsample \
 		  snapshot_too_old \
diff --git a/src/test/modules/heap/.gitignore b/src/test/modules/heap/.gitignore
new file mode 100644
index 0000000000..5dcb3ff972
--- /dev/null
+++ b/src/test/modules/heap/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/heap/Makefile b/src/test/modules/heap/Makefile
new file mode 100644
index 0000000000..aeae3f938e
--- /dev/null
+++ b/src/test/modules/heap/Makefile
@@ -0,0 +1,20 @@
+# src/test/modules/heap/Makefile
+
+MODULES = test_heap
+PGFILEDESC = "test_heap - regression test for heap"
+
+EXTENSION = test_heap
+DATA = test_heap--1.0.sql
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/heap
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/heap/t/001_emergency_vacuum.pl b/src/test/modules/heap/t/001_emergency_vacuum.pl
new file mode 100644
index 0000000000..791b97f217
--- /dev/null
+++ b/src/test/modules/heap/t/001_emergency_vacuum.pl
@@ -0,0 +1,116 @@
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test for wraparound emergency situation
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('main');
+
+$node->init;
+$node->append_conf('postgresql.conf', qq[
+autovacuum = off # run autovacuum only when to anti wraparound
+max_prepared_transactions = 10
+autovacuum_naptime = 1s
+# so it's easier to verify the order of operations
+autovacuum_max_workers = 1
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+
+$node->safe_psql('postgres', 'CREATE EXTENSION test_heap');
+
+# Create tables for a few different test scenarios
+$node->safe_psql('postgres', qq[
+CREATE TABLE large(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large(data) SELECT generate_series(1,30000);
+
+CREATE TABLE large_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large_trunc(data) SELECT generate_series(1,30000);
+
+CREATE TABLE small(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small(data) SELECT generate_series(1,15000);
+
+CREATE TABLE small_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small_trunc(data) SELECT generate_series(1,15000);
+
+CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH (autovacuum_enabled=false);
+INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);
+]);
+
+# To prevent autovacuum from handling the tables immediately after
+# restart, acquire locks in a 2PC transaction. That allows us to test
+# interactions with running commands.
+$node->safe_psql('postgres', qq[
+BEGIN;
+LOCK TABLE large IN SHARE UPDATE EXCLUSIVE MODE;
+LOCK TABLE large_trunc IN SHARE UPDATE EXCLUSIVE MODE;
+LOCK TABLE small IN SHARE UPDATE EXCLUSIVE MODE;
+LOCK TABLE small_trunc IN SHARE UPDATE EXCLUSIVE MODE;
+LOCK TABLE autovacuum_disabled IN SHARE UPDATE EXCLUSIVE MODE;
+PREPARE TRANSACTION 'prevent-vacuum';
+]);
+
+# Delete a few rows to ensure that vacuum has work to do.
+$node->safe_psql('postgres', qq[
+DELETE FROM large WHERE id % 2 = 0;
+DELETE FROM large_trunc WHERE id > 10000;
+DELETE FROM small WHERE id % 2 = 0;
+DELETE FROM small_trunc WHERE id > 1000;
+DELETE FROM autovacuum_disabled WHERE id % 2 = 0;
+]);
+
+# New XID needs to be a clog page boundary, otherwise we'll get errors about
+# the file not exisitng error. With default compilation settings
+# CLOG_XACTS_PER_PAGE is 32768. The value below is 32768 *
+# (2000000000/32768 + 1), with 2000000000 being the max value for
+# autovacuum_freeze_max_age.  Since the prepared transaction keeps holding the
+# lock on tables above, autovacuum won't run
+$node->safe_psql('postgres', qq[SELECT set_next_xid('2000027648'::xid)]);
+
+# Make sure updating the latest completed with the advanced XID.
+$node->safe_psql('postgres', qq[INSERT INTO small(data) SELECT 1]);
+
+# Check if all databases became old now.
+my $ret = $node->safe_psql('postgres',
+			   qq[
+SELECT datname,
+       age(datfrozenxid) > current_setting('autovacuum_freeze_max_age')::int as old
+FROM pg_database ORDER BY 1
+]);
+is($ret, "postgres|t
+template0|t
+template1|t", "all tables became old");
+
+# Allow autovacuum to start working on these tables.
+$node->safe_psql('postgres', qq[COMMIT PREPARED 'prevent-vacuum']);
+
+$node->poll_query_until('postgres',
+			qq[
+SELECT NOT EXISTS (
+	SELECT *
+	FROM pg_database
+	WHERE age(datfrozenxid) > current_setting('autovacuum_freeze_max_age')::int)
+]) or die "timeout waiting all database are vacuumed";
+
+# Check if these tables are vacuumed.
+$ret = $node->safe_psql('postgres', qq[
+SELECT relname, age(relfrozenxid) > current_setting('autovacuum_freeze_max_age')::int
+FROM pg_class
+WHERE oid = ANY(ARRAY['large'::regclass, 'large_trunc', 'small', 'small_trunc', 'autovacuum_disabled'])
+ORDER BY 1
+]);
+
+is($ret, "autovacuum_disabled|f
+large|f
+large_trunc|f
+small|f
+small_trunc|f", "all tables are vacuumed");
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/modules/heap/test_heap--1.0.sql b/src/test/modules/heap/test_heap--1.0.sql
new file mode 100644
index 0000000000..b7be733bfe
--- /dev/null
+++ b/src/test/modules/heap/test_heap--1.0.sql
@@ -0,0 +1,9 @@
+/* src/test/modules/heap/test_heap--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_heap" to load this file. \quit
+
+CREATE FUNCTION set_next_xid(xid)
+    RETURNS void
+    AS 'MODULE_PATHNAME'
+    LANGUAGE C STRICT VOLATILE;
diff --git a/src/test/modules/heap/test_heap.c b/src/test/modules/heap/test_heap.c
new file mode 100644
index 0000000000..66f73edb76
--- /dev/null
+++ b/src/test/modules/heap/test_heap.c
@@ -0,0 +1,72 @@
+/*----------------------------------------------------------------------
+ * test_heap.c
+ *		Support test functions for the heap
+ *
+ * Copyright (c) 2014-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/test/modules/heap/test_heap.c
+ *----------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/clog.h"
+#include "access/commit_ts.h"
+#include "access/subtrans.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "storage/lwlock.h"
+#include "storage/pmsignal.h"
+#include "utils/builtins.h"
+
+PG_MODULE_MAGIC;
+
+/*
+ * Set the given XID in the current epoch to the next XID
+ */
+PG_FUNCTION_INFO_V1(set_next_xid);
+Datum
+set_next_xid(PG_FUNCTION_ARGS)
+{
+	TransactionId next_xid = PG_GETARG_TRANSACTIONID(0);
+	uint32 epoch;
+
+	if (!TransactionIdIsNormal(next_xid))
+		elog(ERROR, "cannot set invalid transaction id");
+
+	LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+
+	if (TransactionIdPrecedes(next_xid,
+							  XidFromFullTransactionId(ShmemVariableCache->nextXid)))
+	{
+		LWLockRelease(XidGenLock);
+		elog(ERROR, "cannot set transaction id older than the current transaction id");
+	}
+
+	/*
+	 * If the new XID is past xidVacLimit, start trying to force autovacuum
+	 * cycles.
+	 */
+	if (TransactionIdFollowsOrEquals(next_xid, ShmemVariableCache->xidVacLimit))
+	{
+		/* For safety, we release XidGenLock while sending signal */
+		LWLockRelease(XidGenLock);
+		SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER);
+		LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+	}
+
+	/* Construct the new XID in the current epoch */
+	epoch = EpochFromFullTransactionId(ShmemVariableCache->nextXid);
+	ShmemVariableCache->nextXid =
+		FullTransactionIdFromEpochAndXid(epoch, next_xid);
+
+	ExtendCLOG(next_xid);
+	ExtendCommitTs(next_xid);
+	ExtendSUBTRANS(next_xid);
+
+	LWLockRelease(XidGenLock);
+
+	PG_RETURN_VOID();
+}
+
diff --git a/src/test/modules/heap/test_heap.control b/src/test/modules/heap/test_heap.control
new file mode 100644
index 0000000000..7d089bb6d1
--- /dev/null
+++ b/src/test/modules/heap/test_heap.control
@@ -0,0 +1,4 @@
+comment = 'Test code for heap'
+default_version = '1.0'
+module_pathname = '$libdir/test_heap'
+relocatable = true
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index e4feda10fd..022a2fa5f7 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -50,7 +50,7 @@ my @contrib_excludes       = (
 	'sepgsql',         'brin',
 	'test_extensions', 'test_misc',
 	'test_pg_dump',    'snapshot_too_old',
-	'unsafe_tests');
+	'unsafe_tests',     'heap');
 
 # Set of variables for frontend modules
 my $frontend_defines = { 'initdb' => 'FRONTEND' };
-- 
2.24.3 (Apple Git-128)

#21Ian Lawrence Barwick
barwick@gmail.com
In reply to: Masahiko Sawada (#20)
Re: Testing autovacuum wraparound (including failsafe)

2022年6月30日(木) 10:40 Masahiko Sawada <sawada.mshk@gmail.com>:

Hi,

On Tue, Feb 1, 2022 at 11:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Jun 11, 2021 at 10:19 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2021-06-10 16:42:01 +0300, Anastasia Lubennikova wrote:

Cool. Thank you for working on that!
Could you please share a WIP patch for the $subj? I'd be happy to help with
it.

I've attached the current WIP state, which hasn't evolved much since
this message... I put the test in src/backend/access/heap/t/001_emergency_vacuum.pl
but I'm not sure that's the best place. But I didn't think
src/test/recovery is great either.

Thank you for sharing the WIP patch.

Regarding point (1) you mentioned (StartupSUBTRANS() takes a long time
for zeroing out all pages), how about using single-user mode instead
of preparing the transaction? That is, after pg_resetwal we check the
ages of datfrozenxid by executing a query in single-user mode. That
way, we don’t need to worry about autovacuum concurrently running
while checking the ages of frozenxids. I’ve attached a PoC patch that
does the scenario like:

1. start cluster with autovacuum=off and create tables with a few data
and make garbage on them
2. stop cluster and do pg_resetwal
3. start cluster in single-user mode
4. check age(datfrozenxid)
5. stop cluster
6. start cluster and wait for autovacuums to increase template0,
template1, and postgres datfrozenxids

The above steps are wrong.

I think we can expose a function in an extension used only by this
test in order to set nextXid to a future value with zeroing out
clog/subtrans pages. We don't need to fill all clog/subtrans pages
between oldestActiveXID and nextXid. I've attached a PoC patch for
adding this regression test and am going to register it to the next
CF.

BTW, while testing the emergency situation, I found there is a race
condition where anti-wraparound vacuum isn't invoked with the settings
autovacuum = off, autovacuum_max_workers = 1. AN autovacuum worker
sends a signal to the postmaster after advancing datfrozenxid in
SetTransactionIdLimit(). But with the settings, if the autovacuum
launcher attempts to launch a worker before the autovacuum worker who
has signaled to the postmaster finishes, the launcher exits without
launching a worker due to no free workers. The new launcher won’t be
launched until new XID is generated (and only when new XID % 65536 ==
0). Although autovacuum_max_workers = 1 is not mandatory for this
test, it's easier to verify the order of operations.

Hi

Thanks for the patch. While reviewing the patch backlog, we have determined that
the latest version of this patch was submitted before meson support was
implemented, so it should have a "meson.build" file added for consideration for
inclusion in PostgreSQL 16.

Regards

Ian Barwick

#22Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Ian Lawrence Barwick (#21)
Re: Testing autovacuum wraparound (including failsafe)

On 16/11/2022 06:38, Ian Lawrence Barwick wrote:

Thanks for the patch. While reviewing the patch backlog, we have determined that
the latest version of this patch was submitted before meson support was
implemented, so it should have a "meson.build" file added for consideration for
inclusion in PostgreSQL 16.

I wanted to do some XID wraparound testing again, to test the 64-bit
SLRUs patches [1]/messages/by-id/CAJ7c6TPKf0W3MfpP2vr=kq7-NM5G12vTBhi7miu_5m8AG3Cw-w@mail.gmail.com), and revived this.

I took a different approach to consuming the XIDs. Instead of setting
nextXID directly, bypassing GetNewTransactionId(), this patch introduces
a helper function to call GetNewTransactionId() repeatedly. But because
that's slow, it does include a shortcut to skip over "uninteresting"
XIDs. Whenever nextXid is close to an SLRU page boundary or XID
wraparound, it calls GetNewTransactionId(), and otherwise it bumps up
nextXid close to the next "interesting" value. That's still a lot slower
than just setting nextXid, but exercises the code more realistically.

I've written some variant of this helper function many times over the
years, for ad hoc testing. I'd love to have it permanently in the git tree.

In addition to Masahiko's test for emergency vacuum, this includes two
other tests. 002_limits.pl tests the "warn limit" and "stop limit" in
GetNewTransactionId(), and 003_wraparound.pl burns through 10 billion
transactions in total, exercising XID wraparound in general.
Unfortunately these tests are pretty slow; the tests run for about 4
minutes on my laptop in total, and use about 20 GB of disk space. So
perhaps these need to be put in a special test suite that's not run as
part of "check-world". Or perhaps leave out the 003_wraparounds.pl test,
that's the slowest of the tests. But I'd love to have these in the git
tree in some form.

[1]: /messages/by-id/CAJ7c6TPKf0W3MfpP2vr=kq7-NM5G12vTBhi7miu_5m8AG3Cw-w@mail.gmail.com)
/messages/by-id/CAJ7c6TPKf0W3MfpP2vr=kq7-NM5G12vTBhi7miu_5m8AG3Cw-w@mail.gmail.com)

- Heikki

#23Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Heikki Linnakangas (#22)
1 attachment(s)
Re: Testing autovacuum wraparound (including failsafe)

On 03/03/2023 13:34, Heikki Linnakangas wrote:

On 16/11/2022 06:38, Ian Lawrence Barwick wrote:

Thanks for the patch. While reviewing the patch backlog, we have determined that
the latest version of this patch was submitted before meson support was
implemented, so it should have a "meson.build" file added for consideration for
inclusion in PostgreSQL 16.

I wanted to do some XID wraparound testing again, to test the 64-bit
SLRUs patches [1], and revived this.

Forgot attachment.

- Heikki

Attachments:

0001-Add-tests-for-XID-wraparound.patchtext/x-patch; charset=UTF-8; name=0001-Add-tests-for-XID-wraparound.patchDownload
From 34b6b37d7f8d75943924a49e2205033db6e8293c Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 3 Mar 2023 12:01:28 +0200
Subject: [PATCH 1/1] Add tests for XID wraparound.

The test module includes helper functions to quickly burn through lots
of XIDs. They are used in the tests, and are also handy for manually
testing XID wraparound.

Author: Masahiko Sawada, Heikki Linnakangas
Discussion: https://www.postgresql.org/message-id/CAD21AoDVhkXp8HjpFO-gp3TgL6tCKcZQNxn04m01VAtcSi-5sA%40mail.gmail.com
---
 src/test/modules/Makefile                     |   1 +
 src/test/modules/meson.build                  |   1 +
 src/test/modules/xid_wraparound/.gitignore    |   4 +
 src/test/modules/xid_wraparound/Makefile      |  23 ++
 src/test/modules/xid_wraparound/README        |   3 +
 src/test/modules/xid_wraparound/meson.build   |  36 +++
 .../xid_wraparound/t/001_emergency_vacuum.pl  | 110 +++++++++
 .../modules/xid_wraparound/t/002_limits.pl    | 114 +++++++++
 .../xid_wraparound/t/003_wraparounds.pl       |  46 ++++
 .../xid_wraparound/xid_wraparound--1.0.sql    |  12 +
 .../modules/xid_wraparound/xid_wraparound.c   | 221 ++++++++++++++++++
 .../xid_wraparound/xid_wraparound.control     |   4 +
 12 files changed, 575 insertions(+)
 create mode 100644 src/test/modules/xid_wraparound/.gitignore
 create mode 100644 src/test/modules/xid_wraparound/Makefile
 create mode 100644 src/test/modules/xid_wraparound/README
 create mode 100644 src/test/modules/xid_wraparound/meson.build
 create mode 100644 src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
 create mode 100644 src/test/modules/xid_wraparound/t/002_limits.pl
 create mode 100644 src/test/modules/xid_wraparound/t/003_wraparounds.pl
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound.c
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound.control

diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index c629cbe3830..99f5fa23f16 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -14,6 +14,7 @@ SUBDIRS = \
 		  plsample \
 		  snapshot_too_old \
 		  spgist_name_ops \
+		  xid_wraparound \
 		  test_bloomfilter \
 		  test_copy_callbacks \
 		  test_custom_rmgrs \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 1baa6b558d1..39de43dee33 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -10,6 +10,7 @@ subdir('plsample')
 subdir('snapshot_too_old')
 subdir('spgist_name_ops')
 subdir('ssl_passphrase_callback')
+subdir('xid_wraparound')
 subdir('test_bloomfilter')
 subdir('test_copy_callbacks')
 subdir('test_custom_rmgrs')
diff --git a/src/test/modules/xid_wraparound/.gitignore b/src/test/modules/xid_wraparound/.gitignore
new file mode 100644
index 00000000000..5dcb3ff9723
--- /dev/null
+++ b/src/test/modules/xid_wraparound/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/xid_wraparound/Makefile b/src/test/modules/xid_wraparound/Makefile
new file mode 100644
index 00000000000..7a6e0f66762
--- /dev/null
+++ b/src/test/modules/xid_wraparound/Makefile
@@ -0,0 +1,23 @@
+# src/test/modules/xid_wraparound/Makefile
+
+MODULE_big = xid_wraparound
+OBJS = \
+	$(WIN32RES) \
+	xid_wraparound.o
+PGFILEDESC = "xid_wraparound - tests for XID wraparound"
+
+EXTENSION = xid_wraparound
+DATA = xid_wraparound--1.0.sql
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/xid_wraparound
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/xid_wraparound/README b/src/test/modules/xid_wraparound/README
new file mode 100644
index 00000000000..3aab464dec7
--- /dev/null
+++ b/src/test/modules/xid_wraparound/README
@@ -0,0 +1,3 @@
+This module contains tests for XID wraparound. The tests use two
+helper functions to quickly consume lots of XIDs, to reach XID
+wraparound faster.
diff --git a/src/test/modules/xid_wraparound/meson.build b/src/test/modules/xid_wraparound/meson.build
new file mode 100644
index 00000000000..ea513e0536c
--- /dev/null
+++ b/src/test/modules/xid_wraparound/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2022-2023, PostgreSQL Global Development Group
+
+# FIXME: prevent install during main install, but not during test :/
+
+xid_wraparound_sources = files(
+  'xid_wraparound.c',
+)
+
+if host_system == 'windows'
+  xid_wraparound_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'xid_wraparound',
+    '--FILEDESC', 'xid_wraparound - tests for XID wraparound',])
+endif
+
+xid_wraparound = shared_module('xid_wraparound',
+  xid_wraparound_sources,
+  kwargs: pg_mod_args,
+)
+testprep_targets += xid_wraparound
+
+install_data(
+  'xid_wraparound.control',
+  'xid_wraparound--1.0.sql',
+  kwargs: contrib_data_args,
+)
+
+tests += {
+  'name': 'xid_wraparound',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'regress': {
+    'sql': [
+      'xid_wraparound',
+    ],
+  },
+}
diff --git a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
new file mode 100644
index 00000000000..b0198bb0c41
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
@@ -0,0 +1,110 @@
+# Copyright (c) 2022-2023, PostgreSQL Global Development Group
+
+# Test wraparound emergency autovacuum
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('main');
+
+$node->init;
+$node->append_conf('postgresql.conf', qq[
+autovacuum = off # run autovacuum only when to anti wraparound
+max_prepared_transactions = 10
+autovacuum_naptime = 1s
+# so it's easier to verify the order of operations
+autovacuum_max_workers = 1
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create tables for a few different test scenarios
+$node->safe_psql('postgres', qq[
+CREATE TABLE large(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large(data) SELECT generate_series(1,30000);
+
+CREATE TABLE large_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large_trunc(data) SELECT generate_series(1,30000);
+
+CREATE TABLE small(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small(data) SELECT generate_series(1,15000);
+
+CREATE TABLE small_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small_trunc(data) SELECT generate_series(1,15000);
+
+CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH (autovacuum_enabled=false);
+INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);
+]);
+
+# To prevent autovacuum from handling the tables immediately after
+# restart, acquire locks in a 2PC transaction. That allows us to test
+# interactions with running commands.
+$node->safe_psql('postgres', qq[
+BEGIN;
+LOCK TABLE large IN SHARE UPDATE EXCLUSIVE MODE;
+LOCK TABLE large_trunc IN SHARE UPDATE EXCLUSIVE MODE;
+LOCK TABLE small IN SHARE UPDATE EXCLUSIVE MODE;
+LOCK TABLE small_trunc IN SHARE UPDATE EXCLUSIVE MODE;
+LOCK TABLE autovacuum_disabled IN SHARE UPDATE EXCLUSIVE MODE;
+PREPARE TRANSACTION 'prevent-vacuum';
+]);
+
+# Delete a few rows to ensure that vacuum has work to do.
+$node->safe_psql('postgres', qq[
+DELETE FROM large WHERE id % 2 = 0;
+DELETE FROM large_trunc WHERE id > 10000;
+DELETE FROM small WHERE id % 2 = 0;
+DELETE FROM small_trunc WHERE id > 1000;
+DELETE FROM autovacuum_disabled WHERE id % 2 = 0;
+]);
+
+# Consume 2 billion XIDs, to get us very close to wraparound
+$node->safe_psql('postgres', qq[SELECT consume_xids_until('2000000000'::bigint)]);
+
+# Make sure the latest completed XID is advanced
+$node->safe_psql('postgres', qq[INSERT INTO small(data) SELECT 1]);
+
+# Check that all databases became old.
+my $ret = $node->safe_psql('postgres',
+			   qq[
+SELECT datname,
+       age(datfrozenxid) > current_setting('autovacuum_freeze_max_age')::int as old
+FROM pg_database ORDER BY 1
+]);
+is($ret, "postgres|t
+template0|t
+template1|t", "all tables became old");
+
+# Allow autovacuum to start working on these tables.
+$node->safe_psql('postgres', qq[COMMIT PREPARED 'prevent-vacuum']);
+
+$node->poll_query_until('postgres',
+			qq[
+SELECT NOT EXISTS (
+	SELECT *
+	FROM pg_database
+	WHERE age(datfrozenxid) > current_setting('autovacuum_freeze_max_age')::int)
+]) or die "timeout waiting for all databases to be vacuumed";
+
+# Check if these tables are vacuumed.
+$ret = $node->safe_psql('postgres', qq[
+SELECT relname, age(relfrozenxid) > current_setting('autovacuum_freeze_max_age')::int
+FROM pg_class
+WHERE relname IN ('large', 'large_trunc', 'small', 'small_trunc', 'autovacuum_disabled')
+ORDER BY 1
+]);
+
+is($ret, "autovacuum_disabled|f
+large|f
+large_trunc|f
+small|f
+small_trunc|f", "all tables are vacuumed");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/xid_wraparound/t/002_limits.pl b/src/test/modules/xid_wraparound/t/002_limits.pl
new file mode 100644
index 00000000000..ff2743746b2
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/002_limits.pl
@@ -0,0 +1,114 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+#
+# Test XID wraparound limits.
+#
+# When you get close to XID wraparound, you start to get warnings, and
+# when you get even closer, the system refuses to assign any more XIDs
+# until the oldest databases have been vacuumed and datfrozenxid has
+# been advanced.
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+my $ret;
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('wraparound');
+
+$node->init;
+$node->append_conf('postgresql.conf', qq[
+autovacuum = off # run autovacuum only to prevent wraparound
+autovacuum_naptime = 1s
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create a test table
+$node->safe_psql('postgres', qq[
+CREATE TABLE wraparoundtest(t text);
+INSERT INTO wraparoundtest VALUES ('start');
+]);
+
+# Start a background session, which holds a transaction open, preventing
+# autovacuum from advancing relfrozenxid and datfrozenxid.
+my $in  = '';
+my $out = '';
+my $timeout = IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default);
+my $background_psql = $node->background_psql('postgres', \$in, \$out, $timeout);
+$in .= q{
+	BEGIN;
+	INSERT INTO wraparoundtest VALUES ('oldxact');
+};
+$background_psql->pump_nb;
+
+# Consume 2 billion transactions, to get close to wraparound
+$node->safe_psql('postgres', qq[SELECT consume_xids(1000000000)]);
+$node->safe_psql('postgres', qq[INSERT INTO wraparoundtest VALUES ('after 1 billion')]);
+
+$node->safe_psql('postgres', qq[SELECT consume_xids(1000000000)]);
+$node->safe_psql('postgres', qq[INSERT INTO wraparoundtest VALUES ('after 2 billion')]);
+
+# We are now just under 150 million XIDs away from wraparound.
+# Continue consuming XIDs, in batches of 10 million, until we get
+# the warning:
+#
+#  WARNING:  database "postgres" must be vacuumed within 3000024 transactions
+#  HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
+#  You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
+my $stderr;
+my $warn_limit = 0;
+for my $i (1 .. 15)
+{
+	$node->psql('postgres', qq[SELECT consume_xids(10000000)], stderr => \$stderr, on_error_die => 1);
+
+	if ($stderr =~ /WARNING:  database "postgres" must be vacuumed within [0-9]+ transactions/)
+	{
+		# Reached the warn-limit
+		$warn_limit = 1;
+		last;
+	}
+}
+ok($warn_limit == 1, "warn-limit reached");
+
+# We can still INSERT, despite the warnings.
+$node->safe_psql('postgres', qq[INSERT INTO wraparoundtest VALUES ('reached warn-limit')]);
+
+# Keep going. We'll hit the hard "stop" limit.
+$ret = $node->psql('postgres', qq[SELECT consume_xids(100000000)], stderr => \$stderr);
+like($stderr, qr/ERROR:  database is not accepting commands to avoid wraparound data loss/, "stop-limit");
+
+# Finish the old transaction, to allow vacuum freezing to advance
+# relfrozenxid and datfrozenxid again.
+$in .= q{
+COMMIT;
+\q
+};
+$background_psql->finish;
+
+# VACUUM, to freeze the tables and advance datfrozenxid.
+#
+# Autovacuum does this for the other databases, and would do it for
+# 'postgres' too, but let's test manual VACUUM.
+#
+$node->safe_psql('postgres', 'VACUUM');
+
+# Wait until autovacuum has processed the other databases and advanced
+# the system-wide oldest-XID.
+$ret = $node->poll_query_until('postgres', qq[INSERT INTO wraparoundtest VALUES ('after VACUUM')], 'INSERT 0 1');
+
+# Check the table contents
+$ret = $node->safe_psql('postgres', qq[SELECT * from wraparoundtest]);
+is($ret, "start
+oldxact
+after 1 billion
+after 2 billion
+reached warn-limit
+after VACUUM");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/xid_wraparound/t/003_wraparounds.pl b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
new file mode 100644
index 00000000000..e79adbd12db
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
@@ -0,0 +1,46 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+#
+# Consume a lot of XIDs, wrapping around a few times.
+#
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('wraparound');
+
+$node->init;
+$node->append_conf('postgresql.conf', qq[
+autovacuum = off # run autovacuum only when to anti wraparound
+autovacuum_naptime = 1s
+# so it's easier to verify the order of operations
+autovacuum_max_workers = 1
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create a test table
+$node->safe_psql('postgres', qq[
+CREATE TABLE wraparoundtest(t text);
+INSERT INTO wraparoundtest VALUES ('beginning');
+]);
+
+# Burn through 10 billion transactions in total, in batches of 100 million.
+my $ret;
+for my $i (1 .. 100)
+{
+	$ret = $node->safe_psql('postgres', qq[SELECT consume_xids(100000000)]);
+	$ret = $node->safe_psql('postgres', qq[INSERT INTO wraparoundtest VALUES ('after $i batches')]);
+}
+
+$ret = $node->safe_psql('postgres', qq[SELECT COUNT(*) FROM wraparoundtest]);
+is($ret, "101");
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql b/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
new file mode 100644
index 00000000000..f5577adfdbd
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
@@ -0,0 +1,12 @@
+/* src/test/modules/xid_wraparound/xid_wraparound--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION xid_wraparound" to load this file. \quit
+
+CREATE FUNCTION consume_xids(bigint)
+RETURNS bigint IMMUTABLE PARALLEL SAFE STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION consume_xids_until(bigint)
+RETURNS bigint IMMUTABLE PARALLEL SAFE STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/xid_wraparound/xid_wraparound.c b/src/test/modules/xid_wraparound/xid_wraparound.c
new file mode 100644
index 00000000000..c9d6034b557
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound.c
@@ -0,0 +1,221 @@
+/*--------------------------------------------------------------------------
+ *
+ * xid_wraparound.c
+ *		Utilities for testing XID wraparound
+ *
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *		src/test/modules/xid_wraparound/xid_wraparound.c
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/transam.h"
+#include "access/xact.h"
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "storage/lwlock.h"
+#include "storage/proc.h"
+
+PG_MODULE_MAGIC;
+
+static int64 consume_xids_shortcut(void);
+static FullTransactionId consume_xids_common(FullTransactionId untilxid, uint64 nxids);
+
+/*
+ * Consume the specified number of XIDs.
+ */
+PG_FUNCTION_INFO_V1(consume_xids);
+Datum
+consume_xids(PG_FUNCTION_ARGS)
+{
+	int64		nxids = PG_GETARG_INT64(0);
+	FullTransactionId lastxid;
+
+	if (nxids < 0)
+		elog(ERROR, "invalid nxids argument: %lld", (long long) nxids);
+
+	if (nxids == 0)
+		lastxid = ReadNextFullTransactionId();
+	else
+		lastxid = consume_xids_common(InvalidFullTransactionId, (uint64) nxids);
+
+	PG_RETURN_INT64((int64) U64FromFullTransactionId(lastxid));
+}
+
+/*
+ * Consume XIDs, up to the given XID.
+ */
+PG_FUNCTION_INFO_V1(consume_xids_until);
+Datum
+consume_xids_until(PG_FUNCTION_ARGS)
+{
+	FullTransactionId targetxid = FullTransactionIdFromU64((uint64) PG_GETARG_INT64(0));
+	FullTransactionId lastxid;
+
+	if (!FullTransactionIdIsNormal(targetxid))
+		elog(ERROR, "targetxid %llu is not normal", (unsigned long long) U64FromFullTransactionId(targetxid));
+
+	lastxid = consume_xids_common(targetxid, 0);
+
+	PG_RETURN_INT64((int64) U64FromFullTransactionId(lastxid));
+}
+
+/*
+ * Common functionality between the two public functions.
+ */
+static FullTransactionId
+consume_xids_common(FullTransactionId untilxid, uint64 nxids)
+{
+	FullTransactionId lastxid;
+	uint64		last_reported_at = 0;
+	uint64		consumed = 0;
+
+	/* Print a NOTICE every REPORT_INTERVAL xids */
+#define REPORT_INTERVAL (10*1000000)
+
+	/* initialize 'lastxid' with the system's current next XID */
+	lastxid = ReadNextFullTransactionId();
+
+	/*
+	 * We consume XIDs by calling GetNewTransactionId(true), which marks the
+	 * consumed XIDs as subtransactions of the current top-level transaction.
+	 * For that to work, this transaction must have a top-level XID.
+	 *
+	 * GetNewTransactionId registers them in the subxid cache in PGPROC, until
+	 * the cache overflows, but beyond that, we don't keep track of the
+	 * consumed XIDs.
+	 */
+	(void) GetTopTransactionId();
+
+	for (;;)
+	{
+		uint64		xids_left;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* How many XIDs do we have left to consume? */
+		if (nxids > 0)
+		{
+			if (consumed >= nxids)
+				break;
+			xids_left = nxids - consumed;
+		}
+		else
+		{
+			if (FullTransactionIdFollowsOrEquals(lastxid, untilxid))
+				break;
+			xids_left = U64FromFullTransactionId(untilxid) - U64FromFullTransactionId(lastxid);
+		}
+
+		/*
+		 * If we still have plenty of XIDs to consume, try to take a shortcut
+		 * and bump up the nextXid counter directly.
+		 */
+		if (xids_left > 2000 &&
+			consumed - last_reported_at < REPORT_INTERVAL &&
+			MyProc->subxidStatus.overflowed)
+		{
+			int64		consumed_by_shortcut = consume_xids_shortcut();
+
+			if (consumed_by_shortcut > 0)
+			{
+				consumed += consumed_by_shortcut;
+				continue;
+			}
+		}
+
+		/* Slow path: Call GetNewTransactionId to allocate a new XID. */
+		lastxid = GetNewTransactionId(true);
+		consumed++;
+
+		/* Report progress */
+		if (consumed - last_reported_at >= REPORT_INTERVAL)
+		{
+			if (nxids > 0)
+				elog(NOTICE, "consumed %llu / %llu XIDs, latest %u:%u",
+					 (unsigned long long) consumed, (unsigned long long) nxids,
+					 EpochFromFullTransactionId(lastxid),
+					 XidFromFullTransactionId(lastxid));
+			else
+				elog(NOTICE, "consumed up to %u:%u / %u:%u",
+					 EpochFromFullTransactionId(lastxid),
+					 XidFromFullTransactionId(lastxid),
+					 EpochFromFullTransactionId(untilxid),
+					 XidFromFullTransactionId(untilxid));
+			last_reported_at = consumed;
+		}
+	}
+
+	return lastxid;
+}
+
+
+/*
+ * These constants copied from .c files, because they're private.
+ */
+#define COMMIT_TS_XACTS_PER_PAGE (BLCKSZ / 10)
+#define SUBTRANS_XACTS_PER_PAGE (BLCKSZ / sizeof(TransactionId))
+#define CLOG_XACTS_PER_BYTE 4
+#define CLOG_XACTS_PER_PAGE (BLCKSZ * CLOG_XACTS_PER_BYTE)
+
+/*
+ * All the interesting action in GetNewTransactionId happens when we extend
+ * the SLRUs, or at the uint32 wraparound. If the nextXid counter is not close
+ * to any of those interesting values, take a shortcut and bump nextXID
+ * directly, close to the next "interesting" value.
+ */
+static inline uint32
+XidSkip(FullTransactionId fullxid)
+{
+	uint32		low = XidFromFullTransactionId(fullxid);
+	uint32		rem;
+	uint32		distance;
+
+	if (low < 5 || low >= UINT32_MAX - 5)
+		return 0;
+	distance = UINT32_MAX - 5 - low;
+
+	rem = low % COMMIT_TS_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, COMMIT_TS_XACTS_PER_PAGE - rem);
+
+	rem = low % SUBTRANS_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, SUBTRANS_XACTS_PER_PAGE - rem);
+
+	rem = low % CLOG_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, CLOG_XACTS_PER_PAGE - rem);
+
+	return distance;
+}
+
+static int64
+consume_xids_shortcut(void)
+{
+	FullTransactionId nextXid;
+	uint32		consumed;
+
+	LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+	nextXid = ShmemVariableCache->nextXid;
+
+	/*
+	 * Go slow near the "interesting values". The interesting zones include 5
+	 * transactions before and after SLRU page switches.
+	 */
+	consumed = XidSkip(nextXid);
+	if (consumed > 0)
+		ShmemVariableCache->nextXid.value += (uint64) consumed;
+
+	LWLockRelease(XidGenLock);
+
+	return consumed;
+}
diff --git a/src/test/modules/xid_wraparound/xid_wraparound.control b/src/test/modules/xid_wraparound/xid_wraparound.control
new file mode 100644
index 00000000000..6c6964ed3d3
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound.control
@@ -0,0 +1,4 @@
+comment = 'Tests for XID wraparound'
+default_version = '1.0'
+module_pathname = '$libdir/xid_wraparound'
+relocatable = true
-- 
2.30.2

#24Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Heikki Linnakangas (#22)
Re: Testing autovacuum wraparound (including failsafe)

On Fri, Mar 3, 2023 at 8:34 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 16/11/2022 06:38, Ian Lawrence Barwick wrote:

Thanks for the patch. While reviewing the patch backlog, we have determined that
the latest version of this patch was submitted before meson support was
implemented, so it should have a "meson.build" file added for consideration for
inclusion in PostgreSQL 16.

I wanted to do some XID wraparound testing again, to test the 64-bit
SLRUs patches [1], and revived this.

Thank you for reviving this thread!

I took a different approach to consuming the XIDs. Instead of setting
nextXID directly, bypassing GetNewTransactionId(), this patch introduces
a helper function to call GetNewTransactionId() repeatedly. But because
that's slow, it does include a shortcut to skip over "uninteresting"
XIDs. Whenever nextXid is close to an SLRU page boundary or XID
wraparound, it calls GetNewTransactionId(), and otherwise it bumps up
nextXid close to the next "interesting" value. That's still a lot slower
than just setting nextXid, but exercises the code more realistically.

I've written some variant of this helper function many times over the
years, for ad hoc testing. I'd love to have it permanently in the git tree.

These functions seem to be better than mine.

In addition to Masahiko's test for emergency vacuum, this includes two
other tests. 002_limits.pl tests the "warn limit" and "stop limit" in
GetNewTransactionId(), and 003_wraparound.pl burns through 10 billion
transactions in total, exercising XID wraparound in general.
Unfortunately these tests are pretty slow; the tests run for about 4
minutes on my laptop in total, and use about 20 GB of disk space. So
perhaps these need to be put in a special test suite that's not run as
part of "check-world". Or perhaps leave out the 003_wraparounds.pl test,
that's the slowest of the tests. But I'd love to have these in the git
tree in some form.

cbfot reports some failures. The main reason seems that meson.build in
xid_wraparound directory adds the regression tests but the .sql and
.out files are missing in the patch. Perhaps the patch wants to add
only tap tests as Makefile doesn't define REGRESS?

Even after fixing this issue, CI tests (Cirrus CI) are not happy and
report failures due to a disk full. The size of xid_wraparound test
directory is 105MB out of 262MB:

% du -sh testrun
262M testrun
% du -sh testrun/xid_wraparound/
105M testrun/xid_wraparound/
% du -sh testrun/xid_wraparound/*
460K testrun/xid_wraparound/001_emergency_vacuum
93M testrun/xid_wraparound/002_limits
12M testrun/xid_wraparound/003_wraparounds
% ls -lh testrun/xid_wraparound/002_limits/log*
total 93M
-rw-------. 1 masahiko masahiko 93M Mar 7 17:34 002_limits_wraparound.log
-rw-rw-r--. 1 masahiko masahiko 20K Mar 7 17:34 regress_log_002_limits

The biggest file is the server logs since an autovacuum worker writes
autovacuum logs for every table for every second (autovacuum_naptime
is 1s). Maybe we can set log_autovacuum_min_duration reloption for the
test tables instead of globally enabling it

The 001 test uses the 2PC transaction that holds locks on tables but
since we can consume xids while the server running, we don't need
that. Instead I think we can keep a transaction open in the background
like 002 test does.

I'll try these ideas.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In reply to: Heikki Linnakangas (#22)
Re: Testing autovacuum wraparound (including failsafe)

On Fri, Mar 3, 2023 at 3:34 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

I took a different approach to consuming the XIDs. Instead of setting
nextXID directly, bypassing GetNewTransactionId(), this patch introduces
a helper function to call GetNewTransactionId() repeatedly. But because
that's slow, it does include a shortcut to skip over "uninteresting"
XIDs. Whenever nextXid is close to an SLRU page boundary or XID
wraparound, it calls GetNewTransactionId(), and otherwise it bumps up
nextXid close to the next "interesting" value. That's still a lot slower
than just setting nextXid, but exercises the code more realistically.

Surely your tap test should be using single user mode? Perhaps you
missed the obnoxious HINT, that's part of the WARNING that the test
parses? ;-)

This is a very useful patch. I certainly don't want to make life
harder by (say) connecting it to the single user mode problem.
But...the single user mode thing really needs to go away. It's just
terrible advice, and actively harms users.

--
Peter Geoghegan

#26Michael Paquier
michael@paquier.xyz
In reply to: Peter Geoghegan (#25)
Re: Testing autovacuum wraparound (including failsafe)

On Tue, Mar 07, 2023 at 09:21:00PM -0800, Peter Geoghegan wrote:

Surely your tap test should be using single user mode? Perhaps you
missed the obnoxious HINT, that's part of the WARNING that the test
parses? ;-)

I may be missing something, but you cannot use directly a "postgres"
command in a TAP test, can you? See 1a9d802, that has fixed a problem
when TAP tests run with a privileged account on Windows.
--
Michael

In reply to: Michael Paquier (#26)
Re: Testing autovacuum wraparound (including failsafe)

On Wed, Mar 8, 2023 at 10:47 PM Michael Paquier <michael@paquier.xyz> wrote:

I may be missing something, but you cannot use directly a "postgres"
command in a TAP test, can you? See 1a9d802, that has fixed a problem
when TAP tests run with a privileged account on Windows.

I was joking. But I did have a real point: once we have tests for the
xidStopLimit mechanism, why not take the opportunity to correct the
long standing issue with the documentation advising the use of single
user mode?

--
Peter Geoghegan

#28Jacob Champion
jchampion@timescale.com
In reply to: Peter Geoghegan (#27)
Re: Testing autovacuum wraparound (including failsafe)

On Sat, Mar 11, 2023 at 8:47 PM Peter Geoghegan <pg@bowt.ie> wrote:

I was joking. But I did have a real point: once we have tests for the
xidStopLimit mechanism, why not take the opportunity to correct the
long standing issue with the documentation advising the use of single
user mode?

Does https://commitfest.postgresql.org/42/4128/ address that
independently enough?

--Jacob

In reply to: Jacob Champion (#28)
Re: Testing autovacuum wraparound (including failsafe)

On Mon, Mar 13, 2023 at 3:25 PM Jacob Champion <jchampion@timescale.com> wrote:

Does https://commitfest.postgresql.org/42/4128/ address that
independently enough?

I wasn't aware of that patch. It looks like it does exactly what I was
arguing in favor of. So yes.

--
Peter Geoghegan

#30Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#24)
1 attachment(s)
Re: Testing autovacuum wraparound (including failsafe)

On Wed, Mar 8, 2023 at 1:52 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Mar 3, 2023 at 8:34 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 16/11/2022 06:38, Ian Lawrence Barwick wrote:

Thanks for the patch. While reviewing the patch backlog, we have determined that
the latest version of this patch was submitted before meson support was
implemented, so it should have a "meson.build" file added for consideration for
inclusion in PostgreSQL 16.

I wanted to do some XID wraparound testing again, to test the 64-bit
SLRUs patches [1], and revived this.

Thank you for reviving this thread!

I took a different approach to consuming the XIDs. Instead of setting
nextXID directly, bypassing GetNewTransactionId(), this patch introduces
a helper function to call GetNewTransactionId() repeatedly. But because
that's slow, it does include a shortcut to skip over "uninteresting"
XIDs. Whenever nextXid is close to an SLRU page boundary or XID
wraparound, it calls GetNewTransactionId(), and otherwise it bumps up
nextXid close to the next "interesting" value. That's still a lot slower
than just setting nextXid, but exercises the code more realistically.

I've written some variant of this helper function many times over the
years, for ad hoc testing. I'd love to have it permanently in the git tree.

These functions seem to be better than mine.

In addition to Masahiko's test for emergency vacuum, this includes two
other tests. 002_limits.pl tests the "warn limit" and "stop limit" in
GetNewTransactionId(), and 003_wraparound.pl burns through 10 billion
transactions in total, exercising XID wraparound in general.
Unfortunately these tests are pretty slow; the tests run for about 4
minutes on my laptop in total, and use about 20 GB of disk space. So
perhaps these need to be put in a special test suite that's not run as
part of "check-world". Or perhaps leave out the 003_wraparounds.pl test,
that's the slowest of the tests. But I'd love to have these in the git
tree in some form.

cbfot reports some failures. The main reason seems that meson.build in
xid_wraparound directory adds the regression tests but the .sql and
.out files are missing in the patch. Perhaps the patch wants to add
only tap tests as Makefile doesn't define REGRESS?

Even after fixing this issue, CI tests (Cirrus CI) are not happy and
report failures due to a disk full. The size of xid_wraparound test
directory is 105MB out of 262MB:

% du -sh testrun
262M testrun
% du -sh testrun/xid_wraparound/
105M testrun/xid_wraparound/
% du -sh testrun/xid_wraparound/*
460K testrun/xid_wraparound/001_emergency_vacuum
93M testrun/xid_wraparound/002_limits
12M testrun/xid_wraparound/003_wraparounds
% ls -lh testrun/xid_wraparound/002_limits/log*
total 93M
-rw-------. 1 masahiko masahiko 93M Mar 7 17:34 002_limits_wraparound.log
-rw-rw-r--. 1 masahiko masahiko 20K Mar 7 17:34 regress_log_002_limits

The biggest file is the server logs since an autovacuum worker writes
autovacuum logs for every table for every second (autovacuum_naptime
is 1s). Maybe we can set log_autovacuum_min_duration reloption for the
test tables instead of globally enabling it

I think it could be acceptable since 002 and 003 tests are executed
only when required. And 001 test seems to be able to pass on cfbot but
it takes more than 30 sec. In the attached patch, I made these tests
optional and these are enabled if envar ENABLE_XID_WRAPAROUND_TESTS is
defined (supporting only autoconf).

The 001 test uses the 2PC transaction that holds locks on tables but
since we can consume xids while the server running, we don't need
that. Instead I think we can keep a transaction open in the background
like 002 test does.

Updated in the new patch. Also, I added a check if the failsafe mode
is triggered.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v2-0001-Add-tests-for-XID-wraparound.patchapplication/octet-stream; name=v2-0001-Add-tests-for-XID-wraparound.patchDownload
From 851b856feb829c6a1bed041aec7febdc3928fc04 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 3 Mar 2023 12:01:28 +0200
Subject: [PATCH v2] Add tests for XID wraparound.

The test module includes helper functions to quickly burn through lots
of XIDs. They are used in the tests, and are also handy for manually
testing XID wraparound.

Author: Masahiko Sawada, Heikki Linnakangas
Discussion: https://www.postgresql.org/message-id/CAD21AoDVhkXp8HjpFO-gp3TgL6tCKcZQNxn04m01VAtcSi-5sA%40mail.gmail.com
---
 src/test/modules/Makefile                     |   1 +
 src/test/modules/meson.build                  |   1 +
 src/test/modules/xid_wraparound/.gitignore    |   4 +
 src/test/modules/xid_wraparound/Makefile      |  28 +++
 src/test/modules/xid_wraparound/README        |   3 +
 src/test/modules/xid_wraparound/meson.build   |  43 ++++
 .../xid_wraparound/t/001_emergency_vacuum.pl  | 120 ++++++++++
 .../modules/xid_wraparound/t/002_limits.pl    | 114 +++++++++
 .../xid_wraparound/t/003_wraparounds.pl       |  46 ++++
 .../xid_wraparound/xid_wraparound--1.0.sql    |  12 +
 .../modules/xid_wraparound/xid_wraparound.c   | 221 ++++++++++++++++++
 .../xid_wraparound/xid_wraparound.control     |   4 +
 12 files changed, 597 insertions(+)
 create mode 100644 src/test/modules/xid_wraparound/.gitignore
 create mode 100644 src/test/modules/xid_wraparound/Makefile
 create mode 100644 src/test/modules/xid_wraparound/README
 create mode 100644 src/test/modules/xid_wraparound/meson.build
 create mode 100644 src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
 create mode 100644 src/test/modules/xid_wraparound/t/002_limits.pl
 create mode 100644 src/test/modules/xid_wraparound/t/003_wraparounds.pl
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound.c
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound.control

diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index c629cbe383..99f5fa23f1 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -14,6 +14,7 @@ SUBDIRS = \
 		  plsample \
 		  snapshot_too_old \
 		  spgist_name_ops \
+		  xid_wraparound \
 		  test_bloomfilter \
 		  test_copy_callbacks \
 		  test_custom_rmgrs \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 1baa6b558d..39de43dee3 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -10,6 +10,7 @@ subdir('plsample')
 subdir('snapshot_too_old')
 subdir('spgist_name_ops')
 subdir('ssl_passphrase_callback')
+subdir('xid_wraparound')
 subdir('test_bloomfilter')
 subdir('test_copy_callbacks')
 subdir('test_custom_rmgrs')
diff --git a/src/test/modules/xid_wraparound/.gitignore b/src/test/modules/xid_wraparound/.gitignore
new file mode 100644
index 0000000000..5dcb3ff972
--- /dev/null
+++ b/src/test/modules/xid_wraparound/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/xid_wraparound/Makefile b/src/test/modules/xid_wraparound/Makefile
new file mode 100644
index 0000000000..fc5ead6cc5
--- /dev/null
+++ b/src/test/modules/xid_wraparound/Makefile
@@ -0,0 +1,28 @@
+# src/test/modules/xid_wraparound/Makefile
+
+MODULE_big = xid_wraparound
+OBJS = \
+	$(WIN32RES) \
+	xid_wraparound.o
+PGFILEDESC = "xid_wraparound - tests for XID wraparound"
+
+EXTENSION = xid_wraparound
+DATA = xid_wraparound--1.0.sql
+
+# Disabled by default because these tests could take a long time,
+# which typical installcheck users cannot tolerate (e.g. buildfarm
+# clients).
+ifdef ENABLE_XID_WRAPAROUND_TESTS
+TAP_TESTS = 1
+endif
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/xid_wraparound
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/xid_wraparound/README b/src/test/modules/xid_wraparound/README
new file mode 100644
index 0000000000..3aab464dec
--- /dev/null
+++ b/src/test/modules/xid_wraparound/README
@@ -0,0 +1,3 @@
+This module contains tests for XID wraparound. The tests use two
+helper functions to quickly consume lots of XIDs, to reach XID
+wraparound faster.
diff --git a/src/test/modules/xid_wraparound/meson.build b/src/test/modules/xid_wraparound/meson.build
new file mode 100644
index 0000000000..bdd55f22c4
--- /dev/null
+++ b/src/test/modules/xid_wraparound/meson.build
@@ -0,0 +1,43 @@
+# Copyright (c) 2022-2023, PostgreSQL Global Development Group
+
+# FIXME: prevent install during main install, but not during test :/
+
+xid_wraparound_sources = files(
+  'xid_wraparound.c',
+)
+
+if host_system == 'windows'
+  xid_wraparound_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'xid_wraparound',
+    '--FILEDESC', 'xid_wraparound - tests for XID wraparound',])
+endif
+
+xid_wraparound = shared_module('xid_wraparound',
+  xid_wraparound_sources,
+  kwargs: pg_mod_args,
+)
+testprep_targets += xid_wraparound
+
+install_data(
+  'xid_wraparound.control',
+  'xid_wraparound--1.0.sql',
+  kwargs: contrib_data_args,
+)
+
+# Disabled by default because these test could take a long time,
+# which typical installcheck users cannot tolerate (e.g. buildfarm
+# clients).
+if false
+  tests += {
+    'name': 'xid_wraparound',
+    'sd': meson.current_source_dir(),
+    'bd': meson.current_build_dir(),
+    'tap': {
+      'tests': [
+        't/001_emergency_vacuum.pl',
+        't/002_limits.pl',
+        't/003_wraparounds.pl',
+      ],
+    },
+  }
+endif
diff --git a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
new file mode 100644
index 0000000000..bd9bb4b98c
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
@@ -0,0 +1,120 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test wraparound emergency autovacuum.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('main');
+
+$node->init;
+$node->append_conf('postgresql.conf', qq[
+autovacuum = off # run autovacuum only when to anti wraparound
+autovacuum_naptime = 1s
+# so it's easier to verify the order of operations
+autovacuum_max_workers = 1
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create tables for a few different test scenarios
+$node->safe_psql('postgres', qq[
+CREATE TABLE large(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large(data) SELECT generate_series(1,30000);
+
+CREATE TABLE large_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large_trunc(data) SELECT generate_series(1,30000);
+
+CREATE TABLE small(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small(data) SELECT generate_series(1,15000);
+
+CREATE TABLE small_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small_trunc(data) SELECT generate_series(1,15000);
+
+CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH (autovacuum_enabled=false);
+INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);
+]);
+
+# Start a background session, which holds a transaction open, preventing
+# autovacuum from advancing relfrozenxid and datfrozenxid.
+my $in  = '';
+my $out = '';
+my $timeout = IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default);
+my $background_psql = $node->background_psql('postgres', \$in, \$out, $timeout);
+$in .= q{
+	BEGIN;
+	DELETE FROM large WHERE id % 2 = 0;
+	DELETE FROM large_trunc WHERE id > 10000;
+	DELETE FROM small WHERE id % 2 = 0;
+	DELETE FROM small_trunc WHERE id > 1000;
+	DELETE FROM autovacuum_disabled WHERE id % 2 = 0;
+};
+$background_psql->pump_nb;
+
+# Consume 2 billion XIDs, to get us very close to wraparound
+$node->safe_psql('postgres', qq[SELECT consume_xids_until('2000000000'::bigint)]);
+
+# Make sure the latest completed XID is advanced
+$node->safe_psql('postgres', qq[INSERT INTO small(data) SELECT 1]);
+
+# Check that all databases became old enough to trigger failsafe.
+my $ret = $node->safe_psql('postgres',
+			   qq[
+SELECT datname,
+       age(datfrozenxid) > current_setting('vacuum_failsafe_age')::int as old
+FROM pg_database ORDER BY 1
+]);
+is($ret, "postgres|t
+template0|t
+template1|t", "all tables became old");
+
+my $log_offset = -s $node->logfile;
+
+# Finish the old transaction, to allow vacuum freezing to advance
+# relfrozenxid and datfrozenxid again.
+$in .= q{
+COMMIT;
+\q
+};
+$background_psql->finish;
+
+# Wait until autovacuum processed all tables and advanced the
+# system-wide oldest-XID.
+$node->poll_query_until('postgres',
+			qq[
+SELECT NOT EXISTS (
+	SELECT *
+	FROM pg_database
+	WHERE age(datfrozenxid) > current_setting('autovacuum_freeze_max_age')::int)
+]) or die "timeout waiting for all databases to be vacuumed";
+
+# Check if these tables are vacuumed.
+$ret = $node->safe_psql('postgres', qq[
+SELECT relname, age(relfrozenxid) > current_setting('autovacuum_freeze_max_age')::int
+FROM pg_class
+WHERE relname IN ('large', 'large_trunc', 'small', 'small_trunc', 'autovacuum_disabled')
+ORDER BY 1
+]);
+
+is($ret, "autovacuum_disabled|f
+large|f
+large_trunc|f
+small|f
+small_trunc|f", "all tables are vacuumed");
+
+# Check if vacuum failsafe was triggered for each table.
+my $log_contents = slurp_file($node->logfile, $log_offset);
+foreach my $tablename ('large', 'large_trunc', 'small', 'small_trunc', 'autovacuum_disabled')
+{
+    like(
+	$log_contents,
+	qr/bypassing nonessential maintenance of table "postgres.public.$tablename" as a failsafe after \d+ index scans/,
+	"failsafe vacuum triggered for $tablename");
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/xid_wraparound/t/002_limits.pl b/src/test/modules/xid_wraparound/t/002_limits.pl
new file mode 100644
index 0000000000..ff2743746b
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/002_limits.pl
@@ -0,0 +1,114 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+#
+# Test XID wraparound limits.
+#
+# When you get close to XID wraparound, you start to get warnings, and
+# when you get even closer, the system refuses to assign any more XIDs
+# until the oldest databases have been vacuumed and datfrozenxid has
+# been advanced.
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+my $ret;
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('wraparound');
+
+$node->init;
+$node->append_conf('postgresql.conf', qq[
+autovacuum = off # run autovacuum only to prevent wraparound
+autovacuum_naptime = 1s
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create a test table
+$node->safe_psql('postgres', qq[
+CREATE TABLE wraparoundtest(t text);
+INSERT INTO wraparoundtest VALUES ('start');
+]);
+
+# Start a background session, which holds a transaction open, preventing
+# autovacuum from advancing relfrozenxid and datfrozenxid.
+my $in  = '';
+my $out = '';
+my $timeout = IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default);
+my $background_psql = $node->background_psql('postgres', \$in, \$out, $timeout);
+$in .= q{
+	BEGIN;
+	INSERT INTO wraparoundtest VALUES ('oldxact');
+};
+$background_psql->pump_nb;
+
+# Consume 2 billion transactions, to get close to wraparound
+$node->safe_psql('postgres', qq[SELECT consume_xids(1000000000)]);
+$node->safe_psql('postgres', qq[INSERT INTO wraparoundtest VALUES ('after 1 billion')]);
+
+$node->safe_psql('postgres', qq[SELECT consume_xids(1000000000)]);
+$node->safe_psql('postgres', qq[INSERT INTO wraparoundtest VALUES ('after 2 billion')]);
+
+# We are now just under 150 million XIDs away from wraparound.
+# Continue consuming XIDs, in batches of 10 million, until we get
+# the warning:
+#
+#  WARNING:  database "postgres" must be vacuumed within 3000024 transactions
+#  HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
+#  You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
+my $stderr;
+my $warn_limit = 0;
+for my $i (1 .. 15)
+{
+	$node->psql('postgres', qq[SELECT consume_xids(10000000)], stderr => \$stderr, on_error_die => 1);
+
+	if ($stderr =~ /WARNING:  database "postgres" must be vacuumed within [0-9]+ transactions/)
+	{
+		# Reached the warn-limit
+		$warn_limit = 1;
+		last;
+	}
+}
+ok($warn_limit == 1, "warn-limit reached");
+
+# We can still INSERT, despite the warnings.
+$node->safe_psql('postgres', qq[INSERT INTO wraparoundtest VALUES ('reached warn-limit')]);
+
+# Keep going. We'll hit the hard "stop" limit.
+$ret = $node->psql('postgres', qq[SELECT consume_xids(100000000)], stderr => \$stderr);
+like($stderr, qr/ERROR:  database is not accepting commands to avoid wraparound data loss/, "stop-limit");
+
+# Finish the old transaction, to allow vacuum freezing to advance
+# relfrozenxid and datfrozenxid again.
+$in .= q{
+COMMIT;
+\q
+};
+$background_psql->finish;
+
+# VACUUM, to freeze the tables and advance datfrozenxid.
+#
+# Autovacuum does this for the other databases, and would do it for
+# 'postgres' too, but let's test manual VACUUM.
+#
+$node->safe_psql('postgres', 'VACUUM');
+
+# Wait until autovacuum has processed the other databases and advanced
+# the system-wide oldest-XID.
+$ret = $node->poll_query_until('postgres', qq[INSERT INTO wraparoundtest VALUES ('after VACUUM')], 'INSERT 0 1');
+
+# Check the table contents
+$ret = $node->safe_psql('postgres', qq[SELECT * from wraparoundtest]);
+is($ret, "start
+oldxact
+after 1 billion
+after 2 billion
+reached warn-limit
+after VACUUM");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/xid_wraparound/t/003_wraparounds.pl b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
new file mode 100644
index 0000000000..e79adbd12d
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
@@ -0,0 +1,46 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+#
+# Consume a lot of XIDs, wrapping around a few times.
+#
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('wraparound');
+
+$node->init;
+$node->append_conf('postgresql.conf', qq[
+autovacuum = off # run autovacuum only when to anti wraparound
+autovacuum_naptime = 1s
+# so it's easier to verify the order of operations
+autovacuum_max_workers = 1
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create a test table
+$node->safe_psql('postgres', qq[
+CREATE TABLE wraparoundtest(t text);
+INSERT INTO wraparoundtest VALUES ('beginning');
+]);
+
+# Burn through 10 billion transactions in total, in batches of 100 million.
+my $ret;
+for my $i (1 .. 100)
+{
+	$ret = $node->safe_psql('postgres', qq[SELECT consume_xids(100000000)]);
+	$ret = $node->safe_psql('postgres', qq[INSERT INTO wraparoundtest VALUES ('after $i batches')]);
+}
+
+$ret = $node->safe_psql('postgres', qq[SELECT COUNT(*) FROM wraparoundtest]);
+is($ret, "101");
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql b/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
new file mode 100644
index 0000000000..f5577adfdb
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
@@ -0,0 +1,12 @@
+/* src/test/modules/xid_wraparound/xid_wraparound--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION xid_wraparound" to load this file. \quit
+
+CREATE FUNCTION consume_xids(bigint)
+RETURNS bigint IMMUTABLE PARALLEL SAFE STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION consume_xids_until(bigint)
+RETURNS bigint IMMUTABLE PARALLEL SAFE STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/xid_wraparound/xid_wraparound.c b/src/test/modules/xid_wraparound/xid_wraparound.c
new file mode 100644
index 0000000000..c9d6034b55
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound.c
@@ -0,0 +1,221 @@
+/*--------------------------------------------------------------------------
+ *
+ * xid_wraparound.c
+ *		Utilities for testing XID wraparound
+ *
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *		src/test/modules/xid_wraparound/xid_wraparound.c
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/transam.h"
+#include "access/xact.h"
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "storage/lwlock.h"
+#include "storage/proc.h"
+
+PG_MODULE_MAGIC;
+
+static int64 consume_xids_shortcut(void);
+static FullTransactionId consume_xids_common(FullTransactionId untilxid, uint64 nxids);
+
+/*
+ * Consume the specified number of XIDs.
+ */
+PG_FUNCTION_INFO_V1(consume_xids);
+Datum
+consume_xids(PG_FUNCTION_ARGS)
+{
+	int64		nxids = PG_GETARG_INT64(0);
+	FullTransactionId lastxid;
+
+	if (nxids < 0)
+		elog(ERROR, "invalid nxids argument: %lld", (long long) nxids);
+
+	if (nxids == 0)
+		lastxid = ReadNextFullTransactionId();
+	else
+		lastxid = consume_xids_common(InvalidFullTransactionId, (uint64) nxids);
+
+	PG_RETURN_INT64((int64) U64FromFullTransactionId(lastxid));
+}
+
+/*
+ * Consume XIDs, up to the given XID.
+ */
+PG_FUNCTION_INFO_V1(consume_xids_until);
+Datum
+consume_xids_until(PG_FUNCTION_ARGS)
+{
+	FullTransactionId targetxid = FullTransactionIdFromU64((uint64) PG_GETARG_INT64(0));
+	FullTransactionId lastxid;
+
+	if (!FullTransactionIdIsNormal(targetxid))
+		elog(ERROR, "targetxid %llu is not normal", (unsigned long long) U64FromFullTransactionId(targetxid));
+
+	lastxid = consume_xids_common(targetxid, 0);
+
+	PG_RETURN_INT64((int64) U64FromFullTransactionId(lastxid));
+}
+
+/*
+ * Common functionality between the two public functions.
+ */
+static FullTransactionId
+consume_xids_common(FullTransactionId untilxid, uint64 nxids)
+{
+	FullTransactionId lastxid;
+	uint64		last_reported_at = 0;
+	uint64		consumed = 0;
+
+	/* Print a NOTICE every REPORT_INTERVAL xids */
+#define REPORT_INTERVAL (10*1000000)
+
+	/* initialize 'lastxid' with the system's current next XID */
+	lastxid = ReadNextFullTransactionId();
+
+	/*
+	 * We consume XIDs by calling GetNewTransactionId(true), which marks the
+	 * consumed XIDs as subtransactions of the current top-level transaction.
+	 * For that to work, this transaction must have a top-level XID.
+	 *
+	 * GetNewTransactionId registers them in the subxid cache in PGPROC, until
+	 * the cache overflows, but beyond that, we don't keep track of the
+	 * consumed XIDs.
+	 */
+	(void) GetTopTransactionId();
+
+	for (;;)
+	{
+		uint64		xids_left;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* How many XIDs do we have left to consume? */
+		if (nxids > 0)
+		{
+			if (consumed >= nxids)
+				break;
+			xids_left = nxids - consumed;
+		}
+		else
+		{
+			if (FullTransactionIdFollowsOrEquals(lastxid, untilxid))
+				break;
+			xids_left = U64FromFullTransactionId(untilxid) - U64FromFullTransactionId(lastxid);
+		}
+
+		/*
+		 * If we still have plenty of XIDs to consume, try to take a shortcut
+		 * and bump up the nextXid counter directly.
+		 */
+		if (xids_left > 2000 &&
+			consumed - last_reported_at < REPORT_INTERVAL &&
+			MyProc->subxidStatus.overflowed)
+		{
+			int64		consumed_by_shortcut = consume_xids_shortcut();
+
+			if (consumed_by_shortcut > 0)
+			{
+				consumed += consumed_by_shortcut;
+				continue;
+			}
+		}
+
+		/* Slow path: Call GetNewTransactionId to allocate a new XID. */
+		lastxid = GetNewTransactionId(true);
+		consumed++;
+
+		/* Report progress */
+		if (consumed - last_reported_at >= REPORT_INTERVAL)
+		{
+			if (nxids > 0)
+				elog(NOTICE, "consumed %llu / %llu XIDs, latest %u:%u",
+					 (unsigned long long) consumed, (unsigned long long) nxids,
+					 EpochFromFullTransactionId(lastxid),
+					 XidFromFullTransactionId(lastxid));
+			else
+				elog(NOTICE, "consumed up to %u:%u / %u:%u",
+					 EpochFromFullTransactionId(lastxid),
+					 XidFromFullTransactionId(lastxid),
+					 EpochFromFullTransactionId(untilxid),
+					 XidFromFullTransactionId(untilxid));
+			last_reported_at = consumed;
+		}
+	}
+
+	return lastxid;
+}
+
+
+/*
+ * These constants copied from .c files, because they're private.
+ */
+#define COMMIT_TS_XACTS_PER_PAGE (BLCKSZ / 10)
+#define SUBTRANS_XACTS_PER_PAGE (BLCKSZ / sizeof(TransactionId))
+#define CLOG_XACTS_PER_BYTE 4
+#define CLOG_XACTS_PER_PAGE (BLCKSZ * CLOG_XACTS_PER_BYTE)
+
+/*
+ * All the interesting action in GetNewTransactionId happens when we extend
+ * the SLRUs, or at the uint32 wraparound. If the nextXid counter is not close
+ * to any of those interesting values, take a shortcut and bump nextXID
+ * directly, close to the next "interesting" value.
+ */
+static inline uint32
+XidSkip(FullTransactionId fullxid)
+{
+	uint32		low = XidFromFullTransactionId(fullxid);
+	uint32		rem;
+	uint32		distance;
+
+	if (low < 5 || low >= UINT32_MAX - 5)
+		return 0;
+	distance = UINT32_MAX - 5 - low;
+
+	rem = low % COMMIT_TS_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, COMMIT_TS_XACTS_PER_PAGE - rem);
+
+	rem = low % SUBTRANS_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, SUBTRANS_XACTS_PER_PAGE - rem);
+
+	rem = low % CLOG_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, CLOG_XACTS_PER_PAGE - rem);
+
+	return distance;
+}
+
+static int64
+consume_xids_shortcut(void)
+{
+	FullTransactionId nextXid;
+	uint32		consumed;
+
+	LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+	nextXid = ShmemVariableCache->nextXid;
+
+	/*
+	 * Go slow near the "interesting values". The interesting zones include 5
+	 * transactions before and after SLRU page switches.
+	 */
+	consumed = XidSkip(nextXid);
+	if (consumed > 0)
+		ShmemVariableCache->nextXid.value += (uint64) consumed;
+
+	LWLockRelease(XidGenLock);
+
+	return consumed;
+}
diff --git a/src/test/modules/xid_wraparound/xid_wraparound.control b/src/test/modules/xid_wraparound/xid_wraparound.control
new file mode 100644
index 0000000000..6c6964ed3d
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound.control
@@ -0,0 +1,4 @@
+comment = 'Tests for XID wraparound'
+default_version = '1.0'
+module_pathname = '$libdir/xid_wraparound'
+relocatable = true
-- 
2.31.1

#31John Naylor
john.naylor@enterprisedb.com
In reply to: Masahiko Sawada (#30)
Re: Testing autovacuum wraparound (including failsafe)

I agree having the new functions in the tree is useful. I also tried
running the TAP tests in v2, but 001 and 002 fail to run:

Odd number of elements in hash assignment at [...]/Cluster.pm line 2010.
Can't locate object method "pump_nb" via package
"PostgreSQL::Test::BackgroundPsql" at [...]

It seems to be complaining about

+my $in  = '';
+my $out = '';
+my $timeout = IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default);
+my $background_psql = $node->background_psql('postgres', \$in, \$out,
$timeout);

...that call to background_psql doesn't look like other ones that have "key
=> value". Is there something I'm missing?

--
John Naylor
EDB: http://www.enterprisedb.com

#32Masahiko Sawada
sawada.mshk@gmail.com
In reply to: John Naylor (#31)
1 attachment(s)
Re: Testing autovacuum wraparound (including failsafe)

On Fri, Apr 21, 2023 at 12:02 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

I agree having the new functions in the tree is useful. I also tried running the TAP tests in v2, but 001 and 002 fail to run:

Odd number of elements in hash assignment at [...]/Cluster.pm line 2010.
Can't locate object method "pump_nb" via package "PostgreSQL::Test::BackgroundPsql" at [...]

It seems to be complaining about

+my $in  = '';
+my $out = '';
+my $timeout = IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default);
+my $background_psql = $node->background_psql('postgres', \$in, \$out, $timeout);

...that call to background_psql doesn't look like other ones that have "key => value". Is there something I'm missing?

Thanks for reporting. I think that the patch needs to be updated since
commit 664d757531e1 changed background psql TAP functions. I've
attached the updated patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v3-0001-Add-tests-for-XID-wraparound.patchapplication/octet-stream; name=v3-0001-Add-tests-for-XID-wraparound.patchDownload
From 3cc02492129e56ba719d11c75a55732b22c97258 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 3 Mar 2023 12:01:28 +0200
Subject: [PATCH v3] Add tests for XID wraparound.

The test module includes helper functions to quickly burn through lots
of XIDs. They are used in the tests, and are also handy for manually
testing XID wraparound.

Author: Masahiko Sawada, Heikki Linnakangas
Discussion: https://www.postgresql.org/message-id/CAD21AoDVhkXp8HjpFO-gp3TgL6tCKcZQNxn04m01VAtcSi-5sA%40mail.gmail.com
---
 src/test/modules/Makefile                     |   1 +
 src/test/modules/meson.build                  |   1 +
 src/test/modules/xid_wraparound/.gitignore    |   4 +
 src/test/modules/xid_wraparound/Makefile      |  28 +++
 src/test/modules/xid_wraparound/README        |   3 +
 src/test/modules/xid_wraparound/meson.build   |  43 ++++
 .../xid_wraparound/t/001_emergency_vacuum.pl  | 115 +++++++++
 .../modules/xid_wraparound/t/002_limits.pl    | 109 +++++++++
 .../xid_wraparound/t/003_wraparounds.pl       |  46 ++++
 .../xid_wraparound/xid_wraparound--1.0.sql    |  12 +
 .../modules/xid_wraparound/xid_wraparound.c   | 221 ++++++++++++++++++
 .../xid_wraparound/xid_wraparound.control     |   4 +
 12 files changed, 587 insertions(+)
 create mode 100644 src/test/modules/xid_wraparound/.gitignore
 create mode 100644 src/test/modules/xid_wraparound/Makefile
 create mode 100644 src/test/modules/xid_wraparound/README
 create mode 100644 src/test/modules/xid_wraparound/meson.build
 create mode 100644 src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
 create mode 100644 src/test/modules/xid_wraparound/t/002_limits.pl
 create mode 100644 src/test/modules/xid_wraparound/t/003_wraparounds.pl
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound.c
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound.control

diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 79e3033ec2..2e782519e0 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -14,6 +14,7 @@ SUBDIRS = \
 		  plsample \
 		  snapshot_too_old \
 		  spgist_name_ops \
+		  xid_wraparound \
 		  test_bloomfilter \
 		  test_copy_callbacks \
 		  test_custom_rmgrs \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index dcb82ed68f..30f989b3ce 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -11,6 +11,7 @@ subdir('plsample')
 subdir('snapshot_too_old')
 subdir('spgist_name_ops')
 subdir('ssl_passphrase_callback')
+subdir('xid_wraparound')
 subdir('test_bloomfilter')
 subdir('test_copy_callbacks')
 subdir('test_custom_rmgrs')
diff --git a/src/test/modules/xid_wraparound/.gitignore b/src/test/modules/xid_wraparound/.gitignore
new file mode 100644
index 0000000000..5dcb3ff972
--- /dev/null
+++ b/src/test/modules/xid_wraparound/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/xid_wraparound/Makefile b/src/test/modules/xid_wraparound/Makefile
new file mode 100644
index 0000000000..fc5ead6cc5
--- /dev/null
+++ b/src/test/modules/xid_wraparound/Makefile
@@ -0,0 +1,28 @@
+# src/test/modules/xid_wraparound/Makefile
+
+MODULE_big = xid_wraparound
+OBJS = \
+	$(WIN32RES) \
+	xid_wraparound.o
+PGFILEDESC = "xid_wraparound - tests for XID wraparound"
+
+EXTENSION = xid_wraparound
+DATA = xid_wraparound--1.0.sql
+
+# Disabled by default because these tests could take a long time,
+# which typical installcheck users cannot tolerate (e.g. buildfarm
+# clients).
+ifdef ENABLE_XID_WRAPAROUND_TESTS
+TAP_TESTS = 1
+endif
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/xid_wraparound
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/xid_wraparound/README b/src/test/modules/xid_wraparound/README
new file mode 100644
index 0000000000..3aab464dec
--- /dev/null
+++ b/src/test/modules/xid_wraparound/README
@@ -0,0 +1,3 @@
+This module contains tests for XID wraparound. The tests use two
+helper functions to quickly consume lots of XIDs, to reach XID
+wraparound faster.
diff --git a/src/test/modules/xid_wraparound/meson.build b/src/test/modules/xid_wraparound/meson.build
new file mode 100644
index 0000000000..bdd55f22c4
--- /dev/null
+++ b/src/test/modules/xid_wraparound/meson.build
@@ -0,0 +1,43 @@
+# Copyright (c) 2022-2023, PostgreSQL Global Development Group
+
+# FIXME: prevent install during main install, but not during test :/
+
+xid_wraparound_sources = files(
+  'xid_wraparound.c',
+)
+
+if host_system == 'windows'
+  xid_wraparound_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'xid_wraparound',
+    '--FILEDESC', 'xid_wraparound - tests for XID wraparound',])
+endif
+
+xid_wraparound = shared_module('xid_wraparound',
+  xid_wraparound_sources,
+  kwargs: pg_mod_args,
+)
+testprep_targets += xid_wraparound
+
+install_data(
+  'xid_wraparound.control',
+  'xid_wraparound--1.0.sql',
+  kwargs: contrib_data_args,
+)
+
+# Disabled by default because these test could take a long time,
+# which typical installcheck users cannot tolerate (e.g. buildfarm
+# clients).
+if false
+  tests += {
+    'name': 'xid_wraparound',
+    'sd': meson.current_source_dir(),
+    'bd': meson.current_build_dir(),
+    'tap': {
+      'tests': [
+        't/001_emergency_vacuum.pl',
+        't/002_limits.pl',
+        't/003_wraparounds.pl',
+      ],
+    },
+  }
+endif
diff --git a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
new file mode 100644
index 0000000000..f443c0ab86
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
@@ -0,0 +1,115 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test wraparound emergency autovacuum.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('main');
+
+$node->init;
+$node->append_conf('postgresql.conf', qq[
+autovacuum = off # run autovacuum only when to anti wraparound
+autovacuum_naptime = 1s
+# so it's easier to verify the order of operations
+autovacuum_max_workers = 1
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create tables for a few different test scenarios
+$node->safe_psql('postgres', qq[
+CREATE TABLE large(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large(data) SELECT generate_series(1,30000);
+
+CREATE TABLE large_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large_trunc(data) SELECT generate_series(1,30000);
+
+CREATE TABLE small(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small(data) SELECT generate_series(1,15000);
+
+CREATE TABLE small_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small_trunc(data) SELECT generate_series(1,15000);
+
+CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH (autovacuum_enabled=false);
+INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);
+]);
+
+# Start a background session, which holds a transaction open, preventing
+# autovacuum from advancing relfrozenxid and datfrozenxid.
+my $timeout = IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default);
+my $background_psql = $node->background_psql('postgres',
+					     on_error_stop => 0);
+$background_psql->query_safe(q{
+	BEGIN;
+	DELETE FROM large WHERE id % 2 = 0;
+	DELETE FROM large_trunc WHERE id > 10000;
+	DELETE FROM small WHERE id % 2 = 0;
+	DELETE FROM small_trunc WHERE id > 1000;
+	DELETE FROM autovacuum_disabled WHERE id % 2 = 0;
+});
+
+# Consume 2 billion XIDs, to get us very close to wraparound
+$node->safe_psql('postgres', qq[SELECT consume_xids_until('2000000000'::bigint)]);
+
+# Make sure the latest completed XID is advanced
+$node->safe_psql('postgres', qq[INSERT INTO small(data) SELECT 1]);
+
+# Check that all databases became old enough to trigger failsafe.
+my $ret = $node->safe_psql('postgres',
+			   qq[
+SELECT datname,
+       age(datfrozenxid) > current_setting('vacuum_failsafe_age')::int as old
+FROM pg_database ORDER BY 1
+]);
+is($ret, "postgres|t
+template0|t
+template1|t", "all tables became old");
+
+my $log_offset = -s $node->logfile;
+
+# Finish the old transaction, to allow vacuum freezing to advance
+# relfrozenxid and datfrozenxid again.
+$background_psql->query_safe(q{COMMIT;});
+$background_psql->quit;
+
+# Wait until autovacuum processed all tables and advanced the
+# system-wide oldest-XID.
+$node->poll_query_until('postgres',
+			qq[
+SELECT NOT EXISTS (
+	SELECT *
+	FROM pg_database
+	WHERE age(datfrozenxid) > current_setting('autovacuum_freeze_max_age')::int)
+]) or die "timeout waiting for all databases to be vacuumed";
+
+# Check if these tables are vacuumed.
+$ret = $node->safe_psql('postgres', qq[
+SELECT relname, age(relfrozenxid) > current_setting('autovacuum_freeze_max_age')::int
+FROM pg_class
+WHERE relname IN ('large', 'large_trunc', 'small', 'small_trunc', 'autovacuum_disabled')
+ORDER BY 1
+]);
+
+is($ret, "autovacuum_disabled|f
+large|f
+large_trunc|f
+small|f
+small_trunc|f", "all tables are vacuumed");
+
+# Check if vacuum failsafe was triggered for each table.
+my $log_contents = slurp_file($node->logfile, $log_offset);
+foreach my $tablename ('large', 'large_trunc', 'small', 'small_trunc', 'autovacuum_disabled')
+{
+    like(
+	$log_contents,
+	qr/bypassing nonessential maintenance of table "postgres.public.$tablename" as a failsafe after \d+ index scans/,
+	"failsafe vacuum triggered for $tablename");
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/xid_wraparound/t/002_limits.pl b/src/test/modules/xid_wraparound/t/002_limits.pl
new file mode 100644
index 0000000000..6c41fe7919
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/002_limits.pl
@@ -0,0 +1,109 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+#
+# Test XID wraparound limits.
+#
+# When you get close to XID wraparound, you start to get warnings, and
+# when you get even closer, the system refuses to assign any more XIDs
+# until the oldest databases have been vacuumed and datfrozenxid has
+# been advanced.
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+my $ret;
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('wraparound');
+
+$node->init;
+$node->append_conf('postgresql.conf', qq[
+autovacuum = off # run autovacuum only to prevent wraparound
+autovacuum_naptime = 1s
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create a test table
+$node->safe_psql('postgres', qq[
+CREATE TABLE wraparoundtest(t text);
+INSERT INTO wraparoundtest VALUES ('start');
+]);
+
+# Start a background session, which holds a transaction open, preventing
+# autovacuum from advancing relfrozenxid and datfrozenxid.
+my $timeout = IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default);
+my $background_psql = $node->background_psql('postgres',
+					     on_error_stop => 0);
+$background_psql->query_safe(q{
+	BEGIN;
+	INSERT INTO wraparoundtest VALUES ('oldxact');
+});
+
+# Consume 2 billion transactions, to get close to wraparound
+$node->safe_psql('postgres', qq[SELECT consume_xids(1000000000)]);
+$node->safe_psql('postgres', qq[INSERT INTO wraparoundtest VALUES ('after 1 billion')]);
+
+$node->safe_psql('postgres', qq[SELECT consume_xids(1000000000)]);
+$node->safe_psql('postgres', qq[INSERT INTO wraparoundtest VALUES ('after 2 billion')]);
+
+# We are now just under 150 million XIDs away from wraparound.
+# Continue consuming XIDs, in batches of 10 million, until we get
+# the warning:
+#
+#  WARNING:  database "postgres" must be vacuumed within 3000024 transactions
+#  HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
+#  You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
+my $stderr;
+my $warn_limit = 0;
+for my $i (1 .. 15)
+{
+	$node->psql('postgres', qq[SELECT consume_xids(10000000)], stderr => \$stderr, on_error_die => 1);
+
+	if ($stderr =~ /WARNING:  database "postgres" must be vacuumed within [0-9]+ transactions/)
+	{
+		# Reached the warn-limit
+		$warn_limit = 1;
+		last;
+	}
+}
+ok($warn_limit == 1, "warn-limit reached");
+
+# We can still INSERT, despite the warnings.
+$node->safe_psql('postgres', qq[INSERT INTO wraparoundtest VALUES ('reached warn-limit')]);
+
+# Keep going. We'll hit the hard "stop" limit.
+$ret = $node->psql('postgres', qq[SELECT consume_xids(100000000)], stderr => \$stderr);
+like($stderr, qr/ERROR:  database is not accepting commands to avoid wraparound data loss/, "stop-limit");
+
+# Finish the old transaction, to allow vacuum freezing to advance
+# relfrozenxid and datfrozenxid again.
+$background_psql->query_safe(q{COMMIT;});
+$background_psql->quit;
+
+# VACUUM, to freeze the tables and advance datfrozenxid.
+#
+# Autovacuum does this for the other databases, and would do it for
+# 'postgres' too, but let's test manual VACUUM.
+#
+$node->safe_psql('postgres', 'VACUUM');
+
+# Wait until autovacuum has processed the other databases and advanced
+# the system-wide oldest-XID.
+$ret = $node->poll_query_until('postgres', qq[INSERT INTO wraparoundtest VALUES ('after VACUUM')], 'INSERT 0 1');
+
+# Check the table contents
+$ret = $node->safe_psql('postgres', qq[SELECT * from wraparoundtest]);
+is($ret, "start
+oldxact
+after 1 billion
+after 2 billion
+reached warn-limit
+after VACUUM");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/xid_wraparound/t/003_wraparounds.pl b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
new file mode 100644
index 0000000000..e79adbd12d
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
@@ -0,0 +1,46 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+#
+# Consume a lot of XIDs, wrapping around a few times.
+#
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('wraparound');
+
+$node->init;
+$node->append_conf('postgresql.conf', qq[
+autovacuum = off # run autovacuum only when to anti wraparound
+autovacuum_naptime = 1s
+# so it's easier to verify the order of operations
+autovacuum_max_workers = 1
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create a test table
+$node->safe_psql('postgres', qq[
+CREATE TABLE wraparoundtest(t text);
+INSERT INTO wraparoundtest VALUES ('beginning');
+]);
+
+# Burn through 10 billion transactions in total, in batches of 100 million.
+my $ret;
+for my $i (1 .. 100)
+{
+	$ret = $node->safe_psql('postgres', qq[SELECT consume_xids(100000000)]);
+	$ret = $node->safe_psql('postgres', qq[INSERT INTO wraparoundtest VALUES ('after $i batches')]);
+}
+
+$ret = $node->safe_psql('postgres', qq[SELECT COUNT(*) FROM wraparoundtest]);
+is($ret, "101");
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql b/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
new file mode 100644
index 0000000000..f5577adfdb
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
@@ -0,0 +1,12 @@
+/* src/test/modules/xid_wraparound/xid_wraparound--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION xid_wraparound" to load this file. \quit
+
+CREATE FUNCTION consume_xids(bigint)
+RETURNS bigint IMMUTABLE PARALLEL SAFE STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION consume_xids_until(bigint)
+RETURNS bigint IMMUTABLE PARALLEL SAFE STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/xid_wraparound/xid_wraparound.c b/src/test/modules/xid_wraparound/xid_wraparound.c
new file mode 100644
index 0000000000..c9d6034b55
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound.c
@@ -0,0 +1,221 @@
+/*--------------------------------------------------------------------------
+ *
+ * xid_wraparound.c
+ *		Utilities for testing XID wraparound
+ *
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *		src/test/modules/xid_wraparound/xid_wraparound.c
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/transam.h"
+#include "access/xact.h"
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "storage/lwlock.h"
+#include "storage/proc.h"
+
+PG_MODULE_MAGIC;
+
+static int64 consume_xids_shortcut(void);
+static FullTransactionId consume_xids_common(FullTransactionId untilxid, uint64 nxids);
+
+/*
+ * Consume the specified number of XIDs.
+ */
+PG_FUNCTION_INFO_V1(consume_xids);
+Datum
+consume_xids(PG_FUNCTION_ARGS)
+{
+	int64		nxids = PG_GETARG_INT64(0);
+	FullTransactionId lastxid;
+
+	if (nxids < 0)
+		elog(ERROR, "invalid nxids argument: %lld", (long long) nxids);
+
+	if (nxids == 0)
+		lastxid = ReadNextFullTransactionId();
+	else
+		lastxid = consume_xids_common(InvalidFullTransactionId, (uint64) nxids);
+
+	PG_RETURN_INT64((int64) U64FromFullTransactionId(lastxid));
+}
+
+/*
+ * Consume XIDs, up to the given XID.
+ */
+PG_FUNCTION_INFO_V1(consume_xids_until);
+Datum
+consume_xids_until(PG_FUNCTION_ARGS)
+{
+	FullTransactionId targetxid = FullTransactionIdFromU64((uint64) PG_GETARG_INT64(0));
+	FullTransactionId lastxid;
+
+	if (!FullTransactionIdIsNormal(targetxid))
+		elog(ERROR, "targetxid %llu is not normal", (unsigned long long) U64FromFullTransactionId(targetxid));
+
+	lastxid = consume_xids_common(targetxid, 0);
+
+	PG_RETURN_INT64((int64) U64FromFullTransactionId(lastxid));
+}
+
+/*
+ * Common functionality between the two public functions.
+ */
+static FullTransactionId
+consume_xids_common(FullTransactionId untilxid, uint64 nxids)
+{
+	FullTransactionId lastxid;
+	uint64		last_reported_at = 0;
+	uint64		consumed = 0;
+
+	/* Print a NOTICE every REPORT_INTERVAL xids */
+#define REPORT_INTERVAL (10*1000000)
+
+	/* initialize 'lastxid' with the system's current next XID */
+	lastxid = ReadNextFullTransactionId();
+
+	/*
+	 * We consume XIDs by calling GetNewTransactionId(true), which marks the
+	 * consumed XIDs as subtransactions of the current top-level transaction.
+	 * For that to work, this transaction must have a top-level XID.
+	 *
+	 * GetNewTransactionId registers them in the subxid cache in PGPROC, until
+	 * the cache overflows, but beyond that, we don't keep track of the
+	 * consumed XIDs.
+	 */
+	(void) GetTopTransactionId();
+
+	for (;;)
+	{
+		uint64		xids_left;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* How many XIDs do we have left to consume? */
+		if (nxids > 0)
+		{
+			if (consumed >= nxids)
+				break;
+			xids_left = nxids - consumed;
+		}
+		else
+		{
+			if (FullTransactionIdFollowsOrEquals(lastxid, untilxid))
+				break;
+			xids_left = U64FromFullTransactionId(untilxid) - U64FromFullTransactionId(lastxid);
+		}
+
+		/*
+		 * If we still have plenty of XIDs to consume, try to take a shortcut
+		 * and bump up the nextXid counter directly.
+		 */
+		if (xids_left > 2000 &&
+			consumed - last_reported_at < REPORT_INTERVAL &&
+			MyProc->subxidStatus.overflowed)
+		{
+			int64		consumed_by_shortcut = consume_xids_shortcut();
+
+			if (consumed_by_shortcut > 0)
+			{
+				consumed += consumed_by_shortcut;
+				continue;
+			}
+		}
+
+		/* Slow path: Call GetNewTransactionId to allocate a new XID. */
+		lastxid = GetNewTransactionId(true);
+		consumed++;
+
+		/* Report progress */
+		if (consumed - last_reported_at >= REPORT_INTERVAL)
+		{
+			if (nxids > 0)
+				elog(NOTICE, "consumed %llu / %llu XIDs, latest %u:%u",
+					 (unsigned long long) consumed, (unsigned long long) nxids,
+					 EpochFromFullTransactionId(lastxid),
+					 XidFromFullTransactionId(lastxid));
+			else
+				elog(NOTICE, "consumed up to %u:%u / %u:%u",
+					 EpochFromFullTransactionId(lastxid),
+					 XidFromFullTransactionId(lastxid),
+					 EpochFromFullTransactionId(untilxid),
+					 XidFromFullTransactionId(untilxid));
+			last_reported_at = consumed;
+		}
+	}
+
+	return lastxid;
+}
+
+
+/*
+ * These constants copied from .c files, because they're private.
+ */
+#define COMMIT_TS_XACTS_PER_PAGE (BLCKSZ / 10)
+#define SUBTRANS_XACTS_PER_PAGE (BLCKSZ / sizeof(TransactionId))
+#define CLOG_XACTS_PER_BYTE 4
+#define CLOG_XACTS_PER_PAGE (BLCKSZ * CLOG_XACTS_PER_BYTE)
+
+/*
+ * All the interesting action in GetNewTransactionId happens when we extend
+ * the SLRUs, or at the uint32 wraparound. If the nextXid counter is not close
+ * to any of those interesting values, take a shortcut and bump nextXID
+ * directly, close to the next "interesting" value.
+ */
+static inline uint32
+XidSkip(FullTransactionId fullxid)
+{
+	uint32		low = XidFromFullTransactionId(fullxid);
+	uint32		rem;
+	uint32		distance;
+
+	if (low < 5 || low >= UINT32_MAX - 5)
+		return 0;
+	distance = UINT32_MAX - 5 - low;
+
+	rem = low % COMMIT_TS_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, COMMIT_TS_XACTS_PER_PAGE - rem);
+
+	rem = low % SUBTRANS_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, SUBTRANS_XACTS_PER_PAGE - rem);
+
+	rem = low % CLOG_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, CLOG_XACTS_PER_PAGE - rem);
+
+	return distance;
+}
+
+static int64
+consume_xids_shortcut(void)
+{
+	FullTransactionId nextXid;
+	uint32		consumed;
+
+	LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+	nextXid = ShmemVariableCache->nextXid;
+
+	/*
+	 * Go slow near the "interesting values". The interesting zones include 5
+	 * transactions before and after SLRU page switches.
+	 */
+	consumed = XidSkip(nextXid);
+	if (consumed > 0)
+		ShmemVariableCache->nextXid.value += (uint64) consumed;
+
+	LWLockRelease(XidGenLock);
+
+	return consumed;
+}
diff --git a/src/test/modules/xid_wraparound/xid_wraparound.control b/src/test/modules/xid_wraparound/xid_wraparound.control
new file mode 100644
index 0000000000..6c6964ed3d
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound.control
@@ -0,0 +1,4 @@
+comment = 'Tests for XID wraparound'
+default_version = '1.0'
+module_pathname = '$libdir/xid_wraparound'
+relocatable = true
-- 
2.31.1

#33Daniel Gustafsson
daniel@yesql.se
In reply to: Masahiko Sawada (#32)
Re: Testing autovacuum wraparound (including failsafe)

On 27 Apr 2023, at 16:06, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Apr 21, 2023 at 12:02 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

...that call to background_psql doesn't look like other ones that have "key => value". Is there something I'm missing?

Thanks for reporting. I think that the patch needs to be updated since
commit 664d757531e1 changed background psql TAP functions. I've
attached the updated patch.

Is there a risk that the background psql will time out on slow systems during
the consumption of 2B xid's? Since you mainly want to hold it open for the
duration of testing you might want to bump it to avoid false negatives on slow
test systems.

--
Daniel Gustafsson

#34John Naylor
john.naylor@enterprisedb.com
In reply to: Daniel Gustafsson (#33)
Re: Testing autovacuum wraparound (including failsafe)

On Thu, Apr 27, 2023 at 9:12 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 27 Apr 2023, at 16:06, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Apr 21, 2023 at 12:02 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

...that call to background_psql doesn't look like other ones that have

"key => value". Is there something I'm missing?

Thanks for reporting. I think that the patch needs to be updated since
commit 664d757531e1 changed background psql TAP functions. I've
attached the updated patch.

Thanks, it passes for me now.

Is there a risk that the background psql will time out on slow systems

during

the consumption of 2B xid's? Since you mainly want to hold it open for

the

duration of testing you might want to bump it to avoid false negatives on

slow

test systems.

If they're that slow, I'd worry more about generating 20GB of xact status
data. That's why the tests are disabled by default.

--
John Naylor
EDB: http://www.enterprisedb.com

On Thu, Apr 27, 2023 at 9:12 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 27 Apr 2023, at 16:06, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Apr 21, 2023 at 12:02 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

...that call to background_psql doesn't look like other ones that have

"key => value". Is there something I'm missing?

Thanks for reporting. I think that the patch needs to be updated since
commit 664d757531e1 changed background psql TAP functions. I've
attached the updated patch.

Is there a risk that the background psql will time out on slow systems
during
the consumption of 2B xid's? Since you mainly want to hold it open for the
duration of testing you might want to bump it to avoid false negatives on
slow
test systems.

--
Daniel Gustafsson

--
John Naylor
EDB: http://www.enterprisedb.com

#35Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#34)
Re: Testing autovacuum wraparound (including failsafe)

John Naylor <john.naylor@enterprisedb.com> writes:

On Thu, Apr 27, 2023 at 9:12 PM Daniel Gustafsson <daniel@yesql.se> wrote:

Is there a risk that the background psql will time out on slow systems during
the consumption of 2B xid's? Since you mainly want to hold it open for the
duration of testing you might want to bump it to avoid false negatives on
slow test systems.

If they're that slow, I'd worry more about generating 20GB of xact status
data. That's why the tests are disabled by default.

There is exactly zero chance that anyone will accept the introduction
of such an expensive test into either check-world or the buildfarm
sequence.

regards, tom lane

#36Daniel Gustafsson
daniel@yesql.se
In reply to: Tom Lane (#35)
Re: Testing autovacuum wraparound (including failsafe)

On 28 Apr 2023, at 06:42, Tom Lane <tgl@sss.pgh.pa.us> wrote:
John Naylor <john.naylor@enterprisedb.com> writes:

If they're that slow, I'd worry more about generating 20GB of xact status
data. That's why the tests are disabled by default.

There is exactly zero chance that anyone will accept the introduction
of such an expensive test into either check-world or the buildfarm
sequence.

Even though the entire suite is disabled by default, shouldn't it also require
PG_TEST_EXTRA to be consistent with other off-by-default suites like for example
src/test/kerberos?

--
Daniel Gustafsson

#37Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Daniel Gustafsson (#33)
1 attachment(s)
Re: Testing autovacuum wraparound (including failsafe)

On Thu, Apr 27, 2023 at 11:12 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 27 Apr 2023, at 16:06, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Apr 21, 2023 at 12:02 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

...that call to background_psql doesn't look like other ones that have "key => value". Is there something I'm missing?

Thanks for reporting. I think that the patch needs to be updated since
commit 664d757531e1 changed background psql TAP functions. I've
attached the updated patch.

Is there a risk that the background psql will time out on slow systems during
the consumption of 2B xid's? Since you mainly want to hold it open for the
duration of testing you might want to bump it to avoid false negatives on slow
test systems.

Agreed. The timeout can be set by manually setting
PG_TEST_TIMEOUT_DEFAULT, but I bump it to 10 min by default. And it
now require setting PG_TET_EXTRA to run it.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v4-0001-Add-tests-for-XID-wraparound.patchapplication/octet-stream; name=v4-0001-Add-tests-for-XID-wraparound.patchDownload
From 10920f76b2a77f48e495f2541c749fa8bb970e78 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 3 Mar 2023 12:01:28 +0200
Subject: [PATCH v4] Add tests for XID wraparound.

The test module includes helper functions to quickly burn through lots
of XIDs. They are used in the tests, and are also handy for manually
testing XID wraparound.

Since these tests are very expensive the entire suite is disabled by
default. It requires to set PG_TET_EXTRA to run it.

Author: Masahiko Sawada, Heikki Linnakangas
Discussion: https://www.postgresql.org/message-id/CAD21AoDVhkXp8HjpFO-gp3TgL6tCKcZQNxn04m01VAtcSi-5sA%40mail.gmail.com
---
 src/test/modules/Makefile                     |   1 +
 src/test/modules/meson.build                  |   1 +
 src/test/modules/xid_wraparound/.gitignore    |   4 +
 src/test/modules/xid_wraparound/Makefile      |  23 ++
 src/test/modules/xid_wraparound/README        |   3 +
 src/test/modules/xid_wraparound/meson.build   |  43 ++++
 .../xid_wraparound/t/001_emergency_vacuum.pl  | 125 ++++++++++
 .../modules/xid_wraparound/t/002_limits.pl    | 118 ++++++++++
 .../xid_wraparound/t/003_wraparounds.pl       |  55 +++++
 .../xid_wraparound/xid_wraparound--1.0.sql    |  12 +
 .../modules/xid_wraparound/xid_wraparound.c   | 221 ++++++++++++++++++
 .../xid_wraparound/xid_wraparound.control     |   4 +
 12 files changed, 610 insertions(+)
 create mode 100644 src/test/modules/xid_wraparound/.gitignore
 create mode 100644 src/test/modules/xid_wraparound/Makefile
 create mode 100644 src/test/modules/xid_wraparound/README
 create mode 100644 src/test/modules/xid_wraparound/meson.build
 create mode 100644 src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
 create mode 100644 src/test/modules/xid_wraparound/t/002_limits.pl
 create mode 100644 src/test/modules/xid_wraparound/t/003_wraparounds.pl
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound.c
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound.control

diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 6331c976dc..3c6eb33b38 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -14,6 +14,7 @@ SUBDIRS = \
 		  plsample \
 		  snapshot_too_old \
 		  spgist_name_ops \
+		  xid_wraparound \
 		  test_bloomfilter \
 		  test_copy_callbacks \
 		  test_custom_rmgrs \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 17d369e378..cb63901e32 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -11,6 +11,7 @@ subdir('plsample')
 subdir('snapshot_too_old')
 subdir('spgist_name_ops')
 subdir('ssl_passphrase_callback')
+subdir('xid_wraparound')
 subdir('test_bloomfilter')
 subdir('test_copy_callbacks')
 subdir('test_custom_rmgrs')
diff --git a/src/test/modules/xid_wraparound/.gitignore b/src/test/modules/xid_wraparound/.gitignore
new file mode 100644
index 0000000000..5dcb3ff972
--- /dev/null
+++ b/src/test/modules/xid_wraparound/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/xid_wraparound/Makefile b/src/test/modules/xid_wraparound/Makefile
new file mode 100644
index 0000000000..7a6e0f6676
--- /dev/null
+++ b/src/test/modules/xid_wraparound/Makefile
@@ -0,0 +1,23 @@
+# src/test/modules/xid_wraparound/Makefile
+
+MODULE_big = xid_wraparound
+OBJS = \
+	$(WIN32RES) \
+	xid_wraparound.o
+PGFILEDESC = "xid_wraparound - tests for XID wraparound"
+
+EXTENSION = xid_wraparound
+DATA = xid_wraparound--1.0.sql
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/xid_wraparound
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/xid_wraparound/README b/src/test/modules/xid_wraparound/README
new file mode 100644
index 0000000000..3aab464dec
--- /dev/null
+++ b/src/test/modules/xid_wraparound/README
@@ -0,0 +1,3 @@
+This module contains tests for XID wraparound. The tests use two
+helper functions to quickly consume lots of XIDs, to reach XID
+wraparound faster.
diff --git a/src/test/modules/xid_wraparound/meson.build b/src/test/modules/xid_wraparound/meson.build
new file mode 100644
index 0000000000..bdd55f22c4
--- /dev/null
+++ b/src/test/modules/xid_wraparound/meson.build
@@ -0,0 +1,43 @@
+# Copyright (c) 2022-2023, PostgreSQL Global Development Group
+
+# FIXME: prevent install during main install, but not during test :/
+
+xid_wraparound_sources = files(
+  'xid_wraparound.c',
+)
+
+if host_system == 'windows'
+  xid_wraparound_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'xid_wraparound',
+    '--FILEDESC', 'xid_wraparound - tests for XID wraparound',])
+endif
+
+xid_wraparound = shared_module('xid_wraparound',
+  xid_wraparound_sources,
+  kwargs: pg_mod_args,
+)
+testprep_targets += xid_wraparound
+
+install_data(
+  'xid_wraparound.control',
+  'xid_wraparound--1.0.sql',
+  kwargs: contrib_data_args,
+)
+
+# Disabled by default because these test could take a long time,
+# which typical installcheck users cannot tolerate (e.g. buildfarm
+# clients).
+if false
+  tests += {
+    'name': 'xid_wraparound',
+    'sd': meson.current_source_dir(),
+    'bd': meson.current_build_dir(),
+    'tap': {
+      'tests': [
+        't/001_emergency_vacuum.pl',
+        't/002_limits.pl',
+        't/003_wraparounds.pl',
+      ],
+    },
+  }
+endif
diff --git a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
new file mode 100644
index 0000000000..ed3e722f67
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
@@ -0,0 +1,125 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test wraparound emergency autovacuum.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{PG_TEST_EXTRA} !~ /\bxid_wraparound\b/)
+{
+    plan skip_all =>
+	"test xid_wraparound not enabled in PG_TEST_EXTRA";
+}
+
+# bump the query timeout to avoid false negatives on slow test syetems.
+$ENV{PG_TEST_TIMEOUT_DEFAULT} = 600;
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('main');
+
+$node->init;
+$node->append_conf('postgresql.conf', qq[
+autovacuum = off # run autovacuum only when to anti wraparound
+autovacuum_naptime = 1s
+# so it's easier to verify the order of operations
+autovacuum_max_workers = 1
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create tables for a few different test scenarios
+$node->safe_psql('postgres', qq[
+CREATE TABLE large(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large(data) SELECT generate_series(1,30000);
+
+CREATE TABLE large_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large_trunc(data) SELECT generate_series(1,30000);
+
+CREATE TABLE small(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small(data) SELECT generate_series(1,15000);
+
+CREATE TABLE small_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small_trunc(data) SELECT generate_series(1,15000);
+
+CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH (autovacuum_enabled=false);
+INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);
+]);
+
+# Start a background session, which holds a transaction open, preventing
+# autovacuum from advancing relfrozenxid and datfrozenxid.
+my $timeout = IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default);
+my $background_psql = $node->background_psql('postgres',
+					     on_error_stop => 0);
+$background_psql->set_query_timer_restart();
+$background_psql->query_safe(q{
+	BEGIN;
+	DELETE FROM large WHERE id % 2 = 0;
+	DELETE FROM large_trunc WHERE id > 10000;
+	DELETE FROM small WHERE id % 2 = 0;
+	DELETE FROM small_trunc WHERE id > 1000;
+	DELETE FROM autovacuum_disabled WHERE id % 2 = 0;
+});
+
+# Consume 2 billion XIDs, to get us very close to wraparound
+$node->safe_psql('postgres', qq[SELECT consume_xids_until('2000000000'::bigint)]);
+
+# Make sure the latest completed XID is advanced
+$node->safe_psql('postgres', qq[INSERT INTO small(data) SELECT 1]);
+
+# Check that all databases became old enough to trigger failsafe.
+my $ret = $node->safe_psql('postgres',
+			   qq[
+SELECT datname,
+       age(datfrozenxid) > current_setting('vacuum_failsafe_age')::int as old
+FROM pg_database ORDER BY 1
+]);
+is($ret, "postgres|t
+template0|t
+template1|t", "all tables became old");
+
+my $log_offset = -s $node->logfile;
+
+# Finish the old transaction, to allow vacuum freezing to advance
+# relfrozenxid and datfrozenxid again.
+$background_psql->query_safe(q{COMMIT;});
+$background_psql->quit;
+
+# Wait until autovacuum processed all tables and advanced the
+# system-wide oldest-XID.
+$node->poll_query_until('postgres',
+			qq[
+SELECT NOT EXISTS (
+	SELECT *
+	FROM pg_database
+	WHERE age(datfrozenxid) > current_setting('autovacuum_freeze_max_age')::int)
+]) or die "timeout waiting for all databases to be vacuumed";
+
+# Check if these tables are vacuumed.
+$ret = $node->safe_psql('postgres', qq[
+SELECT relname, age(relfrozenxid) > current_setting('autovacuum_freeze_max_age')::int
+FROM pg_class
+WHERE relname IN ('large', 'large_trunc', 'small', 'small_trunc', 'autovacuum_disabled')
+ORDER BY 1
+]);
+
+is($ret, "autovacuum_disabled|f
+large|f
+large_trunc|f
+small|f
+small_trunc|f", "all tables are vacuumed");
+
+# Check if vacuum failsafe was triggered for each table.
+my $log_contents = slurp_file($node->logfile, $log_offset);
+foreach my $tablename ('large', 'large_trunc', 'small', 'small_trunc', 'autovacuum_disabled')
+{
+    like(
+	$log_contents,
+	qr/bypassing nonessential maintenance of table "postgres.public.$tablename" as a failsafe after \d+ index scans/,
+	"failsafe vacuum triggered for $tablename");
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/xid_wraparound/t/002_limits.pl b/src/test/modules/xid_wraparound/t/002_limits.pl
new file mode 100644
index 0000000000..24cc5f272d
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/002_limits.pl
@@ -0,0 +1,118 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+#
+# Test XID wraparound limits.
+#
+# When you get close to XID wraparound, you start to get warnings, and
+# when you get even closer, the system refuses to assign any more XIDs
+# until the oldest databases have been vacuumed and datfrozenxid has
+# been advanced.
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+if ($ENV{PG_TEST_EXTRA} !~ /\bxid_wraparound\b/)
+{
+    plan skip_all =>
+	"test xid_wraparound not enabled in PG_TEST_EXTRA";
+}
+
+# bump the query timeout to avoid false negatives on slow test syetems.
+$ENV{PG_TEST_TIMEOUT_DEFAULT} = 600;
+
+my $ret;
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('wraparound');
+
+$node->init;
+$node->append_conf('postgresql.conf', qq[
+autovacuum = off # run autovacuum only to prevent wraparound
+autovacuum_naptime = 1s
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create a test table
+$node->safe_psql('postgres', qq[
+CREATE TABLE wraparoundtest(t text);
+INSERT INTO wraparoundtest VALUES ('start');
+]);
+
+# Start a background session, which holds a transaction open, preventing
+# autovacuum from advancing relfrozenxid and datfrozenxid.
+my $timeout = IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default);
+my $background_psql = $node->background_psql('postgres',
+					     on_error_stop => 0);
+$background_psql->query_safe(q{
+	BEGIN;
+	INSERT INTO wraparoundtest VALUES ('oldxact');
+});
+
+# Consume 2 billion transactions, to get close to wraparound
+$node->safe_psql('postgres', qq[SELECT consume_xids(1000000000)]);
+$node->safe_psql('postgres', qq[INSERT INTO wraparoundtest VALUES ('after 1 billion')]);
+
+$node->safe_psql('postgres', qq[SELECT consume_xids(1000000000)]);
+$node->safe_psql('postgres', qq[INSERT INTO wraparoundtest VALUES ('after 2 billion')]);
+
+# We are now just under 150 million XIDs away from wraparound.
+# Continue consuming XIDs, in batches of 10 million, until we get
+# the warning:
+#
+#  WARNING:  database "postgres" must be vacuumed within 3000024 transactions
+#  HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
+#  You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
+my $stderr;
+my $warn_limit = 0;
+for my $i (1 .. 15)
+{
+	$node->psql('postgres', qq[SELECT consume_xids(10000000)], stderr => \$stderr, on_error_die => 1);
+
+	if ($stderr =~ /WARNING:  database "postgres" must be vacuumed within [0-9]+ transactions/)
+	{
+		# Reached the warn-limit
+		$warn_limit = 1;
+		last;
+	}
+}
+ok($warn_limit == 1, "warn-limit reached");
+
+# We can still INSERT, despite the warnings.
+$node->safe_psql('postgres', qq[INSERT INTO wraparoundtest VALUES ('reached warn-limit')]);
+
+# Keep going. We'll hit the hard "stop" limit.
+$ret = $node->psql('postgres', qq[SELECT consume_xids(100000000)], stderr => \$stderr);
+like($stderr, qr/ERROR:  database is not accepting commands to avoid wraparound data loss/, "stop-limit");
+
+# Finish the old transaction, to allow vacuum freezing to advance
+# relfrozenxid and datfrozenxid again.
+$background_psql->query_safe(q{COMMIT;});
+$background_psql->quit;
+
+# VACUUM, to freeze the tables and advance datfrozenxid.
+#
+# Autovacuum does this for the other databases, and would do it for
+# 'postgres' too, but let's test manual VACUUM.
+#
+$node->safe_psql('postgres', 'VACUUM');
+
+# Wait until autovacuum has processed the other databases and advanced
+# the system-wide oldest-XID.
+$ret = $node->poll_query_until('postgres', qq[INSERT INTO wraparoundtest VALUES ('after VACUUM')], 'INSERT 0 1');
+
+# Check the table contents
+$ret = $node->safe_psql('postgres', qq[SELECT * from wraparoundtest]);
+is($ret, "start
+oldxact
+after 1 billion
+after 2 billion
+reached warn-limit
+after VACUUM");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/xid_wraparound/t/003_wraparounds.pl b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
new file mode 100644
index 0000000000..e018e6a433
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
@@ -0,0 +1,55 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+#
+# Consume a lot of XIDs, wrapping around a few times.
+#
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+if ($ENV{PG_TEST_EXTRA} !~ /\bxid_wraparound\b/)
+{
+    plan skip_all =>
+	"test xid_wraparound not enabled in PG_TEST_EXTRA";
+}
+
+# bump the query timeout to avoid false negatives on slow test syetems.
+$ENV{PG_TEST_TIMEOUT_DEFAULT} = 600;
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('wraparound');
+
+$node->init;
+$node->append_conf('postgresql.conf', qq[
+autovacuum = off # run autovacuum only when to anti wraparound
+autovacuum_naptime = 1s
+# so it's easier to verify the order of operations
+autovacuum_max_workers = 1
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create a test table
+$node->safe_psql('postgres', qq[
+CREATE TABLE wraparoundtest(t text);
+INSERT INTO wraparoundtest VALUES ('beginning');
+]);
+
+# Burn through 10 billion transactions in total, in batches of 100 million.
+my $ret;
+for my $i (1 .. 100)
+{
+	$ret = $node->safe_psql('postgres', qq[SELECT consume_xids(100000000)]);
+	$ret = $node->safe_psql('postgres', qq[INSERT INTO wraparoundtest VALUES ('after $i batches')]);
+}
+
+$ret = $node->safe_psql('postgres', qq[SELECT COUNT(*) FROM wraparoundtest]);
+is($ret, "101");
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql b/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
new file mode 100644
index 0000000000..f5577adfdb
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
@@ -0,0 +1,12 @@
+/* src/test/modules/xid_wraparound/xid_wraparound--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION xid_wraparound" to load this file. \quit
+
+CREATE FUNCTION consume_xids(bigint)
+RETURNS bigint IMMUTABLE PARALLEL SAFE STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION consume_xids_until(bigint)
+RETURNS bigint IMMUTABLE PARALLEL SAFE STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/xid_wraparound/xid_wraparound.c b/src/test/modules/xid_wraparound/xid_wraparound.c
new file mode 100644
index 0000000000..c9d6034b55
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound.c
@@ -0,0 +1,221 @@
+/*--------------------------------------------------------------------------
+ *
+ * xid_wraparound.c
+ *		Utilities for testing XID wraparound
+ *
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *		src/test/modules/xid_wraparound/xid_wraparound.c
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/transam.h"
+#include "access/xact.h"
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "storage/lwlock.h"
+#include "storage/proc.h"
+
+PG_MODULE_MAGIC;
+
+static int64 consume_xids_shortcut(void);
+static FullTransactionId consume_xids_common(FullTransactionId untilxid, uint64 nxids);
+
+/*
+ * Consume the specified number of XIDs.
+ */
+PG_FUNCTION_INFO_V1(consume_xids);
+Datum
+consume_xids(PG_FUNCTION_ARGS)
+{
+	int64		nxids = PG_GETARG_INT64(0);
+	FullTransactionId lastxid;
+
+	if (nxids < 0)
+		elog(ERROR, "invalid nxids argument: %lld", (long long) nxids);
+
+	if (nxids == 0)
+		lastxid = ReadNextFullTransactionId();
+	else
+		lastxid = consume_xids_common(InvalidFullTransactionId, (uint64) nxids);
+
+	PG_RETURN_INT64((int64) U64FromFullTransactionId(lastxid));
+}
+
+/*
+ * Consume XIDs, up to the given XID.
+ */
+PG_FUNCTION_INFO_V1(consume_xids_until);
+Datum
+consume_xids_until(PG_FUNCTION_ARGS)
+{
+	FullTransactionId targetxid = FullTransactionIdFromU64((uint64) PG_GETARG_INT64(0));
+	FullTransactionId lastxid;
+
+	if (!FullTransactionIdIsNormal(targetxid))
+		elog(ERROR, "targetxid %llu is not normal", (unsigned long long) U64FromFullTransactionId(targetxid));
+
+	lastxid = consume_xids_common(targetxid, 0);
+
+	PG_RETURN_INT64((int64) U64FromFullTransactionId(lastxid));
+}
+
+/*
+ * Common functionality between the two public functions.
+ */
+static FullTransactionId
+consume_xids_common(FullTransactionId untilxid, uint64 nxids)
+{
+	FullTransactionId lastxid;
+	uint64		last_reported_at = 0;
+	uint64		consumed = 0;
+
+	/* Print a NOTICE every REPORT_INTERVAL xids */
+#define REPORT_INTERVAL (10*1000000)
+
+	/* initialize 'lastxid' with the system's current next XID */
+	lastxid = ReadNextFullTransactionId();
+
+	/*
+	 * We consume XIDs by calling GetNewTransactionId(true), which marks the
+	 * consumed XIDs as subtransactions of the current top-level transaction.
+	 * For that to work, this transaction must have a top-level XID.
+	 *
+	 * GetNewTransactionId registers them in the subxid cache in PGPROC, until
+	 * the cache overflows, but beyond that, we don't keep track of the
+	 * consumed XIDs.
+	 */
+	(void) GetTopTransactionId();
+
+	for (;;)
+	{
+		uint64		xids_left;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* How many XIDs do we have left to consume? */
+		if (nxids > 0)
+		{
+			if (consumed >= nxids)
+				break;
+			xids_left = nxids - consumed;
+		}
+		else
+		{
+			if (FullTransactionIdFollowsOrEquals(lastxid, untilxid))
+				break;
+			xids_left = U64FromFullTransactionId(untilxid) - U64FromFullTransactionId(lastxid);
+		}
+
+		/*
+		 * If we still have plenty of XIDs to consume, try to take a shortcut
+		 * and bump up the nextXid counter directly.
+		 */
+		if (xids_left > 2000 &&
+			consumed - last_reported_at < REPORT_INTERVAL &&
+			MyProc->subxidStatus.overflowed)
+		{
+			int64		consumed_by_shortcut = consume_xids_shortcut();
+
+			if (consumed_by_shortcut > 0)
+			{
+				consumed += consumed_by_shortcut;
+				continue;
+			}
+		}
+
+		/* Slow path: Call GetNewTransactionId to allocate a new XID. */
+		lastxid = GetNewTransactionId(true);
+		consumed++;
+
+		/* Report progress */
+		if (consumed - last_reported_at >= REPORT_INTERVAL)
+		{
+			if (nxids > 0)
+				elog(NOTICE, "consumed %llu / %llu XIDs, latest %u:%u",
+					 (unsigned long long) consumed, (unsigned long long) nxids,
+					 EpochFromFullTransactionId(lastxid),
+					 XidFromFullTransactionId(lastxid));
+			else
+				elog(NOTICE, "consumed up to %u:%u / %u:%u",
+					 EpochFromFullTransactionId(lastxid),
+					 XidFromFullTransactionId(lastxid),
+					 EpochFromFullTransactionId(untilxid),
+					 XidFromFullTransactionId(untilxid));
+			last_reported_at = consumed;
+		}
+	}
+
+	return lastxid;
+}
+
+
+/*
+ * These constants copied from .c files, because they're private.
+ */
+#define COMMIT_TS_XACTS_PER_PAGE (BLCKSZ / 10)
+#define SUBTRANS_XACTS_PER_PAGE (BLCKSZ / sizeof(TransactionId))
+#define CLOG_XACTS_PER_BYTE 4
+#define CLOG_XACTS_PER_PAGE (BLCKSZ * CLOG_XACTS_PER_BYTE)
+
+/*
+ * All the interesting action in GetNewTransactionId happens when we extend
+ * the SLRUs, or at the uint32 wraparound. If the nextXid counter is not close
+ * to any of those interesting values, take a shortcut and bump nextXID
+ * directly, close to the next "interesting" value.
+ */
+static inline uint32
+XidSkip(FullTransactionId fullxid)
+{
+	uint32		low = XidFromFullTransactionId(fullxid);
+	uint32		rem;
+	uint32		distance;
+
+	if (low < 5 || low >= UINT32_MAX - 5)
+		return 0;
+	distance = UINT32_MAX - 5 - low;
+
+	rem = low % COMMIT_TS_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, COMMIT_TS_XACTS_PER_PAGE - rem);
+
+	rem = low % SUBTRANS_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, SUBTRANS_XACTS_PER_PAGE - rem);
+
+	rem = low % CLOG_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, CLOG_XACTS_PER_PAGE - rem);
+
+	return distance;
+}
+
+static int64
+consume_xids_shortcut(void)
+{
+	FullTransactionId nextXid;
+	uint32		consumed;
+
+	LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+	nextXid = ShmemVariableCache->nextXid;
+
+	/*
+	 * Go slow near the "interesting values". The interesting zones include 5
+	 * transactions before and after SLRU page switches.
+	 */
+	consumed = XidSkip(nextXid);
+	if (consumed > 0)
+		ShmemVariableCache->nextXid.value += (uint64) consumed;
+
+	LWLockRelease(XidGenLock);
+
+	return consumed;
+}
diff --git a/src/test/modules/xid_wraparound/xid_wraparound.control b/src/test/modules/xid_wraparound/xid_wraparound.control
new file mode 100644
index 0000000000..6c6964ed3d
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound.control
@@ -0,0 +1,4 @@
+comment = 'Tests for XID wraparound'
+default_version = '1.0'
+module_pathname = '$libdir/xid_wraparound'
+relocatable = true
-- 
2.31.1

#38Daniel Gustafsson
daniel@yesql.se
In reply to: Masahiko Sawada (#37)
Re: Testing autovacuum wraparound (including failsafe)

On 12 Jul 2023, at 09:52, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Agreed. The timeout can be set by manually setting
PG_TEST_TIMEOUT_DEFAULT, but I bump it to 10 min by default. And it
now require setting PG_TET_EXTRA to run it.

+# bump the query timeout to avoid false negatives on slow test syetems.
typo: s/syetems/systems/

+# bump the query timeout to avoid false negatives on slow test syetems.
+$ENV{PG_TEST_TIMEOUT_DEFAULT} = 600;
Does this actually work?  Utils.pm read the environment variable at compile
time in the BEGIN block so this setting won't be seen?  A quick testprogram
seems to confirm this but I might be missing something.

--
Daniel Gustafsson

#39Michael Paquier
michael@paquier.xyz
In reply to: Daniel Gustafsson (#38)
Re: Testing autovacuum wraparound (including failsafe)

On Wed, Jul 12, 2023 at 01:47:51PM +0200, Daniel Gustafsson wrote:

+# bump the query timeout to avoid false negatives on slow test syetems.
+$ENV{PG_TEST_TIMEOUT_DEFAULT} = 600;
Does this actually work?  Utils.pm read the environment variable at compile
time in the BEGIN block so this setting won't be seen?  A quick testprogram
seems to confirm this but I might be missing something.

I wish that this test were cheaper, without a need to depend on
PG_TEST_EXTRA.. Actually, note that you are forgetting to update the
documentation of PG_TEST_EXTRA with this new value of xid_wraparound.
--
Michael

#40Daniel Gustafsson
daniel@yesql.se
In reply to: Michael Paquier (#39)
Re: Testing autovacuum wraparound (including failsafe)

On 22 Aug 2023, at 07:49, Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Jul 12, 2023 at 01:47:51PM +0200, Daniel Gustafsson wrote:

+# bump the query timeout to avoid false negatives on slow test syetems.
+$ENV{PG_TEST_TIMEOUT_DEFAULT} = 600;
Does this actually work?  Utils.pm read the environment variable at compile
time in the BEGIN block so this setting won't be seen?  A quick testprogram
seems to confirm this but I might be missing something.

I wish that this test were cheaper, without a need to depend on
PG_TEST_EXTRA.. Actually, note that you are forgetting to update the
documentation of PG_TEST_EXTRA with this new value of xid_wraparound.

Agreed, it would be nice, but I don't see any way to achieve that. I still
think the test is worthwhile to add, once the upthread mentioned issues are
resolved.

--
Daniel Gustafsson

#41Noah Misch
noah@leadboat.com
In reply to: Daniel Gustafsson (#38)
Re: Testing autovacuum wraparound (including failsafe)

On Wed, Jul 12, 2023 at 01:47:51PM +0200, Daniel Gustafsson wrote:

On 12 Jul 2023, at 09:52, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Agreed. The timeout can be set by manually setting
PG_TEST_TIMEOUT_DEFAULT, but I bump it to 10 min by default. And it
now require setting PG_TET_EXTRA to run it.

+# bump the query timeout to avoid false negatives on slow test syetems.
typo: s/syetems/systems/

+# bump the query timeout to avoid false negatives on slow test syetems.
+$ENV{PG_TEST_TIMEOUT_DEFAULT} = 600;
Does this actually work?  Utils.pm read the environment variable at compile
time in the BEGIN block so this setting won't be seen?  A quick testprogram
seems to confirm this but I might be missing something.

The correct way to get a longer timeout is "IPC::Run::timer(4 *
$PostgreSQL::Test::Utils::timeout_default);". Even if changing env worked,
that would be removing the ability for even-slower systems to set timeouts
greater than 10min.

#42Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Noah Misch (#41)
2 attachment(s)
Re: Testing autovacuum wraparound (including failsafe)

Sorry for the late reply.

On Sun, Sep 3, 2023 at 2:48 PM Noah Misch <noah@leadboat.com> wrote:

On Wed, Jul 12, 2023 at 01:47:51PM +0200, Daniel Gustafsson wrote:

On 12 Jul 2023, at 09:52, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Agreed. The timeout can be set by manually setting
PG_TEST_TIMEOUT_DEFAULT, but I bump it to 10 min by default. And it
now require setting PG_TET_EXTRA to run it.

+# bump the query timeout to avoid false negatives on slow test syetems.
typo: s/syetems/systems/

+# bump the query timeout to avoid false negatives on slow test syetems.
+$ENV{PG_TEST_TIMEOUT_DEFAULT} = 600;
Does this actually work?  Utils.pm read the environment variable at compile
time in the BEGIN block so this setting won't be seen?  A quick testprogram
seems to confirm this but I might be missing something.

The correct way to get a longer timeout is "IPC::Run::timer(4 *
$PostgreSQL::Test::Utils::timeout_default);". Even if changing env worked,
that would be removing the ability for even-slower systems to set timeouts
greater than 10min.

Agreed.

I've attached new version patches. 0001 patch adds an option to
background_psql to specify the timeout seconds, and 0002 patch is the
main regression test patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v5-0001-Add-option-to-specify-timeout-seconds-to-Backgrou.patchapplication/octet-stream; name=v5-0001-Add-option-to-specify-timeout-seconds-to-Backgrou.patchDownload
From 10240ff94fa62383eeedb8fc395254f980c0e8f0 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 30 Aug 2023 23:10:59 +0900
Subject: [PATCH v5 1/2] Add option to specify timeout seconds to
 BackgroundPsql.pm.

Author: Masahiko Sawada
Reviewed-by:
Discussion: https://postgr.es/m/
---
 src/test/perl/PostgreSQL/Test/BackgroundPsql.pm | 10 ++++++----
 src/test/perl/PostgreSQL/Test/Cluster.pm        |  4 +++-
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm b/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm
index 924b57ab21..725c3555f9 100644
--- a/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm
+++ b/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm
@@ -68,7 +68,7 @@ use Test::More;
 
 =over
 
-=item PostgreSQL::Test::BackgroundPsql->new(interactive, @params)
+=item PostgreSQL::Test::BackgroundPsql->new(interactive, @params, timeout)
 
 Builds a new object of class C<PostgreSQL::Test::BackgroundPsql> for either
 an interactive or background session and starts it. If C<interactive> is
@@ -81,7 +81,7 @@ string. For C<interactive> sessions, IO::Pty is required.
 sub new
 {
 	my $class = shift;
-	my ($interactive, $psql_params) = @_;
+	my ($interactive, $psql_params, $timeout) = @_;
 	my $psql = {
 		'stdin' => '',
 		'stdout' => '',
@@ -96,8 +96,10 @@ sub new
 	  "Forbidden caller of constructor: package: $package, file: $file:$line"
 	  unless $package->isa('PostgreSQL::Test::Cluster');
 
-	$psql->{timeout} =
-	  IPC::Run::timeout($PostgreSQL::Test::Utils::timeout_default);
+	$psql->{timeout} = IPC::Run::timeout(
+		defined($timeout)
+		? $timeout
+		: $PostgreSQL::Test::Utils::timeout_default);
 
 	if ($interactive)
 	{
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 2a478ba6ed..c34c3b8627 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -2054,6 +2054,7 @@ sub background_psql
 	local %ENV = $self->_get_env();
 
 	my $replication = $params{replication};
+	my $timeout = undef;
 
 	my @psql_params = (
 		$self->installed_command('psql'),
@@ -2065,12 +2066,13 @@ sub background_psql
 		'-');
 
 	$params{on_error_stop} = 1 unless defined $params{on_error_stop};
+	$timeout = $params{timeout} if defined $params{timeout};
 
 	push @psql_params, '-v', 'ON_ERROR_STOP=1' if $params{on_error_stop};
 	push @psql_params, @{ $params{extra_params} }
 	  if defined $params{extra_params};
 
-	return PostgreSQL::Test::BackgroundPsql->new(0, \@psql_params);
+	return PostgreSQL::Test::BackgroundPsql->new(0, \@psql_params, $timeout);
 }
 
 =pod
-- 
2.31.1

v5-0002-Add-tests-for-XID-wraparound.patchapplication/octet-stream; name=v5-0002-Add-tests-for-XID-wraparound.patchDownload
From ffbe0c42ffcf90b288e9fabffc3a6599381bb608 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 3 Mar 2023 12:01:28 +0200
Subject: [PATCH v5 2/2] Add tests for XID wraparound.

The test module includes helper functions to quickly burn through lots
of XIDs. They are used in the tests, and are also handy for manually
testing XID wraparound.

Since these tests are very expensive the entire suite is disabled by
default. It requires to set PG_TET_EXTRA to run it.

Author: Heikki Linnakangas, Masahiko Sawada, Andres Freund
Discussion: https://www.postgresql.org/message-id/CAD21AoDVhkXp8HjpFO-gp3TgL6tCKcZQNxn04m01VAtcSi-5sA%40mail.gmail.com
---
 doc/src/sgml/regress.sgml                     |  10 +
 src/test/modules/Makefile                     |   1 +
 src/test/modules/meson.build                  |   1 +
 src/test/modules/xid_wraparound/.gitignore    |   4 +
 src/test/modules/xid_wraparound/Makefile      |  23 ++
 src/test/modules/xid_wraparound/README        |   3 +
 src/test/modules/xid_wraparound/meson.build   |  36 +++
 .../xid_wraparound/t/001_emergency_vacuum.pl  | 133 +++++++++++
 .../modules/xid_wraparound/t/002_limits.pl    | 138 +++++++++++
 .../xid_wraparound/t/003_wraparounds.pl       |  60 +++++
 .../xid_wraparound/xid_wraparound--1.0.sql    |  12 +
 .../modules/xid_wraparound/xid_wraparound.c   | 221 ++++++++++++++++++
 .../xid_wraparound/xid_wraparound.control     |   4 +
 13 files changed, 646 insertions(+)
 create mode 100644 src/test/modules/xid_wraparound/.gitignore
 create mode 100644 src/test/modules/xid_wraparound/Makefile
 create mode 100644 src/test/modules/xid_wraparound/README
 create mode 100644 src/test/modules/xid_wraparound/meson.build
 create mode 100644 src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
 create mode 100644 src/test/modules/xid_wraparound/t/002_limits.pl
 create mode 100644 src/test/modules/xid_wraparound/t/003_wraparounds.pl
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound.c
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound.control

diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 675db86e4d..9aeef9ec1e 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -313,6 +313,16 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance'
       </para>
      </listitem>
     </varlistentry>
+
+    <varlistentry>
+     <term><literal>xid_wraparound</literal></term>
+     <listitem>
+      <para>
+       Runs the test suite under <filename>src/test/module/xid_wrapround</filename>.
+       Not enabled by default it is resource intensive.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
 
    Tests for features that are not supported by the current build
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 6331c976dc..3c6eb33b38 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -14,6 +14,7 @@ SUBDIRS = \
 		  plsample \
 		  snapshot_too_old \
 		  spgist_name_ops \
+		  xid_wraparound \
 		  test_bloomfilter \
 		  test_copy_callbacks \
 		  test_custom_rmgrs \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 17d369e378..cb63901e32 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -11,6 +11,7 @@ subdir('plsample')
 subdir('snapshot_too_old')
 subdir('spgist_name_ops')
 subdir('ssl_passphrase_callback')
+subdir('xid_wraparound')
 subdir('test_bloomfilter')
 subdir('test_copy_callbacks')
 subdir('test_custom_rmgrs')
diff --git a/src/test/modules/xid_wraparound/.gitignore b/src/test/modules/xid_wraparound/.gitignore
new file mode 100644
index 0000000000..5dcb3ff972
--- /dev/null
+++ b/src/test/modules/xid_wraparound/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/xid_wraparound/Makefile b/src/test/modules/xid_wraparound/Makefile
new file mode 100644
index 0000000000..7a6e0f6676
--- /dev/null
+++ b/src/test/modules/xid_wraparound/Makefile
@@ -0,0 +1,23 @@
+# src/test/modules/xid_wraparound/Makefile
+
+MODULE_big = xid_wraparound
+OBJS = \
+	$(WIN32RES) \
+	xid_wraparound.o
+PGFILEDESC = "xid_wraparound - tests for XID wraparound"
+
+EXTENSION = xid_wraparound
+DATA = xid_wraparound--1.0.sql
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/xid_wraparound
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/xid_wraparound/README b/src/test/modules/xid_wraparound/README
new file mode 100644
index 0000000000..3aab464dec
--- /dev/null
+++ b/src/test/modules/xid_wraparound/README
@@ -0,0 +1,3 @@
+This module contains tests for XID wraparound. The tests use two
+helper functions to quickly consume lots of XIDs, to reach XID
+wraparound faster.
diff --git a/src/test/modules/xid_wraparound/meson.build b/src/test/modules/xid_wraparound/meson.build
new file mode 100644
index 0000000000..42f933525e
--- /dev/null
+++ b/src/test/modules/xid_wraparound/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+xid_wraparound_sources = files(
+  'xid_wraparound.c',
+)
+
+if host_system == 'windows'
+  xid_wraparound_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'xid_wraparound',
+    '--FILEDESC', 'xid_wraparound - tests for XID wraparound',])
+endif
+
+xid_wraparound = shared_module('xid_wraparound',
+  xid_wraparound_sources,
+  kwargs: pg_mod_args,
+)
+testprep_targets += xid_wraparound
+
+install_data(
+  'xid_wraparound.control',
+  'xid_wraparound--1.0.sql',
+  kwargs: contrib_data_args,
+)
+
+tests += {
+  'name': 'xid_wraparound',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_emergency_vacuum.pl',
+      't/002_limits.pl',
+      't/003_wraparounds.pl',
+    ],
+  },
+}
diff --git a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
new file mode 100644
index 0000000000..dd75faaa91
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
@@ -0,0 +1,133 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test wraparound emergency autovacuum.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{PG_TEST_EXTRA} !~ /\bxid_wraparound\b/)
+{
+	plan skip_all => "test xid_wraparound not enabled in PG_TEST_EXTRA";
+}
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('main');
+
+$node->init;
+$node->append_conf(
+	'postgresql.conf', qq[
+autovacuum = off # run autovacuum only when to anti wraparound
+autovacuum_naptime = 1s
+# so it's easier to verify the order of operations
+autovacuum_max_workers = 1
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create tables for a few different test scenarios
+$node->safe_psql(
+	'postgres', qq[
+CREATE TABLE large(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large(data) SELECT generate_series(1,30000);
+
+CREATE TABLE large_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large_trunc(data) SELECT generate_series(1,30000);
+
+CREATE TABLE small(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small(data) SELECT generate_series(1,15000);
+
+CREATE TABLE small_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small_trunc(data) SELECT generate_series(1,15000);
+
+CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH (autovacuum_enabled=false);
+INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);
+]);
+
+# Bump the query timeout to avoid false negatives on slow test systems.
+my $psql_timeout_secs = 4 * $PostgreSQL::Test::Utils::timeout_default;
+
+# Start a background session, which holds a transaction open, preventing
+# autovacuum from advancing relfrozenxid and datfrozenxid.
+my $background_psql = $node->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+$background_psql->set_query_timer_restart();
+$background_psql->query_safe(
+	q{
+	BEGIN;
+	DELETE FROM large WHERE id % 2 = 0;
+	DELETE FROM large_trunc WHERE id > 10000;
+	DELETE FROM small WHERE id % 2 = 0;
+	DELETE FROM small_trunc WHERE id > 1000;
+	DELETE FROM autovacuum_disabled WHERE id % 2 = 0;
+});
+
+# Consume 2 billion XIDs, to get us very close to wraparound
+$node->safe_psql('postgres',
+	qq[SELECT consume_xids_until('2000000000'::bigint)]);
+
+# Make sure the latest completed XID is advanced
+$node->safe_psql('postgres', qq[INSERT INTO small(data) SELECT 1]);
+
+# Check that all databases became old enough to trigger failsafe.
+my $ret = $node->safe_psql(
+	'postgres',
+	qq[
+SELECT datname,
+       age(datfrozenxid) > current_setting('vacuum_failsafe_age')::int as old
+FROM pg_database ORDER BY 1
+]);
+is( $ret, "postgres|t
+template0|t
+template1|t", "all tables became old");
+
+my $log_offset = -s $node->logfile;
+
+# Finish the old transaction, to allow vacuum freezing to advance
+# relfrozenxid and datfrozenxid again.
+$background_psql->query_safe(q{COMMIT;});
+$background_psql->quit;
+
+# Wait until autovacuum processed all tables and advanced the
+# system-wide oldest-XID.
+$node->poll_query_until(
+	'postgres',
+	qq[
+SELECT NOT EXISTS (
+	SELECT *
+	FROM pg_database
+	WHERE age(datfrozenxid) > current_setting('autovacuum_freeze_max_age')::int)
+]) or die "timeout waiting for all databases to be vacuumed";
+
+# Check if these tables are vacuumed.
+$ret = $node->safe_psql(
+	'postgres', qq[
+SELECT relname, age(relfrozenxid) > current_setting('autovacuum_freeze_max_age')::int
+FROM pg_class
+WHERE relname IN ('large', 'large_trunc', 'small', 'small_trunc', 'autovacuum_disabled')
+ORDER BY 1
+]);
+
+is( $ret, "autovacuum_disabled|f
+large|f
+large_trunc|f
+small|f
+small_trunc|f", "all tables are vacuumed");
+
+# Check if vacuum failsafe was triggered for each table.
+my $log_contents = slurp_file($node->logfile, $log_offset);
+foreach my $tablename ('large', 'large_trunc', 'small', 'small_trunc',
+	'autovacuum_disabled')
+{
+	like(
+		$log_contents,
+		qr/bypassing nonessential maintenance of table "postgres.public.$tablename" as a failsafe after \d+ index scans/,
+		"failsafe vacuum triggered for $tablename");
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/xid_wraparound/t/002_limits.pl b/src/test/modules/xid_wraparound/t/002_limits.pl
new file mode 100644
index 0000000000..7e99e72613
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/002_limits.pl
@@ -0,0 +1,138 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+#
+# Test XID wraparound limits.
+#
+# When you get close to XID wraparound, you start to get warnings, and
+# when you get even closer, the system refuses to assign any more XIDs
+# until the oldest databases have been vacuumed and datfrozenxid has
+# been advanced.
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+if ($ENV{PG_TEST_EXTRA} !~ /\bxid_wraparound\b/)
+{
+	plan skip_all => "test xid_wraparound not enabled in PG_TEST_EXTRA";
+}
+
+my $ret;
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('wraparound');
+
+$node->init;
+$node->append_conf(
+	'postgresql.conf', qq[
+autovacuum = off # run autovacuum only to prevent wraparound
+autovacuum_naptime = 1s
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create a test table
+$node->safe_psql(
+	'postgres', qq[
+CREATE TABLE wraparoundtest(t text);
+INSERT INTO wraparoundtest VALUES ('start');
+]);
+
+# Bump the query timeout to avoid false negatives on slow test systems.
+my $psql_timeout_secs = 4 * $PostgreSQL::Test::Utils::timeout_default;
+
+# Start a background session, which holds a transaction open, preventing
+# autovacuum from advancing relfrozenxid and datfrozenxid.
+my $background_psql = $node->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+$background_psql->query_safe(
+	q{
+	BEGIN;
+	INSERT INTO wraparoundtest VALUES ('oldxact');
+});
+
+# Consume 2 billion transactions, to get close to wraparound
+$node->safe_psql('postgres', qq[SELECT consume_xids(1000000000)]);
+$node->safe_psql('postgres',
+	qq[INSERT INTO wraparoundtest VALUES ('after 1 billion')]);
+
+$node->safe_psql('postgres', qq[SELECT consume_xids(1000000000)]);
+$node->safe_psql('postgres',
+	qq[INSERT INTO wraparoundtest VALUES ('after 2 billion')]);
+
+# We are now just under 150 million XIDs away from wraparound.
+# Continue consuming XIDs, in batches of 10 million, until we get
+# the warning:
+#
+#  WARNING:  database "postgres" must be vacuumed within 3000024 transactions
+#  HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
+#  You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
+my $stderr;
+my $warn_limit = 0;
+for my $i (1 .. 15)
+{
+	$node->psql(
+		'postgres', qq[SELECT consume_xids(10000000)],
+		stderr => \$stderr,
+		on_error_die => 1);
+
+	if ($stderr =~
+		/WARNING:  database "postgres" must be vacuumed within [0-9]+ transactions/
+	  )
+	{
+		# Reached the warn-limit
+		$warn_limit = 1;
+		last;
+	}
+}
+ok($warn_limit == 1, "warn-limit reached");
+
+# We can still INSERT, despite the warnings.
+$node->safe_psql('postgres',
+	qq[INSERT INTO wraparoundtest VALUES ('reached warn-limit')]);
+
+# Keep going. We'll hit the hard "stop" limit.
+$ret = $node->psql(
+	'postgres',
+	qq[SELECT consume_xids(100000000)],
+	stderr => \$stderr);
+like(
+	$stderr,
+	qr/ERROR:  database is not accepting commands to avoid wraparound data loss/,
+	"stop-limit");
+
+# Finish the old transaction, to allow vacuum freezing to advance
+# relfrozenxid and datfrozenxid again.
+$background_psql->query_safe(q{COMMIT;});
+$background_psql->quit;
+
+# VACUUM, to freeze the tables and advance datfrozenxid.
+#
+# Autovacuum does this for the other databases, and would do it for
+# 'postgres' too, but let's test manual VACUUM.
+#
+$node->safe_psql('postgres', 'VACUUM');
+
+# Wait until autovacuum has processed the other databases and advanced
+# the system-wide oldest-XID.
+$ret =
+  $node->poll_query_until('postgres',
+	qq[INSERT INTO wraparoundtest VALUES ('after VACUUM')],
+	'INSERT 0 1');
+
+# Check the table contents
+$ret = $node->safe_psql('postgres', qq[SELECT * from wraparoundtest]);
+is( $ret, "start
+oldxact
+after 1 billion
+after 2 billion
+reached warn-limit
+after VACUUM");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/xid_wraparound/t/003_wraparounds.pl b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
new file mode 100644
index 0000000000..be71b00a17
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
@@ -0,0 +1,60 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+#
+# Consume a lot of XIDs, wrapping around a few times.
+#
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+if ($ENV{PG_TEST_EXTRA} !~ /\bxid_wraparound\b/)
+{
+	plan skip_all => "test xid_wraparound not enabled in PG_TEST_EXTRA";
+}
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('wraparound');
+
+$node->init;
+$node->append_conf(
+	'postgresql.conf', qq[
+autovacuum = off # run autovacuum only when to anti wraparound
+autovacuum_naptime = 1s
+# so it's easier to verify the order of operations
+autovacuum_max_workers = 1
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create a test table
+$node->safe_psql(
+	'postgres', qq[
+CREATE TABLE wraparoundtest(t text);
+INSERT INTO wraparoundtest VALUES ('beginning');
+]);
+
+# Bump the query timeout to avoid false negatives on slow test systems.
+my $psql_timeout_secs = 4 * $PostgreSQL::Test::Utils::timeout_default;
+
+# Burn through 10 billion transactions in total, in batches of 100 million.
+my $ret;
+for my $i (1 .. 100)
+{
+	$ret = $node->safe_psql(
+		'postgres',
+		qq[SELECT consume_xids(100000000)],
+		timeout => $psql_timeout_secs);
+	$ret = $node->safe_psql('postgres',
+		qq[INSERT INTO wraparoundtest VALUES ('after $i batches')]);
+}
+
+$ret = $node->safe_psql('postgres', qq[SELECT COUNT(*) FROM wraparoundtest]);
+is($ret, "101");
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql b/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
new file mode 100644
index 0000000000..f5577adfdb
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
@@ -0,0 +1,12 @@
+/* src/test/modules/xid_wraparound/xid_wraparound--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION xid_wraparound" to load this file. \quit
+
+CREATE FUNCTION consume_xids(bigint)
+RETURNS bigint IMMUTABLE PARALLEL SAFE STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION consume_xids_until(bigint)
+RETURNS bigint IMMUTABLE PARALLEL SAFE STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/xid_wraparound/xid_wraparound.c b/src/test/modules/xid_wraparound/xid_wraparound.c
new file mode 100644
index 0000000000..c9d6034b55
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound.c
@@ -0,0 +1,221 @@
+/*--------------------------------------------------------------------------
+ *
+ * xid_wraparound.c
+ *		Utilities for testing XID wraparound
+ *
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *		src/test/modules/xid_wraparound/xid_wraparound.c
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/transam.h"
+#include "access/xact.h"
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "storage/lwlock.h"
+#include "storage/proc.h"
+
+PG_MODULE_MAGIC;
+
+static int64 consume_xids_shortcut(void);
+static FullTransactionId consume_xids_common(FullTransactionId untilxid, uint64 nxids);
+
+/*
+ * Consume the specified number of XIDs.
+ */
+PG_FUNCTION_INFO_V1(consume_xids);
+Datum
+consume_xids(PG_FUNCTION_ARGS)
+{
+	int64		nxids = PG_GETARG_INT64(0);
+	FullTransactionId lastxid;
+
+	if (nxids < 0)
+		elog(ERROR, "invalid nxids argument: %lld", (long long) nxids);
+
+	if (nxids == 0)
+		lastxid = ReadNextFullTransactionId();
+	else
+		lastxid = consume_xids_common(InvalidFullTransactionId, (uint64) nxids);
+
+	PG_RETURN_INT64((int64) U64FromFullTransactionId(lastxid));
+}
+
+/*
+ * Consume XIDs, up to the given XID.
+ */
+PG_FUNCTION_INFO_V1(consume_xids_until);
+Datum
+consume_xids_until(PG_FUNCTION_ARGS)
+{
+	FullTransactionId targetxid = FullTransactionIdFromU64((uint64) PG_GETARG_INT64(0));
+	FullTransactionId lastxid;
+
+	if (!FullTransactionIdIsNormal(targetxid))
+		elog(ERROR, "targetxid %llu is not normal", (unsigned long long) U64FromFullTransactionId(targetxid));
+
+	lastxid = consume_xids_common(targetxid, 0);
+
+	PG_RETURN_INT64((int64) U64FromFullTransactionId(lastxid));
+}
+
+/*
+ * Common functionality between the two public functions.
+ */
+static FullTransactionId
+consume_xids_common(FullTransactionId untilxid, uint64 nxids)
+{
+	FullTransactionId lastxid;
+	uint64		last_reported_at = 0;
+	uint64		consumed = 0;
+
+	/* Print a NOTICE every REPORT_INTERVAL xids */
+#define REPORT_INTERVAL (10*1000000)
+
+	/* initialize 'lastxid' with the system's current next XID */
+	lastxid = ReadNextFullTransactionId();
+
+	/*
+	 * We consume XIDs by calling GetNewTransactionId(true), which marks the
+	 * consumed XIDs as subtransactions of the current top-level transaction.
+	 * For that to work, this transaction must have a top-level XID.
+	 *
+	 * GetNewTransactionId registers them in the subxid cache in PGPROC, until
+	 * the cache overflows, but beyond that, we don't keep track of the
+	 * consumed XIDs.
+	 */
+	(void) GetTopTransactionId();
+
+	for (;;)
+	{
+		uint64		xids_left;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* How many XIDs do we have left to consume? */
+		if (nxids > 0)
+		{
+			if (consumed >= nxids)
+				break;
+			xids_left = nxids - consumed;
+		}
+		else
+		{
+			if (FullTransactionIdFollowsOrEquals(lastxid, untilxid))
+				break;
+			xids_left = U64FromFullTransactionId(untilxid) - U64FromFullTransactionId(lastxid);
+		}
+
+		/*
+		 * If we still have plenty of XIDs to consume, try to take a shortcut
+		 * and bump up the nextXid counter directly.
+		 */
+		if (xids_left > 2000 &&
+			consumed - last_reported_at < REPORT_INTERVAL &&
+			MyProc->subxidStatus.overflowed)
+		{
+			int64		consumed_by_shortcut = consume_xids_shortcut();
+
+			if (consumed_by_shortcut > 0)
+			{
+				consumed += consumed_by_shortcut;
+				continue;
+			}
+		}
+
+		/* Slow path: Call GetNewTransactionId to allocate a new XID. */
+		lastxid = GetNewTransactionId(true);
+		consumed++;
+
+		/* Report progress */
+		if (consumed - last_reported_at >= REPORT_INTERVAL)
+		{
+			if (nxids > 0)
+				elog(NOTICE, "consumed %llu / %llu XIDs, latest %u:%u",
+					 (unsigned long long) consumed, (unsigned long long) nxids,
+					 EpochFromFullTransactionId(lastxid),
+					 XidFromFullTransactionId(lastxid));
+			else
+				elog(NOTICE, "consumed up to %u:%u / %u:%u",
+					 EpochFromFullTransactionId(lastxid),
+					 XidFromFullTransactionId(lastxid),
+					 EpochFromFullTransactionId(untilxid),
+					 XidFromFullTransactionId(untilxid));
+			last_reported_at = consumed;
+		}
+	}
+
+	return lastxid;
+}
+
+
+/*
+ * These constants copied from .c files, because they're private.
+ */
+#define COMMIT_TS_XACTS_PER_PAGE (BLCKSZ / 10)
+#define SUBTRANS_XACTS_PER_PAGE (BLCKSZ / sizeof(TransactionId))
+#define CLOG_XACTS_PER_BYTE 4
+#define CLOG_XACTS_PER_PAGE (BLCKSZ * CLOG_XACTS_PER_BYTE)
+
+/*
+ * All the interesting action in GetNewTransactionId happens when we extend
+ * the SLRUs, or at the uint32 wraparound. If the nextXid counter is not close
+ * to any of those interesting values, take a shortcut and bump nextXID
+ * directly, close to the next "interesting" value.
+ */
+static inline uint32
+XidSkip(FullTransactionId fullxid)
+{
+	uint32		low = XidFromFullTransactionId(fullxid);
+	uint32		rem;
+	uint32		distance;
+
+	if (low < 5 || low >= UINT32_MAX - 5)
+		return 0;
+	distance = UINT32_MAX - 5 - low;
+
+	rem = low % COMMIT_TS_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, COMMIT_TS_XACTS_PER_PAGE - rem);
+
+	rem = low % SUBTRANS_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, SUBTRANS_XACTS_PER_PAGE - rem);
+
+	rem = low % CLOG_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, CLOG_XACTS_PER_PAGE - rem);
+
+	return distance;
+}
+
+static int64
+consume_xids_shortcut(void)
+{
+	FullTransactionId nextXid;
+	uint32		consumed;
+
+	LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+	nextXid = ShmemVariableCache->nextXid;
+
+	/*
+	 * Go slow near the "interesting values". The interesting zones include 5
+	 * transactions before and after SLRU page switches.
+	 */
+	consumed = XidSkip(nextXid);
+	if (consumed > 0)
+		ShmemVariableCache->nextXid.value += (uint64) consumed;
+
+	LWLockRelease(XidGenLock);
+
+	return consumed;
+}
diff --git a/src/test/modules/xid_wraparound/xid_wraparound.control b/src/test/modules/xid_wraparound/xid_wraparound.control
new file mode 100644
index 0000000000..6c6964ed3d
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound.control
@@ -0,0 +1,4 @@
+comment = 'Tests for XID wraparound'
+default_version = '1.0'
+module_pathname = '$libdir/xid_wraparound'
+relocatable = true
-- 
2.31.1

#43Daniel Gustafsson
daniel@yesql.se
In reply to: Masahiko Sawada (#42)
Re: Testing autovacuum wraparound (including failsafe)

On 27 Sep 2023, at 14:39, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I've attached new version patches. 0001 patch adds an option to
background_psql to specify the timeout seconds, and 0002 patch is the
main regression test patch.

-=item PostgreSQL::Test::BackgroundPsql->new(interactive, @params)
+=item PostgreSQL::Test::BackgroundPsql->new(interactive, @params, timeout)

Looking at this I notice that I made a typo in 664d757531e, the =item line
should have "@psql_params" and not "@params". Perhaps you can fix that minor
thing while in there?

+ $timeout = $params{timeout} if defined $params{timeout};

I think this should be documented in the background_psql POD docs.

+ Not enabled by default it is resource intensive.

This sentence is missing a "because", should read: "..by default *because* it
is.."

+# Bump the query timeout to avoid false negatives on slow test systems.
+my $psql_timeout_secs = 4 * $PostgreSQL::Test::Utils::timeout_default;

Should we bump the timeout like this for all systems? I interpreted Noah's
comment such that it should be possible for slower systems to override, not
that it should be extended everywhere, but I might have missed something.

--
Daniel Gustafsson

#44vignesh C
vignesh21@gmail.com
In reply to: Masahiko Sawada (#42)
Re: Testing autovacuum wraparound (including failsafe)

On Thu, 28 Sept 2023 at 03:55, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Sorry for the late reply.

On Sun, Sep 3, 2023 at 2:48 PM Noah Misch <noah@leadboat.com> wrote:

On Wed, Jul 12, 2023 at 01:47:51PM +0200, Daniel Gustafsson wrote:

On 12 Jul 2023, at 09:52, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Agreed. The timeout can be set by manually setting
PG_TEST_TIMEOUT_DEFAULT, but I bump it to 10 min by default. And it
now require setting PG_TET_EXTRA to run it.

+# bump the query timeout to avoid false negatives on slow test syetems.
typo: s/syetems/systems/

+# bump the query timeout to avoid false negatives on slow test syetems.
+$ENV{PG_TEST_TIMEOUT_DEFAULT} = 600;
Does this actually work?  Utils.pm read the environment variable at compile
time in the BEGIN block so this setting won't be seen?  A quick testprogram
seems to confirm this but I might be missing something.

The correct way to get a longer timeout is "IPC::Run::timer(4 *
$PostgreSQL::Test::Utils::timeout_default);". Even if changing env worked,
that would be removing the ability for even-slower systems to set timeouts
greater than 10min.

Agreed.

I've attached new version patches. 0001 patch adds an option to
background_psql to specify the timeout seconds, and 0002 patch is the
main regression test patch.

Few comments:
1) Should we have some validation for the inputs given:
+PG_FUNCTION_INFO_V1(consume_xids_until);
+Datum
+consume_xids_until(PG_FUNCTION_ARGS)
+{
+       FullTransactionId targetxid =
FullTransactionIdFromU64((uint64) PG_GETARG_INT64(0));
+       FullTransactionId lastxid;
+
+       if (!FullTransactionIdIsNormal(targetxid))
+               elog(ERROR, "targetxid %llu is not normal", (unsigned
long long) U64FromFullTransactionId(targetxid));

If not it will take inputs like -1 and 999999999999999.
Also the notice messages might confuse for the above values, as it
will show a different untilxid value like the below:
postgres=# SELECT consume_xids_until(999999999999999);
NOTICE: consumed up to 0:10000809 / 232830:2764472319

2) Should this be added after worker_spi as we generally add it in the
alphabetical order:
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index fcd643f6f1..4054bde84c 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -10,6 +10,7 @@ subdir('libpq_pipeline')
 subdir('plsample')
 subdir('spgist_name_ops')
 subdir('ssl_passphrase_callback')
+subdir('xid_wraparound')
 subdir('test_bloomfilter')
3) Similarly here too:
index e81873cb5a..a4c845ab4a 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -13,6 +13,7 @@ SUBDIRS = \
                  libpq_pipeline \
                  plsample \
                  spgist_name_ops \
+                 xid_wraparound \
                  test_bloomfilter \
4) The following includes are not required transam.h, fmgr.h, lwlock.h
+ *             src/test/modules/xid_wraparound/xid_wraparound.c
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/transam.h"
+#include "access/xact.h"
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "storage/lwlock.h"
+#include "storage/proc.h"

Regards,
Vignesh

#45Noah Misch
noah@leadboat.com
In reply to: Daniel Gustafsson (#43)
Re: Testing autovacuum wraparound (including failsafe)

On Fri, Sep 29, 2023 at 12:17:04PM +0200, Daniel Gustafsson wrote:

+# Bump the query timeout to avoid false negatives on slow test systems.
+my $psql_timeout_secs = 4 * $PostgreSQL::Test::Utils::timeout_default;

Should we bump the timeout like this for all systems? I interpreted Noah's
comment such that it should be possible for slower systems to override, not
that it should be extended everywhere, but I might have missed something.

This is the conventional way to do it. For an operation far slower than a
typical timeout_default situation, the patch can and should dilate the default
timeout like this. The patch version as of my last comment was extending the
timeout but also blocking users from extending it further via
PG_TEST_TIMEOUT_DEFAULT. The latest version restores PG_TEST_TIMEOUT_DEFAULT
reactivity, resolving my comment.

#46Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Daniel Gustafsson (#43)
Re: Testing autovacuum wraparound (including failsafe)

On Fri, Sep 29, 2023 at 7:17 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 27 Sep 2023, at 14:39, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I've attached new version patches. 0001 patch adds an option to
background_psql to specify the timeout seconds, and 0002 patch is the
main regression test patch.

-=item PostgreSQL::Test::BackgroundPsql->new(interactive, @params)
+=item PostgreSQL::Test::BackgroundPsql->new(interactive, @params, timeout)

Looking at this I notice that I made a typo in 664d757531e, the =item line
should have "@psql_params" and not "@params". Perhaps you can fix that minor
thing while in there?

+ $timeout = $params{timeout} if defined $params{timeout};

I think this should be documented in the background_psql POD docs.

While updating the documentation, I found the following description:

=item $node->background_psql($dbname, %params) =>
PostgreSQL::Test::BackgroundPsql inst$
Invoke B<psql> on B<$dbname> and return a BackgroundPsql object.

A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up,
which can be modified later.

Is it true that we can modify the timeout after creating
BackgroundPsql object? If so, it seems we don't need to introduce the
new timeout argument. But how?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#47Daniel Gustafsson
daniel@yesql.se
In reply to: Masahiko Sawada (#46)
Re: Testing autovacuum wraparound (including failsafe)

On 27 Nov 2023, at 14:06, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Is it true that we can modify the timeout after creating
BackgroundPsql object? If so, it seems we don't need to introduce the
new timeout argument. But how?

I can't remember if that's leftovers that incorrectly remains from an earlier
version of the BackgroundPsql work, or if it's a very bad explanation of
->set_query_timer_restart(). The timeout will use the timeout_default value
and that cannot be overridden, it can only be reset per query.

With your patch the timeout still cannot be changed, but at least set during
start which seems good enough until there are tests warranting more complexity.
The docs should be corrected to reflect this in your patch.

--
Daniel Gustafsson

#48Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Daniel Gustafsson (#47)
3 attachment(s)
Re: Testing autovacuum wraparound (including failsafe)

On Mon, Nov 27, 2023 at 10:40 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 27 Nov 2023, at 14:06, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Is it true that we can modify the timeout after creating
BackgroundPsql object? If so, it seems we don't need to introduce the
new timeout argument. But how?

I can't remember if that's leftovers that incorrectly remains from an earlier
version of the BackgroundPsql work, or if it's a very bad explanation of
->set_query_timer_restart(). The timeout will use the timeout_default value
and that cannot be overridden, it can only be reset per query.

Thank you for confirming this. I see there is the same problem also in
interactive_psql(). So I've attached the 0001 patch to fix these
documentation issues. Which could be backpatched.

With your patch the timeout still cannot be changed, but at least set during
start which seems good enough until there are tests warranting more complexity.
The docs should be corrected to reflect this in your patch.

I've incorporated the comments except for the following one and
attached updated version of the rest patches:

On Fri, Sep 29, 2023 at 7:20 PM vignesh C <vignesh21@gmail.com> wrote:

Few comments:
1) Should we have some validation for the inputs given:
+PG_FUNCTION_INFO_V1(consume_xids_until);
+Datum
+consume_xids_until(PG_FUNCTION_ARGS)
+{
+       FullTransactionId targetxid =
FullTransactionIdFromU64((uint64) PG_GETARG_INT64(0));
+       FullTransactionId lastxid;
+
+       if (!FullTransactionIdIsNormal(targetxid))
+               elog(ERROR, "targetxid %llu is not normal", (unsigned
long long) U64FromFullTransactionId(targetxid));

If not it will take inputs like -1 and 999999999999999.
Also the notice messages might confuse for the above values, as it
will show a different untilxid value like the below:
postgres=# SELECT consume_xids_until(999999999999999);
NOTICE: consumed up to 0:10000809 / 232830:2764472319

The full transaction ids shown in the notice messages are separated
into epoch and xid so it's not a different value. This epoch-and-xid
style is used also in pg_controldata output and makes sense to me to
show the progress of xid consumption.

Once the new test gets committed, I'll prepare a new buildfarm animal
for that if possible.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v6-0001-fix-wrong-description-of-BackgroundPsql-s-timeout.patchapplication/octet-stream; name=v6-0001-fix-wrong-description-of-BackgroundPsql-s-timeout.patchDownload
From ad61cb7fdfd0cb484314d7fd3db844669a583da6 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 28 Nov 2023 10:34:48 +0900
Subject: [PATCH v6 1/3] fix wrong description of BackgroundPsql's timeout.

---
 src/test/perl/PostgreSQL/Test/Cluster.pm | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index c3d46c7c70..4b7baa908f 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -2028,8 +2028,7 @@ sub psql
 
 Invoke B<psql> on B<$dbname> and return a BackgroundPsql object.
 
-A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up,
-which can be modified later.
+A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up.
 
 psql is invoked in tuples-only unaligned mode with reading of B<.psqlrc>
 disabled.  That may be overridden by passing extra psql parameters.
@@ -2095,8 +2094,7 @@ sub background_psql
 Invoke B<psql> on B<$dbname> and return a BackgroundPsql object, which the
 caller may use to send interactive input to B<psql>.
 
-A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up,
-which can be modified later.
+A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up.
 
 psql is invoked in tuples-only unaligned mode with reading of B<.psqlrc>
 disabled.  That may be overridden by passing extra psql parameters.
-- 
2.31.1

v6-0003-Add-tests-for-XID-wraparound.patchapplication/octet-stream; name=v6-0003-Add-tests-for-XID-wraparound.patchDownload
From aff49fdf5715c28f593d3b7c63d7332e7f3e840a Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 3 Mar 2023 12:01:28 +0200
Subject: [PATCH v6 3/3] Add tests for XID wraparound.

The test module includes helper functions to quickly burn through lots
of XIDs. They are used in the tests, and are also handy for manually
testing XID wraparound.

Since these tests are very expensive the entire suite is disabled by
default. It requires to set PG_TET_EXTRA to run it.

Reviewed-by: Daniel Gustafsson, John Naylor, Michael Paquier,
Reviewed-by: vignesh C
Author: Heikki Linnakangas, Masahiko Sawada, Andres Freund
Discussion: https://www.postgresql.org/message-id/CAD21AoDVhkXp8HjpFO-gp3TgL6tCKcZQNxn04m01VAtcSi-5sA%40mail.gmail.com
---
 doc/src/sgml/regress.sgml                     |  10 +
 src/test/modules/Makefile                     |   3 +-
 src/test/modules/meson.build                  |   1 +
 src/test/modules/xid_wraparound/.gitignore    |   4 +
 src/test/modules/xid_wraparound/Makefile      |  23 ++
 src/test/modules/xid_wraparound/README        |   3 +
 src/test/modules/xid_wraparound/meson.build   |  36 +++
 .../xid_wraparound/t/001_emergency_vacuum.pl  | 133 +++++++++++
 .../modules/xid_wraparound/t/002_limits.pl    | 138 +++++++++++
 .../xid_wraparound/t/003_wraparounds.pl       |  60 +++++
 .../xid_wraparound/xid_wraparound--1.0.sql    |  12 +
 .../modules/xid_wraparound/xid_wraparound.c   | 219 ++++++++++++++++++
 .../xid_wraparound/xid_wraparound.control     |   4 +
 13 files changed, 645 insertions(+), 1 deletion(-)
 create mode 100644 src/test/modules/xid_wraparound/.gitignore
 create mode 100644 src/test/modules/xid_wraparound/Makefile
 create mode 100644 src/test/modules/xid_wraparound/README
 create mode 100644 src/test/modules/xid_wraparound/meson.build
 create mode 100644 src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
 create mode 100644 src/test/modules/xid_wraparound/t/002_limits.pl
 create mode 100644 src/test/modules/xid_wraparound/t/003_wraparounds.pl
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound.c
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound.control

diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 69f627d7f4..70d9bdefe1 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -314,6 +314,16 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance'
       </para>
      </listitem>
     </varlistentry>
+
+    <varlistentry>
+     <term><literal>xid_wraparound</literal></term>
+     <listitem>
+      <para>
+       Runs the test suite under <filename>src/test/module/xid_wrapround</filename>.
+       Not enabled by default because it is resource intensive.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
 
    Tests for features that are not supported by the current build
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index a18e4d28a0..5d33fa6a9a 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -34,7 +34,8 @@ SUBDIRS = \
 		  test_shm_mq \
 		  test_slru \
 		  unsafe_tests \
-		  worker_spi
+		  worker_spi \
+		  xid_wraparound
 
 ifeq ($(with_ssl),openssl)
 SUBDIRS += ssl_passphrase_callback
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 4e83c0f8d7..b76f588559 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -32,3 +32,4 @@ subdir('test_shm_mq')
 subdir('test_slru')
 subdir('unsafe_tests')
 subdir('worker_spi')
+subdir('xid_wraparound')
diff --git a/src/test/modules/xid_wraparound/.gitignore b/src/test/modules/xid_wraparound/.gitignore
new file mode 100644
index 0000000000..5dcb3ff972
--- /dev/null
+++ b/src/test/modules/xid_wraparound/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/xid_wraparound/Makefile b/src/test/modules/xid_wraparound/Makefile
new file mode 100644
index 0000000000..7a6e0f6676
--- /dev/null
+++ b/src/test/modules/xid_wraparound/Makefile
@@ -0,0 +1,23 @@
+# src/test/modules/xid_wraparound/Makefile
+
+MODULE_big = xid_wraparound
+OBJS = \
+	$(WIN32RES) \
+	xid_wraparound.o
+PGFILEDESC = "xid_wraparound - tests for XID wraparound"
+
+EXTENSION = xid_wraparound
+DATA = xid_wraparound--1.0.sql
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/xid_wraparound
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/xid_wraparound/README b/src/test/modules/xid_wraparound/README
new file mode 100644
index 0000000000..3aab464dec
--- /dev/null
+++ b/src/test/modules/xid_wraparound/README
@@ -0,0 +1,3 @@
+This module contains tests for XID wraparound. The tests use two
+helper functions to quickly consume lots of XIDs, to reach XID
+wraparound faster.
diff --git a/src/test/modules/xid_wraparound/meson.build b/src/test/modules/xid_wraparound/meson.build
new file mode 100644
index 0000000000..42f933525e
--- /dev/null
+++ b/src/test/modules/xid_wraparound/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+xid_wraparound_sources = files(
+  'xid_wraparound.c',
+)
+
+if host_system == 'windows'
+  xid_wraparound_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'xid_wraparound',
+    '--FILEDESC', 'xid_wraparound - tests for XID wraparound',])
+endif
+
+xid_wraparound = shared_module('xid_wraparound',
+  xid_wraparound_sources,
+  kwargs: pg_mod_args,
+)
+testprep_targets += xid_wraparound
+
+install_data(
+  'xid_wraparound.control',
+  'xid_wraparound--1.0.sql',
+  kwargs: contrib_data_args,
+)
+
+tests += {
+  'name': 'xid_wraparound',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_emergency_vacuum.pl',
+      't/002_limits.pl',
+      't/003_wraparounds.pl',
+    ],
+  },
+}
diff --git a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
new file mode 100644
index 0000000000..2ae1667d58
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
@@ -0,0 +1,133 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test wraparound emergency autovacuum.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{PG_TEST_EXTRA} !~ /\bxid_wraparound\b/)
+{
+	plan skip_all => "test xid_wraparound not enabled in PG_TEST_EXTRA";
+}
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('main');
+
+$node->init;
+$node->append_conf(
+	'postgresql.conf', qq[
+autovacuum = off # run autovacuum only when to anti wraparound
+autovacuum_naptime = 1s
+# so it's easier to verify the order of operations
+autovacuum_max_workers = 1
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create tables for a few different test scenarios
+$node->safe_psql(
+	'postgres', qq[
+CREATE TABLE large(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large(data) SELECT generate_series(1,30000);
+
+CREATE TABLE large_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large_trunc(data) SELECT generate_series(1,30000);
+
+CREATE TABLE small(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small(data) SELECT generate_series(1,15000);
+
+CREATE TABLE small_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small_trunc(data) SELECT generate_series(1,15000);
+
+CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH (autovacuum_enabled=false);
+INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);
+]);
+
+# Bump the query timeout to avoid false negatives on slow test systems.
+my $psql_timeout_secs = 4 * $PostgreSQL::Test::Utils::timeout_default;
+
+# Start a background session, which holds a transaction open, preventing
+# autovacuum from advancing relfrozenxid and datfrozenxid.
+my $background_psql = $node->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+$background_psql->set_query_timer_restart();
+$background_psql->query_safe(
+	q{
+	BEGIN;
+	DELETE FROM large WHERE id % 2 = 0;
+	DELETE FROM large_trunc WHERE id > 10000;
+	DELETE FROM small WHERE id % 2 = 0;
+	DELETE FROM small_trunc WHERE id > 1000;
+	DELETE FROM autovacuum_disabled WHERE id % 2 = 0;
+});
+
+# Consume 2 billion XIDs, to get us very close to wraparound
+$node->safe_psql('postgres',
+	qq[SELECT consume_xids_until('2000000000'::xid8)]);
+
+# Make sure the latest completed XID is advanced
+$node->safe_psql('postgres', qq[INSERT INTO small(data) SELECT 1]);
+
+# Check that all databases became old enough to trigger failsafe.
+my $ret = $node->safe_psql(
+	'postgres',
+	qq[
+SELECT datname,
+       age(datfrozenxid) > current_setting('vacuum_failsafe_age')::int as old
+FROM pg_database ORDER BY 1
+]);
+is( $ret, "postgres|t
+template0|t
+template1|t", "all tables became old");
+
+my $log_offset = -s $node->logfile;
+
+# Finish the old transaction, to allow vacuum freezing to advance
+# relfrozenxid and datfrozenxid again.
+$background_psql->query_safe(q{COMMIT;});
+$background_psql->quit;
+
+# Wait until autovacuum processed all tables and advanced the
+# system-wide oldest-XID.
+$node->poll_query_until(
+	'postgres',
+	qq[
+SELECT NOT EXISTS (
+	SELECT *
+	FROM pg_database
+	WHERE age(datfrozenxid) > current_setting('autovacuum_freeze_max_age')::int)
+]) or die "timeout waiting for all databases to be vacuumed";
+
+# Check if these tables are vacuumed.
+$ret = $node->safe_psql(
+	'postgres', qq[
+SELECT relname, age(relfrozenxid) > current_setting('autovacuum_freeze_max_age')::int
+FROM pg_class
+WHERE relname IN ('large', 'large_trunc', 'small', 'small_trunc', 'autovacuum_disabled')
+ORDER BY 1
+]);
+
+is( $ret, "autovacuum_disabled|f
+large|f
+large_trunc|f
+small|f
+small_trunc|f", "all tables are vacuumed");
+
+# Check if vacuum failsafe was triggered for each table.
+my $log_contents = slurp_file($node->logfile, $log_offset);
+foreach my $tablename ('large', 'large_trunc', 'small', 'small_trunc',
+	'autovacuum_disabled')
+{
+	like(
+		$log_contents,
+		qr/bypassing nonessential maintenance of table "postgres.public.$tablename" as a failsafe after \d+ index scans/,
+		"failsafe vacuum triggered for $tablename");
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/xid_wraparound/t/002_limits.pl b/src/test/modules/xid_wraparound/t/002_limits.pl
new file mode 100644
index 0000000000..b35352f25b
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/002_limits.pl
@@ -0,0 +1,138 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+#
+# Test XID wraparound limits.
+#
+# When you get close to XID wraparound, you start to get warnings, and
+# when you get even closer, the system refuses to assign any more XIDs
+# until the oldest databases have been vacuumed and datfrozenxid has
+# been advanced.
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+if ($ENV{PG_TEST_EXTRA} !~ /\bxid_wraparound\b/)
+{
+	plan skip_all => "test xid_wraparound not enabled in PG_TEST_EXTRA";
+}
+
+my $ret;
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('wraparound');
+
+$node->init;
+$node->append_conf(
+	'postgresql.conf', qq[
+autovacuum = off # run autovacuum only to prevent wraparound
+autovacuum_naptime = 1s
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create a test table
+$node->safe_psql(
+	'postgres', qq[
+CREATE TABLE wraparoundtest(t text);
+INSERT INTO wraparoundtest VALUES ('start');
+]);
+
+# Bump the query timeout to avoid false negatives on slow test systems.
+my $psql_timeout_secs = 4 * $PostgreSQL::Test::Utils::timeout_default;
+
+# Start a background session, which holds a transaction open, preventing
+# autovacuum from advancing relfrozenxid and datfrozenxid.
+my $background_psql = $node->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+$background_psql->query_safe(
+	q{
+	BEGIN;
+	INSERT INTO wraparoundtest VALUES ('oldxact');
+});
+
+# Consume 2 billion transactions, to get close to wraparound
+$node->safe_psql('postgres', qq[SELECT consume_xids(1000000000)]);
+$node->safe_psql('postgres',
+	qq[INSERT INTO wraparoundtest VALUES ('after 1 billion')]);
+
+$node->safe_psql('postgres', qq[SELECT consume_xids(1000000000)]);
+$node->safe_psql('postgres',
+	qq[INSERT INTO wraparoundtest VALUES ('after 2 billion')]);
+
+# We are now just under 150 million XIDs away from wraparound.
+# Continue consuming XIDs, in batches of 10 million, until we get
+# the warning:
+#
+#  WARNING:  database "postgres" must be vacuumed within 3000024 transactions
+#  HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
+#  You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
+my $stderr;
+my $warn_limit = 0;
+for my $i (1 .. 15)
+{
+	$node->psql(
+		'postgres', qq[SELECT consume_xids(10000000)],
+		stderr => \$stderr,
+		on_error_die => 1);
+
+	if ($stderr =~
+		/WARNING:  database "postgres" must be vacuumed within [0-9]+ transactions/
+	  )
+	{
+		# Reached the warn-limit
+		$warn_limit = 1;
+		last;
+	}
+}
+ok($warn_limit == 1, "warn-limit reached");
+
+# We can still INSERT, despite the warnings.
+$node->safe_psql('postgres',
+	qq[INSERT INTO wraparoundtest VALUES ('reached warn-limit')]);
+
+# Keep going. We'll hit the hard "stop" limit.
+$ret = $node->psql(
+	'postgres',
+	qq[SELECT consume_xids(100000000)],
+	stderr => \$stderr);
+like(
+	$stderr,
+	qr/ERROR:  database is not accepting commands that assign new XIDs to avoid wraparound data loss in database "postgres"/,
+	"stop-limit");
+
+# Finish the old transaction, to allow vacuum freezing to advance
+# relfrozenxid and datfrozenxid again.
+$background_psql->query_safe(q{COMMIT;});
+$background_psql->quit;
+
+# VACUUM, to freeze the tables and advance datfrozenxid.
+#
+# Autovacuum does this for the other databases, and would do it for
+# 'postgres' too, but let's test manual VACUUM.
+#
+$node->safe_psql('postgres', 'VACUUM');
+
+# Wait until autovacuum has processed the other databases and advanced
+# the system-wide oldest-XID.
+$ret =
+  $node->poll_query_until('postgres',
+	qq[INSERT INTO wraparoundtest VALUES ('after VACUUM')],
+	'INSERT 0 1');
+
+# Check the table contents
+$ret = $node->safe_psql('postgres', qq[SELECT * from wraparoundtest]);
+is( $ret, "start
+oldxact
+after 1 billion
+after 2 billion
+reached warn-limit
+after VACUUM");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/xid_wraparound/t/003_wraparounds.pl b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
new file mode 100644
index 0000000000..be71b00a17
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
@@ -0,0 +1,60 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+#
+# Consume a lot of XIDs, wrapping around a few times.
+#
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+if ($ENV{PG_TEST_EXTRA} !~ /\bxid_wraparound\b/)
+{
+	plan skip_all => "test xid_wraparound not enabled in PG_TEST_EXTRA";
+}
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('wraparound');
+
+$node->init;
+$node->append_conf(
+	'postgresql.conf', qq[
+autovacuum = off # run autovacuum only when to anti wraparound
+autovacuum_naptime = 1s
+# so it's easier to verify the order of operations
+autovacuum_max_workers = 1
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create a test table
+$node->safe_psql(
+	'postgres', qq[
+CREATE TABLE wraparoundtest(t text);
+INSERT INTO wraparoundtest VALUES ('beginning');
+]);
+
+# Bump the query timeout to avoid false negatives on slow test systems.
+my $psql_timeout_secs = 4 * $PostgreSQL::Test::Utils::timeout_default;
+
+# Burn through 10 billion transactions in total, in batches of 100 million.
+my $ret;
+for my $i (1 .. 100)
+{
+	$ret = $node->safe_psql(
+		'postgres',
+		qq[SELECT consume_xids(100000000)],
+		timeout => $psql_timeout_secs);
+	$ret = $node->safe_psql('postgres',
+		qq[INSERT INTO wraparoundtest VALUES ('after $i batches')]);
+}
+
+$ret = $node->safe_psql('postgres', qq[SELECT COUNT(*) FROM wraparoundtest]);
+is($ret, "101");
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql b/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
new file mode 100644
index 0000000000..51d25fc4c6
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
@@ -0,0 +1,12 @@
+/* src/test/modules/xid_wraparound/xid_wraparound--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION xid_wraparound" to load this file. \quit
+
+CREATE FUNCTION consume_xids(nxids bigint)
+RETURNS xid8 IMMUTABLE PARALLEL SAFE STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION consume_xids_until(targetxid xid8)
+RETURNS xid8 IMMUTABLE PARALLEL SAFE STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/xid_wraparound/xid_wraparound.c b/src/test/modules/xid_wraparound/xid_wraparound.c
new file mode 100644
index 0000000000..312eebbbc8
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound.c
@@ -0,0 +1,219 @@
+/*--------------------------------------------------------------------------
+ *
+ * xid_wraparound.c
+ *		Utilities for testing XID wraparound
+ *
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *		src/test/modules/xid_wraparound/xid_wraparound.c
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/xact.h"
+#include "miscadmin.h"
+#include "storage/proc.h"
+#include "utils/xid8.h"
+
+PG_MODULE_MAGIC;
+
+static int64 consume_xids_shortcut(void);
+static FullTransactionId consume_xids_common(FullTransactionId untilxid, uint64 nxids);
+
+/*
+ * Consume the specified number of XIDs.
+ */
+PG_FUNCTION_INFO_V1(consume_xids);
+Datum
+consume_xids(PG_FUNCTION_ARGS)
+{
+	int64		nxids = PG_GETARG_INT64(0);
+	FullTransactionId lastxid;
+
+	if (nxids < 0)
+		elog(ERROR, "invalid nxids argument: %lld", (long long) nxids);
+
+	if (nxids == 0)
+		lastxid = ReadNextFullTransactionId();
+	else
+		lastxid = consume_xids_common(InvalidFullTransactionId, (uint64) nxids);
+
+	PG_RETURN_FULLTRANSACTIONID(lastxid);
+}
+
+/*
+ * Consume XIDs, up to the given XID.
+ */
+PG_FUNCTION_INFO_V1(consume_xids_until);
+Datum
+consume_xids_until(PG_FUNCTION_ARGS)
+{
+	FullTransactionId targetxid = PG_GETARG_FULLTRANSACTIONID(0);
+	FullTransactionId lastxid;
+
+	if (!FullTransactionIdIsNormal(targetxid))
+		elog(ERROR, "targetxid %llu is not normal",
+			 (unsigned long long) U64FromFullTransactionId(targetxid));
+
+	lastxid = consume_xids_common(targetxid, 0);
+
+	PG_RETURN_FULLTRANSACTIONID(lastxid);
+}
+
+/*
+ * Common functionality between the two public functions.
+ */
+static FullTransactionId
+consume_xids_common(FullTransactionId untilxid, uint64 nxids)
+{
+	FullTransactionId lastxid;
+	uint64		last_reported_at = 0;
+	uint64		consumed = 0;
+
+	/* Print a NOTICE every REPORT_INTERVAL xids */
+#define REPORT_INTERVAL (10 * 1000000)
+
+	/* initialize 'lastxid' with the system's current next XID */
+	lastxid = ReadNextFullTransactionId();
+
+	/*
+	 * We consume XIDs by calling GetNewTransactionId(true), which marks the
+	 * consumed XIDs as subtransactions of the current top-level transaction.
+	 * For that to work, this transaction must have a top-level XID.
+	 *
+	 * GetNewTransactionId registers them in the subxid cache in PGPROC, until
+	 * the cache overflows, but beyond that, we don't keep track of the
+	 * consumed XIDs.
+	 */
+	(void) GetTopTransactionId();
+
+	for (;;)
+	{
+		uint64		xids_left;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* How many XIDs do we have left to consume? */
+		if (nxids > 0)
+		{
+			if (consumed >= nxids)
+				break;
+			xids_left = nxids - consumed;
+		}
+		else
+		{
+			if (FullTransactionIdFollowsOrEquals(lastxid, untilxid))
+				break;
+			xids_left = U64FromFullTransactionId(untilxid) - U64FromFullTransactionId(lastxid);
+		}
+
+		/*
+		 * If we still have plenty of XIDs to consume, try to take a shortcut
+		 * and bump up the nextXid counter directly.
+		 */
+		if (xids_left > 2000 &&
+			consumed - last_reported_at < REPORT_INTERVAL &&
+			MyProc->subxidStatus.overflowed)
+		{
+			int64		consumed_by_shortcut = consume_xids_shortcut();
+
+			if (consumed_by_shortcut > 0)
+			{
+				consumed += consumed_by_shortcut;
+				continue;
+			}
+		}
+
+		/* Slow path: Call GetNewTransactionId to allocate a new XID. */
+		lastxid = GetNewTransactionId(true);
+		consumed++;
+
+		/* Report progress */
+		if (consumed - last_reported_at >= REPORT_INTERVAL)
+		{
+			if (nxids > 0)
+				elog(NOTICE, "consumed %llu / %llu XIDs, latest %u:%u",
+					 (unsigned long long) consumed, (unsigned long long) nxids,
+					 EpochFromFullTransactionId(lastxid),
+					 XidFromFullTransactionId(lastxid));
+			else
+				elog(NOTICE, "consumed up to %u:%u / %u:%u",
+					 EpochFromFullTransactionId(lastxid),
+					 XidFromFullTransactionId(lastxid),
+					 EpochFromFullTransactionId(untilxid),
+					 XidFromFullTransactionId(untilxid));
+			last_reported_at = consumed;
+		}
+	}
+
+	return lastxid;
+}
+
+/*
+ * These constants copied from .c files, because they're private.
+ */
+#define COMMIT_TS_XACTS_PER_PAGE (BLCKSZ / 10)
+#define SUBTRANS_XACTS_PER_PAGE (BLCKSZ / sizeof(TransactionId))
+#define CLOG_XACTS_PER_BYTE 4
+#define CLOG_XACTS_PER_PAGE (BLCKSZ * CLOG_XACTS_PER_BYTE)
+
+/*
+ * All the interesting action in GetNewTransactionId happens when we extend
+ * the SLRUs, or at the uint32 wraparound. If the nextXid counter is not close
+ * to any of those interesting values, take a shortcut and bump nextXID
+ * directly, close to the next "interesting" value.
+ */
+static inline uint32
+XidSkip(FullTransactionId fullxid)
+{
+	uint32		low = XidFromFullTransactionId(fullxid);
+	uint32		rem;
+	uint32		distance;
+
+	if (low < 5 || low >= UINT32_MAX - 5)
+		return 0;
+	distance = UINT32_MAX - 5 - low;
+
+	rem = low % COMMIT_TS_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, COMMIT_TS_XACTS_PER_PAGE - rem);
+
+	rem = low % SUBTRANS_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, SUBTRANS_XACTS_PER_PAGE - rem);
+
+	rem = low % CLOG_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, CLOG_XACTS_PER_PAGE - rem);
+
+	return distance;
+}
+
+static int64
+consume_xids_shortcut(void)
+{
+	FullTransactionId nextXid;
+	uint32		consumed;
+
+	LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+	nextXid = ShmemVariableCache->nextXid;
+
+	/*
+	 * Go slow near the "interesting values". The interesting zones include 5
+	 * transactions before and after SLRU page switches.
+	 */
+	consumed = XidSkip(nextXid);
+	if (consumed > 0)
+		ShmemVariableCache->nextXid.value += (uint64) consumed;
+
+	LWLockRelease(XidGenLock);
+
+	return consumed;
+}
diff --git a/src/test/modules/xid_wraparound/xid_wraparound.control b/src/test/modules/xid_wraparound/xid_wraparound.control
new file mode 100644
index 0000000000..6c6964ed3d
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound.control
@@ -0,0 +1,4 @@
+comment = 'Tests for XID wraparound'
+default_version = '1.0'
+module_pathname = '$libdir/xid_wraparound'
+relocatable = true
-- 
2.31.1

v6-0002-Add-option-to-specify-timeout-seconds-to-Backgrou.patchapplication/octet-stream; name=v6-0002-Add-option-to-specify-timeout-seconds-to-Backgrou.patchDownload
From b4dbb1cf1ba74e14a3c7383e9fa000593127dd97 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 30 Aug 2023 23:10:59 +0900
Subject: [PATCH v6 2/3] Add option to specify timeout seconds to
 BackgroundPsql.pm.

Author: Masahiko Sawada
Reviewed-by: Daniel Gustafsson, Noah Misch
Discussion: https://postgr.es/m/C9CF2F76-0D81-4C9D-9832-202BE8517056%40yesql.se
---
 src/test/perl/PostgreSQL/Test/BackgroundPsql.pm | 10 ++++++----
 src/test/perl/PostgreSQL/Test/Cluster.pm        | 11 ++++++++---
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm b/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm
index 924b57ab21..58d393f5b8 100644
--- a/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm
+++ b/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm
@@ -68,7 +68,7 @@ use Test::More;
 
 =over
 
-=item PostgreSQL::Test::BackgroundPsql->new(interactive, @params)
+=item PostgreSQL::Test::BackgroundPsql->new(interactive, @psql_params, timeout)
 
 Builds a new object of class C<PostgreSQL::Test::BackgroundPsql> for either
 an interactive or background session and starts it. If C<interactive> is
@@ -81,7 +81,7 @@ string. For C<interactive> sessions, IO::Pty is required.
 sub new
 {
 	my $class = shift;
-	my ($interactive, $psql_params) = @_;
+	my ($interactive, $psql_params, $timeout) = @_;
 	my $psql = {
 		'stdin' => '',
 		'stdout' => '',
@@ -96,8 +96,10 @@ sub new
 	  "Forbidden caller of constructor: package: $package, file: $file:$line"
 	  unless $package->isa('PostgreSQL::Test::Cluster');
 
-	$psql->{timeout} =
-	  IPC::Run::timeout($PostgreSQL::Test::Utils::timeout_default);
+	$psql->{timeout} = IPC::Run::timeout(
+		defined($timeout)
+		? $timeout
+		: $PostgreSQL::Test::Utils::timeout_default);
 
 	if ($interactive)
 	{
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 4b7baa908f..f37907bcf7 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -2028,8 +2028,6 @@ sub psql
 
 Invoke B<psql> on B<$dbname> and return a BackgroundPsql object.
 
-A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up.
-
 psql is invoked in tuples-only unaligned mode with reading of B<.psqlrc>
 disabled.  That may be overridden by passing extra psql parameters.
 
@@ -2047,6 +2045,11 @@ By default, the B<psql> method invokes the B<psql> program with ON_ERROR_STOP=1
 set, so SQL execution is stopped at the first error and exit code 3 is
 returned.  Set B<on_error_stop> to 0 to ignore errors instead.
 
+=item timeout => 'interval'
+
+Set a timeout for a background psql session. By default, timeout of
+$PostgreSQL::Test::Utils::timeout_default is set up.
+
 =item replication => B<value>
 
 If set, add B<replication=value> to the conninfo string.
@@ -2068,6 +2071,7 @@ sub background_psql
 	local %ENV = $self->_get_env();
 
 	my $replication = $params{replication};
+	my $timeout = undef;
 
 	my @psql_params = (
 		$self->installed_command('psql'),
@@ -2079,12 +2083,13 @@ sub background_psql
 		'-');
 
 	$params{on_error_stop} = 1 unless defined $params{on_error_stop};
+	$timeout = $params{timeout} if defined $params{timeout};
 
 	push @psql_params, '-v', 'ON_ERROR_STOP=1' if $params{on_error_stop};
 	push @psql_params, @{ $params{extra_params} }
 	  if defined $params{extra_params};
 
-	return PostgreSQL::Test::BackgroundPsql->new(0, \@psql_params);
+	return PostgreSQL::Test::BackgroundPsql->new(0, \@psql_params, $timeout);
 }
 
 =pod
-- 
2.31.1

#49Daniel Gustafsson
daniel@yesql.se
In reply to: Masahiko Sawada (#48)
Re: Testing autovacuum wraparound (including failsafe)

On 28 Nov 2023, at 03:00, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Nov 27, 2023 at 10:40 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 27 Nov 2023, at 14:06, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Is it true that we can modify the timeout after creating
BackgroundPsql object? If so, it seems we don't need to introduce the
new timeout argument. But how?

I can't remember if that's leftovers that incorrectly remains from an earlier
version of the BackgroundPsql work, or if it's a very bad explanation of
->set_query_timer_restart(). The timeout will use the timeout_default value
and that cannot be overridden, it can only be reset per query.

Thank you for confirming this. I see there is the same problem also in
interactive_psql(). So I've attached the 0001 patch to fix these
documentation issues.

-A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up,
-which can be modified later.
+A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up.

Since it cannot be modified, I think we should just say "A timeout of .." and
call it a default timeout. This obviously only matters for the backpatch since
the sentence is removed in 0002.

Which could be backpatched.

+1

With your patch the timeout still cannot be changed, but at least set during
start which seems good enough until there are tests warranting more complexity.
The docs should be corrected to reflect this in your patch.

I've incorporated the comments except for the following one and
attached updated version of the rest patches:

LGTM.

--
Daniel Gustafsson

#50Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Daniel Gustafsson (#49)
3 attachment(s)
Re: Testing autovacuum wraparound (including failsafe)

On Tue, Nov 28, 2023 at 7:16 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 28 Nov 2023, at 03:00, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Nov 27, 2023 at 10:40 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 27 Nov 2023, at 14:06, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Is it true that we can modify the timeout after creating
BackgroundPsql object? If so, it seems we don't need to introduce the
new timeout argument. But how?

I can't remember if that's leftovers that incorrectly remains from an earlier
version of the BackgroundPsql work, or if it's a very bad explanation of
->set_query_timer_restart(). The timeout will use the timeout_default value
and that cannot be overridden, it can only be reset per query.

Thank you for confirming this. I see there is the same problem also in
interactive_psql(). So I've attached the 0001 patch to fix these
documentation issues.

-A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up,
-which can be modified later.
+A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up.

Since it cannot be modified, I think we should just say "A timeout of .." and
call it a default timeout. This obviously only matters for the backpatch since
the sentence is removed in 0002.

Agreed.

I've attached new version patches (0002 and 0003 are unchanged except
for the commit message). I'll push them, barring any objections.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v7-0001-Fix-wrong-description-of-BackgroundPsql-s-timeout.patchapplication/octet-stream; name=v7-0001-Fix-wrong-description-of-BackgroundPsql-s-timeout.patchDownload
From 00973920b19d741e4733793880c68f6d2c08351b Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 28 Nov 2023 10:34:48 +0900
Subject: [PATCH v7 1/3] Fix wrong description of BackgroundPsql's timeout.

Backpatch through 16 where this was introduced.

Reviewed-by: Daniel Gustafsson
Backpatch-through: 16
Discussion: http://postgr.es/m/CAD21AoBXMEqDBLoDuAWVWoTLYB4aNsxx4oYNmyJJbhfq_vGQBQ@mail.gmail.com
---
 src/test/perl/PostgreSQL/Test/Cluster.pm | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index c3d46c7c70..321b77d7ed 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -2028,8 +2028,7 @@ sub psql
 
 Invoke B<psql> on B<$dbname> and return a BackgroundPsql object.
 
-A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up,
-which can be modified later.
+A timeout of $PostgreSQL::Test::Utils::timeout_default is set up.
 
 psql is invoked in tuples-only unaligned mode with reading of B<.psqlrc>
 disabled.  That may be overridden by passing extra psql parameters.
@@ -2095,8 +2094,7 @@ sub background_psql
 Invoke B<psql> on B<$dbname> and return a BackgroundPsql object, which the
 caller may use to send interactive input to B<psql>.
 
-A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up,
-which can be modified later.
+A timeout of $PostgreSQL::Test::Utils::timeout_default is set up.
 
 psql is invoked in tuples-only unaligned mode with reading of B<.psqlrc>
 disabled.  That may be overridden by passing extra psql parameters.
-- 
2.31.1

v7-0003-Add-tests-for-XID-wraparound.patchapplication/octet-stream; name=v7-0003-Add-tests-for-XID-wraparound.patchDownload
From 8dbc82d5e867d085fac11fbef112ef1fc3401b54 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 3 Mar 2023 12:01:28 +0200
Subject: [PATCH v7 3/3] Add tests for XID wraparound.

The test module includes helper functions to quickly burn through lots
of XIDs. They are used in the tests, and are also handy for manually
testing XID wraparound.

Since these tests are very expensive the entire suite is disabled by
default. It requires to set PG_TET_EXTRA to run it.

Reviewed-by: Daniel Gustafsson, John Naylor, Michael Paquier
Reviewed-by: vignesh C
Author: Heikki Linnakangas, Masahiko Sawada, Andres Freund
Discussion: https://www.postgresql.org/message-id/CAD21AoDVhkXp8HjpFO-gp3TgL6tCKcZQNxn04m01VAtcSi-5sA%40mail.gmail.com
---
 doc/src/sgml/regress.sgml                     |  10 +
 src/test/modules/Makefile                     |   3 +-
 src/test/modules/meson.build                  |   1 +
 src/test/modules/xid_wraparound/.gitignore    |   4 +
 src/test/modules/xid_wraparound/Makefile      |  23 ++
 src/test/modules/xid_wraparound/README        |   3 +
 src/test/modules/xid_wraparound/meson.build   |  36 +++
 .../xid_wraparound/t/001_emergency_vacuum.pl  | 133 +++++++++++
 .../modules/xid_wraparound/t/002_limits.pl    | 138 +++++++++++
 .../xid_wraparound/t/003_wraparounds.pl       |  60 +++++
 .../xid_wraparound/xid_wraparound--1.0.sql    |  12 +
 .../modules/xid_wraparound/xid_wraparound.c   | 219 ++++++++++++++++++
 .../xid_wraparound/xid_wraparound.control     |   4 +
 13 files changed, 645 insertions(+), 1 deletion(-)
 create mode 100644 src/test/modules/xid_wraparound/.gitignore
 create mode 100644 src/test/modules/xid_wraparound/Makefile
 create mode 100644 src/test/modules/xid_wraparound/README
 create mode 100644 src/test/modules/xid_wraparound/meson.build
 create mode 100644 src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
 create mode 100644 src/test/modules/xid_wraparound/t/002_limits.pl
 create mode 100644 src/test/modules/xid_wraparound/t/003_wraparounds.pl
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound.c
 create mode 100644 src/test/modules/xid_wraparound/xid_wraparound.control

diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 69f627d7f4..70d9bdefe1 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -314,6 +314,16 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance'
       </para>
      </listitem>
     </varlistentry>
+
+    <varlistentry>
+     <term><literal>xid_wraparound</literal></term>
+     <listitem>
+      <para>
+       Runs the test suite under <filename>src/test/module/xid_wrapround</filename>.
+       Not enabled by default because it is resource intensive.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
 
    Tests for features that are not supported by the current build
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index a18e4d28a0..5d33fa6a9a 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -34,7 +34,8 @@ SUBDIRS = \
 		  test_shm_mq \
 		  test_slru \
 		  unsafe_tests \
-		  worker_spi
+		  worker_spi \
+		  xid_wraparound
 
 ifeq ($(with_ssl),openssl)
 SUBDIRS += ssl_passphrase_callback
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 4e83c0f8d7..b76f588559 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -32,3 +32,4 @@ subdir('test_shm_mq')
 subdir('test_slru')
 subdir('unsafe_tests')
 subdir('worker_spi')
+subdir('xid_wraparound')
diff --git a/src/test/modules/xid_wraparound/.gitignore b/src/test/modules/xid_wraparound/.gitignore
new file mode 100644
index 0000000000..5dcb3ff972
--- /dev/null
+++ b/src/test/modules/xid_wraparound/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/xid_wraparound/Makefile b/src/test/modules/xid_wraparound/Makefile
new file mode 100644
index 0000000000..7a6e0f6676
--- /dev/null
+++ b/src/test/modules/xid_wraparound/Makefile
@@ -0,0 +1,23 @@
+# src/test/modules/xid_wraparound/Makefile
+
+MODULE_big = xid_wraparound
+OBJS = \
+	$(WIN32RES) \
+	xid_wraparound.o
+PGFILEDESC = "xid_wraparound - tests for XID wraparound"
+
+EXTENSION = xid_wraparound
+DATA = xid_wraparound--1.0.sql
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/xid_wraparound
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/xid_wraparound/README b/src/test/modules/xid_wraparound/README
new file mode 100644
index 0000000000..3aab464dec
--- /dev/null
+++ b/src/test/modules/xid_wraparound/README
@@ -0,0 +1,3 @@
+This module contains tests for XID wraparound. The tests use two
+helper functions to quickly consume lots of XIDs, to reach XID
+wraparound faster.
diff --git a/src/test/modules/xid_wraparound/meson.build b/src/test/modules/xid_wraparound/meson.build
new file mode 100644
index 0000000000..42f933525e
--- /dev/null
+++ b/src/test/modules/xid_wraparound/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+xid_wraparound_sources = files(
+  'xid_wraparound.c',
+)
+
+if host_system == 'windows'
+  xid_wraparound_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'xid_wraparound',
+    '--FILEDESC', 'xid_wraparound - tests for XID wraparound',])
+endif
+
+xid_wraparound = shared_module('xid_wraparound',
+  xid_wraparound_sources,
+  kwargs: pg_mod_args,
+)
+testprep_targets += xid_wraparound
+
+install_data(
+  'xid_wraparound.control',
+  'xid_wraparound--1.0.sql',
+  kwargs: contrib_data_args,
+)
+
+tests += {
+  'name': 'xid_wraparound',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_emergency_vacuum.pl',
+      't/002_limits.pl',
+      't/003_wraparounds.pl',
+    ],
+  },
+}
diff --git a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
new file mode 100644
index 0000000000..2ae1667d58
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
@@ -0,0 +1,133 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test wraparound emergency autovacuum.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{PG_TEST_EXTRA} !~ /\bxid_wraparound\b/)
+{
+	plan skip_all => "test xid_wraparound not enabled in PG_TEST_EXTRA";
+}
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('main');
+
+$node->init;
+$node->append_conf(
+	'postgresql.conf', qq[
+autovacuum = off # run autovacuum only when to anti wraparound
+autovacuum_naptime = 1s
+# so it's easier to verify the order of operations
+autovacuum_max_workers = 1
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create tables for a few different test scenarios
+$node->safe_psql(
+	'postgres', qq[
+CREATE TABLE large(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large(data) SELECT generate_series(1,30000);
+
+CREATE TABLE large_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO large_trunc(data) SELECT generate_series(1,30000);
+
+CREATE TABLE small(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small(data) SELECT generate_series(1,15000);
+
+CREATE TABLE small_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+INSERT INTO small_trunc(data) SELECT generate_series(1,15000);
+
+CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH (autovacuum_enabled=false);
+INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);
+]);
+
+# Bump the query timeout to avoid false negatives on slow test systems.
+my $psql_timeout_secs = 4 * $PostgreSQL::Test::Utils::timeout_default;
+
+# Start a background session, which holds a transaction open, preventing
+# autovacuum from advancing relfrozenxid and datfrozenxid.
+my $background_psql = $node->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+$background_psql->set_query_timer_restart();
+$background_psql->query_safe(
+	q{
+	BEGIN;
+	DELETE FROM large WHERE id % 2 = 0;
+	DELETE FROM large_trunc WHERE id > 10000;
+	DELETE FROM small WHERE id % 2 = 0;
+	DELETE FROM small_trunc WHERE id > 1000;
+	DELETE FROM autovacuum_disabled WHERE id % 2 = 0;
+});
+
+# Consume 2 billion XIDs, to get us very close to wraparound
+$node->safe_psql('postgres',
+	qq[SELECT consume_xids_until('2000000000'::xid8)]);
+
+# Make sure the latest completed XID is advanced
+$node->safe_psql('postgres', qq[INSERT INTO small(data) SELECT 1]);
+
+# Check that all databases became old enough to trigger failsafe.
+my $ret = $node->safe_psql(
+	'postgres',
+	qq[
+SELECT datname,
+       age(datfrozenxid) > current_setting('vacuum_failsafe_age')::int as old
+FROM pg_database ORDER BY 1
+]);
+is( $ret, "postgres|t
+template0|t
+template1|t", "all tables became old");
+
+my $log_offset = -s $node->logfile;
+
+# Finish the old transaction, to allow vacuum freezing to advance
+# relfrozenxid and datfrozenxid again.
+$background_psql->query_safe(q{COMMIT;});
+$background_psql->quit;
+
+# Wait until autovacuum processed all tables and advanced the
+# system-wide oldest-XID.
+$node->poll_query_until(
+	'postgres',
+	qq[
+SELECT NOT EXISTS (
+	SELECT *
+	FROM pg_database
+	WHERE age(datfrozenxid) > current_setting('autovacuum_freeze_max_age')::int)
+]) or die "timeout waiting for all databases to be vacuumed";
+
+# Check if these tables are vacuumed.
+$ret = $node->safe_psql(
+	'postgres', qq[
+SELECT relname, age(relfrozenxid) > current_setting('autovacuum_freeze_max_age')::int
+FROM pg_class
+WHERE relname IN ('large', 'large_trunc', 'small', 'small_trunc', 'autovacuum_disabled')
+ORDER BY 1
+]);
+
+is( $ret, "autovacuum_disabled|f
+large|f
+large_trunc|f
+small|f
+small_trunc|f", "all tables are vacuumed");
+
+# Check if vacuum failsafe was triggered for each table.
+my $log_contents = slurp_file($node->logfile, $log_offset);
+foreach my $tablename ('large', 'large_trunc', 'small', 'small_trunc',
+	'autovacuum_disabled')
+{
+	like(
+		$log_contents,
+		qr/bypassing nonessential maintenance of table "postgres.public.$tablename" as a failsafe after \d+ index scans/,
+		"failsafe vacuum triggered for $tablename");
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/xid_wraparound/t/002_limits.pl b/src/test/modules/xid_wraparound/t/002_limits.pl
new file mode 100644
index 0000000000..b35352f25b
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/002_limits.pl
@@ -0,0 +1,138 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+#
+# Test XID wraparound limits.
+#
+# When you get close to XID wraparound, you start to get warnings, and
+# when you get even closer, the system refuses to assign any more XIDs
+# until the oldest databases have been vacuumed and datfrozenxid has
+# been advanced.
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+if ($ENV{PG_TEST_EXTRA} !~ /\bxid_wraparound\b/)
+{
+	plan skip_all => "test xid_wraparound not enabled in PG_TEST_EXTRA";
+}
+
+my $ret;
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('wraparound');
+
+$node->init;
+$node->append_conf(
+	'postgresql.conf', qq[
+autovacuum = off # run autovacuum only to prevent wraparound
+autovacuum_naptime = 1s
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create a test table
+$node->safe_psql(
+	'postgres', qq[
+CREATE TABLE wraparoundtest(t text);
+INSERT INTO wraparoundtest VALUES ('start');
+]);
+
+# Bump the query timeout to avoid false negatives on slow test systems.
+my $psql_timeout_secs = 4 * $PostgreSQL::Test::Utils::timeout_default;
+
+# Start a background session, which holds a transaction open, preventing
+# autovacuum from advancing relfrozenxid and datfrozenxid.
+my $background_psql = $node->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+$background_psql->query_safe(
+	q{
+	BEGIN;
+	INSERT INTO wraparoundtest VALUES ('oldxact');
+});
+
+# Consume 2 billion transactions, to get close to wraparound
+$node->safe_psql('postgres', qq[SELECT consume_xids(1000000000)]);
+$node->safe_psql('postgres',
+	qq[INSERT INTO wraparoundtest VALUES ('after 1 billion')]);
+
+$node->safe_psql('postgres', qq[SELECT consume_xids(1000000000)]);
+$node->safe_psql('postgres',
+	qq[INSERT INTO wraparoundtest VALUES ('after 2 billion')]);
+
+# We are now just under 150 million XIDs away from wraparound.
+# Continue consuming XIDs, in batches of 10 million, until we get
+# the warning:
+#
+#  WARNING:  database "postgres" must be vacuumed within 3000024 transactions
+#  HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
+#  You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
+my $stderr;
+my $warn_limit = 0;
+for my $i (1 .. 15)
+{
+	$node->psql(
+		'postgres', qq[SELECT consume_xids(10000000)],
+		stderr => \$stderr,
+		on_error_die => 1);
+
+	if ($stderr =~
+		/WARNING:  database "postgres" must be vacuumed within [0-9]+ transactions/
+	  )
+	{
+		# Reached the warn-limit
+		$warn_limit = 1;
+		last;
+	}
+}
+ok($warn_limit == 1, "warn-limit reached");
+
+# We can still INSERT, despite the warnings.
+$node->safe_psql('postgres',
+	qq[INSERT INTO wraparoundtest VALUES ('reached warn-limit')]);
+
+# Keep going. We'll hit the hard "stop" limit.
+$ret = $node->psql(
+	'postgres',
+	qq[SELECT consume_xids(100000000)],
+	stderr => \$stderr);
+like(
+	$stderr,
+	qr/ERROR:  database is not accepting commands that assign new XIDs to avoid wraparound data loss in database "postgres"/,
+	"stop-limit");
+
+# Finish the old transaction, to allow vacuum freezing to advance
+# relfrozenxid and datfrozenxid again.
+$background_psql->query_safe(q{COMMIT;});
+$background_psql->quit;
+
+# VACUUM, to freeze the tables and advance datfrozenxid.
+#
+# Autovacuum does this for the other databases, and would do it for
+# 'postgres' too, but let's test manual VACUUM.
+#
+$node->safe_psql('postgres', 'VACUUM');
+
+# Wait until autovacuum has processed the other databases and advanced
+# the system-wide oldest-XID.
+$ret =
+  $node->poll_query_until('postgres',
+	qq[INSERT INTO wraparoundtest VALUES ('after VACUUM')],
+	'INSERT 0 1');
+
+# Check the table contents
+$ret = $node->safe_psql('postgres', qq[SELECT * from wraparoundtest]);
+is( $ret, "start
+oldxact
+after 1 billion
+after 2 billion
+reached warn-limit
+after VACUUM");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/xid_wraparound/t/003_wraparounds.pl b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
new file mode 100644
index 0000000000..be71b00a17
--- /dev/null
+++ b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
@@ -0,0 +1,60 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+#
+# Consume a lot of XIDs, wrapping around a few times.
+#
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+if ($ENV{PG_TEST_EXTRA} !~ /\bxid_wraparound\b/)
+{
+	plan skip_all => "test xid_wraparound not enabled in PG_TEST_EXTRA";
+}
+
+# Initialize node
+my $node = PostgreSQL::Test::Cluster->new('wraparound');
+
+$node->init;
+$node->append_conf(
+	'postgresql.conf', qq[
+autovacuum = off # run autovacuum only when to anti wraparound
+autovacuum_naptime = 1s
+# so it's easier to verify the order of operations
+autovacuum_max_workers = 1
+log_autovacuum_min_duration = 0
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
+
+# Create a test table
+$node->safe_psql(
+	'postgres', qq[
+CREATE TABLE wraparoundtest(t text);
+INSERT INTO wraparoundtest VALUES ('beginning');
+]);
+
+# Bump the query timeout to avoid false negatives on slow test systems.
+my $psql_timeout_secs = 4 * $PostgreSQL::Test::Utils::timeout_default;
+
+# Burn through 10 billion transactions in total, in batches of 100 million.
+my $ret;
+for my $i (1 .. 100)
+{
+	$ret = $node->safe_psql(
+		'postgres',
+		qq[SELECT consume_xids(100000000)],
+		timeout => $psql_timeout_secs);
+	$ret = $node->safe_psql('postgres',
+		qq[INSERT INTO wraparoundtest VALUES ('after $i batches')]);
+}
+
+$ret = $node->safe_psql('postgres', qq[SELECT COUNT(*) FROM wraparoundtest]);
+is($ret, "101");
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql b/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
new file mode 100644
index 0000000000..51d25fc4c6
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound--1.0.sql
@@ -0,0 +1,12 @@
+/* src/test/modules/xid_wraparound/xid_wraparound--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION xid_wraparound" to load this file. \quit
+
+CREATE FUNCTION consume_xids(nxids bigint)
+RETURNS xid8 IMMUTABLE PARALLEL SAFE STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION consume_xids_until(targetxid xid8)
+RETURNS xid8 IMMUTABLE PARALLEL SAFE STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/xid_wraparound/xid_wraparound.c b/src/test/modules/xid_wraparound/xid_wraparound.c
new file mode 100644
index 0000000000..312eebbbc8
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound.c
@@ -0,0 +1,219 @@
+/*--------------------------------------------------------------------------
+ *
+ * xid_wraparound.c
+ *		Utilities for testing XID wraparound
+ *
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *		src/test/modules/xid_wraparound/xid_wraparound.c
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/xact.h"
+#include "miscadmin.h"
+#include "storage/proc.h"
+#include "utils/xid8.h"
+
+PG_MODULE_MAGIC;
+
+static int64 consume_xids_shortcut(void);
+static FullTransactionId consume_xids_common(FullTransactionId untilxid, uint64 nxids);
+
+/*
+ * Consume the specified number of XIDs.
+ */
+PG_FUNCTION_INFO_V1(consume_xids);
+Datum
+consume_xids(PG_FUNCTION_ARGS)
+{
+	int64		nxids = PG_GETARG_INT64(0);
+	FullTransactionId lastxid;
+
+	if (nxids < 0)
+		elog(ERROR, "invalid nxids argument: %lld", (long long) nxids);
+
+	if (nxids == 0)
+		lastxid = ReadNextFullTransactionId();
+	else
+		lastxid = consume_xids_common(InvalidFullTransactionId, (uint64) nxids);
+
+	PG_RETURN_FULLTRANSACTIONID(lastxid);
+}
+
+/*
+ * Consume XIDs, up to the given XID.
+ */
+PG_FUNCTION_INFO_V1(consume_xids_until);
+Datum
+consume_xids_until(PG_FUNCTION_ARGS)
+{
+	FullTransactionId targetxid = PG_GETARG_FULLTRANSACTIONID(0);
+	FullTransactionId lastxid;
+
+	if (!FullTransactionIdIsNormal(targetxid))
+		elog(ERROR, "targetxid %llu is not normal",
+			 (unsigned long long) U64FromFullTransactionId(targetxid));
+
+	lastxid = consume_xids_common(targetxid, 0);
+
+	PG_RETURN_FULLTRANSACTIONID(lastxid);
+}
+
+/*
+ * Common functionality between the two public functions.
+ */
+static FullTransactionId
+consume_xids_common(FullTransactionId untilxid, uint64 nxids)
+{
+	FullTransactionId lastxid;
+	uint64		last_reported_at = 0;
+	uint64		consumed = 0;
+
+	/* Print a NOTICE every REPORT_INTERVAL xids */
+#define REPORT_INTERVAL (10 * 1000000)
+
+	/* initialize 'lastxid' with the system's current next XID */
+	lastxid = ReadNextFullTransactionId();
+
+	/*
+	 * We consume XIDs by calling GetNewTransactionId(true), which marks the
+	 * consumed XIDs as subtransactions of the current top-level transaction.
+	 * For that to work, this transaction must have a top-level XID.
+	 *
+	 * GetNewTransactionId registers them in the subxid cache in PGPROC, until
+	 * the cache overflows, but beyond that, we don't keep track of the
+	 * consumed XIDs.
+	 */
+	(void) GetTopTransactionId();
+
+	for (;;)
+	{
+		uint64		xids_left;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* How many XIDs do we have left to consume? */
+		if (nxids > 0)
+		{
+			if (consumed >= nxids)
+				break;
+			xids_left = nxids - consumed;
+		}
+		else
+		{
+			if (FullTransactionIdFollowsOrEquals(lastxid, untilxid))
+				break;
+			xids_left = U64FromFullTransactionId(untilxid) - U64FromFullTransactionId(lastxid);
+		}
+
+		/*
+		 * If we still have plenty of XIDs to consume, try to take a shortcut
+		 * and bump up the nextXid counter directly.
+		 */
+		if (xids_left > 2000 &&
+			consumed - last_reported_at < REPORT_INTERVAL &&
+			MyProc->subxidStatus.overflowed)
+		{
+			int64		consumed_by_shortcut = consume_xids_shortcut();
+
+			if (consumed_by_shortcut > 0)
+			{
+				consumed += consumed_by_shortcut;
+				continue;
+			}
+		}
+
+		/* Slow path: Call GetNewTransactionId to allocate a new XID. */
+		lastxid = GetNewTransactionId(true);
+		consumed++;
+
+		/* Report progress */
+		if (consumed - last_reported_at >= REPORT_INTERVAL)
+		{
+			if (nxids > 0)
+				elog(NOTICE, "consumed %llu / %llu XIDs, latest %u:%u",
+					 (unsigned long long) consumed, (unsigned long long) nxids,
+					 EpochFromFullTransactionId(lastxid),
+					 XidFromFullTransactionId(lastxid));
+			else
+				elog(NOTICE, "consumed up to %u:%u / %u:%u",
+					 EpochFromFullTransactionId(lastxid),
+					 XidFromFullTransactionId(lastxid),
+					 EpochFromFullTransactionId(untilxid),
+					 XidFromFullTransactionId(untilxid));
+			last_reported_at = consumed;
+		}
+	}
+
+	return lastxid;
+}
+
+/*
+ * These constants copied from .c files, because they're private.
+ */
+#define COMMIT_TS_XACTS_PER_PAGE (BLCKSZ / 10)
+#define SUBTRANS_XACTS_PER_PAGE (BLCKSZ / sizeof(TransactionId))
+#define CLOG_XACTS_PER_BYTE 4
+#define CLOG_XACTS_PER_PAGE (BLCKSZ * CLOG_XACTS_PER_BYTE)
+
+/*
+ * All the interesting action in GetNewTransactionId happens when we extend
+ * the SLRUs, or at the uint32 wraparound. If the nextXid counter is not close
+ * to any of those interesting values, take a shortcut and bump nextXID
+ * directly, close to the next "interesting" value.
+ */
+static inline uint32
+XidSkip(FullTransactionId fullxid)
+{
+	uint32		low = XidFromFullTransactionId(fullxid);
+	uint32		rem;
+	uint32		distance;
+
+	if (low < 5 || low >= UINT32_MAX - 5)
+		return 0;
+	distance = UINT32_MAX - 5 - low;
+
+	rem = low % COMMIT_TS_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, COMMIT_TS_XACTS_PER_PAGE - rem);
+
+	rem = low % SUBTRANS_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, SUBTRANS_XACTS_PER_PAGE - rem);
+
+	rem = low % CLOG_XACTS_PER_PAGE;
+	if (rem == 0)
+		return 0;
+	distance = Min(distance, CLOG_XACTS_PER_PAGE - rem);
+
+	return distance;
+}
+
+static int64
+consume_xids_shortcut(void)
+{
+	FullTransactionId nextXid;
+	uint32		consumed;
+
+	LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
+	nextXid = ShmemVariableCache->nextXid;
+
+	/*
+	 * Go slow near the "interesting values". The interesting zones include 5
+	 * transactions before and after SLRU page switches.
+	 */
+	consumed = XidSkip(nextXid);
+	if (consumed > 0)
+		ShmemVariableCache->nextXid.value += (uint64) consumed;
+
+	LWLockRelease(XidGenLock);
+
+	return consumed;
+}
diff --git a/src/test/modules/xid_wraparound/xid_wraparound.control b/src/test/modules/xid_wraparound/xid_wraparound.control
new file mode 100644
index 0000000000..6c6964ed3d
--- /dev/null
+++ b/src/test/modules/xid_wraparound/xid_wraparound.control
@@ -0,0 +1,4 @@
+comment = 'Tests for XID wraparound'
+default_version = '1.0'
+module_pathname = '$libdir/xid_wraparound'
+relocatable = true
-- 
2.31.1

v7-0002-Add-option-to-specify-timeout-seconds-to-Backgrou.patchapplication/octet-stream; name=v7-0002-Add-option-to-specify-timeout-seconds-to-Backgrou.patchDownload
From 508cc515b6ac04487544d005b77d20b709fa11b2 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 30 Aug 2023 23:10:59 +0900
Subject: [PATCH v7 2/3] Add option to specify timeout seconds to
 BackgroundPsql.pm.

Previously, a background psql session uses the default timeout and it
cannot be overridden. This change adds a new option to set the timeout
during start.

Reviewed-by: Daniel Gustafsson, Noah Misch
Discussion: https://postgr.es/m/C9CF2F76-0D81-4C9D-9832-202BE8517056%40yesql.se
---
 src/test/perl/PostgreSQL/Test/BackgroundPsql.pm | 10 ++++++----
 src/test/perl/PostgreSQL/Test/Cluster.pm        | 11 ++++++++---
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm b/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm
index 924b57ab21..58d393f5b8 100644
--- a/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm
+++ b/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm
@@ -68,7 +68,7 @@ use Test::More;
 
 =over
 
-=item PostgreSQL::Test::BackgroundPsql->new(interactive, @params)
+=item PostgreSQL::Test::BackgroundPsql->new(interactive, @psql_params, timeout)
 
 Builds a new object of class C<PostgreSQL::Test::BackgroundPsql> for either
 an interactive or background session and starts it. If C<interactive> is
@@ -81,7 +81,7 @@ string. For C<interactive> sessions, IO::Pty is required.
 sub new
 {
 	my $class = shift;
-	my ($interactive, $psql_params) = @_;
+	my ($interactive, $psql_params, $timeout) = @_;
 	my $psql = {
 		'stdin' => '',
 		'stdout' => '',
@@ -96,8 +96,10 @@ sub new
 	  "Forbidden caller of constructor: package: $package, file: $file:$line"
 	  unless $package->isa('PostgreSQL::Test::Cluster');
 
-	$psql->{timeout} =
-	  IPC::Run::timeout($PostgreSQL::Test::Utils::timeout_default);
+	$psql->{timeout} = IPC::Run::timeout(
+		defined($timeout)
+		? $timeout
+		: $PostgreSQL::Test::Utils::timeout_default);
 
 	if ($interactive)
 	{
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 321b77d7ed..a020377761 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -2028,8 +2028,6 @@ sub psql
 
 Invoke B<psql> on B<$dbname> and return a BackgroundPsql object.
 
-A timeout of $PostgreSQL::Test::Utils::timeout_default is set up.
-
 psql is invoked in tuples-only unaligned mode with reading of B<.psqlrc>
 disabled.  That may be overridden by passing extra psql parameters.
 
@@ -2047,6 +2045,11 @@ By default, the B<psql> method invokes the B<psql> program with ON_ERROR_STOP=1
 set, so SQL execution is stopped at the first error and exit code 3 is
 returned.  Set B<on_error_stop> to 0 to ignore errors instead.
 
+=item timeout => 'interval'
+
+Set a timeout for a background psql session. By default, timeout of
+$PostgreSQL::Test::Utils::timeout_default is set up.
+
 =item replication => B<value>
 
 If set, add B<replication=value> to the conninfo string.
@@ -2068,6 +2071,7 @@ sub background_psql
 	local %ENV = $self->_get_env();
 
 	my $replication = $params{replication};
+	my $timeout = undef;
 
 	my @psql_params = (
 		$self->installed_command('psql'),
@@ -2079,12 +2083,13 @@ sub background_psql
 		'-');
 
 	$params{on_error_stop} = 1 unless defined $params{on_error_stop};
+	$timeout = $params{timeout} if defined $params{timeout};
 
 	push @psql_params, '-v', 'ON_ERROR_STOP=1' if $params{on_error_stop};
 	push @psql_params, @{ $params{extra_params} }
 	  if defined $params{extra_params};
 
-	return PostgreSQL::Test::BackgroundPsql->new(0, \@psql_params);
+	return PostgreSQL::Test::BackgroundPsql->new(0, \@psql_params, $timeout);
 }
 
 =pod
-- 
2.31.1

#51Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#50)
Re: Testing autovacuum wraparound (including failsafe)

On Wed, Nov 29, 2023 at 5:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Nov 28, 2023 at 7:16 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 28 Nov 2023, at 03:00, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Nov 27, 2023 at 10:40 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 27 Nov 2023, at 14:06, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Is it true that we can modify the timeout after creating
BackgroundPsql object? If so, it seems we don't need to introduce the
new timeout argument. But how?

I can't remember if that's leftovers that incorrectly remains from an earlier
version of the BackgroundPsql work, or if it's a very bad explanation of
->set_query_timer_restart(). The timeout will use the timeout_default value
and that cannot be overridden, it can only be reset per query.

Thank you for confirming this. I see there is the same problem also in
interactive_psql(). So I've attached the 0001 patch to fix these
documentation issues.

-A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up,
-which can be modified later.
+A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up.

Since it cannot be modified, I think we should just say "A timeout of .." and
call it a default timeout. This obviously only matters for the backpatch since
the sentence is removed in 0002.

Agreed.

I've attached new version patches (0002 and 0003 are unchanged except
for the commit message). I'll push them, barring any objections.

Pushed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#52Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#51)
Re: Testing autovacuum wraparound (including failsafe)

On Thu, Nov 30, 2023 at 4:35 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Nov 29, 2023 at 5:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Nov 28, 2023 at 7:16 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 28 Nov 2023, at 03:00, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Nov 27, 2023 at 10:40 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 27 Nov 2023, at 14:06, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Is it true that we can modify the timeout after creating
BackgroundPsql object? If so, it seems we don't need to introduce the
new timeout argument. But how?

I can't remember if that's leftovers that incorrectly remains from an earlier
version of the BackgroundPsql work, or if it's a very bad explanation of
->set_query_timer_restart(). The timeout will use the timeout_default value
and that cannot be overridden, it can only be reset per query.

Thank you for confirming this. I see there is the same problem also in
interactive_psql(). So I've attached the 0001 patch to fix these
documentation issues.

-A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up,
-which can be modified later.
+A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up.

Since it cannot be modified, I think we should just say "A timeout of .." and
call it a default timeout. This obviously only matters for the backpatch since
the sentence is removed in 0002.

Agreed.

I've attached new version patches (0002 and 0003 are unchanged except
for the commit message). I'll push them, barring any objections.

Pushed.

FYI I've configured the buildfarm animal perentie to run regression
tests including xid_wraparound:

https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=perentie&amp;br=HEAD

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#53Peter Eisentraut
peter@eisentraut.org
In reply to: Masahiko Sawada (#51)
Re: Testing autovacuum wraparound (including failsafe)

The way src/test/modules/xid_wraparound/meson.build is written, it
installs the xid_wraparound.so module into production installations.
For test modules, a different installation code needs to be used. See
neighboring test modules such as
src/test/modules/test_rbtree/meson.build for examples.

#54Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Peter Eisentraut (#53)
1 attachment(s)
Re: Testing autovacuum wraparound (including failsafe)

On Thu, Feb 8, 2024 at 3:11 AM Peter Eisentraut <peter@eisentraut.org> wrote:

The way src/test/modules/xid_wraparound/meson.build is written, it
installs the xid_wraparound.so module into production installations.
For test modules, a different installation code needs to be used. See
neighboring test modules such as
src/test/modules/test_rbtree/meson.build for examples.

Good catch, thanks.

I've attached the patch to fix it. Does it make sense?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

0001-Prevent-installation-of-xid_wraparound-test-during-m.patchapplication/octet-stream; name=0001-Prevent-installation-of-xid_wraparound-test-during-m.patchDownload
From 188fa6e4e60106f655e45b4370c976a92ef5f3b6 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 8 Feb 2024 13:01:36 +0900
Subject: [PATCH] Prevent installation of xid_wraparound test during main
 install.

Oversight in e255b646a.

Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/84cd416a-0e37-4019-8380-1c8a3cdd8c5c%40eisentraut.org
---
 src/test/modules/xid_wraparound/meson.build | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/test/modules/xid_wraparound/meson.build b/src/test/modules/xid_wraparound/meson.build
index 602172754a..2e5248131b 100644
--- a/src/test/modules/xid_wraparound/meson.build
+++ b/src/test/modules/xid_wraparound/meson.build
@@ -12,14 +12,13 @@ endif
 
 xid_wraparound = shared_module('xid_wraparound',
   xid_wraparound_sources,
-  kwargs: pg_mod_args,
+  kwargs: pg_test_mod_args,
 )
-testprep_targets += xid_wraparound
+test_install_libs += xid_wraparound
 
-install_data(
+test_install_data += files(
   'xid_wraparound.control',
   'xid_wraparound--1.0.sql',
-  kwargs: contrib_data_args,
 )
 
 tests += {
-- 
2.39.3

#55Peter Eisentraut
peter@eisentraut.org
In reply to: Masahiko Sawada (#54)
Re: Testing autovacuum wraparound (including failsafe)

On 08.02.24 05:05, Masahiko Sawada wrote:

On Thu, Feb 8, 2024 at 3:11 AM Peter Eisentraut <peter@eisentraut.org> wrote:

The way src/test/modules/xid_wraparound/meson.build is written, it
installs the xid_wraparound.so module into production installations.
For test modules, a different installation code needs to be used. See
neighboring test modules such as
src/test/modules/test_rbtree/meson.build for examples.

Good catch, thanks.

I've attached the patch to fix it. Does it make sense?

Yes, that looks correct to me and produces the expected behavior.

#56Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Peter Eisentraut (#55)
Re: Testing autovacuum wraparound (including failsafe)

On Thu, Feb 8, 2024 at 4:06 PM Peter Eisentraut <peter@eisentraut.org> wrote:

On 08.02.24 05:05, Masahiko Sawada wrote:

On Thu, Feb 8, 2024 at 3:11 AM Peter Eisentraut <peter@eisentraut.org> wrote:

The way src/test/modules/xid_wraparound/meson.build is written, it
installs the xid_wraparound.so module into production installations.
For test modules, a different installation code needs to be used. See
neighboring test modules such as
src/test/modules/test_rbtree/meson.build for examples.

Good catch, thanks.

I've attached the patch to fix it. Does it make sense?

Yes, that looks correct to me and produces the expected behavior.

Thank you for the check. Pushed at 1aa67a5ea687.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#57Alexander Lakhin
exclusion@gmail.com
In reply to: Masahiko Sawada (#51)
Re: Testing autovacuum wraparound (including failsafe)

Hello,

30.11.2023 10:35, Masahiko Sawada wrote:

I've attached new version patches (0002 and 0003 are unchanged except
for the commit message). I'll push them, barring any objections.

Pushed.

I've discovered that the test 001_emergency_vacuum.pl can fail due to a
race condition. I can't see the server log at [1]https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dodo&amp;dt=2024-05-19%2006%3A33%3A34, but I reproduced the
failure locally and with additional logging and log_min_messages = DEBUG3,
the log shows:
...
2024-05-22 11:46:28.125 UTC [21256:2853] DEBUG:  SlruScanDirectory invoking callback on pg_xact/0690
2024-05-22 11:46:28.125 UTC [21256:2854] DEBUG:  transaction ID wrap limit is 2147484396, limited by database with OID 5
2024-05-22 11:46:28.126 UTC [21256:2855] LOG: !!!SendPostmasterSignal| PMSIGNAL_START_AUTOVAC_LAUNCHER
2024-05-22 11:46:28.135 UTC [14871:20077] DEBUG:  postmaster received pmsignal signal
2024-05-22 11:46:28.137 UTC [21257:1] DEBUG:  autovacuum launcher started
2024-05-22 11:46:28.137 UTC [21257:2] DEBUG:  InitPostgres
2024-05-22 11:46:28.138 UTC [21257:3] LOG:  !!!AutoVacLauncherMain| !AutoVacuumingActive() && !ShutdownRequestPending;
before do_start_worker()
2024-05-22 11:46:28.138 UTC [21257:4] LOG:  !!!do_start_worker| return quickly when there are no free workers
2024-05-22 11:46:28.138 UTC [21257:5] DEBUG:  shmem_exit(0): 4 before_shmem_exit callbacks to make
2024-05-22 11:46:28.138 UTC [21257:6] DEBUG:  shmem_exit(0): 6 on_shmem_exit callbacks to make
2024-05-22 11:46:28.138 UTC [21257:7] DEBUG:  proc_exit(0): 1 callbacks to make
2024-05-22 11:46:28.138 UTC [21257:8] DEBUG:  exit(0)
2024-05-22 11:46:28.138 UTC [21257:9] DEBUG:  shmem_exit(-1): 0 before_shmem_exit callbacks to make
2024-05-22 11:46:28.138 UTC [21257:10] DEBUG:  shmem_exit(-1): 0 on_shmem_exit callbacks to make
2024-05-22 11:46:28.138 UTC [21257:11] DEBUG:  proc_exit(-1): 0 callbacks to make
2024-05-22 11:46:28.146 UTC [21256:2856] DEBUG:  MultiXactId wrap limit is 2147483648, limited by database with OID 5
2024-05-22 11:46:28.146 UTC [21256:2857] DEBUG:  MultiXact member stop limit is now 4294914944 based on MultiXact 1
2024-05-22 11:46:28.146 UTC [21256:2858] DEBUG:  shmem_exit(0): 4 before_shmem_exit callbacks to make
2024-05-22 11:46:28.147 UTC [21256:2859] DEBUG:  shmem_exit(0): 7 on_shmem_exit callbacks to make
2024-05-22 11:46:28.147 UTC [21256:2860] DEBUG:  proc_exit(0): 1 callbacks to make
2024-05-22 11:46:28.147 UTC [21256:2861] DEBUG:  exit(0)
2024-05-22 11:46:28.147 UTC [21256:2862] DEBUG:  shmem_exit(-1): 0 before_shmem_exit callbacks to make
2024-05-22 11:46:28.147 UTC [21256:2863] DEBUG:  shmem_exit(-1): 0 on_shmem_exit callbacks to make
2024-05-22 11:46:28.147 UTC [21256:2864] DEBUG:  proc_exit(-1): 0 callbacks to make
2024-05-22 11:46:28.151 UTC [14871:20078] DEBUG:  forked new backend, pid=21258 socket=8
2024-05-22 11:46:28.171 UTC [14871:20079] DEBUG:  server process (PID 21256) exited with exit code 0
2024-05-22 11:46:28.152 UTC [21258:1] [unknown] LOG:  connection received: host=[local]
2024-05-22 11:46:28.171 UTC [21258:2] [unknown] DEBUG:  InitPostgres
2024-05-22 11:46:28.172 UTC [21258:3] [unknown] LOG:  connection authenticated: user="vagrant" method=trust
(/pgtest/postgresql.git/src/test/modules/xid_wraparound/tmp_check/t_001_emergency_vacuum_main_data/pgdata/pg_hba.conf:117)
2024-05-22 11:46:28.172 UTC [21258:4] [unknown] LOG:  connection authorized: user=vagrant database=postgres
application_name=001_emergency_vacuum.pl
2024-05-22 11:46:28.175 UTC [21258:5] 001_emergency_vacuum.pl LOG: statement: INSERT INTO small(data) SELECT 1

That is, autovacuum worker (21256) sent PMSIGNAL_START_AUTOVAC_LAUNCHER,
postmaster started autovacuum launcher, which could not start new
autovacuum worker due to the process 21256 not exited yet.

The failure can be reproduced easily with the sleep added inside
SetTransactionIdLimit():
        if (TransactionIdFollowsOrEquals(curXid, xidVacLimit) &&
                IsUnderPostmaster && !InRecovery)
SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER);
+pg_usleep(10000L);

By the way I also discovered that rather resource-intensive xid_wraparound
tests executed twice during the buildfarm "make" run (on dodo and perentie
(see [2]https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=perentie&amp;dt=2024-05-22%2000%3A02%3A19) animals), at stage module-xid_wraparound-check and then at stage
testmodules-install-check-C.

[1]: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dodo&amp;dt=2024-05-19%2006%3A33%3A34
[2]: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=perentie&amp;dt=2024-05-22%2000%3A02%3A19

Best regards,
Alexander

#58Alexander Lakhin
exclusion@gmail.com
In reply to: Masahiko Sawada (#52)
2 attachment(s)
Re: Testing autovacuum wraparound (including failsafe)

Hello Masahiko-san,

01.12.2023 05:14, Masahiko Sawada wrote:

FYI I've configured the buildfarm animal perentie to run regression
tests including xid_wraparound:

https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=perentie&amp;br=HEAD

Please look at a failure produced by perentie recently: [1]https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=perentie&amp;dt=2024-10-05%2000%3A00%3A14.

I've analyzed all the available detailed perentie logs (starting from
2024-04-04) and got the following durations of the pg_ctl stop operation
at the end of the 001_emergency_vacuum.pl (from the
module-xid_wraparound-check stage): see perentie-timings.txt and
perentie-timings.png attached. So it looks like perentie needs larger
PGCTLTIMEOUT for the test (maybe 180 seconds would work?).

Though it's not clear to me, why this test takes so long on that animal,
even when it succeeds. For example, [2]https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=perentie&amp;dt=2024-10-06%2000%3A00%3A13 shows:
[09:28:23] t/001_emergency_vacuum.pl .. ok   225894 ms ( 0.00 usr 0.00 sys +  0.31 cusr  0.43 csys =  0.74 CPU)
[09:30:53] t/002_limits.pl ............ ok   150291 ms ( 0.00 usr 0.00 sys +  1.85 cusr  1.50 csys =  3.35 CPU)
[09:49:33] t/003_wraparounds.pl ....... ok  1119766 ms ( 0.00 usr 0.00 sys +  1.68 cusr  2.39 csys =  4.07 CPU)

While what I'm seeing locally on my Fedora 40 VM is:
PG_TEST_EXTRA="xid_wraparound" make -s check -C src/test/modules/xid_wraparound/ PROVE_FLAGS="--timer"
# +++ tap check in src/test/modules/xid_wraparound +++
[04:41:56] t/001_emergency_vacuum.pl .. ok    18852 ms ( 0.01 usr 0.00 sys +  0.14 cusr  0.28 csys =  0.43 CPU)
[04:42:15] t/002_limits.pl ............ ok    18539 ms ( 0.00 usr 0.00 sys +  0.72 cusr  0.88 csys =  1.60 CPU)
[04:42:34] t/003_wraparounds.pl ....... ok    74368 ms ( 0.00 usr 0.00 sys +  0.82 cusr  2.57 csys =  3.39 CPU)

Also, maybe it would make sense to run this test for REL_17_STABLE too, as
dodo is not with us since 2024-09-04, and I don't know if there are other
animals running these tests (having xid_wraparound in PG_TEST_EXTRA).

[1]: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=perentie&amp;dt=2024-10-05%2000%3A00%3A14
[2]: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=perentie&amp;dt=2024-10-06%2000%3A00%3A13

Best regards,
Alexander

Attachments:

perentie-timings.txttext/plain; charset=UTF-8; name=perentie-timings.txtDownload
perentie-timings.tar.gzapplication/gzip; name=perentie-timings.tar.gzDownload
��g��wX�K�.���"�Q�����H�V@�zQ���BPA�#���j���H� @���~��o������}���\��?�<e���Y�^���8�9���Z���Z�Y�[��9�[P��n��-""t�
�
������-,zY@H�����;�Bw.!Qa�����;��~`���@Wl��������?s�������?U��=�t(�>Vx�NEu�,�o���O.J��PQY�?~xO�#nq�����z�2�3���si~�I�W�]=� ��*��5F4:O�=����j��o^U8�0;��x��4��
��<���y}���������'[���NH�dQQ%����(���>5�������=�=�.t��������U���}���K!R�*��	������-����
���i�/-��?�p���p�Y����A�������Y7r�]���6u�$��#�����k�,C��	O�����i!+H;�X�Y�a9�M���=K��=�
x��B���'�ym}�
)�L�q����.��^�n*�G6Y�!|����dh��V������Sw��q��!�/�^%~��
����S.7��O0��B���@�?b���pd��8*2���Y}�2��@��oX^���=Y�6{Dx��d H7YW��Gx��JS�
��SyfT5�1����'�����,!���:��v�g>��)j�d�|���Vk���p/1������!��i�.3�o�����AA��H^?�(����tV
��w�����
�V�a9��\��J���=��2H���P)U�������H�p����$�<TfO��4�d;���T�������� "/:���� ����F��MF���5���\�^}"��P�W�.f�$L �0O
�t����O����|YgW�	?6�J��Y�6�0�t�,�k��U��fQ������������~���2N��8w�n{�o�����'f�(\��-�q�������J
���R�_�S�S��E}	E�Sl)Hx86��M7LX��[N���=W,3�e����v m/D������v�kFM�����?�TW�b<�/���p�N���m���WEeO������U���AI���aI��QQy�&���
5�]Ej�������@����~�(��k[�j��+�8ABE�lg%f���Pz�3��%=���������e���ztX^��9�����W�~
�:�=��sL����k��^��d�%<�3�������0E��l8��:�$��t�`d�����@'\��-u?9��x4:��	�kegN���]��|����,{�s`���&�L>��n�T�0�~G����i�s��Q+Eix�2�3HAQ�7N��0H���#}���t��+��H���t������W���q�_$dcS~��C�n��k�gG��q�>]�U�F���2�6�w�/����d�I�G��5�3��t��p�K��X��gNA�:����h��2G-��!�Z�l���O��*#.%GOf���w.�b��dc'�$c��!��Z��F��	z�R�MN��M/�{[�w7 C��d�����F����b%6�6cf69�s����C�k� 

C��mvh����5�}y_L�+(%
�Y�]*Y�"��DXJ?����?h3�>��R]������+��G�����RTo"��3�+��V[S�����*��(k�j���%<���U��������A:��^�����||��*�����\������������D�{2�x�S&{s�b�X�����.���d����Q�7�����}�R���}a(��4-g�y\�gw��+_��y�1�:�&&�kW�j� �o^_�9-���w����W
x�s8$O����h��n�n��+��,v��C�����������b4E�v�(*~�4����^N����o�6�3>-+	�{�|�M�I��UH��a���.���".��+j?��'�]�����t�
e��H������9�^�;�_��Q��
���q&~@����|Q�S�M��9��6�M9�*�K�<&'�;�"�ij�%��$������&l��R����a��F������Uze=�k���H;?����6�/*�^I��fpnW��Z���w�l������������ls�������N����|�|�#�;������I�L���K#2��(��s4�G4�O��-*>����d��X�g���(����B�zog~E��1{�}zm���:�����>�����Z���;��n��A���!�o�ya66Y�l���*��|C��R������y;��$��8eVV\��,pP�v�mO��&�����.wK��m�I�fz��A��n�H3�\�����M����C�F-�,����g���'aWj�Xu���\�RC��A�[]x&@S
��b{��p��%j��$�k�!��r�������Cg�?��-�{#����m�����x�{+�6�LG�� ��&,AGdm?��B��94���m��NW'0�z���x��B1_IaL�$���]�&BU��P�nz2]�:���q��;��r��vH��'�����E	���yZA�n�V�It��at��N��HwMW,��P��D�l�Ikt!��\a;��
������|�m��p(�������9�"��������5���=��K���~7�!�JL������_>bym��C1��L��V2-4;����PE��5���/�j3�9����P��o���~$tuxA��V�T�����|'�7j�F
��F�3��$����ur]z�*JS�1A_&4!����	�=�Tb�������Q���v7/����:�snIYn����fv+"�Kg�Y+fB���h�s��Q�bu���������nWh1�sf�������>�:�;2��Nn���``��F�[w�
�����|��p�oq[-��C^���-�=�?U����w����H�/M������p�N�U��j���	�(���K�V~� �U=�a�k5M�vE5�nu������RQ����:36��E7	nE���a*�o��}�������_iG�QR� ��9����>e�)]QQ<��Y42�P�c��}h�����0|�������/g���*D(%��B"�$��5%X�@�P����|$��� ��5��>���b���A�Q�W���J�����P�}�Sn�����as�Z/�z8����/<��	i&����x������brT��-���p1y ��A{1K�mM�[zJ�
������?h�asN22K0P�1�K��8��P�i��Q}d:�+�����|:K�Q���PY�<�'}�Ql{[3�se�1��m�sI�;�6����g
%��qH~�	r6��</rF!�DUY!���"���"���n�e.���}��.��e>��z�'��V�Q��������*���,j����.���Ou�We/;�kFU��gF(y�����o/��F�Z�t��$��Z��{���Z�C����lw���*�6��?��P)-�z=���)�#A�UT�%�*"<�B��<�p��!����8V	�f2�_����9ni��]�����F�-�
�enj:�SQR"A��<?���U����%���-sGBbJ��D���7Jq��M1x`�(������N��T�{_�ry������M/A�!!T�Q?�6niag�ot��J��6*������P��W+�6��M����R8rlI(���F�� ����O�������\�4���M������

?�����\�d��1Z�L�In4��/��[�{Q���*��(v�����������#�V��h�Q:�{��.��2E�������u��������\�����9�z �sf%�o>I1$�wZ�2	��*tU��2���rR��R�@R"1�V�/������*p�;�|Z� Y������_�S����Wa��5�Y�"����}
?<����8��Cb���wX�43$k���S����cOC!{%D���� ����7xgG��y����U��%��.�Joi'���f����	Yx�YSz�����bG����l2��02��I�l��u������n���B�~|;oL6����.���)i������]�����f�SCR35����^������OA%m��#�s/����3����:-�R���mYoou�cQ<z���IEBl��/@�w��f����3������
�o�P�[���%d���j�q�-k��s�Fd��Ur'���:��-]�74�:�]���D�w+��A{/U��pR.~^B9�Nr@��oA�����	v�4����&���r�������"nE��X�V�����o�B|��MF�����p���D���l�q����3�p)��Q�����Dk�����u��C������}�*���})��u��OH��7#e<����b����0L���_m����������*���b�c�����l��N?�3��h.��E���"�CTX3Q���=�������(R_�����9��	p���@��/-]���=�:�>9ov���:���LR��)��:6:����wwI�O��(��s�j��)�=���z7�J�;�N�i*w��4�%T'Pqh��&o�Q�W;a�tG�`���^�.������cdL���$��-Y:x ['N��/[8�^��!�?����a���!E����K�:N��#�}���6�%4�����j�����_�/�*�����`��u��Wyxcx�.����f�2[���,Q^�����m#���1���~0����HW1iA[v@-�8~�MMc��������s���9����O�q�lGv�����b���������L��#Oo{�����u�3��'l��2�U��
�mk4��n[������S��������&��Z���s���O�x����]=[Oj7�a���m�����-S�k��^��[�����}����X������~���b���)=r���?��h��!�2#�������\�`�\_�l�g�7�z����y`����/n��b���"�4D{q�t7T]�c��,�z�]�Y�@�a�+G.*�T�U��V����h���	�<#T����H1��L@��f��u�I��VhD+�������+7o>v�ioq
.��!�F��J��p�������k�Uv����T.X%<�[���'�Q)�Wn�sw������[�i���`\B3���9������E�H
��	��U
\YV����_7H�,]h��[�!

��F��b2��]g?	3�u)�@B�~\;�h��=�ax���5��y��Cb�Q�[��;�3��L�*��bc��"5��o
���������sd�,�z��yM�U�`��UhbHH��JO/�Hh1y�U����H��h�!i��MoZH���2�r��hl����B��L�r���C��3��96#-�����~�C^�`i������N�^x��U���6N%��Bo��k	����*_���^H<}6��oY�F��Z"����=ki+r��k��x�p�M~y��n|��\�y�����*k�R
�lbbg�{��>��m*��?R��������
&�����bs�m���Jx�7j?�zmO�E��t��41���+W�`�T
��C��K����*���
Ema�KSU��E����1h�>a�����c����Z43�U����Lm�Y����	�j8����z��S�b��O-���q]�y��K?vnw�|; Qp�u-��=5"�X��o:�x��l������ z��]��?���5���`��9���S9f��-nyp���'�to����o����f�����)���#��E��8B��tK�`p�Q`����B�d�k��M-��pWE���+�����a,o�{@���IU�������H��8$Q�����L&�{a#�r<�'2�8Z��`�(����������v[F���
���&Q.���!w��vs� ��R�b�P�4������*���*tv�������S����l��l�N�����0��s�U}���<���g~'�=�#Fx7�#S~2^u�E"W�8vV�'���P���Z����
����������
�����_b6��)����D��~n���a\��H@J�&.jf�`��*������OV1]�@�p�~�f$D(��I��
�H�r����uK.�v	�)�����Uq**�b;g�����)t��`���R�sY����<����op���?����8� =�9�<����]������}o
��'�w��KM�mB�49U�Ar}�
��e�,?���?��?��=����EY��CQ�	l-�%��D���`M�n����:���2��K��?���7���|j�t��������S����b|�"��>gTU�����[ Aw�<#�e/3.Z��c��G�$�hf����3���p�z|�v�.0E4���
��U��m��^;�c��+�-�������h7��%����U�_��0X'*��L.)��O���5��m�A��ki�L�2�?�i�����Jq�}T������ t%��D�� ��Q�6��Q�n������-��O.����Gx��yOX
 �����@E�I���	��&F�J��s�~�`�:2�X`b_>�T#��������Hg�U��Y���},���prf/�PQ9�m��H��n���GQ{�����S�Zoyx��Ut�L�����f�7��y����d�/�S���/\1`�\j��?0v#i'���2IUL�Y	M���)���V$�sC�����rA������^���Y:~@���_<��2���6��'���H��]J�IFc}�|@��w#��g�>����1�8����6����$�
:l�A���[�pQ�Y�����k���x�4���n���X�P�����w@RW��&$]����~[d��_�RZ#�zy�
�V��4�_]�`�Z���~="/(x��WHNEP=;F\��*����Pt/��b^�G����sZwy3��p�G`vc��Qy9��ge'���NL��O��_E�1�<��������z����()�;�o��C�s
3���"������t�ow�<e�a4?x�0��,��l$�����`�x�n��D����J����V+�
r=�5s��/���GY}�6L`��|�C���a*�-���D�1����#�`���
/l��6�����k�h�1����5:2�{���T�|tT_�Z�l���rh�j�w�q��0����U; ��Sc���wc�������������
{�x��*������Q��>P��W��"�-�|���p���QwK�{��!���Y�)|�S�R�W)|�=�wc_�E����j~l�����o����Y�=��;h:�U�s�G��hW%�y^Z��.�Z��U�p60��K�G�����U+��E�q	�UO�vRR=���o��%�q������u�x���	E�m*!����8�����-�����PF'�r���Jd�����al������x�o�.������:��j������>~��������"9z`N��{�u�
n��;%����0��,�~���m�"u�:w���A���&�^�=*�_sY�[vG�U�|$����r"�s���QGN�wM#6;�J@{������l��+*������-��7�`B�;��!�o��7�2GDsm��k�WM�N����1�}�
T!���m?p�[*m��� !uHhW����~w�4<^�����_���q����?zr�����N������ae������)]H.^�����IN�@i\��U�ABa��FH��M��{��d���NU����y�c����~������WM����r���Jz���J;�M6�����5\�>���+a�QB�����0��h0\��r��cr��<��e���H����-n�t�E��&z"�?@\�����
�FeF�v�}���$�����
2��k����Di1qSK��.���`����Tq�$w��5"����F���o������`7�;5n<�3I��I}q�L���I%��|���"��[U�����{��Cl���h5LX��_
�]���G'2���
�*�`���p������n���-]� ���t����&�K�=)�S5*����
��x^A�hU�J���M���`W&/O�_��7lMM��v"9�������/~&����X<.a����9V�k����qx6T�f���S#�KrU���k���tp+�x{�EEF��3���U������a1�B���
��nB��� ���{���<�x�v�^{�BRWZ_�kY��b�d��G^6!�w��2)
�
Q�A	����=TTC���|U�A.��c��CO������;����/X9~���B��i��O�2%$����������8����]gS�����?V�5WSk�"��]�o�PQ�?��-�dt�Qi�����>a�V����n�i��s�6^�aJ�g����N
�m�����M����:
x�
��j��YL���B�W-`~��pme���-�N3��?�����K�����m*�W�	�x�F-��7����%X�<���v?;@���k�m�g^�P������CY�����)E�=+�E�m���� ��e/�==~���7������S(;���6�
����i~������!��]�3K�-J������o�Wj�
��n%q���;��n�{����J���x����������"2����"���l'E��o�����2����c0��,I�7c�dv���Wv���|<��f�&�)�&.����z�d�|�,^�N$`i,E6�qw�_G�����SM�Ra��'V/2E��|�$�[����3���iWK�w����;"����{f���8^�������������}����_?����V�~����0cP���RQ]�'�u���p��Y_�
��
��S��?���P�
K�S����Bg�5�*�2g^R�rp�/�Da>�Ac��[�-��Y�����fk���o���&���mF������]��I�n����{�M	7i5�C$��0���������@�[/��	tc�w��GR��q�gq����c9Z���7����|0Ch6���{8J������m��`��d.blj�d����
^X���q��"�"�� ��?� ������QM��|tFyku�OFW����P�J�����N��r��_��}�����Dk��$���R���;�xLV�@�l���!/~dw���b�D����X�mQ)�'�<��v �����C����
�C�M��9 �}�xy0���Ny?�pz��*k�w�s�T�f����C;\VDN�B���`_x�|���6��_��N��������1�y�1�����6v;a#K:t��=��=]	��b�iT.X#���-���������|=��-��%�8;;�+�!����Ck�l"����_����r3���;����^��077o�Uy�t;+=��)�������*^�S�CWX�ed=�-F��Y���-w)�����\�����.7��}��~X��c-p���y��)2�B��F�hC���������Z�+_gL���g�MdG�h"2�����h�iV	W����$�����@(��F*�q�1�+F�?m:Y��������s�]�������H�d�w���M+O
US>����,jO�*?~��eh?��XEkklR������w��-[TP�.j;�58p�����TK�����'�Dd=�N��}AZ�\NN.�nee���
���GY��B��b�������g�����n
��]gl���P�/8p�HBb��j6�����<��B�;��[�������������7������^�����*���O�����zl[n���������43�}��Y:�u�A�S�3��a�.1��EI�8a�N�+u�.P�a���z��I��`_��
[���%��y$Ci����>��Y4NS���oU������&$�;H��������?Z���&����js�������Fe��0s,��#�q!�%�k�'\�,�h;{x&��`q�S^c��
1����������9j�������'*��>Nu���F�Lgk����I��gL$��5*u��rA�?Nv����8m6Gl�A@�� �w�.m�=�t[��U�_@�}����,������}��8qB����)�PUQ{�z!�!htb��T��QW�9�
�3Q)�IX��d�������������{:c����5{6�x��n�T�S���w����1��'����u��������X��*�$�%���"9�c6� ����N^�U���dz$/�����J2"Vm+�E��+�.m��B�������E�.���4[����6iB���8�����1�AI��\xjo2�D������\I/!n~�d�n���t����m'��N��`�3N�l��/�!Y���?����^E���;�d�m�	6�U]��}�_m,�e��`%M�kgOD���8�#2�D�$����~#U)
��j��	���&Vw8��s$������Sx����B����+��l1�^?�|���IF~���i�S�,���<�e�Om�&WT��eT�c����:�I!�z���{�d���
Z�!��(�4UJ �Ul���w�y�����Rpf��^n]����E{�A�)*U��������^�h@��W�rG��[a�b�9L�n���sr�4J_����	��Cf�����D�fC����H-�^�m��
IF���6��t
�}uO=m �,J,R�B8M6Z�	n���G(n��M�W��s��MX��+G9=�s�]4�v}�fm��6��qo�Bo�(�����]�R�\�����NL�h���5H�����x7s{vOg%���+�����K�A�j]��jhl��&&�`W������1��l��J���`]���U��x��Z����Q��vd�����b�3v�:�:d�be�����,�<�,)d����5����Y�*���`8����g�~��$J�MX2������x����~��n����q���]Y������`���Z�0���;PM�
~:�����+��;kY?�Z��E�j�V^����g�4]�@�x���c��G��dQ�6��H�	!YQ����l�������
�����m�t����?�("c��fC�4�=��%�c�����R �<}�����v\���L&���-���'b�V-����s���@�h�Vx��=5!��(���[��R7��qU���v�����)���]P}�uV���i0�-�`;��~�mD��/�U��������/=��f���[��=S~������<�����lk�/���-�����Q�o_?��_���y�
dq����Dg����xU)��]��N�Q��|*w�T�!��H�����D3{M�03R=s�]��o�;K���:�wpn�����o����e�<a�X#D(�(3��o�^��>9��i� �A|����:v������	I��O����O�h��6r\�o�w�=j\����Z���laT�$&R"k�GA�L�"Wn��l�x������x��o�!��Kck���+���/������c�����:-���������#���?5�&���k��Q��|��3������]�R}��g{�����O��	"EdPS\L�Z|!%@j6�n�
�s���4F�T';�@�t��l�@TV�)V�����+�+
��-V�����{H����+�("M���T��j�S�Y���g5���m����*��R�p�Q�~��QP�]��W��\C�%�����GO�OE�~�$�N
��OE��u���gj?|S_���=c�,r m�z��L@d�bf��5���etC��8}xNNf��������8O�R�\.���|j?\�o��� ���� R?�%�3p�_������2�{tV
���i�������_�L�����l�������X�1r�����������_�tn_�Uy���>�����ve&p��43�
�G�MN��*�z'�@cXp� �U�}6���$�yq�������s���{_���W#�x��!��lK����O�������������:]~t���dmf�&)��T��|��d���XC;=#�o��>�[>�������c���D/YY�7�2�a�pA�[��Rj%R^B��?/�%���A��Zr�����Vs�`�Eg�����^��p�
������S������X�n��AW���[FJ��g��d���gJMp�_�+�bG����5����p<zq�
4��{������,�����������W��e�
$
'=�������:7�r�O\��d���a��2B>���>���4�C/*�$��,K�"V�����3]�����h�
���*[��}�o2F���5�M�v�������~Y�s��S�S��z�Zl�����C-����m�-���0� &Ex^��^�Iu����@0s���&��QgI����"�O�I�������3n�B&e%���gB��3��r�-�!���p ��$6&S����V}�g�����q�8xY7���*��>�S'��=]>�~~c0�6"�������N���.�����$�-�`|�"���'��T.�����U�l����"��Q�U_�1=\����M�H�o�f��
�d��9��y�	}m,��tW+"�h��~8���e���v�S��(CU�
�!j�/UM� yy=v+�I���<tV�$��m�<�|`	I�Y_Oo7�?�d������o�8Y�7��a�N��*�@��f�[]��9��n-~������n�@����6h��0r����O������ ��0H` �5t��+�/{��,�r��e���:���"���Y��{~��}�b�:
��?Y~K�Nq�����s�n���b�	�^�����lT�x]*<����a����P�������c���/��2k)���r�o�I��������p��9w������ �\������-Hk��D�9��y��~k�����.�?�n?
�G������O��	`j��Q{r�a��������4��{1�Z^6�����Vu@r0��-\@RF���7W�W�a�s��!gp'�o��_���L>6��f��
�v��Dd�SJ�����2�4OI�����[m�O������o~G|��	<���u���2��b8S�=�y���6v;�<z��_�<bUz�������L6Q���Y�*��x`Z�~��=D��8y����r���.���jQ���}\���"-'���Mdf���&����f�[7�j��AD/=C����9���� #���?���*�F���n��/I�wM�2��h�&}��`O���/\�4'��=��|��_���Z�-c�oc}���w�y��p
*��n.���}t�-��������U2_�6F��)��>�H��2�Fy�~�W��e��DP�\�Y��������?�Od���K_��u���Zt��Q!��;a���������4��_a:�!�["�x��knx�?�l��m���M�E����y���B���/�Nj��b[Z�_R��
I8={���	~�������v���%@���Y^;��i?-��J�Q�v���"�g���������q�@n��=�L��f�u��+$y�q^���o��D�q\�qj�5���[��A�K^6��o�k0���1�����R�NE�<�h8��~�0}d0&���|����E�I����@@"���l�|�tD6�s��d�t<��ie� *BX����C��q�E�����c�2h	:���`:{������n~�=
?�����p�T����R��������?�/��~�2N��\Q���*S�m����e���#A����sk����Zc[Z�b�N���vb�V����{��y,���4���c���V���o?�5���{�������������Z�f��9Cu>��WA��?q��?�����#�Vu�T�c�m���O���>�3�	{��wA�C���7S6��3E�6��{�����O�xb�8h�B��|L�7#S�G08m�����������Z\���	���A+��,/<%�_�M=��2����'����+f�-�+��|�����,Y�g?&O� ����o�h�������
�����]0��?"U�b�����g�,���R,�z�(�]����e�k�;H���9��E�����j�~��Gr@����i�FxIb��a����&uS}���4���$##PW�9����Ce\��A,�B���?�<��v��e��@>�s���t+�x������y��E��@���a��A���	�G��8�FK<s�8�_Z~�t�[K�����u
�pL����
W?���C�� �EY.c:a�G���6���-�T�����v�����*7�����XW�0��O��<c~t�dD3�(0�o����|a�V��"�mSz�>[r�),�Ql��O@n��q��F�����rpB��X�rG�~��4-� o�h3+�W��C�Fp�����%[��6?�?g���X��dY���
G'�6���ix��2m}�r�{i�����9� ���o.�9V~jnX������m��[JL��y����Jb�����M&y�/<���8�lF�v;/e�'��Ay*wI���v(�k�K���A&2����X'5�R���,�.�y�����4�>v�Z_p�o�!R�w�^/z�>�G��Z9"0���{�Q�b.�n�����H{.������3���j��������[���
>��=�,\:Y%��W��?�A��������1��T�i���X�u�O7����1��b��Q��|�3��_1��"Z�9����5�~����+��GI�h�����<�2!����O�*s���@�m��M���V{�������a������\{I/iUd��$2�����'o���� ����#_3_�����T�R����t�|D��	�O����YrQD�p��KD=a�Z�~ ,[�����O��� �`��wAs$��z1���MYEnLE52���^3I9<���}�A������������!~e��zs��[��#��)��7w0Y
��>Y�U{<�����M�j���$F'�C2F��$;�2���!m�:�M���5~���L���0��x;����s��(�8�V�9g�N���oer��q#�|jO<z�) ��	���4"z�sw��+��=C?��`d�y�"��J����;%��!u����6��Wq�C�z�|8�=����L��[2��=3���&�}N$f��)��>���@���g�,�F)p]�A�h�����\�a[Y�8���z�i�u�1�I�MW�&k��������G�pm�L�Cj���Uz����6�����>k���t&�ZMgz�X7;_�;�^�5V��'�%�'���L%����|�\����Im��= �#�z����V}}m�a�'�'�h�z$����I��A�G�EUi�\]^���VI�n���`Ko�g.����>�r_m�4g�gD��}�����@�W��d�
h���r����.�7(8(a������������y��.x�T����k,qZz��#{��(�T$�Qj�9�Lw�L���Vue��a{?�D����v��3��)�)I���z���r��!-�'/�z��<��� Ys�7 ��!,�]Ok��r~�B����.,���o3�:��3rl��Yq6����5�3{A�VI]��zC:�Rf.+�:A#��������=�{��S�@^p��������m�3��=�h?��^E���:i3��h�
�f}���T���|;w��0������v����X{Md�l��!����T��"zk0p�a1p��#Q@�������]6�i���O�X���8������&
~�	'ldH%t�tj��4�Ol�T��"�pg���Zc�G++;�5���|����:���r������y�q��X�������1���A�9�|�e|��E�Wu��
a���c��k>�/Ni�����H�Kj�������V���,��/o���HY���qx��T'(-��jN|m�&q����A��Z���4����`�w|;X�B��Sr��{�3[?�R�������/k%���iTb0s�y7��>P����B�57���^��0����	�����
��e�'�4z����7�!\<g-f8���4�v��_O��;CT2'���������+��.���=	���K�e�T����V���O��������xBy��e����i����U�a;�g@���d��$�Mxgy�a���h�0�{�C���������\�W:u��L���:eu�������>Y?�]��yj���xe&�&��b�b�q�����.��`c�Ie���
rl����Yi�3Y$�@|+f��6a���.���e���_��y6��}�b��\�_�~��4iiC��)�<��j��
����IX	��E��zR���{��uR���<9����/��o��&�M}�Yp�n��>�0���g��6k&B�GF�c9n	�B�3�"e2w>/!�/��)J��������i�_m�}/���2b;Y��{�8Q�����-�4gs���L5~��~t����T�_*{o��1Q�����EQ6C�3�,�qMa�U����l2���(��x�����/�#R���GI1S�Y�Gr+��_�Y&��F�e~?�H{���N>VD��09R(��YaG�(���L����$�3�M�bq�s����kn�����T.�*R��@!�qQ��*�40��/?�
�	8yD�/�C�Q��7����������QC�|��>S�>�3?�!��q&��IU��G6G�MO��
+��vUyK.H&~�tm�����������E�HQ�4<������R�j���=��$�JkL�������K1��2k6��2�|#���e��������<P1y����l�P������)��:���s�<nX���e_0�E���H�Q$�&���.ehg��!#�>U*���X�.*=D���#���8���W[��[���[A���3^zo��C������&>����6�]�R�A@d�����:��6�����h p^<�8jQ{�6lM��dyY������d,:�!DH�S�����1����i�Lo�^�8�������s�QM]k�U���T�AS�-U�E��	(��A����X1�ZD@
�<#s� ��	!L!L�$��{�����~����Or��>����=����XTV-G#����1�����D�&;oFS�4@��n=�	�z����������V��!�F�!����R\-��2���zm�S����2���y�rq�����L��+��M��M�;��x9����F�w�����E �E=�J�9\c��YT���/��.���;4��F���Y�4���?\o�p��R)n�����i���{�i���3���:�)O8b��F������
�i�r�[�&&�!�O
�.��?NQ����p���c>EB�_�]��c*9L,5���\��4��4F"��'()���!�L�>�Z�;T�T���z�z��r�C�)G�������v��t�|]m��9�����yt����!��g����V�����0I��r��<f&D��_��&�n�Q���V�$���`�3(Lx������h~�X��&���r�7W�]G�x�e��j���x���Qz)&G��5��P��R���7�K�k+��&M�#�K;�).�=�@yV���/iLy�!u����[��l��d2s��w���\�L�}
��)���6�����{'���2�������oY�8�[B����z@�$�6������d(���N��2�R����Pw�R����]�U�����{�2����'����o�@��[��T3@������I��{��;����B�#��&G�r��?���3E-e����Mb����s���A�~�1\p�0������fU��C��+-]R���36�
���6����s1��G(�B=P�g.�?�S>:q�c�<�HVb�qsn��?�q�Q�?n����Bb��NO�l{Zh������T�>(7��a��KO��_���B���;��CS[��h*}v���M|�������d����?�,�W@3�]$���q��e���Ja`��������qmW\��ba���Up���B\q�t�Q��m74��r�H��a���>����) ���8k����TGJR�����>�K����VD|�hB������]4k��,�$���>��X������ ^�!y\����W�+��_{yp
C�����7��_zZ�:�u1Y����W��Mj���i����������I$�h���.p��;��V�����|4��M>6�;���MVs�0R�f���_Qc�y
�Zy�v��O��� ���C����^(��2��/��y���A�C�h r�1o�ih�(�-�&u;c?�J+XM���3�\�{�F�	�ul'
C��E����5;�R���v�@r���wn�t�2���������CA�H������[�j�Y�`]N/$�@��)[�a_M/uY��D����
��7��`�f���E�1�M���}�j�� >�|l�X�o�|]cP��S ��������R��U�{�O�����_��yF���
�+HU{��#��)oj��X�L���u�7��(gQ�	\�O;*��2�<�o�f[�\�9�m�O�_�]����s7�>��u����0P<j9U�f���UN�\i=������jzd�[)�(7D�1
�5�0(�W;����l�(��pzi�d��j�0`?�2�3+_���g"�A�jW���b{w?>����Z�����j��*��s
��4~�DN�c�y8J�l�$R"���)A��r��v�4+���W��m�C���~2���9��{��U�-Z�.����j���X_�������@MjH=a%����;3���SC�t��B��d���R��*i����-��`���qL-�~����g�9N}��?H��R*�����R8	C��=���(��V���|m5��_=���@�K��.!'"���`�~�g�h@9�'��HE���;�%P�X��B�����8A��{�;]^YA�U��V[���i�?Y�"���0*�uP�ZUy���Y��d�"�:����{���}�=�{��/������W�(��H���"f��zV�\)�����@�30��l���r�v�3�p�Vsui��k�t�A��
x������0��)���x:���y:�����syq1�11��M�����.:��Pv�,���/���M�-_�F2�e���3�
������9@(wrRB�>�i�D����&�Tv[�PQ�������6`��y�C�_��"o����w�w�w +�[�����/8/�9y�ye�Ui4���1���~���^��^���)j���Lw��S��
Xo/��E�j8F�{1�
mXK�`�Z�����[�_��8�Y�����BM��9{U�����W�+�����2�Jz�$��������#��y��z�qT����"o�R�[J�^�V����S��,hy������Ny���R�*]CD�Q�
�`����|�G���u���nuQn�u&����!0I�[���5F����u�	�'�\�W0��N�e�,����|���	�]�2\�q`��I�eKy���K
����X>!������IC
��%��M�z�gN�'���k"���dI�����r�.M�Ghm8�H*�^��uQ��0���B����5W2L�Sg����\N_�A���|N����YB�i��[cW�e#e��	�{���W�}mO�d����^��>d���iJ�+��'�-��f�U����\�����c1{��@�X�!�./)���h��{��mNM�u�����SFa��2p��ZU���������L�����J���mUPj__)�����u��2�q���g+q��;����0Xa���3%v!����L�����:<��;dS���������r��u����[Z�7_v`�L���
��5�������q���aH���`���^����C��{\�����d�a��� =����c���+p�v?��������S�����c�����|dM|���9�������c�]�b�[-�+�T�j\�]O����31�8A^a�C�<��%SR'���uV)����[u��Gj��E��W�P/��d��8N���;��G�<�*�;$�
�Lcd��E��Q���w��"��e>1+Y��K���9��6�K��%2��^���z,����h
�#j#7`	���������%$�1�8���]������|�Z�Z�bL]N�x{�N��@�p���4%�����l�0�6wY���oY���.f�|�}�>�:#�K�3�f�s��c-%�c"V��P��]�����[;��~�4����g��U-�1+�P�nQ\���S�Aok��	m���.�B�r�w��g���Gz��m���=%�IW�:�KJ8��&Q���+'���.��a�,����F�������#�zZ��� j9{!�K���c{P#�k���e�pC���D=��
wG�G�%��{\��1i���3��&�V���6r�����
/���K�(����V^�� ���f��//���-Y�N���P��[Oo�����`���
8~l,�`��-e�����'=s?���a���t����w���������D�C�YA�/��j�=����������t(�9I
_m
���3jY]�wG�VU�;�G,���a�����o�y�������Y��	�u��6mBR��"F/��[��F�����N�����8�Y��$`���z�k�����G�g�9�,	)G@��J��V�&
���,q���?��}ZI*�o{m����B}�/����A����N����lx��O^��$=#/���I���m�����k�������3?�k��m
{��|��������g���Y{������n����� 9�m�%-n�����<��-�26R�2�S=����1�G-��/+�N��!��^-G?iI�<���������w�Z��.A��C�� 9l&����NC���w
���.����/��Q�����{l�#\�������>!i/��^�In�+(�8��<Ax�t�9��y��>cO�!Po�$W��Rd����]�r�t���G�iZ���yco�S$�#�Qf���6vz2.�
$<IR����2���2��"��������_&�1U��s��"R��Y�M^�����s`0��!�]�,����^����s�l��6�=.��\�u/Fm��@ s�K��$�M���HU��X�n���&6ot+!�>�<[���6�*�UZ����D���g��f���b������������5��m�[~��61��\�q�>9��n�z6�
�,h�W��Y%������1'6������0=]��-�����jL�AN�H[��/�$�QW������U�����b��o�\��/���C�����5�A������hh���6
��i�~��S�A����4��u-����;.�y��-��zB=T}���[�i�v��I���4>�)���}.Y�y���E��U�/U�#���6���2C��d>99CU� 1��L�P.�-�`7�z����!��k7����28ph����yw������?_�W�}�w���}�;���F�[[eL��k	d�@�>m:��Z�d�%�Jh^����g��G�n�!��}
&��Bs��I�i��/Vk�gj�n��2W�v���/g#�J$����o�?�F���bld�����1���.F�I	��5w���7����K����/����G����?B�r
#59Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Alexander Lakhin (#58)
Re: Testing autovacuum wraparound (including failsafe)

Hi,

On Tue, Oct 8, 2024 at 10:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:

Hello Masahiko-san,

01.12.2023 05:14, Masahiko Sawada wrote:

FYI I've configured the buildfarm animal perentie to run regression
tests including xid_wraparound:

https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=perentie&amp;br=HEAD

Please look at a failure produced by perentie recently: [1].

I've analyzed all the available detailed perentie logs (starting from
2024-04-04) and got the following durations of the pg_ctl stop operation
at the end of the 001_emergency_vacuum.pl (from the
module-xid_wraparound-check stage): see perentie-timings.txt and
perentie-timings.png attached. So it looks like perentie needs larger
PGCTLTIMEOUT for the test (maybe 180 seconds would work?).

Though it's not clear to me, why this test takes so long on that animal,
even when it succeeds. For example, [2] shows:
[09:28:23] t/001_emergency_vacuum.pl .. ok 225894 ms ( 0.00 usr 0.00 sys + 0.31 cusr 0.43 csys = 0.74 CPU)
[09:30:53] t/002_limits.pl ............ ok 150291 ms ( 0.00 usr 0.00 sys + 1.85 cusr 1.50 csys = 3.35 CPU)
[09:49:33] t/003_wraparounds.pl ....... ok 1119766 ms ( 0.00 usr 0.00 sys + 1.68 cusr 2.39 csys = 4.07 CPU)

While what I'm seeing locally on my Fedora 40 VM is:
PG_TEST_EXTRA="xid_wraparound" make -s check -C src/test/modules/xid_wraparound/ PROVE_FLAGS="--timer"
# +++ tap check in src/test/modules/xid_wraparound +++
[04:41:56] t/001_emergency_vacuum.pl .. ok 18852 ms ( 0.01 usr 0.00 sys + 0.14 cusr 0.28 csys = 0.43 CPU)
[04:42:15] t/002_limits.pl ............ ok 18539 ms ( 0.00 usr 0.00 sys + 0.72 cusr 0.88 csys = 1.60 CPU)
[04:42:34] t/003_wraparounds.pl ....... ok 74368 ms ( 0.00 usr 0.00 sys + 0.82 cusr 2.57 csys = 3.39 CPU)

Thank you for reporting it.

I've investigated the logs you shared and figured out that some other
(personal) tests were overlapped at that time (perentie is my
machine). So the failure should be ignored, sorry for making noise.
I'll make sure other tests won't be overlapped again when perentie is
executing the regression tests.

Also, maybe it would make sense to run this test for REL_17_STABLE too, as
dodo is not with us since 2024-09-04, and I don't know if there are other
animals running these tests (having xid_wraparound in PG_TEST_EXTRA).

Good idea. I've configured perentie to do that.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com