pgsql: snapshot scalability: cache snapshots using a xact completion co

Started by Andres Freundover 5 years ago14 messages

andres@anarazel.de

over 5 years ago

snapshot scalability: cache snapshots using a xact completion counter.

Previous commits made it faster/more scalable to compute snapshots. But not
building a snapshot is still faster. Now that GetSnapshotData() does not
maintain RecentGlobal* anymore, that is actually not too hard:

This commit introduces xactCompletionCount, which tracks the number of
top-level transactions with xids (i.e. which may have modified the database)
that completed in some form since the start of the server.

We can avoid rebuilding the snapshot's contents whenever the current
xactCompletionCount is the same as it was when the snapshot was
originally built. Currently this check happens while holding
ProcArrayLock. While it's likely possible to perform the check without
acquiring ProcArrayLock, it seems better to do that separately /
later, some careful analysis is required. Even with the lock this is a
significant win on its own.

On a smaller two socket machine this gains another ~1.03x, on a larger
machine the effect is roughly double (earlier patch version tested
though). If we were able to safely avoid the lock there'd be another
significant gain on top of that.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: David Rowley <dgrowleyml@gmail.com>
Discussion: /messages/by-id/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/623a9ba79bbdd11c5eccb30b8bd5c446130e521c

Modified Files
--------------
src/backend/replication/logical/snapbuild.c | 1 +
src/backend/storage/ipc/procarray.c | 125 +++++++++++++++++++++++-----
src/backend/utils/time/snapmgr.c | 4 +
src/include/access/transam.h | 9 ++
src/include/utils/snapshot.h | 7 ++
5 files changed, 126 insertions(+), 20 deletions(-)

Michael Paquier

michael@paquier.xyz

over 5 years ago

In reply to: Andres Freund (#1)

Re: pgsql: snapshot scalability: cache snapshots using a xact completion co

On Tue, Aug 18, 2020 at 04:30:21AM +0000, Andres Freund wrote:

snapshot scalability: cache snapshots using a xact completion counter.

Previous commits made it faster/more scalable to compute snapshots. But not
building a snapshot is still faster. Now that GetSnapshotData() does not
maintain RecentGlobal* anymore, that is actually not too hard:

This commit introduces xactCompletionCount, which tracks the number of
top-level transactions with xids (i.e. which may have modified the database)
that completed in some form since the start of the server.

We can avoid rebuilding the snapshot's contents whenever the current
xactCompletionCount is the same as it was when the snapshot was
originally built. Currently this check happens while holding
ProcArrayLock. While it's likely possible to perform the check without
acquiring ProcArrayLock, it seems better to do that separately /
later, some careful analysis is required. Even with the lock this is a
significant win on its own.

On a smaller two socket machine this gains another ~1.03x, on a larger
machine the effect is roughly double (earlier patch version tested
though). If we were able to safely avoid the lock there'd be another
significant gain on top of that.

spurfowl and more animals are telling us that this commit has broken
2PC:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=spurfowl&dt=2020-08-18%2004%3A31%3A11
--
Michael

Tom Lane

tgl@sss.pgh.pa.us

over 5 years ago

In reply to: Andres Freund (#1)

Re: pgsql: snapshot scalability: cache snapshots using a xact completion co

Andres Freund <andres@anarazel.de> writes:

snapshot scalability: cache snapshots using a xact completion counter.

buildfarm doesn't like this a bit ...

regards, tom lane

Andres Freund

andres@anarazel.de

over 5 years ago

In reply to: Tom Lane (#3)

Re: pgsql: snapshot scalability: cache snapshots using a xact completion co

Hi,

On 2020-08-18 00:55:22 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

snapshot scalability: cache snapshots using a xact completion counter.

buildfarm doesn't like this a bit ...

Yea, looking already. Unless that turns out to be incredibly bad luck
and only the first three animals failed (there's a few passes after), or
unless I find the issue in the next 30min or so, I'll revert.

Greetings,

Andres Freund

andres@anarazel.de

over 5 years ago

In reply to: Michael Paquier (#2)

Re: pgsql: snapshot scalability: cache snapshots using a xact completion co

On 2020-08-18 13:52:46 +0900, Michael Paquier wrote:

On Tue, Aug 18, 2020 at 04:30:21AM +0000, Andres Freund wrote:
spurfowl and more animals are telling us that this commit has broken
2PC:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=spurfowl&dt=2020-08-18%2004%3A31%3A11

It looks like it's a bit more subtle than outright breaking 2PC. We're
now at 3 out of 18 BF members having failed. I locally ran also quite a
few loops of the normal regression tests without finding an issue.

I'd written to Tom that I was planning to revert unless the number of
failures were lower than initially indicated. But that actually seems to
have come to pass (the failures are quicker to report because they don't
run the subsequent tests, of course). I'd like to let the failures
accumulate a bit longer, say until tomorrow Midday if I haven't figured
it out by then. With the hope of finding some detail to help pinpoint
the issue.

Greetings,

Andres Freund

Tom Lane

tgl@sss.pgh.pa.us

over 5 years ago

In reply to: Andres Freund (#5)

Re: pgsql: snapshot scalability: cache snapshots using a xact completion co

Andres Freund <andres@anarazel.de> writes:

I'd written to Tom that I was planning to revert unless the number of
failures were lower than initially indicated. But that actually seems to
have come to pass (the failures are quicker to report because they don't
run the subsequent tests, of course). I'd like to let the failures
accumulate a bit longer, say until tomorrow Midday if I haven't figured
it out by then. With the hope of finding some detail to help pinpoint
the issue.

There's certainly no obvious pattern here, so I agree with waiting for
more data.

regards, tom lane

Andres Freund

andres@anarazel.de

over 5 years ago

In reply to: Tom Lane (#6)

Re: pgsql: snapshot scalability: cache snapshots using a xact completion co

Hi,

On 2020-08-18 01:21:17 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I'd written to Tom that I was planning to revert unless the number of
failures were lower than initially indicated. But that actually seems to
have come to pass (the failures are quicker to report because they don't
run the subsequent tests, of course). I'd like to let the failures
accumulate a bit longer, say until tomorrow Midday if I haven't figured
it out by then. With the hope of finding some detail to help pinpoint
the issue.

There's certainly no obvious pattern here, so I agree with waiting for
more data.

FWIW, I think I have found the bug, but I'm still working to reproduce
the issue reliably enough that I can verify that the fix actually works.

The issue is basically that 2PC PREPARE is weird, WRT procarray. The
last snapshot built with GetSnapshotData() before the PREPARE doesn't
include its own transaction in ->xip[], as normal. PrepareTransaction()
removes the "normal" entry with ProcArrayClearTransaction(), which so
far doesn't increase the xact completion count. Because the xact
completion count is not increased, snapshots can be reused as long as
they're taken before the 2PC transaction is finished. That's fine for
other backends, but for the backend doing the PrepareTransaction() it's
not, because there ->xip doesn't include the own backend.

It's a bit tricky to reproduce exactly the issue the BF is occasionally
hitting, because the way ->xmax is computed *limits* the
damage. Combined with the use of SERIALIZABLE (preventing recomputation
of the data snapshot) that makes it somewhat hard to hit.

Greetings,

Andres Freund

andres@anarazel.de

over 5 years ago

In reply to: Andres Freund (#7)

Re: pgsql: snapshot scalability: cache snapshots using a xact completion co

On 2020-08-18 13:28:05 -0700, Andres Freund wrote:

Hi,

On 2020-08-18 01:21:17 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I'd written to Tom that I was planning to revert unless the number of
failures were lower than initially indicated. But that actually seems to
have come to pass (the failures are quicker to report because they don't
run the subsequent tests, of course). I'd like to let the failures
accumulate a bit longer, say until tomorrow Midday if I haven't figured
it out by then. With the hope of finding some detail to help pinpoint
the issue.

There's certainly no obvious pattern here, so I agree with waiting for
more data.

FWIW, I think I have found the bug, but I'm still working to reproduce
the issue reliably enough that I can verify that the fix actually works.

The issue is basically that 2PC PREPARE is weird, WRT procarray. The
last snapshot built with GetSnapshotData() before the PREPARE doesn't
include its own transaction in ->xip[], as normal. PrepareTransaction()
removes the "normal" entry with ProcArrayClearTransaction(), which so
far doesn't increase the xact completion count. Because the xact
completion count is not increased, snapshots can be reused as long as
they're taken before the 2PC transaction is finished. That's fine for
other backends, but for the backend doing the PrepareTransaction() it's
not, because there ->xip doesn't include the own backend.

It's a bit tricky to reproduce exactly the issue the BF is occasionally
hitting, because the way ->xmax is computed *limits* the
damage. Combined with the use of SERIALIZABLE (preventing recomputation
of the data snapshot) that makes it somewhat hard to hit.

I pushed a fix. After a while I figured out that it's not actually that
hard to test reliably. But it does require multiple sessions
interacting, particularly another session needs to acquire and commit a
transaction id that's later than the prepared transaction's.

I think it's worth adding an isolation test. But it doesn't seem like
extending prepared-transactions.spec makes too much sense, it doesn't
fit in well. It's a lot easier to reproduce the issue without
SERIALIZABLE, for example. Generally the file seems more about
serializable than 2PC...

So unless somebody disagrees I'm gonna add a new
prepared-transactions-2.spec.

Greetings,

Andres Freund

andres@anarazel.de

over 5 years ago

In reply to: Andres Freund (#8)

1 attachment(s)

prepared transaction isolation tests

Hi,

This thread started on committers, at
/messages/by-id/20200818234532.uiafo5br5lo6zhya@alap3.anarazel.de

In it I wanted to add a isolation test around prepared transactions:

On 2020-08-18 16:45:32 -0700, Andres Freund wrote:

I think it's worth adding an isolation test. But it doesn't seem like
extending prepared-transactions.spec makes too much sense, it doesn't
fit in well. It's a lot easier to reproduce the issue without
SERIALIZABLE, for example. Generally the file seems more about
serializable than 2PC...

So unless somebody disagrees I'm gonna add a new
prepared-transactions-2.spec.

But I noticed that the already existing prepared transactions test
wasn't in the normal schedule, since:

commit ae55d9fbe3871a5e6309d9b91629f1b0ff2b8cba
Author: Andrew Dunstan <andrew@dunslane.net>
Date: 2012-07-20 15:51:40 -0400

Remove prepared transactions from main isolation test schedule.

There is no point in running this test when prepared transactions are disabled,
which is the default. New make targets that include the test are provided. This
will save some useless waste of cycles on buildfarm machines.

Backpatch to 9.1 where these tests were introduced.

diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 669c0f220c4..2184975dcb1 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -9,7 +9,6 @@ test: ri-trigger
 test: partial-index
 test: two-ids
 test: multiple-row-versions
-test: prepared-transactions
 test: fk-contention
 test: fk-deadlock
 test: fk-deadlock2

The commit confuses me, cause I thought we explicitly enabled prepared
transactions during tests well before that? See

commit 8d4f2ecd41312e57422901952cbad234d293060b
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: 2009-04-23 00:23:46 +0000

Change the default value of max_prepared_transactions to zero, and add
documentation warnings against setting it nonzero unless active use of
prepared transactions is intended and a suitable transaction manager has been
installed. This should help to prevent the type of scenario we've seen
several times now where a prepared transaction is forgotten and eventually
causes severe maintenance problems (or even anti-wraparound shutdown).

The only real reason we had the default be nonzero in the first place was to
support regression testing of the feature. To still be able to do that,
tweak pg_regress to force a nonzero value during "make check". Since we
cannot force a nonzero value in "make installcheck", add a variant regression
test "expected" file that shows the results that will be obtained when
max_prepared_transactions is zero.

Also, extend the HINT messages for transaction wraparound warnings to mention
the possibility that old prepared transactions are causing the problem.

All per today's discussion.

And indeed, including the test in the schedule works for make check, not
just an installcheck with explicitly enabled prepared xacts.

ISTM we should just add an alternative output for disabled prepared
xacts, and re-add the test?

My new test, without the alternative output for now, is attached.

Greetings,

Andres Freund

Attachments:

test.difftext/x-diff; charset=us-asciiDownload

diff --git c/src/test/isolation/expected/prepared-transactions-2.out i/src/test/isolation/expected/prepared-transactions-2.out
new file mode 100644
index 00000000000..f7fd2e7f989
--- /dev/null
+++ i/src/test/isolation/expected/prepared-transactions-2.out
@@ -0,0 +1,119 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_ins_before s1_begin s1_txid s2_txid s1_update_in s1_ins_in s1_show s1_prepare s1_show s1_commit s1_show
+step s1_ins_before: INSERT INTO preptest VALUES ('before');
+step s1_begin: BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;
+step s1_txid: SELECT txid_current() IS NULL;
+?column?       
+
+f              
+step s2_txid: SELECT txid_current() IS NULL;
+?column?       
+
+f              
+step s1_update_in: UPDATE preptest SET data = 'in; '|| data;
+step s1_ins_in: INSERT INTO preptest VALUES ('in');
+step s1_show: SELECT * FROM preptest
+data           
+
+in; before     
+in             
+step s1_prepare: PREPARE TRANSACTION 'prep';
+step s1_show: SELECT * FROM preptest
+data           
+
+before         
+step s1_commit: COMMIT PREPARED 'prep';
+step s1_show: SELECT * FROM preptest
+data           
+
+in; before     
+in             
+
+starting permutation: s1_ins_before s1_begin s1_txid s2_txid s1_update_in s1_ins_in s1_show s1_prepare s2_show s1_commit s2_show
+step s1_ins_before: INSERT INTO preptest VALUES ('before');
+step s1_begin: BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;
+step s1_txid: SELECT txid_current() IS NULL;
+?column?       
+
+f              
+step s2_txid: SELECT txid_current() IS NULL;
+?column?       
+
+f              
+step s1_update_in: UPDATE preptest SET data = 'in; '|| data;
+step s1_ins_in: INSERT INTO preptest VALUES ('in');
+step s1_show: SELECT * FROM preptest
+data           
+
+in; before     
+in             
+step s1_prepare: PREPARE TRANSACTION 'prep';
+step s2_show: SELECT * FROM preptest
+data           
+
+before         
+step s1_commit: COMMIT PREPARED 'prep';
+step s2_show: SELECT * FROM preptest
+data           
+
+in; before     
+in             
+
+starting permutation: s1_ins_before s1_begin s1_txid s2_txid s1_update_in s1_ins_in s1_show s1_prepare s1_show s1_rollback s1_show
+step s1_ins_before: INSERT INTO preptest VALUES ('before');
+step s1_begin: BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;
+step s1_txid: SELECT txid_current() IS NULL;
+?column?       
+
+f              
+step s2_txid: SELECT txid_current() IS NULL;
+?column?       
+
+f              
+step s1_update_in: UPDATE preptest SET data = 'in; '|| data;
+step s1_ins_in: INSERT INTO preptest VALUES ('in');
+step s1_show: SELECT * FROM preptest
+data           
+
+in; before     
+in             
+step s1_prepare: PREPARE TRANSACTION 'prep';
+step s1_show: SELECT * FROM preptest
+data           
+
+before         
+step s1_rollback: ROLLBACK PREPARED 'prep';
+step s1_show: SELECT * FROM preptest
+data           
+
+before         
+
+starting permutation: s1_ins_before s1_begin s1_txid s2_txid s1_update_in s1_ins_in s1_show s1_prepare s2_show s1_rollback s2_show
+step s1_ins_before: INSERT INTO preptest VALUES ('before');
+step s1_begin: BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;
+step s1_txid: SELECT txid_current() IS NULL;
+?column?       
+
+f              
+step s2_txid: SELECT txid_current() IS NULL;
+?column?       
+
+f              
+step s1_update_in: UPDATE preptest SET data = 'in; '|| data;
+step s1_ins_in: INSERT INTO preptest VALUES ('in');
+step s1_show: SELECT * FROM preptest
+data           
+
+in; before     
+in             
+step s1_prepare: PREPARE TRANSACTION 'prep';
+step s2_show: SELECT * FROM preptest
+data           
+
+before         
+step s1_rollback: ROLLBACK PREPARED 'prep';
+step s2_show: SELECT * FROM preptest
+data           
+
+before         
diff --git c/src/test/isolation/isolation_schedule i/src/test/isolation/isolation_schedule
index 218c87b24bf..ec15ce784db 100644
--- c/src/test/isolation/isolation_schedule
+++ i/src/test/isolation/isolation_schedule
@@ -1,3 +1,5 @@
+test: prepared-transactions
+test: prepared-transactions-2
 test: read-only-anomaly
 test: read-only-anomaly-2
 test: read-only-anomaly-3
diff --git c/src/test/isolation/specs/prepared-transactions-2.spec i/src/test/isolation/specs/prepared-transactions-2.spec
new file mode 100644
index 00000000000..d5ed58e4cfc
--- /dev/null
+++ i/src/test/isolation/specs/prepared-transactions-2.spec
@@ -0,0 +1,60 @@
+# This test checks against a bug found in the initial support for
+# caching snapshots. The bug caused the snapshot from within the
+# to-be-prepared transaction to be used after preparing, erroneously
+# not including the prepared transaction's xid, thereby effectively
+# treating it as aborted.
+#
+# See also https://postgr.es/m/E1k7tGP-0005V0-5k%40gemulon.postgresql.org
+
+setup
+{
+    CREATE TABLE preptest(data text);
+}
+
+teardown
+{
+    DROP TABLE preptest;
+}
+
+session "s1"
+step "s1_begin" { BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED; }
+step "s1_txid" { SELECT txid_current() IS NULL; }
+step "s1_ins_before" { INSERT INTO preptest VALUES ('before'); }
+step "s1_update_in" { UPDATE preptest SET data = 'in; '|| data; }
+step "s1_ins_in" { INSERT INTO preptest VALUES ('in'); }
+step "s1_prepare" { PREPARE TRANSACTION 'prep'; }
+step "s1_commit" { COMMIT PREPARED 'prep'; }
+step "s1_rollback" { ROLLBACK PREPARED 'prep'; }
+step "s1_show" { SELECT * FROM preptest }
+
+session "s2"
+step "s2_show" { SELECT * FROM preptest }
+step "s2_txid" { SELECT txid_current() IS NULL; }
+
+# This permutation shows the bug
+permutation "s1_ins_before" "s1_begin" "s1_txid"
+    # Without acquiring an xid here, s1's xid would be the snapshot's
+    #  xmax, hiding the bug (as only xids < xmax are looked up in ->xip).
+    "s2_txid"
+    "s1_update_in" "s1_ins_in" "s1_show" "s1_prepare"
+    # With the bug the next select would find xmax neither committed nor in progress,
+    # and therefore mark xmax as invalid. Obviously breaking visibility.
+    "s1_show" "s1_commit" "s1_show"
+
+# The following are just variations of the theme that seem good to
+# also test, even though they weren't affected by the bug.
+
+permutation "s1_ins_before" "s1_begin" "s1_txid"
+    "s2_txid"
+    "s1_update_in" "s1_ins_in" "s1_show" "s1_prepare"
+    "s2_show" "s1_commit" "s2_show"
+
+permutation "s1_ins_before" "s1_begin" "s1_txid"
+    "s2_txid"
+    "s1_update_in" "s1_ins_in" "s1_show" "s1_prepare"
+    "s1_show" "s1_rollback" "s1_show"
+
+permutation "s1_ins_before" "s1_begin" "s1_txid"
+    "s2_txid"
+    "s1_update_in" "s1_ins_in" "s1_show" "s1_prepare"
+    "s2_show" "s1_rollback" "s2_show"

#10

Tom Lane

tgl@sss.pgh.pa.us

over 5 years ago

In reply to: Andres Freund (#9)

Re: prepared transaction isolation tests

Andres Freund <andres@anarazel.de> writes:

ISTM we should just add an alternative output for disabled prepared
xacts, and re-add the test?

I believe the buildfarm runs the isolation step with "make installcheck",
so if you're hoping to get buildfarm coverage that way, you're mistaken.

Having said that, it'd probably be good if "make check" did run this test.

regards, tom lane

#11

Andres Freund

andres@anarazel.de

over 5 years ago

In reply to: Tom Lane (#10)

Re: prepared transaction isolation tests

Hi,

On 2020-08-18 22:24:20 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

ISTM we should just add an alternative output for disabled prepared
xacts, and re-add the test?

I believe the buildfarm runs the isolation step with "make installcheck",
so if you're hoping to get buildfarm coverage that way, you're mistaken.

It seems like the buildfarm ought to configure the started server with a
bunch of prepared transactions, in that case? At least going forward?

Having said that, it'd probably be good if "make check" did run this test.

Yea. It'd at least be run when we do check-world - which at least I do
before nearly every commit.

Greetings,

Andres Freund

#12

Michael Paquier

michael@paquier.xyz

over 5 years ago

In reply to: Andres Freund (#11)

Re: prepared transaction isolation tests

On Tue, Aug 18, 2020 at 07:34:00PM -0700, Andres Freund wrote:

It seems like the buildfarm ought to configure the started server with a
bunch of prepared transactions, in that case? At least going forward?

Agreed. Testing with max_prepared_transactions > 0 has much more
value than not, for sure. So I think that it could be a good thing,
particularly if we begin to add more isolation tests.
--
Michael

#13

Andrew Dunstan

andrew.dunstan@2ndquadrant.com

over 5 years ago

In reply to: Andres Freund (#9)

Re: prepared transaction isolation tests

On 8/18/20 9:22 PM, Andres Freund wrote:

Hi,

This thread started on committers, at
/messages/by-id/20200818234532.uiafo5br5lo6zhya@alap3.anarazel.de

In it I wanted to add a isolation test around prepared transactions:

On 2020-08-18 16:45:32 -0700, Andres Freund wrote:

I think it's worth adding an isolation test. But it doesn't seem like
extending prepared-transactions.spec makes too much sense, it doesn't
fit in well. It's a lot easier to reproduce the issue without
SERIALIZABLE, for example. Generally the file seems more about
serializable than 2PC...

So unless somebody disagrees I'm gonna add a new
prepared-transactions-2.spec.

But I noticed that the already existing prepared transactions test
wasn't in the normal schedule, since:

commit ae55d9fbe3871a5e6309d9b91629f1b0ff2b8cba
Author: Andrew Dunstan <andrew@dunslane.net>
Date: 2012-07-20 15:51:40 -0400

Remove prepared transactions from main isolation test schedule.

There is no point in running this test when prepared transactions are disabled,
which is the default. New make targets that include the test are provided. This
will save some useless waste of cycles on buildfarm machines.

Backpatch to 9.1 where these tests were introduced.
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 669c0f220c4..2184975dcb1 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -9,7 +9,6 @@ test: ri-trigger
test: partial-index
test: two-ids
test: multiple-row-versions
-test: prepared-transactions
test: fk-contention
test: fk-deadlock
test: fk-deadlock2
The commit confuses me, cause I thought we explicitly enabled prepared
transactions during tests well before that? See

commit 8d4f2ecd41312e57422901952cbad234d293060b
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: 2009-04-23 00:23:46 +0000

Change the default value of max_prepared_transactions to zero, and add
documentation warnings against setting it nonzero unless active use of
prepared transactions is intended and a suitable transaction manager has been
installed. This should help to prevent the type of scenario we've seen
several times now where a prepared transaction is forgotten and eventually
causes severe maintenance problems (or even anti-wraparound shutdown).

The only real reason we had the default be nonzero in the first place was to
support regression testing of the feature. To still be able to do that,
tweak pg_regress to force a nonzero value during "make check". Since we
cannot force a nonzero value in "make installcheck", add a variant regression
test "expected" file that shows the results that will be obtained when
max_prepared_transactions is zero.

Also, extend the HINT messages for transaction wraparound warnings to mention
the possibility that old prepared transactions are causing the problem.

All per today's discussion.

And indeed, including the test in the schedule works for make check, not
just an installcheck with explicitly enabled prepared xacts.

ISTM we should just add an alternative output for disabled prepared
xacts, and re-add the test?

here's the context for the 2012 commit.

/messages/by-id/50099220.2060005@dunslane.net

So I hope any changes that are made will not result in a major slowdown
of buildfarm animals.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#14

Alvaro Herrera

alvherre@2ndquadrant.com

over 5 years ago

In reply to: Andres Freund (#8)

Re: pgsql: snapshot scalability: cache snapshots using a xact completion co

On 2020-Aug-18, Andres Freund wrote:

So unless somebody disagrees I'm gonna add a new
prepared-transactions-2.spec.

I think keeping things separate if they're not really related is
sensible.

I think it might be a good idea to add that test to older branches too,
even if it's just 13 -- at least temporarily.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services