011_crash_recovery.pl intermittently fails

Started by Kyotaro Horiguchialmost 5 years ago32 messages
#1Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
1 attachment(s)

Hello.

I noticed that 011_crash_recovery.pl intermittently (that being said,
one out of three or so on my environment) fails in the second test.

t/011_crash_recovery.pl .. 2/3
# Failed test 'new xid after restart is greater'
# at t/011_crash_recovery.pl line 56.
# '539'
# >
# '539'

# Failed test 'xid is aborted after crash'
# at t/011_crash_recovery.pl line 60.
# got: 'committed'
# expected: 'aborted'
# Looks like you failed 2 tests of 3.
t/011_crash_recovery.pl .. Dubious, test returned 2 (wstat 512, 0x200)
Failed 2/3 subtests

Test Summary Report
-------------------
t/011_crash_recovery.pl (Wstat: 512 Tests: 3 Failed: 2)
Failed tests: 2-3
Non-zero exit status: 2
Files=1, Tests=3, 3 wallclock secs ( 0.03 usr 0.01 sys + 1.90 cusr 0.39 csys = 2.33 CPU)
Result: FAIL

If the server crashed before emitting WAL records for the transaction
just started, the restarted server cannot know the xid is even
started. I'm not sure that is the intention of the test but we must
make sure the WAL to be emitted before crashing. CHECKPOINT ensures
that.

Thoughts? The attached seems to stabilize the test for me.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

stabilize_011_crash_recovery_pl.patchtext/x-patch; charset=us-ascii
#2Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kyotaro Horiguchi (#1)
Re: 011_crash_recovery.pl intermittently fails

Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:

I noticed that 011_crash_recovery.pl intermittently (that being said,
one out of three or so on my environment) fails in the second test.

Hmmm ... what environment is that? This test script hasn't changed
meaningfully in several years, and we have not seen any real issues
with it up to now.

If the server crashed before emitting WAL records for the transaction
just started, the restarted server cannot know the xid is even
started. I'm not sure that is the intention of the test but we must
make sure the WAL to be emitted before crashing. CHECKPOINT ensures
that.

The original commit for this test says

----
commit 857ee8e391ff6654ef9dcc5dd8b658d7709d0a3c
Author: Robert Haas <rhaas@postgresql.org>
Date: Fri Mar 24 12:00:53 2017 -0400

Add a txid_status function.

If your connection to the database server is lost while a COMMIT is
in progress, it may be difficult to figure out whether the COMMIT was
successful or not. This function will tell you, provided that you
don't wait too long to ask. It may be useful in other situations,
too.

Craig Ringer, reviewed by Simon Riggs and by me

Discussion: /messages/by-id/CAMsr+YHQiWNEi0daCTboS40T+V5s_+dst3PYv_8v2wNVH+Xx4g@mail.gmail.com
----

If the function needs a CHECKPOINT to give a reliable answer,
is it actually good for the claimed purpose?

Independently of that, I doubt that adding a checkpoint call
after the pg_current_xact_id() call is going to help. The
Perl script is able to move on as soon as it's read the
function result. If we need this hack, it has to be put
before that SELECT, AFAICS.

regards, tom lane

#3Peter Geoghegan
Peter Geoghegan
pg@bowt.ie
In reply to: Tom Lane (#2)
Re: 011_crash_recovery.pl intermittently fails

On Thu, Mar 4, 2021 at 7:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hmmm ... what environment is that? This test script hasn't changed
meaningfully in several years, and we have not seen any real issues
with it up to now.

Did you see this recent thread?

/messages/by-id/20210208215206.mqmrnpkaqrdtm7fj@alap3.anarazel.de

--
Peter Geoghegan

#4Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#3)
Re: 011_crash_recovery.pl intermittently fails

Peter Geoghegan <pg@bowt.ie> writes:

On Thu, Mar 4, 2021 at 7:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hmmm ... what environment is that? This test script hasn't changed
meaningfully in several years, and we have not seen any real issues
with it up to now.

Did you see this recent thread?
/messages/by-id/20210208215206.mqmrnpkaqrdtm7fj@alap3.anarazel.de

Hadn't paid much attention at the time, but yeah, it looks like Andres
tripped over some variant of this.

I'd be kind of inclined to remove this test script altogether, on the
grounds that it's wasting cycles on a function that doesn't really
do what is claimed (and we should remove the documentation claim, too).

Having said that, it's still true that this test has been stable in
the buildfarm. Andres explained what he had to perturb to make it
fail, so I'm still interested in what Horiguchi-san did to break it.

regards, tom lane

#5Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#4)
Re: 011_crash_recovery.pl intermittently fails

I wrote:

I'd be kind of inclined to remove this test script altogether, on the
grounds that it's wasting cycles on a function that doesn't really
do what is claimed (and we should remove the documentation claim, too).

Alternatively, maybe we can salvage the function's usefulness by making it
flush WAL before returning?

If we go that route, then we have the opposite problem with respect
to the test script: rather than trying to make it paper over the
function's problems, we ought to try to make it reliably fail with
the code as it stands.

regards, tom lane

#6Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Tom Lane (#4)
Re: 011_crash_recovery.pl intermittently fails

At Thu, 04 Mar 2021 23:02:09 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in

Peter Geoghegan <pg@bowt.ie> writes:

On Thu, Mar 4, 2021 at 7:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hmmm ... what environment is that? This test script hasn't changed
meaningfully in several years, and we have not seen any real issues
with it up to now.

Did you see this recent thread?
/messages/by-id/20210208215206.mqmrnpkaqrdtm7fj@alap3.anarazel.de

Hadn't paid much attention at the time, but yeah, it looks like Andres
tripped over some variant of this.

I'd be kind of inclined to remove this test script altogether, on the
grounds that it's wasting cycles on a function that doesn't really
do what is claimed (and we should remove the documentation claim, too).

Having said that, it's still true that this test has been stable in
the buildfarm. Andres explained what he had to perturb to make it
fail, so I'm still interested in what Horiguchi-san did to break it.

CONFIGURE = '--enable-debug' '--enable-cassert' '--enable-tap-tests' '--enable-depend' '--enable-nls' '--with-icu' '--with-openssl' '--with-libxml' '--with-uuid=e2fs' '--with-tcl' '--with-llvm' '--prefix=/home/horiguti/bin/pgsql_work' 'LLVM_CONFIG=/usr/bin/llvm-config-64' 'CC=/usr/lib64/ccache/gcc' 'CLANG=/usr/lib64/ccache/clang' 'CFLAGS=-O0' '--with-wal-blocksize=16'

the WAL block size might have affected. I'll recheck without it.

FWIW xfs/CentOS8/VirtuaBox/Windows10

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#7Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#6)
Re: 011_crash_recovery.pl intermittently fails

At Fri, 05 Mar 2021 13:13:04 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

At Thu, 04 Mar 2021 23:02:09 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in

Peter Geoghegan <pg@bowt.ie> writes:

On Thu, Mar 4, 2021 at 7:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hmmm ... what environment is that? This test script hasn't changed
meaningfully in several years, and we have not seen any real issues
with it up to now.

Did you see this recent thread?
/messages/by-id/20210208215206.mqmrnpkaqrdtm7fj@alap3.anarazel.de

Hadn't paid much attention at the time, but yeah, it looks like Andres
tripped over some variant of this.

I'd be kind of inclined to remove this test script altogether, on the
grounds that it's wasting cycles on a function that doesn't really
do what is claimed (and we should remove the documentation claim, too).

Having said that, it's still true that this test has been stable in
the buildfarm. Andres explained what he had to perturb to make it
fail, so I'm still interested in what Horiguchi-san did to break it.

CONFIGURE = '--enable-debug' '--enable-cassert' '--enable-tap-tests' '--enable-depend' '--enable-nls' '--with-icu' '--with-openssl' '--with-libxml' '--with-uuid=e2fs' '--with-tcl' '--with-llvm' '--prefix=/home/horiguti/bin/pgsql_work' 'LLVM_CONFIG=/usr/bin/llvm-config-64' 'CC=/usr/lib64/ccache/gcc' 'CLANG=/usr/lib64/ccache/clang' 'CFLAGS=-O0' '--with-wal-blocksize=16'

the WAL block size might have affected. I'll recheck without it.

Ok, I don't see the failure. It guess that the WAL records for the
last transaction crosses a block boundary with 8kB WAL blocks, but not
with 16kB blocks.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#8Thomas Munro
Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#5)
Re: 011_crash_recovery.pl intermittently fails

On Fri, Mar 5, 2021 at 5:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

I'd be kind of inclined to remove this test script altogether, on the
grounds that it's wasting cycles on a function that doesn't really
do what is claimed (and we should remove the documentation claim, too).

Alternatively, maybe we can salvage the function's usefulness by making it
flush WAL before returning?

To make pg_xact_status()'s result reliable, don't you need to make
pg_current_xact_id() flush? In other words, isn't it at the point
that you *observe* the transaction that you have to make sure that
this transaction ID won't be reused after crash recovery. Before
that, it's simultaneously allocated and unallocated, like the cat.

#9Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#8)
Re: 011_crash_recovery.pl intermittently fails

Thomas Munro <thomas.munro@gmail.com> writes:

On Fri, Mar 5, 2021 at 5:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alternatively, maybe we can salvage the function's usefulness by making it
flush WAL before returning?

To make pg_xact_status()'s result reliable, don't you need to make
pg_current_xact_id() flush? In other words, isn't it at the point
that you *observe* the transaction that you have to make sure that
this transaction ID won't be reused after crash recovery. Before
that, it's simultaneously allocated and unallocated, like the cat.

We need to be sure that the XID is written out to WAL before we
let the client see it, yeah. I've not looked to see exactly
where in the code would be the best place.

BTW, I tried simply removing the "allows_streaming" option from
the test, and it failed ten times out of ten tries for me.
So Andres is right that that makes it pretty reproducible in
a stock build.

regards, tom lane

#10Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#7)
Re: 011_crash_recovery.pl intermittently fails

At Fri, 05 Mar 2021 13:21:48 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

At Fri, 05 Mar 2021 13:13:04 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

At Thu, 04 Mar 2021 23:02:09 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in

Having said that, it's still true that this test has been stable in
the buildfarm. Andres explained what he had to perturb to make it
fail, so I'm still interested in what Horiguchi-san did to break it.

CONFIGURE = '--enable-debug' '--enable-cassert' '--enable-tap-tests' '--enable-depend' '--enable-nls' '--with-icu' '--with-openssl' '--with-libxml' '--with-uuid=e2fs' '--with-tcl' '--with-llvm' '--prefix=/home/horiguti/bin/pgsql_work' 'LLVM_CONFIG=/usr/bin/llvm-config-64' 'CC=/usr/lib64/ccache/gcc' 'CLANG=/usr/lib64/ccache/clang' 'CFLAGS=-O0' '--with-wal-blocksize=16'

the WAL block size might have affected. I'll recheck without it.

Ok, I don't see the failure. It guess that the WAL records for the
last transaction crosses a block boundary with 8kB WAL blocks, but not
with 16kB blocks.

In the failure case with 16kB WAL blocks, tx538 ends with a commit
record at 0/01648B98 and nothing follows (other than the recrods added
after restart).

In the successful case, tx538 ends at the same LSN and followed by
INSERT@tx539 at0/1648CE0 up to INSERT_LEAF at 0/0165BFD8-1. So tx539
just fits inside the block (0x1648000-0x164BFFF). That amount of WAL
must cross a 8kB bounary.

Actually with 8kB blocks, tx538 ends at 0/0164B1A8 and tx539 starts at
0/0164B2A8 and ends at 0/0165E7C8, corsses a boundary at 0/0164C000
and 0/016E000.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#11Andrey Borodin
Andrey Borodin
x4mmm@yandex-team.ru
In reply to: Tom Lane (#2)
Re: 011_crash_recovery.pl intermittently fails

5 марта 2021 г., в 08:32, Tom Lane <tgl@sss.pgh.pa.us> написал(а):

Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:

I noticed that 011_crash_recovery.pl intermittently (that being said,
one out of three or so on my environment) fails in the second test.

Hmmm ... what environment is that? This test script hasn't changed
meaningfully in several years, and we have not seen any real issues
with it up to now.

Maybe it's offtopic here, but anyway...
While working on "lz4 for FPIs" I've noticed that this test fails with wal_compression = on.
I did not investigate the case at that moment, but I think that it would be good to run recovery regression tests with compression too.

Best regards, Andrey Borodin.

#12Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Tom Lane (#9)
Re: 011_crash_recovery.pl intermittently fails

At Thu, 04 Mar 2021 23:40:34 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in

BTW, I tried simply removing the "allows_streaming" option from
the test, and it failed ten times out of ten tries for me.
So Andres is right that that makes it pretty reproducible in
a stock build.

The difference comes from the difference of shared_buffers. In the
"allows_streaming" case, PostgresNode::init() *reduces* the number
down to '1MB'(128 blocks) which leads to only 8 XLOGbuffers, which
will very soon be wrap-arounded, which leads to XLogWrite.

When allows_streaming=0 case, the default size of shared_buffers is
128MB (16384 blocks). WAL buffer (512) doesn't get wrap-arounded
during the test and no WAL buffer is written out in that case.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#13Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Andrey Borodin (#11)
Re: 011_crash_recovery.pl intermittently fails

At Fri, 5 Mar 2021 10:07:06 +0500, Andrey Borodin <x4mmm@yandex-team.ru> wrote in

Maybe it's offtopic here, but anyway...
While working on "lz4 for FPIs" I've noticed that this test fails with wal_compression = on.
I did not investigate the case at that moment, but I think that it would be good to run recovery regression tests with compression too.

The problem records has 15 pages of FPIs. Reducing of its size may
prevent WAL-buffer wrap around and wal writes. If no wal retten the
test fails.

regrds.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#14Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Andrey Borodin (#11)
Re: 011_crash_recovery.pl intermittently fails

(Sorry for my slippery fingers.)

At Fri, 5 Mar 2021 10:07:06 +0500, Andrey Borodin <x4mmm@yandex-team.ru> wrote in

Maybe it's offtopic here, but anyway...
While working on "lz4 for FPIs" I've noticed that this test fails with wal_compression = on.
I did not investigate the case at that moment, but I think that it would be good to run recovery regression tests with compression too.

Best regards, Andrey Borodin.

The problem records have 15 pages of FPIs. The reduction of their
size may prevent WAL-buffer wrap around and wal writes. If no wal is
written the test fails.

regrds.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#15Andrey Borodin
Andrey Borodin
x4mmm@yandex-team.ru
In reply to: Kyotaro Horiguchi (#14)
Re: 011_crash_recovery.pl intermittently fails

5 марта 2021 г., в 13:00, Kyotaro Horiguchi <horikyota.ntt@gmail.com> написал(а):

The problem records have 15 pages of FPIs. The reduction of their
size may prevent WAL-buffer wrap around and wal writes. If no wal is
written the test fails.

Thanks, I've finally understood the root cause.
So, test verifies guarantee that is not provided (durability of aborted transactions)?
Maybe flip it to test that transaction effects are not committed\visible?

Best regards, Andrey Borodin.

#16Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Andrey Borodin (#15)
Re: 011_crash_recovery.pl intermittently fails

At Fri, 5 Mar 2021 13:20:53 +0500, Andrey Borodin <x4mmm@yandex-team.ru> wrote in

5 марта 2021 г., в 13:00, Kyotaro Horiguchi <horikyota.ntt@gmail.com> написал(а):

The problem records have 15 pages of FPIs. The reduction of their
size may prevent WAL-buffer wrap around and wal writes. If no wal is
written the test fails.

Thanks, I've finally understood the root cause.
So, test verifies guarantee that is not provided (durability of aborted transactions)?

I think that's right.

Maybe flip it to test that transaction effects are not committed\visible?

Maybe no. The objective of the test is to check if a maybe-comitted
transaction at crash is finally committed or aborted without directly
confirming the result data, I think. And that feature is found not to
be working as expected.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#17Thomas Munro
Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#9)
Re: 011_crash_recovery.pl intermittently fails

On Fri, Mar 5, 2021 at 5:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

On Fri, Mar 5, 2021 at 5:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alternatively, maybe we can salvage the function's usefulness by making it
flush WAL before returning?

To make pg_xact_status()'s result reliable, don't you need to make
pg_current_xact_id() flush? In other words, isn't it at the point
that you *observe* the transaction that you have to make sure that
this transaction ID won't be reused after crash recovery. Before
that, it's simultaneously allocated and unallocated, like the cat.

We need to be sure that the XID is written out to WAL before we
let the client see it, yeah. I've not looked to see exactly
where in the code would be the best place.

One idea would be for TransactionStateData's bool member didLogXid to
become an LSN, initially invalid, that points one past the first
record logged for this transaction, maintained by
MarkCurrentTransactionIdLoggedIfAny(). Then, pg_current_xact_id()
(and any similar xid-reporting function that we deem to be an
officially sanctioned way for a client to "observe" xids) could check
if it's valid yet; if not, it could go ahead and log something
containing the xid to make it valid. Then it could flush the log up
to that LSN.

#18Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#12)
Re: 011_crash_recovery.pl intermittently fails

At Fri, 05 Mar 2021 16:51:17 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

The difference comes from the difference of shared_buffers. In the
"allows_streaming" case, PostgresNode::init() *reduces* the number
down to '1MB'(128 blocks) which leads to only 8 XLOGbuffers, which
will very soon be wrap-arounded, which leads to XLogWrite.

When allows_streaming=0 case, the default size of shared_buffers is
128MB (16384 blocks). WAL buffer (512) doesn't get wrap-arounded
during the test and no WAL buffer is written out in that case.

So I think we need to remove the shared_buffers setting for the
allows_streamig case in PostgresNode.pm

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#19Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kyotaro Horiguchi (#18)
Re: 011_crash_recovery.pl intermittently fails

Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:

So I think we need to remove the shared_buffers setting for the
allows_streamig case in PostgresNode.pm

That would have uncertain effects on other TAP tests, so I'm disinclined
to do it that way. 011_crash_recovery.pl doesn't actually use a standby
server, so just removing its use of the allows_streaming option seems
sufficient.

But, of course, first we need a fix for the bug we now know exists.
Was anyone volunteering to make the patch?

regards, tom lane

#20Michael Paquier
Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#19)
Re: 011_crash_recovery.pl intermittently fails

On Fri, Mar 05, 2021 at 11:16:55AM -0500, Tom Lane wrote:

That would have uncertain effects on other TAP tests, so I'm disinclined
to do it that way.

+1.  There may be tests out-of-core that rely on this value as
default.
--
Michael
#21Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#20)
Re: 011_crash_recovery.pl intermittently fails

At Sat, 6 Mar 2021 10:25:46 +0900, Michael Paquier <michael@paquier.xyz> wrote in

On Fri, Mar 05, 2021 at 11:16:55AM -0500, Tom Lane wrote:

That would have uncertain effects on other TAP tests, so I'm disinclined
to do it that way.

+1. There may be tests out-of-core that rely on this value as
default.

On second thougt, that difference can be said to have revealed the
problem. Agreed.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#22Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Tom Lane (#19)
Re: 011_crash_recovery.pl intermittently fails

At Fri, 05 Mar 2021 11:16:55 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in

Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:

So I think we need to remove the shared_buffers setting for the
allows_streamig case in PostgresNode.pm

That would have uncertain effects on other TAP tests, so I'm disinclined
to do it that way. 011_crash_recovery.pl doesn't actually use a standby
server, so just removing its use of the allows_streaming option seems
sufficient.

But, of course, first we need a fix for the bug we now know exists.
Was anyone volunteering to make the patch?

Thomas' proposal sounds reasonable. If he is not going to do that for
now, I'm willing to work on that.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#23Thomas Munro
Thomas Munro
thomas.munro@gmail.com
In reply to: Kyotaro Horiguchi (#22)
Re: 011_crash_recovery.pl intermittently fails

On Mon, Mar 8, 2021 at 1:39 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Fri, 05 Mar 2021 11:16:55 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in

Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
But, of course, first we need a fix for the bug we now know exists.
Was anyone volunteering to make the patch?

Thomas' proposal sounds reasonable. If he is not going to do that for
now, I'm willing to work on that.

Thanks! I'm afraid I wouldn't get around to it for a few weeks, so if
you have time, please do. (I'm not sure if it's strictly necessary to
log *this* xid, if a higher xid has already been logged, considering
that the goal is just to avoid getting confused about an xid that is
recycled after crash recovery, but coordinating that might be more
complicated; I don't know.)

#24Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#23)
Re: 011_crash_recovery.pl intermittently fails

Thomas Munro <thomas.munro@gmail.com> writes:

Thanks! I'm afraid I wouldn't get around to it for a few weeks, so if
you have time, please do. (I'm not sure if it's strictly necessary to
log *this* xid, if a higher xid has already been logged, considering
that the goal is just to avoid getting confused about an xid that is
recycled after crash recovery, but coordinating that might be more
complicated; I don't know.)

Yeah, ideally the patch wouldn't add any unnecessary WAL flush,
if there's some cheap way to determine that our XID must already
have been written out. But I'm not sure that it's worth adding
any great amount of complexity to avoid that. For sure I would
not advocate adding any new bookkeeping overhead in the mainline
code paths to support it.

regards, tom lane

#25Kyotaro Horiguchi
Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Tom Lane (#24)
2 attachment(s)
Re: 011_crash_recovery.pl intermittently fails

At Sun, 07 Mar 2021 20:09:33 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in

Thomas Munro <thomas.munro@gmail.com> writes:

Thanks! I'm afraid I wouldn't get around to it for a few weeks, so if
you have time, please do. (I'm not sure if it's strictly necessary to
log *this* xid, if a higher xid has already been logged, considering
that the goal is just to avoid getting confused about an xid that is
recycled after crash recovery, but coordinating that might be more
complicated; I don't know.)

Yeah, ideally the patch wouldn't add any unnecessary WAL flush,
if there's some cheap way to determine that our XID must already
have been written out. But I'm not sure that it's worth adding
any great amount of complexity to avoid that. For sure I would
not advocate adding any new bookkeeping overhead in the mainline
code paths to support it.

We need to *write* an additional record if the current transaction
haven't yet written one (EnsureTopTransactionIdLogged()). One
annoyance is the possibly most-common usage of calling
pg_current_xact_id() at the beginning of a transaction, which leads to
an additional 8 byte-long log of XLOG_XACT_ASSIGNMENT. We could also
avoid that by detecting any larger xid is already flushed out.

I haven't find a simple and clean way to tracking the maximum
flushed-out XID. The new cooperation between xlog.c and xact.c
related to XID and LSN happen on shared variable makes things
complex...

So the attached doesn't contain the max-flushed-xid tracking feature.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

v1-0001-Run-011_crash_recovery.pl-with-wal_level-minimal.patchtext/x-patch; charset=us-ascii
v1-0002-Make-sure-published-XIDs-are-persistent.patchtext/x-patch; charset=us-ascii
#26Thomas Munro
Thomas Munro
thomas.munro@gmail.com
In reply to: Kyotaro Horiguchi (#25)
Re: 011_crash_recovery.pl intermittently fails

On Mon, Mar 8, 2021 at 9:32 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Sun, 07 Mar 2021 20:09:33 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in

Thomas Munro <thomas.munro@gmail.com> writes:

Thanks! I'm afraid I wouldn't get around to it for a few weeks, so if
you have time, please do. (I'm not sure if it's strictly necessary to
log *this* xid, if a higher xid has already been logged, considering
that the goal is just to avoid getting confused about an xid that is
recycled after crash recovery, but coordinating that might be more
complicated; I don't know.)

Yeah, ideally the patch wouldn't add any unnecessary WAL flush,
if there's some cheap way to determine that our XID must already
have been written out. But I'm not sure that it's worth adding
any great amount of complexity to avoid that. For sure I would
not advocate adding any new bookkeeping overhead in the mainline
code paths to support it.

We need to *write* an additional record if the current transaction
haven't yet written one (EnsureTopTransactionIdLogged()). One
annoyance is the possibly most-common usage of calling
pg_current_xact_id() at the beginning of a transaction, which leads to
an additional 8 byte-long log of XLOG_XACT_ASSIGNMENT. We could also
avoid that by detecting any larger xid is already flushed out.

Yeah, that would be very expensive for users doing that.

I haven't find a simple and clean way to tracking the maximum
flushed-out XID. The new cooperation between xlog.c and xact.c
related to XID and LSN happen on shared variable makes things
complex...

So the attached doesn't contain the max-flushed-xid tracking feature.

I guess that would be just as expensive if the user does that
sequentially with small transactions (ie allocating xids one by one).

I remembered this thread after seeing the failure of Michael's new
build farm animal "tanager". I think we need to solve this somehow...
according to our documentation "Applications might use this function,
for example, to determine whether their transaction committed or
aborted after the application and database server become disconnected
while a COMMIT is in progress.", but it's currently useless or
dangerous for that purpose.

#27Michael Paquier
Michael Paquier
michael@paquier.xyz
In reply to: Thomas Munro (#26)
Re: 011_crash_recovery.pl intermittently fails

On Wed, Jan 25, 2023 at 12:40:02PM +1300, Thomas Munro wrote:

I remembered this thread after seeing the failure of Michael's new
build farm animal "tanager". I think we need to solve this somehow...

Well, this host has a problem, for what looks like a kernel issue, I
guess.. This is repeatable across all the branches, randomly, with
various errors with the POSIX DSM implementation:
# [63cf68b7.5e5a:1] ERROR: could not open shared memory segment "/PostgreSQL.543738922": No such file or directory
# [63cf68b7.5e58:6] ERROR: could not open shared memory segment "/PostgreSQL.543738922": No such file or directory

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tanager&amp;dt=2023-01-24%2004%3A23%3A53
dynamic_shared_memory_type should be using posix in this case.
Switching to mmap may help, perhaps.. I need to test it.

Anyway, sorry for the noise on this one.
--
Michael

#28Thomas Munro
Thomas Munro
thomas.munro@gmail.com
In reply to: Michael Paquier (#27)
Re: 011_crash_recovery.pl intermittently fails

On Wed, Jan 25, 2023 at 1:02 PM Michael Paquier <michael@paquier.xyz> wrote:

Well, this host has a problem, for what looks like a kernel issue, I
guess.. This is repeatable across all the branches, randomly, with
various errors with the POSIX DSM implementation:
# [63cf68b7.5e5a:1] ERROR: could not open shared memory segment "/PostgreSQL.543738922": No such file or directory
# [63cf68b7.5e58:6] ERROR: could not open shared memory segment "/PostgreSQL.543738922": No such file or directory

Something to do with
https://www.postgresql.org/docs/current/kernel-resources.html#SYSTEMD-REMOVEIPC
?

The failure I saw looked like a straight up case of the bug reported
in this thread to me.

#29Michael Paquier
Michael Paquier
michael@paquier.xyz
In reply to: Thomas Munro (#28)
Re: 011_crash_recovery.pl intermittently fails

On Wed, Jan 25, 2023 at 01:20:39PM +1300, Thomas Munro wrote:

Something to do with
https://www.postgresql.org/docs/current/kernel-resources.html#SYSTEMD-REMOVEIPC
?

Still this is unrelated? This is a buildfarm instance, so the backend
does not run with systemd.

The failure I saw looked like a straight up case of the bug reported
in this thread to me.

Okay...
--
Michael

#30Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#29)
Re: 011_crash_recovery.pl intermittently fails

Michael Paquier <michael@paquier.xyz> writes:

On Wed, Jan 25, 2023 at 01:20:39PM +1300, Thomas Munro wrote:

Something to do with
https://www.postgresql.org/docs/current/kernel-resources.html#SYSTEMD-REMOVEIPC
?

Still this is unrelated? This is a buildfarm instance, so the backend
does not run with systemd.

That systemd behavior affects IPC resources regardless of what process
created them.

regards, tom lane

#31Michael Paquier
Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#30)
Re: 011_crash_recovery.pl intermittently fails

On Tue, Jan 24, 2023 at 07:42:06PM -0500, Tom Lane wrote:

That systemd behavior affects IPC resources regardless of what process
created them.

Thanks, my memory was fuzzy regarding that. I am curious if the error
in the recovery tests will persist with that set up. The next run
will be in a few hours, so let's see..
--
Michael

#32Michael Paquier
Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#31)
Re: 011_crash_recovery.pl intermittently fails

On Wed, Jan 25, 2023 at 10:32:10AM +0900, Michael Paquier wrote:

Thanks, my memory was fuzzy regarding that. I am curious if the error
in the recovery tests will persist with that set up. The next run
will be in a few hours, so let's see..

So it looks like tanaget is able to reproduce the failure of this
thread much more frequently than the other animals:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tanager&amp;dt=2023-01-25%2003%3A05%3A05

That's interesting. FWIW, this environment is just a Raspberry PI 4
with 8GB of memory with clang.
--
Michael