Failures in constraints regression test, "read only 0 of 8192 bytes"

Started by Thomas Munroabout 2 years ago24 messages

thomas.munro@gmail.com

about 2 years ago

These two animals seem to have got mixed up about about the size of
this relation in the same place:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=avocet&dt=2024-02-28%2017%3A34%3A30
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2024-03-01%2006%3A47%3A53

+++ /home/buildfarm/trilobite/buildroot/HEAD/pgsql.build/src/test/regress/results/constraints.out
2024-03-01 08:22:11.624897033 +0100
@@ -573,42 +573,38 @@
  UNIQUE (i) DEFERRABLE INITIALLY DEFERRED;
 BEGIN;
 INSERT INTO unique_tbl VALUES (1, 'five');
+ERROR:  could not read blocks 0..0 in file "base/16384/21437": read
only 0 of 8192 bytes

That error message changed slightly in my smgrreadv() commit a couple
of months ago (it would have been "block 0" and now it's "blocks 0..0"
because now we can read more than one block at a time) but I don't
immediately see how anything at that low level could be responsible
for this.

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 2 years ago

In reply to: Thomas Munro (#1)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

These are "my" animals (running at a local university). There's a couple
interesting details:

1) the animals run on the same machine (one with gcc, one with clang)

2) I did upgrade the OS and restarted the machine on 2024/02/26, i.e.
right before the failures started

These might be just coincidences, but maybe something got broken by the
upgrade ... OTOH it's weird it'd affect just HEAD and none of the other
branches, and on two difference compilers.

Just to be sure I removed the buildroot, in case there's something wrong
with ccache. It's a wild guess, but I don't have any other idea.

regards

On 3/2/24 22:39, Thomas Munro wrote:

These two animals seem to have got mixed up about about the size of
this relation in the same place:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=avocet&dt=2024-02-28%2017%3A34%3A30
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2024-03-01%2006%3A47%3A53
+++ /home/buildfarm/trilobite/buildroot/HEAD/pgsql.build/src/test/regress/results/constraints.out
2024-03-01 08:22:11.624897033 +0100
@@ -573,42 +573,38 @@
UNIQUE (i) DEFERRABLE INITIALLY DEFERRED;
BEGIN;
INSERT INTO unique_tbl VALUES (1, 'five');
+ERROR:  could not read blocks 0..0 in file "base/16384/21437": read
only 0 of 8192 bytes
That error message changed slightly in my smgrreadv() commit a couple
of months ago (it would have been "block 0" and now it's "blocks 0..0"
because now we can read more than one block at a time) but I don't
immediately see how anything at that low level could be responsible
for this.

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Ronan Dunklau

ronan.dunklau@aiven.io

about 2 years ago

In reply to: Tomas Vondra (#2)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

Le samedi 2 mars 2024, 23:29:52 CET Tomas Vondra a écrit :

These are "my" animals (running at a local university). There's a couple
interesting details:

Hi Tomas,
do you still have the failing cluster data ?

Noah pointed me to this thread, and it looks a bit similar to the FSM
corruption issue I'm facing: https://www.postgresql.org/message-id/
1925490.taCxCBeP46%40aivenlaptop

So if you still have the data, it would be nice to see if you indeed have a
corrupted FSM, and if you have indications when it happened.

Best regards,

--
Ronan Dunklau

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 2 years ago

In reply to: Ronan Dunklau (#3)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

On 3/4/24 14:16, Ronan Dunklau wrote:

Le samedi 2 mars 2024, 23:29:52 CET Tomas Vondra a écrit :

These are "my" animals (running at a local university). There's a couple
interesting details:

Hi Tomas,
do you still have the failing cluster data ?

Noah pointed me to this thread, and it looks a bit similar to the FSM
corruption issue I'm facing: https://www.postgresql.org/message-id/
1925490.taCxCBeP46%40aivenlaptop

So if you still have the data, it would be nice to see if you indeed have a
corrupted FSM, and if you have indications when it happened.

Sorry, I nuked the buildroot so I don't have the data anymore. Let's see
if it fails again.

regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Thomas Munro

thomas.munro@gmail.com

about 2 years ago

In reply to: Tomas Vondra (#4)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

Happened again. I see this is OpenSUSE. Does that mean the file
system is Btrfs?

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 2 years ago

In reply to: Thomas Munro (#5)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

On 3/8/24 09:33, Thomas Munro wrote:

Happened again. I see this is OpenSUSE. Does that mean the file
system is Btrfs?

It is, but I don't think that matters - I've been able to reproduce this
locally on my laptop using ext4 filesystem. I'd bet the important piece
here is -DCLOBBER_CACHE_ALWAYS (and it seems avocet/trilobite are the
only animals running with this).

Also, if this really is a filesystem (or environment) issue, it seems
very strange it'd only affect HEAD and not the other branches. So it
seems quite likely this is actually triggered by a commit.

Looking at the commits from the good/bad runs, I see this:

avocet: good=4c2369a bad=f5a465f
trilobite: good=d13ff82 bad=def0ce3

That means the commit would have to be somewhere here:

f5a465f1a07 Promote assertion about !ReindexIsProcessingIndex to ...
57b28c08305 Doc: fix minor typos in two ECPG function descriptions.
28e858c0f95 Improve documentation and GUC description for ...
a661bf7b0f5 Remove flaky isolation tests for timeouts
874d817baa1 Multiple revisions to the GROUP BY reordering tests
466979ef031 Replace lateral references to removed rels in subqueries
a6b2a51e16d Avoid dangling-pointer problem with partitionwise ...
d360e3cc60e Fix compiler warning on typedef redeclaration
8af25652489 Introduce a new smgr bulk loading facility.
e612384fc78 Fix mistake in SQL features list
d13ff82319c Fix BF failure in commit 93db6cbda0.

My guess would be 8af25652489, as it's the only storage-related commit.

I'm currently running tests to verify this.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 2 years ago

In reply to: Tomas Vondra (#6)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

On 3/8/24 13:21, Tomas Vondra wrote:

On 3/8/24 09:33, Thomas Munro wrote:

Happened again. I see this is OpenSUSE. Does that mean the file
system is Btrfs?

It is, but I don't think that matters - I've been able to reproduce this
locally on my laptop using ext4 filesystem. I'd bet the important piece
here is -DCLOBBER_CACHE_ALWAYS (and it seems avocet/trilobite are the
only animals running with this).

Also, if this really is a filesystem (or environment) issue, it seems
very strange it'd only affect HEAD and not the other branches. So it
seems quite likely this is actually triggered by a commit.

Looking at the commits from the good/bad runs, I see this:

avocet: good=4c2369a bad=f5a465f
trilobite: good=d13ff82 bad=def0ce3

That means the commit would have to be somewhere here:

f5a465f1a07 Promote assertion about !ReindexIsProcessingIndex to ...
57b28c08305 Doc: fix minor typos in two ECPG function descriptions.
28e858c0f95 Improve documentation and GUC description for ...
a661bf7b0f5 Remove flaky isolation tests for timeouts
874d817baa1 Multiple revisions to the GROUP BY reordering tests
466979ef031 Replace lateral references to removed rels in subqueries
a6b2a51e16d Avoid dangling-pointer problem with partitionwise ...
d360e3cc60e Fix compiler warning on typedef redeclaration
8af25652489 Introduce a new smgr bulk loading facility.
e612384fc78 Fix mistake in SQL features list
d13ff82319c Fix BF failure in commit 93db6cbda0.

My guess would be 8af25652489, as it's the only storage-related commit.

I'm currently running tests to verify this.

Yup, the breakage starts with this commit. I haven't looked into the
root cause, or whether the commit maybe just made some pre-existing
issue easier to hit. Also, I haven't followed the discussion on the
pgsql-bugs thread [1]/messages/by-id/1878547.tdWV9SEqCh@aivenlaptop, maybe there are some interesting findings.

[1]: /messages/by-id/1878547.tdWV9SEqCh@aivenlaptop
/messages/by-id/1878547.tdWV9SEqCh@aivenlaptop

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Ronan Dunklau

ronan.dunklau@aiven.io

about 2 years ago

In reply to: Tomas Vondra (#7)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

Le vendredi 8 mars 2024, 14:36:48 CET Tomas Vondra a écrit :

My guess would be 8af25652489, as it's the only storage-related commit.

I'm currently running tests to verify this.

Yup, the breakage starts with this commit. I haven't looked into the
root cause, or whether the commit maybe just made some pre-existing
issue easier to hit. Also, I haven't followed the discussion on the
pgsql-bugs thread [1], maybe there are some interesting findings.

If that happens only on HEAD and not on 16, and doesn't involve WAL replay,
then it's not the same bug.

--
Ronan Dunklau

Thomas Munro

thomas.munro@gmail.com

about 2 years ago

In reply to: Tomas Vondra (#7)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

On Sat, Mar 9, 2024 at 2:36 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

On 3/8/24 13:21, Tomas Vondra wrote:

My guess would be 8af25652489, as it's the only storage-related commit.

I'm currently running tests to verify this.

Yup, the breakage starts with this commit. I haven't looked into the
root cause, or whether the commit maybe just made some pre-existing
issue easier to hit. Also, I haven't followed the discussion on the
pgsql-bugs thread [1], maybe there are some interesting findings.

Adding Heikki.

#10

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 2 years ago

In reply to: Thomas Munro (#9)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

On 3/8/24 21:29, Thomas Munro wrote:

On Sat, Mar 9, 2024 at 2:36 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

On 3/8/24 13:21, Tomas Vondra wrote:

My guess would be 8af25652489, as it's the only storage-related commit.

I'm currently running tests to verify this.

Yup, the breakage starts with this commit. I haven't looked into the
root cause, or whether the commit maybe just made some pre-existing
issue easier to hit. Also, I haven't followed the discussion on the
pgsql-bugs thread [1], maybe there are some interesting findings.

Adding Heikki.

I spent a bit of time investigating this today, but haven't made much
progress due to (a) my unfamiliarity with the smgr code in general and
the patch in particular, and (b) CLOBBER_CACHE_ALWAYS making it quite
time consuming to iterate and experiment.

However, the smallest schedule that still reproduces the issue is:

-------------------
test: test_setup

test: create_aggregate create_function_sql create_cast constraints
triggers select inherit typed_table vacuum drop_if_exists
updatable_views roleattributes create_am hash_func errors infinite_recurse
-------------------

I tried to reduce the second step to just "constraints", but that does
not fail. Clearly there's some concurrency at play, and having just a
single backend makes that go away.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#11

Thomas Munro

thomas.munro@gmail.com

about 2 years ago

In reply to: Tomas Vondra (#10)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

On Sat, Mar 9, 2024 at 9:48 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

I spent a bit of time investigating this today, but haven't made much
progress due to (a) my unfamiliarity with the smgr code in general and
the patch in particular, and (b) CLOBBER_CACHE_ALWAYS making it quite
time consuming to iterate and experiment.

However, the smallest schedule that still reproduces the issue is:

-------------------
test: test_setup

test: create_aggregate create_function_sql create_cast constraints
triggers select inherit typed_table vacuum drop_if_exists
updatable_views roleattributes create_am hash_func errors infinite_recurse
-------------------

Thanks, reproduced here (painfully slowly). Looking...

Huh, also look at these extra problems later in the logs of the latest
trilobite and avocet runs, further down after the short read errors:

TRAP: failed Assert("j > attnum"), File: "heaptuple.c", Line: 640, PID: 15753
postgres: autovacuum worker regression(ExceptionalCondition+0x67)[0x9c5f37]
postgres: autovacuum worker regression[0x4b60c8]
postgres: autovacuum worker regression[0x5ff735]
postgres: autovacuum worker regression[0x5fe468]
postgres: autovacuum worker regression(analyze_rel+0x133)[0x5fd5e3]
postgres: autovacuum worker regression(vacuum+0x6b6)[0x683926]
postgres: autovacuum worker regression[0x7ce5e3]
postgres: autovacuum worker regression[0x7cc4f0]
postgres: autovacuum worker regression(StartAutoVacWorker+0x22)[0x7cc152]
postgres: autovacuum worker regression[0x7d57d1]
postgres: autovacuum worker regression[0x7d37bf]
postgres: autovacuum worker regression[0x6f5a4f]

Then crash recovery fails, in one case with:

2024-03-07 20:28:18.632 CET [15860:4] WARNING: will not overwrite a used ItemId
2024-03-07 20:28:18.632 CET [15860:5] CONTEXT: WAL redo at 0/FB07A48
for Heap/INSERT: off: 9, flags: 0x00; blkref #0: rel 1663/16384/2610,
blk 12
2024-03-07 20:28:18.632 CET [15860:6] PANIC: failed to add tuple
2024-03-07 20:28:18.632 CET [15860:7] CONTEXT: WAL redo at 0/FB07A48
for Heap/INSERT: off: 9, flags: 0x00; blkref #0: rel 1663/16384/2610,
blk 12

... and in another with:

2024-03-05 11:51:27.992 CET [25510:4] PANIC: invalid lp
2024-03-05 11:51:27.992 CET [25510:5] CONTEXT: WAL redo at 0/F87A8D0
for Heap/INPLACE: off: 29; blkref #0: rel 1663/16384/20581, blk 0

#12

Thomas Munro

thomas.munro@gmail.com

about 2 years ago

In reply to: Thomas Munro (#11)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

On Sun, Mar 10, 2024 at 5:02 PM Thomas Munro <thomas.munro@gmail.com> wrote:

Thanks, reproduced here (painfully slowly). Looking...

I changed that ERROR to a PANIC and now I can see that
_bt_metaversion() is failing to read a meta page (block 0), and the
file is indeed of size 0 in my filesystem. Which is not cool, for a
btree. Looking at btbuildempty(), we have this sequence:

bulkstate = smgr_bulk_start_rel(index, INIT_FORKNUM);

/* Construct metapage. */
metabuf = smgr_bulk_get_buf(bulkstate);
_bt_initmetapage((Page) metabuf, P_NONE, 0, allequalimage);
smgr_bulk_write(bulkstate, BTREE_METAPAGE, metabuf, true);

smgr_bulk_finish(bulkstate);

Ooh. One idea would be that the smgr lifetime stuff is b0rked,
introducing corruption. Bulk write itself isn't pinning the smgr
relation, it's relying purely on the object not being invalidated,
which the theory of 21d9c3ee's commit message allowed for but ... here
it's destroyed (HASH_REMOVE'd) sooner under CACHE_CLOBBER_ALWAYS,
which I think we failed to grok. If that's it, I'm surprised that
things don't implode more spectacularly. Perhaps HASH_REMOVE should
clobber objects in debug builds, similar to pfree?

For that hypothesis, the corruption might not be happening in the
above-quoted code itself, because it doesn't seem to have an
invalidation acceptance point (unless I'm missing it). Some other
bulk write got mixed up? Not sure yet.

I won't be surprised if the answer is: if you're holding a reference,
you have to get a pin (referring to bulk_write.c).

#13

Thomas Munro

thomas.munro@gmail.com

about 2 years ago

In reply to: Thomas Munro (#12)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

On Sun, Mar 10, 2024 at 6:48 PM Thomas Munro <thomas.munro@gmail.com> wrote:

I won't be surprised if the answer is: if you're holding a reference,
you have to get a pin (referring to bulk_write.c).

Ahhh, on second thoughts, I take that back, I think the original
theory still actually works just fine. It's just that somewhere in
our refactoring of that commit, when we were vacillating between
different semantics for 'destroy' and 'release', I think we made a
mistake: in RelationCacheInvalidate() I think we should now call
smgrreleaseall(), not smgrdestroyall(). That satisfies the
requirements for sinval queue overflow: we close md.c segments (and
most importantly virtual file descriptors), so our lack of sinval
records can't hurt us, we'll reopen all files as required. That's
what CLOBBER_CACHE_ALWAYS is effectively testing (and more). But the
smgr pointer remains valid, and retains only its "identity", eg hash
table key, and that's also fine. It won't be destroyed until after
the end of the transaction. Which was the point, and it allows things
like bulk_write.c (and streaming_read.c) to hold an smgr reference.
Right?

#14

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 2 years ago

In reply to: Thomas Munro (#13)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

Thanks for diagnosing this!

On 10/03/2024 08:23, Thomas Munro wrote:

On Sun, Mar 10, 2024 at 6:48 PM Thomas Munro <thomas.munro@gmail.com> wrote:

I won't be surprised if the answer is: if you're holding a reference,
you have to get a pin (referring to bulk_write.c).

Ahhh, on second thoughts, I take that back, I think the original
theory still actually works just fine. It's just that somewhere in
our refactoring of that commit, when we were vacillating between
different semantics for 'destroy' and 'release', I think we made a
mistake: in RelationCacheInvalidate() I think we should now call
smgrreleaseall(), not smgrdestroyall().

Yes, I ran the reproducer with 'rr', and came to the same conclusion.
That smgrdestroyall() call closes destroys the SmgrRelation, breaking
the new assumption that an unpinned SmgrRelation is valid until end of
transaction.

That satisfies the
requirements for sinval queue overflow: we close md.c segments (and
most importantly virtual file descriptors), so our lack of sinval
records can't hurt us, we'll reopen all files as required. That's
what CLOBBER_CACHE_ALWAYS is effectively testing (and more). But the
smgr pointer remains valid, and retains only its "identity", eg hash
table key, and that's also fine. It won't be destroyed until after
the end of the transaction. Which was the point, and it allows things
like bulk_write.c (and streaming_read.c) to hold an smgr reference.
Right

+1.

I wonder if we can now simplify RelationCacheInvalidate() further. In
the loop above that smgrdestroyall():

while ((idhentry = (RelIdCacheEnt *) hash_seq_search(&status)) != NULL)
{
relation = idhentry->reldesc;

/* Must close all smgr references to avoid leaving dangling ptrs */
RelationCloseSmgr(relation);

/*
* Ignore new relations; no other backend will manipulate them before
* we commit. Likewise, before replacing a relation's relfilelocator,
* we shall have acquired AccessExclusiveLock and drained any
* applicable pending invalidations.
*/
if (relation->rd_createSubid != InvalidSubTransactionId ||
relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
continue;

I don't think we need to call RelationCloseSmgr() for relations created
in the same transaction anymore.

--
Heikki Linnakangas
Neon (https://neon.tech)

#15

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 2 years ago

In reply to: Heikki Linnakangas (#14)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

On 10/03/2024 11:20, Heikki Linnakangas wrote:

On 10/03/2024 08:23, Thomas Munro wrote:

On Sun, Mar 10, 2024 at 6:48 PM Thomas Munro <thomas.munro@gmail.com> wrote:

I won't be surprised if the answer is: if you're holding a reference,
you have to get a pin (referring to bulk_write.c).

Ahhh, on second thoughts, I take that back, I think the original
theory still actually works just fine. It's just that somewhere in
our refactoring of that commit, when we were vacillating between
different semantics for 'destroy' and 'release', I think we made a
mistake: in RelationCacheInvalidate() I think we should now call
smgrreleaseall(), not smgrdestroyall().

Yes, I ran the reproducer with 'rr', and came to the same conclusion.
That smgrdestroyall() call closes destroys the SmgrRelation, breaking
the new assumption that an unpinned SmgrRelation is valid until end of
transaction.

That satisfies the
requirements for sinval queue overflow: we close md.c segments (and
most importantly virtual file descriptors), so our lack of sinval
records can't hurt us, we'll reopen all files as required. That's
what CLOBBER_CACHE_ALWAYS is effectively testing (and more). But the
smgr pointer remains valid, and retains only its "identity", eg hash
table key, and that's also fine. It won't be destroyed until after
the end of the transaction. Which was the point, and it allows things
like bulk_write.c (and streaming_read.c) to hold an smgr reference.
Right

+1.

I wonder if we can now simplify RelationCacheInvalidate() further. In
the loop above that smgrdestroyall():

while ((idhentry = (RelIdCacheEnt *) hash_seq_search(&status)) != NULL)
{
relation = idhentry->reldesc;

/* Must close all smgr references to avoid leaving dangling ptrs */
RelationCloseSmgr(relation);

/*
* Ignore new relations; no other backend will manipulate them before
* we commit. Likewise, before replacing a relation's relfilelocator,
* we shall have acquired AccessExclusiveLock and drained any
* applicable pending invalidations.
*/
if (relation->rd_createSubid != InvalidSubTransactionId ||
relation->rd_firstRelfilelocatorSubid != InvalidSubTransactionId)
continue;

I don't think we need to call RelationCloseSmgr() for relations created
in the same transaction anymore.

Barring objections, I'll commit the attached.

Hmm, I'm not sure if we need even smgrreleaseall() here anymore. It's
not required for correctness AFAICS. We don't do it in single-rel
invalidation in RelationCacheInvalidateEntry() either.

--
Heikki Linnakangas
Neon (https://neon.tech)

#16

Thomas Munro

thomas.munro@gmail.com

about 2 years ago

In reply to: Heikki Linnakangas (#15)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

On Mon, Mar 11, 2024 at 9:30 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

Barring objections, I'll commit the attached.

I guess the comment for smgrreleaseall() could also be updated. It
mentions only PROCSIGNAL_BARRIER_SMGRRELEASE, but I think sinval
overflow (InvalidateSystemCaches()) should also be mentioned?

Hmm, I'm not sure if we need even smgrreleaseall() here anymore. It's
not required for correctness AFAICS. We don't do it in single-rel
invalidation in RelationCacheInvalidateEntry() either.

I think we do, because we have missed sinval messages. It's unlikely
but a relfilenode might have been recycled, and we might have file
descriptors that point to the unlinked files. That is, there are new
files with the same names and we need to open those ones.

#17

Thomas Munro

thomas.munro@gmail.com

about 2 years ago

In reply to: Thomas Munro (#16)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

On Mon, Mar 11, 2024 at 9:59 AM Thomas Munro <thomas.munro@gmail.com> wrote:

On Mon, Mar 11, 2024 at 9:30 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

Hmm, I'm not sure if we need even smgrreleaseall() here anymore. It's
not required for correctness AFAICS. We don't do it in single-rel
invalidation in RelationCacheInvalidateEntry() either.

I think we do, because we have missed sinval messages. It's unlikely
but a relfilenode might have been recycled, and we might have file
descriptors that point to the unlinked files. That is, there are new
files with the same names and we need to open those ones.

... though I think you would be right if Dilip and Robert had
succeeded in their quest to introduce 56-bit non-cycling relfilenodes.
And for the record, we can also shoot ourselves in the foot in another
known case without sinval[1]/messages/by-id/CA+hUKGLs554tQFCUjv_vn7ft9Xv5LNjPoAd--3Df+JJKJ7A8kw@mail.gmail.com, so more work is needed here, but that
doesn't mean this sinval code path should also aim footwards.

[1]: /messages/by-id/CA+hUKGLs554tQFCUjv_vn7ft9Xv5LNjPoAd--3Df+JJKJ7A8kw@mail.gmail.com

#18

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 2 years ago

In reply to: Thomas Munro (#16)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

On 10/03/2024 22:59, Thomas Munro wrote:

On Mon, Mar 11, 2024 at 9:30 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

Barring objections, I'll commit the attached.

+1

Pushed, thanks!

I guess the comment for smgrreleaseall() could also be updated. It
mentions only PROCSIGNAL_BARRIER_SMGRRELEASE, but I think sinval
overflow (InvalidateSystemCaches()) should also be mentioned?

I removed that comment; people can grep to find the callers.

Hmm, I'm not sure if we need even smgrreleaseall() here anymore. It's
not required for correctness AFAICS. We don't do it in single-rel
invalidation in RelationCacheInvalidateEntry() either.

I think we do, because we have missed sinval messages. It's unlikely
but a relfilenode might have been recycled, and we might have file
descriptors that point to the unlinked files. That is, there are new
files with the same names and we need to open those ones.

Gotcha.

--
Heikki Linnakangas
Neon (https://neon.tech)

#19

Alexander Lakhin

exclusion@gmail.com

over 1 year ago

In reply to: Heikki Linnakangas (#18)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

Hello Heikki,

11.03.2024 10:09, Heikki Linnakangas wrote:

On 10/03/2024 22:59, Thomas Munro wrote:

On Mon, Mar 11, 2024 at 9:30 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

Barring objections, I'll commit the attached.

+1

Pushed, thanks!

Please look at a new anomaly, that I've discovered in master.

Starting from af0e7deb4, the following script:
numjobs=80
for ((i=1;i<=50;i++)); do
echo "I $i"

for ((j=1;j<=numjobs;j++)); do createdb db$j; done

for ((j=1;j<=numjobs;j++)); do
echo "
VACUUM FULL pg_class;
REINDEX INDEX pg_class_oid_index;
" | psql -d db$j >/dev/null 2>&1 &

echo "
CREATE TABLE tbl1 (t text);
DROP TABLE tbl1;
" | psql -d db$j >/dev/null 2>&1 &
done
wait

grep 'was terminated' server.log && break;
for ((j=1;j<=numjobs;j++)); do dropdb db$j; done

done

triggers a segfault:
2024-06-19 19:22:49.009 UTC [1607210:6] LOG: server process (PID 1607671) was terminated by signal 11: Segmentation fault

with the following stack trace:
Core was generated by `postgres: law db50 [local] CREATE TABLE '.
Program terminated with signal SIGSEGV, Segmentation fault.

warning: Section `.reg-xstate/1607671' in core file too small.
#0 0x000055d04cb8232e in RelationReloadNailed (relation=0x7f7d0a1b1fd8) at relcache.c:2415
2415                            relp = (Form_pg_class) GETSTRUCT(pg_class_tuple);
(gdb) bt
#0 0x000055d04cb8232e in RelationReloadNailed (relation=0x7f7d0a1b1fd8) at relcache.c:2415
#1 0x000055d04cb8278c in RelationClearRelation (relation=0x7f7d0a1b1fd8, rebuild=true) at relcache.c:2560
#2 0x000055d04cb834e9 in RelationCacheInvalidate (debug_discard=false) at relcache.c:3048
#3 0x000055d04cb72e3c in InvalidateSystemCachesExtended (debug_discard=false) at inval.c:680
#4 0x000055d04cb73190 in InvalidateSystemCaches () at inval.c:794
#5 0x000055d04c9754ad in ReceiveSharedInvalidMessages (
    invalFunction=0x55d04cb72eee <LocalExecuteInvalidationMessage>,
    resetFunction=0x55d04cb7317e <InvalidateSystemCaches>) at sinval.c:105
#6 0x000055d04cb731b4 in AcceptInvalidationMessages () at inval.c:808
#7 0x000055d04c97d2ed in LockRelationOid (relid=2662, lockmode=1) at lmgr.c:136
#8 0x000055d04c404b1f in relation_open (relationId=2662, lockmode=1) at relation.c:55
#9 0x000055d04c47941e in index_open (relationId=2662, lockmode=1) at indexam.c:137
#10 0x000055d04c4787c2 in systable_beginscan (heapRelation=0x7f7d0a1b1fd8, indexId=2662, indexOK=true, snapshot=0x0,
    nkeys=1, key=0x7ffd456a8570) at genam.c:396
#11 0x000055d04cb7e93d in ScanPgRelation (targetRelId=3466, indexOK=true, force_non_historic=false) at relcache.c:381
#12 0x000055d04cb7fe15 in RelationBuildDesc (targetRelId=3466, insertIt=true) at relcache.c:1093
#13 0x000055d04cb81c93 in RelationIdGetRelation (relationId=3466) at relcache.c:2108
#14 0x000055d04c404b29 in relation_open (relationId=3466, lockmode=1) at relation.c:58
#15 0x000055d04cb720a6 in BuildEventTriggerCache () at evtcache.c:129
#16 0x000055d04cb71f6a in EventCacheLookup (event=EVT_SQLDrop) at evtcache.c:68
#17 0x000055d04c61c037 in trackDroppedObjectsNeeded () at event_trigger.c:1342
#18 0x000055d04c61bf02 in EventTriggerBeginCompleteQuery () at event_trigger.c:1284
#19 0x000055d04c9ac744 in ProcessUtilitySlow (pstate=0x55d04d04aca0, pstmt=0x55d04d021540,
    queryString=0x55d04d020830 "CREATE TABLE tbl1 (t text);", context=PROCESS_UTILITY_TOPLEVEL, params=0x0,
    queryEnv=0x0, dest=0x55d04d021800, qc=0x7ffd456a8fb0) at utility.c:1107
#20 0x000055d04c9ac64d in standard_ProcessUtility (pstmt=0x55d04d021540,
    queryString=0x55d04d020830 "CREATE TABLE tbl1 (t text);", readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL,
    params=0x0, queryEnv=0x0, dest=0x55d04d021800, qc=0x7ffd456a8fb0) at utility.c:1067
#21 0x000055d04c9ab54d in ProcessUtility (pstmt=0x55d04d021540,
    queryString=0x55d04d020830 "CREATE TABLE tbl1 (t text);", readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL,
    params=0x0, queryEnv=0x0, dest=0x55d04d021800, qc=0x7ffd456a8fb0) at utility.c:523
#22 0x000055d04c9a9dc6 in PortalRunUtility (portal=0x55d04d0c1020, pstmt=0x55d04d021540, isTopLevel=true,
    setHoldSnapshot=false, dest=0x55d04d021800, qc=0x7ffd456a8fb0) at pquery.c:1158
#23 0x000055d04c9aa03d in PortalRunMulti (portal=0x55d04d0c1020, isTopLevel=true, setHoldSnapshot=false,
    dest=0x55d04d021800, altdest=0x55d04d021800, qc=0x7ffd456a8fb0) at pquery.c:1315
#24 0x000055d04c9a9487 in PortalRun (portal=0x55d04d0c1020, count=9223372036854775807, isTopLevel=true, run_once=true,
    dest=0x55d04d021800, altdest=0x55d04d021800, qc=0x7ffd456a8fb0) at pquery.c:791
#25 0x000055d04c9a2150 in exec_simple_query (query_string=0x55d04d020830 "CREATE TABLE tbl1 (t text);")
    at postgres.c:1273
#26 0x000055d04c9a71f5 in PostgresMain (dbname=0x55d04d05dbf0 "db50", username=0x55d04d05dbd8 "law") at postgres.c:4675
#27 0x000055d04c8bd8c8 in BackendRun (port=0x55d04d0540c0) at postmaster.c:4475
#28 0x000055d04c8bcedd in BackendStartup (port=0x55d04d0540c0) at postmaster.c:4151
#29 0x000055d04c8b93a9 in ServerLoop () at postmaster.c:1769
#30 0x000055d04c8b8ca4 in PostmasterMain (argc=3, argv=0x55d04d01a770) at postmaster.c:1468
#31 0x000055d04c76c577 in main (argc=3, argv=0x55d04d01a770) at main.c:197

(gdb) p pg_class_tuple
$1 = (HeapTuple) 0x0

server.log might also contain:
2024-06-19 19:25:38.060 UTC [1618682:5] psql ERROR: could not read blocks 3..3 in file "base/28531/28840": read only 0
of 8192 bytes

Best regards,
Alexander

#20

Michael Paquier

michael@paquier.xyz

over 1 year ago

In reply to: Alexander Lakhin (#19)

Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

On Wed, Jun 19, 2024 at 11:00:00PM +0300, Alexander Lakhin wrote:

Starting from af0e7deb4, the following script:
[...]
triggers a segfault:
2024-06-19 19:22:49.009 UTC [1607210:6] LOG: server process (PID
1607671) was terminated by signal 11: Segmentation fault

Open item added for this issue.
--
Michael

#21

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 1 year ago

In reply to: Alexander Lakhin (#19)

#22