pg_dump crash on identity sequence with not loaded attributes

Started by Artur Zakirovover 1 year ago13 messagesbugs
Jump to latest
#1Artur Zakirov
zaartur@gmail.com

Hi all,

pg_dump crashes when a table is added into an extension and its
identity sequence isn't. It can be reproduced by pgmq by the following
way:

git clone git@github.com:tembo-io/pgmq.git
cd pgmq && git checkout v1.4.5
make clean && make install
psql -U postgres -c "create extension if not exists pgmq; SELECT
pgmq.create('my_queue'); select pgmq.send('my_queue', '\"asdf\"');"
pg_dump -d postgres -U postgres -Fc -f backup

pg_dump will crash. The backtrace:

* thread #1, queue = 'com.apple.main-thread', stop reason =
EXC_BAD_ACCESS (code=1, address=0x0)
frame #0: 0x000000010001ce24 pg_dump`dumpTable [inlined]
dumpSequence(fout=0x00000001438045a0, tbinfo=0x00000001280205b0) at
pg_dump.c:17832:15 [opt]
17829
fmtQualifiedDumpable(owning_tab));
17830 appendPQExpBuffer(query,
17831
"ALTER COLUMN %s ADD GENERATED ",
-> 17832
fmtId(owning_tab->attnames[tbinfo->owning_col - 1]));
17833 if
(owning_tab->attidentity[tbinfo->owning_col - 1] ==
ATTRIBUTE_IDENTITY_ALWAYS)
17834 appendPQExpBufferStr(query, "ALWAYS");
17835 else if
(owning_tab->attidentity[tbinfo->owning_col - 1] ==
ATTRIBUTE_IDENTITY_BY_DEFAULT)
warning: pg_dump was compiled with optimization - stepping may
behave oddly; variables may not be available.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason =
EXC_BAD_ACCESS (code=1, address=0x0)
* frame #0: 0x000000010001ce24 pg_dump`dumpTable [inlined]
dumpSequence(fout=0x00000001438045a0, tbinfo=0x00000001280205b0) at
pg_dump.c:17832:15 [opt]
frame #1: 0x000000010001cc94
pg_dump`dumpTable(fout=0x00000001438045a0, tbinfo=0x00000001280205b0)
at pg_dump.c:15720:4 [opt]
frame #2: 0x0000000100004f60 pg_dump`main [inlined]
dumpDumpableObject(fout=0x00000001438045a0, dobj=0x00000001280205b0)
at pg_dump.c:10616:4 [opt]
frame #3: 0x0000000100004eac pg_dump`main(argc=<unavailable>,
argv=<unavailable>) at pg_dump.c:1085:3 [opt]
frame #4: 0x000000019429c274 dyld`start + 2840

I created a PR for pgmq to fix the issue on their side:
https://github.com/tembo-io/pgmq/pull/352

While the client should probably include the sequence together with
the table and the issue is partially on their side, I think pg_dump
shouldn't crash in such cases. I could reproduce it at least since
Postgres 15 to master.

The attached patch fixes the issue on pg_dump side by ignoring
identity sequences if their table attributes are not loaded. I'm not
sure if this is the best fix. It might be better to raise an error
instead to let a user know that they should add the sequence to the
extension too.

--
Kind regards,
Artur

Attachments:

pg_dump_ignore_notinteresting_sequence.patchapplication/octet-stream; name=pg_dump_ignore_notinteresting_sequence.patchDownload+2-1
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Artur Zakirov (#1)
Re: pg_dump crash on identity sequence with not loaded attributes

Artur Zakirov <zaartur@gmail.com> writes:

pg_dump crashes when a table is added into an extension and its
identity sequence isn't.

Yup, reproduced here.

The attached patch fixes the issue on pg_dump side by ignoring
identity sequences if their table attributes are not loaded. I'm not
sure if this is the best fix.

I don't like it much. That test is trying to avoid exactly this
situation; why is it failing to do so? I traced through it with
gdb and found that when we get here,

(gdb) s
7259 if (owning_tab->dobj.dump == DUMP_COMPONENT_NONE &&
(gdb) p owning_tab->dobj.dump
$7 = 16
(gdb) p seqinfo->is_identity_sequence
$8 = true
(gdb) p owning_tab->interesting
$9 = false

That is, the owning table is marked to dump ACLs and nothing else.
Per getTables(), we only mark a table interesting if we need to
dump its DEFINITION or DATA component. So kaboom.

I didn't check the git history, but I suspect that this code was
correct when written and got broken by init_privs changes, which
allowed the DUMP_COMPONENT_ACL flag to get set along with nothing
else (cf checkExtensionMembership). I'm inclined to fix it
like this:

        /*
         * Only dump identity sequences if we're going to dump the table that
         * it belongs to.
         */
-       if (owning_tab->dobj.dump == DUMP_COMPONENT_NONE &&
+       if ((owning_tab->dobj.dump & DUMP_COMPONENT_DEFINITION) &&
            seqinfo->is_identity_sequence)
        {
            seqinfo->dobj.dump = DUMP_COMPONENT_NONE;

(The comment could use a bit of adjustment too.) The idea here is
that so far as the user is concerned, an identity sequence is part
of the DDL that defines the table, so it should be dumped if and
only if we're dumping the table's definitional DDL. Ancillary
stuff like ACLs shouldn't change this conclusion.

A different approach that we could take is to decide that by golly,
we're supposed to dump this sequence and we should do so. That
could be done by forcing owning_tab->interesting to true just below
here. However, in that case the entire bit quoted above seems wrong,
because we'd no longer be using the policy stated by the comment.
Certainly it still shouldn't matter whether the table is by chance
marked to dump ACLs.

I poked around for other places that might have a similar disease.
The only other direct test like "foo == DUMP_COMPONENT_NONE" is in
getPublicationNamespaces:

/*
* We always dump publication namespaces unless the corresponding
* namespace is excluded from the dump.
*/
if (nspinfo->dobj.dump == DUMP_COMPONENT_NONE)
continue;

I didn't try to break that but I suspect it is equally wrong, ie
it'll behave surprisingly differently for schemas that belong to
extensions vs. those that don't. I'm less sure what to do here.
It seems like publishing a namespace that belongs to an extension
might not be an unreasonable thing to do, in which case users
would be sad if we suppressed it from the publication's dump.
Perhaps this should be like

if (!((nspinfo->dobj.dump & DUMP_COMPONENT_DEFINITION) ||
(nspinfo->ext_member && extension-is-being-dumped)))
continue;

regards, tom lane

#3Artur Zakirov
zaartur@gmail.com
In reply to: Tom Lane (#2)
Re: pg_dump crash on identity sequence with not loaded attributes

On Mon, 9 Dec 2024 at 22:51, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Artur Zakirov <zaartur@gmail.com> writes:

pg_dump crashes when a table is added into an extension and its
identity sequence isn't.

Yup, reproduced here.

Thank you for looking into this. It seems that the original code was
introduced by the commit which fixed a similar crash:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b965f2617184032687037547204e1db1c1e1a56c

/*
* Only dump identity sequences if we're going to dump the table that
* it belongs to.
*/
-       if (owning_tab->dobj.dump == DUMP_COMPONENT_NONE &&
+       if ((owning_tab->dobj.dump & DUMP_COMPONENT_DEFINITION) &&
seqinfo->is_identity_sequence)
{
seqinfo->dobj.dump = DUMP_COMPONENT_NONE;

I think it is necessary to use negation in this condition. The patch
is attached. I checked the new patch and it works as expected.

--
Kind regards,
Artur

Attachments:

v2_pg_dump_ignore_notinteresting_sequence.patchapplication/octet-stream; name=v2_pg_dump_ignore_notinteresting_sequence.patchDownload+4-3
#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Artur Zakirov (#3)
Re: pg_dump crash on identity sequence with not loaded attributes

Artur Zakirov <zaartur@gmail.com> writes:

On Mon, 9 Dec 2024 at 22:51, Tom Lane <tgl@sss.pgh.pa.us> wrote:

-       if (owning_tab->dobj.dump == DUMP_COMPONENT_NONE &&
+       if ((owning_tab->dobj.dump & DUMP_COMPONENT_DEFINITION) &&

I think it is necessary to use negation in this condition.

D'oh, of course. But what's your thoughts on the other points?
Is this what we want to do at all?

regards, tom lane

#5Peter Eisentraut
peter_e@gmx.net
In reply to: Artur Zakirov (#1)
Re: pg_dump crash on identity sequence with not loaded attributes

On 09.12.24 16:54, Artur Zakirov wrote:

pg_dump crashes when a table is added into an extension and its
identity sequence isn't.

Maybe the bug is that this scenario is possible in the first place? The
identity sequence is an internal dependency. It would be like putting a
view into an extension but not the underlying rule?

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#5)
Re: pg_dump crash on identity sequence with not loaded attributes

Peter Eisentraut <peter@eisentraut.org> writes:

On 09.12.24 16:54, Artur Zakirov wrote:

pg_dump crashes when a table is added into an extension and its
identity sequence isn't.

Maybe the bug is that this scenario is possible in the first place? The
identity sequence is an internal dependency. It would be like putting a
view into an extension but not the underlying rule?

As far as I can tell, views' rules are not marked as direct members of
the extension, whether they are made within the extension's script or
later on. The same for indexes. (I think this was intentional to
avoid bloating pg_depend more than necessary.) So those cases might
not be great analogies. Rules and indexes can't exist except as
auxiliary objects of a table, so a sequence isn't quite the same
situation anyway.

Even if we hacked things up so that the sequence got added to the
table's extension, a user could still get into this state with
ALTER EXTENSION ... DROP SEQUENCE. So between that and having
existing databases with the issue, I think pg_dump has to be made
to cope with it. I'm also talking myself into the idea that we
have to do our best to actually reproduce the situation, not
just ignore the sequence, which is what the currently-proposed
patch would do. Maybe forcing owning_tab->interesting to true
is the right fix? But the rule about "don't dump sequences of
not-to-be-dumped-tables" is still problematic from this standpoint.

regards, tom lane

#7Artur Zakirov
zaartur@gmail.com
In reply to: Tom Lane (#4)
Re: pg_dump crash on identity sequence with not loaded attributes

On Tue, 10 Dec 2024 at 16:29, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think it is necessary to use negation in this condition.

D'oh, of course. But what's your thoughts on the other points?
Is this what we want to do at all?

Alternatively I was thinking about this change:

continue;
}

+               if (!(owning_tab->dobj.dump & DUMP_COMPONENT_DEFINITION) &&
+                       seqinfo->is_identity_sequence)
+                       seqinfo->dobj.dump &= ~DUMP_COMPONENT_DEFINITION;
+
                /*
                 * Otherwise we need to dump the components that are
being dumped for
                 * the table and any components which the sequence is explicitly

This way we wouldn't ignore ACLs for example. Currently a user might
not have permissions to a sequence but still be able to read/write to
its table if they have permissions. Although the user cannot execute
nextval/setval explicitly.
If an admin grants permissions to the sequence explicitly (for some
reason) the user will be able to execute nextval/setval. And that will
be lost during dump/restore. Currently this doesn't work at all and
ignoring sequences won't break anything.

Ideally I would like to have the ability to dump ACL of those
sequences too, even though this looks like a quite narrow use case.

Alternatively, instead of forcing owning_tab->interesting to true, I
think we could always initialize owning_tab's attributes (i.e. arrays
like owning_tab->attnames, owning_tab->attidenity), which are used by
dumpSequence() and which causes the crash. Are there any downsides of
it?

--
Kind regards,
Artur

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Artur Zakirov (#7)
Re: pg_dump crash on identity sequence with not loaded attributes

Artur Zakirov <zaartur@gmail.com> writes:

Alternatively, instead of forcing owning_tab->interesting to true, I
think we could always initialize owning_tab's attributes (i.e. arrays
like owning_tab->attnames, owning_tab->attidenity), which are used by
dumpSequence() and which causes the crash. Are there any downsides of
it?

Lots. The entire point of the ->interesting flag is to avoid fetching
additional details about tables that we don't really care about.
Unless I misunderstand, you're proposing throwing away that whole
optimization, which has got to be an overall loss.

regards, tom lane

#9Artur Zakirov
zaartur@gmail.com
In reply to: Tom Lane (#8)
Re: pg_dump crash on identity sequence with not loaded attributes

On Tue, 10 Dec 2024 at 19:08, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Artur Zakirov <zaartur@gmail.com> writes:

Alternatively, instead of forcing owning_tab->interesting to true, I
think we could always initialize owning_tab's attributes (i.e. arrays
like owning_tab->attnames, owning_tab->attidenity), which are used by
dumpSequence() and which causes the crash. Are there any downsides of
it?

Lots. The entire point of the ->interesting flag is to avoid fetching
additional details about tables that we don't really care about.
Unless I misunderstand, you're proposing throwing away that whole
optimization, which has got to be an overall loss.

Yeah, I rechecked the code and it seems getTableAttrs() is called
later than getOwnedSeqs(). And we can set owning_tab->interesting to
true to load data only of needed tables.

I think it will still be necessary to negate DUMP_COMPONENT_DEFINITION
from seqinfo->dobj.dump because we shouldn't dump statements like ...
ALTER COLUMN ... ADD GENERATED ..., if the table's definition isn't
dumped.

--
Kind regards,
Artur

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Artur Zakirov (#9)
Re: pg_dump crash on identity sequence with not loaded attributes

Artur Zakirov <zaartur@gmail.com> writes:

I think it will still be necessary to negate DUMP_COMPONENT_DEFINITION
from seqinfo->dobj.dump because we shouldn't dump statements like ...
ALTER COLUMN ... ADD GENERATED ..., if the table's definition isn't
dumped.

Yeah. After chewing on this for awhile, I think the cleanest solution
is say that b965f2617 was just wrong, and we should revert it in
favor of adopting this logic:

if (seqinfo->is_identity_sequence)
seqinfo->dobj.dump = owning_tab->dobj.dump;
else
seqinfo->dobj.dump |= owning_tab->dobj.dump;

That is, for an identity sequence we should dump the exact same
properties as for its owning table. This basically rejects the
notion that a table can be inside an extension while its identity
sequence is not (or vice versa), which is the same position we
take for indexes, rules, etc. However, for non-identity owned
sequences (i.e. serials) keep the same behavior as before.
The large comment block just below here (added by f9e439b1c)
carefully describes the behavior that was wanted for non-identity
sequences, and I don't think we need to break that.

The attached patch also gets rid of the dubious coding in
getPublicationNamespaces. We might get push-back on that ignoring
schemas belonging to extensions, but if so I'd prefer to see the
behavior coded in a more transparent fashion.

regards, tom lane

Attachments:

fix-pg_dump-identity-sequence.patchtext/x-diff; charset=us-ascii; name=fix-pg_dump-identity-sequence.patchDownload+28-18
#11Artur Zakirov
zaartur@gmail.com
In reply to: Tom Lane (#10)
Re: pg_dump crash on identity sequence with not loaded attributes

On Tue, 10 Dec 2024 at 22:15, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah. After chewing on this for awhile, I think the cleanest solution
is say that b965f2617 was just wrong, and we should revert it in
favor of adopting this logic:

if (seqinfo->is_identity_sequence)
seqinfo->dobj.dump = owning_tab->dobj.dump;
else
seqinfo->dobj.dump |= owning_tab->dobj.dump;

Thank you for the new version of the patch. It looks good to me and it
works as expected after I tested it.

The attached patch also gets rid of the dubious coding in
getPublicationNamespaces. We might get push-back on that ignoring
schemas belonging to extensions, but if so I'd prefer to see the
behavior coded in a more transparent fashion.

I don't have a strong opinion about this part and I'm not sure that we
need to ignore mapping between a namespace and a publication if the
namespace belongs to the extension. Maybe we could always dump the
mapping if the publication itself is dumped.

But other than that it seems the patch breaks dumps of mapping of a
publication with the public namespace. The issue with the public
namespace is mentioned in the original thread of the feature of adding
schema mapping [1]. I couldn't find in the thread if it was addressed
in the end. With the new patch the following mapping won't be mapped:

create publication pub_test for tables in schema public;

pg_dump won't generate corresponding "ALTER PUBLICATION pub_test ADD
TABLES IN SCHEMA public".

1. /messages/by-id/OS0PR01MB61137F0B8498E9241F2D960EFB149@OS0PR01MB6113.jpnprd01.prod.outlook.com

--
Kind regards,
Artur

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Artur Zakirov (#11)
Re: pg_dump crash on identity sequence with not loaded attributes

Artur Zakirov <zaartur@gmail.com> writes:

On Tue, 10 Dec 2024 at 22:15, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The attached patch also gets rid of the dubious coding in
getPublicationNamespaces. We might get push-back on that ignoring
schemas belonging to extensions, but if so I'd prefer to see the
behavior coded in a more transparent fashion.

But other than that it seems the patch breaks dumps of mapping of a
publication with the public namespace. The issue with the public
namespace is mentioned in the original thread of the feature of adding
schema mapping [1]. I couldn't find in the thread if it was addressed
in the end. With the new patch the following mapping won't be mapped:
create publication pub_test for tables in schema public;

Ouch. Okay, I agree that's probably bad, but then it raises the
question of why this filter exists at all. It certainly seems
100% magic and undocumented that it does the "right" thing for
these cases.

In any case, that's mostly orthogonal to the problem with identity
sequences. I now propose leaving out that change and just pushing
the change in getOwnedSeqs(). The publication-namespace business
should be discussed on its own thread.

regards, tom lane

#13Artur Zakirov
zaartur@gmail.com
In reply to: Tom Lane (#12)
Re: pg_dump crash on identity sequence with not loaded attributes

On Wed, 11 Dec 2024 at 16:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:

In any case, that's mostly orthogonal to the problem with identity
sequences. I now propose leaving out that change and just pushing
the change in getOwnedSeqs(). The publication-namespace business
should be discussed on its own thread.

+1, agree it looks like a bit different issue and for now we can fix
only the issue with sequences.

I found another inconsistency with publishing mapping and just mention
it here. According to the code of selectDumpableNamespace() namespaces
"pg_*" and "information_schema" are ignored by pg_dump. But it is
possible to create mapping for publications with "information_schema"
and those mappings will be lost by dump/restore. But I agree that
publication-namespace should have its own thread.

--
Kind regards,
Artur