Command order bug in pg_dump
tested against 706cbed351037fb5e886815506515d1281e62d40
Execute this in first db (say, db1):
```
create table tfk (i int unique) partition by range (i );
create table tfk_po partition of tfk for values from ( 0 ) to (1);
create table tt (i int) partition by range (i );
create table tt_po partition of tt for values from ( 0 ) to (1);
ALTER TABLE public.tt
ADD CONSTRAINT tt_i_fkey FOREIGN KEY (i) REFERENCES public.tfk(i);
ALTER TABLE public.tt
ADD CONSTRAINT tt_a_fkey FOREIGN KEY (i) REFERENCES public.tfk(i);
```
create new database and dump-restore
./pgbin/bin/pg_dump -d db1 --schema-only > dump-p.sql
./pgbin/bin/createdb db2
restore fails
db2=# \i dump-p.sql
SET
SET
SET
SET
SET
SET
set_config
------------
(1 row)
SET
SET
SET
SET
SET
CREATE TABLE
ALTER TABLE
SET
CREATE TABLE
ALTER TABLE
CREATE TABLE
ALTER TABLE
CREATE TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER INDEX
ALTER TABLE
psql:dump-p.sql:120: ERROR: constraint "tt_i_fkey" for relation "tt"
already exists
db2=#
This bug is about now postgresql chooses generated name for inherited contains
I think this is a problem, when pg_dump creates sql which is
non-applicable for restore. Bug discovered when digging out pg_upgarde
failure reasons.
--
Best regards,
Kirill Reshke
Kirill Reshke <reshkekirill@gmail.com> writes:
psql:dump-p.sql:120: ERROR: constraint "tt_i_fkey" for relation "tt"
already exists
I don't think this is pg_dump's fault: there is no such constraint
when the ALTER TABLE starts. Something inside the ALTER TABLE
recursion seems to be messing up if there is already another
similar FK constraint. This trace is pretty interesting:
regression=# create database db1;
CREATE DATABASE
regression=# \c db1
You are now connected to database "db1" as user "postgres".
db1=# create table tfk (i int unique) partition by range (i );
CREATE TABLE
db1=# create table tfk_po partition of tfk for values from ( 0 ) to (1);
CREATE TABLE
db1=# create table tt (i int) partition by range (i );
CREATE TABLE
db1=# create table tt_po partition of tt for values from ( 0 ) to (1);
CREATE TABLE
db1=# ALTER TABLE public.tt
db1-# ADD CONSTRAINT tt_a_fkey FOREIGN KEY (i) REFERENCES public.tfk(i);
ALTER TABLE
db1=# ALTER TABLE public.tt
db1-# ADD CONSTRAINT tt_i_fkey FOREIGN KEY (i) REFERENCES public.tfk(i);
ERROR: constraint "tt_i_fkey" for relation "tt" already exists
db1=# ALTER TABLE public.tt drop CONSTRAINT tt_a_fkey;
ALTER TABLE
db1=# ALTER TABLE public.tt
ADD CONSTRAINT tt_i_fkey FOREIGN KEY (i) REFERENCES public.tfk(i);
ALTER TABLE
Doesn't seem to be a new problem, either ... this trace is against
v13.
regards, tom lane
I wrote:
I don't think this is pg_dump's fault: there is no such constraint
when the ALTER TABLE starts. Something inside the ALTER TABLE
recursion seems to be messing up if there is already another
similar FK constraint.
Oh, I see what's happening. We create both the specified constraint
and an inherited child constraint on the named partitioned table.
If tt_a_fkey is added first, the name chosen for its child is
"tt_i_fkey", breaking the later attempt by the user to use that name.
Apparently there is some hack in psql to not show that child
constraint entry, because you don't see it in \d output, which
is what confused me before (and would confuse most people
hitting this).
We could fool around with the generation rule for the child
constraint's name, but fundamentally we're infringing on user
namespace here, so I think that's likely to just move the problem
cases around. Why do we need this child pg_constraint entry at all?
If we can't avoid having it, probably choosing a name like
"tt_a_fkey1" would be marginally less likely to cause trouble
... but only marginally.
In any case, it's pretty awful that we make these entries but
\d does not show them.
regards, tom lane
On Mon, 21 Apr 2025 at 19:56, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Doesn't seem to be a new problem, either ... this trace is against
v13.
Sure, repro was on 12=>13 pg_upgrade.
We could fool around with the generation rule for the child
constraint's name, but fundamentally we're infringing on user
namespace here, so I think that's likely to just move the problem
cases around.
My view of this problem is that pg_dump fails its purpose (to produce
restorable dump) because... Lack of control? What if we can force
inherited child constraint names?
So, along with AT ADD CONSTRAINT, we can provide a list of names and
say: instead of using a constraint name generation rule, the server
should choose these names in order. I understand this is too much code
for this minor matter, and the fix will be probably just to change
generation rule.
Why do we need this child pg_constraint entry at all?
Currently no idea.
In any case, it's pretty awful that we make these entries but
\d does not show them.
Okay... Perhaps, but since the user did not specifically request this,
perhaps we shouldn't display it.
--
Best regards,
Kirill Reshke
On Mon, 21 Apr 2025 at 22:30, Kirill Reshke <reshkekirill@gmail.com> wrote:
My view of this problem is that pg_dump fails its purpose (to produce
restorable dump) because... Lack of control? What if we can force
inherited child constraint names?
So, along with AT ADD CONSTRAINT, we can provide a list of names and
say: instead of using a constraint name generation rule, the server
should choose these names in order.
Forget this nonsense, this is a bad idea.
--
Best regards,
Kirill Reshke
On 2025-Apr-21, Tom Lane wrote:
I don't think this is pg_dump's fault: there is no such constraint
when the ALTER TABLE starts. Something inside the ALTER TABLE
recursion seems to be messing up if there is already another
similar FK constraint. This trace is pretty interesting:regression=# create database db1;
CREATE DATABASE
regression=# \c db1
You are now connected to database "db1" as user "postgres".
db1=# create table tfk (i int unique) partition by range (i );
CREATE TABLE
db1=# create table tfk_po partition of tfk for values from ( 0 ) to (1);
CREATE TABLE
db1=# create table tt (i int) partition by range (i );
CREATE TABLE
db1=# create table tt_po partition of tt for values from ( 0 ) to (1);
CREATE TABLE
db1=# ALTER TABLE public.tt
db1-# ADD CONSTRAINT tt_a_fkey FOREIGN KEY (i) REFERENCES public.tfk(i);
ALTER TABLE
db1=# ALTER TABLE public.tt
db1-# ADD CONSTRAINT tt_i_fkey FOREIGN KEY (i) REFERENCES public.tfk(i);
ERROR: constraint "tt_i_fkey" for relation "tt" already exists
Oh hah. So constraint tt_a_fkey has chosen the name tt_i_fkey for its
internal sub-constraint object to point to the partitions, so when the
user wants to created tt_i_fkey, the name available anymore.
Maybe we can change the naming policy so that these internal constraint
objects have names that are unlikely to be chosen by users, maybe by
suffixing "fkey_int" instead of "fkey", or something like that. (We
could even do "$1" and so on for this kind of constraint). In
hindsight, it isn't such a great idea to let the system choose the best
name for an internal implementation object.
I'd probably not change this in versions 13 and 14 at all in any case,
because the code is too different. I'm unsure whether this is enough of
a bug to consider backpatching to 15-17; maybe we should just change 18
at this point, since I haven't heard of a user complaining about this.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Estoy de acuerdo contigo en que la verdad absoluta no existe...
El problema es que la mentira sí existe y tu estás mintiendo" (G. Lama)
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
Maybe we can change the naming policy so that these internal constraint
objects have names that are unlikely to be chosen by users, maybe by
suffixing "fkey_int" instead of "fkey", or something like that. (We
could even do "$1" and so on for this kind of constraint). In
hindsight, it isn't such a great idea to let the system choose the best
name for an internal implementation object.
I experimented with the attached, which approximates "add some digits
to the name used for the parent constraint". (We could refactor
ChooseConstraintName if we wanted a less approximate version of that
rule, but I'm not sure it's worth the trouble.)
The extent to which these derived names leak out to the user, as
illustrated by the regression test changes, makes me even less happy
about the fact that \d doesn't show them. I think we really ought
to try to find a way to not need these entries. But that is clearly
not v18 material at this point.
I'd probably not change this in versions 13 and 14 at all in any case,
because the code is too different. I'm unsure whether this is enough of
a bug to consider backpatching to 15-17; maybe we should just change 18
at this point, since I haven't heard of a user complaining about this.
Kirill's complaint isn't that? But I agree that changing this rule in
stable branches would probably be a net negative user experience,
seeing that the names are plenty user-visible.
regards, tom lane
Attachments:
wip-rename-child-foreign-key-constraints.patchtext/x-diff; charset=us-ascii; name=wip-rename-child-foreign-key-constraints.patchDownload+20-10
On 2025-Apr-21, Tom Lane wrote:
I experimented with the attached, which approximates "add some digits
to the name used for the parent constraint". (We could refactor
ChooseConstraintName if we wanted a less approximate version of that
rule, but I'm not sure it's worth the trouble.)
This seems a better implementation of the idea than what I had in mind.
For my part, please feel free to go ahead with this. (I can focus on
the remaining problems with self-referencing FKs ... sigh)
The extent to which these derived names leak out to the user, as
illustrated by the regression test changes, makes me even less happy
about the fact that \d doesn't show them. I think we really ought
to try to find a way to not need these entries. But that is clearly
not v18 material at this point.
Well, for starters we could change psql to show the hidden constraint
names. Since this appears to be a problem for users, maybe even change
\d+ to show them in backbranches. For the longer term (19) I can try to
work on reimplementing things so that these names don't appear anywhere
and are more collision-resistant (though frankly I would prefer to spend
my time on REPACK and column reordering)
I'd probably not change this in versions 13 and 14 at all in any case,
because the code is too different. I'm unsure whether this is enough of
a bug to consider backpatching to 15-17; maybe we should just change 18
at this point, since I haven't heard of a user complaining about this.Kirill's complaint isn't that?
I didn't see it as one as first, but I reread his OP and now I agree
that it is.
FWIW I've been migrating my main email address to a new domain. Finally
got sick of noip.com.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"La vida es para el que se aventura"
Alvaro Herrera <alvherre@kurilemu.de> writes:
On 2025-Apr-21, Tom Lane wrote:
I experimented with the attached, which approximates "add some digits
to the name used for the parent constraint". (We could refactor
ChooseConstraintName if we wanted a less approximate version of that
rule, but I'm not sure it's worth the trouble.)
This seems a better implementation of the idea than what I had in mind.
For my part, please feel free to go ahead with this.
OK. I'll take a look first at whether the aforesaid refactoring
is easy enough to be worth doing.
FWIW I've been migrating my main email address to a new domain. Finally
got sick of noip.com.
Got it, thanks for the heads-up.
regards, tom lane
I wrote:
Alvaro Herrera <alvherre@kurilemu.de> writes:
On 2025-Apr-21, Tom Lane wrote:
I experimented with the attached, which approximates "add some digits
to the name used for the parent constraint". (We could refactor
ChooseConstraintName if we wanted a less approximate version of that
rule, but I'm not sure it's worth the trouble.)
This seems a better implementation of the idea than what I had in mind.
For my part, please feel free to go ahead with this.
OK. I'll take a look first at whether the aforesaid refactoring
is easy enough to be worth doing.
After poking at that, it's easy to get ChooseConstraintName to do
something just slightly different from what I said above: the rule is
now "add an underscore and some digits to the name used for the parent
constraint". I like this even better than the previous idea, because
I think it makes it more obvious that the name is derived from the
parent constraint. However, this changes the chosen name in more
cases than my previous hack did. So I'm reposting the patch to see
if anyone feels this is too much churn. I think it's okay as a
v18-only patch, though I wouldn't propose it for back-patch.
regards, tom lane
Attachments:
v2-rename-child-foreign-key-constraints.patchtext/x-diff; charset=us-ascii; name=v2-rename-child-foreign-key-constraints.patchDownload+74-63
On Wed, 23 Apr 2025 at 01:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:
So I'm reposting the patch to see
if anyone feels this is too much churn.
WFM
--
Best regards,
Kirill Reshke
On 2025-Apr-22, Tom Lane wrote:
After poking at that, it's easy to get ChooseConstraintName to do
something just slightly different from what I said above: the rule is
now "add an underscore and some digits to the name used for the parent
constraint". I like this even better than the previous idea, because
I think it makes it more obvious that the name is derived from the
parent constraint. However, this changes the chosen name in more
cases than my previous hack did. So I'm reposting the patch to see
if anyone feels this is too much churn. I think it's okay as a
v18-only patch, though I wouldn't propose it for back-patch.
I think the new constraint names are better, so +1 for this version of
the patch for 18. I agree that we don't necessarily want to backpatch
this to stable branches though.
I wonder if it would make pg_upgrade users' lives easier if we had
pg_upgrade --check notify them about possible collisions on these
constraints (for the older branches). I don't have good ideas on how to
implement that though other than a trial dump/restore, which is perhaps
unreasonable.
(My position on pg_upgrade is that if pg_upgrade --check passes, then
you shouldn't need any additional tests for assurance that running
pg_upgrade for real is going to work. So I would be happier if we could
detect this problem.)
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"La primera ley de las demostraciones en vivo es: no trate de usar el sistema.
Escriba un guión que no toque nada para no causar daños." (Jakob Nielsen)
Alvaro Herrera <alvherre@kurilemu.de> writes:
I think the new constraint names are better, so +1 for this version of
the patch for 18. I agree that we don't necessarily want to backpatch
this to stable branches though.
Cool, I'll push that patch in a bit then.
I wonder if it would make pg_upgrade users' lives easier if we had
pg_upgrade --check notify them about possible collisions on these
constraints (for the older branches). I don't have good ideas on how to
implement that though other than a trial dump/restore, which is perhaps
unreasonable.
Yeah, I'm not seeing a simple way to do that either. I'm hesitant to
put a ton of work into it, because in the end the situation seems like
self-inflicted damage. It's not quite true that you need duplicate
foreign key constraints to hit this, but it's close: the default
table-and-column-name-based name for one constraint has to match the
actual name of a second constraint. If the second constraint isn't
a duplicate of the first then it has a very misleadingly chosen name.
Either way it's pretty poor DBA practice.
Between that thought and the fact that the problem has gone unnoticed
since v12, I'm content to make this change in the default names and
stop here.
regards, tom lane