BUG #17720: pg_dump creates a dump with primary key that cannot be restored, when specifying 'using index ...'
The following bug has been logged on the website:
Bug reference: 17720
Logged by: reiner peterke
Email address: zedaardv@drizzle.com
PostgreSQL version: 15.1
Operating system: openSUSE Leap 15.4
Description:
I have a table
create table hamster(under integer, over text);
I create a unique index on the under column with nulls not distinct
create unique index uq_not_distinct on hamster (under) nulls not distinct;
i now create a primary key using the unique index
alter table hamster add constraint pk_hamster primary key using index
uq_not_distinct;
NOTICE: ALTER TABLE / ADD CONSTRAINT USING INDEX will rename index
"uq_not_distinct" to "pk_hamster"
table looks good
\d hamster
Table "moon.hamster"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
under | integer | | not null |
over | text | | |
Indexes:
"pk_hamster" PRIMARY KEY, btree (under) NULLS NOT DISTINCT
Do a pg_dump of the database.
the dump creates the code for a primary key that cannot be restored
pg_dump -p 5632 -Of tranquility.sql -d tranquility
on restore, I get the following error
psql:tranquility.sql:90: ERROR: syntax error at or near "NULLS"
LINE 2: ADD CONSTRAINT pk_hamster PRIMARY KEY NULLS NOT DISTINCT...
in the dump itself the create constraint command is
ALTER TABLE ONLY moon.hamster
ADD CONSTRAINT pk_hamster PRIMARY KEY NULLS NOT DISTINCT (under);
which does not work with the NULLS NOT DISTINCT string
On Wednesday, December 14, 2022, PG Bug reporting form <
noreply@postgresql.org> wrote:
The following bug has been logged on the website:
Bug reference: 17720
Logged by: reiner peterke
Email address: zedaardv@drizzle.com
PostgreSQL version: 15.1
Operating system: openSUSE Leap 15.4
Description:Do a pg_dump of the database.
the dump creates the code for a primary key that cannot be restored
pg_dump -p 5632 -Of tranquility.sql -d tranquility
on restore, I get the following error
psql:tranquility.sql:90: ERROR: syntax error at or near "NULLS"
LINE 2: ADD CONSTRAINT pk_hamster PRIMARY KEY NULLS NOT DISTINCT...
in the dump itself the create constraint command is
ALTER TABLE ONLY moon.hamster
ADD CONSTRAINT pk_hamster PRIMARY KEY NULLS NOT DISTINCT (under);
which does not work with the NULLS NOT DISTINCT string
There is a decent chance that the fix here is to prohibit doing what you
did here - a PK cannot contain nulls in any of its columns so indeed
choosing an index that specifies how nulls behave is non-sensical. That
said, it also doesn’t hurt so long as the column itself is indeed not
null. But extending the syntax doesn’t seem that appealing.
David J.
On 14 Dec 2022, at 13:54, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Wednesday, December 14, 2022, PG Bug reporting form <noreply@postgresql.org <mailto:noreply@postgresql.org>> wrote:
The following bug has been logged on the website:Bug reference: 17720
Logged by: reiner peterke
Email address: zedaardv@drizzle.com <mailto:zedaardv@drizzle.com>
PostgreSQL version: 15.1
Operating system: openSUSE Leap 15.4
Description:Do a pg_dump of the database.
the dump creates the code for a primary key that cannot be restored
pg_dump -p 5632 -Of tranquility.sql -d tranquility
on restore, I get the following error
psql:tranquility.sql:90: ERROR: syntax error at or near "NULLS"
LINE 2: ADD CONSTRAINT pk_hamster PRIMARY KEY NULLS NOT DISTINCT...
in the dump itself the create constraint command is
ALTER TABLE ONLY moon.hamster
ADD CONSTRAINT pk_hamster PRIMARY KEY NULLS NOT DISTINCT (under);
which does not work with the NULLS NOT DISTINCT stringThere is a decent chance that the fix here is to prohibit doing what you did here - a PK cannot contain nulls in any of its columns so indeed choosing an index that specifies how nulls behave is non-sensical. That said, it also doesn’t hurt so long as the column itself is indeed not null. But extending the syntax doesn’t seem that appealing.
Even if we prohibit this, there is still the case of all existing systems which
can't be dumped. I wonder if the solution is to teach pg_dump to not create
NULLS NOT DISTINCT primary key constraints? The simple attached fix creates a
valid PK constraint on the above schema.
--
Daniel Gustafsson https://vmware.com/
Attachments:
pg_dump_pk_nulls.diffapplication/octet-stream; name=pg_dump_pk_nulls.diff; x-unix-mode=0644Download+1-1
Daniel Gustafsson <daniel@yesql.se> writes:
On 14 Dec 2022, at 13:54, David G. Johnston <david.g.johnston@gmail.com> wrote:
There is a decent chance that the fix here is to prohibit doing what you did here - a PK cannot contain nulls in any of its columns so indeed choosing an index that specifies how nulls behave is non-sensical. That said, it also doesn’t hurt so long as the column itself is indeed not null. But extending the syntax doesn’t seem that appealing.
Even if we prohibit this, there is still the case of all existing systems which
can't be dumped. I wonder if the solution is to teach pg_dump to not create
NULLS NOT DISTINCT primary key constraints? The simple attached fix creates a
valid PK constraint on the above schema.
It doesn't make sense for pg_dump to editorialize on a schema that
we otherwise consider valid; people would rightfully complain that
dump/restore changed things. So we need to do both things: prohibit
adopting such an index as a PK constraint (but I guess it's okay
for plain unique constraints?), and adjust pg_dump to compensate
for the legacy case where it was already done.
regards, tom lane
On 14 Dec 2022, at 16:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
On 14 Dec 2022, at 13:54, David G. Johnston <david.g.johnston@gmail.com> wrote:
There is a decent chance that the fix here is to prohibit doing what you did here - a PK cannot contain nulls in any of its columns so indeed choosing an index that specifies how nulls behave is non-sensical. That said, it also doesn’t hurt so long as the column itself is indeed not null. But extending the syntax doesn’t seem that appealing.
Even if we prohibit this, there is still the case of all existing systems which
can't be dumped. I wonder if the solution is to teach pg_dump to not create
NULLS NOT DISTINCT primary key constraints? The simple attached fix creates a
valid PK constraint on the above schema.It doesn't make sense for pg_dump to editorialize on a schema that
we otherwise consider valid; people would rightfully complain that
dump/restore changed things. So we need to do both things: prohibit
adopting such an index as a PK constraint (but I guess it's okay
for plain unique constraints?), and adjust pg_dump to compensate
for the legacy case where it was already done.
Agreed, I'll expand the patch.
--
Daniel Gustafsson https://vmware.com/
On Wed, Dec 14, 2022 at 8:37 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
On 14 Dec 2022, at 13:54, David G. Johnston <david.g.johnston@gmail.com>
wrote:
There is a decent chance that the fix here is to prohibit doing what
you did here - a PK cannot contain nulls in any of its columns so indeed
choosing an index that specifies how nulls behave is non-sensical. That
said, it also doesn’t hurt so long as the column itself is indeed not
null. But extending the syntax doesn’t seem that appealing.Even if we prohibit this, there is still the case of all existing
systems which
can't be dumped. I wonder if the solution is to teach pg_dump to not
create
NULLS NOT DISTINCT primary key constraints? The simple attached fix
creates a
valid PK constraint on the above schema.
It doesn't make sense for pg_dump to editorialize on a schema that
we otherwise consider valid; people would rightfully complain that
dump/restore changed things. So we need to do both things: prohibit
adopting such an index as a PK constraint (but I guess it's okay
for plain unique constraints?), and adjust pg_dump to compensate
for the legacy case where it was already done.
The WHERE clause of CREATE INDEX doesn't pose an issue as we report "Cannot
create a primary key or unique constraint using such an index".
It is also not possible to specify an opclass in PRIMARY KEY, but since we
document that unique indexes are only currently implemented by B-tree, and
that is what you get from PRIMARY KEY, that is also not a problem
(presently - unless extension authors were to bypass this).
The remaining elements: INCLUDE, WITH, and TABLESPACE, all exist in both
formulations.
I am thinking now that the failure to include NULLS [NOT[ DISTINCT in the
CREATE TABLE syntax is an oversight that needs to be fixed. It just
doesn't make sense to have the two commands expose different features.
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes:
I am thinking now that the failure to include NULLS [NOT] DISTINCT in the
CREATE TABLE syntax is an oversight that needs to be fixed. It just
doesn't make sense to have the two commands expose different features.
It looks to me like it was pretty intentional, because both CREATE
and ALTER TABLE let you write UNIQUE NULLS NOT DISTINCT but not
PRIMARY KEY NULLS NOT DISTINCT. That doesn't seem like an oversight.
regards, tom lane
On Wed, Dec 14, 2022 at 9:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
I am thinking now that the failure to include NULLS [NOT] DISTINCT in the
CREATE TABLE syntax is an oversight that needs to be fixed. It just
doesn't make sense to have the two commands expose different features.It looks to me like it was pretty intentional, because both CREATE
and ALTER TABLE let you write UNIQUE NULLS NOT DISTINCT but not
PRIMARY KEY NULLS NOT DISTINCT. That doesn't seem like an oversight.
OK, that was me tunnel-visioned on the "index_parameters" syntax section
where INCLUDE, etc... are listed and not seeing it there.
So, don't document that PRIMARY KEY accepts the NULLS [NOT] DISTINCT but
make it do so anyway?
David J.
On 14 Dec 2022, at 16:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
On 14 Dec 2022, at 13:54, David G. Johnston <david.g.johnston@gmail.com> wrote:
There is a decent chance that the fix here is to prohibit doing what you did here - a PK cannot contain nulls in any of its columns so indeed choosing an index that specifies how nulls behave is non-sensical. That said, it also doesn’t hurt so long as the column itself is indeed not null. But extending the syntax doesn’t seem that appealing.
Even if we prohibit this, there is still the case of all existing systems which
can't be dumped. I wonder if the solution is to teach pg_dump to not create
NULLS NOT DISTINCT primary key constraints? The simple attached fix creates a
valid PK constraint on the above schema.It doesn't make sense for pg_dump to editorialize on a schema that
we otherwise consider valid; people would rightfully complain that
dump/restore changed things. So we need to do both things: prohibit
adopting such an index as a PK constraint (but I guess it's okay
for plain unique constraints?), and adjust pg_dump to compensate
for the legacy case where it was already done.
The attached prohibits the use of NULLS NOT DISTINCT for backing primary key
constraints but allow them for unique constraints. Is this along the lines of
what you had in mind?
--
Daniel Gustafsson https://vmware.com/
Attachments:
v1-0001-Disallow-NULLS-NOT-DISTINCT-indexes-for-primary-k.patchapplication/octet-stream; name=v1-0001-Disallow-NULLS-NOT-DISTINCT-indexes-for-primary-k.patch; x-unix-mode=0644Download+20-2
Daniel Gustafsson <daniel@yesql.se> writes:
The attached prohibits the use of NULLS NOT DISTINCT for backing primary key
constraints but allow them for unique constraints. Is this along the lines of
what you had in mind?
Needs more than zero comments in the code, and why bother testing
is_alter_table in index_check_primary_key? We're disallowing
this case across-the-board, no matter how you get to it.
I'll defer to Peter on whether this is in fact the right way to go,
or we should relax the syntax restriction as David suggests.
regards, tom lane
On 15 Dec 2022, at 00:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
The attached prohibits the use of NULLS NOT DISTINCT for backing primary key
constraints but allow them for unique constraints. Is this along the lines of
what you had in mind?Needs more than zero comments in the code, and why bother testing
is_alter_table in index_check_primary_key? We're disallowing
this case across-the-board, no matter how you get to it.
That was an, admittedly poor, attempt at self-documenting code. Removed and
comments added in the attached.
--
Daniel Gustafsson https://vmware.com/
Attachments:
v2-0001-Disallow-NULLS-NOT-DISTINCT-indexes-for-primary-k.patchapplication/octet-stream; name=v2-0001-Disallow-NULLS-NOT-DISTINCT-indexes-for-primary-k.patch; x-unix-mode=0644Download+31-2
On 12/15/22 14:18, Daniel Gustafsson wrote:
On 15 Dec 2022, at 00:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
The attached prohibits the use of NULLS NOT DISTINCT for backing primary key
constraints but allow them for unique constraints. Is this along the lines of
what you had in mind?Needs more than zero comments in the code, and why bother testing
is_alter_table in index_check_primary_key? We're disallowing
this case across-the-board, no matter how you get to it.That was an, admittedly poor, attempt at self-documenting code. Removed and
comments added in the attached.
[reviewing patch...]
If the point in adding a primary key USING INDEX is to avoid building an
index, then this restriction defeats that purpose. We have no ALTER
INDEX command to switch or drop the <unique null treatment>.
Since the primary key can't have any nulls anyway, I think we should
just flip this, perhaps with a NOTICE like the rename does.
--
Vik Fearing
Vik Fearing <vik@postgresfriends.org> writes:
If the point in adding a primary key USING INDEX is to avoid building an
index, then this restriction defeats that purpose. We have no ALTER
INDEX command to switch or drop the <unique null treatment>.
Well, if you want to build an index to then use as a primary key,
it's incumbent on you to make the index with the correct properties.
Do you expect the system to silently fix it for you if the index is
on the wrong columns?
I might have more sympathy for this argument if the correct <unique null
treatment> were non-default, but it is not. You'd have had to go out
of your way to make the index incompatible. That in turn suggests that
there's a mistake somewhere, so having the system automatically fix
it for you might just be masking something that would be better dealt
with manually.
regards, tom lane
On 12/15/22 16:54, Tom Lane wrote:
Vik Fearing <vik@postgresfriends.org> writes:
If the point in adding a primary key USING INDEX is to avoid building an
index, then this restriction defeats that purpose. We have no ALTER
INDEX command to switch or drop the <unique null treatment>.Well, if you want to build an index to then use as a primary key,
it's incumbent on you to make the index with the correct properties.
This is assuming you initially created the index for the purpose of
making it the primary key, perhaps for concurrency.
Do you expect the system to silently fix it for you if the index is
on the wrong columns?
Of course not.
I might have more sympathy for this argument if the correct <unique null
treatment> were non-default, but it is not. You'd have had to go out
of your way to make the index incompatible. That in turn suggests that
there's a mistake somewhere, so having the system automatically fix
it for you might just be masking something that would be better dealt
with manually.
The index may have been originally created for something else. The only
way to deal with this manually is to drop and recreate the index, just
to remove something that has no effect anyway because the index contains
no nulls.
I am not going to die on this hill, but I think it is a shame to make
users go through that. A NOTICE saying "btw, we did that" should be
sufficient.
--
Vik Fearing
On Thu, Dec 15, 2022 at 8:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Vik Fearing <vik@postgresfriends.org> writes:
If the point in adding a primary key USING INDEX is to avoid building an
index, then this restriction defeats that purpose. We have no ALTER
INDEX command to switch or drop the <unique null treatment>.Well, if you want to build an index to then use as a primary key,
it's incumbent on you to make the index with the correct properties.
Do you expect the system to silently fix it for you if the index is
on the wrong columns?I might have more sympathy for this argument if the correct <unique null
treatment> were non-default, but it is not. You'd have had to go out
of your way to make the index incompatible. That in turn suggests that
there's a mistake somewhere, so having the system automatically fix
it for you might just be masking something that would be better dealt
with manually.
No matter what we do we are either masking the situation that a user has a
primary key backing index defined NULLS NOT DISTINCT which is a pointless,
but not invalid, definition, or causing an error to happen because we've
decided that we shouldn't be masking that inconsistent configuration.
Green-field, maybe we should have made it a "NULLS { NOT ALLOWED | [NOT]
DISTINCT }" option and been better at informing/prohibiting the user from
choosing an index that is not compatible with the constraint they want it
to support. But that isn't really something we make an effort to do
generally and it seems too late to get on that horse now.
Sleeping on this, I feel even more strongly that touching pg_dump now is
wrong. The configuration, while nonsensical, is technically valid. Which
means that the argument for causing existing pg_dumps to become broken is
basically zero. We need to simply synchronize PRIMARY KEY to accept this
configuration in the interest of dump/restore preservation.
David J.
On Thu, Dec 15, 2022 at 4:54 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Vik Fearing <vik@postgresfriends.org> writes:
If the point in adding a primary key USING INDEX is to avoid building an
index, then this restriction defeats that purpose. We have no ALTER
INDEX command to switch or drop the <unique null treatment>.Well, if you want to build an index to then use as a primary key,
it's incumbent on you to make the index with the correct properties.
Do you expect the system to silently fix it for you if the index is
on the wrong columns?I might have more sympathy for this argument if the correct <unique null
treatment> were non-default, but it is not. You'd have had to go out
of your way to make the index incompatible. That in turn suggests that
there's a mistake somewhere, so having the system automatically fix
it for you might just be masking something that would be better dealt
with manually.regards, tom lane
Can you explain the distinction?
If i build an index nulls distinct/nulls not distinct on a column that will
later be used in an
alter index add constraint ... using index
both indexes can have at least one null.
from that point using an index in the creation of a constraint should not
matter if the index could potentially have one or many nulls.
If the pre version 15 syntax is correct, then the post version 15 syntax
should also be correct.
If one is not valid, then both should not be valid.
Or am i missing something here?
Reiner
On 15.12.22 00:27, Tom Lane wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
The attached prohibits the use of NULLS NOT DISTINCT for backing primary key
constraints but allow them for unique constraints. Is this along the lines of
what you had in mind?Needs more than zero comments in the code, and why bother testing
is_alter_table in index_check_primary_key? We're disallowing
this case across-the-board, no matter how you get to it.I'll defer to Peter on whether this is in fact the right way to go,
or we should relax the syntax restriction as David suggests.
My first instinct was to just fix pg_dump to not dump syntax that can't
be loaded in.
It shouldn't matter what null treatment the underlying unique index has,
since the primary key can't have nulls anyway, so either type of index
should be acceptable. But then we'd need to think through a bunch of
possible ALTER behaviors. For example, if we just change pg_dump and
leave the index as is, a subsequent dump and restore would lose the
original null treatment flag. What if someone then wants to re-detach
the constraint from the index? (Does that exist now? Maybe not, but it
could.) What should happen then? This could all be worked out, I
think, but it would need more thought.
In short, I think preventing the ALTER command, as proposed in this
patch, seems like a good solution for the moment. Additional work to
enable some of this could follow later independently.
On 16 Dec 2022, at 09:07, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
On 15.12.22 00:27, Tom Lane wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
The attached prohibits the use of NULLS NOT DISTINCT for backing primary key
constraints but allow them for unique constraints. Is this along the lines of
what you had in mind?Needs more than zero comments in the code, and why bother testing
is_alter_table in index_check_primary_key? We're disallowing
this case across-the-board, no matter how you get to it.
I'll defer to Peter on whether this is in fact the right way to go,
or we should relax the syntax restriction as David suggests.My first instinct was to just fix pg_dump to not dump syntax that can't be loaded in.
It shouldn't matter what null treatment the underlying unique index has, since the primary key can't have nulls anyway, so either type of index should be acceptable. But then we'd need to think through a bunch of possible ALTER behaviors. For example, if we just change pg_dump and leave the index as is, a subsequent dump and restore would lose the original null treatment flag. What if someone then wants to re-detach the constraint from the index? (Does that exist now? Maybe not, but it could.) What should happen then? This could all be worked out, I think, but it would need more thought.
In short, I think preventing the ALTER command, as proposed in this patch, seems like a good solution for the moment. Additional work to enable some of this could follow later independently.
The attached removes the change from pg_dump and only prohibits the ALTER TABLE
command for attaching the index. Since it will render dumps unable to be
restored I also added a check to pg_upgrade to cover the case.
--
Daniel Gustafsson https://vmware.com/
Attachments:
v3-0001-Disallow-NULLS-NOT-DISTINCT-indexes-for-primary-k.patchapplication/octet-stream; name=v3-0001-Disallow-NULLS-NOT-DISTINCT-indexes-for-primary-k.patch; x-unix-mode=0644Download+118-1
Daniel Gustafsson <daniel@yesql.se> writes:
The attached removes the change from pg_dump and only prohibits the ALTER TABLE
command for attaching the index. Since it will render dumps unable to be
restored I also added a check to pg_upgrade to cover the case.
That doesn't seem like a great answer. I understood Peter to be
favoring both aspects of the previous patch.
regards, tom lane
On 16 Dec 2022, at 17:12, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
The attached removes the change from pg_dump and only prohibits the ALTER TABLE
command for attaching the index. Since it will render dumps unable to be
restored I also added a check to pg_upgrade to cover the case.That doesn't seem like a great answer. I understood Peter to be
favoring both aspects of the previous patch.
Oh, ok, if so then I misunderstood. No worries, v2 is then the one to consider.
--
Daniel Gustafsson https://vmware.com/