ALTER TABLE with ADD COLUMN and ADD PRIMARY KEY USING INDEX throws spurious "column contains null values"

Started by Zhang, Jieabout 7 years ago5 messageshackers
Jump to latest
#1Zhang, Jie
zhangjie2@cn.fujitsu.com

Hi all,

When I do the following:
postgres=# create table t1 (a int);
postgres=# insert into t1 values(1);
postgres=# create unique index uniq_idx on t1(a);
postgres=# alter table t1 add column b float8 not null default random(), add primary key using index uniq_idx;
ERROR: column "b" contains null values

PostgreSQL throws error "column b contains null values".

#########################################
alter table t1 add column b float8 not null default 0, add primary key using index uniq_idx;

alter table success.
#########################################

The reasons for the error are as follows.

ATController provides top level control over the phases.
Phase 1: preliminary examination of commands, create work queue
Phase 2: update system catalogs
Phase 3: scan/rewrite tables as needed

In Phase 2, when dealing with "add column b float8 not null default random()", the table is marked rewrite.
When dealing with "add primary key using index uniq_idx", ATExecAddIndexConstraint calls index_check_primary_key.

The calling order is as follows.
index_check_primary_key()

AlterTableInternal()

ATController()

ATRewriteTables()

ATRewriteTable()

ATRewriteTable check all not-null constraints. Column a and column b need to check NOT NULL.
Unfortunately, at this time, Phase 3 hasn't been done yet.
The table is not rewrited, just marked rewrite. So, throws error "column b contains null values".

In Phase 2, if table is marked rewrite, we can do not check whether columns are NOT NULL.
Because phase 3 will do it.

Here's a patch to fix this bug.

Best Regards!

Attachments:

alter_table.patchapplication/octet-stream; name=alter_table.patchDownload+82-31
In reply to: Zhang, Jie (#1)
Re: ALTER TABLE with ADD COLUMN and ADD PRIMARY KEY USING INDEX throws spurious "column contains null values"

Hi

I did not review patch, just want add link to related bug 15580 and one another -hackers thread:

/messages/by-id/15580-d1a6de5a3d65da51@postgresql.org
/messages/by-id/CAOVr7+3C9u_ZApjxpccrorvt0aw=k8Ct5FhHRVZA-YO36V3=rg@mail.gmail.com

regards, Sergei

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zhang, Jie (#1)
Re: ALTER TABLE with ADD COLUMN and ADD PRIMARY KEY USING INDEX throws spurious "column contains null values"

"Zhang, Jie" <zhangjie2@cn.fujitsu.com> writes:

Here's a patch to fix this bug.

I took a look at this patch, but I really dislike it: it adds a mighty
ad-hoc parameter to a whole bunch of functions that shouldn't really
have anything to do with the problem. Moreover, I have little confidence
that it's really solving the problem and not introducing any new problems
(such as failure to apply the not-null check when we need to).

I think the real problem is exactly that we've got index_check_primary_key
doing its own mini ALTER TABLE in ignorance of what might be happening
in the outer ALTER TABLE. That's just ripe for order-of-operations
problems, seeing that ALTER TABLE has a lot of very careful decisions
about which steps have to happen before which other ones. Moreover,
as this old comment notes, it's a horridly inefficient approach if
the table is large:

/*
* XXX: possible future improvement: when being called from ALTER TABLE,
* it would be more efficient to merge this with the outer ALTER TABLE, so
* as to avoid two scans. But that seems to complicate DefineIndex's API
* unduly.
*/

So I thought a bit about how to fix that, and realized that we could
easily adjust the parser to emit AT_SetNotNull subcommands as part of the
outer ALTER TABLE that has the ADD PRIMARY KEY subcommand. Then,
index_check_primary_key doesn't need to do anything at all about adding
NOT NULL, although it seems like a good safety feature for it to check
that somebody else already added that.

So, attached is a WIP patch that fixes it that way. Some notes
for review:

* I initially thought that index_check_primary_key could be simplified
to the point where it *only* throws an error for missing NOT NULL.
This soon proved to be wrong, because the comments for the function
are lies, or at least obsolete: there are multiple scenarios in which
a CREATE TABLE with a PRIMARY KEY option does need this function to
perform ALTER TABLE SET NOT NULL. Fortunately, that's not so expensive
in that case, since the table must be empty. So as coded, it throws
an error if is_alter_table, and otherwise does what it did before.

* We need to fix the order of operations in ALTER TABLE phase 2 so that
any AT_SetNotNull subcommands happen after the AT_PASS_ADD_COL pass
(else the column might not be there yet) and before AT_PASS_ADD_INDEX
(else index_check_primary_key will complain). I did this by putting
AT_SetNotNull into the AT_PASS_COL_ATTRS pass and moving that to after
AT_PASS_ADD_COL; it had been before AT_PASS_ADD_COL, but that seems at
best random and at worst broken. (AT_SetIdentity is the only existing
subcommand using AT_PASS_COL_ATTRS, and it sure seems like it'd make more
sense to run it after AT_PASS_ADD_COL, so that it can work on a column
being added in the same ALTER. Am I missing something?)

* Some existing regression tests for "ALTER TABLE ONLY partitioned_table
ADD PRIMARY KEY" failed. That apparently is supposed to work if all
partitions already have a suitable unique index and NOT NULL constraint,
but it was failing because ATPrepSetNotNull wasn't expecting to be used
this way. I thought that function was a pretty terrible kluge anyway,
so what I did was to refactor things so that in this scenario we just
apply checks to see if all the partitions already have suitable NOT NULL.
Note that this represents looser semantics than what was there before,
because previously you couldn't say "ALTER TABLE ONLY partitioned_table
SET NOT NULL" if there were any partitions; now you can, if the partitions
all have suitable NOT NULL already. We probably ought to change the error
message to reflect that, but I didn't yet.

* A couple of existing test cases change error messages, like so:

-ERROR:  column "test1" named in key does not exist
+ERROR:  column "test1" of relation "atacc1" does not exist

This is because the added AT_SetNotNull subcommand runs before
AT_AddIndex, so it's the one that notices that there's not really
any such column, and historically it's spelled its error message
differently. This change seems all to the good to me, so I didn't
try to avoid it.

* I haven't yet added any test case(s) reflecting the bug fix nor
the looser semantics for adding NOT NULL to partitioned tables.
It does pass check-world as presented.

* I'm not sure whether we want to try to back-patch this, or how
far it should go. The misbehavior has been there a long time
(at least back to 8.4, I didn't check further); so the lack of
previous reports means people seldom try to do it. That may
indicate that it's not worth taking any risks of new bugs to
squash this one. (Also, I suspect that it might take a lot of
work to port this to before v10, because there are comments
suggesting that this code worked a good bit differently before.)
I do think we should shove this into v12 though.

Comments?

regards, tom lane

Attachments:

fix-alter-table-add-primary-key-vs-not-null-1.patchtext/x-diff; charset=us-ascii; name=fix-alter-table-add-primary-key-vs-not-null-1.patchDownload+172-42
#4Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#3)
Re: ALTER TABLE with ADD COLUMN and ADD PRIMARY KEY USING INDEX throws spurious "column contains null values"

On Wed, Apr 17, 2019 at 11:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

* I'm not sure whether we want to try to back-patch this, or how
far it should go. The misbehavior has been there a long time
(at least back to 8.4, I didn't check further); so the lack of
previous reports means people seldom try to do it. That may
indicate that it's not worth taking any risks of new bugs to
squash this one. (Also, I suspect that it might take a lot of
work to port this to before v10, because there are comments
suggesting that this code worked a good bit differently before.)
I do think we should shove this into v12 though.

Shoving it into v12 but not back-patching seems like a reasonable
compromise, although I have not reviewed the patch or tried to figure
out how risky it is.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#4)
Re: ALTER TABLE with ADD COLUMN and ADD PRIMARY KEY USING INDEX throws spurious "column contains null values"

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Apr 17, 2019 at 11:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

* I'm not sure whether we want to try to back-patch this, or how
far it should go. The misbehavior has been there a long time
(at least back to 8.4, I didn't check further); so the lack of
previous reports means people seldom try to do it. That may
indicate that it's not worth taking any risks of new bugs to
squash this one. (Also, I suspect that it might take a lot of
work to port this to before v10, because there are comments
suggesting that this code worked a good bit differently before.)
I do think we should shove this into v12 though.

Shoving it into v12 but not back-patching seems like a reasonable
compromise, although I have not reviewed the patch or tried to figure
out how risky it is.

Here's a less-WIP patch for that. I fixed up some more stuff:

* I initially thought that index_check_primary_key could be simplified
to the point where it *only* throws an error for missing NOT NULL.
This soon proved to be wrong, because the comments for the function
are lies, or at least obsolete: there are multiple scenarios in which
a CREATE TABLE with a PRIMARY KEY option does need this function to
perform ALTER TABLE SET NOT NULL.

I decided that a cleaner way to handle this was to make the parser
generate required ALTER TABLE SET NOT NULL operations in all cases,
not just the ALTER TABLE case. This gets rid of the former confused
situation wherein transformIndexConstraint forced primary-key columns
NOT NULL in some situations and abdicated responsibility in others.

* We need to fix the order of operations in ALTER TABLE phase 2 so that
any AT_SetNotNull subcommands happen after the AT_PASS_ADD_COL pass
(else the column might not be there yet) and before AT_PASS_ADD_INDEX
(else index_check_primary_key will complain). I did this by putting
AT_SetNotNull into the AT_PASS_COL_ATTRS pass and moving that to after
AT_PASS_ADD_COL; it had been before AT_PASS_ADD_COL, but that seems at
best random and at worst broken. (AT_SetIdentity is the only existing
subcommand using AT_PASS_COL_ATTRS, and it sure seems like it'd make more
sense to run it after AT_PASS_ADD_COL, so that it can work on a column
being added in the same ALTER. Am I missing something?)

Sure enough, AT_SetIdentity is broken for the case where the column was
just created in the same ALTER command, as per test case added below.
Admittedly, that's a fairly unlikely thing to do, but it should work;
so the current ordering of these passes is wrong.

BTW, now that we have an AT_PASS_COL_ATTRS pass, it's a bit tempting to
shove other stuff that's in the nature of change-a-column-attribute into
it; there are several AT_ subcommands of that sort that are currently in
AT_PASS_MISC. I didn't do that here though.

Also, this fixes the issue complained of in
/messages/by-id/16115.1555874162@sss.pgh.pa.us

Barring objections I'll commit this tomorrow or so.

regards, tom lane

Attachments:

fix-alter-table-add-primary-key-vs-not-null-2.patchtext/x-diff; charset=us-ascii; name=fix-alter-table-add-primary-key-vs-not-null-2.patchDownload+320-109