Identity column behavior discrepancies when inserting one or many rows

Started by Jean Prulièreover 5 years ago3 messagesbugs
Jump to latest
#1Jean Prulière
jean@oclock.io

Hi there !

I am teaching PostgreSQL at an online school and we recently stumbled upon
a rather strange behavior while experimenting with the identity column
feature. I know it is not something any skilled person would do but it is
still, imho, different from what the documentation states. Our students
love testing and it is quite frustrating if no one can explain what they
found.

Here is a very simple script to reproduce what we encountered :

CREATE TABLE test (

id int GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
sometext text NOT NULL
);

-- that one works : INSERT 0 1
-- I think it should not and throw a 428C9
INSERT INTO test (id, sometext)

VALUES (DEFAULT, 'test');

-- this one does not : SQL state : 428C9
INSERT INTO test (id, sometext)

VALUES (DEFAULT, 'test2'),

(DEFAULT, 'test3');

Though GENERATED ALWAYS implies the absence of the identity column in the
column names list of any INSERT statement, listing it there is OK, without
the OVERRIDING SYSTEM VALUE flag, as long as only one row is inserted (and
DEFAULT is used as value, of course). But starting at 2 rows (and I can
only guess it never stops), the proper error is thrown, advising to use the
aforementioned flag.

Is there any reason I missed such behavior would be expected ?

I upgraded my cluster (was 11.9, is now 13) to the latest version to be
sure it is still effective : it is.

Here is the version and platform info returned by calling version() :
PostgreSQL 13.0 (Ubuntu 13.0-1.pgdg18.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit

It is my first bug report, sorry if I failed to report some piece of info
:-|

Peace !

Jean Pruliere

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Jean Prulière (#1)
Re: Identity column behavior discrepancies when inserting one or many rows

On Mon, Oct 12, 2020 at 4:04 AM Jean Prulière <jean@oclock.io> wrote:

Here is a very simple script to reproduce what we encountered :

CREATE TABLE test (

id int GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
sometext text NOT NULL
);

-- that one works : INSERT 0 1
-- I think it should not and throw a 428C9
INSERT INTO test (id, sometext)

VALUES (DEFAULT, 'test');

-- this one does not : SQL state : 428C9
INSERT INTO test (id, sometext)

VALUES (DEFAULT, 'test2'),

(DEFAULT, 'test3');

Though GENERATED ALWAYS implies the absence of the identity column in the
column names list of any INSERT statement, listing it there is OK, without
the OVERRIDING SYSTEM VALUE flag, as long as only one row is inserted (and
DEFAULT is used as value, of course). But starting at 2 rows (and I can
only guess it never stops), the proper error is thrown, advising to use the
aforementioned flag.

Is there any reason I missed such behavior would be expected ?

I agree the inconsistency seems to be undocumented but I wonder why the
second case doesn't work, not why the first one does. INSERT says: "For a
generated column, specifying this is permitted but merely specifies the
normal behavior of computing the column from its generation expression."
That says the first case works, default means the same as omitting the
column altogether.

David J.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: David G. Johnston (#2)
Re: Identity column behavior discrepancies when inserting one or many rows

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Mon, Oct 12, 2020 at 4:04 AM Jean Prulière <jean@oclock.io> wrote:

Here is a very simple script to reproduce what we encountered :

CREATE TABLE test (
id int GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
sometext text NOT NULL
);

-- that one works : INSERT 0 1
-- I think it should not and throw a 428C9
INSERT INTO test (id, sometext)

VALUES (DEFAULT, 'test');

-- this one does not : SQL state : 428C9
INSERT INTO test (id, sometext)

VALUES (DEFAULT, 'test2'),
(DEFAULT, 'test3');

I agree the inconsistency seems to be undocumented but I wonder why the
second case doesn't work, not why the first one does. INSERT says: "For a
generated column, specifying this is permitted but merely specifies the
normal behavior of computing the column from its generation expression."
That says the first case works, default means the same as omitting the
column altogether.

Yeah. Ideally both examples should work. The fact that the second one
does not is an implementation deficiency: the code that recognizes no-op
DEFAULTs for this purpose doesn't descend into multi-row VALUES lists.
Somebody submitted a patch to improve that awhile ago, but I didn't like
the first version and there hasn't been an update yet:

https://commitfest.postgresql.org/30/2681/

Perhaps you could get your students interested in rewriting that patch?

regards, tom lane