ANSI Compliant Inserts
Reports missing values as bad.
BAD: INSERT INTO tab (col1, col2) VALUES ('val1');
GOOD: INSERT INTO tab (col1, col2) VALUES ('val1', 'val2');
Regress tests against DEFAULT and normal values as they're managed
slightly different.
Attachments:
ansiinsert.patchtext/plain; charset=ISO-8859-1; name=ansiinsert.patchDownload+40-8
Rod Taylor <rbt@zort.ca> writes:
/* ! * XXX It is possible that the targetlist has fewer entries than were ! * in the columns list. We do not consider this an error. Perhaps we ! * should, if the columns list was explicitly given? */ =20=20 /* done building the range table and jointree */ qry->rtable =3D pstate->p_rtable; --- 547,558 ---- } =20=20 /* ! * Ensure that the targetlist has the same number of entries ! * that were present in the columns list. Don't do the check ! * for select statements. */ + if (stmt->cols !=3D NIL && (icolumns !=3D NIL || attnos !=3D NIL)) + elog(ERROR, "INSERT has more target columns than expressions");
What's the rationale for changing this exactly?
The code might or might not need changing (I believe the XXX comment
questioning it is mine, in fact) but changing behavior without any
pghackers discussion is not the way to approach this.
In general I'm suspicious of rejecting cases we used to accept for
no good reason other than that it's not in the spec. There is a LOT
of Postgres behavior that's not in the spec.
regards, tom lane
Tom Lane wrote:
Rod Taylor <rbt@zort.ca> writes:
/* ! * XXX It is possible that the targetlist has fewer entries than were ! * in the columns list. We do not consider this an error. Perhaps we ! * should, if the columns list was explicitly given? */ =20=20 /* done building the range table and jointree */ qry->rtable =3D pstate->p_rtable; --- 547,558 ---- } =20=20 /* ! * Ensure that the targetlist has the same number of entries ! * that were present in the columns list. Don't do the check ! * for select statements. */ + if (stmt->cols !=3D NIL && (icolumns !=3D NIL || attnos !=3D NIL)) + elog(ERROR, "INSERT has more target columns than expressions");What's the rationale for changing this exactly?
The code might or might not need changing (I believe the XXX comment
questioning it is mine, in fact) but changing behavior without any
pghackers discussion is not the way to approach this.In general I'm suspicious of rejecting cases we used to accept for
no good reason other than that it's not in the spec. There is a LOT
of Postgres behavior that's not in the spec.
TODO has:
o Disallow missing columns in INSERT ... VALUES, per ANSI
I think it should be done because it is very easy to miss columns on
INSERT without knowing it. I think our current behavior is too
error-prone. Now, if we want to just throw a NOTICE is such cases, that
would work too.
Clearly he didn't need discussion because it was already on the TODO
list. I guess the question is whether it should have had a question
mark. I certainly didn't think so.
Also, I thought we were going to fix COPY to reject missing columns too.
I just can't see a valid reason for allowing missing columns in either
case, except to hide errors.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Bruce Momjian <pgman@candle.pha.pa.us> writes:
In general I'm suspicious of rejecting cases we used to accept for
no good reason other than that it's not in the spec. There is a LOT
of Postgres behavior that's not in the spec.
TODO has:
o Disallow missing columns in INSERT ... VALUES, per ANSI
Where's the discussion that's the basis of that entry? I don't recall
any existing consensus on this (though maybe I forgot).
There are a fair number of things in the TODO list that you put there
because you liked 'em, but that doesn't mean everyone else agrees.
I certainly will not accept "once it's on the TODO list it cannot be
questioned"...
regards, tom lane
Tom Lane wrote:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
In general I'm suspicious of rejecting cases we used to accept for
no good reason other than that it's not in the spec. There is a LOT
of Postgres behavior that's not in the spec.TODO has:
o Disallow missing columns in INSERT ... VALUES, per ANSIWhere's the discussion that's the basis of that entry? I don't recall
any existing consensus on this (though maybe I forgot).
I assume someone (Peter?) looked it up and reported that our current
behavior was incorrect and not compliant. I didn't do the research in
whether it was compliant.
There are a fair number of things in the TODO list that you put there
because you liked 'em, but that doesn't mean everyone else agrees.
I certainly will not accept "once it's on the TODO list it cannot be
questioned"...
I put it there because I didn't think there was any question. If I was
wrong, I can add a question mark to it.
Do you want to argue we should continue allowing it? Also, what about
missing trailling columns in COPY?
If we continue allowing missing INSERT columns, I am not sure we can
claim it is an extension or whether the standard requires the query to
fail.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
A team member had a bug in their SQL code which would have been caught
with this, so I looked it up. Found the TODO entry indicating it was
something that should be done. It was fairly simple to do, so I went
forward with it.
If it's not wanted, then feel free to reject the patch and remove the
TODO item -- or change the TODO item to indicate discussion is
required.
--
Rod Taylor
Your eyes are weary from staring at the CRT. You feel sleepy. Notice
how restful it is to watch the cursor blink. Close your eyes. The
opinions stated above are yours. You cannot imagine why you ever felt
otherwise.
----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Rod Taylor" <rbt@zort.ca>
Cc: <pgsql-patches@postgresql.org>
Sent: Sunday, April 14, 2002 11:09 PM
Subject: Re: [PATCHES] ANSI Compliant Inserts
Rod Taylor <rbt@zort.ca> writes:
/*
! * XXX It is possible that the targetlist has fewer entries than
were
! * in the columns list. We do not consider this an error.
Perhaps we
! * should, if the columns list was explicitly given? */ =20=20 /* done building the range table and jointree */ qry->rtable =3D pstate->p_rtable; --- 547,558 ---- } =20=20 /* ! * Ensure that the targetlist has the same number of entries ! * that were present in the columns list. Don't do the check ! * for select statements. */ + if (stmt->cols !=3D NIL && (icolumns !=3D NIL || attnos !=3D
NIL))
+ elog(ERROR, "INSERT has more target columns than expressions");
What's the rationale for changing this exactly?
The code might or might not need changing (I believe the XXX comment
questioning it is mine, in fact) but changing behavior without any
pghackers discussion is not the way to approach this.In general I'm suspicious of rejecting cases we used to accept for
no good reason other than that it's not in the spec. There is a LOT
of Postgres behavior that's not in the spec.regards, tom lane
---------------------------(end of
broadcast)---------------------------
Show quoted text
TIP 4: Don't 'kill -9' the postmaster
Rod Taylor wrote:
A team member had a bug in their SQL code which would have been caught
with this, so I looked it up. Found the TODO entry indicating it was
something that should be done. It was fairly simple to do, so I went
forward with it.If it's not wanted, then feel free to reject the patch and remove the
TODO item -- or change the TODO item to indicate discussion is
required.
Yes, that is the point. Tom thinks discussion is required on this item,
while I can't imagine why. OK, Tom, let's discuss.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Bruce Momjian <pgman@candle.pha.pa.us> writes:
Do you want to argue we should continue allowing it?
No; I'm objecting that there hasn't been adequate discussion about
this change of behavior.
BTW, if the rationale for the change is "ANSI compliance" then the patch
is still wrong. SQL92 says:
3) No <column name> of T shall be identified more than once. If the
<insert column list> is omitted, then an <insert column list>
that identifies all columns of T in the ascending sequence of
their ordinal positions within T is implicit.
5) Let QT be the table specified by the <query expression>. The
degree of QT shall be equal to the number of <column name>s in
the <insert column list>.
The patch enforces equality only for the case of an explicit <insert
column list> --- which is the behavior I suggested in the original
comment, but the spec clearly requires an exact match for an implicit
list too. How tight do we want to get?
In any case this discussion should be taking place someplace more public
than -patches.
regards, tom lane
I submitted a patch which would make Postgresql ANSI compliant in
regards to INSERT with a provided column list. As Tom states below,
this is not full compliance.
CREATE TABLE tab(col1 text, col2 text);
INSERT INTO tab (col1, col2) VALUES ('val1'); -- bad by spec (enforced
by patch)
INSERT INTO tab (col1, col2) VALUES ('val1', 'val2'); -- good
INSERT INTO tab VALUES ('val1'); -- bad by spec (not enforced)
INSERT INTO tab VALUES ('val1', 'val2'); -- good
Currently in postgres all of the above are valid. I'd like to rule
out the first case (as enforced by the patch) as it's obvious the user
had intended to have two values. Especially useful when the user
misses a value and inserts bad data into the table as a result.
For the latter one, it could be argued that the user understands the
table in question and has inserted the values they require. New
columns are added at the end, and probably don't affect the operation
in question so why should it be changed to suit new columns? But,
automated code should always be written with the columns explicitly
listed, so this may be a user who has simply forgotten to add the
value -- easy to do on wide tables.
Thoughts?
--
Rod Taylor
----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Bruce Momjian" <pgman@candle.pha.pa.us>
Cc: "Rod Taylor" <rbt@zort.ca>; <pgsql-patches@postgresql.org>
Sent: Sunday, April 14, 2002 11:49 PM
Subject: Re: [PATCHES] ANSI Compliant Inserts
Bruce Momjian <pgman@candle.pha.pa.us> writes:
Do you want to argue we should continue allowing it?
No; I'm objecting that there hasn't been adequate discussion about
this change of behavior.BTW, if the rationale for the change is "ANSI compliance" then the
patch
is still wrong. SQL92 says:
3) No <column name> of T shall be identified more than
once. If the
<insert column list> is omitted, then an <insert column
list>
that identifies all columns of T in the ascending
sequence of
their ordinal positions within T is implicit.
5) Let QT be the table specified by the <query expression>.
The
degree of QT shall be equal to the number of <column
name>s in
the <insert column list>.
The patch enforces equality only for the case of an explicit <insert
column list> --- which is the behavior I suggested in the original
comment, but the spec clearly requires an exact match for an
implicit
list too. How tight do we want to get?
In any case this discussion should be taking place someplace more
public
Show quoted text
than -patches.
regards, tom lane
"Rod Taylor" <rbt@zort.ca> writes:
CREATE TABLE tab(col1 text, col2 text);
INSERT INTO tab (col1, col2) VALUES ('val1'); -- bad by spec (enforced
by patch)
INSERT INTO tab (col1, col2) VALUES ('val1', 'val2'); -- good
INSERT INTO tab VALUES ('val1'); -- bad by spec (not enforced)
INSERT INTO tab VALUES ('val1', 'val2'); -- good
Currently in postgres all of the above are valid. I'd like to rule
out the first case (as enforced by the patch) as it's obvious the user
had intended to have two values.
Seems reasonable.
For the latter one, it could be argued that the user understands the
table in question and has inserted the values they require.
Ruling out this case would break a technique that I've used a lot in the
past, which is to put defaultable columns (eg, SERIAL columns) at the
end, so that they can simply be left out of quick manual inserts.
So I agree with this part too. (I wouldn't necessarily write
application code that way, but then I believe in the theory that robust
application code should always specify an explicit column list.)
For the record --- I actually am in favor of this patch; but I wanted
to see the change discussed and defended in a more widely-read mailing
list than -patches. If there are no objections from the assembled
hackers, apply away ...
regards, tom lane
[ Discussion moved to hackers.]
We are discussing TODO item:
o Disallow missing columns in INSERT ... VALUES, per ANSI
Tom Lane wrote:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
Do you want to argue we should continue allowing it?
No; I'm objecting that there hasn't been adequate discussion about
this change of behavior.
So, you don't want to allow it, I don't want to allow it, the patch
author doesn't want to allow it. The reason the item doesn't require
much discussion is that I can't imagine anyone arguing we should allow
it. If there is anyone out there that doesn't want the listed TODO item
completed, please chime in now.
BTW, if the rationale for the change is "ANSI compliance" then the patch
is still wrong. SQL92 says:3) No <column name> of T shall be identified more than once. If the
<insert column list> is omitted, then an <insert column list>
that identifies all columns of T in the ascending sequence of
their ordinal positions within T is implicit.5) Let QT be the table specified by the <query expression>. The
degree of QT shall be equal to the number of <column name>s in
the <insert column list>.The patch enforces equality only for the case of an explicit <insert
column list> --- which is the behavior I suggested in the original
comment, but the spec clearly requires an exact match for an implicit
list too. How tight do we want to get?
Yes, I think we want both implicit and explicit column names to match
the VALUES list. We do have DEFAULT for INSERT now, so that should make
things somewhat easier for people wanting to insert DEFAULT values
without specifying the column list.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Rod Taylor wrote:
For the latter one, it could be argued that the user understands the
table in question and has inserted the values they require. New
columns are added at the end, and probably don't affect the operation
in question so why should it be changed to suit new columns? But,
automated code should always be written with the columns explicitly
listed, so this may be a user who has simply forgotten to add the
value -- easy to do on wide tables.
I think our new DEFAULT for insert allows people to properly match all
columns, and I think it is too error prone to allow missing columns in
any INSERT.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Tom Lane wrote:
"Rod Taylor" <rbt@zort.ca> writes:
CREATE TABLE tab(col1 text, col2 text);
INSERT INTO tab (col1, col2) VALUES ('val1'); -- bad by spec (enforced
by patch)
INSERT INTO tab (col1, col2) VALUES ('val1', 'val2'); -- goodINSERT INTO tab VALUES ('val1'); -- bad by spec (not enforced)
INSERT INTO tab VALUES ('val1', 'val2'); -- goodCurrently in postgres all of the above are valid. I'd like to rule
out the first case (as enforced by the patch) as it's obvious the user
had intended to have two values.Seems reasonable.
For the latter one, it could be argued that the user understands the
table in question and has inserted the values they require.Ruling out this case would break a technique that I've used a lot in the
past, which is to put defaultable columns (eg, SERIAL columns) at the
end, so that they can simply be left out of quick manual inserts.
So I agree with this part too. (I wouldn't necessarily write
application code that way, but then I believe in the theory that robust
application code should always specify an explicit column list.)
Yes, I understand the tempation to put the columns needing default at
the end and skipping them on INSERT. However, our new DEFAULT insert
value seems to handle that nicely, certainly better than the old code
did, and I think the added robustness of now requiring full columns on
INSERT is worth it.
I realize this could break some apps, but with the new DEFAULT value, it
seems like a good time to reign in this error-prone capability.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
INSERT INTO tab VALUES ('val1'); -- bad by spec (not enforced)
INSERT INTO tab VALUES ('val1', 'val2'); -- goodI recall that this was the behavior we agreed we wanted. IMHO, it
would
be conditional on the INSERT ... VALUES (DEFAULT) capability being
provided. I'm not sure if that is there yet.
My patch for that was applied a couple weeks ago.
Import Notes
Reference msg id not found: Pine.LNX.4.30.0204150023520.717-100000@peter.localdomain | Resolved by subject fallback
Bruce Momjian <pgman@candle.pha.pa.us> writes:
Tom Lane wrote:
Ruling out this case would break a technique that I've used a lot in the
past, which is to put defaultable columns (eg, SERIAL columns) at the
end, so that they can simply be left out of quick manual inserts.
Yes, I understand the tempation to put the columns needing default at
the end and skipping them on INSERT. However, our new DEFAULT insert
value seems to handle that nicely, certainly better than the old code
did, and I think the added robustness of now requiring full columns on
INSERT is worth it.
If I have two or three defaultable columns (say, a SERIAL primary key
and an insertion timestamp), it's going to be a pain in the neck to
have to write DEFAULT, DEFAULT, ... at the end of every insert.
I feel that people who want error cross-checking on this will have used
an explicit column list anyway. Therefore, Rod's patch tightens the
case that should be tight, while still being appropriately loose for
casual manual inserts.
BTW, I do *not* agree with equating this case with COPY. COPY is mostly
used for loading dumped data, and so it's reasonable to make different
tradeoffs between error checking and friendliness for COPY and INSERT.
regards, tom lane
Rod Taylor writes:
I submitted a patch which would make Postgresql ANSI compliant in
regards to INSERT with a provided column list. As Tom states below,
this is not full compliance.CREATE TABLE tab(col1 text, col2 text);
INSERT INTO tab (col1, col2) VALUES ('val1'); -- bad by spec (enforced
by patch)
INSERT INTO tab (col1, col2) VALUES ('val1', 'val2'); -- goodINSERT INTO tab VALUES ('val1'); -- bad by spec (not enforced)
INSERT INTO tab VALUES ('val1', 'val2'); -- good
I recall that this was the behavior we agreed we wanted. IMHO, it would
be conditional on the INSERT ... VALUES (DEFAULT) capability being
provided. I'm not sure if that is there yet.
--
Peter Eisentraut peter_e@gmx.net
Peter Eisentraut <peter_e@gmx.net> writes:
I recall that this was the behavior we agreed we wanted. IMHO, it would
be conditional on the INSERT ... VALUES (DEFAULT) capability being
provided. I'm not sure if that is there yet.
That is there now. Do you recall when this was discussed before?
I couldn't remember if there'd been any real discussion or not.
regards, tom lane
Peter Eisentraut wrote:
Rod Taylor writes:
I submitted a patch which would make Postgresql ANSI compliant in
regards to INSERT with a provided column list. As Tom states below,
this is not full compliance.CREATE TABLE tab(col1 text, col2 text);
INSERT INTO tab (col1, col2) VALUES ('val1'); -- bad by spec (enforced
by patch)
INSERT INTO tab (col1, col2) VALUES ('val1', 'val2'); -- goodINSERT INTO tab VALUES ('val1'); -- bad by spec (not enforced)
INSERT INTO tab VALUES ('val1', 'val2'); -- goodI recall that this was the behavior we agreed we wanted. IMHO, it would
be conditional on the INSERT ... VALUES (DEFAULT) capability being
provided. I'm not sure if that is there yet.
Yes, it is key to have DEFAULT working before we change this, and it is
in CVS now, committed a week or two ago.
Peter, are you saying you don't want to require all columns to be
specified when INSERT doesn't list the columns?
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Bruce Momjian writes:
Peter, are you saying you don't want to require all columns to be
specified when INSERT doesn't list the columns?
Yes, that's what I'm saying. Too much breakage and annoyance potential in
that change.
--
Peter Eisentraut peter_e@gmx.net
IMHO, from a developers/users prospective I want to get atleast a NOTICE
when they mismatch, and really preferably I want the query to fail
because if I'm naming a column list then forget a value for it something
is wrong...
Bruce Momjian wrote:
Show quoted text
Tom Lane wrote:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
In general I'm suspicious of rejecting cases we used to accept for
no good reason other than that it's not in the spec. There is a LOT
of Postgres behavior that's not in the spec.TODO has:
o Disallow missing columns in INSERT ... VALUES, per ANSIWhere's the discussion that's the basis of that entry? I don't recall
any existing consensus on this (though maybe I forgot).I assume someone (Peter?) looked it up and reported that our current
behavior was incorrect and not compliant. I didn't do the research in
whether it was compliant.There are a fair number of things in the TODO list that you put there
because you liked 'em, but that doesn't mean everyone else agrees.
I certainly will not accept "once it's on the TODO list it cannot be
questioned"...I put it there because I didn't think there was any question. If I was
wrong, I can add a question mark to it.Do you want to argue we should continue allowing it? Also, what about
missing trailling columns in COPY?If we continue allowing missing INSERT columns, I am not sure we can
claim it is an extension or whether the standard requires the query to
fail.