BUG #5898: Nested "in" clauses hide bad column names

Started by Scott Dunbarabout 15 years ago5 messagesbugs
Jump to latest
#1Scott Dunbar
scott@xigole.com

The following bug has been logged online:

Bug reference: 5898
Logged by: Scott Dunbar
Email address: scott@xigole.com
PostgreSQL version: 9.0.3
Operating system: Ubuntu 10.10
Description: Nested "in" clauses hide bad column names
Details:

I have a nested in clause like:

select respondent_id from respondent where respondent_id in (select
respondent_id from chat_session where project_id in (select project_id from
project where company_id = 4));

However, in this example, there is no column named respondent_id in the
chat_session table. But the query runs and, indeed, returns all of the rows
in respondent. Since this was then part of another nested in it deleted far
more that it was supposed to.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Scott Dunbar (#1)
Re: BUG #5898: Nested "in" clauses hide bad column names

"Scott Dunbar" <scott@xigole.com> writes:

I have a nested in clause like:

select respondent_id from respondent where respondent_id in (select
respondent_id from chat_session where project_id in (select project_id from
project where company_id = 4));

However, in this example, there is no column named respondent_id in the
chat_session table.

Probably there is one in respondent, though?  This behavior is not a bug
--- what you have there is an outer reference, and it is working exactly
as specified by the SQL standard.  Sub-selects would be a whole lot less
useful if they couldn't refer to variables of the outer query.

regards, tom lane

#3Scott Dunbar
scott@xigole.com
In reply to: Tom Lane (#2)
Re: BUG #5898: Nested "in" clauses hide bad column names

Yes, you're correct. I guess this makes sense but it does seem strange
that I can enter garbage in a query but it still runs. And in my case
the output from this (the entire table) was then used in a delete
statement that toasted the entire table. Allowing bogus SQL just seems
"wrong" but I do understand what's going on.

Thanks for your help.

On 02/22/2011 10:45 AM, Tom Lane wrote:

"Scott Dunbar"<scott@xigole.com> writes:

I have a nested in clause like:
select respondent_id from respondent where respondent_id in (select
respondent_id from chat_session where project_id in (select project_id from
project where company_id = 4));
However, in this example, there is no column named respondent_id in the
chat_session table.

Probably there is one in respondent, though?  This behavior is not a bug
--- what you have there is an outer reference, and it is working exactly
as specified by the SQL standard.  Sub-selects would be a whole lot less
useful if they couldn't refer to variables of the outer query.

regards, tom lane

--
Scott Dunbar
Xigole Systems, Inc.
Enterprise software consulting, development, and hosting
303�667�6343

#4Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Scott Dunbar (#3)
Re: BUG #5898: Nested "in" clauses hide bad column names

Scott Dunbar <scott@xigole.com> wrote:

I guess this makes sense but it does seem strange that I can enter
garbage in a query but it still runs.

It wasn't garbage.

And in my case the output from this (the entire table) was then
used in a delete statement that toasted the entire table.

I can suggest a few ways to protect yourself in such situations.

(1) I generally run any DELETE statement against data I care about
as a SELECT first. It's generally pretty easy to write it as a
SELECT tbl.* FROM which can be converted to DELETE FROM tbl after
reviewing what matches.

(2) When in doubt, use BEGIN; before running the statement. You
can review the count, run SELECTs to look at the results, etc.,
before running COMMIT; to make it "stick".

(3) Less convenient, but sometimes useful, is to EXPLAIN your query
before actually running it. If you do that with your delete, you'll
see the criterion applied to a table other than what you were
expecting, which might alert you to the problem. The estimated row
count at the top level of the plan might be another red flag.

Allowing bogus SQL just seems "wrong" but I do understand what's
going on.

If you did you wouldn't call a well formed, unambiguous, standard-
conforming statement bogus. It did exactly what you said; just not
what you meant. It pays to be a bit paranoid when running ad hoc
DML in case you accidentally don't say what you mean.

-Kevin

#5Eric Schwarzenbach
subscriber@blackbrook.org
In reply to: Scott Dunbar (#3)
Re: BUG #5898: Nested "in" clauses hide bad column names

Using table aliases prevents such problems by disambiguating the column
names. For example, if you had table aliases in just one part of this
query as in the below, you would have gotten an error instead of
deleting all those rows:

select respondent_id from respondent where respondent_id in (select
cs.respondent_id from chat_session cs where cs.project_id in (select project_id from
project where company_id = 4));

Cheers,

Eric

Show quoted text

On 2/22/2011 1:07 PM, Scott Dunbar wrote:

Yes, you're correct. I guess this makes sense but it does seem
strange that I can enter garbage in a query but it still runs. And in
my case the output from this (the entire table) was then used in a
delete statement that toasted the entire table. Allowing bogus SQL
just seems "wrong" but I do understand what's going on.

Thanks for your help.

On 02/22/2011 10:45 AM, Tom Lane wrote:

"Scott Dunbar"<scott@xigole.com> writes:

I have a nested in clause like:
select respondent_id from respondent where respondent_id in (select
respondent_id from chat_session where project_id in (select project_id from
project where company_id = 4));
However, in this example, there is no column named respondent_id in the
chat_session table.

Probably there is one in respondent, though?  This behavior is not a bug
--- what you have there is an outer reference, and it is working exactly
as specified by the SQL standard.  Sub-selects would be a whole lot less
useful if they couldn't refer to variables of the outer query.

regards, tom lane

--
Scott Dunbar
Xigole Systems, Inc.
Enterprise software consulting, development, and hosting
303·667·6343