BUG #5234: ALTER TABLE ... RENAME COLUMN change view definition incorrectly
The following bug has been logged online:
Bug reference: 5234
Logged by: Sergey Burladyan
Email address: eshkinkot@gmail.com
PostgreSQL version: 8.3.8
Operating system: Debian GNU/Linux 5.0.3 (lenny) + testing
Description: ALTER TABLE ... RENAME COLUMN change view definition
incorrectly
Details:
reported by Weed at http://www.sql.ru/forum/actualthread.aspx?tid=717835
(Russian)
Example:
create table a (i int, v text);
create table b (j int, v text);
create view v_using as select * from a left join b using (v);
alter table a rename v to o;
\d v_using
CREATE TABLE
CREATE TABLE
CREATE VIEW
ALTER TABLE
View "public.v_using"
Column | Type | Modifiers
--------+---------+-----------
v | text |
i | integer |
j | integer |
View definition:
SELECT a.o AS v, a.i, b.j
FROM a
LEFT JOIN b USING (v);
View is still working, but it text definition is incorrect:
t1=> select * from v_using ;
v | i | j
---+---+---
(0 rows)
t1=> SELECT a.o AS v, a.i, b.j
t1-> FROM a
t1-> LEFT JOIN b USING (v);
ERROR: 42703: column "v" specified in USING clause does not exist in left
table
LOCATION: transformFromClauseItem, parse_clause.c:813
If you dump database in this state, when you cannot restore this dump
without manual fix:
$ pg_dump -Fc -f dump t1
$ pg_restore dump | grep -A2 VIEW
-- Name: v_using; Type: VIEW; Schema: public; Owner: seb
--
CREATE VIEW v_using AS
SELECT a.o AS v, a.i, b.j FROM (a LEFT JOIN b USING (v));
$ LANG=C sudo -u postgres pg_restore -c -d t1 dump
. . .
pg_restore: [archiver (db)] could not execute query: ERROR: column "v"
specified in USING clause does not exist in left table
Command was: CREATE VIEW v_using AS
SELECT a.o AS v, a.i, b.j FROM (a LEFT JOIN b USING (v));
. . .
On Sun, Dec 6, 2009 at 5:41 PM, Sergey Burladyan <eshkinkot@gmail.com> wrote:
The following bug has been logged online:
Bug reference: 5234
Logged by: Sergey Burladyan
Email address: eshkinkot@gmail.com
PostgreSQL version: 8.3.8
Operating system: Debian GNU/Linux 5.0.3 (lenny) + testing
Description: ALTER TABLE ... RENAME COLUMN change view definition
incorrectly
Details:reported by Weed at http://www.sql.ru/forum/actualthread.aspx?tid=717835
(Russian)Example:
create table a (i int, v text);
create table b (j int, v text);
create view v_using as select * from a left join b using (v);
alter table a rename v to o;
\d v_usingCREATE TABLE
CREATE TABLE
CREATE VIEW
ALTER TABLE
View "public.v_using"
Column | Type | Modifiers
--------+---------+-----------
v | text |
i | integer |
j | integer |
View definition:
SELECT a.o AS v, a.i, b.j
FROM a
LEFT JOIN b USING (v);View is still working, but it text definition is incorrect:
t1=> select * from v_using ;
v | i | j
---+---+---
(0 rows)t1=> SELECT a.o AS v, a.i, b.j
t1-> FROM a
t1-> LEFT JOIN b USING (v);
ERROR: 42703: column "v" specified in USING clause does not exist in left
table
LOCATION: transformFromClauseItem, parse_clause.c:813If you dump database in this state, when you cannot restore this dump
without manual fix:
$ pg_dump -Fc -f dump t1
$ pg_restore dump | grep -A2 VIEW
-- Name: v_using; Type: VIEW; Schema: public; Owner: seb
--CREATE VIEW v_using AS
SELECT a.o AS v, a.i, b.j FROM (a LEFT JOIN b USING (v));$ LANG=C sudo -u postgres pg_restore -c -d t1 dump
. . .
pg_restore: [archiver (db)] could not execute query: ERROR: column "v"
specified in USING clause does not exist in left table
Command was: CREATE VIEW v_using AS
SELECT a.o AS v, a.i, b.j FROM (a LEFT JOIN b USING (v));
The problem here seems to be that get_from_clause_item() is a bit
naive about the JoinExpr representation. usingClause is stored as a
text argument, but get_from_clause_item() is assuming that the schema
definitions have not changed since the parse tree was created.
isNatural has the same problem.
I'm not an expert on this area of the code, but can we just ignore
isNatural and usingClause when deparsing? The attached patch fixes
the bug for me, although I couldn't swear as to whether it breaks
anything else.
...Robert
Attachments:
deparse-without-isnatural-or-using.patchtext/x-diff; charset=US-ASCII; name=deparse-without-isnatural-or-using.patchDownload+32-83
Robert Haas <robertmhaas@gmail.com> writes:
I'm not an expert on this area of the code, but can we just ignore
isNatural and usingClause when deparsing?
No. These properties are *not* ignorable because doing so changes the
set of returned columns --- you should get only one column out not two.
My reading of the spec is that USING (and therefore NATURAL) is defined
to join identically named columns. Therefore, renaming one of the input
columns as the OP did *should* indeed *must* break the view. The problem
is not how to make it work, it's how to give an error message that
doesn't look like an internal failure.
(This wasn't one of the SQL committee's better ideas IMO, but ours not
to reason why.)
regards, tom lane
On Thu, Dec 10, 2009 at 1:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
I'm not an expert on this area of the code, but can we just ignore
isNatural and usingClause when deparsing?No. These properties are *not* ignorable because doing so changes the
set of returned columns --- you should get only one column out not two.My reading of the spec is that USING (and therefore NATURAL) is defined
to join identically named columns. Therefore, renaming one of the input
columns as the OP did *should* indeed *must* break the view. The problem
is not how to make it work, it's how to give an error message that
doesn't look like an internal failure.
That seems ugly and unnecessary. I think we might be able to define
ourselves out of this problem. We don't guarantee (and have never
guaranteed) that selecting from a stored view will produce the same
results as re-executing the original query. For example, * refers the
list of columns at definition-time, not execution-time, and if a
column is renamed, the view still refers to the same column; it
doesn't start crashing, nor would we want it to. Similarly, here, the
USING is internally converted to an equality join on the two columns,
and the ambiguous output column is, I think, resolved in favor of one
of them. I think we can just say that that conversion happens in toto
at parse-time, just as the *-to-column-list conversion and the
column-name-to-column-reference conversions do. This seems like a
significantly more useful behavior and as a fringe benefit it
simplifies the code.
Moreover, it's basically what we're already doing for so long as the
view remains safely stored in the database; it just becomes
undumpable. If we adopt your solution, backpatching it might break
working applications and not backpatching it will leave residual
breakage when people try to upgrade. If we just make the dumping code
work the same way the executor already does, we don't have that
problem.
...Robert
"Robert" == Robert Haas <robertmhaas@gmail.com> writes:
On Thu, Dec 10, 2009 at 1:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
My reading of the spec is that USING (and therefore NATURAL) is
defined to join identically named columns. Therefore, renaming
one of the input columns as the OP did *should* indeed *must*
break the view. The problem is not how to make it work, it's how
to give an error message that doesn't look like an internal
failure.
Robert> That seems ugly and unnecessary. I think we might be able to
Robert> define ourselves out of this problem. We don't guarantee
Robert> (and have never guaranteed) that selecting from a stored view
Robert> will produce the same results as re-executing the original
Robert> query. For example, * refers the list of columns at
Robert> definition-time, not execution-time, and if a column is
Robert> renamed, the view still refers to the same column; it doesn't
Robert> start crashing, nor would we want it to. Similarly, here,
Robert> the USING is internally converted to an equality join on the
Robert> two columns, and the ambiguous output column is, I think,
Robert> resolved in favor of one of them. I think we can just say
Robert> that that conversion happens in toto at parse-time, just as
Robert> the *-to-column-list conversion and the
Robert> column-name-to-column-reference conversions do. This seems
Robert> like a significantly more useful behavior and as a fringe
Robert> benefit it simplifies the code.
There's another possible solution (albeit a somewhat nontrivial one)
which came up when a bunch of us were talking about this one on IRC;
which is to handle the problem in the view deparse: if a column used
in a USING clause has been renamed, add an alias to the query that
renames it back, e.g.
select ... from table1 as table1(v,a) join ... using (v)
This would have to affect all the other references to that same column
in the query, so you'd need to do something like this: before deparsing,
walk the query looking for offending USING clauses, and make a list of
renamings to apply to column names.
I haven't tried actually implementing this, but I believe it is
possible.
--
Andrew (irc:RhodiumToad)
On Thu, Dec 10, 2009 at 10:48 AM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
"Robert" == Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Dec 10, 2009 at 1:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> My reading of the spec is that USING (and therefore NATURAL) is
>> defined to join identically named columns. Therefore, renaming
>> one of the input columns as the OP did *should* indeed *must*
>> break the view. The problem is not how to make it work, it's how
>> to give an error message that doesn't look like an internal
>> failure.Robert> That seems ugly and unnecessary. I think we might be able to
Robert> define ourselves out of this problem. We don't guarantee
Robert> (and have never guaranteed) that selecting from a stored view
Robert> will produce the same results as re-executing the original
Robert> query. For example, * refers the list of columns at
Robert> definition-time, not execution-time, and if a column is
Robert> renamed, the view still refers to the same column; it doesn't
Robert> start crashing, nor would we want it to. Similarly, here,
Robert> the USING is internally converted to an equality join on the
Robert> two columns, and the ambiguous output column is, I think,
Robert> resolved in favor of one of them. I think we can just say
Robert> that that conversion happens in toto at parse-time, just as
Robert> the *-to-column-list conversion and the
Robert> column-name-to-column-reference conversions do. This seems
Robert> like a significantly more useful behavior and as a fringe
Robert> benefit it simplifies the code.There's another possible solution (albeit a somewhat nontrivial one)
which came up when a bunch of us were talking about this one on IRC;
which is to handle the problem in the view deparse: if a column used
in a USING clause has been renamed, add an alias to the query that
renames it back, e.g.
select ... from table1 as table1(v,a) join ... using (v)This would have to affect all the other references to that same column
in the query, so you'd need to do something like this: before deparsing,
walk the query looking for offending USING clauses, and make a list of
renamings to apply to column names.I haven't tried actually implementing this, but I believe it is
possible.
What advantage does this offer?
...Robert
Robert Haas <robertmhaas@gmail.com> writes:
On Thu, Dec 10, 2009 at 1:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
My reading of the spec is that USING (and therefore NATURAL) is defined
to join identically named columns. �Therefore, renaming one of the input
columns as the OP did *should* indeed *must* break the view. �The problem
is not how to make it work, it's how to give an error message that
doesn't look like an internal failure.
That seems ugly and unnecessary. I think we might be able to define
ourselves out of this problem. We don't guarantee (and have never
guaranteed) that selecting from a stored view will produce the same
results as re-executing the original query. For example, * refers the
list of columns at definition-time, not execution-time,
Um, aren't you contradicting yourself there?
The problem with USING is that it is not merely a join condition but
affects the set of columns emitted by the join. It can't be converted
to a simple ON without changing the semantics, and I don't believe we
should try.
regards, tom lane
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
There's another possible solution (albeit a somewhat nontrivial one)
which came up when a bunch of us were talking about this one on IRC;
which is to handle the problem in the view deparse: if a column used
in a USING clause has been renamed, add an alias to the query that
renames it back, e.g.
select ... from table1 as table1(v,a) join ... using (v)
Hmm. Cute, but I wonder why we shouldn't just be throwing an error.
As I said last night, the only thing I see wrong with the current
behavior is that the error message isn't phrased in a way that makes
it obvious it's the user's fault.
regards, tom lane
On Thu, Dec 10, 2009 at 11:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
There's another possible solution (albeit a somewhat nontrivial one)
which came up when a bunch of us were talking about this one on IRC;
which is to handle the problem in the view deparse: if a column used
in a USING clause has been renamed, add an alias to the query that
renames it back, e.g.
select ... from table1 as table1(v,a) join ... using (v)Hmm. Cute, but I wonder why we shouldn't just be throwing an error.
As I said last night, the only thing I see wrong with the current
behavior is that the error message isn't phrased in a way that makes
it obvious it's the user's fault.
I think the problem we're trying to solve is that the view works but
it isn't dumpable.
...Robert
On Thu, Dec 10, 2009 at 11:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Thu, Dec 10, 2009 at 1:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
My reading of the spec is that USING (and therefore NATURAL) is defined
to join identically named columns. Therefore, renaming one of the input
columns as the OP did *should* indeed *must* break the view. The problem
is not how to make it work, it's how to give an error message that
doesn't look like an internal failure.That seems ugly and unnecessary. I think we might be able to define
ourselves out of this problem. We don't guarantee (and have never
guaranteed) that selecting from a stored view will produce the same
results as re-executing the original query. For example, * refers the
list of columns at definition-time, not execution-time,Um, aren't you contradicting yourself there?
Not that I can see.
The problem with USING is that it is not merely a join condition but
affects the set of columns emitted by the join. It can't be converted
to a simple ON without changing the semantics, and I don't believe we
should try.
The OP seemed to be pretty clear on what the semantics should be, and
I'm not confused either. Why are you? :-)
...Robert
Robert Haas <robertmhaas@gmail.com> writes:
On Thu, Dec 10, 2009 at 11:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The problem with USING is that it is not merely a join condition but
affects the set of columns emitted by the join. �It can't be converted
to a simple ON without changing the semantics, and I don't believe we
should try.
The OP seemed to be pretty clear on what the semantics should be, and
I'm not confused either. Why are you? :-)
I'm not confused: it's an error condition. You have not explained why
it isn't.
What would actually be ideal is to throw an error on the attempted column
rename, but I doubt the problem is worth the additions to system
infrastructure that would be necessary to make that happen.
regards, tom lane
Robert Haas escribi�:
On Thu, Dec 10, 2009 at 11:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hmm. �Cute, but I wonder why we shouldn't just be throwing an error.
As I said last night, the only thing I see wrong with the current
behavior is that the error message isn't phrased in a way that makes
it obvious it's the user's fault.I think the problem we're trying to solve is that the view works but
it isn't dumpable.
So it would also be a solution that the ALTER TABLE that renamed the
column throwed an error if there's a view that requires the name.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
On Thu, Dec 10, 2009 at 1:02 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
Robert Haas escribió:
On Thu, Dec 10, 2009 at 11:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hmm. Cute, but I wonder why we shouldn't just be throwing an error.
As I said last night, the only thing I see wrong with the current
behavior is that the error message isn't phrased in a way that makes
it obvious it's the user's fault.I think the problem we're trying to solve is that the view works but
it isn't dumpable.So it would also be a solution that the ALTER TABLE that renamed the
column throwed an error if there's a view that requires the name.
Yes. I don't think it's as good of a solution, but it is a solution.
...Robert
On Thu, Dec 10, 2009 at 12:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Thu, Dec 10, 2009 at 11:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The problem with USING is that it is not merely a join condition but
affects the set of columns emitted by the join. It can't be converted
to a simple ON without changing the semantics, and I don't believe we
should try.The OP seemed to be pretty clear on what the semantics should be, and
I'm not confused either. Why are you? :-)I'm not confused: it's an error condition. You have not explained why
it isn't.
Well, it's obviously not an error condition at present, because the
view works perfectly well. As far as running SELECT statements
against the view, the QUERY structure in ev_rewrite is semantically
equivalent to one without the usingClause. buildMergedJoinVar()
builds a target list entry that refers to one or both of the
underlying base tables, using COALESCE if necessary. Nothing in that
join var depends IN ANY WAY on the usingClause. The JoinExpr
similarly gets rewritten in terms of an equality construct - while
it's true that the usingClause is still present, nothing outside the
parser/deparser actually looks at it.
For that reason, your claim that we CAN'T rewrite USING to a simple ON
without changing the semantics seems to me to be false. For purposes
of 99.44% of the code, we are already doing it. The only problem is,
after we do it, and after using that rewritten version to execute the
query, we maintain the fiction that we haven't done it when we dump
the view back out, leading to inconsistent results.
...Robert