Apparent anomaly with views and unions

Started by Guy Rouillierabout 21 years ago9 messagesgeneral
Jump to latest
#1Guy Rouillier
guyr@masergy.com

I using 8.0.1. I create 3 tables with these definitions:

create table t1 (serv_id varchar(50) not null);
create table t2 (serv_id varchar(50) not null);
create table t3 (serv_id varchar(50) not null);

Now I create a view like this:

create or replace view v1 as select * from t1;

Next, I attempt to update this view like this:
create or replace view v1 as select * from t1 union select * from t2;

I receive: ERROR: cannot change data type of view column "serv_id"

Next, I drop view v1 and recreate it using the union above. I am now
able to do the following without error:

create or replace view v1 as select * from t1 union select * from t2
union select * from t3;

So it looks like changing a view from a non-union to a union is
producing an error, but changing a view that already contains a 2-way
union to one that contains a 3-way union is allowed (and I'm assuming
any n to n+1 union transition is also allowed.)

Is this intentional? It was certainly not predictable.

--
Guy Rouillier

#2Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Guy Rouillier (#1)
Re: Apparent anomaly with views and unions

On Fri, 11 Feb 2005, Guy Rouillier wrote:

I using 8.0.1. I create 3 tables with these definitions:

create table t1 (serv_id varchar(50) not null);
create table t2 (serv_id varchar(50) not null);
create table t3 (serv_id varchar(50) not null);

Now I create a view like this:

create or replace view v1 as select * from t1;

Next, I attempt to update this view like this:
create or replace view v1 as select * from t1 union select * from t2;

I receive: ERROR: cannot change data type of view column "serv_id"

I'm pretty sure PostgreSQL treats the type of serv_id in the new view as
varchar with no limit rather than varchar(50). I think that's also not
entirely up to spec.

In the two to three case it already was a varchar with no limit and so the
type was the same.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephan Szabo (#2)
Re: Apparent anomaly with views and unions

Stephan Szabo <sszabo@megazone.bigpanda.com> writes:

On Fri, 11 Feb 2005, Guy Rouillier wrote:

Now I create a view like this:
create or replace view v1 as select * from t1;

Next, I attempt to update this view like this:
create or replace view v1 as select * from t1 union select * from t2;

I receive: ERROR: cannot change data type of view column "serv_id"

I'm pretty sure PostgreSQL treats the type of serv_id in the new view as
varchar with no limit rather than varchar(50).

This is the same issue reported here:
http://archives.postgresql.org/pgsql-general/2004-12/msg00508.php
http://archives.postgresql.org/pgsql-hackers/2004-12/msg00408.php
and is pretty closely related to this:
http://archives.postgresql.org/pgsql-bugs/2005-02/msg00008.php

I was looking at fixing it yesterday. The obvious path to a fix is to
do select_common_type across all the members of a set-operation tree
at once, rather than pairwise as we do it now. That bothers me a bit
because it could change the semantics. Consider for example

SELECT float8col UNION ALL (SELECT numericcol UNION SELECT numericcol)

The final result will be float8 in any case, but I think that a strict
reading of the spec requires the righthand UNION to be done in numeric
type, before we promote to float8. If we promote to float8 and then
union, roundoff could make formerly distinct values not distinct,
resulting in a different number of output rows.

The above is a pretty contrived case though --- in particular,
I had to write UNION ALL for the upper level, else it would do a
duplicate-elimination pass that would mask the issue anyway. In
practice I'm not sure anyone would ever notice the difference if we
took the simple path of selecting one type across the whole tree.

Maybe it would work to resolve types only across nested nodes of exactly
the same set operator. I haven't worked through the details of this
however.

regards, tom lane

#4Guy Rouillier
guyr@masergy.com
In reply to: Tom Lane (#3)
Re: Apparent anomaly with views and unions

Tom and Stephan, thank you both for taking the time to reply. Further
comments inline.

Tom Lane wrote:

Stephan Szabo <sszabo@megazone.bigpanda.com> writes:

On Fri, 11 Feb 2005, Guy Rouillier wrote:

Now I create a view like this:
create or replace view v1 as select * from t1;

Next, I attempt to update this view like this:
create or replace view v1 as select * from t1 union select * from
t2;

I receive: ERROR: cannot change data type of view column "serv_id"

I'm pretty sure PostgreSQL treats the type of serv_id in the new view
as varchar with no limit rather than varchar(50).

This is the same issue reported here:
http://archives.postgresql.org/pgsql-general/2004-12/msg00508.php
http://archives.postgresql.org/pgsql-hackers/2004-12/msg00408.php
and is pretty closely related to this:
http://archives.postgresql.org/pgsql-bugs/2005-02/msg00008.php

We appear to have two issues here:

(1) What is the meaning of "replace" as in replace view?
(2) What are the semantics for multiple set operations?

My original issue deals with (1). I'm unfortunately not well versed in
the SQL spec, but from a layman's point of view, if I'm replacing a view
definition, I don't expect that to be rejected because it is
incompatible with the original view definition. My new definition may
use entirely different set of tables than the original view. "create or
replace view" should accomplish the same thing as "drop view; create
view".

I was looking at fixing it yesterday. The obvious path to a fix is
to do select_common_type across all the members of a set-operation
tree at once, rather than pairwise as we do it now. That bothers me
a bit because it could change the semantics. Consider for example

SELECT float8col UNION ALL (SELECT numericcol UNION SELECT
numericcol)

This seems to be addressing point (2). In this particular case, I would
expect the query engine to follow the submitter's explicit directions,
since he or she provided an explicit grouping. If no parentheses are
included, then the optimizer or query rewriter is free to do as it
wishes.

--
Guy Rouillier

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Guy Rouillier (#4)
Re: Apparent anomaly with views and unions

"Guy Rouillier" <guyr@masergy.com> writes:

We appear to have two issues here:

(1) What is the meaning of "replace" as in replace view?
(2) What are the semantics for multiple set operations?

My original issue deals with (1). I'm unfortunately not well versed in
the SQL spec, but from a layman's point of view, if I'm replacing a view
definition, I don't expect that to be rejected because it is
incompatible with the original view definition.

Well, we don't have the ability to cope with redefining a view in a way
that changes its result column list --- we don't know how to propagate
that into dependent views in any clean way. (Now, ALTER TABLE deals
with this by allowing the change only when there are no dependent views;
maybe it'd be sensible to do the same for REPLACE VIEW.) The immediate
issue here is that the redefinition shouldn't be seen as changing the
result column type. The underlying cause of that hasn't got anything
to do with views per se, it has to do with the way that we impute a
result datatype to a UNION structure.

My new definition may
use entirely different set of tables than the original view. "create or
replace view" should accomplish the same thing as "drop view; create
view".

In the cases I'm concerned about, DROP VIEW would fail unless you used
CASCADE, and the subsequent CREATE VIEW would not magically bring back
the dependent objects the CASCADE had killed. The entire *point* of
REPLACE VIEW is that it does not have the side effect of destroying
dependent objects; and therefore it has to act in a way that ensures
the dependent objects still make sense. So, for the moment anyway,
that means disallowing changes in the output column names and datatypes.

Consider for example

SELECT float8col UNION ALL (SELECT numericcol UNION SELECT
numericcol)

This seems to be addressing point (2). In this particular case, I would
expect the query engine to follow the submitter's explicit directions,
since he or she provided an explicit grouping. If no parentheses are
included, then the optimizer or query rewriter is free to do as it
wishes.

I don't think we get to do it that way. The way the SQL spec is
written, there is not a difference between A UNION B UNION C and
(A UNION B) UNION C. The result is *always* built up from pairwise
UNIONs, and the semantic definition of UNION talks about combining
only two tables at a time.

This consideration doesn't stop us from fixing your problem (since in
the case you care about, all the datatypes are the same anyway) ...
but it puts rather a large crimp in Woehling's problem, which I was
hoping to fix at the same time.

regards, tom lane

#6Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#5)
Re: Apparent anomaly with views and unions

Tom Lane <tgl@sss.pgh.pa.us> writes:

In the cases I'm concerned about, DROP VIEW would fail unless you used
CASCADE, and the subsequent CREATE VIEW would not magically bring back
the dependent objects the CASCADE had killed. The entire *point* of
REPLACE VIEW is that it does not have the side effect of destroying
dependent objects; and therefore it has to act in a way that ensures
the dependent objects still make sense. So, for the moment anyway,
that means disallowing changes in the output column names and datatypes.

There have been a few times when people were confused by the way views adapted
to changes in their base tables. A typical example is when someone wants to
rename a column and then add a new column with the old name. They usually
expect the view to refer to the new column and not automagically change
definition to refer to the renamed column.

Based on the user reactions, it seems like the least-surprise behaviour would
actually be the naive implementation of keeping the raw text defining the view
and when anything changes in the base tables (or views) reinterpreting the
view based on the new context.

That would make both the ALTER TABLE and CREATE OR REPLACE VIEW behaviour
simple to implement. If you rename a column then any query using the view gets
an error until you put back a column with the name used.

It seems like this would be a huge behaviour change though. I don't suppose
the spec says anything about how views are supposed to respond to changes in
base context does it?

--
greg

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#6)
Re: Apparent anomaly with views and unions

Greg Stark <gsstark@mit.edu> writes:

It seems like this would be a huge behaviour change though. I don't suppose
the spec says anything about how views are supposed to respond to changes in
base context does it?

SQL99 doesn't have RENAME COLUMN nor ALTER COLUMN TYPE, so the exact
cases don't arise, but I think this note under ALTER TABLE ADD COLUMN
is relevant:

NOTE 189 - The addition of a column to a table has no effect on
any existing <query expression> included in a view descriptor,
<triggered action> included in a trigger descriptor, or <search
condition> included in a constraint descriptor because any
implicit column references in these descriptor elements are
syntactically substituted by explicit column references under
the Syntax Rules of Subclause 7.11, "<query specification>".
Furthermore, by implication (from the lack of any General Rules
to the contrary), the meaning of a column reference is never
retroactively changed by the addition of a column subsequent
to the invocation of the <SQL schema statement> containing that
column reference.

The last sentence in particular seems to justify the way we do things
now.

The spec is also quite clear that DROP COLUMN (with CASCADE) destroys
any views that mention the column.

I see nothing at all in the spec that justifies the idea of "keep around
the source text and reinterpret it". They don't think that way; they
think in terms of thoroughly-processed "descriptors" stored in the
system catalogs.

regards, tom lane

#8Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#7)
Re: Apparent anomaly with views and unions

Tom Lane <tgl@sss.pgh.pa.us> writes:

I see nothing at all in the spec that justifies the idea of "keep around
the source text and reinterpret it". They don't think that way; they
think in terms of thoroughly-processed "descriptors" stored in the
system catalogs.

Huh. The thing I find most jarring about this way of thinking is that it means
I can have objects in my database that don't correspond to any source code I
have saved.

Say I want to add an expression to a view, I can't safely take the source as I
created it in the past, add the column, and recreate it. The actual view in
the database may no longer correspond to the create statement I have saved.

I suppose you could say that's no different than the create table statements
which could be obsoleted by a series of alter tables.

--
greg

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#8)
Re: Apparent anomaly with views and unions

Greg Stark <gsstark@mit.edu> writes:

Huh. The thing I find most jarring about this way of thinking is that it means
I can have objects in my database that don't correspond to any source code I
have saved.

[ shrug... ] Almost any form of ALTER command gives rise to that
situation; RENAME TABLE being just the most basic. I can't find
anything compelling in that gripe. If you want your saved source
code to always describe the database, you mustn't ALTER at all.

There are certainly lots of people who do things just that way,
but they aren't the ones pushing us to add more and more ALTER
functionality ;-)

regards, tom lane