patch: Add columns via CREATE OR REPLACE VIEW

Started by Robert Haasover 17 years ago30 messageshackers

Jump to latest

Robert Haas

robertmhaas@gmail.com

over 17 years ago

Here's a patch that allows CREATE OR REPLACE VIEW to add new columns
to an existing view.

Any feedback would be appreciated, especially if it meant that I could
fix any problems before the next commitfest.

Thanks,

...Robert

Alvaro Herrera

alvherre@2ndquadrant.com

over 17 years ago

In reply to: Robert Haas (#1)

Re: patch: Add columns via CREATE OR REPLACE VIEW

Robert Haas escribiï¿½:

Here's a patch that allows CREATE OR REPLACE VIEW to add new columns
to an existing view.

Any feedback would be appreciated, especially if it meant that I could
fix any problems before the next commitfest.

What happens with the columns previously defined? What happens if I
specify a different column definition for them; does it raise an error?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Robert Haas

robertmhaas@gmail.com

over 17 years ago

In reply to: Alvaro Herrera (#2)

Re: patch: Add columns via CREATE OR REPLACE VIEW

On Thu, Aug 7, 2008 at 11:17 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Robert Haas escribió:

Here's a patch that allows CREATE OR REPLACE VIEW to add new columns
to an existing view.

Any feedback would be appreciated, especially if it meant that I could
fix any problems before the next commitfest.

What happens with the columns previously defined? What happens if I
specify a different column definition for them; does it raise an error?

Yes. CheckViewTupleDesc() errors out, same as before.

Basically, the old algorithm was:

1. Check [new # of columns] = [old # of columns]
2. For each column: check that [old definition] = [new definition]

The new algorithm is:

1. Check [new # of columns] >= [old # of columns]
2. For each column <= [old # of columns]: check that [old definition]
= [new definition]
3. For each column > [old # of columns]: add the new column to the
relation (as ALTER TABLE ADD COLUMN, except bypassing the usual
prohibition on adding columns to views)

...Robert

Tom Lane

tgl@sss.pgh.pa.us

over 17 years ago

In reply to: Alvaro Herrera (#2)

Re: patch: Add columns via CREATE OR REPLACE VIEW

Alvaro Herrera <alvherre@commandprompt.com> writes:

Robert Haas escribi�:

Here's a patch that allows CREATE OR REPLACE VIEW to add new columns
to an existing view.

What happens with the columns previously defined? What happens if I
specify a different column definition for them; does it raise an error?

The original idea here was to give REPLACE VIEW as much flexibility
as we've recently added for tables via ALTER TABLE, which would ideally
include

1. adding columns
2. renaming columns
3. dropping columns that are not referenced elsewhere
4. changing type of columns that are not referenced elsewhere

But it seems hard to tell the difference between a "rename" and a
"drop". I think that we aren't going to get far on this until we
decide what we will consider to be the identity of a view column.
With regular tables the attnum is a persistent identifier, but that
doesn't seem to play nicely for REPLACE VIEW, at least not if you're
wanting to allow people to remove columns from their view definitions.

Maybe the right way is to *not* use CREATE OR REPLACE VIEW, but
rather ALTER VIEW ADD COLUMN and so on. Then column identity seems
a lot easier to keep track of.

Thoughts?

regards, tom lane

Alvaro Herrera

alvherre@2ndquadrant.com

over 17 years ago

In reply to: Tom Lane (#4)

Re: patch: Add columns via CREATE OR REPLACE VIEW

Tom Lane escribiï¿½:

But it seems hard to tell the difference between a "rename" and a
"drop". I think that we aren't going to get far on this until we
decide what we will consider to be the identity of a view column.
With regular tables the attnum is a persistent identifier, but that
doesn't seem to play nicely for REPLACE VIEW, at least not if you're
wanting to allow people to remove columns from their view definitions.

Hmm, maybe we need to pull off the project to separate logical attribute
number from physical and position. It sounds like it could make it
easier for view modification.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

David E. Wheeler

david@kineticode.com

over 17 years ago

In reply to: Tom Lane (#4)

Re: patch: Add columns via CREATE OR REPLACE VIEW

On Aug 7, 2008, at 08:43, Tom Lane wrote:

Maybe the right way is to *not* use CREATE OR REPLACE VIEW, but
rather ALTER VIEW ADD COLUMN and so on. Then column identity seems
a lot easier to keep track of.

+1, although what does the standard say?

Best,

David

Robert Haas

robertmhaas@gmail.com

over 17 years ago

In reply to: Tom Lane (#4)

Re: patch: Add columns via CREATE OR REPLACE VIEW

But it seems hard to tell the difference between a "rename" and a
"drop". I think that we aren't going to get far on this until we
decide what we will consider to be the identity of a view column.
With regular tables the attnum is a persistent identifier, but that
doesn't seem to play nicely for REPLACE VIEW, at least not if you're
wanting to allow people to remove columns from their view definitions.

Maybe the right way is to *not* use CREATE OR REPLACE VIEW, but
rather ALTER VIEW ADD COLUMN and so on. Then column identity seems
a lot easier to keep track of.

Thoughts?

ISTM that column identity should logically follow column name. If a
user types "CREATE OR REPLACE VIEW sandwhich (bacon, lettuce, bread)
..." and sometime later types "CREATE OR REPLACE VIEW sandwich (bacon,
lettuce, tomato, bread) ..." it seems overwhelmingly likely that their
intention was to insert tomato between lettuce and bread rather than
to rename bread to tomato and add a new column that happens to also be
called bread. If they want the other interpretation, they should use
do "ALTER VIEW sandwhich RENAME COLUMN tomato TO bread" before
executing "CREATE OR REPLACE VIEW".

The problem with "ALTER VIEW ADD COLUMN" is that the new column won't
be of any use until someone does "CREATE OR REPLACE VIEW" to update
the underlying query. And if they're already doing "CREATE OR REPLACE
VIEW", then we might as well let "CREATE OR REPLACE VIEW" generate the
column definitions automatically rather than forcing them to do the
same thing by hand (just as we allowed the user to create the view in
the first place without insisting on explicit column definitions).
The problem is even worse for "ALTER VIEW ALTER COLUMN TYPE". What
exactly will the semantics of the view be after this operation but
before a subsequent update of the query via "CREATE OR REPLACE VIEW"?
There are various options but none of them make much sense.

If you accept the idea that column identity should be based on column
name, then the only two operations that are really necessary are
"CREATE OR REPLACE VIEW" and "ALTER VIEW RENAME COLUMN", and it is
100% clear what the semantics of those operations should be. We may
not choose to fully support all possible alterations that can be made
through this framework for some time to come, but it's extremely easy
to understand where we're trying to get to. If you want to
additionally have "ALTER VIEW ADD/DROP COLUMN" available for those
that may wish to use them, presumably returning NULL for any column
that isn't generated by the query, that's also easy to understand and
well-defined.

...Robert

Andrew Dunstan

andrew@dunslane.net

over 17 years ago

In reply to: Tom Lane (#4)

Re: patch: Add columns via CREATE OR REPLACE VIEW

Tom Lane wrote:

Maybe the right way is to *not* use CREATE OR REPLACE VIEW, but
rather ALTER VIEW ADD COLUMN and so on. Then column identity seems
a lot easier to keep track of.

How would that look? Where would we put the new query?

cheers

andrew

Robert Haas

robertmhaas@gmail.com

over 17 years ago

In reply to: Alvaro Herrera (#5)

Re: patch: Add columns via CREATE OR REPLACE VIEW

Hmm, maybe we need to pull off the project to separate logical attribute
number from physical and position. It sounds like it could make it
easier for view modification.

Much easier! It would be a nice feature to have for table as well.
Right now, if you have a table with columns (foo1, foo2, foo3, bar1,
bar2, bar3) and you decide to add column foo4, there's no way to put
it where you intuitively want to put it. Not a big deal, but I'd
definitely use it if we had it.

However, it's not necessary to implement this in order to make
meaningful improvements to CREATE OR REPLACE VIEW. I think the only
thing we need to agree on is that no future implementation of CREATE
OR REPLACE VIEW will ever implicitly rename a column. If we agree on
column name as a measure of column identity, then the change I'm
proposing is forward-compatible with any other enhancements we may
want to make later.

...Robert

#10

Alvaro Herrera

alvherre@2ndquadrant.com

over 17 years ago

In reply to: Andrew Dunstan (#8)

Re: patch: Add columns via CREATE OR REPLACE VIEW

Andrew Dunstan escribiï¿½:

Tom Lane wrote:

Maybe the right way is to *not* use CREATE OR REPLACE VIEW, but
rather ALTER VIEW ADD COLUMN and so on. Then column identity seems
a lot easier to keep track of.

How would that look? Where would we put the new query?

I was thinking that the ADD COLUMN should specify the new result list
entry. Of course, this doesn't allow changing the query in more complex
ways (e.g. you can't add a GROUP BY clause that wasn't already there).

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#11

Tom Lane

tgl@sss.pgh.pa.us

over 17 years ago

In reply to: Alvaro Herrera (#10)

Re: patch: Add columns via CREATE OR REPLACE VIEW

Alvaro Herrera <alvherre@commandprompt.com> writes:

Andrew Dunstan escribi�:

Tom Lane wrote:

Maybe the right way is to *not* use CREATE OR REPLACE VIEW, but
rather ALTER VIEW ADD COLUMN and so on. Then column identity seems
a lot easier to keep track of.

How would that look? Where would we put the new query?

I was thinking that the ADD COLUMN should specify the new result list
entry.

Yeah, that's what I was thinking too. If you needed to change more
than just the topmost SELECT list, you'd need two steps: an ADD COLUMN
and then CREATE OR REPLACE VIEW to change the query in some way that
doesn't result in changing the output column set. Maybe that's going
to be too awkward to use though.

regards, tom lane

#12

Robert Haas

robertmhaas@gmail.com

over 17 years ago

In reply to: Tom Lane (#11)

Re: patch: Add columns via CREATE OR REPLACE VIEW

I was thinking that the ADD COLUMN should specify the new result list
entry.

Yeah, that's what I was thinking too. If you needed to change more
than just the topmost SELECT list, you'd need two steps: an ADD COLUMN
and then CREATE OR REPLACE VIEW to change the query in some way that
doesn't result in changing the output column set. Maybe that's going
to be too awkward to use though.

To me, that sounds less useful than simply being able to make changes
via CREATE OR REPLACE VIEW, but it's not an either/or proposition, and
I suspect that it's significantly more complicated to implement than
the patch I submitted.

...Robert

#13

Bruce Momjian

bruce@momjian.us

over 17 years ago

In reply to: Robert Haas (#9)

Re: patch: Add columns via CREATE OR REPLACE VIEW

"Robert Haas" <robertmhaas@gmail.com> writes:

I think the only thing we need to agree on is that no future implementation
of CREATE OR REPLACE VIEW will ever implicitly rename a column. If we agree
on column name as a measure of column identity, then the change I'm
proposing is forward-compatible with any other enhancements we may want to
make later.

hm... so what would this output?

CREATE VIEW a AS (select 1 as a, 2 as b);
CREATE VIEW b AS (select x,y FROM a AS a(x,y))
CREATE OR REPLACE VIEW a AS (select 1 as b, 'two' as c, 3 as a);
SELECT * FROM b;

What about this?

CREATE OR REPLACE VIEW a AS (select 1 as b, 2 as c, 'three as a);
SELECT * FROM b;

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!

#14

Pavel Stehule

pavel.stehule@gmail.com

over 17 years ago

In reply to: Tom Lane (#4)

Re: patch: Add columns via CREATE OR REPLACE VIEW

Maybe the right way is to *not* use CREATE OR REPLACE VIEW, but
rather ALTER VIEW ADD COLUMN and so on. Then column identity seems
a lot easier to keep track of.

I prefere ALTER VIEW too

regards
Pavel Stehule

#15

Tom Lane

tgl@sss.pgh.pa.us

over 17 years ago

In reply to: David E. Wheeler (#6)

Re: patch: Add columns via CREATE OR REPLACE VIEW

"David E. Wheeler" <david@kineticode.com> writes:

On Aug 7, 2008, at 08:43, Tom Lane wrote:

Maybe the right way is to *not* use CREATE OR REPLACE VIEW, but
rather ALTER VIEW ADD COLUMN and so on. Then column identity seems
a lot easier to keep track of.

+1, although what does the standard say?

AFAICT the standard doesn't have any way to alter the definition of an
existing view at all. It might be worth asking what other systems do,
though --- can you alter a view in Oracle or DB2 or mysql?

regards, tom lane

#16

David E. Wheeler

david@kineticode.com

over 17 years ago

In reply to: Tom Lane (#15)

Re: patch: Add columns via CREATE OR REPLACE VIEW

On Aug 7, 2008, at 13:01, Tom Lane wrote:

+1, although what does the standard say?

AFAICT the standard doesn't have any way to alter the definition of an
existing view at all. It might be worth asking what other systems do,
though --- can you alter a view in Oracle or DB2 or mysql?

Looks like MySQL 6.0 just does a CREATE OR REPLACE when you do ALTER
VIEW:

http://dev.mysql.com/doc/refman/6.0/en/create-view.html
http://dev.mysql.com/doc/refman/6.0/en/alter-view.html

Oracle doesn't seem to do much with it, either, just recompiles a view:

http://download.oracle.com/docs/cd/B10500_01/server.920/a96540/statements_45a.htm

Note that it says, "This statement does not change the definition of
an existing view. To redefine a view, you must use CREATE VIEW with
the OR REPLACE keywords."

DB2's ALTER VIEW is a bit more promising, though there doesn't seem to
be a way to add columns, just to redefine them:

http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/r0000894.htm

Best,

David

#17

Robert Haas

robertmhaas@gmail.com

over 17 years ago

In reply to: Bruce Momjian (#13)

Re: patch: Add columns via CREATE OR REPLACE VIEW

I think the only thing we need to agree on is that no future implementation
of CREATE OR REPLACE VIEW will ever implicitly rename a column. If we agree
on column name as a measure of column identity, then the change I'm
proposing is forward-compatible with any other enhancements we may want to
make later.

hm... so what would this output?

CREATE VIEW a AS (select 1 as a, 2 as b);
CREATE VIEW b AS (select x,y FROM a AS a(x,y))
CREATE OR REPLACE VIEW a AS (select 1 as b, 'two' as c, 3 as a);
SELECT * FROM b;

I'm not too familiar with the syntax "a AS a(x, y)" but I think it's
asking that the first two columns of a (whatever they are at the
moment) be aliased to x and y. If we allow reordering columns, it's
going to change the meaning of a number of different expressions. In
addition to the above, you have to worry about at least:

(1) INSERT INTO foo VALUES (a, b, c)
(2) COPY foo FROM wherever
(3) SELECT * FROM foo

Were I implementing the ability to reorder columns (which I have no
current plans to do), I think the logical thing to do would be to
decree that all of this is the user's problem. If you sensibly refer
to columns by name, then you are guaranteed to always be referencing
the same set of columns. If you assume that you know the position of
the columns, you assume the risk of getting burned if that ordering
changes. This is already true: even without the ability to reorder
columns, a table or view can still be dropped and recreated with a
different column ordering, and any client code that stupidly relies on
the columns being in the same order that they were before will blow
up. If people are writing code this way, they should either (1) fix
it to include explicit target lists or (2) not ever change the order
of their table columns. This is true whether or not reordering
columns requires dropping the object in question and all of its
dependencies, or whether it can be done in place, and is not an
argument for refusing to make it easier to do in-place.

With respect to your particular example, I think CREATE OR REPLACE
VIEW should fail with an error, just as this case does:

CREATE VIEW a AS (select 1 as a, 2 as b);
CREATE VIEW b AS (select x,y FROM a AS a(x,y));
ALTER TABLE a RENAME TO a_old; -- move a out of the way so we can
change the definition
CREATE OR REPLACE VIEW a AS (select 1 as b, 'two' as c, 3 as a);
CREATE OR REPLACE VIEW b AS (select x,y FROM a AS a(x,y)); -- recreate
b based on new definition of a
ERROR: cannot change data type of view column "y"

As Tom said upthread, columns should only be allowed to be dropped or
retyped if they have no dependencies. This case is a little weird
because normally the dependency would follow the name, but here
because of the a(x, y) syntax it has to follow the column position,
but it seems clear to me that the type change can't silently propagate
down into view b, so making it error out is the only logical
alternative.

What about this?

CREATE OR REPLACE VIEW a AS (select 1 as b, 2 as c, 'three as a);
SELECT * FROM b;

This seems well defined. a now has columns b-c-a with types
int-int-varchar. b gets a.b as x and a.c as y, so I expect to get (1,
2). If he had written CREATE VIEW b AS (select a as x,b as y FROM a)
then the statement above would fail due to a dependency on the type of
a.

Of course, accepting the patch I submitted doesn't require us to ever
support any of this, since it makes no attempt to reorder or retype
any existing columns. It only allows adding new columns at the end.
If we want to stick with the approach of "don't ever reorder columns",
we could decree that the goal is for CREATE OR REPLACE VIEW to allow
adding new columns and retyping of columns without dependencies, but
that renaming, dropping, and potentially reordering would have to be
done using an ALTER VIEW command.

I'm afraid we're getting off into a discussion of how to support
reordering of columns which I find fascinating but not necessarily
relevant to the patch at hand. I can't think of any imaginable
scenario in which the ability to add new columns at the end of an
existing view via CREATE OR REPLACE VIEW causes any problem for any
feature we might want to implement in the future, and it's clearly
useful. It takes me about half an hour to add a column to a
particular view on one of the systems I work with because of the need
to update all the dependent objects, and this would reduce it to about
10 seconds.

...Robert

#18

Robert Haas

robertmhaas@gmail.com

over 17 years ago

In reply to: Robert Haas (#17)

Re: patch: Add columns via CREATE OR REPLACE VIEW

I'm not too familiar with the syntax "a AS a(x, y)" but I think it's
asking that the first two columns of a (whatever they are at the
moment) be aliased to x and y.

Another possible option would be to introduce a syntax along the lines of

table AS table_alias (column AS column_alias, column AS column_alias)

and decree that a(x, y) is in essence expanded to a(a as x, b as y) at
the time you originally execute the statement, much the same way we
handle *.

This has a major implementation advantage in that it frees us from
worrying about whether columns were originally referenced by position
or by name. It does change the answers I gave in my previous email,
but I guess that doesn't matter very much since (1) we're not
proposing to implement this now and (2) either decision is
upward-compatible with the proposed patch.

Although several people have said that they prefer the idea of using
ALTER VIEW to make changes to views, no one has really expanded on the
reasons for their preference. Also, no one has offered any argument
at all as to why any future ALTER VIEW functionality that might be
added would conflict with the semantics of the patch I proposed.

...Robert

#19

Tom Lane

tgl@sss.pgh.pa.us

over 17 years ago

In reply to: Robert Haas (#18)

Re: patch: Add columns via CREATE OR REPLACE VIEW

"Robert Haas" <robertmhaas@gmail.com> writes:

Although several people have said that they prefer the idea of using
ALTER VIEW to make changes to views, no one has really expanded on the
reasons for their preference.

Because it sidesteps the problem of tracking which column is supposed to
be which. If you try to do it through CREATE OR REPLACE VIEW, you have
to either be extremely restrictive (like probably not allow renaming
of columns at all), or write some AI-complete algorithm to guess at what
the user intended.

regards, tom lane

#20

Robert Haas

robertmhaas@gmail.com

over 17 years ago

In reply to: Tom Lane (#19)

Re: patch: Add columns via CREATE OR REPLACE VIEW

Forgot to copy my response to this to the list.

On Thu, Aug 7, 2008 at 5:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Robert Haas" <robertmhaas@gmail.com> writes:

Although several people have said that they prefer the idea of using
ALTER VIEW to make changes to views, no one has really expanded on the
reasons for their preference.

Because it sidesteps the problem of tracking which column is supposed to
be which. If you try to do it through CREATE OR REPLACE VIEW, you have
to either be extremely restrictive (like probably not allow renaming
of columns at all), or write some AI-complete algorithm to guess at what
the user intended.

The current code takes the approach of being extremely restrictive -
it doesn't let you change anything at all. The code I'm proposing
manages to relax that restriction without creating any ambiguity that
anyone has been able to point out. All of the ambiguities that have
been mentioned are problems that might be created by some other,
entirely hypothetical patch.

...Robert

#21

Tom Lane

tgl@sss.pgh.pa.us

over 17 years ago

In reply to: Robert Haas (#20)

#22

Robert Haas

robertmhaas@gmail.com

over 17 years ago

In reply to: Tom Lane (#21)

#23