Re: ALTER TABLE DROP COLUMN

Started by Bruce Momjianover 25 years ago20 messages
#1Bruce Momjian
pgman@candle.pha.pa.us

Seems we have 4 DROP COLUMN ideas:

Method Advantage
-----------------------------------------------------------------
1 invisible column marked by negative attnum fast
2 invisible column marked by is_dropped column fast
3 make copy of table without column col removed
4 make new tuples in existing table without column col removed

Folks, we had better choose one and get started.

Number 1 Hiroshi has ifdef'ed out in the code. Items 1 and 2 have
problems with backend code and 3rd party code not seeing the dropped
columns, or having gaps in the attno numbering. Number 3 has problems
with making it an atomic operation, and number 4 is described below.

---------------------------------------------------------------------------

Bruce Momjian <pgman@candle.pha.pa.us> writes:

You can exclusively lock the table, then do a heap_getnext() scan over
the entire table, remove the dropped column, do a heap_insert(), then a
heap_delete() on the current tuple, making sure to skip over the tuples
inserted by the current transaction. When completed, remove the column
from pg_attribute, mark the transaction as committed (if desired), and
run vacuum over the table to remove the deleted rows.

Hmm, that would work --- the new tuples commit at the same instant that
the schema updates commit, so it should be correct. You have the 2x
disk usage problem, but there's no way around that without losing
rollback ability.

A potentially tricky bit will be persuading the tuple-reading and tuple-
writing subroutines to pay attention to different versions of the tuple
structure for the same table. I haven't looked to see if this will be
difficult or not. If you can pass the TupleDesc explicitly then it
shouldn't be a problem.

I'd suggest that the cleanup vacuum *not* be an automatic part of
the operation; just recommend that people do it ASAP after dropping
a column. Consider needing to drop several columns...

regards, tom lane

************

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#2Hannu Krosing
hannu@tm.ee
In reply to: Bruce Momjian (#1)

Bruce Momjian wrote:

Seems we have 4 DROP COLUMN ideas:

Method Advantage
-----------------------------------------------------------------
1 invisible column marked by negative attnum fast
2 invisible column marked by is_dropped column fast
3 make copy of table without column col removed
4 make new tuples in existing table without column col removed

IIRC there was a fifth idea, a variation of 2 that would work better
with
inheritance -

5 all columns have is_real_column attribute that is true for all
coluns
present in that relation, so situations like

create table tab_a(a_i int);
create table tab_b(b_i int) inherits(tab_a);
alter table tab_a add column c_i int;

can be made to work.

It would also require clients to ignore all missing columns that backend
can
pass to them as nulls (which is usually quite cheap in bandwith usage)
in
case of "SELECT **" queries.

We could even rename attno to attid to make folks aware that it is not
be
assumed to be continuous.

Folks, we had better choose one and get started.

Number 1 Hiroshi has ifdef'ed out in the code. Items 1 and 2 have
problems with backend code and 3rd party code not seeing the dropped
columns, or having gaps in the attno numbering.

If we want to make ADD COLUMN to work with inheritance wihout having to
rewrite every single tuple in both parent and inherited tables, we will
have to accept the fact that there are caps in in attno numbering.

Number 3 has problems
with making it an atomic operation, and number 4 is described below.

Nr 4 has still problems with attno numbering _changing_ during drop
which
could either be better or worse for client software than having gaps -
in both cases client must be prepared to deal with runtime changes in
attribute definition.

--------------
Hannu

#3Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Bruce Momjian (#1)
RE: ALTER TABLE DROP COLUMN

-----Original Message-----
From: pgsql-hackers-owner@hub.org
[mailto:pgsql-hackers-owner@hub.org]On Behalf Of Bruce Momjian

Seems we have 4 DROP COLUMN ideas:

Method Advantage
-----------------------------------------------------------------
1 invisible column marked by negative attnum fast
2 invisible column marked by is_dropped column fast
3 make copy of table without column col removed
4 make new tuples in existing table without column col removed

Folks, we had better choose one and get started.

Number 1 Hiroshi has ifdef'ed out in the code. Items 1 and 2 have
problems with backend code and 3rd party code not seeing the dropped
columns,

Hmm,doesn't *not seeing* mean the column is dropped ?

or having gaps in the attno numbering. Number 3 has problems
with making it an atomic operation, and number 4 is described below.

Don't forget another important point.

Currently even DROP TABLE doesn't remove related objects completely.
And I don't think I could remove objects related to the dropping column
completely using 1)2) in ALTER TABLE DROP COLUMN implementation.

Using 3)4) we should not only remove objects as 1)2) but also
change attnum-s in all objects related to the relation. Otherwise
PostgreSQL would do the wrong thing silently.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp

#4Don Baccus
dhogaza@pacifier.com
In reply to: Hiroshi Inoue (#3)
RE: ALTER TABLE DROP COLUMN

At 01:43 PM 6/10/00 +0900, Hiroshi Inoue wrote:

-----Original Message-----
From: pgsql-hackers-owner@hub.org
[mailto:pgsql-hackers-owner@hub.org]On Behalf Of Bruce Momjian

Seems we have 4 DROP COLUMN ideas:

Method Advantage
-----------------------------------------------------------------
1 invisible column marked by negative attnum fast
2 invisible column marked by is_dropped column fast
3 make copy of table without column col removed
4 make new tuples in existing table without column col removed

Folks, we had better choose one and get started.

Oracle gives you the choice between the "cheating" fast method(s) and
the "real" slow (really slow?) real method.

So there's at least real world experience by virtue of example by
the world's most successful database supplier that user control
over "hide the column" and "really delete the column" is valuable.

It really makes a lot of sense to give such a choice. If one
does so by "hiding", at a later date one would think the choice
of "really deleting" would be a possibility. I don't know if
Oracle does this...

If not, they might not care. In today's world, there are bazillions
of dollars for Oracle to scoop up from users who could just as easily
be PG users - all those "we'll fail if don't IPO 'cause we'll never
have any customers" database-backed websites :)

- Don Baccus, Portland OR <dhogaza@pacifier.com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Don Baccus (#4)

Don Baccus <dhogaza@pacifier.com> writes:

Oracle gives you the choice between the "cheating" fast method(s) and
the "real" slow (really slow?) real method.

So there's at least real world experience by virtue of example by
the world's most successful database supplier that user control
over "hide the column" and "really delete the column" is valuable.

Sure, but you don't need any help from the database to do "really delete
the column". SELECT INTO... is enough, and it's not even any slower
than the implementations under discussion.

So I'm satisfied if we offer the "hide the column" approach.

Has anyone thought about what happens to table constraints that use the
doomed column? Triggers, RI rules, yadda yadda?

Has anyone thought about undoing a DELETE COLUMN? The data is still
there, at least in tuples that have not been updated, so it's not
totally unreasonable.

regards, tom lane

#6Don Baccus
dhogaza@pacifier.com
In reply to: Tom Lane (#5)

At 01:14 AM 6/10/00 -0400, Tom Lane wrote:

Don Baccus <dhogaza@pacifier.com> writes:

Oracle gives you the choice between the "cheating" fast method(s) and
the "real" slow (really slow?) real method.

So there's at least real world experience by virtue of example by
the world's most successful database supplier that user control
over "hide the column" and "really delete the column" is valuable.

Sure, but you don't need any help from the database to do "really delete
the column". SELECT INTO... is enough, and it's not even any slower
than the implementations under discussion.

So I'm satisfied if we offer the "hide the column" approach.

<shrug> I wouldn't put a "real" drop column at the top of my list
of priorities, but there is something to be said for user convenience.

- Don Baccus, Portland OR <dhogaza@pacifier.com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

#7Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Hiroshi Inoue (#3)

-----Original Message-----
From: pgsql-hackers-owner@hub.org
[mailto:pgsql-hackers-owner@hub.org]On Behalf Of Bruce Momjian

Seems we have 4 DROP COLUMN ideas:

Method Advantage
-----------------------------------------------------------------
1 invisible column marked by negative attnum fast
2 invisible column marked by is_dropped column fast
3 make copy of table without column col removed
4 make new tuples in existing table without column col removed

Folks, we had better choose one and get started.

Number 1 Hiroshi has ifdef'ed out in the code. Items 1 and 2 have
problems with backend code and 3rd party code not seeing the dropped
columns,

Hmm,doesn't *not seeing* mean the column is dropped ?

I meant problems of backend code and 3rd party code _seeing_ the dropped
column in pg_attribute.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hiroshi Inoue (#3)

Seems we have 4 DROP COLUMN ideas:
Method Advantage
-----------------------------------------------------------------
1 invisible column marked by negative attnum fast
2 invisible column marked by is_dropped column fast
3 make copy of table without column col removed
4 make new tuples in existing table without column col removed

Bruce and I talked about this by phone yesterday, and we realized that
none of these are very satisfactory. #1 and #2 both have the flaw that
applications that examine pg_attribute will probably break: they will
see a sequence of attnum values with gaps in it. And what should the
rel's relnatts field be set to? #3 and #4 are better on that point,
but they leave us with the problem of renumbering references to columns
after the dropped one in constraints, rules, PL functions, etc.

Furthermore, there is a closely related problem that none of these
approaches give us much help on: recursive ALTER TABLE ADD COLUMN.
Right now, ADD puts the new column at the end of each table it's added
to, which often means that it gets a different column number in child
tables than in parent tables. That leads to havoc for pg_dump.

I think the only clean solution is to create a clear distinction between
physical and logical column numbers. Each pg_attribute tuple would need
two attnum fields, and pg_class would need two relnatts fields as well.
A column once created would never change its physical column number, but
its logical column number might change as a consequence of adding or
dropping columns before it. ADD COLUMN would ensure that a column added
to child tables receives the same logical column number as it has in the
parent table, thus solving the dump/reload problem. DROP COLUMN would
assign an invalid logical column number to dropped columns. They could
be numbered zero except that we'd probably still want a unique index on
attrelid+attnum, and the index would complain. I'd suggest using
Hiroshi's idea: give a dropped column a logical attnum equal to
-(physical_attnum + offset).

With this approach, internal operations on tuples would all use
physical column numbers, but operations that interface to the outside
world would present a view of only the valid logical columns. For
example, the parser would only allow logical columns to be referenced
by name; "SELECT *" would expand to valid logical columns in logical-
column-number order; COPY would send or receive valid logical columns
in logical-column-number order; etc.

Stored rules and so forth probably should store physical column numbers
so that they need not be modified during column add/drop.

This would require looking at all the places in the backend to determine
whether they should be working with logical or physical column numbers,
but the design is such that most all places would want to be using
physical numbers, so I don't think it'd be too painful.

Although I'd prefer to give the replacement columns two new names
(eg, "attlnum" and "attpnum") to ensure we find all uses, this would
surely break applications that examine pg_attribute. For compatibility
we'd have to recycle "attnum" and "relnatts" to indicate logical column
number and logical column count, while adding new fields (say "attpnum"
and "relnpatts") for the physical number and count.

Comments?

regards, tom lane

#9Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#8)

Seems we have 4 DROP COLUMN ideas:
Method Advantage
-----------------------------------------------------------------
1 invisible column marked by negative attnum fast
2 invisible column marked by is_dropped column fast
3 make copy of table without column col removed
4 make new tuples in existing table without column col removed

Bruce and I talked about this by phone yesterday, and we realized that
none of these are very satisfactory. #1 and #2 both have the flaw that
applications that examine pg_attribute will probably break: they will
see a sequence of attnum values with gaps in it. And what should the
rel's relnatts field be set to? #3 and #4 are better on that point,
but they leave us with the problem of renumbering references to columns
after the dropped one in constraints, rules, PL functions, etc.

Yes, glad you summarized.

Furthermore, there is a closely related problem that none of these
approaches give us much help on: recursive ALTER TABLE ADD COLUMN.
Right now, ADD puts the new column at the end of each table it's added
to, which often means that it gets a different column number in child
tables than in parent tables. That leads to havoc for pg_dump.

Also good point.

I think the only clean solution is to create a clear distinction between
physical and logical column numbers. Each pg_attribute tuple would need
two attnum fields, and pg_class would need two relnatts fields as well.

Excellent idea.

A column once created would never change its physical column number, but
its logical column number might change as a consequence of adding or
dropping columns before it. ADD COLUMN would ensure that a column added
to child tables receives the same logical column number as it has in the
parent table, thus solving the dump/reload problem. DROP COLUMN would
assign an invalid logical column number to dropped columns. They could
be numbered zero except that we'd probably still want a unique index on
attrelid+attnum, and the index would complain. I'd suggest using
Hiroshi's idea: give a dropped column a logical attnum equal to
-(physical_attnum + offset).

My guess is that we would need a unique index on the physical attno, not
the logical one. Multiple zero attno's may be fine.

With this approach, internal operations on tuples would all use
physical column numbers, but operations that interface to the outside
world would present a view of only the valid logical columns. For
example, the parser would only allow logical columns to be referenced
by name; "SELECT *" would expand to valid logical columns in logical-
column-number order; COPY would send or receive valid logical columns
in logical-column-number order; etc.

Yes, the only hard part will be taking values supplied in logical order
and moving them into pysical order. Not too hard with dropped columns,
because they are only gaps, but inheritance would require re-ordering
some of the values supplied by the user. Not hard, just something
additional that is needed.

Stored rules and so forth probably should store physical column numbers
so that they need not be modified during column add/drop.

Yes!

This would require looking at all the places in the backend to determine
whether they should be working with logical or physical column numbers,
but the design is such that most all places would want to be using
physical numbers, so I don't think it'd be too painful.

Agreed. Most are physical.

Although I'd prefer to give the replacement columns two new names
(eg, "attlnum" and "attpnum") to ensure we find all uses, this would
surely break applications that examine pg_attribute. For compatibility
we'd have to recycle "attnum" and "relnatts" to indicate logical column
number and logical column count, while adding new fields (say "attpnum"
and "relnpatts") for the physical number and count.

Can I recommend keeping attnum and relatts as logical, and adding
attheapnum and relheapatts so that it clearly shows these are the heap
values, not the user values.

Great idea. I was seeing things blocked in every option until your
idea.

Also, my guess is that Hiroshi's #ifdef's mark the places we need to
start looking at.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#10Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Tom Lane (#8)
RE: ALTER TABLE DROP COLUMN

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]

Seems we have 4 DROP COLUMN ideas:
Method Advantage
-----------------------------------------------------------------
1 invisible column marked by negative attnum fast
2 invisible column marked by is_dropped column fast
3 make copy of table without column col removed
4 make new tuples in existing table without column col removed

Hmm,I've received no pg-ML mails for more than 1 day.
What's happened with pgsql ML ?

Bruce and I talked about this by phone yesterday, and we realized that
none of these are very satisfactory. #1 and #2 both have the flaw that
applications that examine pg_attribute will probably break: they will
see a sequence of attnum values with gaps in it. And what should the
rel's relnatts field be set to? #3 and #4 are better on that point,
but they leave us with the problem of renumbering references to columns
after the dropped one in constraints, rules, PL functions, etc.

Furthermore, there is a closely related problem that none of these
approaches give us much help on: recursive ALTER TABLE ADD COLUMN.
Right now, ADD puts the new column at the end of each table it's added
to, which often means that it gets a different column number in child
tables than in parent tables. That leads to havoc for pg_dump.

Inheritance is one of the reason why I didn't take #2. I don't understand
marking is_dropped is needed or not when pg_attribute is overhauled
for inheritance.
I myself have never wanted to use current inheritance functionality
mainly because of this big flaw. Judging from the recent discussion
about oo(though I don't understand details),the change seems to be
needed in order to make inheritance functionality really useful.

I think the only clean solution is to create a clear distinction between
physical and logical column numbers. Each pg_attribute tuple would need
two attnum fields, and pg_class would need two relnatts fields as well.
A column once created would never change its physical column number, but

I don't understand inheritance well. In the near future wouldn't the
implementation require e.g. attid which is common to all children
of a parent and is never changed ? If so,we would need the third
attid field which is irrevalent to physical/logical position. If not,
physical column number would be sufficient .

its logical column number might change as a consequence of adding or
dropping columns before it. ADD COLUMN would ensure that a column added
to child tables receives the same logical column number as it has in the
parent table, thus solving the dump/reload problem. DROP COLUMN would
assign an invalid logical column number to dropped columns. They could
be numbered zero except that we'd probably still want a unique index on
attrelid+attnum, and the index would complain. I'd suggest using
Hiroshi's idea: give a dropped column a logical attnum equal to
-(physical_attnum + offset).

With this approach, internal operations on tuples would all use
physical column numbers, but operations that interface to the outside
world would present a view of only the valid logical columns. For
example, the parser would only allow logical columns to be referenced
by name; "SELECT *" would expand to valid logical columns in logical-
column-number order; COPY would send or receive valid logical columns
in logical-column-number order; etc.

Stored rules and so forth probably should store physical column numbers
so that they need not be modified during column add/drop.

This would require looking at all the places in the backend to determine
whether they should be working with logical or physical column numbers,
but the design is such that most all places would want to be using
physical numbers, so I don't think it'd be too painful.

Although I'd prefer to give the replacement columns two new names
(eg, "attlnum" and "attpnum") to ensure we find all uses, this would
surely break applications that examine pg_attribute. For compatibility
we'd have to recycle "attnum" and "relnatts" to indicate logical column
number and logical column count, while adding new fields (say "attpnum"
and "relnpatts") for the physical number and count.

I agree with you that we would add attpnum and change the meaing of
attnum as logical column number for backward compatibility.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp

#11Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Hiroshi Inoue (#10)

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]

Seems we have 4 DROP COLUMN ideas:
Method Advantage
-----------------------------------------------------------------
1 invisible column marked by negative attnum fast
2 invisible column marked by is_dropped column fast
3 make copy of table without column col removed
4 make new tuples in existing table without column col removed

Hmm,I've received no pg-ML mails for more than 1 day.
What's happened with pgsql ML ?

Tom says there are tons of messages in the hub.org mail queue, but they
are not being delivered.

Bruce and I talked about this by phone yesterday, and we realized that
none of these are very satisfactory. #1 and #2 both have the flaw that
applications that examine pg_attribute will probably break: they will
see a sequence of attnum values with gaps in it. And what should the
rel's relnatts field be set to? #3 and #4 are better on that point,
but they leave us with the problem of renumbering references to columns
after the dropped one in constraints, rules, PL functions, etc.

Furthermore, there is a closely related problem that none of these
approaches give us much help on: recursive ALTER TABLE ADD COLUMN.
Right now, ADD puts the new column at the end of each table it's added
to, which often means that it gets a different column number in child
tables than in parent tables. That leads to havoc for pg_dump.

Inheritance is one of the reason why I didn't take #2. I don't understand
marking is_dropped is needed or not when pg_attribute is overhauled
for inheritance.
I myself have never wanted to use current inheritance functionality
mainly because of this big flaw. Judging from the recent discussion
about oo(though I don't understand details),the change seems to be
needed in order to make inheritance functionality really useful.

What would happen is that all the logical attributes would be shifted
over, and a new column added using ADD COLUMN would be put in its place.
Seems it would work fine.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#12Chris Bitmead
chris@bitmead.com
In reply to: Bruce Momjian (#1)

I don't know if this is one of the 5, but my idea of a good
implementation is to do the fast invisible approach, and then update
individual tuples to the new format the next time they happen to be
UPDATEd.

Therefore, ALTER TABLE DROP COLUMN, followed by UPDATE foo SET bar=bar;
would cause the equiv of (4).

--
Chris Bitmead
mailto:chris@bitmead.com
Hannu Krosing wrote:

Show quoted text

Bruce Momjian wrote:

Seems we have 4 DROP COLUMN ideas:

Method Advantage
-----------------------------------------------------------------
1 invisible column marked by negative attnum fast
2 invisible column marked by is_dropped column fast
3 make copy of table without column col removed
4 make new tuples in existing table without column col removed

IIRC there was a fifth idea, a variation of 2 that would work better
with
inheritance -

5 all columns have is_real_column attribute that is true for all
coluns
present in that relation, so situations like

create table tab_a(a_i int);
create table tab_b(b_i int) inherits(tab_a);
alter table tab_a add column c_i int;

can be made to work.

It would also require clients to ignore all missing columns that backend
can
pass to them as nulls (which is usually quite cheap in bandwith usage)
in
case of "SELECT **" queries.

We could even rename attno to attid to make folks aware that it is not
be
assumed to be continuous.

Folks, we had better choose one and get started.

Number 1 Hiroshi has ifdef'ed out in the code. Items 1 and 2 have
problems with backend code and 3rd party code not seeing the dropped
columns, or having gaps in the attno numbering.

If we want to make ADD COLUMN to work with inheritance wihout having to
rewrite every single tuple in both parent and inherited tables, we will
have to accept the fact that there are caps in in attno numbering.

Number 3 has problems
with making it an atomic operation, and number 4 is described below.

Nr 4 has still problems with attno numbering _changing_ during drop
which
could either be better or worse for client software than having gaps -
in both cases client must be prepared to deal with runtime changes in
attribute definition.

--------------
Hannu

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Chris Bitmead (#12)

Chris Bitmead <chris@bitmead.com> writes:

I don't know if this is one of the 5, but my idea of a good
implementation is to do the fast invisible approach, and then update
individual tuples to the new format the next time they happen to be
UPDATEd.

How would you tell whether a particular tuple has been updated or not?

Furthermore, how would you remember the old tuple format (or formats)
so that you'd know how to make the conversion?

Seems to me this approach would require adding some sort of table
version number to every tuple header, plus storing a complete set of
system catalog entries for every past version of each table's schema.
That's a heck of a high price, in both storage and complexity, for a
feature of dubious value...

regards, tom lane

#14The Hermit Hacker
scrappy@hub.org
In reply to: Bruce Momjian (#11)

On Sun, 11 Jun 2000, Bruce Momjian wrote:

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]

Seems we have 4 DROP COLUMN ideas:
Method Advantage
-----------------------------------------------------------------
1 invisible column marked by negative attnum fast
2 invisible column marked by is_dropped column fast
3 make copy of table without column col removed
4 make new tuples in existing table without column col removed

Hmm,I've received no pg-ML mails for more than 1 day.
What's happened with pgsql ML ?

Tom says there are tons of messages in the hub.org mail queue, but they
are not being delivered.

I was out for the past 4 days taking a little girl camping for her b-day
... great weekend, but we had a process run rampant over the weekend that
caused the loadavg to skyrocket. For anyone that has ever used sendmail,
they will know that a high load will cause sendmail to essentially shut
itself down, queuing only up to a certain point, refusing connections
after that ... queuing is at a loadavg of 15, refusing connections at 20,
the machine was slightly higher then that ...

Just checked the queue, and now that the load is back down, the queue is
pretty much flushed out again ...

#15Chris Bitmead
chrisb@nimrod.itg.telstra.com.au
In reply to: Bruce Momjian (#1)

Yes, it would need to work as you describe below. Such a scheme is used
in several object databases I know of. (Versant being one) where it
works great. It's not just useful for drop column, but also things like
add column with default value. It means that you can add and drop
columns to your hearts content in the blink of an eye, and yet
ultimately not pay the price in terms of storage costs.

But yep, it's a lot more work, and understandable if there isn't
enthusiasm for doing it.

Tom Lane wrote:

Show quoted text

Chris Bitmead <chris@bitmead.com> writes:

I don't know if this is one of the 5, but my idea of a good
implementation is to do the fast invisible approach, and then update
individual tuples to the new format the next time they happen to be
UPDATEd.

How would you tell whether a particular tuple has been updated or not?

Furthermore, how would you remember the old tuple format (or formats)
so that you'd know how to make the conversion?

Seems to me this approach would require adding some sort of table
version number to every tuple header, plus storing a complete set of
system catalog entries for every past version of each table's schema.
That's a heck of a high price, in both storage and complexity, for a
feature of dubious value...

#16Chris Bitmead
chrisb@nimrod.itg.telstra.com.au
In reply to: Hiroshi Inoue (#10)

Hiroshi Inoue wrote:

I don't understand inheritance well. In the near future wouldn't the
implementation require e.g. attid which is common to all children
of a parent and is never changed ? If so,we would need the third
attid field which is irrevalent to physical/logical position. If not,
physical column number would be sufficient .

We only need something like a unique attid of course if we support
column renaming in child tables. Otherwise the attname is sufficient to
match up child-parent columns.

If/when we support renaming, probably a parent_column_oid in
pg_attribute might be one way to go.

Your idea seems fine Tom.

#17Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Chris Bitmead (#16)
RE: ALTER TABLE DROP COLUMN

-----Original Message-----
From: Chris Bitmead

Hiroshi Inoue wrote:

I don't understand inheritance well. In the near future wouldn't the
implementation require e.g. attid which is common to all children
of a parent and is never changed ? If so,we would need the third
attid field which is irrevalent to physical/logical position. If not,
physical column number would be sufficient .

We only need something like a unique attid of course if we support
column renaming in child tables. Otherwise the attname is sufficient to
match up child-parent columns.

There are some objects which keep plans etc as compiled
state.

create table t1 (i1 int4);
create table t2 (i2 int4) inherits t1;
create table t3 (i3 int4) inherits t2;
alter table t1 add column i4 int4;

For each table,the list of (column, logical number, physical number)
would be as follows.

t1 (i1, 1, 1) (i4, 2, 2)
t2 (i1, 1, 1) (i4, 2, 3) (i2, 3, 2)
t3 (i1, 1, 1) (i4, 2, 4) (i2, 3, 2) (i3, 4, 3)

At this point the compilation of 'select * from t1(*?)' would mean
select (physical #1),(physical #2) from t1 +
select (physical #1),(physical #3) from t2 +
select (physical #1),(physical #4) from t3

Note that physical # aren't common for column i4.
I've wanted to confirm that above compilation would be OK for
the (near) future enhancement of inheritance functionality.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp

#18Chris Bitmead
chrisb@nimrod.itg.telstra.com.au
In reply to: Hiroshi Inoue (#17)

On thinking about it I can definitely see your point about wanting an
attrid that is common across the hierarchy, regardless of compiled
plans. There would be some merit in splitting up pg_attribute into two
parts. One part is common across all classes in the hierarchy, the other
part is specific to one class. Then the oid of the common part is the
attrid you refer to.

However, I'm not sure this directly affects Tom's proposal. Selects from
hierarchies are implemented in terms of a union of all the classes in
the hierarchy. Wouldn't the compiled plan refer to physical ids? In any
case, if UNION can be made to work, I would think select hierarchies
automatically would work too.

Hiroshi Inoue wrote:

Show quoted text

-----Original Message-----
From: Chris Bitmead

Hiroshi Inoue wrote:

I don't understand inheritance well. In the near future wouldn't the
implementation require e.g. attid which is common to all children
of a parent and is never changed ? If so,we would need the third
attid field which is irrevalent to physical/logical position. If not,
physical column number would be sufficient .

We only need something like a unique attid of course if we support
column renaming in child tables. Otherwise the attname is sufficient to
match up child-parent columns.

There are some objects which keep plans etc as compiled
state.

create table t1 (i1 int4);
create table t2 (i2 int4) inherits t1;
create table t3 (i3 int4) inherits t2;
alter table t1 add column i4 int4;

For each table,the list of (column, logical number, physical number)
would be as follows.

t1 (i1, 1, 1) (i4, 2, 2)
t2 (i1, 1, 1) (i4, 2, 3) (i2, 3, 2)
t3 (i1, 1, 1) (i4, 2, 4) (i2, 3, 2) (i3, 4, 3)

At this point the compilation of 'select * from t1(*?)' would mean
select (physical #1),(physical #2) from t1 +
select (physical #1),(physical #3) from t2 +
select (physical #1),(physical #4) from t3

Note that physical # aren't common for column i4.
I've wanted to confirm that above compilation would be OK for
the (near) future enhancement of inheritance functionality.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hiroshi Inoue (#17)

"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:

create table t1 (i1 int4);
create table t2 (i2 int4) inherits t1;
create table t3 (i3 int4) inherits t2;
alter table t1 add column i4 int4;

For each table,the list of (column, logical number, physical number)
would be as follows.

t1 (i1, 1, 1) (i4, 2, 2)
t2 (i1, 1, 1) (i4, 2, 3) (i2, 3, 2)
t3 (i1, 1, 1) (i4, 2, 4) (i2, 3, 2) (i3, 4, 3)

At this point the compilation of 'select * from t1(*?)' would mean
select (physical #1),(physical #2) from t1 +
select (physical #1),(physical #3) from t2 +
select (physical #1),(physical #4) from t3

Note that physical # aren't common for column i4.

That's no different from the current situation: the planner already must
(and does) adjust column numbers for each derived table while expanding
an inherited query. It's kind of a pain but hardly an insurmountable
problem.

Currently the matching is done by column name. We could possibly match
on logical column position instead --- not sure if that's better or
worse.

regards, tom lane

#20Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Tom Lane (#19)
RE: ALTER TABLE DROP COLUMN

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]

"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:

create table t1 (i1 int4);
create table t2 (i2 int4) inherits t1;
create table t3 (i3 int4) inherits t2;
alter table t1 add column i4 int4;

For each table,the list of (column, logical number, physical number)
would be as follows.

t1 (i1, 1, 1) (i4, 2, 2)
t2 (i1, 1, 1) (i4, 2, 3) (i2, 3, 2)
t3 (i1, 1, 1) (i4, 2, 4) (i2, 3, 2) (i3, 4, 3)

At this point the compilation of 'select * from t1(*?)' would mean
select (physical #1),(physical #2) from t1 +
select (physical #1),(physical #3) from t2 +
select (physical #1),(physical #4) from t3

Note that physical # aren't common for column i4.

That's no different from the current situation:

Yes your proposal has no problem currently. I'm only
anxious about oo feature. Recently there has been a
discussion around oo and we would be able to expect
the progress in the near future. If oo people never mind,
your proposal would be OK.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp