Re: ALTER TABLE DROP COLUMN

Started by Bruce Momjianover 25 years ago72 messages
#1Bruce Momjian
pgman@candle.pha.pa.us

OK, I am opening this can of worms again. I personally would like to
see this code activated, even if it does take 2x the disk space to alter
a column. Hiroshi had other ideas. Where did we leave this? We have
one month to decide on a plan.

Bruce Momjian <pgman@candle.pha.pa.us> writes:

You can exclusively lock the table, then do a heap_getnext() scan over
the entire table, remove the dropped column, do a heap_insert(), then a
heap_delete() on the current tuple, making sure to skip over the tuples
inserted by the current transaction. When completed, remove the column
from pg_attribute, mark the transaction as committed (if desired), and
run vacuum over the table to remove the deleted rows.

Hmm, that would work --- the new tuples commit at the same instant that
the schema updates commit, so it should be correct. You have the 2x
disk usage problem, but there's no way around that without losing
rollback ability.

A potentially tricky bit will be persuading the tuple-reading and tuple-
writing subroutines to pay attention to different versions of the tuple
structure for the same table. I haven't looked to see if this will be
difficult or not. If you can pass the TupleDesc explicitly then it
shouldn't be a problem.

I'd suggest that the cleanup vacuum *not* be an automatic part of
the operation; just recommend that people do it ASAP after dropping
a column. Consider needing to drop several columns...

regards, tom lane

************

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#1)

Bruce Momjian <pgman@candle.pha.pa.us> writes:

OK, I am opening this can of worms again. I personally would like to
see this code activated, even if it does take 2x the disk space to alter
a column. Hiroshi had other ideas. Where did we leave this? We have
one month to decide on a plan.

I think the plan should be to do nothing for 7.1. ALTER DROP COLUMN
isn't an especially pressing feature, and so I don't feel that we
should be hustling to squeeze it in just before beta. We're already
overdue for beta.

regards, tom lane

#3Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Tom Lane (#2)
RE: ALTER TABLE DROP COLUMN

-----Original Message-----
From: Tom Lane

Bruce Momjian <pgman@candle.pha.pa.us> writes:

OK, I am opening this can of worms again. I personally would like to
see this code activated, even if it does take 2x the disk space to alter
a column. Hiroshi had other ideas. Where did we leave this? We have
one month to decide on a plan.

I think the plan should be to do nothing for 7.1. ALTER DROP COLUMN
isn't an especially pressing feature, and so I don't feel that we
should be hustling to squeeze it in just before beta. We're already
overdue for beta.

Seems some people expect the implementation in 7.1.
(recent [GENERAL} drop column?)
I could commit my local branch if people don't mind
backward incompatibility.
I've maintained the branch for more than 1 month
and it implements the following TODOs.

* Add ALTER TABLE DROP COLUMN feature
* ALTER TABLE ADD COLUMN to inherited table put column in wrong place
* Prevent column dropping if column is used by foreign key

Comments ?

Hiroshi Inoue

P.S. I've noticed that get_rte_attribute_name() seems to
break my implementation. I'm not sure if I could solve it.

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hiroshi Inoue (#3)

"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:

Seems some people expect the implementation in 7.1.
(recent [GENERAL} drop column?)
I could commit my local branch if people don't mind
backward incompatibility.

I've lost track --- is this different from the _DROP_COLUMN_HACK__
code that's already in CVS? I really really didn't like that
implementation :-(, but I forget what other methods were being
discussed.

P.S. I've noticed that get_rte_attribute_name() seems to
break my implementation. I'm not sure if I could solve it.

That would be a problem --- rule dumping depends on that code to
produce correct aliases, so making it work is not optional.

regards, tom lane

#5Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Hiroshi Inoue (#3)

Tom Lane wrote:

"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:

Seems some people expect the implementation in 7.1.
(recent [GENERAL} drop column?)
I could commit my local branch if people don't mind
backward incompatibility.

I've lost track --- is this different from the _DROP_COLUMN_HACK__
code that's already in CVS? I really really didn't like that
implementation :-(, but I forget what other methods were being
discussed.

My current local trial implementation follows your idea(logical/
physical attribute numbers).

P.S. I've noticed that get_rte_attribute_name() seems to
break my implementation. I'm not sure if I could solve it.

That would be a problem --- rule dumping depends on that code to
produce correct aliases, so making it work is not optional.

Your change has no problem if logical==physical attribute
numbers.

Regards.

Hiroshi Inoue

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hiroshi Inoue (#5)

Hiroshi Inoue <Inoue@tpf.co.jp> writes:

P.S. I've noticed that get_rte_attribute_name() seems to
break my implementation. I'm not sure if I could solve it.

That would be a problem --- rule dumping depends on that code to
produce correct aliases, so making it work is not optional.

Your change has no problem if logical==physical attribute
numbers.

But if they're not, what do we do? Can we define the order of the
alias-name lists as being one or the other numbering? (Offhand I'd
say it should be logical numbering, but I haven't chased the details.)
If neither of those work, we'll need some more complex datastructure
than a simple list.

regards, tom lane

#7Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Hiroshi Inoue (#3)

Tom Lane wrote:

Hiroshi Inoue <Inoue@tpf.co.jp> writes:

P.S. I've noticed that get_rte_attribute_name() seems to
break my implementation. I'm not sure if I could solve it.

That would be a problem --- rule dumping depends on that code to
produce correct aliases, so making it work is not optional.

Your change has no problem if logical==physical attribute
numbers.

But if they're not, what do we do? Can we define the order of the
alias-name lists as being one or the other numbering? (Offhand I'd
say it should be logical numbering, but I haven't chased the details.)
If neither of those work, we'll need some more complex datastructure
than a simple list.

I'm not sure if we could keep invariant attribute numbers.
Though I've used physical attribute numbers as many as possible
in my trial implementation,there's already an exception.
I had to use logical attribute numbers for FieldSelect node.

Regards.

Hiroshi Inoue

#8Philip Warner
pjw@rhyme.com.au
In reply to: Hiroshi Inoue (#7)

At 12:05 6/10/00 +0900, Hiroshi Inoue wrote:

Tom Lane wrote:

Hiroshi Inoue <Inoue@tpf.co.jp> writes:

P.S. I've noticed that get_rte_attribute_name() seems to
break my implementation. I'm not sure if I could solve it.

That would be a problem --- rule dumping depends on that code to
produce correct aliases, so making it work is not optional.

Your change has no problem if logical==physical attribute
numbers.

But if they're not, what do we do? Can we define the order of the
alias-name lists as being one or the other numbering? (Offhand I'd
say it should be logical numbering, but I haven't chased the details.)
If neither of those work, we'll need some more complex datastructure
than a simple list.

I'm not sure if we could keep invariant attribute numbers.
Though I've used physical attribute numbers as many as possible
in my trial implementation,there's already an exception.
I had to use logical attribute numbers for FieldSelect node.

Not really a useful suggestion at this stage, but it seems to me that
storing plans and/or parse trees is possibly a false economy. Would it be
worth considering storing the relevant SQL (or a parse tree with field &
table names) and compiling the rule in each backend the first time it is
used? (and keep it for the life of the backend).

This would allow underlying view tables to be deleted/added as well as make
the above problem go away. The 'parse tree with names' would also enable
easy construction of dependency information when and if that is implemented...

----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/

#9The Hermit Hacker
scrappy@hub.org
In reply to: Tom Lane (#2)

seconded ...

On Fri, 29 Sep 2000, Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

OK, I am opening this can of worms again. I personally would like to
see this code activated, even if it does take 2x the disk space to alter
a column. Hiroshi had other ideas. Where did we leave this? We have
one month to decide on a plan.

I think the plan should be to do nothing for 7.1. ALTER DROP COLUMN
isn't an especially pressing feature, and so I don't feel that we
should be hustling to squeeze it in just before beta. We're already
overdue for beta.

regards, tom lane

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#10The Hermit Hacker
scrappy@hub.org
In reply to: Tom Lane (#4)

On Thu, 5 Oct 2000, Tom Lane wrote:

"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:

Seems some people expect the implementation in 7.1.
(recent [GENERAL} drop column?)
I could commit my local branch if people don't mind
backward incompatibility.

there have been several ideas thrown back and forth ... the best one that
I saw, forgetting who suggested it, had to do with the idea of locking the
table and doing an effective vacuum on that table with a 'row re-write'
happening ...

Basically, move the first 100 rows to the end of the table file, then take
100 and write it to position 0, 101 to position 1, etc ... that way, at
max, you are using ( tuple * 100 ) bytes of disk space, vs 2x the table
size ... either method is going to lock the file for a period of time, but
one is much more friendly as far as disk space is concerned *plus*, if RAM
is available for this, it might even be something that the backend could
use up to -S blocks of RAM to do it off disk? If I set -S to 64meg, and
the table is 24Meg in size, it could do it all in memory?

#11Bruce Momjian
pgman@candle.pha.pa.us
In reply to: The Hermit Hacker (#10)

Basically, move the first 100 rows to the end of the table file, then take
100 and write it to position 0, 101 to position 1, etc ... that way, at
max, you are using ( tuple * 100 ) bytes of disk space, vs 2x the table
size ... either method is going to lock the file for a period of time, but
one is much more friendly as far as disk space is concerned *plus*, if RAM
is available for this, it might even be something that the backend could
use up to -S blocks of RAM to do it off disk? If I set -S to 64meg, and
the table is 24Meg in size, it could do it all in memory?

Yes, I liked that too.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#11)

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Basically, move the first 100 rows to the end of the table file, then take
100 and write it to position 0, 101 to position 1, etc ... that way, at
max, you are using ( tuple * 100 ) bytes of disk space, vs 2x the table
size ... either method is going to lock the file for a period of time, but
one is much more friendly as far as disk space is concerned *plus*, if RAM
is available for this, it might even be something that the backend could
use up to -S blocks of RAM to do it off disk? If I set -S to 64meg, and
the table is 24Meg in size, it could do it all in memory?

Yes, I liked that too.

What happens if you crash partway through?

I don't think it's possible to build a crash-robust rewriting ALTER
process that doesn't use 2X disk space: you must have all the old tuples
AND all the new tuples down on disk simultaneously just before you
commit. The only way around 2X disk space is to adopt some logical
renumbering approach to the columns, so that you can pretend the dropped
column isn't there anymore when it really still is.

regards, tom lane

#13Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#12)

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Basically, move the first 100 rows to the end of the table file, then take
100 and write it to position 0, 101 to position 1, etc ... that way, at
max, you are using ( tuple * 100 ) bytes of disk space, vs 2x the table
size ... either method is going to lock the file for a period of time, but
one is much more friendly as far as disk space is concerned *plus*, if RAM
is available for this, it might even be something that the backend could
use up to -S blocks of RAM to do it off disk? If I set -S to 64meg, and
the table is 24Meg in size, it could do it all in memory?

Yes, I liked that too.

What happens if you crash partway through?

I don't think it's possible to build a crash-robust rewriting ALTER
process that doesn't use 2X disk space: you must have all the old tuples
AND all the new tuples down on disk simultaneously just before you
commit. The only way around 2X disk space is to adopt some logical
renumbering approach to the columns, so that you can pretend the dropped
column isn't there anymore when it really still is.

Yes, I liked the 2X disk space, and making the new tuples visible all at
once at the end.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#14The Hermit Hacker
scrappy@hub.org
In reply to: Tom Lane (#12)

On Mon, 9 Oct 2000, Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Basically, move the first 100 rows to the end of the table file, then take
100 and write it to position 0, 101 to position 1, etc ... that way, at
max, you are using ( tuple * 100 ) bytes of disk space, vs 2x the table
size ... either method is going to lock the file for a period of time, but
one is much more friendly as far as disk space is concerned *plus*, if RAM
is available for this, it might even be something that the backend could
use up to -S blocks of RAM to do it off disk? If I set -S to 64meg, and
the table is 24Meg in size, it could do it all in memory?

Yes, I liked that too.

What happens if you crash partway through?

what happens if you crash partway through a vacuum?

I don't think it's possible to build a crash-robust rewriting ALTER
process that doesn't use 2X disk space: you must have all the old
tuples AND all the new tuples down on disk simultaneously just before
you commit. The only way around 2X disk space is to adopt some
logical renumbering approach to the columns, so that you can pretend
the dropped column isn't there anymore when it really still is.

how about a combination of the two? basically, we're gonna want a vacuum
of the table after the alter to clean out those extra columns that we've
marked as 'dead' ... basically, anything that avoids tht whole 2x disk
space option is cool ...

#15The Hermit Hacker
scrappy@hub.org
In reply to: Bruce Momjian (#13)

On Mon, 9 Oct 2000, Bruce Momjian wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Basically, move the first 100 rows to the end of the table file, then take
100 and write it to position 0, 101 to position 1, etc ... that way, at
max, you are using ( tuple * 100 ) bytes of disk space, vs 2x the table
size ... either method is going to lock the file for a period of time, but
one is much more friendly as far as disk space is concerned *plus*, if RAM
is available for this, it might even be something that the backend could
use up to -S blocks of RAM to do it off disk? If I set -S to 64meg, and
the table is 24Meg in size, it could do it all in memory?

Yes, I liked that too.

What happens if you crash partway through?

I don't think it's possible to build a crash-robust rewriting ALTER
process that doesn't use 2X disk space: you must have all the old tuples
AND all the new tuples down on disk simultaneously just before you
commit. The only way around 2X disk space is to adopt some logical
renumbering approach to the columns, so that you can pretend the dropped
column isn't there anymore when it really still is.

Yes, I liked the 2X disk space, and making the new tuples visible all at
once at the end.

man, are you ever wishy-washy on this issue, aren't you? :) you like not
using 2x, you like using 2x ... :)

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: The Hermit Hacker (#14)

The Hermit Hacker <scrappy@hub.org> writes:

What happens if you crash partway through?

what happens if you crash partway through a vacuum?

Nothing. Vacuum is crash-safe. ALTER TABLE should be too.

regards, tom lane

#17The Hermit Hacker
scrappy@hub.org
In reply to: Tom Lane (#16)

On Mon, 9 Oct 2000, Tom Lane wrote:

The Hermit Hacker <scrappy@hub.org> writes:

What happens if you crash partway through?

what happens if you crash partway through a vacuum?

Nothing. Vacuum is crash-safe. ALTER TABLE should be too.

Sorry, that's what I meant ... why should marking a column as 'deleted'
and running a 'vacuum' to clean up the physical table be any less
crash-safe?

#18Bruce Momjian
pgman@candle.pha.pa.us
In reply to: The Hermit Hacker (#17)

On Mon, 9 Oct 2000, Tom Lane wrote:

The Hermit Hacker <scrappy@hub.org> writes:

What happens if you crash partway through?

what happens if you crash partway through a vacuum?

Nothing. Vacuum is crash-safe. ALTER TABLE should be too.

Sorry, that's what I meant ... why should marking a column as 'deleted'
and running a 'vacuum' to clean up the physical table be any less
crash-safe?

It is not. The only downside is 2x disk space to make new versions of
the tuple.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#19The Hermit Hacker
scrappy@hub.org
In reply to: Bruce Momjian (#18)

On Mon, 9 Oct 2000, Bruce Momjian wrote:

On Mon, 9 Oct 2000, Tom Lane wrote:

The Hermit Hacker <scrappy@hub.org> writes:

What happens if you crash partway through?

what happens if you crash partway through a vacuum?

Nothing. Vacuum is crash-safe. ALTER TABLE should be too.

Sorry, that's what I meant ... why should marking a column as 'deleted'
and running a 'vacuum' to clean up the physical table be any less
crash-safe?

It is not. The only downside is 2x disk space to make new versions of
the tuple.

huh? vacuum moves/cleans up tuples, as well as compresses them, so that
the end result is a smaller table then what it started with, at/with very
little increase in the total size/space needed to perform the vacuum ...

if we reduced vacuum such that it compressed at the field level vs tuple,
we could move a few tuples to the end of the table (crash safe) and then
move N+1 to position 1 minus that extra field. If we mark the column as
being deleted, then if the system crashes part way through, it should be
possible to continue after the system is brought up, no?

#20Bruce Momjian
pgman@candle.pha.pa.us
In reply to: The Hermit Hacker (#19)

Sorry, that's what I meant ... why should marking a column as 'deleted'
and running a 'vacuum' to clean up the physical table be any less
crash-safe?

It is not. The only downside is 2x disk space to make new versions of
the tuple.

huh? vacuum moves/cleans up tuples, as well as compresses them, so that
the end result is a smaller table then what it started with, at/with very
little increase in the total size/space needed to perform the vacuum ...

if we reduced vacuum such that it compressed at the field level vs tuple,
we could move a few tuples to the end of the table (crash safe) and then
move N+1 to position 1 minus that extra field. If we mark the column as
being deleted, then if the system crashes part way through, it should be
possible to continue after the system is brought up, no?

If it crashes in the middle, some rows have the column removed, and some
do not.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#21The Hermit Hacker
scrappy@hub.org
In reply to: Bruce Momjian (#20)

On Mon, 9 Oct 2000, Bruce Momjian wrote:

Sorry, that's what I meant ... why should marking a column as 'deleted'
and running a 'vacuum' to clean up the physical table be any less
crash-safe?

It is not. The only downside is 2x disk space to make new versions of
the tuple.

huh? vacuum moves/cleans up tuples, as well as compresses them, so that
the end result is a smaller table then what it started with, at/with very
little increase in the total size/space needed to perform the vacuum ...

if we reduced vacuum such that it compressed at the field level vs tuple,
we could move a few tuples to the end of the table (crash safe) and then
move N+1 to position 1 minus that extra field. If we mark the column as
being deleted, then if the system crashes part way through, it should be
possible to continue after the system is brought up, no?

If it crashes in the middle, some rows have the column removed, and some
do not.

hrmm .. mvcc uses a timestamp, no? is there no way of using that
timestamp to determine which columns have/haven't been cleaned up
following a crash? maybe some way of marking a table as being in a 'drop
column' mode, so that when it gets brought back up again, it is scan'd for
any tuples older then that date?

#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: The Hermit Hacker (#19)

The Hermit Hacker <scrappy@hub.org> writes:

It is not. The only downside is 2x disk space to make new versions of
the tuple.

huh? vacuum moves/cleans up tuples, as well as compresses them, so that
the end result is a smaller table then what it started with, at/with very
little increase in the total size/space needed to perform the vacuum ...

Huh? right back at you ;-). Vacuum is very careful to make sure that
it always has two copies of any tuple it moves. The reason it's not 2x
disk space is that it only moves tuples to fill free space in existing
pages of the file. So the moved tuples take up space-that-was-free as
well as the space they were originally in. But this has nothing
whatever to do with the requirements of ALTER DROP COLUMN --- to be
safe, that must have two copies of every tuple, free space or no free
space.

regards, tom lane

#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: The Hermit Hacker (#21)

The Hermit Hacker <scrappy@hub.org> writes:

hrmm .. mvcc uses a timestamp, no? is there no way of using that
timestamp to determine which columns have/haven't been cleaned up
following a crash? maybe some way of marking a table as being in a 'drop
column' mode, so that when it gets brought back up again, it is scan'd for
any tuples older then that date?

WAL would provide the framework to do something like that, but I still
say it'd be a bad idea. What you're describing is
irrevocable-once-it-starts DROP COLUMN; there is no way to roll it back.
We're trying to get rid of statements that act that way, not add more.

I am not convinced that a 2x penalty for DROP COLUMN is such a huge
problem that we should give up all the normal safety features of SQL
in order to avoid it. Seems to me that DROP COLUMN is only a big issue
during DB development, when you're usually working with relatively small
amounts of test data anyway.

regards, tom lane

#24The Hermit Hacker
scrappy@hub.org
In reply to: Tom Lane (#23)

On Mon, 9 Oct 2000, Tom Lane wrote:

The Hermit Hacker <scrappy@hub.org> writes:

hrmm .. mvcc uses a timestamp, no? is there no way of using that
timestamp to determine which columns have/haven't been cleaned up
following a crash? maybe some way of marking a table as being in a 'drop
column' mode, so that when it gets brought back up again, it is scan'd for
any tuples older then that date?

WAL would provide the framework to do something like that, but I still
say it'd be a bad idea. What you're describing is
irrevocable-once-it-starts DROP COLUMN; there is no way to roll it back.
We're trying to get rid of statements that act that way, not add more.

Hrmmmm ... this one I can't really argue, or, at least, can't think of
anything right now :(

I am not convinced that a 2x penalty for DROP COLUMN is such a huge
problem that we should give up all the normal safety features of SQL
in order to avoid it. Seems to me that DROP COLUMN is only a big
issue during DB development, when you're usually working with
relatively small amounts of test data anyway.

Actually, I could see DROP COLUMN being useful in a few other places
... recently, I spent several hours re-structuring a clients database that
had been built by someone else who didn't know what 'relational' means in
RDBMS ... or how about an application developer that decides to
restructure their schema's in a new release and provides an 'upgrade.sql'
script that is designed to do this?

A good example might be the UDMSearch stuff, where you have tables that
are quite large, but they decide that they want to remove the 'base URL'
component' of one table and put it into another table? a nice update
script could go something like (pseudo like):

ADD COLUMN base_url int;
INSERT INTO new_table SELECT base_url_text FROM table;
DROP COLUMN base_url_text;

That would make for a very painful upgrade process if I have to go through
the trouble of upgrading my hardware to add more space ...

#25Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#23)

The Hermit Hacker <scrappy@hub.org> writes:

hrmm .. mvcc uses a timestamp, no? is there no way of using that
timestamp to determine which columns have/haven't been cleaned up
following a crash? maybe some way of marking a table as being in a 'drop
column' mode, so that when it gets brought back up again, it is scan'd for
any tuples older then that date?

WAL would provide the framework to do something like that, but I still
say it'd be a bad idea. What you're describing is
irrevocable-once-it-starts DROP COLUMN; there is no way to roll it back.
We're trying to get rid of statements that act that way, not add more.

I am not convinced that a 2x penalty for DROP COLUMN is such a huge
problem that we should give up all the normal safety features of SQL
in order to avoid it. Seems to me that DROP COLUMN is only a big issue
during DB development, when you're usually working with relatively small
amounts of test data anyway.

Bingo!

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#26The Hermit Hacker
scrappy@hub.org
In reply to: Bruce Momjian (#25)

On Mon, 9 Oct 2000, Bruce Momjian wrote:

The Hermit Hacker <scrappy@hub.org> writes:

hrmm .. mvcc uses a timestamp, no? is there no way of using that
timestamp to determine which columns have/haven't been cleaned up
following a crash? maybe some way of marking a table as being in a 'drop
column' mode, so that when it gets brought back up again, it is scan'd for
any tuples older then that date?

WAL would provide the framework to do something like that, but I still
say it'd be a bad idea. What you're describing is
irrevocable-once-it-starts DROP COLUMN; there is no way to roll it back.
We're trying to get rid of statements that act that way, not add more.

I am not convinced that a 2x penalty for DROP COLUMN is such a huge
problem that we should give up all the normal safety features of SQL
in order to avoid it. Seems to me that DROP COLUMN is only a big issue
during DB development, when you're usually working with relatively small
amounts of test data anyway.

Bingo!

you are jumping on your 'I agree/Bingo' much much too fast :)

#27Bruce Momjian
pgman@candle.pha.pa.us
In reply to: The Hermit Hacker (#26)

On Mon, 9 Oct 2000, Bruce Momjian wrote:

The Hermit Hacker <scrappy@hub.org> writes:

hrmm .. mvcc uses a timestamp, no? is there no way of using that
timestamp to determine which columns have/haven't been cleaned up
following a crash? maybe some way of marking a table as being in a 'drop
column' mode, so that when it gets brought back up again, it is scan'd for
any tuples older then that date?

WAL would provide the framework to do something like that, but I still
say it'd be a bad idea. What you're describing is
irrevocable-once-it-starts DROP COLUMN; there is no way to roll it back.
We're trying to get rid of statements that act that way, not add more.

I am not convinced that a 2x penalty for DROP COLUMN is such a huge
problem that we should give up all the normal safety features of SQL
in order to avoid it. Seems to me that DROP COLUMN is only a big issue
during DB development, when you're usually working with relatively small
amounts of test data anyway.

Bingo!

you are jumping on your 'I agree/Bingo' much much too fast :)

You know this DROP COLUMN is a hot button for me.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#28The Hermit Hacker
scrappy@hub.org
In reply to: Bruce Momjian (#27)

On Mon, 9 Oct 2000, Bruce Momjian wrote:

On Mon, 9 Oct 2000, Bruce Momjian wrote:

The Hermit Hacker <scrappy@hub.org> writes:

hrmm .. mvcc uses a timestamp, no? is there no way of using that
timestamp to determine which columns have/haven't been cleaned up
following a crash? maybe some way of marking a table as being in a 'drop
column' mode, so that when it gets brought back up again, it is scan'd for
any tuples older then that date?

WAL would provide the framework to do something like that, but I still
say it'd be a bad idea. What you're describing is
irrevocable-once-it-starts DROP COLUMN; there is no way to roll it back.
We're trying to get rid of statements that act that way, not add more.

I am not convinced that a 2x penalty for DROP COLUMN is such a huge
problem that we should give up all the normal safety features of SQL
in order to avoid it. Seems to me that DROP COLUMN is only a big issue
during DB development, when you're usually working with relatively small
amounts of test data anyway.

Bingo!

you are jumping on your 'I agree/Bingo' much much too fast :)

You know this DROP COLUMN is a hot button for me.

Ya, but in one email, you appear to agree with me ... then Tom posts a
good point and you jump over to that side ... at least pick a side? :) I
too wish to see it implemented, I just don't want to have to double my
disk space if at some point I decide to upgrade an application and find
out that they decided to change their schema(s) :(

#29Bruce Momjian
pgman@candle.pha.pa.us
In reply to: The Hermit Hacker (#28)

I am not convinced that a 2x penalty for DROP COLUMN is such a huge
problem that we should give up all the normal safety features of SQL
in order to avoid it. Seems to me that DROP COLUMN is only a big issue
during DB development, when you're usually working with relatively small
amounts of test data anyway.

Bingo!

you are jumping on your 'I agree/Bingo' much much too fast :)

You know this DROP COLUMN is a hot button for me.

Ya, but in one email, you appear to agree with me ... then Tom posts a
good point and you jump over to that side ... at least pick a side? :) I
too wish to see it implemented, I just don't want to have to double my
disk space if at some point I decide to upgrade an application and find
out that they decided to change their schema(s) :(

Sorry, I liked the vacuum idea, but 2x disk usage, not 100 at a time.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#30Don Baccus
dhogaza@pacifier.com
In reply to: The Hermit Hacker (#24)

At 07:55 PM 10/9/00 -0300, The Hermit Hacker wrote:

I am not convinced that a 2x penalty for DROP COLUMN is such a huge
problem that we should give up all the normal safety features of SQL
in order to avoid it. Seems to me that DROP COLUMN is only a big
issue during DB development, when you're usually working with
relatively small amounts of test data anyway.

Actually, I could see DROP COLUMN being useful in a few other places
... recently, I spent several hours re-structuring a clients database that
had been built by someone else who didn't know what 'relational' means in
RDBMS ... or how about an application developer that decides to
restructure their schema's in a new release and provides an 'upgrade.sql'
script that is designed to do this?

This last example is one reason DROP COLUMN would be a great help to
the OpenACS development effort.

However, upgrades (new releases) are fairly infrequent, and many users of
current versions won't bother unless they've run into toolkit bugs
(same goes for updating PG). Those who do know that doing an upgrade
will require planning, testing on a system that's not running their
"live" website, and some amount of downtime.

So I don't think a 2x penalty is a huge problem.

That would make for a very painful upgrade process if I have to go through
the trouble of upgrading my hardware to add more space ...

For many folks, if eating 2x the size of a single table runs their system out
of disk space, clearly they should've upgraded long, long ago. An OpenACS
site has hundreds of tables, I can't imagine running my disk space so tight
that I couldn't double the size of one of them long enough to do a DROP
COLUMN.

Obviously, some folks doing other things will have single tables that are
huge,
but after all they can always do what they do now - not drop columns.

- Don Baccus, Portland OR <dhogaza@pacifier.com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

#31Hannu Krosing
hannu@tm.ee
In reply to: The Hermit Hacker (#28)

The Hermit Hacker wrote:

On Mon, 9 Oct 2000, Bruce Momjian wrote:

Ya, but in one email, you appear to agree with me ... then Tom posts a
good point and you jump over to that side ... at least pick a side? :) I
too wish to see it implemented, I just don't want to have to double my
disk space if at some point I decide to upgrade an application and find
out that they decided to change their schema(s) :(

As Don already pointed out, if you don't have enough room to double your
table size you must be running an one-table, append-only application where
you can only do a very limited set of queries.

select * from that_table order by some_column_without_an_index; is definitely
out as it takes many times the space of a that_table anyway.

There _may_ be some cases where 2x is unacceptable, but without radically
changing tables on-disk structure there is no way to avoid it and still be
able to rollback or even crash cleanly ;)

-----------------
Hannu

#32Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: Hannu Krosing (#31)
AW: ALTER TABLE DROP COLUMN

being deleted, then if the system crashes part way through,

it should be

possible to continue after the system is brought up, no?

If it crashes in the middle, some rows have the column
removed, and some
do not.

We would need to know where this separation is, but we cannot do a rollback
only rollforward. Thus probably not acceptable.

Andreas

#33Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: Zeugswetter Andreas SB (#32)
AW: ALTER TABLE DROP COLUMN

WAL would provide the framework to do something like that, but I still
say it'd be a bad idea. What you're describing is
irrevocable-once-it-starts DROP COLUMN; there is no way to
roll it back.
We're trying to get rid of statements that act that way, not add more.

Yes.

I am not convinced that a 2x penalty for DROP COLUMN is such a huge
problem that we should give up all the normal safety features of SQL
in order to avoid it. Seems to me that DROP COLUMN is only a
big issue during DB development, when you're usually working with
relatively small amounts of test data anyway.

Here I don't agree, the statement can also be used for an application version
upgrade. Thus seen in SAP/R3 with tables > 30 Gb.

My conclusion would be that we need both:
1. a fast system table only solution with physical/logical column id
2. a tool that does the cleanup (e.g. vacuum)

Andreas

#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zeugswetter Andreas SB (#33)
Re: AW: ALTER TABLE DROP COLUMN

Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at> writes:

My conclusion would be that we need both:
1. a fast system table only solution with physical/logical column id
2. a tool that does the cleanup (e.g. vacuum)

But the peak space usage during cleanup must still be 2X.

regards, tom lane

#35The Hermit Hacker
scrappy@hub.org
In reply to: Tom Lane (#34)
Re: AW: ALTER TABLE DROP COLUMN

On Thu, 12 Oct 2000, Tom Lane wrote:

Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at> writes:

My conclusion would be that we need both:
1. a fast system table only solution with physical/logical column id
2. a tool that does the cleanup (e.g. vacuum)

But the peak space usage during cleanup must still be 2X.

Is there no way of doing this such that we have N tuple types in the
table? So that UPDATE/INSERTs are minus the extra column, while the old
ones just have that column marked as deleted? Maybe change the stored
value of the deleted field as some internal value that, when vacuum, or
any other operation, sees it, it 'ignores' that field? maybe something
that when you do an 'alter table drop', it effectively does an UPDATE on
that field to set it to the 'drop column' value?

#36Tom Lane
tgl@sss.pgh.pa.us
In reply to: The Hermit Hacker (#35)
Re: AW: ALTER TABLE DROP COLUMN

The Hermit Hacker <scrappy@hub.org> writes:

On Thu, 12 Oct 2000, Tom Lane wrote:

Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at> writes:

My conclusion would be that we need both:
1. a fast system table only solution with physical/logical column id
2. a tool that does the cleanup (e.g. vacuum)

But the peak space usage during cleanup must still be 2X.

Is there no way of doing this such that we have N tuple types in the
table? So that UPDATE/INSERTs are minus the extra column, while the old
ones just have that column marked as deleted?

If we bite the bullet to the extent of supporting a distinction between
physical and logical column numbers, then ISTM there's no strong need
to do any of this other stuff at all. I'd expect that an inserted or
updated tuple would have a NULL in any physical column position that
doesn't have an equivalent logical column, so the space cost is minimal
(zero, in fact, if there are any other NULLs in the tuple). Over time
the space occupied by deleted-column data would gradually go away as
tuples got updated.

I really don't see why we're expending so much discussion on ways to
reformat all the tuples at once. It can't be done cheaply and I see
no real reason to do it at all, so it seems like we have many
more-profitable problems to work on.

regards, tom lane

#37The Hermit Hacker
scrappy@hub.org
In reply to: Tom Lane (#36)
Re: AW: ALTER TABLE DROP COLUMN

On Thu, 12 Oct 2000, Tom Lane wrote:

The Hermit Hacker <scrappy@hub.org> writes:

On Thu, 12 Oct 2000, Tom Lane wrote:

Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at> writes:

My conclusion would be that we need both:
1. a fast system table only solution with physical/logical column id
2. a tool that does the cleanup (e.g. vacuum)

But the peak space usage during cleanup must still be 2X.

Is there no way of doing this such that we have N tuple types in the
table? So that UPDATE/INSERTs are minus the extra column, while the old
ones just have that column marked as deleted?

If we bite the bullet to the extent of supporting a distinction between
physical and logical column numbers, then ISTM there's no strong need
to do any of this other stuff at all.

what does/would it take to implement this?

#38Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: The Hermit Hacker (#37)
AW: AW: ALTER TABLE DROP COLUMN

My conclusion would be that we need both:
1. a fast system table only solution with physical/logical column id
2. a tool that does the cleanup (e.g. vacuum)

But the peak space usage during cleanup must still be 2X.

The difference for a cleanup would be, that it does not need to
be rolled back as a whole (like current vacuum).
A cleanup could be partly done, and resumed later, thus a sofisticated
cleanup could avoid 2X.

Andreas

#39Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: Zeugswetter Andreas SB (#38)
AW: AW: ALTER TABLE DROP COLUMN

we bite the bullet to the extent of supporting a distinction between
physical and logical column numbers, then ISTM there's no strong need
to do any of this other stuff at all. I'd expect that an inserted or
updated tuple would have a NULL in any physical column position that
doesn't have an equivalent logical column, so the space cost
is minimal
(zero, in fact, if there are any other NULLs in the tuple). Over time
the space occupied by deleted-column data would gradually go away as
tuples got updated.

This said, I think Hiroshi's patch seems a perfect starting point, no ?

Andreas

#40Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Zeugswetter Andreas SB (#39)
Re: AW: AW: ALTER TABLE DROP COLUMN

[ Charset ISO-8859-1 unsupported, converting... ]

we bite the bullet to the extent of supporting a distinction between
physical and logical column numbers, then ISTM there's no strong need
to do any of this other stuff at all. I'd expect that an inserted or
updated tuple would have a NULL in any physical column position that
doesn't have an equivalent logical column, so the space cost
is minimal
(zero, in fact, if there are any other NULLs in the tuple). Over time
the space occupied by deleted-column data would gradually go away as
tuples got updated.

This said, I think Hiroshi's patch seems a perfect starting point, no ?

Having phantom columns adds additional complexity to the system overall.
We have to decide we really want it before making things more complex
than they already are.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#41The Hermit Hacker
scrappy@hub.org
In reply to: Bruce Momjian (#40)
Re: AW: AW: ALTER TABLE DROP COLUMN

On Fri, 13 Oct 2000, Bruce Momjian wrote:

[ Charset ISO-8859-1 unsupported, converting... ]

we bite the bullet to the extent of supporting a distinction between
physical and logical column numbers, then ISTM there's no strong need
to do any of this other stuff at all. I'd expect that an inserted or
updated tuple would have a NULL in any physical column position that
doesn't have an equivalent logical column, so the space cost
is minimal
(zero, in fact, if there are any other NULLs in the tuple). Over time
the space occupied by deleted-column data would gradually go away as
tuples got updated.

This said, I think Hiroshi's patch seems a perfect starting point, no ?

Having phantom columns adds additional complexity to the system overall.
We have to decide we really want it before making things more complex
than they already are.

My feel from Tom's email, about changing the "structure" of how a column
is defined, seems to be that he thinks *that* will simplify things, not
make them more complex, but I may be reading things wrong.

Hiroshi's patch would make for a good starting point by bringing in the
ability to do the DROP COLUMN feature, as I understand, without the
rollback capability, with the changes that Tom is proposing bringing it to
a 'rollbackable' stage ...

Again, maybe I am misunderstanding Tom's comments, but the whole column
issue itself sounded like something he wanted to see happen anyway ...

#42Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: The Hermit Hacker (#41)
AW: AW: AW: ALTER TABLE DROP COLUMN

Hiroshi's patch would make for a good starting point by bringing in the
ability to do the DROP COLUMN feature, as I understand, without the
rollback capability,

No Hiroshi's patch is rollback enabled, simply because all it does is change
some system tables. It only does not free space that is used by "old" phantom
columns. This cleanup would need extra work, but for now, I guess it would be fine
to simply say that if you want to regain the space create a new table and move the
data.

Andreas

#43Hannu Krosing
hannu@tm.ee
In reply to: Zeugswetter Andreas SB (#33)
Re: AW: ALTER TABLE DROP COLUMN

Tom Lane wrote:

Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at> writes:

My conclusion would be that we need both:
1. a fast system table only solution with physical/logical column id
2. a tool that does the cleanup (e.g. vacuum)

But the peak space usage during cleanup must still be 2X.

Perhaps he means some kind of off-line cleanup tool, that has only the
requirement of being able to continue from where it left off (or
crashed) ?

It could be useful in some cases (like removing a column on saturday
from a
terabyte-sized file that is used only on weekdays :)

----------
Hannu

#44Don Baccus
dhogaza@pacifier.com
In reply to: Zeugswetter Andreas SB (#33)
Re: AW: ALTER TABLE DROP COLUMN

At 04:23 PM 10/12/00 +0200, Zeugswetter Andreas SB wrote:

My conclusion would be that we need both:
1. a fast system table only solution with physical/logical column id
2. a tool that does the cleanup (e.g. vacuum)

Oracle provides both styles of "drop column" - the "hide the column's
data and make it logically disappear" style, and the "grind through
and delete all the data as well as make the column disappear from
view". So there's evidence of a need for both styles.

If you choose the "hide the data" style, I don't know if you can later
recover that space.

However, despite the above I think a 2x "grind through and remove the
data" DROP COLUMN would be a welcome first addition, and would meet the
needs of a very high percentage of the current user base. A future
option to just hide the data would most likely be welcome, too.

- Don Baccus, Portland OR <dhogaza@pacifier.com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

#45The Hermit Hacker
scrappy@hub.org
In reply to: Zeugswetter Andreas SB (#42)
Re: AW: AW: AW: ALTER TABLE DROP COLUMN

On Fri, 13 Oct 2000, Zeugswetter Andreas SB wrote:

Hiroshi's patch would make for a good starting point by bringing in the
ability to do the DROP COLUMN feature, as I understand, without the
rollback capability,

No Hiroshi's patch is rollback enabled, simply because all it does is
change some system tables. It only does not free space that is used by
"old" phantom columns. This cleanup would need extra work, but for
now, I guess it would be fine to simply say that if you want to regain
the space create a new table and move the data.

okay, but, again based on my impression of what Tom has stated, and
previous conversations on this topic, the key problem is what happens if I
drop a column and a later date decide add a new column of the same name,
what happens?

I *believe* its that condition that Tom's thought about physical vs
logical column names was about ... Tom?

#46Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#40)
Re: AW: AW: ALTER TABLE DROP COLUMN

Bruce Momjian <pgman@candle.pha.pa.us> writes:

This said, I think Hiroshi's patch seems a perfect starting point, no ?

Having phantom columns adds additional complexity to the system overall.
We have to decide we really want it before making things more complex
than they already are.

I think we do, because it solves more than just the ALTER DROP COLUMN
problem: it cleans up other sore spots too. Like ALTER TABLE ADD COLUMN
in a table with child tables.

Of course, it depends on just how ugly and intrusive the code changes
are to make physical and logical columns distinct. I'd like to think
that some fairly limited changes in and around heap_getattr would do
most of the trick. If we need something as messy as the first-cut
DROP_COLUMN_HACK then I'll look for another way...

regards, tom lane

#47Tom Lane
tgl@sss.pgh.pa.us
In reply to: The Hermit Hacker (#45)
Re: AW: AW: AW: ALTER TABLE DROP COLUMN

The Hermit Hacker <scrappy@hub.org> writes:

okay, but, again based on my impression of what Tom has stated, and
previous conversations on this topic, the key problem is what happens if I
drop a column and a later date decide add a new column of the same name,
what happens?

I'm not very worried about that --- presumably the old column is either
marked as dead or gone entirely from pg_attribute, so why should there
be any confusion?

regards, tom lane

#48The Hermit Hacker
scrappy@hub.org
In reply to: Tom Lane (#47)
Re: AW: AW: AW: ALTER TABLE DROP COLUMN

On Fri, 13 Oct 2000, Tom Lane wrote:

The Hermit Hacker <scrappy@hub.org> writes:

okay, but, again based on my impression of what Tom has stated, and
previous conversations on this topic, the key problem is what happens if I
drop a column and a later date decide add a new column of the same name,
what happens?

I'm not very worried about that --- presumably the old column is either
marked as dead or gone entirely from pg_attribute, so why should there
be any confusion?

right, but, if you add a column after it was previuosly marked dead, does
it then create a second column in pg_attribute with that same
name? *raised eyebrow*

#49Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Don Baccus (#44)
RE: AW: ALTER TABLE DROP COLUMN

-----Original Message-----
From: Don Baccus [mailto:dhogaza@pacifier.com]

At 04:23 PM 10/12/00 +0200, Zeugswetter Andreas SB wrote:

My conclusion would be that we need both:
1. a fast system table only solution with physical/logical column id
2. a tool that does the cleanup (e.g. vacuum)

Oracle provides both styles of "drop column" - the "hide the column's
data and make it logically disappear" style, and the "grind through
and delete all the data as well as make the column disappear from
view". So there's evidence of a need for both styles.

If you choose the "hide the data" style, I don't know if you can later
recover that space.

However, despite the above I think a 2x "grind through and remove the
data" DROP COLUMN would be a welcome first addition, and would meet the
needs of a very high percentage of the current user base.

This style of "DROP COLUMN" would change the attribute
numbers whose positons are after the dropped column.
Unfortunately we have no mechanism to invalidate/remove
objects(or prepared plans) which uses such attribute numbers.
And I've seen no proposal/discussion to solve this problem
for DROP COLUMN feature. We wound't be able to prevent
PostgreSQL from doing the wrong thing silently.

When I used Oracle,I saw neither option of DROP COLUMN
feature. It seems to tell us that the implementation isn't
that easy. It may not be a bad choise to give up DROP
COLUMN feature forever.

A future
option to just hide the data would most likely be welcome, too.

My trial implementation using physical/logical attribute numbers
isn't so clean as I expected. I'm inclined to restrict my change to
fix the TODO
* ALTER TABLE ADD COLUMN to inherited table put column in wrong place
though it would also introduce a backward compatibility.
I could live without DROP COLUMN feature though I couldn't
live without ADD COLUMN feature.

Comments ?

Regards.

Hiroshi Inoue

#50Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Hiroshi Inoue (#49)
Re: AW: ALTER TABLE DROP COLUMN

My trial implementation using physical/logical attribute numbers
isn't so clean as I expected. I'm inclined to restrict my change to
fix the TODO
* ALTER TABLE ADD COLUMN to inherited table put column in wrong place
though it would also introduce a backward compatibility.
I could live without DROP COLUMN feature though I couldn't
live without ADD COLUMN feature.

We have a DROP COLUMN option in the FAQ, so I don't see a rush there.
Sounds like we need your fix for add column with inheritance, but I
suppose the 2x fix for DROP COLUMN could be used to add columns too.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#51KuroiNeko
evpopkov@carrier.kiev.ua
In reply to: Hiroshi Inoue (#49)
RE: AW: ALTER TABLE DROP COLUMN

Inoue san,

This style of "DROP COLUMN" would change the attribute
numbers whose positons are after the dropped column.
Unfortunately we have no mechanism to invalidate/remove
objects(or prepared plans) which uses such attribute numbers.

1 create table alpha( id int4, payload text );
2 insert into alpha( id, payload ) values( 0, 'zero' );
3 create table t( payload text );
4 insert into t( payload ) select payload from alpha;
5 drop table alpha;
6 alter table t rename to alpha;

Not a big deal, right? Also, drop column isn't really needed
that often and requires alot of manual processing, like updating
views/rules/procedures etc.
On the other hand, when dropping a column (multiple columns) in a table
with 10+ columns, statements 3 and 4 above may become quite painfull.
It'd be nice if drop column were `expanded' to appropriate queries
automatically. Not sure about abovementioned attribute numbers in such
case.
In general, however, if drop column is the only statement that is likely
to affect attribute numbers this way (assuming that add column always adds
and never inserts an attribute), then a fairly simple function in plpgsql,
shipped with template1 will probably do. At least it should work to drop a
single column, because full-featured function will require argument list of
variable length.

Ed

---
Well I tried to be meek
And I have tried to be mild
But I spat like a woman
And I sulked like a child
I have lived behind the walls
That have made me alone
Striven for peace
Which I never have known

Dire Straits, Brothers In Arms, The Man's Too Strong (Knopfler)

#52KuroiNeko
evpopkov@carrier.kiev.ua
In reply to: Hiroshi Inoue (#49)
2 attachment(s)
select oid .... for update ....

Here's something I don't understand.... If I'm missing an obvious, please
feel free to kick me in the right direction.
We're given two tables, linked with a foreign key. When insert is run on a
master table, everything is OK, when trying to insert into a detail table,
a strange query appears in the log (schema and log snippet attached).
In fact, there should be no problem, but if a user has no permissions to
update the master table (eg, this is a log where he can only insert, but
neither delete, nor update), then inserting into detail table fails on
permission violation at the query:

SELECT oid FROM "alpha" WHERE "id" = $1 FOR UPDATE OF "alpha"

TIA

Ed

---
Well I tried to be meek
And I have tried to be mild
But I spat like a woman
And I sulked like a child
I have lived behind the walls
That have made me alone
Striven for peace
Which I never have known

Dire Straits, Brothers In Arms, The Man's Too Strong (Knopfler)

Attachments:

t.sqltext/plain; charset=us-asciiDownload
pgsqltext/plain; charset=us-asciiDownload
#53Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hiroshi Inoue (#49)
Re: AW: ALTER TABLE DROP COLUMN

"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:

My trial implementation using physical/logical attribute numbers
isn't so clean as I expected. I'm inclined to restrict my change to
fix the TODO
* ALTER TABLE ADD COLUMN to inherited table put column in wrong place
though it would also introduce a backward compatibility.

I'm confused --- how will that make things any simpler or cleaner?
You still need physical/logical column numbering distinction in order
to fix inherited ADD COLUMN, don't you?

regards, tom lane

#54Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: KuroiNeko (#52)
Re: select oid .... for update ....

This is a known problem in 7.0.x (see mailing list archives for more
information). Peter E has a patch for 7.1 to remove this problem.

Stephan Szabo
sszabo@bigpanda.com

On Sun, 15 Oct 2000, KuroiNeko wrote:

Show quoted text

Here's something I don't understand.... If I'm missing an obvious, please
feel free to kick me in the right direction.
We're given two tables, linked with a foreign key. When insert is run on a
master table, everything is OK, when trying to insert into a detail table,
a strange query appears in the log (schema and log snippet attached).
In fact, there should be no problem, but if a user has no permissions to
update the master table (eg, this is a log where he can only insert, but
neither delete, nor update), then inserting into detail table fails on
permission violation at the query:

SELECT oid FROM "alpha" WHERE "id" = $1 FOR UPDATE OF "alpha"

#55KuroiNeko
evpopkov@carrier.kiev.ua
In reply to: Stephan Szabo (#54)
Permissions, was select oid

This is a known problem in 7.0.x (see mailing list archives for more
information). Peter E has a patch for 7.1 to remove this problem.

Thanks and sorry for the hassle.
While we're on it, if there's any work going on premissions (separating
update/delete etc), I'd be glad to offer my help, if needed.

Thx

--

Well I tried to be meek
And I have tried to be mild
But I spat like a woman
And I sulked like a child
I have lived behind the walls
That have made me alone
Striven for peace
Which I never have known

Dire Straits, Brothers In Arms, The Man's Too Strong (Knopfler)

#56Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Tom Lane (#53)
RE: AW: ALTER TABLE DROP COLUMN

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]

"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:

My trial implementation using physical/logical attribute numbers
isn't so clean as I expected. I'm inclined to restrict my change to
fix the TODO
* ALTER TABLE ADD COLUMN to inherited table put column in wrong place
though it would also introduce a backward compatibility.

I'm confused --- how will that make things any simpler or cleaner?
You still need physical/logical column numbering distinction in order
to fix inherited ADD COLUMN, don't you?

Yes,the implementation would be almost same.
I've been busy for some time and wasn't able to follow
this thread. Don't people love 2x DROP COLUMN ?
I don't object to 2x DROP COLUMN if it could be
implemented properly though I don't want to implement
it myself. However I would strongly object to 2x ADD
COLUMN if such implementations are proposed.

Regards.

Hiroshi Inoue

#57Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Hiroshi Inoue (#49)
Re: AW: ALTER TABLE DROP COLUMN

KuroiNeko wrote:

Inoue san,

This style of "DROP COLUMN" would change the attribute
numbers whose positons are after the dropped column.
Unfortunately we have no mechanism to invalidate/remove
objects(or prepared plans) which uses such attribute numbers.

1 create table alpha( id int4, payload text );
2 insert into alpha( id, payload ) values( 0, 'zero' );
3 create table t( payload text );
4 insert into t( payload ) select payload from alpha;
5 drop table alpha;
6 alter table t rename to alpha;

Not a big deal, right?

Yes,there's a similar procedure in FAQ.

Also, drop column isn't really needed
that often and requires alot of manual processing, like updating
views/rules/procedures etc.

The FAQ doesn't refer to alot of manual processing at all.
Certainly it's very difficult to cover all procederes to
accomplish "DROP COLUMN". It's one of the reason why
I've said "DROP COLUMN" isn't that easy.

On the other hand, when dropping a column (multiple columns) in a table
with 10+ columns, statements 3 and 4 above may become quite painfull.
It'd be nice if drop column were `expanded' to appropriate queries
automatically. Not sure about abovementioned attribute numbers in such
case.
In general, however, if drop column is the only statement that is likely
to affect attribute numbers this way (assuming that add column always adds
and never inserts an attribute), then a fairly simple function in plpgsql,
shipped with template1 will probably do.

plpgsql functions are executed in a transaction.
I don't think plpgsql could execute
"insert(select into) -> drop -> rename"
properly(at least currently).

Regards.

Hiroshi Inoue

Show quoted text

At least it should work to drop a
single column, because full-featured function will require argument list of
variable length.

Ed

---
Well I tried to be meek
And I have tried to be mild
But I spat like a woman
And I sulked like a child
I have lived behind the walls
That have made me alone
Striven for peace
Which I never have known

Dire Straits, Brothers In Arms, The Man's Too Strong (Knopfler)

#58Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Bruce Momjian (#40)
Re: AW: AW: ALTER TABLE DROP COLUMN

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

This said, I think Hiroshi's patch seems a perfect starting point, no ?

Having phantom columns adds additional complexity to the system overall.
We have to decide we really want it before making things more complex
than they already are.

I think we do, because it solves more than just the ALTER DROP COLUMN
problem: it cleans up other sore spots too. Like ALTER TABLE ADD COLUMN
in a table with child tables.

Of course, it depends on just how ugly and intrusive the code changes
are to make physical and logical columns distinct. I'd like to think
that some fairly limited changes in and around heap_getattr would do
most of the trick. If we need something as messy as the first-cut
DROP_COLUMN_HACK then I'll look for another way...

Hmm,the implementation using physical and logical attribute numbers
would be much more complicated than first-cut DROP_COLUMN_HACK.
There's no simpler way than first-cut DROP_COLUMN_HACK.
I see no progress in 2x DROP COLUMN implementation.

How about giving up DROP COLUMN forever ?

Regards.

Hiroshi Inoue

#59Chris
chris@bitmead.com
In reply to: Hiroshi Inoue (#49)
Re: AW: ALTER TABLE DROP COLUMN

KuroiNeko wrote:

1 create table alpha( id int4, payload text );

<snip>

Not a big deal, right?

Yes a big deal. You just lost all your oids.

#60Chris
chris@bitmead.com
In reply to: Hiroshi Inoue (#49)
Re: AW: ALTER TABLE DROP COLUMN

Hiroshi Inoue wrote:

When I used Oracle,I saw neither option of DROP
COLUMN feature. It seems to tell us that the
implementation isn't
that easy. It may not be a bad choise to give up DROP
COLUMN feature forever.

Because it's not easy we shouldn't do it? I don't think so. The perfect
solution is lazy updating of tuples but it requires versioning of
meta-data and that requires a bit of work.

However I would strongly object to 2x
ADD COLUMN if such implementations are proposed.

Not even 2x for ADD COLUMN DEFAULT ?

#61Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Hiroshi Inoue (#49)
Re: AW: ALTER TABLE DROP COLUMN

Chris wrote:

Hiroshi Inoue wrote:

When I used Oracle,I saw neither option of DROP
COLUMN feature. It seems to tell us that the
implementation isn't
that easy. It may not be a bad choise to give up DROP
COLUMN feature forever.

Because it's not easy we shouldn't do it? I don't think so. The perfect
solution is lazy updating of tuples but it requires versioning of
meta-data and that requires a bit of work.

However I would strongly object to 2x
ADD COLUMN if such implementations are proposed.

Not even 2x for ADD COLUMN DEFAULT ?

Certainly it would need 2x.
However is ADD COLUMN DEFAULT really needed ?
I would do as follows.

ADD COLUMN (without default)
UPDATE .. SET new_column = new default
ALTER TABLE ALTER COLUMN SET DEFAULT

Regards.
Hiroshi Inoue

#62Chris
chris@bitmead.com
In reply to: Hiroshi Inoue (#49)
Re: AW: ALTER TABLE DROP COLUMN

Hiroshi Inoue wrote:

Certainly it would need 2x.
However is ADD COLUMN DEFAULT really needed ?
I would do as follows.

ADD COLUMN (without default)
UPDATE .. SET new_column = new default
ALTER TABLE ALTER COLUMN SET DEFAULT

Well in current postgres that would use 2x. With WAL I presume that
would use a lot of log space and probably a lot more processing. But if
you can do the above you might as well support the right syntax.

#63Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Hiroshi Inoue (#49)
Re: AW: ALTER TABLE DROP COLUMN

Chris wrote:

Hiroshi Inoue wrote:

When I used Oracle,I saw neither option of DROP
COLUMN feature. It seems to tell us that the
implementation isn't
that easy. It may not be a bad choise to give up DROP
COLUMN feature forever.

Because it's not easy we shouldn't do it? I don't think so. The perfect
solution is lazy updating of tuples but it requires versioning of
meta-data and that requires a bit of work.

We could easily break the consistency of DB due to careless
implementations. Is "DROP COLUMN" valuable to walk on a
tightrope ? I would agree if "ADD COLUMN" needs to walk
on a tightrope.

Regards.

Hiroshi Inoue

#64Chris
chris@bitmead.com
In reply to: Hiroshi Inoue (#49)
Re: AW: ALTER TABLE DROP COLUMN

Hiroshi Inoue wrote:

We could easily break the consistency of DB due to
careless implementations.

I'm sure no-one around here would do careless implementations. :-)

#65Hannu Krosing
hannu@tm.ee
In reply to: Hiroshi Inoue (#49)
Re: AW: ALTER TABLE DROP COLUMN

Chris wrote:

Hiroshi Inoue wrote:

When I used Oracle,I saw neither option of DROP
COLUMN feature. It seems to tell us that the
implementation isn't
that easy. It may not be a bad choise to give up DROP
COLUMN feature forever.

Because it's not easy we shouldn't do it? I don't think so. The perfect
solution is lazy updating of tuples but it requires versioning of
meta-data and that requires a bit of work.

I would prefer the logical/physical numbering + typed tuples
(or is it the same thing ;)

It would give us the additional benefit of being able to move to SQL3-wise
correct CREATE TABLE UNDER syntax with most constraints (primary/foreign key,
unique, ...) carried on automatically if we store the (single-)inheritance
hierarchy in one file.

Others (NOT NULL, CHECK, ...) will need additional check for tuple type.

This does not solve the problem for multiple inheritance, but then we could
cludge most of it by inheriting all from a single root.

I suspect it would still be easier than doing it the other way (by
constructing
UNIONs each time, checking several indexes for uniquenass (or creating a new
index type for indexing several separate relations))

---------------------
Hannu

#66Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: Hannu Krosing (#65)
AW: AW: ALTER TABLE DROP COLUMN

This style of "DROP COLUMN" would change the attribute
numbers whose positons are after the dropped column.
Unfortunately we have no mechanism to invalidate/remove
objects(or prepared plans) which uses such attribute numbers.
And I've seen no proposal/discussion to solve this problem
for DROP COLUMN feature. We wound't be able to prevent
PostgreSQL from doing the wrong thing silently.

That issue got me confused now (There was a previous mail where
you suggested using logical colid most of the time). Why not use the
physical colid in prepared objects/plans. Since those can't currently change
it seems such plans would not be invalidated.

Andreas

#67Don Baccus
dhogaza@pacifier.com
In reply to: Hiroshi Inoue (#49)
RE: AW: ALTER TABLE DROP COLUMN

At 10:56 PM 10/15/00 +0900, Hiroshi Inoue wrote:

When I used Oracle,I saw neither option of DROP COLUMN
feature. It seems to tell us that the implementation isn't
that easy. It may not be a bad choise to give up DROP
COLUMN feature forever.

Both options are in Oracle now, as proudly documented in their
freely accessible on-line documentation. It is very possible
they didn't implement it until version 8, i.e. until a couple of years
ago.

- Don Baccus, Portland OR <dhogaza@pacifier.com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

#68Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: Don Baccus (#67)
AW: AW: AW: ALTER TABLE DROP COLUMN

I think we do, because it solves more than just the ALTER DROP COLUMN
problem: it cleans up other sore spots too. Like ALTER TABLE ADD COLUMN
in a table with child tables.

Yes, could also implement "add column xx int before someothercolumn"
to add a column in the middle.

Andreas

#69KuroiNeko
evpopkov@carrier.kiev.ua
In reply to: Hiroshi Inoue (#49)
Re: AW: ALTER TABLE DROP COLUMN

Not a big deal, right?

Yes a big deal. You just lost all your oids.

After I hit the wall with oids for the first time, I don't refer to them
anymore :) But yes, you're perfectly right, this is one more reason to have
DDL completely `automated,' ie no manual substitutions.
And here the fact that drop column is rarely needed is a double-bladed
sword. With things that you don't do often, you're at risk to forget
something essential and hose your data.

--

Well I tried to be meek
And I have tried to be mild
But I spat like a woman
And I sulked like a child
I have lived behind the walls
That have made me alone
Striven for peace
Which I never have known

Dire Straits, Brothers In Arms, The Man's Too Strong (Knopfler)

#70merlin
merlin@crimelabs.net
In reply to: Don Baccus (#67)
RE: AW: ALTER TABLE DROP COLUMN

Both options are in Oracle now, as proudly documented in their
freely accessible on-line documentation. It is very possible
they didn't implement it until version 8, i.e. until a couple of years
ago.

FYI: ALTER TABLE DROP COLUMN was added as of 8 / 8i according to our
Oracle DBA.

- merlin

#71Adam Haberlach
adam@newsnipple.com
In reply to: Chris (#59)
Re: AW: ALTER TABLE DROP COLUMN

On Mon, Oct 16, 2000 at 06:51:10PM +1100, Chris wrote:

KuroiNeko wrote:

1 create table alpha( id int4, payload text );

<snip>

Not a big deal, right?

Yes a big deal. You just lost all your oids.

Been there. Done that. Learned to heed the warnings about using
oids in any kind of persistant manner.

--
Adam Haberlach | ASCII /~\
adam@newsnipple.com | Ribbon \ / Against
http://www.newsnipple.com | Campaign X HTML
'88 EX500 | / \ E-mail

#72Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Zeugswetter Andreas SB (#66)
Re: AW: AW: ALTER TABLE DROP COLUMN

Zeugswetter Andreas SB wrote:

This style of "DROP COLUMN" would change the attribute
numbers whose positons are after the dropped column.
Unfortunately we have no mechanism to invalidate/remove
objects(or prepared plans) which uses such attribute numbers.
And I've seen no proposal/discussion to solve this problem
for DROP COLUMN feature. We wound't be able to prevent
PostgreSQL from doing the wrong thing silently.

That issue got me confused now (There was a previous mail where
you suggested using logical colid most of the time). Why not use the
physical colid in prepared objects/plans. Since those can't currently change
it seems such plans would not be invalidated.

Because I couldn't follow this thread well,I don't understand
what people expect now.
I've maintained my trial implementation for 2 months but
I couldn't do it any longer.

If people prefer 2x DROP COLUMN,I would give up my trial.
I know some people(Hannu, you .. ??) prefer logical and physical
attribute numbers but Tom seems to hate an ugly implementa
-tation. Unfortunately the implementation is pretty ugly and
so I may have to give up my trial also.

Regards.
Hiroshi Inoue