BETWEEN Node & DROP COLUMN
Hi All,
I have given up working on the BETWEEN node. It got to the stage where I
realised I was really out of my depth! Rod Taylor has indicated an interest
in the problem and I have sent him my latest patch, so hopefully he'll be
able to crack it.
So instead, I've taken up with the DROP COLUMN crusade. It seems that the
following are the jobs that need to be done:
* Add attisdropped to pg_attribute
- Looking for takers for this one, otherwise I'll look into it.
* Fill out AlterTableDropColumn
- I've done this, with the assumption that attisdropped exists. It sets
attisdropped to true, drops the column default and renames the column.
(Plus does all other normal ALTER TABLE checks)
* Modify parser and other places to ignore dropped columns
- This is also up for grabs.
* Modify psql and pg_dump to handle dropped columns
- I've done this.
Once the above is done, we have a working drop column implementation.
* Modify all other interfaces, JDBC, etc. to handle dropped cols.
- I think this can be suggested to the relevant developers once the above
is committed!
* Modify VACUUM to add a RECLAIM option to reduce on disk table size.
- This is out of my league, so it's up for grabs
I have approached a couple of people off-list to see if they're interested
in helping, so please post to the list if you intend to work on something.
It has also occurred to me that once drop column exists, users will be able
to change the type of their columns manually (ie. create a new col, update
all values, drop the old col). So, there is no reason why this new
attisdropped field shouldn't allow us to implement a full ALTER TABLE/SET
TYPE sort of feature - cool huh?
Chris
all values, drop the old col). So, there is no reason why this new
attisdropped field shouldn't allow us to implement a full ALTER TABLE/SET
TYPE sort of feature - cool huh?
I've not looked in a while, but the column rename code did not account
for issues in foreign keys, etc. Those should be easier to ferret out
soon, but may not be so nice to change yet.
It should also be noted that an ALTER TABLE / SET TYPE implemented with
the above idea with run into the 2x diskspace issue as well as take
quite a while to process.
I've not looked in a while, but the column rename code did not account
for issues in foreign keys, etc. Those should be easier to ferret out
soon, but may not be so nice to change yet.
Which is probably a good reason for us to offer it as an all-in-one command,
rather than expecting them to do it manually...
It should also be noted that an ALTER TABLE / SET TYPE implemented with
the above idea with run into the 2x diskspace issue as well as take
quite a while to process.
I think that if the 'SET TYPE' operation is ever to be rollback-able, it
will need to use 2x diskspace. If it's overwritten in place, there's no
chance of fallback... I think that a DBA would choose to use the command
knowing full well what it requires? Better than not offering them the
choice at all!
Chris
It should also be noted that an ALTER TABLE / SET TYPE implemented with
the above idea with run into the 2x diskspace issue as well as take
quite a while to process.I think that if the 'SET TYPE' operation is ever to be rollback-able, it
will need to use 2x diskspace. If it's overwritten in place, there's no
chance of fallback... I think that a DBA would choose to use the command
knowing full well what it requires? Better than not offering them the
choice at all!
True, but if we did the multi-version thing in pg_attribute we may be
able to coerce to the right type on the way out making it a high speed
change.
On Wed, 2002-07-03 at 14:32, Rod Taylor wrote:
It should also be noted that an ALTER TABLE / SET TYPE implemented with
the above idea with run into the 2x diskspace issue as well as take
quite a while to process.I think that if the 'SET TYPE' operation is ever to be rollback-able, it
will need to use 2x diskspace. If it's overwritten in place, there's no
chance of fallback... I think that a DBA would choose to use the command
knowing full well what it requires? Better than not offering them the
choice at all!True, but if we did the multi-version thing in pg_attribute we may be
able to coerce to the right type on the way out making it a high speed
change.
If I understand you right, i.e. you want to do the conversion at each
select(), then the change is high speed but all subsequent queries using
it will pay a a speed penalty, not to mention added complexity of the
whole thing.
I don't think that making changes quick autweights added slowness and
complexity - changes are meant to be slow ;)
The real-life analogue to the proposed scenario would be adding one
extra wheel next to each existing one in a car in order to make it
possible to change tyres while driving - while certainly possible nobody
actually does it.
---------------
Hannu
Christopher Kings-Lynne wrote:
Hi All,
I have given up working on the BETWEEN node. It got to the stage where I
realised I was really out of my depth! Rod Taylor has indicated an interest
in the problem and I have sent him my latest patch, so hopefully he'll be
able to crack it.So instead, I've taken up with the DROP COLUMN crusade. It seems that the
following are the jobs that need to be done:
Great crusade!
* Add attisdropped to pg_attribute
- Looking for takers for this one, otherwise I'll look into it.
I can do this for you. Just let me know when.
* Fill out AlterTableDropColumn
- I've done this, with the assumption that attisdropped exists. It sets
attisdropped to true, drops the column default and renames the column.
(Plus does all other normal ALTER TABLE checks)
* Modify parser and other places to ignore dropped columns
- This is also up for grabs.
As I remember, Hiroshi's drop column changed the attribute number to a
special negative value, which required lots of changes to track.
Keeping the same number and just marking the column as dropped is a big
win. This does push the coding out the client though.
* Modify psql and pg_dump to handle dropped columns
- I've done this.Once the above is done, we have a working drop column implementation.
* Modify all other interfaces, JDBC, etc. to handle dropped cols.
- I think this can be suggested to the relevant developers once the above
is committed!* Modify VACUUM to add a RECLAIM option to reduce on disk table size.
- This is out of my league, so it's up for grabs
Will UPDATE on a row set the deleted column to NULL? If so, the
disk space used by the column would go away over time. In fact, a
simple:
UPDATE tab SET col = col;
VACUUM;
would remove the data stored in the deleted column; no change to VACUUM
needed.
I have approached a couple of people off-list to see if they're interested
in helping, so please post to the list if you intend to work on something.It has also occurred to me that once drop column exists, users will be able
to change the type of their columns manually (ie. create a new col, update
all values, drop the old col). So, there is no reason why this new
attisdropped field shouldn't allow us to implement a full ALTER TABLE/SET
TYPE sort of feature - cool huh?
Yep.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Hiroshi Inoue wrote:
As I remember, Hiroshi's drop column changed the attribute number to a
special negative value, which required lots of changes to track.??? What do you mean by *lots of* ?
Yes, please remind me. Was your solution renumbering the attno values?
I think there are fewer cases to fix if we keep the existing attribute
numbering and just mark the column as deleted. Is this accurate?
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Import Notes
Reply to msg id not found: 3D239619.A8274AA2@tpf.co.jp | Resolved by subject fallback
Bruce Momjian wrote:
Christopher Kings-Lynne wrote:
Hi All,
I have given up working on the BETWEEN node. It got to the stage where I
realised I was really out of my depth! Rod Taylor has indicated an interest
in the problem and I have sent him my latest patch, so hopefully he'll be
able to crack it.So instead, I've taken up with the DROP COLUMN crusade. It seems that the
following are the jobs that need to be done:Great crusade!
* Add attisdropped to pg_attribute
- Looking for takers for this one, otherwise I'll look into it.I can do this for you. Just let me know when.
* Fill out AlterTableDropColumn
- I've done this, with the assumption that attisdropped exists. It sets
attisdropped to true, drops the column default and renames the column.
(Plus does all other normal ALTER TABLE checks)
* Modify parser and other places to ignore dropped columns
- This is also up for grabs.As I remember, Hiroshi's drop column changed the attribute number to a
special negative value, which required lots of changes to track.
??? What do you mean by *lots of* ?
regards,
Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/
Bruce Momjian wrote:
Hiroshi Inoue wrote:
As I remember, Hiroshi's drop column changed the attribute number to a
special negative value, which required lots of changes to track.??? What do you mean by *lots of* ?
Yes, please remind me. Was your solution renumbering the attno values?
Yes though I don't intend to object to Christopher's proposal.
I think there are fewer cases to fix if we keep the existing attribute
numbering and just mark the column as deleted. Is this accurate?
No. I don't understand why you think so.
regards,
Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/
Hiroshi Inoue wrote:
Bruce Momjian wrote:
Hiroshi Inoue wrote:
As I remember, Hiroshi's drop column changed the attribute number to a
special negative value, which required lots of changes to track.??? What do you mean by *lots of* ?
Yes, please remind me. Was your solution renumbering the attno values?
Yes though I don't intend to object to Christopher's proposal.
I think there are fewer cases to fix if we keep the existing attribute
numbering and just mark the column as deleted. Is this accurate?No. I don't understand why you think so.
With the isdropped column, you really only need to deal with '*'
expansion in a few places, and prevent the column from being accessed.
With renumbering, the backend loops that go through the attnos have to
be dealt with.
Is this correct? I certainly prefer attno renumbering to isdropped
because it allows us to get DROP COLUMN without any client changes, or
at least with fewer because the dropped column has a negative attno. Is
this accurate?
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Bruce Momjian wrote:
Hiroshi Inoue wrote:
Bruce Momjian wrote:
Hiroshi Inoue wrote:
As I remember, Hiroshi's drop column changed the attribute number to a
special negative value, which required lots of changes to track.??? What do you mean by *lots of* ?
Yes, please remind me. Was your solution renumbering the attno values?
Yes though I don't intend to object to Christopher's proposal.
I think there are fewer cases to fix if we keep the existing attribute
numbering and just mark the column as deleted. Is this accurate?No. I don't understand why you think so.
With the isdropped column, you really only need to deal with '*'
expansion in a few places, and prevent the column from being accessed.
With renumbering, the backend loops that go through the attnos have to
be dealt with.
I used the following macro in my trial implementation.
#define COLUMN_IS_DROPPED(attribute) ((attribute)->attnum <=
DROP_COLUMN_OFFSET)
The places where the macro was put are exactly the places
where attisdropped must be checked.
The difference is essentially little. Please don't propagate
a wrong information.
Is this correct? I certainly prefer attno renumbering to isdropped
because it allows us to get DROP COLUMN without any client changes,
Unfortunately many apps rely on the fact that the attnos are
consecutive starting from 1. It was the main reason why Tom
rejected my trial. Nothing has changed about it.
regards,
Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/
Yes, please remind me. Was your solution renumbering the
attno values?
Yes though I don't intend to object to Christopher's proposal.
Hiroshi,
I am thinking of rolling back my CVS to see if there's code from your
previous test implementation that we can use. Apart from the DropColumn
function itself, what other changes did you make? Did you have
modifications for '*' expansion in the parser, etc.?
Chris
Christopher Kings-Lynne wrote:
Yes, please remind me. Was your solution renumbering the
attno values?
Yes though I don't intend to object to Christopher's proposal.
Hiroshi,
I am thinking of rolling back my CVS to see if there's code from your
previous test implementation that we can use. Apart from the DropColumn
function itself, what other changes did you make? Did you have
modifications for '*' expansion in the parser, etc.?
Yes, please review Hiroshi's work. It is good work. Can we have an
analysis of Hiroshi's approach vs the isdropped case.
Is it better to renumber the attno or set a column to isdropped. The
former may be easier on the clients.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
I am thinking of rolling back my CVS to see if there's code from your
previous test implementation that we can use. Apart from the DropColumn
function itself, what other changes did you make? Did you have
modifications for '*' expansion in the parser, etc.?Yes, please review Hiroshi's work. It is good work. Can we have an
analysis of Hiroshi's approach vs the isdropped case.
Yes, it is. I've rolled it back and I'm already incorporating his changes
to the parser into my patch. I just have to grep all the source code for
'HACK' to find all the changes. It's all very handy.
Is it better to renumber the attno or set a column to isdropped. The
former may be easier on the clients.
Well, obviously I prefer the attisdropped approach. I think it's clearer
and there's less confusion. As a head developer for phpPgAdmin that's what
I'd prefer... Hiroshi obviously prefers his solution, but doesn't object to
mine/Tom's. I think that with all the schema-related changes that clients
will have to handle in 7.3, we may as well hit them with the dropped column
stuff in the same go, that way there's fewer rounds of clients scrambling to
keep up with the server.
I intend to email every single postgres client I can find and tell them
about the new changes, well before we release 7.3...
Chris
Christopher Kings-Lynne wrote:
I am thinking of rolling back my CVS to see if there's code from your
previous test implementation that we can use. Apart from the DropColumn
function itself, what other changes did you make? Did you have
modifications for '*' expansion in the parser, etc.?Yes, please review Hiroshi's work. It is good work. Can we have an
analysis of Hiroshi's approach vs the isdropped case.Yes, it is. I've rolled it back and I'm already incorporating his changes
to the parser into my patch. I just have to grep all the source code for
'HACK' to find all the changes. It's all very handy.
Yes. It should have been accepted long ago, but we were waiting for a
"perfect" solution which we all know now will never come.
Is it better to renumber the attno or set a column to isdropped. The
former may be easier on the clients.Well, obviously I prefer the attisdropped approach. I think it's clearer
and there's less confusion. As a head developer for phpPgAdmin that's what
I'd prefer... Hiroshi obviously prefers his solution, but doesn't object to
OK, can you explain the issues from a server and client perspective,
i.e. renumbering vs isdropped?
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Christopher Kings-Lynne wrote:
Yes, please remind me. Was your solution renumbering the
attno values?
Yes though I don't intend to object to Christopher's proposal.
Hiroshi,
I am thinking of rolling back my CVS to see if there's code from your
previous test implementation that we can use. Apart from the DropColumn
function itself, what other changes did you make? Did you have
modifications for '*' expansion in the parser, etc.?
Don't mind my posting.
I'm only correcting a misunderstanding for my work.
regards,
Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/
Well, obviously I prefer the attisdropped approach. I think
it's clearer
and there's less confusion. As a head developer for phpPgAdmin
that's what
I'd prefer... Hiroshi obviously prefers his solution, but
doesn't object to
OK, can you explain the issues from a server and client perspective,
i.e. renumbering vs isdropped?
Well in the renumbering case, the client needs to know about missing attnos
and it has to know to ignore negative attnos (which it probably does
already). ie. psql and pg_dump wouldn't have to be modified in that case.
In the isdropped case, the client needs to know to exclude any column with
'attisdropped' set to true.
So in both cases, the client needs to be updated. I personally prefer the
explicit 'is dropped' as opposed to the implicit 'negative number', but hey.
*sigh* Now I've gone and made an argument for the renumbering case. I'm
going to have a good look at Hiroshi's old code and see which one is less
complicated, etc. So far all I've really need to do is redefine Hiroshi's
COLUMN_DROPPED macro.
I'm sure that both methods could be made to handle a 'ALTER TABLE/SET TYPE'
syntax.
Chris
Christopher Kings-Lynne wrote:
Well, obviously I prefer the attisdropped approach. I think
it's clearer
and there's less confusion. As a head developer for phpPgAdmin
that's what
I'd prefer... Hiroshi obviously prefers his solution, but
doesn't object to
OK, can you explain the issues from a server and client perspective,
i.e. renumbering vs isdropped?Well in the renumbering case, the client needs to know about missing attnos
and it has to know to ignore negative attnos (which it probably does
already). ie. psql and pg_dump wouldn't have to be modified in that case.In the isdropped case, the client needs to know to exclude any column with
'attisdropped' set to true.So in both cases, the client needs to be updated. I personally prefer the
explicit 'is dropped' as opposed to the implicit 'negative number', but hey.*sigh* Now I've gone and made an argument for the renumbering case. I'm
going to have a good look at Hiroshi's old code and see which one is less
complicated, etc. So far all I've really need to do is redefine Hiroshi's
COLUMN_DROPPED macro.I'm sure that both methods could be made to handle a 'ALTER TABLE/SET TYPE'
syntax.
Yes! This is exactly what I would like investigated. I am embarrassed
to see that we had Hiroshi's patch all this time and never implemented
it.
I think it underscores that we have drifted too far into the code purity
camp and need a little reality check that users have needs and we should
try to meet them if we want to be successful. How many DROP COLUMN
gripes have we heard over the years! Now I am upset.
OK, I calmed down now. What I would like to know is which DROP COLUMN
method is easier on the server end, and which is easier on the client
end. If one is easier in both places, let's use that.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Unfortunately many apps rely on the fact that the attnos are
consecutive starting from 1. It was the main reason why Tom
rejected my trial. Nothing has changed about it.
OK, I've been looking at Hiroshi's implementation. It's basically
semantically equivalent to mine from what I can see so far. The only
difference really is in how the dropped columns are marked.
I've been ruminating on Hiroshi's statement at the top there. What was the
reasoning for assuming that 'many apps rely on the fact that the attnos are
consecutive'? Is that true? phpPgAdmin doesn't. In fact, phpPgAdmin won't
require any changes with Hiroshi's implementaiton and will require changes
with mine.
Anyway, an app that relies on consecutive attnos is going to have pain
skipping over attisdropped columns anyway???
In fact, I'm now beginning to think that I should just resurrect Hiroshi's
implementation. I'm prepared to do that if people like...
Chris
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
I used the following macro in my trial implementation.
#define COLUMN_IS_DROPPED(attribute) ((attribute)->attnum <=
DROP_COLUMN_OFFSET)
The places where the macro was put are exactly the places
where attisdropped must be checked.
Actually, your trial required column dropped-ness to be checked in
many more places than the proposed approach does. Since you renumbered
the dropped column, nominal column numbers didn't correspond to physical
order of values in tuples anymore; that meant checking for dropped
columns in many low-level tuple manipulations.
Is this correct? I certainly prefer attno renumbering to isdropped
because it allows us to get DROP COLUMN without any client changes,
Unfortunately many apps rely on the fact that the attnos are
consecutive starting from 1. It was the main reason why Tom
rejected my trial. Nothing has changed about it.
I'm still not thrilled about it ... but I don't see a reasonable way
around it, either. I don't see any good way to do DROP COLUMN
without breaking applications that make such assumptions. Unless
you have one, we may as well go for the approach that adds the least
complication to the backend.
regards, tom lane