ALTER TABLE modifications

Started by Rod Taylorover 22 years ago29 messageshackers

rbt@rbt.ca

over 22 years ago

A general re-organization of Alter Table. Node wise, it is a
AlterTableStmt with a list of AlterTableCmds. The Cmds are the
individual actions to be completed (Add constraint, drop constraint, add
column, etc.)

Processing is done in 2 phases. The first phase updates the system
catalogs and creates a work queue for the table scan. The second phase
is to conduct the actual table scan evaluating all constraints and other
per tuple processing simultaneously, as required. This has no effect on
single step operations, but has a large benefit for combinational logic
where multiple table scans would otherwise be required.

Steps for the table scan include expression processing (default or
transform expressions), not null constraint process, other constraints.

The code has been simplified a little. Simple permission and recursion
checks are conducted in the common area and an enum for the command type
is used rather than a Char.

ALTER TABLE ADD COLUMN col DEFAULT 3 NOT NULL;
Allow creation of a column with a default (filled) and not null
constraint. This is internally subdivided into 3 actions much
like the below syntax.

ALTER TABLE tab ADD COLUMN col DEFAULT 3, ADD CHECK (anothercol > 3);
The above combinational syntax is commented out in gram.y. The
support framework is used in both the above and below items, but
arbitrary statements probably have some issues -- I've not
tested enough to determine.

If it is useful, it will be submitted at a later date.

ALTER TABLE tab ALTER COLUMN col TYPE text TRANSFORM ...;
Currently migrates indexes, check constraints, defaults, and the
column definition to the new type with optional transform. If
the tranform is not supplied, a standard assignment cast is
attempted.

One issue is that it detects dependencies way too late in the
game (after the TRANFORM has been applied). I tried mucking up
performDeletion to have a RESTRICT that did not throw the error
at the end (or physically remove the items) but that didn't work
out. Any bright ideas on how to achieve the NOTICES from a
failed DROP COLUMN earlier? Otherwise, I can put in single
object detection at that point.

It does not migrate foreign keys yet (should be easy to add) and
will be forwarded as an independent patch at a later time.
Currently foreign keys are rejected like views, functions, etc.

Comments appreciated.
--
Rod Taylor <pg [at] rbt [dot] ca>

Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

Bruce Momjian

bruce@momjian.us

over 22 years ago

In reply to: Rod Taylor (#1)

Re: ALTER TABLE modifications

Is this to be applied to CVS HEAD?

---------------------------------------------------------------------------

Rod Taylor wrote:

A general re-organization of Alter Table. Node wise, it is a
AlterTableStmt with a list of AlterTableCmds. The Cmds are the
individual actions to be completed (Add constraint, drop constraint, add
column, etc.)

Processing is done in 2 phases. The first phase updates the system
catalogs and creates a work queue for the table scan. The second phase
is to conduct the actual table scan evaluating all constraints and other
per tuple processing simultaneously, as required. This has no effect on
single step operations, but has a large benefit for combinational logic
where multiple table scans would otherwise be required.

Steps for the table scan include expression processing (default or
transform expressions), not null constraint process, other constraints.

The code has been simplified a little. Simple permission and recursion
checks are conducted in the common area and an enum for the command type
is used rather than a Char.

ALTER TABLE ADD COLUMN col DEFAULT 3 NOT NULL;
Allow creation of a column with a default (filled) and not null
constraint. This is internally subdivided into 3 actions much
like the below syntax.

ALTER TABLE tab ADD COLUMN col DEFAULT 3, ADD CHECK (anothercol > 3);
The above combinational syntax is commented out in gram.y. The
support framework is used in both the above and below items, but
arbitrary statements probably have some issues -- I've not
tested enough to determine.

If it is useful, it will be submitted at a later date.

ALTER TABLE tab ALTER COLUMN col TYPE text TRANSFORM ...;
Currently migrates indexes, check constraints, defaults, and the
column definition to the new type with optional transform. If
the tranform is not supplied, a standard assignment cast is
attempted.

One issue is that it detects dependencies way too late in the
game (after the TRANFORM has been applied). I tried mucking up
performDeletion to have a RESTRICT that did not throw the error
at the end (or physically remove the items) but that didn't work
out. Any bright ideas on how to achieve the NOTICES from a
failed DROP COLUMN earlier? Otherwise, I can put in single
object detection at that point.

It does not migrate foreign keys yet (should be easy to add) and
will be forwarded as an independent patch at a later time.
Currently foreign keys are rejected like views, functions, etc.

Comments appreciated.
--
Rod Taylor <pg [at] rbt [dot] ca>

Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

[ Attachment, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Tom Lane

tgl@sss.pgh.pa.us

over 22 years ago

In reply to: Bruce Momjian (#2)

Re: ALTER TABLE modifications

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Is this to be applied to CVS HEAD?

It sounded like large portions were still at the request-for-comment
stage...

regards, tom lane

Rod Taylor

rbt@rbt.ca

over 22 years ago

In reply to: Tom Lane (#3)

Re: ALTER TABLE modifications

On Tue, 2003-11-11 at 23:46, Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Is this to be applied to CVS HEAD?

It sounded like large portions were still at the request-for-comment
stage...

It can be applied to -HEAD without breaking anything or backtracking in
feature set (that I can see) but there is substantial change in the way
things work so a pre-review is warranted and I'm sure changes will be
requested.

If I don't receive comments for corrections by the weekend, I'll finish
off one of the missing components (foreign keys) and submit a
replacement.

Peter Eisentraut

peter_e@gmx.net

over 22 years ago

In reply to: Rod Taylor (#1)

Re: ALTER TABLE modifications

Rod Taylor writes:

ALTER TABLE tab ADD COLUMN col DEFAULT 3, ADD CHECK (anothercol > 3);
The above combinational syntax is commented out in gram.y. The
support framework is used in both the above and below items, but
arbitrary statements probably have some issues -- I've not
tested enough to determine.

If it is useful, it will be submitted at a later date.

I think it's perfectly fine to write two separate ALTER TABLE statements.
No need to introduce this nonstandard syntax.

ALTER TABLE tab ALTER COLUMN col TYPE text TRANSFORM ...;
Currently migrates indexes, check constraints, defaults, and the
column definition to the new type with optional transform. If
the tranform is not supplied, a standard assignment cast is
attempted.

Please don't use the term "transform". It is used by the SQL standard for
other purposes. What kind of object would you put in place of the "..."
anyway? A function? What syntax do other databases use?

--
Peter Eisentraut peter_e@gmx.net

Rod Taylor

rbt@rbt.ca

over 22 years ago

In reply to: Peter Eisentraut (#5)

Re: ALTER TABLE modifications

On Wed, 2003-11-12 at 14:02, Peter Eisentraut wrote:

Rod Taylor writes:

ALTER TABLE tab ADD COLUMN col DEFAULT 3, ADD CHECK (anothercol > 3);

I think it's perfectly fine to write two separate ALTER TABLE statements.
No need to introduce this nonstandard syntax.

Yes, it is certainly fine to do so, but much faster to do the above.

The command shown executes nearly 40% faster than 2 independent
statements in a single transaction -- the difference is even more
significant with additional sub-commands.

ALTER TABLE tab ALTER COLUMN col TYPE text TRANSFORM ...;
Currently migrates indexes, check constraints, defaults, and the
column definition to the new type with optional transform. If
the tranform is not supplied, a standard assignment cast is
attempted.

Please don't use the term "transform". It is used by the SQL standard for
other purposes. What kind of object would you put in place of the "..."
anyway? A function? What syntax do other databases use?

I've not found another database which allows this syntax. The suggestion
of TRANSFORM was Toms and was a result of using an assignment cast by
default. Do you have a better term I can use?

http://groups.google.ca/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&frame=right&th=266b02a270a164aa&seekm=1064805960.60248.24.camel%40jester#link4

The ... is an A_Expr which does not accept (among other things)
subselects. CASE statements, equations, etc. work fine.

CREATE TABLE tab (col int2);
-- integer to boolean
ALTER TABLE tab ALTER col TYPE boolean
TRANSFORM CASE WHEN col >= 1 THEN true ELSE false END;

-- or say Bytes to MBytes (original column is int8)
ALTER TABLE tab ALTER col TYPE integer TRANSFORM col / (1024 * 1024);

Hannu Krosing

hannu@tm.ee

over 22 years ago

In reply to: Peter Eisentraut (#5)

Re: ALTER TABLE modifications

Peter Eisentraut kirjutas K, 12.11.2003 kell 21:02:

Rod Taylor writes:

ALTER TABLE tab ADD COLUMN col DEFAULT 3, ADD CHECK (anothercol > 3);
The above combinational syntax is commented out in gram.y. The
support framework is used in both the above and below items, but
arbitrary statements probably have some issues -- I've not
tested enough to determine.

If it is useful, it will be submitted at a later date.

I think it's perfectly fine to write two separate ALTER TABLE statements.

I guess the difference is that each pass (i.e. ALTER TABLE) needs to do
another scan and copy of the table. Putting them in one statement allows
all the alterations to be done in one pass.

No need to introduce this nonstandard syntax.

ALTER TABLE tab ALTER COLUMN col TYPE text TRANSFORM ...;
Currently migrates indexes, check constraints, defaults, and the
column definition to the new type with optional transform. If
the tranform is not supplied, a standard assignment cast is
attempted.

Please don't use the term "transform". It is used by the SQL standard for
other purposes.

Is the "other" use conflicting with this syntax ?

I think we have preferred reusing existing keywords to adding new ones
in the past.

-----------------
Hannu

Peter Eisentraut

peter_e@gmx.net

over 22 years ago

In reply to: Rod Taylor (#6)

Re: ALTER TABLE modifications

Rod Taylor writes:

Yes, it is certainly fine to do so, but much faster to do the above.

Are table schema changes performance-sensitive operations, and are they
usually done in bulk? I doubt it.

I've not found another database which allows this syntax. The suggestion
of TRANSFORM was Toms and was a result of using an assignment cast by
default. Do you have a better term I can use?

I'm not sure I buy this whole concept in the first place. If there is
no cast between type A and type B, then surely changing a table column
from A to B is nonsensical.

-- or say Bytes to MBytes (original column is int8)
ALTER TABLE tab ALTER col TYPE integer TRANSFORM col / (1024 * 1024);

You can do this using a plain column type change plus an UPDATE. I'd
prefer to keep these operations independent.

--
Peter Eisentraut peter_e@gmx.net

Peter Eisentraut

peter_e@gmx.net

over 22 years ago

In reply to: Hannu Krosing (#7)

Re: ALTER TABLE modifications

Hannu Krosing writes:

Please don't use the term "transform". It is used by the SQL standard for
other purposes.

Is the "other" use conflicting with this syntax ?

I think we have preferred reusing existing keywords to adding new ones
in the past.

Maybe (although I don't agree). but we've never intentionally done
terminology overloading.

--
Peter Eisentraut peter_e@gmx.net

#10

Rod Taylor

rbt@rbt.ca

over 22 years ago

In reply to: Peter Eisentraut (#8)

Re: ALTER TABLE modifications

On Thu, 2003-11-13 at 09:18, Peter Eisentraut wrote:

Rod Taylor writes:

Yes, it is certainly fine to do so, but much faster to do the above.

Are table schema changes performance-sensitive operations, and are they
usually done in bulk? I doubt it.

Perhaps not for you. But I would disagree with both of those points for
a majority of companies.

If the system has high uptime requirements, the timeframe to accomplish
maintenance decreases substantially. For example, I have two 1.5 hour
maintenance windows per year for maintenance. This is a requirement
given by our clients, nearly all of whom insist on it. Telcos and other
enterprise level environments with 24 / 7 usage patterns have different
demands.

So.. Adding 2 new check constraints to a table with 500million entries
takes way way too much time in my eyes.

Toms request (when I initially brought up fixing ADD COLUMN) was that it
would process the DEFAULT and other new constraints in a single pass.
The above syntax simply uses the framework put in place for ADD COLUMN.

I'll not push it if it does not function with complex combinations,
which has not been tested, as such is not included in this patch.

-- or say Bytes to MBytes (original column is int8)
ALTER TABLE tab ALTER col TYPE integer TRANSFORM col / (1024 * 1024);

You can do this using a plain column type change plus an UPDATE. I'd
prefer to keep these operations independent.

Yes, if you can do the update first.

Ok.. How about boolean to int2 as the user discovered they require a
third or fouth state? Or vice-versa. An MSSQL, MySQL, etc. migration
would leave a bunch of int2's (since they do not support boolean) in the
PostgreSQL database which have a boolean meaning.

rbt=# select '1'::int4::boolean
rbt-# ;
ERROR: cannot cast type integer to boolean
rbt=# select '1'::boolean;
bool
------
t
(1 row)

Are you going to tell these people to write down the Default, add a new
column, copy the data, remove the old column, re-add the default, and
rename the new column to the old column name?

This is the exact reason many folks have asked for this ALTER form to be
added.

Considering there are already 2 sites out there running with this patch
(I have been asked for advance copies) I do not believe the
functionality is overkill in any respect.

Can you please suggest a better term to use in place of TRANSFORM?
Perhaps UPDATE WITH?

#11

Hannu Krosing

hannu@tm.ee

over 22 years ago

In reply to: Rod Taylor (#10)

Re: ALTER TABLE modifications

Rod Taylor kirjutas N, 13.11.2003 kell 16:59:

Can you please suggest a better term to use in place of TRANSFORM?
Perhaps UPDATE WITH?

or perhaps USING, based loosely on our use of USING in CREATE INDEX ?

----------
Hannu

#12

Rod Taylor

rbt@rbt.ca

over 22 years ago

In reply to: Hannu Krosing (#11)

Re: ALTER TABLE modifications

-- moved to -hackers

On Thu, 2003-11-13 at 11:35, Hannu Krosing wrote:

Rod Taylor kirjutas N, 13.11.2003 kell 16:59:

Can you please suggest a better term to use in place of TRANSFORM?
Perhaps UPDATE WITH?

or perhaps USING, based loosely on our use of USING in CREATE INDEX ?

USING sounds reasonable to me. Peter?

To those not on -patches, Peter is unhappy with the TRANSFORM clause in:
ALTER TABLE tab ALTER COLUMN col TRANSFORM expression;

The reasoning is that TRANSFORM in the spec means something different
than our use. We're attempting to come up with a replacement term.

#13

Dave Cramer

pg@fastcrypt.com

over 22 years ago

In reply to: Hannu Krosing (#11)

Re: ALTER TABLE modifications

Rod,

I tried the current patch on a RC2 release, and I noticed one
undesirable side affect.

Modifying a column moves it to the end. In high availability situations
this would not be desirable, I would imagine it would break lots of
code.

Dave

Show quoted text

On Thu, 2003-11-13 at 11:35, Hannu Krosing wrote:

Rod Taylor kirjutas N, 13.11.2003 kell 16:59:

Can you please suggest a better term to use in place of TRANSFORM?
Perhaps UPDATE WITH?

or perhaps USING, based loosely on our use of USING in CREATE INDEX ?

----------
Hannu

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#14

Alvaro Herrera

alvherre@dcc.uchile.cl

over 22 years ago

In reply to: Dave Cramer (#13)

Re: ALTER TABLE modifications

On Fri, Nov 14, 2003 at 08:59:05AM -0500, Dave Cramer wrote:

I tried the current patch on a RC2 release, and I noticed one
undesirable side affect.

Modifying a column moves it to the end. In high availability situations
this would not be desirable, I would imagine it would break lots of
code.

This is expected. Doing otherwise would incur into a much bigger
performance hit.

Anyway, IMHO no code should use SELECT * in any case, which is the only
scenario where one would expect physical column order to matter, isn't
it?

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"La primera ley de las demostraciones en vivo es: no trate de usar el sistema.
Escriba un guiï¿½n que no toque nada para no causar daï¿½os." (Jakob Nielsen)

#15

Hannu Krosing

hannu@tm.ee

over 22 years ago

In reply to: Alvaro Herrera (#14)

Re: ALTER TABLE modifications

Alvaro Herrera kirjutas R, 14.11.2003 kell 16:17:

On Fri, Nov 14, 2003 at 08:59:05AM -0500, Dave Cramer wrote:

I tried the current patch on a RC2 release, and I noticed one
undesirable side affect.

Modifying a column moves it to the end. In high availability situations
this would not be desirable, I would imagine it would break lots of
code.

This is expected. Doing otherwise would incur into a much bigger
performance hit.

Not neccessarily, but it would make the patch much bigger ;)

IIRC there was discussion about splitting colum numbers into physical
and logical numbers at the time when DROP COLUMN was done.

Anyway, IMHO no code should use SELECT * in any case, which is the only
scenario where one would expect physical column order to matter, isn't
it?

and this could also break when just changing the column type.

------------
Hannu

#16

Rod Taylor

rbt@rbt.ca

over 22 years ago

In reply to: Dave Cramer (#13)

Re: ALTER TABLE modifications

On Fri, 2003-11-14 at 08:59, Dave Cramer wrote:

Rod,

I tried the current patch on a RC2 release, and I noticed one
undesirable side affect.

Modifying a column moves it to the end. In high availability situations
this would not be desirable, I would imagine it would break lots of
code.

Yes, I've done that to myself a few times.

The method is rename old column, add new column, move data across, move
or reform dependencies, drop old column.

Adding the new column puts it to the end. In order to avoid the
repositioning we would need some kind of a position abstraction from the
physical storage to what the user sees. It's on the TODO list, but not a
part of this patch.

#17

Peter Eisentraut

peter_e@gmx.net

over 22 years ago

In reply to: Rod Taylor (#16)

Re: ALTER TABLE modifications

Rod Taylor writes:

The method is rename old column, add new column, move data across, move
or reform dependencies, drop old column.

I can do this by hand. If we have an explicit command to do it, then it
needs to preserve the table schema. Else, this feature would be mostly
useless and a certain source of complaints.

--
Peter Eisentraut peter_e@gmx.net

#18

Rod Taylor

rbt@rbt.ca

over 22 years ago

In reply to: Peter Eisentraut (#17)

Re: ALTER TABLE modifications

On Fri, 2003-11-14 at 09:57, Peter Eisentraut wrote:

Rod Taylor writes:

The method is rename old column, add new column, move data across, move
or reform dependencies, drop old column.

I can do this by hand. If we have an explicit command to do it, then it
needs to preserve the table schema. Else, this feature would be mostly
useless and a certain source of complaints.

The method was agreed to on -hackers prior to any code having been
written. Unless I'm mistaken, the method was suggested to me by other on
-hackers.

Can you give me an alternative on how to approach this problem without
relying on another TODO item labelled ALTER TABLE .. POSITION?

The point of the command isn't to accomplish anything magical, simply to
make it easier.

begin;
Rename old column
Add new column
Copy data
Recreate default
Recreate NOT NULL constraint
Recreate index 1
Recreate index 2 (multi-key)
Recreate check constraint 1
Recreate check constraint 2
Recreate check constraint 3
Recreate foreign key 1
remove old column
commit;
vacuum full table (to get rid of 2x problem since the user was unable to
do a fileswap)

The above is by hand, the below is with this command:

begin;
Alter table .. type ..
commit;

If it becomes possible to change the order without having to goto an
exceptional amount of work (that POSITION item) then it would only be
obvious to attempt to preserve the position.

#19

Dave Cramer

pg@fastcrypt.com

over 22 years ago

In reply to: Peter Eisentraut (#17)

Re: ALTER TABLE modifications

OK,

Here is another approach, that would retain column order. It will
require that the table be locked while this proceeds, but I think this
is a good idea anyway.

lock table
create newtable as select c1, c2, c3::newtype
modify pg_class to point to the new filename
modify existing pg_attribute for the column in question
recreate indexes that exist on the column
unlock table

Dave

Show quoted text

On Fri, 2003-11-14 at 09:57, Peter Eisentraut wrote:

Rod Taylor writes:

The method is rename old column, add new column, move data across, move
or reform dependencies, drop old column.

I can do this by hand. If we have an explicit command to do it, then it
needs to preserve the table schema. Else, this feature would be mostly
useless and a certain source of complaints.

#20

Tom Lane

tgl@sss.pgh.pa.us

over 22 years ago

In reply to: Rod Taylor (#18)

Re: ALTER TABLE modifications

Rod Taylor <pg@rbt.ca> writes:

On Fri, 2003-11-14 at 09:57, Peter Eisentraut wrote:

I can do this by hand. If we have an explicit command to do it, then it
needs to preserve the table schema. Else, this feature would be mostly
useless and a certain source of complaints.

The method was agreed to on -hackers prior to any code having been
written.

I believe the consensus was that automating what you could do by hand
is still a step forward.

It clearly would be better if we could relabel the logical column
position after finishing the whole process, but I agree with Rod that
that is an independent patch. Combining them into one mega-patch
doesn't sound like good engineering.

I guess the real question here is whether we would want to revert this
capability if a patch to adjust logical column orderings doesn't appear
before 7.5. My vote would be "no", but apparently Peter's is "yes".
Any other opinions?

regards, tom lane

#21

Rod Taylor

rbt@rbt.ca

over 22 years ago

In reply to: Dave Cramer (#19)

#22

Peter Eisentraut

peter_e@gmx.net

over 22 years ago

In reply to: Tom Lane (#20)

#23