Dropping column from big table
Hi All,
It's postgres database version 15.4. We have a table which is daily and is
approx. ~2TB in size having a total ~90 partitions. We have a requirement
to drop columns and add new columns to this table.
I Want to understand, If this can be done online? what is the fastest way
to drop/add columns from such a big table and what will be the consequence
of this in regards to vacuum, post this activity? Or if any other issues we
may face post this?
Regards
Sud
On 7/10/24 13:13, sud wrote:
Hi All,
It's postgres database version 15.4. We have a table which is daily and
is approx. ~2TB in size having a total ~90 partitions. We have a
requirement to drop columns and add new columns to this table.I Want to understand, If this can be done online? what is the fastest
way to drop/add columns from such a big table and what will be the
consequence of this in regards to vacuum, post this activity? Or if any
other issues we may face post this?Regards
Sud
https://www.postgresql.org/docs/current/sql-altertable.html
"The DROP COLUMN form does not physically remove the column, but simply
makes it invisible to SQL operations. Subsequent insert and update
operations in the table will store a null value for the column. Thus,
dropping a column is quick but it will not immediately reduce the
on-disk size of your table, as the space occupied by the dropped column
is not reclaimed. The space will be reclaimed over time as existing rows
are updated.
To force immediate reclamation of space occupied by a dropped column,
you can execute one of the forms of ALTER TABLE that performs a rewrite
of the whole table. This results in reconstructing each row with the
dropped column replaced by a null value.
"
--
Adrian Klaver
adrian.klaver@aklaver.com
On Thu, Jul 11, 2024 at 2:52 AM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:
https://www.postgresql.org/docs/current/sql-altertable.html
"The DROP COLUMN form does not physically remove the column, but simply
makes it invisible to SQL operations. Subsequent insert and update
operations in the table will store a null value for the column. Thus,
dropping a column is quick but it will not immediately reduce the
on-disk size of your table, as the space occupied by the dropped column
is not reclaimed. The space will be reclaimed over time as existing rows
are updated.To force immediate reclamation of space occupied by a dropped column,
you can execute one of the forms of ALTER TABLE that performs a rewrite
of the whole table. This results in reconstructing each row with the
dropped column replaced by a null value.
"
Thank you so much. When you said *"you can execute one of the forms of
ALTER TABLE that performs a rewrite*
*of the whole table."* Does it mean that post "alter table drop column" the
vacuum is going to run longer as it will try to clean up all the rows and
recreate the new rows? But then how can this be avoidable or made better
without impacting the system performance and blocking others?
On Wed, Jul 10, 2024 at 11:28 PM sud <suds1434@gmail.com> wrote:
On Thu, Jul 11, 2024 at 2:52 AM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:https://www.postgresql.org/docs/current/sql-altertable.html
"The DROP COLUMN form does not physically remove the column, but simply
makes it invisible to SQL operations. Subsequent insert and update
operations in the table will store a null value for the column. Thus,
dropping a column is quick but it will not immediately reduce the
on-disk size of your table, as the space occupied by the dropped column
is not reclaimed. The space will be reclaimed over time as existing rows
are updated.To force immediate reclamation of space occupied by a dropped column,
you can execute one of the forms of ALTER TABLE that performs a rewrite
of the whole table. This results in reconstructing each row with the
dropped column replaced by a null value.
"Thank you so much. When you said *"you can execute one of the forms of
ALTER TABLE that performs a rewrite*
*of the whole table."* Does it mean that post "alter table drop column"
the vacuum is going to run longer as it will try to clean up all the rows
and recreate the new rows? But then how can this be avoidable or made
better without impacting the system performance
"Impact" is a non-specific word. "How much impact" depends on how many
autovacuum workers you've set it to use, and how many threads you set in
vacuumdb.
and blocking others?
VACUUM never blocks.
Anyway, DROP is the easy part; it's ADD COLUMN that can take a lot of time
(depending on whether or not you populate the column with a default value).
I'd detach all the partitions from the parent table, and then add the new
column to the not-children in multiple threads, add the column to the
parent and then reattach all of the children. That's the fastest method,
though takes some time to set up.
On Thu, 11 Jul, 2024, 12:46 pm Ron Johnson, <ronljohnsonjr@gmail.com> wrote:
On Wed, Jul 10, 2024 at 11:28 PM sud <suds1434@gmail.com> wrote:
Thank you so much. When you said *"you can execute one of the forms of
ALTER TABLE that performs a rewrite*
*of the whole table."* Does it mean that post "alter table drop column"
the vacuum is going to run longer as it will try to clean up all the rows
and recreate the new rows? But then how can this be avoidable or made
better without impacting the system performance"Impact" is a non-specific word. "How much impact" depends on how many
autovacuum workers you've set it to use, and how many threads you set in
vacuumdb.and blocking others?
VACUUM never blocks.
Anyway, DROP is the easy part; it's ADD COLUMN that can take a lot of time
(depending on whether or not you populate the column with a default value).I'd detach all the partitions from the parent table, and then add the new
column to the not-children in multiple threads, add the column to the
parent and then reattach all of the children. That's the fastest method,
though takes some time to set up.
Thank you so much.
Dropping will take it's own time for post vacuum however as you rightly
said, it won't be blocking which should be fine.
In regards to add column, Detaching all partitions then adding column to
the individual partition in multiple sessions and then reattaching looks to
be a really awesome idea to make it faster. However one doubt, Will it
create issue if there already exists foreign key on this partition table or
say it's the parent to other child partition/nonpartition tables?
On Thu, 2024-07-11 at 13:10 +0530, sud wrote:
Dropping will take it's own time for post vacuum however as you
rightly said, it won't be blocking which should be fine.
I am not certain if you understood this correctly.
Dropping a column is fast, but doesn't reclaim the space.
VACUUM won't block anything, but won't reclaim the space.
VACUUM (FULL) will block everything, but will also not reclaim the space.
You'd need to use a form of ALTER TABLE that rewrites the table,
as indicated in the documentation. However, such an operation
will block all access to the table for a long time, and it will
temporarily need much more space, because it has to hold both the
old and the new copy of the table.
Yours,
Laurenz Albe
On Thu, Jul 11, 2024 at 3:41 AM sud <suds1434@gmail.com> wrote:
On Thu, 11 Jul, 2024, 12:46 pm Ron Johnson, <ronljohnsonjr@gmail.com>
wrote:On Wed, Jul 10, 2024 at 11:28 PM sud <suds1434@gmail.com> wrote:
Thank you so much. When you said *"you can execute one of the forms of
ALTER TABLE that performs a rewrite*
*of the whole table."* Does it mean that post "alter table drop column"
the vacuum is going to run longer as it will try to clean up all the rows
and recreate the new rows? But then how can this be avoidable or made
better without impacting the system performance"Impact" is a non-specific word. "How much impact" depends on how many
autovacuum workers you've set it to use, and how many threads you set in
vacuumdb.and blocking others?
VACUUM never blocks.
Anyway, DROP is the easy part; it's ADD COLUMN that can take a lot of
time (depending on whether or not you populate the column with a default
value).I'd detach all the partitions from the parent table, and then add the new
column to the not-children in multiple threads, add the column to the
parent and then reattach all of the children. That's the fastest method,
though takes some time to set up.Thank you so much.
Dropping will take it's own time for post vacuum however as you rightly
said, it won't be blocking which should be fine.In regards to add column, Detaching all partitions then adding column to
the individual partition in multiple sessions and then reattaching looks to
be a really awesome idea to make it faster.
Do both the DROP and ADD in the same "set". Possibly in the same statement
(which would be fastest if it works), and alternatively on the same command
line. Examples:
psql --host=foo.example.com somedb -c "ALTER TABLE bar_p85 DROP COLUMN
splat, ADD COLUMN barf BIGINT;"
psql --host=foo.example.com somedb -c "ALTER TABLE bar_p85 DROP splat;" -c
ALTER TABLE bar_p85 ADD COLUMN barf BIGINT;"
My syntax is probably wrong, but you get the idea.
However one doubt, Will it create issue if there already exists foreign key
on this partition table or say it's the parent to other child
partition/nonpartition tables?
(Note that detached children have FK constraints.)
It'll certainly create an "issue" if the column you're dropping is part of
the foreign key. 😀
It'll also cause a problem if the table you're dropping from or adding to
is the "target" of the FK, since the source can't check the being-altered
table during the ALTER TABLE statement.
Bottom line: you can optimize for:
1. minimized wall time by doing it in multiple transactions (which
*might* bodge
your application; we don't know it, so can't say for sure), OR
2. assured consistency (one transaction where you just ALTER the parent,
and have it ripple down to the children); it'll take much longer, though.
One other issue: *if* adding the new column requires a rewrite, "ALTER
parent" *might* (but I've never tried it) temporarily use an extra 2TB of
disk space in that single transaction. Doing the ALTERs child by child
minimizes that, since each child's ALTER is it's own transaction.
Whatever you do... test test test.
Show quoted text
On 2024-Jul-11, Ron Johnson wrote:
Anyway, DROP is the easy part; it's ADD COLUMN that can take a lot of time
(depending on whether or not you populate the column with a default value).
Actually, ADD COLUMN with a default does not rewrite the entire table
either, starting from pg11.
"Major enhancements in PostgreSQL 11 include:
[...]
* Many other useful performance improvements, including the ability to
avoid a table rewrite for ALTER TABLE ... ADD COLUMN with a non-null
column default."
https://www.postgresql.org/docs/11/release-11.html
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
On 2024-07-11 10:06:47 +0200, Laurenz Albe wrote:
On Thu, 2024-07-11 at 13:10 +0530, sud wrote:
Dropping will take it's own time for post vacuum however as you
rightly said, it won't be blocking which should be fine.I am not certain if you understood this correctly.
Dropping a column is fast, but doesn't reclaim the space.
VACUUM won't block anything, but won't reclaim the space.
VACUUM (FULL) will block everything, but will also not reclaim the space.You'd need to use a form of ALTER TABLE that rewrites the table,
as indicated in the documentation.
Unfortunately the documentation indicates very little. It mentions that
the table will be rewritten with
* SET ACCESS METHOD
* a volatile DEFAULT
* changing the type of an existing column (unless binary coercible)
All three change something which you probably don't want to change.
The documentation also mentions some cases where the table is not
rewritten, so maybe some not explicitely mentioned options rewrite the
table, too.
I would especially expected ALTER TABLE ... CLUSTER to do this, but if
VACUUM FULL preserves the (former) content of dropped columns, maybe
CLUSTER does, too?
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
On Sun, 2024-07-14 at 00:05 +0200, Peter J. Holzer wrote:
On 2024-07-11 10:06:47 +0200, Laurenz Albe wrote:
On Thu, 2024-07-11 at 13:10 +0530, sud wrote:
Dropping will take it's own time for post vacuum however as you
rightly said, it won't be blocking which should be fine.I am not certain if you understood this correctly.
Dropping a column is fast, but doesn't reclaim the space.
VACUUM won't block anything, but won't reclaim the space.
VACUUM (FULL) will block everything, but will also not reclaim the space.You'd need to use a form of ALTER TABLE that rewrites the table,
as indicated in the documentation.Unfortunately the documentation indicates very little. It mentions that
the table will be rewritten with* SET ACCESS METHOD
* a volatile DEFAULT
* changing the type of an existing column (unless binary coercible)All three change something which you probably don't want to change.
Hm, true.
You can always do
UPDATE tab SET id = id;
followed by
VACUUM (FULL) tab;
Yours,
Laurenz Albe
On 2024-07-15 13:53:25 +0200, Laurenz Albe wrote:
On Sun, 2024-07-14 at 00:05 +0200, Peter J. Holzer wrote:
On 2024-07-11 10:06:47 +0200, Laurenz Albe wrote:
Dropping a column is fast, but doesn't reclaim the space.
VACUUM won't block anything, but won't reclaim the space.
VACUUM (FULL) will block everything, but will also not reclaim the space.You'd need to use a form of ALTER TABLE that rewrites the table,
as indicated in the documentation.Unfortunately the documentation indicates very little. It mentions that
the table will be rewritten with* SET ACCESS METHOD
* a volatile DEFAULT
* changing the type of an existing column (unless binary coercible)All three change something which you probably don't want to change.
Hm, true.
You can always do
UPDATE tab SET id = id;
followed by
VACUUM (FULL) tab;
Yes, that should work. It needs about twice the size of the table in
temporary space, though.
Since the OP wrote that the table is "daily ... and 90 partitions"
(which understand that there is one partition per day and partitions are
kept for 90 days) it might be better to just wait. After 90 days all the
partitions with the obsolete column will be gone.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
On Mon, Jul 15, 2024 at 7:58 PM Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
Hm, true.
You can always do
UPDATE tab SET id = id;
followed by
VACUUM (FULL) tab;
Yes, that should work. It needs about twice the size of the table in
temporary space, though.Since the OP wrote that the table is "daily ... and 90 partitions"
(which understand that there is one partition per day and partitions are
kept for 90 days) it might be better to just wait. After 90 days all the
partitions with the obsolete column will be gone.
Thank You very much.
As I understand, after dropping a column, it will still internally hold the
"NOT NULL" values in that column for all the rows, even though it's not
visible outside.
So, after the DROP column, it will force update any of the columns as
below, Then it will force create another copy of each of the rows even if
the column is updated to the same value. The new copy will have the dropped
column with values as NULL. And the post "VACUUM FULL '' will clean all the
rows with "NOT NULL '' values of that dropped column and thus reclaim the
space.
But the only issue would be "VACUUM FULL" will take a table lock and also
it may take longer to run this vacuum on the full table considering the
size of the table in TB's. Thus, is it fine to just leave it post execution
of the "update" statement , so that the normal vacuum operation (which will
be online operation) and that will take care of the removal of old rows ?
UPDATE tab SET id = id;
VACUUM (FULL) tab;
And also, As you mentioned we may also leave it as is and wait for the
partition to be dropped, so that the dropped column with "not null" values
which are still lying under the hood and are occupying space will be
removed automatically. But even then, is that dropped column still lying in
the rows with null values in it throughout its lifecycle, till the table
exists in the database?
Seems there is no other option exist to drop the column with space
reclaimed from the table in immediate effect, other than above discussed.
On 2024-07-16 02:00:27 +0530, sud wrote:
On Mon, Jul 15, 2024 at 7:58 PM Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
Hm, true.
You can always do
UPDATE tab SET id = id;
followed by
VACUUM (FULL) tab;
Yes, that should work. It needs about twice the size of the table in
temporary space, though.Since the OP wrote that the table is "daily ... and 90 partitions"
(which understand that there is one partition per day and partitions are
kept for 90 days) it might be better to just wait. After 90 days all the
partitions with the obsolete column will be gone.
Thank You very much.
As I understand, after dropping a column, it will still internally hold the
"NOT NULL" values in that column for all the rows, even though it's not visible
outside.So, after the DROP column, it will force update any of the columns as below,
Then it will force create another copy of each of the rows even if the column
is updated to the same value. The new copy will have the dropped column with
values as NULL. And the post "VACUUM FULL '' will clean all the rows with "NOT
NULL '' values of that dropped column and thus reclaim the space.
Correct.
But the only issue would be "VACUUM FULL" will take a table lock and also it
may take longer to run this vacuum on the full table considering the size of
the table in TB's. Thus, is it fine to just leave it post execution of the
"update" statement , so that the normal vacuum operation (which will be online
operation) and that will take care of the removal of old rows ?
This is unlikely to recover the space.
The UPDATE will duplicate all rows. Since - presumably - there isn't
much free space within each partition the new rows will go at the end of
each partition, effectively doubling its size.
A regular VACUUM (whether autovacuum or invoked manually) will then
remove the old rows. and make the space available for new data. But
since that newly free space is at the beginning of each partition it
can't be returned to the OS. It would be available for new data written
to those partitions I guess not much is written to old partitons.
You could, however, do this in small steps and vacuum after each.
Something like this (in Python)
conn = ...
csr = conn.cursor()
for i in range(100):
csr.execute("UPDATE tab set id = id WHERE id % 100 = %s", (i,))
conn.commit()
csr.execute("VACUUM tab")
conn.commit()
That might just be able to squeeze the new rows in between the existing
rows and not grow the table.
And also, As you mentioned we may also leave it as is and wait for the
partition to be dropped, so that the dropped column with "not null" values
which are still lying under the hood and are occupying space will be removed
automatically. But even then, is that dropped column still lying in the rows
with null values in it throughout its lifecycle, till the table exists in the
database?
Yes. But a nullable column with a null value takes only a single bit of
storage, so that's negligible.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
On Tue, Jul 16, 2024 at 6:07 AM Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
But the only issue would be "VACUUM FULL" will take a table lock and
also it
may take longer to run this vacuum on the full table considering the
size of
the table in TB's. Thus, is it fine to just leave it post execution of
the
"update" statement , so that the normal vacuum operation (which will be
online
operation) and that will take care of the removal of old rows ?
This is unlikely to recover the space.
The UPDATE will duplicate all rows. Since - presumably - there isn't
much free space within each partition the new rows will go at the end of
each partition, effectively doubling its size.A regular VACUUM (whether autovacuum or invoked manually) will then
remove the old rows. and make the space available for new data. But
since that newly free space is at the beginning of each partition it
can't be returned to the OS. It would be available for new data written
to those partitions I guess not much is written to old partitons.You could, however, do this in small steps and vacuum after each.
Something like this (in Python)conn = ...
csr = conn.cursor()
for i in range(100):
csr.execute("UPDATE tab set id = id WHERE id % 100 = %s", (i,))
conn.commit()
csr.execute("VACUUM tab")
conn.commit()That might just be able to squeeze the new rows in between the existing
rows and not grow the table.And also, As you mentioned we may also leave it as is and wait for the
partition to be dropped, so that the dropped column with "not null"values
which are still lying under the hood and are occupying space will be
removed
automatically. But even then, is that dropped column still lying in the
rows
with null values in it throughout its lifecycle, till the table exists
in the
database?
Yes. But a nullable column with a null value takes only a single bit of
storage, so that's negligible.
Thank you so much.
Normal vacuum marks the space occupied by the dead tuples as free or
reusable but vacuum full removes those completely. However even with
"vacuum full", the old rows will be removed completely from the storage ,
but the new rows will always be there with the 'dropped' column still
existing under the hood along with the table storage, with just carrying
"null" values in it. However, as it's a single bit of storage so will be
having negligible overhead. If we want to fully remove that column from the
table , we may have to create a new table and dump the data into that from
the existing table and then rename it back to old. Is this understanding
correct?
On Monday, July 15, 2024, sud <suds1434@gmail.com> wrote:
However even with "vacuum full", the old rows will be removed completely
from the storage , but the new rows will always be there with the 'dropped'
column still existing under the hood along with the table storage, with
just carrying "null" values in it. […] Is this understanding correct?
No. The table rewrite process involves creating new tuples that exactly
conform to the current row specification. The potentially non-null data
present in live tuples for columns that have been dropped are not copied
into the newly constructed tuples.
David J.
On Monday, July 15, 2024, David G. Johnston <david.g.johnston@gmail.com>
wrote:
On Monday, July 15, 2024, sud <suds1434@gmail.com> wrote:
However even with "vacuum full", the old rows will be removed completely
from the storage , but the new rows will always be there with the 'dropped'
column still existing under the hood along with the table storage, with
just carrying "null" values in it. […] Is this understanding correct?No. The table rewrite process involves creating new tuples that exactly
conform to the current row specification. The potentially non-null data
present in live tuples for columns that have been dropped are not copied
into the newly constructed tuples.https://github.com/postgres/postgres/blob/d2b74882cab84b9f4fdce0f2f32e89
2ba9164f5c/src/backend/access/heap/heapam_handler.c#L2499
My bad, stopped at the code comment. Apparently the data is just nulled,
not removed, the current row descriptor contains those columns with “is
dropped” and since this behavior doesn’t change the catalogs in this way
the new ones must as well. We just get the space back.
David J.
On Tue, Jul 16, 2024 at 10:26 AM David G. Johnston <
david.g.johnston@gmail.com> wrote:
On Monday, July 15, 2024, David G. Johnston <david.g.johnston@gmail.com>
wrote:On Monday, July 15, 2024, sud <suds1434@gmail.com> wrote:
However even with "vacuum full", the old rows will be removed completely
from the storage , but the new rows will always be there with the 'dropped'
column still existing under the hood along with the table storage, with
just carrying "null" values in it. […] Is this understanding correct?No. The table rewrite process involves creating new tuples that exactly
conform to the current row specification. The potentially non-null data
present in live tuples for columns that have been dropped are not copied
into the newly constructed tuples.My bad, stopped at the code comment. Apparently the data is just nulled,
not removed, the current row descriptor contains those columns with “is
dropped” and since this behavior doesn’t change the catalogs in this way
the new ones must as well. We just get the space back.
Thank you for the confirmation.
And if someone wants to fully remove that column from the table , then the
only option is to create a new table with an exact set of active columns
and insert the data into that from the existing/old table and then rename
it back to old. Is this correct understanding?
On Monday, July 15, 2024, sud <suds1434@gmail.com> wrote:
Thank you for the confirmation.
And if someone wants to fully remove that column from the table , then the
only option is to create a new table with an exact set of active columns
and insert the data into that from the existing/old table and then rename
it back to old. Is this correct understanding?
You haven’t removed the column from the table, you’ve removed the whole
dang table while creating a new, wholly unrelated, one that just happens to
have the same name. But the name of a table isn’t its primary key, though
some places do use it for lookup purposes.
David J.