DROP column: documentation unclear

Started by Adrian von Bidderabout 16 years ago18 messagesgeneral
Jump to latest
#1Adrian von Bidder
avbidder@fortytwo.ch

Hi,

The documentation about DROP COLUMN is a bit unclear:

| The DROP COLUMN form does not physically remove the column, but simply
| makes it invisible to SQL operations. Subsequent insert and update
| operations in the table will store a null value for the column. Thus,
| dropping a column is quick but it will not immediately reduce the on-disk
| size of your table, as the space occupied by the dropped column is not
| reclaimed. The space will be reclaimed over time as existing rows are
| updated.

"subsequent ... will store a null value" would imply that deleted columns
will still take some place, while "the space will be reclaimed ..." would
suggest that new rows (insert or updates in mvcc) don't have the deleted
column anymore - I'm not quite sure how to interpret this. What is pg
doing?

chees
-- vbi

--
featured product: PostgreSQL - http://postgresql.org

#2Martijn van Oosterhout
kleptog@svana.org
In reply to: Adrian von Bidder (#1)
Re: DROP column: documentation unclear

On Mon, Mar 08, 2010 at 05:09:14PM +0100, Adrian von Bidder wrote:

Hi,

The documentation about DROP COLUMN is a bit unclear:

| The DROP COLUMN form does not physically remove the column, but simply
| makes it invisible to SQL operations. Subsequent insert and update
| operations in the table will store a null value for the column. Thus,
| dropping a column is quick but it will not immediately reduce the on-disk
| size of your table, as the space occupied by the dropped column is not
| reclaimed. The space will be reclaimed over time as existing rows are
| updated.

"subsequent ... will store a null value" would imply that deleted columns
will still take some place, while "the space will be reclaimed ..." would
suggest that new rows (insert or updates in mvcc) don't have the deleted
column anymore - I'm not quite sure how to interpret this. What is pg
doing?

What you're missing is that in postgres NULLs are stored as a bit in
the header and there is no data. So in a sense NULLs take no space
(well, one bit) which means both statements are true.

Have a nice day,

--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Please line up in a tree and maintain the heap invariant while
boarding. Thank you for flying nlogn airlines.

#3Scot Kreienkamp
SKreien@la-z-boy.com
In reply to: Martijn van Oosterhout (#2)
autovacuum question

Hi everyone,

I have a database that is constantly getting reloaded several times per
day from production backups and is used for reporting purposes. The
problem I'm having with it is that the database seems to be much slower
than the others I have that are more static. I suspect that is due to
the lack of query planner statistics (analyze) having been done after
each restore, which is that way due to the amount of time it takes to
accomplish.

First, I'm hoping someone here can validate my theory. Second, if
that's true, is there any way to trigger an auto-analyze on a table
automatically the first time a query touches that table? (I ask because
there is no way to shrink the amount of time a database-wide analyze
would take into the window I have to do it in. The expectations may be
a bit unrealistic here, I know.) Third, what parameters can I set to
make analyze go as fast as possible, knowing that the disks are slow on
it because of the hardware? (Can't do anything about that either, FYI)
Obviously more memory the better, and setting maintenance work memory
higher also. Doing a vacuum is kind of pointless because it gets
reloaded every 2 hours, so all I really need is an analyze --I think--.

I'm on 8.3.7. Since 8.3.10 is due out I'll probably upgrade to it after
it's shaken out a bit.

Thanks for your help. Any suggestions are welcome.

Scot Kreienkamp
skreien@la-z-boy.com

#4Scott Mead
scott.lists@enterprisedb.com
In reply to: Scot Kreienkamp (#3)
Re: autovacuum question

On Mon, Mar 8, 2010 at 5:13 PM, Scot Kreienkamp <SKreien@la-z-boy.com>wrote:

Hi everyone,

I have a database that is constantly getting reloaded several times per
day from production backups and is used for reporting purposes. The
problem I'm having with it is that the database seems to be much slower
than the others I have that are more static. I suspect that is due to
the lack of query planner statistics (analyze) having been done after
each restore, which is that way due to the amount of time it takes to
accomplish.

First, I'm hoping someone here can validate my theory.

It would seem likely that this could be the problem...

Second, if
that's true, is there any way to trigger an auto-analyze on a table
automatically the first time a query touches that table?

(I ask because

there is no way to shrink the amount of time a database-wide analyze
would take into the window I have to do it in. The expectations may be
a bit unrealistic here, I know.)

Why not just add an 'analyze' as the last step of the restore job?

Third, what parameters can I set to
make analyze go as fast as possible, knowing that the disks are slow on
it because of the hardware? (Can't do anything about that either, FYI)
Obviously more memory the better, and setting maintenance work memory
higher also. Doing a vacuum is kind of pointless because it gets
reloaded every 2 hours, so all I really need is an analyze --I think--.

Sounds like you've done what you can. How long does an analyze take?

I'm on 8.3.7. Since 8.3.10 is due out I'll probably upgrade to it after
it's shaken out a bit.

Why not move up to 8.4?

--Scott M

Show quoted text

Thanks for your help. Any suggestions are welcome.

Scot Kreienkamp
skreien@la-z-boy.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Scot Kreienkamp
SKreien@la-z-boy.com
In reply to: Scott Mead (#4)
Re: autovacuum question

On Mon, Mar 8, 2010 at 5:13 PM, Scot Kreienkamp <SKreien@la-z-boy.com>
wrote:

Hi everyone,

I have a database that is constantly getting reloaded several times per
day from production backups and is used for reporting purposes. The
problem I'm having with it is that the database seems to be much slower
than the others I have that are more static. I suspect that is due to
the lack of query planner statistics (analyze) having been done after
each restore, which is that way due to the amount of time it takes to
accomplish.

First, I'm hoping someone here can validate my theory.

It would seem likely that this could be the problem...

Cool.... Thanks. Glad I'm on the right track.

Second, if
that's true, is there any way to trigger an auto-analyze on a
table
automatically the first time a query touches that table?

(I ask because
there is no way to shrink the amount of time a database-wide
analyze
would take into the window I have to do it in. The expectations
may be
a bit unrealistic here, I know.)

Why not just add an 'analyze' as the last step of the restore job?

Due to the amount of time it takes. The disks are slow enough to make a
database-wide analyze painful since I would have to repeat it every 1-2
hours, IE every reload time.

Third, what parameters can I set to
make analyze go as fast as possible, knowing that the disks are
slow on
it because of the hardware? (Can't do anything about that
either, FYI)
Obviously more memory the better, and setting maintenance work
memory
higher also. Doing a vacuum is kind of pointless because it
gets
reloaded every 2 hours, so all I really need is an analyze --I
think--.

Sounds like you've done what you can. How long does an analyze take?

Last I tried it, it took 15 minutes on a 30 gig database while it was
being used.

I'm on 8.3.7. Since 8.3.10 is due out I'll probably upgrade to it after
it's shaken out a bit.

Why not move up to 8.4?

Because I'm constrained by our application developers who don't have the
time to vet our app against 8.4 yet. I've been pushing for it for the
last 2 months.

--Scott M

Thanks for your help. Any suggestions are welcome.

Scot Kreienkamp
skreien@la-z-boy.com

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Scot Kreienkamp (#5)
Re: autovacuum question

"Scot Kreienkamp" <SKreien@la-z-boy.com> writes:

Why not just add an 'analyze' as the last step of the restore job?

Due to the amount of time it takes. The disks are slow enough to make a
database-wide analyze painful since I would have to repeat it every 1-2
hours, IE every reload time.

You claimed that before. It didn't make any sense then and it doesn't
now. There is no way that an analyze is expensive compared to a
database reload.

Maybe what you need to be doing is rethinking the strategy that involves
reloading every couple of hours...

regards, tom lane

#7Harald Fuchs
hari.fuchs@gmail.com
In reply to: Adrian von Bidder (#1)
Re: DROP column: documentation unclear

In article <20100308213549.GB660@svana.org>,
Martijn van Oosterhout <kleptog@svana.org> writes:

"subsequent ... will store a null value" would imply that deleted columns
will still take some place, while "the space will be reclaimed ..." would
suggest that new rows (insert or updates in mvcc) don't have the deleted
column anymore - I'm not quite sure how to interpret this. What is pg
doing?

What you're missing is that in postgres NULLs are stored as a bit in
the header and there is no data. So in a sense NULLs take no space
(well, one bit) which means both statements are true.

But if you already have eight nullable columns, the (maybe originally
non-null) column which has been dropped would cause the header to be
one byte larger, wouldn't it?

#8Scot Kreienkamp
SKreien@la-z-boy.com
In reply to: Tom Lane (#6)
Re: autovacuum question

Wish I could Tom. I need a non-production, read-write copy of the
database that is updated every 1-2 hours from production. I don't set
this requirement, the business does. I just have to do it if it's
technically possible.

I found a way to do it very easily using LVM snapshots and WAL log
shipping, but the net effect is I'm bringing a new LVM snapshot copy of
the database out of recovery every 1-2 hours. That means I'd have to
spend 15 minutes, or one-quarter of the time, doing an analyze every
time I refresh the database. That's fairly painful. The LVM snap and
restart only takes 1-2 minutes right now.

If you have any other ideas how I can accomplish or improve this I'm all
ears.

Thanks,

Scot Kreienkamp
skreien@la-z-boy.com

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, March 08, 2010 10:32 PM
To: Scot Kreienkamp
Cc: Scott Mead; pgsql-general@postgresql.org
Subject: Re: [GENERAL] autovacuum question

"Scot Kreienkamp" <SKreien@la-z-boy.com> writes:

Why not just add an 'analyze' as the last step of the restore job?

Due to the amount of time it takes. The disks are slow enough to make

a

database-wide analyze painful since I would have to repeat it every

1-2

hours, IE every reload time.

You claimed that before. It didn't make any sense then and it doesn't
now. There is no way that an analyze is expensive compared to a
database reload.

Maybe what you need to be doing is rethinking the strategy that involves
reloading every couple of hours...

regards, tom lane

#9Bruce Momjian
bruce@momjian.us
In reply to: Scot Kreienkamp (#8)
Re: autovacuum question

On Tue, Mar 9, 2010 at 1:47 PM, Scot Kreienkamp <SKreien@la-z-boy.com> wrote:

I found a way to do it very easily using LVM snapshots and WAL log
shipping, but the net effect is I'm bringing a new LVM snapshot copy of
the database out of recovery every 1-2 hours.  That means I'd have to
spend 15 minutes, or one-quarter of the time, doing an analyze every
time I refresh the database.  That's fairly painful.  The LVM snap and
restart only takes 1-2 minutes right now.

Your snapshot should have the same stats that the server does, so this
doesn't actually seem to explain the discrepancy.

You be running into performance problems with LVM if the snapshot is
the one paying the price for all the CoW copies. Or it could be that
doing retail block copies as needed results in them being fragmented
and destroying the sequential scan performance. You might be able to
reduce the difference by making sure to do a vacuum and a checkpoint
immediately prior to the snapshot. That would hopefully achieve
setting most hint bits so that read-only queries on the snapshot don't
cause writes to blocks just to set them.

There might be an option in LVM to materialize the entire snapshot
which might be able to bring the performance up to the same level and
hopefully allocate all the blocks sequentially.

--
greg

#10Scot Kreienkamp
SKreien@la-z-boy.com
In reply to: Bruce Momjian (#9)
Re: autovacuum question

Would the stats come across in WAL log shipping to a physically separate server? My understanding is that they won't.

Thanks,

Scot Kreienkamp
skreien@la-z-boy.com

-----Original Message-----
From: gsstark@gmail.com [mailto:gsstark@gmail.com] On Behalf Of Greg Stark
Sent: Tuesday, March 09, 2010 9:39 AM
To: Scot Kreienkamp
Cc: Tom Lane; Scott Mead; pgsql-general@postgresql.org
Subject: Re: autovacuum question

On Tue, Mar 9, 2010 at 1:47 PM, Scot Kreienkamp <SKreien@la-z-boy.com> wrote:

I found a way to do it very easily using LVM snapshots and WAL log
shipping, but the net effect is I'm bringing a new LVM snapshot copy of
the database out of recovery every 1-2 hours.  That means I'd have to
spend 15 minutes, or one-quarter of the time, doing an analyze every
time I refresh the database.  That's fairly painful.  The LVM snap and
restart only takes 1-2 minutes right now.

Your snapshot should have the same stats that the server does, so this
doesn't actually seem to explain the discrepancy.

You be running into performance problems with LVM if the snapshot is
the one paying the price for all the CoW copies. Or it could be that
doing retail block copies as needed results in them being fragmented
and destroying the sequential scan performance. You might be able to
reduce the difference by making sure to do a vacuum and a checkpoint
immediately prior to the snapshot. That would hopefully achieve
setting most hint bits so that read-only queries on the snapshot don't
cause writes to blocks just to set them.

There might be an option in LVM to materialize the entire snapshot
which might be able to bring the performance up to the same level and
hopefully allocate all the blocks sequentially.

--
greg

#11Scott Marlowe
scott.marlowe@gmail.com
In reply to: Scot Kreienkamp (#8)
Re: autovacuum question

On Tue, Mar 9, 2010 at 6:47 AM, Scot Kreienkamp <SKreien@la-z-boy.com> wrote:

Wish I could Tom.  I need a non-production, read-write copy of the
database that is updated every 1-2 hours from production. I don't set
this requirement, the business does. I just have to do it if it's
technically possible.

I found a way to do it very easily using LVM snapshots and WAL log
shipping, but the net effect is I'm bringing a new LVM snapshot copy of
the database out of recovery every 1-2 hours.  That means I'd have to
spend 15 minutes, or one-quarter of the time, doing an analyze every
time I refresh the database.  That's fairly painful.  The LVM snap and
restart only takes 1-2 minutes right now.

If you have any other ideas how I can accomplish or improve this I'm all
ears.

I'm gonna take a scientific wild-assed guess that the real issue here
is caching, or more specifically, lack thereof when you first start up
your copy of the db.

#12Scott Mead
scott.lists@enterprisedb.com
In reply to: Scott Marlowe (#11)
Re: autovacuum question

On Tue, Mar 9, 2010 at 9:56 AM, Scott Marlowe <scott.marlowe@gmail.com>wrote:

On Tue, Mar 9, 2010 at 6:47 AM, Scot Kreienkamp <SKreien@la-z-boy.com>
wrote:

Wish I could Tom. I need a non-production, read-write copy of the
database that is updated every 1-2 hours from production. I don't set
this requirement, the business does. I just have to do it if it's
technically possible.

I found a way to do it very easily using LVM snapshots and WAL log
shipping, but the net effect is I'm bringing a new LVM snapshot copy of
the database out of recovery every 1-2 hours. That means I'd have to
spend 15 minutes, or one-quarter of the time, doing an analyze every
time I refresh the database. That's fairly painful. The LVM snap and
restart only takes 1-2 minutes right now.

If you have any other ideas how I can accomplish or improve this I'm all
ears.

I'm gonna take a scientific wild-assed guess that the real issue here
is caching, or more specifically, lack thereof when you first start up
your copy of the db.

ISTM that 9.0's read-only standby feature may be of use to you. I know it
doesn't help you *today* but have you looked at it yet?

--Scott M

Show quoted text

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#13Scot Kreienkamp
SKreien@la-z-boy.com
In reply to: Scott Mead (#12)
Re: autovacuum question

ISTM that 9.0's read-only standby feature may be of use to you. I know
it doesn't help you *today* but have you looked at it yet?

[Scot Kreienkamp]

I had considered it and it will make my life easier for my reporting
server, but unfortunately in this case I need a read-write copy.

#14Bryan Montgomery
monty@english.net
In reply to: Scot Kreienkamp (#8)
Re: autovacuum question

Could you have two of these non-production databases? Prepare one in the
background, including an analyze and then make it the 'live' non-production
database then use the offline / alternative database for the next load
prepare that and then switch it on when ready.

In this scenario you'd need twice the disk space I guess but the 'downtime'
would be a lot less. I'd imagine you could have databases on different ports
and switch them at the roll over point, or maybe even just different
database names.

Just an alternative idea to throw out there.

Bryan.

On Tue, Mar 9, 2010 at 8:47 AM, Scot Kreienkamp <SKreien@la-z-boy.com>wrote:

Show quoted text

Wish I could Tom. I need a non-production, read-write copy of the
database that is updated every 1-2 hours from production. I don't set
this requirement, the business does. I just have to do it if it's
technically possible.

I found a way to do it very easily using LVM snapshots and WAL log
shipping, but the net effect is I'm bringing a new LVM snapshot copy of
the database out of recovery every 1-2 hours. That means I'd have to
spend 15 minutes, or one-quarter of the time, doing an analyze every
time I refresh the database. That's fairly painful. The LVM snap and
restart only takes 1-2 minutes right now.

If you have any other ideas how I can accomplish or improve this I'm all
ears.

Thanks,

Scot Kreienkamp
skreien@la-z-boy.com

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, March 08, 2010 10:32 PM
To: Scot Kreienkamp
Cc: Scott Mead; pgsql-general@postgresql.org
Subject: Re: [GENERAL] autovacuum question

"Scot Kreienkamp" <SKreien@la-z-boy.com> writes:

Why not just add an 'analyze' as the last step of the restore job?

Due to the amount of time it takes. The disks are slow enough to make

a

database-wide analyze painful since I would have to repeat it every

1-2

hours, IE every reload time.

You claimed that before. It didn't make any sense then and it doesn't
now. There is no way that an analyze is expensive compared to a
database reload.

Maybe what you need to be doing is rethinking the strategy that involves
reloading every couple of hours...

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#15Scott Mead
scott.lists@enterprisedb.com
In reply to: Scott Mead (#12)
Re: autovacuum question

On Tue, Mar 9, 2010 at 10:01 AM, Scott Mead <scott.lists@enterprisedb.com>wrote:

On Tue, Mar 9, 2010 at 9:56 AM, Scott Marlowe <scott.marlowe@gmail.com>wrote:

On Tue, Mar 9, 2010 at 6:47 AM, Scot Kreienkamp <SKreien@la-z-boy.com>
wrote:

Wish I could Tom. I need a non-production, read-write copy of the
database that is updated every 1-2 hours from production. I don't set
this requirement, the business does. I just have to do it if it's
technically possible.

I found a way to do it very easily using LVM snapshots and WAL log
shipping, but the net effect is I'm bringing a new LVM snapshot copy of
the database out of recovery every 1-2 hours. That means I'd have to
spend 15 minutes, or one-quarter of the time, doing an analyze every
time I refresh the database. That's fairly painful. The LVM snap and
restart only takes 1-2 minutes right now.

If you have any other ideas how I can accomplish or improve this I'm all
ears.

I'm gonna take a scientific wild-assed guess that the real issue here
is caching, or more specifically, lack thereof when you first start up
your copy of the db.

ISTM that 9.0's read-only standby feature may be of use to you. I know it
doesn't help you *today* but have you looked at it yet?

Okay, so the RO database won't work. How much data are we talking? How
much growth do you see between snapshots?

--Scott M

#16Scot Kreienkamp
SKreien@la-z-boy.com
In reply to: Scott Mead (#15)
Re: autovacuum question

ISTM that 9.0's read-only standby feature may be of use to you. I know
it doesn't help you *today* but have you looked at it yet?

Okay, so the RO database won't work. How much data are we talking?
How much growth do you see between snapshots?

The initial database size is 31 gigs. I give it 5 gigs in the snapshot
to grow in, but I would be surprised if more than 200 megs of data
changes. The actual change rate should be very low.

#17Scot Kreienkamp
SKreien@la-z-boy.com
In reply to: Scott Marlowe (#11)
Re: autovacuum question

I'm gonna take a scientific wild-assed guess that the real issue here
is caching, or more specifically, lack thereof when you first start up
your copy of the db.

[Scot Kreienkamp]
That is definitely one of the problems. No way to help that that I'm
aware of.

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Harald Fuchs (#7)
Re: DROP column: documentation unclear

Harald Fuchs <hari.fuchs@gmail.com> writes:

Martijn van Oosterhout <kleptog@svana.org> writes:

What you're missing is that in postgres NULLs are stored as a bit in
the header and there is no data. So in a sense NULLs take no space
(well, one bit) which means both statements are true.

But if you already have eight nullable columns, the (maybe originally
non-null) column which has been dropped would cause the header to be
one byte larger, wouldn't it?

No; the size of the bitmap is equal to the total number of columns, not
the number of columns that are null. One way to think about it is that
the first null in a particular row costs NUMCOLUMNS bits to store, and
then any additional nulls are free.

Anyway, arguing about the size of the null bitmap seems to me to be
beside the point. If you care about the space occupied by a column at
all, it's going to be a lot more than one bit. It's that macroscopic
space usage that the DROP COLUMN documentation is talking about
reclaiming, not whether or not you need a null bitmap.

regards, tom lane