updating all records of a table
Hi:
I have to update all the records of a table. I'm worried about what the table will look like in terms of fragmentation when this is finished. Is there some sort of table healing/reorg/rebuild measure I should take if I want the resulting table to operate at optimal efficiency? What about indexes, should I drop/recreate those?
(I remember the bad-ole days with Oracle where table defragging and index rebuilding was something we had to do)
Thanks for any help !
On Thu, 2011-03-03 at 20:03 -0700, Gauthier, Dave wrote:
Hi:
I have to update all the records of a table. I'm worried about what
the table will look like in terms of fragmentation when this is
finished. Is there some sort of table healing/reorg/rebuild measure I
should take if I want the resulting table to operate at optimal
efficiency? What about indexes, should I drop/recreate those?
Well it depends on the size of table but yes it is going to create a lot
of dead space. A cluster or reindex of the table will solve this for
you.
JD
--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt
On Thu, Mar 03, 2011 at 08:03:59PM -0700, Gauthier, Dave wrote:
Hi:
I have to update all the records of a table. I'm worried about what the table will look like in terms of fragmentation when this is finished. Is there some sort of table healing/reorg/rebuild measure I should take if I want the resulting table to operate at optimal efficiency? What about indexes, should I drop/recreate those?
Is it really important that it happen in one transaction?
In the past when I've had to do this on large numbers of rows, I
always tried to do it in batches. You can run vacuums in between
groups, so that the table doesn't get too bloated.
Otherwise, yeah, you're better off to do some of the cleanup Joshua
suggested.
A
--
Andrew Sullivan
ajs@crankycanuck.ca
On Mar 4, 2011, at 5:17 PM, Andrew Sullivan wrote:
On Thu, Mar 03, 2011 at 08:03:59PM -0700, Gauthier, Dave wrote:
Hi:
I have to update all the records of a table. I'm worried about what the table will look like in terms of fragmentation when this is finished. Is there some sort of table healing/reorg/rebuild measure I should take if I want the resulting table to operate at optimal efficiency? What about indexes, should I drop/recreate those?
Is it really important that it happen in one transaction?
In the past when I've had to do this on large numbers of rows, I
always tried to do it in batches. You can run vacuums in between
groups, so that the table doesn't get too bloated.Otherwise, yeah, you're better off to do some of the cleanup Joshua
suggested.A
+1
If UPDATE is for all rows, then
1. CTAS with change value in SELECT
2. Rename the tables. -- This will give zero Bloats.
Thanks & Regards,
Vibhor Kumar
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
vibhor.kumar@enterprisedb.com
Blog:http://vibhork.blogspot.com
On 03/04/2011 04:54 AM, Vibhor Kumar wrote:
On Mar 4, 2011, at 5:17 PM, Andrew Sullivan wrote:
On Thu, Mar 03, 2011 at 08:03:59PM -0700, Gauthier, Dave wrote:
Hi:
I have to update all the records of a table. I'm worried about what the table will look like in terms of fragmentation when this is finished. Is there some sort of table healing/reorg/rebuild measure I should take if I want the resulting table to operate at optimal efficiency? What about indexes, should I drop/recreate those?
Is it really important that it happen in one transaction?
In the past when I've had to do this on large numbers of rows, I
always tried to do it in batches. You can run vacuums in between
groups, so that the table doesn't get too bloated.Otherwise, yeah, you're better off to do some of the cleanup Joshua
suggested.A
+1
If UPDATE is for all rows, then
1. CTAS with change value in SELECT
2. Rename the tables. -- This will give zero Bloats.
Elegant, but of course, this doubles the disk space consumed. Not
always tenable.
I like the "cluster" and "reindex" ideas. The table is not that big and I do have the disk space. This table will also grow over time, so if the table ends up taking more space in the end, that's OK, it'll get used. The DB will also be unavailable to the users while this is happening, so I won't have to be contending with interactive users.
Thanks for all the suggestions!
-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Rob Sargent
Sent: Friday, March 04, 2011 10:21 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] updating all records of a table
On 03/04/2011 04:54 AM, Vibhor Kumar wrote:
On Mar 4, 2011, at 5:17 PM, Andrew Sullivan wrote:
On Thu, Mar 03, 2011 at 08:03:59PM -0700, Gauthier, Dave wrote:
Hi:
I have to update all the records of a table. I'm worried about what the table will look like in terms of fragmentation when this is finished. Is there some sort of table healing/reorg/rebuild measure I should take if I want the resulting table to operate at optimal efficiency? What about indexes, should I drop/recreate those?
Is it really important that it happen in one transaction?
In the past when I've had to do this on large numbers of rows, I
always tried to do it in batches. You can run vacuums in between
groups, so that the table doesn't get too bloated.Otherwise, yeah, you're better off to do some of the cleanup Joshua
suggested.A
+1
If UPDATE is for all rows, then
1. CTAS with change value in SELECT
2. Rename the tables. -- This will give zero Bloats.
Elegant, but of course, this doubles the disk space consumed. Not
always tenable.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
robjsargent@gmail.com (Rob Sargent) writes:
On 03/04/2011 04:54 AM, Vibhor Kumar wrote:
On Mar 4, 2011, at 5:17 PM, Andrew Sullivan wrote:
On Thu, Mar 03, 2011 at 08:03:59PM -0700, Gauthier, Dave wrote:
Hi:
I have to update all the records of a table. I'm worried about
what the table will look like in terms of fragmentation when this
is finished. Is there some sort of table healing/reorg/rebuild
measure I should take if I want the resulting table to operate at
optimal efficiency? What about indexes, should I drop/recreate
those?Is it really important that it happen in one transaction?
In the past when I've had to do this on large numbers of rows, I
always tried to do it in batches. You can run vacuums in between
groups, so that the table doesn't get too bloated.Otherwise, yeah, you're better off to do some of the cleanup Joshua
suggested.A
+1
If UPDATE is for all rows, then
1. CTAS with change value in SELECT
2. Rename the tables. -- This will give zero Bloats.Elegant, but of course, this doubles the disk space consumed. Not
always tenable.
... But if you needed to do it in one Swell Foop, there really wasn't
any other choice.
The only way *not* to double (or more) space consumption is to do
incremental updates, vacuuming around each increment.
--
select 'cbbrowne' || '@' || 'linuxdatabases.info';
http://www3.sympatico.ca/cbbrowne/lisp.html
We are MICROS~1. You will be assimilated. Resistance is futile.
(Attributed to B.G., Gill Bates)
This has been a great thread! I am missing something because I do not
know what CTAS is. WOuld someone please help me understand.
ray
On Sat, Mar 05, 2011 at 07:38:23AM -0800, ray wrote:
This has been a great thread! I am missing something because I do not
know what CTAS is. WOuld someone please help me understand.
Create Table As Select.
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
Patriotism is when love of your own people comes first; nationalism,
when hate for people other than your own comes first.
- Charles de Gaulle
you might consider lowering the fillfactor a bit. It will consume more
space, but it will make updates and inserts faster.
http://www.postgresql.org/docs/8.4/interactive/sql-createtable.html :
fillfactor (integer)
The fillfactor for a table is a percentage between 10 and 100. 100 (complete
packing) is the default. When a smaller fillfactor is specified,
INSERT operations
pack table pages only to the indicated percentage; the remaining space on
each page is reserved for updating rows on that page. This gives UPDATE a
chance to place the updated copy of a row on the same page as the original,
which is more efficient than placing it on a different page. For a table
whose entries are never updated, complete packing is the best choice, but in
heavily updated tables smaller fillfactors are appropriate. This parameter
cannot be set for TOAST tables.
--
"Patriotism is the conviction that your country is superior to all others
because you were born in it." -- George Bernard Shaw