Does VACUUM reorder tables on clustered indices
Does VACUUMing reorder tables on clustered indices or is it only the CLUSTER
command that can do that?
/ r church
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
On Sun, 18 Dec 2005, frank church wrote:
Does VACUUMing reorder tables on clustered indices or is it only the CLUSTER
command that can do that?
Cluster does that. Vacuum only cleans dead tuples from the tables.
--
18:02:25 up 4 days, 9:57, 5 users, load average: 1.59, 1.57, 1.62
---------------------------------------------------------
Lic. Martín Marqués | SELECT 'mmarques' ||
Centro de Telemática | '@' || 'unl.edu.ar';
Universidad Nacional | DBA, Programador,
del Litoral | Administrador
---------------------------------------------------------
Martin Marques escribi�:
On Sun, 18 Dec 2005, frank church wrote:
Does VACUUMing reorder tables on clustered indices or is it only the
CLUSTER
command that can do that?Cluster does that. Vacuum only cleans dead tuples from the tables.
Note that while reordering, CLUSTER also gets rid of dead tuples, so if
you cluster you don't need to vacuum.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes:
Note that while reordering, CLUSTER also gets rid of dead tuples, so if
you cluster you don't need to vacuum.
Just for the record, that behavior is seriously broken: it violates
MVCC if any of the deleted tuples are still visible to anyone else.
regards, tom lane
Tom Lane escribi�:
Alvaro Herrera <alvherre@commandprompt.com> writes:
Note that while reordering, CLUSTER also gets rid of dead tuples, so if
you cluster you don't need to vacuum.Just for the record, that behavior is seriously broken: it violates
MVCC if any of the deleted tuples are still visible to anyone else.
Well, FWIW, the fact that the CLUSTER command exists at all is broken
IMHO ... but meanwhile, are you suggesting that CLUSTER should be fixed
to retain tuples that are visible to running transactions?
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Sun, Dec 18, 2005 at 10:08:22PM -0500, Tom Lane wrote:
Just for the record, that behavior is seriously broken: it violates
MVCC if any of the deleted tuples are still visible to anyone else.
Does it remove tuples that VACUUM FULL wouldn't? I always thought it
did essentially the same thing?
A
--
Andrew Sullivan | ajs@crankycanuck.ca
I remember when computers were frustrating because they *did* exactly what
you told them to. That actually seems sort of quaint now.
--J.D. Baldwin
Andrew Sullivan <ajs@crankycanuck.ca> writes:
On Sun, Dec 18, 2005 at 10:08:22PM -0500, Tom Lane wrote:
Just for the record, that behavior is seriously broken: it violates
MVCC if any of the deleted tuples are still visible to anyone else.
Does it remove tuples that VACUUM FULL wouldn't?
Yes. CLUSTER works on SnapshotNow, so it will remove committed-dead
tuples even if there are still open transactions that could see them.
Of course, said transactions couldn't be actively using the table
while the CLUSTER runs, because it takes an exclusive table lock.
But they *could* look at it afterwards. Offhand I think you'd only
be likely to notice the difference if the open transactions were
SERIALIZABLE --- in READ COMMITTED mode, by the time they could look
at the clustered table, they'd likely be using a snapshot that postdates
the DELETE.
[ experiments a bit... ] Hmm. Actually, it's far worse than I
thought. It looks like CLUSTER puts the tuples into the new table with
its own xid, which means that concurrent serializable transactions will
see the new table as completely empty!
<< session 1 >>
regression=# select * from int4_tbl;
f1
-------------
0
123456
-123456
2147483647
-2147483647
(5 rows)
regression=# create index fooi on int4_tbl(f1);
CREATE INDEX
regression=# begin isolation level serializable;
BEGIN
regression=# select 2+2; -- establish transaction snapshot
?column?
----------
4
(1 row)
<< session 2 >>
regression=# delete from int4_tbl where f1 = -123456;
DELETE 1
regression=# cluster fooi on int4_tbl;
CLUSTER
<< back to session 1 >>
regression=# select * from int4_tbl;
f1
----
(0 rows)
regression=# commit;
COMMIT
regression=# select * from int4_tbl;
f1
-------------
-2147483647
0
123456
2147483647
(4 rows)
regards, tom lane
On Sun, Dec 18, 2005 at 07:01:39PM -0300, Alvaro Herrera wrote:
Martin Marques escribi?:
On Sun, 18 Dec 2005, frank church wrote:
Does VACUUMing reorder tables on clustered indices or is it only the
CLUSTER
command that can do that?Cluster does that. Vacuum only cleans dead tuples from the tables.
Note that while reordering, CLUSTER also gets rid of dead tuples, so if
you cluster you don't need to vacuum.
It also does a REINDEX...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Hi,
Utilize <b>CLUSTER;</b> (after vacuum) to reorder the data.
Regards
----- Original Message -----
From: "Jim C. Nasby" <jnasby@pervasive.com>
To: "Martin Marques" <martin@bugs.unl.edu.ar>; "frank church"
<pgsql@adontendev.net>; <pgsql-sql@postgresql.org>
Sent: Tuesday, December 20, 2005 10:41 PM
Subject: Re: [SQL] Does VACUUM reorder tables on clustered indices
Show quoted text
On Sun, Dec 18, 2005 at 07:01:39PM -0300, Alvaro Herrera wrote:
Martin Marques escribi?:
On Sun, 18 Dec 2005, frank church wrote:
Does VACUUMing reorder tables on clustered indices or is it only the
CLUSTER
command that can do that?Cluster does that. Vacuum only cleans dead tuples from the tables.
Note that while reordering, CLUSTER also gets rid of dead tuples, so if
you cluster you don't need to vacuum.It also does a REINDEX...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
On Wed, Dec 21, 2005 at 12:34:12AM +0100, ipv@tinet.org wrote:
Hi,
Utilize <b>CLUSTER;</b> (after vacuum) to reorder the data.
Why would you vacuum when cluster is just going to wipe out the dead
tuples anyway?
Note that while reordering, CLUSTER also gets rid of dead tuples, so if
you cluster you don't need to vacuum.It also does a REINDEX...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
jnasby@pervasive.com ("Jim C. Nasby") writes:
On Wed, Dec 21, 2005 at 12:34:12AM +0100, ipv@tinet.org wrote:
Hi,
Utilize <b>CLUSTER;</b> (after vacuum) to reorder the data.
Why would you vacuum when cluster is just going to wipe out the dead
tuples anyway?
There is one reason to VACUUM before running CLUSTER...
That is that VACUUM will be *guaranteed* to draw all the pages into memory.
Subsequently, you can be certain that the pages are in cache, and that
the CLUSTER should need to do minimal I/O to read data into memory.
If I'm considering clustering the Slony-I "sl_log_1" table, forcing it
into memory *is* something I'll consider doing in order to minimize
the time that would-be writers are blocked from writing...
--
let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];;
http://cbbrowne.com/info/multiplexor.html
Never criticize anybody until you have walked a mile in their shoes,
because by that time you will be a mile away and have their shoes.
-- email sig, Brian Servis
Chris Browne wrote:
jnasby@pervasive.com ("Jim C. Nasby") writes:
On Wed, Dec 21, 2005 at 12:34:12AM +0100, ipv@tinet.org wrote:
Hi,
Utilize <b>CLUSTER;</b> (after vacuum) to reorder the data.
Why would you vacuum when cluster is just going to wipe out the dead
tuples anyway?There is one reason to VACUUM before running CLUSTER...
That is that VACUUM will be *guaranteed* to draw all the pages into memory.
Subsequently, you can be certain that the pages are in cache, and that
the CLUSTER should need to do minimal I/O to read data into memory.If I'm considering clustering the Slony-I "sl_log_1" table, forcing it
into memory *is* something I'll consider doing in order to minimize
the time that would-be writers are blocked from writing...
Why don't you just do SELECT * FROM tab WHERE col != 'lkjasdflkjadsf'.
That should pull things into memory without the VACUUM overhead.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Wed, Dec 21, 2005 at 06:36:45PM -0500, Chris Browne wrote:
If I'm considering clustering the Slony-I "sl_log_1" table, forcing it
into memory *is* something I'll consider doing in order to minimize
the time that would-be writers are blocked from writing...
Given what Tom Lane recently reported (and assuming I understood his
remarks), I think it's a Mighty Bad Idea to CLUSTER sl_log_1.
A
--
Andrew Sullivan | ajs@crankycanuck.ca
In the future this spectacle of the middle classes shocking the avant-
garde will probably become the textbook definition of Postmodernism.
--Brad Holland
Tom, has this bug been addressed or documented?
---------------------------------------------------------------------------
Tom Lane wrote:
Andrew Sullivan <ajs@crankycanuck.ca> writes:
On Sun, Dec 18, 2005 at 10:08:22PM -0500, Tom Lane wrote:
Just for the record, that behavior is seriously broken: it violates
MVCC if any of the deleted tuples are still visible to anyone else.Does it remove tuples that VACUUM FULL wouldn't?
Yes. CLUSTER works on SnapshotNow, so it will remove committed-dead
tuples even if there are still open transactions that could see them.
Of course, said transactions couldn't be actively using the table
while the CLUSTER runs, because it takes an exclusive table lock.
But they *could* look at it afterwards. Offhand I think you'd only
be likely to notice the difference if the open transactions were
SERIALIZABLE --- in READ COMMITTED mode, by the time they could look
at the clustered table, they'd likely be using a snapshot that postdates
the DELETE.[ experiments a bit... ] Hmm. Actually, it's far worse than I
thought. It looks like CLUSTER puts the tuples into the new table with
its own xid, which means that concurrent serializable transactions will
see the new table as completely empty!<< session 1 >>
regression=# select * from int4_tbl;
f1
-------------
0
123456
-123456
2147483647
-2147483647
(5 rows)regression=# create index fooi on int4_tbl(f1);
CREATE INDEX
regression=# begin isolation level serializable;
BEGIN
regression=# select 2+2; -- establish transaction snapshot
?column?
----------
4
(1 row)<< session 2 >>
regression=# delete from int4_tbl where f1 = -123456;
DELETE 1
regression=# cluster fooi on int4_tbl;
CLUSTER<< back to session 1 >>
regression=# select * from int4_tbl;
f1
----
(0 rows)regression=# commit;
COMMIT
regression=# select * from int4_tbl;
f1
-------------
-2147483647
0
123456
2147483647
(4 rows)regards, tom lane
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes:
Tom, has this bug been addressed or documented?
No. Please add a TODO:
* Make CLUSTER preserve recently-dead tuples per MVCC requirements
I have not tested, but I suspect the table-rewriting variants of ALTER TABLE
have the same problem.
regards, tom lane
Tom Lane wrote:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
Tom, has this bug been addressed or documented?
No. Please add a TODO:
* Make CLUSTER preserve recently-dead tuples per MVCC requirements
I have not tested, but I suspect the table-rewriting variants of ALTER TABLE
have the same problem.
Added.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073