Does VACUUM reorder tables on clustered indices

Started by frank churchabout 20 years ago16 messages
#1frank church
pgsql@adontendev.net

Does VACUUMing reorder tables on clustered indices or is it only the CLUSTER
command that can do that?

/ r church

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

#2Martin Marques
martin@bugs.unl.edu.ar
In reply to: frank church (#1)
Re: Does VACUUM reorder tables on clustered indices

On Sun, 18 Dec 2005, frank church wrote:

Does VACUUMing reorder tables on clustered indices or is it only the CLUSTER
command that can do that?

Cluster does that. Vacuum only cleans dead tuples from the tables.

--
18:02:25 up 4 days, 9:57, 5 users, load average: 1.59, 1.57, 1.62
---------------------------------------------------------
Lic. Martín Marqués | SELECT 'mmarques' ||
Centro de Telemática | '@' || 'unl.edu.ar';
Universidad Nacional | DBA, Programador,
del Litoral | Administrador
---------------------------------------------------------

#3Alvaro Herrera
alvherre@commandprompt.com
In reply to: Martin Marques (#2)
Re: Does VACUUM reorder tables on clustered indices

Martin Marques escribi�:

On Sun, 18 Dec 2005, frank church wrote:

Does VACUUMing reorder tables on clustered indices or is it only the
CLUSTER
command that can do that?

Cluster does that. Vacuum only cleans dead tuples from the tables.

Note that while reordering, CLUSTER also gets rid of dead tuples, so if
you cluster you don't need to vacuum.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#3)
Re: Does VACUUM reorder tables on clustered indices

Alvaro Herrera <alvherre@commandprompt.com> writes:

Note that while reordering, CLUSTER also gets rid of dead tuples, so if
you cluster you don't need to vacuum.

Just for the record, that behavior is seriously broken: it violates
MVCC if any of the deleted tuples are still visible to anyone else.

regards, tom lane

#5Alvaro Herrera
alvherre@commandprompt.com
In reply to: Tom Lane (#4)
Re: Does VACUUM reorder tables on clustered indices

Tom Lane escribi�:

Alvaro Herrera <alvherre@commandprompt.com> writes:

Note that while reordering, CLUSTER also gets rid of dead tuples, so if
you cluster you don't need to vacuum.

Just for the record, that behavior is seriously broken: it violates
MVCC if any of the deleted tuples are still visible to anyone else.

Well, FWIW, the fact that the CLUSTER command exists at all is broken
IMHO ... but meanwhile, are you suggesting that CLUSTER should be fixed
to retain tuples that are visible to running transactions?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#6Andrew Sullivan
ajs@crankycanuck.ca
In reply to: Tom Lane (#4)
Re: Does VACUUM reorder tables on clustered indices

On Sun, Dec 18, 2005 at 10:08:22PM -0500, Tom Lane wrote:

Just for the record, that behavior is seriously broken: it violates
MVCC if any of the deleted tuples are still visible to anyone else.

Does it remove tuples that VACUUM FULL wouldn't? I always thought it
did essentially the same thing?

A

--
Andrew Sullivan | ajs@crankycanuck.ca
I remember when computers were frustrating because they *did* exactly what
you told them to. That actually seems sort of quaint now.
--J.D. Baldwin

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Sullivan (#6)
Re: Does VACUUM reorder tables on clustered indices

Andrew Sullivan <ajs@crankycanuck.ca> writes:

On Sun, Dec 18, 2005 at 10:08:22PM -0500, Tom Lane wrote:

Just for the record, that behavior is seriously broken: it violates
MVCC if any of the deleted tuples are still visible to anyone else.

Does it remove tuples that VACUUM FULL wouldn't?

Yes. CLUSTER works on SnapshotNow, so it will remove committed-dead
tuples even if there are still open transactions that could see them.
Of course, said transactions couldn't be actively using the table
while the CLUSTER runs, because it takes an exclusive table lock.
But they *could* look at it afterwards. Offhand I think you'd only
be likely to notice the difference if the open transactions were
SERIALIZABLE --- in READ COMMITTED mode, by the time they could look
at the clustered table, they'd likely be using a snapshot that postdates
the DELETE.

[ experiments a bit... ] Hmm. Actually, it's far worse than I
thought. It looks like CLUSTER puts the tuples into the new table with
its own xid, which means that concurrent serializable transactions will
see the new table as completely empty!

<< session 1 >>

regression=# select * from int4_tbl;
f1
-------------
0
123456
-123456
2147483647
-2147483647
(5 rows)

regression=# create index fooi on int4_tbl(f1);
CREATE INDEX
regression=# begin isolation level serializable;
BEGIN
regression=# select 2+2; -- establish transaction snapshot
?column?
----------
4
(1 row)

<< session 2 >>

regression=# delete from int4_tbl where f1 = -123456;
DELETE 1
regression=# cluster fooi on int4_tbl;
CLUSTER

<< back to session 1 >>

regression=# select * from int4_tbl;
f1
----
(0 rows)

regression=# commit;
COMMIT
regression=# select * from int4_tbl;
f1
-------------
-2147483647
0
123456
2147483647
(4 rows)

regards, tom lane

#8Jim C. Nasby
jnasby@pervasive.com
In reply to: Alvaro Herrera (#3)
Re: Does VACUUM reorder tables on clustered indices

On Sun, Dec 18, 2005 at 07:01:39PM -0300, Alvaro Herrera wrote:

Martin Marques escribi?:

On Sun, 18 Dec 2005, frank church wrote:

Does VACUUMing reorder tables on clustered indices or is it only the
CLUSTER
command that can do that?

Cluster does that. Vacuum only cleans dead tuples from the tables.

Note that while reordering, CLUSTER also gets rid of dead tuples, so if
you cluster you don't need to vacuum.

It also does a REINDEX...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#9Noname
ipv@tinet.org
In reply to: frank church (#1)
Re: Does VACUUM reorder tables on clustered indices

Hi,

Utilize <b>CLUSTER;</b> (after vacuum) to reorder the data.

Regards

----- Original Message -----
From: "Jim C. Nasby" <jnasby@pervasive.com>
To: "Martin Marques" <martin@bugs.unl.edu.ar>; "frank church"
<pgsql@adontendev.net>; <pgsql-sql@postgresql.org>
Sent: Tuesday, December 20, 2005 10:41 PM
Subject: Re: [SQL] Does VACUUM reorder tables on clustered indices

Show quoted text

On Sun, Dec 18, 2005 at 07:01:39PM -0300, Alvaro Herrera wrote:

Martin Marques escribi?:

On Sun, 18 Dec 2005, frank church wrote:

Does VACUUMing reorder tables on clustered indices or is it only the
CLUSTER
command that can do that?

Cluster does that. Vacuum only cleans dead tuples from the tables.

Note that while reordering, CLUSTER also gets rid of dead tuples, so if
you cluster you don't need to vacuum.

It also does a REINDEX...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

#10Jim C. Nasby
jnasby@pervasive.com
In reply to: Noname (#9)
Re: Does VACUUM reorder tables on clustered indices

On Wed, Dec 21, 2005 at 12:34:12AM +0100, ipv@tinet.org wrote:

Hi,

Utilize <b>CLUSTER;</b> (after vacuum) to reorder the data.

Why would you vacuum when cluster is just going to wipe out the dead
tuples anyway?

Note that while reordering, CLUSTER also gets rid of dead tuples, so if
you cluster you don't need to vacuum.

It also does a REINDEX...

--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#11Chris Browne
cbbrowne@acm.org
In reply to: frank church (#1)
Re: Does VACUUM reorder tables on clustered indices

jnasby@pervasive.com ("Jim C. Nasby") writes:

On Wed, Dec 21, 2005 at 12:34:12AM +0100, ipv@tinet.org wrote:

Hi,

Utilize <b>CLUSTER;</b> (after vacuum) to reorder the data.

Why would you vacuum when cluster is just going to wipe out the dead
tuples anyway?

There is one reason to VACUUM before running CLUSTER...

That is that VACUUM will be *guaranteed* to draw all the pages into memory.

Subsequently, you can be certain that the pages are in cache, and that
the CLUSTER should need to do minimal I/O to read data into memory.

If I'm considering clustering the Slony-I "sl_log_1" table, forcing it
into memory *is* something I'll consider doing in order to minimize
the time that would-be writers are blocked from writing...
--
let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];;
http://cbbrowne.com/info/multiplexor.html
Never criticize anybody until you have walked a mile in their shoes,
because by that time you will be a mile away and have their shoes.
-- email sig, Brian Servis

#12Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Chris Browne (#11)
Re: Does VACUUM reorder tables on clustered indices

Chris Browne wrote:

jnasby@pervasive.com ("Jim C. Nasby") writes:

On Wed, Dec 21, 2005 at 12:34:12AM +0100, ipv@tinet.org wrote:

Hi,

Utilize <b>CLUSTER;</b> (after vacuum) to reorder the data.

Why would you vacuum when cluster is just going to wipe out the dead
tuples anyway?

There is one reason to VACUUM before running CLUSTER...

That is that VACUUM will be *guaranteed* to draw all the pages into memory.

Subsequently, you can be certain that the pages are in cache, and that
the CLUSTER should need to do minimal I/O to read data into memory.

If I'm considering clustering the Slony-I "sl_log_1" table, forcing it
into memory *is* something I'll consider doing in order to minimize
the time that would-be writers are blocked from writing...

Why don't you just do SELECT * FROM tab WHERE col != 'lkjasdflkjadsf'.
That should pull things into memory without the VACUUM overhead.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#13Andrew Sullivan
ajs@crankycanuck.ca
In reply to: Chris Browne (#11)
Re: Does VACUUM reorder tables on clustered indices

On Wed, Dec 21, 2005 at 06:36:45PM -0500, Chris Browne wrote:

If I'm considering clustering the Slony-I "sl_log_1" table, forcing it
into memory *is* something I'll consider doing in order to minimize
the time that would-be writers are blocked from writing...

Given what Tom Lane recently reported (and assuming I understood his
remarks), I think it's a Mighty Bad Idea to CLUSTER sl_log_1.

A

--
Andrew Sullivan | ajs@crankycanuck.ca
In the future this spectacle of the middle classes shocking the avant-
garde will probably become the textbook definition of Postmodernism.
--Brad Holland

#14Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#7)
Re: Does VACUUM reorder tables on clustered indices

Tom, has this bug been addressed or documented?

---------------------------------------------------------------------------

Tom Lane wrote:

Andrew Sullivan <ajs@crankycanuck.ca> writes:

On Sun, Dec 18, 2005 at 10:08:22PM -0500, Tom Lane wrote:

Just for the record, that behavior is seriously broken: it violates
MVCC if any of the deleted tuples are still visible to anyone else.

Does it remove tuples that VACUUM FULL wouldn't?

Yes. CLUSTER works on SnapshotNow, so it will remove committed-dead
tuples even if there are still open transactions that could see them.
Of course, said transactions couldn't be actively using the table
while the CLUSTER runs, because it takes an exclusive table lock.
But they *could* look at it afterwards. Offhand I think you'd only
be likely to notice the difference if the open transactions were
SERIALIZABLE --- in READ COMMITTED mode, by the time they could look
at the clustered table, they'd likely be using a snapshot that postdates
the DELETE.

[ experiments a bit... ] Hmm. Actually, it's far worse than I
thought. It looks like CLUSTER puts the tuples into the new table with
its own xid, which means that concurrent serializable transactions will
see the new table as completely empty!

<< session 1 >>

regression=# select * from int4_tbl;
f1
-------------
0
123456
-123456
2147483647
-2147483647
(5 rows)

regression=# create index fooi on int4_tbl(f1);
CREATE INDEX
regression=# begin isolation level serializable;
BEGIN
regression=# select 2+2; -- establish transaction snapshot
?column?
----------
4
(1 row)

<< session 2 >>

regression=# delete from int4_tbl where f1 = -123456;
DELETE 1
regression=# cluster fooi on int4_tbl;
CLUSTER

<< back to session 1 >>

regression=# select * from int4_tbl;
f1
----
(0 rows)

regression=# commit;
COMMIT
regression=# select * from int4_tbl;
f1
-------------
-2147483647
0
123456
2147483647
(4 rows)

regards, tom lane

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#14)
Re: Does VACUUM reorder tables on clustered indices

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Tom, has this bug been addressed or documented?

No. Please add a TODO:

* Make CLUSTER preserve recently-dead tuples per MVCC requirements

I have not tested, but I suspect the table-rewriting variants of ALTER TABLE
have the same problem.

regards, tom lane

#16Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#15)
Re: Does VACUUM reorder tables on clustered indices

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Tom, has this bug been addressed or documented?

No. Please add a TODO:

* Make CLUSTER preserve recently-dead tuples per MVCC requirements

I have not tested, but I suspect the table-rewriting variants of ALTER TABLE
have the same problem.

Added.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073