vacuum does not reclaim rows

Started by Tatsuo Ishiiover 22 years ago8 messages
#1Tatsuo Ishii
t-ishii@sra.co.jp

Hi,

Does anybody know why vacuum full does not relcaim deleted rows if a
open transaction which started before the deletion happens is running
even on a different database?

Here is an example:

T1: psql db1
T2: psql db2
T3: psql db2

T2: create table t1(i int); insert into t1 values(1);
T1: begin;
T2: delete from t1;
T3: vacuum full t1;

here vacuum will not reclaim deleted rows.

While tracking this, I found that GetOldestXmin(false) checks the
database id correctly:

if (allDbs || proc->databaseId == MyDatabaseId)

but after that it checks proc->xmin, where xmin may not be running on
the same database. I wonder if this is correct or not. Maybe we should
make sure that xmin is running on the same database if GetOldestXmin()
is called with its arg being set false? This is PostgreSQL 7.3.3.
--
Tatsuo Ishii

#2Kurt Roeckx
Q@ping.be
In reply to: Tatsuo Ishii (#1)
Re: vacuum does not reclaim rows

On Sun, Jul 06, 2003 at 12:00:39AM +0900, Tatsuo Ishii wrote:

Hi,

Does anybody know why vacuum full does not relcaim deleted rows if a
open transaction which started before the deletion happens is running
even on a different database?

On what do you base that it doesn't reclaim those rows?

Kurt

#3Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Kurt Roeckx (#2)
Re: vacuum does not reclaim rows

Does anybody know why vacuum full does not relcaim deleted rows if a
open transaction which started before the deletion happens is running
even on a different database?

On what do you base that it doesn't reclaim those rows?

test=# vacuum full verbose t1;
INFO: --Relation public.t1--
INFO: Pages 1: Changed 1, reaped 0, Empty 0, New 0; Tup 1: Vac 0, Keep/VTL 1/0, UnUsed 0, MinLen 32, MaxLen 32; Re-using: Free/Avail. Space 8136/8136; EndEmpty/Avail. Pages 0/1.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: Rel t1: Pages: 1 --> 1; Tuple(s) moved: 0.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

Here you can see "Pages: 1 --> 1; Tuple(s) moved: 0", it indicates
vacuum actually skips the table and does not reclaim those rows.

Here is a case when vacuum does reclaim.

test=# vacuum full verbose t1;
INFO: --Relation public.t1--
INFO: Pages 1: Changed 0, reaped 1, Empty 0, New 0; Tup 0: Vac 1, Keep/VTL 0/0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space 8168/0; EndEmpty/Avail. Pages 1/0.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: Rel t1: Pages: 1 --> 0.
VACUUM
--
Tatsuo Ishii

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tatsuo Ishii (#1)
Re: vacuum does not reclaim rows

Tatsuo Ishii <t-ishii@sra.co.jp> writes:

but after that it checks proc->xmin, where xmin may not be running on
the same database. I wonder if this is correct or not. Maybe we should
make sure that xmin is running on the same database

How would you know? (At the time you are looking, it's quite possible
the other guy's xmin doesn't exist anymore.) In any case you can't just
arbitrarily ignore the other guy's xmin, since it's a proxy for
subsequent transaction IDs as well, and those might be in any database.

It might be possible to do something by having each proc store both
a "local" and a "global" xmin computed as of its current xid start,
but I haven't really thought through the details. In any case, that
would be extra bookkeeping needed during every transaction start,
so I'd want to see proof of a generally-useful improvement in return.

On the whole I'm against changing this logic ... I think the odds
of breaking something are high, and the odds of making a useful
improvement low ...

regards, tom lane

#5Andrew Sullivan
andrew@libertyrms.info
In reply to: Tom Lane (#4)
Re: vacuum does not reclaim rows

On Sat, Jul 05, 2003 at 09:51:16PM -0400, Tom Lane wrote:

but I haven't really thought through the details. In any case, that
would be extra bookkeeping needed during every transaction start,
so I'd want to see proof of a generally-useful improvement in return.

For what it's worth, we have redesigned around this very problem,
because we had a table, vacuumed every 5 minutes, which was always >
50% dead tuples. Of course, we _were_ able to redesign around it,
but I'm not sure whether we just moved the problem to a new location.
We'll see in the upcoming weeks. (Our testing says no, but I always
figure there's _some_ strange client case I never thought of.)

A

-- 
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110
#6Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#4)
Re: vacuum does not reclaim rows

I think the big issue is that people think that if they have no one in a
specific database, that VACUUM FULL will completely remove unused
space, while this is not true if there are other backends connected to
other databases.

This might be a stupid question, but why does one backend have to care
about the global xmin at all? Isn't the local xmin the only important
value?

---------------------------------------------------------------------------

Tom Lane wrote:

Tatsuo Ishii <t-ishii@sra.co.jp> writes:

but after that it checks proc->xmin, where xmin may not be running on
the same database. I wonder if this is correct or not. Maybe we should
make sure that xmin is running on the same database

How would you know? (At the time you are looking, it's quite possible
the other guy's xmin doesn't exist anymore.) In any case you can't just
arbitrarily ignore the other guy's xmin, since it's a proxy for
subsequent transaction IDs as well, and those might be in any database.

It might be possible to do something by having each proc store both
a "local" and a "global" xmin computed as of its current xid start,
but I haven't really thought through the details. In any case, that
would be extra bookkeeping needed during every transaction start,
so I'd want to see proof of a generally-useful improvement in return.

On the whole I'm against changing this logic ... I think the odds
of breaking something are high, and the odds of making a useful
improvement low ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#6)
Re: vacuum does not reclaim rows

Bruce Momjian <pgman@candle.pha.pa.us> writes:

This might be a stupid question, but why does one backend have to care
about the global xmin at all? Isn't the local xmin the only important
value?

Not when it's looking at shared tables.

Do you want to try to make the tqual code aware of whether it's looking
at a shared or nonshared table? I don't honestly think it's worth it...

regards, tom lane

#8Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#7)
Re: vacuum does not reclaim rows

Oh, shared tables --- I forgot about those.

---------------------------------------------------------------------------

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

This might be a stupid question, but why does one backend have to care
about the global xmin at all? Isn't the local xmin the only important
value?

Not when it's looking at shared tables.

Do you want to try to make the tqual code aware of whether it's looking
at a shared or nonshared table? I don't honestly think it's worth it...

regards, tom lane

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073