Error correction for n_dead_tuples

Started by ITAGAKI Takahiroover 18 years ago2 messages
#1ITAGAKI Takahiro
itagaki.takahiro@oss.ntt.co.jp

Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm concerned that this one creates
an open-loop behavior in which the n_dead_tuples estimate will diverge
arbitrarily far from reality over time. I criticized the original
proposal on that basis, and I'm not convinced this version fixes it,
because of the fact that stats counter updates occur much later than the
actions they count. (My recent patch to rate-limit tabstat messages made
that problem worse, but it existed anyway.) What might make sense is for
vacuum to count the number of dead-but-not-removable tuples it skips over,
and apply that as the value of n_dead_tuples on receipt of the vacuum
message (instead of setting to zero as now). This is likely to be wrong
with respect to the actions of transactions running concurrently with the
vacuum, but I think so is the proposed patch; and at least in this form
the error certainly cannot accumulate across vacuum cycles.

In my understanding, there are two proposal to change the way of updating
n_dead_tuples by vacuum presently:

Set n_dead_tuples to the number of
1. unvacuumable tuples the vacuum has seen
2. dead tuples reported to stats collector after the beginning of vacuum
at the end of vacuum.

Both methods don't accumulate errors across vacuum cycles, because dead
tuples statistics at the beginning of vacuum is cleared in both of them.
Also, if there is no background updates, the n_dead_tuples is certainly
set to zero.

I think the 2nd is better. If we update or delete tuples in the pages
that have been already scanned by vacuum, the vaccum cannot see the
newly created dead tuples. The vacuum could report fewer number as
the unvacuumable tuples.

The following is a test with a patch of the 2nd fix. The vacuum reports
'960 dead row versions cannot be removed yet', that is used in 1st method,
but the actual dead tuples are 2000, that is used in 2nd method.
...so I'll propose the 2nd method again.

Comments welcome.

# SELECT n_live_tup, n_dead_tup FROM pg_stat_user_tables WHERE relname = 'accounts';
n_live_tup | n_dead_tup
------------+------------
100000 | 0

# vacuum verbose accounts; (and 'pgbench -n -N -t2000' concurrently)
INFO: vacuuming "public.accounts"
INFO: index "accounts_pkey" now contains 100778 row versions in 276 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.70 sec.
INFO: "accounts": found 0 removable, 100778 nonremovable row versions in 1679 pages
DETAIL: 960 dead row versions cannot be removed yet.
There were 1626 unused item pointers.
876 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 5.78 sec.
VACUUM

# SELECT n_live_tup, n_dead_tup FROM pg_stat_user_tables WHERE relname = 'accounts';
n_live_tup | n_dead_tup
------------+------------
100000 | 2000

# SELECT tuple_count, dead_tuple_count FROM pgstattuple('accounts');
tuple_count | dead_tuple_count
-------------+------------------
100000 | 2000

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: ITAGAKI Takahiro (#1)
Re: Error correction for n_dead_tuples

I think this patch needs to be revisited in the light of the work I just
did to make pgstats correctly track the effects of aborted transactions.
It seems fairly likely to me that most of the problem you were seeing
was actually due to that stupidity, and not to race conditions between
VACUUM and concurrent transactions. So maybe we no longer need to do
anything.

It might be worth fixing things so that a VACUUM sets n_dead_tuples to
the number of deleted-but-not-removable tuples it saw, rather than
always setting to zero as is presently done.

regards, tom lane