incoherent dead tuples between pg_stat_user_tables and pgstattuple?

Started by Luca Ferrariover 6 years ago5 messagesgeneral
Jump to latest
#1Luca Ferrari
fluca1978@gmail.com

Hello,
running 11.5 I've some misconception about pgstatuple: it seems it
lags behind normal statistics.
I've almost double a table with an update.
The result is:

testdb=# select n_live_tup, n_dead_tup, last_autovacuum,
last_autoanalyze from pg_stat_user_tables where relname = 'foo';
-[ RECORD 1 ]----+------------------------------
n_live_tup | 5021619
n_dead_tup | 5000000
last_autovacuum | 2019-09-27 12:05:36.072376+02
last_autoanalyze | 2019-09-27 12:06:16.538413+02

but pgstatuple provides a quite different output, saying the table is
not full of dead rows:

testdb=# SELECT * FROM pgstattuple( 'foo' );
-[ RECORD 1 ]------+----------
table_len | 930897920
tuple_count | 5000000
tuple_len | 439595972
tuple_percent | 47.22
dead_tuple_count | 88
dead_tuple_len | 7744
dead_tuple_percent | 0
free_space | 447716392
free_percent | 48.1

Even after running an ANALYZE, pgstattuple provides the same results.
After a VACUUM FULL ANALYZE the world is as I would expect it to be:

testdb=# select n_live_tup, n_dead_tup, last_autovacuum,
last_autoanalyze from pg_stat_user_tables where relname = 'foo';
-[ RECORD 1 ]----+------------------------------
n_live_tup | 5000117
n_dead_tup | 0
last_autovacuum | 2019-09-27 12:05:36.072376+02
last_autoanalyze | 2019-09-27 12:06:16.538413+02

Time: 16,159 ms
testdb=# SELECT * FROM pgstattuple( 'foo' );
-[ RECORD 1 ]------+----------
table_len | 465453056
tuple_count | 5000000
tuple_len | 439595972
tuple_percent | 94.44
dead_tuple_count | 0
dead_tuple_len | 0
dead_tuple_percent | 0
free_space | 3870144
free_percent | 0.83

Am I missing something?

Thanks,
Luca

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Luca Ferrari (#1)
Re: incoherent dead tuples between pg_stat_user_tables and pgstattuple?

Luca Ferrari <fluca1978@gmail.com> writes:

running 11.5 I've some misconception about pgstatuple: it seems it
lags behind normal statistics.

n_live_tup/n_dead_tup should not be thought to be better than
approximations. Most operations adjust them only incrementally,
and messages to the stats collector can get dropped entirely
under heavy load, causing deltas to go missing. There are
also race conditions involved in some update scenarios.

Even after running an ANALYZE, pgstattuple provides the same results.
After a VACUUM FULL ANALYZE the world is as I would expect it to be:

VACUUM FULL is one of very few operations that reset those counters
to guaranteed-correct values (and I'm not sure the guarantee is
ironclad even in that case).

It's very hard to improve on this without giving up the desirable
ability to have concurrent table updates. If you really want an
accurate row count, COUNT(*) or pgstattuples will give you a more
reliable estimate ... at much higher cost, of course.

regards, tom lane

#3Luca Ferrari
fluca1978@gmail.com
In reply to: Tom Lane (#2)
Re: incoherent dead tuples between pg_stat_user_tables and pgstattuple?

On Fri, Sep 27, 2019 at 3:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

n_live_tup/n_dead_tup should not be thought to be better than
approximations. Most operations adjust them only incrementally,
and messages to the stats collector can get dropped entirely
under heavy load, causing deltas to go missing. There are
also race conditions involved in some update scenarios.

Are you suggesting tha autovacuum made free space available so that
pg_stat_user_tables lags behind? Because I don't see last_autovacuum
changing in the mantime.

Luca

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Luca Ferrari (#3)
Re: incoherent dead tuples between pg_stat_user_tables and pgstattuple?

Luca Ferrari <fluca1978@gmail.com> writes:

On Fri, Sep 27, 2019 at 3:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

n_live_tup/n_dead_tup should not be thought to be better than
approximations. Most operations adjust them only incrementally,
and messages to the stats collector can get dropped entirely
under heavy load, causing deltas to go missing. There are
also race conditions involved in some update scenarios.

Are you suggesting tha autovacuum made free space available so that
pg_stat_user_tables lags behind? Because I don't see last_autovacuum
changing in the mantime.

I'm not suggesting any specific scenario, because you haven't presented
any evidence as to when those counts became off. I'm just saying that
there are plenty of possible mechanisms for them to become so.

regards, tom lane

#5Luca Ferrari
fluca1978@gmail.com
In reply to: Tom Lane (#4)
Re: incoherent dead tuples between pg_stat_user_tables and pgstattuple?

On Fri, Sep 27, 2019 at 4:43 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm not suggesting any specific scenario, because you haven't presented
any evidence as to when those counts became off. I'm just saying that
there are plenty of possible mechanisms for them to become so.

So far autovacuum seems the main reason for pg_stattuple to be _ahead_
of pg_stats (I stand correct).
In fact, turning off autovacuum the pg_stattuple shows the exact
result as I would have expected, and probably the pg_stats I was
referring to with n_dead_tup > 0 was due to a previous UPDATE ran
against the same table.

Apparently this produces always the same results:

testdb=# CREATE TABLE foo( bar text );
CREATE TABLE
testdb=# INSERT INTO foo
SELECT generate_series( 1, 5000000 )::text;
INSERT 0 5000000
testdb=# ANALYZE foo;
ANALYZE

testdb=# show autovacuum;
-[ RECORD 1 ]---
autovacuum | off

testdb=# select current_timestamp;
-[ RECORD 1 ]-----+------------------------------
current_timestamp | 2019-09-27 20:44:39.277217+02

Time: 0,839 ms
testdb=# select n_live_tup, n_dead_tup, last_autovacuum,
last_autoanalyze from pg_stat_user_tables where relname = 'foo';
-[ RECORD 1 ]----+------------------------------
n_live_tup | 5000000
n_dead_tup | 0
last_autovacuum | 2019-09-27 20:29:18.571858+02
last_autoanalyze | 2019-09-27 20:29:40.889657+02

testdb=# UPDATE foo SET bar = lower( bar );
UPDATE 5000000

testdb=# select n_live_tup, n_dead_tup, last_autovacuum,
last_autoanalyze from pg_stat_user_tables where relname = 'foo';
-[ RECORD 1 ]----+------------------------------
n_live_tup | 5000000
n_dead_tup | 5000000
last_autovacuum | 2019-09-27 20:29:18.571858+02
last_autoanalyze | 2019-09-27 20:29:40.889657+02

testdb=# SELECT * FROM pgstattuple( 'foo' );
-[ RECORD 1 ]------+----------
table_len | 362479616
tuple_count | 5000000
tuple_len | 158888896
tuple_percent | 43.83
dead_tuple_count | 5000000
dead_tuple_len | 158888896
dead_tuple_percent | 43.83
free_space | 1240672
free_percent | 0.34