Sudden increase in n_dead_tup with no corresponding insert/update/delete

Started by Matthew Tice10 months ago4 messagesgeneral
Jump to latest
#1Matthew Tice
mjtice@gmail.com

Hi all,

While investigating some potential vacuum improvements to make to a table I happened to notice that one table (along with others) will suddenly increase the number of n_dead_tup reported in pg_stat_user_tables without a corresponding increase in the inserts, updates, or deletes.

For instance, running this query in a 1 second loop

select * from pg_stat_user_tables where relname = 'casino_account_history_lines';

I can see the n_dead_tup column increases until which time the autovacuum process finishes vacuuming the table. Example:

-[ RECORD 1 ]-------+------------------------------
relid | 33378
schemaname | public
relname | casino_account_history_lines
seq_scan | 1122
seq_tup_read | 178229588443
idx_scan | 456779105
idx_tup_fetch | 5539267637
n_tup_ins | 45093031
n_tup_upd | 47289203
n_tup_del | 0
n_tup_hot_upd | 0
n_live_tup | 1646966715
n_dead_tup | 1356331
n_mod_since_analyze | 11498
n_ins_since_vacuum | 6288
last_vacuum | 2025-06-03 14:57:43.46009+00
last_autovacuum | 2025-06-03 19:09:21.595322+00
last_analyze | 2025-06-03 14:57:54.848185+00
last_autoanalyze | 2025-06-03 19:09:48.390396+00
vacuum_count | 2
autovacuum_count | 3973
analyze_count | 6
autoanalyze_count | 3078

--
-- At this point the table is no longer in pg_stat_progress_vacuum and `n_dead_tup` has dropped from 1356331 to 4302
--

-[ RECORD 1 ]-------+------------------------------
relid | 33378
schemaname | public
relname | casino_account_history_lines
seq_scan | 1122
seq_tup_read | 178229588443
idx_scan | 456779364
idx_tup_fetch | 5539267804
n_tup_ins | 45093063
n_tup_upd | 47289232
n_tup_del | 0
n_tup_hot_upd | 0
n_live_tup | 1646961282
n_dead_tup | 4302
n_mod_since_analyze | 11559
n_ins_since_vacuum | 2
last_vacuum | 2025-06-03 14:57:43.46009+00
last_autovacuum | 2025-06-03 19:12:48.107816+00
last_analyze | 2025-06-03 14:57:54.848185+00
last_autoanalyze | 2025-06-03 19:09:48.390396+00
vacuum_count | 2
autovacuum_count | 3974
analyze_count | 6
autoanalyze_count | 3078

--

This seems normal to me, however, while still looking at pg_stat_user_tables in a loop, `n_dead_tup` steadily increases to, in this latest run, `5038` at which point, one second later the number jumps to above 1.2 million:

Tue 03 Jun 2025 07:13:11 PM UTC (every 1s)

-[ RECORD 1 ]-------+------------------------------
relid | 33378
schemaname | public
relname | casino_account_history_lines
seq_scan | 1122
seq_tup_read | 178229588443
idx_scan | 456784246
idx_tup_fetch | 5539271612
n_tup_ins | 45093719
n_tup_upd | 47289968
n_tup_del | 0
n_tup_hot_upd | 0
n_live_tup | 1646961938
n_dead_tup | 5038
n_mod_since_analyze | 12951
n_ins_since_vacuum | 658
last_vacuum | 2025-06-03 14:57:43.46009+00
last_autovacuum | 2025-06-03 19:12:48.107816+00
last_analyze | 2025-06-03 14:57:54.848185+00
last_autoanalyze | 2025-06-03 19:09:48.390396+00
vacuum_count | 2
autovacuum_count | 3974
analyze_count | 6
autoanalyze_count | 3078

Tue 03 Jun 2025 07:13:12 PM UTC (every 1s)

-[ RECORD 1 ]-------+------------------------------
relid | 33378
schemaname | public
relname | casino_account_history_lines
seq_scan | 1122
seq_tup_read | 178229588443
idx_scan | 456784464
idx_tup_fetch | 5539271752
n_tup_ins | 45093746
n_tup_upd | 47289993
n_tup_del | 0
n_tup_hot_upd | 0
n_live_tup | 1647255972
n_dead_tup | 1290579
n_mod_since_analyze | 2
n_ins_since_vacuum | 685
last_vacuum | 2025-06-03 14:57:43.46009+00
last_autovacuum | 2025-06-03 19:12:48.107816+00
last_analyze | 2025-06-03 14:57:54.848185+00
last_autoanalyze | 2025-06-03 19:13:12.125828+00
vacuum_count | 2
autovacuum_count | 3974
analyze_count | 6
autoanalyze_count | 3079

I don't understand where this large increase is coming from when there are no corresponding inserts, updates, or deletes (at the magnitude). This entire process repeats itself and, as mentioned, the same thing is happening on other observed tables.

I'm running version 'PostgreSQL 15.6 (Ubuntu 15.6-1.pgdg22.04+1)'

Thanks,
Matt

#2David Rowley
dgrowleyml@gmail.com
In reply to: Matthew Tice (#1)
Re: Sudden increase in n_dead_tup with no corresponding insert/update/delete

On Wed, 4 Jun 2025 at 07:22, Matthew Tice <mjtice@gmail.com> wrote:

Tue 03 Jun 2025 07:13:11 PM UTC (every 1s)
n_dead_tup | 5038
autoanalyze_count | 3078

Tue 03 Jun 2025 07:13:12 PM UTC (every 1s)
n_dead_tup | 1290579
autoanalyze_count | 3079

I don't understand where this large increase is coming from when there are no corresponding inserts, updates, or deletes (at the magnitude). This entire process repeats itself and, as mentioned, the same thing is happening on other observed tables.

I imagine it's from the auto-analyze that ran. Analyze will try to
estimate the live and dead rows, but since analyze only samples some
blocks, it may come up with something that's not too accurate if the
blocks it happened to sample don't contain similar percentages of dead
rows than the entire table.

See [1]https://github.com/postgres/postgres/blob/REL_15_STABLE/src/backend/commands/analyze.c#L1318.

David

[1]: https://github.com/postgres/postgres/blob/REL_15_STABLE/src/backend/commands/analyze.c#L1318

#3Matthew Tice
mjtice@gmail.com
In reply to: David Rowley (#2)
Re: Sudden increase in n_dead_tup with no corresponding insert/update/delete

On Jun 3, 2025, at 6:23 PM, David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 4 Jun 2025 at 07:22, Matthew Tice <mjtice@gmail.com> wrote:

Tue 03 Jun 2025 07:13:11 PM UTC (every 1s)
n_dead_tup | 5038
autoanalyze_count | 3078

Tue 03 Jun 2025 07:13:12 PM UTC (every 1s)
n_dead_tup | 1290579
autoanalyze_count | 3079

I don't understand where this large increase is coming from when there are no corresponding inserts, updates, or deletes (at the magnitude). This entire process repeats itself and, as mentioned, the same thing is happening on other observed tables.

I imagine it's from the auto-analyze that ran. Analyze will try to
estimate the live and dead rows, but since analyze only samples some
blocks, it may come up with something that's not too accurate if the
blocks it happened to sample don't contain similar percentages of dead
rows than the entire table.

See [1].

David

[1] https://github.com/postgres/postgres/blob/REL_15_STABLE/src/backend/commands/analyze.c#L1318

Thanks, David.

This table is relatively large (1.6B records, 1.5TB, 38 columns). The `default_statistics_target` is set to 300 - so I think that 90000 may not be enough to gather accurate statistics.

#4Ron
ronljohnsonjr@gmail.com
In reply to: Matthew Tice (#3)
Re: Sudden increase in n_dead_tup with no corresponding insert/update/delete

Note also that 15.6 is about 18 months old. Upgrading really does only
take a few minutes, if you download the binaries before installation.

On Wed, Jun 4, 2025 at 2:37 PM Matthew Tice <mjtice@gmail.com> wrote:

On Jun 3, 2025, at 6:23 PM, David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 4 Jun 2025 at 07:22, Matthew Tice <mjtice@gmail.com> wrote:

Tue 03 Jun 2025 07:13:11 PM UTC (every 1s)
n_dead_tup | 5038
autoanalyze_count | 3078

Tue 03 Jun 2025 07:13:12 PM UTC (every 1s)
n_dead_tup | 1290579
autoanalyze_count | 3079

I don't understand where this large increase is coming from when there

are no corresponding inserts, updates, or deletes (at the magnitude). This
entire process repeats itself and, as mentioned, the same thing is
happening on other observed tables.

I imagine it's from the auto-analyze that ran. Analyze will try to
estimate the live and dead rows, but since analyze only samples some
blocks, it may come up with something that's not too accurate if the
blocks it happened to sample don't contain similar percentages of dead
rows than the entire table.

See [1].

David

[1]

https://github.com/postgres/postgres/blob/REL_15_STABLE/src/backend/commands/analyze.c#L1318

Thanks, David.

This table is relatively large (1.6B records, 1.5TB, 38 columns). The
`default_statistics_target` is set to 300 - so I think that 90000 may not
be enough to gather accurate statistics.

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!