Why doesn't autovacuum/analyze run in due time after calling pg_stat_reset?
Hi, all. After calling pg_stat_reset all statistics used by autovacuum
got zeroed, and started accumulating from scratch. Some tables get acted
upon properly, some don't.
foo=> select
relname,n_live_tup,n_dead_tup,cast(round(n_live_tup*cast(current_setting('autovacuum_vacuum_scale_factor')
as float)+current_setting('autovacuum_vacuum_threshold')::int) as
int)-n_dead_tup as left, n_mod_since_analyz
e,cast(round(n_live_tup*cast(current_setting('autovacuum_analyze_scale_factor')
as float)+current_setting('autovacuum_analyze_threshold')::int) as
int)-n_mod_since_analyze as
left,n_ins_since_vacuum,last_autovacuum,last_autoanalyze from p
g_stat_all_tables where n_live_tup>0 and
n_dead_tup>=current_setting('autovacuum_vacuum_threshold')::int order by
8 nulls first, 4;
relname | n_live_tup | n_dead_tup | left |
n_mod_since_analyze | left | n_ins_since_vacuum |
last_autovacuum | last_autoanalyze
--------------------------+------------+------------+----------+---------------------+---------+--------------------+-------------------------------+-------------------------------
fooooooooooo | 32781 | 240663 | -234057 |
513265 | -509937 | 270291 |
|
fooo | 40 | 24891 | -24833 |
49822 | -49768 | 24931 |
|
foooo | 46 | 18991 | -18932 |
19099 | -19044 | 46 |
|
fooooo | 1 | 12687 | -12637 |
40795 | -40745 | 1 |
|
fooooooooooooo | 2393 | 11115 | -10586 |
137599 | -137310 | 2393 |
|
fooooooooo | 9465 | 11919 | -9976 |
352888 | -351892 | 9466 |
|
fooooooooooooooooooooooo | 26 | 2558 | -2503 |
188 | -135 | 2584 |
|
user_sessions | 118 | 1231 | -1157 |
19114 | -19052 | 118 |
|
fooooooooooooooooo | 32 | 562 | -506 |
226 | -173 | 594 |
|
fooooooo | 53 | 537 | -476 |
644 | -589 | 53 |
|
fooooooooooooo | 327 | 524 | -409 |
804 | -721 | 520 |
|
foooooooooooooooo | 46 | 104 | -45 |
457 | -402 | 183 |
|
foooooooooooooooooo | 34 | 93 | -36 |
158 | -105 | 34 |
|
foooooooooooooooo | 47 | 95 | -36 |
364 | -309 | 47 |
|
fooooooooooooooooo | 84 | 91 | -24 |
177 | -119 | 84 |
|
foooooooo | 290504401 | 9540832 | 48560098 |
26663449 | 2387041 | 8319194 | 2023-08-17
08:03:19.159161+00 | 2023-08-17 07:43:54.982375+00
fooooooooooooooooo | 43449 | 3823 | 4917 |
4190 | 205 | 377 | 2023-08-17
08:31:14.5573+00 | 2023-08-17 08:31:15.19454+00
fooooooooooooooooooooo | 3913 | 715 | 118 |
200 | 241 | 0 | 2023-08-17
08:44:04.721272+00 | 2023-08-17 08:47:04.875214+00
foooooooooooooooo | 73 | 63 | 2 |
31 | 26 | 35 | 2023-08-17
08:45:05.115824+00 | 2023-08-17 08:47:05.032251+00
foooooooooooo | 790249 | 126240 | 31860 |
4149 | 74926 | 119413 | 2023-08-17
08:46:10.625275+00 | 2023-08-17 08:49:04.867481+00
pg_shdepend | 1885 | 286 | 141 |
116 | 122 | 270 | 2023-08-17
08:47:03.638639+00 | 2023-08-17 08:48:42.903096+00
pg_index | 993 | 79 | 170 |
10 | 139 | 72 | 2023-08-17
08:48:03.67267+00 | 2023-08-17 08:49:03.723851+00
pg_depend | 9779 | 1027 | 979 |
130 | 898 | 923 | 2023-08-17
08:48:03.743417+00 | 2023-08-17 08:49:03.740913+00
foooooooooooooo | 43699 | 2352 | 6438 |
3527 | 893 | 1175 | 2023-08-17
08:48:03.84116+00 | 2023-08-17 08:48:03.93689+00
pg_attribute | 12478 | 432 | 2114 |
480 | 818 | 288 | 2023-08-17
08:49:03.558385+00 | 2023-08-17 08:49:03.652786+00
foooooooooooooooo | 3717 | 890 | -97 |
893 | -471 | 370 | 2023-08-17
08:49:04.941033+00 | 2023-08-17 08:49:04.94695+00
Juding by columns named "left", tables having negative values should be
vacuumed or analyzed, but they aren't.
The threshold for vacuuming is calculated as n_live_tup multiplied by
autovacuum_vacuum_scale_factore plus autovacuum_vacuum_threshold minus
n_dead_tup.
Accordingly for the analyze threshold.
Is there some kind of minimum n_live_tup that it must reach before acing
on the table? Those values are very low, they don't reflect the real
table size until after vacuum or analyze actually runs.
Thanks for any tips.
On 8/17/23 13:01, rihad wrote:
Hi, all. After calling pg_stat_reset all statistics used by autovacuum
got zeroed, and started accumulating from scratch. Some tables get
acted upon properly, some don't.
Self-replying: yup, it seems there's an arbitrary limit of 100K of
n_live_tup after which autovac/analyze kicks in, or it seems so.
There are still many tables waiting for their turn, which is long due.
Although there are some tables having only 60-70 (not 60-70K) n_live_tup
that have had autovacuum run on them. Weird.
Show quoted text
foo=> select
relname,n_live_tup,n_dead_tup,cast(round(n_live_tup*cast(current_setting('autovacuum_vacuum_scale_factor')
as float)+current_setting('autovacuum_vacuum_threshold')::int) as
int)-n_dead_tup as left, n_mod_since_analyz
e,cast(round(n_live_tup*cast(current_setting('autovacuum_analyze_scale_factor')
as float)+current_setting('autovacuum_analyze_threshold')::int) as
int)-n_mod_since_analyze as
left,n_ins_since_vacuum,last_autovacuum,last_autoanalyze from p
g_stat_all_tables where n_live_tup>0 and
n_dead_tup>=current_setting('autovacuum_vacuum_threshold')::int order
by 8 nulls first, 4;
relname | n_live_tup | n_dead_tup | left |
n_mod_since_analyze | left | n_ins_since_vacuum |
last_autovacuum | last_autoanalyze
--------------------------+------------+------------+----------+---------------------+---------+--------------------+-------------------------------+-------------------------------fooooooooooo | 32781 | 240663 | -234057 |
513265 | -509937 | 270291 |
|
fooo | 40 | 24891 | -24833 |
49822 | -49768 | 24931 |
|
foooo | 46 | 18991 | -18932 |
19099 | -19044 | 46 |
|
fooooo | 1 | 12687 | -12637 |
40795 | -40745 | 1 |
|
fooooooooooooo | 2393 | 11115 | -10586 |
137599 | -137310 | 2393 |
|
fooooooooo | 9465 | 11919 | -9976 |
352888 | -351892 | 9466 |
|
fooooooooooooooooooooooo | 26 | 2558 | -2503 |
188 | -135 | 2584 |
|
user_sessions | 118 | 1231 | -1157 |
19114 | -19052 | 118 |
|
fooooooooooooooooo | 32 | 562 | -506 |
226 | -173 | 594 |
|
fooooooo | 53 | 537 | -476 |
644 | -589 | 53 |
|
fooooooooooooo | 327 | 524 | -409 |
804 | -721 | 520 |
|
foooooooooooooooo | 46 | 104 | -45 |
457 | -402 | 183 |
|
foooooooooooooooooo | 34 | 93 | -36 |
158 | -105 | 34 |
|
foooooooooooooooo | 47 | 95 | -36 |
364 | -309 | 47 |
|
fooooooooooooooooo | 84 | 91 | -24 |
177 | -119 | 84 |
|
foooooooo | 290504401 | 9540832 | 48560098 |
26663449 | 2387041 | 8319194 | 2023-08-17
08:03:19.159161+00 | 2023-08-17 07:43:54.982375+00
fooooooooooooooooo | 43449 | 3823 | 4917 |
4190 | 205 | 377 | 2023-08-17
08:31:14.5573+00 | 2023-08-17 08:31:15.19454+00
fooooooooooooooooooooo | 3913 | 715 | 118 |
200 | 241 | 0 | 2023-08-17
08:44:04.721272+00 | 2023-08-17 08:47:04.875214+00
foooooooooooooooo | 73 | 63 | 2 |
31 | 26 | 35 | 2023-08-17
08:45:05.115824+00 | 2023-08-17 08:47:05.032251+00
foooooooooooo | 790249 | 126240 | 31860 |
4149 | 74926 | 119413 | 2023-08-17
08:46:10.625275+00 | 2023-08-17 08:49:04.867481+00
pg_shdepend | 1885 | 286 | 141 |
116 | 122 | 270 | 2023-08-17
08:47:03.638639+00 | 2023-08-17 08:48:42.903096+00
pg_index | 993 | 79 | 170 |
10 | 139 | 72 | 2023-08-17
08:48:03.67267+00 | 2023-08-17 08:49:03.723851+00
pg_depend | 9779 | 1027 | 979 |
130 | 898 | 923 | 2023-08-17
08:48:03.743417+00 | 2023-08-17 08:49:03.740913+00
foooooooooooooo | 43699 | 2352 | 6438 |
3527 | 893 | 1175 | 2023-08-17
08:48:03.84116+00 | 2023-08-17 08:48:03.93689+00
pg_attribute | 12478 | 432 | 2114 |
480 | 818 | 288 | 2023-08-17
08:49:03.558385+00 | 2023-08-17 08:49:03.652786+00
foooooooooooooooo | 3717 | 890 | -97 |
893 | -471 | 370 | 2023-08-17
08:49:04.941033+00 | 2023-08-17 08:49:04.94695+00Juding by columns named "left", tables having negative values should
be vacuumed or analyzed, but they aren't.The threshold for vacuuming is calculated as n_live_tup multiplied by
autovacuum_vacuum_scale_factore plus autovacuum_vacuum_threshold minus
n_dead_tup.Accordingly for the analyze threshold.
Is there some kind of minimum n_live_tup that it must reach before
acing on the table? Those values are very low, they don't reflect the
real table size until after vacuum or analyze actually runs.Thanks for any tips.
On 8/18/23 22:35, Rihad wrote:
On 8/17/23 13:01, rihad wrote:
Hi, all. After calling pg_stat_reset all statistics used by autovacuum
got zeroed, and started accumulating from scratch. Some tables get
acted upon properly, some don't.Self-replying: yup, it seems there's an arbitrary limit of 100K of
n_live_tup after which autovac/analyze kicks in, or it seems so.
To know rather then guess read:
https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM
There are still many tables waiting for their turn, which is long due.
Although there are some tables having only 60-70 (not 60-70K) n_live_tup
that have had autovacuum run on them. Weird.
--
Adrian Klaver
adrian.klaver@aklaver.com
On 8/20/23 20:22, Adrian Klaver wrote:
On 8/18/23 22:35, Rihad wrote:
On 8/17/23 13:01, rihad wrote:
Hi, all. After calling pg_stat_reset all statistics used by
autovacuum got zeroed, and started accumulating from scratch. Some
tables get acted upon properly, some don't.Self-replying: yup, it seems there's an arbitrary limit of 100K of
n_live_tup after which autovac/analyze kicks in, or it seems so.To know rather then guess read:
https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM
Sure, I read it before asking.
Taking the first table in the list as an example:
relname | n_live_tup | n_dead_tup | left |
n_mod_since_analyze | left
--------------------------+------------+------------+----------+---------------------+--------
fooooooooooo | 32781 | 240663 | -234057 |
513265 | -509937
n_dead_tup (not the actual value, but some time after calling
pg_stat_reset) is much larger than 20% of n_live_tup 32781, and
n_mod_since_analyze is much larger than 10% of it.
Yet it is kept unvacuumed and unanalyzed for a long time.
autovacuum_(vacuum|analyze)_threshold is 50.
What am I missing?
Show quoted text
There are still many tables waiting for their turn, which is long due.
Although there are some tables having only 60-70 (not 60-70K)
n_live_tup that have had autovacuum run on them. Weird.
On 8/20/23 12:10, Rihad wrote:
On 8/20/23 20:22, Adrian Klaver wrote:
On 8/18/23 22:35, Rihad wrote:
On 8/17/23 13:01, rihad wrote:
Hi, all. After calling pg_stat_reset all statistics used by
autovacuum got zeroed, and started accumulating from scratch. Some
tables get acted upon properly, some don't.Self-replying: yup, it seems there's an arbitrary limit of 100K of
n_live_tup after which autovac/analyze kicks in, or it seems so.To know rather then guess read:
https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM
Sure, I read it before asking.
Taking the first table in the list as an example:
relname | n_live_tup | n_dead_tup | left |
n_mod_since_analyze | left
--------------------------+------------+------------+----------+---------------------+--------
fooooooooooo | 32781 | 240663 | -234057 |
513265 | -509937n_dead_tup (not the actual value, but some time after calling
pg_stat_reset) is much larger than 20% of n_live_tup 32781, and
n_mod_since_analyze is much larger than 10% of it.Yet it is kept unvacuumed and unanalyzed for a long time.
autovacuum_(vacuum|analyze)_threshold is 50.
What am I missing?
Hard to say without seeing the actual settings in postgresql.conf that
match:
Most importantly:
autovacuum
and
track_counts
https://www.postgresql.org/docs/current/runtime-config-statistics.html#GUC-TRACK-COUNTS
There are still many tables waiting for their turn, which is long due.
Although there are some tables having only 60-70 (not 60-70K)
n_live_tup that have had autovacuum run on them. Weird.
--
Adrian Klaver
adrian.klaver@aklaver.com
On 8/21/23 00:15, Adrian Klaver wrote:
On 8/20/23 12:10, Rihad wrote:
On 8/20/23 20:22, Adrian Klaver wrote:
On 8/18/23 22:35, Rihad wrote:
On 8/17/23 13:01, rihad wrote:
Hi, all. After calling pg_stat_reset all statistics used by
autovacuum got zeroed, and started accumulating from scratch. Some
tables get acted upon properly, some don't.Self-replying: yup, it seems there's an arbitrary limit of 100K of
n_live_tup after which autovac/analyze kicks in, or it seems so.To know rather then guess read:
https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM
Sure, I read it before asking.
Taking the first table in the list as an example:
relname | n_live_tup | n_dead_tup | left |
n_mod_since_analyze | left
--------------------------+------------+------------+----------+---------------------+--------fooooooooooo | 32781 | 240663 | -234057 |
513265 | -509937n_dead_tup (not the actual value, but some time after calling
pg_stat_reset) is much larger than 20% of n_live_tup 32781, and
n_mod_since_analyze is much larger than 10% of it.Yet it is kept unvacuumed and unanalyzed for a long time.
autovacuum_(vacuum|analyze)_threshold is 50.
What am I missing?
Hard to say without seeing the actual settings in postgresql.conf that
match:Most importantly:
autovacuum
and
track_counts
https://www.postgresql.org/docs/current/runtime-config-statistics.html#GUC-TRACK-COUNTS
They are both on and set as per default. Autovac/analyze continue
running on some tables after pg_stat_reset. Just not on all of them,
even thought they should judging by live/dead tuples calculation.
foo=> show track_counts;
track_counts
--------------
on
(1 row)
foo=> show autovacuum;
autovacuum
------------
on
(1 row)
Show quoted text
There are still many tables waiting for their turn, which is long due.
Although there are some tables having only 60-70 (not 60-70K)
n_live_tup that have had autovacuum run on them. Weird.
On 8/20/23 14:10, Rihad wrote:
On 8/20/23 20:22, Adrian Klaver wrote:
On 8/18/23 22:35, Rihad wrote:
On 8/17/23 13:01, rihad wrote:
Hi, all. After calling pg_stat_reset all statistics used by autovacuum
got zeroed, and started accumulating from scratch. Some tables get
acted upon properly, some don't.Self-replying: yup, it seems there's an arbitrary limit of 100K of
n_live_tup after which autovac/analyze kicks in, or it seems so.To know rather then guess read:
https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM
Sure, I read it before asking.
Taking the first table in the list as an example:
relname | n_live_tup | n_dead_tup | left |
n_mod_since_analyze | left
--------------------------+------------+------------+----------+---------------------+--------
fooooooooooo | 32781 | 240663 | -234057 |
513265 | -509937n_dead_tup (not the actual value, but some time after calling
pg_stat_reset) is much larger than 20% of n_live_tup 32781, and
n_mod_since_analyze is much larger than 10% of it.Yet it is kept unvacuumed and unanalyzed for a long time.
autovacuum_(vacuum|analyze)_threshold is 50.
What am I missing?
What are your autovacuum_(vacuum|analyze_*scale_factor* values?
--
Born in Arizona, moved to Babylonia.
On 8/20/23 22:31, Rihad wrote:
On 8/21/23 00:15, Adrian Klaver wrote:
On 8/20/23 12:10, Rihad wrote:
On 8/20/23 20:22, Adrian Klaver wrote:
On 8/18/23 22:35, Rihad wrote:
On 8/17/23 13:01, rihad wrote:
Hard to say without seeing the actual settings in postgresql.conf that
match:Most importantly:
autovacuum
and
track_counts
https://www.postgresql.org/docs/current/runtime-config-statistics.html#GUC-TRACK-COUNTS
They are both on and set as per default. Autovac/analyze continue
running on some tables after pg_stat_reset. Just not on all of them,
even thought they should judging by live/dead tuples calculation.foo=> show track_counts;
track_counts
--------------
on
(1 row)foo=> show autovacuum;
autovacuum
------------
on
(1 row)
How about the rest of the settings at?:
https://www.postgresql.org/docs/current/runtime-config-autovacuum.html
Have the storage parameters for the tables been changed per?:
https://www.postgresql.org/docs/current/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS
In psql you can do:
\d+ <table_name>
The setting if changed will show up as Options: <setting>
Also are there include directives in use per?:
https://www.postgresql.org/docs/current/config-setting.html#CONFIG-INCLUDES
You can see by looking at the sourcefile field in pg_settings:
https://www.postgresql.org/docs/current/view-pg-settings.html
There are still many tables waiting for their turn, which is long due.
Although there are some tables having only 60-70 (not 60-70K)
n_live_tup that have had autovacuum run on them. Weird.
--
Adrian Klaver
adrian.klaver@aklaver.com
On 8/21/23 20:00, Adrian Klaver wrote:
On 8/20/23 22:31, Rihad wrote:
On 8/21/23 00:15, Adrian Klaver wrote:
On 8/20/23 12:10, Rihad wrote:
On 8/20/23 20:22, Adrian Klaver wrote:
On 8/18/23 22:35, Rihad wrote:
On 8/17/23 13:01, rihad wrote:
Hard to say without seeing the actual settings in postgresql.conf
that match:Most importantly:
autovacuum
and
track_counts
https://www.postgresql.org/docs/current/runtime-config-statistics.html#GUC-TRACK-COUNTS
They are both on and set as per default. Autovac/analyze continue
running on some tables after pg_stat_reset. Just not on all of them,
even thought they should judging by live/dead tuples calculation.foo=> show track_counts;
track_counts
--------------
on
(1 row)foo=> show autovacuum;
autovacuum
------------
on
(1 row)How about the rest of the settings at?:
https://www.postgresql.org/docs/current/runtime-config-autovacuum.html
Have the storage parameters for the tables been changed per?:
https://www.postgresql.org/docs/current/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS
In psql you can do:
\d+ <table_name>
The setting if changed will show up as Options: <setting>
Also are there include directives in use per?:
https://www.postgresql.org/docs/current/config-setting.html#CONFIG-INCLUDES
You can see by looking at the sourcefile field in pg_settings:
https://www.postgresql.org/docs/current/view-pg-settings.html
Thanks for the detailed reply, no tables have custom settings.
I need to make it clear once again that all autovac/analyze work as
expected when n_live_tup matches reality, i.e. when analyze has been run
on them since last reset.
A way to fix this is to simply analyze the whole database. Before doing
that, while n_live_tup starts from basically 0 and grows based on DB
activity, these usual calculations of 10-20% table size for
vacuum/analyze don't work. They don't trigger autovac for most tables,
or do it much much later.
Show quoted text
There are still many tables waiting for their turn, which is long
due.Although there are some tables having only 60-70 (not 60-70K)
n_live_tup that have had autovacuum run on them. Weird.
On 8/21/23 09:09, Rihad wrote:
On 8/21/23 20:00, Adrian Klaver wrote:
Thanks for the detailed reply, no tables have custom settings.
I need to make it clear once again that all autovac/analyze work as
expected when n_live_tup matches reality, i.e. when analyze has been run
on them since last reset.A way to fix this is to simply analyze the whole database. Before doing
that, while n_live_tup starts from basically 0 and grows based on DB
activity, these usual calculations of 10-20% table size for
vacuum/analyze don't work. They don't trigger autovac for most tables,
or do it much much later.
You still have not said or shown whether the other autovacuum settings
are the default values or not. Assuming they are, then the only other
explanation I can come up with is that there is a process or processes
that are creating long running open transactions that prevent autovacuum
from running on the affected tables.
--
Adrian Klaver
adrian.klaver@aklaver.com
On 8/21/23 20:17, Adrian Klaver wrote:
On 8/21/23 09:09, Rihad wrote:
On 8/21/23 20:00, Adrian Klaver wrote:
Thanks for the detailed reply, no tables have custom settings.
I need to make it clear once again that all autovac/analyze work as
expected when n_live_tup matches reality, i.e. when analyze has been
run on them since last reset.A way to fix this is to simply analyze the whole database. Before
doing that, while n_live_tup starts from basically 0 and grows based
on DB activity, these usual calculations of 10-20% table size for
vacuum/analyze don't work. They don't trigger autovac for most
tables, or do it much much later.You still have not said or shown whether the other autovacuum settings
are the default values or not. Assuming they are, then the only other
explanation I can come up with is that there is a process or processes
that are creating long running open transactions that prevent
autovacuum from running on the affected tables.
Sorry, they are all as per default, commented out in the config.
There are no long running queries, otherwise they wouldn't be
vacuumed/analyzed in due time after running first manual analyze, which
updates n_live_tup to match reltuples.
On 8/21/23 09:31, Rihad wrote:
On 8/21/23 20:17, Adrian Klaver wrote:
On 8/21/23 09:09, Rihad wrote:
On 8/21/23 20:00, Adrian Klaver wrote:
Sorry, they are all as per default, commented out in the config.
There are no long running queries, otherwise they wouldn't be
vacuumed/analyzed in due time after running first manual analyze, which
updates n_live_tup to match reltuples.
My only remaining suggestion is to closely monitor the Postgres log and
see if provides a clue.
--
Adrian Klaver
adrian.klaver@aklaver.com
On 8/21/23 20:50, Adrian Klaver wrote:
On 8/21/23 09:31, Rihad wrote:
On 8/21/23 20:17, Adrian Klaver wrote:
On 8/21/23 09:09, Rihad wrote:
On 8/21/23 20:00, Adrian Klaver wrote:
Sorry, they are all as per default, commented out in the config.
There are no long running queries, otherwise they wouldn't be
vacuumed/analyzed in due time after running first manual analyze,
which updates n_live_tup to match reltuples.My only remaining suggestion is to closely monitor the Postgres log
and see if provides a clue.
I'm awfully sorry, I read the autovacuum manual carefully, it isn't
n_live_tup, but reltuples that is taken into account during the calculation.
vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples
where the vacuum base threshold is autovacuum_vacuum_threshold
<https://www.postgresql.org/docs/current/runtime-config-autovacuum.html#GUC-AUTOVACUUM-VACUUM-THRESHOLD>,
the vacuum scale factor is autovacuum_vacuum_scale_factor
<https://www.postgresql.org/docs/current/runtime-config-autovacuum.html#GUC-AUTOVACUUM-VACUUM-SCALE-FACTOR>,
*and the number of tuples is **|pg_class|**.**|reltuples|**.*
Your first suggestion was to RTFM.