pgsql: Don't consider newly inserted tuples in nbtree VACUUM.

Started by Peter Geogheganover 5 years ago3 messagescomitters
Jump to latest

Don't consider newly inserted tuples in nbtree VACUUM.

Remove the entire idea of "stale stats" within nbtree VACUUM (stop
caring about stats involving the number of inserted tuples). Also
remove the vacuum_cleanup_index_scale_factor GUC/param on the master
branch (though just disable them on postgres 13).

The vacuum_cleanup_index_scale_factor/stats interface made the nbtree AM
partially responsible for deciding when pg_class.reltuples stats needed
to be updated. This seems contrary to the spirit of the index AM API,
though -- it is not actually necessary for an index AM's bulk delete and
cleanup callbacks to provide accurate stats when it happens to be
inconvenient. The core code owns that. (Index AMs have the authority
to perform or not perform certain kinds of deferred cleanup based on
their own considerations, such as page deletion and recycling, but that
has little to do with pg_class.reltuples/num_index_tuples.)

This issue was fairly harmless until the introduction of the
autovacuum_vacuum_insert_threshold feature by commit b07642db, which had
an undesirable interaction with the vacuum_cleanup_index_scale_factor
mechanism: it made insert-driven autovacuums perform full index scans,
even though there is no real benefit to doing so. This has been tied to
a regression with an append-only insert benchmark [1]https://smalldatum.blogspot.com/2021/01/insert-benchmark-postgres-is-still.html.

Also have remaining cases that perform a full scan of an index during a
cleanup-only nbtree VACUUM indicate that the final tuple count is only
an estimate. This prevents vacuumlazy.c from setting the index's
pg_class.reltuples in those cases (it will now only update pg_class when
vacuumlazy.c had TIDs for nbtree to bulk delete). This arguably fixes
an oversight in deduplication-related bugfix commit 48e12913.

[1]: https://smalldatum.blogspot.com/2021/01/insert-benchmark-postgres-is-still.html

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: /messages/by-id/CAD21AoA4WHthN5uU6+WScZ7+J_RcEjmcuH94qcoUPuB42ShXzg@mail.gmail.com
Backpatch: 13-, where autovacuum_vacuum_insert_threshold was added.

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/9f3665fbfc34b963933e51778c7feaa8134ac885

Modified Files
--------------
doc/src/sgml/config.sgml | 40 ---------------
doc/src/sgml/ref/create_index.sgml | 14 ------
src/backend/access/common/reloptions.c | 9 ----
src/backend/access/nbtree/nbtinsert.c | 3 --
src/backend/access/nbtree/nbtpage.c | 40 +++++----------
src/backend/access/nbtree/nbtree.c | 70 ++++++++-------------------
src/backend/access/nbtree/nbtutils.c | 2 -
src/backend/access/nbtree/nbtxlog.c | 2 +-
src/backend/access/rmgrdesc/nbtdesc.c | 5 +-
src/backend/utils/init/globals.c | 2 -
src/backend/utils/misc/guc.c | 10 ----
src/backend/utils/misc/postgresql.conf.sample | 3 --
src/bin/psql/tab-complete.c | 4 +-
src/include/access/nbtree.h | 7 +--
src/include/access/nbtxlog.h | 1 -
src/include/access/xlog_internal.h | 2 +-
src/include/miscadmin.h | 2 -
src/test/regress/expected/btree_index.out | 29 -----------
src/test/regress/sql/btree_index.sql | 19 --------
19 files changed, 42 insertions(+), 222 deletions(-)

In reply to: Peter Geoghegan (#1)
Re: pgsql: Don't consider newly inserted tuples in nbtree VACUUM.

On Wed, Mar 10, 2021 at 4:27 PM Peter Geoghegan <pg@bowt.ie> wrote:

Don't consider newly inserted tuples in nbtree VACUUM.

Remove the entire idea of "stale stats" within nbtree VACUUM (stop
caring about stats involving the number of inserted tuples). Also
remove the vacuum_cleanup_index_scale_factor GUC/param on the master
branch (though just disable them on postgres 13).

It looks like there is an issue on buildfarm animal crake, with
XversionUpgrade-REL_11_STABLE-HEAD:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&amp;dt=2021-03-11%2000%3A33%3A42

The v11 tests set the vacuum_cleanup_index_scale_factor storage
parameter in an index that is not dropped by the same test, to get
pg_dump coverage and so on (see btree_index.sql tests). The pg_upgrade
tests therefore go on to attempt to restore a DB with the parameter
set in the same index. This fails with Postgres 14/master due to my
having removed vacuum_cleanup_index_scale_factor on that branch.

Perhaps the best solution is also the simplest: revert the tests for
vacuum_cleanup_index_scale_factor storage on the backbranches
(Postgres 12 and 13) too. The tests in question don't test much.

Any input on how I should proceed here, Andrew?

--
Peter Geoghegan

In reply to: Peter Geoghegan (#2)
Re: pgsql: Don't consider newly inserted tuples in nbtree VACUUM.

On Wed, Mar 10, 2021 at 5:49 PM Peter Geoghegan <pg@bowt.ie> wrote:

Perhaps the best solution is also the simplest: revert the tests for
vacuum_cleanup_index_scale_factor storage on the backbranches
(Postgres 12 and 13) too. The tests in question don't test much.

I just pushed a fix to REL_11_STABLE and REL_12_STABLE only. I decided
to drop the index with vacuum_cleanup_index_scale_factor set (which is
a more targeted fix). That should make things go back to green on
crake.

Thanks
--
Peter Geoghegan