doc fixes: vacuum_cleanup_index_scale_factor
Introduced 857f9c36cda520030381bd8c2af20adf0ce0e1d4
The "minimal version" should probably be "minimum version", but I didn't
include it here.
Also, the documentation doesn't indicate the default value of -1 (or its
special meaning).
Also, my understanding of this feature changed when I read this logic:
if (cleanup_scale_factor < 0 ||
metad->btm_last_cleanup_num_heap_tuples < 0 ||
info->num_heap_tuples > (1.0 + cleanup_scale_factor) *
metad->btm_last_cleanup_num_heap_tuples)
result = true;
=> That means that _bt_vacuum_needs_cleanup() returns true unless a nondefault
value is set. That feels wrong, since the doc said:
"When no tuples were deleted from the heap, B-tree indexes
might still be scanned during <command>VACUUM</command>
cleanup stage by two reasons."
Which sounds like "being scanned when no tuples were deleted" is exceptional
rather, than the default.
I changed that paragraph based on that understanding (which is consistent with
the commit message). The language could probably be further improved, but
someone should first verify my tentative understanding of the intent.
Justin
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index eabe2a9..18c0ec0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1893,14 +1893,15 @@ include_dir 'conf.d'
</term>
<listitem>
<para>
- When no tuples were deleted from the heap, B-tree indexes might still
- be scanned during <command>VACUUM</command> cleanup stage by two
- reasons. The first reason is that B-tree index contains deleted pages
- which can be recycled during cleanup. The second reason is that B-tree
- index statistics is stalled. The criterion of stalled index statistics
- is number of inserted tuples since previous statistics collection
- is greater than <varname>vacuum_cleanup_index_scale_factor</varname>
- fraction of total number of heap tuples.
+ When no tuples were deleted from the heap, B-tree indexes are still
+ scanned during <command>VACUUM</command> cleanup stage unless
+ two conditions are met. First, if a B-tree index contains no deleted pages
+ which can be recycled during cleanup. Second, if B-tree
+ index statistics are not stale. Index statistics are considered stale unless
+ <varname>vacuum_cleanup_index_scale_factor</varname> is non-negative, and the
+ number of inserted tuples since the previous statistics collection is
+ less than that fraction of the total number of heap tuples.
+ The default is -1, meaning index scan during cleanup is not skipped.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 3bcc56e..22b4a75 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -189,7 +189,7 @@ _bt_update_meta_cleanup_info(Relation rel, TransactionId oldestBtpoXact,
if (metad->btm_version < BTREE_VERSION)
_bt_upgrademetapage(metapg);
- /* update cleanup-related infromation */
+ /* update cleanup-related information */
metad->btm_oldest_btpo_xact = oldestBtpoXact;
metad->btm_last_cleanup_num_heap_tuples = numHeapTuples;
MarkBufferDirty(metabuf);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index e5dce00..4e86280 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -818,10 +818,10 @@ _bt_vacuum_needs_cleanup(IndexVacuumInfo *info)
float8 cleanup_scale_factor;
/*
- * If table receives large enough amount of insertions and no cleanup
- * was performed, then index might appear to have stalled statistics.
- * In order to evade that, we perform cleanup when table receives
- * vacuum_cleanup_index_scale_factor fractions of insertions.
+ * If table receives enough insertions and no cleanup
+ * was performed, then index would appear have stale statistics.
+ * If scale factor is set, we avoid that by performing cleanup if the number of added tuples
+ * exceeds vacuum_cleanup_index_scale_factor fraction of original tuple count.
*/
relopts = (StdRdOptions *) info->index->rd_options;
cleanup_scale_factor = (relopts &&
@@ -870,8 +870,8 @@ btbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
&oldestBtpoXact);
/*
- * Update cleanup-related information in metapage. These information
- * is used only for cleanup but keeping up them to date can avoid
+ * Update cleanup-related information in metapage. This information
+ * is used only for cleanup but keeping them up to date can avoid
* unnecessary cleanup even after bulkdelete.
*/
_bt_update_meta_cleanup_info(info->index, oldestBtpoXact,
@@ -899,8 +899,8 @@ btvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
* If btbulkdelete was called, we need not do anything, just return the
* stats from the latest btbulkdelete call. If it wasn't called, we might
* still need to do a pass over the index, to recycle any newly-recyclable
- * pages and to obtain index statistics. _bt_vacuum_needs_cleanup checks
- * is there are newly-recyclable or stalled index statistics.
+ * pages or to obtain index statistics. _bt_vacuum_needs_cleanup
+ * determines if if either are needed.
*
* Since we aren't going to actually delete any leaf items, there's no
* need to go through all the vacuum-cycle-ID pushups.
On Tue, May 1, 2018 at 10:30 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
- When no tuples were deleted from the heap, B-tree indexes might still - be scanned during <command>VACUUM</command> cleanup stage by two - reasons. The first reason is that B-tree index contains deleted pages - which can be recycled during cleanup. The second reason is that B-tree - index statistics is stalled. The criterion of stalled index statistics - is number of inserted tuples since previous statistics collection - is greater than <varname>vacuum_cleanup_index_scale_factor</varname> - fraction of total number of heap tuples. + When no tuples were deleted from the heap, B-tree indexes are still + scanned during <command>VACUUM</command> cleanup stage unless + two conditions are met. First, if a B-tree index contains no deleted pages + which can be recycled during cleanup. Second, if B-tree + index statistics are not stale. Index statistics are considered stale unless + <varname>vacuum_cleanup_index_scale_factor</varname> is non-negative, and the + number of inserted tuples since the previous statistics collection is + less than that fraction of the total number of heap tuples. + The default is -1, meaning index scan during cleanup is not skipped.
I agree that this documentation needs to be rewritten but your rewrite
doesn't strike me as very good English either. A sentence of the form
"First, if I like hamburgers." is not correct English.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, May 02, 2018 at 10:54:31AM -0400, Robert Haas wrote:
On Tue, May 1, 2018 at 10:30 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
- When no tuples were deleted from the heap, B-tree indexes might still - be scanned during <command>VACUUM</command> cleanup stage by two - reasons. The first reason is that B-tree index contains deleted pages - which can be recycled during cleanup. The second reason is that B-tree - index statistics is stalled. The criterion of stalled index statistics - is number of inserted tuples since previous statistics collection - is greater than <varname>vacuum_cleanup_index_scale_factor</varname> - fraction of total number of heap tuples. + When no tuples were deleted from the heap, B-tree indexes are still + scanned during <command>VACUUM</command> cleanup stage unless + two conditions are met. First, if a B-tree index contains no deleted pages + which can be recycled during cleanup. Second, if B-tree + index statistics are not stale. Index statistics are considered stale unless + <varname>vacuum_cleanup_index_scale_factor</varname> is non-negative, and the + number of inserted tuples since the previous statistics collection is + less than that fraction of the total number of heap tuples. + The default is -1, meaning index scan during cleanup is not skipped.I agree that this documentation needs to be rewritten but your rewrite
doesn't strike me as very good English either.
2nd attempt
index eabe2a9..e305de9 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1893,14 +1893,16 @@ include_dir 'conf.d'
</term>
<listitem>
<para>
- When no tuples were deleted from the heap, B-tree indexes might still
- be scanned during <command>VACUUM</command> cleanup stage by two
- reasons. The first reason is that B-tree index contains deleted pages
- which can be recycled during cleanup. The second reason is that B-tree
- index statistics is stalled. The criterion of stalled index statistics
- is number of inserted tuples since previous statistics collection
- is greater than <varname>vacuum_cleanup_index_scale_factor</varname>
- fraction of total number of heap tuples.
+ When no tuples were deleted from the heap, B-tree indexes are still
+ scanned during <command>VACUUM</command> cleanup stage unless two
+ conditions are met: the index contains no deleted pages which can be
+ recycled during cleanup; and, the index statistics are not stale.
+ Index statistics are considered stale unless
+ <varname>vacuum_cleanup_index_scale_factor</varname>
+ is set to a non-negative value, and the number of inserted tuples since
+ the previous statistics collection is less than that fraction of the
+ total number of heap tuples. The default is -1, which means index
+ scans during <command>VACUUM</command> cleanup are not skipped.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 3bcc56e..22b4a75 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -189,7 +189,7 @@ _bt_update_meta_cleanup_info(Relation rel, TransactionId oldestBtpoXact,
if (metad->btm_version < BTREE_VERSION)
_bt_upgrademetapage(metapg);
- /* update cleanup-related infromation */
+ /* update cleanup-related information */
metad->btm_oldest_btpo_xact = oldestBtpoXact;
metad->btm_last_cleanup_num_heap_tuples = numHeapTuples;
MarkBufferDirty(metabuf);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index e5dce00..4e86280 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -818,10 +818,10 @@ _bt_vacuum_needs_cleanup(IndexVacuumInfo *info)
float8 cleanup_scale_factor;
/*
- * If table receives large enough amount of insertions and no cleanup
- * was performed, then index might appear to have stalled statistics.
- * In order to evade that, we perform cleanup when table receives
- * vacuum_cleanup_index_scale_factor fractions of insertions.
+ * If table receives enough insertions and no cleanup
+ * was performed, then index would appear have stale statistics.
+ * If scale factor is set, we avoid that by performing cleanup if the number of added tuples
+ * exceeds vacuum_cleanup_index_scale_factor fraction of original tuple count.
*/
relopts = (StdRdOptions *) info->index->rd_options;
cleanup_scale_factor = (relopts &&
@@ -870,8 +870,8 @@ btbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
&oldestBtpoXact);
/*
- * Update cleanup-related information in metapage. These information
- * is used only for cleanup but keeping up them to date can avoid
+ * Update cleanup-related information in metapage. This information
+ * is used only for cleanup but keeping them up to date can avoid
* unnecessary cleanup even after bulkdelete.
*/
_bt_update_meta_cleanup_info(info->index, oldestBtpoXact,
@@ -899,8 +899,8 @@ btvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
* If btbulkdelete was called, we need not do anything, just return the
* stats from the latest btbulkdelete call. If it wasn't called, we might
* still need to do a pass over the index, to recycle any newly-recyclable
- * pages and to obtain index statistics. _bt_vacuum_needs_cleanup checks
- * is there are newly-recyclable or stalled index statistics.
+ * pages or to obtain index statistics. _bt_vacuum_needs_cleanup
+ * determines if if either are needed.
*
* Since we aren't going to actually delete any leaf items, there's no
* need to go through all the vacuum-cycle-ID pushups.
On Wed, May 2, 2018 at 11:43 AM, Justin Pryzby <pryzby@telsasoft.com> wrote:
2nd attempt
That looks like good English to me. I can't vouch for whether it's
technically correct.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Hi!
Thank you for your attention on this subject. It's definitely right,
that documentation needs to be revised in these places.
On Wed, May 2, 2018 at 6:43 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
index eabe2a9..e305de9 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1893,14 +1893,16 @@ include_dir 'conf.d' </term> <listitem> <para> - When no tuples were deleted from the heap, B-tree indexes might still - be scanned during <command>VACUUM</command> cleanup stage by two - reasons. The first reason is that B-tree index contains deleted pages - which can be recycled during cleanup. The second reason is that B-tree - index statistics is stalled. The criterion of stalled index statistics - is number of inserted tuples since previous statistics collection - is greater than <varname>vacuum_cleanup_index_ scale_factor</varname> - fraction of total number of heap tuples. + When no tuples were deleted from the heap, B-tree indexes are still + scanned during <command>VACUUM</command> cleanup stage unless two + conditions are met: the index contains no deleted pages which can be + recycled during cleanup; and, the index statistics are not stale. + Index statistics are considered stale unless + <varname>vacuum_cleanup_index_scale_factor</varname> + is set to a non-negative value, and the number of inserted tuples since + the previous statistics collection is less than that fraction of the + total number of heap tuples. The default is -1, which means index + scans during <command>VACUUM</command> cleanup are not skipped. </para> </listitem> </varlistentry>
The default value of vacuum_cleanup_index_scale_factor GUC is 0.1,
that means that 10% of tuples need to be inserted in order to trigger
vacuum cleanup. See guc.c
{
{"vacuum_cleanup_index_scale_factor", PGC_SIGHUP, AUTOVACUUM,
gettext_noop("Number of tuple inserts prior to index cleanup as a fraction
of reltuples."),
NULL
},
&vacuum_cleanup_index_scale_factor,
0.1, 0.0, 100.0,
NULL, NULL, NULL
},
Default value of vacuum_cleanup_index_scale_factor reloption is -1,
it means that by default value of vacuum_cleanup_index_scale_factor GUC
is used. See following piece of code in _bt_vacuum_needs_cleanup().
cleanup_scale_factor = (relopts &&
relopts->vacuum_cleanup_index_scale_factor >= 0)
? relopts->vacuum_cleanup_index_scale_factor
: vacuum_cleanup_index_scale_factor;
In order to have vacuum cleanup scan every time, one should set
vacuum_cleanup_index_scale_factor GUC to 0. Assuming this,
we need to replace "cleanup_scale_factor < 0" to
"cleanup_scale_factor <= 0" in the following condition:
if (cleanup_scale_factor < 0 ||
metad->btm_last_cleanup_num_heap_tuples < 0 ||
info->num_heap_tuples > (1.0 + cleanup_scale_factor) *
metad->btm_last_cleanup_num_heap_tuples)
result = true;
Another issue is that we by default store -1 in
metad->btm_last_cleanup_num_heap_tuples in order to evade overhead
of meta-page rewrite. metad->btm_last_cleanup_num_heap_tuples is
set at first btcleanup() call when no tuples were deleted. Second and
subsequent btcleanup() calls may skip index scan. This aspect needs
to be properly documented.
I'm going to propose a patch for this subject in a couple of days.
That patch would incorporate some of your changes as well as contain
some changes from me.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Hi!
I've revised docs and comments, and also made some fixes in the code.
See the attached patchset.
* 0001-vacuum-cleanup-index-scale-factor-user-set.patch
Changes vacuum_cleanup_index_scale_factor GUC to PGC_USERSET,
because it might be useful to change in specific session.
* 0002-vacuum-cleanup-index-scale-factor-tab-complete.patch
Add missing psql tab-complete support for
vacuum_cleanup_index_scale_factor
* 0003-btree-cleanup-condition-fix.patch
Fix condition which triggers btree index cleanup scan to always fire
when vacuum_cleanup_index_scale_factor == 0.
* 0004-btree-cleanup-docs-comments-fixes.patch
Documentation and comment improvements from Justin Pryzby
revised by me.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
0001-vacuum-cleanup-index-scale-factor-user-set.patchapplication/octet-stream; name=0001-vacuum-cleanup-index-scale-factor-user-set.patchDownload
commit 5483a6e6cd2c2bf0ebcb500875a4ddf5b9bf877b
Author: Alexander Korotkov <a.korotkov@postgrespro.ru>
Date: Mon May 7 18:59:53 2018 +0300
Change vacuum_cleanup_index_scale_factor GUC to PGC_USERSET
vacuum_cleanup_index_scale_factor GUC was defined as PGC_SIGHUP. But this
GUC affects not only autovacuum. So it might be useful to change it from user
session in order to influence manually runned VACUUM.
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index f496ad6281..7cd2d2d80e 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3238,7 +3238,7 @@ static struct config_real ConfigureNamesReal[] =
},
{
- {"vacuum_cleanup_index_scale_factor", PGC_SIGHUP, AUTOVACUUM,
+ {"vacuum_cleanup_index_scale_factor", PGC_USERSET, AUTOVACUUM,
gettext_noop("Number of tuple inserts prior to index cleanup as a fraction of reltuples."),
NULL
},
0002-vacuum-cleanup-index-scale-factor-tab-complete.patchapplication/octet-stream; name=0002-vacuum-cleanup-index-scale-factor-tab-complete.patchDownload
commit b7e0a2c6166b7c2dde87c94ae5bfb85f62dd27ec
Author: Alexander Korotkov <a.korotkov@postgrespro.ru>
Date: Mon May 7 19:03:05 2018 +0300
Tab-complete support for vacuum_cleanup_index_scale_factor reloption
857f9c36 has introduced vacuum_cleanup_index_scale_factor reloption, but
psql tab-complete support for it was missing. So, add it.
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index b431efc983..7bb47eadc6 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1855,13 +1855,15 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH_CONST("(");
/* ALTER INDEX <foo> SET|RESET ( */
else if (Matches5("ALTER", "INDEX", MatchAny, "RESET", "("))
- COMPLETE_WITH_LIST7("fillfactor", "recheck_on_update",
+ COMPLETE_WITH_LIST8("fillfactor", "recheck_on_update",
+ "vacuum_cleanup_index_scale_factor", /* BTREE */
"fastupdate", "gin_pending_list_limit", /* GIN */
"buffering", /* GiST */
"pages_per_range", "autosummarize" /* BRIN */
);
else if (Matches5("ALTER", "INDEX", MatchAny, "SET", "("))
- COMPLETE_WITH_LIST7("fillfactor =", "recheck_on_update =",
+ COMPLETE_WITH_LIST8("fillfactor =", "recheck_on_update =",
+ "vacuum_cleanup_index_scale_factor =", /* BTREE */
"fastupdate =", "gin_pending_list_limit =", /* GIN */
"buffering =", /* GiST */
"pages_per_range =", "autosummarize =" /* BRIN */
0003-btree-cleanup-condition-fix.patchapplication/octet-stream; name=0003-btree-cleanup-condition-fix.patchDownload
commit da90c3f38c15f4f22a40b5742a0693925f892d15
Author: Alexander Korotkov <a.korotkov@postgrespro.ru>
Date: Mon May 7 19:10:41 2018 +0300
Fix condition for B-tree index cleanup
Zero value of vacuum_cleanup_index_scale_factor means that user wants B-tree
index cleanup to be never skipped.
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index e5dce00876..d894ba0374 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -829,7 +829,7 @@ _bt_vacuum_needs_cleanup(IndexVacuumInfo *info)
? relopts->vacuum_cleanup_index_scale_factor
: vacuum_cleanup_index_scale_factor;
- if (cleanup_scale_factor < 0 ||
+ if (cleanup_scale_factor <= 0 ||
metad->btm_last_cleanup_num_heap_tuples < 0 ||
info->num_heap_tuples > (1.0 + cleanup_scale_factor) *
metad->btm_last_cleanup_num_heap_tuples)
0004-btree-cleanup-docs-comments-fixes.patchapplication/octet-stream; name=0004-btree-cleanup-docs-comments-fixes.patchDownload
commit eba1215ae9f06fb5ac94a9906cbd8d02c8481d5c
Author: Alexander Korotkov <a.korotkov@postgrespro.ru>
Date: Mon May 7 19:13:19 2018 +0300
Documentation and comment improvements for 857f9c36
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index eabe2a9235..785ecf922a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1893,15 +1893,35 @@ include_dir 'conf.d'
</term>
<listitem>
<para>
- When no tuples were deleted from the heap, B-tree indexes might still
- be scanned during <command>VACUUM</command> cleanup stage by two
- reasons. The first reason is that B-tree index contains deleted pages
- which can be recycled during cleanup. The second reason is that B-tree
- index statistics is stalled. The criterion of stalled index statistics
- is number of inserted tuples since previous statistics collection
- is greater than <varname>vacuum_cleanup_index_scale_factor</varname>
- fraction of total number of heap tuples.
+ When no tuples were deleted from the heap, B-tree indexes are still
+ scanned during <command>VACUUM</command> cleanup stage unless two
+ conditions are met: the index contains no deleted pages which can be
+ recycled during cleanup; and, the index statistics are not stale.
+ In order to detect stale index statistics, number of total heap tuples
+ during previous statistics collection is memorized in the index
+ meta-page. Once number number of inserted tuples since previous
+ statistics collection is more than
+ <varname>vacuum_cleanup_index_scale_factor</varname> fraction of
+ number of heap tuples memorized in the meta-page, index statistics is
+ considered to be stalled. Note, that number of heap tuples is written
+ to the meta-page at the first time when no dead tuples are found
+ during <command>VACUUM</command> cycle. Thus, skip of B-tree index
+ scan during cleanup stage is only possible in second and subsequent
+ <command>VACUUM</command> cycles detecting no dead tuples.
</para>
+
+ <para>
+ Zero value of <varname>vacuum_cleanup_index_scale_factor</varname>
+ means that index scans during <command>VACUUM</command> cleanup are
+ never skipped. The default value is 0.1; the maximum value is 100.
+ </para>
+
+ <para>
+ Currently, <varname>vacuum_cleanup_index_scale_factor</varname>
+ influences only B-tree indexes, but in future versions it might be
+ applied to other index access methods too.
+ </para>
+
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 3bcc56e9d2..22b4a7578f 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -189,7 +189,7 @@ _bt_update_meta_cleanup_info(Relation rel, TransactionId oldestBtpoXact,
if (metad->btm_version < BTREE_VERSION)
_bt_upgrademetapage(metapg);
- /* update cleanup-related infromation */
+ /* update cleanup-related information */
metad->btm_oldest_btpo_xact = oldestBtpoXact;
metad->btm_last_cleanup_num_heap_tuples = numHeapTuples;
MarkBufferDirty(metabuf);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index d894ba0374..27a3032e42 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -818,10 +818,11 @@ _bt_vacuum_needs_cleanup(IndexVacuumInfo *info)
float8 cleanup_scale_factor;
/*
- * If table receives large enough amount of insertions and no cleanup
- * was performed, then index might appear to have stalled statistics.
- * In order to evade that, we perform cleanup when table receives
- * vacuum_cleanup_index_scale_factor fractions of insertions.
+ * If table receives enough insertions and no cleanup was performed,
+ * then index would appear have stale statistics. If scale factor
+ * is set, we avoid that by performing cleanup if the number of
+ * inserted tuples exceeds vacuum_cleanup_index_scale_factor fraction
+ * of original tuples count.
*/
relopts = (StdRdOptions *) info->index->rd_options;
cleanup_scale_factor = (relopts &&
@@ -870,8 +871,8 @@ btbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
&oldestBtpoXact);
/*
- * Update cleanup-related information in metapage. These information
- * is used only for cleanup but keeping up them to date can avoid
+ * Update cleanup-related information in metapage. This information
+ * is used only for cleanup but keeping them up to date can avoid
* unnecessary cleanup even after bulkdelete.
*/
_bt_update_meta_cleanup_info(info->index, oldestBtpoXact,
@@ -899,8 +900,8 @@ btvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
* If btbulkdelete was called, we need not do anything, just return the
* stats from the latest btbulkdelete call. If it wasn't called, we might
* still need to do a pass over the index, to recycle any newly-recyclable
- * pages and to obtain index statistics. _bt_vacuum_needs_cleanup checks
- * is there are newly-recyclable or stalled index statistics.
+ * pages or to obtain index statistics. _bt_vacuum_needs_cleanup
+ * determines if either are needed.
*
* Since we aren't going to actually delete any leaf items, there's no
* need to go through all the vacuum-cycle-ID pushups.
On Mon, May 07, 2018 at 07:26:25PM +0300, Alexander Korotkov wrote:
Hi!
I've revised docs and comments, and also made some fixes in the code.
See the attached patchset.* 0004-btree-cleanup-docs-comments-fixes.patch
Documentation and comment improvements from Justin Pryzby
revised by me.
2nd iteration:
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index eabe2a9235..785ecf922a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1893,15 +1893,35 @@ include_dir 'conf.d'
</term>
<listitem>
<para>
- When no tuples were deleted from the heap, B-tree indexes might still
- be scanned during <command>VACUUM</command> cleanup stage by two
- reasons. The first reason is that B-tree index contains deleted pages
- which can be recycled during cleanup. The second reason is that B-tree
- index statistics is stalled. The criterion of stalled index statistics
- is number of inserted tuples since previous statistics collection
- is greater than <varname>vacuum_cleanup_index_scale_factor</varname>
- fraction of total number of heap tuples.
+ When no tuples were deleted from the heap, B-tree indexes are still
+ scanned during <command>VACUUM</command> cleanup stage unless two
+ conditions are met: the index contains no deleted pages which can be
+ recycled during cleanup; and, the index statistics are not stale.
+ In order to detect stale index statistics, number of total heap tuples
should say: "THE number"
+ during previous statistics collection is memorized in the index
s/memorized/stored/
+ meta-page. Once number number of inserted tuples since previous
Should say "Once the number of inserted tuples..."
+ statistics collection is more than
+ <varname>vacuum_cleanup_index_scale_factor</varname> fraction of
+ number of heap tuples memorized in the meta-page, index statistics is
s/memorized/stored/
+ considered to be stalled. Note, that number of heap tuples is written
"THE number"
s/stalled/stale/
+ to the meta-page at the first time when no dead tuples are found
remove "at"
+ during <command>VACUUM</command> cycle. Thus, skip of B-tree index
I think should say: "Thus, skipping of the B-tree index scan"
+ scan during cleanup stage is only possible in second and subsequent
s/in/when/
+ <para>
+ Zero value of <varname>vacuum_cleanup_index_scale_factor</varname>
I would say "A zero value of ..."
Thanks,
Justin
Hi, Justin!
Thank you for revising documentation patch.
On Mon, May 7, 2018 at 7:55 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Mon, May 07, 2018 at 07:26:25PM +0300, Alexander Korotkov wrote:
Hi!
I've revised docs and comments, and also made some fixes in the code.
See the attached patchset.* 0004-btree-cleanup-docs-comments-fixes.patch
Documentation and comment improvements from Justin Pryzby
revised by me.2nd iteration:
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index eabe2a9235..785ecf922a 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1893,15 +1893,35 @@ include_dir 'conf.d' </term> <listitem> <para> - When no tuples were deleted from the heap, B-tree indexes might still - be scanned during <command>VACUUM</command> cleanup stage by two - reasons. The first reason is that B-tree index contains deleted pages - which can be recycled during cleanup. The second reason is that B-tree - index statistics is stalled. The criterion of stalled index statistics - is number of inserted tuples since previous statistics collection - is greater than <varname>vacuum_cleanup_index_ scale_factor</varname> - fraction of total number of heap tuples. + When no tuples were deleted from the heap, B-tree indexes are still + scanned during <command>VACUUM</command> cleanup stage unless two + conditions are met: the index contains no deleted pages which can be + recycled during cleanup; and, the index statistics are not stale. + In order to detect stale index statistics, number of total heap tuples should say: "THE number"+ during previous statistics collection is memorized in the index
s/memorized/stored/+ meta-page. Once number number of inserted tuples since previous
Should say "Once the number of inserted tuples..."+ statistics collection is more than + <varname>vacuum_cleanup_index_scale_factor</varname> fraction of + number of heap tuples memorized in the meta-page, index statistics is s/memorized/stored/+ considered to be stalled. Note, that number of heap tuples is
written
"THE number"
s/stalled/stale/+ to the meta-page at the first time when no dead tuples are found
remove "at"+ during <command>VACUUM</command> cycle. Thus, skip of B-tree
index
I think should say: "Thus, skipping of the B-tree index scan"+ scan during cleanup stage is only possible in second and
subsequent
s/in/when/+ <para> + Zero value of <varname>vacuum_cleanup_index_ scale_factor</varname> I would say "A zero value of ..."
I've applied all the changes you suggested. Please, find it in the
attached patchset.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
0001-vacuum-cleanup-index-scale-factor-user-set-2.patchapplication/octet-stream; name=0001-vacuum-cleanup-index-scale-factor-user-set-2.patchDownload
commit eabeaa2e373a20c1731025c886340f5551d83be3
Author: Alexander Korotkov <a.korotkov@postgrespro.ru>
Date: Mon May 7 18:59:53 2018 +0300
Change vacuum_cleanup_index_scale_factor GUC to PGC_USERSET
vacuum_cleanup_index_scale_factor GUC was defined as PGC_SIGHUP. But this
GUC affects not only autovacuum. So it might be useful to change it from user
session in order to influence manually runned VACUUM.
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index f496ad6281..7cd2d2d80e 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3238,7 +3238,7 @@ static struct config_real ConfigureNamesReal[] =
},
{
- {"vacuum_cleanup_index_scale_factor", PGC_SIGHUP, AUTOVACUUM,
+ {"vacuum_cleanup_index_scale_factor", PGC_USERSET, AUTOVACUUM,
gettext_noop("Number of tuple inserts prior to index cleanup as a fraction of reltuples."),
NULL
},
0002-vacuum-cleanup-index-scale-factor-tab-complete-2.patchapplication/octet-stream; name=0002-vacuum-cleanup-index-scale-factor-tab-complete-2.patchDownload
commit 5bc7e4ee2cad7f715db34fc33cadd1e28fd74b1b
Author: Alexander Korotkov <a.korotkov@postgrespro.ru>
Date: Mon May 7 19:03:05 2018 +0300
Tab-complete support for vacuum_cleanup_index_scale_factor reloption
857f9c36 has introduced vacuum_cleanup_index_scale_factor reloption, but
psql tab-complete support for it was missing. So, add it.
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index b431efc983..7bb47eadc6 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1855,13 +1855,15 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH_CONST("(");
/* ALTER INDEX <foo> SET|RESET ( */
else if (Matches5("ALTER", "INDEX", MatchAny, "RESET", "("))
- COMPLETE_WITH_LIST7("fillfactor", "recheck_on_update",
+ COMPLETE_WITH_LIST8("fillfactor", "recheck_on_update",
+ "vacuum_cleanup_index_scale_factor", /* BTREE */
"fastupdate", "gin_pending_list_limit", /* GIN */
"buffering", /* GiST */
"pages_per_range", "autosummarize" /* BRIN */
);
else if (Matches5("ALTER", "INDEX", MatchAny, "SET", "("))
- COMPLETE_WITH_LIST7("fillfactor =", "recheck_on_update =",
+ COMPLETE_WITH_LIST8("fillfactor =", "recheck_on_update =",
+ "vacuum_cleanup_index_scale_factor =", /* BTREE */
"fastupdate =", "gin_pending_list_limit =", /* GIN */
"buffering =", /* GiST */
"pages_per_range =", "autosummarize =" /* BRIN */
0003-btree-cleanup-condition-fix-2.patchapplication/octet-stream; name=0003-btree-cleanup-condition-fix-2.patchDownload
commit a576ddcbb81554aad8c79d28b05d396ed378417b
Author: Alexander Korotkov <a.korotkov@postgrespro.ru>
Date: Mon May 7 19:10:41 2018 +0300
Fix condition for B-tree index cleanup
Zero value of vacuum_cleanup_index_scale_factor means that user wants B-tree
index cleanup to be never skipped.
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index e5dce00876..d894ba0374 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -829,7 +829,7 @@ _bt_vacuum_needs_cleanup(IndexVacuumInfo *info)
? relopts->vacuum_cleanup_index_scale_factor
: vacuum_cleanup_index_scale_factor;
- if (cleanup_scale_factor < 0 ||
+ if (cleanup_scale_factor <= 0 ||
metad->btm_last_cleanup_num_heap_tuples < 0 ||
info->num_heap_tuples > (1.0 + cleanup_scale_factor) *
metad->btm_last_cleanup_num_heap_tuples)
0004-btree-cleanup-docs-comments-fixes-2.patchapplication/octet-stream; name=0004-btree-cleanup-docs-comments-fixes-2.patchDownload
commit 662ad6b28330a73b286191e5741280b95679c16d
Author: Alexander Korotkov <a.korotkov@postgrespro.ru>
Date: Tue May 8 12:30:07 2018 +0300
Documentation and comment improvements for 857f9c36
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index ffea744cb8..8a1686d22e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1893,15 +1893,35 @@ include_dir 'conf.d'
</term>
<listitem>
<para>
- When no tuples were deleted from the heap, B-tree indexes might still
- be scanned during <command>VACUUM</command> cleanup stage by two
- reasons. The first reason is that B-tree index contains deleted pages
- which can be recycled during cleanup. The second reason is that B-tree
- index statistics is stalled. The criterion of stalled index statistics
- is number of inserted tuples since previous statistics collection
- is greater than <varname>vacuum_cleanup_index_scale_factor</varname>
- fraction of total number of heap tuples.
+ When no tuples were deleted from the heap, B-tree indexes are still
+ scanned during <command>VACUUM</command> cleanup stage unless two
+ conditions are met: the index contains no deleted pages which can be
+ recycled during cleanup; and, the index statistics are not stale.
+ In order to detect stale index statistics, the number of total heap
+ tuples during previous statistics collection is stored in the index
+ meta-page. Once the number of inserted tuples since previous
+ statistics collection is more than
+ <varname>vacuum_cleanup_index_scale_factor</varname> fraction of
+ number of heap tuples stored in the meta-page, index statistics is
+ considered to be stale. Note, that the number of heap tuples is
+ written to the meta-page the first time when no dead tuples are found
+ during <command>VACUUM</command> cycle. Thus, skipping of the B-tree
+ index scan during cleanup stage is only possible when second and
+ subsequent <command>VACUUM</command> cycles detecting no dead tuples.
</para>
+
+ <para>
+ A zero value of <varname>vacuum_cleanup_index_scale_factor</varname>
+ means that index scans during <command>VACUUM</command> cleanup are
+ never skipped. The default value is 0.1; the maximum value is 100.
+ </para>
+
+ <para>
+ Currently, <varname>vacuum_cleanup_index_scale_factor</varname>
+ influences only B-tree indexes, but in future versions it might be
+ applied to other index access methods too.
+ </para>
+
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 3bcc56e9d2..22b4a7578f 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -189,7 +189,7 @@ _bt_update_meta_cleanup_info(Relation rel, TransactionId oldestBtpoXact,
if (metad->btm_version < BTREE_VERSION)
_bt_upgrademetapage(metapg);
- /* update cleanup-related infromation */
+ /* update cleanup-related information */
metad->btm_oldest_btpo_xact = oldestBtpoXact;
metad->btm_last_cleanup_num_heap_tuples = numHeapTuples;
MarkBufferDirty(metabuf);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index d894ba0374..27a3032e42 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -818,10 +818,11 @@ _bt_vacuum_needs_cleanup(IndexVacuumInfo *info)
float8 cleanup_scale_factor;
/*
- * If table receives large enough amount of insertions and no cleanup
- * was performed, then index might appear to have stalled statistics.
- * In order to evade that, we perform cleanup when table receives
- * vacuum_cleanup_index_scale_factor fractions of insertions.
+ * If table receives enough insertions and no cleanup was performed,
+ * then index would appear have stale statistics. If scale factor
+ * is set, we avoid that by performing cleanup if the number of
+ * inserted tuples exceeds vacuum_cleanup_index_scale_factor fraction
+ * of original tuples count.
*/
relopts = (StdRdOptions *) info->index->rd_options;
cleanup_scale_factor = (relopts &&
@@ -870,8 +871,8 @@ btbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
&oldestBtpoXact);
/*
- * Update cleanup-related information in metapage. These information
- * is used only for cleanup but keeping up them to date can avoid
+ * Update cleanup-related information in metapage. This information
+ * is used only for cleanup but keeping them up to date can avoid
* unnecessary cleanup even after bulkdelete.
*/
_bt_update_meta_cleanup_info(info->index, oldestBtpoXact,
@@ -899,8 +900,8 @@ btvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
* If btbulkdelete was called, we need not do anything, just return the
* stats from the latest btbulkdelete call. If it wasn't called, we might
* still need to do a pass over the index, to recycle any newly-recyclable
- * pages and to obtain index statistics. _bt_vacuum_needs_cleanup checks
- * is there are newly-recyclable or stalled index statistics.
+ * pages or to obtain index statistics. _bt_vacuum_needs_cleanup
+ * determines if either are needed.
*
* Since we aren't going to actually delete any leaf items, there's no
* need to go through all the vacuum-cycle-ID pushups.
3rd iteration ; thanks for bearing with me.
On Tue, May 08, 2018 at 12:35:00PM +0300, Alexander Korotkov wrote:
Hi, Justin!
Thank you for revising documentation patch.
On Mon, May 7, 2018 at 7:55 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
+ In order to detect stale index statistics, the number of total heap
+ tuples during previous statistics collection is stored in the index
+ meta-page.
Consider removing: "during previous statistics collection"
Or: "during THE previous statistics collection"
+ Once the number of inserted tuples since previous
since THE previous
+ statistics collection is more than
+ <varname>vacuum_cleanup_index_scale_factor</varname> fraction of
Since the multiplier can be greater than 1, should we say "multiple" instead of
fraction?
+ during <command>VACUUM</command> cycle. Thus, skipping of the B-tree
+ index scan during cleanup stage is only possible when second and
+ subsequent <command>VACUUM</command> cycles detecting no dead tuples.
Change "detecting" to "detect". Or maybe just "find"
Justin
On 05/08/2018 02:05 PM, Justin Pryzby wrote:
3rd iteration ; thanks for bearing with me.
On Tue, May 08, 2018 at 12:35:00PM +0300, Alexander Korotkov wrote:
Hi, Justin!
Thank you for revising documentation patch.
On Mon, May 7, 2018 at 7:55 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
+ In order to detect stale index statistics, the number of total heap + tuples during previous statistics collection is stored in the index + meta-page.Consider removing: "during previous statistics collection"
Or: "during THE previous statistics collection"+ Once the number of inserted tuples since previous
since THE previous
+ statistics collection is more than + <varname>vacuum_cleanup_index_scale_factor</varname> fraction ofSince the multiplier can be greater than 1, should we say "multiple" instead of
fraction?+ during <command>VACUUM</command> cycle. Thus, skipping of the B-tree + index scan during cleanup stage is only possible when second and + subsequent <command>VACUUM</command> cycles detecting no dead tuples.Change "detecting" to "detect". Or maybe just "find"
Justin
Hi Justin,
Thank you for helping with the docs. Attached is another doc patch that
should address most of the issues you've brought up.
I've also reshuffled the text a bit to make it more like an option
description. Hope you'll find it useful.
--
Liudmila Mantrova
Technical writer at Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
0004-btree-cleanup-docs-comments-fixes-3.patchtext/x-patch; name=0004-btree-cleanup-docs-comments-fixes-3.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index ffea744..c4afd14 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1893,15 +1893,34 @@ include_dir 'conf.d'
</term>
<listitem>
<para>
- When no tuples were deleted from the heap, B-tree indexes might still
- be scanned during <command>VACUUM</command> cleanup stage by two
- reasons. The first reason is that B-tree index contains deleted pages
- which can be recycled during cleanup. The second reason is that B-tree
- index statistics is stalled. The criterion of stalled index statistics
- is number of inserted tuples since previous statistics collection
- is greater than <varname>vacuum_cleanup_index_scale_factor</varname>
- fraction of total number of heap tuples.
+ Specifies the fraction of the total number of heap tuples counted in
+ the previous statistics collection that can be inserted without
+ incurring an index scan at the <command>VACUUM</command> cleanup stage.
+ This setting currently applies to B-tree indexes only.
</para>
+
+ <para>
+ If no tuples were deleted from the heap, B-tree indexes are still
+ scanned at the <command>VACUUM</command> cleanup stage when at least one
+ of the following conditions is met: the index statistics are stale, or
+ the index contains deleted pages that can be recycled during cleanup.
+ Index statistics are considered to be stale if the number of newly
+ inserted tuples exceeds the <varname>vacuum_cleanup_index_scale_factor</varname>
+ fraction of the total number of heap tuples detected by the previous
+ statistics collection. The total number of heap tuples is stored in
+ the index meta-page. Note that the meta-page does not include this data
+ until <command>VACUUM</command> finds no dead tuples, so B-tree index
+ scan at the cleanup stage can only be skipped if the second and
+ subsequent <command>VACUUM</command> cycles detect no dead tuples.
+ </para>
+
+ <para>
+ The value can range from <literal>0</literal> to <literal>100</literal>.
+ When <varname>vacuum_cleanup_index_scale_factor</varname> is set to
+ <literal>0</literal>, index scans are never skipped during
+ <command>VACUUM</command> cleanup. The default value is <literal>0.1</literal>.
+ </para>
+
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 3bcc56e..22b4a75 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -189,7 +189,7 @@ _bt_update_meta_cleanup_info(Relation rel, TransactionId oldestBtpoXact,
if (metad->btm_version < BTREE_VERSION)
_bt_upgrademetapage(metapg);
- /* update cleanup-related infromation */
+ /* update cleanup-related information */
metad->btm_oldest_btpo_xact = oldestBtpoXact;
metad->btm_last_cleanup_num_heap_tuples = numHeapTuples;
MarkBufferDirty(metabuf);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index d894ba0..27a3032 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -818,10 +818,11 @@ _bt_vacuum_needs_cleanup(IndexVacuumInfo *info)
float8 cleanup_scale_factor;
/*
- * If table receives large enough amount of insertions and no cleanup
- * was performed, then index might appear to have stalled statistics.
- * In order to evade that, we perform cleanup when table receives
- * vacuum_cleanup_index_scale_factor fractions of insertions.
+ * If table receives enough insertions and no cleanup was performed,
+ * then index would appear have stale statistics. If scale factor
+ * is set, we avoid that by performing cleanup if the number of
+ * inserted tuples exceeds vacuum_cleanup_index_scale_factor fraction
+ * of original tuples count.
*/
relopts = (StdRdOptions *) info->index->rd_options;
cleanup_scale_factor = (relopts &&
@@ -870,8 +871,8 @@ btbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
&oldestBtpoXact);
/*
- * Update cleanup-related information in metapage. These information
- * is used only for cleanup but keeping up them to date can avoid
+ * Update cleanup-related information in metapage. This information
+ * is used only for cleanup but keeping them up to date can avoid
* unnecessary cleanup even after bulkdelete.
*/
_bt_update_meta_cleanup_info(info->index, oldestBtpoXact,
@@ -899,8 +900,8 @@ btvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
* If btbulkdelete was called, we need not do anything, just return the
* stats from the latest btbulkdelete call. If it wasn't called, we might
* still need to do a pass over the index, to recycle any newly-recyclable
- * pages and to obtain index statistics. _bt_vacuum_needs_cleanup checks
- * is there are newly-recyclable or stalled index statistics.
+ * pages or to obtain index statistics. _bt_vacuum_needs_cleanup
+ * determines if either are needed.
*
* Since we aren't going to actually delete any leaf items, there's no
* need to go through all the vacuum-cycle-ID pushups.
thanks to everyone, pushed
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/