GUC for cleanup indexes threshold.
Hi and happy new year.
The lazy vacuum calls lazy_cleanup_index to update statistics of
indexes on a table such as relpages, reltuples at the end of the
lazy_scan_heap. In all type of indexes the lazy_cleanup_index scans
all index pages. It happens even if table has not been updated at all
since previous vacuum invoked. Freeze map reduces the execution time
and cost of table vacuuming much if almost table has been frozen. But
it doesn't work for cleaning up indexes. If a very large static table
has index then because the cleaning up index is called and it always
scans all index pages, it takes time to scan all pages of index as
reported[1]/messages/by-id/MWHPR20MB142177B86D893C946FAFC9A4A18C0@MWHPR20MB1421.namprd20.prod.outlook.com.
Attached patch introduces new GUC parameter parameter
vacuum_cleanup_index_scale_factor which specifies the fraction of the
table pages containing dead tuple needed to trigger a cleaning up
indexes. The default is 0.0, which means that the cleanup index is not
invoked if no update on table. In other word, if table is completely
frozen then lazy vacuum can skip the index scans as well. Increasing
this value could reduce total time of lazy vacuum but the statistics
and the free space map of index are not updated.
[1]: /messages/by-id/MWHPR20MB142177B86D893C946FAFC9A4A18C0@MWHPR20MB1421.namprd20.prod.outlook.com
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Attachments:
cleanup_indexes_threshold_v1.patchbinary/octet-stream; name=cleanup_indexes_threshold_v1.patchDownload+45-3
On Wed, Jan 4, 2017 at 3:21 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Hi and happy new year.
The lazy vacuum calls lazy_cleanup_index to update statistics of
indexes on a table such as relpages, reltuples at the end of the
lazy_scan_heap. In all type of indexes the lazy_cleanup_index scans
all index pages. It happens even if table has not been updated at all
since previous vacuum invoked. Freeze map reduces the execution time
and cost of table vacuuming much if almost table has been frozen. But
it doesn't work for cleaning up indexes. If a very large static table
has index then because the cleaning up index is called and it always
scans all index pages, it takes time to scan all pages of index as
reported[1].Attached patch introduces new GUC parameter parameter
vacuum_cleanup_index_scale_factor which specifies the fraction of the
table pages containing dead tuple needed to trigger a cleaning up
indexes. The default is 0.0, which means that the cleanup index is not
invoked if no update on table. In other word, if table is completely
frozen then lazy vacuum can skip the index scans as well. Increasing
this value could reduce total time of lazy vacuum but the statistics
and the free space map of index are not updated.
Cool. I'll look at this, but not until next CF.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jan 4, 2017 at 1:51 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Attached patch introduces new GUC parameter parameter
vacuum_cleanup_index_scale_factor which specifies the fraction of the
table pages containing dead tuple needed to trigger a cleaning up
indexes. The default is 0.0, which means that the cleanup index is not
invoked if no update on table. In other word, if table is completely
frozen then lazy vacuum can skip the index scans as well. Increasing
this value could reduce total time of lazy vacuum but the statistics
and the free space map of index are not updated.
I was looking into your patch and trying to understand how the
following piece of code works.
+ if (vacuumed_pages > cleanupidx_thresh)
+ {
+ for (i = 0; i < nindexes; i++)
+ lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+ }
So, you are skipping deletion of index entries if it does not reach
the clean-up index threshold. But, you are removing all dead tuples
from the heap pointed by the same index. Hence, index will contain
entries with invalid references. How does that work? How will you
remove those index entries later? (I'm a newbie.)
+ This parameter can only be set anywhere.
Oxymoron. :-)
--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Feb 10, 2017 at 8:01 PM, Kuntal Ghosh
<kuntalghosh.2007@gmail.com> wrote:
On Wed, Jan 4, 2017 at 1:51 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Attached patch introduces new GUC parameter parameter
vacuum_cleanup_index_scale_factor which specifies the fraction of the
table pages containing dead tuple needed to trigger a cleaning up
indexes. The default is 0.0, which means that the cleanup index is not
invoked if no update on table. In other word, if table is completely
frozen then lazy vacuum can skip the index scans as well. Increasing
this value could reduce total time of lazy vacuum but the statistics
and the free space map of index are not updated.I was looking into your patch and trying to understand how the
following piece of code works.
Thank you for looking at this patch!
+ if (vacuumed_pages > cleanupidx_thresh) + { + for (i = 0; i < nindexes; i++) + lazy_cleanup_index(Irel[i], indstats[i], vacrelstats); + } So, you are skipping deletion of index entries if it does not reach the clean-up index threshold. But, you are removing all dead tuples from the heap pointed by the same index. Hence, index will contain entries with invalid references.
I think no. Before calling lazy_cleanup_index, all garbage on heap and
index should have been reclaimed by lazy_vacuum_heap and
lazy_vacuum_index.
+ This parameter can only be set anywhere.
Oxymoron. :-)
Will fix it.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi, I tried regression test and found some errors concerning brin and gin,
though I didn't look into this.
Here's a log:
*** /home/ideriha/postgres-master/src/test/regress/expected/brin.out 2017-02-13 11:33:43.270942937 +0900
--- /home/ideriha/postgres-master/src/test/regress/results/brin.out 2017-02-15 14:58:24.725984474 +0900
***************
*** 403,408 ****
SELECT brin_summarize_new_values('brinidx'); -- ok, no change expected
brin_summarize_new_values
---------------------------
! 0
(1 row)
--- 403,408 ----
SELECT brin_summarize_new_values('brinidx'); -- ok, no change expected
brin_summarize_new_values
---------------------------
! 5
(1 row)
======================================================================
*** /home/ideriha/postgres-master/src/test/regress/expected/gin.out 2016-12-20 16:49:09.513050050 +0900
--- /home/ideriha/postgres-master/src/test/regress/results/gin.out 2017-02-15 14:58:25.536984461 +0900
***************
*** 20,26 ****
select gin_clean_pending_list('gin_test_idx'); -- nothing to flush
gin_clean_pending_list
------------------------
! 0
(1 row)
-- Test vacuuming
--- 20,26 ----
select gin_clean_pending_list('gin_test_idx'); -- nothing to flush
gin_clean_pending_list
------------------------
! 8
(1 row)
-- Test vacuuming
======================================================================
Regards,
Ideriha Takeshi
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Feb 15, 2017 at 4:39 PM, Ideriha, Takeshi
<ideriha.takeshi@jp.fujitsu.com> wrote:
Hi, I tried regression test and found some errors concerning brin and gin,
though I didn't look into this.Here's a log:
*** /home/ideriha/postgres-master/src/test/regress/expected/brin.out 2017-02-13 11:33:43.270942937 +0900 --- /home/ideriha/postgres-master/src/test/regress/results/brin.out 2017-02-15 14:58:24.725984474 +0900 *************** *** 403,408 **** SELECT brin_summarize_new_values('brinidx'); -- ok, no change expected brin_summarize_new_values --------------------------- ! 0 (1 row)--- 403,408 ---- SELECT brin_summarize_new_values('brinidx'); -- ok, no change expected brin_summarize_new_values --------------------------- ! 5 (1 row)======================================================================
*** /home/ideriha/postgres-master/src/test/regress/expected/gin.out 2016-12-20 16:49:09.513050050 +0900 --- /home/ideriha/postgres-master/src/test/regress/results/gin.out 2017-02-15 14:58:25.536984461 +0900 *************** *** 20,26 **** select gin_clean_pending_list('gin_test_idx'); -- nothing to flush gin_clean_pending_list ------------------------ ! 0 (1 row)-- Test vacuuming --- 20,26 ---- select gin_clean_pending_list('gin_test_idx'); -- nothing to flush gin_clean_pending_list ------------------------ ! 8 (1 row)-- Test vacuuming
======================================================================
Thank you for testing!
It's a bug. Attached latest version patch, which passed make check.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Attachments:
skip_cleanup_indexdes_v2.patchapplication/octet-stream; name=skip_cleanup_indexdes_v2.patchDownload+45-3
On Mon, Feb 13, 2017 at 1:01 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Thanks for the explanation. I've looked into the referred code. I'm
still in doubt. vacuumed_pages is incremented only when there are no
indexes, i.e. nindexes=0. Now, look at the following part in the
patch.
+ /*
+ * Do post-vacuum cleanup and statistics update for each index if
+ * the number of vacuumed page exceeds threshold.
+ */
+ cleanupidx_thresh = (float4) nblocks * vacuum_cleanup_index_scale;
+
+ elog(DEBUG3, "%s: vac: %d (threshold %0.f)",
+ RelationGetRelationName(onerel), nblocks, cleanupidx_thresh);
+ if (vacuumed_pages >= cleanupidx_thresh)
+ {
+ for (i = 0; i < nindexes; i++)
+ lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
+ }
So, unless vacuum_cleanup_index_scale_thresh is zero,
lazy_cleanup_index will never be called. IMO, this seems to be
incorrect. Besides, I've tested with non-zero(0.5)
vacuum_cleanup_index_scale_thresh and the regression tests for brin
and gin fails. (make installcheck)
+ {"vacuum_cleanup_index_scale_factor", PGC_USERSET,
CLIENT_CONN_STATEMENT,
+ gettext_noop("Number of pages containing dead tuple
prior to vacuum as a fraction of relpages."),
+ NULL
+ },
+ &vacuum_cleanup_index_scale,
+ 0.0, 0.0, 100.0,
+ NULL, NULL, NULL
+ },
Maximum value for vacuum_cleanup_index_scale_factor should be 1
instead of 100. As the code indicates, it is certainly not treated as
a percentage fraction of relpages.
--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 15 February 2017 at 08:07, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
It's a bug. Attached latest version patch, which passed make check.
In its current form, I'm not sure this is a good idea. Problems...
1. I'm pretty sure the world doesn't need another VACUUM parameter
I suggest that we use the existing vacuum scale factor/4 to reflect
that indexes are more sensitive to bloat.
2. The current btree vacuum code requires 2 vacuums to fully reuse
half-dead pages. So skipping an index vacuum might mean that second
index scan never happens at all, which would be bad.
I suggest that we store the number of half-dead pages in the metapage
after each VACUUM, so we can decide whether to skip the scan or not.
And we use some math like each half-dead page that needs to be reused
is worth 250 index entries, so the decision to skip is based upon rows
and empty pages, not just recently vacuumed rows.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Feb 16, 2017 at 6:17 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 15 February 2017 at 08:07, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
It's a bug. Attached latest version patch, which passed make check.
In its current form, I'm not sure this is a good idea. Problems...
1. I'm pretty sure the world doesn't need another VACUUM parameter
I suggest that we use the existing vacuum scale factor/4 to reflect
that indexes are more sensitive to bloat.
I do not think it's a good idea to control multiple behaviors with a
single GUC. We don't really know that dividing by 4 will be right for
everyone, or even for most people. It's better to have another
parameter with a sensible default than to hardcode a ratio that might
work out poorly for some people.
2. The current btree vacuum code requires 2 vacuums to fully reuse
half-dead pages. So skipping an index vacuum might mean that second
index scan never happens at all, which would be bad.
Maybe. If there are a tiny number of those half-dead pages in a huge
index, it probably doesn't matter. Also, I don't think it would never
happen, unless the table just never gets any more updates or deletes -
but that case could also happen today. It's just a matter of
happening less frequently.
I guess the question is whether the accumulation of half-dead pages in
the index could become a problem before the unsetting of all-visible
bits in the heap becomes a problem. If the second one always happen
first, then we don't have an issue here, but if it's possible for the
first one to become a big problem before the second one gets to be a
serious issue, then we need something more sophisticated.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Feb 17, 2017 at 3:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Feb 16, 2017 at 6:17 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 15 February 2017 at 08:07, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
It's a bug. Attached latest version patch, which passed make check.
In its current form, I'm not sure this is a good idea. Problems...
1. I'm pretty sure the world doesn't need another VACUUM parameter
I suggest that we use the existing vacuum scale factor/4 to reflect
that indexes are more sensitive to bloat.I do not think it's a good idea to control multiple behaviors with a
single GUC. We don't really know that dividing by 4 will be right for
everyone, or even for most people. It's better to have another
parameter with a sensible default than to hardcode a ratio that might
work out poorly for some people.2. The current btree vacuum code requires 2 vacuums to fully reuse
half-dead pages. So skipping an index vacuum might mean that second
index scan never happens at all, which would be bad.Maybe. If there are a tiny number of those half-dead pages in a huge
index, it probably doesn't matter. Also, I don't think it would never
happen, unless the table just never gets any more updates or deletes -
but that case could also happen today. It's just a matter of
happening less frequently.
The half-dead pages are never cleaned up if the ratio of pages
containing garbage is always lower than threshold. Also in gin index
the pending list is never cleared, which become big problem. I guess
that we should take action for each type of indexes.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2/19/17 7:56 PM, Masahiko Sawada wrote:
The half-dead pages are never cleaned up if the ratio of pages
containing garbage is always lower than threshold. Also in gin index
the pending list is never cleared, which become big problem. I guess
that we should take action for each type of indexes.
What worries me is that each AM is going to have a different notion of
what needs to happen to support this. That indicates that trying to
handle this at the vacuum level is not a good idea.
I think it would be wiser to add support for skipping scans to the AM
API instead. That also means you don't have to add support for this to
every index type to start with.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 20, 2017 at 11:35 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 2/19/17 7:56 PM, Masahiko Sawada wrote:
The half-dead pages are never cleaned up if the ratio of pages
containing garbage is always lower than threshold. Also in gin index
the pending list is never cleared, which become big problem. I guess
that we should take action for each type of indexes.What worries me is that each AM is going to have a different notion of what
needs to happen to support this. That indicates that trying to handle this
at the vacuum level is not a good idea.I think it would be wiser to add support for skipping scans to the AM API
instead. That also means you don't have to add support for this to every
index type to start with.
Yeah, and it's better to have it as a index storage parameter.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 20, 2017 at 7:26 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Feb 17, 2017 at 3:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Feb 16, 2017 at 6:17 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 15 February 2017 at 08:07, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
It's a bug. Attached latest version patch, which passed make check.
In its current form, I'm not sure this is a good idea. Problems...
1. I'm pretty sure the world doesn't need another VACUUM parameter
I suggest that we use the existing vacuum scale factor/4 to reflect
that indexes are more sensitive to bloat.I do not think it's a good idea to control multiple behaviors with a
single GUC. We don't really know that dividing by 4 will be right for
everyone, or even for most people. It's better to have another
parameter with a sensible default than to hardcode a ratio that might
work out poorly for some people.2. The current btree vacuum code requires 2 vacuums to fully reuse
half-dead pages. So skipping an index vacuum might mean that second
index scan never happens at all, which would be bad.Maybe. If there are a tiny number of those half-dead pages in a huge
index, it probably doesn't matter. Also, I don't think it would never
happen, unless the table just never gets any more updates or deletes -
but that case could also happen today. It's just a matter of
happening less frequently.
Yeah thats right and I am not sure if it is worth to perform a
complete pass to reclaim dead/deleted pages unless we know someway
that there are many such pages. Also, I think we do reclaim the
complete page while allocating a new page in btree.
The half-dead pages are never cleaned up if the ratio of pages
containing garbage is always lower than threshold.
Which threshold are you referring here?
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 20 February 2017 at 09:15, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Feb 20, 2017 at 7:26 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Feb 17, 2017 at 3:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Feb 16, 2017 at 6:17 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 15 February 2017 at 08:07, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
It's a bug. Attached latest version patch, which passed make check.
2. The current btree vacuum code requires 2 vacuums to fully reuse
half-dead pages. So skipping an index vacuum might mean that second
index scan never happens at all, which would be bad.Maybe. If there are a tiny number of those half-dead pages in a huge
index, it probably doesn't matter. Also, I don't think it would never
happen, unless the table just never gets any more updates or deletes -
but that case could also happen today. It's just a matter of
happening less frequently.Yeah thats right and I am not sure if it is worth to perform a
complete pass to reclaim dead/deleted pages unless we know someway
that there are many such pages.
Agreed.... which is why
On 16 February 2017 at 11:17, Simon Riggs <simon@2ndquadrant.com> wrote:
I suggest that we store the number of half-dead pages in the metapage
after each VACUUM, so we can decide whether to skip the scan or not.
Also, I think we do reclaim the
complete page while allocating a new page in btree.
That's not how it works according to the README at least.
You might be referring to cleaning out killed tuples just before a
page split? That's something different.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 20, 2017 at 3:01 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 20 February 2017 at 09:15, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Feb 20, 2017 at 7:26 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Feb 17, 2017 at 3:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Feb 16, 2017 at 6:17 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 15 February 2017 at 08:07, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
It's a bug. Attached latest version patch, which passed make check.
2. The current btree vacuum code requires 2 vacuums to fully reuse
half-dead pages. So skipping an index vacuum might mean that second
index scan never happens at all, which would be bad.Maybe. If there are a tiny number of those half-dead pages in a huge
index, it probably doesn't matter. Also, I don't think it would never
happen, unless the table just never gets any more updates or deletes -
but that case could also happen today. It's just a matter of
happening less frequently.Yeah thats right and I am not sure if it is worth to perform a
complete pass to reclaim dead/deleted pages unless we know someway
that there are many such pages.Agreed.... which is why
On 16 February 2017 at 11:17, Simon Riggs <simon@2ndquadrant.com> wrote:I suggest that we store the number of half-dead pages in the metapage
after each VACUUM, so we can decide whether to skip the scan or not.Also, I think we do reclaim the
complete page while allocating a new page in btree.That's not how it works according to the README at least.
I am referring to code (_bt_getbuf()->if (_bt_page_recyclable(page))),
won't that help us in reclaiming the space?
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 20, 2017 at 6:15 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Feb 20, 2017 at 7:26 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Feb 17, 2017 at 3:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Feb 16, 2017 at 6:17 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 15 February 2017 at 08:07, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
It's a bug. Attached latest version patch, which passed make check.
In its current form, I'm not sure this is a good idea. Problems...
1. I'm pretty sure the world doesn't need another VACUUM parameter
I suggest that we use the existing vacuum scale factor/4 to reflect
that indexes are more sensitive to bloat.I do not think it's a good idea to control multiple behaviors with a
single GUC. We don't really know that dividing by 4 will be right for
everyone, or even for most people. It's better to have another
parameter with a sensible default than to hardcode a ratio that might
work out poorly for some people.2. The current btree vacuum code requires 2 vacuums to fully reuse
half-dead pages. So skipping an index vacuum might mean that second
index scan never happens at all, which would be bad.Maybe. If there are a tiny number of those half-dead pages in a huge
index, it probably doesn't matter. Also, I don't think it would never
happen, unless the table just never gets any more updates or deletes -
but that case could also happen today. It's just a matter of
happening less frequently.Yeah thats right and I am not sure if it is worth to perform a
complete pass to reclaim dead/deleted pages unless we know someway
that there are many such pages. Also, I think we do reclaim the
complete page while allocating a new page in btree.The half-dead pages are never cleaned up if the ratio of pages
containing garbage is always lower than threshold.Which threshold are you referring here?
I meant the new parameter in current patch.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 20 February 2017 at 10:27, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Feb 20, 2017 at 3:01 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 20 February 2017 at 09:15, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Feb 20, 2017 at 7:26 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Feb 17, 2017 at 3:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Feb 16, 2017 at 6:17 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 15 February 2017 at 08:07, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
It's a bug. Attached latest version patch, which passed make check.
2. The current btree vacuum code requires 2 vacuums to fully reuse
half-dead pages. So skipping an index vacuum might mean that second
index scan never happens at all, which would be bad.Maybe. If there are a tiny number of those half-dead pages in a huge
index, it probably doesn't matter. Also, I don't think it would never
happen, unless the table just never gets any more updates or deletes -
but that case could also happen today. It's just a matter of
happening less frequently.Yeah thats right and I am not sure if it is worth to perform a
complete pass to reclaim dead/deleted pages unless we know someway
that there are many such pages.Agreed.... which is why
On 16 February 2017 at 11:17, Simon Riggs <simon@2ndquadrant.com> wrote:I suggest that we store the number of half-dead pages in the metapage
after each VACUUM, so we can decide whether to skip the scan or not.Also, I think we do reclaim the
complete page while allocating a new page in btree.That's not how it works according to the README at least.
I am referring to code (_bt_getbuf()->if (_bt_page_recyclable(page))),
won't that help us in reclaiming the space?
Not unless the README is incorrect, no.
That section of code is just a retest of pages retrieved from FSM;
they aren't even added there until two scans have occurred and even
then it may not be possible to recycle.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Feb 16, 2017 at 10:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
2. The current btree vacuum code requires 2 vacuums to fully reuse
half-dead pages. So skipping an index vacuum might mean that second
index scan never happens at all, which would be bad.Maybe. If there are a tiny number of those half-dead pages in a huge
index, it probably doesn't matter. Also, I don't think it would never
happen, unless the table just never gets any more updates or deletes -
but that case could also happen today. It's just a matter of
happening less frequently.I guess the question is whether the accumulation of half-dead pages in
the index could become a problem before the unsetting of all-visible
bits in the heap becomes a problem. If the second one always happen
first, then we don't have an issue here, but if it's possible for the
first one to become a big problem before the second one gets to be a
serious issue, then we need something more sophisticated.
Not getting to a second VACUUM where you might have otherwise can only
be a problem to the extent that users are sensitive to not reclaiming
disk space from indexes at the level of the FSM. It's not accurate to
say that pages could be left "half dead" indefinitely by this patch,
since that is something that lasts only as long as the first phase of
B-Tree page deletion. In fact, the only possible problem is that pages
are recyclable in principle, but that doesn't happen due to this new
GUC.
That isn't analogous to heap bloat at all, because it's not as if
there are any downlinks or right links or left links pointing to the
recyclable (fully deleted) pages; the previous key space *has* in fact
been *fully* reclaimed. These pages are fully dead, and as such are
out of the critical path of index scans entirely once the second phase
finishes. (They only need to continue to physically exist because old
index scans might follow a stale pointer).
Note that there is an interlock against RecentGlobalXmin respected by
VACUUM, that prevents this sort of recycling. I suspect that the
restrictions on page deletion as opposed to page recycling is vastly
more likely to cause pain to users, and that's not made any worse by
this.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Feb 21, 2017 at 1:09 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 20 February 2017 at 10:27, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Feb 20, 2017 at 3:01 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 20 February 2017 at 09:15, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Feb 20, 2017 at 7:26 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Feb 17, 2017 at 3:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Feb 16, 2017 at 6:17 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 15 February 2017 at 08:07, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
It's a bug. Attached latest version patch, which passed make check.
2. The current btree vacuum code requires 2 vacuums to fully reuse
half-dead pages. So skipping an index vacuum might mean that second
index scan never happens at all, which would be bad.Maybe. If there are a tiny number of those half-dead pages in a huge
index, it probably doesn't matter. Also, I don't think it would never
happen, unless the table just never gets any more updates or deletes -
but that case could also happen today. It's just a matter of
happening less frequently.Yeah thats right and I am not sure if it is worth to perform a
complete pass to reclaim dead/deleted pages unless we know someway
that there are many such pages.Agreed.... which is why
On 16 February 2017 at 11:17, Simon Riggs <simon@2ndquadrant.com> wrote:I suggest that we store the number of half-dead pages in the metapage
after each VACUUM, so we can decide whether to skip the scan or not.Also, I think we do reclaim the
complete page while allocating a new page in btree.That's not how it works according to the README at least.
I am referring to code (_bt_getbuf()->if (_bt_page_recyclable(page))),
won't that help us in reclaiming the space?Not unless the README is incorrect, no.
Just to ensure that we both have the same understanding, let me try to
write what I understand about this reclaim algorithm. AFAIU, in the
first pass vacuum will mark the half dead pages as Deleted and in the
second pass, it will record such pages as free in FSM so that they can
be reused as new pages when the indexam asked for a new block instead
of extending the index relation. Now, if we introduce this new GUC,
then there are chances that sometimes we skip the second pass where it
would not have been skipped.
Note that we do perform the second pass in the same vacuum cycle when
index has not been scanned for deleting the tuples as per below code:
btvacuumcleanup()
{
..
if (stats == NULL)
{
stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
btvacuumscan(info, stats, NULL, NULL, 0);
..
}
In above code stats won't be NULL, if the vacuum has scanned index for
deleting tuples (btbulkdelete). So, based on this I think it will
skip scanning the index (or recycling pages marked as deleted) in the
second vacuum only when there are no dead tuple removals in that
vacuum. Do we agree till here?
I understand that there could be some delay in reclaiming dead pages
but do you think it is such a big deal that we completely scan the
index for such cases or even try to change the metapage format?
That section of code is just a retest of pages retrieved from FSM;
Yes, I think you are right. In short, I agree that only vacuum can
reclaim half-dead pages.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Feb 22, 2017 at 12:01 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Feb 21, 2017 at 1:09 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 20 February 2017 at 10:27, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Feb 20, 2017 at 3:01 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 20 February 2017 at 09:15, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Feb 20, 2017 at 7:26 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Feb 17, 2017 at 3:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Feb 16, 2017 at 6:17 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 15 February 2017 at 08:07, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
It's a bug. Attached latest version patch, which passed make check.
2. The current btree vacuum code requires 2 vacuums to fully reuse
half-dead pages. So skipping an index vacuum might mean that second
index scan never happens at all, which would be bad.Maybe. If there are a tiny number of those half-dead pages in a huge
index, it probably doesn't matter. Also, I don't think it would never
happen, unless the table just never gets any more updates or deletes -
but that case could also happen today. It's just a matter of
happening less frequently.Yeah thats right and I am not sure if it is worth to perform a
complete pass to reclaim dead/deleted pages unless we know someway
that there are many such pages.Agreed.... which is why
On 16 February 2017 at 11:17, Simon Riggs <simon@2ndquadrant.com> wrote:I suggest that we store the number of half-dead pages in the metapage
after each VACUUM, so we can decide whether to skip the scan or not.Also, I think we do reclaim the
complete page while allocating a new page in btree.That's not how it works according to the README at least.
I am referring to code (_bt_getbuf()->if (_bt_page_recyclable(page))),
won't that help us in reclaiming the space?Not unless the README is incorrect, no.
Just to ensure that we both have the same understanding, let me try to
write what I understand about this reclaim algorithm. AFAIU, in the
first pass vacuum will mark the half dead pages as Deleted and in the
second pass, it will record such pages as free in FSM so that they can
be reused as new pages when the indexam asked for a new block instead
of extending the index relation. Now, if we introduce this new GUC,
then there are chances that sometimes we skip the second pass where it
would not have been skipped.Note that we do perform the second pass in the same vacuum cycle when
index has not been scanned for deleting the tuples as per below code:
The first pass uses cycle id given by _bt_start_vacuum, but the second
pass always uses cycle id 0.
btvacuumcleanup()
{
..
if (stats == NULL)
{
stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
btvacuumscan(info, stats, NULL, NULL, 0);
..
}In above code stats won't be NULL, if the vacuum has scanned index for
deleting tuples (btbulkdelete). So, based on this I think it will
skip scanning the index (or recycling pages marked as deleted) in the
second vacuum only when there are no dead tuple removals in that
vacuum. Do we agree till here?
Agreed.
I understand that there could be some delay in reclaiming dead pages
but do you think it is such a big deal that we completely scan the
index for such cases or even try to change the metapage format?
IIUC, I think that we need to have the number of half-dead pages in meta page.
Isn't it a problem that the freespace map of btree index is not
vacuumed if all vacuums skip the second pass?
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers