Questions/Observations related to Gist vacuum

Started by Amit Kapilaabout 6 years ago34 messages
#1Amit Kapila
amit.kapila16@gmail.com

While reviewing a parallel vacuum patch [1]/messages/by-id/CAA4eK1JEQ2y3uNucNopDjK8pse6xSe5=_oknoWfRQvAF=VqsBA@mail.gmail.com, we noticed a few things
about $SUBJECT implemented in commit -
7df159a620b760e289f1795b13542ed1b3e13b87.

1. A new memory context GistBulkDeleteResult->page_set_context has
been introduced, but it doesn't seem to be used.
2. Right now, in gistbulkdelete we make a note of empty leaf pages and
internals pages and then in the second pass during gistvacuumcleanup,
we delete all the empty leaf pages. I was thinking why unlike nbtree,
we have delayed the deletion of empty pages till gistvacuumcleanup. I
don't see any problem if we do this during gistbulkdelete itself
similar to nbtree, also I think there is some advantage in marking the
pages as deleted as early as possible. Basically, if the vacuum
operation is canceled or errored out between gistbulkdelete and
gistvacuumcleanup, then I think the deleted pages could be marked as
recyclable very early in next vacuum operation. The other advantage
of doing this during gistbulkdelete is we can avoid sharing
information between gistbulkdelete and gistvacuumcleanup which is
quite helpful for a parallel vacuum as the information is not trivial
(it is internally stored as in-memory Btree). OTOH, there might be
some advantage for delaying the deletion of pages especially in the
case of multiple scans during a single VACUUM command. We can
probably delete all empty leaf pages in one go which could in some
cases lead to fewer internal page reads. However, I am not sure if it
is really advantageous to postpone the deletion as there seem to be
some downsides to it as well. I don't see it documented why unlike
nbtree we consider delaying deletion of empty pages till
gistvacuumcleanup, but I might be missing something.

Thoughts?

[1]: /messages/by-id/CAA4eK1JEQ2y3uNucNopDjK8pse6xSe5=_oknoWfRQvAF=VqsBA@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#2Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Amit Kapila (#1)
Re: Questions/Observations related to Gist vacuum

On 15/10/2019 09:37, Amit Kapila wrote:

While reviewing a parallel vacuum patch [1], we noticed a few things
about $SUBJECT implemented in commit -
7df159a620b760e289f1795b13542ed1b3e13b87.

1. A new memory context GistBulkDeleteResult->page_set_context has
been introduced, but it doesn't seem to be used.

Oops. internal_page_set and empty_leaf_set were supposed to be allocated
in that memory context. As things stand, we leak them until end of
vacuum, in a multi-pass vacuum.

2. Right now, in gistbulkdelete we make a note of empty leaf pages and
internals pages and then in the second pass during gistvacuumcleanup,
we delete all the empty leaf pages. I was thinking why unlike nbtree,
we have delayed the deletion of empty pages till gistvacuumcleanup. I
don't see any problem if we do this during gistbulkdelete itself
similar to nbtree, also I think there is some advantage in marking the
pages as deleted as early as possible. Basically, if the vacuum
operation is canceled or errored out between gistbulkdelete and
gistvacuumcleanup, then I think the deleted pages could be marked as
recyclable very early in next vacuum operation. The other advantage
of doing this during gistbulkdelete is we can avoid sharing
information between gistbulkdelete and gistvacuumcleanup which is
quite helpful for a parallel vacuum as the information is not trivial
(it is internally stored as in-memory Btree). OTOH, there might be
some advantage for delaying the deletion of pages especially in the
case of multiple scans during a single VACUUM command. We can
probably delete all empty leaf pages in one go which could in some
cases lead to fewer internal page reads. However, I am not sure if it
is really advantageous to postpone the deletion as there seem to be
some downsides to it as well. I don't see it documented why unlike
nbtree we consider delaying deletion of empty pages till
gistvacuumcleanup, but I might be missing something.

Hmm. The thinking is/was that removing the empty pages is somewhat
expensive, because it has to scan all the internal nodes to find the
downlinks to the to-be-deleted pages. Furthermore, it needs to scan all
the internal pages (or at least until it has found all the downlinks),
regardless of how many empty pages are being deleted. So it makes sense
to do it only once, for all the empty pages. You're right though, that
there would be advantages, too, in doing it after each pass. All things
considered, I'm not sure which is better.

- Heikki

#3Dilip Kumar
dilipbalaut@gmail.com
In reply to: Heikki Linnakangas (#2)
1 attachment(s)
Re: Questions/Observations related to Gist vacuum

On Tue, Oct 15, 2019 at 7:13 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 15/10/2019 09:37, Amit Kapila wrote:

While reviewing a parallel vacuum patch [1], we noticed a few things
about $SUBJECT implemented in commit -
7df159a620b760e289f1795b13542ed1b3e13b87.

1. A new memory context GistBulkDeleteResult->page_set_context has
been introduced, but it doesn't seem to be used.

Oops. internal_page_set and empty_leaf_set were supposed to be allocated
in that memory context. As things stand, we leak them until end of
vacuum, in a multi-pass vacuum.

Here is a patch to fix this issue.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

user_correct_memorycontext_gist_stat.patchapplication/octet-stream; name=user_correct_memorycontext_gist_stat.patchDownload
diff --git a/src/backend/access/gist/gistvacuum.c b/src/backend/access/gist/gistvacuum.c
index bf754ea..31e2571 100644
--- a/src/backend/access/gist/gistvacuum.c
+++ b/src/backend/access/gist/gistvacuum.c
@@ -169,6 +169,7 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	BlockNumber num_pages;
 	bool		needLock;
 	BlockNumber blkno;
+	MemoryContext oldctx = NULL;
 
 	/*
 	 * Reset counts that will be incremented during the scan; needed in case
@@ -179,8 +180,17 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	stats->stats.pages_deleted = 0;
 	stats->stats.pages_free = 0;
 	MemoryContextReset(stats->page_set_context);
+
+	/*
+	 * Create the internal_page_set and the empty_leaf_set in the
+	 * page_set_context.  Internally, integer set will remember this context
+	 * so the subsequent allocations for these integer sets will be done from
+	 * the same context.
+	 */
+	oldctx = MemoryContextSwitchTo(stats->page_set_context);
 	stats->internal_page_set = intset_create();
 	stats->empty_leaf_set = intset_create();
+	MemoryContextSwitchTo(oldctx);
 
 	/* Set up info to pass down to gistvacuumpage */
 	vstate.info = info;
#4Amit Kapila
amit.kapila16@gmail.com
In reply to: Heikki Linnakangas (#2)
Re: Questions/Observations related to Gist vacuum

On Tue, Oct 15, 2019 at 7:13 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 15/10/2019 09:37, Amit Kapila wrote:

2. Right now, in gistbulkdelete we make a note of empty leaf pages and
internals pages and then in the second pass during gistvacuumcleanup,
we delete all the empty leaf pages. I was thinking why unlike nbtree,
we have delayed the deletion of empty pages till gistvacuumcleanup. I
don't see any problem if we do this during gistbulkdelete itself
similar to nbtree, also I think there is some advantage in marking the
pages as deleted as early as possible. Basically, if the vacuum
operation is canceled or errored out between gistbulkdelete and
gistvacuumcleanup, then I think the deleted pages could be marked as
recyclable very early in next vacuum operation. The other advantage
of doing this during gistbulkdelete is we can avoid sharing
information between gistbulkdelete and gistvacuumcleanup which is
quite helpful for a parallel vacuum as the information is not trivial
(it is internally stored as in-memory Btree). OTOH, there might be
some advantage for delaying the deletion of pages especially in the
case of multiple scans during a single VACUUM command. We can
probably delete all empty leaf pages in one go which could in some
cases lead to fewer internal page reads. However, I am not sure if it
is really advantageous to postpone the deletion as there seem to be
some downsides to it as well. I don't see it documented why unlike
nbtree we consider delaying deletion of empty pages till
gistvacuumcleanup, but I might be missing something.

Hmm. The thinking is/was that removing the empty pages is somewhat
expensive, because it has to scan all the internal nodes to find the
downlinks to the to-be-deleted pages. Furthermore, it needs to scan all
the internal pages (or at least until it has found all the downlinks),
regardless of how many empty pages are being deleted. So it makes sense
to do it only once, for all the empty pages. You're right though, that
there would be advantages, too, in doing it after each pass.

I was thinking more about this and it seems that there could be more
cases where delaying the delete mark for pages can further delay the
recycling of pages. It is quite possible that immediately after bulk
delete the value of nextFullXid (used as deleteXid) is X and during
vacuum clean up it can be X + N where the chances of N being large is
more when there are multiple passes of vacuum scan. Now, if we would
have set the value of deleteXid as X, then there are more chances for
the next vacuum to recycle it. I am not sure but it might be that in
the future, we could come up with something (say if we can recompute
RecentGlobalXmin again) where we can recycle pages of first index scan
in the next scan of the index during single vacuum operation.

This is just to emphasize the point that doing the delete marking of
pages in the same pass has advantages, otherwise, I understand that
there are advantages in delaying it as well.

All things
considered, I'm not sure which is better.

Yeah, this is a tough call to make, but if we can allow it to delete
the pages in bulkdelete conditionally for parallel vacuum workers,
then it would be better.

I think we have below option w.r.t Gist indexes for parallel vacuum
a. don't allow Gist Index to participate in parallel vacuum
b. allow delete of leaf pages in bulkdelete for parallel worker
c. always allow deleting leaf pages in bulkdelete
d. Invent some mechanism to share all the Gist stats via shared memory

(a) is not a very good option, but it is a safe option as we can
extend it in the future and we might decide to go with it especially
if we can't decide among any other options. (b) would serve the need
but would add some additional checks in gistbulkdelete and will look
more like a hack. (c) would be best, but I think it will be difficult
to be sure that is a good decision for all type of cases. (d) can
require a lot of effort and I am not sure if it is worth adding
complexity in the proposed patch.

Do you have any thoughts on this?

Just to give you an idea of the current parallel vacuum patch, the
master backend scans the heap and forms the dead tuple array in dsm
and then we launch one worker for each index based on the availability
of workers and share the dead tuple array with each worker. Each
worker performs bulkdelete for the index. In the end, we perform
cleanup of all the indexes either via worker or master backend based
on some conditions.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#5Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Amit Kapila (#4)
Re: Questions/Observations related to Gist vacuum

On 16 October 2019 12:57:03 CEST, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Oct 15, 2019 at 7:13 PM Heikki Linnakangas <hlinnaka@iki.fi>
wrote:

All things
considered, I'm not sure which is better.

Yeah, this is a tough call to make, but if we can allow it to delete
the pages in bulkdelete conditionally for parallel vacuum workers,
then it would be better.

Yeah, if it's needed for parallel vacuum, maybe that tips the scale.

Hopefully, multi-pass vacuums are rare in practice. And we should lift the current 1 GB limit on the dead TID array, replacing it with something more compact and expandable, to make multi-pass vacuums even more rare. So I don't think we need to jump through many hoops to optimize the multi-pass case.

- Heikki

#6Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#3)
1 attachment(s)
Re: Questions/Observations related to Gist vacuum

On Wed, Oct 16, 2019 at 11:20 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 15, 2019 at 7:13 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 15/10/2019 09:37, Amit Kapila wrote:

While reviewing a parallel vacuum patch [1], we noticed a few things
about $SUBJECT implemented in commit -
7df159a620b760e289f1795b13542ed1b3e13b87.

1. A new memory context GistBulkDeleteResult->page_set_context has
been introduced, but it doesn't seem to be used.

Oops. internal_page_set and empty_leaf_set were supposed to be allocated
in that memory context. As things stand, we leak them until end of
vacuum, in a multi-pass vacuum.

Here is a patch to fix this issue.

The patch looks good to me. I have slightly modified the comments and
removed unnecessary initialization.

Heikki, are you fine me committing and backpatching this to 12? Let
me know if you have a different idea to fix.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

0001-Fix-memory-leak-introduced-in-commit-7df159a620.patchapplication/octet-stream; name=0001-Fix-memory-leak-introduced-in-commit-7df159a620.patchDownload
From b8ebd7a97e3f22d1f0643100bb45f9a48ce3fedc Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Thu, 17 Oct 2019 08:45:43 +0530
Subject: [PATCH] Fix memory leak introduced in commit 7df159a620.

We memorize all internal and empty leaf pages in the 1st vacuum stage for
gist indexes.  They are used in the 2nd stage, to delete all the empty
pages.  There was a memory context page_set_context for this purpose, but
we never used it.

Reported-by: Amit Kapila
Author: Paul Dilip Kumar
Reviewed-by: Amit Kapila
Backpatch-through: 12, where it got introduced
Discussion: https://postgr.es/m/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com
---
 src/backend/access/gist/gistvacuum.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/src/backend/access/gist/gistvacuum.c b/src/backend/access/gist/gistvacuum.c
index bf754ea..710e401 100644
--- a/src/backend/access/gist/gistvacuum.c
+++ b/src/backend/access/gist/gistvacuum.c
@@ -169,6 +169,7 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	BlockNumber num_pages;
 	bool		needLock;
 	BlockNumber blkno;
+	MemoryContext oldctx;
 
 	/*
 	 * Reset counts that will be incremented during the scan; needed in case
@@ -179,8 +180,17 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	stats->stats.pages_deleted = 0;
 	stats->stats.pages_free = 0;
 	MemoryContextReset(stats->page_set_context);
+
+	/*
+	 * Create the integer sets to remember all the internal and the empty leaf
+	 * pages in page_set_context.  Internally, the integer set will remember
+	 * this context so that the subsequent allocations for these integer sets
+	 * will be done from the same context.
+	 */
+	oldctx = MemoryContextSwitchTo(stats->page_set_context);
 	stats->internal_page_set = intset_create();
 	stats->empty_leaf_set = intset_create();
+	MemoryContextSwitchTo(oldctx);
 
 	/* Set up info to pass down to gistvacuumpage */
 	vstate.info = info;
-- 
1.8.3.1

#7Amit Kapila
amit.kapila16@gmail.com
In reply to: Heikki Linnakangas (#5)
Re: Questions/Observations related to Gist vacuum

On Wed, Oct 16, 2019 at 7:21 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 16 October 2019 12:57:03 CEST, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Oct 15, 2019 at 7:13 PM Heikki Linnakangas <hlinnaka@iki.fi>
wrote:

All things
considered, I'm not sure which is better.

Yeah, this is a tough call to make, but if we can allow it to delete
the pages in bulkdelete conditionally for parallel vacuum workers,
then it would be better.

Yeah, if it's needed for parallel vacuum, maybe that tips the scale.

makes sense. I think we can write a patch for it and prepare the
parallel vacuum patch on top of it. Once the parallel vacuum is in a
committable shape, we can commit the gist-index related patch first
followed by parallel vacuum patch.

Hopefully, multi-pass vacuums are rare in practice. And we should lift the current 1 GB limit on the dead TID array, replacing it with something more compact and expandable, to make multi-pass vacuums even more rare. So I don't think we need to jump through many hoops to optimize the multi-pass case.

Yeah, that will be a good improvement.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#8Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#7)
Re: Questions/Observations related to Gist vacuum

On Thu, Oct 17, 2019 at 9:15 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Oct 16, 2019 at 7:21 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 16 October 2019 12:57:03 CEST, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Oct 15, 2019 at 7:13 PM Heikki Linnakangas <hlinnaka@iki.fi>
wrote:

All things
considered, I'm not sure which is better.

Yeah, this is a tough call to make, but if we can allow it to delete
the pages in bulkdelete conditionally for parallel vacuum workers,
then it would be better.

Yeah, if it's needed for parallel vacuum, maybe that tips the scale.

makes sense. I think we can write a patch for it and prepare the
parallel vacuum patch on top of it. Once the parallel vacuum is in a
committable shape, we can commit the gist-index related patch first
followed by parallel vacuum patch.

+1
I can write a patch for the same.

Hopefully, multi-pass vacuums are rare in practice. And we should lift the current 1 GB limit on the dead TID array, replacing it with something more compact and expandable, to make multi-pass vacuums even more rare. So I don't think we need to jump through many hoops to optimize the multi-pass case.

Yeah, that will be a good improvement.

+1

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#9Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Amit Kapila (#6)
Re: Questions/Observations related to Gist vacuum

On 17/10/2019 05:31, Amit Kapila wrote:

On Wed, Oct 16, 2019 at 11:20 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 15, 2019 at 7:13 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 15/10/2019 09:37, Amit Kapila wrote:

While reviewing a parallel vacuum patch [1], we noticed a few things
about $SUBJECT implemented in commit -
7df159a620b760e289f1795b13542ed1b3e13b87.

1. A new memory context GistBulkDeleteResult->page_set_context has
been introduced, but it doesn't seem to be used.

Oops. internal_page_set and empty_leaf_set were supposed to be allocated
in that memory context. As things stand, we leak them until end of
vacuum, in a multi-pass vacuum.

Here is a patch to fix this issue.

The patch looks good to me. I have slightly modified the comments and
removed unnecessary initialization.

Heikki, are you fine me committing and backpatching this to 12? Let
me know if you have a different idea to fix.

Thanks! Looks good to me. Did either of you test it, though, with a
multi-pass vacuum?

- Heikki

#10Dilip Kumar
dilipbalaut@gmail.com
In reply to: Heikki Linnakangas (#9)
Re: Questions/Observations related to Gist vacuum

On Thu, Oct 17, 2019 at 12:27 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 17/10/2019 05:31, Amit Kapila wrote:

On Wed, Oct 16, 2019 at 11:20 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 15, 2019 at 7:13 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 15/10/2019 09:37, Amit Kapila wrote:

While reviewing a parallel vacuum patch [1], we noticed a few things
about $SUBJECT implemented in commit -
7df159a620b760e289f1795b13542ed1b3e13b87.

1. A new memory context GistBulkDeleteResult->page_set_context has
been introduced, but it doesn't seem to be used.

Oops. internal_page_set and empty_leaf_set were supposed to be allocated
in that memory context. As things stand, we leak them until end of
vacuum, in a multi-pass vacuum.

Here is a patch to fix this issue.

The patch looks good to me. I have slightly modified the comments and
removed unnecessary initialization.

Heikki, are you fine me committing and backpatching this to 12? Let
me know if you have a different idea to fix.

Thanks! Looks good to me. Did either of you test it, though, with a
multi-pass vacuum?

From my side, I have tested it with the multi-pass vacuum using the
gist index and after the fix, it's using expected memory context.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#11Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#10)
Re: Questions/Observations related to Gist vacuum

On Thu, Oct 17, 2019 at 1:47 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 12:27 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 17/10/2019 05:31, Amit Kapila wrote:

The patch looks good to me. I have slightly modified the comments and
removed unnecessary initialization.

Heikki, are you fine me committing and backpatching this to 12? Let
me know if you have a different idea to fix.

Thanks! Looks good to me. Did either of you test it, though, with a
multi-pass vacuum?

From my side, I have tested it with the multi-pass vacuum using the
gist index and after the fix, it's using expected memory context.

I have also verified that, but I think what additionally we can test
here is that without the patch it will leak the memory in
TopTransactionContext (CurrentMemoryContext), but after patch it
shouldn't leak it during multi-pass vacuum.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#12Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#11)
Re: Questions/Observations related to Gist vacuum

On Thu, 17 Oct 2019, 14:59 Amit Kapila, <amit.kapila16@gmail.com> wrote:

Show quoted text

On Thu, Oct 17, 2019 at 1:47 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 12:27 PM Heikki Linnakangas <hlinnaka@iki.fi>

wrote:

On 17/10/2019 05:31, Amit Kapila wrote:

The patch looks good to me. I have slightly modified the comments

and

removed unnecessary initialization.

Heikki, are you fine me committing and backpatching this to 12? Let
me know if you have a different idea to fix.

Thanks! Looks good to me. Did either of you test it, though, with a
multi-pass vacuum?

From my side, I have tested it with the multi-pass vacuum using the
gist index and after the fix, it's using expected memory context.

I have also verified that, but I think what additionally we can test
here is that without the patch it will leak the memory in
TopTransactionContext (CurrentMemoryContext), but after patch it
shouldn't leak it during multi-pass vacuum.

Make sense to me, I will test that by tomorrow.

#13Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#12)
Re: Questions/Observations related to Gist vacuum

On Thu, Oct 17, 2019 at 6:32 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, 17 Oct 2019, 14:59 Amit Kapila, <amit.kapila16@gmail.com> wrote:

On Thu, Oct 17, 2019 at 1:47 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 12:27 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 17/10/2019 05:31, Amit Kapila wrote:

The patch looks good to me. I have slightly modified the comments and
removed unnecessary initialization.

Heikki, are you fine me committing and backpatching this to 12? Let
me know if you have a different idea to fix.

Thanks! Looks good to me. Did either of you test it, though, with a
multi-pass vacuum?

From my side, I have tested it with the multi-pass vacuum using the
gist index and after the fix, it's using expected memory context.

I have also verified that, but I think what additionally we can test
here is that without the patch it will leak the memory in
TopTransactionContext (CurrentMemoryContext), but after patch it
shouldn't leak it during multi-pass vacuum.

Make sense to me, I will test that by tomorrow.

I have performed the test to observe the memory usage in the
TopTransactionContext using the MemoryContextStats function from gdb.

For testing this, in gistvacuumscan every time, after it resets the
page_set_context, I have collected the sample. I have collected 3
samples for both the head and the patch. We can clearly see that on
the head the memory is getting accumulated in the
TopTransactionContext whereas with the patch there is no memory
getting accumulated.

head:
TopTransactionContext: 1056832 total in 2 blocks; 3296 free (5
chunks); 1053536 used
GiST VACUUM page set context: 112 total in 0 blocks (0 chunks); 0
free (0 chunks); 112 used
Grand total: 1056944 bytes in 2 blocks; 3296 free (5 chunks); 1053648 used

TopTransactionContext: 1089600 total in 4 blocks; 19552 free (14
chunks); 1070048 used
GiST VACUUM page set context: 112 total in 0 blocks (0 chunks); 0
free (0 chunks); 112 used
Grand total: 1089712 bytes in 4 blocks; 19552 free (14 chunks); 1070160 used

TopTransactionContext: 1122368 total in 5 blocks; 35848 free (20
chunks); 1086520 used
GiST VACUUM page set context: 112 total in 0 blocks (0 chunks); 0
free (0 chunks); 112 used
Grand total: 1122480 bytes in 5 blocks; 35848 free (20 chunks); 1086632 used

With Patch:
TopTransactionContext: 1056832 total in 2 blocks; 3296 free (1
chunks); 1053536 used
GiST VACUUM page set context: 112 total in 0 blocks (0 chunks); 0
free (0 chunks); 112 used
Grand total: 1056944 bytes in 2 blocks; 3296 free (1 chunks); 1053648 used

TopTransactionContext: 1056832 total in 2 blocks; 3296 free (1
chunks); 1053536 used
GiST VACUUM page set context: 112 total in 0 blocks (0 chunks); 0
free (0 chunks); 112 used
Grand total: 1056944 bytes in 2 blocks; 3296 free (1 chunks); 1053648 used

TopTransactionContext: 1056832 total in 2 blocks; 3296 free (1
chunks); 1053536 used
GiST VACUUM page set context: 112 total in 0 blocks (0 chunks); 0
free (0 chunks); 112 used
Grand total: 1056944 bytes in 2 blocks; 3296 free (1 chunks); 1053648 used

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#14Dilip Kumar
dilipbalaut@gmail.com
In reply to: Heikki Linnakangas (#5)
Re: Questions/Observations related to Gist vacuum

On Wed, Oct 16, 2019 at 7:22 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 16 October 2019 12:57:03 CEST, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Oct 15, 2019 at 7:13 PM Heikki Linnakangas <hlinnaka@iki.fi>
wrote:

All things
considered, I'm not sure which is better.

Yeah, this is a tough call to make, but if we can allow it to delete
the pages in bulkdelete conditionally for parallel vacuum workers,
then it would be better.

Yeah, if it's needed for parallel vacuum, maybe that tips the scale.

Are we planning to do this only if it is called from parallel vacuum
workers or in general?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#15Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#13)
Re: Questions/Observations related to Gist vacuum

On Fri, Oct 18, 2019 at 9:34 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 6:32 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, 17 Oct 2019, 14:59 Amit Kapila, <amit.kapila16@gmail.com> wrote:

On Thu, Oct 17, 2019 at 1:47 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Oct 17, 2019 at 12:27 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

Thanks! Looks good to me. Did either of you test it, though, with a
multi-pass vacuum?

From my side, I have tested it with the multi-pass vacuum using the
gist index and after the fix, it's using expected memory context.

I have also verified that, but I think what additionally we can test
here is that without the patch it will leak the memory in
TopTransactionContext (CurrentMemoryContext), but after patch it
shouldn't leak it during multi-pass vacuum.

Make sense to me, I will test that by tomorrow.

I have performed the test to observe the memory usage in the
TopTransactionContext using the MemoryContextStats function from gdb.

For testing this, in gistvacuumscan every time, after it resets the
page_set_context, I have collected the sample. I have collected 3
samples for both the head and the patch. We can clearly see that on
the head the memory is getting accumulated in the
TopTransactionContext whereas with the patch there is no memory
getting accumulated.

Thanks for the test. It shows that prior to patch the memory was
getting leaked in TopTransactionContext during multi-pass vacuum and
after patch, there is no leak. I will commit the patch early next
week unless Heikki or someone wants some more tests.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#16Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#14)
Re: Questions/Observations related to Gist vacuum

On Fri, Oct 18, 2019 at 9:41 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Oct 16, 2019 at 7:22 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 16 October 2019 12:57:03 CEST, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Oct 15, 2019 at 7:13 PM Heikki Linnakangas <hlinnaka@iki.fi>
wrote:

All things
considered, I'm not sure which is better.

Yeah, this is a tough call to make, but if we can allow it to delete
the pages in bulkdelete conditionally for parallel vacuum workers,
then it would be better.

Yeah, if it's needed for parallel vacuum, maybe that tips the scale.

Are we planning to do this only if it is called from parallel vacuum
workers or in general?

I think we can do it in general as adding some check for parallel
vacuum there will look bit hackish. It is not clear if we get enough
benefit by keeping it for cleanup phase of the index as discussed in
emails above. Heikki, others, let us know if you don't agree here.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#17Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#16)
1 attachment(s)
Re: Questions/Observations related to Gist vacuum

On Fri, Oct 18, 2019 at 10:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 18, 2019 at 9:41 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Oct 16, 2019 at 7:22 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 16 October 2019 12:57:03 CEST, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Oct 15, 2019 at 7:13 PM Heikki Linnakangas <hlinnaka@iki.fi>
wrote:

All things
considered, I'm not sure which is better.

Yeah, this is a tough call to make, but if we can allow it to delete
the pages in bulkdelete conditionally for parallel vacuum workers,
then it would be better.

Yeah, if it's needed for parallel vacuum, maybe that tips the scale.

Are we planning to do this only if it is called from parallel vacuum
workers or in general?

I think we can do it in general as adding some check for parallel
vacuum there will look bit hackish.

I agree with that point.
It is not clear if we get enough

benefit by keeping it for cleanup phase of the index as discussed in
emails above. Heikki, others, let us know if you don't agree here.

I have prepared a first version of the patch. Currently, I am
performing an empty page deletion for all the cases.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

delete_emptypages_in_gistbulkdelete_v1.patchapplication/octet-stream; name=delete_emptypages_in_gistbulkdelete_v1.patchDownload
diff --git a/src/backend/access/gist/gistvacuum.c b/src/backend/access/gist/gistvacuum.c
index bf754ea..aee7904 100644
--- a/src/backend/access/gist/gistvacuum.c
+++ b/src/backend/access/gist/gistvacuum.c
@@ -24,17 +24,13 @@
 #include "storage/lmgr.h"
 #include "utils/memutils.h"
 
-/*
- * State kept across vacuum stages.
- */
 typedef struct
 {
-	IndexBulkDeleteResult stats;	/* must be first */
+	IndexBulkDeleteResult *stats;	/* kept across vacuum stages. */
 
 	/*
-	 * These are used to memorize all internal and empty leaf pages in the 1st
-	 * vacuum stage.  They are used in the 2nd stage, to delete all the empty
-	 * pages.
+	 * These are used to memorize all internal and empty leaf pages. They are
+	 * used for deleting all the empty pages.
 	 */
 	IntegerSet *internal_page_set;
 	IntegerSet *empty_leaf_set;
@@ -61,7 +57,7 @@ static bool gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 						   Buffer buffer, OffsetNumber downlink,
 						   Buffer leafBuffer);
 
-/* allocate the 'stats' struct that's kept over vacuum stages */
+/* allocate the gist 'stats' struct */
 static GistBulkDeleteResult *
 create_GistBulkDeleteResult(void)
 {
@@ -83,15 +79,28 @@ IndexBulkDeleteResult *
 gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
+	GistBulkDeleteResult *gist_stats;
+
+	gist_stats = create_GistBulkDeleteResult();
 
 	/* allocate stats if first time through, else re-use existing struct */
-	if (gist_stats == NULL)
-		gist_stats = create_GistBulkDeleteResult();
+	if (stats == NULL)
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+
+	gist_stats->stats = stats;
 
 	gistvacuumscan(info, gist_stats, callback, callback_state);
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	/*
+	 * If we saw any empty pages, try to unlink them from the tree so that
+	 * they can be reused.
+	 */
+	gistvacuum_delete_empty_pages(info, gist_stats);
+
+	/* we don't need the internal and empty page sets anymore */
+	MemoryContextDelete(gist_stats->page_set_context);
+
+	return stats;
 }
 
 /*
@@ -100,8 +109,6 @@ gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 IndexBulkDeleteResult *
 gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* No-op in ANALYZE ONLY mode */
 	if (info->analyze_only)
 		return stats;
@@ -111,23 +118,23 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 * stats from the latest gistbulkdelete call.  If it wasn't called, we
 	 * still need to do a pass over the index, to obtain index statistics.
 	 */
-	if (gist_stats == NULL)
+	if (stats == NULL)
 	{
-		gist_stats = create_GistBulkDeleteResult();
+		GistBulkDeleteResult *gist_stats = create_GistBulkDeleteResult();
+
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+		gist_stats->stats = stats;
 		gistvacuumscan(info, gist_stats, NULL, NULL);
-	}
 
-	/*
-	 * If we saw any empty pages, try to unlink them from the tree so that
-	 * they can be reused.
-	 */
-	gistvacuum_delete_empty_pages(info, gist_stats);
+		/*
+		 * If we saw any empty pages, try to unlink them from the tree so that
+		 * they can be reused.
+		 */
+		gistvacuum_delete_empty_pages(info, gist_stats);
 
-	/* we don't need the internal and empty page sets anymore */
-	MemoryContextDelete(gist_stats->page_set_context);
-	gist_stats->page_set_context = NULL;
-	gist_stats->internal_page_set = NULL;
-	gist_stats->empty_leaf_set = NULL;
+		/* we don't need the internal and empty page sets anymore */
+		MemoryContextDelete(gist_stats->page_set_context);
+	}
 
 	/*
 	 * It's quite possible for us to be fooled by concurrent page splits into
@@ -137,18 +144,18 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 */
 	if (!info->estimated_count)
 	{
-		if (gist_stats->stats.num_index_tuples > info->num_heap_tuples)
-			gist_stats->stats.num_index_tuples = info->num_heap_tuples;
+		if (stats->num_index_tuples > info->num_heap_tuples)
+			stats->num_index_tuples = info->num_heap_tuples;
 	}
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
  * gistvacuumscan --- scan the index for VACUUMing purposes
  *
  * This scans the index for leaf tuples that are deletable according to the
- * vacuum callback, and updates the stats.  Both btbulkdelete and
+ * vacuum callback, and updates the stats->  Both btbulkdelete and
  * btvacuumcleanup invoke this (the latter only if no btbulkdelete call
  * occurred).
  *
@@ -174,11 +181,11 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Reset counts that will be incremented during the scan; needed in case
 	 * of multiple scans during a single VACUUM command.
 	 */
-	stats->stats.estimated_count = false;
-	stats->stats.num_index_tuples = 0;
-	stats->stats.pages_deleted = 0;
-	stats->stats.pages_free = 0;
-	MemoryContextReset(stats->page_set_context);
+	stats->stats->estimated_count = false;
+	stats->stats->num_index_tuples = 0;
+	stats->stats->pages_deleted = 0;
+	stats->stats->pages_free = 0;
+
 	stats->internal_page_set = intset_create();
 	stats->empty_leaf_set = intset_create();
 
@@ -247,11 +254,11 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Note that if no recyclable pages exist, we don't bother vacuuming the
 	 * FSM at all.
 	 */
-	if (stats->stats.pages_free > 0)
+	if (stats->stats->pages_free > 0)
 		IndexFreeSpaceMapVacuum(rel);
 
 	/* update statistics */
-	stats->stats.num_pages = num_pages;
+	stats->stats->num_pages = num_pages;
 }
 
 /*
@@ -297,13 +304,13 @@ restart:
 	{
 		/* Okay to recycle this page */
 		RecordFreeIndexPage(rel, blkno);
-		stats->stats.pages_free++;
-		stats->stats.pages_deleted++;
+		stats->stats->pages_free++;
+		stats->stats->pages_deleted++;
 	}
 	else if (GistPageIsDeleted(page))
 	{
 		/* Already deleted, but can't recycle yet */
-		stats->stats.pages_deleted++;
+		stats->stats->pages_deleted++;
 	}
 	else if (GistPageIsLeaf(page))
 	{
@@ -378,7 +385,7 @@ restart:
 
 			END_CRIT_SECTION();
 
-			stats->stats.tuples_removed += ntodelete;
+			stats->stats->tuples_removed += ntodelete;
 			/* must recompute maxoff */
 			maxoff = PageGetMaxOffsetNumber(page);
 		}
@@ -398,7 +405,7 @@ restart:
 				intset_add_member(stats->empty_leaf_set, blkno);
 		}
 		else
-			stats->stats.num_index_tuples += nremain;
+			stats->stats->num_index_tuples += nremain;
 	}
 	else
 	{
@@ -563,7 +570,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 		ReleaseBuffer(buffer);
 
 		/* update stats */
-		stats->stats.pages_removed += deleted;
+		stats->stats->pages_removed += deleted;
 
 		/*
 		 * We can stop the scan as soon as we have seen the downlinks, even if
@@ -655,7 +662,7 @@ gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	/* mark the page as deleted */
 	MarkBufferDirty(leafBuffer);
 	GistPageSetDeleted(leafPage, txid);
-	stats->stats.pages_deleted++;
+	stats->stats->pages_deleted++;
 
 	/* remove the downlink from the parent */
 	MarkBufferDirty(parentBuffer);
#18Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#15)
Re: Questions/Observations related to Gist vacuum

On Fri, Oct 18, 2019 at 10:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Thanks for the test. It shows that prior to patch the memory was
getting leaked in TopTransactionContext during multi-pass vacuum and
after patch, there is no leak. I will commit the patch early next
week unless Heikki or someone wants some more tests.

Pushed.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#19Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#18)
Re: Questions/Observations related to Gist vacuum

On Mon, Oct 21, 2019 at 11:23 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 18, 2019 at 10:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Thanks for the test. It shows that prior to patch the memory was
getting leaked in TopTransactionContext during multi-pass vacuum and
after patch, there is no leak. I will commit the patch early next
week unless Heikki or someone wants some more tests.

Pushed.

Thanks.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#20Andrey Borodin
x4mmm@yandex-team.ru
In reply to: Dilip Kumar (#17)
Re: Questions/Observations related to Gist vacuum

Hi!

18 окт. 2019 г., в 13:21, Dilip Kumar <dilipbalaut@gmail.com> написал(а):

On Fri, Oct 18, 2019 at 10:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

I think we can do it in general as adding some check for parallel
vacuum there will look bit hackish.

I agree with that point.
It is not clear if we get enough

benefit by keeping it for cleanup phase of the index as discussed in
emails above. Heikki, others, let us know if you don't agree here.

I have prepared a first version of the patch. Currently, I am
performing an empty page deletion for all the cases.

I've took a look into the patch, and cannot understand one simple thing...
We should not call gistvacuum_delete_empty_pages() for same gist_stats twice.
Another way once the function is called we should somehow update or zero empty_leaf_set.
Does this invariant hold in your patch?

Best regards, Andrey Borodin.

#21Dilip Kumar
dilipbalaut@gmail.com
In reply to: Andrey Borodin (#20)
Re: Questions/Observations related to Gist vacuum

On Mon, Oct 21, 2019 at 2:30 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Hi!

18 окт. 2019 г., в 13:21, Dilip Kumar <dilipbalaut@gmail.com> написал(а):

On Fri, Oct 18, 2019 at 10:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

I think we can do it in general as adding some check for parallel
vacuum there will look bit hackish.

I agree with that point.
It is not clear if we get enough

benefit by keeping it for cleanup phase of the index as discussed in
emails above. Heikki, others, let us know if you don't agree here.

I have prepared a first version of the patch. Currently, I am
performing an empty page deletion for all the cases.

I've took a look into the patch, and cannot understand one simple thing...
We should not call gistvacuum_delete_empty_pages() for same gist_stats twice.
Another way once the function is called we should somehow update or zero empty_leaf_set.
Does this invariant hold in your patch?

Thanks for looking into the patch. With this patch now
GistBulkDeleteResult is local to single gistbulkdelete call or
gistvacuumcleanup. So now we are not sharing GistBulkDeleteResult,
across the calls so I am not sure how it will be called twice for the
same gist_stats? I might be missing something here?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#22Andrey Borodin
x4mmm@yandex-team.ru
In reply to: Dilip Kumar (#21)
Re: Questions/Observations related to Gist vacuum

21 окт. 2019 г., в 11:12, Dilip Kumar <dilipbalaut@gmail.com> написал(а):

On Mon, Oct 21, 2019 at 2:30 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

I've took a look into the patch, and cannot understand one simple thing...
We should not call gistvacuum_delete_empty_pages() for same gist_stats twice.
Another way once the function is called we should somehow update or zero empty_leaf_set.
Does this invariant hold in your patch?

Thanks for looking into the patch. With this patch now
GistBulkDeleteResult is local to single gistbulkdelete call or
gistvacuumcleanup. So now we are not sharing GistBulkDeleteResult,
across the calls so I am not sure how it will be called twice for the
same gist_stats? I might be missing something here?

Yes, you are right, sorry for the noise.
Currently we are doing both gistvacuumscan() and gistvacuum_delete_empty_pages() in both gistbulkdelete() and gistvacuumcleanup(). Is it supposed to be so? Functions gistbulkdelete() and gistvacuumcleanup() look very similar and share some comments. This is what triggered my attention.

Thanks!

--
Andrey Borodin
Open source RDBMS development team leader
Yandex.Cloud

#23Dilip Kumar
dilipbalaut@gmail.com
In reply to: Andrey Borodin (#22)
Re: Questions/Observations related to Gist vacuum

On Mon, Oct 21, 2019 at 2:58 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

21 окт. 2019 г., в 11:12, Dilip Kumar <dilipbalaut@gmail.com> написал(а):

On Mon, Oct 21, 2019 at 2:30 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

I've took a look into the patch, and cannot understand one simple thing...
We should not call gistvacuum_delete_empty_pages() for same gist_stats twice.
Another way once the function is called we should somehow update or zero empty_leaf_set.
Does this invariant hold in your patch?

Thanks for looking into the patch. With this patch now
GistBulkDeleteResult is local to single gistbulkdelete call or
gistvacuumcleanup. So now we are not sharing GistBulkDeleteResult,
across the calls so I am not sure how it will be called twice for the
same gist_stats? I might be missing something here?

Yes, you are right, sorry for the noise.
Currently we are doing both gistvacuumscan() and gistvacuum_delete_empty_pages() in both gistbulkdelete() and gistvacuumcleanup(). Is it supposed to be so?

There was an issue discussed in parallel vacuum thread[1]/messages/by-id/CAA4eK1JEQ2y3uNucNopDjK8pse6xSe5=_oknoWfRQvAF=VqsBA@mail.gmail.com, and for
solving that it has been discussed in this thread[2]/messages/by-id/69EF7B88-F3E7-4E09-824D-694CF39E5683@iki.fi that we can
delete empty pages in bulkdelete phase itself. But, that does not
mean that we can remove that from the gistvacuumcleanup phase.
Because if the gistbulkdelete is not at all called in the vacuum pass
then gistvacuumcleanup, will perform both gistvacuumscan and
gistvacuum_delete_empty_pages. In short, In whichever pass, we detect
the empty page in the same pass we delete the empty page.

Functions gistbulkdelete() and gistvacuumcleanup() look very similar
and share some comments. This is what triggered my attention.

[1]: /messages/by-id/CAA4eK1JEQ2y3uNucNopDjK8pse6xSe5=_oknoWfRQvAF=VqsBA@mail.gmail.com
[2]: /messages/by-id/69EF7B88-F3E7-4E09-824D-694CF39E5683@iki.fi

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#24Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#17)
Re: Questions/Observations related to Gist vacuum

On Fri, Oct 18, 2019 at 4:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I have prepared a first version of the patch. Currently, I am
performing an empty page deletion for all the cases.

Few comments:
----------------------
1.
-/*
- * State kept across vacuum stages.
- */
 typedef struct
 {
- IndexBulkDeleteResult stats; /* must be first */
+ IndexBulkDeleteResult *stats; /* kept across vacuum stages. */
  /*
- * These are used to memorize all internal and empty leaf pages in the 1st
- * vacuum stage.  They are used in the 2nd stage, to delete all the empty
- * pages.
+ * These are used to memorize all internal and empty leaf pages. They are
+ * used for deleting all the empty pages.
  */
  IntegerSet *internal_page_set;
  IntegerSet *empty_leaf_set;

Now, if we don't want to share the remaining stats across
gistbulkdelete and gistvacuumcleanup, isn't it better to keep the
information of internal and empty leaf pages as part of GistVacState?
Also, I think it is better to call gistvacuum_delete_empty_pages from
function gistvacuumscan as that will avoid it calling from multiple
places.

2.
- gist_stats->page_set_context = NULL;
- gist_stats->internal_page_set = NULL;
- gist_stats->empty_leaf_set = NULL;

Why have you removed this initialization?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#25Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#24)
Re: Questions/Observations related to Gist vacuum

On Tue, Oct 22, 2019 at 9:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 18, 2019 at 4:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I have prepared a first version of the patch. Currently, I am
performing an empty page deletion for all the cases.

Few comments:
----------------------
1.
-/*
- * State kept across vacuum stages.
- */
typedef struct
{
- IndexBulkDeleteResult stats; /* must be first */
+ IndexBulkDeleteResult *stats; /* kept across vacuum stages. */
/*
- * These are used to memorize all internal and empty leaf pages in the 1st
- * vacuum stage.  They are used in the 2nd stage, to delete all the empty
- * pages.
+ * These are used to memorize all internal and empty leaf pages. They are
+ * used for deleting all the empty pages.
*/
IntegerSet *internal_page_set;
IntegerSet *empty_leaf_set;

Now, if we don't want to share the remaining stats across
gistbulkdelete and gistvacuumcleanup, isn't it better to keep the
information of internal and empty leaf pages as part of GistVacState?

Basically, only IndexBulkDeleteResult is now shared across the stage
so we can move all members to GistVacState and completely get rid of
GistBulkDeleteResult?

IndexBulkDeleteResult *stats
IntegerSet *internal_page_set;
IntegerSet *empty_leaf_set;
MemoryContext page_set_context;

Also, I think it is better to call gistvacuum_delete_empty_pages from
function gistvacuumscan as that will avoid it calling from multiple
places.

Yeah we can do that.

2.
- gist_stats->page_set_context = NULL;
- gist_stats->internal_page_set = NULL;
- gist_stats->empty_leaf_set = NULL;

Why have you removed this initialization?

This was post-cleanup reset since we were returning the gist_stats so
it was better to clean up but now we are not returning it out so I
though we don't need to clean this. But, I think now we can free the
memory gist_stats itself.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#26Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#25)
Re: Questions/Observations related to Gist vacuum

On Tue, Oct 22, 2019 at 10:50 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 22, 2019 at 9:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 18, 2019 at 4:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I have prepared a first version of the patch. Currently, I am
performing an empty page deletion for all the cases.

Few comments:
----------------------
1.
-/*
- * State kept across vacuum stages.
- */
typedef struct
{
- IndexBulkDeleteResult stats; /* must be first */
+ IndexBulkDeleteResult *stats; /* kept across vacuum stages. */
/*
- * These are used to memorize all internal and empty leaf pages in the 1st
- * vacuum stage.  They are used in the 2nd stage, to delete all the empty
- * pages.
+ * These are used to memorize all internal and empty leaf pages. They are
+ * used for deleting all the empty pages.
*/
IntegerSet *internal_page_set;
IntegerSet *empty_leaf_set;

Now, if we don't want to share the remaining stats across
gistbulkdelete and gistvacuumcleanup, isn't it better to keep the
information of internal and empty leaf pages as part of GistVacState?

Basically, only IndexBulkDeleteResult is now shared across the stage
so we can move all members to GistVacState and completely get rid of
GistBulkDeleteResult?

Yes, something like that would be better. Let's try and see how it comes out.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#27Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#26)
1 attachment(s)
Re: Questions/Observations related to Gist vacuum

On Tue, Oct 22, 2019 at 10:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Oct 22, 2019 at 10:50 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 22, 2019 at 9:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Oct 18, 2019 at 4:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I have prepared a first version of the patch. Currently, I am
performing an empty page deletion for all the cases.

Few comments:
----------------------
1.
-/*
- * State kept across vacuum stages.
- */
typedef struct
{
- IndexBulkDeleteResult stats; /* must be first */
+ IndexBulkDeleteResult *stats; /* kept across vacuum stages. */
/*
- * These are used to memorize all internal and empty leaf pages in the 1st
- * vacuum stage.  They are used in the 2nd stage, to delete all the empty
- * pages.
+ * These are used to memorize all internal and empty leaf pages. They are
+ * used for deleting all the empty pages.
*/
IntegerSet *internal_page_set;
IntegerSet *empty_leaf_set;

Now, if we don't want to share the remaining stats across
gistbulkdelete and gistvacuumcleanup, isn't it better to keep the
information of internal and empty leaf pages as part of GistVacState?

Basically, only IndexBulkDeleteResult is now shared across the stage
so we can move all members to GistVacState and completely get rid of
GistBulkDeleteResult?

Yes, something like that would be better. Let's try and see how it comes out.

I have modified as we discussed. Please take a look.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v2-0001-delete-empty-page-in-gistbulkdelete.patchapplication/octet-stream; name=v2-0001-delete-empty-page-in-gistbulkdelete.patchDownload
From f45fa55c868a1cf596dc9a205fbc1607b33cd8f6 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.kumar@enterprisedb.com>
Date: Tue, 22 Oct 2019 13:54:14 +0530
Subject: [PATCH v2] delete empty page in gistbulkdelete

---
 src/backend/access/gist/gistvacuum.c | 148 ++++++++++++++---------------------
 1 file changed, 59 insertions(+), 89 deletions(-)

diff --git a/src/backend/access/gist/gistvacuum.c b/src/backend/access/gist/gistvacuum.c
index 710e401..6551558 100644
--- a/src/backend/access/gist/gistvacuum.c
+++ b/src/backend/access/gist/gistvacuum.c
@@ -24,58 +24,34 @@
 #include "storage/lmgr.h"
 #include "utils/memutils.h"
 
-/*
- * State kept across vacuum stages.
- */
+/* Working state needed by gistbulkdelete */
 typedef struct
 {
-	IndexBulkDeleteResult stats;	/* must be first */
+	IndexVacuumInfo *info;
+	IndexBulkDeleteResult *stats;
+	IndexBulkDeleteCallback callback;
+	void	   *callback_state;
+	GistNSN		startNSN;
 
 	/*
-	 * These are used to memorize all internal and empty leaf pages in the 1st
-	 * vacuum stage.  They are used in the 2nd stage, to delete all the empty
-	 * pages.
+	 * These are used to memorize all internal and empty leaf pages. They are
+	 * used for deleting all the empty pages.
 	 */
 	IntegerSet *internal_page_set;
 	IntegerSet *empty_leaf_set;
 	MemoryContext page_set_context;
-} GistBulkDeleteResult;
-
-/* Working state needed by gistbulkdelete */
-typedef struct
-{
-	IndexVacuumInfo *info;
-	GistBulkDeleteResult *stats;
-	IndexBulkDeleteCallback callback;
-	void	   *callback_state;
-	GistNSN		startNSN;
 } GistVacState;
 
-static void gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+static void gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   IndexBulkDeleteCallback callback, void *callback_state);
 static void gistvacuumpage(GistVacState *vstate, BlockNumber blkno,
 						   BlockNumber orig_blkno);
 static void gistvacuum_delete_empty_pages(IndexVacuumInfo *info,
-										  GistBulkDeleteResult *stats);
-static bool gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+										  GistVacState *stats);
+static bool gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   Buffer buffer, OffsetNumber downlink,
 						   Buffer leafBuffer);
 
-/* allocate the 'stats' struct that's kept over vacuum stages */
-static GistBulkDeleteResult *
-create_GistBulkDeleteResult(void)
-{
-	GistBulkDeleteResult *gist_stats;
-
-	gist_stats = (GistBulkDeleteResult *) palloc0(sizeof(GistBulkDeleteResult));
-	gist_stats->page_set_context =
-		GenerationContextCreate(CurrentMemoryContext,
-								"GiST VACUUM page set context",
-								16 * 1024);
-
-	return gist_stats;
-}
-
 /*
  * VACUUM bulkdelete stage: remove index entries.
  */
@@ -83,15 +59,13 @@ IndexBulkDeleteResult *
 gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* allocate stats if first time through, else re-use existing struct */
-	if (gist_stats == NULL)
-		gist_stats = create_GistBulkDeleteResult();
+	if (stats == NULL)
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
 
-	gistvacuumscan(info, gist_stats, callback, callback_state);
+	gistvacuumscan(info, stats, callback, callback_state);
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -100,8 +74,6 @@ gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 IndexBulkDeleteResult *
 gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* No-op in ANALYZE ONLY mode */
 	if (info->analyze_only)
 		return stats;
@@ -111,25 +83,13 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 * stats from the latest gistbulkdelete call.  If it wasn't called, we
 	 * still need to do a pass over the index, to obtain index statistics.
 	 */
-	if (gist_stats == NULL)
+	if (stats == NULL)
 	{
-		gist_stats = create_GistBulkDeleteResult();
-		gistvacuumscan(info, gist_stats, NULL, NULL);
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+		gistvacuumscan(info, stats, NULL, NULL);
 	}
 
 	/*
-	 * If we saw any empty pages, try to unlink them from the tree so that
-	 * they can be reused.
-	 */
-	gistvacuum_delete_empty_pages(info, gist_stats);
-
-	/* we don't need the internal and empty page sets anymore */
-	MemoryContextDelete(gist_stats->page_set_context);
-	gist_stats->page_set_context = NULL;
-	gist_stats->internal_page_set = NULL;
-	gist_stats->empty_leaf_set = NULL;
-
-	/*
 	 * It's quite possible for us to be fooled by concurrent page splits into
 	 * double-counting some index tuples, so disbelieve any total that exceeds
 	 * the underlying heap's count ... if we know that accurately.  Otherwise
@@ -137,11 +97,11 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 */
 	if (!info->estimated_count)
 	{
-		if (gist_stats->stats.num_index_tuples > info->num_heap_tuples)
-			gist_stats->stats.num_index_tuples = info->num_heap_tuples;
+		if (stats->num_index_tuples > info->num_heap_tuples)
+			stats->num_index_tuples = info->num_heap_tuples;
 	}
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -161,7 +121,7 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
  * The caller is responsible for initially allocating/zeroing a stats struct.
  */
 static void
-gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
 	Relation	rel = info->index;
@@ -175,11 +135,10 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Reset counts that will be incremented during the scan; needed in case
 	 * of multiple scans during a single VACUUM command.
 	 */
-	stats->stats.estimated_count = false;
-	stats->stats.num_index_tuples = 0;
-	stats->stats.pages_deleted = 0;
-	stats->stats.pages_free = 0;
-	MemoryContextReset(stats->page_set_context);
+	stats->estimated_count = false;
+	stats->num_index_tuples = 0;
+	stats->pages_deleted = 0;
+	stats->pages_free = 0;
 
 	/*
 	 * Create the integer sets to remember all the internal and the empty leaf
@@ -187,9 +146,12 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * this context so that the subsequent allocations for these integer sets
 	 * will be done from the same context.
 	 */
-	oldctx = MemoryContextSwitchTo(stats->page_set_context);
-	stats->internal_page_set = intset_create();
-	stats->empty_leaf_set = intset_create();
+	vstate.page_set_context = GenerationContextCreate(CurrentMemoryContext,
+												"GiST VACUUM page set context",
+												16 * 1024);
+	oldctx = MemoryContextSwitchTo(vstate.page_set_context);
+	vstate.internal_page_set = intset_create();
+	vstate.empty_leaf_set = intset_create();
 	MemoryContextSwitchTo(oldctx);
 
 	/* Set up info to pass down to gistvacuumpage */
@@ -257,11 +219,20 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Note that if no recyclable pages exist, we don't bother vacuuming the
 	 * FSM at all.
 	 */
-	if (stats->stats.pages_free > 0)
+	if (stats->pages_free > 0)
 		IndexFreeSpaceMapVacuum(rel);
 
 	/* update statistics */
-	stats->stats.num_pages = num_pages;
+	stats->num_pages = num_pages;
+
+	/*
+	 * If we saw any empty pages, try to unlink them from the tree so that
+	 * they can be reused.
+	 */
+	gistvacuum_delete_empty_pages(info, &vstate);
+
+	/* we don't need the internal and empty page sets anymore */
+	MemoryContextDelete(vstate.page_set_context);
 }
 
 /*
@@ -278,7 +249,6 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 static void
 gistvacuumpage(GistVacState *vstate, BlockNumber blkno, BlockNumber orig_blkno)
 {
-	GistBulkDeleteResult *stats = vstate->stats;
 	IndexVacuumInfo *info = vstate->info;
 	IndexBulkDeleteCallback callback = vstate->callback;
 	void	   *callback_state = vstate->callback_state;
@@ -307,13 +277,13 @@ restart:
 	{
 		/* Okay to recycle this page */
 		RecordFreeIndexPage(rel, blkno);
-		stats->stats.pages_free++;
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_free++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsDeleted(page))
 	{
 		/* Already deleted, but can't recycle yet */
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsLeaf(page))
 	{
@@ -388,7 +358,7 @@ restart:
 
 			END_CRIT_SECTION();
 
-			stats->stats.tuples_removed += ntodelete;
+			vstate->stats->tuples_removed += ntodelete;
 			/* must recompute maxoff */
 			maxoff = PageGetMaxOffsetNumber(page);
 		}
@@ -405,10 +375,10 @@ restart:
 			 * it up.
 			 */
 			if (blkno == orig_blkno)
-				intset_add_member(stats->empty_leaf_set, blkno);
+				intset_add_member(vstate->empty_leaf_set, blkno);
 		}
 		else
-			stats->stats.num_index_tuples += nremain;
+			vstate->stats->num_index_tuples += nremain;
 	}
 	else
 	{
@@ -443,7 +413,7 @@ restart:
 		 * parents of empty leaf pages.
 		 */
 		if (blkno == orig_blkno)
-			intset_add_member(stats->internal_page_set, blkno);
+			intset_add_member(vstate->internal_page_set, blkno);
 	}
 
 	UnlockReleaseBuffer(buffer);
@@ -466,7 +436,7 @@ restart:
  * Scan all internal pages, and try to delete their empty child pages.
  */
 static void
-gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats)
+gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistVacState *vstate)
 {
 	Relation	rel = info->index;
 	BlockNumber empty_pages_remaining;
@@ -475,10 +445,10 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 	/*
 	 * Rescan all inner pages to find those that have empty child pages.
 	 */
-	empty_pages_remaining = intset_num_entries(stats->empty_leaf_set);
-	intset_begin_iterate(stats->internal_page_set);
+	empty_pages_remaining = intset_num_entries(vstate->empty_leaf_set);
+	intset_begin_iterate(vstate->internal_page_set);
 	while (empty_pages_remaining > 0 &&
-		   intset_iterate_next(stats->internal_page_set, &blkno))
+		   intset_iterate_next(vstate->internal_page_set, &blkno))
 	{
 		Buffer		buffer;
 		Page		page;
@@ -521,7 +491,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			BlockNumber leafblk;
 
 			leafblk = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
-			if (intset_is_member(stats->empty_leaf_set, leafblk))
+			if (intset_is_member(vstate->empty_leaf_set, leafblk))
 			{
 				leafs_to_delete[ntodelete] = leafblk;
 				todelete[ntodelete++] = off;
@@ -561,7 +531,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			gistcheckpage(rel, leafbuf);
 
 			LockBuffer(buffer, GIST_EXCLUSIVE);
-			if (gistdeletepage(info, stats,
+			if (gistdeletepage(info, vstate->stats,
 							   buffer, todelete[i] - deleted,
 							   leafbuf))
 				deleted++;
@@ -573,7 +543,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 		ReleaseBuffer(buffer);
 
 		/* update stats */
-		stats->stats.pages_removed += deleted;
+		vstate->stats->pages_removed += deleted;
 
 		/*
 		 * We can stop the scan as soon as we have seen the downlinks, even if
@@ -596,7 +566,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
  * prevented it.
  */
 static bool
-gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   Buffer parentBuffer, OffsetNumber downlink,
 			   Buffer leafBuffer)
 {
@@ -665,7 +635,7 @@ gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	/* mark the page as deleted */
 	MarkBufferDirty(leafBuffer);
 	GistPageSetDeleted(leafPage, txid);
-	stats->stats.pages_deleted++;
+	stats->pages_deleted++;
 
 	/* remove the downlink from the parent */
 	MarkBufferDirty(parentBuffer);
-- 
1.8.3.1

#28Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#27)
Re: Questions/Observations related to Gist vacuum

On Tue, Oct 22, 2019 at 2:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 22, 2019 at 10:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Basically, only IndexBulkDeleteResult is now shared across the stage
so we can move all members to GistVacState and completely get rid of
GistBulkDeleteResult?

Yes, something like that would be better. Let's try and see how it comes out.

I have modified as we discussed. Please take a look.

Thanks, I haven't reviewed this yet, but it seems to be on the right
lines. Sawada-San, can you please prepare the next version of the
parallel vacuum patch on top of this patch and enable parallel vacuum
for Gist indexes? We can do the review of this patch in detail once
the parallel vacuum patch is in better shape as without that it
wouldn't make sense to commit this patch.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#29Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#28)
Re: Questions/Observations related to Gist vacuum

On Wed, Oct 23, 2019 at 8:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Oct 22, 2019 at 2:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 22, 2019 at 10:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Basically, only IndexBulkDeleteResult is now shared across the stage
so we can move all members to GistVacState and completely get rid of
GistBulkDeleteResult?

Yes, something like that would be better. Let's try and see how it comes out.

I have modified as we discussed. Please take a look.

Thanks, I haven't reviewed this yet, but it seems to be on the right
lines. Sawada-San, can you please prepare the next version of the
parallel vacuum patch on top of this patch and enable parallel vacuum
for Gist indexes?

Yeah I've sent the latest patch set that is built on top of this
patch[1]/messages/by-id/CAD21AoBMo9dr_QmhT=dKh7fmiq7tpx+yLHR8nw9i5NZ-SgtaVg@mail.gmail.com. BTW I looked at this patch briefly but it looks good to me.

[1]: /messages/by-id/CAD21AoBMo9dr_QmhT=dKh7fmiq7tpx+yLHR8nw9i5NZ-SgtaVg@mail.gmail.com

Regards,

--
Masahiko Sawada

#30Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#29)
1 attachment(s)
Re: Questions/Observations related to Gist vacuum

On Fri, Oct 25, 2019 at 9:22 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Oct 23, 2019 at 8:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Oct 22, 2019 at 2:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 22, 2019 at 10:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

I have modified as we discussed. Please take a look.

Thanks, I haven't reviewed this yet, but it seems to be on the right
lines. Sawada-San, can you please prepare the next version of the
parallel vacuum patch on top of this patch and enable parallel vacuum
for Gist indexes?

Yeah I've sent the latest patch set that is built on top of this
patch[1]. BTW I looked at this patch briefly but it looks good to me.

Today, I have looked at this patch and found a few things that need to
be changed:

1.
 static void gistvacuum_delete_empty_pages(IndexVacuumInfo *info,
-   GistBulkDeleteResult *stats);
-static bool gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+   GistVacState *stats);

I think stats is not a good name for GistVacState. How about vstate?

2.
+ /* we don't need the internal and empty page sets anymore */
+ MemoryContextDelete(vstate.page_set_context);

After memory context delete, we can reset this and other related
variables as we were doing without the patch.

3. There are a couple of places in code (like comments, README) that
mentions the deletion of empty pages in the second stage of the
vacuum. We should change all such places.

I have modified the patch for the above points and additionally ran
pgindent. Let me know what you think about the attached patch?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v3-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM.patchapplication/octet-stream; name=v3-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM.patchDownload
From 4a1a58d12c67676db03f73b46e277c7ee290caa4 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 9 Dec 2019 14:12:59 +0530
Subject: [PATCH] Delete empty pages in each pass during GIST VACUUM.

Earlier, we use to postpone deleting empty pages till the second stage of
vacuum to amortize the cost of scanning internal pages.  However, that can
sometimes (say vacuum is canceled or errored between first and second
stage) delay the pages to be recycled.

Another thing is that to facilitate deleting empty pages in the second
stage, we need to share the information of internal and empty pages
between different stages of vacuum.  It will be quite tricky to share this
information via DSM which is required for the upcoming parallel vacuum
patch.

Also, it will bring the logic to reclaim deleted pages more closer to
nbtree where we delete empty pages in each pass.

Overall, the advantages of deleting empty pages in each pass outweigh the
advantages of postoponing the same.

Author: Dilip Kumar, with changes by Amit Kapila
Reviewed-by: Sawada Masahiko and Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com
---
 src/backend/access/gist/README       |  23 +++--
 src/backend/access/gist/gistvacuum.c | 160 +++++++++++++++--------------------
 2 files changed, 78 insertions(+), 105 deletions(-)

diff --git a/src/backend/access/gist/README b/src/backend/access/gist/README
index 8cbca69..fffdfff 100644
--- a/src/backend/access/gist/README
+++ b/src/backend/access/gist/README
@@ -429,18 +429,17 @@ splits during searches, we don't need a "vacuum cycle ID" concept for that
 like B-tree does.
 
 While we scan all the pages, we also make note of any completely empty leaf
-pages. We will try to unlink them from the tree in the second stage. We also
-record the block numbers of all internal pages; they are needed in the second
-stage, to locate parents of the empty pages.
-
-In the second stage, we try to unlink any empty leaf pages from the tree, so
-that their space can be reused. In order to delete an empty page, its
-downlink must be removed from the parent. We scan all the internal pages,
-whose block numbers we memorized in the first stage, and look for downlinks
-to pages that we have memorized as being empty. Whenever we find one, we
-acquire a lock on the parent and child page, re-check that the child page is
-still empty. Then, we remove the downlink and mark the child as deleted, and
-release the locks.
+pages. We will try to unlink them from the tree after the scan. We also record
+the block numbers of all internal pages; they are needed to locate parents of
+the empty pages while unlinking them.
+
+We try to unlink any empty leaf pages from the tree, so that their space can
+be reused. In order to delete an empty page, its downlink must be removed from
+the parent. We scan all the internal pages, whose block numbers we memorized
+in the first stage, and look for downlinks to pages that we have memorized as
+being empty. Whenever we find one, we acquire a lock on the parent and child
+page, re-check that the child page is still empty. Then, we remove the
+downlink and mark the child as deleted, and release the locks.
 
 The insertion algorithm would get confused, if an internal page was completely
 empty. So we never delete the last child of an internal page, even if it's
diff --git a/src/backend/access/gist/gistvacuum.c b/src/backend/access/gist/gistvacuum.c
index 710e401..730f3e8 100644
--- a/src/backend/access/gist/gistvacuum.c
+++ b/src/backend/access/gist/gistvacuum.c
@@ -24,58 +24,34 @@
 #include "storage/lmgr.h"
 #include "utils/memutils.h"
 
-/*
- * State kept across vacuum stages.
- */
+/* Working state needed by gistbulkdelete */
 typedef struct
 {
-	IndexBulkDeleteResult stats;	/* must be first */
+	IndexVacuumInfo *info;
+	IndexBulkDeleteResult *stats;
+	IndexBulkDeleteCallback callback;
+	void	   *callback_state;
+	GistNSN		startNSN;
 
 	/*
-	 * These are used to memorize all internal and empty leaf pages in the 1st
-	 * vacuum stage.  They are used in the 2nd stage, to delete all the empty
-	 * pages.
+	 * These are used to memorize all internal and empty leaf pages. They are
+	 * used for deleting all the empty pages.
 	 */
 	IntegerSet *internal_page_set;
 	IntegerSet *empty_leaf_set;
 	MemoryContext page_set_context;
-} GistBulkDeleteResult;
-
-/* Working state needed by gistbulkdelete */
-typedef struct
-{
-	IndexVacuumInfo *info;
-	GistBulkDeleteResult *stats;
-	IndexBulkDeleteCallback callback;
-	void	   *callback_state;
-	GistNSN		startNSN;
 } GistVacState;
 
-static void gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+static void gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   IndexBulkDeleteCallback callback, void *callback_state);
 static void gistvacuumpage(GistVacState *vstate, BlockNumber blkno,
 						   BlockNumber orig_blkno);
 static void gistvacuum_delete_empty_pages(IndexVacuumInfo *info,
-										  GistBulkDeleteResult *stats);
-static bool gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+										  GistVacState *vstate);
+static bool gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   Buffer buffer, OffsetNumber downlink,
 						   Buffer leafBuffer);
 
-/* allocate the 'stats' struct that's kept over vacuum stages */
-static GistBulkDeleteResult *
-create_GistBulkDeleteResult(void)
-{
-	GistBulkDeleteResult *gist_stats;
-
-	gist_stats = (GistBulkDeleteResult *) palloc0(sizeof(GistBulkDeleteResult));
-	gist_stats->page_set_context =
-		GenerationContextCreate(CurrentMemoryContext,
-								"GiST VACUUM page set context",
-								16 * 1024);
-
-	return gist_stats;
-}
-
 /*
  * VACUUM bulkdelete stage: remove index entries.
  */
@@ -83,15 +59,13 @@ IndexBulkDeleteResult *
 gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* allocate stats if first time through, else re-use existing struct */
-	if (gist_stats == NULL)
-		gist_stats = create_GistBulkDeleteResult();
+	if (stats == NULL)
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
 
-	gistvacuumscan(info, gist_stats, callback, callback_state);
+	gistvacuumscan(info, stats, callback, callback_state);
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -100,8 +74,6 @@ gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 IndexBulkDeleteResult *
 gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* No-op in ANALYZE ONLY mode */
 	if (info->analyze_only)
 		return stats;
@@ -111,25 +83,13 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 * stats from the latest gistbulkdelete call.  If it wasn't called, we
 	 * still need to do a pass over the index, to obtain index statistics.
 	 */
-	if (gist_stats == NULL)
+	if (stats == NULL)
 	{
-		gist_stats = create_GistBulkDeleteResult();
-		gistvacuumscan(info, gist_stats, NULL, NULL);
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+		gistvacuumscan(info, stats, NULL, NULL);
 	}
 
 	/*
-	 * If we saw any empty pages, try to unlink them from the tree so that
-	 * they can be reused.
-	 */
-	gistvacuum_delete_empty_pages(info, gist_stats);
-
-	/* we don't need the internal and empty page sets anymore */
-	MemoryContextDelete(gist_stats->page_set_context);
-	gist_stats->page_set_context = NULL;
-	gist_stats->internal_page_set = NULL;
-	gist_stats->empty_leaf_set = NULL;
-
-	/*
 	 * It's quite possible for us to be fooled by concurrent page splits into
 	 * double-counting some index tuples, so disbelieve any total that exceeds
 	 * the underlying heap's count ... if we know that accurately.  Otherwise
@@ -137,11 +97,11 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 */
 	if (!info->estimated_count)
 	{
-		if (gist_stats->stats.num_index_tuples > info->num_heap_tuples)
-			gist_stats->stats.num_index_tuples = info->num_heap_tuples;
+		if (stats->num_index_tuples > info->num_heap_tuples)
+			stats->num_index_tuples = info->num_heap_tuples;
 	}
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -153,15 +113,16 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
  * occurred).
  *
  * This also makes note of any empty leaf pages, as well as all internal
- * pages.  The second stage, gistvacuum_delete_empty_pages(), needs that
- * information.  Any deleted pages are added directly to the free space map.
- * (They should've been added there when they were originally deleted, already,
- * but it's possible that the FSM was lost at a crash, for example.)
+ * pages while looping over all index pages.  After scanning all the pages, we
+ * remove the empty pages so that they can be reused.  Any deleted pages are
+ * added directly to the free space map.  (They should've been added there
+ * when they were originally deleted, already, but it's possible that the FSM
+ * was lost at a crash, for example.)
  *
  * The caller is responsible for initially allocating/zeroing a stats struct.
  */
 static void
-gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
 	Relation	rel = info->index;
@@ -175,11 +136,10 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Reset counts that will be incremented during the scan; needed in case
 	 * of multiple scans during a single VACUUM command.
 	 */
-	stats->stats.estimated_count = false;
-	stats->stats.num_index_tuples = 0;
-	stats->stats.pages_deleted = 0;
-	stats->stats.pages_free = 0;
-	MemoryContextReset(stats->page_set_context);
+	stats->estimated_count = false;
+	stats->num_index_tuples = 0;
+	stats->pages_deleted = 0;
+	stats->pages_free = 0;
 
 	/*
 	 * Create the integer sets to remember all the internal and the empty leaf
@@ -187,9 +147,12 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * this context so that the subsequent allocations for these integer sets
 	 * will be done from the same context.
 	 */
-	oldctx = MemoryContextSwitchTo(stats->page_set_context);
-	stats->internal_page_set = intset_create();
-	stats->empty_leaf_set = intset_create();
+	vstate.page_set_context = GenerationContextCreate(CurrentMemoryContext,
+													  "GiST VACUUM page set context",
+													  16 * 1024);
+	oldctx = MemoryContextSwitchTo(vstate.page_set_context);
+	vstate.internal_page_set = intset_create();
+	vstate.empty_leaf_set = intset_create();
 	MemoryContextSwitchTo(oldctx);
 
 	/* Set up info to pass down to gistvacuumpage */
@@ -257,11 +220,23 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Note that if no recyclable pages exist, we don't bother vacuuming the
 	 * FSM at all.
 	 */
-	if (stats->stats.pages_free > 0)
+	if (stats->pages_free > 0)
 		IndexFreeSpaceMapVacuum(rel);
 
 	/* update statistics */
-	stats->stats.num_pages = num_pages;
+	stats->num_pages = num_pages;
+
+	/*
+	 * If we saw any empty pages, try to unlink them from the tree so that
+	 * they can be reused.
+	 */
+	gistvacuum_delete_empty_pages(info, &vstate);
+
+	/* we don't need the internal and empty page sets anymore */
+	MemoryContextDelete(vstate.page_set_context);
+	vstate.page_set_context = NULL;
+	vstate.internal_page_set = NULL;
+	vstate.empty_leaf_set = NULL;
 }
 
 /*
@@ -278,7 +253,6 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 static void
 gistvacuumpage(GistVacState *vstate, BlockNumber blkno, BlockNumber orig_blkno)
 {
-	GistBulkDeleteResult *stats = vstate->stats;
 	IndexVacuumInfo *info = vstate->info;
 	IndexBulkDeleteCallback callback = vstate->callback;
 	void	   *callback_state = vstate->callback_state;
@@ -307,13 +281,13 @@ restart:
 	{
 		/* Okay to recycle this page */
 		RecordFreeIndexPage(rel, blkno);
-		stats->stats.pages_free++;
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_free++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsDeleted(page))
 	{
 		/* Already deleted, but can't recycle yet */
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsLeaf(page))
 	{
@@ -388,7 +362,7 @@ restart:
 
 			END_CRIT_SECTION();
 
-			stats->stats.tuples_removed += ntodelete;
+			vstate->stats->tuples_removed += ntodelete;
 			/* must recompute maxoff */
 			maxoff = PageGetMaxOffsetNumber(page);
 		}
@@ -405,10 +379,10 @@ restart:
 			 * it up.
 			 */
 			if (blkno == orig_blkno)
-				intset_add_member(stats->empty_leaf_set, blkno);
+				intset_add_member(vstate->empty_leaf_set, blkno);
 		}
 		else
-			stats->stats.num_index_tuples += nremain;
+			vstate->stats->num_index_tuples += nremain;
 	}
 	else
 	{
@@ -443,7 +417,7 @@ restart:
 		 * parents of empty leaf pages.
 		 */
 		if (blkno == orig_blkno)
-			intset_add_member(stats->internal_page_set, blkno);
+			intset_add_member(vstate->internal_page_set, blkno);
 	}
 
 	UnlockReleaseBuffer(buffer);
@@ -466,7 +440,7 @@ restart:
  * Scan all internal pages, and try to delete their empty child pages.
  */
 static void
-gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats)
+gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistVacState *vstate)
 {
 	Relation	rel = info->index;
 	BlockNumber empty_pages_remaining;
@@ -475,10 +449,10 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 	/*
 	 * Rescan all inner pages to find those that have empty child pages.
 	 */
-	empty_pages_remaining = intset_num_entries(stats->empty_leaf_set);
-	intset_begin_iterate(stats->internal_page_set);
+	empty_pages_remaining = intset_num_entries(vstate->empty_leaf_set);
+	intset_begin_iterate(vstate->internal_page_set);
 	while (empty_pages_remaining > 0 &&
-		   intset_iterate_next(stats->internal_page_set, &blkno))
+		   intset_iterate_next(vstate->internal_page_set, &blkno))
 	{
 		Buffer		buffer;
 		Page		page;
@@ -521,7 +495,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			BlockNumber leafblk;
 
 			leafblk = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
-			if (intset_is_member(stats->empty_leaf_set, leafblk))
+			if (intset_is_member(vstate->empty_leaf_set, leafblk))
 			{
 				leafs_to_delete[ntodelete] = leafblk;
 				todelete[ntodelete++] = off;
@@ -561,7 +535,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			gistcheckpage(rel, leafbuf);
 
 			LockBuffer(buffer, GIST_EXCLUSIVE);
-			if (gistdeletepage(info, stats,
+			if (gistdeletepage(info, vstate->stats,
 							   buffer, todelete[i] - deleted,
 							   leafbuf))
 				deleted++;
@@ -573,7 +547,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 		ReleaseBuffer(buffer);
 
 		/* update stats */
-		stats->stats.pages_removed += deleted;
+		vstate->stats->pages_removed += deleted;
 
 		/*
 		 * We can stop the scan as soon as we have seen the downlinks, even if
@@ -596,7 +570,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
  * prevented it.
  */
 static bool
-gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   Buffer parentBuffer, OffsetNumber downlink,
 			   Buffer leafBuffer)
 {
@@ -665,7 +639,7 @@ gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	/* mark the page as deleted */
 	MarkBufferDirty(leafBuffer);
 	GistPageSetDeleted(leafPage, txid);
-	stats->stats.pages_deleted++;
+	stats->pages_deleted++;
 
 	/* remove the downlink from the parent */
 	MarkBufferDirty(parentBuffer);
-- 
1.8.3.1

#31Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#30)
1 attachment(s)
Re: Questions/Observations related to Gist vacuum

On Mon, Dec 9, 2019 at 2:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

I have modified the patch for the above points and additionally ran
pgindent. Let me know what you think about the attached patch?

A new version with a slightly modified commit message.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v4-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM.patchapplication/octet-stream; name=v4-0001-Delete-empty-pages-in-each-pass-during-GIST-VACUUM.patchDownload
From b44bc6deae88c9bec552f6de4e6e73a1b252c191 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Mon, 9 Dec 2019 14:12:59 +0530
Subject: [PATCH] Delete empty pages in each pass during GIST VACUUM.

Earlier, we use to postpone deleting empty pages till the second stage of
vacuum to amortize the cost of scanning internal pages.  However, that can
sometimes (say vacuum is canceled or errored between first and second
stage) delay the pages to be recycled.

Another thing is that to facilitate deleting empty pages in the second
stage, we need to share the information of internal and empty pages
between different stages of vacuum.  It will be quite tricky to share this
information via DSM which is required for the upcoming parallel vacuum
patch.

Also, it will bring the logic to reclaim deleted pages closer to nbtree
where we delete empty pages in each pass.

Overall, the advantages of deleting empty pages in each pass outweigh the
advantages of postponing the same.

Author: Dilip Kumar, with changes by Amit Kapila
Reviewed-by: Sawada Masahiko and Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com
---
 src/backend/access/gist/README       |  23 +++--
 src/backend/access/gist/gistvacuum.c | 160 +++++++++++++++--------------------
 2 files changed, 78 insertions(+), 105 deletions(-)

diff --git a/src/backend/access/gist/README b/src/backend/access/gist/README
index 8cbca69..fffdfff 100644
--- a/src/backend/access/gist/README
+++ b/src/backend/access/gist/README
@@ -429,18 +429,17 @@ splits during searches, we don't need a "vacuum cycle ID" concept for that
 like B-tree does.
 
 While we scan all the pages, we also make note of any completely empty leaf
-pages. We will try to unlink them from the tree in the second stage. We also
-record the block numbers of all internal pages; they are needed in the second
-stage, to locate parents of the empty pages.
-
-In the second stage, we try to unlink any empty leaf pages from the tree, so
-that their space can be reused. In order to delete an empty page, its
-downlink must be removed from the parent. We scan all the internal pages,
-whose block numbers we memorized in the first stage, and look for downlinks
-to pages that we have memorized as being empty. Whenever we find one, we
-acquire a lock on the parent and child page, re-check that the child page is
-still empty. Then, we remove the downlink and mark the child as deleted, and
-release the locks.
+pages. We will try to unlink them from the tree after the scan. We also record
+the block numbers of all internal pages; they are needed to locate parents of
+the empty pages while unlinking them.
+
+We try to unlink any empty leaf pages from the tree, so that their space can
+be reused. In order to delete an empty page, its downlink must be removed from
+the parent. We scan all the internal pages, whose block numbers we memorized
+in the first stage, and look for downlinks to pages that we have memorized as
+being empty. Whenever we find one, we acquire a lock on the parent and child
+page, re-check that the child page is still empty. Then, we remove the
+downlink and mark the child as deleted, and release the locks.
 
 The insertion algorithm would get confused, if an internal page was completely
 empty. So we never delete the last child of an internal page, even if it's
diff --git a/src/backend/access/gist/gistvacuum.c b/src/backend/access/gist/gistvacuum.c
index 710e401..730f3e8 100644
--- a/src/backend/access/gist/gistvacuum.c
+++ b/src/backend/access/gist/gistvacuum.c
@@ -24,58 +24,34 @@
 #include "storage/lmgr.h"
 #include "utils/memutils.h"
 
-/*
- * State kept across vacuum stages.
- */
+/* Working state needed by gistbulkdelete */
 typedef struct
 {
-	IndexBulkDeleteResult stats;	/* must be first */
+	IndexVacuumInfo *info;
+	IndexBulkDeleteResult *stats;
+	IndexBulkDeleteCallback callback;
+	void	   *callback_state;
+	GistNSN		startNSN;
 
 	/*
-	 * These are used to memorize all internal and empty leaf pages in the 1st
-	 * vacuum stage.  They are used in the 2nd stage, to delete all the empty
-	 * pages.
+	 * These are used to memorize all internal and empty leaf pages. They are
+	 * used for deleting all the empty pages.
 	 */
 	IntegerSet *internal_page_set;
 	IntegerSet *empty_leaf_set;
 	MemoryContext page_set_context;
-} GistBulkDeleteResult;
-
-/* Working state needed by gistbulkdelete */
-typedef struct
-{
-	IndexVacuumInfo *info;
-	GistBulkDeleteResult *stats;
-	IndexBulkDeleteCallback callback;
-	void	   *callback_state;
-	GistNSN		startNSN;
 } GistVacState;
 
-static void gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+static void gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   IndexBulkDeleteCallback callback, void *callback_state);
 static void gistvacuumpage(GistVacState *vstate, BlockNumber blkno,
 						   BlockNumber orig_blkno);
 static void gistvacuum_delete_empty_pages(IndexVacuumInfo *info,
-										  GistBulkDeleteResult *stats);
-static bool gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+										  GistVacState *vstate);
+static bool gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 						   Buffer buffer, OffsetNumber downlink,
 						   Buffer leafBuffer);
 
-/* allocate the 'stats' struct that's kept over vacuum stages */
-static GistBulkDeleteResult *
-create_GistBulkDeleteResult(void)
-{
-	GistBulkDeleteResult *gist_stats;
-
-	gist_stats = (GistBulkDeleteResult *) palloc0(sizeof(GistBulkDeleteResult));
-	gist_stats->page_set_context =
-		GenerationContextCreate(CurrentMemoryContext,
-								"GiST VACUUM page set context",
-								16 * 1024);
-
-	return gist_stats;
-}
-
 /*
  * VACUUM bulkdelete stage: remove index entries.
  */
@@ -83,15 +59,13 @@ IndexBulkDeleteResult *
 gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* allocate stats if first time through, else re-use existing struct */
-	if (gist_stats == NULL)
-		gist_stats = create_GistBulkDeleteResult();
+	if (stats == NULL)
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
 
-	gistvacuumscan(info, gist_stats, callback, callback_state);
+	gistvacuumscan(info, stats, callback, callback_state);
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -100,8 +74,6 @@ gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 IndexBulkDeleteResult *
 gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 {
-	GistBulkDeleteResult *gist_stats = (GistBulkDeleteResult *) stats;
-
 	/* No-op in ANALYZE ONLY mode */
 	if (info->analyze_only)
 		return stats;
@@ -111,25 +83,13 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 * stats from the latest gistbulkdelete call.  If it wasn't called, we
 	 * still need to do a pass over the index, to obtain index statistics.
 	 */
-	if (gist_stats == NULL)
+	if (stats == NULL)
 	{
-		gist_stats = create_GistBulkDeleteResult();
-		gistvacuumscan(info, gist_stats, NULL, NULL);
+		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+		gistvacuumscan(info, stats, NULL, NULL);
 	}
 
 	/*
-	 * If we saw any empty pages, try to unlink them from the tree so that
-	 * they can be reused.
-	 */
-	gistvacuum_delete_empty_pages(info, gist_stats);
-
-	/* we don't need the internal and empty page sets anymore */
-	MemoryContextDelete(gist_stats->page_set_context);
-	gist_stats->page_set_context = NULL;
-	gist_stats->internal_page_set = NULL;
-	gist_stats->empty_leaf_set = NULL;
-
-	/*
 	 * It's quite possible for us to be fooled by concurrent page splits into
 	 * double-counting some index tuples, so disbelieve any total that exceeds
 	 * the underlying heap's count ... if we know that accurately.  Otherwise
@@ -137,11 +97,11 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 */
 	if (!info->estimated_count)
 	{
-		if (gist_stats->stats.num_index_tuples > info->num_heap_tuples)
-			gist_stats->stats.num_index_tuples = info->num_heap_tuples;
+		if (stats->num_index_tuples > info->num_heap_tuples)
+			stats->num_index_tuples = info->num_heap_tuples;
 	}
 
-	return (IndexBulkDeleteResult *) gist_stats;
+	return stats;
 }
 
 /*
@@ -153,15 +113,16 @@ gistvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
  * occurred).
  *
  * This also makes note of any empty leaf pages, as well as all internal
- * pages.  The second stage, gistvacuum_delete_empty_pages(), needs that
- * information.  Any deleted pages are added directly to the free space map.
- * (They should've been added there when they were originally deleted, already,
- * but it's possible that the FSM was lost at a crash, for example.)
+ * pages while looping over all index pages.  After scanning all the pages, we
+ * remove the empty pages so that they can be reused.  Any deleted pages are
+ * added directly to the free space map.  (They should've been added there
+ * when they were originally deleted, already, but it's possible that the FSM
+ * was lost at a crash, for example.)
  *
  * The caller is responsible for initially allocating/zeroing a stats struct.
  */
 static void
-gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   IndexBulkDeleteCallback callback, void *callback_state)
 {
 	Relation	rel = info->index;
@@ -175,11 +136,10 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Reset counts that will be incremented during the scan; needed in case
 	 * of multiple scans during a single VACUUM command.
 	 */
-	stats->stats.estimated_count = false;
-	stats->stats.num_index_tuples = 0;
-	stats->stats.pages_deleted = 0;
-	stats->stats.pages_free = 0;
-	MemoryContextReset(stats->page_set_context);
+	stats->estimated_count = false;
+	stats->num_index_tuples = 0;
+	stats->pages_deleted = 0;
+	stats->pages_free = 0;
 
 	/*
 	 * Create the integer sets to remember all the internal and the empty leaf
@@ -187,9 +147,12 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * this context so that the subsequent allocations for these integer sets
 	 * will be done from the same context.
 	 */
-	oldctx = MemoryContextSwitchTo(stats->page_set_context);
-	stats->internal_page_set = intset_create();
-	stats->empty_leaf_set = intset_create();
+	vstate.page_set_context = GenerationContextCreate(CurrentMemoryContext,
+													  "GiST VACUUM page set context",
+													  16 * 1024);
+	oldctx = MemoryContextSwitchTo(vstate.page_set_context);
+	vstate.internal_page_set = intset_create();
+	vstate.empty_leaf_set = intset_create();
 	MemoryContextSwitchTo(oldctx);
 
 	/* Set up info to pass down to gistvacuumpage */
@@ -257,11 +220,23 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	 * Note that if no recyclable pages exist, we don't bother vacuuming the
 	 * FSM at all.
 	 */
-	if (stats->stats.pages_free > 0)
+	if (stats->pages_free > 0)
 		IndexFreeSpaceMapVacuum(rel);
 
 	/* update statistics */
-	stats->stats.num_pages = num_pages;
+	stats->num_pages = num_pages;
+
+	/*
+	 * If we saw any empty pages, try to unlink them from the tree so that
+	 * they can be reused.
+	 */
+	gistvacuum_delete_empty_pages(info, &vstate);
+
+	/* we don't need the internal and empty page sets anymore */
+	MemoryContextDelete(vstate.page_set_context);
+	vstate.page_set_context = NULL;
+	vstate.internal_page_set = NULL;
+	vstate.empty_leaf_set = NULL;
 }
 
 /*
@@ -278,7 +253,6 @@ gistvacuumscan(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 static void
 gistvacuumpage(GistVacState *vstate, BlockNumber blkno, BlockNumber orig_blkno)
 {
-	GistBulkDeleteResult *stats = vstate->stats;
 	IndexVacuumInfo *info = vstate->info;
 	IndexBulkDeleteCallback callback = vstate->callback;
 	void	   *callback_state = vstate->callback_state;
@@ -307,13 +281,13 @@ restart:
 	{
 		/* Okay to recycle this page */
 		RecordFreeIndexPage(rel, blkno);
-		stats->stats.pages_free++;
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_free++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsDeleted(page))
 	{
 		/* Already deleted, but can't recycle yet */
-		stats->stats.pages_deleted++;
+		vstate->stats->pages_deleted++;
 	}
 	else if (GistPageIsLeaf(page))
 	{
@@ -388,7 +362,7 @@ restart:
 
 			END_CRIT_SECTION();
 
-			stats->stats.tuples_removed += ntodelete;
+			vstate->stats->tuples_removed += ntodelete;
 			/* must recompute maxoff */
 			maxoff = PageGetMaxOffsetNumber(page);
 		}
@@ -405,10 +379,10 @@ restart:
 			 * it up.
 			 */
 			if (blkno == orig_blkno)
-				intset_add_member(stats->empty_leaf_set, blkno);
+				intset_add_member(vstate->empty_leaf_set, blkno);
 		}
 		else
-			stats->stats.num_index_tuples += nremain;
+			vstate->stats->num_index_tuples += nremain;
 	}
 	else
 	{
@@ -443,7 +417,7 @@ restart:
 		 * parents of empty leaf pages.
 		 */
 		if (blkno == orig_blkno)
-			intset_add_member(stats->internal_page_set, blkno);
+			intset_add_member(vstate->internal_page_set, blkno);
 	}
 
 	UnlockReleaseBuffer(buffer);
@@ -466,7 +440,7 @@ restart:
  * Scan all internal pages, and try to delete their empty child pages.
  */
 static void
-gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats)
+gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistVacState *vstate)
 {
 	Relation	rel = info->index;
 	BlockNumber empty_pages_remaining;
@@ -475,10 +449,10 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 	/*
 	 * Rescan all inner pages to find those that have empty child pages.
 	 */
-	empty_pages_remaining = intset_num_entries(stats->empty_leaf_set);
-	intset_begin_iterate(stats->internal_page_set);
+	empty_pages_remaining = intset_num_entries(vstate->empty_leaf_set);
+	intset_begin_iterate(vstate->internal_page_set);
 	while (empty_pages_remaining > 0 &&
-		   intset_iterate_next(stats->internal_page_set, &blkno))
+		   intset_iterate_next(vstate->internal_page_set, &blkno))
 	{
 		Buffer		buffer;
 		Page		page;
@@ -521,7 +495,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			BlockNumber leafblk;
 
 			leafblk = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
-			if (intset_is_member(stats->empty_leaf_set, leafblk))
+			if (intset_is_member(vstate->empty_leaf_set, leafblk))
 			{
 				leafs_to_delete[ntodelete] = leafblk;
 				todelete[ntodelete++] = off;
@@ -561,7 +535,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 			gistcheckpage(rel, leafbuf);
 
 			LockBuffer(buffer, GIST_EXCLUSIVE);
-			if (gistdeletepage(info, stats,
+			if (gistdeletepage(info, vstate->stats,
 							   buffer, todelete[i] - deleted,
 							   leafbuf))
 				deleted++;
@@ -573,7 +547,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
 		ReleaseBuffer(buffer);
 
 		/* update stats */
-		stats->stats.pages_removed += deleted;
+		vstate->stats->pages_removed += deleted;
 
 		/*
 		 * We can stop the scan as soon as we have seen the downlinks, even if
@@ -596,7 +570,7 @@ gistvacuum_delete_empty_pages(IndexVacuumInfo *info, GistBulkDeleteResult *stats
  * prevented it.
  */
 static bool
-gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
+gistdeletepage(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			   Buffer parentBuffer, OffsetNumber downlink,
 			   Buffer leafBuffer)
 {
@@ -665,7 +639,7 @@ gistdeletepage(IndexVacuumInfo *info, GistBulkDeleteResult *stats,
 	/* mark the page as deleted */
 	MarkBufferDirty(leafBuffer);
 	GistPageSetDeleted(leafPage, txid);
-	stats->stats.pages_deleted++;
+	stats->pages_deleted++;
 
 	/* remove the downlink from the parent */
 	MarkBufferDirty(parentBuffer);
-- 
1.8.3.1

#32Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#31)
Re: Questions/Observations related to Gist vacuum

On Mon, Dec 9, 2019 at 2:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Dec 9, 2019 at 2:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

I have modified the patch for the above points and additionally ran
pgindent. Let me know what you think about the attached patch?

A new version with a slightly modified commit message.

Your changes look fine to me. Thanks!

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#33Mahendra Singh Thalor
mahi6run@gmail.com
In reply to: Amit Kapila (#31)
Re: Questions/Observations related to Gist vacuum

On Mon, 9 Dec 2019 at 14:37, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Dec 9, 2019 at 2:27 PM Amit Kapila <amit.kapila16@gmail.com>

wrote:

I have modified the patch for the above points and additionally ran
pgindent. Let me know what you think about the attached patch?

A new version with a slightly modified commit message.

I reviewed v4 patch and below is the one review comment:

+     * These are used to memorize all internal and empty leaf pages. They
are
+     * used for deleting all the empty pages.
      */
After dot, there should be 2 spaces. Earlier, there was 2 spaces.

Other than that patch looks fine to me.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

#34Dilip Kumar
dilipbalaut@gmail.com
In reply to: Mahendra Singh Thalor (#33)
Re: Questions/Observations related to Gist vacuum

On Thu, Jan 9, 2020 at 4:41 PM Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

On Mon, 9 Dec 2019 at 14:37, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Dec 9, 2019 at 2:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

I have modified the patch for the above points and additionally ran
pgindent. Let me know what you think about the attached patch?

A new version with a slightly modified commit message.

I reviewed v4 patch and below is the one review comment:

+     * These are used to memorize all internal and empty leaf pages. They are
+     * used for deleting all the empty pages.
*/
After dot, there should be 2 spaces. Earlier, there was 2 spaces.

Other than that patch looks fine to me.

Thanks for the comment. Amit has already taken care of this before pushing it.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com