pgsql: Add pages deleted from pending list to FSM
Add pages deleted from pending list to FSM
Add pages deleted from GIN's pending list during cleanup to free space map
immediately. Clean up process could be initiated by ordinary insert but adding
page to FSM might occur only at vacuum. On some workload like never-vacuumed
insert-only tables it could cause a huge bloat.
Jeff Janes <jeff.janes@gmail.com>
Branch
------
master
Details
-------
http://git.postgresql.org/pg/commitdiff/e95680832854cf300e64c10de9cc2f586df558e8
Modified Files
--------------
src/backend/access/gin/ginfast.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
On Mon, Sep 7, 2015 at 10:29 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:
Add pages deleted from pending list to FSM
Add pages deleted from GIN's pending list during cleanup to free space map
immediately. Clean up process could be initiated by ordinary insert but adding
page to FSM might occur only at vacuum. On some workload like never-vacuumed
insert-only tables it could cause a huge bloat.
+ /*
+ * As pending list pages can have a high churn rate, it is
+ * desirable to recycle them immediately to the FreeSpace Map when
+ * ordinary backends clean the list.
+ */
+ if (fsm_vac && !vac_delay)
+ IndexFreeSpaceMapVacuum(index);
So autoanalyze still doesn't call IndexFreeSpaceMapVacuum().
That is, only backend can clean the list in INSERT-only workload.
I don't think that this is desirable. Because the backend will
periodically take a big performance penalty.
So I'm thinking that even autoanalyze should call IndexFreeSpaceMapVacuum()
to clean the list in a background, in order to avoid such spikes in
INSERT response time. Thought?
+ for (i = 0; i < data.ndeleted; i++)
+ RecordFreeIndexPage(index, freespace[i]);
ginvacuumcleanup calls RecordFreeIndexPage() twice for the same block.
One is in shiftList(), and another is in ginvacuumcleanup().
I'm just wondering if this affects the VACUUM performance especially
when the pending list size is set to very big.
Regards,
--
Fujii Masao
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers
[ moving thread to -hackers ]
Fujii Masao <masao.fujii@gmail.com> writes:
So autoanalyze still doesn't call IndexFreeSpaceMapVacuum().
That is, only backend can clean the list in INSERT-only workload.
I don't think that this is desirable. Because the backend will
periodically take a big performance penalty.
So I'm thinking that even autoanalyze should call IndexFreeSpaceMapVacuum()
to clean the list in a background, in order to avoid such spikes in
INSERT response time. Thought?
It seems quite bizarre for auto-analyze to do that. auto-vacuum, sure,
but I do not think this should get plugged into ANALYZE.
Maybe we need to rethink the rules for what is autovacuum vs. autoanalyze,
or come up with some third thing that the autovac daemon does for indexes.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply to both emails in one
Fujii:
So autoanalyze still doesn't call IndexFreeSpaceMapVacuum().
Fixed, see patch
ginvacuumcleanup calls RecordFreeIndexPage() twice for the same block.
fixed too
Tom:
It seems quite bizarre for auto-analyze to do that. auto-vacuum, sure,
but I do not think this should get plugged into ANALYZE.
I agree, but we want to have pending list shorter as possible.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Attachments:
gin_fsm_analyze.patchtext/plain; charset=UTF-8; name=gin_fsm_analyze.patchDownload
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 76bebad..7be9362 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -434,7 +434,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
END_CRIT_SECTION();
if (needCleanup)
- ginInsertCleanup(ginstate, false, NULL);
+ ginInsertCleanup(ginstate, false, true, NULL);
}
/*
@@ -505,7 +505,7 @@ ginHeapTupleFastCollect(GinState *ginstate,
*/
static bool
shiftList(Relation index, Buffer metabuffer, BlockNumber newHead,
- IndexBulkDeleteResult *stats)
+ bool fill_fsm, IndexBulkDeleteResult *stats)
{
Page metapage;
GinMetaPageData *metadata;
@@ -732,13 +732,19 @@ processPendingPage(BuildAccumulator *accum, KeyArray *ka,
* action of removing a page from the pending list really needs exclusive
* lock.
*
- * vac_delay indicates that ginInsertCleanup is called from vacuum process,
- * so call vacuum_delay_point() periodically.
+ * vac_delay indicates that ginInsertCleanup should so call
+ * vacuum_delay_point() periodically.
+ *
+ * fill_fsm indicates that ginInsertCleanup should add deleted pages
+ * to FSM otherwise caller is responsible to put deleted pages into
+ * FSM.
+ *
* If stats isn't null, we count deleted pending pages into the counts.
*/
void
ginInsertCleanup(GinState *ginstate,
- bool vac_delay, IndexBulkDeleteResult *stats)
+ bool vac_delay, bool fill_fsm,
+ IndexBulkDeleteResult *stats)
{
Relation index = ginstate->index;
Buffer metabuffer,
@@ -899,7 +905,7 @@ ginInsertCleanup(GinState *ginstate,
* remove read pages from pending list, at this point all
* content of read pages is in regular structure
*/
- if (shiftList(index, metabuffer, blkno, stats))
+ if (shiftList(index, metabuffer, blkno, fill_fsm, stats))
{
/* another cleanup process is running concurrently */
LockBuffer(metabuffer, GIN_UNLOCK);
@@ -948,7 +954,7 @@ ginInsertCleanup(GinState *ginstate,
* desirable to recycle them immediately to the FreeSpace Map when
* ordinary backends clean the list.
*/
- if (fsm_vac && !vac_delay)
+ if (fsm_vac && fill_fsm)
IndexFreeSpaceMapVacuum(index);
diff --git a/src/backend/access/gin/ginvacuum.c b/src/backend/access/gin/ginvacuum.c
index 1315762..323cfb0 100644
--- a/src/backend/access/gin/ginvacuum.c
+++ b/src/backend/access/gin/ginvacuum.c
@@ -544,7 +544,7 @@ ginbulkdelete(PG_FUNCTION_ARGS)
/* Yes, so initialize stats to zeroes */
stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
/* and cleanup any pending inserts */
- ginInsertCleanup(&gvs.ginstate, true, stats);
+ ginInsertCleanup(&gvs.ginstate, true, false, stats);
}
/* we'll re-count the tuples each time */
@@ -659,7 +659,7 @@ ginvacuumcleanup(PG_FUNCTION_ARGS)
if (IsAutoVacuumWorkerProcess())
{
initGinState(&ginstate, index);
- ginInsertCleanup(&ginstate, true, stats);
+ ginInsertCleanup(&ginstate, true, true, stats);
}
PG_RETURN_POINTER(stats);
}
@@ -672,7 +672,7 @@ ginvacuumcleanup(PG_FUNCTION_ARGS)
{
stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
initGinState(&ginstate, index);
- ginInsertCleanup(&ginstate, true, stats);
+ ginInsertCleanup(&ginstate, true, false, stats);
}
memset(&idxStat, 0, sizeof(idxStat));
diff --git a/src/include/access/gin_private.h b/src/include/access/gin_private.h
index acbe36a..5021887 100644
--- a/src/include/access/gin_private.h
+++ b/src/include/access/gin_private.h
@@ -936,7 +936,7 @@ extern void ginHeapTupleFastCollect(GinState *ginstate,
OffsetNumber attnum, Datum value, bool isNull,
ItemPointer ht_ctid);
extern void ginInsertCleanup(GinState *ginstate,
- bool vac_delay, IndexBulkDeleteResult *stats);
+ bool vac_delay, bool fill_fsm, IndexBulkDeleteResult *stats);
/* ginpostinglist.c */
On Fri, Sep 18, 2015 at 6:27 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
[ moving thread to -hackers ]
Fujii Masao <masao.fujii@gmail.com> writes:
So autoanalyze still doesn't call IndexFreeSpaceMapVacuum().
That is, only backend can clean the list in INSERT-only workload.
I don't think that this is desirable. Because the backend will
periodically take a big performance penalty.
Calling IndexFreeSpaceMapVacuum is only need to reuse the space freed up by
cleaning the pending list.
The list is still getting cleaned and truncated by autoanalyze, it is just
that the space is not getting marked for reuse.
When I wrote this, my thought was that a table vacuum is going to call
IndexFreeSpaceMapVacuum() from another place in its code and so should not
call it here as well. I neglected to consider the autoanalyze case.
So I'm thinking that even autoanalyze should call
IndexFreeSpaceMapVacuum()
to clean the list in a background, in order to avoid such spikes in
INSERT response time. Thought?
But it already is cleaning the list. If that space is not marked as
reusable, that causes index bloat, but doesn't cause latency spikes.
It seems quite bizarre for auto-analyze to do that. auto-vacuum, sure,
but I do not think this should get plugged into ANALYZE.
It may be odd that autoanalyze cleans the list at all (which this patch
doesn't change), but given that it does clean the list I don't see why it
would be bizarre to make the cleaned-up space re-usable.
Cheers,
Jeff
On Sat, Sep 19, 2015 at 5:20 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
On Fri, Sep 18, 2015 at 6:27 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
[ moving thread to -hackers ]
Fujii Masao <masao.fujii@gmail.com> writes:
So autoanalyze still doesn't call IndexFreeSpaceMapVacuum().
That is, only backend can clean the list in INSERT-only workload.
I don't think that this is desirable. Because the backend will
periodically take a big performance penalty.Calling IndexFreeSpaceMapVacuum is only need to reuse the space freed up by
cleaning the pending list.The list is still getting cleaned and truncated by autoanalyze, it is just
that the space is not getting marked for reuse.When I wrote this, my thought was that a table vacuum is going to call
IndexFreeSpaceMapVacuum() from another place in its code and so should not
call it here as well. I neglected to consider the autoanalyze case.So I'm thinking that even autoanalyze should call
IndexFreeSpaceMapVacuum()
to clean the list in a background, in order to avoid such spikes in
INSERT response time. Thought?But it already is cleaning the list. If that space is not marked as
reusable, that causes index bloat, but doesn't cause latency spikes.
Yep, you're right. The problem here is an index bloat.
It seems quite bizarre for auto-analyze to do that. auto-vacuum, sure,
but I do not think this should get plugged into ANALYZE.It may be odd that autoanalyze cleans the list at all (which this patch
doesn't change), but given that it does clean the list I don't see why it
would be bizarre to make the cleaned-up space re-usable.
Agreed. It's odd to clean up the list but not mark the pages as re-usable.
Regards,
--
Fujii Masao
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers