Use PageIndexTupleOverwrite() within nbtsort.c
Attached patch slightly simplifies nbtsort.c by making it use
PageIndexTupleOverwrite() to overwrite the last right non-pivot tuple
with the new high key (pivot tuple). PageIndexTupleOverwrite() is
designed so that code like this doesn't need to delete and re-insert
to replace an existing tuple.
This slightly simplifies the code, and also makes it marginally
faster. I'll add this to the 2019-09 CF.
--
Peter Geoghegan
Attachments:
0001-Overwrite-lastright-item-with-highkey-in-nbtsort.c.patchapplication/octet-stream; name=0001-Overwrite-lastright-item-with-highkey-in-nbtsort.c.patchDownload
From a63250aa4645ff04f468160105da78ec0a69a3ea Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Mon, 13 May 2019 20:01:50 -0700
Subject: [PATCH] Overwrite lastright item with highkey in nbtsort.c.
Using PageIndexTupleOverwrite() instead of deleting and re-inserting
naively makes the code slightly simpler. It also saves us from having
to shift every leaf page's line pointer array back and forth during
index builds, which makes CREATE INDEX marginally faster.
---
src/backend/access/nbtree/nbtsort.c | 21 ++++++---------------
1 file changed, 6 insertions(+), 15 deletions(-)
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index d0b9013caf..0c93ba116b 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -943,7 +943,6 @@ _bt_buildadd(BTWriteState *wstate, BTPageState *state, IndexTuple itup)
{
IndexTuple lastleft;
IndexTuple truncated;
- Size truncsz;
/*
* Truncate away any unneeded attributes from high key on leaf
@@ -961,26 +960,18 @@ _bt_buildadd(BTWriteState *wstate, BTPageState *state, IndexTuple itup)
* choosing a split point here for a benefit that is bound to be
* much smaller.
*
- * Since the truncated tuple is often smaller than the original
- * tuple, it cannot just be copied in place (besides, we want to
- * actually save space on the leaf page). We delete the original
- * high key, and add our own truncated high key at the same
- * offset.
- *
- * Note that the page layout won't be changed very much. oitup is
- * already located at the physical beginning of tuple space, so we
- * only shift the line pointer array back and forth, and overwrite
- * the tuple space previously occupied by oitup. This is fairly
- * cheap.
+ * Overwrite the old item with new truncated high key directly.
+ * oitup is already located at the physical beginning of tuple
+ * space, so this should directly reuse the existing tuple space.
*/
ii = PageGetItemId(opage, OffsetNumberPrev(last_off));
lastleft = (IndexTuple) PageGetItem(opage, ii);
truncated = _bt_truncate(wstate->index, lastleft, oitup,
wstate->inskey);
- truncsz = IndexTupleSize(truncated);
- PageIndexTupleDelete(opage, P_HIKEY);
- _bt_sortaddtup(opage, truncsz, truncated, P_HIKEY);
+ if (!PageIndexTupleOverwrite(opage, P_HIKEY, (Item) truncated,
+ IndexTupleSize(truncated)))
+ elog(ERROR, "failed to add hikey to the index page");
pfree(truncated);
/* oitup should continue to point to the page's high key */
--
2.17.1
16.07.2019 1:12, Peter Geoghegan wrote:
Attached patch slightly simplifies nbtsort.c by making it use
PageIndexTupleOverwrite() to overwrite the last right non-pivot tuple
with the new high key (pivot tuple). PageIndexTupleOverwrite() is
designed so that code like this doesn't need to delete and re-insert
to replace an existing tuple.This slightly simplifies the code, and also makes it marginally
faster. I'll add this to the 2019-09 CF.
I'm okay with this patch.
Should we also update similar code in _bt_mark_page_halfdead()?
I attached a new version of the patch with this change.
--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
0001-Overwrite-lastright-item-with-highkey_v2.patchtext/x-patch; name=0001-Overwrite-lastright-item-with-highkey_v2.patchDownload
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 9c1f7de..0e6a27e 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -1653,8 +1653,7 @@ _bt_mark_page_halfdead(Relation rel, Buffer leafbuf, BTStack stack)
opaque = (BTPageOpaque) PageGetSpecialPointer(page);
opaque->btpo_flags |= BTP_HALF_DEAD;
- PageIndexTupleDelete(page, P_HIKEY);
- Assert(PageGetMaxOffsetNumber(page) == 0);
+ Assert(PageGetMaxOffsetNumber(page) == P_HIKEY);
MemSet(&trunctuple, 0, sizeof(IndexTupleData));
trunctuple.t_info = sizeof(IndexTupleData);
if (target != leafblkno)
@@ -1662,8 +1661,8 @@ _bt_mark_page_halfdead(Relation rel, Buffer leafbuf, BTStack stack)
else
BTreeTupleSetTopParent(&trunctuple, InvalidBlockNumber);
- if (PageAddItem(page, (Item) &trunctuple, sizeof(IndexTupleData), P_HIKEY,
- false, false) == InvalidOffsetNumber)
+ if (!PageIndexTupleOverwrite(page, P_HIKEY, (Item) &trunctuple,
+ IndexTupleSize(&trunctuple)))
elog(ERROR, "could not add dummy high key to half-dead page");
/* Must mark buffers dirty before XLogInsert */
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index b30cf9e..82e0881 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -943,7 +943,6 @@ _bt_buildadd(BTWriteState *wstate, BTPageState *state, IndexTuple itup)
{
IndexTuple lastleft;
IndexTuple truncated;
- Size truncsz;
/*
* Truncate away any unneeded attributes from high key on leaf
@@ -961,26 +960,18 @@ _bt_buildadd(BTWriteState *wstate, BTPageState *state, IndexTuple itup)
* choosing a split point here for a benefit that is bound to be
* much smaller.
*
- * Since the truncated tuple is often smaller than the original
- * tuple, it cannot just be copied in place (besides, we want to
- * actually save space on the leaf page). We delete the original
- * high key, and add our own truncated high key at the same
- * offset.
- *
- * Note that the page layout won't be changed very much. oitup is
- * already located at the physical beginning of tuple space, so we
- * only shift the line pointer array back and forth, and overwrite
- * the tuple space previously occupied by oitup. This is fairly
- * cheap.
+ * Overwrite the old item with new truncated high key directly.
+ * oitup is already located at the physical beginning of tuple
+ * space, so this should directly reuse the existing tuple space.
*/
ii = PageGetItemId(opage, OffsetNumberPrev(last_off));
lastleft = (IndexTuple) PageGetItem(opage, ii);
truncated = _bt_truncate(wstate->index, lastleft, oitup,
wstate->inskey);
- truncsz = IndexTupleSize(truncated);
- PageIndexTupleDelete(opage, P_HIKEY);
- _bt_sortaddtup(opage, truncsz, truncated, P_HIKEY);
+ if (!PageIndexTupleOverwrite(opage, P_HIKEY, (Item) truncated,
+ IndexTupleSize(truncated)))
+ elog(ERROR, "failed to add hikey to the index page");
pfree(truncated);
/* oitup should continue to point to the page's high key */
On Tue, Aug 6, 2019 at 8:30 AM Anastasia Lubennikova
<a.lubennikova@postgrespro.ru> wrote:
Should we also update similar code in _bt_mark_page_halfdead()?
I attached a new version of the patch with this change.
Pushed.
At first I thought that there might be a problem with doing the same
thing within _bt_mark_page_halfdead(), because we still won't use
PageIndexTupleOverwrite() in the corresponding recovery routine -- in
theory, that could break WAL consistency checking because the redo
routine works by reconstructing a half-deleted leaf page from scratch,
resulting in a logically equivalent though physically different page
(even after masking within btree_mask()). However, I eventually
decided that you had it right. Your _bt_mark_page_halfdead() change is
clearer overall and doesn't break WAL consistency checking in
practice, for reasons that are no less obvious than before.
Thanks!
--
Peter Geoghegan