ERROR: failed to add item to the index page
With master (3dbb317d32f4f084ef7badaed8ef36f5c9b85fe6) I'm getting this:
visena=# CREATE INDEX origo_email_part_hdrvl_value_idx ON
public.origo_email_part_headervalue USING btree
(lower(substr((header_value)::text, 0, 1000)) varchar_pattern_ops);
psql: ERROR: failed to add item to the index page
The schema looks like this: create table origo_email_part_headervalue (
entity_idBIGSERIAL PRIMARY KEY, version int8 not null, header_value varchar NOT
NULL, header_id int8 references origo_email_part_header (entity_id), value_index
int NOT NULL DEFAULT0, UNIQUE (header_id, value_index) ); CREATE INDEX
origo_email_part_hdrvl_hdr_id_idxON origo_email_part_headervalue (header_id);
CREATE INDEXorigo_email_part_hdrvl_value_idx ON origo_email_part_headervalue (
lower(substr(header_value, 0, 1000)) varchar_pattern_ops); (haven't tried any
other version so I'm not sure when this started to happen) -- Andreas Joseph
Krogh
Andreas Joseph Krogh <andreas@visena.com> writes:
With master (3dbb317d32f4f084ef7badaed8ef36f5c9b85fe6) I'm getting this:
visena=# CREATE INDEX origo_email_part_hdrvl_value_idx ON
public.origo_email_part_headervalue USING btree
(lower(substr((header_value)::text, 0, 1000)) varchar_pattern_ops);
psql: ERROR: failed to add item to the index page
Hm, your example works for me on HEAD.
Usually, the first thing to suspect when you're tracking HEAD and get
bizarre failures is that you have a messed-up build. Before spending
any time diagnosing more carefully, do "make distclean", reconfigure,
rebuild, reinstall, then see if problem is still there.
(In theory, you can avoid this sort of failure with appropriate use
of --enable-depend, but personally I don't trust that too much.
I find that with ccache + autoconf cache + parallel build, rebuilding
completely is fast enough that it's something I just do routinely
after any git pull. I'd rather use up my remaining brain cells on
other kinds of problems...)
regards, tom lane
På tirsdag 30. april 2019 kl. 15:43:16, skrev Tom Lane <tgl@sss.pgh.pa.us
<mailto:tgl@sss.pgh.pa.us>>: Andreas Joseph Krogh <andreas@visena.com> writes:
With master (3dbb317d32f4f084ef7badaed8ef36f5c9b85fe6) I'm getting this:
visena=# CREATE INDEX origo_email_part_hdrvl_value_idx ON
public.origo_email_part_headervalue USING btree
(lower(substr((header_value)::text, 0, 1000)) varchar_pattern_ops);
psql: ERROR: failed to add item to the index page
Hm, your example works for me on HEAD.
Usually, the first thing to suspect when you're tracking HEAD and get
bizarre failures is that you have a messed-up build. Before spending
any time diagnosing more carefully, do "make distclean", reconfigure,
rebuild, reinstall, then see if problem is still there.
(In theory, you can avoid this sort of failure with appropriate use
of --enable-depend, but personally I don't trust that too much.
I find that with ccache + autoconf cache + parallel build, rebuilding
completely is fast enough that it's something I just do routinely
after any git pull. I'd rather use up my remaining brain cells on
other kinds of problems...)
regards, tom lane I built with this: make distclean && ./configure
--prefix=$HOME/programs/postgresql-master --with-openssl --with-llvm && make -j
8 install-world-contrib-recurse install-world-doc-recurse
It's probably caused by the data: visena=# select count(*) from
origo_email_part_headervalue;
count
----------
14609516
(1 row)
I'll see if I can create a self contained example.
--
Andreas Joseph Krogh
Andreas Joseph Krogh <andreas@visena.com> writes:
I built with this: make distclean && ./configure
--prefix=$HOME/programs/postgresql-master --with-openssl --with-llvm && make -j
8 install-world-contrib-recurse install-world-doc-recurse
--with-llvm, eh? Does it reproduce without that? What platform is
this on, what LLVM version?
I'll see if I can create a self contained example.
Please.
regards, tom lane
På tirsdag 30. april 2019 kl. 15:53:50, skrev Tom Lane <tgl@sss.pgh.pa.us
<mailto:tgl@sss.pgh.pa.us>>: Andreas Joseph Krogh <andreas@visena.com> writes:
I built with this: make distclean && ./configure
--prefix=$HOME/programs/postgresql-master --with-openssl --with-llvm &&
make -j
8 install-world-contrib-recurse install-world-doc-recurse
--with-llvm, eh? Does it reproduce without that? What platform is
this on, what LLVM version?
I'll see if I can create a self contained example.
Please.
regards, tom lane Ubuntu 19.04 $ llvm-config --version
8.0.0
"--with-llvm" was something I had from when pg-11 was master. It might not be
needed anymore? I'm trying a fresh build without --with-llvm and reload of data
now.
--
Andreas Joseph Krogh
På tirsdag 30. april 2019 kl. 16:03:04, skrev Andreas Joseph Krogh <
andreas@visena.com <mailto:andreas@visena.com>>: På tirsdag 30. april 2019 kl.
15:53:50, skrev Tom Lane <tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>>:
Andreas Joseph Krogh <andreas@visena.com> writes:
I built with this: make distclean && ./configure
--prefix=$HOME/programs/postgresql-master --with-openssl --with-llvm &&
make -j
8 install-world-contrib-recurse install-world-doc-recurse
--with-llvm, eh? Does it reproduce without that? What platform is
this on, what LLVM version?
I'll see if I can create a self contained example.
Please.
regards, tom lane Ubuntu 19.04 $ llvm-config --version
8.0.0
"--with-llvm" was something I had from when pg-11 was master. It might not be
needed anymore? I'm trying a fresh build without --with-llvm and reload of data
now. Yep, happens without --with-llvm also. I'll try to load only the necessary
table(s) to reproduce. --
Andreas Joseph Krogh
Please fix or abstain from using the MUA that produces this monstrosity
of a Subject: "Sv: Sv: Re: Sv: Re: ERROR: failed to add item to the
index page"
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
På tirsdag 30. april 2019 kl. 16:27:05, skrev Andreas Joseph Krogh <
andreas@visena.com <mailto:andreas@visena.com>>: [snip] Yep, happens without
--with-llvm also. I'll try to load only the necessary table(s) to reproduce. I
have a 1.4GB dump (only one table) which reliably reproduces this error.
Shall I share it off-list? --
Andreas Joseph Krogh
On Tue, Apr 30, 2019 at 9:44 AM Andreas Joseph Krogh
<andreas@visena.com> wrote:
I have a 1.4GB dump (only one table) which reliably reproduces this error.
Shall I share it off-list?
I would be quite interested in this, too, since there is a chance that
it's my bug.
--
Peter Geoghegan
Andreas Joseph Krogh <andreas@visena.com> writes:
I have a 1.4GB dump (only one table) which reliably reproduces this error.
Shall I share it off-list? --
That's awfully large :-(. How do you have in mind to transmit it?
Maybe you could write a short script that generates dummy data
to reproduce the problem?
regards, tom lane
On Tue, Apr 30, 2019 at 9:47 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andreas Joseph Krogh <andreas@visena.com> writes:
I have a 1.4GB dump (only one table) which reliably reproduces this error.
Shall I share it off-list? --That's awfully large :-(. How do you have in mind to transmit it?
I've send dumps that were larger than that by providing a Google drive
link. Something like that should work reasonably well.
--
Peter Geoghegan
På tirsdag 30. april 2019 kl. 18:48:45, skrev Peter Geoghegan <pg@bowt.ie
<mailto:pg@bowt.ie>>: On Tue, Apr 30, 2019 at 9:47 AM Tom Lane
<tgl@sss.pgh.pa.us> wrote:
Andreas Joseph Krogh <andreas@visena.com> writes:
I have a 1.4GB dump (only one table) which reliably reproduces this error.
Shall I share it off-list? --That's awfully large :-(. How do you have in mind to transmit it?
I've send dumps that were larger than that by providing a Google drive
link. Something like that should work reasonably well. I've sent you guys a
link (Google Drive) off-list.
--
Andreas Joseph Krogh
On Tue, Apr 30, 2019 at 9:56 AM Andreas Joseph Krogh <andreas@visena.com> wrote:
I've sent you guys a link (Google Drive) off-list.
I'll start investigating the problem right away.
Thanks
--
Peter Geoghegan
On Tue, Apr 30, 2019 at 9:59 AM Peter Geoghegan <pg@bowt.ie> wrote:
I'll start investigating the problem right away.
I have found what the problem is. I simply neglected to make a
conservative assumption about suffix truncation needing to add a heap
TID to a leaf page's new high key in nbtsort.c (following commit
dd299df8189), even though I didn't make the same mistake in
nbtsplitloc.c. Not sure how I managed to make such a basic error.
Andreas' test case works fine with the attached patch. I won't push a
fix for this today.
--
Peter Geoghegan
Attachments:
0001-Tentative-fix-for-nbtsort.c-space-bug.patchapplication/octet-stream; name=0001-Tentative-fix-for-nbtsort.c-space-bug.patchDownload
From 7c2c580e16d3d32db4a622e99100cb8ff4ba6b7c Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Tue, 30 Apr 2019 10:55:52 -0700
Subject: [PATCH] Tentative fix for nbtsort.c space bug.
---
src/backend/access/nbtree/nbtsort.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 9ac4c1e1c0..f00e9a6e30 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -879,7 +879,8 @@ _bt_buildadd(BTWriteState *wstate, BTPageState *state, IndexTuple itup)
* fillfactor. However, we must put at least two items on each page, so
* disregard fillfactor if we don't have that many.
*/
- if (pgspc < itupsz || (pgspc < state->btps_full && last_off > P_FIRSTKEY))
+ if (pgspc < itupsz + MAXALIGN(sizeof(ItemPointerData)) ||
+ (pgspc < state->btps_full && last_off > P_FIRSTKEY))
{
/*
* Finish off the page and write it out.
--
2.17.1
På tirsdag 30. april 2019 kl. 19:58:31, skrev Peter Geoghegan <pg@bowt.ie
<mailto:pg@bowt.ie>>: On Tue, Apr 30, 2019 at 9:59 AM Peter Geoghegan
<pg@bowt.ie> wrote:
I'll start investigating the problem right away.
I have found what the problem is. I simply neglected to make a
conservative assumption about suffix truncation needing to add a heap
TID to a leaf page's new high key in nbtsort.c (following commit
dd299df8189), even though I didn't make the same mistake in
nbtsplitloc.c. Not sure how I managed to make such a basic error.
Andreas' test case works fine with the attached patch. I won't push a
fix for this today.
--
Peter Geoghegan Nice, thanks! --
Andreas Joseph Krogh
On Tue, Apr 30, 2019 at 11:54 AM Andreas Joseph Krogh
<andreas@visena.com> wrote:
Nice, thanks!
Thanks for the report!
--
Peter Geoghegan
Hi,
On 2019-04-30 20:54:45 +0200, Andreas Joseph Krogh wrote:
P� tirsdag 30. april 2019 kl. 19:58:31, skrev Peter Geoghegan <pg@bowt.ie
<mailto:pg@bowt.ie>>: On Tue, Apr 30, 2019 at 9:59 AM Peter Geoghegan
<pg@bowt.ie> wrote:I'll start investigating the problem right away.
I have found what the problem is. I simply neglected to make a
conservative assumption about suffix truncation needing to add a heap
TID to a leaf page's new high key in nbtsort.c (following commit
dd299df8189), even though I didn't make the same mistake in
nbtsplitloc.c. Not sure how I managed to make such a basic error.Andreas' test case works fine with the attached patch. I won't push a
fix for this today.--
Peter Geoghegan Nice, thanks! --
Andreas Joseph Krogh
Andreas, unfortunately your emails are pretty unreadable. Check the
quoted email, and the web archive:
/messages/by-id/VisenaEmail.41.51d7719d814a1f54.16a6f98a5e9@tc7-visena
Greetings,
Andres Freund
På tirsdag 30. april 2019 kl. 20:59:43, skrev Andres Freund <andres@anarazel.de
<mailto:andres@anarazel.de>>: [...]
Andreas, unfortunately your emails are pretty unreadable. Check the
quoted email, and the web archive:
/messages/by-id/VisenaEmail.41.51d7719d814a1f54.16a6f98a5e9@tc7-visena
Greetings,
Andres Freund
I know that the text-version is quite unreadable, especially when quoting. My
MUA is web-based and uses CKEditor for composing, and it doesn't care much to
try to format the text/plain version (I know because I've written it, yes and
have yet to fix the Re: Sv: Re: Sv: subject issue...). But it has tons of
benefits CRM- and usage-wise so I prefer to use it. But - how use text/plain
these days:-) --
Andreas Joseph Krogh
Hi,
On 2019-04-30 21:23:21 +0200, Andreas Joseph Krogh wrote:
P� tirsdag 30. april 2019 kl. 20:59:43, skrev Andres Freund <andres@anarazel.de
<mailto:andres@anarazel.de>>: [...]
Andreas, unfortunately your emails are pretty unreadable. Check the
quoted email, and the web archive:/messages/by-id/VisenaEmail.41.51d7719d814a1f54.16a6f98a5e9@tc7-visena
Greetings,
Andres Freund
I know that the text-version is quite unreadable, especially when quoting. My
MUA is web-based and uses CKEditor for composing, and it doesn't care much to
try to format the text/plain version (I know because I've written it, yes and
have yet to fix the Re: Sv: Re: Sv: subject issue...). But it has tons of
benefits CRM- and usage-wise so I prefer to use it. But - how use text/plain
these days:-) --
The standard on pg lists is to write in a manner that's usable for both
text mail readers and the archive. Doesn't terribly matter to the
occasional one-off poster on -general, but you're not that... So please
try to write readable mails for the PG lists.
Greetings,
Andres Freund
På tirsdag 30. april 2019 kl. 21:26:52, skrev Andres Freund <andres@anarazel.de
<mailto:andres@anarazel.de>>: > [...]
The standard on pg lists is to write in a manner that's usable for both >
text mail readers and the archive. Doesn't terribly matter to the > occasional
one-off poster on -general, but you're not that... So please > try to write
readable mails for the PG lists.
Greetings,
Andres Freund ACK. --
Andreas Joseph Krogh
On Tue, Apr 30, 2019 at 10:58 AM Peter Geoghegan <pg@bowt.ie> wrote:j
I have found what the problem is. I simply neglected to make a
conservative assumption about suffix truncation needing to add a heap
TID to a leaf page's new high key in nbtsort.c (following commit
dd299df8189), even though I didn't make the same mistake in
nbtsplitloc.c. Not sure how I managed to make such a basic error.
Attached is a much more polished version of the same patch. I tried to
make clear how the "page full" test (the test that has been fixed to
take heap TID space for high key into account) is related to other
close-by code, such as the tuple space limit budget within
_bt_check_third_page(), and the code that sets up an actual call to
_bt_truncate().
I'll wait a few days before pushing this. This version doesn't feel
too far off being committable. I tested it with some of the CREATE
INDEX tests that I developed during development of the nbtree unique
keys project, including a test with tuples that are precisely at the
1/3 of a page threshold. The new definition of 1/3 of a page takes
high key heap TID overhead into account -- see _bt_check_third_page().
--
Peter Geoghegan
Attachments:
v2-0001-Fix-nbtsort.c-s-page-space-accounting.patchapplication/octet-stream; name=v2-0001-Fix-nbtsort.c-s-page-space-accounting.patchDownload
From 6bf12b80259c48ee613a56e85f37ebb73cfdf819 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Tue, 30 Apr 2019 10:55:52 -0700
Subject: [PATCH v2] Fix nbtsort.c's page space accounting.
Commit dd299df8189 failed to have nbtsort.c conservatively assume that
suffix truncation was ineffective (i.e. that it would represent heap TID
in new high key). Space for a possible heap TID in new leaf page high
key was budgeted within _bt_check_third_page(), but nbtsort.c's
definition of when a page is completely full didn't explicitly consider
high keys where heap TID is represented. When the page was deemed full,
it might already be too late: there might be insufficient space, because
the last existing non-pivot tuple on page gets replaced with new high
key that is slightly larger (larger by the space required to store a
MAXALIGN()'d heap TID item pointer).
To fix, bring nbtsort.c in line with nbtsplitloc.c, which already
explicitly assumes that new high key will need to have a heap TID added
when high key is formed on the leaf level.
Reported-By: Andreas Joseph Krogh
Discussion: https://postgr.es/m/VisenaEmail.c5.3ee7fe277d514162.16a6d785bea@tc7-visena
---
src/backend/access/nbtree/nbtsort.c | 44 ++++++++++++++++++++---------
1 file changed, 30 insertions(+), 14 deletions(-)
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 9ac4c1e1c0..7db468bb58 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -841,6 +841,7 @@ _bt_buildadd(BTWriteState *wstate, BTPageState *state, IndexTuple itup)
OffsetNumber last_off;
Size pgspc;
Size itupsz;
+ bool isleaf;
/*
* This is a handy place to check for cancel interrupts during the btree
@@ -855,9 +856,13 @@ _bt_buildadd(BTWriteState *wstate, BTPageState *state, IndexTuple itup)
pgspc = PageGetFreeSpace(npage);
itupsz = IndexTupleSize(itup);
itupsz = MAXALIGN(itupsz);
+ /* Leaf pages use suffix truncation, and need extra heap TID space */
+ isleaf = state->btps_level == 0;
/*
- * Check whether the item can fit on a btree page at all.
+ * Check whether the item can fit on page, while making sure that page has
+ * at least two tuples (in addition to page high key) before starting next
+ * page.
*
* Every newly built index will treat heap TID as part of the keyspace,
* which imposes the requirement that new high keys must occasionally have
@@ -870,16 +875,27 @@ _bt_buildadd(BTWriteState *wstate, BTPageState *state, IndexTuple itup)
* the reserved space. This should never fail on internal pages.
*/
if (unlikely(itupsz > BTMaxItemSize(npage)))
- _bt_check_third_page(wstate->index, wstate->heap,
- state->btps_level == 0, npage, itup);
+ _bt_check_third_page(wstate->index, wstate->heap, isleaf, npage,
+ itup);
/*
- * Check to see if page is "full". It's definitely full if the item won't
- * fit. Otherwise, compare to the target freespace derived from the
- * fillfactor. However, we must put at least two items on each page, so
- * disregard fillfactor if we don't have that many.
+ * Page is definitely full if the new item won't fit. We take into
+ * account the possible need for heap TID space within _bt_truncate() when
+ * page is a leaf page. It is guaranteed that we can fit at least 2
+ * non-pivot tuples plus a high key with heap TID when finishing off a
+ * leaf page, because _bt_check_third_page() conservatively rejects
+ * oversized non-pivot tuples. (On internal pages we can always fit 3
+ * pivot tuples, including high key.)
+ *
+ * Most of the time, page is only "full" in the sense that inserting new
+ * tuple would cause us to exceed fillfactor-wise limit (no need to take
+ * heap TID space into account in this soft limit). However, we must
+ * always leave at least two items plus high key on each page before
+ * starting a new page, so disregard fillfactor if we don't have enough
+ * items to make that work yet.
*/
- if (pgspc < itupsz || (pgspc < state->btps_full && last_off > P_FIRSTKEY))
+ if (pgspc - (isleaf ? MAXALIGN(sizeof(ItemPointerData)) : 0) < itupsz ||
+ (pgspc < state->btps_full && last_off > P_FIRSTKEY))
{
/*
* Finish off the page and write it out.
@@ -889,7 +905,6 @@ _bt_buildadd(BTWriteState *wstate, BTPageState *state, IndexTuple itup)
ItemId ii;
ItemId hii;
IndexTuple oitup;
- BTPageOpaque opageop = (BTPageOpaque) PageGetSpecialPointer(opage);
/* Create new page of same level */
npage = _bt_blnewpage(state->btps_level);
@@ -917,7 +932,7 @@ _bt_buildadd(BTWriteState *wstate, BTPageState *state, IndexTuple itup)
ItemIdSetUnused(ii); /* redundant */
((PageHeader) opage)->pd_lower -= sizeof(ItemIdData);
- if (P_ISLEAF(opageop))
+ if (isleaf)
{
IndexTuple lastleft;
IndexTuple truncated;
@@ -944,8 +959,9 @@ _bt_buildadd(BTWriteState *wstate, BTPageState *state, IndexTuple itup)
* to actually save space on the leaf page). We delete the
* original high key, and add our own truncated high key at the
* same offset. It's okay if the truncated tuple is slightly
- * larger due to containing a heap TID value, since this case is
- * known to _bt_check_third_page(), which reserves space.
+ * larger due to containing a heap TID value, since that was taken
+ * into account when we determined that page is full; the extra
+ * space must already be available on page.
*
* Note that the page layout won't be changed very much. oitup is
* already located at the physical beginning of tuple space, so we
@@ -979,9 +995,9 @@ _bt_buildadd(BTWriteState *wstate, BTPageState *state, IndexTuple itup)
Assert((BTreeTupleGetNAtts(state->btps_minkey, wstate->index) <=
IndexRelationGetNumberOfKeyAttributes(wstate->index) &&
BTreeTupleGetNAtts(state->btps_minkey, wstate->index) > 0) ||
- P_LEFTMOST(opageop));
+ P_LEFTMOST((BTPageOpaque) PageGetSpecialPointer(opage)));
Assert(BTreeTupleGetNAtts(state->btps_minkey, wstate->index) == 0 ||
- !P_LEFTMOST(opageop));
+ !P_LEFTMOST((BTPageOpaque) PageGetSpecialPointer(opage)));
BTreeInnerTupleSetDownLink(state->btps_minkey, oblkno);
_bt_buildadd(wstate, state->btps_next, state->btps_minkey);
pfree(state->btps_minkey);
--
2.17.1
On Tue, Apr 30, 2019 at 6:28 PM Peter Geoghegan <pg@bowt.ie> wrote:
Attached is a much more polished version of the same patch. I tried to
make clear how the "page full" test (the test that has been fixed to
take heap TID space for high key into account) is related to other
close-by code, such as the tuple space limit budget within
_bt_check_third_page(), and the code that sets up an actual call to
_bt_truncate().
Pushed, though final version does the test a little differently. It
adds the required heap TID space to itupsz, rather than subtracting it
from pgspc. This is actually representative of the underlying logic,
and avoids unsigned underflow.
--
Peter Geoghegan
På torsdag 02. mai 2019 kl. 21:38:02, skrev Peter Geoghegan <pg@bowt.ie>:
Pushed, though final version does the test a little differently. It
adds the required heap TID space to itupsz, rather than subtracting it
from pgspc. This is actually representative of the underlying logic,
and avoids unsigned underflow. Thanks!
--
Andreas Joseph Krogh