[sqlsmith] PANIC: failed to add BRIN tuple

Started by Andreas Seltenreichover 9 years ago10 messages
#1Andreas Seltenreich
seltenreich@gmx.de

There was one instance of this PANIC when testing with the regression db
of master at 50e5315.

,----
| WARNING: specified item offset is too large
| PANIC: failed to add BRIN tuple
| server closed the connection unexpectedly
`----

It is reproducible with the query below on this instance only. I've put
the data directory (20MB) here:

http://ansel.ydns.eu/~andreas/brincrash.tar.xz

The instance was running on Debian Jessie amd64. Query and Backtrace
below.

regards,
Andreas

--8<---------------cut here---------------start------------->8---
update public.brintest set byteacol = null, charcol =
public.brintest.charcol, int2col = null, int4col =
public.brintest.int4col, textcol = public.brintest.textcol, oidcol =
cast(coalesce(cast(coalesce(null, public.brintest.oidcol) as oid),
pg_catalog.pg_my_temp_schema()) as oid), tidcol =
public.brintest.tidcol, float8col = public.brintest.float8col,
macaddrcol = null, cidrcol = public.brintest.cidrcol, datecol =
public.brintest.datecol, timecol = public.brintest.timecol,
timestamptzcol = pg_catalog.clock_timestamp(), intervalcol =
public.brintest.intervalcol, timetzcol = public.brintest.timetzcol,
bitcol = public.brintest.bitcol, varbitcol =
public.brintest.varbitcol, uuidcol = null returning
public.brintest.byteacol as c0;
--8<---------------cut here---------------end--------------->8---

Core was generated by `postgres: smith regression [local] UPDATE '.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007fd2cda67067 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007fd2cda67067 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007fd2cda68448 in __GI_abort () at abort.c:89
#2 0x00000000007ec969 in errfinish (dummy=dummy@entry=0) at elog.c:557
#3 0x00000000007f011c in elog_finish (elevel=elevel@entry=20, fmt=fmt@entry=0x82ca8f "failed to add BRIN tuple") at elog.c:1378
#4 0x0000000000470618 in brin_doupdate (idxrel=0x101f4c0, pagesPerRange=1, revmap=0x10d20e50, heapBlk=8, oldbuf=2878, oldoff=9, origtup=0x10d864a8, origsz=6144, newtup=0x5328a88, newsz=6144, samepage=1 '\001') at brin_pageops.c:184
#5 0x000000000046e5bb in brininsert (idxRel=0x101f4c0, values=0x211b, nulls=0x6 <error: Cannot access memory at address 0x6>, heaptid=0xffffffffffffffff, heapRel=0x7fd2ce6fd700, checkUnique=UNIQUE_CHECK_NO) at brin.c:244
#6 0x00000000005d887f in ExecInsertIndexTuples (slot=0xe92a560, tupleid=0x10d21084, estate=0x9ed8a68, noDupErr=0 '\000', specConflict=0x0, arbiterIndexes=0x0) at execIndexing.c:383
#7 0x00000000005f74d5 in ExecUpdate (tupleid=0x7ffe11ea74a0, oldtuple=0x211b, slot=0xe92a560, planSlot=0xffffffffffffffff, epqstate=0x7fd2ce6fd700, estate=0x9ed8a68, canSetTag=1 '\001') at nodeModifyTable.c:1015
#8 0x00000000005f7b6c in ExecModifyTable (node=0x9ed8d28) at nodeModifyTable.c:1501
#9 0x00000000005dd5d8 in ExecProcNode (node=node@entry=0x9ed8d28) at execProcnode.c:396
#10 0x00000000005d962f in ExecutePlan (dest=0xde86040, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_UPDATE, use_parallel_mode=<optimized out>, planstate=0x9ed8d28, estate=0x9ed8a68) at execMain.c:1567
#11 standard_ExecutorRun (queryDesc=0xde860d8, direction=<optimized out>, count=0) at execMain.c:338
#12 0x00000000006f74c9 in ProcessQuery (plan=<optimized out>, sourceText=0xd74e88 "update public.brintest[...]", params=0x0, dest=0xde86040, completionTag=0x7ffe11ea7670 "") at pquery.c:185
#13 0x00000000006f775f in PortalRunMulti (portal=portal@entry=0xde8abf0, isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0xde86040, altdest=0xc96680 <donothingDR>, completionTag=completionTag@entry=0x7ffe11ea7670 "") at pquery.c:1267
#14 0x00000000006f7a0c in FillPortalStore (portal=portal@entry=0xde8abf0, isTopLevel=isTopLevel@entry=1 '\001') at pquery.c:1044
#15 0x00000000006f845d in PortalRun (portal=0xde8abf0, count=9223372036854775807, isTopLevel=<optimized out>, dest=0x9ee76b8, altdest=0x9ee76b8, completionTag=0x7ffe11ea7a20 "") at pquery.c:782
#16 0x00000000006f5c63 in exec_simple_query (query_string=<optimized out>) at postgres.c:1094
#17 PostgresMain (argc=233352176, argv=0xe8ad358, dbname=0xcf7508 "regression", username=0xe8ad3b0 "Xӊ\016") at postgres.c:4059
#18 0x000000000046c8b2 in BackendRun (port=0xd1c580) at postmaster.c:4258
#19 BackendStartup (port=0xd1c580) at postmaster.c:3932
#20 ServerLoop () at postmaster.c:1690
#21 0x000000000069081e in PostmasterMain (argc=argc@entry=4, argv=argv@entry=0xcf64f0) at postmaster.c:1298
#22 0x000000000046d80d in main (argc=4, argv=0xcf64f0) at main.c:228

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andreas Seltenreich (#1)
Re: [sqlsmith] PANIC: failed to add BRIN tuple

Andreas Seltenreich wrote:

There was one instance of this PANIC when testing with the regression db
of master at 50e5315.

,----
| WARNING: specified item offset is too large
| PANIC: failed to add BRIN tuple
| server closed the connection unexpectedly
`----

It is reproducible with the query below on this instance only. I've put
the data directory (20MB) here:

http://ansel.ydns.eu/~andreas/brincrash.tar.xz

Thanks for all the details. I'll be looking into this tomorrow.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andreas Seltenreich (#1)
Re: [sqlsmith] PANIC: failed to add BRIN tuple

Andreas Seltenreich wrote:

There was one instance of this PANIC when testing with the regression db
of master at 50e5315.

,----
| WARNING: specified item offset is too large
| PANIC: failed to add BRIN tuple
| server closed the connection unexpectedly
`----

It is reproducible with the query below on this instance only.

Hm, so this is an over-eager check. As I understand, what seems to have
happened is that the page was filled with up to 10 tuples, then tuples
0-8 were removed, probably moved to other pages. When this update runs,
it needs to update the remaining tuple, which is a normal thing for BRIN
to do. But BRIN doesn't want the item number to change, because it's
referenced from the "revmap"; so it removes the item and then wants to
insert it again. But bufpage.c is not accustomed to having callers want
to put items beyond what's the last currently used item plus one, so it
raises a warning and returns without doing the insert. This drives BRIN
crazy.

I tried simply removing the "return InvalidOffsetNumber" line from that
block (so that it still throws the warning but it does execute the index
insert), and everything seems to behave correctly. I suppose a simple
fix would be to add a flag to PageAddItem() and skip this block in that
case, but that would break the ABI in 9.5. I'm not real sure what's a
good fix yet. Maybe a new routine specific to the needs of BRIN is
called for.

I would also like to come up with a way to have this scenario be tested
by a new regression test.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andreas Seltenreich (#1)
1 attachment(s)
Re: [sqlsmith] PANIC: failed to add BRIN tuple

Andreas Seltenreich wrote:

There was one instance of this PANIC when testing with the regression db
of master at 50e5315.

,----
| WARNING: specified item offset is too large
| PANIC: failed to add BRIN tuple
| server closed the connection unexpectedly
`----

It is reproducible with the query below on this instance only. I've put
the data directory (20MB) here:

http://ansel.ydns.eu/~andreas/brincrash.tar.xz

The instance was running on Debian Jessie amd64. Query and Backtrace
below.

How long does it take for you to reproduce this panic in the unpatched
code? This patch fixes the immediate problem, but it'd be good if the
bug is fixed more generally.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

brincrash.patchtext/x-diff; charset=us-asciiDownload
commit 7ecb10d515980f592a3bd7fb27f883af8404d762
Author:     Alvaro Herrera <alvherre@alvh.no-ip.org>
AuthorDate: Tue May 24 19:09:37 2016 -0400
CommitDate: Tue May 24 19:10:01 2016 -0400

    fix PageAddItem BRIN bug

diff --git a/src/backend/access/brin/brin_pageops.c b/src/backend/access/brin/brin_pageops.c
index d0ca485..a49eccd 100644
--- a/src/backend/access/brin/brin_pageops.c
+++ b/src/backend/access/brin/brin_pageops.c
@@ -179,8 +179,9 @@ brin_doupdate(Relation idxrel, BlockNumber pagesPerRange,
 
 		START_CRIT_SECTION();
 		PageIndexDeleteNoCompact(oldpage, &oldoff, 1);
-		if (PageAddItem(oldpage, (Item) newtup, newsz, oldoff, true,
-						false) == InvalidOffsetNumber)
+		if (PageAddItemFlags(oldpage, (Item) newtup, newsz, oldoff,
+							 PAI_OVERWRITE | PAI_ALLOW_LARGE_OFFSET) ==
+			InvalidOffsetNumber)
 			elog(ERROR, "failed to add BRIN tuple");
 		MarkBufferDirty(oldbuf);
 
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 0189bc9..cc7a4d8 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -153,12 +153,12 @@ PageIsVerified(Page page, BlockNumber blkno)
 
 
 /*
- *	PageAddItem
+ *	PageAddItemFlags
  *
  *	Add an item to a page.  Return value is offset at which it was
  *	inserted, or InvalidOffsetNumber if there's not room to insert.
  *
- *	If overwrite is true, we just store the item at the specified
+ *	If flag PAI_OVERWRITE is set, we just store the item at the specified
  *	offsetNumber (which must be either a currently-unused item pointer,
  *	or one past the last existing item).  Otherwise,
  *	if offsetNumber is valid and <= current max offset in the page,
@@ -167,18 +167,20 @@ PageIsVerified(Page page, BlockNumber blkno)
  *	If offsetNumber is not valid, then assign one by finding the first
  *	one that is both unused and deallocated.
  *
- *	If is_heap is true, we enforce that there can't be more than
+ *	If flag PAI_IS_HEAP is set, we enforce that there can't be more than
  *	MaxHeapTuplesPerPage line pointers on the page.
  *
+ *	If flag PAI_ALLOW_LARGE_OFFSET is not set, we disallow placing items
+ *	beyond one past the last existing item.
+ *
  *	!!! EREPORT(ERROR) IS DISALLOWED HERE !!!
  */
 OffsetNumber
-PageAddItem(Page page,
-			Item item,
-			Size size,
-			OffsetNumber offsetNumber,
-			bool overwrite,
-			bool is_heap)
+PageAddItemFlags(Page page,
+				 Item item,
+				 Size size,
+				 OffsetNumber offsetNumber,
+				 int flags)
 {
 	PageHeader	phdr = (PageHeader) page;
 	Size		alignedSize;
@@ -209,7 +211,7 @@ PageAddItem(Page page,
 	if (OffsetNumberIsValid(offsetNumber))
 	{
 		/* yes, check it */
-		if (overwrite)
+		if ((flags & PAI_OVERWRITE) != 0)
 		{
 			if (offsetNumber < limit)
 			{
@@ -257,13 +259,18 @@ PageAddItem(Page page,
 		}
 	}
 
-	if (offsetNumber > limit)
+	/*
+	 * Reject placing items beyond the first unused line pointer, unless
+	 * caller explicitely asked for that.
+	 */
+	if (((flags & PAI_ALLOW_LARGE_OFFSET) == 0) && offsetNumber > limit)
 	{
 		elog(WARNING, "specified item offset is too large");
 		return InvalidOffsetNumber;
 	}
 
-	if (is_heap && offsetNumber > MaxHeapTuplesPerPage)
+	/* Reject placing items beyond heap boundary, if heap */
+	if (((flags & PAI_IS_HEAP) != 0) && offsetNumber > MaxHeapTuplesPerPage)
 	{
 		elog(WARNING, "can't put more than MaxHeapTuplesPerPage items in a heap page");
 		return InvalidOffsetNumber;
@@ -324,6 +331,21 @@ PageAddItem(Page page,
 }
 
 /*
+ * PageAddItemFlags shim, for compatibility.
+ */
+OffsetNumber
+PageAddItem(Page page, Item item, Size size, OffsetNumber offsetNumber,
+			bool overwrite, bool is_heap)
+{
+	int		flags = 0;
+
+	flags |= overwrite ? PAI_OVERWRITE : 0;
+	flags |= is_heap ? PAI_IS_HEAP : 0;
+
+	return PageAddItemFlags(page, item, size, offsetNumber, flags);
+}
+
+/*
  * PageGetTempPage
  *		Get a temporary page in local memory for special processing.
  *		The returned page is not initialized at all; caller must do that.
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 63053d4..5ca1db7 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -407,11 +407,16 @@ do { \
  *		extern declarations
  * ----------------------------------------------------------------
  */
+#define PAI_OVERWRITE			(1 << 0)
+#define PAI_IS_HEAP				(1 << 1)
+#define PAI_ALLOW_LARGE_OFFSET	(1 << 2)
 
 extern void PageInit(Page page, Size pageSize, Size specialSize);
 extern bool PageIsVerified(Page page, BlockNumber blkno);
 extern OffsetNumber PageAddItem(Page page, Item item, Size size,
 			OffsetNumber offsetNumber, bool overwrite, bool is_heap);
+extern OffsetNumber PageAddItemFlags(Page page, Item item, Size size,
+				 OffsetNumber offsetNumber, int flags);
 extern Page PageGetTempPage(Page page);
 extern Page PageGetTempPageCopy(Page page);
 extern Page PageGetTempPageCopySpecial(Page page);
#5Andreas Seltenreich
seltenreich@gmx.de
In reply to: Alvaro Herrera (#4)
Re: [sqlsmith] PANIC: failed to add BRIN tuple

Alvaro Herrera writes:

How long does it take for you to reproduce this panic in the unpatched
code?

Very long, I'm afraid. I only observed it once, and according to the
logging database, about 4e9 random queries were generated since testing
with 9.5 code.

I could probably speed it up by creating lots of additional BRIN indexes
in the regression database, and by compiling a sqlsmith that generates
update statements only.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Andreas Seltenreich
seltenreich@gmx.de
In reply to: Andreas Seltenreich (#5)
Re: [sqlsmith] PANIC: failed to add BRIN tuple

I wrote:

Alvaro Herrera writes:

How long does it take for you to reproduce this panic in the unpatched
code?

I could probably speed it up by creating lots of additional BRIN indexes
in the regression database, and by compiling a sqlsmith that generates
update statements only.

This actually worked. Sqlsmith triggered the BRIN panic twice in 80e6
queries (vs. onece in 4e9 before). I pushed the modified version on the
branch "modify-heavy". Re-fuzzing now with your patch applied.

andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Andreas Seltenreich
seltenreich@gmx.de
In reply to: Andreas Seltenreich (#6)
Re: [sqlsmith] PANIC: failed to add BRIN tuple

I wrote:

Re-fuzzing now with your patch applied.

This so far yielded three BRIN core dumps on different machines with the
same backtraces. Crash recovery fails in these cases.

I've put the data directory here (before recovery):

http://ansel.ydns.eu/~andreas/brincrash2-spidew.tar.xz (9.1M)

Corresponding backtraces of the backend and startup core files below.

regards,
Andreas

Core was generated by `postgres: smith brintest [local] UPDATE '.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f5a49a9c067 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007f5a49a9c067 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f5a49a9d448 in __GI_abort () at abort.c:89
#2 0x00000000007ec979 in errfinish (dummy=dummy@entry=0) at elog.c:557
#3 0x00000000007f012c in elog_finish (elevel=elevel@entry=20, fmt=fmt@entry=0x989cf0 "incorrect index offsets supplied") at elog.c:1378
#4 0x00000000006eda2f in PageIndexDeleteNoCompact (page=page@entry=0x7f5a48c6f200 "Y", itemnos=itemnos@entry=0x7ffe6d0d0abc, nitems=nitems@entry=1) at bufpage.c:1011
#5 0x0000000000470119 in brin_doupdate (idxrel=0x1c5a530, pagesPerRange=1, revmap=0x91a7d90, heapBlk=2166, oldbuf=3782, oldoff=2, origtup=0x719db80, origsz=3656, newtup=0x754e188, newsz=3656, samepage=1 '\001') at brin_pageops.c:181
#6 0x000000000046e5db in brininsert (idxRel=0x1c5a530, values=0x6591, nulls=0x6 <error: Cannot access memory at address 0x6>, heaptid=0xffffffffffffffff, heapRel=0x7f5a4a72f700, checkUnique=UNIQUE_CHECK_NO) at brin.c:244
#7 0x00000000005d888f in ExecInsertIndexTuples (slot=0x91a4870, tupleid=0x91a7df4, estate=0x91b9b38, noDupErr=0 '\000', specConflict=0x0, arbiterIndexes=0x0) at execIndexing.c:383
#8 0x00000000005f74e5 in ExecUpdate (tupleid=0x7ffe6d0d0ed0, oldtuple=0x6591, slot=0x91a4870, planSlot=0xffffffffffffffff, epqstate=0x7f5a4a72f700, estate=0x91b9b38, canSetTag=1 '\001') at nodeModifyTable.c:1015
#9 0x00000000005f7b7c in ExecModifyTable (node=0x71861c0) at nodeModifyTable.c:1501
#10 0x00000000005dd5e8 in ExecProcNode (node=node@entry=0x71861c0) at execProcnode.c:396
#11 0x00000000005d963f in ExecutePlan (dest=0x17bb870, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_UPDATE, use_parallel_mode=<optimized out>, planstate=0x71861c0, estate=0x91b9b38) at execMain.c:1567
#12 standard_ExecutorRun (queryDesc=0x17bb908, direction=<optimized out>, count=0) at execMain.c:338
#13 0x00000000006f74d9 in ProcessQuery (plan=<optimized out>, sourceText=0x1670e18 "update public.brintest set \n charcol = null, \n namecol = pg_catalog.name(cast(null as character varying)\n ), \n int8col = null, \n int2col = public.brintest.int2col, \n textcol = null, \n float4col = null, \n float8col = cast(coalesce(null,\n\n public.brintest.float8col) as double precision), \n inetcol = public.brintest.inetcol, \n cidrcol = public.brintest.cidrcol, \n bpcharcol = null, \n datecol = (select datecol from public.brintest limit 1 offset 25)\n, \n timecol = (select timecol from public.brintest limit 1 offset 42)\n, \n timetzcol = (select timetzcol from public.brintest limit 1 offset 3)\n, \n numericcol = (select numericcol from public.brintest limit 1 offset 32)\n, \n uuidcol = public.brintest.uuidcol\nreturning \n public.brintest.datecol as c0;", params=0x0, dest=0x17bb870, completionTag=0x7ffe6d0d10a0 "") at pquery.c:185
#14 0x00000000006f776f in PortalRunMulti (portal=portal@entry=0x1611b68, isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x17bb870, altdest=0xc96680 <donothingDR>, completionTag=completionTag@entry=0x7ffe6d0d10a0 "") at pquery.c:1267
#15 0x00000000006f7a1c in FillPortalStore (portal=portal@entry=0x1611b68, isTopLevel=isTopLevel@entry=1 '\001') at pquery.c:1044
#16 0x00000000006f846d in PortalRun (portal=0x1611b68, count=9223372036854775807, isTopLevel=<optimized out>, dest=0x756c870, altdest=0x756c870, completionTag=0x7ffe6d0d1450 "") at pquery.c:782
#17 0x00000000006f5c73 in exec_simple_query (query_string=<optimized out>) at postgres.c:1094
#18 PostgresMain (argc=23141224, argv=0x2cab518, dbname=0x15f3578 "brintest", username=0x2cab570 "\030\265\312\002") at postgres.c:4059
#19 0x000000000046c8d2 in BackendRun (port=0x1618880) at postmaster.c:4258
#20 BackendStartup (port=0x1618880) at postmaster.c:3932
#21 ServerLoop () at postmaster.c:1690
#22 0x00000000006907fe in PostmasterMain (argc=argc@entry=4, argv=argv@entry=0x15f2560) at postmaster.c:1298
#23 0x000000000046d82d in main (argc=4, argv=0x15f2560) at main.c:228
(gdb) frame 4
#4 0x00000000006eda2f in PageIndexDeleteNoCompact (page=page@entry=0x7f5a48c6f200 "Y", itemnos=itemnos@entry=0x7ffe6d0d0abc, nitems=nitems@entry=1) at bufpage.c:1011
1011 elog(ERROR, "incorrect index offsets supplied");
(gdb) list
1006 }
1007 }
1008
1009 /* this will catch invalid or out-of-order itemnos[] */
1010 if (nextitm != nitems)
1011 elog(ERROR, "incorrect index offsets supplied");
1012
1013 if (empty)
1014 {
1015 /* Page is completely empty, so just reset it quickly */
(gdb) up
#5 0x0000000000470119 in brin_doupdate (idxrel=0x1c5a530, pagesPerRange=1, revmap=0x91a7d90, heapBlk=2166, oldbuf=3782, oldoff=2, origtup=0x719db80, origsz=3656, newtup=0x754e188, newsz=3656, samepage=1 '\001') at brin_pageops.c:181
(gdb) list
176 brin_initialize_empty_new_buffer(idxrel, newbuf);
177 UnlockReleaseBuffer(newbuf);
178 }
179
180 START_CRIT_SECTION();
181 PageIndexDeleteNoCompact(oldpage, &oldoff, 1);
182 if (PageAddItemFlags(oldpage, (Item) newtup, newsz, oldoff,
183 PAI_OVERWRITE | PAI_ALLOW_LARGE_OFFSET) ==
184 InvalidOffsetNumber)
185 elog(ERROR, "failed to add BRIN tuple");

2016-05-25 13:54:55.955 CEST [14614] LOG: redo starts at 59/DB000050
2016-05-25 13:54:56.014 CEST [14614] PANIC: brin_xlog_samepage_update: failed to add tuple
2016-05-25 13:54:56.014 CEST [14614] CONTEXT: xlog redo at 59/EC09EBD0 for BRIN/SAMEPAGE_UPDATE: offnum 2
$ gdb postgres data/postgres.14614@.core
Core was generated by `postgres: startup process recovering 000000010000005'.
Program terminated with signal SIGABRT, Aborted.
(gdb) bt
bt
#0 0x00007f234df5d507 in __GI_raise (sig=sig@entry=6)
at ../sysdeps/unix/sysv/linux/raise.c:55
#1 0x00007f234df5e8da in __GI_abort () at abort.c:89
#2 0x00000000007ed289 in errfinish (dummy=dummy@entry=0) at elog.c:557
#3 0x00000000007f0a5c in elog_finish (elevel=elevel@entry=22,
fmt=fmt@entry=0x82ec08 "brin_xlog_samepage_update: failed to add tuple")
at elog.c:1378
#4 0x0000000000473be5 in brin_xlog_samepage_update (record=<optimized out>)
at brin_xlog.c:198
#5 brin_redo (record=<optimized out>) at brin_xlog.c:279
#6 0x00000000004fbeaa in StartupXLOG () at xlog.c:6871
#7 0x0000000000691d91 in StartupProcessMain () at startup.c:216
#8 0x0000000000509f57 in AuxiliaryProcessMain (argc=argc@entry=2,
argv=argv@entry=0x7ffe245352b0) at bootstrap.c:419
#9 0x000000000068eeea in StartChildProcess (type=StartupProcess)
at postmaster.c:5227
#10 0x00000000006917a5 in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x2c9c540) at postmaster.c:1290
#11 0x000000000046db36 in main (argc=3, argv=0x2c9c540) at main.c:228
(gdb) frame 4
#4 0x0000000000473be5 in brin_xlog_samepage_update (record=<optimized out>)
at brin_xlog.c:198
(gdb) list
list
193 elog(PANIC, "brin_xlog_samepage_update: invalid max offset number");
194
195 PageIndexDeleteNoCompact(page, &offnum, 1);
196 offnum = PageAddItem(page, (Item) brintuple, tuplen, offnum, true, false);
197 if (offnum == InvalidOffsetNumber)
198 elog(PANIC, "brin_xlog_samepage_update: failed to add tuple");
199
200 PageSetLSN(page, lsn);
201 MarkBufferDirty(buffer);
202 }

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andreas Seltenreich (#7)
2 attachment(s)
Re: [sqlsmith] PANIC: failed to add BRIN tuple

Andreas Seltenreich wrote:

I wrote:

Re-fuzzing now with your patch applied.

This so far yielded three BRIN core dumps on different machines with the
same backtraces. Crash recovery fails in these cases.

I've put the data directory here (before recovery):

http://ansel.ydns.eu/~andreas/brincrash2-spidew.tar.xz (9.1M)

Corresponding backtraces of the backend and startup core files below.

Ah, of course. These two crashes are two different bugs: the xlog one
is because I forgot to attach the new PageAddItem() flag to the
corresponding replay code; and the one in regular running is because I
neglected to have the patched PageAddItem() compute pd_lower correctly,
which in effect left pages as if empty.

I came up with a simple-minded test program (attached) which reproduces
the reported crashes in a couple of seconds; it can be improved further
for stronger BRIN testing, but at least with this I am confident that
the crashes at hand are gone. If you can re-run sqlsmith and see if you
can find different bugs, I'd appreciate it.

I'm not going to push this just yet, in order to give others time to
comment on the new PageAddItemFlags API I'm adding.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

brincrash-2.patchtext/x-diff; charset=us-asciiDownload
commit 6fa8e8ce3812dd6be2ae962ffe076a06482906c9
Author:     Alvaro Herrera <alvherre@alvh.no-ip.org>
AuthorDate: Fri May 27 18:14:01 2016 -0400
CommitDate: Fri May 27 18:14:01 2016 -0400

    Fix PageAddItem BRIN bug
    
    BRIN was relying on the ability of removing a tuple from an index page,
    then putting it back again later; but the PageAddItem interface doesn't
    actually allow this to happen for tuples beyond the first free one past
    the last used item.  To fix, create a new function PageAddItemFlags
    which is like PageAddItem except that it has a flags bitmap; the
    "overwrite" and "is_heap" boolean flags in the original become
    PAI_OVERWITE and PAI_IS_HEAP flags, and a new flag
    PAI_ALLOW_LARGE_OFFSET enables the behavior required by BRIN.
    PageAddItem() remains with the original signature, for compatibility
    with third-party modules (and other callers in core code are not
    modified, either).
    
    I also added a new error check in brinGetTupleForHeapBlock to raise an
    error if an offset found in the revmap is not actually marked as live by
    the page header, which would have detected this sooner (and, in any
    case, react with a "corrupted BRIN index" error, rather than a hard
    crash, suggesting a REINDEX.)
    
    Reported by Andreas Seltenreich as detected by his handy sqlsmith
    fuzzer.

diff --git a/src/backend/access/brin/brin_pageops.c b/src/backend/access/brin/brin_pageops.c
index d0ca485..4b1afce 100644
--- a/src/backend/access/brin/brin_pageops.c
+++ b/src/backend/access/brin/brin_pageops.c
@@ -179,8 +179,8 @@ brin_doupdate(Relation idxrel, BlockNumber pagesPerRange,
 
 		START_CRIT_SECTION();
 		PageIndexDeleteNoCompact(oldpage, &oldoff, 1);
-		if (PageAddItem(oldpage, (Item) newtup, newsz, oldoff, true,
-						false) == InvalidOffsetNumber)
+		if (PageAddItemFlags(oldpage, (Item) newtup, newsz, oldoff,
+							 PAI_OVERWRITE | PAI_ALLOW_LARGE_OFFSET) == InvalidOffsetNumber)
 			elog(ERROR, "failed to add BRIN tuple");
 		MarkBufferDirty(oldbuf);
 
diff --git a/src/backend/access/brin/brin_revmap.c b/src/backend/access/brin/brin_revmap.c
index 812f76c..853181b 100644
--- a/src/backend/access/brin/brin_revmap.c
+++ b/src/backend/access/brin/brin_revmap.c
@@ -272,6 +272,10 @@ brinGetTupleForHeapBlock(BrinRevmap *revmap, BlockNumber heapBlk,
 		/* If we land on a revmap page, start over */
 		if (BRIN_IS_REGULAR_PAGE(page))
 		{
+			if (*off > PageGetMaxOffsetNumber(page))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg_internal("corrupted BRIN index: inconsistent range map")));
 			lp = PageGetItemId(page, *off);
 			if (ItemIdIsUsed(lp))
 			{
diff --git a/src/backend/access/brin/brin_xlog.c b/src/backend/access/brin/brin_xlog.c
index deb7af4..3103775 100644
--- a/src/backend/access/brin/brin_xlog.c
+++ b/src/backend/access/brin/brin_xlog.c
@@ -193,7 +193,8 @@ brin_xlog_samepage_update(XLogReaderState *record)
 			elog(PANIC, "brin_xlog_samepage_update: invalid max offset number");
 
 		PageIndexDeleteNoCompact(page, &offnum, 1);
-		offnum = PageAddItem(page, (Item) brintuple, tuplen, offnum, true, false);
+		offnum = PageAddItemFlags(page, (Item) brintuple, tuplen, offnum,
+								  PAI_OVERWRITE | PAI_ALLOW_LARGE_OFFSET);
 		if (offnum == InvalidOffsetNumber)
 			elog(PANIC, "brin_xlog_samepage_update: failed to add tuple");
 
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 0189bc9..d227d4b 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -153,12 +153,12 @@ PageIsVerified(Page page, BlockNumber blkno)
 
 
 /*
- *	PageAddItem
+ *	PageAddItemFlags
  *
  *	Add an item to a page.  Return value is offset at which it was
  *	inserted, or InvalidOffsetNumber if there's not room to insert.
  *
- *	If overwrite is true, we just store the item at the specified
+ *	If flag PAI_OVERWRITE is set, we just store the item at the specified
  *	offsetNumber (which must be either a currently-unused item pointer,
  *	or one past the last existing item).  Otherwise,
  *	if offsetNumber is valid and <= current max offset in the page,
@@ -167,18 +167,20 @@ PageIsVerified(Page page, BlockNumber blkno)
  *	If offsetNumber is not valid, then assign one by finding the first
  *	one that is both unused and deallocated.
  *
- *	If is_heap is true, we enforce that there can't be more than
+ *	If flag PAI_IS_HEAP is set, we enforce that there can't be more than
  *	MaxHeapTuplesPerPage line pointers on the page.
  *
+ *	If flag PAI_ALLOW_LARGE_OFFSET is not set, we disallow placing items
+ *	beyond one past the last existing item.
+ *
  *	!!! EREPORT(ERROR) IS DISALLOWED HERE !!!
  */
 OffsetNumber
-PageAddItem(Page page,
-			Item item,
-			Size size,
-			OffsetNumber offsetNumber,
-			bool overwrite,
-			bool is_heap)
+PageAddItemFlags(Page page,
+				 Item item,
+				 Size size,
+				 OffsetNumber offsetNumber,
+				 int flags)
 {
 	PageHeader	phdr = (PageHeader) page;
 	Size		alignedSize;
@@ -209,7 +211,7 @@ PageAddItem(Page page,
 	if (OffsetNumberIsValid(offsetNumber))
 	{
 		/* yes, check it */
-		if (overwrite)
+		if ((flags & PAI_OVERWRITE) != 0)
 		{
 			if (offsetNumber < limit)
 			{
@@ -257,13 +259,18 @@ PageAddItem(Page page,
 		}
 	}
 
-	if (offsetNumber > limit)
+	/*
+	 * Reject placing items beyond the first unused line pointer, unless
+	 * caller asked for that behavior specifically.
+	 */
+	if (((flags & PAI_ALLOW_LARGE_OFFSET) == 0) && offsetNumber > limit)
 	{
 		elog(WARNING, "specified item offset is too large");
 		return InvalidOffsetNumber;
 	}
 
-	if (is_heap && offsetNumber > MaxHeapTuplesPerPage)
+	/* Reject placing items beyond heap boundary, if heap */
+	if (((flags & PAI_IS_HEAP) != 0) && offsetNumber > MaxHeapTuplesPerPage)
 	{
 		elog(WARNING, "can't put more than MaxHeapTuplesPerPage items in a heap page");
 		return InvalidOffsetNumber;
@@ -275,7 +282,10 @@ PageAddItem(Page page,
 	 * Note: do arithmetic as signed ints, to avoid mistakes if, say,
 	 * alignedSize > pd_upper.
 	 */
-	if (offsetNumber == limit || needshuffle)
+	if ((flags & PAI_ALLOW_LARGE_OFFSET) != 0)
+		lower = Max(phdr->pd_lower,
+					SizeOfPageHeaderData + sizeof(ItemIdData) * offsetNumber);
+	else if (offsetNumber == limit || needshuffle)
 		lower = phdr->pd_lower + sizeof(ItemIdData);
 	else
 		lower = phdr->pd_lower;
@@ -324,6 +334,36 @@ PageAddItem(Page page,
 }
 
 /*
+ *	PageAddItem, compatibility layer on top of PageAddItem.
+ *
+ *	Add an item to a page.  Return value is offset at which it was
+ *	inserted, or InvalidOffsetNumber if there's not room to insert.
+ *
+ *	If overwrite is true, we just store the item at the specified
+ *	offsetNumber (which must be either a currently-unused item pointer,
+ *	or one past the last existing item).  Otherwise,
+ *	if offsetNumber is valid and <= current max offset in the page,
+ *	insert item into the array at that position by shuffling ItemId's
+ *	down to make room.
+ *	If offsetNumber is not valid, then assign one by finding the first
+ *	one that is both unused and deallocated.
+ *
+ *	If is_heap is true, we enforce that there can't be more than
+ *	MaxHeapTuplesPerPage line pointers on the page.
+ */
+OffsetNumber
+PageAddItem(Page page, Item item, Size size, OffsetNumber offsetNumber,
+			bool overwrite, bool is_heap)
+{
+	int		flags = 0;
+
+	flags |= overwrite ? PAI_OVERWRITE : 0;
+	flags |= is_heap ? PAI_IS_HEAP : 0;
+
+	return PageAddItemFlags(page, item, size, offsetNumber, flags);
+}
+
+/*
  * PageGetTempPage
  *		Get a temporary page in local memory for special processing.
  *		The returned page is not initialized at all; caller must do that.
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 63053d4..5ca1db7 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -407,11 +407,16 @@ do { \
  *		extern declarations
  * ----------------------------------------------------------------
  */
+#define PAI_OVERWRITE			(1 << 0)
+#define PAI_IS_HEAP				(1 << 1)
+#define PAI_ALLOW_LARGE_OFFSET	(1 << 2)
 
 extern void PageInit(Page page, Size pageSize, Size specialSize);
 extern bool PageIsVerified(Page page, BlockNumber blkno);
 extern OffsetNumber PageAddItem(Page page, Item item, Size size,
 			OffsetNumber offsetNumber, bool overwrite, bool is_heap);
+extern OffsetNumber PageAddItemFlags(Page page, Item item, Size size,
+				 OffsetNumber offsetNumber, int flags);
 extern Page PageGetTempPage(Page page);
 extern Page PageGetTempPageCopy(Page page);
 extern Page PageGetTempPageCopySpecial(Page page);
brintest.pltext/x-perl; charset=us-asciiDownload
#9Andreas Seltenreich
seltenreich@gmx.de
In reply to: Alvaro Herrera (#8)
Re: [sqlsmith] PANIC: failed to add BRIN tuple

Alvaro Herrera writes:

If you can re-run sqlsmith and see if you can find different bugs, I'd
appreciate it.

[...]

[2. text/x-diff; brincrash-2.patch]

BRIN is inconspicuous since applying this patch. All coredumps I see
now are either due to the parallel worker shutdown issue or acl.c's
text/name confusion, both reported earlier.

regards,
Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andreas Seltenreich (#9)
Re: [sqlsmith] PANIC: failed to add BRIN tuple

Andreas Seltenreich wrote:

Alvaro Herrera writes:

If you can re-run sqlsmith and see if you can find different bugs, I'd
appreciate it.

[...]

[2. text/x-diff; brincrash-2.patch]

BRIN is inconspicuous since applying this patch. All coredumps I see
now are either due to the parallel worker shutdown issue or acl.c's
text/name confusion, both reported earlier.

Great. Pushed patch, after some further cosmetic changes.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers