BUG #14668: BRIN open autosummarize=on , database will crash

Started by 德哥almost 9 years ago4 messagesbugs
Jump to latest
#1德哥
digoal@126.com

The following bug has been logged on the website:

Bug reference: 14668
Logged by: Zhou Digoal
Email address: digoal@126.com
PostgreSQL version: 10beta1
Operating system: CentOS 6.x x64
Description:

HI,
when i test brin index, and set autosummarize=on, insert data will crash
database.

```
postgres=# create table test(id serial8, c1 int, c2 int);
CREATE TABLE
postgres=# create index idx_test_1 on test using brin(id) with
(pages_per_range=1,autosummarize=on);
CREATE INDEX

vi test.sql
\set c1 random(1,10000)
\set c2 random(1,1000000)
insert into test (c1,c2) values (:c1, :c2);

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 32 -j 32 -T 100
```

then PostgreSQL crash,

log

```
,,0,LOG,00000,"server process (PID 38060) was terminated by signal 11:
Segmentation fault","Failed process was running: insert into test (c1,c2)
values ($1, $2);",,,,,,,"LogChildExit, postmaster.c:3553",""
,,0,LOG,00000,"terminating any other active server
processes",,,,,,,,"HandleChildCrash, postmaster.c:3273",""
,71/2,0,WARNING,57P02,"terminating connection because of crash of another
server process","The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.","In a moment you should be
able to reconnect to the database and repeat your command.",,,,,,"quickdie,
postgres.c:2595",""
,37/2,0,WARNING,57P02,"terminating connection because of crash of another
server process","The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.","In a moment you should be
able to reconnect to the database and repeat your command.",,,,,,"quickdie,
postgres.c:2595",""
,70/2,0,WARNING,57P02,"terminating connection because of crash of another
server process","The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.","In a moment you should be
able to reconnect to the database and repeat your command.",,,,,,"quickdie,
postgres.c:2595",""
,69/2,0,WARNING,57P02,"terminating connection because of crash of another
server process","The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.","In a moment you should be
able to reconnect to the database and repeat your command.",,,,,,"quickdie,
postgres.c:2595",""
,1/0,0,WARNING,57P02,"terminating connection because of crash of another
server process","The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.","In a moment you should be
able to reconnect to the database and repeat your command.",,,,,,"quickdie,
postgres.c:2595",""
,35/2,0,WARNING,57P02,"terminating connection because of crash of another
server process","The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.","In a moment you should be
able to reconnect to the database and repeat your command.",,,,,,"quickdie,
postgres.c:2595",""
,,0,LOG,00000,"all server processes terminated;
reinitializing",,,,,,,,"PostmasterStateMachine, postmaster.c:3800",""
,,0,LOG,00000,"database system was interrupted; last known up at 2017-05-24
14:22:26 CST",,,,,,,,"StartupXLOG, xlog.c:6256",""
,,0,LOG,00000,"database system was not properly shut down; automatic
recovery in progress",,,,,,,,"StartupXLOG, xlog.c:6759",""
,,0,LOG,00000,"redo starts at 0/408E7140",,,,,,,,"StartupXLOG,
xlog.c:7014",""
,,0,LOG,00000,"invalid record length at 0/4090F028: wanted 24, got
0",,,,,,,,"ReadRecord, xlog.c:4184",""
,,0,LOG,00000,"redo done at 0/4090EEF0",,,,,,,,"StartupXLOG,
xlog.c:7286",""
,,0,LOG,00000,"last completed transaction was at log time 2017-05-24
14:24:49.118091+08",,,,,,,,"StartupXLOG, xlog.c:7291",""
,,0,LOG,00000,"checkpoint starting: end-of-recovery
immediate",,,,,,,,"LogCheckpointStart, xlog.c:8369",""
,,0,LOG,00000,"checkpoint complete: wrote 117 buffers (0.0%); 0 WAL file(s)
added, 0 removed, 0 recycled; write=0.082 s, sync=0.016 s, total=0.104 s;
sync files=38, longest=0.004 s, average=0.000 s; distance=159 kB,
estimate=159 kB",,,,,,,,"LogCheckpointEnd, xlog.c:8451",""
,,0,LOG,00000,"database system is ready to accept
connections",,,,,,,,"reaper, postmaster.c:2866",""
```

core file

```
(gdb) bt
#0 0x00000000008fc1c3 in yy_transition ()
Cannot access memory at address 0x7fff048a5788
```

best regards,
digoal

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#2Thomas Munro
thomas.munro@gmail.com
In reply to: 德哥 (#1)
Re: BUG #14668: BRIN open autosummarize=on , database will crash

On Wed, May 24, 2017 at 6:33 PM, <digoal@126.com> wrote:

postgres=# create table test(id serial8, c1 int, c2 int);
CREATE TABLE
postgres=# create index idx_test_1 on test using brin(id) with
(pages_per_range=1,autosummarize=on);
CREATE INDEX

vi test.sql
\set c1 random(1,10000)
\set c2 random(1,1000000)
insert into test (c1,c2) values (:c1, :c2);

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 32 -j 32 -T 100
```

then PostgreSQL crash,

Reproduced here.

frame #3: 0x000000010ac2d6f0
postgres`ExceptionalCondition(conditionName="!(pointer !=
((void*)0))", errorType="FailedAssertion",
fileName="../../../../src/include/utils/memutils.h", lineNumber=116) +
128 at assert.c:54
frame #4: 0x000000010ac6f856
postgres`GetMemoryChunkContext(pointer=0x0000000000000000) + 54 at
memutils.h:116
frame #5: 0x000000010ac6f725
postgres`pfree(pointer=0x0000000000000000) + 21 at mcxt.c:952
frame #6: 0x000000010a5cabd5
postgres`brin_free_tuple(tuple=0x0000000000000000) + 21 at
brin_tuple.c:310
frame #7: 0x000000010a5c2b88
postgres`brininsert(idxRel=0x000000010b190638,
values=0x00007fff5563dbd0, nulls="", heaptid=0x00007fcb67801b8c,
heapRel=0x000000010b18b1d0, checkUnique=UNIQUE_CHECK_NO,
indexInfo=0x00007fcb67800aa0) + 680 at brin.c:193

I guess brin_free_tuple(lastPageTuple) should only be called if it's
not NULL, so I guess brin.c lacks an "else" here:

-                       brin_free_tuple(lastPageTuple);
+                       else
+                               brin_free_tuple(lastPageTuple);

It doesn't crash for me with that change.

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Thomas Munro (#2)
Re: BUG #14668: BRIN open autosummarize=on , database will crash

Thomas Munro wrote:

I guess brin_free_tuple(lastPageTuple) should only be called if it's
not NULL, so I guess brin.c lacks an "else" here:

-                       brin_free_tuple(lastPageTuple);
+                       else
+                               brin_free_tuple(lastPageTuple);

It doesn't crash for me with that change.

Pushed fix. Actually that's not correct either, because tuples returned
by brinGetTupleForHeapBlock are not supposed to be freed at all since
they are shared buffer items. The correct thing to do there was to
release the buffer lock ...

Thanks for the report and analysis.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#4Thomas Munro
thomas.munro@gmail.com
In reply to: Alvaro Herrera (#3)
Re: BUG #14668: BRIN open autosummarize=on , database will crash

On Wed, May 31, 2017 at 10:19 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Pushed fix. Actually that's not correct either, because tuples returned
by brinGetTupleForHeapBlock are not supposed to be freed at all since
they are shared buffer items. The correct thing to do there was to
release the buffer lock ...

Ugh, right, thanks. I'll blame the myopic analysis on jetlag and lack
of coffee.

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs