BUG #5622: Query failed: server closed the connection unexpectedly
The following bug has been logged online:
Bug reference: 5622
Logged by: Thue Janus Kristensen
Email address: thuejk@gmail.com
PostgreSQL version: 8.4.4
Operating system: Ubuntu 10.04 LTS i386
Description: Query failed: server closed the connection unexpectedly
Details:
I have a 100% reproducible server crash with postgresql. This crash does not
happen on my 8.3 installation.
When the crash occurs I get
pg_query(): Query failed: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
However, I have been unable to simplify the error condition :(. I tried
dumping 8000 lines of sql, but when I run them separately, I don't get a
crash.
The last lines before the crash are ($db->query is a prepared query,
$db->simple_query is a non-prepared query):
<?php
$db->query("SAVEPOINT lala");
$res = questions::move_question($t->e, $g1q1_id, $g2_id); //some queries
here
$db->query("ROLLBACK TO SAVEPOINT lala");
$db->query("RELEASE SAVEPOINT lala");
//some queries here
$res = aas::insert_in_group($t->e, $aa_id, $g2_id);
$res = questions::move_question($t->e, $g1q1_id, $g2_id);
//crash when line below is run!
$db->simple_query("SET CONSTRAINTS ALL IMMEDIATE");
?>
/var/log/syslog:
Aug 17 23:08:26 thue-laptop kernel: [46188.626357] postgres[11461]: segfault
at 21723858 ip 00772286 sp bf8a4a80 error 4 in postgres[621000+446000]
/var/log/postgresql/postgresql-8.4-main.log
2010-08-17 23:08:26 CEST LOG: server process (PID 11461) was terminated by
signal 11: Segmentation fault
2010-08-17 23:08:26 CEST LOG: terminating any other active server
processes
2010-08-17 23:08:26 CEST WARNING: terminating connection because of crash
of another server process
2010-08-17 23:08:26 CEST DETAIL: The postmaster has commanded this server
process to roll back the current transaction and exit,\
because another server process exited abnormally and possibly corrupted
shared memory.
2010-08-17 23:08:26 CEST HINT: In a moment you should be able to reconnect
to the database and repeat your command.
2010-08-17 23:08:26 CEST LOG: all server processes terminated;
reinitializing
2010-08-17 23:08:26 CEST LOG: database system was interrupted; last known
up at 2010-08-17 23:04:02 CEST
2010-08-17 23:08:26 CEST LOG: database system was not properly shut down;
automatic recovery in progress
2010-08-17 23:08:26 CEST LOG: redo starts at 0/37500054
I realize that this is probably not enough to understand the bug, but I will
be happy to try to debug it further, if I am given a pointer as to how.
On Tue, Aug 17, 2010 at 5:26 PM, Thue Janus Kristensen <thuejk@gmail.com> wrote:
The following bug has been logged online:
Bug reference: 5622
Logged by: Thue Janus Kristensen
Email address: thuejk@gmail.com
PostgreSQL version: 8.4.4
Operating system: Ubuntu 10.04 LTS i386
Description: Query failed: server closed the connection unexpectedly
Details:I have a 100% reproducible server crash with postgresql. This crash does not
happen on my 8.3 installation.When the crash occurs I get
pg_query(): Query failed: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.However, I have been unable to simplify the error condition :(. I tried
dumping 8000 lines of sql, but when I run them separately, I don't get a
crash.The last lines before the crash are ($db->query is a prepared query,
$db->simple_query is a non-prepared query):
<?php
$db->query("SAVEPOINT lala");
$res = questions::move_question($t->e, $g1q1_id, $g2_id); //some queries
here
$db->query("ROLLBACK TO SAVEPOINT lala");
$db->query("RELEASE SAVEPOINT lala");
//some queries here
$res = aas::insert_in_group($t->e, $aa_id, $g2_id);
$res = questions::move_question($t->e, $g1q1_id, $g2_id);
//crash when line below is run!
$db->simple_query("SET CONSTRAINTS ALL IMMEDIATE");
?>/var/log/syslog:
Aug 17 23:08:26 thue-laptop kernel: [46188.626357] postgres[11461]: segfault
at 21723858 ip 00772286 sp bf8a4a80 error 4 in postgres[621000+446000]/var/log/postgresql/postgresql-8.4-main.log
2010-08-17 23:08:26 CEST LOG: server process (PID 11461) was terminated by
signal 11: Segmentation fault
2010-08-17 23:08:26 CEST LOG: terminating any other active server
processes
2010-08-17 23:08:26 CEST WARNING: terminating connection because of crash
of another server process
2010-08-17 23:08:26 CEST DETAIL: The postmaster has commanded this server
process to roll back the current transaction and exit,\
because another server process exited abnormally and possibly corrupted
shared memory.
2010-08-17 23:08:26 CEST HINT: In a moment you should be able to reconnect
to the database and repeat your command.
2010-08-17 23:08:26 CEST LOG: all server processes terminated;
reinitializing
2010-08-17 23:08:26 CEST LOG: database system was interrupted; last known
up at 2010-08-17 23:04:02 CEST
2010-08-17 23:08:26 CEST LOG: database system was not properly shut down;
automatic recovery in progress
2010-08-17 23:08:26 CEST LOG: redo starts at 0/37500054I realize that this is probably not enough to understand the bug, but I will
be happy to try to debug it further, if I am given a pointer as to how.
Well, obviously the best thing would be to isolate a reproducible test
case. But maybe a good start would be to try to get a list of the
exact series of SQL statements that are being executed. Perhaps you
could set log_min_duration_statement=0 and then find 'em in the logs.
If you do this with some trivial query in the places where you have
"some queries here", it doesn't crash.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On 18/08/10 05:26, Thue Janus Kristensen wrote:
/var/log/syslog:
Aug 17 23:08:26 thue-laptop kernel: [46188.626357] postgres[11461]: segfault
at 21723858 ip 00772286 sp bf8a4a80 error 4 in postgres[621000+446000]
If you're game, it might also be useful to see what the backend was up
to when it crashed. There are some instructions on how to get a stack
trace on the wiki:
http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD
--
Craig Ringer
Tech-related writing: http://soapyfrogs.blogspot.com/
On Tue, Aug 17, 2010 at 10:55 PM, Craig Ringer
<craig@postnewspapers.com.au> wrote:
On 18/08/10 05:26, Thue Janus Kristensen wrote:
/var/log/syslog:
Aug 17 23:08:26 thue-laptop kernel: [46188.626357] postgres[11461]: segfault
at 21723858 ip 00772286 sp bf8a4a80 error 4 in postgres[621000+446000]If you're game, it might also be useful to see what the backend was up
to when it crashed. There are some instructions on how to get a stack
trace on the wiki:http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD
Yeah, that would be very helpful.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
I finally succeeded in creating a test case, after much experimentation.
Running the attached sql crashes my postgresql server 100% of the time.
Regards, Thue
On Tue, Aug 17, 2010 at 11:32 PM, Robert Haas <robertmhaas@gmail.com> wrote:
Show quoted text
On Tue, Aug 17, 2010 at 5:26 PM, Thue Janus Kristensen <thuejk@gmail.com>
wrote:The following bug has been logged online:
Bug reference: 5622
Logged by: Thue Janus Kristensen
Email address: thuejk@gmail.com
PostgreSQL version: 8.4.4
Operating system: Ubuntu 10.04 LTS i386
Description: Query failed: server closed the connectionunexpectedly
Details:
I have a 100% reproducible server crash with postgresql. This crash does
not
happen on my 8.3 installation.
When the crash occurs I get
pg_query(): Query failed: server closed the connectionunexpectedly
This probably means the server terminated abnormally
before or while processing the request.However, I have been unable to simplify the error condition :(. I tried
dumping 8000 lines of sql, but when I run them separately, I don't get a
crash.The last lines before the crash are ($db->query is a prepared query,
$db->simple_query is a non-prepared query):
<?php
$db->query("SAVEPOINT lala");
$res = questions::move_question($t->e, $g1q1_id, $g2_id); //somequeries
here
$db->query("ROLLBACK TO SAVEPOINT lala");
$db->query("RELEASE SAVEPOINT lala");
//some queries here
$res = aas::insert_in_group($t->e, $aa_id, $g2_id);
$res = questions::move_question($t->e, $g1q1_id, $g2_id);
//crash when line below is run!
$db->simple_query("SET CONSTRAINTS ALL IMMEDIATE");
?>/var/log/syslog:
Aug 17 23:08:26 thue-laptop kernel: [46188.626357] postgres[11461]:segfault
at 21723858 ip 00772286 sp bf8a4a80 error 4 in postgres[621000+446000]
/var/log/postgresql/postgresql-8.4-main.log
2010-08-17 23:08:26 CEST LOG: server process (PID 11461) was terminatedby
signal 11: Segmentation fault
2010-08-17 23:08:26 CEST LOG: terminating any other active server
processes
2010-08-17 23:08:26 CEST WARNING: terminating connection because ofcrash
of another server process
2010-08-17 23:08:26 CEST DETAIL: The postmaster has commanded thisserver
process to roll back the current transaction and exit,\
because another server process exited abnormally and possibly corrupted
shared memory.
2010-08-17 23:08:26 CEST HINT: In a moment you should be able toreconnect
to the database and repeat your command.
2010-08-17 23:08:26 CEST LOG: all server processes terminated;
reinitializing
2010-08-17 23:08:26 CEST LOG: database system was interrupted; lastknown
up at 2010-08-17 23:04:02 CEST
2010-08-17 23:08:26 CEST LOG: database system was not properly shutdown;
automatic recovery in progress
2010-08-17 23:08:26 CEST LOG: redo starts at 0/37500054I realize that this is probably not enough to understand the bug, but I
will
be happy to try to debug it further, if I am given a pointer as to how.
Well, obviously the best thing would be to isolate a reproducible test
case. But maybe a good start would be to try to get a list of the
exact series of SQL statements that are being executed. Perhaps you
could set log_min_duration_statement=0 and then find 'em in the logs.
If you do this with some trivial query in the places where you have
"some queries here", it doesn't crash.--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Attachments:
bug.sqltext/x-sql; charset=US-ASCII; name=bug.sqlDownload
Thue Janus Kristensen <thuejk@gmail.com> writes:
I finally succeeded in creating a test case, after much experimentation.
Running the attached sql crashes my postgresql server 100% of the time.
Cool, replicated here on CVS HEAD:
Program terminated with signal 10, Bus error.
#0 afterTriggerCheckState (evtshared=0x40163089) at trigger.c:2850
2850 if ((evtshared->ats_event & AFTER_TRIGGER_DEFERRABLE) == 0)
(gdb) bt
#0 afterTriggerCheckState (evtshared=0x40163089) at trigger.c:2850
#1 0x2f61f4 in afterTriggerMarkEvents (events=0x40163089, move_list=0x0,
immediate_only=1 '\001') at trigger.c:3219
#2 0x2f76dc in AfterTriggerSetState (stmt=0x40064af0) at trigger.c:4149
#3 0x418250 in standard_ProcessUtility (parsetree=0x40064af0,
queryString=0x400641a8 "SET CONSTRAINTS ALL IMMEDIATE;", params=0x0,
isTopLevel=1 '\001', dest=0x40064cd8, completionTag=0x7b03b308 "")
at utility.c:1226
#4 0x416a70 in ProcessUtility (parsetree=0x40064af0,
queryString=0x400641a8 "SET CONSTRAINTS ALL IMMEDIATE;", params=0x0,
isTopLevel=1 '\001', dest=0x40064cd8, completionTag=0x7b03b308 "")
at utility.c:323
#5 0x415740 in PortalRunUtility (portal=0x400e1fa0, utilityStmt=0x40064af0,
isTopLevel=1 '\001', dest=0x40064cd8, completionTag=0x7b03b308 "")
at pquery.c:1191
#6 0x4158f0 in PortalRunMulti (portal=0x400e1fa0, isTopLevel=1 '\001',
dest=0x40064cd8, altdest=0x40064cd8, completionTag=0x7b03b308 "")
at pquery.c:1298
#7 0x414f64 in PortalRun (portal=0x400e1fa0, count=2147483647,
isTopLevel=1 '\001', dest=0x40064cd8, altdest=0x40064cd8,
completionTag=0x7b03b308 "") at pquery.c:822
#8 0x40e7c0 in exec_simple_query (
query_string=0x400641a8 "SET CONSTRAINTS ALL IMMEDIATE;")
Looking at it ...
regards, tom lane
Thue Janus Kristensen <thuejk@gmail.com> writes:
I finally succeeded in creating a test case, after much experimentation.
Running the attached sql crashes my postgresql server 100% of the time.
Cute problem. The attached fix cures it for me; would you see if it
solves your original case too?
regards, tom lane
On Thu, Aug 19, 2010 at 5:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Thue Janus Kristensen <thuejk@gmail.com> writes:
I finally succeeded in creating a test case, after much experimentation.
Running the attached sql crashes my postgresql server 100% of the time.Cute problem.
Generating the test case was very "interesting" black box debugging. Now why
does it need at least 7 (and not 1-6) updates to that unrelated row in that
unrelated table to crash...? At least it was deterministic.
The attached fix cures it for me; would you see if it
solves your original case too?
Yes, it fixes my original problem too, thanks :).
Regards, Thue