BUG #5004: pg_freespacemap make a SegFault
The following bug has been logged online:
Bug reference: 5004
Logged by: Sebastien Lardiere
Email address: slardiere@hi-media.com
PostgreSQL version: 8.3.7
Operating system: Debian Etch
Description: pg_freespacemap make a SegFault
Details:
I've got a crash with a cluster. Nothing found in the logfile, but a message
about a Segfault, so I get a coredump and here is the backtrace :
Core was generated by `postgres: postgres postgres [local] SELECT
'.
Program terminated with signal 11, Segmentation fault.
#0 pg_freespacemap_pages (fcinfo=0x7fff4a9bc250) at pg_freespacemap.c:162
162
fctx->record[i].reltablespace = fsmrel->key.spcNode;
(gdb) bt
#0 pg_freespacemap_pages (fcinfo=0x7fff4a9bc250) at pg_freespacemap.c:162
#1 0x0000000000526781 in ExecMakeTableFunctionResult (funcexpr=0x29c2408,
econtext=0x29c1b70, expectedDesc=0x29c1ed0, returnDesc=0x7fff4a9bc6d0) at
execQual.c:1566
#2 0x00000000005330d2 in FunctionNext (node=0x29bf620) at
nodeFunctionscan.c:68
#3 0x000000000052881c in ExecScan (node=0x7fc03f6c5370, accessMtd=0x533030
<FunctionNext>) at execScan.c:68
#4 0x0000000000521f6d in ExecProcNode (node=0x29bf620) at
execProcnode.c:356
#5 0x000000000052ca40 in ExecAgg (node=0x29c17f0) at nodeAgg.c:874
#6 0x0000000000521fed in ExecProcNode (node=0x29c17f0) at
execProcnode.c:394
#7 0x0000000000520ffd in ExecutorRun (queryDesc=<value optimized out>,
direction=ForwardScanDirection, count=0) at execMain.c:1335
#8 0x00000000005ba0d6 in PortalRunSelect (portal=0x29b47a0, forward=<value
optimized out>, count=0, dest=0x29af198) at pquery.c:943
#9 0x00000000005bb159 in PortalRun (portal=0x29b47a0,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x29af198,
altdest=0x29af198, completionTag=0x7fff4a9bcf40 "") at pquery.c:769
#10 0x00000000005b6d2d in exec_simple_query (query_string=0x2969070 "select
count(*) as pages from pg_freespacemap_pages ") at postgres.c:1004
#11 0x00000000005b8071 in PostgresMain (argc=4, argv=<value optimized out>,
username=0x28bf4b0 "postgres") at postgres.c:3631
#12 0x000000000058ca1b in ServerLoop () at postmaster.c:3207
#13 0x000000000058d73e in PostmasterMain (argc=5, argv=0x28ba310) at
postmaster.c:1029
#14 0x0000000000544c15 in main (argc=5, argv=<value optimized out>) at
main.c:188
We can see the use of contrib/pg_freespacemap. A munin plugin sent this
query "select count(*) as pages from pg_freespacemap_pages " every 5 minutes
( since 1 year, now ) and we obtain graph.
I notice that the graph says that our freespacemap is empty ( a few thousand
of pages ) since our first crash. And sometime, the number of pages
increase, and we've got a crash.
If you want more detail, ask me ...
Thanks,
PS : Sorry for my poor english
"Sebastien Lardiere" <slardiere@hi-media.com> writes:
Description: pg_freespacemap make a SegFault
There's a post-8.3.7 fix that might cure this:
http://archives.postgresql.org/pgsql-committers/2009-04/msg00108.php
regards, tom lane
On Fri, Aug 21, 2009 at 04:26:11PM +0000, Sebastien Lardiere wrote:
The following bug has been logged online:
Bug reference: 5004
Logged by: Sebastien Lardiere
Email address: slardiere@hi-media.com
PostgreSQL version: 8.3.7
Operating system: Debian Etch
Description: pg_freespacemap make a SegFault
Details:I've got a crash with a cluster. Nothing found in the logfile, but a message
about a Segfault, so I get a coredump and here is the backtrace :
Can you check if you had any vacuums running at the time of crash?
It might be in logs, something like:
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
STATEMENT: vacuum
if yes - how many vacuum jobs there were?
depesz
--
Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007
On 22/08/2009 19:52, hubert depesz lubaczewski wrote:
On Fri, Aug 21, 2009 at 04:26:11PM +0000, Sebastien Lardiere wrote:
The following bug has been logged online:
Bug reference: 5004
Logged by: Sebastien Lardiere
Email address: slardiere@hi-media.com
PostgreSQL version: 8.3.7
Operating system: Debian Etch
Description: pg_freespacemap make a SegFault
Details:I've got a crash with a cluster. Nothing found in the logfile, but a message
about a Segfault, so I get a coredump and here is the backtrace :Can you check if you had any vacuums running at the time of crash?
Yes, autovacuum is on. it wasn't "normal" vacuum during the crash, but
the last.
Nevertheless, the day before the first crash, I made a big delete on 23
millions of rows, and pg_freespacemap show a big increase of the number
of pages in FSM. Then, when the number of pages in FSM increase, Pg
crashes ; but :
It might be in logs, something like:
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
STATEMENT: vacuumif yes - how many vacuum jobs there were?
I never seen in the logs this messages with vacuum, Pg always crash with
the query :
"select count(*) as pages from pg_freespacemap_pages"
We can see in Munin ( graph attached ), the behavior :
The big increase, then, the first crash, and, a each time there is a
significat increase, a crash, with a reset of FSM.
I had disable the plugin, so there is no more queries with
pg_freespacemap, and no crash.
--
Sébastien Lardière
Attachments:
bdd1-pg_fsm-week.pngimage/png; name=bdd1-pg_fsm-week.pngDownload+0-2
On 21/08/2009 18:51, Tom Lane wrote:
"Sebastien Lardiere"<slardiere@hi-media.com> writes:
Description: pg_freespacemap make a SegFault
There's a post-8.3.7 fix that might cure this:
http://archives.postgresql.org/pgsql-committers/2009-04/msg00108.php
regards, tom lane
Ok, I'll try to appli this patch,
Thanks,
--
S�bastien Lardi�re