BUG #5235: Segmentation fault under high load through JDBC
The following bug has been logged online:
Bug reference: 5235
Logged by: Oleg Yurchenko
Email address: oleg@fts.ee
PostgreSQL version: 8.4.1
Operating system: FreeBSD 8.0-RELEASE #0 Generic kernel i386
Description: Segmentation fault under high load through JDBC
Details:
Postgres-8.4.1 back-end crashes with segmentation fault after 20-30 min of
high load through postgres-jdbc. Tried different versions of jdbc:
postgresql-8.4-701.jdbc3.jar, postgresql-8.3-605.jdbc3.jar,
postgresql-jdbc-8.3.603_1
Core dump and bt are following:
# gdb -c /usr/local/pgsql/data/postgres.core postgres
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd"...
Core was generated by `postgres'.
Program terminated with signal 11, Segmentation fault.
#0 0x0839380b in MemoryContextAllocZeroAligned (context=0x2d9eaf70,
size=16) at mcxt.c:559
559 mcxt.c: No such file or directory.
in mcxt.c
[New Thread 28b01140 (LWP 100115)]
#(gdb)bt
#14185 0x081c1024 in ExecTargetList (targetlist=0x2b61a158,
econtext=0x2b619330, values=0x2b61a0c8, isnull=0x2b61a0d8 "",
itemIsDone=0x2b61a170, isDone=0xbfbfe4f0) at execQual.c:5007
#14186 0x081c14cd in ExecProject (projInfo=0x2b61a0e8, isDone=0xbfbfe4f0) at
execQual.c:5222
#14187 0x081c1623 in ExecScan (node=0x2b6192a8, accessMtd=0x81d5120
<SubqueryNext>) at execScan.c:143
#14188 0x081d516a in ExecSubqueryScan (node=0x2b6192a8) at
nodeSubqueryscan.c:85
#14189 0x081b7f9e in ExecProcNode (node=0x2b6192a8) at execProcnode.c:381
#14190 0x081b598e in ExecutePlan (estate=0x2b619018, planstate=0x2b6192a8,
operation=CMD_SELECT, numberTuples=0,
---Type <return> to continue, or q <return> to quit---
direction=ForwardScanDirection, dest=0x28bc4420) at execMain.c:1504
#14191 0x081b3d53 in standard_ExecutorRun (queryDesc=0x2b789e40,
direction=ForwardScanDirection, count=0) at execMain.c:309
#14192 0x081b3c66 in ExecutorRun (queryDesc=0x2b789e40,
direction=ForwardScanDirection, count=0) at execMain.c:258
#14193 0x082958a7 in PortalRunSelect (portal=0x2ab68018, forward=1 '\001',
count=0, dest=0x28bc4420) at pquery.c:953
#14194 0x082955c0 in PortalRun (portal=0x2ab68018, count=2147483647,
isTopLevel=1 '\001', dest=0x28bc4420,
altdest=0x28bc4420, completionTag=0xbfbfe7d4 "") at pquery.c:779
#14195 0x0829155f in exec_execute_message (portal_name=0x28bc4018 "",
max_rows=2147483647) at postgres.c:1928
#14196 0x08293e23 in PostgresMain (argc=4, argv=0x28b3b890,
username=0x28b3b7c0 "tad") at postgres.c:3671
#14197 0x0825e2d0 in BackendRun (port=0x28b2f600) at postmaster.c:3447
#14198 0x0825d7b3 in BackendStartup (port=0x28b2f600) at postmaster.c:3061
#14199 0x0825ad4a in ServerLoop () at postmaster.c:1387
#14200 0x0825a466 in PostmasterMain (argc=3, argv=0xbfbfec2c) at
postmaster.c:1040
#14201 0x081ea4e5 in main (argc=3, argv=0xbfbfec2c) at main.c:188
Regards,
Oleg
"Oleg Yurchenko" <oleg@fts.ee> writes:
Program terminated with signal 11, Segmentation fault.
#0 0x0839380b in MemoryContextAllocZeroAligned (context=0x2d9eaf70,
size=16) at mcxt.c:559
559 mcxt.c: No such file or directory.
in mcxt.c
[New Thread 28b01140 (LWP 100115)]
#(gdb)bt
#14185 0x081c1024 in ExecTargetList (targetlist=0x2b61a158,
econtext=0x2b619330, values=0x2b61a0c8, isnull=0x2b61a0d8 "",
itemIsDone=0x2b61a170, isDone=0xbfbfe4f0) at execQual.c:5007
#14186 0x081c14cd in ExecProject (projInfo=0x2b61a0e8, isDone=0xbfbfe4f0) at
execQual.c:5222
So where are the 14184 intermediate call levels?
If that's actually accurate and not a symptom of gdb being confused,
it's reasonable to guess that something went into infinite recursion
and the segfault occurred when it ran out of stack space. But there's
no evidence here to suggest what that was. Have you got any
potentially-recursive C or Perl or Python functions? Because the system
ought to notice when it's getting into recursion trouble with any
higher-level code.
regards, tom lane
On 8/12/2009 8:09 AM, Tom Lane wrote:
"Oleg Yurchenko"<oleg@fts.ee> writes:
Program terminated with signal 11, Segmentation fault.
#0 0x0839380b in MemoryContextAllocZeroAligned (context=0x2d9eaf70,
size=16) at mcxt.c:559
559 mcxt.c: No such file or directory.
in mcxt.c
[New Thread 28b01140 (LWP 100115)]#(gdb)bt
#14185 0x081c1024 in ExecTargetList (targetlist=0x2b61a158,
econtext=0x2b619330, values=0x2b61a0c8, isnull=0x2b61a0d8 "",
itemIsDone=0x2b61a170, isDone=0xbfbfe4f0) at execQual.c:5007
#14186 0x081c14cd in ExecProject (projInfo=0x2b61a0e8, isDone=0xbfbfe4f0) at
execQual.c:5222So where are the 14184 intermediate call levels?
To generate a full backtrace, disable paging and enable logging with:
(gdb) set pagination off
(gdb) set logging file debuglog.txt
(gdb) set logging on
(gdb) bt
Gzip the resulting 'debuglog.txt' and put it somewhere accessible, then
post a link to this mailing list.
(If your PostgreSQL was built with full debugging support, rather than
just having debuginfo packages installed for a distro version, run 'gdb
full' instead of 'gdb' to get stack frame details).
Alternately, it might be helpful enough to just go back toward the
bottom of the stack and dump the first few hundred frames:
(gdb) set pagination off
(gdb) set logging file debuglog.txt
(gdb) set logging on
(gdb) bt -200
... then paste the result directly into this.
--
Craig Ringer
This the end of core dump. It is 8.3M bzip-ed. I can provide it on the
request.
I'm trying to compile Openbravo ERP application. There is no
C/Perl/Python functions.
My investigations show that pgsql catches segmentation fault on some|
||ported Oracle PLSQL function|:
Query in the source code is following: select
instr(ad_parent_tree(?,?),'|'||?||'|') AS isItsOwnChild from dual;
Best regards,
Oleg.
Tom Lane wrote:
Show quoted text
"Oleg Yurchenko" <oleg@fts.ee> writes:
Program terminated with signal 11, Segmentation fault.
#0 0x0839380b in MemoryContextAllocZeroAligned (context=0x2d9eaf70,
size=16) at mcxt.c:559
559 mcxt.c: No such file or directory.
in mcxt.c
[New Thread 28b01140 (LWP 100115)]#(gdb)bt
#14185 0x081c1024 in ExecTargetList (targetlist=0x2b61a158,
econtext=0x2b619330, values=0x2b61a0c8, isnull=0x2b61a0d8 "",
itemIsDone=0x2b61a170, isDone=0xbfbfe4f0) at execQual.c:5007
#14186 0x081c14cd in ExecProject (projInfo=0x2b61a0e8, isDone=0xbfbfe4f0) at
execQual.c:5222So where are the 14184 intermediate call levels?
If that's actually accurate and not a symptom of gdb being confused,
it's reasonable to guess that something went into infinite recursion
and the segfault occurred when it ran out of stack space. But there's
no evidence here to suggest what that was. Have you got any
potentially-recursive C or Perl or Python functions? Because the system
ought to notice when it's getting into recursion trouble with any
higher-level code.regards, tom lane
2009/12/8 Oleg Jurtšenko <oleg.jurtsenko@fts.ee>:
This the end of core dump. It is 8.3M bzip-ed. I can provide it on the
request.
I think maybe the beginning would be more useful than the end.
...Robert
Both files are there.
Oleg.
Robert Haas wrote:
Show quoted text
2009/12/8 Oleg Jurtšenko <oleg.jurtsenko@fts.ee>:
This the end of core dump. It is 8.3M bzip-ed. I can provide it on the
request.I think maybe the beginning would be more useful than the end.
...Robert
2009/12/8 Oleg Jurtšenko <oleg.jurtsenko@fts.ee>:
Both files are there.
Both files are where? I don't see an attachment or a link.
...Robert
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Once more:<br>
<br>
<a class="moz-txt-link-abbreviated"
href="http://www.fts.ee/pgsqldebug.tgz">http://www.fts.ee/pgsqldebug.tgz</a>
- with loging enabled
<br>
<a class="moz-txt-link-abbreviated"
href="http://www.ftse.ee/pg_core.tar.bzip2">http://www.ftse.ee/pg_core.tar.bzip2</a>
- full core dump
<br>
<br>
Robert Haas wrote:
<blockquote
cite="mid:603c8f070912081124s4190ae1dm5dc31e06d5965c15@mail.gmail.com"
type="cite">
<pre wrap="">2009/12/8 Oleg Jurtšenko <a class="moz-txt-link-rfc2396E" href="mailto:oleg.jurtsenko@fts.ee"><oleg.jurtsenko@fts.ee></a>:
</pre>
<blockquote type="cite">
<pre wrap="">Both files are there.
</pre>
</blockquote>
<pre wrap=""><!---->
Both files are where? I don't see an attachment or a link.
...Robert
</pre>
</blockquote>
</body>
</html>
2009/12/8 Oleg Jurtšenko <oleg.jurtsenko@fts.ee>:
Once more:
http://www.fts.ee/pgsqldebug.tgz - with loging enabled
http://www.ftse.ee/pg_core.tar.bzip2 - full core dump
It looks like you've got a pl/pgsql function that called itself
recursively 1417 times before running out of stack space. What do you
have max_stack_depth set to?
...Robert
You are right, it crushes on following statement: "select
instr(ad_parent_tree(?,?),'|'||?||'|') AS isItsOwnChild from dual;"
max_stack_depth is commented out, I think it has the default value: #max_stack_depth = 2MB
I'm attaching related functions.
Oleg
Robert Haas wrote:
Show quoted text
2009/12/8 Oleg Jurtšenko <oleg.jurtsenko@fts.ee>:
Once more:
http://www.fts.ee/pgsqldebug.tgz - with loging enabled
http://www.ftse.ee/pg_core.tar.bzip2 - full core dumpIt looks like you've got a pl/pgsql function that called itself
recursively 1417 times before running out of stack space. What do you
have max_stack_depth set to?...Robert
Attachments:
OB2.50MP9_functions.txttext/plain; name=OB2.50MP9_functions.txtDownload+0-8
2009/12/8 Oleg Jurtšenko <oleg.jurtsenko@fts.ee>:
You are right, it crushes on following statement: "select
instr(ad_parent_tree(?,?),'|'||?||'|') AS isItsOwnChild from dual;"max_stack_depth is commented out, I think it has the default value:
#max_stack_depth = 2MB
Well, my guess is you have your kernel limit for max stack depth set
to something very small. See:
http://www.postgresql.org/docs/current/interactive/runtime-config-resource.html#GUC-MAX-STACK-DEPTH
You can do "SHOW max_stack_depth;" to confirm the setting for that
parameter. But I'm not quite sure how to check what value is being
applied to PG. Sounds like it's smaller than 2MB, though. You may be
able to reduce max_stack_depth to prevent the crash, but then you'll
get an error instead. If you want the function to actually run to
completion (and assuming this isn't an infinite loop) you'll need to
increase the kernel limit being applied to PG.
...Robert
Robert Haas <robertmhaas@gmail.com> writes:
2009/12/8 Oleg Jurtšenko <oleg.jurtsenko@fts.ee>:
You are right, it crushes on following statement: "select
instr(ad_parent_tree(?,?),'|'||?||'|') AS isItsOwnChild from dual;"max_stack_depth is commented out, I think it has the default value:
#max_stack_depth = 2MB
Well, my guess is you have your kernel limit for max stack depth set
to something very small. See:
http://www.postgresql.org/docs/current/interactive/runtime-config-resource.html#GUC-MAX-STACK-DEPTH
You can do "SHOW max_stack_depth;" to confirm the setting for that
parameter. But I'm not quite sure how to check what value is being
applied to PG. Sounds like it's smaller than 2MB, though. You may be
able to reduce max_stack_depth to prevent the crash, but then you'll
get an error instead.
The weird thing about this is that recent versions of PG try to adjust
max_stack_depth automatically. The only ways I can see for that to
fail is if
(1) the platform hasn't got getrlimit(RLIMIT_STACK), or
(2) the effective stack rlimit is so tiny Postgres doesn't believe it,
which looks to be anything under 100KB.
The claim in the docs that the default value is 2MB is a vast
oversimplification of reality, so I'd be interested to know what "show
max_stack_depth" actually reports. It'd also be useful to run
"ulimit -a" in the context in which the postmaster is normally started
(that's NOT your interactive shell session, usually --- try adding
that to the postmaster start script).
regards, tom lane
(resending with list cc'd)
Oleg Jurt�enko wrote:
Core dump file are available here:
www.fts.ee/pgsqldebug.tgz - with loging enabled
Well, it certainly looks recursive:
#0 0x0839380b in MemoryContextAllocZeroAligned (context=0x2d9eaf70, size=16) at mcxt.c:559
#1 0x081bfc16 in ExecInitExpr (node=0x2b7f0078, parent=0x2da151a8) at execQual.c:4392
#2 0x081c0d83 in ExecInitExpr (node=0x2b7f0050, parent=0x2da151a8) at execQual.c:4786
#3 0x081bf769 in ExecInitExpr (node=0x2b7f0028, parent=0x2da151a8) at execQual.c:4280
#4 0x081c0d83 in ExecInitExpr (node=0x2b7947e8, parent=0x2da151a8) at execQual.c:4786
#5 0x081cc7ba in ExecInitIndexScan (node=0x2b794488, estate=0x2da15018, eflags=0) at nodeIndexscan.c:536
#6 0x081b7b08 in ExecInitNode (node=0x2b794488, estate=0x2da15018, eflags=0) at execProcnode.c:179
#7 0x081b487d in InitPlan (queryDesc=0x2da13040, eflags=0) at execMain.c:835
#8 0x081b3be8 in standard_ExecutorStart (queryDesc=0x2da13040, eflags=0) at execMain.c:219
#9 0x081b3a67 in ExecutorStart (queryDesc=0x2da13040, eflags=0) at execMain.c:148
#10 0x081dc566 in _SPI_pquery (queryDesc=0x2da13040, fire_triggers=1 '\001', tcount=1) at spi.c:2010
#11 0x081dc1de in _SPI_execute_plan (plan=0x2b793c18, paramLI=0x2da13018, snapshot=0x0, crosscheck_snapshot=0x0, read_only=0 '\0', fire_triggers=1 '\001', tcount=1) at spi.c:1834
#12 0x081d9830 in SPI_execute_plan (plan=0x2b793c18, Values=0x2da11238, Nulls=0x2da11248 "nn", read_only=0 '\0', tcount=1) at spi.c:392
#13 0x28a60023 in exec_stmt_execsql (estate=0xbfa00ebc, stmt=0x2b7f9948) at pl_exec.c:2781
#14 0x28a5d4dd in exec_stmt (estate=0xbfa00ebc, stmt=0x2b7f9948) at pl_exec.c:1297
#15 0x28a5d25d in exec_stmts (estate=0xbfa00ebc, stmts=0x2b7f9850) at pl_exec.c:1200
#16 0x28a5ce2f in exec_stmt_block (estate=0xbfa00ebc, block=0x2b7f9ef0) at pl_exec.c:1012
#17 0x28a5b91c in plpgsql_exec_function (func=0x2b787d28, fcinfo=0xbfa01000) at pl_exec.c:315
#18 0x28a57541 in plpgsql_call_handler (fcinfo=0xbfa01000) at pl_handler.c:95
#19 0x081bae61 in ExecMakeFunctionResultNoSets (fcache=0x2b64d218, econtext=0x2c7bc390, isNull=0xbfa012df "", isDone=0x0) at execQual.c:1752
#20 0x28a6335c in exec_eval_simple_expr (estate=0xbfa0147c, expr=0x2b7f9998, result=0xbfa012a8, isNull=0xbfa012df "", rettype=0xbfa012e0) at pl_exec.c:4450
#21 0x28a629e5 in exec_eval_expr (estate=0xbfa0147c, expr=0x2b7f9998, isNull=0xbfa012df "", rettype=0xbfa012e0) at pl_exec.c:4061
#22 0x28a6154a in exec_assign_expr (estate=0xbfa0147c, target=0x2da0b118, expr=0x2b7f9998) at pl_exec.c:3428
#23 0x28a5d633 in exec_stmt_assign (estate=0xbfa0147c, stmt=0x2b7f9980) at pl_exec.c:1345
#24 0x28a5d357 in exec_stmt (estate=0xbfa0147c, stmt=0x2b7f9980) at pl_exec.c:1237
#25 0x28a5d25d in exec_stmts (estate=0xbfa0147c, stmts=0x2b7f9850) at pl_exec.c:1200
#26 0x28a5ce2f in exec_stmt_block (estate=0xbfa0147c, block=0x2b7f9ef0) at pl_exec.c:1012
#27 0x28a5b91c in plpgsql_exec_function (func=0x2b787d28, fcinfo=0xbfa015c0) at pl_exec.c:315
#28 0x28a57541 in plpgsql_call_handler (fcinfo=0xbfa015c0) at pl_handler.c:95
... blah blah ...
#14169 0x081bacf6 in ExecMakeFunctionResult (fcache=0x2b64d218, econtext=0x2b64d110, isNull=0xbfbfdb8f "", isDone=0x0) at execQual.c:1685
#14170 0x081bb631 in ExecEvalFunc (fcache=0x2b64d218, econtext=0x2b64d110, isNull=0xbfbfdb8f "", isDone=0x0) at execQual.c:2116
#14171 0x28a6335c in exec_eval_simple_expr (estate=0xbfbfdd2c, expr=0x2b7f9998, result=0xbfbfdb58, isNull=0xbfbfdb8f "", rettype=0xbfbfdb90) at pl_exec.c:4450
#14172 0x28a629e5 in exec_eval_expr (estate=0xbfbfdd2c, expr=0x2b7f9998, isNull=0xbfbfdb8f "", rettype=0xbfbfdb90) at pl_exec.c:4061
#14173 0x28a6154a in exec_assign_expr (estate=0xbfbfdd2c, target=0x2b6464d8, expr=0x2b7f9998) at pl_exec.c:3428
#14174 0x28a5d633 in exec_stmt_assign (estate=0xbfbfdd2c, stmt=0x2b7f9980) at pl_exec.c:1345
#14175 0x28a5d357 in exec_stmt (estate=0xbfbfdd2c, stmt=0x2b7f9980) at pl_exec.c:1237
#14176 0x28a5d25d in exec_stmts (estate=0xbfbfdd2c, stmts=0x2b7f9850) at pl_exec.c:1200
#14177 0x28a5ce2f in exec_stmt_block (estate=0xbfbfdd2c, block=0x2b7f9ef0) at pl_exec.c:1012
#14178 0x28a5b91c in plpgsql_exec_function (func=0x2b787d28, fcinfo=0xbfbfdea4) at pl_exec.c:315
#14179 0x28a57541 in plpgsql_call_handler (fcinfo=0xbfbfdea4) at pl_handler.c:95
#14180 0x081bacf6 in ExecMakeFunctionResult (fcache=0x2b6197c0, econtext=0x2b619330, isNull=0xbfbfe344 "<E8>8<D1>*(9<D1>*\027", isDone=0xbfbfe120) at execQual.c:1685
#14181 0x081bb631 in ExecEvalFunc (fcache=0x2b6197c0, econtext=0x2b619330, isNull=0xbfbfe344 "<E8>8<D1>*(9<D1>*\027", isDone=0xbfbfe120) at execQual.c:2116
#14182 0x081ba291 in ExecEvalFuncArgs (fcinfo=0xbfbfe1a4, argList=0x2b619c40, econtext=0x2b619330) at execQual.c:1216
#14183 0x081ba834 in ExecMakeFunctionResult (fcache=0x2b6193b8, econtext=0x2b619330, isNull=0x2b61a0d8 "", isDone=0x2b61a170) at execQual.c:1463
#14184 0x081bb631 in ExecEvalFunc (fcache=0x2b6193b8, econtext=0x2b619330, isNull=0x2b61a0d8 "", isDone=0x2b61a170) at execQual.c:2116
#14185 0x081c1024 in ExecTargetList (targetlist=0x2b61a158, econtext=0x2b619330, values=0x2b61a0c8, isnull=0x2b61a0d8 "", itemIsDone=0x2b61a170, isDone=0xbfbfe4f0) at execQual.c:5007
#14186 0x081c14cd in ExecProject (projInfo=0x2b61a0e8, isDone=0xbfbfe4f0) at execQual.c:5222
#14187 0x081c1623 in ExecScan (node=0x2b6192a8, accessMtd=0x81d5120 <SubqueryNext>) at execScan.c:143
#14188 0x081d516a in ExecSubqueryScan (node=0x2b6192a8) at nodeSubqueryscan.c:85
#14189 0x081b7f9e in ExecProcNode (node=0x2b6192a8) at execProcnode.c:381
#14190 0x081b598e in ExecutePlan (estate=0x2b619018, planstate=0x2b6192a8, operation=CMD_SELECT, numberTuples=0, direction=ForwardScanDirection, dest=0x28bc4420) at execMain.c:1504
#14191 0x081b3d53 in standard_ExecutorRun (queryDesc=0x2b789e40, direction=ForwardScanDirection, count=0) at execMain.c:309
#14192 0x081b3c66 in ExecutorRun (queryDesc=0x2b789e40, direction=ForwardScanDirection, count=0) at execMain.c:258
#14193 0x082958a7 in PortalRunSelect (portal=0x2ab68018, forward=1 '\001', count=0, dest=0x28bc4420) at pquery.c:953
#14194 0x082955c0 in PortalRun (portal=0x2ab68018, count=2147483647, isTopLevel=1 '\001', dest=0x28bc4420, altdest=0x28bc4420, completionTag=0xbfbfe7d4 "") at pquery.c:779
#14195 0x0829155f in exec_execute_message (portal_name=0x28bc4018 "", max_rows=2147483647) at postgres.c:1928
#14196 0x08293e23 in PostgresMain (argc=4, argv=0x28b3b890, username=0x28b3b7c0 "tad") at postgres.c:3671
#14197 0x0825e2d0 in BackendRun (port=0x28b2f600) at postmaster.c:3447
#14198 0x0825d7b3 in BackendStartup (port=0x28b2f600) at postmaster.c:3061
#14199 0x0825ad4a in ServerLoop () at postmaster.c:1387
#14200 0x0825a466 in PostmasterMain (argc=3, argv=0xbfbfec2c) at postmaster.c:1040
#14201 0x081ea4e5 in main (argc=3, argv=0xbfbfec2c) at main.c:188
Recursion within PL/PgSQL?
--
Craig Ringer
Import Notes
Reply to msg id not found: 4B1E9384.7030605@fts.ee
Recursion within PL/PgSQL?
er ... sorry for stating the belated and obvious. I was dropped from the
CC on the other branch of this thread so it wasn't hitting my INBOX and
didn't realise it'd carried on until I saw it in my Pg list folder.
--
Craig Ringer
I'm not sure about the theory about recursion and infinity loop. I have
tested different versions of Postgres and FreeBSD. Please take a look on
results below.
Well, output of "ulimit -a":
$ ulimit -a
cpu time (seconds, -t) unlimited
file size (512-blocks, -f) unlimited
data seg size (kbytes, -d) 524288
stack size (kbytes, -s) 65536
core file size (512-blocks, -c) unlimited
max memory size (kbytes, -m) unlimited
locked memory (kbytes, -l) unlimited
max user processes (-u) 4986
open files (-n) 9972
virtual mem size (kbytes, -v) unlimited
swap limit (kbytes, -w) unlimited
sbsize (bytes, -b) unlimited
pseudo-terminals (-p) unlimited
Output of "SHOW max_stack_depth;"
postgres=# SHOW max_stack_depth;
max_stack_depth
-----------------
2MB
(1 row)
I tried to execute "select instr(ad_parent_tree(?,?),'|'||'?'||'|') AS
isItsOwnChild from dual;" query with psql terminal and got segmentation
fault as well.
The most interesting thing is that this function makes segmentation
fault also on FreeBSD 7.2 with Postgresql-8.3.7.
Consequentially, both Postgresql-8.3.7 and Postresql-8.4.1 are affected.
Oleg.
Tom Lane wrote:
Show quoted text
Robert Haas <robertmhaas@gmail.com> writes:
2009/12/8 Oleg Jurtšenko <oleg.jurtsenko@fts.ee>:
You are right, it crushes on following statement: "select
instr(ad_parent_tree(?,?),'|'||?||'|') AS isItsOwnChild from dual;"max_stack_depth is commented out, I think it has the default value:
#max_stack_depth = 2MBWell, my guess is you have your kernel limit for max stack depth set
to something very small. See:http://www.postgresql.org/docs/current/interactive/runtime-config-resource.html#GUC-MAX-STACK-DEPTH
You can do "SHOW max_stack_depth;" to confirm the setting for that
parameter. But I'm not quite sure how to check what value is being
applied to PG. Sounds like it's smaller than 2MB, though. You may be
able to reduce max_stack_depth to prevent the crash, but then you'll
get an error instead.The weird thing about this is that recent versions of PG try to adjust
max_stack_depth automatically. The only ways I can see for that to
fail is if(1) the platform hasn't got getrlimit(RLIMIT_STACK), or
(2) the effective stack rlimit is so tiny Postgres doesn't believe it,
which looks to be anything under 100KB.The claim in the docs that the default value is 2MB is a vast
oversimplification of reality, so I'd be interested to know what "show
max_stack_depth" actually reports. It'd also be useful to run
"ulimit -a" in the context in which the postmaster is normally started
(that's NOT your interactive shell session, usually --- try adding
that to the postmaster start script).regards, tom lane
On Tue, Dec 8, 2009 at 11:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
2009/12/8 Oleg Jurtšenko <oleg.jurtsenko@fts.ee>:
You are right, it crushes on following statement: "select
instr(ad_parent_tree(?,?),'|'||?||'|') AS isItsOwnChild from dual;"max_stack_depth is commented out, I think it has the default value:
#max_stack_depth = 2MBWell, my guess is you have your kernel limit for max stack depth set
to something very small. See:http://www.postgresql.org/docs/current/interactive/runtime-config-resource.html#GUC-MAX-STACK-DEPTH
You can do "SHOW max_stack_depth;" to confirm the setting for that
parameter. But I'm not quite sure how to check what value is being
applied to PG. Sounds like it's smaller than 2MB, though. You may be
able to reduce max_stack_depth to prevent the crash, but then you'll
get an error instead.The weird thing about this is that recent versions of PG try to adjust
max_stack_depth automatically. The only ways I can see for that to
fail is if(1) the platform hasn't got getrlimit(RLIMIT_STACK), or
(2) the effective stack rlimit is so tiny Postgres doesn't believe it,
which looks to be anything under 100KB.
How about (3) getrlimit(RLIMIT_STACK) lies through its teeth, by
ignoring the existence of another and lower limit imposed elsewhere?
A little Googling seems to reveal that FreeBSD has a parameter called
MAXSSIZ (and possibly a variant for 64-bit builds). I kind find a lot
of people talking about needing to raise it (for MySQL, among other
things), but I haven't been able to determine for certain what the
default is. Perhaps it is set to a really low value on the OP's
system?
...Robert
"Robert" == Robert Haas <robertmhaas@gmail.com> writes:
Robert> How about (3) getrlimit(RLIMIT_STACK) lies through its teeth,
Robert> by ignoring the existence of another and lower limit imposed
Robert> elsewhere?
Robert> A little Googling seems to reveal that FreeBSD has a
Robert> parameter called MAXSSIZ (and possibly a variant for 64-bit
Robert> builds). I kind find a lot of people talking about needing
Robert> to raise it (for MySQL, among other things), but I haven't
Robert> been able to determine for certain what the default is.
Robert> Perhaps it is set to a really low value on the OP's system?
The default is 64MB on i386, 512MB on amd64; that's where the
getrlimit value comes from unless it's been explicitly reduced
somewhere. The kernel MAXSSIZ sets the value of the hard limit for
RLIMIT_STACK for proc0, and everything else inherits that. All
setrlimit calls for RLIMIT_STACK are explicitly clamped to MAXSSIZ, so
there's no way to set that value higher than the kernel limit, and no
way for getrlimit to report a value higher than the real limit.
--
Andrew (irc:RhodiumToad)
"Oleg" == Oleg Jurtšenko <oleg@fts.ee> writes:
Oleg> I tried to execute "select instr(ad_parent_tree(?,?),'|'||'?'||'|') AS
Oleg> isItsOwnChild from dual;" query with psql terminal and got
Oleg> segmentation fault as well.
Oleg> The most interesting thing is that this function makes segmentation
Oleg> fault also on FreeBSD 7.2 with Postgresql-8.3.7.
What are the definitions of your instr() and ad_parent_tree() functions?
--
Andrew (irc:RhodiumToad)
Functions are attached
Oleg
Andrew Gierth wrote:
Show quoted text
"Oleg" == Oleg Jurtšenko <oleg@fts.ee> writes:
Oleg> I tried to execute "select instr(ad_parent_tree(?,?),'|'||'?'||'|') AS
Oleg> isItsOwnChild from dual;" query with psql terminal and got
Oleg> segmentation fault as well.Oleg> The most interesting thing is that this function makes segmentation
Oleg> fault also on FreeBSD 7.2 with Postgresql-8.3.7.What are the definitions of your instr() and ad_parent_tree() functions?
Attachments:
OB2.50MP9_functions.txttext/plain; name=OB2.50MP9_functions.txtDownload+0-8
Andrew Gierth wrote:
"Robert" == Robert Haas <robertmhaas@gmail.com> writes:
Robert> How about (3) getrlimit(RLIMIT_STACK) lies through its teeth,
Robert> by ignoring the existence of another and lower limit imposed
Robert> elsewhere?Robert> A little Googling seems to reveal that FreeBSD has a
Robert> parameter called MAXSSIZ (and possibly a variant for 64-bit
Robert> builds). I kind find a lot of people talking about needing
Robert> to raise it (for MySQL, among other things), but I haven't
Robert> been able to determine for certain what the default is.
Robert> Perhaps it is set to a really low value on the OP's system?The default is 64MB on i386, 512MB on amd64; that's where the
getrlimit value comes from unless it's been explicitly reduced
somewhere. The kernel MAXSSIZ sets the value of the hard limit for
RLIMIT_STACK for proc0, and everything else inherits that. All
setrlimit calls for RLIMIT_STACK are explicitly clamped to MAXSSIZ, so
there's no way to set that value higher than the kernel limit, and no
way for getrlimit to report a value higher than the real limit.
I vaguely recall issues in the past with linking of postgresql (or PLs
that require it) against libc_r causing some rather small stack limits
being imposed under some circumstances but I don't recall the details
any more...
Stefan