lots of values for IN() clause

Started by Markus Wannerover 19 years ago15 messagesgeneral
Jump to latest
#1Markus Wanner
markus@bluegap.ch

Hi,

One of our PostgreSQL 8.1.5 databases constantly crashed on a certain
query (backend SEGFAULTs). I've figured the crashes were caused by a
very long IN() clause.

You can easily reproduce the crash by feeding the output of the python
script below to your database.

Fortunately, 8.2 (as of 09/19/2006) no longer crashes. Anyway, I think
it would be helpful to have at least a better error message instead of a
SEGFAULT for 8.1. I didn't test earlier versions.

Regards

Markus

python script mentioned:

#!/usr/bin/python

values = 100000

print "CREATE TABLE test (a INT, b INT);"
print "SELECT a, b FROM test WHERE b IN (" + \
",".join([str(x) for x in range(values)]) + \
");"
print "DROP TABLE test;"

#2Shelby Cain
alyandon@yahoo.com
In reply to: Markus Wanner (#1)
Re: lots of values for IN() clause

I don't get a segfault on 8.0.8 under linux or 8.1.4 under win32. The backend (correctly I assume) issues a hint to increase max_stack_depth in both cases.

$ psql -h localhost -p 5432 -d test -U readwrite < script.sql

Password:

CREATE TABLE

ERROR: stack depth limit exceeded

HINT: Increase the configuration parameter "max_stack_depth".

DROP TABLE

Regards,

Shelby Cain

----- Original Message ----
From: Markus Schiltknecht <markus@bluegap.ch>
To: pgsql general <pgsql-general@postgresql.org>
Sent: Thursday, November 2, 2006 9:21:52 AM
Subject: [GENERAL] lots of values for IN() clause

Hi,

One of our PostgreSQL 8.1.5 databases constantly crashed on a certain
query (backend SEGFAULTs). I've figured the crashes were caused by a
very long IN() clause.

You can easily reproduce the crash by feeding the output of the python
script below to your database.

Fortunately, 8.2 (as of 09/19/2006) no longer crashes. Anyway, I think
it would be helpful to have at least a better error message instead of a
SEGFAULT for 8.1. I didn't test earlier versions.

Regards

Markus

python script mentioned:

#!/usr/bin/python

values = 100000

print "CREATE TABLE test (a INT, b INT);"
print "SELECT a, b FROM test WHERE b IN (" + \
",".join([str(x) for x in range(values)]) + \
");"
print "DROP TABLE test;"

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

#3Markus Wanner
markus@bluegap.ch
In reply to: Shelby Cain (#2)
Re: lots of values for IN() clause

Hi,

thanks for testing, unfortunately I don't have a 8.0 around. And as 8.2
works and is probably coming very soon...

Regards

Markus

Shelby Cain wrote:

Show quoted text

I don't get a segfault on 8.0.8 under linux or 8.1.4 under win32. The backend (correctly I assume) issues a hint to increase max_stack_depth in both cases.

$ psql -h localhost -p 5432 -d test -U readwrite < script.sql

Password:

CREATE TABLE

ERROR: stack depth limit exceeded

HINT: Increase the configuration parameter "max_stack_depth".

DROP TABLE

Regards,

Shelby Cain

#4Shelby Cain
alyandon@yahoo.com
In reply to: Markus Wanner (#3)
Re: lots of values for IN() clause

No segfault in 8.1.5 on win32 either...

----- Original Message ----
From: Markus Schiltknecht <markus@bluegap.ch>
To: Shelby Cain <alyandon@yahoo.com>
Cc: pgsql general <pgsql-general@postgresql.org>
Sent: Thursday, November 2, 2006 9:55:08 AM
Subject: Re: [GENERAL] lots of values for IN() clause

Hi,

thanks for testing, unfortunately I don't have a 8.0 around. And as 8.2
works and is probably coming very soon...

Regards

Markus

#5Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Markus Wanner (#1)
Re: lots of values for IN() clause

Markus Schiltknecht wrote:

Hi,

One of our PostgreSQL 8.1.5 databases constantly crashed on a certain
query (backend SEGFAULTs). I've figured the crashes were caused by a
very long IN() clause.

You can easily reproduce the crash by feeding the output of the python
script below to your database.

I'd argue that you have max_stack_depth set to an invalid value (higher
than your true stack limit). I tried your example here on 8.1.5 and got
this:

alvherre=# \i foo
CREATE TABLE
psql:foo:2: ERROR: stack depth limit exceeded
HINT: Increase the configuration parameter "max_stack_depth".
DROP TABLE

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#6Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#5)
Re: lots of values for IN() clause

Alvaro Herrera wrote:

Markus Schiltknecht wrote:

Hi,

One of our PostgreSQL 8.1.5 databases constantly crashed on a certain
query (backend SEGFAULTs). I've figured the crashes were caused by a
very long IN() clause.

You can easily reproduce the crash by feeding the output of the python
script below to your database.

I'd argue that you have max_stack_depth set to an invalid value (higher
than your true stack limit). I tried your example here on 8.1.5 and got
this:

Forgot to quote the value:

alvherre=# show max_stack_depth ;
max_stack_depth
-----------------
2048
(1 fila)

This is a stock ix86 machine running Linux 2.6.

--
Alvaro Herrera http://www.flickr.com/photos/alvherre/
"Ellos andaban todos desnudos como su madre los pari�, y tambi�n las mujeres,
aunque no vi m�s que una, harto moza, y todos los que yo vi eran todos
mancebos, que ninguno vi de edad de m�s de XXX a�os" (Crist�bal Col�n)

#7Richard Huxton
dev@archonet.com
In reply to: Markus Wanner (#3)
Re: lots of values for IN() clause

Markus Schiltknecht wrote:

Hi,

thanks for testing, unfortunately I don't have a 8.0 around. And as 8.2
works and is probably coming very soon...

If you can reliably reproduce it (I can't here - Debian on x86) - a
bug-report on the bugs mailing list or the website would probably be
appreciated by the developers. PG version, OS version, method of install
etc.

--
Richard Huxton
Archonet Ltd

#8Richard Huxton
dev@archonet.com
In reply to: Markus Wanner (#1)
Re: lots of values for IN() clause

Markus Schiltknecht wrote:

Hi,

One of our PostgreSQL 8.1.5 databases constantly crashed on a certain
query (backend SEGFAULTs). I've figured the crashes were caused by a
very long IN() clause.

(Sorry for replying twice in a row)

Could it be out-of-memory rather than a segfault you're seeing? That is,
you've got max_stack_depth set high enough that you're exhausting system
RAM (Or some other limit).

--
Richard Huxton
Archonet Ltd

#9Markus Wanner
markus@bluegap.ch
In reply to: Alvaro Herrera (#5)
Re: lots of values for IN() clause

Hello Alvaro,

yeah, thanks, that's it. postgresql.conf had:

max_stack_depth = 8192 # min 100, size in KB

I don't know who put it at 8192. According to the fine manual at [1]http://www.postgresql.org/docs/8.1/interactive/runtime-config-resource.html, it
should be set to something below 'ulimit -s', which gives 8192 on the
machine in question. I've now set it to 7000 and I also get a warning
instead of a SEGFAULT.

Thank you!

Markus

[1]: http://www.postgresql.org/docs/8.1/interactive/runtime-config-resource.html
http://www.postgresql.org/docs/8.1/interactive/runtime-config-resource.html

Alvaro Herrera wrote:

Show quoted text

Markus Schiltknecht wrote:

Hi,

One of our PostgreSQL 8.1.5 databases constantly crashed on a certain
query (backend SEGFAULTs). I've figured the crashes were caused by a
very long IN() clause.

You can easily reproduce the crash by feeding the output of the python
script below to your database.

I'd argue that you have max_stack_depth set to an invalid value (higher
than your true stack limit). I tried your example here on 8.1.5 and got
this:

alvherre=# \i foo
CREATE TABLE
psql:foo:2: ERROR: stack depth limit exceeded
HINT: Increase the configuration parameter "max_stack_depth".
DROP TABLE

#10Markus Wanner
markus@bluegap.ch
In reply to: Richard Huxton (#7)
Re: lots of values for IN() clause

Hi,

Richard Huxton wrote:

If you can reliably reproduce it (I can't here - Debian on x86) - a
bug-report on the bugs mailing list or the website would probably be
appreciated by the developers. PG version, OS version, method of install
etc.

I've thought about that, but I somehow just *knew* it was my fault and
not a bug in PostgreSQL ;-)

OTOH, having to configure such things is not exactly user friendly. I
guess it's difficult to determine the stack limit in a cross-platform
way. Or does having that configuration option other reasons for existence?

Regards

Markus

#11Martijn van Oosterhout
kleptog@svana.org
In reply to: Markus Wanner (#10)
Re: lots of values for IN() clause

On Thu, Nov 02, 2006 at 05:16:40PM +0100, Markus Schiltknecht wrote:

OTOH, having to configure such things is not exactly user friendly. I
guess it's difficult to determine the stack limit in a cross-platform
way. Or does having that configuration option other reasons for existence?

A patch went in recently that (on platforms where it's possible) tries
to determine the maximum stack depth and complains if you set it too
large. So this problem should go away in the future...

There might be situations where you want to be able to use a larger
stack, that's why it's configurable. PostgreSQL now actually checks
whether the number you provided makes sense.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

#12Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Markus Wanner (#9)
Re: lots of values for IN() clause

Markus Schiltknecht wrote:

Hello Alvaro,

yeah, thanks, that's it. postgresql.conf had:

max_stack_depth = 8192 # min 100, size in KB

I don't know who put it at 8192. According to the fine manual at [1], it
should be set to something below 'ulimit -s', which gives 8192 on the
machine in question. I've now set it to 7000 and I also get a warning
instead of a SEGFAULT.

I notice it crashes with max_stack_depth set to 8173, but correctly
detects the error with max_stack_depth set to 8172. The doc suggests a
safety margin of "a megabyte or so", so I think we are conforming to our
docs here.

Tom recently added a check for getrlimit(RLIMIT_STACK), but I don't know
if that considered the "megabyte or so". *peeks the code* Yeah,
there's a 512 kb "daylight", but there's also an absolute maximum of
2MB.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#13Gurjeet Singh
singh.gurjeet@gmail.com
In reply to: Alvaro Herrera (#12)
Re: lots of values for IN() clause

But I do not understand why the IN list has to make so many recursive
calls???

I know if it was possible, it'd have been done already... but can 'making it
iterative' (whatever 'it' stands for) be reconsidered?

--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | yahoo }.com

Show quoted text

On 11/2/06, Alvaro Herrera <alvherre@commandprompt.com> wrote:

I notice it crashes with max_stack_depth set to 8173, but correctly
detects the error with max_stack_depth set to 8172. The doc suggests a
safety margin of "a megabyte or so", so I think we are conforming to our
docs here.

Tom recently added a check for getrlimit(RLIMIT_STACK), but I don't know
if that considered the "megabyte or so". *peeks the code* Yeah,
there's a 512 kb "daylight", but there's also an absolute maximum of
2MB.

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gurjeet Singh (#13)
Re: lots of values for IN() clause

"Gurjeet Singh" <singh.gurjeet@gmail.com> writes:

But I do not understand why the IN list has to make so many recursive
calls???

Existing releases effectively expand "foo IN (1,2,3,...)" to
"(((foo = 1) OR foo = 2) OR foo = 3) ..." which can be a deeply
nested OR structure. IIRC this is changed in 8.2.

regards, tom lane

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#12)
Re: lots of values for IN() clause

Alvaro Herrera <alvherre@commandprompt.com> writes:

Tom recently added a check for getrlimit(RLIMIT_STACK), but I don't know
if that considered the "megabyte or so". *peeks the code* Yeah,
there's a 512 kb "daylight", but there's also an absolute maximum of
2MB.

No, there's no absolute maximum (unless I blew the logic). The
*default* if you do not specify max_stack_depth at all is capped at 2MB,
which is the same as the old default. I did that to avoid creating any
new failure if getrlimit lies to us for some reason. Possibly once we
have more confidence in that code, we can be more aggressive about
setting max_stack_depth to getrlimit(RLIMIT_STACK) minus 512K or so.

regards, tom lane