PostgreSQL 8.0.6 crash
PostgreSQL promptly uses all available memory for the query and
subsequently crashes.
I'm sure it can be corrected with a setting, but should it crash?
freedb=# create table ucode as select distinct ucode from cdtitles group
by ucode having count(ucode)>1 ;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
(Table layout)
freedb=# \d cdtitles
Table "public.cdtitles"
Column | Type | Modifiers
-----------+-------------------+-----------------------------------
cdid | integer | default nextval('seq_cdid'::text)
cdcode | character varying |
ucode | character varying |
title | character varying |
artist | character varying |
genre | character varying |
dgenre | character varying |
year | integer |
processed | character varying |
submit | character varying |
revision | integer |
disclen | integer |
tracks | integer |
extkey | integer |
offsets | integer[] |
Indexes:
"cdtitles_ucode" btree (ucode) CLUSTER
(Size of table)
freedb=# select count(*) from cdtitles ;
count
---------
1927912
(1 row)
(Sample row)
freedb=# select * from cdtitles where cdid = 100001 limit 1;
cdid | cdcode | ucode | title |
artist | genre | dgenre | year | processed
| submit | revision | disclen | tracks | extkey |
offsets
--------+----------+-----------------------+--------------------------+------------+-------+--------+------+-------------------------------------------------+-----------------+----------+---------+--------+--------+-----------------------------------------------------------------------------------------------------------------------------------------------
100001 | 320e9e14 | 0ea0-14-3399-0006ad6a | Var Slutar V�arna
(CD1) | Streaplers | rock | | 0 | cddbd v1.5PL3 Copyright (c)
Steve Scherf et al. | dBpowerAMP V4.0 | 0 | | 19 |
|
{21,150,13920,31180,43664,59907,73163,86629,98447,113684,130205,141682,154920,166783,179028,192689,205161,222164,234649,249692,264340,3744,0}
(1 row
Attachments:
"Mark Woodward" <pgsql@mohawksoft.com> writes:
PostgreSQL promptly uses all available memory for the query and
subsequently crashes.
I'll bet a nickel this is on a Linux machine with OOM kill enabled.
What does the postmaster log show --- or look in the kernel log to
see if it mentions anything about an out-of-memory kill.
freedb=# create table ucode as select distinct ucode from cdtitles group
by ucode having count(ucode)>1 ;
server closed the connection unexpectedly
What does EXPLAIN show as the plan for that? If it's a hash aggregate,
try with "enable_hashagg" turned off. How many distinct ucode values
are there in the table?
regards, tom lane
More info: the machine has 512M RAM and 512M swap
Work mem is set to:work_mem = 1024
This should't have crashed, should it?
Show quoted text
PostgreSQL promptly uses all available memory for the query and
subsequently crashes.I'm sure it can be corrected with a setting, but should it crash?
freedb=# create table ucode as select distinct ucode from cdtitles group
by ucode having count(ucode)>1 ;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.(Table layout)
freedb=# \d cdtitles
Table "public.cdtitles"
Column | Type | Modifiers
-----------+-------------------+-----------------------------------
cdid | integer | default nextval('seq_cdid'::text)
cdcode | character varying |
ucode | character varying |
title | character varying |
artist | character varying |
genre | character varying |
dgenre | character varying |
year | integer |
processed | character varying |
submit | character varying |
revision | integer |
disclen | integer |
tracks | integer |
extkey | integer |
offsets | integer[] |
Indexes:
"cdtitles_ucode" btree (ucode) CLUSTER(Size of table)
freedb=# select count(*) from cdtitles ;
count
---------
1927912
(1 row)(Sample row)
freedb=# select * from cdtitles where cdid = 100001 limit 1;
cdid | cdcode | ucode | title |
artist | genre | dgenre | year | processed
| submit | revision | disclen | tracks | extkey |
offsets
--------+----------+-----------------------+--------------------------+------------+-------+--------+------+-------------------------------------------------+-----------------+----------+---------+--------+--------+-----------------------------------------------------------------------------------------------------------------------------------------------
100001 | 320e9e14 | 0ea0-14-3399-0006ad6a | Var Slutar V�arna
(CD1) | Streaplers | rock | | 0 | cddbd v1.5PL3 Copyright (c)
Steve Scherf et al. | dBpowerAMP V4.0 | 0 | | 19 |
|
{21,150,13920,31180,43664,59907,73163,86629,98447,113684,130205,141682,154920,166783,179028,192689,205161,222164,234649,249692,264340,3744,0}
(1 row
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
"Mark Woodward" <pgsql@mohawksoft.com> writes:
-> HashAggregate (cost=106527.68..106528.68 rows=200 width=32)
Filter: (count(ucode) > 1)
-> Seq Scan on cdtitles (cost=0.00..96888.12 rows=1927912
width=32)
Well, shouldn't hash aggregate respect work memory limits?
If the planner thought there were 1.7M distinct values, it wouldn't have
used hash aggregate ... but it only thinks there are 200, which IIRC is
the default assumption. Have you ANALYZEd this table lately?
Meanwhile, I'd strongly recommend turning off OOM kill. That's got to
be the single worst design decision in the entire Linux kernel.
regards, tom lane
Import Notes
Reply to msg id not found: 16799.24.91.171.78.1139500413.squirrel@mail.mohawksoft.com
"Mark Woodward" <pgsql@mohawksoft.com> writes:
PostgreSQL promptly uses all available memory for the query and
subsequently crashes.I'll bet a nickel this is on a Linux machine with OOM kill enabled.
What does the postmaster log show --- or look in the kernel log to
see if it mentions anything about an out-of-memory kill.
That's a no brainer. Maybe I should have rephrased the condition, crash
may be the wrong word, it was definitely killed by out of memory. Sorry.
freedb=# create table ucode as select distinct ucode from cdtitles group
by ucode having count(ucode)>1 ;
server closed the connection unexpectedlyWhat does EXPLAIN show as the plan for that? If it's a hash aggregate,
try with "enable_hashagg" turned off. How many distinct ucode values
are there in the table?
There are over 1.7M distinct rows, about 200K non-distinct that I want to
somehow remove.
It does have hash aggregate:
freedb=# explain select distinct ucode from cdtitles group by ucode having
count(ucode)>1 ;
QUERY PLAN
-------------------------------------------------------------------------------------
Unique (cost=106536.32..106537.32 rows=200 width=32)
-> Sort (cost=106536.32..106536.82 rows=200 width=32)
Sort Key: ucode
-> HashAggregate (cost=106527.68..106528.68 rows=200 width=32)
Filter: (count(ucode) > 1)
-> Seq Scan on cdtitles (cost=0.00..96888.12 rows=1927912
width=32)
(6 rows)
Well, shouldn't hash aggregate respect work memory limits?
"Mark Woodward" <pgsql@mohawksoft.com> writes:
Still, I would say that is is extremly bad behavior for not having
stats, wouldn't you think?
Think of it as a kernel bug.
Meanwhile, I'd strongly recommend turning off OOM kill. That's got to
be the single worst design decision in the entire Linux kernel.
How is this any different than the FreeBSD having a default 512M process
size limit? On FreeBSD, the process would have been killed earlier.
No, the process would have been politely told it was out of memory, and
would have told you the same. If the kernel's way of notifying a
process that it's out of memory is SIGKILL, there is not a damn thing
that we can do to operate robustly.
regards, tom lane
Import Notes
Reply to msg id not found: 16581.24.91.171.78.1139501668.squirrel@mail.mohawksoft.com
"Mark Woodward" <pgsql@mohawksoft.com> writes:
-> HashAggregate (cost=106527.68..106528.68 rows=200
width=32)
Filter: (count(ucode) > 1)
-> Seq Scan on cdtitles (cost=0.00..96888.12
rows=1927912
width=32)Well, shouldn't hash aggregate respect work memory limits?
If the planner thought there were 1.7M distinct values, it wouldn't have
used hash aggregate ... but it only thinks there are 200, which IIRC is
the default assumption. Have you ANALYZEd this table lately?
I thought that I had, but I did CLUSTER at some point. Or maybe I didn't
I'm, not sure. I have been working on a file reader/parser/importer
program. I created and dropped the DB so many times it is hard to keep
track. Still, I would say that is is extremly bad behavior for not having
stats, wouldn't you think?
Meanwhile, I'd strongly recommend turning off OOM kill. That's got to
be the single worst design decision in the entire Linux kernel.
How is this any different than the FreeBSD having a default 512M process
size limit? On FreeBSD, the process would have been killed earlier.
Tom Lane wrote:
"Mark Woodward" <pgsql@mohawksoft.com> writes:
Still, I would say that is is extremly bad behavior for not having
stats, wouldn't you think?Think of it as a kernel bug.
Meanwhile, I'd strongly recommend turning off OOM kill. That's got to
be the single worst design decision in the entire Linux kernel.How is this any different than the FreeBSD having a default 512M process
size limit? On FreeBSD, the process would have been killed earlier.No, the process would have been politely told it was out of memory, and
would have told you the same. If the kernel's way of notifying a
process that it's out of memory is SIGKILL, there is not a damn thing
that we can do to operate robustly.
And we have docco on it:
http://www.postgresql.org/docs/current/static/kernel-resources.html#AEN18105
which I assume is still current. Back in October I mentioned the OOM
killer to Andrew Morton - his reaction was a very pained look and a curt
admonition: "turn it off".
cheers
andrew
"Mark Woodward" <pgsql@mohawksoft.com> writes:
Still, I would say that is is extremly bad behavior for not having
stats, wouldn't you think?Think of it as a kernel bug.
While I respect your viewpoint that the Linux kernel should not kill an
offending process if the system runs out of memory, I sort of disagree in
that OOM is a disaster preventor. It should be viewed as a last ditch "him
or me" choice the kernel needs to make and it should not get into that
position in the first place.
Regardless, it is troubling that failing to have current stats can cause
the system, with a large data set, to exceed working memory limits.
I think it is still a bug. While it may manifest itself as a pg crash on
Linux because of a feature with which you have issue, the fact remains
that PG is exeeding its working memory limit.
Should failing to run ANALYZE cause this behavior?
If so, how does this get clearly documented?
If not, can it be prevented?
Meanwhile, I'd strongly recommend turning off OOM kill. That's got to
be the single worst design decision in the entire Linux kernel.How is this any different than the FreeBSD having a default 512M process
size limit? On FreeBSD, the process would have been killed earlier.No, the process would have been politely told it was out of memory, and
would have told you the same. If the kernel's way of notifying a
process that it's out of memory is SIGKILL, there is not a damn thing
that we can do to operate robustly.
Lets not waste time on a Linux discussion. Linux and FreeBSD have their
strengths, and a debate on the dubious merits of either is a long and
contentious debate. Both systems are fine, just with some subtle
differences in design goals.
"Mark Woodward" <pgsql@mohawksoft.com> writes:
I think it is still a bug. While it may manifest itself as a pg crash on
Linux because of a feature with which you have issue, the fact remains
that PG is exeeding its working memory limit.
The problem is that *we have no way to know what that limit is* ---
short of exceeding it and being summarily killed. (BTW, the kernel
doesn't know what the limit is either.) There is simply not any way
to operate robustly under the OOM-kill regime.
While I'll certainly acknowledge that it'd be nice if hashagg had
spill-to-disk capability, that wouldn't alter the fundamental fact that
if you want reliable behavior you MUST turn off OOM kill. There is not
anything we can do at the database level to work around that kernel-level
misdesign.
regards, tom lane
"Mark Woodward" <pgsql@mohawksoft.com> writes:
Again, regardless of OS used, hashagg will exceed "working memory" as
defined in postgresql.conf.
So? If you've got OOM kill enabled, it can zap a process whether it's
strictly adhered to work_mem or not. The OOM killer is entirely capable
of choosing a victim process whose memory footprint hasn't changed
materially since it started (eg, the postmaster).
regards, tom lane
Import Notes
Reply to msg id not found: 16528.24.91.171.78.1139509705.squirrel@mail.mohawksoft.com
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
"Mark Woodward" <pgsql@mohawksoft.com> writes:
Again, regardless of OS used, hashagg will exceed "working memory" as
defined in postgresql.conf.So? If you've got OOM kill enabled, it can zap a process whether it's
strictly adhered to work_mem or not. The OOM killer is entirely capable
of choosing a victim process whose memory footprint hasn't changed
materially since it started (eg, the postmaster).
Unless I've missed something here, disabling the OOM killer doesn't
really solve the problem here. What solves the problem is running
ANALYZE but it's still certainly the case that it would make some sense
for the Postmaster, upon realizing that it's gone well beyond its
work_mem boundary, to ideally switch plans to one which isn't going to
exceed its work_mem limit. Less ideally, it could give up and issue an
error to the user instead of running the box out of memory.
I appriciate that this is probably not very easy to implement but I
do believe the current situation could be improved in this regard.
Thanks,
Stephen
"Mark Woodward" <pgsql@mohawksoft.com> writes:
I think it is still a bug. While it may manifest itself as a pg crash on
Linux because of a feature with which you have issue, the fact remains
that PG is exeeding its working memory limit.The problem is that *we have no way to know what that limit is* ---
short of exceeding it and being summarily killed. (BTW, the kernel
doesn't know what the limit is either.) There is simply not any way
to operate robustly under the OOM-kill regime.
No, you misunderstand what I said, the "working memory" as defined in
postgresql.conf. I don't care about the OS debate.
While I'll certainly acknowledge that it'd be nice if hashagg had
spill-to-disk capability, that wouldn't alter the fundamental fact that
if you want reliable behavior you MUST turn off OOM kill. There is not
anything we can do at the database level to work around that kernel-level
misdesign.
Again, regardless of OS used, hashagg will exceed "working memory" as
defined in postgresql.conf.
At issue is would a lack of ANALYZE justify this behavior? If so, it
should be documented.
"Mark Woodward" <pgsql@mohawksoft.com> writes:
Again, regardless of OS used, hashagg will exceed "working memory" as
defined in postgresql.conf.So? If you've got OOM kill enabled, it can zap a process whether it's
strictly adhered to work_mem or not. The OOM killer is entirely capable
of choosing a victim process whose memory footprint hasn't changed
materially since it started (eg, the postmaster).
Sorry, I must strongly disagree here. The postgresql.conf "working mem" is
a VERY IMPORTANT setting, it is intended to limit the consumption of
memory by the postgresql process. Often times PostgreSQL will work along
side other application servers on the same system, infact, may be a
sub-part of application servers on the same system. (This is, in fact, how
it is used on one of my site servers.)
Clearly, if the server will use 1000 times this number (Set for 1024K, but
exceeds 1G) this is broken, and it may cause other systems to fail or
perform very poorly.
If it is not something that can be fixed, it should be clearly documented.
On Thu, Feb 09, 2006 at 02:03:41PM -0500, Mark Woodward wrote:
"Mark Woodward" <pgsql@mohawksoft.com> writes:
Again, regardless of OS used, hashagg will exceed "working memory" as
defined in postgresql.conf.So? If you've got OOM kill enabled, it can zap a process whether it's
strictly adhered to work_mem or not. The OOM killer is entirely capable
of choosing a victim process whose memory footprint hasn't changed
materially since it started (eg, the postmaster).Sorry, I must strongly disagree here. The postgresql.conf "working mem" is
a VERY IMPORTANT setting, it is intended to limit the consumption of
memory by the postgresql process. Often times PostgreSQL will work along
Actually, no, it's not designed for that at all.
side other application servers on the same system, infact, may be a
sub-part of application servers on the same system. (This is, in fact, how
it is used on one of my site servers.)Clearly, if the server will use 1000 times this number (Set for 1024K, but
exceeds 1G) this is broken, and it may cause other systems to fail or
perform very poorly.If it is not something that can be fixed, it should be clearly documented.
work_mem (integer)
Specifies the amount of memory to be used by internal sort
operations and hash tables before switching to temporary disk files.
The value is specified in kilobytes, and defaults to 1024 kilobytes
(1 MB). Note that for a complex query, several sort or hash
operations might be running in parallel; each one will be allowed to
use as much memory as this value specifies before it starts to put
data into temporary files. Also, several running sessions could be
doing such operations concurrently. So the total memory used could
be many times the value of work_mem; it is necessary to keep this
fact in mind when choosing the value. Sort operations are used for
ORDER BY, DISTINCT, and merge joins. Hash tables are used in hash
joins, hash-based aggregation, and hash-based processing of IN
subqueries.
So it says right there that it's very easy to exceed work_mem by a very
large amount. Granted, this is a very painful problem to deal with and
will hopefully be changed at some point, but it's pretty clear as to how
this works.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Thu, Feb 09, 2006 at 11:42:57AM -0500, Mark Woodward wrote:
"Mark Woodward" <pgsql@mohawksoft.com> writes:
Still, I would say that is is extremly bad behavior for not having
stats, wouldn't you think?Think of it as a kernel bug.
While I respect your viewpoint that the Linux kernel should not kill an
offending process if the system runs out of memory, I sort of disagree in
that OOM is a disaster preventor. It should be viewed as a last ditch "him
or me" choice the kernel needs to make and it should not get into that
position in the first place.
I've had processes run away on a FreeBSD box before, to the extent of
running entirely out of swap and memory. Instead of random processes
just dying for no apparent reason, I instead started getting a bunch of
out-of-memory errors. No disaster, I just fixed the problem and life
went on.
Well, ok, the box did become rather unresponsive when my fix for the
problem meant that all the sudden there were about 950 perl processes
trying to run at the same time. I wish I'd captured top showing 900+
runnable processes. But after a few minutes the processes started
completing and exiting and everything was soon back to normal. I rather
doubt Linux would survive that...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
* Jim C. Nasby (jnasby@pervasive.com) wrote:
On Thu, Feb 09, 2006 at 02:03:41PM -0500, Mark Woodward wrote:
If it is not something that can be fixed, it should be clearly documented.
work_mem (integer)
Specifies the amount of memory to be used by internal sort
operations and hash tables before switching to temporary disk files.
The value is specified in kilobytes, and defaults to 1024 kilobytes
(1 MB). Note that for a complex query, several sort or hash
operations might be running in parallel; each one will be allowed to
use as much memory as this value specifies before it starts to put
data into temporary files. Also, several running sessions could be
doing such operations concurrently. So the total memory used could
be many times the value of work_mem; it is necessary to keep this
fact in mind when choosing the value. Sort operations are used for
ORDER BY, DISTINCT, and merge joins. Hash tables are used in hash
joins, hash-based aggregation, and hash-based processing of IN
subqueries.So it says right there that it's very easy to exceed work_mem by a very
large amount. Granted, this is a very painful problem to deal with and
will hopefully be changed at some point, but it's pretty clear as to how
this works.
It also says that when it goes over, it'll spill to disk. Additionally,
we're talking about one hash here, not multiple ones. It seems at least
misleading as, if I understand correctly, Postgres isn't actually
actively checking to see if the amount of memory used by an in-progress
hash creation has gone over the limit but rather it guesses at how much
memory will be used during the planning stage to decide if a hash plan
is possible or not. That guess can certainly be wrong but there's
nothing in place to handle the situation where the guess is wrong...
Thanks,
Stephen
Stephen Frost <sfrost@snowman.net> writes:
Unless I've missed something here, disabling the OOM killer doesn't
really solve the problem here.
Well in a way it does. Postgres would get an out-of-memory error from malloc
which it would handle properly and the world would be happy.
Except not quite, since I think an out of memory error still means that
backend exits instead of just that query failing. That means if you have an
application running such as apache then all subsequent transactions on that
connection fail too, instead of just the transaction that misbehaved.
And as the other poster mentioned, having Postgres use up every available byte
of memory isn't really very friendly to anything else running on the box.
It doesn't seem like a bad idea to have a max_memory parameter that if a
backend ever exceeded it would immediately abort the current transaction. That
would let an application continue operating normally after getting an error.
--
greg
Greg Stark <gsstark@mit.edu> writes:
Except not quite, since I think an out of memory error still means that
backend exits instead of just that query failing.
Not at all! PG will recover from this perfectly well ... if it's given
the opportunity, rather than being SIGKILLed.
It doesn't seem like a bad idea to have a max_memory parameter that if a
backend ever exceeded it would immediately abort the current
transaction.
See ulimit (or local equivalent).
regards, tom lane
On Thu, Feb 09, 2006 at 02:03:41PM -0500, Mark Woodward wrote:
"Mark Woodward" <pgsql@mohawksoft.com> writes:
Again, regardless of OS used, hashagg will exceed "working memory" as
defined in postgresql.conf.So? If you've got OOM kill enabled, it can zap a process whether it's
strictly adhered to work_mem or not. The OOM killer is entirelycapable
of choosing a victim process whose memory footprint hasn't changed
materially since it started (eg, the postmaster).Sorry, I must strongly disagree here. The postgresql.conf "working mem"
is
a VERY IMPORTANT setting, it is intended to limit the consumption of
memory by the postgresql process. Often times PostgreSQL will work alongActually, no, it's not designed for that at all.
I guess that's a matter of opinion.
side other application servers on the same system, infact, may be a
sub-part of application servers on the same system. (This is, in fact,
how
it is used on one of my site servers.)Clearly, if the server will use 1000 times this number (Set for 1024K,
but
exceeds 1G) this is broken, and it may cause other systems to fail or
perform very poorly.If it is not something that can be fixed, it should be clearly
documented.work_mem (integer)
Specifies the amount of memory to be used by internal sort
operations and hash tables before switching to temporary disk files.
The value is specified in kilobytes, and defaults to 1024 kilobytes
(1 MB). Note that for a complex query, several sort or hash
operations might be running in parallel; each one will be allowed to
use as much memory as this value specifies before it starts to put
data into temporary files. Also, several running sessions could be
doing such operations concurrently. So the total memory used could
be many times the value of work_mem; it is necessary to keep this
fact in mind when choosing the value. Sort operations are used for
ORDER BY, DISTINCT, and merge joins. Hash tables are used in hash
joins, hash-based aggregation, and hash-based processing of IN
subqueries.So it says right there that it's very easy to exceed work_mem by a very
large amount. Granted, this is a very painful problem to deal with and
will hopefully be changed at some point, but it's pretty clear as to how
this works.
Well, if you read that paragraph carefully, I'll admit that I was a little
too literal in my statement apliying it to the "process" and not specific
functions within the process, but in the documentation:
"each one will be allowed to use as much memory as this value specifies
before it starts to put data into temporary files."
According to the documentation the behavior of hashagg is broken. It did
not use up to this amount and then start to use temporary files, it used
1000 times this limit and was killed by the OS.
I think it should be documented as the behavior is unpredictable.