Running out of memory the hard way ...

Started by Bill Moranabout 9 years ago6 messagesgeneral
Jump to latest
#1Bill Moran
wmoran@potentialtech.com

If you run a transaction with lots of server side functions that use a
lot of memory, this can trigger the OOM killer in Linux, causing the
PosgreSQL backend to receive a SIGKILL and all the associated bad
stuff.

Tuning the OOM killer is not sufficient. No setting I've found for the
OOM killer will guarantee that it won't SIGKILL a process that's essentially
untenable anyway (because it's going to use more memory than actually
exists on the system at some point anyway). Additionally, "add more RAM"
doesn't "solve" the problem, it only delays it until datasets
scale up to even larger transactions that use even more memory.

This is particularly prevelent with Postgis, because some Postgis functions
are very memory intesive, but I'd be willing to bet real money that I could
trigger it with just about any stored procedure that allocates memory in
such as way that it doesn't get reclaimed until the transaction completes.
See as an example: https://trac.osgeo.org/postgis/ticket/3445
If anyone wants to investigate this but is having trouble reproducing, I
can construct specific failure scenarios fairly easily.

Another workaround is to run the offending statements in smaller
transactional batches. This is the best solution I've found so far, but
it's not quite ideal. In particular it requires the client program to
reimplement transaction guarantees on the client side. Sometimes this
isn't necessary, but other times it is.

What I feel is the best way to mitigate the situation, is to have some
setting that limits the maximum RAM any backend can consume. Attempting to
exceed this limit would cause an error and rollback for that particular
backend without affecting other backends. This would provide information
to the client process that can be rationally interpreted by client code
to result in either an error that a developer can understand, or possibly
adaptive code that changes behavior to accomodate limits on the server
side.

My first question: does this setting exist somewhere and I'm simply not
finding it for some reason?

Assuming this doesn't exist (I haven't found it) my next question is
whether there's a philosophical or technical reason that such a feature
doesn't exist? Should I take this discussion to -hackers?

--
Bill Moran <wmoran@potentialtech.com>

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Bill Moran (#1)
Re: Running out of memory the hard way ...

Bill Moran wrote:

If you run a transaction with lots of server side functions that use a
lot of memory, this can trigger the OOM killer in Linux, causing the
PosgreSQL backend to receive a SIGKILL and all the associated bad
stuff.

Tuning the OOM killer is not sufficient. No setting I've found for the
OOM killer will guarantee that it won't SIGKILL a process that's essentially
untenable anyway (because it's going to use more memory than actually
exists on the system at some point anyway). Additionally, "add more RAM"
doesn't "solve" the problem, it only delays it until datasets
scale up to even larger transactions that use even more memory.

I routinely set vm.overcommit_memory = 2 and configure vm.overcommit_ratio
so that the kernel does not try to commit more memory than there is in
the machine.

That should get rid of the problem, of course at the risk of leaving
some memory unused.

This is particularly prevelent with Postgis, because some Postgis functions
are very memory intesive, but I'd be willing to bet real money that I could
trigger it with just about any stored procedure that allocates memory in
such as way that it doesn't get reclaimed until the transaction completes.

[...]

What I feel is the best way to mitigate the situation, is to have some
setting that limits the maximum RAM any backend can consume. Attempting to
exceed this limit would cause an error and rollback for that particular
backend without affecting other backends.

[...]

My first question: does this setting exist somewhere and I'm simply not
finding it for some reason?

Assuming this doesn't exist (I haven't found it) my next question is
whether there's a philosophical or technical reason that such a feature
doesn't exist? Should I take this discussion to -hackers?

I don't think that there is such a setting.

work_mem sets a limit per operation, but that is a soft limit that
PostgreSQL server code can choose to ignore if it pleases.
Moreover, it does not limit the *total* memory a backend can use.

I'd delegate that problem to the operating system which, after all,
should know best of all how much memory a process uses.
And I don't see a big advantage in a PostgreSQL generated error message
over an "out of memory" error that is propagated from the operating system.

Of course, if there is no way to limit the amount of memory per process
(excluding shared memory!), you have a point.

Yours,
Laurenz Albe

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Laurenz Albe (#2)
Re: Running out of memory the hard way ...

Albe Laurenz <laurenz.albe@wien.gv.at> writes:

Bill Moran wrote:

What I feel is the best way to mitigate the situation, is to have some
setting that limits the maximum RAM any backend can consume.

I'd delegate that problem to the operating system which, after all,
should know best of all how much memory a process uses.

I've had some success using ulimit in the past, although it does have
the disadvantage that you have to impose the same limit on every PG
process. (You set it before starting the postmaster and it inherits
to every child process.) If memory serves, limiting with the -v switch
works better than -d or -m on Linux; but I might be misremembering.
Conceivably we could add code to let the ulimit be set per-process,
if the use-case were strong enough.

To implement a limit inside PG, we'd have to add expensive bookkeeping
to the palloc/pfree mechanism, and even that would be no panacea because
it would fail to account for memory allocated directly from malloc.
Hence, you could be pretty certain that it would be wildly inaccurate
for sessions using third-party code such as PostGIS or Python. An
OS-enforced limit definitely sounds better from here.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Paul Ramsey
pramsey@cleverelephant.ca
In reply to: Tom Lane (#3)
Re: Running out of memory the hard way ...

On Wed, Feb 8, 2017 at 7:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Albe Laurenz <laurenz.albe@wien.gv.at> writes:

Bill Moran wrote:

What I feel is the best way to mitigate the situation, is to have some
setting that limits the maximum RAM any backend can consume.

I'd delegate that problem to the operating system which, after all,
should know best of all how much memory a process uses.

I've had some success using ulimit in the past, although it does have
the disadvantage that you have to impose the same limit on every PG
process. (You set it before starting the postmaster and it inherits
to every child process.) If memory serves, limiting with the -v switch
works better than -d or -m on Linux; but I might be misremembering.
Conceivably we could add code to let the ulimit be set per-process,
if the use-case were strong enough.

To implement a limit inside PG, we'd have to add expensive bookkeeping
to the palloc/pfree mechanism, and even that would be no panacea because
it would fail to account for memory allocated directly from malloc.
Hence, you could be pretty certain that it would be wildly inaccurate
for sessions using third-party code such as PostGIS or Python. An
OS-enforced limit definitely sounds better from here.

Confirming what Tom said, with respect to the specific example in this
thread, a large proportion of the allocations in memory hungry bits of
PostGIS are in fact using bare malloc via the GEOS library.

P

#5Bill Moran
wmoran@potentialtech.com
In reply to: Tom Lane (#3)
Re: Running out of memory the hard way ...

On Wed, 08 Feb 2017 10:44:24 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:

Albe Laurenz <laurenz.albe@wien.gv.at> writes:

Bill Moran wrote:

What I feel is the best way to mitigate the situation, is to have some
setting that limits the maximum RAM any backend can consume.

I'd delegate that problem to the operating system which, after all,
should know best of all how much memory a process uses.

I've had some success using ulimit in the past, although it does have
the disadvantage that you have to impose the same limit on every PG
process. (You set it before starting the postmaster and it inherits
to every child process.) If memory serves, limiting with the -v switch
works better than -d or -m on Linux; but I might be misremembering.
Conceivably we could add code to let the ulimit be set per-process,
if the use-case were strong enough.

Thanks, Tom. I'm not sure why I didn't think to use this. Although part
of the problem may be that most of the links that come up from a google
search on this topic don't seem to have this suggestion.

Hopefully having this in the list archives will make the search easier
for the next person who has this issue. Does anyone know if there are
any suggestions to this effect in the official documentation? If not,
I'll try to make some time to submit a patch.

To implement a limit inside PG, we'd have to add expensive bookkeeping
to the palloc/pfree mechanism, and even that would be no panacea because
it would fail to account for memory allocated directly from malloc.
Hence, you could be pretty certain that it would be wildly inaccurate
for sessions using third-party code such as PostGIS or Python. An
OS-enforced limit definitely sounds better from here.

Unfortunate but understandable.

--
Bill Moran <wmoran@potentialtech.com>

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bill Moran (#5)
Re: Running out of memory the hard way ...

Bill Moran <wmoran@potentialtech.com> writes:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Bill Moran wrote:

What I feel is the best way to mitigate the situation, is to have some
setting that limits the maximum RAM any backend can consume.

I've had some success using ulimit in the past,

Hopefully having this in the list archives will make the search easier
for the next person who has this issue. Does anyone know if there are
any suggestions to this effect in the official documentation? If not,
I'll try to make some time to submit a patch.

[ grepgrepgrep... ] There is not. There are several mentions of ulimit
in the docs, but all of them point towards removing unreasonably-tight
default limits, not towards intentionally adding a limit. I think
something in or near 18.4.2 "Resource Limits" might be appropriate.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general