Out of Memory

Started by greigwiseover 7 years ago21 messagesgeneral
Jump to latest
#1greigwise
greigwise@comcast.net

Hello, I'm running postgres 9.6.10 on Centos 7. Seeing the occasional out
of memory error trying to run a query. In the logs I see something like
this:

Grand total: 462104832 bytes in 795 blocks; 142439136 free (819860 chunks);
319665696 used
2018-09-20 18:08:01 UTC xxxx 5ba3e1a2.7a8a dbname ERROR: out of memory
2018-09-20 18:08:01 UTC xxxx 5ba3e1a2.7a8a dbname DETAIL: Failed on request
of size 2016.

If I have 142439136 free, then why am I failing on a request of size 2016?

Am I misunderstanding here?

Thanks,
Greig Wise

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

#2Paul Carlucci
paul.carlucci@gmail.com
In reply to: greigwise (#1)
Re: Out of Memory

My first two guesses are ulimit or numa. numactl --hardware will show your
how many nodes your box has and if you're exhausting any of them.

On Thu, Sep 20, 2018, 6:11 PM greigwise <greigwise@comcast.net> wrote:

Show quoted text

Hello, I'm running postgres 9.6.10 on Centos 7. Seeing the occasional out
of memory error trying to run a query. In the logs I see something like
this:

Grand total: 462104832 bytes in 795 blocks; 142439136 free (819860 chunks);
319665696 used
2018-09-20 18:08:01 UTC xxxx 5ba3e1a2.7a8a dbname ERROR: out of memory
2018-09-20 18:08:01 UTC xxxx 5ba3e1a2.7a8a dbname DETAIL: Failed on
request
of size 2016.

If I have 142439136 free, then why am I failing on a request of size 2016?

Am I misunderstanding here?

Thanks,
Greig Wise

--
Sent from:
http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

#3greigwise
greigwise@comcast.net
In reply to: Paul Carlucci (#2)
Re: Out of Memory

ulimit -a for the postgres user shows memory unlimited.

numactl --hardware gives command not found.

Thanks again.

Greig

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: greigwise (#1)
Re: Out of Memory

greigwise <greigwise@comcast.net> writes:

Hello, I'm running postgres 9.6.10 on Centos 7. Seeing the occasional out
of memory error trying to run a query. In the logs I see something like
this:

Grand total: 462104832 bytes in 795 blocks; 142439136 free (819860 chunks);
319665696 used
2018-09-20 18:08:01 UTC xxxx 5ba3e1a2.7a8a dbname ERROR: out of memory
2018-09-20 18:08:01 UTC xxxx 5ba3e1a2.7a8a dbname DETAIL: Failed on request
of size 2016.

If I have 142439136 free, then why am I failing on a request of size 2016?

The free space must be in contexts other than the one that last little
request wanted space in. Overall, you've got about 460MB of space
consumed in that session, so it's not *that* surprising that you got OOM.
(At least, it's unsurprising on a 32-bit machine. If the server is
64-bit I'd have thought the kernel would be a bit more liberal.)

But anyway, this looks like a mighty inefficient usage pattern at best,
and maybe a memory leak at worst. Can you create a self-contained test
case that does this?

regards, tom lane

#5greigwise
greigwise@comcast.net
In reply to: Tom Lane (#4)
Re: Out of Memory

Well, we are 64-bit.... I'll see if I can make some kind of self contained
test to repeat it.

Thanks,
Greig

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

#6greigwise
greigwise@comcast.net
In reply to: greigwise (#5)
Re: Out of Memory

Well, I've been unsuccessful so far on creating a standalone test.

I have put some scripting in place to capture some additional information on
the server with the out of memory issues. I have a script which just
periodically dumps the output of free -m to a text file.

So, the output of free -m immediately before and after the out of memory
error looks like this:

Just before:
total used free shared buff/cache
available
Mem: 14877 2978 132 3553 11766
7943
Swap: 0 0 0

Just after:
total used free shared buff/cache
available
Mem: 14877 2946 649 3548 11280
7982
Swap: 0 0 0

If I have nearly 8 GB of memory left, why am I getting out of memory errors?

Thanks,
Greig

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: greigwise (#6)
Re: Out of Memory

greigwise <greigwise@comcast.net> writes:

If I have nearly 8 GB of memory left, why am I getting out of memory errors?

Probably the postmaster is running under restrictive ulimit settings.

regards, tom lane

#8greigwise
greigwise@comcast.net
In reply to: Tom Lane (#7)
Re: Out of Memory

Tom Lane-2 wrote

greigwise &lt;

greigwise@

&gt; writes:

If I have nearly 8 GB of memory left, why am I getting out of memory
errors?

Probably the postmaster is running under restrictive ulimit settings.

regards, tom lane

If I login as the user which runs postmaster, I get this:

-bash-4.2$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 59341
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

Max memory size is unlimited as is virtual memory... is there something else
there I should be concerned with in regard to out of memory?

Thanks again.
Greig

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

#9Bill Moran
wmoran@potentialtech.com
In reply to: greigwise (#6)
Re: Out of Memory

On Tue, 25 Sep 2018 11:34:19 -0700 (MST)
greigwise <greigwise@comcast.net> wrote:

Well, I've been unsuccessful so far on creating a standalone test.

I have put some scripting in place to capture some additional information on
the server with the out of memory issues. I have a script which just
periodically dumps the output of free -m to a text file.

So, the output of free -m immediately before and after the out of memory
error looks like this:

Just before:
total used free shared buff/cache
available
Mem: 14877 2978 132 3553 11766
7943
Swap: 0 0 0

Just after:
total used free shared buff/cache
available
Mem: 14877 2946 649 3548 11280
7982
Swap: 0 0 0

If I have nearly 8 GB of memory left, why am I getting out of memory errors?

Doesn't the default NUMA setup mean that it can't actually allocate all
the available memory to a single NUMA zone (or whatever it's called)?

Or am I talking ancient Linux history with that?

--
Bill Moran

#10Tory M Blue
tmblue@gmail.com
In reply to: Bill Moran (#9)
Re: Out of Memory

On Tue, Sep 25, 2018 at 2:05 PM PT <wmoran@potentialtech.com> wrote:

On Tue, 25 Sep 2018 11:34:19 -0700 (MST)
greigwise <greigwise@comcast.net> wrote:

Well, I've been unsuccessful so far on creating a standalone test.

I have put some scripting in place to capture some additional

information on

the server with the out of memory issues. I have a script which just
periodically dumps the output of free -m to a text file.

So, the output of free -m immediately before and after the out of memory
error looks like this:

Just before:
total used free shared buff/cache
available
Mem: 14877 2978 132 3553 11766

7943
Swap: 0 0 0

Just after:
total used free shared buff/cache
available
Mem: 14877 2946 649 3548 11280

7982
Swap: 0 0 0

If I have nearly 8 GB of memory left, why am I getting out of memory

errors?

Doesn't the default NUMA setup mean that it can't actually allocate all
the available memory to a single NUMA zone (or whatever it's called)?

Or am I talking ancient Linux history with that?

--
Bill Moran

By the way we have similar issues running in our smaller stage
environment. 9.5.6 on CentOS 7.2

Only ever my stage environment on smaller AWS t2 boxen. Memory looks fine
but Postgres says it has no access to any. And my settings are really low.

Not sure if it's the same issue, but we run into it a few times a year in
my stage environment. Also running postgres in unlimited mode for ulimit.

Tory

#11greigwise
greigwise@comcast.net
In reply to: greigwise (#8)
Re: Out of Memory

There is also this:

-bash-4.2$ prlimit -p 6590
RESOURCE DESCRIPTION SOFT HARD UNITS
AS address space limit unlimited unlimited bytes
CORE max core file size 0 unlimited blocks
CPU CPU time unlimited unlimited seconds
DATA max data size unlimited unlimited bytes
FSIZE max file size unlimited unlimited blocks
LOCKS max number of file locks held unlimited unlimited
MEMLOCK max locked-in-memory address space 65536 65536 bytes
MSGQUEUE max bytes in POSIX mqueues 819200 819200 bytes
NICE max nice prio allowed to raise 0 0
NOFILE max number of open files 1024 4096
NPROC max number of processes 4096 59341
RSS max resident set size unlimited unlimited pages
RTPRIO max real-time priority 0 0
RTTIME timeout for real-time tasks unlimited unlimited microsecs
SIGPENDING max number of pending signals 59341 59341
STACK max stack size 8388608 unlimited bytes

Is it possible that the fact that my stack size is limited is what is
causing my issue?

Thanks,
Greig

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: greigwise (#11)
Re: Out of Memory

greigwise <greigwise@comcast.net> writes:

Is it possible that the fact that my stack size is limited is what is
causing my issue?

No. If you were hitting that limit you'd get a message specifically
talking about stack.

regards, tom lane

#13greigwise
greigwise@comcast.net
In reply to: Tom Lane (#12)
Re: Out of Memory

Tom Lane-2 wrote

greigwise &lt;

greigwise@

&gt; writes:

Is it possible that the fact that my stack size is limited is what is
causing my issue?

No. If you were hitting that limit you'd get a message specifically
talking about stack.

regards, tom lane

Well, darn. I'm at a loss... any suggestions of what I should do next to
troubleshoot this?

Thanks.
Greig

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

#14greigwise
greigwise@comcast.net
In reply to: greigwise (#13)
Re: Out of Memory

I think I figured it out:

vm.overcommit_memory = 2
vm.overcommit_ratio = 50

Only allows me to use 50% of my RAM... ugh! I have 16 GB, so when only 8 is
left, I start seeing OOM. Will increase this setting and see if it helps.

Thanks everyone for the help.

Greig

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

#15greigwise
greigwise@comcast.net
In reply to: greigwise (#14)
Re: Out of Memory

All right.. one more thing here. Any suggestions for how to set overcommit
on a postgres db server with 16 GB of RAM and no swap? I think I want
vm.overcommit_memory = 2, as I understand that prevents the OOM killer from
zapping me. Is 100% the right way to go for overcommit_ratio? Is there a
drawback to this?

Thanks again.
Greig

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

#16Laurenz Albe
laurenz.albe@cybertec.at
In reply to: greigwise (#15)
Re: Out of Memory

greigwise wrote:

All right.. one more thing here. Any suggestions for how to set overcommit
on a postgres db server with 16 GB of RAM and no swap? I think I want
vm.overcommit_memory = 2, as I understand that prevents the OOM killer from
zapping me. Is 100% the right way to go for overcommit_ratio? Is there a
drawback to this?

vm.overcommit_memory = 2
vm_overcommit_ratio = 100

Linux commits (swap * overcommit_ratio * RAM / 100),
so without any swap the correct value would be 100.

I don't know how safe it is to got entirely without swap.
I would add a little bit just to be sure.

Yours,
Laurenz Albe

#17Christoph Moench-Tegeder
cmt@burggraben.net
In reply to: Laurenz Albe (#16)
Re: Out of Memory

## Laurenz Albe (laurenz.albe@cybertec.at):

vm.overcommit_memory = 2
vm_overcommit_ratio = 100

Linux commits (swap * overcommit_ratio * RAM / 100),

^
That should be a "+".

See Documentation/sysctl/vm.txt and Documentation/vm/overcommit-accounting.rst
in your kernel tree.

so without any swap the correct value would be 100.

I don't know how safe it is to got entirely without swap.

That's completely safe. Even more: if your production machine ever needs
swap, it's already mostly dead in the water. So there's also no benefit in
having the kernel copy stuff to swap "just in case" (read: set sysctl
vm.swappiness to 0) and use the swap disk space for something more
productive. If your memory usage ever exceeds available memory, something
already went horribly wrong, and you want to know about it right now.
A good way to know about this kind of mistakes is having stuff fail
hard, obviously and loudly (i.e. Out Of Memory). Do not try to save
some day by swapping: stuff is "sometimes slow" and if you really need
the swap, everything crawls to a halt anyway.
Of course, this does not hold for personal workstations and
the like where you might want to use the swap space for suspend to
disk, or have some buffer in case of runaway processes.

Regards,
Christoph

--
Spare Space.

#18Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Christoph Moench-Tegeder (#17)
Re: Out of Memory

Christoph Moench-Tegeder wrote:

## Laurenz Albe (laurenz.albe@cybertec.at):

vm.overcommit_memory = 2
vm_overcommit_ratio = 100

Linux commits (swap * overcommit_ratio * RAM / 100),

^
That should be a "+".

Yes; shame on me for careless typing, and thank you for the
correction.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

#19Rob Sargent
robjsargent@gmail.com
In reply to: Laurenz Albe (#18)
Re: Out of Memory

On Sep 27, 2018, at 3:45 PM, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

Christoph Moench-Tegeder wrote:

## Laurenz Albe (laurenz.albe@cybertec.at):

vm.overcommit_memory = 2
vm_overcommit_ratio = 100

Linux commits (swap * overcommit_ratio * RAM / 100),

^
That should be a "+".

Yes; shame on me for careless typing, and thank you for the
correction.

Are there any parentheses needed in that formula?

#20Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Rob Sargent (#19)
Re: Out of Memory

Rob Sargent wrote:

Christoph Moench-Tegeder wrote:

## Laurenz Albe (laurenz.albe@cybertec.at):

vm.overcommit_memory = 2
vm_overcommit_ratio = 100

Linux commits (swap * overcommit_ratio * RAM / 100),

^
That should be a "+".

Yes; shame on me for careless typing, and thank you for the
correction.

Are there any parentheses needed in that formula?

No. It is swap space plus a certain percentage of RAM.

I don't know how the Linux developers came up with that
weird formula.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

#21Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Laurenz Albe (#20)