Out of Memory
Hello, I'm running postgres 9.6.10 on Centos 7. Seeing the occasional out
of memory error trying to run a query. In the logs I see something like
this:
Grand total: 462104832 bytes in 795 blocks; 142439136 free (819860 chunks);
319665696 used
2018-09-20 18:08:01 UTC xxxx 5ba3e1a2.7a8a dbname ERROR: out of memory
2018-09-20 18:08:01 UTC xxxx 5ba3e1a2.7a8a dbname DETAIL: Failed on request
of size 2016.
If I have 142439136 free, then why am I failing on a request of size 2016?
Am I misunderstanding here?
Thanks,
Greig Wise
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
My first two guesses are ulimit or numa. numactl --hardware will show your
how many nodes your box has and if you're exhausting any of them.
On Thu, Sep 20, 2018, 6:11 PM greigwise <greigwise@comcast.net> wrote:
Show quoted text
Hello, I'm running postgres 9.6.10 on Centos 7. Seeing the occasional out
of memory error trying to run a query. In the logs I see something like
this:Grand total: 462104832 bytes in 795 blocks; 142439136 free (819860 chunks);
319665696 used
2018-09-20 18:08:01 UTC xxxx 5ba3e1a2.7a8a dbname ERROR: out of memory
2018-09-20 18:08:01 UTC xxxx 5ba3e1a2.7a8a dbname DETAIL: Failed on
request
of size 2016.If I have 142439136 free, then why am I failing on a request of size 2016?
Am I misunderstanding here?
Thanks,
Greig Wise--
Sent from:
http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
ulimit -a for the postgres user shows memory unlimited.
numactl --hardware gives command not found.
Thanks again.
Greig
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
greigwise <greigwise@comcast.net> writes:
Hello, I'm running postgres 9.6.10 on Centos 7. Seeing the occasional out
of memory error trying to run a query. In the logs I see something like
this:
Grand total: 462104832 bytes in 795 blocks; 142439136 free (819860 chunks);
319665696 used
2018-09-20 18:08:01 UTC xxxx 5ba3e1a2.7a8a dbname ERROR: out of memory
2018-09-20 18:08:01 UTC xxxx 5ba3e1a2.7a8a dbname DETAIL: Failed on request
of size 2016.
If I have 142439136 free, then why am I failing on a request of size 2016?
The free space must be in contexts other than the one that last little
request wanted space in. Overall, you've got about 460MB of space
consumed in that session, so it's not *that* surprising that you got OOM.
(At least, it's unsurprising on a 32-bit machine. If the server is
64-bit I'd have thought the kernel would be a bit more liberal.)
But anyway, this looks like a mighty inefficient usage pattern at best,
and maybe a memory leak at worst. Can you create a self-contained test
case that does this?
regards, tom lane
Well, we are 64-bit.... I'll see if I can make some kind of self contained
test to repeat it.
Thanks,
Greig
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Well, I've been unsuccessful so far on creating a standalone test.
I have put some scripting in place to capture some additional information on
the server with the out of memory issues. I have a script which just
periodically dumps the output of free -m to a text file.
So, the output of free -m immediately before and after the out of memory
error looks like this:
Just before:
total used free shared buff/cache
available
Mem: 14877 2978 132 3553 11766
7943
Swap: 0 0 0
Just after:
total used free shared buff/cache
available
Mem: 14877 2946 649 3548 11280
7982
Swap: 0 0 0
If I have nearly 8 GB of memory left, why am I getting out of memory errors?
Thanks,
Greig
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
greigwise <greigwise@comcast.net> writes:
If I have nearly 8 GB of memory left, why am I getting out of memory errors?
Probably the postmaster is running under restrictive ulimit settings.
regards, tom lane
Tom Lane-2 wrote
greigwise <
greigwise@
> writes:
If I have nearly 8 GB of memory left, why am I getting out of memory
errors?Probably the postmaster is running under restrictive ulimit settings.
regards, tom lane
If I login as the user which runs postmaster, I get this:
-bash-4.2$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 59341
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Max memory size is unlimited as is virtual memory... is there something else
there I should be concerned with in regard to out of memory?
Thanks again.
Greig
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
On Tue, 25 Sep 2018 11:34:19 -0700 (MST)
greigwise <greigwise@comcast.net> wrote:
Well, I've been unsuccessful so far on creating a standalone test.
I have put some scripting in place to capture some additional information on
the server with the out of memory issues. I have a script which just
periodically dumps the output of free -m to a text file.So, the output of free -m immediately before and after the out of memory
error looks like this:Just before:
total used free shared buff/cache
available
Mem: 14877 2978 132 3553 11766
7943
Swap: 0 0 0Just after:
total used free shared buff/cache
available
Mem: 14877 2946 649 3548 11280
7982
Swap: 0 0 0If I have nearly 8 GB of memory left, why am I getting out of memory errors?
Doesn't the default NUMA setup mean that it can't actually allocate all
the available memory to a single NUMA zone (or whatever it's called)?
Or am I talking ancient Linux history with that?
--
Bill Moran
On Tue, Sep 25, 2018 at 2:05 PM PT <wmoran@potentialtech.com> wrote:
On Tue, 25 Sep 2018 11:34:19 -0700 (MST)
greigwise <greigwise@comcast.net> wrote:Well, I've been unsuccessful so far on creating a standalone test.
I have put some scripting in place to capture some additional
information on
the server with the out of memory issues. I have a script which just
periodically dumps the output of free -m to a text file.So, the output of free -m immediately before and after the out of memory
error looks like this:Just before:
total used free shared buff/cache
available
Mem: 14877 2978 132 3553 117667943
Swap: 0 0 0Just after:
total used free shared buff/cache
available
Mem: 14877 2946 649 3548 112807982
Swap: 0 0 0If I have nearly 8 GB of memory left, why am I getting out of memory
errors?
Doesn't the default NUMA setup mean that it can't actually allocate all
the available memory to a single NUMA zone (or whatever it's called)?Or am I talking ancient Linux history with that?
--
Bill Moran
By the way we have similar issues running in our smaller stage
environment. 9.5.6 on CentOS 7.2
Only ever my stage environment on smaller AWS t2 boxen. Memory looks fine
but Postgres says it has no access to any. And my settings are really low.
Not sure if it's the same issue, but we run into it a few times a year in
my stage environment. Also running postgres in unlimited mode for ulimit.
Tory
There is also this:
-bash-4.2$ prlimit -p 6590
RESOURCE DESCRIPTION SOFT HARD UNITS
AS address space limit unlimited unlimited bytes
CORE max core file size 0 unlimited blocks
CPU CPU time unlimited unlimited seconds
DATA max data size unlimited unlimited bytes
FSIZE max file size unlimited unlimited blocks
LOCKS max number of file locks held unlimited unlimited
MEMLOCK max locked-in-memory address space 65536 65536 bytes
MSGQUEUE max bytes in POSIX mqueues 819200 819200 bytes
NICE max nice prio allowed to raise 0 0
NOFILE max number of open files 1024 4096
NPROC max number of processes 4096 59341
RSS max resident set size unlimited unlimited pages
RTPRIO max real-time priority 0 0
RTTIME timeout for real-time tasks unlimited unlimited microsecs
SIGPENDING max number of pending signals 59341 59341
STACK max stack size 8388608 unlimited bytes
Is it possible that the fact that my stack size is limited is what is
causing my issue?
Thanks,
Greig
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
greigwise <greigwise@comcast.net> writes:
Is it possible that the fact that my stack size is limited is what is
causing my issue?
No. If you were hitting that limit you'd get a message specifically
talking about stack.
regards, tom lane
Tom Lane-2 wrote
greigwise <
greigwise@
> writes:
Is it possible that the fact that my stack size is limited is what is
causing my issue?No. If you were hitting that limit you'd get a message specifically
talking about stack.regards, tom lane
Well, darn. I'm at a loss... any suggestions of what I should do next to
troubleshoot this?
Thanks.
Greig
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
I think I figured it out:
vm.overcommit_memory = 2
vm.overcommit_ratio = 50
Only allows me to use 50% of my RAM... ugh! I have 16 GB, so when only 8 is
left, I start seeing OOM. Will increase this setting and see if it helps.
Thanks everyone for the help.
Greig
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
All right.. one more thing here. Any suggestions for how to set overcommit
on a postgres db server with 16 GB of RAM and no swap? I think I want
vm.overcommit_memory = 2, as I understand that prevents the OOM killer from
zapping me. Is 100% the right way to go for overcommit_ratio? Is there a
drawback to this?
Thanks again.
Greig
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
greigwise wrote:
All right.. one more thing here. Any suggestions for how to set overcommit
on a postgres db server with 16 GB of RAM and no swap? I think I want
vm.overcommit_memory = 2, as I understand that prevents the OOM killer from
zapping me. Is 100% the right way to go for overcommit_ratio? Is there a
drawback to this?
vm.overcommit_memory = 2
vm_overcommit_ratio = 100
Linux commits (swap * overcommit_ratio * RAM / 100),
so without any swap the correct value would be 100.
I don't know how safe it is to got entirely without swap.
I would add a little bit just to be sure.
Yours,
Laurenz Albe
## Laurenz Albe (laurenz.albe@cybertec.at):
vm.overcommit_memory = 2
vm_overcommit_ratio = 100Linux commits (swap * overcommit_ratio * RAM / 100),
^
That should be a "+".
See Documentation/sysctl/vm.txt and Documentation/vm/overcommit-accounting.rst
in your kernel tree.
so without any swap the correct value would be 100.
I don't know how safe it is to got entirely without swap.
That's completely safe. Even more: if your production machine ever needs
swap, it's already mostly dead in the water. So there's also no benefit in
having the kernel copy stuff to swap "just in case" (read: set sysctl
vm.swappiness to 0) and use the swap disk space for something more
productive. If your memory usage ever exceeds available memory, something
already went horribly wrong, and you want to know about it right now.
A good way to know about this kind of mistakes is having stuff fail
hard, obviously and loudly (i.e. Out Of Memory). Do not try to save
some day by swapping: stuff is "sometimes slow" and if you really need
the swap, everything crawls to a halt anyway.
Of course, this does not hold for personal workstations and
the like where you might want to use the swap space for suspend to
disk, or have some buffer in case of runaway processes.
Regards,
Christoph
--
Spare Space.
Christoph Moench-Tegeder wrote:
## Laurenz Albe (laurenz.albe@cybertec.at):
vm.overcommit_memory = 2
vm_overcommit_ratio = 100Linux commits (swap * overcommit_ratio * RAM / 100),
^
That should be a "+".
Yes; shame on me for careless typing, and thank you for the
correction.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
On Sep 27, 2018, at 3:45 PM, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
Christoph Moench-Tegeder wrote:
## Laurenz Albe (laurenz.albe@cybertec.at):
vm.overcommit_memory = 2
vm_overcommit_ratio = 100Linux commits (swap * overcommit_ratio * RAM / 100),
^
That should be a "+".Yes; shame on me for careless typing, and thank you for the
correction.
Are there any parentheses needed in that formula?
Rob Sargent wrote:
Christoph Moench-Tegeder wrote:
## Laurenz Albe (laurenz.albe@cybertec.at):
vm.overcommit_memory = 2
vm_overcommit_ratio = 100Linux commits (swap * overcommit_ratio * RAM / 100),
^
That should be a "+".Yes; shame on me for careless typing, and thank you for the
correction.Are there any parentheses needed in that formula?
No. It is swap space plus a certain percentage of RAM.
I don't know how the Linux developers came up with that
weird formula.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com