Memory Leak executing small queries without closing the connection
Hello,
We are running PostgreSQL 9.4.5 on FreeBSD 10.1 and have multiple worker
processes connected via persistent connections to PostgreSQL, they
perform just simple queries with SELECT on primary keys and simple
INSERTS/UPDATES. Normally nearly all the workers are idle but they still
consume the maximum configured work mem on the PostgreSQL server and the
memory is also resident. If some other queries get in we get into out of
memory situations. So it looks like PostgreSQL has memory leaks.
I found a test scenario to reproduce it also on a newer FreeBSD 10.2 VM
as well as in a Fedora 23 VM (both with PostgreSQL 9.4.5):
Executions in psql with one persisent connection:
-- Create the table
CREATE TABLE t_random AS SELECT s, md5(random()::text) FROM
generate_Series(1,100000000) s;
-- Create the index
CREATE INDEX ON t_random(s);
-- Restart psql with a new connection:
-- Memory goes slighty up after each execution even after canceling:
-- Memory leak on FreeBSD 10.2/Fedora 23 and PostgreSQL 9.4.5 on cancel
the query or multiple execution
SELECT * FROM t_random ORDER BY md5 LIMIT 100000;
-- Therefore I created a function:
CREATE OR REPLACE FUNCTION execmultiplei(IN num int8)
RETURNS void AS $$
BEGIN
-- RAISE NOTICE 'num=%', num;
FOR i IN 1..num LOOP
PERFORM * FROM t_random WHERE s = i;
END LOOP;
END;
$$ LANGUAGE plpgsql;
-- Test it several times
SELECT execmultiplei(10000000);
-- Linux testing (FreeBSD is similar), relevant part is RES (resident
memory):
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
-- after startup of psql
26851 postgres 20 0 2363276 7432 6292 S 0.0 0.2 0:00.00
postgres: postgres postgres [local] idle
-- Memory goes up, ok so far
26851 postgres 20 0 2365732 255152 253548 R 99.0 6.3 0:10.77
postgres: postgres postgres [local] SELECT
26851 postgres 20 0 2365732 408464 406788 R 100.0 10.1 0:17.81
postgres: postgres postgres [local] SELECT
26851 postgres 20 0 2365732 864472 862576 R 100.0 21.4 0:38.90
postgres: postgres postgres [local] SELECT
-- Function execmultiplei and transaction terminated, but memory still
allocated!!!
26851 postgres 20 0 2365732 920668 918748 S 0.0 22.7 0:41.40
postgres: postgres postgres [local] idle
-- Calling it again
26851 postgres 20 0 2365732 920668 918748 R 99.0 22.7 0:46.51
postgres: postgres postgres [local] SELECT
-- idle again, memory still allocated
26851 postgres 20 0 2365732 920668 918748 S 0.0 22.7 1:22.54
postgres: postgres postgres [local] idle
Memory will only be released if psql is exited. According to the
PostgreSQL design memory should be freed when the transaction completed.
top commands on FreeBSD: top -SaPz -o res -s 1
top commands on Linux: top -o RES d1
Config: VMs with 4GB of RAM, 2 vCPUs
shared_buffers = 2048MB # min 128kB
effective_cache_size = 2GB
work_mem = 892MB
wal_buffers = 8MB
checkpoint_segments = 16
Any ideas?
Thank you.
Ciao,
Gerhard
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Hello,
some further details from the original FreeBSD 10.1 machine:
Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free
Swap: 512M Total, 506M Used, 6620K Free, 98% Inuse
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
77941 pgsql 5 20 0 7925M 7296M usem 2 352:34 6.98%
postgres: username dbnamee 127.0.0.1(43367) (postgres)
75862 pgsql 1 20 0 7321M 7266M select 4 7:04 0.00%
postgres: checkpointer process (postgres)
75863 pgsql 1 20 0 7321M 7260M select 4 3:34 0.00%
postgres: writer process (postgres)
75867 pgsql 1 20 0 7323M 7237M sbwait 0 3:19 0.00%
postgres: username dbnamee ipaddress(58977) (postgres)
76178 pgsql 1 20 0 7323M 7237M sbwait 2 3:18 0.00%
postgres: username dbnamee ipaddress(35807) (postgres)
75868 pgsql 1 20 0 7323M 7237M sbwait 0 3:20 0.00%
postgres: username dbnamee ipaddress(58978) (postgres)
75884 pgsql 1 20 0 7323M 7236M sbwait 3 3:20 0.00%
postgres: username dbnamee ipaddress(59908) (postgres)
75869 pgsql 1 20 0 7323M 7236M sbwait 3 3:20 0.00%
postgres: username dbnamee ipaddress(58982) (postgres)
76181 pgsql 1 20 0 7323M 7236M sbwait 2 3:18 0.00%
postgres: username dbnamee ipaddress(35813) (postgres)
75883 pgsql 1 20 0 7323M 7236M sbwait 0 3:20 0.00%
postgres: username dbnamee ipaddress(59907) (postgres)
76180 pgsql 1 20 0 7323M 7236M sbwait 1 3:19 0.00%
postgres: username dbnamee ipaddress(35811) (postgres)
76177 pgsql 1 20 0 7323M 7236M sbwait 1 3:18 0.00%
postgres: username dbnamee ipaddress(35712) (postgres)
76179 pgsql 1 20 0 7323M 7236M sbwait 5 3:18 0.00%
postgres: username dbnamee ipaddress(35810) (postgres)
64951 pgsql 1 75 0 7375M 662M CPU1 1 0:01 11.47%
postgres: username dbnamee 127.0.0.1(32073) (postgres)
64950 pgsql 1 77 0 7325M 598M CPU6 6 0:02 16.55%
postgres: username dbnamee 127.0.0.1(44251) (postgres)
Out of memory:
kernel: swap_pager_getswapspace(4): failed
kernel: swap_pager_getswapspace(8): failed
kernel: swap_pager_getswapspace(3): failed
Main issue is IHMO (as far as I understood the FreeBSD Memory system)
that 20G are INACTIVE. When I subtract the shared memory, even ~13GB
should be available, but they are still allocated but inactive
(INACTIVE). INACTIVE memory might be clean or dirty. As we get into out
of memory situations it is likely that the memory is dirty (otherwise it
would have been reused).
Config:
Memory: 32GB, Swap: 512MB
maintenance_work_mem = 512MB
effective_cache_size = 10GB
work_mem = 892MB
wal_buffers = 8MB
checkpoint_segments = 16
shared_buffers = 7080MB
max_connections = 80
autovacuum_max_workers = 3
Thnx.
Ciao,
Gerhard
On 13.12.2015 08:49, Gerhard Wiesinger wrote:
Hello,
We are running PostgreSQL 9.4.5 on FreeBSD 10.1 and have multiple
worker processes connected via persistent connections to PostgreSQL,
they perform just simple queries with SELECT on primary keys and
simple INSERTS/UPDATES. Normally nearly all the workers are idle but
they still consume the maximum configured work mem on the PostgreSQL
server and the memory is also resident. If some other queries get in
we get into out of memory situations. So it looks like PostgreSQL has
memory leaks.I found a test scenario to reproduce it also on a newer FreeBSD 10.2
VM as well as in a Fedora 23 VM (both with PostgreSQL 9.4.5):Executions in psql with one persisent connection:
-- Create the table
CREATE TABLE t_random AS SELECT s, md5(random()::text) FROM
generate_Series(1,100000000) s;
-- Create the index
CREATE INDEX ON t_random(s);-- Restart psql with a new connection:
-- Memory goes slighty up after each execution even after canceling:
-- Memory leak on FreeBSD 10.2/Fedora 23 and PostgreSQL 9.4.5 on
cancel the query or multiple execution
SELECT * FROM t_random ORDER BY md5 LIMIT 100000;-- Therefore I created a function:
CREATE OR REPLACE FUNCTION execmultiplei(IN num int8)
RETURNS void AS $$
BEGIN
-- RAISE NOTICE 'num=%', num;
FOR i IN 1..num LOOP
PERFORM * FROM t_random WHERE s = i;
END LOOP;
END;
$$ LANGUAGE plpgsql;-- Test it several times
SELECT execmultiplei(10000000);-- Linux testing (FreeBSD is similar), relevant part is RES (resident
memory):
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
-- after startup of psql
26851 postgres 20 0 2363276 7432 6292 S 0.0 0.2 0:00.00
postgres: postgres postgres [local] idle
-- Memory goes up, ok so far
26851 postgres 20 0 2365732 255152 253548 R 99.0 6.3 0:10.77
postgres: postgres postgres [local] SELECT
26851 postgres 20 0 2365732 408464 406788 R 100.0 10.1 0:17.81
postgres: postgres postgres [local] SELECT
26851 postgres 20 0 2365732 864472 862576 R 100.0 21.4 0:38.90
postgres: postgres postgres [local] SELECT
-- Function execmultiplei and transaction terminated, but memory still
allocated!!!
26851 postgres 20 0 2365732 920668 918748 S 0.0 22.7 0:41.40
postgres: postgres postgres [local] idle
-- Calling it again
26851 postgres 20 0 2365732 920668 918748 R 99.0 22.7 0:46.51
postgres: postgres postgres [local] SELECT
-- idle again, memory still allocated
26851 postgres 20 0 2365732 920668 918748 S 0.0 22.7 1:22.54
postgres: postgres postgres [local] idleMemory will only be released if psql is exited. According to the
PostgreSQL design memory should be freed when the transaction completed.top commands on FreeBSD: top -SaPz -o res -s 1
top commands on Linux: top -o RES d1Config: VMs with 4GB of RAM, 2 vCPUs
shared_buffers = 2048MB # min 128kB
effective_cache_size = 2GB
work_mem = 892MB
wal_buffers = 8MB
checkpoint_segments = 16Any ideas?
Thank you.
Ciao,
Gerhard
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Sun, 13 Dec 2015 09:57:21 +0100
Gerhard Wiesinger <lists@wiesinger.com> wrote:
some further details from the original FreeBSD 10.1 machine:
Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free
Swap: 512M Total, 506M Used, 6620K Free, 98% InusePID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
77941 pgsql 5 20 0 7925M 7296M usem 2 352:34 6.98%
postgres: username dbnamee 127.0.0.1(43367) (postgres)
<snip>
I see no evidence of an actual leak here. Each process is basically using
the 7G of shared_buffers you have allocated in the config (which is only
7G _total_ for all processes, since it's shared memory)
Out of memory:
kernel: swap_pager_getswapspace(4): failed
kernel: swap_pager_getswapspace(8): failed
kernel: swap_pager_getswapspace(3): failedMain issue is IHMO (as far as I understood the FreeBSD Memory system)
that 20G are INACTIVE. When I subtract the shared memory, even ~13GB
should be available, but they are still allocated but inactive
(INACTIVE). INACTIVE memory might be clean or dirty. As we get into out
of memory situations it is likely that the memory is dirty (otherwise it
would have been reused).
Not quite correct. Inactive memory is _always_ available for re-use.
Config:
Memory: 32GB, Swap: 512MB
Probably not the cause of this problem, but this is a non-optimal layout.
If you're going to use swap at all, it should generally be 2x the size of
your RAM for optimal performance under memory pressure.
maintenance_work_mem = 512MB
effective_cache_size = 10GB
work_mem = 892MB
I expect that this value is the cause of the problem. The scenario you
describe below is sorting a large table on an unindexed column, meaning
it will have to use all that work_mem. I'd be interested to see the
output of:
EXPLAIN ANALYZE SELECT * FROM t_random ORDER BY md5 LIMIT 100000;
But even without that information, I'd recommend you reduce work_mem
to about 16M or so.
wal_buffers = 8MB
checkpoint_segments = 16
shared_buffers = 7080MB
max_connections = 80
autovacuum_max_workers = 3
[snip]
We are running PostgreSQL 9.4.5 on FreeBSD 10.1 and have multiple
worker processes connected via persistent connections to PostgreSQL,
they perform just simple queries with SELECT on primary keys and
simple INSERTS/UPDATES.
That's not at all the test scenario you show below. The scenario below
is a large sort operation on a non-indexed column, which is vastly
different than a single-row fetch based on an index.
Normally nearly all the workers are idle but
they still consume the maximum configured work mem on the PostgreSQL
server and the memory is also resident.
I see no evidence of that in your top output. Each process has a
reference to the 7G of shared_buffers you allocated, which is memory
shared by all processes, and is expected. I'm not as familiar with
Linux top, but the results appear to be the same.
If some other queries get in
we get into out of memory situations. So it looks like PostgreSQL has
memory leaks.I found a test scenario to reproduce it also on a newer FreeBSD 10.2
VM as well as in a Fedora 23 VM (both with PostgreSQL 9.4.5):Executions in psql with one persisent connection:
-- Create the table
CREATE TABLE t_random AS SELECT s, md5(random()::text) FROM
generate_Series(1,100000000) s;
-- Create the index
CREATE INDEX ON t_random(s);-- Restart psql with a new connection:
-- Memory goes slighty up after each execution even after canceling:
-- Memory leak on FreeBSD 10.2/Fedora 23 and PostgreSQL 9.4.5 on
cancel the query or multiple execution
SELECT * FROM t_random ORDER BY md5 LIMIT 100000;-- Therefore I created a function:
CREATE OR REPLACE FUNCTION execmultiplei(IN num int8)
RETURNS void AS $$
BEGIN
-- RAISE NOTICE 'num=%', num;
FOR i IN 1..num LOOP
PERFORM * FROM t_random WHERE s = i;
END LOOP;
END;
$$ LANGUAGE plpgsql;-- Test it several times
SELECT execmultiplei(10000000);-- Linux testing (FreeBSD is similar), relevant part is RES (resident
memory):
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
-- after startup of psql
26851 postgres 20 0 2363276 7432 6292 S 0.0 0.2 0:00.00
postgres: postgres postgres [local] idle
-- Memory goes up, ok so far
26851 postgres 20 0 2365732 255152 253548 R 99.0 6.3 0:10.77
postgres: postgres postgres [local] SELECT
26851 postgres 20 0 2365732 408464 406788 R 100.0 10.1 0:17.81
postgres: postgres postgres [local] SELECT
26851 postgres 20 0 2365732 864472 862576 R 100.0 21.4 0:38.90
postgres: postgres postgres [local] SELECT
-- Function execmultiplei and transaction terminated, but memory still
allocated!!!
26851 postgres 20 0 2365732 920668 918748 S 0.0 22.7 0:41.40
postgres: postgres postgres [local] idle
-- Calling it again
26851 postgres 20 0 2365732 920668 918748 R 99.0 22.7 0:46.51
postgres: postgres postgres [local] SELECT
-- idle again, memory still allocated
26851 postgres 20 0 2365732 920668 918748 S 0.0 22.7 1:22.54
postgres: postgres postgres [local] idleMemory will only be released if psql is exited. According to the
PostgreSQL design memory should be freed when the transaction completed.top commands on FreeBSD: top -SaPz -o res -s 1
top commands on Linux: top -o RES d1Config: VMs with 4GB of RAM, 2 vCPUs
shared_buffers = 2048MB # min 128kB
effective_cache_size = 2GB
work_mem = 892MB
wal_buffers = 8MB
checkpoint_segments = 16
--
Bill Moran
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Hello Bill,
Thank you for your response, comments inline:
On 13.12.2015 16:05, Bill Moran wrote:
On Sun, 13 Dec 2015 09:57:21 +0100
Gerhard Wiesinger <lists@wiesinger.com> wrote:some further details from the original FreeBSD 10.1 machine:
Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free
Swap: 512M Total, 506M Used, 6620K Free, 98% InusePID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
77941 pgsql 5 20 0 7925M 7296M usem 2 352:34 6.98%
postgres: username dbnamee 127.0.0.1(43367) (postgres)<snip>
I see no evidence of an actual leak here. Each process is basically using
the 7G of shared_buffers you have allocated in the config (which is only
7G _total_ for all processes, since it's shared memory)
OK, but why do we then get: kernel: swap_pager_getswapspace(4): failed?
Out of memory:
kernel: swap_pager_getswapspace(4): failed
kernel: swap_pager_getswapspace(8): failed
kernel: swap_pager_getswapspace(3): failedMain issue is IHMO (as far as I understood the FreeBSD Memory system)
that 20G are INACTIVE. When I subtract the shared memory, even ~13GB
should be available, but they are still allocated but inactive
(INACTIVE). INACTIVE memory might be clean or dirty. As we get into out
of memory situations it is likely that the memory is dirty (otherwise it
would have been reused).Not quite correct. Inactive memory is _always_ available for re-use.
Are you sure that's true?
Monitoring inactive memory:
cat vm_stat.sh
#!/usr/bin/env bash
while [ 1 ]; do
date +%Y.%m.%d.%H.%M.%S
sysctl -a | grep vm.stats.vm.
sleep 1
done
And even we get out of memory with swap_pager_getswapspace Inactive
Memory (from the log file) is around 20GB (doesn't go down or up)
vm.stats.vm.v_inactive_count: 5193530 (*4k pages is around 20GB)
Then we have 20GB inactive memory, but we still get out of memory with
kernel: swap_pager_getswapspace(4): failed. Any ideas why?
Config:
Memory: 32GB, Swap: 512MBProbably not the cause of this problem, but this is a non-optimal layout.
If you're going to use swap at all, it should generally be 2x the size of
your RAM for optimal performance under memory pressure.
Yes, we will try to disable it totally. Nevertheless why do we get out
of memory/Swap?
maintenance_work_mem = 512MB
effective_cache_size = 10GB
work_mem = 892MBI expect that this value is the cause of the problem. The scenario you
describe below is sorting a large table on an unindexed column, meaning
it will have to use all that work_mem. I'd be interested to see the
output of:EXPLAIN ANALYZE SELECT * FROM t_random ORDER BY md5 LIMIT 100000;
That was only a test query, has nothing to do with production based
query. They are mostly SELECT/INSERTS/UPDATES on primary keys.
But even without that information, I'd recommend you reduce work_mem
to about 16M or so.
Why so low? E.g. sorting on reporting or some long running queries are
then done on disk and not in memory.
wal_buffers = 8MB
checkpoint_segments = 16
shared_buffers = 7080MB
max_connections = 80
autovacuum_max_workers = 3[snip]
We are running PostgreSQL 9.4.5 on FreeBSD 10.1 and have multiple
worker processes connected via persistent connections to PostgreSQL,
they perform just simple queries with SELECT on primary keys and
simple INSERTS/UPDATES.That's not at all the test scenario you show below. The scenario below
is a large sort operation on a non-indexed column, which is vastly
different than a single-row fetch based on an index.
Yes, that non indexed select was just for testing purporeses.
Normally nearly all the workers are idle but
they still consume the maximum configured work mem on the PostgreSQL
server and the memory is also resident.I see no evidence of that in your top output. Each process has a
reference to the 7G of shared_buffers you allocated, which is memory
shared by all processes, and is expected. I'm not as familiar with
Linux top, but the results appear to be the same.
Yes, might be the case, but if it is n times 7G shared memory then we
have ~20GB Inactive Memory available, so plenty of memory. And why are
we getting: kernel: swap_pager_getswapspace(4): failed?
Thnx.
Ciao,
Gerhard
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Sun, 13 Dec 2015 16:35:08 +0100
Gerhard Wiesinger <lists@wiesinger.com> wrote:
Hello Bill,
Thank you for your response, comments inline:
On 13.12.2015 16:05, Bill Moran wrote:
On Sun, 13 Dec 2015 09:57:21 +0100
Gerhard Wiesinger <lists@wiesinger.com> wrote:some further details from the original FreeBSD 10.1 machine:
Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free
Swap: 512M Total, 506M Used, 6620K Free, 98% InusePID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
77941 pgsql 5 20 0 7925M 7296M usem 2 352:34 6.98%
postgres: username dbnamee 127.0.0.1(43367) (postgres)<snip>
I see no evidence of an actual leak here. Each process is basically using
the 7G of shared_buffers you have allocated in the config (which is only
7G _total_ for all processes, since it's shared memory)OK, but why do we then get: kernel: swap_pager_getswapspace(4): failed?
You haven't provided enough information to isolate that cause yet. What's
in the Postgres log? Surely it will have logged something when its request
for RAM was denied, and it should be more informational than the OS'
generic message.
Out of memory:
kernel: swap_pager_getswapspace(4): failed
kernel: swap_pager_getswapspace(8): failed
kernel: swap_pager_getswapspace(3): failedMain issue is IHMO (as far as I understood the FreeBSD Memory system)
that 20G are INACTIVE. When I subtract the shared memory, even ~13GB
should be available, but they are still allocated but inactive
(INACTIVE). INACTIVE memory might be clean or dirty. As we get into out
of memory situations it is likely that the memory is dirty (otherwise it
would have been reused).Not quite correct. Inactive memory is _always_ available for re-use.
Are you sure that's true?
Yes. Read The Design and Implementation of FreeBSD for the details.
Monitoring inactive memory:
cat vm_stat.sh
#!/usr/bin/env bashwhile [ 1 ]; do
date +%Y.%m.%d.%H.%M.%S
sysctl -a | grep vm.stats.vm.
sleep 1
doneAnd even we get out of memory with swap_pager_getswapspace Inactive
Memory (from the log file) is around 20GB (doesn't go down or up)
vm.stats.vm.v_inactive_count: 5193530 (*4k pages is around 20GB)Then we have 20GB inactive memory, but we still get out of memory with
kernel: swap_pager_getswapspace(4): failed. Any ideas why?
Theory: If the planner decides it needs to do 30 sort operations for a
query, it will try to allocate 27G of RAM, which exceeds what's available,
and therefore never gets allocated. So you get the "out of space" message,
but the actual memory usage doesn't change.
maintenance_work_mem = 512MB
effective_cache_size = 10GB
work_mem = 892MBI expect that this value is the cause of the problem. The scenario you
describe below is sorting a large table on an unindexed column, meaning
it will have to use all that work_mem. I'd be interested to see the
output of:EXPLAIN ANALYZE SELECT * FROM t_random ORDER BY md5 LIMIT 100000;
That was only a test query, has nothing to do with production based
query. They are mostly SELECT/INSERTS/UPDATES on primary keys.
Then provide the _actual_ queries and the EXPLAIN ANALYZE and the table
layouts and basic data distribution of the actual cause. If your test
case is completely non-representative of what's happening, then you're
not going to get useful answers.
But even without that information, I'd recommend you reduce work_mem
to about 16M or so.Why so low? E.g. sorting on reporting or some long running queries are
then done on disk and not in memory.
Even a simple query could involve multiple sorts, and you're allowing
each sort to use up to 890M of RAM (which is _not_ shared). As noted
earlier, even a moderately complex query could exceed the available
RAM on the system. But since you don't provide the actual queries and
tables causing problems, I can only guess. And since you appear to
have already decided what the cause of the problem is, then crafted
completely non-relevent queries that you think prove your point, I'm
not sure there's anything I can do to help you.
wal_buffers = 8MB
checkpoint_segments = 16
shared_buffers = 7080MB
max_connections = 80
autovacuum_max_workers = 3[snip]
We are running PostgreSQL 9.4.5 on FreeBSD 10.1 and have multiple
worker processes connected via persistent connections to PostgreSQL,
they perform just simple queries with SELECT on primary keys and
simple INSERTS/UPDATES.That's not at all the test scenario you show below. The scenario below
is a large sort operation on a non-indexed column, which is vastly
different than a single-row fetch based on an index.Yes, that non indexed select was just for testing purporeses.
Completely useless. I can always craft some sort of nonsense case that
will show that something that no sane person would ever do doesn't work.
Especially if I set configuration limits (such as work_mem) to
unreasonably high values.
Normally nearly all the workers are idle but
they still consume the maximum configured work mem on the PostgreSQL
server and the memory is also resident.I see no evidence of that in your top output. Each process has a
reference to the 7G of shared_buffers you allocated, which is memory
shared by all processes, and is expected. I'm not as familiar with
Linux top, but the results appear to be the same.Yes, might be the case, but if it is n times 7G shared memory then we
have ~20GB Inactive Memory available, so plenty of memory. And why are
we getting: kernel: swap_pager_getswapspace(4): failed?
My advice:
First, set work_mem to 16M and see if the problem goes away. If you
still want to investigate exactly what was causing the problem after
that, then post the _actual_ scenerio that's causing it. I find it
_extremely_ unlikely that it's being caused by SELECT * FROM table
WHERE indexed_column = ?. And as a result, you should gather more
information on what queries are actual failing to execute. If you've
got a lot of data and are doing even moderately complex JOINs, you
could easily be allocating many gigs of sort space for each query.
With only 32G of RAM, a few dozen connections could quickly exhaust
that.
--
Bill Moran
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Gerhard Wiesinger <lists@wiesinger.com> writes:
Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free
Swap: 512M Total, 506M Used, 6620K Free, 98% Inuse
OK, but why do we then get: kernel: swap_pager_getswapspace(4): failed?
Just judging from the name of the function, I would bet this is a direct
result of having only 512M of swap configured. As Bill already pointed
out, that's a pretty useless choice on a system with 32G of RAM. As soon
as the kernel tries to push out any significant amount of idle processes,
it's gonna be out of swap space. The numbers you show above prove that
it is almost out of free swap already.
Also, while that 20G of "inactive" pages may be candidates for reuse,
they probably can't actually be reused without swapping them out ...
and there's noplace for that data to go.
regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On 13.12.2015 18:17, Tom Lane wrote:
Gerhard Wiesinger <lists@wiesinger.com> writes:
Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free
Swap: 512M Total, 506M Used, 6620K Free, 98% InuseOK, but why do we then get: kernel: swap_pager_getswapspace(4): failed?
Just judging from the name of the function, I would bet this is a direct
result of having only 512M of swap configured. As Bill already pointed
out, that's a pretty useless choice on a system with 32G of RAM. As soon
as the kernel tries to push out any significant amount of idle processes,
it's gonna be out of swap space. The numbers you show above prove that
it is almost out of free swap already.
The system wasn't designed by me, I wouldn't do it either that way. Does
swapoff help?
Also, while that 20G of "inactive" pages may be candidates for reuse,
they probably can't actually be reused without swapping them out ...
and there's noplace for that data to go.
There is no log in syslog (where postgres log) when
swap_pager_getswapspace is logged.
But why do we have 20G of Inactive pages? They are still allocated by
kernel or user space. As you can see below (top output) NON Postgres
processes are around 9G in virtual size, resident even lower. The system
is nearly idle, and the queries typically aren't active after one second
agin. Therefore where does the rest of the 11G of Inactive pages come
from (if it isn't a Postgres/FreeBSD memory leak)?
I read that Postgres has it's own memory allocator:
https://www.reddit.com/r/programming/comments/18zija/github_got_30_better_performance_using_tcmalloc/
Might that be an issue with double allocation/freeing and the "cheese
hole" topic with memory fragmentation?
https://www.opennet.ru/base/dev/fbsdvm.txt.html
inactive pages not actively used by programs which are
dirty and (at some point) need to be written
to their backing store (typically disk).
These pages are still associated with objects and
can be reclaimed if a program references them.
Pages can be moved from the active to the inactive
queue at any time with little adverse effect.
Moving pages to the cache queue has bigger
consequences (note 1)
https://unix.stackexchange.com/questions/134862/what-do-the-different-memory-counters-in-freebsd-mean
Active: Memory currently being used by a process
Inactive: Memory that has been freed but is still cached since it
may be used again. If more Free memory is required, this memory can be
cleared and become free. This memory is not cleared before it is needed,
because "free memory is wasted memory", it doesn't cost anything to keep
the old data around in case it is needed again.
Wired: Memory in use by the Kernel. This memory cannot be swapped
out (GW: including ZFS cache!!!)
Cache: Memory being used to cache data, can be freed immediately if
required
Buffers: Disk cache
Free: Memory that is completely free and ready to use. Inactive,
Cache and Buffers can become free if they are cleaned up.
Thnx.
Ciao,
Gerhard
================================================================================================================================================================
last pid: 7277; load averages: 0.91, 0.96, 1.02 up 18+06:22:31
18:57:54
135 processes: 2 running, 132 sleeping, 1 waiting
Mem: 8020M Active, 19G Inact, 3537M Wired, 299M Cache, 1679M Buf, 38M Free
Swap: 512M Total, 501M Used, 12M Free, 97% Inuse
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
77941 pgsql 5 20 0 7921M 7295M usem 7 404:32 10.25%
postgres
79570 pgsql 1 20 0 7367M 6968M sbwait 6 4:24 0.59% postgres
73449 pgsql 1 27 0 7367M 6908M sbwait 4 8:08 5.08% postgres
74209 pgsql 1 27 0 7367M 6803M sbwait 0 6:37 1.46% postgres
74207 pgsql 1 25 0 7367M 6748M sbwait 6 6:34 1.76% postgres
74206 pgsql 1 22 0 7367M 6548M sbwait 5 6:44 1.86% postgres
73380 pgsql 1 23 0 7367M 7265M sbwait 4 8:15 1.17% postgres
74208 pgsql 1 24 0 7367M 7223M sbwait 1 6:30 4.59% postgres
79569 pgsql 1 24 0 7367M 7105M sbwait 3 4:36 1.17% postgres
74210 pgsql 1 29 0 7363M 7182M sbwait 5 6:41 5.47% postgres
73479 pgsql 1 22 0 7363M 6560M sbwait 6 7:14 3.56% postgres
83030 pgsql 1 20 0 7329M 193M sbwait 5 0:00 0.00% postgres
76178 pgsql 1 20 0 7323M 7245M sbwait 2 3:44 0.00% postgres
75867 pgsql 1 20 0 7323M 7245M sbwait 2 3:45 0.00% postgres
75869 pgsql 1 20 0 7323M 7245M sbwait 2 3:46 0.00% postgres
75883 pgsql 1 20 0 7323M 7245M sbwait 7 3:46 0.00% postgres
76180 pgsql 1 20 0 7323M 7245M sbwait 7 3:45 0.00% postgres
75868 pgsql 1 20 0 7323M 7245M sbwait 6 3:46 0.00% postgres
76177 pgsql 1 20 0 7323M 7245M sbwait 3 3:45 0.00% postgres
76179 pgsql 1 20 0 7323M 7245M sbwait 3 3:44 0.00% postgres
75884 pgsql 1 20 0 7323M 7245M sbwait 7 3:46 0.00% postgres
76181 pgsql 1 20 0 7323M 7245M sbwait 7 3:44 0.00% postgres
75862 pgsql 1 20 0 7321M 7266M select 0 8:00 0.00% postgres
75863 pgsql 1 20 0 7321M 7261M select 4 4:04 0.00% postgres
4361 pgsql 1 20 0 7321M 206M sbwait 2 0:00 0.00%
postgres
7021 pgsql 1 20 0 7321M 181M sbwait 0 0:00 0.00%
postgres
83028 pgsql 1 20 0 7321M 180M sbwait 3 0:00 0.00% postgres
97964 pgsql 1 20 0 7321M 175M sbwait 4 0:00 0.00% postgres
97963 pgsql 1 20 0 7321M 175M sbwait 0 0:01 0.00% postgres
75865 pgsql 1 20 0 7321M 173M select 4 0:05 0.00% postgres
75864 pgsql 1 20 0 7321M 173M select 2 0:54 0.00% postgres
593 pgsql 1 21 0 7317M 173M select 4 1:50 0.00%
postgres
32387 myusername 9 20 0 980M 375M uwait 5 69:03 1.27% node
71419 myusername 9 20 0 798M 146M uwait 1 0:40 0.00% node
77938 myusername 9 20 0 795M 115M uwait 7 691:07 15.58% node
831 myusername 9 20 0 628M 30076K uwait 7 6:28 0.00% node
77937 myusername 5 20 0 618M 21352K usem 4 0:00 0.00% node
71418 myusername 5 20 0 618M 18000K usem 0 0:00 0.00% node
622 myusername 1 20 0 261M 3388K kqread 3 41:01 0.00% nginx
623 myusername 1 20 0 261M 2528K kqread 3 0:04 0.00% nginx
621 root 1 20 0 261M 2448K pause 0 0:00 0.00% nginx
604 myusername 1 20 0 235M 12144K kqread 1 1:27 0.00% uwsgi
602 myusername 1 20 0 232M 11592K kqread 0 1:11 0.00% uwsgi
600 myusername 1 20 0 231M 8736K kqread 6 0:58 0.00% uwsgi
598 myusername 1 20 0 230M 13196K kqread 0 0:17 0.00% uwsgi
608 myusername 1 20 0 227M 11204K kqread 7 0:46 0.00% uwsgi
606 myusername 1 20 0 227M 10328K kqread 1 0:59 0.00% uwsgi
610 myusername 1 20 0 227M 11656K kqread 0 1:09 0.00% uwsgi
603 myusername 1 20 0 215M 21048K kqread 4 4:35 0.00% uwsgi
599 myusername 1 20 0 215M 19756K kqread 5 0:52 0.00% uwsgi
605 myusername 1 20 0 215M 19496K kqread 4 4:01 0.00% uwsgi
597 myusername 1 20 0 215M 19284K kqread 0 3:21 0.00% uwsgi
609 myusername 1 20 0 215M 19304K kqread 1 3:10 0.00% uwsgi
601 myusername 1 20 0 215M 17208K kqread 2 4:47 0.00% uwsgi
607 myusername 1 20 0 214M 17912K kqread 5 1:44 0.00% uwsgi
595 myusername 1 20 0 213M 12224K kqread 4 0:01 0.00% uwsgi
596 myusername 1 20 0 211M 4112K kqread 6 0:00 0.00% uwsgi
532 myusername 1 20 0 100M 4036K kqread 0 0:51 0.00% uwsgi
538 myusername 1 20 0 100M 3556K kqread 1 0:51 0.00% uwsgi
92090 myusername 1 20 0 86480K 5576K select 2 0:00 0.00% sshd
84665 myusername 1 20 0 86480K 5460K select 4 0:00 0.00% sshd
84249 myusername 1 20 0 86480K 5456K select 1 0:00 0.00% sshd
72867 myusername 1 20 0 86480K 5348K select 2 0:00 0.00% sshd
75956 myusername 1 20 0 86480K 5328K select 7 0:00 0.00% sshd
92075 root 1 21 0 86480K 5220K select 7 0:00 0.00% sshd
84653 root 1 21 0 86480K 5132K select 7 0:00 0.00% sshd
84175 root 1 20 0 86480K 5124K select 7 0:00 0.00% sshd
72863 root 1 20 0 86480K 4980K select 4 0:00 0.00% sshd
75950 root 1 20 0 86480K 4952K select 5 0:00 0.00% sshd
653 root 1 20 0 61212K 1756K select 7 0:03 0.00% sshd
92133 myusername 1 20 0 47720K 2100K wait 4 0:00 0.00% su
84828 myusername 1 20 0 47720K 2068K wait 4 0:00 0.00% su
84275 myusername 1 20 0 47720K 2068K wait 3 0:00 0.00% su
83806 myusername 1 20 0 47720K 2052K wait 0 0:00 0.00% su
72875 myusername 1 20 0 47720K 1968K wait 2 0:00 0.00% su
83852 root 1 20 0 47708K 2032K wait 2 0:00 0.00% su
84923 root 1 20 0 47708K 2032K wait 7 0:00 0.00% su
6080 myusername 1 20 0 46496K 10540K select 7 11:47 0.00% tmux
75866 pgsql 1 20 0 37304K 4964K select 1 1:56 0.00% postgres
360 unbound 1 20 0 33016K 1652K select 6 0:00 0.00% unbound
68658 myusername 1 20 0 28408K 2300K ttyin 0 0:00 0.00% psql
656 root 1 20 0 24112K 2104K select 2 0:13 0.00%
sendmail
659 smmsp 1 20 0 24112K 1932K pause 1 0:00 0.00%
sendmail
92160 root 1 20 0 23572K 3560K ttyin 2 0:00 0.00% csh
84847 root 1 20 0 23572K 3540K pause 4 0:00 0.00% csh
83809 root 1 20 0 23572K 3516K pause 7 0:00 0.00% csh
84294 root 1 20 0 23572K 3424K pause 6 0:00 0.00% csh
72876 root 1 43 0 23572K 3316K ttyin 0 0:00 0.00% csh
7276 root 1 20 0 21920K 2264K CPU4 4 0:00 0.00% top
83853 pgsql 1 47 0 17068K 2404K ttyin 6 0:00 0.00% sh
84924 pgsql 1 20 0 17068K 2384K wait 5 0:00 0.00% sh
580 _ntp 1 20 -20 16740K 1056K select 2 0:00 0.00% ntpd
579 root 1 20 -20 16732K 1036K select 6 0:00 0.00% ntpd
582 _ntp 1 52 0 16732K 904K select 4 0:00 0.00% ntpd
663 root 1 20 0 16596K 1064K nanslp 5 0:06 0.00% cron
92095 myusername 1 20 0 15668K 2940K wait 7 0:00 0.00% bash
84255 myusername 1 20 0 15668K 2932K wait 6 0:00 0.00% bash
84675 myusername 1 20 0 15668K 2932K wait 3 0:00 0.00% bash
75957 myusername 1 20 0 15668K 2820K wait 6 0:00 0.00% bash
72869 myusername 1 20 0 15668K 2772K wait 6 0:00 0.00% bash
68644 myusername 1 20 0 15668K 1620K wait 2 0:00 0.00% bash
7109 myusername 1 20 0 15668K 1000K wait 3 0:00 0.00% bash
7317 myusername 1 31 0 15668K 956K ttyin 2 0:00 0.00% bash
6120 myusername 1 20 0 15668K 868K wait 4 0:00 0.00% bash
32681 myusername 1 20 0 15668K 780K ttyin 2 0:00 0.00% bash
84934 pgsql 1 52 0 14716K 2556K nanslp 3 0:02 0.00% cmdwatch
405 root 1 20 0 14504K 984K select 5 68:40 0.00%
syslogd
712 root 1 52 0 14492K 808K ttyin 1 0:00 0.00% getty
711 root 1 52 0 14492K 808K ttyin 5 0:00 0.00% getty
716 root 1 52 0 14492K 808K ttyin 2 0:00 0.00% getty
710 root 1 52 0 14492K 808K ttyin 7 0:00 0.00% getty
714 root 1 52 0 14492K 808K ttyin 5 0:00 0.00% getty
713 root 1 52 0 14492K 808K ttyin 0 0:00 0.00% getty
715 root 1 52 0 14492K 808K ttyin 3 0:00 0.00% getty
709 root 1 52 0 14492K 808K ttyin 2 0:00 0.00% getty
316 root 1 20 0 13164K 508K select 4 0:00 0.00% devd
7277 root 1 20 0 12372K 1760K piperd 1 0:00 0.00% sed
1 root 1 20 0 9472K 116K wait 4 0:00 0.00% init
12 root 18 -84 - 0K 288K WAIT 0 91:43 0.00% intr
0 root 12 -8 0 0K 192K - 0 131:24 0.00%
kernel
15 root 12 -68 - 0K 192K - 7 1:16 0.00% usb
13 root 3 -8 - 0K 48K - 1 64:25 0.00% geom
2 root 3 -16 - 0K 48K - 0 49:12 0.00% cam
9 root 2 20 - 0K 32K sdflus 7 1:43 0.00%
bufdaemon
6 root 1 -16 - 0K 16K psleep 1 149:47 0.00%
pagedaemon
5 root 1 -8 - 0K 16K m:w1 5 45:03 0.00%
g_mirror gm0
14 root 1 -16 - 0K 16K - 3 31:12 0.00%
rand_harvestq
17 root 1 16 - 0K 16K syncer 5 10:25 0.00% syncer
16 root 1 -16 - 0K 16K vlruwt 5 0:03 0.00% vnlru
4 root 1 -16 - 0K 16K idle 7 0:00 0.00%
enc_daemon0
8 root 1 155 ki31 0K 16K pgzero 7 0:00 0.00%
pagezero
3 root 1 -16 - 0K 16K waitin 7 0:00 0.00%
sctp_iterator
7 root 1 -16 - 0K 16K psleep 2 0:00 0.00%
vmdaemon
10 root 1 -16 - 0K 16K audit_ 0 0:00 0.00% audit
================================================================================================================================================================
#!/usr/bin/env bash
while [ 1 ]; do
date +%Y.%m.%d.%H.%M.%S
echo "SELECT pid,state,now()-xact_start AS time FROM pg_stat_activity
WHERE state <> 'idle' AND pg_backend_pid()<>pid ORDER BY pid" | psql
postgres
sleep 1
done
2015.12.13.19.54.34
pid | state | time
-------+--------+-----------------
76179 | active | 00:00:00.001182
(1 row)
2015.12.13.19.54.35
pid | state | time
-----+-------+------
(0 rows)
2015.12.13.19.54.36
pid | state | time
-------+--------+------------------
73479 | active | 00:00:00.115812
75868 | active | -00:00:00.000506
(2 rows)
2015.12.13.19.54.37
pid | state | time
-----+-------+------
(0 rows)
2015.12.13.19.54.38
pid | state | time
-------+--------+-----------------
74206 | active | 00:00:00.750855
(1 row)
2015.12.13.19.54.39
pid | state | time
-------+--------+-----------------
77941 | active | 00:00:00.003457
79569 | active | 00:00:00.061027
(2 rows)
2015.12.13.19.54.40
pid | state | time
-------+--------+---------------
74207 | active | 00:00:00.2614
(1 row)
================================================================================================================================================================
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Sun, 13 Dec 2015 20:09:04 +0100
Gerhard Wiesinger <lists@wiesinger.com> wrote:
On 13.12.2015 18:17, Tom Lane wrote:
Gerhard Wiesinger <lists@wiesinger.com> writes:
Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free
Swap: 512M Total, 506M Used, 6620K Free, 98% InuseOK, but why do we then get: kernel: swap_pager_getswapspace(4): failed?
Just judging from the name of the function, I would bet this is a direct
result of having only 512M of swap configured. As Bill already pointed
out, that's a pretty useless choice on a system with 32G of RAM. As soon
as the kernel tries to push out any significant amount of idle processes,
it's gonna be out of swap space. The numbers you show above prove that
it is almost out of free swap already.The system wasn't designed by me, I wouldn't do it either that way. Does
swapoff help?
FreeBSD and Linux (and most modern OS) are designed to have swap,
and usually more swap than RAM. I have never heard a good reason for
not using swap, and the reasons I _have_ heard have always been by
people misinformed about how the OS works.
If someone has a _good_ explanation for why you wouldn't want any
swap on a DB server, I'd love to hear it; but everything I've heard
up till now has been speculation based on misinformation.
IOW: no, you should not turn swap off, you should instead allocate
the appropriate amount of swap space.
Also, while that 20G of "inactive" pages may be candidates for reuse,
they probably can't actually be reused without swapping them out ...
and there's noplace for that data to go.There is no log in syslog (where postgres log) when
swap_pager_getswapspace is logged.But why do we have 20G of Inactive pages? They are still allocated by
kernel or user space. As you can see below (top output) NON Postgres
processes are around 9G in virtual size, resident even lower. The system
is nearly idle, and the queries typically aren't active after one second
agin. Therefore where does the rest of the 11G of Inactive pages come
from (if it isn't a Postgres/FreeBSD memory leak)?
I read that Postgres has it's own memory allocator:
https://www.reddit.com/r/programming/comments/18zija/github_got_30_better_performance_using_tcmalloc/
Might that be an issue with double allocation/freeing and the "cheese
hole" topic with memory fragmentation?
If there were a memory leak in either FreeBSD or Postgres of the
seriousness you're describing that were as easy to trigger as you
claim, I would expect the mailing lists and other support forums
to be exploding in panic. Notice that they are not. Also, I still
don't see _ANY_ evidence of a leak. I see evidence that something
is happening that is trying to allocate a LOT of RAM, that isn't
available on your system; but that's not the same as a leak.
https://www.opennet.ru/base/dev/fbsdvm.txt.html
inactive pages not actively used by programs which are
dirty and (at some point) need to be written
to their backing store (typically disk).
These pages are still associated with objects and
can be reclaimed if a program references them.
Pages can be moved from the active to the inactive
queue at any time with little adverse effect.
Moving pages to the cache queue has bigger
consequences (note 1)
Correct, but, when under pressure, the system _will_ recycle those
pages to be available.
Tom might be correct in that the system thinks they are inactive
because it could easily push them out to swap, but then it can't
_actually_ do that because you haven't allocated enough swap, but
that doesn't match my understanding of how inactive is used. A
question of that detail would be better asked on a FreeBSD forum,
as the differences between different VM implementations can be
pretty specific and technical.
[snip]
Mem: 8020M Active, 19G Inact, 3537M Wired, 299M Cache, 1679M Buf, 38M Free
Swap: 512M Total, 501M Used, 12M Free, 97% InusePID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
77941 pgsql 5 20 0 7921M 7295M usem 7 404:32 10.25%
postgres
79570 pgsql 1 20 0 7367M 6968M sbwait 6 4:24 0.59% postgres
[snip about 30 identical PG processes]
32387 myusername 9 20 0 980M 375M uwait 5 69:03 1.27% node
[snip similar processes]
622 myusername 1 20 0 261M 3388K kqread 3 41:01 0.00% nginx
[snip similar processes]
Wait ... this is a combined HTTP/Postgres server? You didn't mention that
earlier, and it's kind of important.
What evidence do you have that Postgres is actually the part of
this system running out of memory? I don't see any such evidence in any of
your emails, and (based on experience) I find it pretty likely that whatever
is running under node is doing something in a horrifically memory-inefficient
manner. Since you mention that you see nothing in the PG logs, that makes it
even more likely (to me) that you're looking entirely in the wrong place.
I'd be willing to bet a steak dinner that if you put the web server on a
different server than the DB, that the memory problems would follow the
web server and not the DB server.
--
Bill Moran
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On 13.12.2015 21:14, Bill Moran wrote:
Wait ... this is a combined HTTP/Postgres server? You didn't mention that
earlier, and it's kind of important.What evidence do you have that Postgres is actually the part of
this system running out of memory?
For me the complete picture doesn't look consistent.
I don't see any such evidence in any of
your emails, and (based on experience) I find it pretty likely that whatever
is running under node is doing something in a horrifically memory-inefficient
manner. Since you mention that you see nothing in the PG logs, that makes it
even more likely (to me) that you're looking entirely in the wrong place.I'd be willing to bet a steak dinner that if you put the web server on a
different server than the DB, that the memory problems would follow the
web server and not the DB server.
Changes in config:
track_activity_query_size = 102400
work_mem = 100MB
Ok, we restarted PostgreSQL and had it stopped for seconds, and logged
top every second:
When PostgreSQL was down nearly all memory was freed, looks good to me.
So it is likely that node and other processes are not the cause.
Mem: 742M Active, 358M Inact, 1420M Wired, 21G Cache, 871M Buf, 8110M Free
Swap: 512M Total, 477M Used, 35M Free, 93% Inuse
When PostgreSQL restarted, Inactive was growing fast (~1min):
Mem: 7998M Active, 18G Inact, 2763M Wired, 1766M Cache, 1889M Buf, 1041M
Free
Swap: 512M Total, 472M Used, 41M Free, 92% Inuse
After some few minutes we are back again at the same situation:
Mem: 8073M Active, 20G Inact, 2527M Wired, 817M Cache, 1677M Buf, 268M Free
Swap: 512M Total, 472M Used, 41M Free, 92% Inuse
The steak dinner is mine :-) Donating to the PostgreSQL community :-)
Any further ideas, I don't think this is normal system behaviour.
Ciao,
Gerhard
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Sun, 13 Dec 2015 22:23:19 +0100
Gerhard Wiesinger <lists@wiesinger.com> wrote:
On 13.12.2015 21:14, Bill Moran wrote:
Wait ... this is a combined HTTP/Postgres server? You didn't mention that
earlier, and it's kind of important.What evidence do you have that Postgres is actually the part of
this system running out of memory?For me the complete picture doesn't look consistent.
That's because you haven't gathered enough of the right type of information.
I don't see any such evidence in any of
your emails, and (based on experience) I find it pretty likely that whatever
is running under node is doing something in a horrifically memory-inefficient
manner. Since you mention that you see nothing in the PG logs, that makes it
even more likely (to me) that you're looking entirely in the wrong place.I'd be willing to bet a steak dinner that if you put the web server on a
different server than the DB, that the memory problems would follow the
web server and not the DB server.Changes in config:
track_activity_query_size = 102400
work_mem = 100MBOk, we restarted PostgreSQL and had it stopped for seconds, and logged
top every second:When PostgreSQL was down nearly all memory was freed, looks good to me.
So it is likely that node and other processes are not the cause.
Mem: 742M Active, 358M Inact, 1420M Wired, 21G Cache, 871M Buf, 8110M Free
Swap: 512M Total, 477M Used, 35M Free, 93% InuseWhen PostgreSQL restarted, Inactive was growing fast (~1min):
Mem: 7998M Active, 18G Inact, 2763M Wired, 1766M Cache, 1889M Buf, 1041M
Free
Swap: 512M Total, 472M Used, 41M Free, 92% InuseAfter some few minutes we are back again at the same situation:
Mem: 8073M Active, 20G Inact, 2527M Wired, 817M Cache, 1677M Buf, 268M Free
Swap: 512M Total, 472M Used, 41M Free, 92% InuseThe steak dinner is mine :-) Donating to the PostgreSQL community :-)
Based on the fact that the inactive memory increased? Your understanding
of inactive memory in FreeBSD is incorrect. Those pages are probably DB
pages that the OS is keeping in inactive memory because Postgres requests
them over and over, which is what the OS is supposed to do to ensure the
best performance. Are you seeing any out of swap space errors? Even if
you are, you still haven't determined if the problem is the result of
Postgres or the node.js stuff you have running. I don't know what node.js
might be caching on the client side ... do you?
No. Until you can actually report back something other than wild
speculation, I'll keep that steak dinner for myself. Besides, that bet
was based on you putting the PG server on seperate hardware from the
web server, which you didn't do.
--
Bill Moran
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On 12/13/2015 01:23 PM, Gerhard Wiesinger wrote:
On 13.12.2015 21:14, Bill Moran wrote:
Wait ... this is a combined HTTP/Postgres server? You didn't mention that
earlier, and it's kind of important.What evidence do you have that Postgres is actually the part of
this system running out of memory?For me the complete picture doesn't look consistent.
I don't see any such evidence in any of
your emails, and (based on experience) I find it pretty likely that
whatever
is running under node is doing something in a horrifically
memory-inefficient
manner. Since you mention that you see nothing in the PG logs, that
makes it
even more likely (to me) that you're looking entirely in the wrong place.I'd be willing to bet a steak dinner that if you put the web server on a
different server than the DB, that the memory problems would follow the
web server and not the DB server.Changes in config:
track_activity_query_size = 102400
work_mem = 100MBOk, we restarted PostgreSQL and had it stopped for seconds, and logged
top every second:When PostgreSQL was down nearly all memory was freed, looks good to me.
So it is likely that node and other processes are not the cause.
Mem: 742M Active, 358M Inact, 1420M Wired, 21G Cache, 871M Buf, 8110M Free
Swap: 512M Total, 477M Used, 35M Free, 93% InuseWhen PostgreSQL restarted, Inactive was growing fast (~1min):
Mem: 7998M Active, 18G Inact, 2763M Wired, 1766M Cache, 1889M Buf, 1041M
Free
Swap: 512M Total, 472M Used, 41M Free, 92% InuseAfter some few minutes we are back again at the same situation:
Mem: 8073M Active, 20G Inact, 2527M Wired, 817M Cache, 1677M Buf, 268M Free
Swap: 512M Total, 472M Used, 41M Free, 92% InuseThe steak dinner is mine :-) Donating to the PostgreSQL community :-)
To me all the above proves is that this a complete system issue and only
with all the parts running do you get a problem. It still does indicate
which part or interaction of parts is the issue. This is further muddied
by no description of what, if anything, you where doing in the above
scenarios.
Any further ideas, I don't think this is normal system behaviour.
Ciao,
Gerhard
--
Adrian Klaver
adrian.klaver@aklaver.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general