Server side backend permanent session memory usage ?

Started by Day, Davidover 9 years ago5 messagesgeneral
Jump to latest
#1Day, David
dday@redcom.com

Hi,

There seems to be an unbounded growth of memory usage by the backend postgres process representing a "permanent" session in our system.

The size/res values retrieved by running the "top" utility seem to exceed the amount I would expect given the mem and buf tuning parameters of the postgresql.conf file. ( Though, I find a theoretical maximum session value formula a bit difficult value to nail down from the documentation. )

I have seen other reported issues regarding session backend memory usage in stack-overflow and the workaround has amounted to periodically, dropping and restoring the pools connection. Otherwise over time ( many days ) the connections backends, collectively keep accumulating memory, swap space starts coming into play and the system performance takes a big hit.

I have implanted the session refresh concept and begun testing for improvement, but at least wanted to report this and get some feedback on my apparent lack of understanding on what is going on under the hood here ?

Some system notes:
FreeBSD 10.x
Postgres 9..3.5 and 9.5 systems
Hardware 2 Gig/2 core and 4 Gig /4 core Memory

There are no memory related errors reported in system logs or postgres logs.
There are no outstanding uncommitted transactions. ( per pgadminIII views )
The load offered is a very regular set of queries distributed against 5 - 10 "permanent" sessions for processing a VoIP softswitch.
I have played around with adjusting the work_mem and shared buffers up and down with no impact on the memory growth observation.

On initial session connection the "top" utility would report on the order of 220M/38000K (SIZE/RES) memory connection.
After 4 days of operation the same session will indicate 516M/300M.

Thank you for any insights or suggestions on how to better analyze what is transpiring.

Best Regards

Dave Day

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Day, David (#1)
Re: Server side backend permanent session memory usage ?

"Day, David" <dday@redcom.com> writes:

There seems to be an unbounded growth of memory usage by the backend postgres process representing a "permanent" session in our system.

It's hard to evaluate this report with so little information, but there
are at least three possible explanations:

1. Many people misunderstand "top"'s output and believe they are seeing
memory bloat when they aren't. This happens because top only charges
pages of shared memory to a process after the process has first physically
touched those pages. So a PG process's claimed use of shared memory will
gradually grow from nothing to the whole shared-memory area, as it has
occasion to make use of different shared buffers, lock table entries, etc.
You can correct for this by subtracting the SHR (shared) column from
the process's reported size, but people often fail to. (Note: it's
possible that FreeBSD's implementation doesn't suffer from this problem,
but the issue definitely exists on e.g. Linux.)

2. If, over time, the queries issued to the process touch many different
tables (I'm talking thousands of tables), or execute large numbers of
distinct plpgsql functions, etc, then you will get bloating of the
internal caches that hold copies of that catalog data. PG generally
operates on the principle that cached is better than not cached, so it
doesn't try to limit the size of those caches; but in some installations
that can cause problems. If this is your situation, then indeed
restarting the sessions periodically may be necessary.

3. Or you might have found an actual memory leak. PG's memory usage
conventions are such that true leaks that persist across transactions
are pretty rare ... but I won't say it doesn't happen.

If you've eliminated point 1 and want to try to look into the other
theories, you could do this: attach to a recently-started session with
gdb, and execute
call MemoryContextStats(TopMemoryContext)
quit
This will cause a memory usage map to get dumped to stderr (hopefully
you are starting the postmaster in such a way that that gets captured
to a log file rather than sent to /dev/null). Save that. Wait until
you see bloat, reattach and repeat, compare the memory maps. Let us
know what you see. If possible, compare maps taken at points where
the session is idle and waiting for input.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Day, David
dday@redcom.com
In reply to: Tom Lane (#2)
Re: Server side backend permanent session memory usage ?

Hi,

I've been knocking heads with this issue for a while without updating this thread.

I have implemented a refresh connection behavior from our pooler/client side which
seems to be a successful work-around for memory loss indications on the backend side.

With that "solution" in my pocket and at the suggestion/hunch of one our developers, I ran this experiment.

We have a server side function written in plpgsql called by the client side application. That fx does work that includes
Invoking a server side plpython2u function.

If I repeat calling this severer side logic/function outside our app from a a psql invoked connection and I monitor
the memory growth on the backend side I see top reporting a continual growth trend after each burst of invocations.

The premise was that the issue of the repeated invocation of the python interpreter and/or the python function use of a "subprocess" method
does not cleanup correctly and this accounts for the memory growth and system performance degradation.

FreeBSD 10.1, postgres 9.5.3 or postgres 9.3.11, running on VM, no relevant problems indicated in log files.

I could attempt to create a self contained example for the pg community exploration or preferably can attach gdb to the backend for those
who would like to suggest to me some particular details that might confirm or rule out this theory.

Thanks

Dave Day

--------
Tom,

Thank you for that very considered answer.
I will put that information to use over the next couple of days and get back to the gen-list with my findings.

I hear what you are saying about item 1. I suspect it is at least an indicator of the degradation if not the actual cause of it. The only thing I have had to restart historically to recover is the pooling agent to recover normal operation. ( And the poolers/client memory stats are pretty stable with run time )

There are only about 120 tables in two schemas that could be accessed by these session, and I suspect what they are actually repeatedly accessing is a very small subset of that, 20-30 tables and perhaps 30-40 functions, which perhaps makes Item 2 unlikely.

Item 3 - should be doable to get these results - might have some information by tomorrow.

Regards

Dave Day

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Thursday, July 14, 2016 11:22 AM
To: Day, David
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Server side backend permanent session memory usage ?

"Day, David" <dday@redcom.com> writes:

There seems to be an unbounded growth of memory usage by the backend postgres process representing a "permanent" session in our system.

It's hard to evaluate this report with so little information, but there are at least three possible explanations:

1. Many people misunderstand "top"'s output and believe they are seeing memory bloat when they aren't. This happens because top only charges pages of shared memory to a process after the process has first physically touched those pages. So a PG process's claimed use of shared memory will gradually grow from nothing to the whole shared-memory area, as it has occasion to make use of different shared buffers, lock table entries, etc.
You can correct for this by subtracting the SHR (shared) column from the process's reported size, but people often fail to. (Note: it's possible that FreeBSD's implementation doesn't suffer from this problem, but the issue definitely exists on e.g. Linux.)

2. If, over time, the queries issued to the process touch many different tables (I'm talking thousands of tables), or execute large numbers of distinct plpgsql functions, etc, then you will get bloating of the internal caches that hold copies of that catalog data. PG generally operates on the principle that cached is better than not cached, so it doesn't try to limit the size of those caches; but in some installations that can cause problems. If this is your situation, then indeed restarting the sessions periodically may be necessary.

3. Or you might have found an actual memory leak. PG's memory usage conventions are such that true leaks that persist across transactions are pretty rare ... but I won't say it doesn't happen.

If you've eliminated point 1 and want to try to look into the other theories, you could do this: attach to a recently-started session with gdb, and execute
call MemoryContextStats(TopMemoryContext)
quit
This will cause a memory usage map to get dumped to stderr (hopefully you are starting the postmaster in such a way that that gets captured to a log file rather than sent to /dev/null). Save that. Wait until you see bloat, reattach and repeat, compare the memory maps. Let us know what you see. If possible, compare maps taken at points where the session is idle and waiting for input.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Day, David (#3)
Re: Server side backend permanent session memory usage ?

"Day, David" <dday@redcom.com> writes:

We have a server side function written in plpgsql called by the client side application. That fx does work that includes
Invoking a server side plpython2u function.
If I repeat calling this severer side logic/function outside our app from a a psql invoked connection and I monitor
the memory growth on the backend side I see top reporting a continual growth trend after each burst of invocations.
The premise was that the issue of the repeated invocation of the python interpreter and/or the python function use of a "subprocess" method
does not cleanup correctly and this accounts for the memory growth and system performance degradation.
FreeBSD 10.1, postgres 9.5.3 or postgres 9.3.11, running on VM, no relevant problems indicated in log files.

It'd be very useful to find out whether you can reproduce the leakage in
9.6beta (any beta release will do). There are a couple of significant
fixes in plpython that might be relevant --- I'm looking at these in
particular:

https://git.postgresql.org/gitweb/?p=postgresql.git&amp;a=commitdiff&amp;h=8c75ad436
https://git.postgresql.org/gitweb/?p=postgresql.git&amp;a=commitdiff&amp;h=1d2fe56e4

We did not back-patch those changes because they seemed rather
invasive/risky compared to the value for average users.

If you still see misbehavior in 9.6, it'd be worth trying to extract
a self-contained example.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Day, David
dday@redcom.com
In reply to: Tom Lane (#4)
Re: Server side backend permanent session memory usage ?

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Wednesday, July 27, 2016 4:50 PM
To: Day, David
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Server side backend permanent session memory usage ?

"Day, David" <dday@redcom.com> writes:

We have a server side function written in plpgsql called by the client
side application. That fx does work that includes Invoking a server side plpython2u function.
If I repeat calling this severer side logic/function outside our app
from a a psql invoked connection and I monitor the memory growth on the backend side I see top reporting a continual growth trend after each burst of invocations.
The premise was that the issue of the repeated invocation of the python interpreter and/or the python function use of a "subprocess" method
does not cleanup correctly and this accounts for the memory growth and system performance degradation.
FreeBSD 10.1, postgres 9.5.3 or postgres 9.3.11, running on VM, no relevant problems indicated in log files.

It'd be very useful to find out whether you can reproduce the leakage in 9.6beta (any beta release will do). There are a couple of significant fixes in plpython that might be relevant --- I'm looking at these in
particular:

https://git.postgresql.org/gitweb/?p=postgresql.git&amp;a=commitdiff&amp;h=8c75ad436
https://git.postgresql.org/gitweb/?p=postgresql.git&amp;a=commitdiff&amp;h=1d2fe56e4

We did not back-patch those changes because they seemed rather invasive/risky compared to the value for average users.

If you still see misbehavior in 9.6, it'd be worth trying to extract a self-contained example.

regards, tom lane

Tom,

I would venture that the first item you listed probably explains our observations.

I plan to write up an internal trouble report on our end so that when we start using 9.6 we will disable the connection
refresh workaround and re-check the issue and report back at that time. ( Not soon unfortunately )

I guess then it is a word of caution in the meantime to 9.5 and earlier version users who use the python extension and have long lived connections.

Thanks so much for your assistance.

Best Regards

Dave Day

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general