Is a connection max lifetime useful in a connection pool?
Hello,
I am designing and implementing a connection pool for psycopg3 [1]https://www.psycopg.org/articles/2021/01/17/pool-design/[2]https://github.com/psycopg/psycopg3/blob/connection-pool/psycopg3/psycopg3/pool.py.
Some of the inspiration is coming from HikariCP [3]https://github.com/brettwooldridge/HikariCP, a Java connection
pool.
One of the HikariCP configuration parameters is "maxLifetime", whose
description is: "This property controls the maximum lifetime of a
connection in the pool. [...] **We strongly recommend setting this
value, and it should be several seconds shorter than any database or
infrastructure imposed connection time limit.**" (bold is theirs,
default value is 30 mins).
When discussing the pool features in the psycopg mailing list someone
pointed out "what is the utility of this parameter? connections don't
rot, do they?"
Hikari is a generic connection pool, not one specific for Postgres. So
I'm wondering: is there any value in periodically deleting and
recreating connections for a Postgres-specific connection pool? Is a
MaxLifetime parameter useful?
Thank you very much,
-- Daniele
[1]: https://www.psycopg.org/articles/2021/01/17/pool-design/
[2]: https://github.com/psycopg/psycopg3/blob/connection-pool/psycopg3/psycopg3/pool.py
[3]: https://github.com/brettwooldridge/HikariCP
Hi
ne 21. 2. 2021 v 19:05 odesílatel Daniele Varrazzo <
daniele.varrazzo@gmail.com> napsal:
Hello,
I am designing and implementing a connection pool for psycopg3 [1][2].
Some of the inspiration is coming from HikariCP [3], a Java connection
pool.One of the HikariCP configuration parameters is "maxLifetime", whose
description is: "This property controls the maximum lifetime of a
connection in the pool. [...] **We strongly recommend setting this
value, and it should be several seconds shorter than any database or
infrastructure imposed connection time limit.**" (bold is theirs,
default value is 30 mins).When discussing the pool features in the psycopg mailing list someone
pointed out "what is the utility of this parameter? connections don't
rot, do they?"Hikari is a generic connection pool, not one specific for Postgres. So
I'm wondering: is there any value in periodically deleting and
recreating connections for a Postgres-specific connection pool? Is a
MaxLifetime parameter useful?
I have very strong experience - it is very useful. Long live PostgreSQL
processes can have a lot of allocated memory, and with some unhappy
consequences, the operation system memory can be fragmented, and the
operation system can use swap. Next issue can be bloated catalogue cache
inside processes. Both issues depends on application design, catalog size,
and others factors, and the most simple fix of these issues is setting a
short life of Postgres sessions - 1 hour is usual value.
Regards
Pavel
Show quoted text
Thank you very much,
-- Daniele
[1]: https://www.psycopg.org/articles/2021/01/17/pool-design/
[2]:
https://github.com/psycopg/psycopg3/blob/connection-pool/psycopg3/psycopg3/pool.py
[3]: https://github.com/brettwooldridge/HikariCP
Greetings,
* Daniele Varrazzo (daniele.varrazzo@gmail.com) wrote:
I am designing and implementing a connection pool for psycopg3 [1][2].
Some of the inspiration is coming from HikariCP [3], a Java connection
pool.One of the HikariCP configuration parameters is "maxLifetime", whose
description is: "This property controls the maximum lifetime of a
connection in the pool. [...] **We strongly recommend setting this
value, and it should be several seconds shorter than any database or
infrastructure imposed connection time limit.**" (bold is theirs,
default value is 30 mins).When discussing the pool features in the psycopg mailing list someone
pointed out "what is the utility of this parameter? connections don't
rot, do they?"Hikari is a generic connection pool, not one specific for Postgres. So
I'm wondering: is there any value in periodically deleting and
recreating connections for a Postgres-specific connection pool? Is a
MaxLifetime parameter useful?
Short answer- yes. In particular, what I read into the HikariCP's
documentation is that they've had cases where, say, a firewall in the
middle is configured to just rudely drop a connection after a certain
amount of time (which can take some time to detect if the firewall just
decides to no longer forward packets associated with that connection).
There's another PG-specific reason though: on systems with loads of
tables / objects, each object that a given backend touches ends up in a
per-backend cache. This is great because it helps a lot when the same
objects are used over and over, but when there's lots of objects getting
touched the per-backend memory usage can increase (and no, it doesn't
ever go down; work is being done to improve on that situation but it
hasn't been fixed so far, afaik). If backends are never let go,
eventually all the backends end up with entries cached for all the
objects making for a fair bit of memory being used. Naturally, the
trade off here is that a brand new backend won't have anything in the
cache and therefore will have a 'slow start' when it comes to answering
queries (this is part of where PG's reputation for slow starting comes
from actually and why connection poolers are particularly useful for
PG).
There's other reasons too- PG (rarely, but it happens), and extensions
(a bit more often..) can end up leaking per-backend memory meaning that
the backend memory usage increases without any actual benefit. Dropping
and reconnecting can help address that (though, of course, we'd like to
fix all such cases).
Thanks,
Stephen
On Sun, 21 Feb 2021 at 19:12, Pavel Stehule <pavel.stehule@gmail.com> wrote:
I have very strong experience - it is very useful.
On Sun, 21 Feb 2021 at 19:26, Stephen Frost <sfrost@snowman.net> wrote:
Short answer- yes.
Sounds good. Thank you very much for your insight!
-- Daniele
Hi,
On 2021-02-21 19:05:03 +0100, Daniele Varrazzo wrote:
One of the HikariCP configuration parameters is "maxLifetime", whose
description is: "This property controls the maximum lifetime of a
connection in the pool. [...] **We strongly recommend setting this
value, and it should be several seconds shorter than any database or
infrastructure imposed connection time limit.**" (bold is theirs,
default value is 30 mins).When discussing the pool features in the psycopg mailing list someone
pointed out "what is the utility of this parameter? connections don't
rot, do they?"Hikari is a generic connection pool, not one specific for Postgres. So
I'm wondering: is there any value in periodically deleting and
recreating connections for a Postgres-specific connection pool? Is a
MaxLifetime parameter useful?
It's extremely useful. If your pooler is used in a large application
with different "parts" or your application uses schema based
multi-tenancy or such, you can end up with the various per-connection
caches getting very large without providing much benefit. Unfortunately
we do not yet have effective "pressure" against that. Similarly, if you
have an application using prepared statements you can end up with enough
prepared statements for that to be a memory usage issue.
Additionally, if there's ever a problem with memory leakage, be it in
core PG or some extension, being able to limit the harm of that can be a
life saver.
Greetings,
Andres Freund
On Mon, Feb 22, 2021 at 7:52 AM Andres Freund <andres@anarazel.de> wrote:
On 2021-02-21 19:05:03 +0100, Daniele Varrazzo wrote:
One of the HikariCP configuration parameters is "maxLifetime", whose
description is: "This property controls the maximum lifetime of a
connection in the pool. [...] **We strongly recommend setting this
value, and it should be several seconds shorter than any database or
infrastructure imposed connection time limit.**" (bold is theirs,
default value is 30 mins).When discussing the pool features in the psycopg mailing list someone
pointed out "what is the utility of this parameter? connections don't
rot, do they?"Hikari is a generic connection pool, not one specific for Postgres. So
I'm wondering: is there any value in periodically deleting and
recreating connections for a Postgres-specific connection pool? Is a
MaxLifetime parameter useful?It's extremely useful.
+1, I multiple times had to rely on similar cleanup in other poolers.
If your pooler is used in a large application
with different "parts" or your application uses schema based
multi-tenancy or such, you can end up with the various per-connection
caches getting very large without providing much benefit. Unfortunately
we do not yet have effective "pressure" against that. Similarly, if you
have an application using prepared statements you can end up with enough
prepared statements for that to be a memory usage issue.
And in some case negative cache entries can be a problem too.