Connection queuing by connection pooling libraries

Started by Saurav Sarkarover 4 years ago5 messagesgeneral

Jump to latest

Saurav Sarkar

saurav.sarkar1@gmail.com

over 4 years ago

Hi All,

A basic question on handling large number of concurrent requests on DB.

I have a cloud service which can get large of requests which will obviously
trigger the db operations.

Every db will have some max connection limit which can get exhausted on
large number of requests.

I know db connection pooling can be used to reuse the connections but it
will not help when there are large number of active concurrent connections.
My queries are already optimised and short living.

For that i need some queuing mechanism like pgbouncer for postgres
https://www.percona.com/blog/2021/02/26/connection-queuing-in-pgbouncer-is-it-a-magical-remedy/

pgbounder i understand is a proxy which needs to be separately installed on
the web or db server.

I was thinking if the normal client side db connection pooling libraries
like Apache DBCP , can also provide similar connection queuing while
running in the application runtime.

Or is there some other way to handle this problem ?

Best Regards,

Saurav

Vijaykumar Jain

vijaykumarjain.github@gmail.com

over 4 years ago

In reply to: Saurav Sarkar (#1)

Re: Connection queuing by connection pooling libraries

On Tue, 19 Oct 2021 at 22:45, Saurav Sarkar <saurav.sarkar1@gmail.com>
wrote:

Hi All,

A basic question on handling large number of concurrent requests on DB.

I have a cloud service which can get large of requests which will
obviously trigger the db operations.

Every db will have some max connection limit which can get exhausted on
large number of requests.

I know db connection pooling can be used to reuse the connections but it
will not help when there are large number of active concurrent connections.
My queries are already optimised and short living.

For that i need some queuing mechanism like pgbouncer for postgres
https://www.percona.com/blog/2021/02/26/connection-queuing-in-pgbouncer-is-it-a-magical-remedy/

pgbounder i understand is a proxy which needs to be separately installed
on the web or db server.

I was thinking if the normal client side db connection pooling libraries
like Apache DBCP , can also provide similar connection queuing while
running in the application runtime.

DBCP – BasicDataSource Configuration (apache.org)
<https://commons.apache.org/proper/commons-dbcp/configuration.html>

<https://commons.apache.org/proper/commons-dbcp/configuration.html>From
what i see here, this seems to be similar to other client side connection
libraries/ poolers, which works, but the major benefit from pgbouncer for
server side connection
pooling is something like transaction level pooling. which would work
pretty well in your case, if you say you have short lived connections.

creating a new connection on postgresql directly is expensive (resources)
as it is spawning a new process.
pgbouncer provides a workaround, where it manages opening and closing of
connections on the server side for you, and then allocating the already
opened connections to the frontend connections. when your frontend
connection is done with the transaction, it will close the frontend
connection, but does not do it at the backend "yet". if there is a new
request for a connection, it will reuse the old connection, and allocate
the same to the new connection.
Hence the caveat, you cannot use prepared statements and there is no
guarantee which backend connection gets mapped to which frontend connection.

also, it is important to note, having too many connections while in a
transaction can result in bloating, as due to mvcc, it will not do any dead
rows cleanup even if it can as it cannot see those "dead" rows if any,
that can result in bloating and eventually slowdown and other problems.

The best use case I have seen with pgbouncer is not just connection
pooling, but also being able to handle auto scaling to a large effect.
Especially with microservices, if you app has a connection pool of 10, and
you scale your app to 10 instances, without pgbouncer it would use all 100
backend connections, but if the connections are short lived transactions,
it might work well even with 50 connections in the backend.

there are other administrative benefits of pgbouncer too, beyond just
connection pooling,

The biggest drawback I have had with pgbouncer with enterprise auth support
like for ldap/kerberos etc and it masks the ip from the client with its own
ip when you look at pg_stat_activity.
but since we use one db per app, we are able to handle the above issues
with less unknowns.

--

Thanks,
Vijay
Mumbai, India

Vijaykumar Jain

vijaykumarjain.github@gmail.com

over 4 years ago

In reply to: Vijaykumar Jain (#2)

Re: Connection queuing by connection pooling libraries

On Tue, 19 Oct 2021 at 23:09, Vijaykumar Jain <
vijaykumarjain.github@gmail.com> wrote:

On Tue, 19 Oct 2021 at 22:45, Saurav Sarkar <saurav.sarkar1@gmail.com>
wrote:

Hi All,

A basic question on handling large number of concurrent requests on DB.

I have a cloud service which can get large of requests which will
obviously trigger the db operations.

Every db will have some max connection limit which can get exhausted on
large number of requests.

I know db connection pooling can be used to reuse the connections but it
will not help when there are large number of active concurrent connections.
My queries are already optimised and short living.

For that i need some queuing mechanism like pgbouncer for postgres
https://www.percona.com/blog/2021/02/26/connection-queuing-in-pgbouncer-is-it-a-magical-remedy/

pgbounder i understand is a proxy which needs to be separately installed
on the web or db server.

I was thinking if the normal client side db connection pooling libraries
like Apache DBCP , can also provide similar connection queuing while
running in the application runtime.

btw there are other options as well to pgbouncer which are in active
development, incase you wish to explore.

yandex/odyssey: Scalable PostgreSQL connection pooler (github.com)
<https://github.com/yandex/odyssey>
<https://github.com/yandex/odyssey>pgagroal
<https://agroal.github.io/pgagroal/>

<https://agroal.github.io/pgagroal/>also,
some people ( like myself :) split read/writes and use primary and a set
of replicas for connections to handle scale.
so a pretty common architecture makes use of
Scaling PostgreSQL using Connection Poolers and Load Balancers for an
Enterprise Grade environment - Percona Database Performance Blog
<https://www.percona.com/blog/2018/10/02/scaling-postgresql-using-connection-poolers-and-load-balancers-for-an-enterprise-grade-environment/>

Vijaykumar Jain

vijaykumarjain.github@gmail.com

over 4 years ago

In reply to: Vijaykumar Jain (#3)

Re: Connection queuing by connection pooling libraries

On Tue, 19 Oct 2021 at 23:20, Vijaykumar Jain <
vijaykumarjain.github@gmail.com> wrote:

On Tue, 19 Oct 2021 at 23:09, Vijaykumar Jain <
vijaykumarjain.github@gmail.com> wrote:

On Tue, 19 Oct 2021 at 22:45, Saurav Sarkar <saurav.sarkar1@gmail.com>
wrote:

Hi All,

A basic question on handling large number of concurrent requests on DB.

I have a cloud service which can get large of requests which will
obviously trigger the db operations.

Every db will have some max connection limit which can get exhausted on
large number of requests.

I know db connection pooling can be used to reuse the connections but it
will not help when there are large number of active concurrent connections.
My queries are already optimised and short living.

For that i need some queuing mechanism like pgbouncer for postgres
https://www.percona.com/blog/2021/02/26/connection-queuing-in-pgbouncer-is-it-a-magical-remedy/

pgbounder i understand is a proxy which needs to be separately installed
on the web or db server.

I was thinking if the normal client side db connection pooling libraries
like Apache DBCP , can also provide similar connection queuing while
running in the application runtime.

also pls checkout, i forgot to link early on
Number Of Database Connections - PostgreSQL wiki
<https://wiki.postgresql.org/wiki/Number_Of_Database_Connections>
it explains the reasons, too many direct connections may result in
performance issues.

Saurav Sarkar

saurav.sarkar1@gmail.com

over 4 years ago

In reply to: Vijaykumar Jain (#4)

Re: Connection queuing by connection pooling libraries

Thanks a lot Vijay for your valuable inputs.

Best Regards,
Saurav

On Wed, Oct 20, 2021 at 1:06 AM Vijaykumar Jain <
vijaykumarjain.github@gmail.com> wrote:

Show quoted text

On Tue, 19 Oct 2021 at 23:20, Vijaykumar Jain <
vijaykumarjain.github@gmail.com> wrote:

On Tue, 19 Oct 2021 at 23:09, Vijaykumar Jain <
vijaykumarjain.github@gmail.com> wrote:

On Tue, 19 Oct 2021 at 22:45, Saurav Sarkar <saurav.sarkar1@gmail.com>
wrote:

Hi All,

A basic question on handling large number of concurrent requests on DB.

I have a cloud service which can get large of requests which will
obviously trigger the db operations.

Every db will have some max connection limit which can get exhausted on
large number of requests.

I know db connection pooling can be used to reuse the connections but
it will not help when there are large number of active concurrent
connections. My queries are already optimised and short living.

For that i need some queuing mechanism like pgbouncer for postgres
https://www.percona.com/blog/2021/02/26/connection-queuing-in-pgbouncer-is-it-a-magical-remedy/

pgbounder i understand is a proxy which needs to be separately
installed on the web or db server.

I was thinking if the normal client side db connection pooling
libraries like Apache DBCP , can also provide similar connection queuing
while running in the application runtime.

also pls checkout, i forgot to link early on
Number Of Database Connections - PostgreSQL wiki
<https://wiki.postgresql.org/wiki/Number_Of_Database_Connections>
it explains the reasons, too many direct connections may result in
performance issues.