Builtin connection polling

Started by Konstantin Knizhnikabout 8 years ago131 messageshackers
Jump to latest
#1Konstantin Knizhnik
k.knizhnik@postgrespro.ru

Hi hackers,

My recent experiments with pthread version of Postgres show that
although pthread offers some performance advantages comparing with
processes for large number of connections, them still can not eliminate
need in connection pooling. Large number even of inactive connections
cause significant degrade of Postgres performance.

So we need connection pooling.  Most of enterprise systems working with
Postgres are using pgbouncer or similar tools.
But pgbouncer has the following drawbacks:
1. It is an extra entity which complicates system installation and
administration.
2. Pgbouncer itself can be a bottleneck and point of failure. For
example with enabled SSL, single threaded model of pgbouncer becomes
limiting factor when a lot of clients try to simultaneously reestablish
connection. This is why some companies are building hierarchy of pgbouncers.
3. Using pool_mode other than "session" makes it not possible to use
prepared statements and session variables.
Lack of prepared statements can itself decrease speed of simple queries
up to two times.

So I thought about built-in connection pooling for Postgres. Ideally it
should be integrated with pthreads, because in this case scheduling of
sessions can be done more flexible and easily.
But I decided to start with patch to vanilla Postgres.

Idea is the following:
1. We start some number of normal backends (which forms backend pool for
serving client sessions).
2. When number of connections exceeds number of backends, then instead
of spawning new backend we choose some of existed backend and redirect
connection to it.
There is more or less portable way in Unix to pass socket descriptors
between processes using Unix sockets:
for example
https://stackoverflow.com/questions/28003921/sending-file-descriptor-by-linux-socket/
(this is one of the places where pthreads Postgres will win). So a
session is bounded to a backend. Backends and chosen using round-robin
policy which should guarantee more or less unform distribution of
sessions between backends if number of sessions is much larger than
number of backends. But certainly skews in client application access
patterns can violate this assumption.
3. Rescheduling is done at transaction level. So it is enough to have
one entry in procarray for backend to correctly handle locks. Also
transaction level pooling eliminates
problem with false deadlocks (caused by lack of free executors in the
pool). Also transaction level pooling minimize changes in Postgres core
needed to maintain correct session context:
no need to suspend/resume transaction state, static variables, ....
4. In the main Postgres query loop in PostgresMain  we determine a
moment when backend is not in transaction state and perform select of
sockets of all active sessions and choose one of them.
5. When client is disconnected, then we close session but do not
terminate backend.
6. To support prepared statements, we append session identifier to the
name of the statement. So prepared statements of different sessions will
not interleave. As far as session is bounded to the backend, it is
possible to use prepared statements.

This is minimal plan for embedded session pooling I decided to implement
as prototype.

Several things are not addressed now:

1. Temporary tables. In principle them can be handled in the same way as
prepared statements: by concatenating session identifier to the name of
the table.
But it require adjusting references to this table in all queries. It is
much more complicated than in case of prepared statements.
2. Session level GUCs. In principle it is not difficult to remember GUCs
modified by session and save/restore them on session switch.
But it is just not implemented now.
3. Support of multiple users/databases/... It is the most critical
drawback. Right now my prototype implementation assumes that all clients
are connected to the same database
under the same user with some connection options. And it is a challenge
about which I want to know option of community. The name of the database
and user are retrieved from client connection by ProcessStartupPacket
function. In vanilla Posgres this function is executed by spawned
backend. So I do not know which database a client is going to access
before calling this function and reading data from the client's socket.
Now I just choose random backend and assign connection to this backend.
But it can happen that this backend is working with different
database/user. Now I just return error in this case. Certainly it is
possible to call ProcessStartupPacket at postmaster and then select
proper backend working with specified database/user.
But I afraid that postmaster can become bottleneck i this case,
especially in case of using SSL. Also larger number of databases/users
can significantly suffer efficiency of pooling if each backend will be
responsible only for database/user combination. May be backend should be
bounded only to the database and concrete role should be set on session
switch. But it can require flushing backend caches whichdevalues idea of
embedded session pooling. This problem can be easily solved with
multithreaded Postgres where it is possible to easily reassign session
to another thread.

Now results shown by my prototype. I used pgbench with scale factor 100
in readonly  mode (-S option).
Precise pgbench command is "pgbench -S -c N -M prepared -T 100 -P 1 -n".
Results in the table below are in kTPS:

Connections
Vanilla Postgres
Postgres with session pool size=10
10
186
181
100
118
224
1000
59
191

As you see instead of degrade of performance with increasing number of
connections, Postgres with session pool shows stable performance result.
Moreover, for vanilla Postgres best results at my system are obtained
for 10 connections, but Postgres with session pool shows better
performance for 100 connections with the same number of spawned backends.

My patch to the Postgres is attached to this mail.
To switch on session polling set session_pool_size to some non-zero
value. Another GUC variable which I have added is "max_sessions" which
specifies maximal number of sessions handled by backend. So total number
of handled client connections is session_pool_size*max_sessions.

Certainly it is just prototype far from practical use.
In addition to the challenges mentioned above, there are also some other
issues which should be considered:

1. Long living transaction in client application blocks all other
sessions in the backend and so can suspend work of the Postgres.
So Uber-style programming when database transaction is started with
opening door of a car and finished at the end of the trip is completely
not compatible with this approach.
2. Fatal errors cause disconnect not only of one client caused the
problem but bunch of client sessions scheduled to this backend.
3. It is possible to use PL-APIs, such as plpython, but session level
variables may not be used.
4. There may be some memory leaks caused by allocation of memory using
malloc or in top memory context which is expected to be freed on backend
exit.
But it is not deallocated at session close, so large number of handled
sessions can cause memory overflow.
5. Some applications, handling mutliple connections inside single thread
and multiplexing them at statement level (rather than on transaction
level) may not work correctly.
It seems to be quite exotic use case. But pgbench actually behaves in
this way! This is why attempt to start pgbench with multistatement
transactions (-N) will fail if number of threads (-j) is smaller than
number of connections (-c).
6. The approach with passing socket descriptors between processes was
implemented only for Unix and tested only at Linux, although is expected
to work also as MacOS and other Unix dialects. Windows is not supported now.

I will be glad to receive an feedback and suggestion concerning
perspectives of embedded connection pooling.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

session_pool.patchtext/x-patch; name=session_pool.patchDownload+472-28
#2Ivan Novick
inovick@pivotal.io
In reply to: Konstantin Knizhnik (#1)
Re: Builtin connection polling

+1 to the concept... A lot of user could benefit if we did this in a good
way.

On Wed, Jan 17, 2018 at 8:09 AM, Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> wrote:

Hi hackers,

My recent experiments with pthread version of Postgres show that although
pthread offers some performance advantages comparing with processes for
large number of connections, them still can not eliminate need in
connection pooling. Large number even of inactive connections cause
significant degrade of Postgres performance.

So we need connection pooling. Most of enterprise systems working with
Postgres are using pgbouncer or similar tools.
But pgbouncer has the following drawbacks:
1. It is an extra entity which complicates system installation and
administration.
2. Pgbouncer itself can be a bottleneck and point of failure. For example
with enabled SSL, single threaded model of pgbouncer becomes limiting
factor when a lot of clients try to simultaneously reestablish connection.
This is why some companies are building hierarchy of pgbouncers.
3. Using pool_mode other than "session" makes it not possible to use
prepared statements and session variables.
Lack of prepared statements can itself decrease speed of simple queries up
to two times.

So I thought about built-in connection pooling for Postgres. Ideally it
should be integrated with pthreads, because in this case scheduling of
sessions can be done more flexible and easily.
But I decided to start with patch to vanilla Postgres.

Idea is the following:
1. We start some number of normal backends (which forms backend pool for
serving client sessions).
2. When number of connections exceeds number of backends, then instead of
spawning new backend we choose some of existed backend and redirect
connection to it.
There is more or less portable way in Unix to pass socket descriptors
between processes using Unix sockets:
for example https://stackoverflow.com/questions/28003921/sending-
file-descriptor-by-linux-socket/
(this is one of the places where pthreads Postgres will win). So a session
is bounded to a backend. Backends and chosen using round-robin policy which
should guarantee more or less unform distribution of sessions between
backends if number of sessions is much larger than number of backends. But
certainly skews in client application access patterns can violate this
assumption.
3. Rescheduling is done at transaction level. So it is enough to have one
entry in procarray for backend to correctly handle locks. Also transaction
level pooling eliminates
problem with false deadlocks (caused by lack of free executors in the
pool). Also transaction level pooling minimize changes in Postgres core
needed to maintain correct session context:
no need to suspend/resume transaction state, static variables, ....
4. In the main Postgres query loop in PostgresMain we determine a moment
when backend is not in transaction state and perform select of sockets of
all active sessions and choose one of them.
5. When client is disconnected, then we close session but do not terminate
backend.
6. To support prepared statements, we append session identifier to the
name of the statement. So prepared statements of different sessions will
not interleave. As far as session is bounded to the backend, it is possible
to use prepared statements.

This is minimal plan for embedded session pooling I decided to implement
as prototype.

Several things are not addressed now:

1. Temporary tables. In principle them can be handled in the same way as
prepared statements: by concatenating session identifier to the name of the
table.
But it require adjusting references to this table in all queries. It is
much more complicated than in case of prepared statements.
2. Session level GUCs. In principle it is not difficult to remember GUCs
modified by session and save/restore them on session switch.
But it is just not implemented now.
3. Support of multiple users/databases/... It is the most critical
drawback. Right now my prototype implementation assumes that all clients
are connected to the same database
under the same user with some connection options. And it is a challenge
about which I want to know option of community. The name of the database
and user are retrieved from client connection by ProcessStartupPacket
function. In vanilla Posgres this function is executed by spawned backend.
So I do not know which database a client is going to access before calling
this function and reading data from the client's socket. Now I just choose
random backend and assign connection to this backend. But it can happen
that this backend is working with different database/user. Now I just
return error in this case. Certainly it is possible to call
ProcessStartupPacket at postmaster and then select proper backend working
with specified database/user.
But I afraid that postmaster can become bottleneck i this case, especially
in case of using SSL. Also larger number of databases/users can
significantly suffer efficiency of pooling if each backend will be
responsible only for database/user combination. May be backend should be
bounded only to the database and concrete role should be set on session
switch. But it can require flushing backend caches which devalues idea of
embedded session pooling. This problem can be easily solved with
multithreaded Postgres where it is possible to easily reassign session to
another thread.

Now results shown by my prototype. I used pgbench with scale factor 100 in
readonly mode (-S option).
Precise pgbench command is "pgbench -S -c N -M prepared -T 100 -P 1 -n".
Results in the table below are in kTPS:

Connections
Vanilla Postgres
Postgres with session pool size=10
10
186
181
100
118
224
1000
59
191

As you see instead of degrade of performance with increasing number of
connections, Postgres with session pool shows stable performance result.
Moreover, for vanilla Postgres best results at my system are obtained for
10 connections, but Postgres with session pool shows better performance for
100 connections with the same number of spawned backends.

My patch to the Postgres is attached to this mail.
To switch on session polling set session_pool_size to some non-zero value.
Another GUC variable which I have added is "max_sessions" which specifies
maximal number of sessions handled by backend. So total number of handled
client connections is session_pool_size*max_sessions.

Certainly it is just prototype far from practical use.
In addition to the challenges mentioned above, there are also some other
issues which should be considered:

1. Long living transaction in client application blocks all other sessions
in the backend and so can suspend work of the Postgres.
So Uber-style programming when database transaction is started with
opening door of a car and finished at the end of the trip is completely not
compatible with this approach.
2. Fatal errors cause disconnect not only of one client caused the problem
but bunch of client sessions scheduled to this backend.
3. It is possible to use PL-APIs, such as plpython, but session level
variables may not be used.
4. There may be some memory leaks caused by allocation of memory using
malloc or in top memory context which is expected to be freed on backend
exit.
But it is not deallocated at session close, so large number of handled
sessions can cause memory overflow.
5. Some applications, handling mutliple connections inside single thread
and multiplexing them at statement level (rather than on transaction level)
may not work correctly.
It seems to be quite exotic use case. But pgbench actually behaves in this
way! This is why attempt to start pgbench with multistatement transactions
(-N) will fail if number of threads (-j) is smaller than number of
connections (-c).
6. The approach with passing socket descriptors between processes was
implemented only for Unix and tested only at Linux, although is expected to
work also as MacOS and other Unix dialects. Windows is not supported now.

I will be glad to receive an feedback and suggestion concerning
perspectives of embedded connection pooling.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

--
Ivan Novick, Product Manager Pivotal Greenplum
inovick@pivotal.io -- (Mobile) 408-230-6491
https://www.youtube.com/GreenplumDatabase

#3Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Konstantin Knizhnik (#1)
Re: Built-in connection pooling

On 17.01.2018 19:09, Konstantin Knizhnik wrote:

Hi hackers,

My recent experiments with pthread version of Postgres show that
although pthread offers some performance advantages comparing with
processes for large number of connections, them still can not
eliminate need in connection pooling. Large number even of inactive
connections cause significant degrade of Postgres performance.

So we need connection pooling.  Most of enterprise systems working
with Postgres are using pgbouncer or similar tools.
But pgbouncer has the following drawbacks:
1. It is an extra entity which complicates system installation and
administration.
2. Pgbouncer itself can be a bottleneck and point of failure. For
example with enabled SSL, single threaded model of pgbouncer becomes
limiting factor when a lot of clients try to simultaneously
reestablish connection. This is why some companies are building
hierarchy of pgbouncers.
3. Using pool_mode other than "session" makes it not possible to use
prepared statements and session variables.
Lack of prepared statements can itself decrease speed of simple
queries up to two times.

So I thought about built-in connection pooling for Postgres. Ideally
it should be integrated with pthreads, because in this case scheduling
of sessions can be done more flexible and easily.
But I decided to start with patch to vanilla Postgres.

Idea is the following:
1. We start some number of normal backends (which forms backend pool
for serving client sessions).
2. When number of connections exceeds number of backends, then instead
of spawning new backend we choose some of existed backend and redirect
connection to it.
There is more or less portable way in Unix to pass socket descriptors
between processes using Unix sockets:
for example
https://stackoverflow.com/questions/28003921/sending-file-descriptor-by-linux-socket/
(this is one of the places where pthreads Postgres will win). So a
session is bounded to a backend. Backends and chosen using round-robin
policy which should guarantee more or less unform distribution of
sessions between backends if number of sessions is much larger than
number of backends. But certainly skews in client application access
patterns can violate this assumption.
3. Rescheduling is done at transaction level. So it is enough to have
one entry in procarray for backend to correctly handle locks. Also
transaction level pooling eliminates
problem with false deadlocks (caused by lack of free executors in the
pool). Also transaction level pooling minimize changes in Postgres
core needed to maintain correct session context:
no need to suspend/resume transaction state, static variables, ....
4. In the main Postgres query loop in PostgresMain  we determine a
moment when backend is not in transaction state and perform select of
sockets of all active sessions and choose one of them.
5. When client is disconnected, then we close session but do not
terminate backend.
6. To support prepared statements, we append session identifier to the
name of the statement. So prepared statements of different sessions
will not interleave. As far as session is bounded to the backend, it
is possible to use prepared statements.

This is minimal plan for embedded session pooling I decided to
implement as prototype.

Several things are not addressed now:

1. Temporary tables. In principle them can be handled in the same way
as prepared statements: by concatenating session identifier to the
name of the table.
But it require adjusting references to this table in all queries. It
is much more complicated than in case of prepared statements.
2. Session level GUCs. In principle it is not difficult to remember
GUCs modified by session and save/restore them on session switch.
But it is just not implemented now.
3. Support of multiple users/databases/... It is the most critical
drawback. Right now my prototype implementation assumes that all
clients are connected to the same database
under the same user with some connection options. And it is a
challenge about which I want to know option of community. The name of
the database and user are retrieved from client connection by
ProcessStartupPacket function. In vanilla Posgres this function is
executed by spawned backend. So I do not know which database a client
is going to access before calling this function and reading data from
the client's socket. Now I just choose random backend and assign
connection to this backend. But it can happen that this backend is
working with different database/user. Now I just return error in this
case. Certainly it is possible to call ProcessStartupPacket at
postmaster and then select proper backend working with specified
database/user.
But I afraid that postmaster can become bottleneck i this case,
especially in case of using SSL. Also larger number of databases/users
can significantly suffer efficiency of pooling if each backend will be
responsible only for database/user combination. May be backend should
be bounded only to the database and concrete role should be set on
session switch. But it can require flushing backend caches
whichdevalues idea of embedded session pooling. This problem can be
easily solved with multithreaded Postgres where it is possible to
easily reassign session to another thread.

Now results shown by my prototype. I used pgbench with scale factor
100 in readonly  mode (-S option).
Precise pgbench command is "pgbench -S -c N -M prepared -T 100 -P 1
-n". Results in the table below are in kTPS:

Connections
Vanilla Postgres
Postgres with session pool size=10
10
186
181
100
118
224
1000
59
191

As you see instead of degrade of performance with increasing number of
connections, Postgres with session pool shows stable performance result.
Moreover, for vanilla Postgres best results at my system are obtained
for 10 connections, but Postgres with session pool shows better
performance for 100 connections with the same number of spawned backends.

My patch to the Postgres is attached to this mail.
To switch on session polling set session_pool_size to some non-zero
value. Another GUC variable which I have added is "max_sessions" which
specifies maximal number of sessions handled by backend. So total
number of handled client connections is session_pool_size*max_sessions.

Certainly it is just prototype far from practical use.
In addition to the challenges mentioned above, there are also some
other issues which should be considered:

1. Long living transaction in client application blocks all other
sessions in the backend and so can suspend work of the Postgres.
So Uber-style programming when database transaction is started with
opening door of a car and finished at the end of the trip is
completely not compatible with this approach.
2. Fatal errors cause disconnect not only of one client caused the
problem but bunch of client sessions scheduled to this backend.
3. It is possible to use PL-APIs, such as plpython, but session level
variables may not be used.
4. There may be some memory leaks caused by allocation of memory using
malloc or in top memory context which is expected to be freed on
backend exit.
But it is not deallocated at session close, so large number of handled
sessions can cause memory overflow.
5. Some applications, handling mutliple connections inside single
thread and multiplexing them at statement level (rather than on
transaction level) may not work correctly.
It seems to be quite exotic use case. But pgbench actually behaves in
this way! This is why attempt to start pgbench with multistatement
transactions (-N) will fail if number of threads (-j) is smaller than
number of connections (-c).
6. The approach with passing socket descriptors between processes was
implemented only for Unix and tested only at Linux, although is
expected to work also as MacOS and other Unix dialects. Windows is not
supported now.

I will be glad to receive an feedback and suggestion concerning
perspectives of embedded connection pooling.

--
Konstantin Knizhnik
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

Attached please find new version of the patch with few fixes.
And more results at NUMA system with 144 cores and 3Tb of RAM.

Read-only pgbench (-S):

#Connections\kTPS
Vanilla Postgres
Session pool size 256
1k
1300 1505
10k
633
1519
100k
- 1425

Read-write contention test: access to small number of records with 1% of
updates.

#Clients\TPS Vanilla Postgres Session pool size 256
100 557232 573319
200 520395 551670
300 511423 533773
400 468562 523091
500 442268 514056
600 401860 526704
700 363912 530317
800 325148 512238
900 301310 512844
1000 278829 554516

So, as you can see, there is no degrade of performance with increased number of connections in case of using session pooling.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

session_pool-2.patchtext/x-patch; name=session_pool-2.patchDownload+475-28
#4Claudio Freire
klaussfreire@gmail.com
In reply to: Konstantin Knizhnik (#3)
Re: Built-in connection pooling

On Thu, Jan 18, 2018 at 11:48 AM, Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> wrote:

Attached please find new version of the patch with few fixes.
And more results at NUMA system with 144 cores and 3Tb of RAM.

Read-only pgbench (-S):

#Connections\kTPS
Vanilla Postgres
Session pool size 256
1k
1300 1505
10k
633
1519
100k
- 1425

Read-write contention test: access to small number of records with 1% of
updates.

#Clients\TPS Vanilla Postgres Session pool size 256
100 557232 573319
200 520395 551670
300 511423 533773
400 468562 523091
500 442268 514056
600 401860 526704
700 363912 530317
800 325148 512238
900 301310 512844
1000 278829 554516

So, as you can see, there is no degrade of performance with increased number of connections in case of using session pooling.

TBH, the tests you should be running are comparisons with a similar pool
size managed by pgbouncer, not just vanilla unlimited postgres.

Of course a limited pool size will beat thousands of concurrent queries by
a large margin. The real question is whether a pthread-based approach beats
the pgbouncer approach.

#5Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Konstantin Knizhnik (#3)
Re: Built-in connection pooling

Hi Konstantin,

On 01/18/2018 03:48 PM, Konstantin Knizhnik wrote:

On 17.01.2018 19:09, Konstantin Knizhnik wrote:

Hi hackers,

...

I haven't looked at the code yet, but after reading your message I have
a simple question - gow iss this going to work with SSL? If you're only
passing a file descriptor, that does not seem to be sufficient for the
backends to do crypto (that requires the SSL stuff from Port).

Maybe I'm missing something and it already works, though ...

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#6Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Tomas Vondra (#5)
Re: Built-in connection pooling

On 18.01.2018 18:02, Tomas Vondra wrote:

Hi Konstantin,

On 01/18/2018 03:48 PM, Konstantin Knizhnik wrote:

On 17.01.2018 19:09, Konstantin Knizhnik wrote:

Hi hackers,

...

I haven't looked at the code yet, but after reading your message I have
a simple question - gow iss this going to work with SSL? If you're only
passing a file descriptor, that does not seem to be sufficient for the
backends to do crypto (that requires the SSL stuff from Port).

Maybe I'm missing something and it already works, though ...

regards

Ooops, I missed this aspect with SSL. Thank you.
New version of the patch which correctly maintain session context is
attached.
Now each session has its own allocator which should be  used instead of
TopMemoryAllocator.
SSL connections work now.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

session_pool-3.patchtext/x-patch; name=session_pool-3.patchDownload+518-25
#7Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Claudio Freire (#4)
Re: Built-in connection pooling

On 18.01.2018 18:00, Claudio Freire wrote:

On Thu, Jan 18, 2018 at 11:48 AM, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:

Attached please find new version of the patch with few fixes.
And more results at NUMA system with 144 cores and 3Tb of RAM.

Read-only pgbench (-S):

#Connections\kTPS
Vanilla Postgres
Session pool size 256
1k
1300 1505
10k
633
1519
100k
- 1425

Read-write contention test: access to small number of records with
1% of updates.

#Clients\TPS Vanilla Postgres Session pool size 256
100 557232 573319
200 520395 551670
300 511423 533773
400 468562 523091
500 442268 514056
600 401860 526704
700 363912 530317
800 325148 512238
900 301310 512844
1000 278829 554516

So, as you can see, there is no degrade of performance with increased number of connections in case of using session pooling.

TBH, the tests you should be running are comparisons with a similar
pool size managed by pgbouncer, not just vanilla unlimited postgres.

Of course a limited pool size will beat thousands of concurrent
queries by a large margin. The real question is whether a
pthread-based approach beats the pgbouncer approach.

Below are are results with pgbouncer:

#Connections\kTPS
Vanilla Postgres
Builti-in session pool size 256
Postgres + pgbouncer with transaction pooling mode and pool size  256
Postgres + 10 pgbouncers with pool size 20
1k
1300 1505
105
751
10k
633
1519
94
664
100k
- 1425
-
-

(-) here means that I failed to start such number of connections
(because of "resource temporary unavailable" and similar errors).

So single pgbouncer is 10 times slower than direct connection to the
postgres.
No surprise here: pgbouncer is snigle threaded and CPU usage for
pgbouncer is almost 100%.
So we have to launch several instances of pgbouncer and somehow
distribute load between them.
In Linux it is possible to use
REUSEPORT(https://lwn.net/Articles/542629/) to perform load balancing
between several pgbouncer instances.
But you have to edit pgbouncer code: it doesn't support such mode. So I
have started several instances of pgbouncer at different ports and
explicitly distribute several pgbench instances  between them.

But even in this case performance is twice slower than direct connection
and built-in session pooling.
It is because of lacked of prepared statements which I can not use with
pgbouncer in statement/transaction pooling mode.

Also please notice that with session pooling performance is better than
with vanilla Postgres.
It is because with session pooling we can open more connections with out
launching more backends.
It is especially noticeable at my local desktop with 4 cores: for normal
Postgres optimal number of connections is about 10. But with session
pooling 100 connections shows about 30% better result.

So, summarizing all above:

1. pgbouncer doesn't allows to use prepared statements and it cause up
to two times performance penalty.
2. pgbouncer is single threaded and can not efficiently handle more than
1k connections.
3. pgbouncer never can provide better performance than application
connected directly to Postgres with optimal number of connections. In
contrast session pooling can provide better performance than vanilla
Postgres with optimal number of connections.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#8Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Konstantin Knizhnik (#6)
Re: Built-in connection pooling

On 01/19/2018 10:52 AM, Konstantin Knizhnik wrote:

On 18.01.2018 18:02, Tomas Vondra wrote:

Hi Konstantin,

On 01/18/2018 03:48 PM, Konstantin Knizhnik wrote:

On 17.01.2018 19:09, Konstantin Knizhnik wrote:

Hi hackers,

...

I haven't looked at the code yet, but after reading your message I have
a simple question - gow iss this going to work with SSL? If you're only
passing a file descriptor, that does not seem to be sufficient for the
backends to do crypto (that requires the SSL stuff from Port).

Maybe I'm missing something and it already works, though ...

regards

Ooops, I missed this aspect with SSL. Thank you.
New version of the patch which correctly maintain session context is
attached.
Now each session has its own allocator which should be used instead
of TopMemoryAllocator. SSL connections work now.

OK. I've looked at the code, but I have a rather hard time understanding
it, because there are pretty much no comments explaining the intent of
the added code :-( I strongly suggest improving that, to help reviewers.

The questions I'm asking myself are mostly these:

1) When assigning a backend, we first try to get one from a pool, which
happens right at the beginning of BackendStartup. If we find a usable
backend, we send the info to the backend (pg_send_sock/pg_recv_sock).

But AFAICS this only only happens at connection time, right? But it your
initial message you say "Rescheduling is done at transaction level,"
which in my understanding means "transaction pooling". So, how does that
part work?

2) How does this deal with backends for different databases? I don't see
any checks that the requested database matches the backend database (not
any code switching the backend from one db to another - which would be
very tricky, I think).

3) Is there any sort of shrinking the pools? I mean, if the backend is
idle for certain period of time (or when we need backends for other
databases), does it get closed automatically?

Furthermore, I'm rather confused about the meaning of session_pool_size.
I mean, that GUC determines the number of backends in the pool, it has
nothing to do with sessions per se, right? Which would mean it's a bit
misleading to name it "session_..." (particularly if the pooling happens
at transaction level, not session level - which is question #1).

When I've been thinking about adding a built-in connection pool, my
rough plan was mostly "bgworker doing something like pgbouncer" (that
is, listening on a separate port and proxying everything to regular
backends). Obviously, that has pros and cons, and probably would not
work serve the threading use case well.

But it would have some features that I find valuable - for example, it's
trivial to decide which connection requests may or may not be served
from a pool (by connection to the main port or pool port).

That is not to say the bgworker approach is better than what you're
proposing, but I wonder if that would be possible with your approach.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#9Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Tomas Vondra (#8)
Re: Built-in connection pooling

On 19.01.2018 18:53, Tomas Vondra wrote:

On 01/19/2018 10:52 AM, Konstantin Knizhnik wrote:

On 18.01.2018 18:02, Tomas Vondra wrote:

Hi Konstantin,

On 01/18/2018 03:48 PM, Konstantin Knizhnik wrote:

On 17.01.2018 19:09, Konstantin Knizhnik wrote:

Hi hackers,

...

I haven't looked at the code yet, but after reading your message I have
a simple question - gow iss this going to work with SSL? If you're only
passing a file descriptor, that does not seem to be sufficient for the
backends to do crypto (that requires the SSL stuff from Port).

Maybe I'm missing something and it already works, though ...

regards

Ooops, I missed this aspect with SSL. Thank you.
New version of the patch which correctly maintain session context is
attached.
Now each session has its own allocator which should be used instead
of TopMemoryAllocator. SSL connections work now.

OK. I've looked at the code, but I have a rather hard time understanding
it, because there are pretty much no comments explaining the intent of
the added code :-( I strongly suggest improving that, to help reviewers.

Sorry, sorry, sorry...
There are some comments and I will add more.

The questions I'm asking myself are mostly these:

1) When assigning a backend, we first try to get one from a pool, which
happens right at the beginning of BackendStartup. If we find a usable
backend, we send the info to the backend (pg_send_sock/pg_recv_sock).

But AFAICS this only only happens at connection time, right? But it your
initial message you say "Rescheduling is done at transaction level,"
which in my understanding means "transaction pooling". So, how does that
part work?

Here it is:

              ChooseSession:
                DoingCommandRead = true;
                /* Select which client session is ready to send new
query */
                if (WaitEventSetWait(SessionPool, -1, &ready_client, 1,
PG_WAIT_CLIENT) != 1)
                ...
                if (ready_client.fd == SessionPoolSock)
               {
                    /* Here we handle case of attaching new session */
                    ...
                }
                else /* and here we handle case when there is query
(new transaction) from some client */
                {
                    elog(DEBUG2, "Switch to session %d in backend %d",
ready_client.fd, MyProcPid);
                    CurrentSession =
(SessionContext*)ready_client.user_data;
                    MyProcPort = CurrentSession->port;
                }

2) How does this deal with backends for different databases? I don't see
any checks that the requested database matches the backend database (not
any code switching the backend from one db to another - which would be
very tricky, I think).

As I wrote in the initial mail this problem is not handled now.
It is expected that all clients are connected to the same database using
the same user.
I only check and report an error if this assumption is violated.
Definitely it should be fixed. And it is one of the main challenge with
this approach! And I want to receive some advices from community about
the best ways of solving it.
The problem is that we get information about database/user in
ProcessStartupPackage function in the beackend, when session is already
assigned to the particular backend.
We either have to somehow redirect session to some other backend
(somehow notify postmaster that we are not able to handle it)?
either obtain database/user name in postmaster. But it meas that
ProcessStartupPackage should be called in postmaster and Postmaster has
to read from client's socket.
I afraid that postmaster can be a bottleneck in this case.

The problem can be much easily solved in case of using pthread version
of Postgres. In this case reassigning session to another executor
(thread) can be don much easily.
And there is no need to use unportable trick with passing fiel
descriptor to other process.
And in future I am going to combine them. The problem is that pthread
version of Postgres is still in very raw state.

3) Is there any sort of shrinking the pools? I mean, if the backend is
idle for certain period of time (or when we need backends for other
databases), does it get closed automatically?

When client is disconnected, client session is closed. But backen is not
terminated even if there are no more sessions at this backend.
It  was done intentionally, to avoid permanent spawning of new processes
when there is one or few clients which frequently connect/disconnect to
the database.

Furthermore, I'm rather confused about the meaning of session_pool_size.
I mean, that GUC determines the number of backends in the pool, it has
nothing to do with sessions per se, right? Which would mean it's a bit
misleading to name it "session_..." (particularly if the pooling happens
at transaction level, not session level - which is question #1).

Yehh, yes it is not right name. It means maximal number of backends
which should be used to serve client's sessions.
But "max backends" is already used and has completely different meaning.

When I've been thinking about adding a built-in connection pool, my
rough plan was mostly "bgworker doing something like pgbouncer" (that
is, listening on a separate port and proxying everything to regular
backends). Obviously, that has pros and cons, and probably would not
work serve the threading use case well.

And we will get the same problem as with pgbouncer: one process will not
be able to handle all connections...
Certainly it is possible to start several such scheduling bgworkers...
But in any case it is more efficient to multiplex session in backend
themselves.

But it would have some features that I find valuable - for example, it's
trivial to decide which connection requests may or may not be served
from a pool (by connection to the main port or pool port).

That is not to say the bgworker approach is better than what you're
proposing, but I wonder if that would be possible with your approach.

regards

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#10Pavel Stehule
pavel.stehule@gmail.com
In reply to: Konstantin Knizhnik (#9)
Re: Built-in connection pooling

When I've been thinking about adding a built-in connection pool, my

rough plan was mostly "bgworker doing something like pgbouncer" (that
is, listening on a separate port and proxying everything to regular
backends). Obviously, that has pros and cons, and probably would not
work serve the threading use case well.

And we will get the same problem as with pgbouncer: one process will not
be able to handle all connections...
Certainly it is possible to start several such scheduling bgworkers... But
in any case it is more efficient to multiplex session in backend themselves.

pgbouncer hold all time client connect. When we implement the listeners,
then all work can be done by worker processes not by listeners.

Regards

Pavel

Show quoted text

But it would have some features that I find valuable - for example, it's

trivial to decide which connection requests may or may not be served
from a pool (by connection to the main port or pool port).

That is not to say the bgworker approach is better than what you're
proposing, but I wonder if that would be possible with your approach.

regards

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#11Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Pavel Stehule (#10)
Re: Built-in connection pooling

On 19.01.2018 19:28, Pavel Stehule wrote:

When I've been thinking about adding a built-in connection
pool, my
rough plan was mostly "bgworker doing something like
pgbouncer" (that
is, listening on a separate port and proxying everything to
regular
backends). Obviously, that has pros and cons, and probably
would not
work serve the threading use case well.

And we will get the same problem as with pgbouncer: one process
will not be able to handle all connections...
Certainly it is possible to start several such scheduling
bgworkers... But in any case it is more efficient to multiplex
session in backend themselves.

pgbouncer hold all time client connect. When we implement the
listeners, then all work can be done by worker processes not by listeners.

Sorry, I do not understand your point.
In my case pgbench establish connection to the pgbouncer only  once at
the beginning of the test.
And pgbouncer spends all time in context switches (CPU usage is 100% and
it is mostly in kernel space: top of profile are kernel functions).
The same picture will be if instead of pgbouncer you will do such
scheduling in one bgworker.
For the modern systems are not able to perform more than several
hundreds of connection switches per second.
So with single multiplexing thread or process you can not get speed more
than 100k, while at powerful NUMA system it is possible to achieve
millions of TPS.
It is illustrated by the results I have sent in the previous mail: by
spawning 10 instances of pgbouncer I was able to receive 7 times bigger
speed.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#12Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Konstantin Knizhnik (#9)
Re: Built-in connection pooling

On 01/19/2018 05:17 PM, Konstantin Knizhnik wrote:

On 19.01.2018 18:53, Tomas Vondra wrote:

...

The questions I'm asking myself are mostly these:

1) When assigning a backend, we first try to get one from a pool, which
happens right at the beginning of BackendStartup. If we find a usable
backend, we send the info to the backend (pg_send_sock/pg_recv_sock).

But AFAICS this only only happens at connection time, right? But it your
initial message you say "Rescheduling is done at transaction level,"
which in my understanding means "transaction pooling". So, how does that
part work?

Here it is:

              ChooseSession:
...

OK, thanks.

2) How does this deal with backends for different databases? I
don't see any checks that the requested database matches the
backend database (not any code switching the backend from one db to
another - which would be very tricky, I think).

As I wrote in the initial mail this problem is not handled now.
It is expected that all clients are connected to the same database using
the same user.
I only check and report an error if this assumption is violated.
Definitely it should be fixed. And it is one of the main challenge with
this approach! And I want to receive some advices from community about
the best ways of solving it.
The problem is that we get information about database/user in
ProcessStartupPackage function in the beackend, when session is already
assigned to the particular backend.
We either have to somehow redirect session to some other backend
(somehow notify postmaster that we are not able to handle it)?
either obtain database/user name in postmaster. But it meas that
ProcessStartupPackage should be called in postmaster and Postmaster has
to read from client's socket.
I afraid that postmaster can be a bottleneck in this case.

Hmmm, that's unfortunate. I guess you'll have process the startup packet
in the main process, before it gets forked. At least partially.

The problem can be much easily solved in case of using pthread version
of Postgres. In this case reassigning session to another executor
(thread) can be don much easily.
And there is no need to use unportable trick with passing fiel
descriptor to other process.
And in future I am going to combine them. The problem is that pthread
version of Postgres is still in very raw state.

Yeah. Unfortunately, we're using processes now, and switching to threads
will take time (assuming it happens at all).

3) Is there any sort of shrinking the pools? I mean, if the backend is
idle for certain period of time (or when we need backends for other
databases), does it get closed automatically?

When client is disconnected, client session is closed. But backen is not
terminated even if there are no more sessions at this backend.
It  was done intentionally, to avoid permanent spawning of new processes
when there is one or few clients which frequently connect/disconnect to
the database.

Sure, but it means a short peak will exhaust the backends indefinitely.
That's acceptable for a PoC, but I think needs to be fixed eventually.

Furthermore, I'm rather confused about the meaning of session_pool_size.
I mean, that GUC determines the number of backends in the pool, it has
nothing to do with sessions per se, right? Which would mean it's a bit
misleading to name it "session_..." (particularly if the pooling happens
at transaction level, not session level - which is question #1).

Yehh, yes it is not right name. It means maximal number of backends
which should be used to serve client's sessions.
But "max backends" is already used and has completely different meaning.

When I've been thinking about adding a built-in connection pool, my
rough plan was mostly "bgworker doing something like pgbouncer" (that
is, listening on a separate port and proxying everything to regular
backends). Obviously, that has pros and cons, and probably would not
work serve the threading use case well.

And we will get the same problem as with pgbouncer: one process will not
be able to handle all connections...
Certainly it is possible to start several such scheduling bgworkers...
But in any case it is more efficient to multiplex session in backend
themselves.

Well, I haven't said it has to be single-threaded like pgbouncer. I
don't see why the bgworker could not use multiple threads internally (of
course, it'd need to be not to mess the stuff that is not thread-safe).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#13Pavel Stehule
pavel.stehule@gmail.com
In reply to: Konstantin Knizhnik (#11)
Re: Built-in connection pooling

2018-01-19 17:53 GMT+01:00 Konstantin Knizhnik <k.knizhnik@postgrespro.ru>:

On 19.01.2018 19:28, Pavel Stehule wrote:

When I've been thinking about adding a built-in connection pool, my

rough plan was mostly "bgworker doing something like pgbouncer" (that
is, listening on a separate port and proxying everything to regular
backends). Obviously, that has pros and cons, and probably would not
work serve the threading use case well.

And we will get the same problem as with pgbouncer: one process will not
be able to handle all connections...
Certainly it is possible to start several such scheduling bgworkers...
But in any case it is more efficient to multiplex session in backend
themselves.

pgbouncer hold all time client connect. When we implement the listeners,
then all work can be done by worker processes not by listeners.

Sorry, I do not understand your point.
In my case pgbench establish connection to the pgbouncer only once at the
beginning of the test.
And pgbouncer spends all time in context switches (CPU usage is 100% and
it is mostly in kernel space: top of profile are kernel functions).
The same picture will be if instead of pgbouncer you will do such
scheduling in one bgworker.
For the modern systems are not able to perform more than several hundreds
of connection switches per second.
So with single multiplexing thread or process you can not get speed more
than 100k, while at powerful NUMA system it is possible to achieve millions
of TPS.
It is illustrated by the results I have sent in the previous mail: by
spawning 10 instances of pgbouncer I was able to receive 7 times bigger
speed.

pgbouncer is proxy sw. I don't think so native pooler should be proxy too.
So the compare pgbouncer with hypothetical native pooler is not fair,
because pgbouncer pass all communication

Regards

Pavel

Show quoted text

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#14Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Konstantin Knizhnik (#11)
Re: Built-in connection pooling

On 01/19/2018 05:53 PM, Konstantin Knizhnik wrote:

On 19.01.2018 19:28, Pavel Stehule wrote:

When I've been thinking about adding a built-in connection
pool, my
rough plan was mostly "bgworker doing something like
pgbouncer" (that
is, listening on a separate port and proxying everything to
regular
backends). Obviously, that has pros and cons, and probably
would not
work serve the threading use case well.

And we will get the same problem as with pgbouncer: one process
will not be able to handle all connections...
Certainly it is possible to start several such scheduling
bgworkers... But in any case it is more efficient to multiplex
session in backend themselves.

pgbouncer hold all time client connect. When we implement the
listeners, then all work can be done by worker processes not by listeners.

Sorry, I do not understand your point.
In my case pgbench establish connection to the pgbouncer only  once at
the beginning of the test.
And pgbouncer spends all time in context switches (CPU usage is 100% and
it is mostly in kernel space: top of profile are kernel functions).
The same picture will be if instead of pgbouncer you will do such
scheduling in one bgworker.
For the modern systems are not able to perform more than several
hundreds of connection switches per second.
So with single multiplexing thread or process you can not get speed more
than 100k, while at powerful NUMA system it is possible to achieve
millions of TPS.
It is illustrated by the results I have sent in the previous mail: by
spawning 10 instances of pgbouncer I was able to receive 7 times bigger
speed.

AFAICS making pgbouncer multi-threaded would not be hugely complicated.
A simple solution would be a fixed number of worker threads, and client
connections randomly assigned to them.

But this generally is not a common bottleneck in practical workloads (of
course, YMMV).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#15Claudio Freire
klaussfreire@gmail.com
In reply to: Konstantin Knizhnik (#11)
Re: Built-in connection pooling

On Fri, Jan 19, 2018 at 1:53 PM, Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> wrote:

On 19.01.2018 19:28, Pavel Stehule wrote:

When I've been thinking about adding a built-in connection pool, my

rough plan was mostly "bgworker doing something like pgbouncer" (that
is, listening on a separate port and proxying everything to regular
backends). Obviously, that has pros and cons, and probably would not
work serve the threading use case well.

And we will get the same problem as with pgbouncer: one process will not
be able to handle all connections...
Certainly it is possible to start several such scheduling bgworkers...
But in any case it is more efficient to multiplex session in backend
themselves.

pgbouncer hold all time client connect. When we implement the listeners,
then all work can be done by worker processes not by listeners.

Sorry, I do not understand your point.
In my case pgbench establish connection to the pgbouncer only once at the
beginning of the test.
And pgbouncer spends all time in context switches (CPU usage is 100% and
it is mostly in kernel space: top of profile are kernel functions).
The same picture will be if instead of pgbouncer you will do such
scheduling in one bgworker.
For the modern systems are not able to perform more than several hundreds
of connection switches per second.
So with single multiplexing thread or process you can not get speed more
than 100k, while at powerful NUMA system it is possible to achieve millions
of TPS.
It is illustrated by the results I have sent in the previous mail: by
spawning 10 instances of pgbouncer I was able to receive 7 times bigger
speed.

I'm sure pgbouncer can be improved. I've seen async code handle millions of
packets per second (zmq), pgbouncer shouldn't be radically different.

#16Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Claudio Freire (#15)
Re: Built-in connection pooling

On 01/19/2018 06:03 PM, Claudio Freire wrote:

On Fri, Jan 19, 2018 at 1:53 PM, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:

On 19.01.2018 19:28, Pavel Stehule wrote:

When I've been thinking about adding a built-in connection
pool, my
rough plan was mostly "bgworker doing something like
pgbouncer" (that
is, listening on a separate port and proxying everything
to regular
backends). Obviously, that has pros and cons, and probably
would not
work serve the threading use case well.

And we will get the same problem as with pgbouncer: one
process will not be able to handle all connections...
Certainly it is possible to start several such scheduling
bgworkers... But in any case it is more efficient to multiplex
session in backend themselves.

pgbouncer hold all time client connect. When we implement the
listeners, then all work can be done by worker processes not by
listeners.

Sorry, I do not understand your point.
In my case pgbench establish connection to the pgbouncer only  once
at the beginning of the test.
And pgbouncer spends all time in context switches (CPU usage is 100%
and it is mostly in kernel space: top of profile are kernel functions).
The same picture will be if instead of pgbouncer you will do such
scheduling in one bgworker.
For the modern systems are not able to perform more than several
hundreds of connection switches per second.
So with single multiplexing thread or process you can not get speed
more than 100k, while at powerful NUMA system it is possible to
achieve millions of TPS.
It is illustrated by the results I have sent in the previous mail:
by spawning 10 instances of pgbouncer I was able to receive 7 times
bigger speed.

I'm sure pgbouncer can be improved. I've seen async code handle millions
of packets per second (zmq), pgbouncer shouldn't be radically different.

The trouble is pgbouncer is not handling individual packets. It needs to
do additional processing to assemble the messages, understand the state
of the connection (e.g. to do transaction pooling) etc. Or handle SSL.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#17Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Tomas Vondra (#12)
Re: Built-in connection pooling

On 19.01.2018 19:59, Tomas Vondra wrote:

The problem can be much easily solved in case of using pthread version

of Postgres. In this case reassigning session to another executor
(thread) can be don much easily.
And there is no need to use unportable trick with passing fiel
descriptor to other process.
And in future I am going to combine them. The problem is that pthread
version of Postgres is still in very raw state.

Yeah. Unfortunately, we're using processes now, and switching to threads
will take time (assuming it happens at all).

I have to agree with you.

3) Is there any sort of shrinking the pools? I mean, if the backend is
idle for certain period of time (or when we need backends for other
databases), does it get closed automatically?

When client is disconnected, client session is closed. But backen is not
terminated even if there are no more sessions at this backend.
It  was done intentionally, to avoid permanent spawning of new processes
when there is one or few clients which frequently connect/disconnect to
the database.

Sure, but it means a short peak will exhaust the backends indefinitely.
That's acceptable for a PoC, but I think needs to be fixed eventually.

Sorry, I do not understand it.
You specify size of backends pool which will server client session.
Size of this pool is chosen to provide the best performance at the
particular system and workload.
So number of backends will never exceed this optimal value even in case
of "short peak".
From my point of view terminating backends when there are no active
sessions is wrong idea in any case, it was not temporary decision just
for PoC.

Well, I haven't said it has to be single-threaded like pgbouncer. I
don't see why the bgworker could not use multiple threads internally (of
course, it'd need to be not to mess the stuff that is not thread-safe).

Certainly architecture with N multiple scheduling bgworkers and M
executors (backends) may be more flexible
than solution when scheduling is done in executor itself. But we will
have to pay extra cost for redirection.
I am not sure that finally it will allow to reach better performance.
More flexible solution in many cases doesn't mean more efficient solution.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#18Claudio Freire
klaussfreire@gmail.com
In reply to: Tomas Vondra (#16)
Re: Built-in connection pooling

On Fri, Jan 19, 2018 at 2:06 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com>
wrote:

On 01/19/2018 06:03 PM, Claudio Freire wrote:

On Fri, Jan 19, 2018 at 1:53 PM, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:

On 19.01.2018 19:28, Pavel Stehule wrote:

When I've been thinking about adding a built-in connection
pool, my
rough plan was mostly "bgworker doing something like
pgbouncer" (that
is, listening on a separate port and proxying everything
to regular
backends). Obviously, that has pros and cons, and probably
would not
work serve the threading use case well.

And we will get the same problem as with pgbouncer: one
process will not be able to handle all connections...
Certainly it is possible to start several such scheduling
bgworkers... But in any case it is more efficient to multiplex
session in backend themselves.

pgbouncer hold all time client connect. When we implement the
listeners, then all work can be done by worker processes not by
listeners.

Sorry, I do not understand your point.
In my case pgbench establish connection to the pgbouncer only once
at the beginning of the test.
And pgbouncer spends all time in context switches (CPU usage is 100%
and it is mostly in kernel space: top of profile are kernel

functions).

The same picture will be if instead of pgbouncer you will do such
scheduling in one bgworker.
For the modern systems are not able to perform more than several
hundreds of connection switches per second.
So with single multiplexing thread or process you can not get speed
more than 100k, while at powerful NUMA system it is possible to
achieve millions of TPS.
It is illustrated by the results I have sent in the previous mail:
by spawning 10 instances of pgbouncer I was able to receive 7 times
bigger speed.

I'm sure pgbouncer can be improved. I've seen async code handle millions
of packets per second (zmq), pgbouncer shouldn't be radically different.

The trouble is pgbouncer is not handling individual packets. It needs to
do additional processing to assemble the messages, understand the state
of the connection (e.g. to do transaction pooling) etc. Or handle SSL.

I understand. But zmq also has to process framing very similar to the fe
protocol, so I'm still hopeful.

#19Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Pavel Stehule (#13)
Re: Built-in connection pooling

On 19.01.2018 20:01, Pavel Stehule wrote:

2018-01-19 17:53 GMT+01:00 Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>>:

On 19.01.2018 19:28, Pavel Stehule wrote:

When I've been thinking about adding a built-in
connection pool, my
rough plan was mostly "bgworker doing something like
pgbouncer" (that
is, listening on a separate port and proxying everything
to regular
backends). Obviously, that has pros and cons, and
probably would not
work serve the threading use case well.

And we will get the same problem as with pgbouncer: one
process will not be able to handle all connections...
Certainly it is possible to start several such scheduling
bgworkers... But in any case it is more efficient to
multiplex session in backend themselves.

pgbouncer hold all time client connect. When we implement the
listeners, then all work can be done by worker processes not by
listeners.

Sorry, I do not understand your point.
In my case pgbench establish connection to the pgbouncer only 
once at the beginning of the test.
And pgbouncer spends all time in context switches (CPU usage is
100% and it is mostly in kernel space: top of profile are kernel
functions).
The same picture will be if instead of pgbouncer you will do such
scheduling in one bgworker.
For the modern systems are not able to perform more than several
hundreds of connection switches per second.
So with single multiplexing thread or process you can not get
speed more than 100k, while at powerful NUMA system it is possible
to achieve millions of TPS.
It is illustrated by the results I have sent in the previous mail:
by spawning 10 instances of pgbouncer I was able to receive 7
times bigger speed.

pgbouncer is proxy sw. I don't think so native pooler should be proxy
too. So the compare pgbouncer with hypothetical native pooler is not
fair, because pgbouncer pass all communication

If we will have separate scheduling bgworker(s) as Tomas proposed, then
in any case we will have to do some kind of redirection.
It can be done in more efficient way than using Unix sockets (as it is
in case of locally installed pgbouncer), but even if we use shared
memory queue then
performance will be comparable and limited by number of context
switches. It is possible to increase it by combining several requests
into one parcel.
But it even more complicate communication protocol between clients,
scheduling proxies and executors.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#20Claudio Freire
klaussfreire@gmail.com
In reply to: Konstantin Knizhnik (#17)
Re: Built-in connection pooling

On Fri, Jan 19, 2018 at 2:07 PM, Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> wrote:

Well, I haven't said it has to be single-threaded like pgbouncer. I
don't see why the bgworker could not use multiple threads internally (of
course, it'd need to be not to mess the stuff that is not thread-safe).

Certainly architecture with N multiple scheduling bgworkers and M
executors (backends) may be more flexible
than solution when scheduling is done in executor itself. But we will have
to pay extra cost for redirection.
I am not sure that finally it will allow to reach better performance.
More flexible solution in many cases doesn't mean more efficient solution.

I think you can take the best of both worlds.

You can take your approach of passing around fds, and build a "load
balancing protocol" in a bgworker.

The postmaster sends the socket to the bgworker, the bgworker waits for a
command as pgbouncer does, but instead of proxying everything, when
commands arrive, it passes the socket to a backend to handle.

That way, the bgworker can do what pgbouncer does, handle different pooling
modes, match backends to databases, etc, but it doesn't have to proxy all
data, it just delegates handling of a command to a backend, and forgets
about that socket.

Sounds like it could work.

#21Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Claudio Freire (#15)
#22Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Konstantin Knizhnik (#17)
#23Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Claudio Freire (#20)
#24Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Konstantin Knizhnik (#21)
#25Claudio Freire
klaussfreire@gmail.com
In reply to: Tomas Vondra (#23)
#26Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Claudio Freire (#25)
#27Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Tomas Vondra (#24)
#28Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Konstantin Knizhnik (#27)
#29Bruce Momjian
bruce@momjian.us
In reply to: Tomas Vondra (#28)
#30Ivan Novick
inovick@pivotal.io
In reply to: Bruce Momjian (#29)
#31Bruce Momjian
bruce@momjian.us
In reply to: Ivan Novick (#30)
#32Ivan Novick
inovick@pivotal.io
In reply to: Bruce Momjian (#31)
#33Bruce Momjian
bruce@momjian.us
In reply to: Ivan Novick (#32)
#34Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Bruce Momjian (#29)
#35Bruce Momjian
bruce@momjian.us
In reply to: Konstantin Knizhnik (#34)
#36Vladimir Sitnikov
sitnikov.vladimir@gmail.com
In reply to: Bruce Momjian (#35)
#37Bruce Momjian
bruce@momjian.us
In reply to: Vladimir Sitnikov (#36)
#38Vladimir Sitnikov
sitnikov.vladimir@gmail.com
In reply to: Bruce Momjian (#37)
#39Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Konstantin Knizhnik (#3)
#40Vladimir Sitnikov
sitnikov.vladimir@gmail.com
In reply to: Konstantin Knizhnik (#39)
#41Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Vladimir Sitnikov (#40)
#42Vladimir Sitnikov
sitnikov.vladimir@gmail.com
In reply to: Konstantin Knizhnik (#41)
#43Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Vladimir Sitnikov (#42)
#44Vladimir Sitnikov
sitnikov.vladimir@gmail.com
In reply to: Konstantin Knizhnik (#43)
#45Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Vladimir Sitnikov (#44)
#46Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Konstantin Knizhnik (#45)
#47Vladimir Sitnikov
sitnikov.vladimir@gmail.com
In reply to: Konstantin Knizhnik (#46)
#48Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Konstantin Knizhnik (#27)
#49Shay Rojansky
roji@roji.org
In reply to: Konstantin Knizhnik (#48)
#50Ryan Pedela
rpedela@datalanche.com
In reply to: Shay Rojansky (#49)
#51Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Konstantin Knizhnik (#48)
#52Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Konstantin Knizhnik (#51)
#53Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Konstantin Knizhnik (#52)
#54Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Konstantin Knizhnik (#53)
#55Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Nikolay Samokhvalov (#54)
#56Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Konstantin Knizhnik (#55)
#57Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Nikolay Samokhvalov (#56)
#58Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Konstantin Knizhnik (#57)
#59Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Heikki Linnakangas (#58)
#60Craig Ringer
craig@2ndquadrant.com
In reply to: Konstantin Knizhnik (#59)
#61David Fetter
david@fetter.org
In reply to: Konstantin Knizhnik (#59)
#62Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Craig Ringer (#60)
#63Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: David Fetter (#61)
#64Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Konstantin Knizhnik (#59)
#65Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Heikki Linnakangas (#64)
#66Vladimir Borodin
root@simply.name
In reply to: David Fetter (#61)
#67Tsunakawa, Takayuki
tsunakawa.takay@jp.fujitsu.com
In reply to: Konstantin Knizhnik (#59)
#68Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Tsunakawa, Takayuki (#67)
#69Dave Cramer
pg@fastcrypt.com
In reply to: Konstantin Knizhnik (#68)
#70Andres Freund
andres@anarazel.de
In reply to: Heikki Linnakangas (#58)
#71Stephen Frost
sfrost@snowman.net
In reply to: Andres Freund (#70)
#72Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephen Frost (#71)
#73Chris Browne
cbbrowne@acm.org
In reply to: Dave Cramer (#69)
#74Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#72)
#75Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Andres Freund (#74)
#76Michael Paquier
michael@paquier.xyz
In reply to: Tatsuo Ishii (#75)
#77Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Michael Paquier (#76)
#78Vladimir Sitnikov
sitnikov.vladimir@gmail.com
In reply to: Chris Browne (#73)
#79Vladimir Sitnikov
sitnikov.vladimir@gmail.com
In reply to: Konstantin Knizhnik (#53)
#80Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Dave Cramer (#69)
#81Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Tatsuo Ishii (#75)
#82Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Tatsuo Ishii (#77)
#83Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Konstantin Knizhnik (#81)
#84Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Tatsuo Ishii (#83)
#85Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Konstantin Knizhnik (#84)
#86Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Tatsuo Ishii (#85)
#87Craig Ringer
craig@2ndquadrant.com
In reply to: Andres Freund (#74)
#88Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Konstantin Knizhnik (#86)
#89Vladimir Borodin
root@simply.name
In reply to: Andres Freund (#74)
#90Robert Haas
robertmhaas@gmail.com
In reply to: Tomas Vondra (#12)
#91Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#64)
#92Bruce Momjian
bruce@momjian.us
In reply to: Konstantin Knizhnik (#84)
#93Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#92)
#94Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#93)
#95Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#94)
#96Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Robert Haas (#90)
#97Merlin Moncure
mmoncure@gmail.com
In reply to: Robert Haas (#91)
#98Adam Brusselback
adambrusselback@gmail.com
In reply to: Merlin Moncure (#97)
#99Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Robert Haas (#91)
#100Christophe Pettus
xof@thebuild.com
In reply to: Merlin Moncure (#97)
#101Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Christophe Pettus (#100)
#102Merlin Moncure
mmoncure@gmail.com
In reply to: Christophe Pettus (#100)
#103Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Merlin Moncure (#102)
#104Christophe Pettus
xof@thebuild.com
In reply to: Merlin Moncure (#102)
#105Merlin Moncure
mmoncure@gmail.com
In reply to: Christophe Pettus (#104)
#106Robert Haas
robertmhaas@gmail.com
In reply to: Konstantin Knizhnik (#99)
#107Robert Haas
robertmhaas@gmail.com
In reply to: Merlin Moncure (#102)
#108Merlin Moncure
mmoncure@gmail.com
In reply to: Robert Haas (#107)
#109Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#106)
#110Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Merlin Moncure (#105)
#111Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Michael Paquier (#109)
#112Merlin Moncure
mmoncure@gmail.com
In reply to: Konstantin Knizhnik (#110)
#113Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#109)
#114Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Merlin Moncure (#112)
#115Merlin Moncure
mmoncure@gmail.com
In reply to: Konstantin Knizhnik (#114)
#116Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Merlin Moncure (#115)
#117Merlin Moncure
mmoncure@gmail.com
In reply to: Konstantin Knizhnik (#116)
#118Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Merlin Moncure (#117)
#119Robert Haas
robertmhaas@gmail.com
In reply to: Merlin Moncure (#117)
#120Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Robert Haas (#119)
#121Merlin Moncure
mmoncure@gmail.com
In reply to: Robert Haas (#119)
#122Robert Haas
robertmhaas@gmail.com
In reply to: Merlin Moncure (#121)
#123Merlin Moncure
mmoncure@gmail.com
In reply to: Robert Haas (#122)
#124Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Merlin Moncure (#121)
#125Robert Haas
robertmhaas@gmail.com
In reply to: Merlin Moncure (#123)
#126Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Merlin Moncure (#123)
#127Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#122)
#128Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#127)
#129Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Konstantin Knizhnik (#126)
#130Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Konstantin Knizhnik (#129)
#131Konstantin Knizhnik
k.knizhnik@postgrespro.ru
In reply to: Konstantin Knizhnik (#130)