too many clients already

Started by Abraham, Dannyabout 6 years ago12 messagesgeneral
Jump to latest
#1Abraham, Danny
danny_abraham@bmc.com

Hi,

Will appreciate a hint here.

Running on a big and stressed AIX platform and receiving lots of "CDTFATAL: sorry, too many clients already"
and transient difficulty to log in.

Happens on all PG versions (Tested 9.5,10.4,11.5)

Big installation: max_connections is 1200, shared_buffers is 2GB

But .. select count(*) from pg_stat_activity is only 66.

Thanks

Danny

#2Rob Sargent
robjsargent@gmail.com
In reply to: Abraham, Danny (#1)
Re: too many clients already

On Apr 2, 2020, at 9:06 AM, Abraham, Danny <danny_abraham@bmc.com> wrote:

Hi,

Will appreciate a hint here.

Running on a big and stressed AIX platform and receiving lots of "CDTFATAL: sorry, too many clients already"
and transient difficulty to log in.

Happens on all PG versions (Tested 9.5,10.4,11.5)

Big installation: max_connections is 1200, shared_buffers is 2GB

But .. select count(*) from pg_stat_activity is only 66.

Thanks

Danny

Lots of idle, kept-alive clients? Do you have a connection pooler (e.g. pg-bouncer)?

#3Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Abraham, Danny (#1)
Re: too many clients already

On 4/2/20 8:06 AM, Abraham, Danny wrote:

Hi,

Will appreciate a hint here.

Running on a big and stressed AIX platform and receiving lots of "CDTFATAL: sorry, too many clients already"
and transient difficulty to log in.

Happens on all PG versions (Tested 9.5,10.4,11.5)

Big installation: max_connections is 1200, shared_buffers is 2GB

But .. select count(*) from pg_stat_activity is only 66.

On the chance pg_stat_activity is lying to you what does:

ps ax | grep post

show for backends?

For example:

ps ax | grep post
1217 ? Ss 0:00 /usr/lib/postfix/bin//master -w
1233 ? S 0:00 /usr/local/pgsql12/bin/postmaster -D
/usr/local/pgsql12/data
1253 ? Ss 0:00 postgres: logger
1277 ? Ss 0:00 postgres: checkpointer
1278 ? Ss 0:00 postgres: background writer
1279 ? Ss 0:00 postgres: walwriter
1280 ? Ss 0:00 postgres: autovacuum launcher
1281 ? Ss 0:00 postgres: stats collector
1282 ? Ss 0:00 postgres: logical replication launcher
*4693 ? Ss 0:00 postgres: aklaver task_manager [local] idle
*4907 ? Ss 0:00 postgres: aklaver production [local] idle

Thanks

Danny

--
Adrian Klaver
adrian.klaver@aklaver.com

#4Abraham, Danny
danny_abraham@bmc.com
In reply to: Rob Sargent (#2)
RE: Re: too many clients already

No pg-bouncer or connection pooling.
ps -elf | grep postgres | grep idle | wc -l ==> 61

and BTW: Running, say 500 one command psql in parallel will have the same affect..

-----Original Message-----
From: Rob Sargent <robjsargent@gmail.com>
Sent: Thursday, April 02, 2020 6:10 PM
To: Abraham, Danny <danny_abraham@bmc.com>
Cc: pgsql-general@postgresql.org
Subject: [EXTERNAL] Re: too many clients already

On Apr 2, 2020, at 9:06 AM, Abraham, Danny <danny_abraham@bmc.com> wrote:

Hi,

Will appreciate a hint here.

Running on a big and stressed AIX platform and receiving lots of "CDTFATAL: sorry, too many clients already"
and transient difficulty to log in.

Happens on all PG versions (Tested 9.5,10.4,11.5)

Big installation: max_connections is 1200, shared_buffers is 2GB

But .. select count(*) from pg_stat_activity is only 66.

Thanks

Danny

Lots of idle, kept-alive clients? Do you have a connection pooler (e.g. pg-bouncer)?

#5Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Abraham, Danny (#4)
Re: too many clients already

On 4/2/20 8:22 AM, Abraham, Danny wrote:

No pg-bouncer or connection pooling.
ps -elf | grep postgres | grep idle | wc -l ==> 61

and BTW: Running, say 500 one command psql in parallel will have the same affect..

Hmm. In psql on the cluster in question what does below return?:

show max_connections;

-----Original Message-----
From: Rob Sargent <robjsargent@gmail.com>
Sent: Thursday, April 02, 2020 6:10 PM
To: Abraham, Danny <danny_abraham@bmc.com>
Cc: pgsql-general@postgresql.org
Subject: [EXTERNAL] Re: too many clients already

On Apr 2, 2020, at 9:06 AM, Abraham, Danny <danny_abraham@bmc.com> wrote:

Hi,

Will appreciate a hint here.

Running on a big and stressed AIX platform and receiving lots of "CDTFATAL: sorry, too many clients already"
and transient difficulty to log in.

Happens on all PG versions (Tested 9.5,10.4,11.5)

Big installation: max_connections is 1200, shared_buffers is 2GB

But .. select count(*) from pg_stat_activity is only 66.

Thanks

Danny

Lots of idle, kept-alive clients? Do you have a connection pooler (e.g. pg-bouncer)?

--
Adrian Klaver
adrian.klaver@aklaver.com

#6Abraham, Danny
danny_abraham@bmc.com
In reply to: Adrian Klaver (#5)
RE: Re: too many clients already

Big installation: max_connections is 1200, shared_buffers is 2GB

-----Original Message-----
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Thursday, April 02, 2020 6:30 PM
To: Abraham, Danny <danny_abraham@bmc.com>; pgsql-general@postgresql.org
Subject: [EXTERNAL] Re: too many clients already

On 4/2/20 8:22 AM, Abraham, Danny wrote:

No pg-bouncer or connection pooling.
ps -elf | grep postgres | grep idle | wc -l ==> 61

and BTW: Running, say 500 one command psql in parallel will have the same affect..

Hmm. In psql on the cluster in question what does below return?:

show max_connections;

-----Original Message-----
From: Rob Sargent <robjsargent@gmail.com>
Sent: Thursday, April 02, 2020 6:10 PM
To: Abraham, Danny <danny_abraham@bmc.com>
Cc: pgsql-general@postgresql.org
Subject: [EXTERNAL] Re: too many clients already

On Apr 2, 2020, at 9:06 AM, Abraham, Danny <danny_abraham@bmc.com> wrote:

Hi,

Will appreciate a hint here.

Running on a big and stressed AIX platform and receiving lots of "CDTFATAL: sorry, too many clients already"
and transient difficulty to log in.

Happens on all PG versions (Tested 9.5,10.4,11.5)

Big installation: max_connections is 1200, shared_buffers is 2GB

But .. select count(*) from pg_stat_activity is only 66.

Thanks

Danny

Lots of idle, kept-alive clients? Do you have a connection pooler (e.g. pg-bouncer)?

--
Adrian Klaver
adrian.klaver@aklaver.com

#7Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Abraham, Danny (#6)
Re: too many clients already

On 4/2/20 8:35 AM, Abraham, Danny wrote:

Big installation: max_connections is 1200, shared_buffers is 2GB

Have you confirmed that the above is actually in effect by doing?:

show max_connections;

-----Original Message-----
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Thursday, April 02, 2020 6:30 PM
To: Abraham, Danny <danny_abraham@bmc.com>; pgsql-general@postgresql.org
Subject: [EXTERNAL] Re: too many clients already

On 4/2/20 8:22 AM, Abraham, Danny wrote:

No pg-bouncer or connection pooling.
ps -elf | grep postgres | grep idle | wc -l ==> 61

and BTW: Running, say 500 one command psql in parallel will have the same affect..

Hmm. In psql on the cluster in question what does below return?:

show max_connections;

-----Original Message-----
From: Rob Sargent <robjsargent@gmail.com>
Sent: Thursday, April 02, 2020 6:10 PM
To: Abraham, Danny <danny_abraham@bmc.com>
Cc: pgsql-general@postgresql.org
Subject: [EXTERNAL] Re: too many clients already

On Apr 2, 2020, at 9:06 AM, Abraham, Danny <danny_abraham@bmc.com> wrote:

Hi,

Will appreciate a hint here.

Running on a big and stressed AIX platform and receiving lots of "CDTFATAL: sorry, too many clients already"
and transient difficulty to log in.

Happens on all PG versions (Tested 9.5,10.4,11.5)

Big installation: max_connections is 1200, shared_buffers is 2GB

But .. select count(*) from pg_stat_activity is only 66.

Thanks

Danny

Lots of idle, kept-alive clients? Do you have a connection pooler (e.g. pg-bouncer)?

--
Adrian Klaver
adrian.klaver@aklaver.com

#8Abraham, Danny
danny_abraham@bmc.com
In reply to: Adrian Klaver (#7)
RE: Re: too many clients already

va-tlv-ctm-qa22.isr.bmc.com% sql
psql: FATAL: sorry, too many clients already
va-tlv-ctm-qa22.isr.bmc.com% sql
psql (11.5)
Type "help" for help.

ctrlmdb=> show max_connections;
max_connections
-----------------
1200
(1 row)

ctrlmdb=> show shared_buffers;
shared_buffers
----------------
2000MB
(1 row)

-----Original Message-----
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Thursday, April 02, 2020 6:37 PM
To: Abraham, Danny <danny_abraham@bmc.com>; pgsql-general@postgresql.org
Subject: [EXTERNAL] Re: too many clients already

On 4/2/20 8:35 AM, Abraham, Danny wrote:

Big installation: max_connections is 1200, shared_buffers is 2GB

Have you confirmed that the above is actually in effect by doing?:

show max_connections;

-----Original Message-----
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Thursday, April 02, 2020 6:30 PM
To: Abraham, Danny <danny_abraham@bmc.com>; pgsql-general@postgresql.org
Subject: [EXTERNAL] Re: too many clients already

On 4/2/20 8:22 AM, Abraham, Danny wrote:

No pg-bouncer or connection pooling.
ps -elf | grep postgres | grep idle | wc -l ==> 61

and BTW: Running, say 500 one command psql in parallel will have the same affect..

Hmm. In psql on the cluster in question what does below return?:

show max_connections;

-----Original Message-----
From: Rob Sargent <robjsargent@gmail.com>
Sent: Thursday, April 02, 2020 6:10 PM
To: Abraham, Danny <danny_abraham@bmc.com>
Cc: pgsql-general@postgresql.org
Subject: [EXTERNAL] Re: too many clients already

On Apr 2, 2020, at 9:06 AM, Abraham, Danny <danny_abraham@bmc.com> wrote:

Hi,

Will appreciate a hint here.

Running on a big and stressed AIX platform and receiving lots of "CDTFATAL: sorry, too many clients already"
and transient difficulty to log in.

Happens on all PG versions (Tested 9.5,10.4,11.5)

Big installation: max_connections is 1200, shared_buffers is 2GB

But .. select count(*) from pg_stat_activity is only 66.

Thanks

Danny

Lots of idle, kept-alive clients? Do you have a connection pooler (e.g. pg-bouncer)?

--
Adrian Klaver
adrian.klaver@aklaver.com

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Abraham, Danny (#1)
Re: too many clients already

"Abraham, Danny" <danny_abraham@bmc.com> writes:

Running on a big and stressed AIX platform and receiving lots of "CDTFATAL: sorry, too many clients already"
and transient difficulty to log in.
Happens on all PG versions (Tested 9.5,10.4,11.5)
Big installation: max_connections is 1200, shared_buffers is 2GB
But .. select count(*) from pg_stat_activity is only 66.

I'd be suspicious that there are a lot of clients stuck in connection
startup (likely the authentication phase); those connections aren't going
to show in pg_stat_activity until they finish connecting. The "ps"
suggestion Adrian gave you would not show them either, because they're
not going to say "idle".

Enabling log_connections and watching the postmaster log would help
prove or disprove that theory.

regards, tom lane

#10Abraham, Danny
danny_abraham@bmc.com
In reply to: Tom Lane (#9)
RE: Re: too many clients already

Well, I guess the questions is - how do I optimize PG for a stream of very short life checks...
See below:

2020-04-02 11:05:37.010 CDTLOG: connection received: host=10.64.72.157 port=45799
2020-04-02 11:05:37.014 CDTLOG: connection received: host=10.64.72.157 port=45814
2020-04-02 11:05:37.014 CDTLOG: connection received: host=10.64.72.157 port=45813
2020-04-02 11:05:37.018 CDTFATAL: sorry, too many clients already
2020-04-02 11:05:37.015 CDTLOG: connection received: host=10.64.72.157 port=45815
2020-04-02 11:05:37.015 CDTLOG: connection received: host=10.64.72.157 port=45817
2020-04-02 11:05:37.015 CDTLOG: connection received: host=10.64.72.157 port=45809
2020-04-02 11:05:37.015 CDTLOG: connection received: host=10.64.72.157 port=45818
2020-04-02 11:05:37.016 CDTLOG: connection received: host=10.64.72.157 port=45819
2020-04-02 11:05:37.021 CDTFATAL: sorry, too many clients already
2020-04-02 11:05:37.021 CDTFATAL: sorry, too many clients already
2020-04-02 11:05:37.021 CDTFATAL: sorry, too many clients already
2020-04-02 11:05:37.021 CDTFATAL: sorry, too many clients already
2020-04-02 11:05:37.021 CDTFATAL: sorry, too many clients already
2020-04-02 11:05:37.022 CDTFATAL: sorry, too many clients already
2020-04-02 11:05:37.022 CDTFATAL: sorry, too many clients already
-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Thursday, April 02, 2020 6:52 PM
To: Abraham, Danny <danny_abraham@bmc.com>
Cc: pgsql-general@postgresql.org
Subject: [EXTERNAL] Re: too many clients already

"Abraham, Danny" <danny_abraham@bmc.com> writes:

Running on a big and stressed AIX platform and receiving lots of "CDTFATAL: sorry, too many clients already"
and transient difficulty to log in.
Happens on all PG versions (Tested 9.5,10.4,11.5) Big installation:
max_connections is 1200, shared_buffers is 2GB
But .. select count(*) from pg_stat_activity is only 66.

I'd be suspicious that there are a lot of clients stuck in connection startup (likely the authentication phase); those connections aren't going to show in pg_stat_activity until they finish connecting. The "ps"
suggestion Adrian gave you would not show them either, because they're not going to say "idle".

Enabling log_connections and watching the postmaster log would help prove or disprove that theory.

regards, tom lane

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Abraham, Danny (#10)
Re: too many clients already

"Abraham, Danny" <danny_abraham@bmc.com> writes:

Well, I guess the questions is - how do I optimize PG for a stream of very short life checks...

You should be using a connection pooler for a load like that.
PG backends are fairly heavyweight things --- you don't want
to fire one up for just a single query, at least not when
there are many such queries per second.

I think pgbouncer and pgpool are the most widely used options,
but this is a bit outside my expertise.

regards, tom lane

#12Abraham, Danny
danny_abraham@bmc.com
In reply to: Tom Lane (#11)
RE: Re: too many clients already

Agree.

I suspect that this is a mal configured pgpool - the developer thinks that the pool is reusing connections,
While it is, in fact, reopening them.

-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Thursday, April 02, 2020 7:40 PM
To: Abraham, Danny <danny_abraham@bmc.com>
Cc: pgsql-general@postgresql.org
Subject: [EXTERNAL] Re: too many clients already

"Abraham, Danny" <danny_abraham@bmc.com> writes:

Well, I guess the questions is - how do I optimize PG for a stream of very short life checks...

You should be using a connection pooler for a load like that.
PG backends are fairly heavyweight things --- you don't want to fire one up for just a single query, at least not when there are many such queries per second.

I think pgbouncer and pgpool are the most widely used options, but this is a bit outside my expertise.

regards, tom lane