Vacuum full connection exhaustion

Started by Costa Alexoglouover 1 year ago8 messagesgeneral
Jump to latest
#1Costa Alexoglou
costa@dbtune.com

Hey folks,

I noticed something weird, and not sure if this is the expected behaviour
or not in PostgreSQL.

So I am running Benchbase (a benchmark framework) with 50 terminals (50
concurrent connections).
There are 2-3 additional connections, one for a postgres-exporter container
for example.

So far so good, and with a `max_connections` at 100 there is no problem.
What happens is that if I execute manually `VACUUM FULL` the connections
are exhausted.

Also tried this with 150 `max_connections` to see if it just “doubles” the
current connections, but as it turned out, it still exhausted all the
connections until it reached `max_connections`.

This was cross-checked, as the postgres-exporter could not connect, and I
manually was not allowed to connect with `psql`.

Is this expected or is this a bug?

postgres-exporter logs:
```
sql: error: connection to server on socket
"/var/run/postgresql/.s.PGSQL.5432" failed: FATAL: sorry, too many clients
already
```

#2Ron
ronljohnsonjr@gmail.com
In reply to: Costa Alexoglou (#1)
Re: Vacuum full connection exhaustion

On Thu, Aug 8, 2024 at 5:18 AM Costa Alexoglou <costa@dbtune.com> wrote:

Hey folks,

I noticed something weird, and not sure if this is the expected behaviour
or not in PostgreSQL.

So I am running Benchbase (a benchmark framework) with 50 terminals (50
concurrent connections).
There are 2-3 additional connections, one for a postgres-exporter
container for example.

So far so good, and with a `max_connections` at 100 there is no problem.
What happens is that if I execute manually `VACUUM FULL`

Off-topic, but... *WHY?? *It almost certainly does not do what you think
it does. Especially if it's just "VACUUM FULL;"

the connections are exhausted.

Connect to the relevant database and run this query. Don't disconnect, and
keep running it over and over again as you run the "VACUUM FULL;". That'll
tell you exactly what happens.
select pid
,datname as db
,application_name as app_name
,case
when client_hostname is not null then client_hostname
else client_addr::text
end AS client_name
,usename
,to_char((EXTRACT(epoch FROM now() - backend_start))/60.0, '99,999.00')
as backend_min
,to_char(query_start, 'YYYY-MM-DD HH24:MI:SS.MS') as "Query Start"
,to_char((EXTRACT(epoch FROM now() - query_start))/60.0, '99,999.00') as
qry_min
,to_char(xact_start, 'YYYY-MM-DD HH24:MI:SS.MS') as "Txn Start"
,to_char((EXTRACT(epoch FROM now() - xact_start)/60.0), '999.00') as
txn_min
,state
query
from pg_stat_activity
WHERE pid != pg_backend_pid()
order by 6 desc;

Also tried this with 150 `max_connections` to see if it just “doubles” the
current connections, but as it turned out, it still exhausted all the
connections until it reached `max_connections`.

Double it again?

This was cross-checked, as the postgres-exporter could not connect, and I
manually was not allowed to connect with `psql`.

Is this expected or is this a bug?

Depends on what you set these to:
autovacuum_max_workers
max_parallel_maintenance_workers
max_parallel_workers
max_parallel_workers_per_gather
max_worker_processes

--
Death to America, and butter sauce!
Iraq lobster...

#3Francisco Olarte
folarte@peoplecall.com
In reply to: Costa Alexoglou (#1)
Re: Vacuum full connection exhaustion

On Thu, 8 Aug 2024 at 11:18, Costa Alexoglou <costa@dbtune.com> wrote:
...

So I am running Benchbase (a benchmark framework) with 50 terminals (50 concurrent connections).
There are 2-3 additional connections, one for a postgres-exporter container for example.

...

So far so good, and with a `max_connections` at 100 there is no problem. What happens is that if I execute manually `VACUUM FULL` the connections are exhausted.
Also tried this with 150 `max_connections` to see if it just “doubles” the current connections, but as it turned out, it still exhausted all the connections until it reached `max_connections`.
This was cross-checked, as the postgres-exporter could not connect, and I manually was not allowed to connect with `psql`.

Have you tried to check where the connections are coming from and what
are they doing? Apart from the max-paralell-worker stuff already
commented by Ron in an scenario with a long live locking processes (
vacuum full ) combined with potentially aggresive connecting ( a
benchmark tool ) I would verify the benchmark tool is not timing out
and disconnecting improperly leaving connections hung up.

Francisco Olarte.

#4Christophe Pettus
xof@thebuild.com
In reply to: Costa Alexoglou (#1)
Re: Vacuum full connection exhaustion

On Aug 7, 2024, at 10:34, Costa Alexoglou <costa@dbtune.com> wrote:

Hey folks,

I noticed something weird, and not sure if this is the expected behaviour or not in PostgreSQL.

So I am running Benchbase (a benchmark framework) with 50 terminals (50 concurrent connections).
There are 2-3 additional connections, one for a postgres-exporter container for example.

So far so good, and with a `max_connections` at 100 there is no problem. What happens is that if I execute manually `VACUUM FULL` the connections are exhausted.

VACUUM FULL takes an exclusive lock on the table that it is operating on. It's possible that a connection becomes blocked on that exclusive lock waiting for the VACUUM FULL to finish, the application sees the connection stopped and fires up another one (this is common in container-based applications), that one blocks... until all of the connections are full of queries waiting on that VACUUM FULL.

#5David Rowley
dgrowleyml@gmail.com
In reply to: Christophe Pettus (#4)
Re: Vacuum full connection exhaustion

On Fri, 9 Aug 2024 at 02:12, Christophe Pettus <xof@thebuild.com> wrote:

VACUUM FULL takes an exclusive lock on the table that it is operating on. It's possible that a connection becomes blocked on that exclusive lock waiting for the VACUUM FULL to finish, the application sees the connection stopped and fires up another one (this is common in container-based applications), that one blocks... until all of the connections are full of queries waiting on that VACUUM FULL.

I also imagine this is the cause. One way to test would be to do:
BEGIN; LOCK TABLE <name of table>; and see if the connections pile up
in a similar way to when the VACUUM FULL command is used.

David

#6Ron
ronljohnsonjr@gmail.com
In reply to: Christophe Pettus (#4)
Re: Vacuum full connection exhaustion

On Thu, Aug 8, 2024 at 10:12 AM Christophe Pettus <xof@thebuild.com> wrote:

On Aug 7, 2024, at 10:34, Costa Alexoglou <costa@dbtune.com> wrote:

Hey folks,

I noticed something weird, and not sure if this is the expected

behaviour or not in PostgreSQL.

So I am running Benchbase (a benchmark framework) with 50 terminals (50

concurrent connections).

There are 2-3 additional connections, one for a postgres-exporter

container for example.

So far so good, and with a `max_connections` at 100 there is no problem.

What happens is that if I execute manually `VACUUM FULL` the connections
are exhausted.

VACUUM FULL takes an exclusive lock on the table that it is operating on.
It's possible that a connection becomes blocked on that exclusive lock
waiting for the VACUUM FULL to finish, the application sees the connection
stopped and fires up another one (this is common in container-based
applications), that one blocks... until all of the connections are full of
queries waiting on that VACUUM FULL.

"I see a lock, so let's cause another one!" That's crazy.

--
Death to America, and butter sauce.
Iraq lobster!

#7Christophe Pettus
xof@thebuild.com
In reply to: Ron (#6)
Re: Vacuum full connection exhaustion

On Aug 8, 2024, at 21:15, Ron Johnson <ronljohnsonjr@gmail.com> wrote:

"I see a lock, so let's cause another one!" That's crazy.

It's more "Oh, look, I need a connection to service this web request, but my pool is empty, so I'll just fire up a new connection to the server," lather, rinse, repeat. Pretty common these days, sadly.

#8Costa Alexoglou
costa@dbtune.com
In reply to: David Rowley (#5)
Re: Vacuum full connection exhaustion

On Fri, Aug 9, 2024 at 1:02 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Fri, 9 Aug 2024 at 02:12, Christophe Pettus <xof@thebuild.com> wrote:

VACUUM FULL takes an exclusive lock on the table that it is operating

on. It's possible that a connection becomes blocked on that exclusive lock
waiting for the VACUUM FULL to finish, the application sees the connection
stopped and fires up another one (this is common in container-based
applications), that one blocks... until all of the connections are full of
queries waiting on that VACUUM FULL.

I also imagine this is the cause. One way to test would be to do:
BEGIN; LOCK TABLE <name of table>; and see if the connections pile up
in a similar way to when the VACUUM FULL command is used.

David

Thanks folks. David really straight-forward way to test. I validated this,
when I lock the two tables involved in the benchmark the connections are
constantly growing until they reach the `max_connections`