Vacuum full connection exhaustion
Hey folks,
I noticed something weird, and not sure if this is the expected behaviour
or not in PostgreSQL.
So I am running Benchbase (a benchmark framework) with 50 terminals (50
concurrent connections).
There are 2-3 additional connections, one for a postgres-exporter container
for example.
So far so good, and with a `max_connections` at 100 there is no problem.
What happens is that if I execute manually `VACUUM FULL` the connections
are exhausted.
Also tried this with 150 `max_connections` to see if it just “doubles” the
current connections, but as it turned out, it still exhausted all the
connections until it reached `max_connections`.
This was cross-checked, as the postgres-exporter could not connect, and I
manually was not allowed to connect with `psql`.
Is this expected or is this a bug?
postgres-exporter logs:
```
sql: error: connection to server on socket
"/var/run/postgresql/.s.PGSQL.5432" failed: FATAL: sorry, too many clients
already
```
On Thu, Aug 8, 2024 at 5:18 AM Costa Alexoglou <costa@dbtune.com> wrote:
Hey folks,
I noticed something weird, and not sure if this is the expected behaviour
or not in PostgreSQL.So I am running Benchbase (a benchmark framework) with 50 terminals (50
concurrent connections).
There are 2-3 additional connections, one for a postgres-exporter
container for example.So far so good, and with a `max_connections` at 100 there is no problem.
What happens is that if I execute manually `VACUUM FULL`
Off-topic, but... *WHY?? *It almost certainly does not do what you think
it does. Especially if it's just "VACUUM FULL;"
the connections are exhausted.
Connect to the relevant database and run this query. Don't disconnect, and
keep running it over and over again as you run the "VACUUM FULL;". That'll
tell you exactly what happens.
select pid
,datname as db
,application_name as app_name
,case
when client_hostname is not null then client_hostname
else client_addr::text
end AS client_name
,usename
,to_char((EXTRACT(epoch FROM now() - backend_start))/60.0, '99,999.00')
as backend_min
,to_char(query_start, 'YYYY-MM-DD HH24:MI:SS.MS') as "Query Start"
,to_char((EXTRACT(epoch FROM now() - query_start))/60.0, '99,999.00') as
qry_min
,to_char(xact_start, 'YYYY-MM-DD HH24:MI:SS.MS') as "Txn Start"
,to_char((EXTRACT(epoch FROM now() - xact_start)/60.0), '999.00') as
txn_min
,state
query
from pg_stat_activity
WHERE pid != pg_backend_pid()
order by 6 desc;
Also tried this with 150 `max_connections` to see if it just “doubles” the
current connections, but as it turned out, it still exhausted all the
connections until it reached `max_connections`.
Double it again?
This was cross-checked, as the postgres-exporter could not connect, and I
manually was not allowed to connect with `psql`.Is this expected or is this a bug?
Depends on what you set these to:
autovacuum_max_workers
max_parallel_maintenance_workers
max_parallel_workers
max_parallel_workers_per_gather
max_worker_processes
--
Death to America, and butter sauce!
Iraq lobster...
On Thu, 8 Aug 2024 at 11:18, Costa Alexoglou <costa@dbtune.com> wrote:
...
So I am running Benchbase (a benchmark framework) with 50 terminals (50 concurrent connections).
There are 2-3 additional connections, one for a postgres-exporter container for example.
...
So far so good, and with a `max_connections` at 100 there is no problem. What happens is that if I execute manually `VACUUM FULL` the connections are exhausted.
Also tried this with 150 `max_connections` to see if it just “doubles” the current connections, but as it turned out, it still exhausted all the connections until it reached `max_connections`.
This was cross-checked, as the postgres-exporter could not connect, and I manually was not allowed to connect with `psql`.
Have you tried to check where the connections are coming from and what
are they doing? Apart from the max-paralell-worker stuff already
commented by Ron in an scenario with a long live locking processes (
vacuum full ) combined with potentially aggresive connecting ( a
benchmark tool ) I would verify the benchmark tool is not timing out
and disconnecting improperly leaving connections hung up.
Francisco Olarte.
On Aug 7, 2024, at 10:34, Costa Alexoglou <costa@dbtune.com> wrote:
Hey folks,
I noticed something weird, and not sure if this is the expected behaviour or not in PostgreSQL.
So I am running Benchbase (a benchmark framework) with 50 terminals (50 concurrent connections).
There are 2-3 additional connections, one for a postgres-exporter container for example.So far so good, and with a `max_connections` at 100 there is no problem. What happens is that if I execute manually `VACUUM FULL` the connections are exhausted.
VACUUM FULL takes an exclusive lock on the table that it is operating on. It's possible that a connection becomes blocked on that exclusive lock waiting for the VACUUM FULL to finish, the application sees the connection stopped and fires up another one (this is common in container-based applications), that one blocks... until all of the connections are full of queries waiting on that VACUUM FULL.
On Fri, 9 Aug 2024 at 02:12, Christophe Pettus <xof@thebuild.com> wrote:
VACUUM FULL takes an exclusive lock on the table that it is operating on. It's possible that a connection becomes blocked on that exclusive lock waiting for the VACUUM FULL to finish, the application sees the connection stopped and fires up another one (this is common in container-based applications), that one blocks... until all of the connections are full of queries waiting on that VACUUM FULL.
I also imagine this is the cause. One way to test would be to do:
BEGIN; LOCK TABLE <name of table>; and see if the connections pile up
in a similar way to when the VACUUM FULL command is used.
David
On Thu, Aug 8, 2024 at 10:12 AM Christophe Pettus <xof@thebuild.com> wrote:
On Aug 7, 2024, at 10:34, Costa Alexoglou <costa@dbtune.com> wrote:
Hey folks,
I noticed something weird, and not sure if this is the expected
behaviour or not in PostgreSQL.
So I am running Benchbase (a benchmark framework) with 50 terminals (50
concurrent connections).
There are 2-3 additional connections, one for a postgres-exporter
container for example.
So far so good, and with a `max_connections` at 100 there is no problem.
What happens is that if I execute manually `VACUUM FULL` the connections
are exhausted.VACUUM FULL takes an exclusive lock on the table that it is operating on.
It's possible that a connection becomes blocked on that exclusive lock
waiting for the VACUUM FULL to finish, the application sees the connection
stopped and fires up another one (this is common in container-based
applications), that one blocks... until all of the connections are full of
queries waiting on that VACUUM FULL.
"I see a lock, so let's cause another one!" That's crazy.
--
Death to America, and butter sauce.
Iraq lobster!
On Aug 8, 2024, at 21:15, Ron Johnson <ronljohnsonjr@gmail.com> wrote:
"I see a lock, so let's cause another one!" That's crazy.
It's more "Oh, look, I need a connection to service this web request, but my pool is empty, so I'll just fire up a new connection to the server," lather, rinse, repeat. Pretty common these days, sadly.
On Fri, Aug 9, 2024 at 1:02 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 9 Aug 2024 at 02:12, Christophe Pettus <xof@thebuild.com> wrote:
VACUUM FULL takes an exclusive lock on the table that it is operating
on. It's possible that a connection becomes blocked on that exclusive lock
waiting for the VACUUM FULL to finish, the application sees the connection
stopped and fires up another one (this is common in container-based
applications), that one blocks... until all of the connections are full of
queries waiting on that VACUUM FULL.I also imagine this is the cause. One way to test would be to do:
BEGIN; LOCK TABLE <name of table>; and see if the connections pile up
in a similar way to when the VACUUM FULL command is used.David
Thanks folks. David really straight-forward way to test. I validated this,
when I lock the two tables involved in the benchmark the connections are
constantly growing until they reach the `max_connections`