tcp keepalives not sent during long query

Started by Willy-Bas Loosover 3 years ago12 messagesgeneral
Jump to latest
#1Willy-Bas Loos
willybas@gmail.com

Hi!

Some users of our database have a NAT firewall and keep a postgres client
(e.g. pgAdmin ) open for hours. To prevent the connection from being killed
by the firewall due to inactivity, we configured tcp_keepalives_idle = 120
so that the server sends keepalives and keeps the connection active. (this
is on debian)

We've recently upgraded from postgres 9.3 to 13 and from debian 6 to 11.
I'm getting the complaint that since the upgrade, the connection breaks.
But only when they run a long query.

Has anything changed in postgres that might cause this? e.g. that
keepalives are only sent when the session is idle?

Thanks
--
Willy-Bas Loos

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Willy-Bas Loos (#1)
Re: tcp keepalives not sent during long query

On Wed, 2022-12-14 at 08:55 +0100, Willy-Bas Loos wrote:

Some users of our database have a NAT firewall and keep a postgres client (e.g. pgAdmin )
open for hours. To prevent the connection from being killed by the firewall due to inactivity,
we configured tcp_keepalives_idle = 120 so that the server sends keepalives and keeps the
connection active. (this is on debian)

We've recently upgraded from postgres 9.3 to 13 and from debian 6 to 11.
I'm getting the complaint that since the upgrade, the connection breaks. But only when they run a long query.

Has anything changed in postgres that might cause this? e.g. that keepalives are only sent when the session is idle?

It is the operating system kernel that sends keepalives, so that should be independent of
what the PostgreSQL backend is doing.

Yours,
Laurenz Albe

#3Willy-Bas Loos
willybas@gmail.com
In reply to: Laurenz Albe (#2)
Re: tcp keepalives not sent during long query

Thanks for your answer. I was afraid someone would say that...
I was hoping that the keepalives would be more of a matter of cooperation
between postgres and the OS.

On Wed, Dec 14, 2022 at 10:52 AM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:

On Wed, 2022-12-14 at 08:55 +0100, Willy-Bas Loos wrote:

Some users of our database have a NAT firewall and keep a postgres

client (e.g. pgAdmin )

open for hours. To prevent the connection from being killed by the

firewall due to inactivity,

we configured tcp_keepalives_idle = 120 so that the server sends

keepalives and keeps the

connection active. (this is on debian)

We've recently upgraded from postgres 9.3 to 13 and from debian 6 to 11.
I'm getting the complaint that since the upgrade, the connection breaks.

But only when they run a long query.

Has anything changed in postgres that might cause this? e.g. that

keepalives are only sent when the session is idle?

It is the operating system kernel that sends keepalives, so that should be
independent of
what the PostgreSQL backend is doing.

Yours,
Laurenz Albe

--
Willy-Bas Loos

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Willy-Bas Loos (#3)
Re: tcp keepalives not sent during long query

Willy-Bas Loos <willybas@gmail.com> writes:

Thanks for your answer. I was afraid someone would say that...
I was hoping that the keepalives would be more of a matter of cooperation
between postgres and the OS.

No, we just apply the setting to the open socket and trust the OS
to do it.

Are you quite certain that you applied the configuration change to
your new installation?

It'd be worth doing

show tcp_keepalives_idle;

in one of the sessions where you are having trouble. And maybe
check the other keepalives settings too?

regards, tom lane

#5Willy-Bas Loos
willybas@gmail.com
In reply to: Tom Lane (#4)
Re: tcp keepalives not sent during long query

On Wed, Dec 14, 2022 at 6:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

It'd be worth doing

show tcp_keepalives_idle;

Wow, you're right! It's in the postgresql.conf but it isn't set when I

reload the server
A restart also doesn't do it and even doing SET tcp_keepalives_idle=120;
doesn't work.
It gives me a confirmation, but then when I SHOW the value, it gives me 0.

wbloos=# set tcp_keepalives_idle=120;
SET
wbloos=# show tcp_keepalives_idle;
tcp_keepalives_idle
---------------------
0
(1 row)

--
Willy-Bas Loos

#6Willy-Bas Loos
willybas@gmail.com
In reply to: Willy-Bas Loos (#5)
Re: tcp keepalives not sent during long query

The version is PostgreSQL 13.8 (Debian 13.8-0+deb11u1) on
x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110,
64-bit

#7Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Willy-Bas Loos (#5)
Re: tcp keepalives not sent during long query

On Thu, 2022-12-15 at 08:31 +0100, Willy-Bas Loos wrote:

On Wed, Dec 14, 2022 at 6:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

It'd be worth doing

show tcp_keepalives_idle;

Wow, you're right! It's in the postgresql.conf but it isn't set when I reload the server
A restart also doesn't do it and even doing SET tcp_keepalives_idle=120; doesn't work.
It gives me a confirmation, but then when I SHOW the value, it gives me 0.

wbloos=# set tcp_keepalives_idle=120;
SET
wbloos=# show tcp_keepalives_idle;
 tcp_keepalives_idle
---------------------
 0
(1 row)

One good way to debug this is

SELECT setting, source, sourcefile, sourceline
FROM pg_settings
WHERE name = 'tcp_keepalives_idle';

That will tell you from where you get the parameter value.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

#8Willy-Bas Loos
willybas@gmail.com
In reply to: Laurenz Albe (#7)
Re: tcp keepalives not sent during long query

Nice query, i keep learning new stuff here.
Anyway, that shows the correct line (80) in the config file, but the wrong
value.
Namely 0, where the config file has 120

On Thu, Dec 15, 2022 at 12:37 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:

On Thu, 2022-12-15 at 08:31 +0100, Willy-Bas Loos wrote:

On Wed, Dec 14, 2022 at 6:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

It'd be worth doing

show tcp_keepalives_idle;

Wow, you're right! It's in the postgresql.conf but it isn't set when I

reload the server

A restart also doesn't do it and even doing SET tcp_keepalives_idle=120;

doesn't work.

It gives me a confirmation, but then when I SHOW the value, it gives me

0.

wbloos=# set tcp_keepalives_idle=120;
SET
wbloos=# show tcp_keepalives_idle;
tcp_keepalives_idle
---------------------
0
(1 row)

One good way to debug this is

SELECT setting, source, sourcefile, sourceline
FROM pg_settings
WHERE name = 'tcp_keepalives_idle';

That will tell you from where you get the parameter value.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

--
Willy-Bas Loos

#9Geoff Winkless
pgsqladmin@geoff.dj
In reply to: Willy-Bas Loos (#5)
Re: tcp keepalives not sent during long query

On Thu, 15 Dec 2022 at 07:31, Willy-Bas Loos <willybas@gmail.com> wrote:

wbloos=# set tcp_keepalives_idle=120;
SET
wbloos=# show tcp_keepalives_idle;
tcp_keepalives_idle
---------------------
0

Are you connected in this psql session via tcp or unix domain socket?

"In sessions connected via a Unix-domain socket, this parameter is
ignored and always reads as zero."

Geoff

#10Willy-Bas Loos
willybas@gmail.com
In reply to: Geoff Winkless (#9)
Re: tcp keepalives not sent during long query

On Thu, Dec 15, 2022 at 2:04 PM Geoff Winkless <pgsqladmin@geoff.dj> wrote:

Are you connected in this psql session via tcp or unix domain socket?

Right, got me again. That was a Unix-domain socket.

When I do SHOW tcp_keepalives_idle; from pgAdmin it shows me 120, which is
correct.
Thanks for clarifying that.

So that means I still don't know why the connections are breaking.
I know that this could be anything, in any case not due to the postgres
server.

Our ISP has inspected the network traffic and indeed found empty TCP ACK
packages being sent back and forth to/from the user's IP, supposedly
keepalives.
I contacted the user and doublechecked their statement that they only have
the issue when running long queries. Turns out that this is not the case.
The connection also breaks on idle query windows only then they just
reconnect so it's not a problem.

The user now indicated that they can work around the issue by creating a
table as a result, instead of simply selecting the data to be displayed in
the client.
So we decided to cease our efforts to fix the issue.
Thanks a lot for your help!

--
Willy-Bas Loos

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Willy-Bas Loos (#5)
Re: tcp keepalives not sent during long query

Willy-Bas Loos <willybas@gmail.com> writes:

It gives me a confirmation, but then when I SHOW the value, it gives me 0.

wbloos=# set tcp_keepalives_idle=120;
SET
wbloos=# show tcp_keepalives_idle;
tcp_keepalives_idle
---------------------
0
(1 row)

That's the behavior I'd expect on a local (Unix-socket) connection
... you sure you're doing this from one of the problematic clients?

regards, tom lane

#12Willy-Bas Loos
willybas@gmail.com
In reply to: Tom Lane (#11)
Re: tcp keepalives not sent during long query

Yes exactly, Geoff Winkless pointed that out too.
I thought I'd found a cause for the breaking connections, but I hadn't.
Thanks a lot for your help!

On Thu, Dec 15, 2022 at 3:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Willy-Bas Loos <willybas@gmail.com> writes:

It gives me a confirmation, but then when I SHOW the value, it gives me

0.

wbloos=# set tcp_keepalives_idle=120;
SET
wbloos=# show tcp_keepalives_idle;
tcp_keepalives_idle
---------------------
0
(1 row)

That's the behavior I'd expect on a local (Unix-socket) connection
... you sure you're doing this from one of the problematic clients?

regards, tom lane

--
Willy-Bas Loos