BUG #17776: Connections are terminated unexpectedly sometimes

Started by PG Bug reporting formabout 3 years ago5 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 17776
Logged by: Francesco Tagliani
Email address: fran.tm213@gmail.com
PostgreSQL version: 14.6
Operating system: Ubuntu 20
Description:

I am running Ruby on Rails app with postgresql 14.6 on ubuntu 20.
For some reason, the connections are terminated unexpectedly while running a
background job.
Regarding background job, it requires around 100~200 connections over
6~7hrs.

I've installed pghero and configured postgresql based on
pgtune(https://pgtune.leopard.in.ua/)
This is current configuration

# DB Version: 14
# OS Type: linux
# DB Type: web
# Total Memory (RAM): 48 GB
# CPUs num: 12
# Connections num: 1000
# Data Storage: hdd

max_connections = 1000
shared_buffers = 12GB
effective_cache_size = 36GB
maintenance_work_mem = 2GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 4
effective_io_concurrency = 2
work_mem = 3145kB
min_wal_size = 1GB
max_wal_size = 4GB
max_worker_processes = 12
max_parallel_workers_per_gather = 4
max_parallel_workers = 12
max_parallel_maintenance_workers = 4

I've checked postgresql log around that time.

2023-02-06 06:45:41.702 UTC [341438] user@db_production LOG: could not
receive data from client: Connection reset by peer
2023-02-06 08:26:56.134 UTC [352190] user@db_production LOG: could not
receive data from client: Connection reset by peer
2023-02-06 08:26:56.134 UTC [352191] user@db_production LOG: could not
receive data from client: Connection reset by peer

And I can not find other logs.
Can you please advise me how to debug this issue or fix this issue?

Thanks

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: PG Bug reporting form (#1)
Re: BUG #17776: Connections are terminated unexpectedly sometimes

PG Bug reporting form <noreply@postgresql.org> writes:

I am running Ruby on Rails app with postgresql 14.6 on ubuntu 20.
For some reason, the connections are terminated unexpectedly while running a
background job.
Regarding background job, it requires around 100~200 connections over
6~7hrs.
I've checked postgresql log around that time.
2023-02-06 06:45:41.702 UTC [341438] user@db_production LOG: could not
receive data from client: Connection reset by peer
2023-02-06 08:26:56.134 UTC [352190] user@db_production LOG: could not
receive data from client: Connection reset by peer
2023-02-06 08:26:56.134 UTC [352191] user@db_production LOG: could not
receive data from client: Connection reset by peer

Looks like something in your network infrastructure is timing out
and dropping idle connections too quickly. If you can't fix the
actual problem, Postgres' TCP-timeout settings might provide a
workaround by maintaining an illusion that the connection is
busy. See

tcp_keepalives_count
tcp_keepalives_idle
tcp_keepalives_interval
tcp_user_timeout

regards, tom lane

#3Francesco Tagliani
fran.tm213@gmail.com
In reply to: Tom Lane (#2)
Re: BUG #17776: Connections are terminated unexpectedly sometimes

Thanks for the tip, Tom!

I've checked those 4 values you suggested and they're not configured.
That means they're default.

# - TCP settings -
# see "man tcp" for details

#tcp_keepalives_idle = 0 # TCP_KEEPIDLE, in seconds;
# 0 selects the system default
#tcp_keepalives_interval = 0 # TCP_KEEPINTVL, in seconds;
# 0 selects the system default
#tcp_keepalives_count = 0 # TCP_KEEPCNT;
# 0 selects the system default
#tcp_user_timeout = 0 # TCP_USER_TIMEOUT, in milliseconds;
# 0 selects the system default

#client_connection_check_interval = 0 # time between checks for client
# disconnection while running
queries;
# 0 for never

Do you have any recommended values for them?

I found some logs I don't understand.

```
2023-02-05 00:37:16.858 UTC [1739876] [unknown]@[unknown] FATAL:
unsupported frontend protocol 65363.19778: server supports 3.0 to 3.0
2023-02-06 12:26:48.844 UTC [467480] [unknown]@[unknown] FATAL:
unsupported frontend protocol 16.0: server supports 3.0 to 3.0
2023-02-06 02:17:08.205 UTC [257863] [unknown]@[unknown] FATAL:
unsupported frontend protocol 65363.19778: server supports 3.0 to 3.0
2023-02-06 06:37:00.411 UTC [345467] [unknown]@[unknown] FATAL:
unsupported frontend protocol 0.0: server supports 3.0 to 3.0
2023-02-06 06:37:00.679 UTC [345468] [unknown]@[unknown] FATAL:
unsupported frontend protocol 255.255: server supports 3.0 to 3.0
```
Do you have any suggestions for those logs?

Thanks

On Mon, Feb 6, 2023 at 11:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

PG Bug reporting form <noreply@postgresql.org> writes:

I am running Ruby on Rails app with postgresql 14.6 on ubuntu 20.
For some reason, the connections are terminated unexpectedly while

running a

background job.
Regarding background job, it requires around 100~200 connections over
6~7hrs.
I've checked postgresql log around that time.
2023-02-06 06:45:41.702 UTC [341438] user@db_production LOG: could not
receive data from client: Connection reset by peer
2023-02-06 08:26:56.134 UTC [352190] user@db_production LOG: could not
receive data from client: Connection reset by peer
2023-02-06 08:26:56.134 UTC [352191] user@db_production LOG: could not
receive data from client: Connection reset by peer

Looks like something in your network infrastructure is timing out
and dropping idle connections too quickly. If you can't fix the
actual problem, Postgres' TCP-timeout settings might provide a
workaround by maintaining an illusion that the connection is
busy. See

tcp_keepalives_count
tcp_keepalives_idle
tcp_keepalives_interval
tcp_user_timeout

regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Francesco Tagliani (#3)
Re: BUG #17776: Connections are terminated unexpectedly sometimes

Francesco Tagliani <fran.tm213@gmail.com> writes:

I found some logs I don't understand.

```
2023-02-05 00:37:16.858 UTC [1739876] [unknown]@[unknown] FATAL:
unsupported frontend protocol 65363.19778: server supports 3.0 to 3.0
2023-02-06 12:26:48.844 UTC [467480] [unknown]@[unknown] FATAL:
unsupported frontend protocol 16.0: server supports 3.0 to 3.0
2023-02-06 02:17:08.205 UTC [257863] [unknown]@[unknown] FATAL:
unsupported frontend protocol 65363.19778: server supports 3.0 to 3.0
2023-02-06 06:37:00.411 UTC [345467] [unknown]@[unknown] FATAL:
unsupported frontend protocol 0.0: server supports 3.0 to 3.0
2023-02-06 06:37:00.679 UTC [345468] [unknown]@[unknown] FATAL:
unsupported frontend protocol 255.255: server supports 3.0 to 3.0
```

Something port-scanning your server, perhaps? None of those
protocol numbers match anything that Postgres-related code
would use.

regards, tom lane

#5Francesco Tagliani
fran.tm213@gmail.com
In reply to: Tom Lane (#4)
Re: BUG #17776: Connections are terminated unexpectedly sometimes

Hi Tom,
The Ubuntu is on Azure, So is it possible port scanning app is
automatically installed by azure in ubuntu?

Regarding tcp values, do i need to keep current values or do you have any
recommended values?

Thanks

On Tue, Feb 7, 2023 at 12:18 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Francesco Tagliani <fran.tm213@gmail.com> writes:

I found some logs I don't understand.

```
2023-02-05 00:37:16.858 UTC [1739876] [unknown]@[unknown] FATAL:
unsupported frontend protocol 65363.19778: server supports 3.0 to 3.0
2023-02-06 12:26:48.844 UTC [467480] [unknown]@[unknown] FATAL:
unsupported frontend protocol 16.0: server supports 3.0 to 3.0
2023-02-06 02:17:08.205 UTC [257863] [unknown]@[unknown] FATAL:
unsupported frontend protocol 65363.19778: server supports 3.0 to 3.0
2023-02-06 06:37:00.411 UTC [345467] [unknown]@[unknown] FATAL:
unsupported frontend protocol 0.0: server supports 3.0 to 3.0
2023-02-06 06:37:00.679 UTC [345468] [unknown]@[unknown] FATAL:
unsupported frontend protocol 255.255: server supports 3.0 to 3.0
```

Something port-scanning your server, perhaps? None of those
protocol numbers match anything that Postgres-related code
would use.

regards, tom lane