FDW connection drops with "Connection timed out" during async append query due to TCP receive buffer filling up

Started by jiye4 months ago8 messagesbugs

jiye_sw@126.com

4 months ago

Hello:

I encountered an issue with foreign table asynchronous scans.

I'm testing on PostgreSQL 17 in a Windows WSL environment, with TCP keepalive settings configured as follows:
configure:
tcp_keepalives_idle = 2
tcp_keepalives_interval = 2
tcp_keepalives_count = 3
tcp_user_timeout = 9

The foreign table was created with:
sql:
CREATE TABLE ftab (a int, data varchar);
I inserted test data using:
sql:
INSERT INTO ftab
SELECT generate_series(1, 10000),
substring(md5(random()::text) || repeat(md5(random()::text), 127), 1, 40960);
Then I executed an async append query that unions with a simple query:
sql:
SELECT f1(), 'local-data'
UNION ALL
SELECT * FROM f_ftab
LIMIT 2;
Note: f1() is a function that sleeps for 10 seconds, created by:
sql
CREATE OR REPLACE FUNCTION f1()
RETURNS integer
LANGUAGE plpgsql AS $$
BEGIN
PERFORM pg_sleep(10);
RETURN 1;
END; $$;
The abnormal behavior was reproduced: the foreign data server connection was disconnected, with the following error logs:
text
2026-03-10 17:52:55.969 CST [820] LOG: statement: FETCH 100 FROM c1
2026-03-10 17:52:56.495 CST [820] LOG: could not receive data from client: Connection timed out
2026-03-10 17:52:56.495 CST [820] LOG: unexpected EOF on client connection with an open transaction
The async foreign table query failed with the error:
text
ERROR: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
invalid socket
CONTEXT: remote SQL command: SELECT a, data FROM public.ftab
WARNING: no connection to the server
This is a minimal working example. In practice, if the local table scan takes too long and the foreign table has sufficiently wide rows, this issue may reproduce.
In my understanding, when performing a local sequential scan, the PostgreSQL backend fetches data from the local plan without fetching any data from the FDW. As a result, the TCP receive buffer may become full, causing the FDW connection to be disconnected.
I believe this is a minor issue. How can I resolve this problem?

Laurenz Albe

laurenz.albe@cybertec.at

4 months ago

In reply to: jiye (#1)

Re: FDW connection drops with "Connection timed out" during async append query due to TCP receive buffer filling up

On Tue, 2026-03-10 at 18:17 +0800, jiye wrote:

I encountered an issue with foreign table asynchronous scans.

I'm testing on PostgreSQL 17 in a Windows WSL environment, with TCP keepalive settings configured as follows:
configure:

tcp_keepalives_idle = 2
tcp_keepalives_interval = 2
tcp_keepalives_count = 3
tcp_user_timeout = 9

The foreign table was created with:
sql:

CREATE TABLE ftab (a int, data varchar);

I inserted test data using:
sql:

INSERT INTO ftab
SELECT generate_series(1, 10000),
substring(md5(random()::text) || repeat(md5(random()::text), 127), 1, 40960);

Then I executed an async append query that unions with a simple query:
sql:

SELECT f1(), 'local-data'
UNION ALL
SELECT * FROM f_ftab
LIMIT 2;

Note: f1() is a function that sleeps for 10 seconds, created by:
sql

CREATE OR REPLACE FUNCTION f1()
RETURNS integer
LANGUAGE plpgsql AS $$
BEGIN
PERFORM pg_sleep(10);
RETURN 1;
END; $$;

The abnormal behavior was reproduced: the foreign data server connection was disconnected, with the following error logs:
text

2026-03-10 17:52:55.969 CST [820] LOG: statement: FETCH 100 FROM c1
2026-03-10 17:52:56.495 CST [820] LOG: could not receive data from client: Connection timed out
2026-03-10 17:52:56.495 CST [820] LOG: unexpected EOF on client connection with an open transaction

I am not a network expert, but if you set tcp_user_timeout = 9, why are you surprised if the
connection times out after 9 seconds?

Yours,
Laurenz Albe

jiye

jiye_sw@126.com

3 months ago

In reply to: Laurenz Albe (#2)

Re:Re: FDW connection drops with "Connection timed out" during async append query due to TCP receive buffer filling up

I am not surprise the connection timeout, i just wonder if have some solution about this, because if we do not use asynchronous foreign table scan this sql query is ok.
I just believe that local table scan too long will enhance the possibility of connection timeout. And our application set tcp_user_timeout parameter in consideration of high availability, so it can not be changed arbitrarily.

when async fdw scan off, sql can be executed as follow:

postgres=# set enable_async_append TO off;
SET
postgres=#
postgres=# select f1(), *, '1','1','1','1','1' from local_fs union all select 1, * from foreign_sales limit 2;
f1 | id | sale_date | product | amount | region | ?column? | ?column? | ?column? | ?column? | ?column?
----+----+------------+---------+---------+---------+----------+----------+----------+----------+----------
1 | 1 | 2026-02-10 | pro | 1000.00 | tianjin | 1 | 1 | 1 | 1 | 1
1 | 2 | 2026-02-10 | pro | 1000.00 | tianjin | 1 | 1 | 1 | 1 | 1
(2 rows)

2026-03-11 09:08:50.659 CST [861] LOG: duration: 20021.652 ms plan:
Query Text: select f1(), *, '1','1','1','1','1' from local_fs union all select 1, * from foreign_sales limit 2;
Limit (cost=0.00..0.54 rows=2 width=189) (actual time=10011.214..20021.544 rows=2 loops=1)
-> Append (cost=0.00..2789043.10 rows=10240148 width=189) (actual time=10011.212..20021.539 rows=2 loops=1)
-> Seq Scan on local_fs (cost=0.00..2737726.20 rows=10240120 width=189) (actual time=10011.192..20021.516 rows=2 loops=1)
-> Foreign Scan on foreign_sales (cost=100.00..116.16 rows=28 width=2858) (never executed)
2026-03-11 09:08:50.659 CST [1324] LOG: statement: COMMIT TRANSACTION

At 2026-03-11 01:16:04, "Laurenz Albe" <laurenz.albe@cybertec.at> wrote:

Show quoted text

On Tue, 2026-03-10 at 18:17 +0800, jiye wrote:

I encountered an issue with foreign table asynchronous scans.

I'm testing on PostgreSQL 17 in a Windows WSL environment, with TCP keepalive settings configured as follows:
configure:

tcp_keepalives_idle = 2
tcp_keepalives_interval = 2
tcp_keepalives_count = 3
tcp_user_timeout = 9

The foreign table was created with:
sql:

CREATE TABLE ftab (a int, data varchar);

I inserted test data using:
sql:

INSERT INTO ftab
SELECT generate_series(1, 10000),
substring(md5(random()::text) || repeat(md5(random()::text), 127), 1, 40960);

Then I executed an async append query that unions with a simple query:
sql:

SELECT f1(), 'local-data'
UNION ALL
SELECT * FROM f_ftab
LIMIT 2;

Note: f1() is a function that sleeps for 10 seconds, created by:
sql

CREATE OR REPLACE FUNCTION f1()
RETURNS integer
LANGUAGE plpgsql AS $$
BEGIN
PERFORM pg_sleep(10);
RETURN 1;
END; $$;

The abnormal behavior was reproduced: the foreign data server connection was disconnected, with the following error logs:
text

2026-03-10 17:52:55.969 CST [820] LOG: statement: FETCH 100 FROM c1
2026-03-10 17:52:56.495 CST [820] LOG: could not receive data from client: Connection timed out
2026-03-10 17:52:56.495 CST [820] LOG: unexpected EOF on client connection with an open transaction

I am not a network expert, but if you set tcp_user_timeout = 9, why are you surprised if the
connection times out after 9 seconds?

Yours,
Laurenz Albe

David G. Johnston

david.g.johnston@gmail.com

3 months ago

In reply to: jiye (#1)

Re: FDW connection drops with "Connection timed out" during async append query due to TCP receive buffer filling up

On Tuesday, March 10, 2026, jiye <jiye_sw@126.com> wrote:

This is a minimal working example. In practice, if the local table scan
takes too long and the foreign table has sufficiently wide rows, this issue
may reproduce.
In my understanding, when performing a local sequential scan, the
PostgreSQL backend fetches data from the local plan without fetching any
data from the FDW. As a result, the TCP receive buffer may become full,
causing the FDW connection to be disconnected.
I believe this is a minor issue. How can I resolve this problem?

Do not establish a timeout that the execution of the query cannot beat.
Or, I think, at least ensure the non-async portion of the query can produce
a row within the allotted time so the async node is polled within the
timeout. IIUC, the general loop flow is: begin append, begin async, poll
async, poll non-async, poll async, poll non-async, etc…. There will usually
be some lag between async polls. The tcp timeout has to be large enough to
accommodate your reality. No different than if you used a statement
timeout.

I don’t actually know whether or if “buffer filling up” is accurate or
relevant here. It doesn’t seem that way. You haven’t demonstrated that
scenario here, just a timeout being reached.

https://github.com/postgres/postgres/blob/82467f627bd478569de04f4a3f1993098e80c812/src/backend/executor/nodeAppend.c#L342

And since the main design point of async is that any of them may be polled
at any time it is necessary for all such scans to be initialized before any
polling begins. Starting the clock on all of them.

If you don’t want a connection timeout to happen do not set one. That’s
the resolution here so far as I can tell.

David J.

jiye

jiye_sw@126.com

3 months ago

In reply to: David G. Johnston (#4)

Re:Re: FDW connection drops with "Connection timed out" during async append query due to TCP receive buffer filling up

Sorry, I made a mistake about the tcp_user_timeout configuration. Our app sets it to 9000 (9 seconds), but it still errors out even with 9000 - it just takes a little longer to error.
And about this point :
=> I don’t actually know whether or if “buffer filling up” is accurate or relevant here. It doesn’t seem that way. You haven’t demonstrated that scenario here, just a timeout being reached.
Actually i have caputured tcp dump firstly, and "tcp buffer filling up" seem to be demonstrated by "TCP windows full" packet."
Secondly if data of fetch rows are not sufficiently wide, it does not reproduce this issue.

So i suspect that the reason for this connection timeout is that the tcp buffer is full.

At 2026-03-11 13:30:03, "David G. Johnston" <david.g.johnston@gmail.com> wrote:

On Tuesday, March 10, 2026, jiye <jiye_sw@126.com> wrote:
This is a minimal working example. In practice, if the local table scan takes too long and the foreign table has sufficiently wide rows, this issue may reproduce.
In my understanding, when performing a local sequential scan, the PostgreSQL backend fetches data from the local plan without fetching any data from the FDW. As a result, the TCP receive buffer may become full, causing the FDW connection to be disconnected.
I believe this is a minor issue. How can I resolve this problem?

Do not establish a timeout that the execution of the query cannot beat. Or, I think, at least ensure the non-async portion of the query can produce a row within the allotted time so the async node is polled within the timeout. IIUC, the general loop flow is: begin append, begin async, poll async, poll non-async, poll async, poll non-async, etc…. There will usually be some lag between async polls. The tcp timeout has to be large enough to accommodate your reality. No different than if you used a statement timeout.

I don’t actually know whether or if “buffer filling up” is accurate or relevant here. It doesn’t seem that way. You haven’t demonstrated that scenario here, just a timeout being reached.

https://github.com/postgres/postgres/blob/82467f627bd478569de04f4a3f1993098e80c812/src/backend/executor/nodeAppend.c#L342

And since the main design point of async is that any of them may be polled at any time it is necessary for all such scans to be initialized before any polling begins. Starting the clock on all of them.

If you don’t want a connection timeout to happen do not set one. That’s the resolution here so far as I can tell.

David J.

Etsuro Fujita

fujita.etsuro@lab.ntt.co.jp

3 months ago

In reply to: jiye (#5)

Re: Re: FDW connection drops with "Connection timed out" during async append query due to TCP receive buffer filling up

On Wed, Mar 11, 2026 at 3:25 PM jiye <jiye_sw@126.com> wrote:

Sorry, I made a mistake about the tcp_user_timeout configuration. Our app sets it to 9000 (9 seconds), but it still errors out even with 9000 - it just takes a little longer to error.
And about this point :
=> I don’t actually know whether or if “buffer filling up” is accurate or relevant here. It doesn’t seem that way. You haven’t demonstrated that scenario here, just a timeout being reached.
Actually i have caputured tcp dump firstly, and "tcp buffer filling up" seem to be demonstrated by "TCP windows full" packet."
Secondly if data of fetch rows are not sufficiently wide, it does not reproduce this issue.

So i suspect that the reason for this connection timeout is that the tcp buffer is full.

I think this problem is not with async execution, but with your
environment; if the root cause of it is “TCP windows full”, I think it
might fix it to 1) retrieve only needed columns from the remote server
and 2) decrease the fetch_size option for postgres_fdw.

Best regards,
Etsuro Fujita

jiye

jiye_sw@126.com

3 months ago

In reply to: Etsuro Fujita (#6)

Re:Re: Re: FDW connection drops with "Connection timed out" during async append query due to TCP receive buffer filling up

We have successfully reproduced this issue and gained a clearer understanding of its root cause. The application uses a cursor to fetch partial results in batches, with a delay between consecutive fetch operations. When the interval between two batches exceeds the tcp_user_timeout threshold, the connection is terminated unexpectedly.

In my analysis, during cursor-based queries, applications typically retrieve results in partial batches. If the number of rows fetched in a single batch is smaller than the number of rows scanned from the local table, the executor is unable to proceed with fetching rows from the foreign table. While we have attempted workarounds such as adjusting the fetch size, tuning TCP buffer parameters, and modifying the tcp_user_timeout value, these measures only mitigate the symptoms without addressing the underlying problem.

To achieve a fundamental resolution, I propose two potential solutions:

‌Alternate Row Fetching‌: Modify the executor to alternately retrieve rows from the local table and the foreign table, ensuring balanced data flow between the two data sources.
‌Asynchronous Tuple Storage‌: Implement a tuple storage mechanism to asynchronously cache results from the foreign table. This would allow the executor to fetch foreign table results into the storage buffer independently, preventing TCP window exhaustion and decoupling the dependency between local and foreign data retrieval.

At 2026-03-11 16:01:02, "Etsuro Fujita" <etsuro.fujita@gmail.com> wrote:

Show quoted text

On Wed, Mar 11, 2026 at 3:25 PM jiye <jiye_sw@126.com> wrote:

Sorry, I made a mistake about the tcp_user_timeout configuration. Our app sets it to 9000 (9 seconds), but it still errors out even with 9000 - it just takes a little longer to error.
And about this point :
=> I don’t actually know whether or if “buffer filling up” is accurate or relevant here. It doesn’t seem that way. You haven’t demonstrated that scenario here, just a timeout being reached.
Actually i have caputured tcp dump firstly, and "tcp buffer filling up" seem to be demonstrated by "TCP windows full" packet."
Secondly if data of fetch rows are not sufficiently wide, it does not reproduce this issue.

So i suspect that the reason for this connection timeout is that the tcp buffer is full.

I think this problem is not with async execution, but with your
environment; if the root cause of it is “TCP windows full”, I think it
might fix it to 1) retrieve only needed columns from the remote server
and 2) decrease the fetch_size option for postgres_fdw.

Best regards,
Etsuro Fujita

Etsuro Fujita

fujita.etsuro@lab.ntt.co.jp

2 months ago

In reply to: jiye (#7)

Re: Re: Re: FDW connection drops with "Connection timed out" during async append query due to TCP receive buffer filling up

On Fri, Apr 3, 2026 at 12:13 PM jiye <jiye_sw@126.com> wrote:

We have successfully reproduced this issue and gained a clearer understanding of its root cause. The application uses a cursor to fetch partial results in batches, with a delay between consecutive fetch operations. When the interval between two batches exceeds the tcp_user_timeout threshold, the connection is terminated unexpectedly.

I think that that is *expected* behavior.

In my analysis, during cursor-based queries, applications typically retrieve results in partial batches. If the number of rows fetched in a single batch is smaller than the number of rows scanned from the local table, the executor is unable to proceed with fetching rows from the foreign table.

IIRC, I don't think that Append in async mode has such a limitation;
it chooses the next partition to scan independently of the number of
rows returned from it. No?

To achieve a fundamental resolution, I propose two potential solutions:

‌Alternate Row Fetching‌: Modify the executor to alternately retrieve rows from the local table and the foreign table, ensuring balanced data flow between the two data sources.
‌Asynchronous Tuple Storage‌: Implement a tuple storage mechanism to asynchronously cache results from the foreign table. This would allow the executor to fetch foreign table results into the storage buffer independently, preventing TCP window exhaustion and decoupling the dependency between local and foreign data retrieval.

I suppose that these are improvements, but I'm not sure these are
really worth complicating the code, as what you are trying to solve by
these is not a normal case; in particular, it's far from normal to set
a tcp_user_timeout that the query cannot finish.

Anyway, thanks for sharing the analysis and ideas! Sorry for the delay.

Best regards,
Etsuro Fujita