postgres_fdw: incomplete subabort cleanup of connections used in async execution

Started by Etsuro Fujitaabout 4 years ago4 messages
#1Etsuro Fujita
etsuro.fujita@gmail.com
1 attachment(s)

Hi,

While working on [1], I noticed $SUBJECT: postgres_fdw resets the
per-connection states of connections, which store async requests sent
to remote servers in async_capable mode, during post-abort
(pgfdw_xact_callback()), but it fails to do so during post-subabort
(pgfdw_subxact_callback()). This causes a crash when re-executing a
query that was aborted in a subtransaction:

postgres=# create table t (a text, b text);
postgres=# create or replace function slow_data (name text, duration
float) returns setof t as $$ begin perform pg_sleep(duration); return
query select name, generate_series(1, 1000)::text; end; $$ language
plpgsql;
postgres=# create view v1 as select * from slow_data('foo', 2.5);
postgres=# create view v2 as select * from slow_data('bar', 5.0);
postgres=# create extension postgres_fdw;
postgres=# create server loopback1 foreign data wrapper postgres_fdw
options (dbname 'postgres', async_capable 'true');
postgres=# create server loopback2 foreign data wrapper postgres_fdw
options (dbname 'postgres', async_capable 'true');
postgres=# create user mapping for current_user server loopback1;
postgres=# create user mapping for current_user server loopback2;
postgres=# create foreign table list_p1 (a text, b text) server
loopback1 options (table_name 'v1');
postgres=# create foreign table list_p2 (a text, b text) server
loopback2 options (table_name 'v2');
postgres=# create table list_pt (a text, b text) partition by list (a);
postgres=# alter table list_pt attach partition list_p1 for values in ('foo');
postgres=# alter table list_pt attach partition list_p2 for values in ('bar');

postgres=# begin;
BEGIN
postgres=*# savepoint s1;
SAVEPOINT
postgres=*# select count(*) from list_pt;
^CCancel request sent
ERROR: canceling statement due to user request
postgres=!# rollback to savepoint s1;
ROLLBACK
postgres=*# select count(*) from list_pt;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

When canceling the SELECT, an async request sent to a remote server
using a connection is canceled, but it’s stored in the per-connection
state of the connection even after the failed subtransaction for the
reason above, so when re-executing the SELECT, postgres_fdw processes
the invalid async request to re-use the connection in GetConnection(),
causing a segmentation fault. This would be my oversight in commit
27e1f1456. :-(

To fix, I modified pgfdw_abort_cleanup() to reset the per-connection
state in the post-subabort case as well. Also, I modified the
initialization so that it’s done only if necessary, to save cycles,
and improved a comment on the initialization a bit. Attached is a
patch for that.

Best regards,
Etsuro Fujita

Attachments:

fix-postgres-fdw-subabort-cleanup.patchapplication/octet-stream; name=fix-postgres-fdw-subabort-cleanup.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 6bac4ad23e..10e888f5f9 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -1405,10 +1405,17 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel)
 
 		entry->have_prep_stmt = false;
 		entry->have_error = false;
-		/* Also reset per-connection state */
-		memset(&entry->state, 0, sizeof(entry->state));
 	}
 
+	/*
+	 * If pendingAreq of the per-connection state is not NULL, it means that
+	 * an asynchronous fetch begun by fetch_more_data_begin() was not done
+	 * successfully and thus the per-connection state was not reset in
+	 * fetch_more_data(); in that case reset the per-connection state here.
+	 */
+	if (entry->state.pendingAreq)
+		memset(&entry->state, 0, sizeof(entry->state));
+
 	/* Disarm changing_xact_state if it all worked */
 	entry->changing_xact_state = false;
 }
#2Alexander Pyhalov
a.pyhalov@postgrespro.ru
In reply to: Etsuro Fujita (#1)
Re: postgres_fdw: incomplete subabort cleanup of connections used in async execution

Etsuro Fujita писал 2021-12-19 13:25:

Hi,

While working on [1], I noticed $SUBJECT: postgres_fdw resets the
per-connection states of connections, which store async requests sent
to remote servers in async_capable mode, during post-abort
(pgfdw_xact_callback()), but it fails to do so during post-subabort
(pgfdw_subxact_callback()). This causes a crash when re-executing a
query that was aborted in a subtransaction:

Hi.
Looks good to me.
--
Best regards,
Alexander Pyhalov,
Postgres Professional

#3Etsuro Fujita
etsuro.fujita@gmail.com
In reply to: Alexander Pyhalov (#2)
Re: postgres_fdw: incomplete subabort cleanup of connections used in async execution

Hi Alexander,

On Wed, Dec 22, 2021 at 1:08 AM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:

Etsuro Fujita писал 2021-12-19 13:25:

While working on [1], I noticed $SUBJECT: postgres_fdw resets the
per-connection states of connections, which store async requests sent
to remote servers in async_capable mode, during post-abort
(pgfdw_xact_callback()), but it fails to do so during post-subabort
(pgfdw_subxact_callback()). This causes a crash when re-executing a
query that was aborted in a subtransaction:

Looks good to me.

Great! Thanks for reviewing!

Best regards,
Etsuro Fujita

#4Etsuro Fujita
etsuro.fujita@gmail.com
In reply to: Etsuro Fujita (#3)
Re: postgres_fdw: incomplete subabort cleanup of connections used in async execution

On Wed, Dec 22, 2021 at 2:40 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:

On Wed, Dec 22, 2021 at 1:08 AM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:

Looks good to me.

Great! Thanks for reviewing!

I've committed the patch.

Best regards,
Etsuro Fujita