postgres_fdw: incomplete subabort cleanup of connections used in async execution
Hi,
While working on [1], I noticed $SUBJECT: postgres_fdw resets the
per-connection states of connections, which store async requests sent
to remote servers in async_capable mode, during post-abort
(pgfdw_xact_callback()), but it fails to do so during post-subabort
(pgfdw_subxact_callback()). This causes a crash when re-executing a
query that was aborted in a subtransaction:
postgres=# create table t (a text, b text);
postgres=# create or replace function slow_data (name text, duration
float) returns setof t as $$ begin perform pg_sleep(duration); return
query select name, generate_series(1, 1000)::text; end; $$ language
plpgsql;
postgres=# create view v1 as select * from slow_data('foo', 2.5);
postgres=# create view v2 as select * from slow_data('bar', 5.0);
postgres=# create extension postgres_fdw;
postgres=# create server loopback1 foreign data wrapper postgres_fdw
options (dbname 'postgres', async_capable 'true');
postgres=# create server loopback2 foreign data wrapper postgres_fdw
options (dbname 'postgres', async_capable 'true');
postgres=# create user mapping for current_user server loopback1;
postgres=# create user mapping for current_user server loopback2;
postgres=# create foreign table list_p1 (a text, b text) server
loopback1 options (table_name 'v1');
postgres=# create foreign table list_p2 (a text, b text) server
loopback2 options (table_name 'v2');
postgres=# create table list_pt (a text, b text) partition by list (a);
postgres=# alter table list_pt attach partition list_p1 for values in ('foo');
postgres=# alter table list_pt attach partition list_p2 for values in ('bar');
postgres=# begin;
BEGIN
postgres=*# savepoint s1;
SAVEPOINT
postgres=*# select count(*) from list_pt;
^CCancel request sent
ERROR: canceling statement due to user request
postgres=!# rollback to savepoint s1;
ROLLBACK
postgres=*# select count(*) from list_pt;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
When canceling the SELECT, an async request sent to a remote server
using a connection is canceled, but it’s stored in the per-connection
state of the connection even after the failed subtransaction for the
reason above, so when re-executing the SELECT, postgres_fdw processes
the invalid async request to re-use the connection in GetConnection(),
causing a segmentation fault. This would be my oversight in commit
27e1f1456. :-(
To fix, I modified pgfdw_abort_cleanup() to reset the per-connection
state in the post-subabort case as well. Also, I modified the
initialization so that it’s done only if necessary, to save cycles,
and improved a comment on the initialization a bit. Attached is a
patch for that.
Best regards,
Etsuro Fujita
Attachments:
fix-postgres-fdw-subabort-cleanup.patchapplication/octet-stream; name=fix-postgres-fdw-subabort-cleanup.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 6bac4ad23e..10e888f5f9 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -1405,10 +1405,17 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel)
entry->have_prep_stmt = false;
entry->have_error = false;
- /* Also reset per-connection state */
- memset(&entry->state, 0, sizeof(entry->state));
}
+ /*
+ * If pendingAreq of the per-connection state is not NULL, it means that
+ * an asynchronous fetch begun by fetch_more_data_begin() was not done
+ * successfully and thus the per-connection state was not reset in
+ * fetch_more_data(); in that case reset the per-connection state here.
+ */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
/* Disarm changing_xact_state if it all worked */
entry->changing_xact_state = false;
}
Etsuro Fujita писал 2021-12-19 13:25:
Hi,
While working on [1], I noticed $SUBJECT: postgres_fdw resets the
per-connection states of connections, which store async requests sent
to remote servers in async_capable mode, during post-abort
(pgfdw_xact_callback()), but it fails to do so during post-subabort
(pgfdw_subxact_callback()). This causes a crash when re-executing a
query that was aborted in a subtransaction:
Hi.
Looks good to me.
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Hi Alexander,
On Wed, Dec 22, 2021 at 1:08 AM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:
Etsuro Fujita писал 2021-12-19 13:25:
While working on [1], I noticed $SUBJECT: postgres_fdw resets the
per-connection states of connections, which store async requests sent
to remote servers in async_capable mode, during post-abort
(pgfdw_xact_callback()), but it fails to do so during post-subabort
(pgfdw_subxact_callback()). This causes a crash when re-executing a
query that was aborted in a subtransaction:
Looks good to me.
Great! Thanks for reviewing!
Best regards,
Etsuro Fujita
On Wed, Dec 22, 2021 at 2:40 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
On Wed, Dec 22, 2021 at 1:08 AM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:Looks good to me.
Great! Thanks for reviewing!
I've committed the patch.
Best regards,
Etsuro Fujita