postgres_fdw: commit remote (sub)transactions in parallel during pre-commit
Hi,
As I said before [1]/messages/by-id/CAPmGK177E6HPcCQB4-s+m9AcCZDHCC2drZy+FKnnvEXw9kXoXQ@mail.gmail.com, I’m working on $SUBJECT. Attached is a WIP
patch for that. The patch is pretty simple: if a server option added
by the patch “parallel_commit” is enabled, 1) asynchronously send
COMMIT TRANSACTION (RELEASE SAVEPOINT) to all remote servers involved
in a local (sub)transaction, then 2) wait for the results from the
remote servers in the order that the command was sent to the remote
servers, when called from pgfdw_xact_callback (pgfdw_subxact_callback)
during pre-commit. The patch also parallelizes clearing prepared
statements the same way during pre-commit. (The option is false by
default.)
I evaluated the effectiveness of the patch using a simple
multi-statement transaction:
BEGIN;
SAVEPOINT s;
INSERT INTO ft1 VALUES (10, 10);
INSERT INTO ft2 VALUES (20, 20);
RELEASE SAVEPOINT s;
COMMIT;
where ft1 and ft2 are foreign tables created on different foreign
servers hosted on different machines. I ran the transaction five
times using the patch with the option enabled/disabled, and measured
the latencies for the RELEASE and COMMIT commands in each run. The
average latencies for these commands over the five runs are:
* RELEASE
parallel_commit=0: 0.385 ms
parallel_commit=1: 0.221 ms
* COMMIT
parallel_commit=0: 1.660 ms
parallel_commit=1: 0.861 ms
With the option enabled, the average latencies for both commands are
reduced significantly!
I think we could extend this to abort cleanup of remote
(sub)transactions during post-abort. Anyway, I think this is useful,
so I’ll add this to the upcoming commitfest.
Best regards,
Etsuro Fujita
[1]: /messages/by-id/CAPmGK177E6HPcCQB4-s+m9AcCZDHCC2drZy+FKnnvEXw9kXoXQ@mail.gmail.com
Attachments:
postgres-fdw-parallel-commit-1.patchapplication/octet-stream; name=postgres-fdw-parallel-commit-1.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 4aff315b7c..03223afd53 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -58,6 +58,7 @@ typedef struct ConnCacheEntry
bool have_prep_stmt; /* have we prepared any stmts in this xact? */
bool have_error; /* have any subxacts aborted in this xact? */
bool changing_xact_state; /* xact state change in process */
+ bool parallel_commit; /* do we commit (sub)xacts in parallel? */
bool invalidated; /* true if reconnect is pending */
bool keep_connections; /* setting value of keep_connections
* server option */
@@ -92,6 +93,8 @@ static PGconn *connect_pg_server(ForeignServer *server, UserMapping *user);
static void disconnect_pg_server(ConnCacheEntry *entry);
static void check_conn_params(const char **keywords, const char **values, UserMapping *user);
static void configure_remote_session(PGconn *conn);
+static void do_sql_command_begin(PGconn *conn, const char *sql);
+static void do_sql_command_end(PGconn *conn, const char *sql, bool ignore_errors);
static void begin_remote_xact(ConnCacheEntry *entry);
static void pgfdw_xact_callback(XactEvent event, void *arg);
static void pgfdw_subxact_callback(SubXactEvent event,
@@ -100,6 +103,7 @@ static void pgfdw_subxact_callback(SubXactEvent event,
void *arg);
static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
+static void pgfdw_reset_xact_nesting_depth(ConnCacheEntry *entry);
static bool pgfdw_cancel_query(PGconn *conn);
static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
@@ -318,12 +322,15 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
* By default, all the connections to any foreign servers are kept open.
*/
entry->keep_connections = true;
+ entry->parallel_commit = false;
foreach(lc, server->options)
{
DefElem *def = (DefElem *) lfirst(lc);
if (strcmp(def->defname, "keep_connections") == 0)
entry->keep_connections = defGetBoolean(def);
+ if (strcmp(def->defname, "parallel_commit") == 0)
+ entry->parallel_commit = defGetBoolean(def);
}
/* Now try to make the connection */
@@ -589,11 +596,36 @@ do_sql_command(PGconn *conn, const char *sql)
{
PGresult *res;
+ do_sql_command_begin(conn, sql);
+ res = pgfdw_get_result(conn, sql);
+ if (PQresultStatus(res) != PGRES_COMMAND_OK)
+ pgfdw_report_error(ERROR, res, conn, true, sql);
+ PQclear(res);
+}
+
+static void
+do_sql_command_begin(PGconn *conn, const char *sql)
+{
if (!PQsendQuery(conn, sql))
pgfdw_report_error(ERROR, NULL, conn, false, sql);
+}
+
+static void
+do_sql_command_end(PGconn *conn, const char *sql, bool ignore_errors)
+{
+ PGresult *res;
+
+ /* Consume whatever data is available from the socket */
+ if (!PQconsumeInput(conn))
+ pgfdw_report_error(ERROR, NULL, conn, false, sql);
res = pgfdw_get_result(conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
- pgfdw_report_error(ERROR, res, conn, true, sql);
+ {
+ if (ignore_errors)
+ pgfdw_report_error(WARNING, res, conn, true, sql);
+ else
+ pgfdw_report_error(ERROR, res, conn, true, sql);
+ }
PQclear(res);
}
@@ -851,6 +883,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
{
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
+ List *pending_xacts = NIL;
/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
@@ -888,6 +921,13 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Commit all remote transactions during pre-commit */
entry->changing_xact_state = true;
+ if (entry->parallel_commit)
+ {
+ do_sql_command_begin(entry->conn,
+ "COMMIT TRANSACTION");
+ pending_xacts = lappend(pending_xacts, entry);
+ continue;
+ }
do_sql_command(entry->conn, "COMMIT TRANSACTION");
entry->changing_xact_state = false;
@@ -943,23 +983,68 @@ pgfdw_xact_callback(XactEvent event, void *arg)
}
}
- /* Reset state to show we're out of a transaction */
- entry->xact_depth = 0;
-
/*
- * If the connection isn't in a good idle state, it is marked as
- * invalid or keep_connections option of its server is disabled, then
- * discard it to recover. Next GetConnection will open a new
- * connection.
+ * Reset state to show we're out of a transaction, and, if necessary,
+ * discard the connection
*/
- if (PQstatus(entry->conn) != CONNECTION_OK ||
- PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
- entry->changing_xact_state ||
- entry->invalidated ||
- !entry->keep_connections)
+ pgfdw_reset_xact_nesting_depth(entry);
+ }
+
+ if (pending_xacts)
+ {
+ List *pending_deallocs = NIL;
+ ListCell *lc;
+
+ Assert(event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
+ event == XACT_EVENT_PRE_COMMIT);
+
+ foreach(lc, pending_xacts)
+ {
+ entry = (ConnCacheEntry *) lfirst(lc);
+
+ Assert(entry->changing_xact_state);
+ do_sql_command_end(entry->conn, "COMMIT TRANSACTION", false);
+ entry->changing_xact_state = false;
+
+ /* Do a DEALLOCATE ALL if needed */
+ if (entry->have_prep_stmt && entry->have_error)
+ {
+ entry->changing_xact_state = true;
+ do_sql_command_begin(entry->conn, "DEALLOCATE ALL");
+ pending_deallocs = lappend(pending_deallocs, entry);
+ continue;
+ }
+
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ /*
+ * Reset state to show we're out of a transaction, and, if
+ * necessary, discard the connection
+ */
+ pgfdw_reset_xact_nesting_depth(entry);
+ }
+
+ if (pending_deallocs)
{
- elog(DEBUG3, "discarding connection %p", entry->conn);
- disconnect_pg_server(entry);
+ foreach(lc, pending_deallocs)
+ {
+ entry = (ConnCacheEntry *) lfirst(lc);
+
+ Assert(entry->changing_xact_state);
+ /* Ignore errors in the DEALLOCATE (see note above) */
+ do_sql_command_end(entry->conn, "DEALLOCATE ALL", true);
+ entry->changing_xact_state = false;
+
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ /*
+ * Reset state to show we're out of a transaction, and, if
+ * necessary, discard the connection
+ */
+ pgfdw_reset_xact_nesting_depth(entry);
+ }
}
}
@@ -984,6 +1069,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
int curlevel;
+ List *pending_subxacts = NIL;
/* Nothing to do at subxact start, nor after commit. */
if (!(event == SUBXACT_EVENT_PRE_COMMIT_SUB ||
@@ -1026,6 +1112,12 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
/* Commit all remote subtransactions during pre-commit */
snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel);
entry->changing_xact_state = true;
+ if (entry->parallel_commit)
+ {
+ do_sql_command_begin(entry->conn, sql);
+ pending_subxacts = lappend(pending_subxacts, entry);
+ continue;
+ }
do_sql_command(entry->conn, sql);
entry->changing_xact_state = false;
}
@@ -1041,6 +1133,27 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
/* OK, we're outta that level of subtransaction */
entry->xact_depth--;
}
+
+ if (pending_subxacts)
+ {
+ char sql[100];
+ ListCell *lc;
+
+ Assert(event == SUBXACT_EVENT_PRE_COMMIT_SUB);
+
+ snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel);
+ foreach(lc, pending_subxacts)
+ {
+ entry = (ConnCacheEntry *) lfirst(lc);
+
+ Assert(entry->changing_xact_state);
+ do_sql_command_end(entry->conn, sql, false);
+ entry->changing_xact_state = false;
+
+ /* OK, we're outta that level of subtransaction */
+ entry->xact_depth--;
+ }
+ }
}
/*
@@ -1132,6 +1245,28 @@ pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry)
server->servername)));
}
+static void
+pgfdw_reset_xact_nesting_depth(ConnCacheEntry *entry)
+{
+ /* Reset state to show we're out of a transaction */
+ entry->xact_depth = 0;
+
+ /*
+ * If the connection isn't in a good idle state, it is marked as invalid
+ * or keep_connections option of its server is disabled, then discard it
+ * to recover. Next GetConnection will open a new connection.
+ */
+ if (PQstatus(entry->conn) != CONNECTION_OK ||
+ PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
+ entry->changing_xact_state ||
+ entry->invalidated ||
+ !entry->keep_connections)
+ {
+ elog(DEBUG3, "discarding connection %p", entry->conn);
+ disconnect_pg_server(entry);
+ }
+}
+
/*
* Cancel the currently-in-progress query (whose query text we do not have)
* and ignore the result. Returns true if we successfully cancel the query
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index fd141a0fa5..54b473de3d 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9452,7 +9452,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, keep_connections
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, keep_connections
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
@@ -10768,3 +10768,76 @@ ERROR: invalid value for integer option "batch_size": 100$%$#$#
ALTER FOREIGN DATA WRAPPER postgres_fdw OPTIONS (nonexistent 'fdw');
ERROR: invalid option "nonexistent"
HINT: There are no valid options in this context.
+-- ===================================================================
+-- test parallel commit
+-- ===================================================================
+ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+CREATE TABLE ploc1 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
+ SERVER loopback OPTIONS (table_name 'ploc1');
+CREATE TABLE ploc2 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem2 (f1 int, f2 text)
+ SERVER loopback OPTIONS (table_name 'ploc2');
+BEGIN;
+INSERT INTO prem1 VALUES (101, 'foo');
+INSERT INTO prem2 VALUES (201, 'bar');
+COMMIT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+-----
+ 101 | foo
+(1 row)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+-----
+ 201 | bar
+(1 row)
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (102, 'foofoo');
+INSERT INTO prem2 VALUES (202, 'barbar');
+RELEASE SAVEPOINT s;
+COMMIT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+(2 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+(2 rows)
+
+-- This tests parallelizing crearing prepared statements during pre-commit
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (103, 'foofoo');
+INSERT INTO prem2 VALUES (203, 'barbar');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (104, 'test1');
+INSERT INTO prem2 VALUES (204, 'test2');
+COMMIT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | test1
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | test2
+(3 rows)
+
+ALTER SERVER loopback OPTIONS (DROP parallel_commit);
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index 48c7417e6e..1be5830c53 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -120,6 +120,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
strcmp(def->defname, "updatable") == 0 ||
strcmp(def->defname, "truncatable") == 0 ||
strcmp(def->defname, "async_capable") == 0 ||
+ strcmp(def->defname, "parallel_commit") == 0 ||
strcmp(def->defname, "keep_connections") == 0)
{
/* these accept only boolean values */
@@ -248,6 +249,7 @@ InitPgFdwOptions(void)
/* async_capable is available on both server and table */
{"async_capable", ForeignServerRelationId, false},
{"async_capable", ForeignTableRelationId, false},
+ {"parallel_commit", ForeignServerRelationId, false},
{"keep_connections", ForeignServerRelationId, false},
{"password_required", UserMappingRelationId, false},
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 43c30d492d..6bb8937ced 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3426,3 +3426,46 @@ CREATE FOREIGN TABLE inv_bsz (c1 int )
-- No option is allowed to be specified at foreign data wrapper level
ALTER FOREIGN DATA WRAPPER postgres_fdw OPTIONS (nonexistent 'fdw');
+
+-- ===================================================================
+-- test parallel commit
+-- ===================================================================
+ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+
+CREATE TABLE ploc1 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
+ SERVER loopback OPTIONS (table_name 'ploc1');
+CREATE TABLE ploc2 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem2 (f1 int, f2 text)
+ SERVER loopback OPTIONS (table_name 'ploc2');
+
+BEGIN;
+INSERT INTO prem1 VALUES (101, 'foo');
+INSERT INTO prem2 VALUES (201, 'bar');
+COMMIT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (102, 'foofoo');
+INSERT INTO prem2 VALUES (202, 'barbar');
+RELEASE SAVEPOINT s;
+COMMIT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+-- This tests parallelizing crearing prepared statements during pre-commit
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (103, 'foofoo');
+INSERT INTO prem2 VALUES (203, 'barbar');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (104, 'test1');
+INSERT INTO prem2 VALUES (204, 'test2');
+COMMIT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+ALTER SERVER loopback OPTIONS (DROP parallel_commit);
On 2021/10/31 18:05, Etsuro Fujita wrote:
Hi,
As I said before [1], I’m working on $SUBJECT. Attached is a WIP
patch for that.
Thanks for the patch!
The patch is pretty simple: if a server option added
by the patch “parallel_commit” is enabled,
Could you tell me why the parameter is necessary?
Can't we always enable the feature?
* RELEASE
parallel_commit=0: 0.385 ms
parallel_commit=1: 0.221 ms* COMMIT
parallel_commit=0: 1.660 ms
parallel_commit=1: 0.861 msWith the option enabled, the average latencies for both commands are
reduced significantly!
Sounds great!
I think we could extend this to abort cleanup of remote
(sub)transactions during post-abort. Anyway, I think this is useful,
so I’ll add this to the upcoming commitfest.
Thanks!
+ /* Consume whatever data is available from the socket */
+ if (!PQconsumeInput(conn))
+ pgfdw_report_error(ERROR, NULL, conn, false, sql);
Without the patch, PQconsumeInput() is not called before pgfdw_get_result()
But could you tell me why you added PQconsumeInput() there?
When ignore_errors argument is true, the error reported by
PQconsumeInput() should be ignored?
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
I evaluated the effectiveness of the patch using a simple
multi-statement transaction:BEGIN;
SAVEPOINT s;
INSERT INTO ft1 VALUES (10, 10);
INSERT INTO ft2 VALUES (20, 20);
RELEASE SAVEPOINT s;
COMMIT;where ft1 and ft2 are foreign tables created on different foreign
servers hosted on different machines. I ran the transaction five
times using the patch with the option enabled/disabled, and measured
the latencies for the RELEASE and COMMIT commands in each run. The
average latencies for these commands over the five runs are:* RELEASE
parallel_commit=0: 0.385 ms
parallel_commit=1: 0.221 ms* COMMIT
parallel_commit=0: 1.660 ms
parallel_commit=1: 0.861 msWith the option enabled, the average latencies for both commands are
reduced significantly!
Followed your instructions, I performed some basic tests to compare the
performance between before and after. In my testing environment (two
foreign servers on the same local machine), the performance varies,
sometimes the time spent on RELEASE and COMMIT without patch are close
to after patched, but seems it always perform better after patched. Then
I ran a 1-millions tuples insert, 5 times average is something like below,
Before
RELEASE 0.171 ms, COMMIT 1.861 ms
After
RELEASE 0.147 ms, COMMIT 1.305 ms
Best regards,
--
David
Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca
On Mon, Nov 1, 2021 at 3:22 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2021/10/31 18:05, Etsuro Fujita wrote:
The patch is pretty simple: if a server option added
by the patch “parallel_commit” is enabled,Could you tell me why the parameter is necessary?
Can't we always enable the feature?
I don’t think parallel commit would cause performance degradation,
even in the case when there is just a single remote (sub)transaction
to commit when called from pgfdw_xact_callback
(pgfdw_subxact_callback), so I think it might be OK to enable it by
default. But my concern about doing so is the remote side: during
those functions, if there are a lot of (sub)transactions on a single
remote server that need to be committed, parallel commit would
increase the remote server’s load at (sub)transaction end than serial
commit, which is the existing implementation, as the requests to
commit those (sub)transactions are sent to the remote server at the
same time; which some users might want to avoid.
I think we could extend this to abort cleanup of remote
(sub)transactions during post-abort. Anyway, I think this is useful,
so I’ll add this to the upcoming commitfest.Thanks!
I'll update the patch as such in the next version.
+ /* Consume whatever data is available from the socket */ + if (!PQconsumeInput(conn)) + pgfdw_report_error(ERROR, NULL, conn, false, sql);Without the patch, PQconsumeInput() is not called before pgfdw_get_result()
But could you tell me why you added PQconsumeInput() there?
The reason is that there might be the result already before calling
pgfdw_get_result(), in which case PQconsumeInput() followed by
PQisBusy() would allow us to call PQgetResult() without doing
WaitLatchOrSocket(), which I think is rather expensive.
When ignore_errors argument is true, the error reported by
PQconsumeInput() should be ignored?
I’m not sure about that, because the error might be caused by e.g.,
OOM in the local server, in which case I don’t think it is safe to
ignore it and continue the (sub)transaction-end processing.
Thanks for reviewing!
Best regards,
Etsuro Fujita
On Tue, Nov 2, 2021 at 7:47 AM David Zhang <david.zhang@highgo.ca> wrote:
Followed your instructions, I performed some basic tests to compare the
performance between before and after. In my testing environment (two
foreign servers on the same local machine), the performance varies,
sometimes the time spent on RELEASE and COMMIT without patch are close
to after patched, but seems it always perform better after patched. Then
I ran a 1-millions tuples insert, 5 times average is something like below,Before
RELEASE 0.171 ms, COMMIT 1.861 msAfter
RELEASE 0.147 ms, COMMIT 1.305 ms
Good to know! Thanks for testing!
Best regards,
Etsuro Fujita
On 2021/11/07 18:06, Etsuro Fujita wrote:
On Mon, Nov 1, 2021 at 3:22 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2021/10/31 18:05, Etsuro Fujita wrote:
The patch is pretty simple: if a server option added
by the patch “parallel_commit” is enabled,Could you tell me why the parameter is necessary?
Can't we always enable the feature?I don’t think parallel commit would cause performance degradation,
even in the case when there is just a single remote (sub)transaction
to commit when called from pgfdw_xact_callback
(pgfdw_subxact_callback), so I think it might be OK to enable it by
default. But my concern about doing so is the remote side: during
those functions, if there are a lot of (sub)transactions on a single
remote server that need to be committed, parallel commit would
increase the remote server’s load at (sub)transaction end than serial
commit, which is the existing implementation, as the requests to
commit those (sub)transactions are sent to the remote server at the
same time; which some users might want to avoid.
Thanks for explaining this! But probably I failed to get your point.
Sorry... Whichever parallel or serial commit approach, the number of
transactions to commit on the remote server is the same, isn't it?
For example, please imagine the case where a client requests
ten transactions per second to the local server. Each transaction
accesses to the foreign table, so which means that ten transaction
commit operations per second are requested to the remote server.
Unless I'm missing something, this number doesn't change whether
the foreign transaction is committed in parallel way or not.
Thought?
I think we could extend this to abort cleanup of remote
(sub)transactions during post-abort. Anyway, I think this is useful,
so I’ll add this to the upcoming commitfest.Thanks!
I'll update the patch as such in the next version.
IMO it's better to implement and commit these features gradually
if possible. Which would simplify the patch and make it
easier-to-review. So I think that it's better to implement
this feature as a separate patch.
+ /* Consume whatever data is available from the socket */ + if (!PQconsumeInput(conn)) + pgfdw_report_error(ERROR, NULL, conn, false, sql);Without the patch, PQconsumeInput() is not called before pgfdw_get_result()
But could you tell me why you added PQconsumeInput() there?The reason is that there might be the result already before calling
pgfdw_get_result(), in which case PQconsumeInput() followed by
PQisBusy() would allow us to call PQgetResult() without doing
WaitLatchOrSocket(), which I think is rather expensive.
Understood. It's helpful to add the comment about why PQconsumeInput()
is called there.
Also could you tell me how much expensive it is?
When ignore_errors argument is true, the error reported by
PQconsumeInput() should be ignored?I’m not sure about that, because the error might be caused by e.g.,
OOM in the local server, in which case I don’t think it is safe to
ignore it and continue the (sub)transaction-end processing.
But the existing code ignores the error at all, doesn't it?
If it's unsafe to do that, probably we should fix that at first?
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
Dear Fujita-san,
I love your proposal because it will remove a bottleneck
for PostgreSQL build-in sharding.
I read your patch briefly and I think basically it's good.
Currently I have only one comment.
In your patch, postgres_fdw sends a COMMIT command to all entries in the hash table
and waits for the result without a timeout from the first entry.
I think this specification is good because it's very simple,
but if a COMMIT for a particular foreign server could take some time,
I thought it might be more efficient to stop waiting for results and look at the next entry.
This is how it works. First, we define a function similar to pgfdw_get_result()
so that we can specify the timeout time as an argument to WaitLatchOrSocket().
Then change the function called by do_sql_command_end () to the new one,
and change the callback function to skip if the result has not yet arrived
How is it? Is it an unnecessary assumption that COMMIT takes time? Or is this the next step?
I will put a PoC if needed.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Mon, Nov 8, 2021 at 1:13 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2021/11/07 18:06, Etsuro Fujita wrote:
On Mon, Nov 1, 2021 at 3:22 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
Could you tell me why the parameter is necessary?
Can't we always enable the feature?
I think it might be OK to enable it by
default. But my concern about doing so is the remote side: during
those functions, if there are a lot of (sub)transactions on a single
remote server that need to be committed, parallel commit would
increase the remote server’s load at (sub)transaction end than serial
commit, which is the existing implementation, as the requests to
commit those (sub)transactions are sent to the remote server at the
same time; which some users might want to avoid.Thanks for explaining this! But probably I failed to get your point.
Sorry... Whichever parallel or serial commit approach, the number of
transactions to commit on the remote server is the same, isn't it?
For example, please imagine the case where a client requests
ten transactions per second to the local server. Each transaction
accesses to the foreign table, so which means that ten transaction
commit operations per second are requested to the remote server.
Unless I'm missing something, this number doesn't change whether
the foreign transaction is committed in parallel way or not.
Sorry, my explanation was not enough, but I don’t think this is always
true. Let me explain using an example:
create server loopback foreign data wrapper postgres_fdw options
(dbname 'postgres', parallel_commit 'true');
create user mapping for current_user server loopback;
create table t1 (a int, b int);
create table t2 (a int, b int);
create foreign table ft1 (a int, b int) server loopback options
(table_name 't1');
create foreign table ft2 (a int, b int) server loopback options
(table_name 't2');
create role view_owner superuser;
create user mapping for view_owner server loopback;
grant SELECT on ft1 to view_owner;
create view v1 as select * from ft1;
alter view v1 owner to view_owner;
begin;
insert into v1 values (10, 10);
insert into ft2 values (20, 20);
commit;
For this transaction, since the first insert is executed as the view
owner while the second insert is executed as the current user, we
create a connection to the foreign server for each of the users to
execute the inserts. This leads to sending two commit commands to the
foreign server at the same time during pre-commit.
To avoid spike loads on a remote server induced by such a workload, I
think it’s a good idea to have a server option to control whether this
is enabled, but I might be too worried about that, so I want to hear
the opinions of people.
IMO it's better to implement and commit these features gradually
if possible. Which would simplify the patch and make it
easier-to-review. So I think that it's better to implement
this feature as a separate patch.
Ok, I'll create a patch for abort cleanup separately.
+ /* Consume whatever data is available from the socket */ + if (!PQconsumeInput(conn)) + pgfdw_report_error(ERROR, NULL, conn, false, sql);Without the patch, PQconsumeInput() is not called before pgfdw_get_result()
But could you tell me why you added PQconsumeInput() there?The reason is that there might be the result already before calling
pgfdw_get_result(), in which case PQconsumeInput() followed by
PQisBusy() would allow us to call PQgetResult() without doing
WaitLatchOrSocket(), which I think is rather expensive.Understood. It's helpful to add the comment about why PQconsumeInput()
is called there.
Ok.
Also could you tell me how much expensive it is?
IIUC I think the overheads of WaitLatchOrSocket() incurred by a series
of epoll system calls are much larger compared to the overheads of
PQconsumeInput() incurred by a recv system call in non-blocking mode
when no data is available. I didn’t do testing, though.
Actually, we already use this optimization in libpqrcv_receive() for
the caller of that function to avoid doing WaitLatchOrSocket()?
When ignore_errors argument is true, the error reported by
PQconsumeInput() should be ignored?I’m not sure about that, because the error might be caused by e.g.,
OOM in the local server, in which case I don’t think it is safe to
ignore it and continue the (sub)transaction-end processing.But the existing code ignores the error at all, doesn't it?
If it's unsafe to do that, probably we should fix that at first?
I changed my mind; I’ll update the patch to ignore the error as
before, because 1) as far as I know, there are no reports from the
field concerning that we ignore all kinds of errors in cleaning up the
prepared statements, so maybe we don’t need to change that, and 2) we
already committed at least one of the remote transactions, so it’s not
good to abort the local transaction unless we really have to.
Thanks!
Best regards,
Etsuro Fujita
Kuroda-san,
On Thu, Nov 11, 2021 at 11:27 AM kuroda.hayato@fujitsu.com
<kuroda.hayato@fujitsu.com> wrote:
I love your proposal because it will remove a bottleneck
for PostgreSQL build-in sharding.I read your patch briefly and I think basically it's good.
Great! Thanks for reviewing!
Currently I have only one comment.
In your patch, postgres_fdw sends a COMMIT command to all entries in the hash table
and waits for the result without a timeout from the first entry.
I think this specification is good because it's very simple,
but if a COMMIT for a particular foreign server could take some time,
I thought it might be more efficient to stop waiting for results and look at the next entry.
This is how it works. First, we define a function similar to pgfdw_get_result()
so that we can specify the timeout time as an argument to WaitLatchOrSocket().
Then change the function called by do_sql_command_end () to the new one,
and change the callback function to skip if the result has not yet arrivedHow is it? Is it an unnecessary assumption that COMMIT takes time? Or is this the next step?
I will put a PoC if needed.
Hmm, I'm not sure the cost-effectiveness of this optimization is
really high, because if the timeout expired, it means that something
unusual would have happened, and that it would take a long time for
the COMMIT command to complete (or abort at worst). So even if we
processed the rest of the entries while waiting for the command
result, we cannot reduce the total time very much. Maybe I'm missing
something, though.
Best regards,
Etsuro Fujita
On 2021/11/16 18:55, Etsuro Fujita wrote:
Sorry, my explanation was not enough, but I don’t think this is always
true. Let me explain using an example:create server loopback foreign data wrapper postgres_fdw options
(dbname 'postgres', parallel_commit 'true');
create user mapping for current_user server loopback;
create table t1 (a int, b int);
create table t2 (a int, b int);
create foreign table ft1 (a int, b int) server loopback options
(table_name 't1');
create foreign table ft2 (a int, b int) server loopback options
(table_name 't2');
create role view_owner superuser;
create user mapping for view_owner server loopback;
grant SELECT on ft1 to view_owner;
create view v1 as select * from ft1;
alter view v1 owner to view_owner;begin;
insert into v1 values (10, 10);
insert into ft2 values (20, 20);
commit;For this transaction, since the first insert is executed as the view
owner while the second insert is executed as the current user, we
create a connection to the foreign server for each of the users to
execute the inserts. This leads to sending two commit commands to the
foreign server at the same time during pre-commit.To avoid spike loads on a remote server induced by such a workload, I
think it’s a good idea to have a server option to control whether this
is enabled,
I understand your point. But even if the option is disabled (i.e.,
commit command is sent to each foreign server in serial way),
multiple queries still can run on the server concurrently and
which may cause performance "spike". Other clients may open several
sessions to the server and issue queries at the same time. Other
sessions using postgres_fdw may send commit command at the same time.
If we want to avoid that "spike", probably we need to decrease
max_connections or use connection pooling, etc. So ISTM that it's
half-baked and not enough to provide the option that controls
whether postgres_fdw issues commit command in parallel or serial way.
but I might be too worried about that, so I want to hear
the opinions of people.
Yes.
IIUC I think the overheads of WaitLatchOrSocket() incurred by a series
of epoll system calls are much larger compared to the overheads of
PQconsumeInput() incurred by a recv system call in non-blocking mode
when no data is available. I didn’t do testing, though.
Understood.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
On Thu, Nov 18, 2021 at 1:09 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2021/11/16 18:55, Etsuro Fujita wrote:
Sorry, my explanation was not enough, but I don’t think this is always
true. Let me explain using an example:create server loopback foreign data wrapper postgres_fdw options
(dbname 'postgres', parallel_commit 'true');
create user mapping for current_user server loopback;
create table t1 (a int, b int);
create table t2 (a int, b int);
create foreign table ft1 (a int, b int) server loopback options
(table_name 't1');
create foreign table ft2 (a int, b int) server loopback options
(table_name 't2');
create role view_owner superuser;
create user mapping for view_owner server loopback;
grant SELECT on ft1 to view_owner;
create view v1 as select * from ft1;
alter view v1 owner to view_owner;begin;
insert into v1 values (10, 10);
insert into ft2 values (20, 20);
commit;For this transaction, since the first insert is executed as the view
owner while the second insert is executed as the current user, we
create a connection to the foreign server for each of the users to
execute the inserts. This leads to sending two commit commands to the
foreign server at the same time during pre-commit.To avoid spike loads on a remote server induced by such a workload, I
think it’s a good idea to have a server option to control whether this
is enabled,I understand your point. But even if the option is disabled (i.e.,
commit command is sent to each foreign server in serial way),
multiple queries still can run on the server concurrently and
which may cause performance "spike". Other clients may open several
sessions to the server and issue queries at the same time. Other
sessions using postgres_fdw may send commit command at the same time.
If we want to avoid that "spike", probably we need to decrease
max_connections or use connection pooling, etc.
I think that what you are discussing here would be a related but
different issue, because the patch doesn't increase the number of
connections to the remote server that are needed for processing a
single transaction than before.
My concern about the patch is that in parallel-commit mode,
transactions like the above example might increase the remote server's
load at transaction end than before, while using the same number of
connections to the remote server as before, because multiple COMMIT
commands are sent to the remote server at the same time, not
sequentially as before. The option could be used to avoid such a
spike load without changing any settings on the remote server if
necessary. Also, the option could be added at no extra cost, so there
seems to me to be no reason to remove it.
Anyway, I'd like to hear the opinions of others.
Thanks!
Best regards,
Etsuro Fujita
On 2021/11/16 18:55, Etsuro Fujita wrote:
I changed my mind; I’ll update the patch to ignore the error as
before, because 1) as far as I know, there are no reports from the
field concerning that we ignore all kinds of errors in cleaning up the
prepared statements, so maybe we don’t need to change that, and 2) we
already committed at least one of the remote transactions, so it’s not
good to abort the local transaction unless we really have to.
Are you planning to update the patch? In addition to this change,
at least documentation about new parallel_commit parameter needs
to be included in the patch.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
On Fri, Dec 3, 2021 at 6:07 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2021/11/16 18:55, Etsuro Fujita wrote:
I changed my mind; I’ll update the patch to ignore the error as
before, because 1) as far as I know, there are no reports from the
field concerning that we ignore all kinds of errors in cleaning up the
prepared statements, so maybe we don’t need to change that, and 2) we
already committed at least one of the remote transactions, so it’s not
good to abort the local transaction unless we really have to.
Done.
Are you planning to update the patch? In addition to this change,
at least documentation about new parallel_commit parameter needs
to be included in the patch.
Done. Attached is a new version.
* 0001
This is an updated version of the previous patch. In addition to the
above, I expanded a comment in do_sql_command_end() a bit to explain
why we do PQconsumeInput() before doing pgfdw_get_result(), to address
your comment. Also, I moved the code to finish closing pending
(sub)transactions in pgfdw_xact_callback()(pgfdw_subxact_callback())
into separate functions. Also, I modified regression test cases a bit
to access multiple foreign servers.
* 0002
This is a WIP patch for parallel abort. I added an option
parallel_abort for this, because I thought it would be good to
enable/disable these separately. I didn’t do any performance tests
yet.
Sorry for the long delay.
Best regards,
Etsuro Fujita
Attachments:
v2-0001-postgres-fdw-Add-support-for-parallel-commit.patchapplication/octet-stream; name=v2-0001-postgres-fdw-Add-support-for-parallel-commit.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 80db19e401..6337623ca4 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -58,6 +58,7 @@ typedef struct ConnCacheEntry
bool have_prep_stmt; /* have we prepared any stmts in this xact? */
bool have_error; /* have any subxacts aborted in this xact? */
bool changing_xact_state; /* xact state change in process */
+ bool parallel_commit; /* do we commit (sub)xacts in parallel? */
bool invalidated; /* true if reconnect is pending */
bool keep_connections; /* setting value of keep_connections
* server option */
@@ -92,6 +93,8 @@ static PGconn *connect_pg_server(ForeignServer *server, UserMapping *user);
static void disconnect_pg_server(ConnCacheEntry *entry);
static void check_conn_params(const char **keywords, const char **values, UserMapping *user);
static void configure_remote_session(PGconn *conn);
+static void do_sql_command_begin(PGconn *conn, const char *sql);
+static void do_sql_command_end(PGconn *conn, const char *sql, bool ignore_errors);
static void begin_remote_xact(ConnCacheEntry *entry);
static void pgfdw_xact_callback(XactEvent event, void *arg);
static void pgfdw_subxact_callback(SubXactEvent event,
@@ -100,6 +103,7 @@ static void pgfdw_subxact_callback(SubXactEvent event,
void *arg);
static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
+static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
static bool pgfdw_cancel_query(PGconn *conn);
static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
@@ -107,6 +111,8 @@ static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
static void pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql,
bool toplevel);
+static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
+static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries);
static bool UserMappingPasswordRequired(UserMapping *user);
static bool disconnect_cached_connections(Oid serverid);
@@ -318,12 +324,15 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
* By default, all the connections to any foreign servers are kept open.
*/
entry->keep_connections = true;
+ entry->parallel_commit = false;
foreach(lc, server->options)
{
DefElem *def = (DefElem *) lfirst(lc);
if (strcmp(def->defname, "keep_connections") == 0)
entry->keep_connections = defGetBoolean(def);
+ if (strcmp(def->defname, "parallel_commit") == 0)
+ entry->parallel_commit = defGetBoolean(def);
}
/* Now try to make the connection */
@@ -625,11 +634,41 @@ do_sql_command(PGconn *conn, const char *sql)
{
PGresult *res;
+ do_sql_command_begin(conn, sql);
+ res = pgfdw_get_result(conn, sql);
+ if (PQresultStatus(res) != PGRES_COMMAND_OK)
+ pgfdw_report_error(ERROR, res, conn, true, sql);
+ PQclear(res);
+}
+
+static void
+do_sql_command_begin(PGconn *conn, const char *sql)
+{
if (!PQsendQuery(conn, sql))
pgfdw_report_error(ERROR, NULL, conn, false, sql);
+}
+
+static void
+do_sql_command_end(PGconn *conn, const char *sql, bool ignore_errors)
+{
+ PGresult *res;
+
+ /*
+ * Consume whatever data is available from the socket (Note that if all
+ * data is available, this allows us to call PQgetResult without forcing
+ * the overhead of WaitLatchOrSocket in pgfdw_get_result, which would be
+ * very large compared to the overhead of PQconsumeInput.)
+ */
+ if (!PQconsumeInput(conn))
+ pgfdw_report_error(WARNING, NULL, conn, false, sql);
res = pgfdw_get_result(conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
- pgfdw_report_error(ERROR, res, conn, true, sql);
+ {
+ if (ignore_errors)
+ pgfdw_report_error(WARNING, res, conn, true, sql);
+ else
+ pgfdw_report_error(ERROR, res, conn, true, sql);
+ }
PQclear(res);
}
@@ -888,6 +927,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
{
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
+ List *pending_entries = NIL;
/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
@@ -925,6 +965,13 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Commit all remote transactions during pre-commit */
entry->changing_xact_state = true;
+ if (entry->parallel_commit)
+ {
+ do_sql_command_begin(entry->conn,
+ "COMMIT TRANSACTION");
+ pending_entries = lappend(pending_entries, entry);
+ continue;
+ }
do_sql_command(entry->conn, "COMMIT TRANSACTION");
entry->changing_xact_state = false;
@@ -981,23 +1028,15 @@ pgfdw_xact_callback(XactEvent event, void *arg)
}
/* Reset state to show we're out of a transaction */
- entry->xact_depth = 0;
+ pgfdw_reset_xact_state(entry, true);
+ }
- /*
- * If the connection isn't in a good idle state, it is marked as
- * invalid or keep_connections option of its server is disabled, then
- * discard it to recover. Next GetConnection will open a new
- * connection.
- */
- if (PQstatus(entry->conn) != CONNECTION_OK ||
- PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
- entry->changing_xact_state ||
- entry->invalidated ||
- !entry->keep_connections)
- {
- elog(DEBUG3, "discarding connection %p", entry->conn);
- disconnect_pg_server(entry);
- }
+ /* If there are any pending remote transactions, finish closing them */
+ if (pending_entries)
+ {
+ Assert(event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
+ event == XACT_EVENT_PRE_COMMIT);
+ pgfdw_finish_pre_commit_cleanup(pending_entries);
}
/*
@@ -1021,6 +1060,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
int curlevel;
+ List *pending_entries = NIL;
/* Nothing to do at subxact start, nor after commit. */
if (!(event == SUBXACT_EVENT_PRE_COMMIT_SUB ||
@@ -1063,6 +1103,12 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
/* Commit all remote subtransactions during pre-commit */
snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel);
entry->changing_xact_state = true;
+ if (entry->parallel_commit)
+ {
+ do_sql_command_begin(entry->conn, sql);
+ pending_entries = lappend(pending_entries, entry);
+ continue;
+ }
do_sql_command(entry->conn, sql);
entry->changing_xact_state = false;
}
@@ -1076,7 +1122,14 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
}
/* OK, we're outta that level of subtransaction */
- entry->xact_depth--;
+ pgfdw_reset_xact_state(entry, false);
+ }
+
+ /* If there are any pending remote subtransactions, finish closing them */
+ if (pending_entries)
+ {
+ Assert(event == SUBXACT_EVENT_PRE_COMMIT_SUB);
+ pgfdw_finish_pre_subcommit_cleanup(pending_entries);
}
}
@@ -1169,6 +1222,40 @@ pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry)
server->servername)));
}
+/*
+ * Reset state to show we're out of a (sub)transaction.
+ */
+static void
+pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel)
+{
+ if (toplevel)
+ {
+ /* Reset state to show we're out of a transaction */
+ entry->xact_depth = 0;
+
+ /*
+ * If the connection isn't in a good idle state, it is marked as
+ * invalid or keep_connections option of its server is disabled, then
+ * discard it to recover. Next GetConnection will open a new
+ * connection.
+ */
+ if (PQstatus(entry->conn) != CONNECTION_OK ||
+ PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
+ entry->changing_xact_state ||
+ entry->invalidated ||
+ !entry->keep_connections)
+ {
+ elog(DEBUG3, "discarding connection %p", entry->conn);
+ disconnect_pg_server(entry);
+ }
+ }
+ else
+ {
+ /* Reset state to show we're out of a subtransaction */
+ entry->xact_depth--;
+ }
+}
+
/*
* Cancel the currently-in-progress query (whose query text we do not have)
* and ignore the result. Returns true if we successfully cancel the query
@@ -1449,6 +1536,82 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel)
entry->changing_xact_state = false;
}
+static void
+pgfdw_finish_pre_commit_cleanup(List *pending_entries)
+{
+ ConnCacheEntry *entry;
+ List *pending_deallocs = NIL;
+ ListCell *lc;
+
+ Assert(pending_entries);
+
+ foreach(lc, pending_entries)
+ {
+ entry = (ConnCacheEntry *) lfirst(lc);
+
+ Assert(entry->changing_xact_state);
+ do_sql_command_end(entry->conn, "COMMIT TRANSACTION", false);
+ entry->changing_xact_state = false;
+
+ /* Do a DEALLOCATE ALL if needed */
+ if (entry->have_prep_stmt && entry->have_error)
+ {
+ entry->changing_xact_state = true;
+ do_sql_command_begin(entry->conn, "DEALLOCATE ALL");
+ pending_deallocs = lappend(pending_deallocs, entry);
+ continue;
+ }
+
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ pgfdw_reset_xact_state(entry, true);
+ }
+
+ if (pending_deallocs)
+ {
+ foreach(lc, pending_deallocs)
+ {
+ entry = (ConnCacheEntry *) lfirst(lc);
+
+ Assert(entry->changing_xact_state);
+ /* Ignore errors in the DEALLOCATE (see note above) */
+ do_sql_command_end(entry->conn, "DEALLOCATE ALL", true);
+ entry->changing_xact_state = false;
+
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ pgfdw_reset_xact_state(entry, true);
+ }
+ }
+}
+
+static void
+pgfdw_finish_pre_subcommit_cleanup(List *pending_entries)
+{
+ ConnCacheEntry *entry;
+ int curlevel;
+ char sql[100];
+ ListCell *lc;
+
+ Assert(pending_entries);
+
+ curlevel = GetCurrentTransactionNestLevel();
+ snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel);
+
+ foreach(lc, pending_entries)
+ {
+ entry = (ConnCacheEntry *) lfirst(lc);
+
+ Assert(entry->changing_xact_state);
+ do_sql_command_end(entry->conn, sql, false);
+ entry->changing_xact_state = false;
+
+ pgfdw_reset_xact_state(entry, false);
+ }
+}
+
/*
* List active foreign server connections.
*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 7720ab9c58..b5d08881d5 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9509,7 +9509,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, keep_connections
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, keep_connections
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
@@ -10825,3 +10825,79 @@ ERROR: invalid value for integer option "batch_size": 100$%$#$#
ALTER FOREIGN DATA WRAPPER postgres_fdw OPTIONS (nonexistent 'fdw');
ERROR: invalid option "nonexistent"
HINT: There are no valid options in this context.
+-- ===================================================================
+-- test parallel commit
+-- ===================================================================
+ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+CREATE TABLE ploc1 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
+ SERVER loopback OPTIONS (table_name 'ploc1');
+CREATE TABLE ploc2 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem2 (f1 int, f2 text)
+ SERVER loopback2 OPTIONS (table_name 'ploc2');
+BEGIN;
+INSERT INTO prem1 VALUES (101, 'foo');
+INSERT INTO prem2 VALUES (201, 'bar');
+COMMIT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+-----
+ 101 | foo
+(1 row)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+-----
+ 201 | bar
+(1 row)
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (102, 'foofoo');
+INSERT INTO prem2 VALUES (202, 'barbar');
+RELEASE SAVEPOINT s;
+COMMIT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+(2 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+(2 rows)
+
+-- This tests executing DEALLOCATE ALL against foreign servers in parallel
+-- during pre-commit
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (103, 'baz');
+INSERT INTO prem2 VALUES (203, 'qux');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (104, 'bazbaz');
+INSERT INTO prem2 VALUES (204, 'quxqux');
+COMMIT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
+ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index c2c4e36802..29fa9461d7 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -121,6 +121,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
strcmp(def->defname, "updatable") == 0 ||
strcmp(def->defname, "truncatable") == 0 ||
strcmp(def->defname, "async_capable") == 0 ||
+ strcmp(def->defname, "parallel_commit") == 0 ||
strcmp(def->defname, "keep_connections") == 0)
{
/* these accept only boolean values */
@@ -249,6 +250,7 @@ InitPgFdwOptions(void)
/* async_capable is available on both server and table */
{"async_capable", ForeignServerRelationId, false},
{"async_capable", ForeignTableRelationId, false},
+ {"parallel_commit", ForeignServerRelationId, false},
{"keep_connections", ForeignServerRelationId, false},
{"password_required", UserMappingRelationId, false},
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index beeac8af1e..09b7e2afba 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3452,3 +3452,49 @@ CREATE FOREIGN TABLE inv_bsz (c1 int )
-- No option is allowed to be specified at foreign data wrapper level
ALTER FOREIGN DATA WRAPPER postgres_fdw OPTIONS (nonexistent 'fdw');
+
+-- ===================================================================
+-- test parallel commit
+-- ===================================================================
+ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+
+CREATE TABLE ploc1 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
+ SERVER loopback OPTIONS (table_name 'ploc1');
+CREATE TABLE ploc2 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem2 (f1 int, f2 text)
+ SERVER loopback2 OPTIONS (table_name 'ploc2');
+
+BEGIN;
+INSERT INTO prem1 VALUES (101, 'foo');
+INSERT INTO prem2 VALUES (201, 'bar');
+COMMIT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (102, 'foofoo');
+INSERT INTO prem2 VALUES (202, 'barbar');
+RELEASE SAVEPOINT s;
+COMMIT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+-- This tests executing DEALLOCATE ALL against foreign servers in parallel
+-- during pre-commit
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (103, 'baz');
+INSERT INTO prem2 VALUES (203, 'qux');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (104, 'bazbaz');
+INSERT INTO prem2 VALUES (204, 'quxqux');
+COMMIT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 41cdb9ea1b..77ace9039f 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -455,6 +455,43 @@ OPTIONS (ADD password_required 'false');
</variablelist>
</sect3>
+ <sect3>
+ <title>Transaction Management Options</title>
+
+ <para>
+ In cases that multiple remote (sub)transactions are opened in a local
+ (sub)transaction, by default <filename>postgres_fdw</filename> commits
+ those remote (sub)transactions one by one when the local (sub)transaction
+ commits.
+ This can be optimized with the following option:
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>parallel_commit</literal> (<type>boolean</type>)</term>
+ <listitem>
+ <para>
+ This option controls whether <filename>postgres_fdw</filename> commits
+ multiple remote (sub)transactions opened in a local (sub)transaction
+ in parallel when the local (sub)transaction commits.
+ This option can only be specified for foreign servers, not per-table.
+ The default is <literal>false</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <para>
+ Note that if many remote (sub)transactions are opened on a remote server
+ in a local (sub)transaction, this option might increase the remote
+ server’s load significantly when those remote (sub)transactions are
+ committed. So be careful when using this option.
+ </para>
+
+ </sect3>
+
<sect3>
<title>Updatability Options</title>
v2-0002-postgres-fdw-Add-support-for-parallel-abort.patchapplication/octet-stream; name=v2-0002-postgres-fdw-Add-support-for-parallel-abort.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 6337623ca4..07e449f6c2 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -59,6 +59,7 @@ typedef struct ConnCacheEntry
bool have_error; /* have any subxacts aborted in this xact? */
bool changing_xact_state; /* xact state change in process */
bool parallel_commit; /* do we commit (sub)xacts in parallel? */
+ bool parallel_abort; /* do we abort (sub)xacts in parallel? */
bool invalidated; /* true if reconnect is pending */
bool keep_connections; /* setting value of keep_connections
* server option */
@@ -80,6 +81,18 @@ static unsigned int prep_stmt_number = 0;
/* tracks whether any work is needed in callback functions */
static bool xact_got_connection = false;
+/* macro for constructing abort command to be sent */
+#define CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel) \
+ do { \
+ if (toplevel) \
+ snprintf((sql), sizeof(sql), \
+ "ABORT TRANSACTION"); \
+ else \
+ snprintf((sql), sizeof(sql), \
+ "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d", \
+ (entry)->xact_depth, (entry)->xact_depth); \
+ } while(0)
+
/*
* SQL functions
*/
@@ -105,14 +118,26 @@ static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
static bool pgfdw_cancel_query(PGconn *conn);
+static bool pgfdw_cancel_query_begin(PGconn *conn);
+static bool pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime);
static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
+static bool pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query);
+static bool pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ bool ignore_errors,
+ TimestampTz endtime);
static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
static void pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql,
bool toplevel);
+static bool pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, const char *sql,
+ List **pending_entries,
+ List **cancel_requested);
static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries);
+static void pgfdw_finish_abort_cleanup(List *pending_entries,
+ List *cancel_requested,
+ bool toplevel);
static bool UserMappingPasswordRequired(UserMapping *user);
static bool disconnect_cached_connections(Oid serverid);
@@ -325,6 +350,7 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
*/
entry->keep_connections = true;
entry->parallel_commit = false;
+ entry->parallel_abort = false;
foreach(lc, server->options)
{
DefElem *def = (DefElem *) lfirst(lc);
@@ -333,6 +359,8 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
entry->keep_connections = defGetBoolean(def);
if (strcmp(def->defname, "parallel_commit") == 0)
entry->parallel_commit = defGetBoolean(def);
+ if (strcmp(def->defname, "parallel_abort") == 0)
+ entry->parallel_abort = defGetBoolean(def);
}
/* Now try to make the connection */
@@ -928,6 +956,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
@@ -1022,7 +1051,16 @@ pgfdw_xact_callback(XactEvent event, void *arg)
case XACT_EVENT_PARALLEL_ABORT:
case XACT_EVENT_ABORT:
- pgfdw_abort_cleanup(entry, "ABORT TRANSACTION", true);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry,
+ "ABORT TRANSACTION",
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, "ABORT TRANSACTION", true);
break;
}
}
@@ -1032,11 +1070,21 @@ pgfdw_xact_callback(XactEvent event, void *arg)
}
/* If there are any pending remote transactions, finish closing them */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
- event == XACT_EVENT_PRE_COMMIT);
- pgfdw_finish_pre_commit_cleanup(pending_entries);
+ if (event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
+ event == XACT_EVENT_PRE_COMMIT)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_commit_cleanup(pending_entries);
+ }
+ else
+ {
+ Assert(event == XACT_EVENT_PARALLEL_ABORT ||
+ event == XACT_EVENT_ABORT);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ true);
+ }
}
/*
@@ -1061,6 +1109,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
ConnCacheEntry *entry;
int curlevel;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Nothing to do at subxact start, nor after commit. */
if (!(event == SUBXACT_EVENT_PRE_COMMIT_SUB ||
@@ -1118,7 +1167,15 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
snprintf(sql, sizeof(sql),
"ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
curlevel, curlevel);
- pgfdw_abort_cleanup(entry, sql, false);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, sql,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, sql, false);
}
/* OK, we're outta that level of subtransaction */
@@ -1126,10 +1183,19 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
}
/* If there are any pending remote subtransactions, finish closing them */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == SUBXACT_EVENT_PRE_COMMIT_SUB);
- pgfdw_finish_pre_subcommit_cleanup(pending_entries);
+ if (event == SUBXACT_EVENT_PRE_COMMIT_SUB)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_subcommit_cleanup(pending_entries);
+ }
+ else
+ {
+ Assert(event == SUBXACT_EVENT_ABORT_SUB);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ false);
+ }
}
}
@@ -1273,11 +1339,7 @@ pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel)
static bool
pgfdw_cancel_query(PGconn *conn)
{
- PGcancel *cancel;
- char errbuf[256];
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to cancel the query and discard the result, assume
@@ -1285,6 +1347,17 @@ pgfdw_cancel_query(PGconn *conn)
*/
endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ if (!pgfdw_cancel_query_begin(conn))
+ return false;
+ return pgfdw_cancel_query_end(conn, endtime);
+}
+
+static bool
+pgfdw_cancel_query_begin(PGconn *conn)
+{
+ PGcancel *cancel;
+ char errbuf[256];
+
/*
* Issue cancel request. Unfortunately, there's no good way to limit the
* amount of time that we might block inside PQgetCancel().
@@ -1303,6 +1376,15 @@ pgfdw_cancel_query(PGconn *conn)
PQfreeCancel(cancel);
}
+ return true;
+}
+
+static bool
+pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
/* Get and discard the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1337,9 +1419,7 @@ pgfdw_cancel_query(PGconn *conn)
static bool
pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
{
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to execute a cleanup query, assume the connection
@@ -1349,6 +1429,14 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
*/
endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ if (!pgfdw_exec_cleanup_query_begin(conn, query))
+ return false;
+ return pgfdw_exec_cleanup_query_end(conn, query, ignore_errors, endtime);
+}
+
+static bool
+pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query)
+{
/*
* Submit a query. Since we don't use non-blocking mode, this also can
* block. But its risk is relatively small, so we ignore that for now.
@@ -1359,6 +1447,16 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
return false;
}
+ return true;
+}
+
+static bool
+pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ bool ignore_errors, TimestampTz endtime)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
/* Get the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1536,6 +1634,53 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel)
entry->changing_xact_state = false;
}
+static bool
+pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, const char *sql,
+ List **pending_entries, List **cancel_requested)
+{
+ /*
+ * Don't try to clean up the connection if we're already in error
+ * recursion trouble.
+ */
+ if (in_error_recursion_trouble())
+ entry->changing_xact_state = true;
+
+ /*
+ * If connection is already unsalvageable, don't touch it further.
+ */
+ if (entry->changing_xact_state)
+ return false;
+
+ /*
+ * Mark this connection as in the process of changing transaction state.
+ */
+ entry->changing_xact_state = true;
+
+ /* Assume we might have lost track of prepared statements */
+ entry->have_error = true;
+
+ /*
+ * If a command has been submitted to the remote server by using an
+ * asynchronous execution function, the command might not have yet
+ * completed. Check to see if a command is still being processed by the
+ * remote server, and if so, request cancellation of the command.
+ */
+ if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE)
+ {
+ if (!pgfdw_cancel_query_begin(entry->conn))
+ return false; /* Unable to cancel running query */
+ *cancel_requested = lappend(*cancel_requested, entry);
+ }
+ else
+ {
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ return false; /* Unable to abort remote transaction */
+ *pending_entries = lappend(*pending_entries, entry);
+ }
+
+ return true;
+}
+
static void
pgfdw_finish_pre_commit_cleanup(List *pending_entries)
{
@@ -1612,6 +1757,151 @@ pgfdw_finish_pre_subcommit_cleanup(List *pending_entries)
}
}
+static void
+pgfdw_finish_abort_cleanup(List *pending_entries, List *cancel_requested,
+ bool toplevel)
+{
+ List *pending_deallocs = NIL;
+ ListCell *lc;
+
+ if (cancel_requested)
+ {
+ foreach(lc, cancel_requested)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. You might think we should do so before issuing
+ * cancel request like in normal mode, but that is problematic,
+ * because if, for example, it took longer than 30 seconds to
+ * process the first few entries in the cancel_requested list, it
+ * would cause a timeout for each of the remaining entries in the
+ * list when processing it, leading to slamming the connection of
+ * it shut.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ 30000);
+
+ /* Get and discard the result of the query. */
+ if (!pgfdw_cancel_query_end(entry->conn, endtime))
+ {
+ /* Unable to cancel running query */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_entries = lappend(pending_entries, entry);
+ }
+ }
+
+ if (!pending_entries)
+ return;
+
+ foreach(lc, pending_entries)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+
+ /* Get the result of the command. */
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, sql, false, endtime))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ /*
+ * If called for cleanup at main-transaction end, do a DEALLOCATE ALL
+ * if needed.
+ */
+ if (toplevel)
+ {
+ if (entry->have_prep_stmt && entry->have_error)
+ {
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn,
+ "DEALLOCATE ALL"))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_deallocs = lappend(pending_deallocs, entry);
+ continue;
+ }
+
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+ /* Also reset per-connection state */
+ memset(&entry->state, 0, sizeof(entry->state));
+ }
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+
+ if (!pending_deallocs)
+ return;
+ Assert(toplevel);
+
+ foreach(lc, pending_deallocs)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+
+ Assert(entry->changing_xact_state);
+ Assert(entry->have_prep_stmt);
+ Assert(entry->have_error);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+
+ /* Get the result of the command. */
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, "DEALLOCATE ALL",
+ true, endtime))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+ /* Also reset per-connection state */
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+}
+
/*
* List active foreign server connections.
*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index b5d08881d5..592ec4e61b 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9509,7 +9509,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, keep_connections
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, parallel_abort, keep_connections
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
@@ -10826,10 +10826,12 @@ ALTER FOREIGN DATA WRAPPER postgres_fdw OPTIONS (nonexistent 'fdw');
ERROR: invalid option "nonexistent"
HINT: There are no valid options in this context.
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
SERVER loopback OPTIONS (table_name 'ploc1');
@@ -10899,5 +10901,52 @@ SELECT * FROM prem2;
204 | quxqux
(3 rows)
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index 29fa9461d7..d21d5d7a4f 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -122,6 +122,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
strcmp(def->defname, "truncatable") == 0 ||
strcmp(def->defname, "async_capable") == 0 ||
strcmp(def->defname, "parallel_commit") == 0 ||
+ strcmp(def->defname, "parallel_abort") == 0 ||
strcmp(def->defname, "keep_connections") == 0)
{
/* these accept only boolean values */
@@ -251,6 +252,7 @@ InitPgFdwOptions(void)
{"async_capable", ForeignServerRelationId, false},
{"async_capable", ForeignTableRelationId, false},
{"parallel_commit", ForeignServerRelationId, false},
+ {"parallel_abort", ForeignServerRelationId, false},
{"keep_connections", ForeignServerRelationId, false},
{"password_required", UserMappingRelationId, false},
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 09b7e2afba..242572169d 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3454,10 +3454,12 @@ CREATE FOREIGN TABLE inv_bsz (c1 int )
ALTER FOREIGN DATA WRAPPER postgres_fdw OPTIONS (nonexistent 'fdw');
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
@@ -3496,5 +3498,26 @@ COMMIT;
SELECT * FROM prem1;
SELECT * FROM prem2;
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 77ace9039f..ea4ce1daeb 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -460,10 +460,10 @@ OPTIONS (ADD password_required 'false');
<para>
In cases that multiple remote (sub)transactions are opened in a local
- (sub)transaction, by default <filename>postgres_fdw</filename> commits
- those remote (sub)transactions one by one when the local (sub)transaction
- commits.
- This can be optimized with the following option:
+ (sub)transaction, by default <filename>postgres_fdw</filename> commits or
+ aborts those remote (sub)transactions one by one when the local
+ (sub)transaction commits or aborts.
+ This can be optimized with the following options:
</para>
<variablelist>
@@ -481,13 +481,26 @@ OPTIONS (ADD password_required 'false');
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>parallel_abort</literal> (<type>boolean</type>)</term>
+ <listitem>
+ <para>
+ This option controls whether <filename>postgres_fdw</filename> aborts
+ multiple remote (sub)transactions opened in a local (sub)transaction
+ in parallel when the local (sub)transaction aborts.
+ This option can only be specified for foreign servers, not per-table.
+ The default is <literal>false</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
<para>
Note that if many remote (sub)transactions are opened on a remote server
- in a local (sub)transaction, this option might increase the remote
+ in a local (sub)transaction, these options might increase the remote
server’s load significantly when those remote (sub)transactions are
- committed. So be careful when using this option.
+ committed or aborted. So be careful when using these options.
</para>
</sect3>
On 2022/01/06 17:29, Etsuro Fujita wrote:
On Fri, Dec 3, 2021 at 6:07 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2021/11/16 18:55, Etsuro Fujita wrote:
I changed my mind; I’ll update the patch to ignore the error as
before, because 1) as far as I know, there are no reports from the
field concerning that we ignore all kinds of errors in cleaning up the
prepared statements, so maybe we don’t need to change that, and 2) we
already committed at least one of the remote transactions, so it’s not
good to abort the local transaction unless we really have to.Done.
Are you planning to update the patch? In addition to this change,
at least documentation about new parallel_commit parameter needs
to be included in the patch.Done. Attached is a new version.
* 0001
This is an updated version of the previous patch. In addition to the
above, I expanded a comment in do_sql_command_end() a bit to explain
why we do PQconsumeInput() before doing pgfdw_get_result(), to address
your comment. Also, I moved the code to finish closing pending
(sub)transactions in pgfdw_xact_callback()(pgfdw_subxact_callback())
into separate functions. Also, I modified regression test cases a bit
to access multiple foreign servers.
Thanks for updating the patch!
At first I'm reading the 0001 patch. Here are the comments for the patch.
0001 patch failed to be applied. Could you rebase the patch?
+ entry->changing_xact_state = true;
+ do_sql_command_begin(entry->conn, "DEALLOCATE ALL");
+ pending_deallocs = lappend(pending_deallocs, entry);
Originally entry->changing_xact_state is not set to true when executing DEALLOCATE ALL. But the patch do that. Why do we need this change?
The source comment explains that we intentionally ignore errors in the DEALLOCATE. But the patch changes DEALLOCATE ALL so that it's executed via do_sql_command_begin() that can cause an error. Is this OK?
+ if (ignore_errors)
+ pgfdw_report_error(WARNING, res, conn, true, sql);
When DEALLOCATE fails, originally even warning message is not logged. But the patch changes DEALLOCATE so that its result is received via do_sql_command_end() that can log warning message even when ignore_errors argument is enabled. Why do we need to change the behavior?
+ <para>
+ This option controls whether <filename>postgres_fdw</filename> commits
+ multiple remote (sub)transactions opened in a local (sub)transaction
+ in parallel when the local (sub)transaction commits.
Since parallel_commit is an option for foreign server, how the server with this option enabled is handled by postgres_fdw should be documented, instead?
+ <para>
+ Note that if many remote (sub)transactions are opened on a remote server
+ in a local (sub)transaction, this option might increase the remote
+ server’s load significantly when those remote (sub)transactions are
+ committed. So be careful when using this option.
+ </para>
This paragraph should be inside the listitem for parallel_commit, shouldn't it?
async_capable=true also may cause the similar issue? If so, this kind of note should be documented also in async_capable?
This explains that the remote server's load will be increased *significantly*. But "significantly" part is really true? I'd like to know how much parallel_commit=true actually can increase the load in a remote server.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
Hi,
On Thu, Jan 06, 2022 at 05:29:23PM +0900, Etsuro Fujita wrote:
Done. Attached is a new version.
The patchset doesn't apply anymore:
http://cfbot.cputube.org/patch_36_3392.log
=== Applying patches on top of PostgreSQL commit ID 43c2175121c829c8591fc5117b725f1f22bfb670 ===
=== applying patch ./v2-0001-postgres-fdw-Add-support-for-parallel-commit.patch
patching file contrib/postgres_fdw/connection.c
patching file contrib/postgres_fdw/expected/postgres_fdw.out
Hunk #2 FAILED at 10825.
1 out of 2 hunks FAILED -- saving rejects to file contrib/postgres_fdw/expected/postgres_fdw.out.rej
patching file contrib/postgres_fdw/option.c
patching file contrib/postgres_fdw/sql/postgres_fdw.sql
Hunk #1 FAILED at 3452.
1 out of 1 hunk FAILED -- saving rejects to file contrib/postgres_fdw/sql/postgres_fdw.sql.rej
patching file doc/src/sgml/postgres-fdw.sgml
I also see that Fuji-san raised some questions, so for now I will simply change
the patch status to Waiting on Author.
On Thu, Jan 13, 2022 at 11:54 AM Fujii Masao
<masao.fujii@oss.nttdata.com> wrote:
At first I'm reading the 0001 patch. Here are the comments for the patch.
Thanks for reviewing!
0001 patch failed to be applied. Could you rebase the patch?
Done. Attached is an updated version of the patch set.
+ entry->changing_xact_state = true; + do_sql_command_begin(entry->conn, "DEALLOCATE ALL"); + pending_deallocs = lappend(pending_deallocs, entry);Originally entry->changing_xact_state is not set to true when executing DEALLOCATE ALL. But the patch do that. Why do we need this change?
The source comment explains that we intentionally ignore errors in the DEALLOCATE. But the patch changes DEALLOCATE ALL so that it's executed via do_sql_command_begin() that can cause an error. Is this OK?
+ if (ignore_errors) + pgfdw_report_error(WARNING, res, conn, true, sql);When DEALLOCATE fails, originally even warning message is not logged. But the patch changes DEALLOCATE so that its result is received via do_sql_command_end() that can log warning message even when ignore_errors argument is enabled. Why do we need to change the behavior?
Yeah, we don’t need to change the behavior as discussed before, so I
fixed these. I worked on the patch after a while, so I forgot about
that. :-(
+ <para> + This option controls whether <filename>postgres_fdw</filename> commits + multiple remote (sub)transactions opened in a local (sub)transaction + in parallel when the local (sub)transaction commits.Since parallel_commit is an option for foreign server, how the server with this option enabled is handled by postgres_fdw should be documented, instead?
Agreed. I rewrote this slightly like the attached. Does that make sense?
+ <para> + Note that if many remote (sub)transactions are opened on a remote server + in a local (sub)transaction, this option might increase the remote + server’s load significantly when those remote (sub)transactions are + committed. So be careful when using this option. + </para>This paragraph should be inside the listitem for parallel_commit, shouldn't it?
I put this note outside, because it’s rewritten to a note about both
the parallel_commit and parallel_abort options in the following patch.
But it would be good to make the parallel-commit patch independent, so
I moved it into the list.
async_capable=true also may cause the similar issue? If so, this kind of note should be documented also in async_capable?
That’s right. I think it would be good to add a similar note about
that, but I’d like to leave that for another patch.
This explains that the remote server's load will be increased *significantly*. But "significantly" part is really true?
I think that that would depend on how many transactions are committed
on the remote side at the same time. But the word “significantly”
might be too strong, so I dropped the word.
I'd like to know how much parallel_commit=true actually can increase the load in a remote server.
Ok, I’ll do a load test.
About the #0002 patch:
This is in preparation for the parallel-abort patch (#0003), but I’d
like to propose a minor cleanup for commit 85c696112: 1) build an
abort command to be sent to the remote in pgfdw_abort_cleanup(), using
a macro, only when/if necessary, as before, and 2) add/modify comments
a little bit.
Sorry for the delay again.
Best regards,
Etsuro Fujita
Attachments:
v3-0001-postgres-fdw-Add-support-for-parallel-commit.patchapplication/octet-stream; name=v3-0001-postgres-fdw-Add-support-for-parallel-commit.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 29fcb6a76e..d59f91f14c 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -58,6 +58,7 @@ typedef struct ConnCacheEntry
bool have_prep_stmt; /* have we prepared any stmts in this xact? */
bool have_error; /* have any subxacts aborted in this xact? */
bool changing_xact_state; /* xact state change in process */
+ bool parallel_commit; /* do we commit (sub)xacts in parallel? */
bool invalidated; /* true if reconnect is pending */
bool keep_connections; /* setting value of keep_connections
* server option */
@@ -92,6 +93,8 @@ static PGconn *connect_pg_server(ForeignServer *server, UserMapping *user);
static void disconnect_pg_server(ConnCacheEntry *entry);
static void check_conn_params(const char **keywords, const char **values, UserMapping *user);
static void configure_remote_session(PGconn *conn);
+static void do_sql_command_begin(PGconn *conn, const char *sql);
+static void do_sql_command_end(PGconn *conn, const char *sql);
static void begin_remote_xact(ConnCacheEntry *entry);
static void pgfdw_xact_callback(XactEvent event, void *arg);
static void pgfdw_subxact_callback(SubXactEvent event,
@@ -100,6 +103,7 @@ static void pgfdw_subxact_callback(SubXactEvent event,
void *arg);
static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
+static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
static bool pgfdw_cancel_query(PGconn *conn);
static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
@@ -107,6 +111,8 @@ static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
static void pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql,
bool toplevel);
+static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
+static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries);
static bool UserMappingPasswordRequired(UserMapping *user);
static bool disconnect_cached_connections(Oid serverid);
@@ -316,14 +322,20 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
* is changed will be closed and re-made later.
*
* By default, all the connections to any foreign servers are kept open.
+ *
+ * Also determine to commit (sub)transactions opened on the remote server
+ * in parallel at (sub)transaction end.
*/
entry->keep_connections = true;
+ entry->parallel_commit = false;
foreach(lc, server->options)
{
DefElem *def = (DefElem *) lfirst(lc);
if (strcmp(def->defname, "keep_connections") == 0)
entry->keep_connections = defGetBoolean(def);
+ if (strcmp(def->defname, "parallel_commit") == 0)
+ entry->parallel_commit = defGetBoolean(def);
}
/* Now try to make the connection */
@@ -633,6 +645,32 @@ do_sql_command(PGconn *conn, const char *sql)
PQclear(res);
}
+static void
+do_sql_command_begin(PGconn *conn, const char *sql)
+{
+ if (!PQsendQuery(conn, sql))
+ pgfdw_report_error(ERROR, NULL, conn, false, sql);
+}
+
+static void
+do_sql_command_end(PGconn *conn, const char *sql)
+{
+ PGresult *res;
+
+ /*
+ * Consume whatever data is available from the socket (Note that if all
+ * data is available, this allows us to call PQgetResult without forcing
+ * the overhead of WaitLatchOrSocket in pgfdw_get_result, which would be
+ * very large compared to the overhead of PQconsumeInput.)
+ */
+ if (!PQconsumeInput(conn))
+ pgfdw_report_error(ERROR, NULL, conn, false, sql);
+ res = pgfdw_get_result(conn, sql);
+ if (PQresultStatus(res) != PGRES_COMMAND_OK)
+ pgfdw_report_error(ERROR, res, conn, true, sql);
+ PQclear(res);
+}
+
/*
* Start remote transaction or subtransaction, if needed.
*
@@ -888,6 +926,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
{
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
+ List *pending_entries = NIL;
/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
@@ -925,6 +964,12 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Commit all remote transactions during pre-commit */
entry->changing_xact_state = true;
+ if (entry->parallel_commit)
+ {
+ do_sql_command_begin(entry->conn, "COMMIT TRANSACTION");
+ pending_entries = lappend(pending_entries, entry);
+ continue;
+ }
do_sql_command(entry->conn, "COMMIT TRANSACTION");
entry->changing_xact_state = false;
@@ -981,23 +1026,15 @@ pgfdw_xact_callback(XactEvent event, void *arg)
}
/* Reset state to show we're out of a transaction */
- entry->xact_depth = 0;
+ pgfdw_reset_xact_state(entry, true);
+ }
- /*
- * If the connection isn't in a good idle state, it is marked as
- * invalid or keep_connections option of its server is disabled, then
- * discard it to recover. Next GetConnection will open a new
- * connection.
- */
- if (PQstatus(entry->conn) != CONNECTION_OK ||
- PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
- entry->changing_xact_state ||
- entry->invalidated ||
- !entry->keep_connections)
- {
- elog(DEBUG3, "discarding connection %p", entry->conn);
- disconnect_pg_server(entry);
- }
+ /* If there are any pending remote transactions, finish closing them */
+ if (pending_entries)
+ {
+ Assert(event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
+ event == XACT_EVENT_PRE_COMMIT);
+ pgfdw_finish_pre_commit_cleanup(pending_entries);
}
/*
@@ -1021,6 +1058,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
int curlevel;
+ List *pending_entries = NIL;
/* Nothing to do at subxact start, nor after commit. */
if (!(event == SUBXACT_EVENT_PRE_COMMIT_SUB ||
@@ -1063,6 +1101,12 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
/* Commit all remote subtransactions during pre-commit */
snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel);
entry->changing_xact_state = true;
+ if (entry->parallel_commit)
+ {
+ do_sql_command_begin(entry->conn, sql);
+ pending_entries = lappend(pending_entries, entry);
+ continue;
+ }
do_sql_command(entry->conn, sql);
entry->changing_xact_state = false;
}
@@ -1076,7 +1120,14 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
}
/* OK, we're outta that level of subtransaction */
- entry->xact_depth--;
+ pgfdw_reset_xact_state(entry, false);
+ }
+
+ /* If there are any pending remote subtransactions, finish closing them */
+ if (pending_entries)
+ {
+ Assert(event == SUBXACT_EVENT_PRE_COMMIT_SUB);
+ pgfdw_finish_pre_subcommit_cleanup(pending_entries);
}
}
@@ -1169,6 +1220,40 @@ pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry)
server->servername)));
}
+/*
+ * Reset state to show we're out of a (sub)transaction.
+ */
+static void
+pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel)
+{
+ if (toplevel)
+ {
+ /* Reset state to show we're out of a transaction */
+ entry->xact_depth = 0;
+
+ /*
+ * If the connection isn't in a good idle state, it is marked as
+ * invalid or keep_connections option of its server is disabled, then
+ * discard it to recover. Next GetConnection will open a new
+ * connection.
+ */
+ if (PQstatus(entry->conn) != CONNECTION_OK ||
+ PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
+ entry->changing_xact_state ||
+ entry->invalidated ||
+ !entry->keep_connections)
+ {
+ elog(DEBUG3, "discarding connection %p", entry->conn);
+ disconnect_pg_server(entry);
+ }
+ }
+ else
+ {
+ /* Reset state to show we're out of a subtransaction */
+ entry->xact_depth--;
+ }
+}
+
/*
* Cancel the currently-in-progress query (whose query text we do not have)
* and ignore the result. Returns true if we successfully cancel the query
@@ -1456,6 +1541,91 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel)
entry->changing_xact_state = false;
}
+static void
+pgfdw_finish_pre_commit_cleanup(List *pending_entries)
+{
+ ConnCacheEntry *entry;
+ List *pending_deallocs = NIL;
+ ListCell *lc;
+
+ Assert(pending_entries);
+
+ foreach(lc, pending_entries)
+ {
+ entry = (ConnCacheEntry *) lfirst(lc);
+
+ Assert(entry->changing_xact_state);
+ do_sql_command_end(entry->conn, "COMMIT TRANSACTION");
+ entry->changing_xact_state = false;
+
+ /* Do a DEALLOCATE ALL if needed */
+ if (entry->have_prep_stmt && entry->have_error)
+ {
+ /* Ignore errors in the DEALLOCATE (see note above) */
+ if (PQsendQuery(entry->conn, "DEALLOCATE ALL"))
+ {
+ pending_deallocs = lappend(pending_deallocs, entry);
+ continue;
+ }
+ }
+
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ pgfdw_reset_xact_state(entry, true);
+ }
+
+ if (pending_deallocs)
+ {
+ foreach(lc, pending_deallocs)
+ {
+ entry = (ConnCacheEntry *) lfirst(lc);
+ PGresult *res;
+
+ /* Ignore errors in the DEALLOCATE (see note above) */
+ if ((res = PQgetResult(entry->conn)) != NULL)
+ {
+ PQclear(res);
+ /*
+ * Stop if the connection is lost (else we'll loop infinitely)
+ */
+ if (PQstatus(entry->conn) == CONNECTION_BAD)
+ break;
+ }
+
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ pgfdw_reset_xact_state(entry, true);
+ }
+ }
+}
+
+static void
+pgfdw_finish_pre_subcommit_cleanup(List *pending_entries)
+{
+ ConnCacheEntry *entry;
+ int curlevel;
+ char sql[100];
+ ListCell *lc;
+
+ Assert(pending_entries);
+
+ curlevel = GetCurrentTransactionNestLevel();
+ snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel);
+
+ foreach(lc, pending_entries)
+ {
+ entry = (ConnCacheEntry *) lfirst(lc);
+
+ Assert(entry->changing_xact_state);
+ do_sql_command_end(entry->conn, sql);
+ entry->changing_xact_state = false;
+
+ pgfdw_reset_xact_state(entry, false);
+ }
+}
+
/*
* List active foreign server connections.
*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index b2e02caefe..8043b207c5 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9509,7 +9509,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, keep_connections
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, keep_connections
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
@@ -10913,3 +10913,79 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
--Clean up
RESET postgres_fdw.application_name;
RESET debug_discard_caches;
+-- ===================================================================
+-- test parallel commit
+-- ===================================================================
+ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+CREATE TABLE ploc1 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
+ SERVER loopback OPTIONS (table_name 'ploc1');
+CREATE TABLE ploc2 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem2 (f1 int, f2 text)
+ SERVER loopback2 OPTIONS (table_name 'ploc2');
+BEGIN;
+INSERT INTO prem1 VALUES (101, 'foo');
+INSERT INTO prem2 VALUES (201, 'bar');
+COMMIT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+-----
+ 101 | foo
+(1 row)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+-----
+ 201 | bar
+(1 row)
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (102, 'foofoo');
+INSERT INTO prem2 VALUES (202, 'barbar');
+RELEASE SAVEPOINT s;
+COMMIT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+(2 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+(2 rows)
+
+-- This tests executing DEALLOCATE ALL against foreign servers in parallel
+-- during pre-commit
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (103, 'baz');
+INSERT INTO prem2 VALUES (203, 'qux');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (104, 'bazbaz');
+INSERT INTO prem2 VALUES (204, 'quxqux');
+COMMIT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
+ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index fc3ce6a53a..a09c0b6db7 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -121,6 +121,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
strcmp(def->defname, "updatable") == 0 ||
strcmp(def->defname, "truncatable") == 0 ||
strcmp(def->defname, "async_capable") == 0 ||
+ strcmp(def->defname, "parallel_commit") == 0 ||
strcmp(def->defname, "keep_connections") == 0)
{
/* these accept only boolean values */
@@ -249,6 +250,7 @@ InitPgFdwOptions(void)
/* async_capable is available on both server and table */
{"async_capable", ForeignServerRelationId, false},
{"async_capable", ForeignTableRelationId, false},
+ {"parallel_commit", ForeignServerRelationId, false},
{"keep_connections", ForeignServerRelationId, false},
{"password_required", UserMappingRelationId, false},
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index e050639b57..2dc6386b40 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3504,3 +3504,49 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
--Clean up
RESET postgres_fdw.application_name;
RESET debug_discard_caches;
+
+-- ===================================================================
+-- test parallel commit
+-- ===================================================================
+ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+
+CREATE TABLE ploc1 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
+ SERVER loopback OPTIONS (table_name 'ploc1');
+CREATE TABLE ploc2 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem2 (f1 int, f2 text)
+ SERVER loopback2 OPTIONS (table_name 'ploc2');
+
+BEGIN;
+INSERT INTO prem1 VALUES (101, 'foo');
+INSERT INTO prem2 VALUES (201, 'bar');
+COMMIT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (102, 'foofoo');
+INSERT INTO prem2 VALUES (202, 'barbar');
+RELEASE SAVEPOINT s;
+COMMIT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+-- This tests executing DEALLOCATE ALL against foreign servers in parallel
+-- during pre-commit
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (103, 'baz');
+INSERT INTO prem2 VALUES (203, 'qux');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (104, 'bazbaz');
+INSERT INTO prem2 VALUES (204, 'quxqux');
+COMMIT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 2bb31f1125..7886436aae 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -455,6 +455,50 @@ OPTIONS (ADD password_required 'false');
</variablelist>
</sect3>
+ <sect3>
+ <title>Transaction Management Options</title>
+
+ <para>
+ When multiple remote (sub)transactions are involved in a local
+ (sub)transaction, by default <filename>postgres_fdw</filename> commits
+ those remote (sub)transactions one by one when the local (sub)transaction
+ commits.
+ Performance can be improved with the following option:
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>parallel_commit</literal> (<type>boolean</type>)</term>
+ <listitem>
+ <para>
+ This option controls whether <filename>postgres_fdw</filename> commits
+ remote (sub)transactions opened on a foreign server in a local
+ (sub)transaction in parallel when the local (sub)transaction commits.
+ This option can only be specified for foreign servers, not per-table.
+ The default is <literal>false</literal>.
+ </para>
+
+ <para>
+ If multiple foreign servers with this option enabled are involved in
+ a local (sub)transaction, remote (sub)transactions opened on those
+ foreign servers in the local (sub)transaction are committed in parallel
+ across those foreign servers.
+ </para>
+
+ <para>
+ For a foreign server with this option enabled, if many remote
+ (sub)transactions are opened on it in a local (sub)transaction, this
+ option might increase the remote server’s load when the local
+ (sub)transactions commits, so be careful when using this option.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ </sect3>
+
<sect3>
<title>Updatability Options</title>
v3-0002-postgres_fdw-Minor-cleanup-for-pgfdw_abort_cleanup.patchapplication/octet-stream; name=v3-0002-postgres_fdw-Minor-cleanup-for-pgfdw_abort_cleanup.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index d59f91f14c..22e72c99c7 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -80,6 +80,18 @@ static unsigned int prep_stmt_number = 0;
/* tracks whether any work is needed in callback functions */
static bool xact_got_connection = false;
+/* macro for constructing abort command to be sent */
+#define CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel) \
+ do { \
+ if (toplevel) \
+ snprintf((sql), sizeof(sql), \
+ "ABORT TRANSACTION"); \
+ else \
+ snprintf((sql), sizeof(sql), \
+ "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d", \
+ (entry)->xact_depth, (entry)->xact_depth); \
+ } while(0)
+
/*
* SQL functions
*/
@@ -109,8 +121,7 @@ static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
-static void pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql,
- bool toplevel);
+static void pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel);
static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries);
static bool UserMappingPasswordRequired(UserMapping *user);
@@ -1019,8 +1030,8 @@ pgfdw_xact_callback(XactEvent event, void *arg)
break;
case XACT_EVENT_PARALLEL_ABORT:
case XACT_EVENT_ABORT:
-
- pgfdw_abort_cleanup(entry, "ABORT TRANSACTION", true);
+ /* Rollback all remote transactions during abort */
+ pgfdw_abort_cleanup(entry, true);
break;
}
}
@@ -1113,10 +1124,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
else
{
/* Rollback all remote subtransactions during abort */
- snprintf(sql, sizeof(sql),
- "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
- curlevel, curlevel);
- pgfdw_abort_cleanup(entry, sql, false);
+ pgfdw_abort_cleanup(entry, false);
}
/* OK, we're outta that level of subtransaction */
@@ -1469,10 +1477,7 @@ exit: ;
}
/*
- * Abort remote transaction.
- *
- * The statement specified in "sql" is sent to the remote server,
- * in order to rollback the remote transaction.
+ * Abort remote transaction or subtransaction.
*
* "toplevel" should be set to true if toplevel (main) transaction is
* rollbacked, false otherwise.
@@ -1480,8 +1485,10 @@ exit: ;
* Set entry->changing_xact_state to false on success, true on failure.
*/
static void
-pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel)
+pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
{
+ char sql[100];
+
/*
* Don't try to clean up the connection if we're already in error
* recursion trouble.
@@ -1513,8 +1520,9 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel)
!pgfdw_cancel_query(entry->conn))
return; /* Unable to cancel running query */
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
if (!pgfdw_exec_cleanup_query(entry->conn, sql, false))
- return; /* Unable to abort remote transaction */
+ return; /* Unable to abort remote (sub)transaction */
if (toplevel)
{
v3-0003-postgres-fdw-Add-support-for-parallel-abort.patchapplication/octet-stream; name=v3-0003-postgres-fdw-Add-support-for-parallel-abort.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 22e72c99c7..9b041361f3 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -59,6 +59,7 @@ typedef struct ConnCacheEntry
bool have_error; /* have any subxacts aborted in this xact? */
bool changing_xact_state; /* xact state change in process */
bool parallel_commit; /* do we commit (sub)xacts in parallel? */
+ bool parallel_abort; /* do we abort (sub)xacts in parallel? */
bool invalidated; /* true if reconnect is pending */
bool keep_connections; /* setting value of keep_connections
* server option */
@@ -117,13 +118,25 @@ static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
static bool pgfdw_cancel_query(PGconn *conn);
+static bool pgfdw_cancel_query_begin(PGconn *conn);
+static bool pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime);
static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
+static bool pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query);
+static bool pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ bool ignore_errors,
+ TimestampTz endtime);
static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
static void pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel);
+static bool pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries,
+ List **cancel_requested);
static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries);
+static void pgfdw_finish_abort_cleanup(List *pending_entries,
+ List *cancel_requested,
+ bool toplevel);
static bool UserMappingPasswordRequired(UserMapping *user);
static bool disconnect_cached_connections(Oid serverid);
@@ -334,11 +347,12 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
*
* By default, all the connections to any foreign servers are kept open.
*
- * Also determine to commit (sub)transactions opened on the remote server
- * in parallel at (sub)transaction end.
+ * Also determine to commit/abort (sub)transactions opened on the remote
+ * server in parallel at (sub)transaction end.
*/
entry->keep_connections = true;
entry->parallel_commit = false;
+ entry->parallel_abort = false;
foreach(lc, server->options)
{
DefElem *def = (DefElem *) lfirst(lc);
@@ -347,6 +361,8 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
entry->keep_connections = defGetBoolean(def);
if (strcmp(def->defname, "parallel_commit") == 0)
entry->parallel_commit = defGetBoolean(def);
+ if (strcmp(def->defname, "parallel_abort") == 0)
+ entry->parallel_abort = defGetBoolean(def);
}
/* Now try to make the connection */
@@ -938,6 +954,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
@@ -1031,7 +1048,15 @@ pgfdw_xact_callback(XactEvent event, void *arg)
case XACT_EVENT_PARALLEL_ABORT:
case XACT_EVENT_ABORT:
/* Rollback all remote transactions during abort */
- pgfdw_abort_cleanup(entry, true);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, true,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, true);
break;
}
}
@@ -1041,11 +1066,21 @@ pgfdw_xact_callback(XactEvent event, void *arg)
}
/* If there are any pending remote transactions, finish closing them */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
- event == XACT_EVENT_PRE_COMMIT);
- pgfdw_finish_pre_commit_cleanup(pending_entries);
+ if (event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
+ event == XACT_EVENT_PRE_COMMIT)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_commit_cleanup(pending_entries);
+ }
+ else
+ {
+ Assert(event == XACT_EVENT_PARALLEL_ABORT ||
+ event == XACT_EVENT_ABORT);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ true);
+ }
}
/*
@@ -1070,6 +1105,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
ConnCacheEntry *entry;
int curlevel;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Nothing to do at subxact start, nor after commit. */
if (!(event == SUBXACT_EVENT_PRE_COMMIT_SUB ||
@@ -1124,7 +1160,15 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
else
{
/* Rollback all remote subtransactions during abort */
- pgfdw_abort_cleanup(entry, false);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, false,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, false);
}
/* OK, we're outta that level of subtransaction */
@@ -1132,10 +1176,19 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
}
/* If there are any pending remote subtransactions, finish closing them */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == SUBXACT_EVENT_PRE_COMMIT_SUB);
- pgfdw_finish_pre_subcommit_cleanup(pending_entries);
+ if (event == SUBXACT_EVENT_PRE_COMMIT_SUB)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_subcommit_cleanup(pending_entries);
+ }
+ else
+ {
+ Assert(event == SUBXACT_EVENT_ABORT_SUB);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ false);
+ }
}
}
@@ -1279,11 +1332,7 @@ pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel)
static bool
pgfdw_cancel_query(PGconn *conn)
{
- PGcancel *cancel;
- char errbuf[256];
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to cancel the query and discard the result, assume
@@ -1291,6 +1340,17 @@ pgfdw_cancel_query(PGconn *conn)
*/
endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ if (!pgfdw_cancel_query_begin(conn))
+ return false;
+ return pgfdw_cancel_query_end(conn, endtime);
+}
+
+static bool
+pgfdw_cancel_query_begin(PGconn *conn)
+{
+ PGcancel *cancel;
+ char errbuf[256];
+
/*
* Issue cancel request. Unfortunately, there's no good way to limit the
* amount of time that we might block inside PQgetCancel().
@@ -1309,6 +1369,15 @@ pgfdw_cancel_query(PGconn *conn)
PQfreeCancel(cancel);
}
+ return true;
+}
+
+static bool
+pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
/* Get and discard the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1343,9 +1412,7 @@ pgfdw_cancel_query(PGconn *conn)
static bool
pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
{
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to execute a cleanup query, assume the connection
@@ -1355,6 +1422,14 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
*/
endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ if (!pgfdw_exec_cleanup_query_begin(conn, query))
+ return false;
+ return pgfdw_exec_cleanup_query_end(conn, query, ignore_errors, endtime);
+}
+
+static bool
+pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query)
+{
/*
* Submit a query. Since we don't use non-blocking mode, this also can
* block. But its risk is relatively small, so we ignore that for now.
@@ -1365,6 +1440,16 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
return false;
}
+ return true;
+}
+
+static bool
+pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ bool ignore_errors, TimestampTz endtime)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
/* Get the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1549,6 +1634,56 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
entry->changing_xact_state = false;
}
+static bool
+pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries, List **cancel_requested)
+{
+ /*
+ * Don't try to clean up the connection if we're already in error
+ * recursion trouble.
+ */
+ if (in_error_recursion_trouble())
+ entry->changing_xact_state = true;
+
+ /*
+ * If connection is already unsalvageable, don't touch it further.
+ */
+ if (entry->changing_xact_state)
+ return false;
+
+ /*
+ * Mark this connection as in the process of changing transaction state.
+ */
+ entry->changing_xact_state = true;
+
+ /* Assume we might have lost track of prepared statements */
+ entry->have_error = true;
+
+ /*
+ * If a command has been submitted to the remote server by using an
+ * asynchronous execution function, the command might not have yet
+ * completed. Check to see if a command is still being processed by the
+ * remote server, and if so, request cancellation of the command.
+ */
+ if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE)
+ {
+ if (!pgfdw_cancel_query_begin(entry->conn))
+ return false; /* Unable to cancel running query */
+ *cancel_requested = lappend(*cancel_requested, entry);
+ }
+ else
+ {
+ char sql[100];
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ return false; /* Unable to abort remote transaction */
+ *pending_entries = lappend(*pending_entries, entry);
+ }
+
+ return true;
+}
+
static void
pgfdw_finish_pre_commit_cleanup(List *pending_entries)
{
@@ -1634,6 +1769,155 @@ pgfdw_finish_pre_subcommit_cleanup(List *pending_entries)
}
}
+static void
+pgfdw_finish_abort_cleanup(List *pending_entries, List *cancel_requested,
+ bool toplevel)
+{
+ List *pending_deallocs = NIL;
+ ListCell *lc;
+
+ if (cancel_requested)
+ {
+ foreach(lc, cancel_requested)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. You might think we should do so before issuing
+ * cancel request like in normal mode, but that is problematic,
+ * because if, for example, it took longer than 30 seconds to
+ * process the first few entries in the cancel_requested list, it
+ * would cause a timeout for each of the remaining entries in the
+ * list when processing it, leading to slamming the connection of
+ * it shut.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ 30000);
+
+ /* Get and discard the result of the query. */
+ if (!pgfdw_cancel_query_end(entry->conn, endtime))
+ {
+ /* Unable to cancel running query */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_entries = lappend(pending_entries, entry);
+ }
+ }
+
+ if (!pending_entries)
+ return;
+
+ foreach(lc, pending_entries)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+
+ /* Get the result of the command. */
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, sql, false, endtime))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ /*
+ * If called for cleanup at main-transaction end, do a DEALLOCATE ALL
+ * if needed.
+ */
+ if (toplevel)
+ {
+ if (entry->have_prep_stmt && entry->have_error)
+ {
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn,
+ "DEALLOCATE ALL"))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_deallocs = lappend(pending_deallocs, entry);
+ continue;
+ }
+
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+ }
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+
+ if (!pending_deallocs)
+ return;
+ Assert(toplevel);
+
+ foreach(lc, pending_deallocs)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+
+ Assert(entry->changing_xact_state);
+ Assert(entry->have_prep_stmt);
+ Assert(entry->have_error);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+
+ /* Get the result of the command. */
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, "DEALLOCATE ALL",
+ true, endtime))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+}
+
/*
* List active foreign server connections.
*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 8043b207c5..2adb0ac125 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9509,7 +9509,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, keep_connections
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, parallel_abort, keep_connections
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
@@ -10914,10 +10914,12 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
SERVER loopback OPTIONS (table_name 'ploc1');
@@ -10987,5 +10989,52 @@ SELECT * FROM prem2;
204 | quxqux
(3 rows)
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index a09c0b6db7..e8c8b4ab36 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -122,6 +122,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
strcmp(def->defname, "truncatable") == 0 ||
strcmp(def->defname, "async_capable") == 0 ||
strcmp(def->defname, "parallel_commit") == 0 ||
+ strcmp(def->defname, "parallel_abort") == 0 ||
strcmp(def->defname, "keep_connections") == 0)
{
/* these accept only boolean values */
@@ -251,6 +252,7 @@ InitPgFdwOptions(void)
{"async_capable", ForeignServerRelationId, false},
{"async_capable", ForeignTableRelationId, false},
{"parallel_commit", ForeignServerRelationId, false},
+ {"parallel_abort", ForeignServerRelationId, false},
{"keep_connections", ForeignServerRelationId, false},
{"password_required", UserMappingRelationId, false},
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 2dc6386b40..58ec80fad7 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3506,10 +3506,12 @@ RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
@@ -3548,5 +3550,26 @@ COMMIT;
SELECT * FROM prem1;
SELECT * FROM prem2;
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 7886436aae..19e408ec9a 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -460,10 +460,10 @@ OPTIONS (ADD password_required 'false');
<para>
When multiple remote (sub)transactions are involved in a local
- (sub)transaction, by default <filename>postgres_fdw</filename> commits
- those remote (sub)transactions one by one when the local (sub)transaction
- commits.
- Performance can be improved with the following option:
+ (sub)transaction, by default <filename>postgres_fdw</filename> commits or
+ aborts those remote (sub)transactions one by one when the local
+ (sub)transaction commits or aborts.
+ Performance can be improved with the following options:
</para>
<variablelist>
@@ -478,25 +478,38 @@ OPTIONS (ADD password_required 'false');
This option can only be specified for foreign servers, not per-table.
The default is <literal>false</literal>.
</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><literal>parallel_abort</literal> (<type>boolean</type>)</term>
+ <listitem>
<para>
- If multiple foreign servers with this option enabled are involved in
- a local (sub)transaction, remote (sub)transactions opened on those
- foreign servers in the local (sub)transaction are committed in parallel
- across those foreign servers.
- </para>
-
- <para>
- For a foreign server with this option enabled, if many remote
- (sub)transactions are opened on it in a local (sub)transaction, this
- option might increase the remote server’s load when the local
- (sub)transactions commits, so be careful when using this option.
+ This option controls whether <filename>postgres_fdw</filename> aborts
+ remote (sub)transactions opened on a foreign server in a local
+ (sub)transaction in parallel when the local (sub)transaction aborts.
+ This option can only be specified for foreign servers, not per-table.
+ The default is <literal>false</literal>.
</para>
</listitem>
</varlistentry>
</variablelist>
+ <para>
+ If multiple foreign servers with this option enabled are involved in a
+ local (sub)transaction, remote (sub)transactions opened on those foreign
+ servers in the local (sub)transaction are committed or aborted in parallel
+ across those foreign servers.
+ </para>
+
+ <para>
+ For a foreign server with this option enabled, if many remote
+ (sub)transactions are opened on it in a local (sub)transaction, this
+ option might increase the remote server’s load when the local
+ (sub)transactions commits or aborts, so be careful when using this option.
+ </para>
+
</sect3>
<sect3>
Hi Julien,
On Sun, Jan 16, 2022 at 1:07 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
On Thu, Jan 06, 2022 at 05:29:23PM +0900, Etsuro Fujita wrote:
Done. Attached is a new version.
I also see that Fuji-san raised some questions, so for now I will simply change
the patch status to Waiting on Author.
Thanks for letting me know!
I posted a new version of the patchset which addresses Fujii-san’s
comments. I changed the status and moved the patchset to the next
commitfest.
Best regards,
Etsuro Fujita
On 2022/02/07 14:35, Etsuro Fujita wrote:
0001 patch failed to be applied. Could you rebase the patch?
Done. Attached is an updated version of the patch set.
Thanks for updating the patch! Here are the review comments for 0001 patch.
I got the following compiler warning.
[16:58:07.120] connection.c: In function ‘pgfdw_finish_pre_commit_cleanup’:
[16:58:07.120] connection.c:1726:4: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
[16:58:07.120] 1726 | PGresult *res;
[16:58:07.120] | ^~~~~~~~
+ /* Ignore errors in the DEALLOCATE (see note above) */
+ if ((res = PQgetResult(entry->conn)) != NULL)
Doesn't PQgetResult() need to be called repeatedly until it returns NULL or the connection is lost because there can be more than one messages to receive?
+ if (pending_deallocs)
+ {
+ foreach(lc, pending_deallocs)
If pending_deallocs is NIL, we don't enter this foreach loop. So probably "if (pending_deallocs)" seems not necessary.
entry->keep_connections = defGetBoolean(def);
+ if (strcmp(def->defname, "parallel_commit") == 0)
+ entry->parallel_commit = defGetBoolean(def);
Isn't it better to use "else if" here, instead?
+static void do_sql_command_begin(PGconn *conn, const char *sql);
+static void do_sql_command_end(PGconn *conn, const char *sql);
To simplify the code more, I'm tempted to change do_sql_command() so that it just calls the above two functions, instead of calling PQsendQuery() and pgfw_get_result() directly. Thought? If we do this, probably we also need to change do_sql_command_end() so that it accepts boolean flag which specifies whether PQconsumeInput() is called or not, as follows.
do_sql_command_end(PGconn *conn, const char *sql, bool consumeInput)
{
/*
* If any data is expected to be available from the socket, consume it.
* ...
* When parallel_commit is enabled, since there can be a time window between
* sending query and receiving result, we can expect data is already available
* from the socket. In this case we try to consume it at first.... Otherwise..
*/
if (consumeInput && !PQconsumeInput(conn))
...
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
On Tue, Feb 8, 2022 at 3:49 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
Here are the review comments for 0001 patch.
I got the following compiler warning.
[16:58:07.120] connection.c: In function ‘pgfdw_finish_pre_commit_cleanup’:
[16:58:07.120] connection.c:1726:4: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
[16:58:07.120] 1726 | PGresult *res;
[16:58:07.120] | ^~~~~~~~
Sorry, I didn’t notice this, because my compiler doesn’t produce it.
I tried to fix it. Attached is an updated version of the patch set.
I hope this works for you.
+ /* Ignore errors in the DEALLOCATE (see note above) */ + if ((res = PQgetResult(entry->conn)) != NULL)Doesn't PQgetResult() need to be called repeatedly until it returns NULL or the connection is lost because there can be more than one messages to receive?
Yeah, we would receive a single message here, but PQgetResult must be
called repeatedly until it returns NULL (see the documentation note
about it in libpq.sgml); else the PQtransactionStatus of the
connection would remain PQTRANS_ACTIVE, causing the connection to be
closed at transaction end, because we do this in
pgfdw_reset_xact_state called from pgfdw_xact_callback:
/*
* If the connection isn't in a good idle state, it is marked as
* invalid or keep_connections option of its server is disabled, then
* discard it to recover. Next GetConnection will open a new
* connection.
*/
if (PQstatus(entry->conn) != CONNECTION_OK ||
PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
entry->changing_xact_state ||
entry->invalidated ||
!entry->keep_connections)
{
elog(DEBUG3, "discarding connection %p", entry->conn);
disconnect_pg_server(entry);
}
But I noticed a brown-paper-bag bug in the bit you showed above: the
if test should be modified as a while loop. :-( I fixed this in the
attached.
+ if (pending_deallocs) + { + foreach(lc, pending_deallocs)If pending_deallocs is NIL, we don't enter this foreach loop. So probably "if (pending_deallocs)" seems not necessary.
Yeah, I think we could omit the if test, but I added it to match other
places (see e.g., foreign_grouping_ok() in postgres_fdw.c). It looks
cleaner to me to have it before the loop.
entry->keep_connections = defGetBoolean(def); + if (strcmp(def->defname, "parallel_commit") == 0) + entry->parallel_commit = defGetBoolean(def);Isn't it better to use "else if" here, instead?
Yeah, that would be better. Done.
+static void do_sql_command_begin(PGconn *conn, const char *sql); +static void do_sql_command_end(PGconn *conn, const char *sql);To simplify the code more, I'm tempted to change do_sql_command() so that it just calls the above two functions, instead of calling PQsendQuery() and pgfw_get_result() directly. Thought? If we do this, probably we also need to change do_sql_command_end() so that it accepts boolean flag which specifies whether PQconsumeInput() is called or not, as follows.
Done. Actually, I was planning to do this for consistency with a
similar refactoring for pgfdw_cancel_query and
pgfdw_exec_cleanup_query that had been done
in the parallel-abort patch.
I tweaked comments/docs a little bit as well.
Thanks for reviewing!
Best regards,
Etsuro Fujita
Attachments:
v4-0001-postgres-fdw-Add-support-for-parallel-commit.patchapplication/octet-stream; name=v4-0001-postgres-fdw-Add-support-for-parallel-commit.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 29fcb6a76e..3f4e71da9e 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -58,6 +58,7 @@ typedef struct ConnCacheEntry
bool have_prep_stmt; /* have we prepared any stmts in this xact? */
bool have_error; /* have any subxacts aborted in this xact? */
bool changing_xact_state; /* xact state change in process */
+ bool parallel_commit; /* do we commit (sub)xacts in parallel? */
bool invalidated; /* true if reconnect is pending */
bool keep_connections; /* setting value of keep_connections
* server option */
@@ -92,6 +93,9 @@ static PGconn *connect_pg_server(ForeignServer *server, UserMapping *user);
static void disconnect_pg_server(ConnCacheEntry *entry);
static void check_conn_params(const char **keywords, const char **values, UserMapping *user);
static void configure_remote_session(PGconn *conn);
+static void do_sql_command_begin(PGconn *conn, const char *sql);
+static void do_sql_command_end(PGconn *conn, const char *sql,
+ bool consume_input);
static void begin_remote_xact(ConnCacheEntry *entry);
static void pgfdw_xact_callback(XactEvent event, void *arg);
static void pgfdw_subxact_callback(SubXactEvent event,
@@ -100,6 +104,7 @@ static void pgfdw_subxact_callback(SubXactEvent event,
void *arg);
static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
+static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
static bool pgfdw_cancel_query(PGconn *conn);
static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
@@ -107,6 +112,8 @@ static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
static void pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql,
bool toplevel);
+static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
+static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries);
static bool UserMappingPasswordRequired(UserMapping *user);
static bool disconnect_cached_connections(Oid serverid);
@@ -316,14 +323,20 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
* is changed will be closed and re-made later.
*
* By default, all the connections to any foreign servers are kept open.
+ *
+ * Also determine to commit (sub)transactions opened on the remote server
+ * in parallel at (sub)transaction end.
*/
entry->keep_connections = true;
+ entry->parallel_commit = false;
foreach(lc, server->options)
{
DefElem *def = (DefElem *) lfirst(lc);
if (strcmp(def->defname, "keep_connections") == 0)
entry->keep_connections = defGetBoolean(def);
+ else if (strcmp(def->defname, "parallel_commit") == 0)
+ entry->parallel_commit = defGetBoolean(def);
}
/* Now try to make the connection */
@@ -623,10 +636,30 @@ configure_remote_session(PGconn *conn)
void
do_sql_command(PGconn *conn, const char *sql)
{
- PGresult *res;
+ do_sql_command_begin(conn, sql);
+ do_sql_command_end(conn, sql, false);
+}
+static void
+do_sql_command_begin(PGconn *conn, const char *sql)
+{
if (!PQsendQuery(conn, sql))
pgfdw_report_error(ERROR, NULL, conn, false, sql);
+}
+
+static void
+do_sql_command_end(PGconn *conn, const char *sql, bool consume_input)
+{
+ PGresult *res;
+
+ /*
+ * Consume whatever data is available from the socket if requested. Note
+ * that if all data is available, this allows us to call PQgetResult
+ * without forcing the overhead of WaitLatchOrSocket in pgfdw_get_result,
+ * which would be very large compared to the overhead of PQconsumeInput.
+ */
+ if (consume_input && !PQconsumeInput(conn))
+ pgfdw_report_error(ERROR, NULL, conn, false, sql);
res = pgfdw_get_result(conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, sql);
@@ -888,6 +921,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
{
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
+ List *pending_entries = NIL;
/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
@@ -925,6 +959,12 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Commit all remote transactions during pre-commit */
entry->changing_xact_state = true;
+ if (entry->parallel_commit)
+ {
+ do_sql_command_begin(entry->conn, "COMMIT TRANSACTION");
+ pending_entries = lappend(pending_entries, entry);
+ continue;
+ }
do_sql_command(entry->conn, "COMMIT TRANSACTION");
entry->changing_xact_state = false;
@@ -981,23 +1021,15 @@ pgfdw_xact_callback(XactEvent event, void *arg)
}
/* Reset state to show we're out of a transaction */
- entry->xact_depth = 0;
+ pgfdw_reset_xact_state(entry, true);
+ }
- /*
- * If the connection isn't in a good idle state, it is marked as
- * invalid or keep_connections option of its server is disabled, then
- * discard it to recover. Next GetConnection will open a new
- * connection.
- */
- if (PQstatus(entry->conn) != CONNECTION_OK ||
- PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
- entry->changing_xact_state ||
- entry->invalidated ||
- !entry->keep_connections)
- {
- elog(DEBUG3, "discarding connection %p", entry->conn);
- disconnect_pg_server(entry);
- }
+ /* If there are any pending connections, finish cleaning them up */
+ if (pending_entries)
+ {
+ Assert(event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
+ event == XACT_EVENT_PRE_COMMIT);
+ pgfdw_finish_pre_commit_cleanup(pending_entries);
}
/*
@@ -1021,6 +1053,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
int curlevel;
+ List *pending_entries = NIL;
/* Nothing to do at subxact start, nor after commit. */
if (!(event == SUBXACT_EVENT_PRE_COMMIT_SUB ||
@@ -1063,6 +1096,12 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
/* Commit all remote subtransactions during pre-commit */
snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel);
entry->changing_xact_state = true;
+ if (entry->parallel_commit)
+ {
+ do_sql_command_begin(entry->conn, sql);
+ pending_entries = lappend(pending_entries, entry);
+ continue;
+ }
do_sql_command(entry->conn, sql);
entry->changing_xact_state = false;
}
@@ -1076,7 +1115,14 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
}
/* OK, we're outta that level of subtransaction */
- entry->xact_depth--;
+ pgfdw_reset_xact_state(entry, false);
+ }
+
+ /* If there are any pending connections, finish cleaning them up */
+ if (pending_entries)
+ {
+ Assert(event == SUBXACT_EVENT_PRE_COMMIT_SUB);
+ pgfdw_finish_pre_subcommit_cleanup(pending_entries);
}
}
@@ -1169,6 +1215,40 @@ pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry)
server->servername)));
}
+/*
+ * Reset state to show we're out of a (sub)transaction.
+ */
+static void
+pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel)
+{
+ if (toplevel)
+ {
+ /* Reset state to show we're out of a transaction */
+ entry->xact_depth = 0;
+
+ /*
+ * If the connection isn't in a good idle state, it is marked as
+ * invalid or keep_connections option of its server is disabled, then
+ * discard it to recover. Next GetConnection will open a new
+ * connection.
+ */
+ if (PQstatus(entry->conn) != CONNECTION_OK ||
+ PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
+ entry->changing_xact_state ||
+ entry->invalidated ||
+ !entry->keep_connections)
+ {
+ elog(DEBUG3, "discarding connection %p", entry->conn);
+ disconnect_pg_server(entry);
+ }
+ }
+ else
+ {
+ /* Reset state to show we're out of a subtransaction */
+ entry->xact_depth--;
+ }
+}
+
/*
* Cancel the currently-in-progress query (whose query text we do not have)
* and ignore the result. Returns true if we successfully cancel the query
@@ -1456,6 +1536,116 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel)
entry->changing_xact_state = false;
}
+/*
+ * Finish pre-commit cleanup of connections on which we have sent a COMMIT
+ * command.
+ */
+static void
+pgfdw_finish_pre_commit_cleanup(List *pending_entries)
+{
+ ConnCacheEntry *entry;
+ List *pending_deallocs = NIL;
+ ListCell *lc;
+
+ Assert(pending_entries);
+
+ /*
+ * Get the result of the COMMIT command for each of the pending entries
+ */
+ foreach(lc, pending_entries)
+ {
+ entry = (ConnCacheEntry *) lfirst(lc);
+
+ Assert(entry->changing_xact_state);
+ /*
+ * We might already have received the result on the socket, so pass
+ * consume_input=true to try to consume it first
+ */
+ do_sql_command_end(entry->conn, "COMMIT TRANSACTION", true);
+ entry->changing_xact_state = false;
+
+ /* Do a DEALLOCATE ALL in parallel if needed */
+ if (entry->have_prep_stmt && entry->have_error)
+ {
+ /* Ignore errors in the DEALLOCATE (see note above) */
+ if (PQsendQuery(entry->conn, "DEALLOCATE ALL"))
+ {
+ pending_deallocs = lappend(pending_deallocs, entry);
+ continue;
+ }
+ }
+
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ pgfdw_reset_xact_state(entry, true);
+ }
+
+ /*
+ * Get the result of the DEALLOCATE command for each of the pending
+ * entries if any
+ */
+ if (pending_deallocs)
+ {
+ foreach(lc, pending_deallocs)
+ {
+ PGresult *res;
+ entry = (ConnCacheEntry *) lfirst(lc);
+
+ /* Ignore errors in the DEALLOCATE (see note above) */
+ while ((res = PQgetResult(entry->conn)) != NULL)
+ {
+ PQclear(res);
+ /*
+ * Stop if the connection is lost (else we'll loop infinitely)
+ */
+ if (PQstatus(entry->conn) == CONNECTION_BAD)
+ break;
+ }
+
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ pgfdw_reset_xact_state(entry, true);
+ }
+ }
+}
+
+/*
+ * Finish pre-subcommit cleanup of connections on which we have sent a RELEASE
+ * command.
+ */
+static void
+pgfdw_finish_pre_subcommit_cleanup(List *pending_entries)
+{
+ ConnCacheEntry *entry;
+ int curlevel;
+ char sql[100];
+ ListCell *lc;
+
+ Assert(pending_entries);
+
+ /*
+ * Get the result of the RELEASE command for each of the pending entries
+ */
+ curlevel = GetCurrentTransactionNestLevel();
+ snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel);
+ foreach(lc, pending_entries)
+ {
+ entry = (ConnCacheEntry *) lfirst(lc);
+
+ Assert(entry->changing_xact_state);
+ /*
+ * We might already have received the result on the socket, so pass
+ * consume_input=true to try to consume it first
+ */
+ do_sql_command_end(entry->conn, sql, true);
+ entry->changing_xact_state = false;
+
+ pgfdw_reset_xact_state(entry, false);
+ }
+}
+
/*
* List active foreign server connections.
*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index b2e02caefe..8043b207c5 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9509,7 +9509,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, keep_connections
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, keep_connections
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
@@ -10913,3 +10913,79 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
--Clean up
RESET postgres_fdw.application_name;
RESET debug_discard_caches;
+-- ===================================================================
+-- test parallel commit
+-- ===================================================================
+ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+CREATE TABLE ploc1 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
+ SERVER loopback OPTIONS (table_name 'ploc1');
+CREATE TABLE ploc2 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem2 (f1 int, f2 text)
+ SERVER loopback2 OPTIONS (table_name 'ploc2');
+BEGIN;
+INSERT INTO prem1 VALUES (101, 'foo');
+INSERT INTO prem2 VALUES (201, 'bar');
+COMMIT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+-----
+ 101 | foo
+(1 row)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+-----
+ 201 | bar
+(1 row)
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (102, 'foofoo');
+INSERT INTO prem2 VALUES (202, 'barbar');
+RELEASE SAVEPOINT s;
+COMMIT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+(2 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+(2 rows)
+
+-- This tests executing DEALLOCATE ALL against foreign servers in parallel
+-- during pre-commit
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (103, 'baz');
+INSERT INTO prem2 VALUES (203, 'qux');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (104, 'bazbaz');
+INSERT INTO prem2 VALUES (204, 'quxqux');
+COMMIT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
+ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index fc3ce6a53a..a09c0b6db7 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -121,6 +121,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
strcmp(def->defname, "updatable") == 0 ||
strcmp(def->defname, "truncatable") == 0 ||
strcmp(def->defname, "async_capable") == 0 ||
+ strcmp(def->defname, "parallel_commit") == 0 ||
strcmp(def->defname, "keep_connections") == 0)
{
/* these accept only boolean values */
@@ -249,6 +250,7 @@ InitPgFdwOptions(void)
/* async_capable is available on both server and table */
{"async_capable", ForeignServerRelationId, false},
{"async_capable", ForeignTableRelationId, false},
+ {"parallel_commit", ForeignServerRelationId, false},
{"keep_connections", ForeignServerRelationId, false},
{"password_required", UserMappingRelationId, false},
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index e050639b57..2dc6386b40 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3504,3 +3504,49 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
--Clean up
RESET postgres_fdw.application_name;
RESET debug_discard_caches;
+
+-- ===================================================================
+-- test parallel commit
+-- ===================================================================
+ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+
+CREATE TABLE ploc1 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
+ SERVER loopback OPTIONS (table_name 'ploc1');
+CREATE TABLE ploc2 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem2 (f1 int, f2 text)
+ SERVER loopback2 OPTIONS (table_name 'ploc2');
+
+BEGIN;
+INSERT INTO prem1 VALUES (101, 'foo');
+INSERT INTO prem2 VALUES (201, 'bar');
+COMMIT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (102, 'foofoo');
+INSERT INTO prem2 VALUES (202, 'barbar');
+RELEASE SAVEPOINT s;
+COMMIT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+-- This tests executing DEALLOCATE ALL against foreign servers in parallel
+-- during pre-commit
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (103, 'baz');
+INSERT INTO prem2 VALUES (203, 'qux');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (104, 'bazbaz');
+INSERT INTO prem2 VALUES (204, 'quxqux');
+COMMIT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 2bb31f1125..238e2f84a6 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -455,6 +455,52 @@ OPTIONS (ADD password_required 'false');
</variablelist>
</sect3>
+ <sect3>
+ <title>Transaction Management Options</title>
+
+ <para>
+ When multiple remote (sub)transactions are involved in a local
+ (sub)transaction, by default <filename>postgres_fdw</filename> commits
+ those remote (sub)transactions one by one when the local (sub)transaction
+ commits.
+ Performance can be improved with the following option:
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>parallel_commit</literal> (<type>boolean</type>)</term>
+ <listitem>
+ <para>
+ This option controls whether <filename>postgres_fdw</filename> commits
+ remote (sub)transactions opened on a foreign server in a local
+ (sub)transaction in parallel when the local (sub)transaction commits.
+ This option can only be specified for foreign servers, not per-table.
+ The default is <literal>false</literal>.
+ </para>
+
+ <para>
+ If multiple foreign servers with this option enabled are involved in
+ a local (sub)transaction, multiple remote (sub)transactions opened on
+ those foreign servers in the local (sub)transaction are committed in
+ parallel across those foreign servers when the local (sub)transaction
+ commits.
+ </para>
+
+ <para>
+ For a foreign server with this option enabled, if many remote
+ (sub)transactions are opened on the foreign server in a local
+ (sub)transaction, this option might increase the remote server’s load
+ when the local (sub)transaction commits, so be careful when using this
+ option.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ </sect3>
+
<sect3>
<title>Updatability Options</title>
v4-0002-postgres_fdw-Minor-cleanup-for-pgfdw_abort_cleanup.patchapplication/octet-stream; name=v4-0002-postgres_fdw-Minor-cleanup-for-pgfdw_abort_cleanup.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 3f4e71da9e..2deb4b2c4a 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -80,6 +80,18 @@ static unsigned int prep_stmt_number = 0;
/* tracks whether any work is needed in callback functions */
static bool xact_got_connection = false;
+/* macro for constructing abort command to be sent */
+#define CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel) \
+ do { \
+ if (toplevel) \
+ snprintf((sql), sizeof(sql), \
+ "ABORT TRANSACTION"); \
+ else \
+ snprintf((sql), sizeof(sql), \
+ "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d", \
+ (entry)->xact_depth, (entry)->xact_depth); \
+ } while(0)
+
/*
* SQL functions
*/
@@ -110,8 +122,7 @@ static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
-static void pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql,
- bool toplevel);
+static void pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel);
static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries);
static bool UserMappingPasswordRequired(UserMapping *user);
@@ -1014,8 +1025,8 @@ pgfdw_xact_callback(XactEvent event, void *arg)
break;
case XACT_EVENT_PARALLEL_ABORT:
case XACT_EVENT_ABORT:
-
- pgfdw_abort_cleanup(entry, "ABORT TRANSACTION", true);
+ /* Rollback all remote transactions during abort */
+ pgfdw_abort_cleanup(entry, true);
break;
}
}
@@ -1108,10 +1119,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
else
{
/* Rollback all remote subtransactions during abort */
- snprintf(sql, sizeof(sql),
- "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
- curlevel, curlevel);
- pgfdw_abort_cleanup(entry, sql, false);
+ pgfdw_abort_cleanup(entry, false);
}
/* OK, we're outta that level of subtransaction */
@@ -1464,10 +1472,7 @@ exit: ;
}
/*
- * Abort remote transaction.
- *
- * The statement specified in "sql" is sent to the remote server,
- * in order to rollback the remote transaction.
+ * Abort remote transaction or subtransaction.
*
* "toplevel" should be set to true if toplevel (main) transaction is
* rollbacked, false otherwise.
@@ -1475,8 +1480,10 @@ exit: ;
* Set entry->changing_xact_state to false on success, true on failure.
*/
static void
-pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel)
+pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
{
+ char sql[100];
+
/*
* Don't try to clean up the connection if we're already in error
* recursion trouble.
@@ -1508,8 +1515,9 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel)
!pgfdw_cancel_query(entry->conn))
return; /* Unable to cancel running query */
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
if (!pgfdw_exec_cleanup_query(entry->conn, sql, false))
- return; /* Unable to abort remote transaction */
+ return; /* Unable to abort remote (sub)transaction */
if (toplevel)
{
v4-0003-postgres-fdw-Add-support-for-parallel-abort.patchapplication/octet-stream; name=v4-0003-postgres-fdw-Add-support-for-parallel-abort.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 2deb4b2c4a..bd11e762b7 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -59,6 +59,7 @@ typedef struct ConnCacheEntry
bool have_error; /* have any subxacts aborted in this xact? */
bool changing_xact_state; /* xact state change in process */
bool parallel_commit; /* do we commit (sub)xacts in parallel? */
+ bool parallel_abort; /* do we abort (sub)xacts in parallel? */
bool invalidated; /* true if reconnect is pending */
bool keep_connections; /* setting value of keep_connections
* server option */
@@ -118,13 +119,25 @@ static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
static bool pgfdw_cancel_query(PGconn *conn);
+static bool pgfdw_cancel_query_begin(PGconn *conn);
+static bool pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime);
static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
+static bool pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query);
+static bool pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ bool ignore_errors,
+ TimestampTz endtime);
static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
static void pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel);
+static bool pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries,
+ List **cancel_requested);
static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries);
+static void pgfdw_finish_abort_cleanup(List *pending_entries,
+ List *cancel_requested,
+ bool toplevel);
static bool UserMappingPasswordRequired(UserMapping *user);
static bool disconnect_cached_connections(Oid serverid);
@@ -335,11 +348,12 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
*
* By default, all the connections to any foreign servers are kept open.
*
- * Also determine to commit (sub)transactions opened on the remote server
- * in parallel at (sub)transaction end.
+ * Also determine to commit/abort (sub)transactions opened on the remote
+ * server in parallel at (sub)transaction end.
*/
entry->keep_connections = true;
entry->parallel_commit = false;
+ entry->parallel_abort = false;
foreach(lc, server->options)
{
DefElem *def = (DefElem *) lfirst(lc);
@@ -348,6 +362,8 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
entry->keep_connections = defGetBoolean(def);
else if (strcmp(def->defname, "parallel_commit") == 0)
entry->parallel_commit = defGetBoolean(def);
+ else if (strcmp(def->defname, "parallel_abort") == 0)
+ entry->parallel_abort = defGetBoolean(def);
}
/* Now try to make the connection */
@@ -933,6 +949,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
@@ -1026,7 +1043,15 @@ pgfdw_xact_callback(XactEvent event, void *arg)
case XACT_EVENT_PARALLEL_ABORT:
case XACT_EVENT_ABORT:
/* Rollback all remote transactions during abort */
- pgfdw_abort_cleanup(entry, true);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, true,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, true);
break;
}
}
@@ -1036,11 +1061,21 @@ pgfdw_xact_callback(XactEvent event, void *arg)
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
- event == XACT_EVENT_PRE_COMMIT);
- pgfdw_finish_pre_commit_cleanup(pending_entries);
+ if (event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
+ event == XACT_EVENT_PRE_COMMIT)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_commit_cleanup(pending_entries);
+ }
+ else
+ {
+ Assert(event == XACT_EVENT_PARALLEL_ABORT ||
+ event == XACT_EVENT_ABORT);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ true);
+ }
}
/*
@@ -1065,6 +1100,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
ConnCacheEntry *entry;
int curlevel;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Nothing to do at subxact start, nor after commit. */
if (!(event == SUBXACT_EVENT_PRE_COMMIT_SUB ||
@@ -1119,7 +1155,15 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
else
{
/* Rollback all remote subtransactions during abort */
- pgfdw_abort_cleanup(entry, false);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, false,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, false);
}
/* OK, we're outta that level of subtransaction */
@@ -1127,10 +1171,19 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == SUBXACT_EVENT_PRE_COMMIT_SUB);
- pgfdw_finish_pre_subcommit_cleanup(pending_entries);
+ if (event == SUBXACT_EVENT_PRE_COMMIT_SUB)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_subcommit_cleanup(pending_entries);
+ }
+ else
+ {
+ Assert(event == SUBXACT_EVENT_ABORT_SUB);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ false);
+ }
}
}
@@ -1274,11 +1327,7 @@ pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel)
static bool
pgfdw_cancel_query(PGconn *conn)
{
- PGcancel *cancel;
- char errbuf[256];
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to cancel the query and discard the result, assume
@@ -1286,6 +1335,17 @@ pgfdw_cancel_query(PGconn *conn)
*/
endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ if (!pgfdw_cancel_query_begin(conn))
+ return false;
+ return pgfdw_cancel_query_end(conn, endtime);
+}
+
+static bool
+pgfdw_cancel_query_begin(PGconn *conn)
+{
+ PGcancel *cancel;
+ char errbuf[256];
+
/*
* Issue cancel request. Unfortunately, there's no good way to limit the
* amount of time that we might block inside PQgetCancel().
@@ -1304,6 +1364,15 @@ pgfdw_cancel_query(PGconn *conn)
PQfreeCancel(cancel);
}
+ return true;
+}
+
+static bool
+pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
/* Get and discard the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1338,9 +1407,7 @@ pgfdw_cancel_query(PGconn *conn)
static bool
pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
{
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to execute a cleanup query, assume the connection
@@ -1350,6 +1417,14 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
*/
endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ if (!pgfdw_exec_cleanup_query_begin(conn, query))
+ return false;
+ return pgfdw_exec_cleanup_query_end(conn, query, ignore_errors, endtime);
+}
+
+static bool
+pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query)
+{
/*
* Submit a query. Since we don't use non-blocking mode, this also can
* block. But its risk is relatively small, so we ignore that for now.
@@ -1360,6 +1435,16 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
return false;
}
+ return true;
+}
+
+static bool
+pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ bool ignore_errors, TimestampTz endtime)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
/* Get the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1544,6 +1629,56 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
entry->changing_xact_state = false;
}
+static bool
+pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries, List **cancel_requested)
+{
+ /*
+ * Don't try to clean up the connection if we're already in error
+ * recursion trouble.
+ */
+ if (in_error_recursion_trouble())
+ entry->changing_xact_state = true;
+
+ /*
+ * If connection is already unsalvageable, don't touch it further.
+ */
+ if (entry->changing_xact_state)
+ return false;
+
+ /*
+ * Mark this connection as in the process of changing transaction state.
+ */
+ entry->changing_xact_state = true;
+
+ /* Assume we might have lost track of prepared statements */
+ entry->have_error = true;
+
+ /*
+ * If a command has been submitted to the remote server by using an
+ * asynchronous execution function, the command might not have yet
+ * completed. Check to see if a command is still being processed by the
+ * remote server, and if so, request cancellation of the command.
+ */
+ if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE)
+ {
+ if (!pgfdw_cancel_query_begin(entry->conn))
+ return false; /* Unable to cancel running query */
+ *cancel_requested = lappend(*cancel_requested, entry);
+ }
+ else
+ {
+ char sql[100];
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ return false; /* Unable to abort remote transaction */
+ *pending_entries = lappend(*pending_entries, entry);
+ }
+
+ return true;
+}
+
/*
* Finish pre-commit cleanup of connections on which we have sent a COMMIT
* command.
@@ -1654,6 +1789,159 @@ pgfdw_finish_pre_subcommit_cleanup(List *pending_entries)
}
}
+/*
+ * Finish (sub)abort cleanup of connections on which we have sent a (sub)abort
+ * command or cancel request.
+ */
+static void
+pgfdw_finish_abort_cleanup(List *pending_entries, List *cancel_requested,
+ bool toplevel)
+{
+ List *pending_deallocs = NIL;
+ ListCell *lc;
+
+ if (cancel_requested)
+ {
+ foreach(lc, cancel_requested)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. You might think we should do so before issuing
+ * cancel request like in normal mode, but that is problematic,
+ * because if, for example, it took longer than 30 seconds to
+ * process the first few entries in the cancel_requested list, it
+ * would cause a timeout for each of the remaining entries in the
+ * list when processing it, leading to slamming the connection of
+ * it shut.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ 30000);
+
+ /* Get and discard the result of the query. */
+ if (!pgfdw_cancel_query_end(entry->conn, endtime))
+ {
+ /* Unable to cancel running query */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_entries = lappend(pending_entries, entry);
+ }
+ }
+
+ if (!pending_entries)
+ return;
+
+ foreach(lc, pending_entries)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+
+ /* Get the result of the command. */
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, sql, false, endtime))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ /*
+ * If called for cleanup at main-transaction end, do a DEALLOCATE ALL
+ * if needed.
+ */
+ if (toplevel)
+ {
+ if (entry->have_prep_stmt && entry->have_error)
+ {
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn,
+ "DEALLOCATE ALL"))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_deallocs = lappend(pending_deallocs, entry);
+ continue;
+ }
+
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+ }
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+
+ if (!pending_deallocs)
+ return;
+ Assert(toplevel);
+
+ foreach(lc, pending_deallocs)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+
+ Assert(entry->changing_xact_state);
+ Assert(entry->have_prep_stmt);
+ Assert(entry->have_error);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+
+ /* Get the result of the command. */
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, "DEALLOCATE ALL",
+ true, endtime))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+}
+
/*
* List active foreign server connections.
*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 8043b207c5..2adb0ac125 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9509,7 +9509,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, keep_connections
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, parallel_abort, keep_connections
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
@@ -10914,10 +10914,12 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
SERVER loopback OPTIONS (table_name 'ploc1');
@@ -10987,5 +10989,52 @@ SELECT * FROM prem2;
204 | quxqux
(3 rows)
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index a09c0b6db7..e8c8b4ab36 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -122,6 +122,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
strcmp(def->defname, "truncatable") == 0 ||
strcmp(def->defname, "async_capable") == 0 ||
strcmp(def->defname, "parallel_commit") == 0 ||
+ strcmp(def->defname, "parallel_abort") == 0 ||
strcmp(def->defname, "keep_connections") == 0)
{
/* these accept only boolean values */
@@ -251,6 +252,7 @@ InitPgFdwOptions(void)
{"async_capable", ForeignServerRelationId, false},
{"async_capable", ForeignTableRelationId, false},
{"parallel_commit", ForeignServerRelationId, false},
+ {"parallel_abort", ForeignServerRelationId, false},
{"keep_connections", ForeignServerRelationId, false},
{"password_required", UserMappingRelationId, false},
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 2dc6386b40..58ec80fad7 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3506,10 +3506,12 @@ RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
@@ -3548,5 +3550,26 @@ COMMIT;
SELECT * FROM prem1;
SELECT * FROM prem2;
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 238e2f84a6..196e0a986c 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -460,10 +460,10 @@ OPTIONS (ADD password_required 'false');
<para>
When multiple remote (sub)transactions are involved in a local
- (sub)transaction, by default <filename>postgres_fdw</filename> commits
- those remote (sub)transactions one by one when the local (sub)transaction
- commits.
- Performance can be improved with the following option:
+ (sub)transaction, by default <filename>postgres_fdw</filename> commits or
+ aborts those remote (sub)transactions one by one when the local
+ (sub)transaction commits or aborts.
+ Performance can be improved with the following options:
</para>
<variablelist>
@@ -478,27 +478,40 @@ OPTIONS (ADD password_required 'false');
This option can only be specified for foreign servers, not per-table.
The default is <literal>false</literal>.
</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><literal>parallel_abort</literal> (<type>boolean</type>)</term>
+ <listitem>
<para>
- If multiple foreign servers with this option enabled are involved in
- a local (sub)transaction, multiple remote (sub)transactions opened on
- those foreign servers in the local (sub)transaction are committed in
- parallel across those foreign servers when the local (sub)transaction
- commits.
- </para>
-
- <para>
- For a foreign server with this option enabled, if many remote
- (sub)transactions are opened on the foreign server in a local
- (sub)transaction, this option might increase the remote server’s load
- when the local (sub)transaction commits, so be careful when using this
- option.
+ This option controls whether <filename>postgres_fdw</filename> aborts
+ remote (sub)transactions opened on a foreign server in a local
+ (sub)transaction in parallel when the local (sub)transaction aborts.
+ This option can only be specified for foreign servers, not per-table.
+ The default is <literal>false</literal>.
</para>
</listitem>
</varlistentry>
</variablelist>
+ <para>
+ If multiple foreign servers with these options enabled are involved in a
+ local (sub)transaction, multiple remote (sub)transactions opened on those
+ foreign servers in the local (sub)transaction are committed or aborted in
+ parallel across those foreign servers when the local (sub)transaction
+ commits or aborts.
+ </para>
+
+ <para>
+ For a foreign server with these options enabled, if many remote
+ (sub)transactions are opened on the foreign server in a local
+ (sub)transaction, these options might increase the remote server’s load
+ when the local (sub)transaction commits or aborts, so be careful when
+ using these options.
+ </para>
+
</sect3>
<sect3>
On 2022/02/11 21:59, Etsuro Fujita wrote:
I tweaked comments/docs a little bit as well.
Thanks for updating the patches!
I reviewed 0001 patch. It looks good to me except the following minor things. If these are addressed, I think that the 001 patch can be marked as ready for committer.
+ * Also determine to commit (sub)transactions opened on the remote server
+ * in parallel at (sub)transaction end.
Like the comment "Determine whether to keep the connection ...", "determine to commit" should be "determine whether to commit"?
"remote server" should be "remote servers"?
+ curlevel = GetCurrentTransactionNestLevel();
+ snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel);
Why does pgfdw_finish_pre_subcommit_cleanup() need to call GetCurrentTransactionNestLevel() and construct the "RELEASE SAVEPOINT" query string again? pgfdw_subxact_callback() already does them and probably we can make it pass either of them to pgfdw_finish_pre_subcommit_cleanup() as its argument.
+ This option controls whether <filename>postgres_fdw</filename> commits
+ remote (sub)transactions opened on a foreign server in a local
+ (sub)transaction in parallel when the local (sub)transaction commits.
"a foreign server" should be "foreign servers"?
"in a local (sub)transaction" part seems not to be necessary.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
Thanks a lot for updating the patch.
Tried to apply the patches to master branch, no warning found and
regression test passed.
Now, we have many places (5) calling the same function with a constant
number 30000. Is this a good time to consider redefine this number a
macro somewhere?
Thank you,
On 2022-02-17 8:46 a.m., Fujii Masao wrote:
On 2022/02/11 21:59, Etsuro Fujita wrote:
I tweaked comments/docs a little bit as well.
Thanks for updating the patches!
I reviewed 0001 patch. It looks good to me except the following minor
things. If these are addressed, I think that the 001 patch can be
marked as ready for committer.+ * Also determine to commit (sub)transactions opened on the remote server + * in parallel at (sub)transaction end.Like the comment "Determine whether to keep the connection ...",
"determine to commit" should be "determine whether to commit"?"remote server" should be "remote servers"?
+ curlevel = GetCurrentTransactionNestLevel(); + snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel);Why does pgfdw_finish_pre_subcommit_cleanup() need to call
GetCurrentTransactionNestLevel() and construct the "RELEASE SAVEPOINT"
query string again? pgfdw_subxact_callback() already does them and
probably we can make it pass either of them to
pgfdw_finish_pre_subcommit_cleanup() as its argument.+ This option controls whether <filename>postgres_fdw</filename> commits + remote (sub)transactions opened on a foreign server in a local + (sub)transaction in parallel when the local (sub)transaction commits."a foreign server" should be "foreign servers"?
"in a local (sub)transaction" part seems not to be necessary.
Regards,
--
David
Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca
On Fri, Feb 18, 2022 at 1:46 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
I reviewed 0001 patch. It looks good to me except the following minor things. If these are addressed, I think that the 001 patch can be marked as ready for committer.
OK
+ * Also determine to commit (sub)transactions opened on the remote server + * in parallel at (sub)transaction end.Like the comment "Determine whether to keep the connection ...", "determine to commit" should be "determine whether to commit"?
Agreed. I’ll change it as such.
"remote server" should be "remote servers"?
Maybe I’m missing something, but we determine this for the given
remote server, so it seems to me correct to say “the remote server”,
not “the remote servers“.
+ curlevel = GetCurrentTransactionNestLevel(); + snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel);Why does pgfdw_finish_pre_subcommit_cleanup() need to call GetCurrentTransactionNestLevel() and construct the "RELEASE SAVEPOINT" query string again? pgfdw_subxact_callback() already does them and probably we can make it pass either of them to pgfdw_finish_pre_subcommit_cleanup() as its argument.
Yeah, that would save cycles, but I think that that makes code a bit
unclean IMO. (To save cycles, I think we could also modify
pgfdw_subxact_callback() to reuse the query in the while loop in that
function when processing multiple open remote subtransactions there,
but that would make code a bit complicated, so I don’t think it’s a
good idea to do so, either.) So I’d vote for reconstructing the query
in pgfdw_finish_pre_subcommit_cleanup() as we do in
pgfdw_subxact_callback().
To avoid calling GetCurrentTransactionNestLevel() again, I think we
could pass the curlevel variable to that function.
+ This option controls whether <filename>postgres_fdw</filename> commits + remote (sub)transactions opened on a foreign server in a local + (sub)transaction in parallel when the local (sub)transaction commits."a foreign server" should be "foreign servers"?
I thought it would be good to say “a foreign server”, not “foreign
servers”, because it makes clear that even remote transactions opened
on a single foreign server are committed in parallel. (To say that
this option is not for a specific foreign server, I added to the
documentation “This option can only be specified for foreign
servers”.)
"in a local (sub)transaction" part seems not to be necessary.
And I thought adding it would make clear which remote transactions are
committed in parallel. But maybe I’m missing something, so could you
elaborate a bit more on these?
Thanks for reviewing!
Best regards,
Etsuro Fujita
On Sat, Feb 19, 2022 at 6:55 AM David Zhang <david.zhang@highgo.ca> wrote:
Tried to apply the patches to master branch, no warning found and
regression test passed.
Thanks for testing!
Now, we have many places (5) calling the same function with a constant
number 30000. Is this a good time to consider redefine this number a
macro somewhere?
Yeah, I think that is a good idea. I’ll do so in the next version of
the parallel-abort patch (#0003) if no objections.
Best regards,
Etsuro Fujita
On 2022/02/21 14:45, Etsuro Fujita wrote:
On Fri, Feb 18, 2022 at 1:46 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
I reviewed 0001 patch. It looks good to me except the following minor things. If these are addressed, I think that the 001 patch can be marked as ready for committer.
OK
+ * Also determine to commit (sub)transactions opened on the remote server + * in parallel at (sub)transaction end.Like the comment "Determine whether to keep the connection ...", "determine to commit" should be "determine whether to commit"?
Agreed. I’ll change it as such.
Thanks! If that's updated, IMO it's ok to commit the 0001 patch.
After the commit, I will review 0002 and 0003 patches.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
On Tue, Feb 22, 2022 at 1:03 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2022/02/21 14:45, Etsuro Fujita wrote:
On Fri, Feb 18, 2022 at 1:46 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
I reviewed 0001 patch. It looks good to me except the following minor things. If these are addressed, I think that the 001 patch can be marked as ready for committer.
+ * Also determine to commit (sub)transactions opened on the remote server + * in parallel at (sub)transaction end.Like the comment "Determine whether to keep the connection ...", "determine to commit" should be "determine whether to commit"?
Agreed. I’ll change it as such.
Done.
Thanks! If that's updated, IMO it's ok to commit the 0001 patch.
Cool! Attached is an updated patch. Other changes other than that:
1) I added the curlevel parameter to
pgfdw_finish_pre_subcommit_cleanup() to avoid doing
GetCurrentTransactionNestLevel() there, as proposed, and 2) tweaked
comments a bit further, mostly for/in
pgfdw_finish_pre_commit_cleanup() and
pgfdw_finish_pre_subcommit_cleanup(). Barring objections, I’ll commit
the patch.
Thanks for reviewing!
Best regards,
Etsuro Fujita
Attachments:
v5-0001-postgres-fdw-Add-support-for-parallel-commit.patchapplication/octet-stream; name=v5-0001-postgres-fdw-Add-support-for-parallel-commit.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index f753c6e232..8c64d42dda 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -58,6 +58,7 @@ typedef struct ConnCacheEntry
bool have_prep_stmt; /* have we prepared any stmts in this xact? */
bool have_error; /* have any subxacts aborted in this xact? */
bool changing_xact_state; /* xact state change in process */
+ bool parallel_commit; /* do we commit (sub)xacts in parallel? */
bool invalidated; /* true if reconnect is pending */
bool keep_connections; /* setting value of keep_connections
* server option */
@@ -92,6 +93,9 @@ static PGconn *connect_pg_server(ForeignServer *server, UserMapping *user);
static void disconnect_pg_server(ConnCacheEntry *entry);
static void check_conn_params(const char **keywords, const char **values, UserMapping *user);
static void configure_remote_session(PGconn *conn);
+static void do_sql_command_begin(PGconn *conn, const char *sql);
+static void do_sql_command_end(PGconn *conn, const char *sql,
+ bool consume_input);
static void begin_remote_xact(ConnCacheEntry *entry);
static void pgfdw_xact_callback(XactEvent event, void *arg);
static void pgfdw_subxact_callback(SubXactEvent event,
@@ -100,6 +104,7 @@ static void pgfdw_subxact_callback(SubXactEvent event,
void *arg);
static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
+static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
static bool pgfdw_cancel_query(PGconn *conn);
static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
@@ -107,6 +112,9 @@ static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
static void pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql,
bool toplevel);
+static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
+static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries,
+ int curlevel);
static bool UserMappingPasswordRequired(UserMapping *user);
static bool disconnect_cached_connections(Oid serverid);
@@ -316,14 +324,20 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
* is changed will be closed and re-made later.
*
* By default, all the connections to any foreign servers are kept open.
+ *
+ * Also determine whether to commit (sub)transactions opened on the remote
+ * server in parallel at (sub)transaction end.
*/
entry->keep_connections = true;
+ entry->parallel_commit = false;
foreach(lc, server->options)
{
DefElem *def = (DefElem *) lfirst(lc);
if (strcmp(def->defname, "keep_connections") == 0)
entry->keep_connections = defGetBoolean(def);
+ else if (strcmp(def->defname, "parallel_commit") == 0)
+ entry->parallel_commit = defGetBoolean(def);
}
/* Now try to make the connection */
@@ -623,10 +637,30 @@ configure_remote_session(PGconn *conn)
void
do_sql_command(PGconn *conn, const char *sql)
{
- PGresult *res;
+ do_sql_command_begin(conn, sql);
+ do_sql_command_end(conn, sql, false);
+}
+static void
+do_sql_command_begin(PGconn *conn, const char *sql)
+{
if (!PQsendQuery(conn, sql))
pgfdw_report_error(ERROR, NULL, conn, false, sql);
+}
+
+static void
+do_sql_command_end(PGconn *conn, const char *sql, bool consume_input)
+{
+ PGresult *res;
+
+ /*
+ * If requested, consume whatever data is available from the socket.
+ * (Note that if all data is available, this allows pgfdw_get_result to
+ * call PQgetResult without forcing the overhead of WaitLatchOrSocket,
+ * which would be large compared to the overhead of PQconsumeInput.)
+ */
+ if (consume_input && !PQconsumeInput(conn))
+ pgfdw_report_error(ERROR, NULL, conn, false, sql);
res = pgfdw_get_result(conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, sql);
@@ -888,6 +922,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
{
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
+ List *pending_entries = NIL;
/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
@@ -925,6 +960,12 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Commit all remote transactions during pre-commit */
entry->changing_xact_state = true;
+ if (entry->parallel_commit)
+ {
+ do_sql_command_begin(entry->conn, "COMMIT TRANSACTION");
+ pending_entries = lappend(pending_entries, entry);
+ continue;
+ }
do_sql_command(entry->conn, "COMMIT TRANSACTION");
entry->changing_xact_state = false;
@@ -981,23 +1022,15 @@ pgfdw_xact_callback(XactEvent event, void *arg)
}
/* Reset state to show we're out of a transaction */
- entry->xact_depth = 0;
+ pgfdw_reset_xact_state(entry, true);
+ }
- /*
- * If the connection isn't in a good idle state, it is marked as
- * invalid or keep_connections option of its server is disabled, then
- * discard it to recover. Next GetConnection will open a new
- * connection.
- */
- if (PQstatus(entry->conn) != CONNECTION_OK ||
- PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
- entry->changing_xact_state ||
- entry->invalidated ||
- !entry->keep_connections)
- {
- elog(DEBUG3, "discarding connection %p", entry->conn);
- disconnect_pg_server(entry);
- }
+ /* If there are any pending connections, finish cleaning them up */
+ if (pending_entries)
+ {
+ Assert(event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
+ event == XACT_EVENT_PRE_COMMIT);
+ pgfdw_finish_pre_commit_cleanup(pending_entries);
}
/*
@@ -1021,6 +1054,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
int curlevel;
+ List *pending_entries = NIL;
/* Nothing to do at subxact start, nor after commit. */
if (!(event == SUBXACT_EVENT_PRE_COMMIT_SUB ||
@@ -1063,6 +1097,12 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
/* Commit all remote subtransactions during pre-commit */
snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel);
entry->changing_xact_state = true;
+ if (entry->parallel_commit)
+ {
+ do_sql_command_begin(entry->conn, sql);
+ pending_entries = lappend(pending_entries, entry);
+ continue;
+ }
do_sql_command(entry->conn, sql);
entry->changing_xact_state = false;
}
@@ -1076,7 +1116,14 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
}
/* OK, we're outta that level of subtransaction */
- entry->xact_depth--;
+ pgfdw_reset_xact_state(entry, false);
+ }
+
+ /* If there are any pending connections, finish cleaning them up */
+ if (pending_entries)
+ {
+ Assert(event == SUBXACT_EVENT_PRE_COMMIT_SUB);
+ pgfdw_finish_pre_subcommit_cleanup(pending_entries, curlevel);
}
}
@@ -1169,6 +1216,40 @@ pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry)
server->servername)));
}
+/*
+ * Reset state to show we're out of a (sub)transaction.
+ */
+static void
+pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel)
+{
+ if (toplevel)
+ {
+ /* Reset state to show we're out of a transaction */
+ entry->xact_depth = 0;
+
+ /*
+ * If the connection isn't in a good idle state, it is marked as
+ * invalid or keep_connections option of its server is disabled, then
+ * discard it to recover. Next GetConnection will open a new
+ * connection.
+ */
+ if (PQstatus(entry->conn) != CONNECTION_OK ||
+ PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
+ entry->changing_xact_state ||
+ entry->invalidated ||
+ !entry->keep_connections)
+ {
+ elog(DEBUG3, "discarding connection %p", entry->conn);
+ disconnect_pg_server(entry);
+ }
+ }
+ else
+ {
+ /* Reset state to show we're out of a subtransaction */
+ entry->xact_depth--;
+ }
+}
+
/*
* Cancel the currently-in-progress query (whose query text we do not have)
* and ignore the result. Returns true if we successfully cancel the query
@@ -1456,6 +1537,112 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel)
entry->changing_xact_state = false;
}
+/*
+ * Finish pre-commit cleanup of connections on each of which we've sent a
+ * COMMIT command to the remote server.
+ */
+static void
+pgfdw_finish_pre_commit_cleanup(List *pending_entries)
+{
+ ConnCacheEntry *entry;
+ List *pending_deallocs = NIL;
+ ListCell *lc;
+
+ Assert(pending_entries);
+
+ /*
+ * Get the result of the COMMIT command for each of the pending entries
+ */
+ foreach(lc, pending_entries)
+ {
+ entry = (ConnCacheEntry *) lfirst(lc);
+
+ Assert(entry->changing_xact_state);
+ /*
+ * We might already have received the result on the socket, so pass
+ * consume_input=true to try to consume it first
+ */
+ do_sql_command_end(entry->conn, "COMMIT TRANSACTION", true);
+ entry->changing_xact_state = false;
+
+ /* Do a DEALLOCATE ALL in parallel if needed */
+ if (entry->have_prep_stmt && entry->have_error)
+ {
+ /* Ignore errors (see notes in pgfdw_xact_callback) */
+ if (PQsendQuery(entry->conn, "DEALLOCATE ALL"))
+ {
+ pending_deallocs = lappend(pending_deallocs, entry);
+ continue;
+ }
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ pgfdw_reset_xact_state(entry, true);
+ }
+
+ /* No further work if no pending entries */
+ if (!pending_deallocs)
+ return;
+
+ /*
+ * Get the result of the DEALLOCATE command for each of the pending
+ * entries
+ */
+ foreach(lc, pending_deallocs)
+ {
+ PGresult *res;
+
+ entry = (ConnCacheEntry *) lfirst(lc);
+
+ /* Ignore errors (see notes in pgfdw_xact_callback) */
+ while ((res = PQgetResult(entry->conn)) != NULL)
+ {
+ PQclear(res);
+ /* Stop if the connection is lost (else we'll loop infinitely) */
+ if (PQstatus(entry->conn) == CONNECTION_BAD)
+ break;
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ pgfdw_reset_xact_state(entry, true);
+ }
+}
+
+/*
+ * Finish pre-subcommit cleanup of connections on each of which we've sent a
+ * RELEASE command to the remote server.
+ */
+static void
+pgfdw_finish_pre_subcommit_cleanup(List *pending_entries, int curlevel)
+{
+ ConnCacheEntry *entry;
+ char sql[100];
+ ListCell *lc;
+
+ Assert(pending_entries);
+
+ /*
+ * Get the result of the RELEASE command for each of the pending entries
+ */
+ snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel);
+ foreach(lc, pending_entries)
+ {
+ entry = (ConnCacheEntry *) lfirst(lc);
+
+ Assert(entry->changing_xact_state);
+ /*
+ * We might already have received the result on the socket, so pass
+ * consume_input=true to try to consume it first
+ */
+ do_sql_command_end(entry->conn, sql, true);
+ entry->changing_xact_state = false;
+
+ pgfdw_reset_xact_state(entry, false);
+ }
+}
+
/*
* List active foreign server connections.
*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 057342083c..f210f91188 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9509,7 +9509,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, keep_connections
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, keep_connections
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
@@ -10933,3 +10933,79 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
--Clean up
RESET postgres_fdw.application_name;
RESET debug_discard_caches;
+-- ===================================================================
+-- test parallel commit
+-- ===================================================================
+ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+CREATE TABLE ploc1 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
+ SERVER loopback OPTIONS (table_name 'ploc1');
+CREATE TABLE ploc2 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem2 (f1 int, f2 text)
+ SERVER loopback2 OPTIONS (table_name 'ploc2');
+BEGIN;
+INSERT INTO prem1 VALUES (101, 'foo');
+INSERT INTO prem2 VALUES (201, 'bar');
+COMMIT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+-----
+ 101 | foo
+(1 row)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+-----
+ 201 | bar
+(1 row)
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (102, 'foofoo');
+INSERT INTO prem2 VALUES (202, 'barbar');
+RELEASE SAVEPOINT s;
+COMMIT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+(2 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+(2 rows)
+
+-- This tests executing DEALLOCATE ALL against foreign servers in parallel
+-- during pre-commit
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (103, 'baz');
+INSERT INTO prem2 VALUES (203, 'qux');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (104, 'bazbaz');
+INSERT INTO prem2 VALUES (204, 'quxqux');
+COMMIT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
+ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index af38e956e7..76d0b6dd0f 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -121,6 +121,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
strcmp(def->defname, "updatable") == 0 ||
strcmp(def->defname, "truncatable") == 0 ||
strcmp(def->defname, "async_capable") == 0 ||
+ strcmp(def->defname, "parallel_commit") == 0 ||
strcmp(def->defname, "keep_connections") == 0)
{
/* these accept only boolean values */
@@ -249,6 +250,7 @@ InitPgFdwOptions(void)
/* async_capable is available on both server and table */
{"async_capable", ForeignServerRelationId, false},
{"async_capable", ForeignTableRelationId, false},
+ {"parallel_commit", ForeignServerRelationId, false},
{"keep_connections", ForeignServerRelationId, false},
{"password_required", UserMappingRelationId, false},
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 6c9f579c41..95b6b7192e 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3515,3 +3515,49 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
--Clean up
RESET postgres_fdw.application_name;
RESET debug_discard_caches;
+
+-- ===================================================================
+-- test parallel commit
+-- ===================================================================
+ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+
+CREATE TABLE ploc1 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
+ SERVER loopback OPTIONS (table_name 'ploc1');
+CREATE TABLE ploc2 (f1 int, f2 text);
+CREATE FOREIGN TABLE prem2 (f1 int, f2 text)
+ SERVER loopback2 OPTIONS (table_name 'ploc2');
+
+BEGIN;
+INSERT INTO prem1 VALUES (101, 'foo');
+INSERT INTO prem2 VALUES (201, 'bar');
+COMMIT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (102, 'foofoo');
+INSERT INTO prem2 VALUES (202, 'barbar');
+RELEASE SAVEPOINT s;
+COMMIT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+-- This tests executing DEALLOCATE ALL against foreign servers in parallel
+-- during pre-commit
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (103, 'baz');
+INSERT INTO prem2 VALUES (203, 'qux');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (104, 'bazbaz');
+INSERT INTO prem2 VALUES (204, 'quxqux');
+COMMIT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index dc57fe4b0d..8ebf0dc3a0 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -456,6 +456,52 @@ OPTIONS (ADD password_required 'false');
</variablelist>
</sect3>
+ <sect3>
+ <title>Transaction Management Options</title>
+
+ <para>
+ When multiple remote (sub)transactions are involved in a local
+ (sub)transaction, by default <filename>postgres_fdw</filename> commits
+ those remote (sub)transactions one by one when the local (sub)transaction
+ commits.
+ Performance can be improved with the following option:
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>parallel_commit</literal> (<type>boolean</type>)</term>
+ <listitem>
+ <para>
+ This option controls whether <filename>postgres_fdw</filename> commits
+ remote (sub)transactions opened on a foreign server in a local
+ (sub)transaction in parallel when the local (sub)transaction commits.
+ This option can only be specified for foreign servers, not per-table.
+ The default is <literal>false</literal>.
+ </para>
+
+ <para>
+ If multiple foreign servers with this option enabled are involved in
+ a local (sub)transaction, multiple remote (sub)transactions opened on
+ those foreign servers in the local (sub)transaction are committed in
+ parallel across those foreign servers when the local (sub)transaction
+ commits.
+ </para>
+
+ <para>
+ For a foreign server with this option enabled, if many remote
+ (sub)transactions are opened on the foreign server in a local
+ (sub)transaction, this option might increase the remote server’s load
+ when the local (sub)transaction commits, so be careful when using this
+ option.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ </sect3>
+
<sect3>
<title>Updatability Options</title>
On Wed, Feb 23, 2022 at 3:30 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
Attached is an updated patch.
Barring objections, I’ll commit
the patch.
I have committed the patch. I think the 0003 patch needs rebase.
I'll update the patch.
Best regards,
Etsuro Fujita
On Thu, Feb 24, 2022 at 2:49 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
I think the 0003 patch needs rebase.
I'll update the patch.
Here is an updated version. I added to the 0003 patch a macro for
defining the milliseconds to wait, as proposed by David upthread.
Best regards,
Etsuro Fujita
Attachments:
v5-0002-postgres_fdw-Minor-cleanup-for-pgfdw_abort_cleanup.patchapplication/octet-stream; name=v5-0002-postgres_fdw-Minor-cleanup-for-pgfdw_abort_cleanup.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 8c64d42dda..fdf4cb90be 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -80,6 +80,18 @@ static unsigned int prep_stmt_number = 0;
/* tracks whether any work is needed in callback functions */
static bool xact_got_connection = false;
+/* macro for constructing abort command to be sent */
+#define CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel) \
+ do { \
+ if (toplevel) \
+ snprintf((sql), sizeof(sql), \
+ "ABORT TRANSACTION"); \
+ else \
+ snprintf((sql), sizeof(sql), \
+ "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d", \
+ (entry)->xact_depth, (entry)->xact_depth); \
+ } while(0)
+
/*
* SQL functions
*/
@@ -110,8 +122,7 @@ static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
-static void pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql,
- bool toplevel);
+static void pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel);
static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries,
int curlevel);
@@ -1015,8 +1026,8 @@ pgfdw_xact_callback(XactEvent event, void *arg)
break;
case XACT_EVENT_PARALLEL_ABORT:
case XACT_EVENT_ABORT:
-
- pgfdw_abort_cleanup(entry, "ABORT TRANSACTION", true);
+ /* Rollback all remote transactions during abort */
+ pgfdw_abort_cleanup(entry, true);
break;
}
}
@@ -1109,10 +1120,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
else
{
/* Rollback all remote subtransactions during abort */
- snprintf(sql, sizeof(sql),
- "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
- curlevel, curlevel);
- pgfdw_abort_cleanup(entry, sql, false);
+ pgfdw_abort_cleanup(entry, false);
}
/* OK, we're outta that level of subtransaction */
@@ -1465,10 +1473,7 @@ exit: ;
}
/*
- * Abort remote transaction.
- *
- * The statement specified in "sql" is sent to the remote server,
- * in order to rollback the remote transaction.
+ * Abort remote transaction or subtransaction.
*
* "toplevel" should be set to true if toplevel (main) transaction is
* rollbacked, false otherwise.
@@ -1476,8 +1481,10 @@ exit: ;
* Set entry->changing_xact_state to false on success, true on failure.
*/
static void
-pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel)
+pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
{
+ char sql[100];
+
/*
* Don't try to clean up the connection if we're already in error
* recursion trouble.
@@ -1509,8 +1516,9 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel)
!pgfdw_cancel_query(entry->conn))
return; /* Unable to cancel running query */
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
if (!pgfdw_exec_cleanup_query(entry->conn, sql, false))
- return; /* Unable to abort remote transaction */
+ return; /* Unable to abort remote (sub)transaction */
if (toplevel)
{
v5-0003-postgres-fdw-Add-support-for-parallel-abort.patchapplication/octet-stream; name=v5-0003-postgres-fdw-Add-support-for-parallel-abort.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index fdf4cb90be..89802cf5d0 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -59,6 +59,7 @@ typedef struct ConnCacheEntry
bool have_error; /* have any subxacts aborted in this xact? */
bool changing_xact_state; /* xact state change in process */
bool parallel_commit; /* do we commit (sub)xacts in parallel? */
+ bool parallel_abort; /* do we abort (sub)xacts in parallel? */
bool invalidated; /* true if reconnect is pending */
bool keep_connections; /* setting value of keep_connections
* server option */
@@ -80,6 +81,9 @@ static unsigned int prep_stmt_number = 0;
/* tracks whether any work is needed in callback functions */
static bool xact_got_connection = false;
+/* Milliseconds to wait to cancel a query or execute a cleanup query */
+#define CONNECTION_CLEANUP_TIMEOUT 30000
+
/* macro for constructing abort command to be sent */
#define CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel) \
do { \
@@ -118,14 +122,26 @@ static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
static bool pgfdw_cancel_query(PGconn *conn);
+static bool pgfdw_cancel_query_begin(PGconn *conn);
+static bool pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime);
static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
+static bool pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query);
+static bool pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ bool ignore_errors,
+ TimestampTz endtime);
static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
static void pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel);
+static bool pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries,
+ List **cancel_requested);
static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries,
int curlevel);
+static void pgfdw_finish_abort_cleanup(List *pending_entries,
+ List *cancel_requested,
+ bool toplevel);
static bool UserMappingPasswordRequired(UserMapping *user);
static bool disconnect_cached_connections(Oid serverid);
@@ -336,11 +352,12 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
*
* By default, all the connections to any foreign servers are kept open.
*
- * Also determine whether to commit (sub)transactions opened on the remote
- * server in parallel at (sub)transaction end.
+ * Also determine whether to commit/abort (sub)transactions opened on the
+ * remote server in parallel at (sub)transaction end.
*/
entry->keep_connections = true;
entry->parallel_commit = false;
+ entry->parallel_abort = false;
foreach(lc, server->options)
{
DefElem *def = (DefElem *) lfirst(lc);
@@ -349,6 +366,8 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
entry->keep_connections = defGetBoolean(def);
else if (strcmp(def->defname, "parallel_commit") == 0)
entry->parallel_commit = defGetBoolean(def);
+ else if (strcmp(def->defname, "parallel_abort") == 0)
+ entry->parallel_abort = defGetBoolean(def);
}
/* Now try to make the connection */
@@ -934,6 +953,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
@@ -1027,7 +1047,15 @@ pgfdw_xact_callback(XactEvent event, void *arg)
case XACT_EVENT_PARALLEL_ABORT:
case XACT_EVENT_ABORT:
/* Rollback all remote transactions during abort */
- pgfdw_abort_cleanup(entry, true);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, true,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, true);
break;
}
}
@@ -1037,11 +1065,21 @@ pgfdw_xact_callback(XactEvent event, void *arg)
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
- event == XACT_EVENT_PRE_COMMIT);
- pgfdw_finish_pre_commit_cleanup(pending_entries);
+ if (event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
+ event == XACT_EVENT_PRE_COMMIT)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_commit_cleanup(pending_entries);
+ }
+ else
+ {
+ Assert(event == XACT_EVENT_PARALLEL_ABORT ||
+ event == XACT_EVENT_ABORT);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ true);
+ }
}
/*
@@ -1066,6 +1104,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
ConnCacheEntry *entry;
int curlevel;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Nothing to do at subxact start, nor after commit. */
if (!(event == SUBXACT_EVENT_PRE_COMMIT_SUB ||
@@ -1120,7 +1159,15 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
else
{
/* Rollback all remote subtransactions during abort */
- pgfdw_abort_cleanup(entry, false);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, false,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, false);
}
/* OK, we're outta that level of subtransaction */
@@ -1128,10 +1175,19 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == SUBXACT_EVENT_PRE_COMMIT_SUB);
- pgfdw_finish_pre_subcommit_cleanup(pending_entries, curlevel);
+ if (event == SUBXACT_EVENT_PRE_COMMIT_SUB)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_subcommit_cleanup(pending_entries, curlevel);
+ }
+ else
+ {
+ Assert(event == SUBXACT_EVENT_ABORT_SUB);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ false);
+ }
}
}
@@ -1275,17 +1331,25 @@ pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel)
static bool
pgfdw_cancel_query(PGconn *conn)
{
- PGcancel *cancel;
- char errbuf[256];
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to cancel the query and discard the result, assume
* the connection is dead.
*/
- endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_cancel_query_begin(conn))
+ return false;
+ return pgfdw_cancel_query_end(conn, endtime);
+}
+
+static bool
+pgfdw_cancel_query_begin(PGconn *conn)
+{
+ PGcancel *cancel;
+ char errbuf[256];
/*
* Issue cancel request. Unfortunately, there's no good way to limit the
@@ -1305,6 +1369,15 @@ pgfdw_cancel_query(PGconn *conn)
PQfreeCancel(cancel);
}
+ return true;
+}
+
+static bool
+pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
/* Get and discard the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1339,9 +1412,7 @@ pgfdw_cancel_query(PGconn *conn)
static bool
pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
{
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to execute a cleanup query, assume the connection
@@ -1349,8 +1420,17 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
* place (e.g. statement timeout, user cancel), so the timeout shouldn't
* be too long.
*/
- endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+ if (!pgfdw_exec_cleanup_query_begin(conn, query))
+ return false;
+ return pgfdw_exec_cleanup_query_end(conn, query, ignore_errors, endtime);
+}
+
+static bool
+pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query)
+{
/*
* Submit a query. Since we don't use non-blocking mode, this also can
* block. But its risk is relatively small, so we ignore that for now.
@@ -1361,6 +1441,16 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
return false;
}
+ return true;
+}
+
+static bool
+pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ bool ignore_errors, TimestampTz endtime)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
/* Get the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1545,6 +1635,56 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
entry->changing_xact_state = false;
}
+static bool
+pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries, List **cancel_requested)
+{
+ /*
+ * Don't try to clean up the connection if we're already in error
+ * recursion trouble.
+ */
+ if (in_error_recursion_trouble())
+ entry->changing_xact_state = true;
+
+ /*
+ * If connection is already unsalvageable, don't touch it further.
+ */
+ if (entry->changing_xact_state)
+ return false;
+
+ /*
+ * Mark this connection as in the process of changing transaction state.
+ */
+ entry->changing_xact_state = true;
+
+ /* Assume we might have lost track of prepared statements */
+ entry->have_error = true;
+
+ /*
+ * If a command has been submitted to the remote server by using an
+ * asynchronous execution function, the command might not have yet
+ * completed. Check to see if a command is still being processed by the
+ * remote server, and if so, request cancellation of the command.
+ */
+ if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE)
+ {
+ if (!pgfdw_cancel_query_begin(entry->conn))
+ return false; /* Unable to cancel running query */
+ *cancel_requested = lappend(*cancel_requested, entry);
+ }
+ else
+ {
+ char sql[100];
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ return false; /* Unable to abort remote transaction */
+ *pending_entries = lappend(*pending_entries, entry);
+ }
+
+ return true;
+}
+
/*
* Finish pre-commit cleanup of connections on each of which we've sent a
* COMMIT command to the remote server.
@@ -1651,6 +1791,168 @@ pgfdw_finish_pre_subcommit_cleanup(List *pending_entries, int curlevel)
}
}
+/*
+ * Finish (sub)abort cleanup of connections on each of which we've sent a
+ * (sub)abort command or cancel request to the remote server.
+ */
+static void
+pgfdw_finish_abort_cleanup(List *pending_entries, List *cancel_requested,
+ bool toplevel)
+{
+ List *pending_deallocs = NIL;
+ ListCell *lc;
+
+ /*
+ * If cancel requests have been issued, get and discard the results of the
+ * queries first.
+ */
+ if (cancel_requested)
+ {
+ foreach(lc, cancel_requested)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. You might think we should do this before issuing
+ * cancel request like in normal mode, but that is problematic,
+ * because if, for example, it took longer than 30 seconds to
+ * process the first few entries in the cancel_requested list, it
+ * would cause a timeout for each of the remaining entries in the
+ * list when processing it, leading to slamming the connection of
+ * it shut.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_cancel_query_end(entry->conn, endtime))
+ {
+ /* Unable to cancel running query */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ /* Send a (sub)abort command in parallel if needed */
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_entries = lappend(pending_entries, entry);
+ }
+ }
+
+ /* No further work if no pending entries */
+ if (!pending_entries)
+ return;
+
+ /*
+ * Get the result of the (sub)abort command for each of the pending
+ * entries
+ */
+ foreach(lc, pending_entries)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, sql, false, endtime))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ if (toplevel)
+ {
+ /* Do a DEALLOCATE ALL in parallel if needed */
+ if (entry->have_prep_stmt && entry->have_error)
+ {
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn,
+ "DEALLOCATE ALL"))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_deallocs = lappend(pending_deallocs, entry);
+ continue;
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+ }
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+
+ /* No further work if no pending entries */
+ if (!pending_deallocs)
+ return;
+ Assert(toplevel);
+
+ /*
+ * Get the result of the DEALLOCATE command for each of the pending
+ * entries
+ */
+ foreach(lc, pending_deallocs)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+
+ Assert(entry->changing_xact_state);
+ Assert(entry->have_prep_stmt);
+ Assert(entry->have_error);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, "DEALLOCATE ALL",
+ true, endtime))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+}
+
/*
* List active foreign server connections.
*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index f210f91188..ed5bf9208f 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9509,7 +9509,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, keep_connections
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, parallel_abort, keep_connections
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
@@ -10934,10 +10934,12 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
SERVER loopback OPTIONS (table_name 'ploc1');
@@ -11007,5 +11009,52 @@ SELECT * FROM prem2;
204 | quxqux
(3 rows)
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index 572591a558..59f865fac3 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -122,6 +122,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
strcmp(def->defname, "truncatable") == 0 ||
strcmp(def->defname, "async_capable") == 0 ||
strcmp(def->defname, "parallel_commit") == 0 ||
+ strcmp(def->defname, "parallel_abort") == 0 ||
strcmp(def->defname, "keep_connections") == 0)
{
/* these accept only boolean values */
@@ -251,6 +252,7 @@ InitPgFdwOptions(void)
{"async_capable", ForeignServerRelationId, false},
{"async_capable", ForeignTableRelationId, false},
{"parallel_commit", ForeignServerRelationId, false},
+ {"parallel_abort", ForeignServerRelationId, false},
{"keep_connections", ForeignServerRelationId, false},
{"password_required", UserMappingRelationId, false},
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 95b6b7192e..bd26739d9a 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3517,10 +3517,12 @@ RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
@@ -3559,5 +3561,26 @@ COMMIT;
SELECT * FROM prem1;
SELECT * FROM prem2;
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 8ebf0dc3a0..e5ca003e9b 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -461,10 +461,10 @@ OPTIONS (ADD password_required 'false');
<para>
When multiple remote (sub)transactions are involved in a local
- (sub)transaction, by default <filename>postgres_fdw</filename> commits
- those remote (sub)transactions one by one when the local (sub)transaction
- commits.
- Performance can be improved with the following option:
+ (sub)transaction, by default <filename>postgres_fdw</filename> commits or
+ aborts those remote (sub)transactions one by one when the local
+ (sub)transaction commits or aborts.
+ Performance can be improved with the following options:
</para>
<variablelist>
@@ -479,27 +479,40 @@ OPTIONS (ADD password_required 'false');
This option can only be specified for foreign servers, not per-table.
The default is <literal>false</literal>.
</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><literal>parallel_abort</literal> (<type>boolean</type>)</term>
+ <listitem>
<para>
- If multiple foreign servers with this option enabled are involved in
- a local (sub)transaction, multiple remote (sub)transactions opened on
- those foreign servers in the local (sub)transaction are committed in
- parallel across those foreign servers when the local (sub)transaction
- commits.
- </para>
-
- <para>
- For a foreign server with this option enabled, if many remote
- (sub)transactions are opened on the foreign server in a local
- (sub)transaction, this option might increase the remote server’s load
- when the local (sub)transaction commits, so be careful when using this
- option.
+ This option controls whether <filename>postgres_fdw</filename> aborts
+ remote (sub)transactions opened on a foreign server in a local
+ (sub)transaction in parallel when the local (sub)transaction aborts.
+ This option can only be specified for foreign servers, not per-table.
+ The default is <literal>false</literal>.
</para>
</listitem>
</varlistentry>
</variablelist>
+ <para>
+ If multiple foreign servers with these options enabled are involved in a
+ local (sub)transaction, multiple remote (sub)transactions opened on those
+ foreign servers in the local (sub)transaction are committed or aborted in
+ parallel across those foreign servers when the local (sub)transaction
+ commits or aborts.
+ </para>
+
+ <para>
+ For a foreign server with these options enabled, if many remote
+ (sub)transactions are opened on the foreign server in a local
+ (sub)transaction, these options might increase the remote server’s load
+ when the local (sub)transaction commits or aborts, so be careful when
+ using these options.
+ </para>
+
</sect3>
<sect3>
On Mon, Feb 28, 2022 at 6:53 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
Here is an updated version. I added to the 0003 patch a macro for
defining the milliseconds to wait, as proposed by David upthread.
I modified the 0003 patch further: 1) I added to
pgfdw_cancel_query_end/pgfdw_exec_cleanup_query_end the PQconsumeInput
optimization that we have in do_sql_command_end, and 2) I
added/tweaked comments a bit further. Attached is an updated version.
Like [1]/messages/by-id/CAPmGK17dAZCXvwnfpr1eTfknTGdt=hYTV9405Gt5SqPOX8K84w@mail.gmail.com, I ran a simple performance test using the following transaction:
BEGIN;
SAVEPOINT s;
INSERT INTO ft1 VALUES (10, 10);
INSERT INTO ft2 VALUES (20, 20);
ROLLBACK TO SAVEPOINT s;
RELEASE SAVEPOINT s;
INSERT INTO ft1 VALUES (10, 10);
INSERT INTO ft2 VALUES (20, 20);
ABORT;
where ft1 is a foreign table created on a foreign server hosted on the
same machine as the local server, and ft2 is a foreign table created
on a foreign server hosted on a different machine. (In this test I
used two machines, while in [1]/messages/by-id/CAPmGK17dAZCXvwnfpr1eTfknTGdt=hYTV9405Gt5SqPOX8K84w@mail.gmail.com I used three machines: one for the
local server and the others for ft1 and ft2.) The average latencies
for the ROLLBACK TO SAVEPOINT and ABORT commands over ten runs of the
above transaction with the parallel_abort option disabled/enabled are:
* ROLLBACK TO SAVEPOINT
parallel_abort=0: 0.3217 ms
parallel_abort=1: 0.2396 ms
* ABORT
parallel_abort=0: 0.4749 ms
parallel_abort=1: 0.3733 ms
This option reduces the latency for ROLLBACK TO SAVEPOINT by 25.5
percent, and the latency for ABORT by 21.4 percent. From the results,
I think the patch is useful.
Best regards,
Etsuro Fujita
[1]: /messages/by-id/CAPmGK17dAZCXvwnfpr1eTfknTGdt=hYTV9405Gt5SqPOX8K84w@mail.gmail.com
Attachments:
v6-0002-postgres_fdw-Minor-cleanup-for-pgfdw_abort_cleanup.patchapplication/octet-stream; name=v6-0002-postgres_fdw-Minor-cleanup-for-pgfdw_abort_cleanup.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 8c64d42dda..fdf4cb90be 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -80,6 +80,18 @@ static unsigned int prep_stmt_number = 0;
/* tracks whether any work is needed in callback functions */
static bool xact_got_connection = false;
+/* macro for constructing abort command to be sent */
+#define CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel) \
+ do { \
+ if (toplevel) \
+ snprintf((sql), sizeof(sql), \
+ "ABORT TRANSACTION"); \
+ else \
+ snprintf((sql), sizeof(sql), \
+ "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d", \
+ (entry)->xact_depth, (entry)->xact_depth); \
+ } while(0)
+
/*
* SQL functions
*/
@@ -110,8 +122,7 @@ static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
-static void pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql,
- bool toplevel);
+static void pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel);
static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries,
int curlevel);
@@ -1015,8 +1026,8 @@ pgfdw_xact_callback(XactEvent event, void *arg)
break;
case XACT_EVENT_PARALLEL_ABORT:
case XACT_EVENT_ABORT:
-
- pgfdw_abort_cleanup(entry, "ABORT TRANSACTION", true);
+ /* Rollback all remote transactions during abort */
+ pgfdw_abort_cleanup(entry, true);
break;
}
}
@@ -1109,10 +1120,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
else
{
/* Rollback all remote subtransactions during abort */
- snprintf(sql, sizeof(sql),
- "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
- curlevel, curlevel);
- pgfdw_abort_cleanup(entry, sql, false);
+ pgfdw_abort_cleanup(entry, false);
}
/* OK, we're outta that level of subtransaction */
@@ -1465,10 +1473,7 @@ exit: ;
}
/*
- * Abort remote transaction.
- *
- * The statement specified in "sql" is sent to the remote server,
- * in order to rollback the remote transaction.
+ * Abort remote transaction or subtransaction.
*
* "toplevel" should be set to true if toplevel (main) transaction is
* rollbacked, false otherwise.
@@ -1476,8 +1481,10 @@ exit: ;
* Set entry->changing_xact_state to false on success, true on failure.
*/
static void
-pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel)
+pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
{
+ char sql[100];
+
/*
* Don't try to clean up the connection if we're already in error
* recursion trouble.
@@ -1509,8 +1516,9 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel)
!pgfdw_cancel_query(entry->conn))
return; /* Unable to cancel running query */
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
if (!pgfdw_exec_cleanup_query(entry->conn, sql, false))
- return; /* Unable to abort remote transaction */
+ return; /* Unable to abort remote (sub)transaction */
if (toplevel)
{
v6-0003-postgres-fdw-Add-support-for-parallel-abort.patchapplication/octet-stream; name=v6-0003-postgres-fdw-Add-support-for-parallel-abort.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index fdf4cb90be..7f209fd5b3 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -59,6 +59,7 @@ typedef struct ConnCacheEntry
bool have_error; /* have any subxacts aborted in this xact? */
bool changing_xact_state; /* xact state change in process */
bool parallel_commit; /* do we commit (sub)xacts in parallel? */
+ bool parallel_abort; /* do we abort (sub)xacts in parallel? */
bool invalidated; /* true if reconnect is pending */
bool keep_connections; /* setting value of keep_connections
* server option */
@@ -80,6 +81,9 @@ static unsigned int prep_stmt_number = 0;
/* tracks whether any work is needed in callback functions */
static bool xact_got_connection = false;
+/* Milliseconds to wait to cancel a query or execute a cleanup query */
+#define CONNECTION_CLEANUP_TIMEOUT 30000
+
/* macro for constructing abort command to be sent */
#define CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel) \
do { \
@@ -118,14 +122,28 @@ static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
static bool pgfdw_cancel_query(PGconn *conn);
+static bool pgfdw_cancel_query_begin(PGconn *conn);
+static bool pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime,
+ bool consume_input);
static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
+static bool pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query);
+static bool pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ TimestampTz endtime,
+ bool consume_input,
+ bool ignore_errors);
static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
static void pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel);
+static bool pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries,
+ List **cancel_requested);
static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries,
int curlevel);
+static void pgfdw_finish_abort_cleanup(List *pending_entries,
+ List *cancel_requested,
+ bool toplevel);
static bool UserMappingPasswordRequired(UserMapping *user);
static bool disconnect_cached_connections(Oid serverid);
@@ -336,11 +354,12 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
*
* By default, all the connections to any foreign servers are kept open.
*
- * Also determine whether to commit (sub)transactions opened on the remote
- * server in parallel at (sub)transaction end.
+ * Also determine whether to commit/abort (sub)transactions opened on the
+ * remote server in parallel at (sub)transaction end.
*/
entry->keep_connections = true;
entry->parallel_commit = false;
+ entry->parallel_abort = false;
foreach(lc, server->options)
{
DefElem *def = (DefElem *) lfirst(lc);
@@ -349,6 +368,8 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
entry->keep_connections = defGetBoolean(def);
else if (strcmp(def->defname, "parallel_commit") == 0)
entry->parallel_commit = defGetBoolean(def);
+ else if (strcmp(def->defname, "parallel_abort") == 0)
+ entry->parallel_abort = defGetBoolean(def);
}
/* Now try to make the connection */
@@ -934,6 +955,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
@@ -1027,7 +1049,15 @@ pgfdw_xact_callback(XactEvent event, void *arg)
case XACT_EVENT_PARALLEL_ABORT:
case XACT_EVENT_ABORT:
/* Rollback all remote transactions during abort */
- pgfdw_abort_cleanup(entry, true);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, true,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, true);
break;
}
}
@@ -1037,11 +1067,21 @@ pgfdw_xact_callback(XactEvent event, void *arg)
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
- event == XACT_EVENT_PRE_COMMIT);
- pgfdw_finish_pre_commit_cleanup(pending_entries);
+ if (event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
+ event == XACT_EVENT_PRE_COMMIT)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_commit_cleanup(pending_entries);
+ }
+ else
+ {
+ Assert(event == XACT_EVENT_PARALLEL_ABORT ||
+ event == XACT_EVENT_ABORT);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ true);
+ }
}
/*
@@ -1066,6 +1106,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
ConnCacheEntry *entry;
int curlevel;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Nothing to do at subxact start, nor after commit. */
if (!(event == SUBXACT_EVENT_PRE_COMMIT_SUB ||
@@ -1120,7 +1161,15 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
else
{
/* Rollback all remote subtransactions during abort */
- pgfdw_abort_cleanup(entry, false);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, false,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, false);
}
/* OK, we're outta that level of subtransaction */
@@ -1128,10 +1177,19 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == SUBXACT_EVENT_PRE_COMMIT_SUB);
- pgfdw_finish_pre_subcommit_cleanup(pending_entries, curlevel);
+ if (event == SUBXACT_EVENT_PRE_COMMIT_SUB)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_subcommit_cleanup(pending_entries, curlevel);
+ }
+ else
+ {
+ Assert(event == SUBXACT_EVENT_ABORT_SUB);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ false);
+ }
}
}
@@ -1275,17 +1333,25 @@ pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel)
static bool
pgfdw_cancel_query(PGconn *conn)
{
- PGcancel *cancel;
- char errbuf[256];
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to cancel the query and discard the result, assume
* the connection is dead.
*/
- endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_cancel_query_begin(conn))
+ return false;
+ return pgfdw_cancel_query_end(conn, endtime, false);
+}
+
+static bool
+pgfdw_cancel_query_begin(PGconn *conn)
+{
+ PGcancel *cancel;
+ char errbuf[256];
/*
* Issue cancel request. Unfortunately, there's no good way to limit the
@@ -1305,6 +1371,31 @@ pgfdw_cancel_query(PGconn *conn)
PQfreeCancel(cancel);
}
+ return true;
+}
+
+static bool
+pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime, bool consume_input)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
+ /*
+ * If requested, consume whatever data is available from the socket.
+ * (Note that if all data is available, this allows
+ * pgfdw_get_cleanup_result to call PQgetResult without forcing the
+ * overhead of WaitLatchOrSocket, which would be large compared to the
+ * overhead of PQconsumeInput.)
+ */
+ if (consume_input && !PQconsumeInput(conn))
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_CONNECTION_FAILURE),
+ errmsg("could not get result of cancel request: %s",
+ pchomp(PQerrorMessage(conn)))));
+ return false;
+ }
+
/* Get and discard the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1339,9 +1430,7 @@ pgfdw_cancel_query(PGconn *conn)
static bool
pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
{
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to execute a cleanup query, assume the connection
@@ -1349,8 +1438,18 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
* place (e.g. statement timeout, user cancel), so the timeout shouldn't
* be too long.
*/
- endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_exec_cleanup_query_begin(conn, query))
+ return false;
+ return pgfdw_exec_cleanup_query_end(conn, query, endtime,
+ false, ignore_errors);
+}
+static bool
+pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query)
+{
/*
* Submit a query. Since we don't use non-blocking mode, this also can
* block. But its risk is relatively small, so we ignore that for now.
@@ -1361,6 +1460,30 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
return false;
}
+ return true;
+}
+
+static bool
+pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ TimestampTz endtime, bool consume_input,
+ bool ignore_errors)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
+ /*
+ * If requested, consume whatever data is available from the socket.
+ * (Note that if all data is available, this allows
+ * pgfdw_get_cleanup_result to call PQgetResult without forcing the
+ * overhead of WaitLatchOrSocket, which would be large compared to the
+ * overhead of PQconsumeInput.)
+ */
+ if (consume_input && !PQconsumeInput(conn))
+ {
+ pgfdw_report_error(WARNING, NULL, conn, false, query);
+ return false;
+ }
+
/* Get the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1545,6 +1668,65 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
entry->changing_xact_state = false;
}
+/*
+ * Like pgfdw_abort_cleanup, submit an abort command or cancel request, but
+ * don't wait for the result.
+ *
+ * Returns true if the abort command or cancel request is successfully issued,
+ * false otherwise. If the abort command is successfully issued, the given
+ * connection cache entry is appended to *pending_entries. Othewise, if the
+ * cancel request is successfully issued, it's appended to *cancel_requested.
+ */
+static bool
+pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries, List **cancel_requested)
+{
+ /*
+ * Don't try to clean up the connection if we're already in error
+ * recursion trouble.
+ */
+ if (in_error_recursion_trouble())
+ entry->changing_xact_state = true;
+
+ /*
+ * If connection is already unsalvageable, don't touch it further.
+ */
+ if (entry->changing_xact_state)
+ return false;
+
+ /*
+ * Mark this connection as in the process of changing transaction state.
+ */
+ entry->changing_xact_state = true;
+
+ /* Assume we might have lost track of prepared statements */
+ entry->have_error = true;
+
+ /*
+ * If a command has been submitted to the remote server by using an
+ * asynchronous execution function, the command might not have yet
+ * completed. Check to see if a command is still being processed by the
+ * remote server, and if so, request cancellation of the command.
+ */
+ if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE)
+ {
+ if (!pgfdw_cancel_query_begin(entry->conn))
+ return false; /* Unable to cancel running query */
+ *cancel_requested = lappend(*cancel_requested, entry);
+ }
+ else
+ {
+ char sql[100];
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ return false; /* Unable to abort remote transaction */
+ *pending_entries = lappend(*pending_entries, entry);
+ }
+
+ return true;
+}
+
/*
* Finish pre-commit cleanup of connections on each of which we've sent a
* COMMIT command to the remote server.
@@ -1651,6 +1833,168 @@ pgfdw_finish_pre_subcommit_cleanup(List *pending_entries, int curlevel)
}
}
+/*
+ * Finish abort cleanup of connections on each of which we've sent an abort
+ * command or cancel request to the remote server.
+ */
+static void
+pgfdw_finish_abort_cleanup(List *pending_entries, List *cancel_requested,
+ bool toplevel)
+{
+ List *pending_deallocs = NIL;
+ ListCell *lc;
+
+ /*
+ * For each of the pending cancel requests (if any), get and discard the
+ * result of the query, and submit an abort command to the remote server.
+ */
+ if (cancel_requested)
+ {
+ foreach(lc, cancel_requested)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. You might think we should do this before issuing
+ * cancel request like in normal mode, but that is problematic,
+ * because if, for example, it took longer than 30 seconds to
+ * process the first few entries in the cancel_requested list, it
+ * would cause a timeout error when processing each of the
+ * remaining entries in the list, leading to slamming that entry's
+ * connection shut.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_cancel_query_end(entry->conn, endtime, true))
+ {
+ /* Unable to cancel running query */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ /* Send an abort command in parallel if needed */
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_entries = lappend(pending_entries, entry);
+ }
+ }
+
+ /* No further work if no pending entries */
+ if (!pending_entries)
+ return;
+
+ /*
+ * Get the result of the abort command for each of the pending entries
+ */
+ foreach(lc, pending_entries)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, sql, endtime,
+ true, false))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ if (toplevel)
+ {
+ /* Do a DEALLOCATE ALL in parallel if needed */
+ if (entry->have_prep_stmt && entry->have_error)
+ {
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn,
+ "DEALLOCATE ALL"))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_deallocs = lappend(pending_deallocs, entry);
+ continue;
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+ }
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+
+ /* No further work if no pending entries */
+ if (!pending_deallocs)
+ return;
+ Assert(toplevel);
+
+ /*
+ * Get the result of the DEALLOCATE command for each of the pending
+ * entries
+ */
+ foreach(lc, pending_deallocs)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+
+ Assert(entry->changing_xact_state);
+ Assert(entry->have_prep_stmt);
+ Assert(entry->have_error);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, "DEALLOCATE ALL",
+ endtime, true, true))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+}
+
/*
* List active foreign server connections.
*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index f210f91188..ed5bf9208f 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9509,7 +9509,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, keep_connections
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, parallel_abort, keep_connections
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
@@ -10934,10 +10934,12 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
SERVER loopback OPTIONS (table_name 'ploc1');
@@ -11007,5 +11009,52 @@ SELECT * FROM prem2;
204 | quxqux
(3 rows)
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index 572591a558..59f865fac3 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -122,6 +122,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
strcmp(def->defname, "truncatable") == 0 ||
strcmp(def->defname, "async_capable") == 0 ||
strcmp(def->defname, "parallel_commit") == 0 ||
+ strcmp(def->defname, "parallel_abort") == 0 ||
strcmp(def->defname, "keep_connections") == 0)
{
/* these accept only boolean values */
@@ -251,6 +252,7 @@ InitPgFdwOptions(void)
{"async_capable", ForeignServerRelationId, false},
{"async_capable", ForeignTableRelationId, false},
{"parallel_commit", ForeignServerRelationId, false},
+ {"parallel_abort", ForeignServerRelationId, false},
{"keep_connections", ForeignServerRelationId, false},
{"password_required", UserMappingRelationId, false},
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 95b6b7192e..bd26739d9a 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3517,10 +3517,12 @@ RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
@@ -3559,5 +3561,26 @@ COMMIT;
SELECT * FROM prem1;
SELECT * FROM prem2;
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 8ebf0dc3a0..e5ca003e9b 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -461,10 +461,10 @@ OPTIONS (ADD password_required 'false');
<para>
When multiple remote (sub)transactions are involved in a local
- (sub)transaction, by default <filename>postgres_fdw</filename> commits
- those remote (sub)transactions one by one when the local (sub)transaction
- commits.
- Performance can be improved with the following option:
+ (sub)transaction, by default <filename>postgres_fdw</filename> commits or
+ aborts those remote (sub)transactions one by one when the local
+ (sub)transaction commits or aborts.
+ Performance can be improved with the following options:
</para>
<variablelist>
@@ -479,27 +479,40 @@ OPTIONS (ADD password_required 'false');
This option can only be specified for foreign servers, not per-table.
The default is <literal>false</literal>.
</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><literal>parallel_abort</literal> (<type>boolean</type>)</term>
+ <listitem>
<para>
- If multiple foreign servers with this option enabled are involved in
- a local (sub)transaction, multiple remote (sub)transactions opened on
- those foreign servers in the local (sub)transaction are committed in
- parallel across those foreign servers when the local (sub)transaction
- commits.
- </para>
-
- <para>
- For a foreign server with this option enabled, if many remote
- (sub)transactions are opened on the foreign server in a local
- (sub)transaction, this option might increase the remote server’s load
- when the local (sub)transaction commits, so be careful when using this
- option.
+ This option controls whether <filename>postgres_fdw</filename> aborts
+ remote (sub)transactions opened on a foreign server in a local
+ (sub)transaction in parallel when the local (sub)transaction aborts.
+ This option can only be specified for foreign servers, not per-table.
+ The default is <literal>false</literal>.
</para>
</listitem>
</varlistentry>
</variablelist>
+ <para>
+ If multiple foreign servers with these options enabled are involved in a
+ local (sub)transaction, multiple remote (sub)transactions opened on those
+ foreign servers in the local (sub)transaction are committed or aborted in
+ parallel across those foreign servers when the local (sub)transaction
+ commits or aborts.
+ </para>
+
+ <para>
+ For a foreign server with these options enabled, if many remote
+ (sub)transactions are opened on the foreign server in a local
+ (sub)transaction, these options might increase the remote server’s load
+ when the local (sub)transaction commits or aborts, so be careful when
+ using these options.
+ </para>
+
</sect3>
<sect3>
Applied patches v6-0002 and v6-0003 to master branch, and the `make
check` test is ok.
Here is my test result in 10 times average on 3 virtual machines:
before the patches:
abort.1 = 2.5473 ms
abort.2 = 4.1572 ms
after the patches with OPTIONS (ADD parallel_abort 'true'):
abort.1 = 1.7136 ms
abort.2 = 2.5833 ms
Overall, it has about 32 ~ 37 % improvement in my testing environment.
On 2022-03-05 2:32 a.m., Etsuro Fujita wrote:
On Mon, Feb 28, 2022 at 6:53 PM Etsuro Fujita<etsuro.fujita@gmail.com> wrote:
Here is an updated version. I added to the 0003 patch a macro for
defining the milliseconds to wait, as proposed by David upthread.I modified the 0003 patch further: 1) I added to
pgfdw_cancel_query_end/pgfdw_exec_cleanup_query_end the PQconsumeInput
optimization that we have in do_sql_command_end, and 2) I
added/tweaked comments a bit further. Attached is an updated version.Like [1], I ran a simple performance test using the following transaction:
BEGIN;
SAVEPOINT s;
INSERT INTO ft1 VALUES (10, 10);
INSERT INTO ft2 VALUES (20, 20);
ROLLBACK TO SAVEPOINT s;
RELEASE SAVEPOINT s;
INSERT INTO ft1 VALUES (10, 10);
INSERT INTO ft2 VALUES (20, 20);
ABORT;where ft1 is a foreign table created on a foreign server hosted on the
same machine as the local server, and ft2 is a foreign table created
on a foreign server hosted on a different machine. (In this test I
used two machines, while in [1] I used three machines: one for the
local server and the others for ft1 and ft2.) The average latencies
for the ROLLBACK TO SAVEPOINT and ABORT commands over ten runs of the
above transaction with the parallel_abort option disabled/enabled are:* ROLLBACK TO SAVEPOINT
parallel_abort=0: 0.3217 ms
parallel_abort=1: 0.2396 ms* ABORT
parallel_abort=0: 0.4749 ms
parallel_abort=1: 0.3733 msThis option reduces the latency for ROLLBACK TO SAVEPOINT by 25.5
percent, and the latency for ABORT by 21.4 percent. From the results,
I think the patch is useful.Best regards,
Etsuro Fujita[1]/messages/by-id/CAPmGK17dAZCXvwnfpr1eTfknTGdt=hYTV9405Gt5SqPOX8K84w@mail.gmail.com
--
David
Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca
On Sat, Mar 12, 2022 at 10:02 AM David Zhang <david.zhang@highgo.ca> wrote:
Applied patches v6-0002 and v6-0003 to master branch, and the `make check` test is ok.
Here is my test result in 10 times average on 3 virtual machines:
before the patches:abort.1 = 2.5473 ms
abort.2 = 4.1572 ms
after the patches with OPTIONS (ADD parallel_abort 'true'):
abort.1 = 1.7136 ms
abort.2 = 2.5833 ms
Overall, it has about 32 ~ 37 % improvement in my testing environment.
I think that is a great improvement. Thanks for testing!
Best regards,
Etsuro Fujita
On Sat, Mar 5, 2022 at 7:32 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
Attached is an updated version.
In the 0002 patch I introduced a macro for building an abort command
in preparation for the parallel abort patch (0003), but I moved it to
0003. Attached is a new patch set. The new version of 0002 is just a
cleanup patch (see the commit message in 0002), and I think it's
committable, so I'm planning to commit it, if no objections.
Best regards,
Etsuro Fujita
Attachments:
v7-0002-postgres_fdw-Minor-cleanup-for-pgfdw_abort_cleanup.patchapplication/octet-stream; name=v7-0002-postgres_fdw-Minor-cleanup-for-pgfdw_abort_cleanup.patchDownload
From 54a26d1e652f85a3fee23aa4d7b2849ceb0487f1 Mon Sep 17 00:00:00 2001
From: Etsuro Fujita <etsuro.fujita@gmail.com>
Date: Thu, 24 Mar 2022 13:16:09 +0900
Subject: [PATCH] postgres_fdw: Minor cleanup for pgfdw_abort_cleanup().
Commit 85c696112 introduced this function to deduplicate code in the
transaction callback functions, but the SQL command passed as an
argument to it was useless when it returned before aborting a remote
transaction using the command. Modify pgfdw_abort_cleanup() so that it
constructs the command when/if necessary, as before, removing the
argument from it. Also update comments in both pgfdw_abort_cleanup()
and the calling function.
Etsuro Fujita, reviewed by David Zhang.
Discussion: https://postgr.es/m/CAPmGK158hrd%3DZfXmgkmNFHivgh18e4oE2Gz151C2Q4OBDjZ08A%40mail.gmail.com
---
contrib/postgres_fdw/connection.c | 29 +++++++++++++++--------------
1 file changed, 15 insertions(+), 14 deletions(-)
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 74d3e73205..129ca79221 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -110,8 +110,7 @@ static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
-static void pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql,
- bool toplevel);
+static void pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel);
static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries,
int curlevel);
@@ -1015,8 +1014,8 @@ pgfdw_xact_callback(XactEvent event, void *arg)
break;
case XACT_EVENT_PARALLEL_ABORT:
case XACT_EVENT_ABORT:
-
- pgfdw_abort_cleanup(entry, "ABORT TRANSACTION", true);
+ /* Rollback all remote transactions during abort */
+ pgfdw_abort_cleanup(entry, true);
break;
}
}
@@ -1109,10 +1108,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
else
{
/* Rollback all remote subtransactions during abort */
- snprintf(sql, sizeof(sql),
- "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
- curlevel, curlevel);
- pgfdw_abort_cleanup(entry, sql, false);
+ pgfdw_abort_cleanup(entry, false);
}
/* OK, we're outta that level of subtransaction */
@@ -1465,10 +1461,7 @@ exit: ;
}
/*
- * Abort remote transaction.
- *
- * The statement specified in "sql" is sent to the remote server,
- * in order to rollback the remote transaction.
+ * Abort remote transaction or subtransaction.
*
* "toplevel" should be set to true if toplevel (main) transaction is
* rollbacked, false otherwise.
@@ -1476,8 +1469,10 @@ exit: ;
* Set entry->changing_xact_state to false on success, true on failure.
*/
static void
-pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel)
+pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
{
+ char sql[100];
+
/*
* Don't try to clean up the connection if we're already in error
* recursion trouble.
@@ -1509,8 +1504,14 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, const char *sql, bool toplevel)
!pgfdw_cancel_query(entry->conn))
return; /* Unable to cancel running query */
+ if (toplevel)
+ snprintf(sql, sizeof(sql), "ABORT TRANSACTION");
+ else
+ snprintf(sql, sizeof(sql),
+ "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
+ entry->xact_depth, entry->xact_depth);
if (!pgfdw_exec_cleanup_query(entry->conn, sql, false))
- return; /* Unable to abort remote transaction */
+ return; /* Unable to abort remote (sub)transaction */
if (toplevel)
{
--
2.14.3 (Apple Git-98)
v7-0003-postgres-fdw-Add-support-for-parallel-abort.patchapplication/octet-stream; name=v7-0003-postgres-fdw-Add-support-for-parallel-abort.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 129ca79221..a7672012f1 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -59,6 +59,7 @@ typedef struct ConnCacheEntry
bool have_error; /* have any subxacts aborted in this xact? */
bool changing_xact_state; /* xact state change in process */
bool parallel_commit; /* do we commit (sub)xacts in parallel? */
+ bool parallel_abort; /* do we abort (sub)xacts in parallel? */
bool invalidated; /* true if reconnect is pending */
bool keep_connections; /* setting value of keep_connections
* server option */
@@ -80,6 +81,21 @@ static unsigned int prep_stmt_number = 0;
/* tracks whether any work is needed in callback functions */
static bool xact_got_connection = false;
+/* Milliseconds to wait to cancel a query or execute a cleanup query */
+#define CONNECTION_CLEANUP_TIMEOUT 30000
+
+/* Macro for constructing abort command to be sent */
+#define CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel) \
+ do { \
+ if (toplevel) \
+ snprintf((sql), sizeof(sql), \
+ "ABORT TRANSACTION"); \
+ else \
+ snprintf((sql), sizeof(sql), \
+ "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d", \
+ (entry)->xact_depth, (entry)->xact_depth); \
+ } while(0)
+
/*
* SQL functions
*/
@@ -106,14 +122,28 @@ static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
static bool pgfdw_cancel_query(PGconn *conn);
+static bool pgfdw_cancel_query_begin(PGconn *conn);
+static bool pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime,
+ bool consume_input);
static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
+static bool pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query);
+static bool pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ TimestampTz endtime,
+ bool consume_input,
+ bool ignore_errors);
static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
static void pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel);
+static bool pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries,
+ List **cancel_requested);
static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries,
int curlevel);
+static void pgfdw_finish_abort_cleanup(List *pending_entries,
+ List *cancel_requested,
+ bool toplevel);
static bool UserMappingPasswordRequired(UserMapping *user);
static bool disconnect_cached_connections(Oid serverid);
@@ -324,11 +354,12 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
*
* By default, all the connections to any foreign servers are kept open.
*
- * Also determine whether to commit (sub)transactions opened on the remote
- * server in parallel at (sub)transaction end.
+ * Also determine whether to commit/abort (sub)transactions opened on the
+ * remote server in parallel at (sub)transaction end.
*/
entry->keep_connections = true;
entry->parallel_commit = false;
+ entry->parallel_abort = false;
foreach(lc, server->options)
{
DefElem *def = (DefElem *) lfirst(lc);
@@ -337,6 +368,8 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
entry->keep_connections = defGetBoolean(def);
else if (strcmp(def->defname, "parallel_commit") == 0)
entry->parallel_commit = defGetBoolean(def);
+ else if (strcmp(def->defname, "parallel_abort") == 0)
+ entry->parallel_abort = defGetBoolean(def);
}
/* Now try to make the connection */
@@ -922,6 +955,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
@@ -1015,7 +1049,15 @@ pgfdw_xact_callback(XactEvent event, void *arg)
case XACT_EVENT_PARALLEL_ABORT:
case XACT_EVENT_ABORT:
/* Rollback all remote transactions during abort */
- pgfdw_abort_cleanup(entry, true);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, true,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, true);
break;
}
}
@@ -1025,11 +1067,21 @@ pgfdw_xact_callback(XactEvent event, void *arg)
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
- event == XACT_EVENT_PRE_COMMIT);
- pgfdw_finish_pre_commit_cleanup(pending_entries);
+ if (event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
+ event == XACT_EVENT_PRE_COMMIT)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_commit_cleanup(pending_entries);
+ }
+ else
+ {
+ Assert(event == XACT_EVENT_PARALLEL_ABORT ||
+ event == XACT_EVENT_ABORT);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ true);
+ }
}
/*
@@ -1054,6 +1106,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
ConnCacheEntry *entry;
int curlevel;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Nothing to do at subxact start, nor after commit. */
if (!(event == SUBXACT_EVENT_PRE_COMMIT_SUB ||
@@ -1108,7 +1161,15 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
else
{
/* Rollback all remote subtransactions during abort */
- pgfdw_abort_cleanup(entry, false);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, false,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, false);
}
/* OK, we're outta that level of subtransaction */
@@ -1116,10 +1177,19 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == SUBXACT_EVENT_PRE_COMMIT_SUB);
- pgfdw_finish_pre_subcommit_cleanup(pending_entries, curlevel);
+ if (event == SUBXACT_EVENT_PRE_COMMIT_SUB)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_subcommit_cleanup(pending_entries, curlevel);
+ }
+ else
+ {
+ Assert(event == SUBXACT_EVENT_ABORT_SUB);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ false);
+ }
}
}
@@ -1263,17 +1333,25 @@ pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel)
static bool
pgfdw_cancel_query(PGconn *conn)
{
- PGcancel *cancel;
- char errbuf[256];
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to cancel the query and discard the result, assume
* the connection is dead.
*/
- endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_cancel_query_begin(conn))
+ return false;
+ return pgfdw_cancel_query_end(conn, endtime, false);
+}
+
+static bool
+pgfdw_cancel_query_begin(PGconn *conn)
+{
+ PGcancel *cancel;
+ char errbuf[256];
/*
* Issue cancel request. Unfortunately, there's no good way to limit the
@@ -1293,6 +1371,31 @@ pgfdw_cancel_query(PGconn *conn)
PQfreeCancel(cancel);
}
+ return true;
+}
+
+static bool
+pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime, bool consume_input)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
+ /*
+ * If requested, consume whatever data is available from the socket.
+ * (Note that if all data is available, this allows
+ * pgfdw_get_cleanup_result to call PQgetResult without forcing the
+ * overhead of WaitLatchOrSocket, which would be large compared to the
+ * overhead of PQconsumeInput.)
+ */
+ if (consume_input && !PQconsumeInput(conn))
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_CONNECTION_FAILURE),
+ errmsg("could not get result of cancel request: %s",
+ pchomp(PQerrorMessage(conn)))));
+ return false;
+ }
+
/* Get and discard the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1327,9 +1430,7 @@ pgfdw_cancel_query(PGconn *conn)
static bool
pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
{
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to execute a cleanup query, assume the connection
@@ -1337,8 +1438,18 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
* place (e.g. statement timeout, user cancel), so the timeout shouldn't
* be too long.
*/
- endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_exec_cleanup_query_begin(conn, query))
+ return false;
+ return pgfdw_exec_cleanup_query_end(conn, query, endtime,
+ false, ignore_errors);
+}
+static bool
+pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query)
+{
/*
* Submit a query. Since we don't use non-blocking mode, this also can
* block. But its risk is relatively small, so we ignore that for now.
@@ -1349,6 +1460,30 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
return false;
}
+ return true;
+}
+
+static bool
+pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ TimestampTz endtime, bool consume_input,
+ bool ignore_errors)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
+ /*
+ * If requested, consume whatever data is available from the socket.
+ * (Note that if all data is available, this allows
+ * pgfdw_get_cleanup_result to call PQgetResult without forcing the
+ * overhead of WaitLatchOrSocket, which would be large compared to the
+ * overhead of PQconsumeInput.)
+ */
+ if (consume_input && !PQconsumeInput(conn))
+ {
+ pgfdw_report_error(WARNING, NULL, conn, false, query);
+ return false;
+ }
+
/* Get the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1504,12 +1639,7 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
!pgfdw_cancel_query(entry->conn))
return; /* Unable to cancel running query */
- if (toplevel)
- snprintf(sql, sizeof(sql), "ABORT TRANSACTION");
- else
- snprintf(sql, sizeof(sql),
- "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
- entry->xact_depth, entry->xact_depth);
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
if (!pgfdw_exec_cleanup_query(entry->conn, sql, false))
return; /* Unable to abort remote (sub)transaction */
@@ -1538,6 +1668,65 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
entry->changing_xact_state = false;
}
+/*
+ * Like pgfdw_abort_cleanup, submit an abort command or cancel request, but
+ * don't wait for the result.
+ *
+ * Returns true if the abort command or cancel request is successfully issued,
+ * false otherwise. If the abort command is successfully issued, the given
+ * connection cache entry is appended to *pending_entries. Othewise, if the
+ * cancel request is successfully issued, it's appended to *cancel_requested.
+ */
+static bool
+pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries, List **cancel_requested)
+{
+ /*
+ * Don't try to clean up the connection if we're already in error
+ * recursion trouble.
+ */
+ if (in_error_recursion_trouble())
+ entry->changing_xact_state = true;
+
+ /*
+ * If connection is already unsalvageable, don't touch it further.
+ */
+ if (entry->changing_xact_state)
+ return false;
+
+ /*
+ * Mark this connection as in the process of changing transaction state.
+ */
+ entry->changing_xact_state = true;
+
+ /* Assume we might have lost track of prepared statements */
+ entry->have_error = true;
+
+ /*
+ * If a command has been submitted to the remote server by using an
+ * asynchronous execution function, the command might not have yet
+ * completed. Check to see if a command is still being processed by the
+ * remote server, and if so, request cancellation of the command.
+ */
+ if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE)
+ {
+ if (!pgfdw_cancel_query_begin(entry->conn))
+ return false; /* Unable to cancel running query */
+ *cancel_requested = lappend(*cancel_requested, entry);
+ }
+ else
+ {
+ char sql[100];
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ return false; /* Unable to abort remote transaction */
+ *pending_entries = lappend(*pending_entries, entry);
+ }
+
+ return true;
+}
+
/*
* Finish pre-commit cleanup of connections on each of which we've sent a
* COMMIT command to the remote server.
@@ -1644,6 +1833,168 @@ pgfdw_finish_pre_subcommit_cleanup(List *pending_entries, int curlevel)
}
}
+/*
+ * Finish abort cleanup of connections on each of which we've sent an abort
+ * command or cancel request to the remote server.
+ */
+static void
+pgfdw_finish_abort_cleanup(List *pending_entries, List *cancel_requested,
+ bool toplevel)
+{
+ List *pending_deallocs = NIL;
+ ListCell *lc;
+
+ /*
+ * For each of the pending cancel requests (if any), get and discard the
+ * result of the query, and submit an abort command to the remote server.
+ */
+ if (cancel_requested)
+ {
+ foreach(lc, cancel_requested)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. You might think we should do this before issuing
+ * cancel request like in normal mode, but that is problematic,
+ * because if, for example, it took longer than 30 seconds to
+ * process the first few entries in the cancel_requested list, it
+ * would cause a timeout error when processing each of the
+ * remaining entries in the list, leading to slamming that entry's
+ * connection shut.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_cancel_query_end(entry->conn, endtime, true))
+ {
+ /* Unable to cancel running query */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ /* Send an abort command in parallel if needed */
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_entries = lappend(pending_entries, entry);
+ }
+ }
+
+ /* No further work if no pending entries */
+ if (!pending_entries)
+ return;
+
+ /*
+ * Get the result of the abort command for each of the pending entries
+ */
+ foreach(lc, pending_entries)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, sql, endtime,
+ true, false))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ if (toplevel)
+ {
+ /* Do a DEALLOCATE ALL in parallel if needed */
+ if (entry->have_prep_stmt && entry->have_error)
+ {
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn,
+ "DEALLOCATE ALL"))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_deallocs = lappend(pending_deallocs, entry);
+ continue;
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+ }
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+
+ /* No further work if no pending entries */
+ if (!pending_deallocs)
+ return;
+ Assert(toplevel);
+
+ /*
+ * Get the result of the DEALLOCATE command for each of the pending
+ * entries
+ */
+ foreach(lc, pending_deallocs)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+
+ Assert(entry->changing_xact_state);
+ Assert(entry->have_prep_stmt);
+ Assert(entry->have_error);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, "DEALLOCATE ALL",
+ endtime, true, true))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+}
+
/*
* List active foreign server connections.
*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index f210f91188..ed5bf9208f 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9509,7 +9509,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, keep_connections
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, parallel_abort, keep_connections
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
@@ -10934,10 +10934,12 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
SERVER loopback OPTIONS (table_name 'ploc1');
@@ -11007,5 +11009,52 @@ SELECT * FROM prem2;
204 | quxqux
(3 rows)
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index 572591a558..59f865fac3 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -122,6 +122,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
strcmp(def->defname, "truncatable") == 0 ||
strcmp(def->defname, "async_capable") == 0 ||
strcmp(def->defname, "parallel_commit") == 0 ||
+ strcmp(def->defname, "parallel_abort") == 0 ||
strcmp(def->defname, "keep_connections") == 0)
{
/* these accept only boolean values */
@@ -251,6 +252,7 @@ InitPgFdwOptions(void)
{"async_capable", ForeignServerRelationId, false},
{"async_capable", ForeignTableRelationId, false},
{"parallel_commit", ForeignServerRelationId, false},
+ {"parallel_abort", ForeignServerRelationId, false},
{"keep_connections", ForeignServerRelationId, false},
{"password_required", UserMappingRelationId, false},
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 95b6b7192e..bd26739d9a 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3517,10 +3517,12 @@ RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
@@ -3559,5 +3561,26 @@ COMMIT;
SELECT * FROM prem1;
SELECT * FROM prem2;
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index d8dc715587..b86b3e9cc3 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -461,10 +461,10 @@ OPTIONS (ADD password_required 'false');
<para>
When multiple remote (sub)transactions are involved in a local
- (sub)transaction, by default <filename>postgres_fdw</filename> commits
- those remote (sub)transactions one by one when the local (sub)transaction
- commits.
- Performance can be improved with the following option:
+ (sub)transaction, by default <filename>postgres_fdw</filename> commits or
+ aborts those remote (sub)transactions one by one when the local
+ (sub)transaction commits or aborts.
+ Performance can be improved with the following options:
</para>
<variablelist>
@@ -479,27 +479,40 @@ OPTIONS (ADD password_required 'false');
This option can only be specified for foreign servers, not per-table.
The default is <literal>false</literal>.
</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><literal>parallel_abort</literal> (<type>boolean</type>)</term>
+ <listitem>
<para>
- If multiple foreign servers with this option enabled are involved in
- a local (sub)transaction, multiple remote (sub)transactions opened on
- those foreign servers in the local (sub)transaction are committed in
- parallel across those foreign servers when the local (sub)transaction
- commits.
- </para>
-
- <para>
- For a foreign server with this option enabled, if many remote
- (sub)transactions are opened on the foreign server in a local
- (sub)transaction, this option might increase the remote server’s load
- when the local (sub)transaction commits, so be careful when using this
- option.
+ This option controls whether <filename>postgres_fdw</filename> aborts
+ remote (sub)transactions opened on a foreign server in a local
+ (sub)transaction in parallel when the local (sub)transaction aborts.
+ This option can only be specified for foreign servers, not per-table.
+ The default is <literal>false</literal>.
</para>
</listitem>
</varlistentry>
</variablelist>
+ <para>
+ If multiple foreign servers with these options enabled are involved in a
+ local (sub)transaction, multiple remote (sub)transactions opened on those
+ foreign servers in the local (sub)transaction are committed or aborted in
+ parallel across those foreign servers when the local (sub)transaction
+ commits or aborts.
+ </para>
+
+ <para>
+ For a foreign server with these options enabled, if many remote
+ (sub)transactions are opened on the foreign server in a local
+ (sub)transaction, these options might increase the remote server’s load
+ when the local (sub)transaction commits or aborts, so be careful when
+ using these options.
+ </para>
+
</sect3>
<sect3>
On Thu, Mar 24, 2022 at 1:34 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
Attached is a new patch set. The new version of 0002 is just a
cleanup patch (see the commit message in 0002), and I think it's
committable, so I'm planning to commit it, if no objections.
Done.
Attached is the 0003 patch, which is the same as the one I sent yesterday.
Best regards,
Etsuro Fujita
Attachments:
v7-0003-postgres-fdw-Add-support-for-parallel-abort.patchapplication/octet-stream; name=v7-0003-postgres-fdw-Add-support-for-parallel-abort.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 129ca79221..a7672012f1 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -59,6 +59,7 @@ typedef struct ConnCacheEntry
bool have_error; /* have any subxacts aborted in this xact? */
bool changing_xact_state; /* xact state change in process */
bool parallel_commit; /* do we commit (sub)xacts in parallel? */
+ bool parallel_abort; /* do we abort (sub)xacts in parallel? */
bool invalidated; /* true if reconnect is pending */
bool keep_connections; /* setting value of keep_connections
* server option */
@@ -80,6 +81,21 @@ static unsigned int prep_stmt_number = 0;
/* tracks whether any work is needed in callback functions */
static bool xact_got_connection = false;
+/* Milliseconds to wait to cancel a query or execute a cleanup query */
+#define CONNECTION_CLEANUP_TIMEOUT 30000
+
+/* Macro for constructing abort command to be sent */
+#define CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel) \
+ do { \
+ if (toplevel) \
+ snprintf((sql), sizeof(sql), \
+ "ABORT TRANSACTION"); \
+ else \
+ snprintf((sql), sizeof(sql), \
+ "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d", \
+ (entry)->xact_depth, (entry)->xact_depth); \
+ } while(0)
+
/*
* SQL functions
*/
@@ -106,14 +122,28 @@ static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
static bool pgfdw_cancel_query(PGconn *conn);
+static bool pgfdw_cancel_query_begin(PGconn *conn);
+static bool pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime,
+ bool consume_input);
static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
+static bool pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query);
+static bool pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ TimestampTz endtime,
+ bool consume_input,
+ bool ignore_errors);
static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
static void pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel);
+static bool pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries,
+ List **cancel_requested);
static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries,
int curlevel);
+static void pgfdw_finish_abort_cleanup(List *pending_entries,
+ List *cancel_requested,
+ bool toplevel);
static bool UserMappingPasswordRequired(UserMapping *user);
static bool disconnect_cached_connections(Oid serverid);
@@ -324,11 +354,12 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
*
* By default, all the connections to any foreign servers are kept open.
*
- * Also determine whether to commit (sub)transactions opened on the remote
- * server in parallel at (sub)transaction end.
+ * Also determine whether to commit/abort (sub)transactions opened on the
+ * remote server in parallel at (sub)transaction end.
*/
entry->keep_connections = true;
entry->parallel_commit = false;
+ entry->parallel_abort = false;
foreach(lc, server->options)
{
DefElem *def = (DefElem *) lfirst(lc);
@@ -337,6 +368,8 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
entry->keep_connections = defGetBoolean(def);
else if (strcmp(def->defname, "parallel_commit") == 0)
entry->parallel_commit = defGetBoolean(def);
+ else if (strcmp(def->defname, "parallel_abort") == 0)
+ entry->parallel_abort = defGetBoolean(def);
}
/* Now try to make the connection */
@@ -922,6 +955,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
@@ -1015,7 +1049,15 @@ pgfdw_xact_callback(XactEvent event, void *arg)
case XACT_EVENT_PARALLEL_ABORT:
case XACT_EVENT_ABORT:
/* Rollback all remote transactions during abort */
- pgfdw_abort_cleanup(entry, true);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, true,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, true);
break;
}
}
@@ -1025,11 +1067,21 @@ pgfdw_xact_callback(XactEvent event, void *arg)
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
- event == XACT_EVENT_PRE_COMMIT);
- pgfdw_finish_pre_commit_cleanup(pending_entries);
+ if (event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
+ event == XACT_EVENT_PRE_COMMIT)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_commit_cleanup(pending_entries);
+ }
+ else
+ {
+ Assert(event == XACT_EVENT_PARALLEL_ABORT ||
+ event == XACT_EVENT_ABORT);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ true);
+ }
}
/*
@@ -1054,6 +1106,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
ConnCacheEntry *entry;
int curlevel;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Nothing to do at subxact start, nor after commit. */
if (!(event == SUBXACT_EVENT_PRE_COMMIT_SUB ||
@@ -1108,7 +1161,15 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
else
{
/* Rollback all remote subtransactions during abort */
- pgfdw_abort_cleanup(entry, false);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, false,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, false);
}
/* OK, we're outta that level of subtransaction */
@@ -1116,10 +1177,19 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == SUBXACT_EVENT_PRE_COMMIT_SUB);
- pgfdw_finish_pre_subcommit_cleanup(pending_entries, curlevel);
+ if (event == SUBXACT_EVENT_PRE_COMMIT_SUB)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_subcommit_cleanup(pending_entries, curlevel);
+ }
+ else
+ {
+ Assert(event == SUBXACT_EVENT_ABORT_SUB);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ false);
+ }
}
}
@@ -1263,17 +1333,25 @@ pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel)
static bool
pgfdw_cancel_query(PGconn *conn)
{
- PGcancel *cancel;
- char errbuf[256];
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to cancel the query and discard the result, assume
* the connection is dead.
*/
- endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_cancel_query_begin(conn))
+ return false;
+ return pgfdw_cancel_query_end(conn, endtime, false);
+}
+
+static bool
+pgfdw_cancel_query_begin(PGconn *conn)
+{
+ PGcancel *cancel;
+ char errbuf[256];
/*
* Issue cancel request. Unfortunately, there's no good way to limit the
@@ -1293,6 +1371,31 @@ pgfdw_cancel_query(PGconn *conn)
PQfreeCancel(cancel);
}
+ return true;
+}
+
+static bool
+pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime, bool consume_input)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
+ /*
+ * If requested, consume whatever data is available from the socket.
+ * (Note that if all data is available, this allows
+ * pgfdw_get_cleanup_result to call PQgetResult without forcing the
+ * overhead of WaitLatchOrSocket, which would be large compared to the
+ * overhead of PQconsumeInput.)
+ */
+ if (consume_input && !PQconsumeInput(conn))
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_CONNECTION_FAILURE),
+ errmsg("could not get result of cancel request: %s",
+ pchomp(PQerrorMessage(conn)))));
+ return false;
+ }
+
/* Get and discard the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1327,9 +1430,7 @@ pgfdw_cancel_query(PGconn *conn)
static bool
pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
{
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to execute a cleanup query, assume the connection
@@ -1337,8 +1438,18 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
* place (e.g. statement timeout, user cancel), so the timeout shouldn't
* be too long.
*/
- endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_exec_cleanup_query_begin(conn, query))
+ return false;
+ return pgfdw_exec_cleanup_query_end(conn, query, endtime,
+ false, ignore_errors);
+}
+static bool
+pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query)
+{
/*
* Submit a query. Since we don't use non-blocking mode, this also can
* block. But its risk is relatively small, so we ignore that for now.
@@ -1349,6 +1460,30 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
return false;
}
+ return true;
+}
+
+static bool
+pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ TimestampTz endtime, bool consume_input,
+ bool ignore_errors)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
+ /*
+ * If requested, consume whatever data is available from the socket.
+ * (Note that if all data is available, this allows
+ * pgfdw_get_cleanup_result to call PQgetResult without forcing the
+ * overhead of WaitLatchOrSocket, which would be large compared to the
+ * overhead of PQconsumeInput.)
+ */
+ if (consume_input && !PQconsumeInput(conn))
+ {
+ pgfdw_report_error(WARNING, NULL, conn, false, query);
+ return false;
+ }
+
/* Get the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1504,12 +1639,7 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
!pgfdw_cancel_query(entry->conn))
return; /* Unable to cancel running query */
- if (toplevel)
- snprintf(sql, sizeof(sql), "ABORT TRANSACTION");
- else
- snprintf(sql, sizeof(sql),
- "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
- entry->xact_depth, entry->xact_depth);
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
if (!pgfdw_exec_cleanup_query(entry->conn, sql, false))
return; /* Unable to abort remote (sub)transaction */
@@ -1538,6 +1668,65 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
entry->changing_xact_state = false;
}
+/*
+ * Like pgfdw_abort_cleanup, submit an abort command or cancel request, but
+ * don't wait for the result.
+ *
+ * Returns true if the abort command or cancel request is successfully issued,
+ * false otherwise. If the abort command is successfully issued, the given
+ * connection cache entry is appended to *pending_entries. Othewise, if the
+ * cancel request is successfully issued, it's appended to *cancel_requested.
+ */
+static bool
+pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries, List **cancel_requested)
+{
+ /*
+ * Don't try to clean up the connection if we're already in error
+ * recursion trouble.
+ */
+ if (in_error_recursion_trouble())
+ entry->changing_xact_state = true;
+
+ /*
+ * If connection is already unsalvageable, don't touch it further.
+ */
+ if (entry->changing_xact_state)
+ return false;
+
+ /*
+ * Mark this connection as in the process of changing transaction state.
+ */
+ entry->changing_xact_state = true;
+
+ /* Assume we might have lost track of prepared statements */
+ entry->have_error = true;
+
+ /*
+ * If a command has been submitted to the remote server by using an
+ * asynchronous execution function, the command might not have yet
+ * completed. Check to see if a command is still being processed by the
+ * remote server, and if so, request cancellation of the command.
+ */
+ if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE)
+ {
+ if (!pgfdw_cancel_query_begin(entry->conn))
+ return false; /* Unable to cancel running query */
+ *cancel_requested = lappend(*cancel_requested, entry);
+ }
+ else
+ {
+ char sql[100];
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ return false; /* Unable to abort remote transaction */
+ *pending_entries = lappend(*pending_entries, entry);
+ }
+
+ return true;
+}
+
/*
* Finish pre-commit cleanup of connections on each of which we've sent a
* COMMIT command to the remote server.
@@ -1644,6 +1833,168 @@ pgfdw_finish_pre_subcommit_cleanup(List *pending_entries, int curlevel)
}
}
+/*
+ * Finish abort cleanup of connections on each of which we've sent an abort
+ * command or cancel request to the remote server.
+ */
+static void
+pgfdw_finish_abort_cleanup(List *pending_entries, List *cancel_requested,
+ bool toplevel)
+{
+ List *pending_deallocs = NIL;
+ ListCell *lc;
+
+ /*
+ * For each of the pending cancel requests (if any), get and discard the
+ * result of the query, and submit an abort command to the remote server.
+ */
+ if (cancel_requested)
+ {
+ foreach(lc, cancel_requested)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. You might think we should do this before issuing
+ * cancel request like in normal mode, but that is problematic,
+ * because if, for example, it took longer than 30 seconds to
+ * process the first few entries in the cancel_requested list, it
+ * would cause a timeout error when processing each of the
+ * remaining entries in the list, leading to slamming that entry's
+ * connection shut.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_cancel_query_end(entry->conn, endtime, true))
+ {
+ /* Unable to cancel running query */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ /* Send an abort command in parallel if needed */
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_entries = lappend(pending_entries, entry);
+ }
+ }
+
+ /* No further work if no pending entries */
+ if (!pending_entries)
+ return;
+
+ /*
+ * Get the result of the abort command for each of the pending entries
+ */
+ foreach(lc, pending_entries)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, sql, endtime,
+ true, false))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ if (toplevel)
+ {
+ /* Do a DEALLOCATE ALL in parallel if needed */
+ if (entry->have_prep_stmt && entry->have_error)
+ {
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn,
+ "DEALLOCATE ALL"))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_deallocs = lappend(pending_deallocs, entry);
+ continue;
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+ }
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+
+ /* No further work if no pending entries */
+ if (!pending_deallocs)
+ return;
+ Assert(toplevel);
+
+ /*
+ * Get the result of the DEALLOCATE command for each of the pending
+ * entries
+ */
+ foreach(lc, pending_deallocs)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+
+ Assert(entry->changing_xact_state);
+ Assert(entry->have_prep_stmt);
+ Assert(entry->have_error);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, "DEALLOCATE ALL",
+ endtime, true, true))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+}
+
/*
* List active foreign server connections.
*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index f210f91188..ed5bf9208f 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9509,7 +9509,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, keep_connections
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, parallel_abort, keep_connections
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
@@ -10934,10 +10934,12 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
SERVER loopback OPTIONS (table_name 'ploc1');
@@ -11007,5 +11009,52 @@ SELECT * FROM prem2;
204 | quxqux
(3 rows)
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index 572591a558..59f865fac3 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -122,6 +122,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
strcmp(def->defname, "truncatable") == 0 ||
strcmp(def->defname, "async_capable") == 0 ||
strcmp(def->defname, "parallel_commit") == 0 ||
+ strcmp(def->defname, "parallel_abort") == 0 ||
strcmp(def->defname, "keep_connections") == 0)
{
/* these accept only boolean values */
@@ -251,6 +252,7 @@ InitPgFdwOptions(void)
{"async_capable", ForeignServerRelationId, false},
{"async_capable", ForeignTableRelationId, false},
{"parallel_commit", ForeignServerRelationId, false},
+ {"parallel_abort", ForeignServerRelationId, false},
{"keep_connections", ForeignServerRelationId, false},
{"password_required", UserMappingRelationId, false},
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 95b6b7192e..bd26739d9a 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3517,10 +3517,12 @@ RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
@@ -3559,5 +3561,26 @@ COMMIT;
SELECT * FROM prem1;
SELECT * FROM prem2;
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index d8dc715587..b86b3e9cc3 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -461,10 +461,10 @@ OPTIONS (ADD password_required 'false');
<para>
When multiple remote (sub)transactions are involved in a local
- (sub)transaction, by default <filename>postgres_fdw</filename> commits
- those remote (sub)transactions one by one when the local (sub)transaction
- commits.
- Performance can be improved with the following option:
+ (sub)transaction, by default <filename>postgres_fdw</filename> commits or
+ aborts those remote (sub)transactions one by one when the local
+ (sub)transaction commits or aborts.
+ Performance can be improved with the following options:
</para>
<variablelist>
@@ -479,27 +479,40 @@ OPTIONS (ADD password_required 'false');
This option can only be specified for foreign servers, not per-table.
The default is <literal>false</literal>.
</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><literal>parallel_abort</literal> (<type>boolean</type>)</term>
+ <listitem>
<para>
- If multiple foreign servers with this option enabled are involved in
- a local (sub)transaction, multiple remote (sub)transactions opened on
- those foreign servers in the local (sub)transaction are committed in
- parallel across those foreign servers when the local (sub)transaction
- commits.
- </para>
-
- <para>
- For a foreign server with this option enabled, if many remote
- (sub)transactions are opened on the foreign server in a local
- (sub)transaction, this option might increase the remote server’s load
- when the local (sub)transaction commits, so be careful when using this
- option.
+ This option controls whether <filename>postgres_fdw</filename> aborts
+ remote (sub)transactions opened on a foreign server in a local
+ (sub)transaction in parallel when the local (sub)transaction aborts.
+ This option can only be specified for foreign servers, not per-table.
+ The default is <literal>false</literal>.
</para>
</listitem>
</varlistentry>
</variablelist>
+ <para>
+ If multiple foreign servers with these options enabled are involved in a
+ local (sub)transaction, multiple remote (sub)transactions opened on those
+ foreign servers in the local (sub)transaction are committed or aborted in
+ parallel across those foreign servers when the local (sub)transaction
+ commits or aborts.
+ </para>
+
+ <para>
+ For a foreign server with these options enabled, if many remote
+ (sub)transactions are opened on the foreign server in a local
+ (sub)transaction, these options might increase the remote server’s load
+ when the local (sub)transaction commits or aborts, so be careful when
+ using these options.
+ </para>
+
</sect3>
<sect3>
I tried to apply the patch to master and plan to run some tests, but got
below errors due to other commits.
$ git apply --check
v7-0003-postgres-fdw-Add-support-for-parallel-abort.patch
error: patch failed: doc/src/sgml/postgres-fdw.sgml:479
error: doc/src/sgml/postgres-fdw.sgml: patch does not apply
+ * remote server in parallel at (sub)transaction end.
Here, I think the comment above could potentially apply to multiple
remote server(s).
Not sure if there is a way to avoid repeated comments? For example, the
same comment below appears in two places (line 231 and line 296).
+ /*
+ * If requested, consume whatever data is available from the socket.
+ * (Note that if all data is available, this allows
+ * pgfdw_get_cleanup_result to call PQgetResult without forcing the
+ * overhead of WaitLatchOrSocket, which would be large compared to the
+ * overhead of PQconsumeInput.)
+ */
On 2022-03-24 11:46 p.m., Etsuro Fujita wrote:
On Thu, Mar 24, 2022 at 1:34 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
Attached is a new patch set. The new version of 0002 is just a
cleanup patch (see the commit message in 0002), and I think it's
committable, so I'm planning to commit it, if no objections.Done.
Attached is the 0003 patch, which is the same as the one I sent yesterday.
Best regards,
Etsuro Fujita
--
Best regards,
David
Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca
Hi,
On Wed, Apr 20, 2022 at 4:55 AM David Zhang <david.zhang@highgo.ca> wrote:
I tried to apply the patch to master and plan to run some tests, but got
below errors due to other commits.
I rebased the patch against HEAD. Attached is an updated patch.
+ * remote server in parallel at (sub)transaction end.
Here, I think the comment above could potentially apply to multiple
remote server(s).
I agree on that point, but I think it's correct to say "the remote
server" here, because we determine this for the given remote server.
Maybe I'm missing something, so could you elaborate on it?
Not sure if there is a way to avoid repeated comments? For example, the
same comment below appears in two places (line 231 and line 296).+ /* + * If requested, consume whatever data is available from the socket. + * (Note that if all data is available, this allows + * pgfdw_get_cleanup_result to call PQgetResult without forcing the + * overhead of WaitLatchOrSocket, which would be large compared to the + * overhead of PQconsumeInput.) + */
IMO I think it's OK to have this in multiple places, because 1) IMO it
wouldn't be that long, and 2) we already duplicated comments like this
in the same file in v14 and earlier. Here is such an example in
pgfdw_xact_callback() and pgfdw_subxact_callback() in that file in
those versions:
/*
* If a command has been submitted to the remote server by
* using an asynchronous execution function, the command
* might not have yet completed. Check to see if a
* command is still being processed by the remote server,
* and if so, request cancellation of the command.
*/
Thanks for reviewing! Sorry for the delay.
Best regards,
Etsuro Fujita
Attachments:
v8-postgres-fdw-Add-support-for-parallel-abort.patchapplication/octet-stream; name=v8-postgres-fdw-Add-support-for-parallel-abort.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index f9b8c01f3b..64ffc2564d 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -59,6 +59,7 @@ typedef struct ConnCacheEntry
bool have_error; /* have any subxacts aborted in this xact? */
bool changing_xact_state; /* xact state change in process */
bool parallel_commit; /* do we commit (sub)xacts in parallel? */
+ bool parallel_abort; /* do we abort (sub)xacts in parallel? */
bool invalidated; /* true if reconnect is pending */
bool keep_connections; /* setting value of keep_connections
* server option */
@@ -80,6 +81,21 @@ static unsigned int prep_stmt_number = 0;
/* tracks whether any work is needed in callback functions */
static bool xact_got_connection = false;
+/* Milliseconds to wait to cancel a query or execute a cleanup query */
+#define CONNECTION_CLEANUP_TIMEOUT 30000
+
+/* Macro for constructing abort command to be sent */
+#define CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel) \
+ do { \
+ if (toplevel) \
+ snprintf((sql), sizeof(sql), \
+ "ABORT TRANSACTION"); \
+ else \
+ snprintf((sql), sizeof(sql), \
+ "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d", \
+ (entry)->xact_depth, (entry)->xact_depth); \
+ } while(0)
+
/*
* SQL functions
*/
@@ -106,14 +122,28 @@ static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
static bool pgfdw_cancel_query(PGconn *conn);
+static bool pgfdw_cancel_query_begin(PGconn *conn);
+static bool pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime,
+ bool consume_input);
static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
+static bool pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query);
+static bool pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ TimestampTz endtime,
+ bool consume_input,
+ bool ignore_errors);
static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
static void pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel);
+static bool pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries,
+ List **cancel_requested);
static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries,
int curlevel);
+static void pgfdw_finish_abort_cleanup(List *pending_entries,
+ List *cancel_requested,
+ bool toplevel);
static bool UserMappingPasswordRequired(UserMapping *user);
static bool disconnect_cached_connections(Oid serverid);
@@ -324,11 +354,12 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
*
* By default, all the connections to any foreign servers are kept open.
*
- * Also determine whether to commit (sub)transactions opened on the remote
- * server in parallel at (sub)transaction end.
+ * Also determine whether to commit/abort (sub)transactions opened on the
+ * remote server in parallel at (sub)transaction end.
*/
entry->keep_connections = true;
entry->parallel_commit = false;
+ entry->parallel_abort = false;
foreach(lc, server->options)
{
DefElem *def = (DefElem *) lfirst(lc);
@@ -337,6 +368,8 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
entry->keep_connections = defGetBoolean(def);
else if (strcmp(def->defname, "parallel_commit") == 0)
entry->parallel_commit = defGetBoolean(def);
+ else if (strcmp(def->defname, "parallel_abort") == 0)
+ entry->parallel_abort = defGetBoolean(def);
}
/* Now try to make the connection */
@@ -922,6 +955,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
@@ -1015,7 +1049,15 @@ pgfdw_xact_callback(XactEvent event, void *arg)
case XACT_EVENT_PARALLEL_ABORT:
case XACT_EVENT_ABORT:
/* Rollback all remote transactions during abort */
- pgfdw_abort_cleanup(entry, true);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, true,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, true);
break;
}
}
@@ -1025,11 +1067,21 @@ pgfdw_xact_callback(XactEvent event, void *arg)
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
- event == XACT_EVENT_PRE_COMMIT);
- pgfdw_finish_pre_commit_cleanup(pending_entries);
+ if (event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
+ event == XACT_EVENT_PRE_COMMIT)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_commit_cleanup(pending_entries);
+ }
+ else
+ {
+ Assert(event == XACT_EVENT_PARALLEL_ABORT ||
+ event == XACT_EVENT_ABORT);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ true);
+ }
}
/*
@@ -1054,6 +1106,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
ConnCacheEntry *entry;
int curlevel;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Nothing to do at subxact start, nor after commit. */
if (!(event == SUBXACT_EVENT_PRE_COMMIT_SUB ||
@@ -1108,7 +1161,15 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
else
{
/* Rollback all remote subtransactions during abort */
- pgfdw_abort_cleanup(entry, false);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, false,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, false);
}
/* OK, we're outta that level of subtransaction */
@@ -1116,10 +1177,19 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == SUBXACT_EVENT_PRE_COMMIT_SUB);
- pgfdw_finish_pre_subcommit_cleanup(pending_entries, curlevel);
+ if (event == SUBXACT_EVENT_PRE_COMMIT_SUB)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_subcommit_cleanup(pending_entries, curlevel);
+ }
+ else
+ {
+ Assert(event == SUBXACT_EVENT_ABORT_SUB);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ false);
+ }
}
}
@@ -1263,17 +1333,25 @@ pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel)
static bool
pgfdw_cancel_query(PGconn *conn)
{
- PGcancel *cancel;
- char errbuf[256];
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to cancel the query and discard the result, assume
* the connection is dead.
*/
- endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_cancel_query_begin(conn))
+ return false;
+ return pgfdw_cancel_query_end(conn, endtime, false);
+}
+
+static bool
+pgfdw_cancel_query_begin(PGconn *conn)
+{
+ PGcancel *cancel;
+ char errbuf[256];
/*
* Issue cancel request. Unfortunately, there's no good way to limit the
@@ -1293,6 +1371,31 @@ pgfdw_cancel_query(PGconn *conn)
PQfreeCancel(cancel);
}
+ return true;
+}
+
+static bool
+pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime, bool consume_input)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
+ /*
+ * If requested, consume whatever data is available from the socket.
+ * (Note that if all data is available, this allows
+ * pgfdw_get_cleanup_result to call PQgetResult without forcing the
+ * overhead of WaitLatchOrSocket, which would be large compared to the
+ * overhead of PQconsumeInput.)
+ */
+ if (consume_input && !PQconsumeInput(conn))
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_CONNECTION_FAILURE),
+ errmsg("could not get result of cancel request: %s",
+ pchomp(PQerrorMessage(conn)))));
+ return false;
+ }
+
/* Get and discard the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1327,9 +1430,7 @@ pgfdw_cancel_query(PGconn *conn)
static bool
pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
{
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to execute a cleanup query, assume the connection
@@ -1337,8 +1438,18 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
* place (e.g. statement timeout, user cancel), so the timeout shouldn't
* be too long.
*/
- endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_exec_cleanup_query_begin(conn, query))
+ return false;
+ return pgfdw_exec_cleanup_query_end(conn, query, endtime,
+ false, ignore_errors);
+}
+static bool
+pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query)
+{
/*
* Submit a query. Since we don't use non-blocking mode, this also can
* block. But its risk is relatively small, so we ignore that for now.
@@ -1349,6 +1460,30 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
return false;
}
+ return true;
+}
+
+static bool
+pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ TimestampTz endtime, bool consume_input,
+ bool ignore_errors)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
+ /*
+ * If requested, consume whatever data is available from the socket.
+ * (Note that if all data is available, this allows
+ * pgfdw_get_cleanup_result to call PQgetResult without forcing the
+ * overhead of WaitLatchOrSocket, which would be large compared to the
+ * overhead of PQconsumeInput.)
+ */
+ if (consume_input && !PQconsumeInput(conn))
+ {
+ pgfdw_report_error(WARNING, NULL, conn, false, query);
+ return false;
+ }
+
/* Get the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1504,12 +1639,7 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
!pgfdw_cancel_query(entry->conn))
return; /* Unable to cancel running query */
- if (toplevel)
- snprintf(sql, sizeof(sql), "ABORT TRANSACTION");
- else
- snprintf(sql, sizeof(sql),
- "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
- entry->xact_depth, entry->xact_depth);
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
if (!pgfdw_exec_cleanup_query(entry->conn, sql, false))
return; /* Unable to abort remote (sub)transaction */
@@ -1538,6 +1668,65 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
entry->changing_xact_state = false;
}
+/*
+ * Like pgfdw_abort_cleanup, submit an abort command or cancel request, but
+ * don't wait for the result.
+ *
+ * Returns true if the abort command or cancel request is successfully issued,
+ * false otherwise. If the abort command is successfully issued, the given
+ * connection cache entry is appended to *pending_entries. Othewise, if the
+ * cancel request is successfully issued, it's appended to *cancel_requested.
+ */
+static bool
+pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries, List **cancel_requested)
+{
+ /*
+ * Don't try to clean up the connection if we're already in error
+ * recursion trouble.
+ */
+ if (in_error_recursion_trouble())
+ entry->changing_xact_state = true;
+
+ /*
+ * If connection is already unsalvageable, don't touch it further.
+ */
+ if (entry->changing_xact_state)
+ return false;
+
+ /*
+ * Mark this connection as in the process of changing transaction state.
+ */
+ entry->changing_xact_state = true;
+
+ /* Assume we might have lost track of prepared statements */
+ entry->have_error = true;
+
+ /*
+ * If a command has been submitted to the remote server by using an
+ * asynchronous execution function, the command might not have yet
+ * completed. Check to see if a command is still being processed by the
+ * remote server, and if so, request cancellation of the command.
+ */
+ if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE)
+ {
+ if (!pgfdw_cancel_query_begin(entry->conn))
+ return false; /* Unable to cancel running query */
+ *cancel_requested = lappend(*cancel_requested, entry);
+ }
+ else
+ {
+ char sql[100];
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ return false; /* Unable to abort remote transaction */
+ *pending_entries = lappend(*pending_entries, entry);
+ }
+
+ return true;
+}
+
/*
* Finish pre-commit cleanup of connections on each of which we've sent a
* COMMIT command to the remote server.
@@ -1644,6 +1833,168 @@ pgfdw_finish_pre_subcommit_cleanup(List *pending_entries, int curlevel)
}
}
+/*
+ * Finish abort cleanup of connections on each of which we've sent an abort
+ * command or cancel request to the remote server.
+ */
+static void
+pgfdw_finish_abort_cleanup(List *pending_entries, List *cancel_requested,
+ bool toplevel)
+{
+ List *pending_deallocs = NIL;
+ ListCell *lc;
+
+ /*
+ * For each of the pending cancel requests (if any), get and discard the
+ * result of the query, and submit an abort command to the remote server.
+ */
+ if (cancel_requested)
+ {
+ foreach(lc, cancel_requested)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. You might think we should do this before issuing
+ * cancel request like in normal mode, but that is problematic,
+ * because if, for example, it took longer than 30 seconds to
+ * process the first few entries in the cancel_requested list, it
+ * would cause a timeout error when processing each of the
+ * remaining entries in the list, leading to slamming that entry's
+ * connection shut.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_cancel_query_end(entry->conn, endtime, true))
+ {
+ /* Unable to cancel running query */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ /* Send an abort command in parallel if needed */
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_entries = lappend(pending_entries, entry);
+ }
+ }
+
+ /* No further work if no pending entries */
+ if (!pending_entries)
+ return;
+
+ /*
+ * Get the result of the abort command for each of the pending entries
+ */
+ foreach(lc, pending_entries)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, sql, endtime,
+ true, false))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ if (toplevel)
+ {
+ /* Do a DEALLOCATE ALL in parallel if needed */
+ if (entry->have_prep_stmt && entry->have_error)
+ {
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn,
+ "DEALLOCATE ALL"))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_deallocs = lappend(pending_deallocs, entry);
+ continue;
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+ }
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+
+ /* No further work if no pending entries */
+ if (!pending_deallocs)
+ return;
+ Assert(toplevel);
+
+ /*
+ * Get the result of the DEALLOCATE command for each of the pending
+ * entries
+ */
+ foreach(lc, pending_deallocs)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+
+ Assert(entry->changing_xact_state);
+ Assert(entry->have_prep_stmt);
+ Assert(entry->have_error);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, "DEALLOCATE ALL",
+ endtime, true, true))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+}
+
/*
* List active foreign server connections.
*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 44457f930c..c7ba7b7a27 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9529,7 +9529,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, keep_connections
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, parallel_abort, keep_connections
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
@@ -11218,10 +11218,12 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
SERVER loopback OPTIONS (table_name 'ploc1');
@@ -11291,5 +11293,52 @@ SELECT * FROM prem2;
204 | quxqux
(3 rows)
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index 572591a558..59f865fac3 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -122,6 +122,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
strcmp(def->defname, "truncatable") == 0 ||
strcmp(def->defname, "async_capable") == 0 ||
strcmp(def->defname, "parallel_commit") == 0 ||
+ strcmp(def->defname, "parallel_abort") == 0 ||
strcmp(def->defname, "keep_connections") == 0)
{
/* these accept only boolean values */
@@ -251,6 +252,7 @@ InitPgFdwOptions(void)
{"async_capable", ForeignServerRelationId, false},
{"async_capable", ForeignTableRelationId, false},
{"parallel_commit", ForeignServerRelationId, false},
+ {"parallel_abort", ForeignServerRelationId, false},
{"keep_connections", ForeignServerRelationId, false},
{"password_required", UserMappingRelationId, false},
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 92d1212027..7c5573ba9a 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3591,10 +3591,12 @@ RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
@@ -3633,5 +3635,26 @@ COMMIT;
SELECT * FROM prem1;
SELECT * FROM prem2;
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index b43d0aecba..29b46e1793 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -461,10 +461,10 @@ OPTIONS (ADD password_required 'false');
<para>
When multiple remote (sub)transactions are involved in a local
- (sub)transaction, by default <filename>postgres_fdw</filename> commits
- those remote (sub)transactions one by one when the local (sub)transaction
- commits.
- Performance can be improved with the following option:
+ (sub)transaction, by default <filename>postgres_fdw</filename> commits or
+ aborts those remote (sub)transactions one by one when the local
+ (sub)transaction commits or aborts.
+ Performance can be improved with the following options:
</para>
<variablelist>
@@ -479,27 +479,40 @@ OPTIONS (ADD password_required 'false');
This option can only be specified for foreign servers, not per-table.
The default is <literal>false</literal>.
</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><literal>parallel_abort</literal> (<type>boolean</type>)</term>
+ <listitem>
<para>
- If multiple foreign servers with this option enabled are involved in
- a local (sub)transaction, multiple remote (sub)transactions opened on
- those foreign servers in the local (sub)transaction are committed in
- parallel across those foreign servers when the local (sub)transaction
- commits.
- </para>
-
- <para>
- For a foreign server with this option enabled, if many remote
- (sub)transactions are opened on the foreign server in a local
- (sub)transaction, this option might increase the remote server's load
- when the local (sub)transaction commits, so be careful when using this
- option.
+ This option controls whether <filename>postgres_fdw</filename> aborts
+ remote (sub)transactions opened on a foreign server in a local
+ (sub)transaction in parallel when the local (sub)transaction aborts.
+ This option can only be specified for foreign servers, not per-table.
+ The default is <literal>false</literal>.
</para>
</listitem>
</varlistentry>
</variablelist>
+ <para>
+ If multiple foreign servers with these options enabled are involved in a
+ local (sub)transaction, multiple remote (sub)transactions opened on those
+ foreign servers in the local (sub)transaction are committed or aborted in
+ parallel across those foreign servers when the local (sub)transaction
+ commits or aborts.
+ </para>
+
+ <para>
+ For a foreign server with these options enabled, if many remote
+ (sub)transactions are opened on the foreign server in a local
+ (sub)transaction, these options might increase the remote server's load
+ when the local (sub)transaction commits or aborts, so be careful when
+ using these options.
+ </para>
+
</sect3>
<sect3>
Thanks a lot for the patch update.
On 2022-05-02 1:25 a.m., Etsuro Fujita wrote:
Hi,
On Wed, Apr 20, 2022 at 4:55 AM David Zhang <david.zhang@highgo.ca> wrote:
I tried to apply the patch to master and plan to run some tests, but got
below errors due to other commits.I rebased the patch against HEAD. Attached is an updated patch.
Applied the patch v8 to master branch today, and the `make check` is OK.
I also repeated previous performance tests on three virtual Ubuntu
18.04, and the performance improvement of parallel abort in 10 times
average is more consistent.
before:
abort.1 = 2.6344 ms
abort.2 = 4.2799 ms
after:
abort.1 = 1.4105 ms
abort.2 = 2.2075 ms
+ * remote server in parallel at (sub)transaction end.
Here, I think the comment above could potentially apply to multiple
remote server(s).I agree on that point, but I think it's correct to say "the remote
server" here, because we determine this for the given remote server.
Maybe I'm missing something, so could you elaborate on it?
Your understanding is correct. I was thinking `remote server(s)` would
be easy for end user to understand but this is a comment in source code,
so either way is fine for me.
Not sure if there is a way to avoid repeated comments? For example, the
same comment below appears in two places (line 231 and line 296).+ /* + * If requested, consume whatever data is available from the socket. + * (Note that if all data is available, this allows + * pgfdw_get_cleanup_result to call PQgetResult without forcing the + * overhead of WaitLatchOrSocket, which would be large compared to the + * overhead of PQconsumeInput.) + */IMO I think it's OK to have this in multiple places, because 1) IMO it
wouldn't be that long, and 2) we already duplicated comments like this
in the same file in v14 and earlier. Here is such an example in
pgfdw_xact_callback() and pgfdw_subxact_callback() in that file in
those versions:/*
* If a command has been submitted to the remote server by
* using an asynchronous execution function, the command
* might not have yet completed. Check to see if a
* command is still being processed by the remote server,
* and if so, request cancellation of the command.
*/Thanks for reviewing! Sorry for the delay.
Best regards,
Etsuro Fujita
--
Best regards,
David
Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca
On Thu, May 5, 2022 at 6:39 AM David Zhang <david.zhang@highgo.ca> wrote:
On 2022-05-02 1:25 a.m., Etsuro Fujita wrote:
On Wed, Apr 20, 2022 at 4:55 AM David Zhang <david.zhang@highgo.ca> wrote:
I tried to apply the patch to master and plan to run some tests, but got
below errors due to other commits.I rebased the patch against HEAD. Attached is an updated patch.
Applied the patch v8 to master branch today, and the `make check` is OK.
I also repeated previous performance tests on three virtual Ubuntu
18.04, and the performance improvement of parallel abort in 10 times
average is more consistent.before:
abort.1 = 2.6344 ms
abort.2 = 4.2799 msafter:
abort.1 = 1.4105 ms
abort.2 = 2.2075 ms
Good to know! Thanks for testing!
+ * remote server in parallel at (sub)transaction end.
Here, I think the comment above could potentially apply to multiple
remote server(s).I agree on that point, but I think it's correct to say "the remote
server" here, because we determine this for the given remote server.
Maybe I'm missing something, so could you elaborate on it?Your understanding is correct. I was thinking `remote server(s)` would
be easy for end user to understand but this is a comment in source code,
so either way is fine for me.
Ok, but I noticed that the comment failed to mention that the
parallel_commit option is disabled by default. Also, I noticed a
comment above it:
* It's enough to determine this only when making new connection because
* all the connections to the foreign server whose keep_connections option
* is changed will be closed and re-made later.
This would apply to the parallel_commit option as well. How about
updating these like the attached? (I simplified the latter comment
and moved it to a more appropriate place.)
Best regards,
Etsuro Fujita
Attachments:
update-comment-in-make_new_connection.patchapplication/octet-stream; name=update-comment-in-make_new_connection.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index f9b8c01f3b..541526ab80 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -318,14 +318,15 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
* open even after the transaction using it ends, so that the subsequent
* transactions can re-use it.
*
- * It's enough to determine this only when making new connection because
- * all the connections to the foreign server whose keep_connections option
- * is changed will be closed and re-made later.
- *
* By default, all the connections to any foreign servers are kept open.
*
* Also determine whether to commit (sub)transactions opened on the remote
- * server in parallel at (sub)transaction end.
+ * server in parallel at (sub)transaction end, which is disabled by
+ * default.
+ *
+ * Note: it's enough to determine these only when making a new connection
+ * because these settings for it are changed, it will be closed and
+ * re-made later.
*/
entry->keep_connections = true;
entry->parallel_commit = false;
On Fri, May 6, 2022 at 7:08 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
On Wed, Apr 20, 2022 at 4:55 AM David Zhang <david.zhang@highgo.ca> wrote:
+ * remote server in parallel at (sub)transaction end.
I noticed that the comment failed to mention that the
parallel_commit option is disabled by default. Also, I noticed a
comment above it:* It's enough to determine this only when making new connection because
* all the connections to the foreign server whose keep_connections option
* is changed will be closed and re-made later.This would apply to the parallel_commit option as well. How about
updating these like the attached? (I simplified the latter comment
and moved it to a more appropriate place.)
I’m planning to commit this as a follow-up patch for commit 04e706d42.
Best regards,
Etsuro Fujita
On Wed, May 11, 2022 at 7:39 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
On Fri, May 6, 2022 at 7:08 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
On Wed, Apr 20, 2022 at 4:55 AM David Zhang <david.zhang@highgo.ca> wrote:
+ * remote server in parallel at (sub)transaction end.
I noticed that the comment failed to mention that the
parallel_commit option is disabled by default. Also, I noticed a
comment above it:* It's enough to determine this only when making new connection because
* all the connections to the foreign server whose keep_connections option
* is changed will be closed and re-made later.This would apply to the parallel_commit option as well. How about
updating these like the attached? (I simplified the latter comment
and moved it to a more appropriate place.)I’m planning to commit this as a follow-up patch for commit 04e706d42.
Done.
Best regards,
Etsuro Fujita
On 5/12/22 01:46, Etsuro Fujita wrote:
On Wed, May 11, 2022 at 7:39 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
I’m planning to commit this as a follow-up patch for commit 04e706d42.
Done.
FYI, I think cfbot is confused about the patch under review here. (When
I first opened the thread I thought the patch had already been committed.)
For new reviewers: it looks like v8, upthread, is the proposal.
--Jacob
Hi Jacob,
On Fri, Jul 1, 2022 at 3:50 AM Jacob Champion <jchampion@timescale.com> wrote:
On 5/12/22 01:46, Etsuro Fujita wrote:
On Wed, May 11, 2022 at 7:39 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
I’m planning to commit this as a follow-up patch for commit 04e706d42.
Done.
FYI, I think cfbot is confused about the patch under review here. (When
I first opened the thread I thought the patch had already been committed.)
I should have attached the patch in the previous email.
For new reviewers: it looks like v8, upthread, is the proposal.
The patch needs rebase due to commits 4036bcbbb, 8c8d307f8 and
82699edbf, so I updated the patch. Attached is a new version, in
which I also tweaked comments a little bit.
Thanks for taking care of this!
Best regards,
Etsuro Fujita
Attachments:
v9-postgres-fdw-Add-support-for-parallel-abort.patchapplication/octet-stream; name=v9-postgres-fdw-Add-support-for-parallel-abort.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index cffb6f8310..c2312c4e50 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -59,6 +59,7 @@ typedef struct ConnCacheEntry
bool have_error; /* have any subxacts aborted in this xact? */
bool changing_xact_state; /* xact state change in process */
bool parallel_commit; /* do we commit (sub)xacts in parallel? */
+ bool parallel_abort; /* do we abort (sub)xacts in parallel? */
bool invalidated; /* true if reconnect is pending */
bool keep_connections; /* setting value of keep_connections
* server option */
@@ -80,6 +81,25 @@ static unsigned int prep_stmt_number = 0;
/* tracks whether any work is needed in callback functions */
static bool xact_got_connection = false;
+/*
+ * Milliseconds to wait to cancel an in-progress query or execute a cleanup
+ * query; if it takes longer than 30 seconds to do these, we assume the
+ * connection is dead.
+ */
+#define CONNECTION_CLEANUP_TIMEOUT 30000
+
+/* Macro for constructing abort command to be sent */
+#define CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel) \
+ do { \
+ if (toplevel) \
+ snprintf((sql), sizeof(sql), \
+ "ABORT TRANSACTION"); \
+ else \
+ snprintf((sql), sizeof(sql), \
+ "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d", \
+ (entry)->xact_depth, (entry)->xact_depth); \
+ } while(0)
+
/*
* SQL functions
*/
@@ -106,14 +126,28 @@ static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
static bool pgfdw_cancel_query(PGconn *conn);
+static bool pgfdw_cancel_query_begin(PGconn *conn);
+static bool pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime,
+ bool consume_input);
static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
+static bool pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query);
+static bool pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ TimestampTz endtime,
+ bool consume_input,
+ bool ignore_errors);
static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
static void pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel);
+static bool pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries,
+ List **cancel_requested);
static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries,
int curlevel);
+static void pgfdw_finish_abort_cleanup(List *pending_entries,
+ List *cancel_requested,
+ bool toplevel);
static bool UserMappingPasswordRequired(UserMapping *user);
static bool disconnect_cached_connections(Oid serverid);
@@ -320,8 +354,8 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
*
* By default, all the connections to any foreign servers are kept open.
*
- * Also determine whether to commit (sub)transactions opened on the remote
- * server in parallel at (sub)transaction end, which is disabled by
+ * Also determine whether to commit/abort (sub)transactions opened on the
+ * remote server in parallel at (sub)transaction end, which is disabled by
* default.
*
* Note: it's enough to determine these only when making a new connection
@@ -330,6 +364,7 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
*/
entry->keep_connections = true;
entry->parallel_commit = false;
+ entry->parallel_abort = false;
foreach(lc, server->options)
{
DefElem *def = (DefElem *) lfirst(lc);
@@ -338,6 +373,8 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
entry->keep_connections = defGetBoolean(def);
else if (strcmp(def->defname, "parallel_commit") == 0)
entry->parallel_commit = defGetBoolean(def);
+ else if (strcmp(def->defname, "parallel_abort") == 0)
+ entry->parallel_abort = defGetBoolean(def);
}
/* Now try to make the connection */
@@ -923,6 +960,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
@@ -1016,7 +1054,15 @@ pgfdw_xact_callback(XactEvent event, void *arg)
case XACT_EVENT_PARALLEL_ABORT:
case XACT_EVENT_ABORT:
/* Rollback all remote transactions during abort */
- pgfdw_abort_cleanup(entry, true);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, true,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, true);
break;
}
}
@@ -1026,11 +1072,21 @@ pgfdw_xact_callback(XactEvent event, void *arg)
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
- event == XACT_EVENT_PRE_COMMIT);
- pgfdw_finish_pre_commit_cleanup(pending_entries);
+ if (event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
+ event == XACT_EVENT_PRE_COMMIT)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_commit_cleanup(pending_entries);
+ }
+ else
+ {
+ Assert(event == XACT_EVENT_PARALLEL_ABORT ||
+ event == XACT_EVENT_ABORT);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ true);
+ }
}
/*
@@ -1055,6 +1111,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
ConnCacheEntry *entry;
int curlevel;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Nothing to do at subxact start, nor after commit. */
if (!(event == SUBXACT_EVENT_PRE_COMMIT_SUB ||
@@ -1109,7 +1166,15 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
else
{
/* Rollback all remote subtransactions during abort */
- pgfdw_abort_cleanup(entry, false);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, false,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, false);
}
/* OK, we're outta that level of subtransaction */
@@ -1117,10 +1182,19 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == SUBXACT_EVENT_PRE_COMMIT_SUB);
- pgfdw_finish_pre_subcommit_cleanup(pending_entries, curlevel);
+ if (event == SUBXACT_EVENT_PRE_COMMIT_SUB)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_subcommit_cleanup(pending_entries, curlevel);
+ }
+ else
+ {
+ Assert(event == SUBXACT_EVENT_ABORT_SUB);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ false);
+ }
}
}
@@ -1264,17 +1338,25 @@ pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel)
static bool
pgfdw_cancel_query(PGconn *conn)
{
- PGcancel *cancel;
- char errbuf[256];
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to cancel the query and discard the result, assume
* the connection is dead.
*/
- endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_cancel_query_begin(conn))
+ return false;
+ return pgfdw_cancel_query_end(conn, endtime, false);
+}
+
+static bool
+pgfdw_cancel_query_begin(PGconn *conn)
+{
+ PGcancel *cancel;
+ char errbuf[256];
/*
* Issue cancel request. Unfortunately, there's no good way to limit the
@@ -1294,6 +1376,31 @@ pgfdw_cancel_query(PGconn *conn)
PQfreeCancel(cancel);
}
+ return true;
+}
+
+static bool
+pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime, bool consume_input)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
+ /*
+ * If requested, consume whatever data is available from the socket.
+ * (Note that if all data is available, this allows
+ * pgfdw_get_cleanup_result to call PQgetResult without forcing the
+ * overhead of WaitLatchOrSocket, which would be large compared to the
+ * overhead of PQconsumeInput.)
+ */
+ if (consume_input && !PQconsumeInput(conn))
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_CONNECTION_FAILURE),
+ errmsg("could not get result of cancel request: %s",
+ pchomp(PQerrorMessage(conn)))));
+ return false;
+ }
+
/* Get and discard the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1328,9 +1435,7 @@ pgfdw_cancel_query(PGconn *conn)
static bool
pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
{
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to execute a cleanup query, assume the connection
@@ -1338,8 +1443,18 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
* place (e.g. statement timeout, user cancel), so the timeout shouldn't
* be too long.
*/
- endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_exec_cleanup_query_begin(conn, query))
+ return false;
+ return pgfdw_exec_cleanup_query_end(conn, query, endtime,
+ false, ignore_errors);
+}
+static bool
+pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query)
+{
/*
* Submit a query. Since we don't use non-blocking mode, this also can
* block. But its risk is relatively small, so we ignore that for now.
@@ -1350,6 +1465,30 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
return false;
}
+ return true;
+}
+
+static bool
+pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ TimestampTz endtime, bool consume_input,
+ bool ignore_errors)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
+ /*
+ * If requested, consume whatever data is available from the socket.
+ * (Note that if all data is available, this allows
+ * pgfdw_get_cleanup_result to call PQgetResult without forcing the
+ * overhead of WaitLatchOrSocket, which would be large compared to the
+ * overhead of PQconsumeInput.)
+ */
+ if (consume_input && !PQconsumeInput(conn))
+ {
+ pgfdw_report_error(WARNING, NULL, conn, false, query);
+ return false;
+ }
+
/* Get the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1505,12 +1644,7 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
!pgfdw_cancel_query(entry->conn))
return; /* Unable to cancel running query */
- if (toplevel)
- snprintf(sql, sizeof(sql), "ABORT TRANSACTION");
- else
- snprintf(sql, sizeof(sql),
- "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
- entry->xact_depth, entry->xact_depth);
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
if (!pgfdw_exec_cleanup_query(entry->conn, sql, false))
return; /* Unable to abort remote (sub)transaction */
@@ -1539,6 +1673,65 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
entry->changing_xact_state = false;
}
+/*
+ * Like pgfdw_abort_cleanup, submit an abort command or cancel a request, but
+ * don't wait for the result.
+ *
+ * Returns true if the abort command or cancel request is successfully issued,
+ * false otherwise. If the abort command is successfully issued, the given
+ * connection cache entry is appended to *pending_entries. Othewise, if the
+ * cancel request is successfully issued, it's appended to *cancel_requested.
+ */
+static bool
+pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries, List **cancel_requested)
+{
+ /*
+ * Don't try to clean up the connection if we're already in error
+ * recursion trouble.
+ */
+ if (in_error_recursion_trouble())
+ entry->changing_xact_state = true;
+
+ /*
+ * If connection is already unsalvageable, don't touch it further.
+ */
+ if (entry->changing_xact_state)
+ return false;
+
+ /*
+ * Mark this connection as in the process of changing transaction state.
+ */
+ entry->changing_xact_state = true;
+
+ /* Assume we might have lost track of prepared statements */
+ entry->have_error = true;
+
+ /*
+ * If a command has been submitted to the remote server by using an
+ * asynchronous execution function, the command might not have yet
+ * completed. Check to see if a command is still being processed by the
+ * remote server, and if so, request cancellation of the command.
+ */
+ if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE)
+ {
+ if (!pgfdw_cancel_query_begin(entry->conn))
+ return false; /* Unable to cancel running query */
+ *cancel_requested = lappend(*cancel_requested, entry);
+ }
+ else
+ {
+ char sql[100];
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ return false; /* Unable to abort remote transaction */
+ *pending_entries = lappend(*pending_entries, entry);
+ }
+
+ return true;
+}
+
/*
* Finish pre-commit cleanup of connections on each of which we've sent a
* COMMIT command to the remote server.
@@ -1647,6 +1840,168 @@ pgfdw_finish_pre_subcommit_cleanup(List *pending_entries, int curlevel)
}
}
+/*
+ * Finish abort cleanup of connections on each of which we've sent an abort
+ * command or cancel request to the remote server.
+ */
+static void
+pgfdw_finish_abort_cleanup(List *pending_entries, List *cancel_requested,
+ bool toplevel)
+{
+ List *pending_deallocs = NIL;
+ ListCell *lc;
+
+ /*
+ * For each of the pending cancel requests (if any), get and discard the
+ * result of the query, and submit an abort command to the remote server.
+ */
+ if (cancel_requested)
+ {
+ foreach(lc, cancel_requested)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. You might think we should do this before issuing
+ * cancel request like in normal mode, but that is problematic,
+ * because if, for example, it took longer than 30 seconds to
+ * process the first few entries in the cancel_requested list, it
+ * would cause a timeout error when processing each of the
+ * remaining entries in the list, leading to slamming that entry's
+ * connection shut.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_cancel_query_end(entry->conn, endtime, true))
+ {
+ /* Unable to cancel running query */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ /* Send an abort command in parallel if needed */
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_entries = lappend(pending_entries, entry);
+ }
+ }
+
+ /* No further work if no pending entries */
+ if (!pending_entries)
+ return;
+
+ /*
+ * Get the result of the abort command for each of the pending entries
+ */
+ foreach(lc, pending_entries)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, sql, endtime,
+ true, false))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ if (toplevel)
+ {
+ /* Do a DEALLOCATE ALL in parallel if needed */
+ if (entry->have_prep_stmt && entry->have_error)
+ {
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn,
+ "DEALLOCATE ALL"))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_deallocs = lappend(pending_deallocs, entry);
+ continue;
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+ }
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+
+ /* No further work if no pending entries */
+ if (!pending_deallocs)
+ return;
+ Assert(toplevel);
+
+ /*
+ * Get the result of the DEALLOCATE command for each of the pending
+ * entries
+ */
+ foreach(lc, pending_deallocs)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+
+ Assert(entry->changing_xact_state);
+ Assert(entry->have_prep_stmt);
+ Assert(entry->have_error);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, "DEALLOCATE ALL",
+ endtime, true, true))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+}
+
/*
* List active foreign server connections.
*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 44457f930c..c7ba7b7a27 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9529,7 +9529,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, keep_connections
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, parallel_abort, keep_connections
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
@@ -11218,10 +11218,12 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
SERVER loopback OPTIONS (table_name 'ploc1');
@@ -11291,5 +11293,52 @@ SELECT * FROM prem2;
204 | quxqux
(3 rows)
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index 572591a558..59f865fac3 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -122,6 +122,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
strcmp(def->defname, "truncatable") == 0 ||
strcmp(def->defname, "async_capable") == 0 ||
strcmp(def->defname, "parallel_commit") == 0 ||
+ strcmp(def->defname, "parallel_abort") == 0 ||
strcmp(def->defname, "keep_connections") == 0)
{
/* these accept only boolean values */
@@ -251,6 +252,7 @@ InitPgFdwOptions(void)
{"async_capable", ForeignServerRelationId, false},
{"async_capable", ForeignTableRelationId, false},
{"parallel_commit", ForeignServerRelationId, false},
+ {"parallel_abort", ForeignServerRelationId, false},
{"keep_connections", ForeignServerRelationId, false},
{"password_required", UserMappingRelationId, false},
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 92d1212027..7c5573ba9a 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3591,10 +3591,12 @@ RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
@@ -3633,5 +3635,26 @@ COMMIT;
SELECT * FROM prem1;
SELECT * FROM prem2;
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index bfd344cdc0..7b8b959462 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -465,12 +465,13 @@ OPTIONS (ADD password_required 'false');
corresponding remote transactions, and subtransactions are managed by
creating corresponding remote subtransactions. When multiple remote
transactions are involved in the current local transaction, by default
- <filename>postgres_fdw</filename> commits those remote transactions
- serially when the local transaction is committed. When multiple remote
- subtransactions are involved in the current local subtransaction, by
- default <filename>postgres_fdw</filename> commits those remote
- subtransactions serially when the local subtransaction is committed.
- Performance can be improved with the following option:
+ <filename>postgres_fdw</filename> commits or aborts those remote
+ transactions serially when the local transaction is committed or aborted.
+ When multiple remote subtransactions are involved in the current local
+ subtransaction, by default <filename>postgres_fdw</filename> commits or
+ aborts those remote subtransactions serially when the local subtransaction
+ is committed or abortd.
+ Performance can be improved with the following options:
</para>
<variablelist>
@@ -486,24 +487,38 @@ OPTIONS (ADD password_required 'false');
specified for foreign servers, not per-table. The default is
<literal>false</literal>.
</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><literal>parallel_abort</literal> (<type>boolean</type>)</term>
+ <listitem>
<para>
- If multiple foreign servers with this option enabled are involved in a
- local transaction, multiple remote transactions on those foreign
- servers are committed in parallel across those foreign servers when
- the local transaction is committed.
- </para>
-
- <para>
- When this option is enabled, a foreign server with many remote
- transactions may see a negative performance impact when the local
- transaction is committed.
+ This option controls whether <filename>postgres_fdw</filename> aborts
+ in parallel remote transactions opened on a foreign server in a local
+ transaction when the local transaction is aborted. This setting also
+ applies to remote and local subtransactions. This option can only be
+ specified for foreign servers, not per-table. The default is
+ <literal>false</literal>.
</para>
</listitem>
</varlistentry>
</variablelist>
+ <para>
+ If multiple foreign servers with these options enabled are involved in a
+ local transaction, multiple remote transactions on those foreign servers
+ are committed or aborted in parallel across those foreign servers when
+ the local transaction is committed or aborted.
+ </para>
+
+ <para>
+ When these options are enabled, a foreign server with many remote
+ transactions may see a negative performance impact when the local
+ transaction is committed or aborted.
+ </para>
+
</sect3>
<sect3>
Hi Etsuro,
The patch needs rebase due to commits 4036bcbbb, 8c8d307f8 and
82699edbf, so I updated the patch. Attached is a new version, in
which I also tweaked comments a little bit.
After rebase the file `postgres_fdw.out` and applied to master branch,
make and make check are all ok for postgres_fdw.
--
David
Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca
Hi David,
On Sat, Oct 1, 2022 at 5:54 AM David Zhang <david.zhang@highgo.ca> wrote:
After rebase the file `postgres_fdw.out` and applied to master branch,
make and make check are all ok for postgres_fdw.
Thanks for testing! Attached is a rebased version of the patch.
Best regards,
Etsuro Fujita
Attachments:
v10-postgres-fdw-Add-support-for-parallel-abort.patchapplication/octet-stream; name=v10-postgres-fdw-Add-support-for-parallel-abort.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index f0c45b00db..864f9516ad 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -59,6 +59,7 @@ typedef struct ConnCacheEntry
bool have_error; /* have any subxacts aborted in this xact? */
bool changing_xact_state; /* xact state change in process */
bool parallel_commit; /* do we commit (sub)xacts in parallel? */
+ bool parallel_abort; /* do we abort (sub)xacts in parallel? */
bool invalidated; /* true if reconnect is pending */
bool keep_connections; /* setting value of keep_connections
* server option */
@@ -80,6 +81,25 @@ static unsigned int prep_stmt_number = 0;
/* tracks whether any work is needed in callback functions */
static bool xact_got_connection = false;
+/*
+ * Milliseconds to wait to cancel an in-progress query or execute a cleanup
+ * query; if it takes longer than 30 seconds to do these, we assume the
+ * connection is dead.
+ */
+#define CONNECTION_CLEANUP_TIMEOUT 30000
+
+/* Macro for constructing abort command to be sent */
+#define CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel) \
+ do { \
+ if (toplevel) \
+ snprintf((sql), sizeof(sql), \
+ "ABORT TRANSACTION"); \
+ else \
+ snprintf((sql), sizeof(sql), \
+ "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d", \
+ (entry)->xact_depth, (entry)->xact_depth); \
+ } while(0)
+
/*
* SQL functions
*/
@@ -106,14 +126,28 @@ static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
static bool pgfdw_cancel_query(PGconn *conn);
+static bool pgfdw_cancel_query_begin(PGconn *conn);
+static bool pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime,
+ bool consume_input);
static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
+static bool pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query);
+static bool pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ TimestampTz endtime,
+ bool consume_input,
+ bool ignore_errors);
static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
static void pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel);
+static bool pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries,
+ List **cancel_requested);
static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries,
int curlevel);
+static void pgfdw_finish_abort_cleanup(List *pending_entries,
+ List *cancel_requested,
+ bool toplevel);
static bool UserMappingPasswordRequired(UserMapping *user);
static bool disconnect_cached_connections(Oid serverid);
@@ -320,8 +354,8 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
*
* By default, all the connections to any foreign servers are kept open.
*
- * Also determine whether to commit (sub)transactions opened on the remote
- * server in parallel at (sub)transaction end, which is disabled by
+ * Also determine whether to commit/abort (sub)transactions opened on the
+ * remote server in parallel at (sub)transaction end, which is disabled by
* default.
*
* Note: it's enough to determine these only when making a new connection
@@ -330,6 +364,7 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
*/
entry->keep_connections = true;
entry->parallel_commit = false;
+ entry->parallel_abort = false;
foreach(lc, server->options)
{
DefElem *def = (DefElem *) lfirst(lc);
@@ -338,6 +373,8 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
entry->keep_connections = defGetBoolean(def);
else if (strcmp(def->defname, "parallel_commit") == 0)
entry->parallel_commit = defGetBoolean(def);
+ else if (strcmp(def->defname, "parallel_abort") == 0)
+ entry->parallel_abort = defGetBoolean(def);
}
/* Now try to make the connection */
@@ -923,6 +960,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
@@ -1016,7 +1054,15 @@ pgfdw_xact_callback(XactEvent event, void *arg)
case XACT_EVENT_PARALLEL_ABORT:
case XACT_EVENT_ABORT:
/* Rollback all remote transactions during abort */
- pgfdw_abort_cleanup(entry, true);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, true,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, true);
break;
}
}
@@ -1026,11 +1072,21 @@ pgfdw_xact_callback(XactEvent event, void *arg)
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
- event == XACT_EVENT_PRE_COMMIT);
- pgfdw_finish_pre_commit_cleanup(pending_entries);
+ if (event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
+ event == XACT_EVENT_PRE_COMMIT)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_commit_cleanup(pending_entries);
+ }
+ else
+ {
+ Assert(event == XACT_EVENT_PARALLEL_ABORT ||
+ event == XACT_EVENT_ABORT);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ true);
+ }
}
/*
@@ -1055,6 +1111,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
ConnCacheEntry *entry;
int curlevel;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Nothing to do at subxact start, nor after commit. */
if (!(event == SUBXACT_EVENT_PRE_COMMIT_SUB ||
@@ -1109,7 +1166,15 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
else
{
/* Rollback all remote subtransactions during abort */
- pgfdw_abort_cleanup(entry, false);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, false,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, false);
}
/* OK, we're outta that level of subtransaction */
@@ -1117,10 +1182,19 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == SUBXACT_EVENT_PRE_COMMIT_SUB);
- pgfdw_finish_pre_subcommit_cleanup(pending_entries, curlevel);
+ if (event == SUBXACT_EVENT_PRE_COMMIT_SUB)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_subcommit_cleanup(pending_entries, curlevel);
+ }
+ else
+ {
+ Assert(event == SUBXACT_EVENT_ABORT_SUB);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ false);
+ }
}
}
@@ -1264,17 +1338,25 @@ pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel)
static bool
pgfdw_cancel_query(PGconn *conn)
{
- PGcancel *cancel;
- char errbuf[256];
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to cancel the query and discard the result, assume
* the connection is dead.
*/
- endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_cancel_query_begin(conn))
+ return false;
+ return pgfdw_cancel_query_end(conn, endtime, false);
+}
+
+static bool
+pgfdw_cancel_query_begin(PGconn *conn)
+{
+ PGcancel *cancel;
+ char errbuf[256];
/*
* Issue cancel request. Unfortunately, there's no good way to limit the
@@ -1294,6 +1376,31 @@ pgfdw_cancel_query(PGconn *conn)
PQfreeCancel(cancel);
}
+ return true;
+}
+
+static bool
+pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime, bool consume_input)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
+ /*
+ * If requested, consume whatever data is available from the socket.
+ * (Note that if all data is available, this allows
+ * pgfdw_get_cleanup_result to call PQgetResult without forcing the
+ * overhead of WaitLatchOrSocket, which would be large compared to the
+ * overhead of PQconsumeInput.)
+ */
+ if (consume_input && !PQconsumeInput(conn))
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_CONNECTION_FAILURE),
+ errmsg("could not get result of cancel request: %s",
+ pchomp(PQerrorMessage(conn)))));
+ return false;
+ }
+
/* Get and discard the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1328,9 +1435,7 @@ pgfdw_cancel_query(PGconn *conn)
static bool
pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
{
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to execute a cleanup query, assume the connection
@@ -1338,8 +1443,18 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
* place (e.g. statement timeout, user cancel), so the timeout shouldn't
* be too long.
*/
- endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_exec_cleanup_query_begin(conn, query))
+ return false;
+ return pgfdw_exec_cleanup_query_end(conn, query, endtime,
+ false, ignore_errors);
+}
+static bool
+pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query)
+{
/*
* Submit a query. Since we don't use non-blocking mode, this also can
* block. But its risk is relatively small, so we ignore that for now.
@@ -1350,6 +1465,30 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
return false;
}
+ return true;
+}
+
+static bool
+pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ TimestampTz endtime, bool consume_input,
+ bool ignore_errors)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
+ /*
+ * If requested, consume whatever data is available from the socket.
+ * (Note that if all data is available, this allows
+ * pgfdw_get_cleanup_result to call PQgetResult without forcing the
+ * overhead of WaitLatchOrSocket, which would be large compared to the
+ * overhead of PQconsumeInput.)
+ */
+ if (consume_input && !PQconsumeInput(conn))
+ {
+ pgfdw_report_error(WARNING, NULL, conn, false, query);
+ return false;
+ }
+
/* Get the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1505,12 +1644,7 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
!pgfdw_cancel_query(entry->conn))
return; /* Unable to cancel running query */
- if (toplevel)
- snprintf(sql, sizeof(sql), "ABORT TRANSACTION");
- else
- snprintf(sql, sizeof(sql),
- "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
- entry->xact_depth, entry->xact_depth);
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
if (!pgfdw_exec_cleanup_query(entry->conn, sql, false))
return; /* Unable to abort remote (sub)transaction */
@@ -1539,6 +1673,65 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
entry->changing_xact_state = false;
}
+/*
+ * Like pgfdw_abort_cleanup, submit an abort command or cancel a request, but
+ * don't wait for the result.
+ *
+ * Returns true if the abort command or cancel request is successfully issued,
+ * false otherwise. If the abort command is successfully issued, the given
+ * connection cache entry is appended to *pending_entries. Othewise, if the
+ * cancel request is successfully issued, it's appended to *cancel_requested.
+ */
+static bool
+pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries, List **cancel_requested)
+{
+ /*
+ * Don't try to clean up the connection if we're already in error
+ * recursion trouble.
+ */
+ if (in_error_recursion_trouble())
+ entry->changing_xact_state = true;
+
+ /*
+ * If connection is already unsalvageable, don't touch it further.
+ */
+ if (entry->changing_xact_state)
+ return false;
+
+ /*
+ * Mark this connection as in the process of changing transaction state.
+ */
+ entry->changing_xact_state = true;
+
+ /* Assume we might have lost track of prepared statements */
+ entry->have_error = true;
+
+ /*
+ * If a command has been submitted to the remote server by using an
+ * asynchronous execution function, the command might not have yet
+ * completed. Check to see if a command is still being processed by the
+ * remote server, and if so, request cancellation of the command.
+ */
+ if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE)
+ {
+ if (!pgfdw_cancel_query_begin(entry->conn))
+ return false; /* Unable to cancel running query */
+ *cancel_requested = lappend(*cancel_requested, entry);
+ }
+ else
+ {
+ char sql[100];
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ return false; /* Unable to abort remote transaction */
+ *pending_entries = lappend(*pending_entries, entry);
+ }
+
+ return true;
+}
+
/*
* Finish pre-commit cleanup of connections on each of which we've sent a
* COMMIT command to the remote server.
@@ -1647,6 +1840,168 @@ pgfdw_finish_pre_subcommit_cleanup(List *pending_entries, int curlevel)
}
}
+/*
+ * Finish abort cleanup of connections on each of which we've sent an abort
+ * command or cancel request to the remote server.
+ */
+static void
+pgfdw_finish_abort_cleanup(List *pending_entries, List *cancel_requested,
+ bool toplevel)
+{
+ List *pending_deallocs = NIL;
+ ListCell *lc;
+
+ /*
+ * For each of the pending cancel requests (if any), get and discard the
+ * result of the query, and submit an abort command to the remote server.
+ */
+ if (cancel_requested)
+ {
+ foreach(lc, cancel_requested)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. You might think we should do this before issuing
+ * cancel request like in normal mode, but that is problematic,
+ * because if, for example, it took longer than 30 seconds to
+ * process the first few entries in the cancel_requested list, it
+ * would cause a timeout error when processing each of the
+ * remaining entries in the list, leading to slamming that entry's
+ * connection shut.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_cancel_query_end(entry->conn, endtime, true))
+ {
+ /* Unable to cancel running query */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ /* Send an abort command in parallel if needed */
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_entries = lappend(pending_entries, entry);
+ }
+ }
+
+ /* No further work if no pending entries */
+ if (!pending_entries)
+ return;
+
+ /*
+ * Get the result of the abort command for each of the pending entries
+ */
+ foreach(lc, pending_entries)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, sql, endtime,
+ true, false))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ if (toplevel)
+ {
+ /* Do a DEALLOCATE ALL in parallel if needed */
+ if (entry->have_prep_stmt && entry->have_error)
+ {
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn,
+ "DEALLOCATE ALL"))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_deallocs = lappend(pending_deallocs, entry);
+ continue;
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+ }
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+
+ /* No further work if no pending entries */
+ if (!pending_deallocs)
+ return;
+ Assert(toplevel);
+
+ /*
+ * Get the result of the DEALLOCATE command for each of the pending
+ * entries
+ */
+ foreach(lc, pending_deallocs)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+
+ Assert(entry->changing_xact_state);
+ Assert(entry->have_prep_stmt);
+ Assert(entry->have_error);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, "DEALLOCATE ALL",
+ endtime, true, true))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+}
+
/*
* List active foreign server connections.
*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 558e94b845..5ddb483cd5 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -11501,10 +11501,12 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
SERVER loopback OPTIONS (table_name 'ploc1');
@@ -11574,5 +11576,52 @@ SELECT * FROM prem2;
204 | quxqux
(3 rows)
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index fa80ee2a55..df6680fe27 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -125,6 +125,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
strcmp(def->defname, "truncatable") == 0 ||
strcmp(def->defname, "async_capable") == 0 ||
strcmp(def->defname, "parallel_commit") == 0 ||
+ strcmp(def->defname, "parallel_abort") == 0 ||
strcmp(def->defname, "keep_connections") == 0)
{
/* these accept only boolean values */
@@ -254,6 +255,7 @@ InitPgFdwOptions(void)
{"async_capable", ForeignServerRelationId, false},
{"async_capable", ForeignTableRelationId, false},
{"parallel_commit", ForeignServerRelationId, false},
+ {"parallel_abort", ForeignServerRelationId, false},
{"keep_connections", ForeignServerRelationId, false},
{"password_required", UserMappingRelationId, false},
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index b0dbb41fb5..4d5140669b 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3744,10 +3744,12 @@ RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
@@ -3786,5 +3788,26 @@ COMMIT;
SELECT * FROM prem1;
SELECT * FROM prem2;
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 527f4deaaa..2b309ad002 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -469,12 +469,13 @@ OPTIONS (ADD password_required 'false');
corresponding remote transactions, and subtransactions are managed by
creating corresponding remote subtransactions. When multiple remote
transactions are involved in the current local transaction, by default
- <filename>postgres_fdw</filename> commits those remote transactions
- serially when the local transaction is committed. When multiple remote
- subtransactions are involved in the current local subtransaction, by
- default <filename>postgres_fdw</filename> commits those remote
- subtransactions serially when the local subtransaction is committed.
- Performance can be improved with the following option:
+ <filename>postgres_fdw</filename> commits or aborts those remote
+ transactions serially when the local transaction is committed or aborted.
+ When multiple remote subtransactions are involved in the current local
+ subtransaction, by default <filename>postgres_fdw</filename> commits or
+ aborts those remote subtransactions serially when the local subtransaction
+ is committed or abortd.
+ Performance can be improved with the following options:
</para>
<variablelist>
@@ -490,24 +491,38 @@ OPTIONS (ADD password_required 'false');
specified for foreign servers, not per-table. The default is
<literal>false</literal>.
</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><literal>parallel_abort</literal> (<type>boolean</type>)</term>
+ <listitem>
<para>
- If multiple foreign servers with this option enabled are involved in a
- local transaction, multiple remote transactions on those foreign
- servers are committed in parallel across those foreign servers when
- the local transaction is committed.
- </para>
-
- <para>
- When this option is enabled, a foreign server with many remote
- transactions may see a negative performance impact when the local
- transaction is committed.
+ This option controls whether <filename>postgres_fdw</filename> aborts
+ in parallel remote transactions opened on a foreign server in a local
+ transaction when the local transaction is aborted. This setting also
+ applies to remote and local subtransactions. This option can only be
+ specified for foreign servers, not per-table. The default is
+ <literal>false</literal>.
</para>
</listitem>
</varlistentry>
</variablelist>
+ <para>
+ If multiple foreign servers with these options enabled are involved in a
+ local transaction, multiple remote transactions on those foreign servers
+ are committed or aborted in parallel across those foreign servers when
+ the local transaction is committed or aborted.
+ </para>
+
+ <para>
+ When these options are enabled, a foreign server with many remote
+ transactions may see a negative performance impact when the local
+ transaction is committed or aborted.
+ </para>
+
</sect3>
<sect3>
On Tue, 1 Nov 2022 at 15:54, Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
Hi David,
On Sat, Oct 1, 2022 at 5:54 AM David Zhang <david.zhang@highgo.ca> wrote:
After rebase the file `postgres_fdw.out` and applied to master branch,
make and make check are all ok for postgres_fdw.Thanks for testing! Attached is a rebased version of the patch.
The patch does not apply on top of HEAD as in [1]http://cfbot.cputube.org/patch_41_3392.log, please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
b37a0832396414e8469d4ee4daea33396bde39b0 ===
=== applying patch ./v10-postgres-fdw-Add-support-for-parallel-abort.patch
patching file contrib/postgres_fdw/connection.c
patching file contrib/postgres_fdw/expected/postgres_fdw.out
Hunk #1 succeeded at 11704 (offset 203 lines).
Hunk #2 FAILED at 11576.
1 out of 2 hunks FAILED -- saving rejects to file
contrib/postgres_fdw/expected/postgres_fdw.out.rej
patching file contrib/postgres_fdw/option.c
Hunk #2 succeeded at 272 (offset 17 lines).
patching file contrib/postgres_fdw/sql/postgres_fdw.sql
Hunk #1 succeeded at 3894 (offset 150 lines).
Hunk #2 FAILED at 3788.
1 out of 2 hunks FAILED -- saving rejects to file
contrib/postgres_fdw/sql/postgres_fdw.sql.rej
[1]: http://cfbot.cputube.org/patch_41_3392.log
Regards,
Vignesh
Hi Vignesh,
On Wed, Jan 4, 2023 at 9:19 PM vignesh C <vignesh21@gmail.com> wrote:
On Tue, 1 Nov 2022 at 15:54, Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
Attached is a rebased version of the patch.
The patch does not apply on top of HEAD as in [1], please post a rebased patch:
I rebased the patch. Attached is an updated patch.
Thanks!
Best regards,
Etsuro Fujita
Attachments:
v11-postgres-fdw-Add-support-for-parallel-abort.patchapplication/octet-stream; name=v11-postgres-fdw-Add-support-for-parallel-abort.patchDownload
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index ed75ce3f79..781cbcb94e 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -59,6 +59,7 @@ typedef struct ConnCacheEntry
bool have_error; /* have any subxacts aborted in this xact? */
bool changing_xact_state; /* xact state change in process */
bool parallel_commit; /* do we commit (sub)xacts in parallel? */
+ bool parallel_abort; /* do we abort (sub)xacts in parallel? */
bool invalidated; /* true if reconnect is pending */
bool keep_connections; /* setting value of keep_connections
* server option */
@@ -80,6 +81,25 @@ static unsigned int prep_stmt_number = 0;
/* tracks whether any work is needed in callback functions */
static bool xact_got_connection = false;
+/*
+ * Milliseconds to wait to cancel an in-progress query or execute a cleanup
+ * query; if it takes longer than 30 seconds to do these, we assume the
+ * connection is dead.
+ */
+#define CONNECTION_CLEANUP_TIMEOUT 30000
+
+/* Macro for constructing abort command to be sent */
+#define CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel) \
+ do { \
+ if (toplevel) \
+ snprintf((sql), sizeof(sql), \
+ "ABORT TRANSACTION"); \
+ else \
+ snprintf((sql), sizeof(sql), \
+ "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d", \
+ (entry)->xact_depth, (entry)->xact_depth); \
+ } while(0)
+
/*
* SQL functions
*/
@@ -106,14 +126,28 @@ static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
static bool pgfdw_cancel_query(PGconn *conn);
+static bool pgfdw_cancel_query_begin(PGconn *conn);
+static bool pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime,
+ bool consume_input);
static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
bool ignore_errors);
+static bool pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query);
+static bool pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ TimestampTz endtime,
+ bool consume_input,
+ bool ignore_errors);
static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime,
PGresult **result, bool *timed_out);
static void pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel);
+static bool pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries,
+ List **cancel_requested);
static void pgfdw_finish_pre_commit_cleanup(List *pending_entries);
static void pgfdw_finish_pre_subcommit_cleanup(List *pending_entries,
int curlevel);
+static void pgfdw_finish_abort_cleanup(List *pending_entries,
+ List *cancel_requested,
+ bool toplevel);
static bool UserMappingPasswordRequired(UserMapping *user);
static bool disconnect_cached_connections(Oid serverid);
@@ -320,8 +354,8 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
*
* By default, all the connections to any foreign servers are kept open.
*
- * Also determine whether to commit (sub)transactions opened on the remote
- * server in parallel at (sub)transaction end, which is disabled by
+ * Also determine whether to commit/abort (sub)transactions opened on the
+ * remote server in parallel at (sub)transaction end, which is disabled by
* default.
*
* Note: it's enough to determine these only when making a new connection
@@ -330,6 +364,7 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
*/
entry->keep_connections = true;
entry->parallel_commit = false;
+ entry->parallel_abort = false;
foreach(lc, server->options)
{
DefElem *def = (DefElem *) lfirst(lc);
@@ -338,6 +373,8 @@ make_new_connection(ConnCacheEntry *entry, UserMapping *user)
entry->keep_connections = defGetBoolean(def);
else if (strcmp(def->defname, "parallel_commit") == 0)
entry->parallel_commit = defGetBoolean(def);
+ else if (strcmp(def->defname, "parallel_abort") == 0)
+ entry->parallel_abort = defGetBoolean(def);
}
/* Now try to make the connection */
@@ -923,6 +960,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
@@ -1016,7 +1054,15 @@ pgfdw_xact_callback(XactEvent event, void *arg)
case XACT_EVENT_PARALLEL_ABORT:
case XACT_EVENT_ABORT:
/* Rollback all remote transactions during abort */
- pgfdw_abort_cleanup(entry, true);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, true,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, true);
break;
}
}
@@ -1026,11 +1072,21 @@ pgfdw_xact_callback(XactEvent event, void *arg)
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
- event == XACT_EVENT_PRE_COMMIT);
- pgfdw_finish_pre_commit_cleanup(pending_entries);
+ if (event == XACT_EVENT_PARALLEL_PRE_COMMIT ||
+ event == XACT_EVENT_PRE_COMMIT)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_commit_cleanup(pending_entries);
+ }
+ else
+ {
+ Assert(event == XACT_EVENT_PARALLEL_ABORT ||
+ event == XACT_EVENT_ABORT);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ true);
+ }
}
/*
@@ -1055,6 +1111,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
ConnCacheEntry *entry;
int curlevel;
List *pending_entries = NIL;
+ List *cancel_requested = NIL;
/* Nothing to do at subxact start, nor after commit. */
if (!(event == SUBXACT_EVENT_PRE_COMMIT_SUB ||
@@ -1109,7 +1166,15 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
else
{
/* Rollback all remote subtransactions during abort */
- pgfdw_abort_cleanup(entry, false);
+ if (entry->parallel_abort)
+ {
+ if (pgfdw_abort_cleanup_begin(entry, false,
+ &pending_entries,
+ &cancel_requested))
+ continue;
+ }
+ else
+ pgfdw_abort_cleanup(entry, false);
}
/* OK, we're outta that level of subtransaction */
@@ -1117,10 +1182,19 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
}
/* If there are any pending connections, finish cleaning them up */
- if (pending_entries)
+ if (pending_entries || cancel_requested)
{
- Assert(event == SUBXACT_EVENT_PRE_COMMIT_SUB);
- pgfdw_finish_pre_subcommit_cleanup(pending_entries, curlevel);
+ if (event == SUBXACT_EVENT_PRE_COMMIT_SUB)
+ {
+ Assert(cancel_requested == NIL);
+ pgfdw_finish_pre_subcommit_cleanup(pending_entries, curlevel);
+ }
+ else
+ {
+ Assert(event == SUBXACT_EVENT_ABORT_SUB);
+ pgfdw_finish_abort_cleanup(pending_entries, cancel_requested,
+ false);
+ }
}
}
@@ -1264,17 +1338,25 @@ pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel)
static bool
pgfdw_cancel_query(PGconn *conn)
{
- PGcancel *cancel;
- char errbuf[256];
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to cancel the query and discard the result, assume
* the connection is dead.
*/
- endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_cancel_query_begin(conn))
+ return false;
+ return pgfdw_cancel_query_end(conn, endtime, false);
+}
+
+static bool
+pgfdw_cancel_query_begin(PGconn *conn)
+{
+ PGcancel *cancel;
+ char errbuf[256];
/*
* Issue cancel request. Unfortunately, there's no good way to limit the
@@ -1294,6 +1376,31 @@ pgfdw_cancel_query(PGconn *conn)
PQfreeCancel(cancel);
}
+ return true;
+}
+
+static bool
+pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime, bool consume_input)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
+ /*
+ * If requested, consume whatever data is available from the socket.
+ * (Note that if all data is available, this allows
+ * pgfdw_get_cleanup_result to call PQgetResult without forcing the
+ * overhead of WaitLatchOrSocket, which would be large compared to the
+ * overhead of PQconsumeInput.)
+ */
+ if (consume_input && !PQconsumeInput(conn))
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_CONNECTION_FAILURE),
+ errmsg("could not get result of cancel request: %s",
+ pchomp(PQerrorMessage(conn)))));
+ return false;
+ }
+
/* Get and discard the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1328,9 +1435,7 @@ pgfdw_cancel_query(PGconn *conn)
static bool
pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
{
- PGresult *result = NULL;
TimestampTz endtime;
- bool timed_out;
/*
* If it takes too long to execute a cleanup query, assume the connection
@@ -1338,8 +1443,18 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
* place (e.g. statement timeout, user cancel), so the timeout shouldn't
* be too long.
*/
- endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), 30000);
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_exec_cleanup_query_begin(conn, query))
+ return false;
+ return pgfdw_exec_cleanup_query_end(conn, query, endtime,
+ false, ignore_errors);
+}
+static bool
+pgfdw_exec_cleanup_query_begin(PGconn *conn, const char *query)
+{
/*
* Submit a query. Since we don't use non-blocking mode, this also can
* block. But its risk is relatively small, so we ignore that for now.
@@ -1350,6 +1465,30 @@ pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors)
return false;
}
+ return true;
+}
+
+static bool
+pgfdw_exec_cleanup_query_end(PGconn *conn, const char *query,
+ TimestampTz endtime, bool consume_input,
+ bool ignore_errors)
+{
+ PGresult *result = NULL;
+ bool timed_out;
+
+ /*
+ * If requested, consume whatever data is available from the socket.
+ * (Note that if all data is available, this allows
+ * pgfdw_get_cleanup_result to call PQgetResult without forcing the
+ * overhead of WaitLatchOrSocket, which would be large compared to the
+ * overhead of PQconsumeInput.)
+ */
+ if (consume_input && !PQconsumeInput(conn))
+ {
+ pgfdw_report_error(WARNING, NULL, conn, false, query);
+ return false;
+ }
+
/* Get the result of the query. */
if (pgfdw_get_cleanup_result(conn, endtime, &result, &timed_out))
{
@@ -1505,12 +1644,7 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
!pgfdw_cancel_query(entry->conn))
return; /* Unable to cancel running query */
- if (toplevel)
- snprintf(sql, sizeof(sql), "ABORT TRANSACTION");
- else
- snprintf(sql, sizeof(sql),
- "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
- entry->xact_depth, entry->xact_depth);
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
if (!pgfdw_exec_cleanup_query(entry->conn, sql, false))
return; /* Unable to abort remote (sub)transaction */
@@ -1539,6 +1673,65 @@ pgfdw_abort_cleanup(ConnCacheEntry *entry, bool toplevel)
entry->changing_xact_state = false;
}
+/*
+ * Like pgfdw_abort_cleanup, submit an abort command or cancel a request, but
+ * don't wait for the result.
+ *
+ * Returns true if the abort command or cancel request is successfully issued,
+ * false otherwise. If the abort command is successfully issued, the given
+ * connection cache entry is appended to *pending_entries. Othewise, if the
+ * cancel request is successfully issued, it's appended to *cancel_requested.
+ */
+static bool
+pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
+ List **pending_entries, List **cancel_requested)
+{
+ /*
+ * Don't try to clean up the connection if we're already in error
+ * recursion trouble.
+ */
+ if (in_error_recursion_trouble())
+ entry->changing_xact_state = true;
+
+ /*
+ * If connection is already unsalvageable, don't touch it further.
+ */
+ if (entry->changing_xact_state)
+ return false;
+
+ /*
+ * Mark this connection as in the process of changing transaction state.
+ */
+ entry->changing_xact_state = true;
+
+ /* Assume we might have lost track of prepared statements */
+ entry->have_error = true;
+
+ /*
+ * If a command has been submitted to the remote server by using an
+ * asynchronous execution function, the command might not have yet
+ * completed. Check to see if a command is still being processed by the
+ * remote server, and if so, request cancellation of the command.
+ */
+ if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE)
+ {
+ if (!pgfdw_cancel_query_begin(entry->conn))
+ return false; /* Unable to cancel running query */
+ *cancel_requested = lappend(*cancel_requested, entry);
+ }
+ else
+ {
+ char sql[100];
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ return false; /* Unable to abort remote transaction */
+ *pending_entries = lappend(*pending_entries, entry);
+ }
+
+ return true;
+}
+
/*
* Finish pre-commit cleanup of connections on each of which we've sent a
* COMMIT command to the remote server.
@@ -1647,6 +1840,168 @@ pgfdw_finish_pre_subcommit_cleanup(List *pending_entries, int curlevel)
}
}
+/*
+ * Finish abort cleanup of connections on each of which we've sent an abort
+ * command or cancel request to the remote server.
+ */
+static void
+pgfdw_finish_abort_cleanup(List *pending_entries, List *cancel_requested,
+ bool toplevel)
+{
+ List *pending_deallocs = NIL;
+ ListCell *lc;
+
+ /*
+ * For each of the pending cancel requests (if any), get and discard the
+ * result of the query, and submit an abort command to the remote server.
+ */
+ if (cancel_requested)
+ {
+ foreach(lc, cancel_requested)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. You might think we should do this before issuing
+ * cancel request like in normal mode, but that is problematic,
+ * because if, for example, it took longer than 30 seconds to
+ * process the first few entries in the cancel_requested list, it
+ * would cause a timeout error when processing each of the
+ * remaining entries in the list, leading to slamming that entry's
+ * connection shut.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_cancel_query_end(entry->conn, endtime, true))
+ {
+ /* Unable to cancel running query */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ /* Send an abort command in parallel if needed */
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn, sql))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_entries = lappend(pending_entries, entry);
+ }
+ }
+
+ /* No further work if no pending entries */
+ if (!pending_entries)
+ return;
+
+ /*
+ * Get the result of the abort command for each of the pending entries
+ */
+ foreach(lc, pending_entries)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+ char sql[100];
+
+ Assert(entry->changing_xact_state);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ CONSTRUCT_ABORT_COMMAND(sql, entry, toplevel);
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, sql, endtime,
+ true, false))
+ {
+ /* Unable to abort remote (sub)transaction */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+
+ if (toplevel)
+ {
+ /* Do a DEALLOCATE ALL in parallel if needed */
+ if (entry->have_prep_stmt && entry->have_error)
+ {
+ if (!pgfdw_exec_cleanup_query_begin(entry->conn,
+ "DEALLOCATE ALL"))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+ else
+ pending_deallocs = lappend(pending_deallocs, entry);
+ continue;
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+ }
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+
+ /* No further work if no pending entries */
+ if (!pending_deallocs)
+ return;
+ Assert(toplevel);
+
+ /*
+ * Get the result of the DEALLOCATE command for each of the pending
+ * entries
+ */
+ foreach(lc, pending_deallocs)
+ {
+ ConnCacheEntry *entry = (ConnCacheEntry *) lfirst(lc);
+ TimestampTz endtime;
+
+ Assert(entry->changing_xact_state);
+ Assert(entry->have_prep_stmt);
+ Assert(entry->have_error);
+
+ /*
+ * Set end time. We do this now, not before issuing the command like
+ * in normal mode, for the same reason as for the cancel_requested
+ * entries.
+ */
+ endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+ CONNECTION_CLEANUP_TIMEOUT);
+
+ if (!pgfdw_exec_cleanup_query_end(entry->conn, "DEALLOCATE ALL",
+ endtime, true, true))
+ {
+ /* Trouble clearing prepared statements */
+ pgfdw_reset_xact_state(entry, toplevel);
+ continue;
+ }
+ entry->have_prep_stmt = false;
+ entry->have_error = false;
+
+ /* Reset the per-connection state if needed */
+ if (entry->state.pendingAreq)
+ memset(&entry->state, 0, sizeof(entry->state));
+
+ /* We're done with this entry; unset the changing_xact_state flag */
+ entry->changing_xact_state = false;
+ pgfdw_reset_xact_state(entry, toplevel);
+ }
+}
+
/*
* List active foreign server connections.
*
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index c0267a99d2..0271566fcf 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -11704,10 +11704,12 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
SERVER loopback OPTIONS (table_name 'ploc1');
@@ -11777,8 +11779,55 @@ SELECT * FROM prem2;
204 | quxqux
(3 rows)
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+ f1 | f2
+-----+--------
+ 101 | foo
+ 102 | foofoo
+ 104 | bazbaz
+(3 rows)
+
+SELECT * FROM prem2;
+ f1 | f2
+-----+--------
+ 201 | bar
+ 202 | barbar
+ 204 | quxqux
+(3 rows)
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
-- ===================================================================
-- test for ANALYZE sampling
-- ===================================================================
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index 984e4d168a..ab00ca9df3 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -125,6 +125,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
strcmp(def->defname, "truncatable") == 0 ||
strcmp(def->defname, "async_capable") == 0 ||
strcmp(def->defname, "parallel_commit") == 0 ||
+ strcmp(def->defname, "parallel_abort") == 0 ||
strcmp(def->defname, "keep_connections") == 0)
{
/* these accept only boolean values */
@@ -271,6 +272,7 @@ InitPgFdwOptions(void)
{"async_capable", ForeignServerRelationId, false},
{"async_capable", ForeignTableRelationId, false},
{"parallel_commit", ForeignServerRelationId, false},
+ {"parallel_abort", ForeignServerRelationId, false},
{"keep_connections", ForeignServerRelationId, false},
{"password_required", UserMappingRelationId, false},
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index c37aa80383..8ef78fc190 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3894,10 +3894,12 @@ RESET postgres_fdw.application_name;
RESET debug_discard_caches;
-- ===================================================================
--- test parallel commit
+-- test parallel commit and parallel abort
-- ===================================================================
ALTER SERVER loopback OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback OPTIONS (ADD parallel_abort 'true');
ALTER SERVER loopback2 OPTIONS (ADD parallel_commit 'true');
+ALTER SERVER loopback2 OPTIONS (ADD parallel_abort 'true');
CREATE TABLE ploc1 (f1 int, f2 text);
CREATE FOREIGN TABLE prem1 (f1 int, f2 text)
@@ -3936,8 +3938,29 @@ COMMIT;
SELECT * FROM prem1;
SELECT * FROM prem2;
+BEGIN;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
+BEGIN;
+SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ROLLBACK TO SAVEPOINT s;
+RELEASE SAVEPOINT s;
+INSERT INTO prem1 VALUES (105, 'test1');
+INSERT INTO prem2 VALUES (205, 'test2');
+ABORT;
+SELECT * FROM prem1;
+SELECT * FROM prem2;
+
ALTER SERVER loopback OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback OPTIONS (DROP parallel_abort);
ALTER SERVER loopback2 OPTIONS (DROP parallel_commit);
+ALTER SERVER loopback2 OPTIONS (DROP parallel_abort);
-- ===================================================================
-- test for ANALYZE sampling
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 78f2d7d8d5..8b84381c27 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -504,12 +504,13 @@ OPTIONS (ADD password_required 'false');
corresponding remote transactions, and subtransactions are managed by
creating corresponding remote subtransactions. When multiple remote
transactions are involved in the current local transaction, by default
- <filename>postgres_fdw</filename> commits those remote transactions
- serially when the local transaction is committed. When multiple remote
- subtransactions are involved in the current local subtransaction, by
- default <filename>postgres_fdw</filename> commits those remote
- subtransactions serially when the local subtransaction is committed.
- Performance can be improved with the following option:
+ <filename>postgres_fdw</filename> commits or aborts those remote
+ transactions serially when the local transaction is committed or aborted.
+ When multiple remote subtransactions are involved in the current local
+ subtransaction, by default <filename>postgres_fdw</filename> commits or
+ aborts those remote subtransactions serially when the local subtransaction
+ is committed or abortd.
+ Performance can be improved with the following options:
</para>
<variablelist>
@@ -525,24 +526,38 @@ OPTIONS (ADD password_required 'false');
specified for foreign servers, not per-table. The default is
<literal>false</literal>.
</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><literal>parallel_abort</literal> (<type>boolean</type>)</term>
+ <listitem>
<para>
- If multiple foreign servers with this option enabled are involved in a
- local transaction, multiple remote transactions on those foreign
- servers are committed in parallel across those foreign servers when
- the local transaction is committed.
- </para>
-
- <para>
- When this option is enabled, a foreign server with many remote
- transactions may see a negative performance impact when the local
- transaction is committed.
+ This option controls whether <filename>postgres_fdw</filename> aborts
+ in parallel remote transactions opened on a foreign server in a local
+ transaction when the local transaction is aborted. This setting also
+ applies to remote and local subtransactions. This option can only be
+ specified for foreign servers, not per-table. The default is
+ <literal>false</literal>.
</para>
</listitem>
</varlistentry>
</variablelist>
+ <para>
+ If multiple foreign servers with these options enabled are involved in a
+ local transaction, multiple remote transactions on those foreign servers
+ are committed or aborted in parallel across those foreign servers when
+ the local transaction is committed or aborted.
+ </para>
+
+ <para>
+ When these options are enabled, a foreign server with many remote
+ transactions may see a negative performance impact when the local
+ transaction is committed or aborted.
+ </para>
+
</sect3>
<sect3 id="postgres-fdw-options-updatability">
On Wed, Jan 18, 2023 at 8:06 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
I rebased the patch. Attached is an updated patch.
The parallel-abort patch received a review from David, and I addressed
his comments. Also, he tested with the patch, and showed that it
reduces time taken to abort remote transactions. So, if there are no
objections, I will commit the patch.
Best regards,
Etsuro Fujita
On Tue, Apr 4, 2023 at 7:28 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
The parallel-abort patch received a review from David, and I addressed
his comments. Also, he tested with the patch, and showed that it
reduces time taken to abort remote transactions. So, if there are no
objections, I will commit the patch.
Pushed after adding/modifying comments a little bit.
Best regards,
Etsuro Fujita