Async execution of postgres_fdw.
Hello, this is the 2nd session of 'intoroducing parallelism using
postgres_fdw'.
The two patch attached are as following,
- 0001-Async-exec-of-postgres_fdw.patch
Main patch, which includes all functions.
- 0002-rename-PGConn-variable.patch
Renaming the variable conn for readability. No functional
effect.
* Outline of this patch
From some consideration after the previous discussion and
comments from others, I judged the original (WIP) patch was
overdone as the first step. So I reduced the patch to minimal
function. The new patch does the following,
- Wrapping PGconn by PgFdwConn in order to handle multiple scans
on one connection.
- The core async logic was added in fetch_more_data().
- Invoking remote commands asynchronously in ExecInitForeignScan.
- Canceling async invocation if any other foreign scans,
modifies, deletes use the same connection.
Cancellation is done by immediately fetching the return of
already-invoked acync command.
* Where this patch will be effective.
With upcoming inheritance-partition feature, this patch enables
stating and running foreign scans asynchronously. It will be more
effective for longer TAT or remote startup times, and larger
number of foreign servers. No negative performance effect on
other situations.
* Concerns about this patch.
- This breaks the assumption that scan starts at ExecForeignScan,
not ExecInitForeignScan, which might cause some problem.
- error reporting code in do_sql_command is quite ugly..
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-Async-exec-of-postgres_fdw.patchtext/x-patch; charset=us-asciiDownload
From 4b56fcd0687172e3cccb329bc17e78935657f58f Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Fri, 28 Nov 2014 10:52:41 +0900
Subject: [PATCH 1/2] Async exec of postgres_fdw.
---
contrib/postgres_fdw/connection.c | 102 ++++++++++++-------
contrib/postgres_fdw/postgres_fdw.c | 191 ++++++++++++++++++++++++++++--------
contrib/postgres_fdw/postgres_fdw.h | 28 +++++-
3 files changed, 242 insertions(+), 79 deletions(-)
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 116be7d..8b1c738 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -44,7 +44,7 @@ typedef struct ConnCacheKey
typedef struct ConnCacheEntry
{
ConnCacheKey key; /* hash key (must be first) */
- PGconn *conn; /* connection to foreign server, or NULL */
+ PgFdwConn *conn; /* connection to foreign server, or NULL */
int xact_depth; /* 0 = no xact open, 1 = main xact open, 2 =
* one level of subxact open, etc */
bool have_prep_stmt; /* have we prepared any stmts in this xact? */
@@ -93,7 +93,7 @@ static void pgfdw_subxact_callback(SubXactEvent event,
* be useful and not mere pedantry. We could not flush any active connections
* mid-transaction anyway.
*/
-PGconn *
+PgFdwConn *
GetConnection(ForeignServer *server, UserMapping *user,
bool will_prep_stmt)
{
@@ -160,16 +160,36 @@ GetConnection(ForeignServer *server, UserMapping *user,
entry->xact_depth = 0; /* just to be sure */
entry->have_prep_stmt = false;
entry->have_error = false;
- entry->conn = connect_pg_server(server, user);
+
+ /* This shoud be in the same memory context with the hashtable */
+ entry->conn =
+ (PgFdwConn *) MemoryContextAllocZero(CacheMemoryContext,
+ sizeof(PgFdwConn));
+
elog(DEBUG3, "new postgres_fdw connection %p for server \"%s\"",
- entry->conn, server->servername);
+ entry->conn->conn, server->servername);
}
+ if (entry->conn->conn == NULL)
+ {
+ entry->conn->conn = connect_pg_server(server, user);
+ entry->conn->nscans = 0;
+ entry->conn->async_state = PGFDW_CONN_IDLE;
+ entry->conn->async_scan = NULL;
+ }
/*
* Start a new transaction or subtransaction if needed.
*/
begin_remote_xact(entry);
+ /*
+ * Cancel async query if there's another foreign scan node sharing this
+ * connection.
+ */
+ if (++entry->conn->nscans > 1 &&
+ entry->conn->async_state == PGFDW_CONN_ASYNC_RUNNING)
+ fetch_more_data(entry->conn->async_scan);
+
/* Remember if caller will prepare statements */
entry->have_prep_stmt |= will_prep_stmt;
@@ -182,7 +202,7 @@ GetConnection(ForeignServer *server, UserMapping *user,
static PGconn *
connect_pg_server(ForeignServer *server, UserMapping *user)
{
- PGconn *volatile conn = NULL;
+ PGconn *volatile conn = NULL;
/*
* Use PG_TRY block to ensure closing connection on error.
@@ -355,7 +375,12 @@ do_sql_command(PGconn *conn, const char *sql)
res = PQexec(conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
- pgfdw_report_error(ERROR, res, conn, true, sql);
+ {
+ PgFdwConn tmpfdwconn;
+
+ tmpfdwconn.conn = conn;
+ pgfdw_report_error(ERROR, res, &tmpfdwconn, true, sql);
+ }
PQclear(res);
}
@@ -380,13 +405,13 @@ begin_remote_xact(ConnCacheEntry *entry)
const char *sql;
elog(DEBUG3, "starting remote transaction on connection %p",
- entry->conn);
+ &entry->conn);
if (IsolationIsSerializable())
sql = "START TRANSACTION ISOLATION LEVEL SERIALIZABLE";
else
sql = "START TRANSACTION ISOLATION LEVEL REPEATABLE READ";
- do_sql_command(entry->conn, sql);
+ do_sql_command(entry->conn->conn, sql);
entry->xact_depth = 1;
}
@@ -400,7 +425,7 @@ begin_remote_xact(ConnCacheEntry *entry)
char sql[64];
snprintf(sql, sizeof(sql), "SAVEPOINT s%d", entry->xact_depth + 1);
- do_sql_command(entry->conn, sql);
+ do_sql_command(entry->conn->conn, sql);
entry->xact_depth++;
}
}
@@ -409,13 +434,13 @@ begin_remote_xact(ConnCacheEntry *entry)
* Release connection reference count created by calling GetConnection.
*/
void
-ReleaseConnection(PGconn *conn)
+ReleaseConnection(PgFdwConn *conn)
{
- /*
- * Currently, we don't actually track connection references because all
- * cleanup is managed on a transaction or subtransaction basis instead. So
- * there's nothing to do here.
- */
+ if (--conn->nscans == 0)
+ {
+ if (conn->async_scan)
+ finish_async_connection(conn->async_scan);
+ }
}
/*
@@ -430,7 +455,7 @@ ReleaseConnection(PGconn *conn)
* collisions are highly improbable; just be sure to use %u not %d to print.
*/
unsigned int
-GetCursorNumber(PGconn *conn)
+GetCursorNumber(PgFdwConn *conn)
{
return ++cursor_number;
}
@@ -444,7 +469,7 @@ GetCursorNumber(PGconn *conn)
* increasing the risk of prepared-statement name collisions by resetting.
*/
unsigned int
-GetPrepStmtNumber(PGconn *conn)
+GetPrepStmtNumber(PgFdwConn *conn)
{
return ++prep_stmt_number;
}
@@ -463,7 +488,7 @@ GetPrepStmtNumber(PGconn *conn)
* marked with have_error = true.
*/
void
-pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
+pgfdw_report_error(int elevel, PGresult *res, PgFdwConn *conn,
bool clear, const char *sql)
{
/* If requested, PGresult must be released before leaving this function. */
@@ -491,7 +516,7 @@ pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
* return NULL, not a PGresult at all.
*/
if (message_primary == NULL)
- message_primary = PQerrorMessage(conn);
+ message_primary = PQerrorMessage(conn->conn);
ereport(elevel,
(errcode(sqlstate),
@@ -536,20 +561,20 @@ pgfdw_xact_callback(XactEvent event, void *arg)
PGresult *res;
/* Ignore cache entry if no open connection right now */
- if (entry->conn == NULL)
+ if (entry->conn->conn == NULL)
continue;
/* If it has an open remote transaction, try to close it */
if (entry->xact_depth > 0)
{
elog(DEBUG3, "closing remote transaction on connection %p",
- entry->conn);
+ entry->conn->conn);
switch (event)
{
case XACT_EVENT_PRE_COMMIT:
/* Commit all remote transactions during pre-commit */
- do_sql_command(entry->conn, "COMMIT TRANSACTION");
+ do_sql_command(entry->conn->conn, "COMMIT TRANSACTION");
/*
* If there were any errors in subtransactions, and we
@@ -568,7 +593,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
*/
if (entry->have_prep_stmt && entry->have_error)
{
- res = PQexec(entry->conn, "DEALLOCATE ALL");
+ res = PQexec(entry->conn->conn, "DEALLOCATE ALL");
PQclear(res);
}
entry->have_prep_stmt = false;
@@ -598,7 +623,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Assume we might have lost track of prepared statements */
entry->have_error = true;
/* If we're aborting, abort all remote transactions too */
- res = PQexec(entry->conn, "ABORT TRANSACTION");
+ res = PQexec(entry->conn->conn, "ABORT TRANSACTION");
/* Note: can't throw ERROR, it would be infinite loop */
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(WARNING, res, entry->conn, true,
@@ -609,7 +634,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* As above, make sure to clear any prepared stmts */
if (entry->have_prep_stmt && entry->have_error)
{
- res = PQexec(entry->conn, "DEALLOCATE ALL");
+ res = PQexec(entry->conn->conn, "DEALLOCATE ALL");
PQclear(res);
}
entry->have_prep_stmt = false;
@@ -621,17 +646,19 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Reset state to show we're out of a transaction */
entry->xact_depth = 0;
-
+ entry->conn->nscans = 0;
+ entry->conn->async_state = PGFDW_CONN_IDLE;
+ entry->conn->async_scan = NULL;
/*
* If the connection isn't in a good idle state, discard it to
* recover. Next GetConnection will open a new connection.
*/
- if (PQstatus(entry->conn) != CONNECTION_OK ||
- PQtransactionStatus(entry->conn) != PQTRANS_IDLE)
+ if (PQstatus(entry->conn->conn) != CONNECTION_OK ||
+ PQtransactionStatus(entry->conn->conn) != PQTRANS_IDLE)
{
- elog(DEBUG3, "discarding connection %p", entry->conn);
- PQfinish(entry->conn);
- entry->conn = NULL;
+ elog(DEBUG3, "discarding connection %p", entry->conn->conn);
+ PQfinish(entry->conn->conn);
+ entry->conn->conn = NULL;
}
}
@@ -677,11 +704,18 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
PGresult *res;
char sql[100];
+ /* Shut down asynchronous scan if running */
+ if (entry->conn->async_scan && PQisBusy(entry->conn->conn))
+ PQconsumeInput(entry->conn->conn);
+ entry->conn->async_scan = NULL;
+ entry->conn->async_state = PGFDW_CONN_IDLE;
+ entry->conn->nscans = 0;
+
/*
* We only care about connections with open remote subtransactions of
* the current level.
*/
- if (entry->conn == NULL || entry->xact_depth < curlevel)
+ if (entry->conn->conn == NULL || entry->xact_depth < curlevel)
continue;
if (entry->xact_depth > curlevel)
@@ -692,7 +726,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
{
/* Commit all remote subtransactions during pre-commit */
snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel);
- do_sql_command(entry->conn, sql);
+ do_sql_command(entry->conn->conn, sql);
}
else
{
@@ -702,7 +736,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
snprintf(sql, sizeof(sql),
"ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
curlevel, curlevel);
- res = PQexec(entry->conn, sql);
+ res = PQexec(entry->conn->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(WARNING, res, entry->conn, true, sql);
else
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index c3039a6..b912091 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -136,7 +136,7 @@ typedef struct PgFdwScanState
List *retrieved_attrs; /* list of retrieved attribute numbers */
/* for remote query execution */
- PGconn *conn; /* connection for the scan */
+ PgFdwConn *conn; /* connection for the scan */
unsigned int cursor_number; /* quasi-unique ID for my cursor */
bool cursor_exists; /* have we created the cursor? */
int numParams; /* number of parameters passed to query */
@@ -156,6 +156,7 @@ typedef struct PgFdwScanState
/* working memory contexts */
MemoryContext batch_cxt; /* context holding current batch of tuples */
MemoryContext temp_cxt; /* context for per-tuple temporary data */
+ ExprContext *econtext; /* copy of ps_ExprContext of ForeignScanState */
} PgFdwScanState;
/*
@@ -167,7 +168,7 @@ typedef struct PgFdwModifyState
AttInMetadata *attinmeta; /* attribute datatype conversion metadata */
/* for remote query execution */
- PGconn *conn; /* connection for the scan */
+ PgFdwConn *conn; /* connection for the scan */
char *p_name; /* name of prepared statement, if created */
/* extracted fdw_private data */
@@ -298,7 +299,7 @@ static void estimate_path_cost_size(PlannerInfo *root,
double *p_rows, int *p_width,
Cost *p_startup_cost, Cost *p_total_cost);
static void get_remote_estimate(const char *sql,
- PGconn *conn,
+ PgFdwConn *conn,
double *rows,
int *width,
Cost *startup_cost,
@@ -306,9 +307,8 @@ static void get_remote_estimate(const char *sql,
static bool ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
EquivalenceClass *ec, EquivalenceMember *em,
void *arg);
-static void create_cursor(ForeignScanState *node);
-static void fetch_more_data(ForeignScanState *node);
-static void close_cursor(PGconn *conn, unsigned int cursor_number);
+static void create_cursor(PgFdwScanState *node);
+static void close_cursor(PgFdwConn *conn, unsigned int cursor_number);
static void prepare_foreign_modify(PgFdwModifyState *fmstate);
static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
ItemPointer tupleid,
@@ -329,7 +329,6 @@ static HeapTuple make_tuple_from_result_row(PGresult *res,
MemoryContext temp_context);
static void conversion_error_callback(void *arg);
-
/*
* Foreign-data wrapper handler function: return a struct with pointers
* to my callback routines.
@@ -982,6 +981,15 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
fsstate->param_values = (const char **) palloc0(numParams * sizeof(char *));
else
fsstate->param_values = NULL;
+
+ fsstate->econtext = node->ss.ps.ps_ExprContext;
+
+ /*
+ * Start scanning asynchronously if it is the first scan on this
+ * connection.
+ */
+ if (fsstate->conn->nscans == 1)
+ create_cursor(fsstate);
}
/*
@@ -1000,7 +1008,7 @@ postgresIterateForeignScan(ForeignScanState *node)
* cursor on the remote side.
*/
if (!fsstate->cursor_exists)
- create_cursor(node);
+ create_cursor(fsstate);
/*
* Get some more tuples, if we've run out.
@@ -1009,7 +1017,7 @@ postgresIterateForeignScan(ForeignScanState *node)
{
/* No point in another fetch if we already detected EOF, though. */
if (!fsstate->eof_reached)
- fetch_more_data(node);
+ fetch_more_data(fsstate);
/* If we didn't get any tuples, must be end of data. */
if (fsstate->next_tuple >= fsstate->num_tuples)
return ExecClearTuple(slot);
@@ -1069,7 +1077,7 @@ postgresReScanForeignScan(ForeignScanState *node)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(fsstate->conn, sql);
+ res = PQexec(fsstate->conn->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fsstate->conn, true, sql);
PQclear(res);
@@ -1398,7 +1406,7 @@ postgresExecForeignInsert(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
+ res = PQexecPrepared(fmstate->conn->conn,
fmstate->p_name,
fmstate->p_nums,
p_values,
@@ -1468,7 +1476,7 @@ postgresExecForeignUpdate(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
+ res = PQexecPrepared(fmstate->conn->conn,
fmstate->p_name,
fmstate->p_nums,
p_values,
@@ -1538,7 +1546,7 @@ postgresExecForeignDelete(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
+ res = PQexecPrepared(fmstate->conn->conn,
fmstate->p_name,
fmstate->p_nums,
p_values,
@@ -1594,7 +1602,7 @@ postgresEndForeignModify(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(fmstate->conn, sql);
+ res = PQexec(fmstate->conn->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fmstate->conn, true, sql);
PQclear(res);
@@ -1726,7 +1734,7 @@ estimate_path_cost_size(PlannerInfo *root,
List *local_join_conds;
StringInfoData sql;
List *retrieved_attrs;
- PGconn *conn;
+ PgFdwConn *conn;
Selectivity local_sel;
QualCost local_cost;
@@ -1836,7 +1844,7 @@ estimate_path_cost_size(PlannerInfo *root,
* The given "sql" must be an EXPLAIN command.
*/
static void
-get_remote_estimate(const char *sql, PGconn *conn,
+get_remote_estimate(const char *sql, PgFdwConn *conn,
double *rows, int *width,
Cost *startup_cost, Cost *total_cost)
{
@@ -1852,7 +1860,7 @@ get_remote_estimate(const char *sql, PGconn *conn,
/*
* Execute EXPLAIN remotely.
*/
- res = PQexec(conn, sql);
+ res = PQexec(conn->conn, sql);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql);
@@ -1917,13 +1925,12 @@ ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
* Create cursor for node's query with current parameter values.
*/
static void
-create_cursor(ForeignScanState *node)
+create_cursor(PgFdwScanState *fsstate)
{
- PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
- ExprContext *econtext = node->ss.ps.ps_ExprContext;
+ ExprContext *econtext = fsstate->econtext;
int numParams = fsstate->numParams;
const char **values = fsstate->param_values;
- PGconn *conn = fsstate->conn;
+ PgFdwConn *conn = fsstate->conn;
StringInfoData buf;
PGresult *res;
@@ -1985,7 +1992,7 @@ create_cursor(ForeignScanState *node)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecParams(conn, buf.data, numParams, NULL, values,
+ res = PQexecParams(conn->conn, buf.data, numParams, NULL, values,
NULL, NULL, 0);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, fsstate->query);
@@ -2001,15 +2008,18 @@ create_cursor(ForeignScanState *node)
/* Clean up */
pfree(buf.data);
+
+ /* Start async scan if this is the first scan */
+ if (fsstate->conn->nscans == 1)
+ fetch_more_data(fsstate);
}
/*
* Fetch some more rows from the node's cursor.
*/
-static void
-fetch_more_data(ForeignScanState *node)
+void
+fetch_more_data(PgFdwScanState *fsstate)
{
- PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
PGresult *volatile res = NULL;
MemoryContext oldcontext;
@@ -2024,7 +2034,7 @@ fetch_more_data(ForeignScanState *node)
/* PGresult must be released before leaving this function. */
PG_TRY();
{
- PGconn *conn = fsstate->conn;
+ PgFdwConn *conn = fsstate->conn;
char sql[64];
int fetch_size;
int numrows;
@@ -2036,9 +2046,63 @@ fetch_more_data(ForeignScanState *node)
snprintf(sql, sizeof(sql), "FETCH %d FROM c%u",
fetch_size, fsstate->cursor_number);
- res = PQexec(conn, sql);
+ switch (conn->async_state)
+ {
+ case PGFDW_CONN_IDLE:
+ Assert(conn->async_scan == NULL);
+
+ if (conn->nscans == 1)
+ {
+ conn->async_scan = fsstate;
+
+ if (!PQsendQuery(conn->conn, sql))
+ pgfdw_report_error(ERROR, res, conn, false,
+ fsstate->query);
+
+ conn->async_state = PGFDW_CONN_ASYNC_RUNNING;
+ goto end_of_fetch;
+ }
+
+ /* Synchronous query execution */
+ conn->async_state = PGFDW_CONN_SYNC_RUNNING;
+ res = PQexec(conn->conn, sql);
+ break;
+
+ case PGFDW_CONN_ASYNC_RUNNING:
+ Assert(conn->async_scan != NULL);
+
+ res = PQgetResult(conn->conn);
+ if (PQntuples(res) == fetch_size)
+ {
+ /*
+ * Connection state doesn't go to IDLE even if all data
+ * has been sent to client for asynchronous query. One
+ * more PQgetResult() is needed to reset the state to
+ * IDLE. See PQexecFinish() for details.
+ */
+ if (PQgetResult(conn->conn) != NULL)
+ elog(ERROR, "Connection status error.");
+ }
+
+ if (conn->nscans == 1)
+ break;
+
+ /*
+ * If nscans is more then 1, stop invoking command asynchronously
+ * for multiple scans on this connection. If nscan is zero, async
+ * command on this connection should be finished immediately.
+ */
+ conn->async_state = PGFDW_CONN_SYNC_RUNNING;
+ break;
+
+ default:
+ elog(ERROR, "unexpected async state : %d", conn->async_state);
+ break;
+
+ }
+
/* On error, report the original query, not the FETCH. */
- if (PQresultStatus(res) != PGRES_TUPLES_OK)
+ if (res && PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, fsstate->query);
/* Convert the data into HeapTuples */
@@ -2066,6 +2130,36 @@ fetch_more_data(ForeignScanState *node)
PQclear(res);
res = NULL;
+
+ switch(conn->async_state)
+ {
+ case PGFDW_CONN_ASYNC_RUNNING:
+ if (!fsstate->eof_reached)
+ {
+ /*
+ * We can immediately request the next bunch of tuples if
+ * we're on asynchronous connection.
+ */
+ if (!PQsendQuery(conn->conn, sql))
+ pgfdw_report_error(ERROR, res, conn, false, fsstate->query);
+ }
+ else
+ conn->async_state = PGFDW_CONN_IDLE;
+ break;
+
+
+ case PGFDW_CONN_SYNC_RUNNING:
+ conn->async_state = PGFDW_CONN_IDLE;
+ conn->async_scan = NULL;
+ break;
+
+ default:
+ elog(ERROR, "Unexpedted async state: %d", conn->async_state);
+ break;
+ }
+
+end_of_fetch:
+ ; /* Nothing to do here but needed to make compiler quiet. */
}
PG_CATCH();
{
@@ -2079,6 +2173,23 @@ fetch_more_data(ForeignScanState *node)
}
/*
+ * Force cancelling async command state.
+ */
+void
+finish_async_connection(PgFdwScanState *fsstate)
+{
+ /* Finish async command if any */
+ if (fsstate->conn->async_state == PGFDW_CONN_ASYNC_RUNNING)
+ fetch_more_data(fsstate->conn->async_scan);
+ fsstate->conn->async_scan = NULL;
+ Assert(fsstate->conn->async_state == PGFDW_CONN_IDLE);
+
+ /* Immediately discard the result */
+ fsstate->next_tuple = 0;
+ fsstate->num_tuples = 0;
+}
+
+/*
* Force assorted GUC parameters to settings that ensure that we'll output
* data values in a form that is unambiguous to the remote server.
*
@@ -2132,7 +2243,7 @@ reset_transmission_modes(int nestlevel)
* Utility routine to close a cursor.
*/
static void
-close_cursor(PGconn *conn, unsigned int cursor_number)
+close_cursor(PgFdwConn *conn, unsigned int cursor_number)
{
char sql[64];
PGresult *res;
@@ -2143,7 +2254,7 @@ close_cursor(PGconn *conn, unsigned int cursor_number)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(conn, sql);
+ res = PQexec(conn->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, sql);
PQclear(res);
@@ -2175,7 +2286,7 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQprepare(fmstate->conn,
+ res = PQprepare(fmstate->conn->conn,
p_name,
fmstate->query,
0,
@@ -2297,7 +2408,7 @@ postgresAnalyzeForeignTable(Relation relation,
ForeignTable *table;
ForeignServer *server;
UserMapping *user;
- PGconn *conn;
+ PgFdwConn *conn;
StringInfoData sql;
PGresult *volatile res = NULL;
@@ -2329,7 +2440,7 @@ postgresAnalyzeForeignTable(Relation relation,
/* In what follows, do not risk leaking any PGresults. */
PG_TRY();
{
- res = PQexec(conn, sql.data);
+ res = PQexec(conn->conn, sql.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
@@ -2379,7 +2490,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
ForeignTable *table;
ForeignServer *server;
UserMapping *user;
- PGconn *conn;
+ PgFdwConn *conn;
unsigned int cursor_number;
StringInfoData sql;
PGresult *volatile res = NULL;
@@ -2423,7 +2534,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
/* In what follows, do not risk leaking any PGresults. */
PG_TRY();
{
- res = PQexec(conn, sql.data);
+ res = PQexec(conn->conn, sql.data);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
PQclear(res);
@@ -2453,7 +2564,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
snprintf(fetch_sql, sizeof(fetch_sql), "FETCH %d FROM c%u",
fetch_size, cursor_number);
- res = PQexec(conn, fetch_sql);
+ res = PQexec(conn->conn, fetch_sql);
/* On error, report the original query, not the FETCH. */
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
@@ -2582,7 +2693,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
bool import_not_null = true;
ForeignServer *server;
UserMapping *mapping;
- PGconn *conn;
+ PgFdwConn *conn;
StringInfoData buf;
PGresult *volatile res = NULL;
int numrows,
@@ -2615,7 +2726,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
conn = GetConnection(server, mapping, false);
/* Don't attempt to import collation if remote server hasn't got it */
- if (PQserverVersion(conn) < 90100)
+ if (PQserverVersion(conn->conn) < 90100)
import_collate = false;
/* Create workspace for strings */
@@ -2628,7 +2739,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
appendStringInfoString(&buf, "SELECT 1 FROM pg_catalog.pg_namespace WHERE nspname = ");
deparseStringLiteral(&buf, stmt->remote_schema);
- res = PQexec(conn, buf.data);
+ res = PQexec(conn->conn, buf.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, buf.data);
@@ -2723,7 +2834,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
appendStringInfo(&buf, " ORDER BY c.relname, a.attnum");
/* Fetch the data */
- res = PQexec(conn, buf.data);
+ res = PQexec(conn->conn, buf.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, buf.data);
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 0382c55..2472451 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -20,17 +20,35 @@
#include "libpq-fe.h"
+typedef enum PgFdwConnState
+{
+ PGFDW_CONN_IDLE,
+ PGFDW_CONN_ASYNC_RUNNING,
+ PGFDW_CONN_SYNC_RUNNING
+} PgFdwConnState;
+
+typedef struct PgFdwConn
+{
+ PGconn *conn;
+ int nscans;
+ PgFdwConnState async_state;
+ struct PgFdwScanState *async_scan;
+} PgFdwConn;
+
+
/* in postgres_fdw.c */
extern int set_transmission_modes(void);
extern void reset_transmission_modes(int nestlevel);
+extern void fetch_more_data(struct PgFdwScanState *node);
+extern void finish_async_connection(struct PgFdwScanState *fsstate);
/* in connection.c */
-extern PGconn *GetConnection(ForeignServer *server, UserMapping *user,
+extern PgFdwConn *GetConnection(ForeignServer *server, UserMapping *user,
bool will_prep_stmt);
-extern void ReleaseConnection(PGconn *conn);
-extern unsigned int GetCursorNumber(PGconn *conn);
-extern unsigned int GetPrepStmtNumber(PGconn *conn);
-extern void pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
+extern void ReleaseConnection(PgFdwConn *conn);
+extern unsigned int GetCursorNumber(PgFdwConn *conn);
+extern unsigned int GetPrepStmtNumber(PgFdwConn *conn);
+extern void pgfdw_report_error(int elevel, PGresult *res, PgFdwConn *conn,
bool clear, const char *sql);
/* in option.c */
--
2.1.0.GIT
0002-rename-PGConn-variable.patchtext/x-patch; charset=us-asciiDownload
From 2d76622c655294b2e6b54fab606ab9c1501c17f0 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 4 Dec 2014 16:48:22 +0900
Subject: [PATCH 2/2] rename PGConn variable
---
contrib/postgres_fdw/connection.c | 46 ++++++++++++++++++-------------------
contrib/postgres_fdw/postgres_fdw.c | 40 ++++++++++++++++----------------
contrib/postgres_fdw/postgres_fdw.h | 2 +-
3 files changed, 44 insertions(+), 44 deletions(-)
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 8b1c738..3d5c8dc 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -167,12 +167,12 @@ GetConnection(ForeignServer *server, UserMapping *user,
sizeof(PgFdwConn));
elog(DEBUG3, "new postgres_fdw connection %p for server \"%s\"",
- entry->conn->conn, server->servername);
+ entry->conn->pgconn, server->servername);
}
- if (entry->conn->conn == NULL)
+ if (entry->conn->pgconn == NULL)
{
- entry->conn->conn = connect_pg_server(server, user);
+ entry->conn->pgconn = connect_pg_server(server, user);
entry->conn->nscans = 0;
entry->conn->async_state = PGFDW_CONN_IDLE;
entry->conn->async_scan = NULL;
@@ -378,7 +378,7 @@ do_sql_command(PGconn *conn, const char *sql)
{
PgFdwConn tmpfdwconn;
- tmpfdwconn.conn = conn;
+ tmpfdwconn.pgconn = conn;
pgfdw_report_error(ERROR, res, &tmpfdwconn, true, sql);
}
PQclear(res);
@@ -411,7 +411,7 @@ begin_remote_xact(ConnCacheEntry *entry)
sql = "START TRANSACTION ISOLATION LEVEL SERIALIZABLE";
else
sql = "START TRANSACTION ISOLATION LEVEL REPEATABLE READ";
- do_sql_command(entry->conn->conn, sql);
+ do_sql_command(entry->conn->pgconn, sql);
entry->xact_depth = 1;
}
@@ -425,7 +425,7 @@ begin_remote_xact(ConnCacheEntry *entry)
char sql[64];
snprintf(sql, sizeof(sql), "SAVEPOINT s%d", entry->xact_depth + 1);
- do_sql_command(entry->conn->conn, sql);
+ do_sql_command(entry->conn->pgconn, sql);
entry->xact_depth++;
}
}
@@ -516,7 +516,7 @@ pgfdw_report_error(int elevel, PGresult *res, PgFdwConn *conn,
* return NULL, not a PGresult at all.
*/
if (message_primary == NULL)
- message_primary = PQerrorMessage(conn->conn);
+ message_primary = PQerrorMessage(conn->pgconn);
ereport(elevel,
(errcode(sqlstate),
@@ -561,20 +561,20 @@ pgfdw_xact_callback(XactEvent event, void *arg)
PGresult *res;
/* Ignore cache entry if no open connection right now */
- if (entry->conn->conn == NULL)
+ if (entry->conn->pgconn == NULL)
continue;
/* If it has an open remote transaction, try to close it */
if (entry->xact_depth > 0)
{
elog(DEBUG3, "closing remote transaction on connection %p",
- entry->conn->conn);
+ entry->conn->pgconn);
switch (event)
{
case XACT_EVENT_PRE_COMMIT:
/* Commit all remote transactions during pre-commit */
- do_sql_command(entry->conn->conn, "COMMIT TRANSACTION");
+ do_sql_command(entry->conn->pgconn, "COMMIT TRANSACTION");
/*
* If there were any errors in subtransactions, and we
@@ -593,7 +593,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
*/
if (entry->have_prep_stmt && entry->have_error)
{
- res = PQexec(entry->conn->conn, "DEALLOCATE ALL");
+ res = PQexec(entry->conn->pgconn, "DEALLOCATE ALL");
PQclear(res);
}
entry->have_prep_stmt = false;
@@ -623,7 +623,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Assume we might have lost track of prepared statements */
entry->have_error = true;
/* If we're aborting, abort all remote transactions too */
- res = PQexec(entry->conn->conn, "ABORT TRANSACTION");
+ res = PQexec(entry->conn->pgconn, "ABORT TRANSACTION");
/* Note: can't throw ERROR, it would be infinite loop */
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(WARNING, res, entry->conn, true,
@@ -634,7 +634,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* As above, make sure to clear any prepared stmts */
if (entry->have_prep_stmt && entry->have_error)
{
- res = PQexec(entry->conn->conn, "DEALLOCATE ALL");
+ res = PQexec(entry->conn->pgconn, "DEALLOCATE ALL");
PQclear(res);
}
entry->have_prep_stmt = false;
@@ -653,12 +653,12 @@ pgfdw_xact_callback(XactEvent event, void *arg)
* If the connection isn't in a good idle state, discard it to
* recover. Next GetConnection will open a new connection.
*/
- if (PQstatus(entry->conn->conn) != CONNECTION_OK ||
- PQtransactionStatus(entry->conn->conn) != PQTRANS_IDLE)
+ if (PQstatus(entry->conn->pgconn) != CONNECTION_OK ||
+ PQtransactionStatus(entry->conn->pgconn) != PQTRANS_IDLE)
{
- elog(DEBUG3, "discarding connection %p", entry->conn->conn);
- PQfinish(entry->conn->conn);
- entry->conn->conn = NULL;
+ elog(DEBUG3, "discarding connection %p", entry->conn->pgconn);
+ PQfinish(entry->conn->pgconn);
+ entry->conn->pgconn = NULL;
}
}
@@ -705,8 +705,8 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
char sql[100];
/* Shut down asynchronous scan if running */
- if (entry->conn->async_scan && PQisBusy(entry->conn->conn))
- PQconsumeInput(entry->conn->conn);
+ if (entry->conn->async_scan && PQisBusy(entry->conn->pgconn))
+ PQconsumeInput(entry->conn->pgconn);
entry->conn->async_scan = NULL;
entry->conn->async_state = PGFDW_CONN_IDLE;
entry->conn->nscans = 0;
@@ -715,7 +715,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
* We only care about connections with open remote subtransactions of
* the current level.
*/
- if (entry->conn->conn == NULL || entry->xact_depth < curlevel)
+ if (entry->conn->pgconn == NULL || entry->xact_depth < curlevel)
continue;
if (entry->xact_depth > curlevel)
@@ -726,7 +726,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
{
/* Commit all remote subtransactions during pre-commit */
snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel);
- do_sql_command(entry->conn->conn, sql);
+ do_sql_command(entry->conn->pgconn, sql);
}
else
{
@@ -736,7 +736,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
snprintf(sql, sizeof(sql),
"ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
curlevel, curlevel);
- res = PQexec(entry->conn->conn, sql);
+ res = PQexec(entry->conn->pgconn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(WARNING, res, entry->conn, true, sql);
else
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index b912091..e82ec82 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -1077,7 +1077,7 @@ postgresReScanForeignScan(ForeignScanState *node)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(fsstate->conn->conn, sql);
+ res = PQexec(fsstate->conn->pgconn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fsstate->conn, true, sql);
PQclear(res);
@@ -1406,7 +1406,7 @@ postgresExecForeignInsert(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn->conn,
+ res = PQexecPrepared(fmstate->conn->pgconn,
fmstate->p_name,
fmstate->p_nums,
p_values,
@@ -1476,7 +1476,7 @@ postgresExecForeignUpdate(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn->conn,
+ res = PQexecPrepared(fmstate->conn->pgconn,
fmstate->p_name,
fmstate->p_nums,
p_values,
@@ -1546,7 +1546,7 @@ postgresExecForeignDelete(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn->conn,
+ res = PQexecPrepared(fmstate->conn->pgconn,
fmstate->p_name,
fmstate->p_nums,
p_values,
@@ -1602,7 +1602,7 @@ postgresEndForeignModify(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(fmstate->conn->conn, sql);
+ res = PQexec(fmstate->conn->pgconn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fmstate->conn, true, sql);
PQclear(res);
@@ -1860,7 +1860,7 @@ get_remote_estimate(const char *sql, PgFdwConn *conn,
/*
* Execute EXPLAIN remotely.
*/
- res = PQexec(conn->conn, sql);
+ res = PQexec(conn->pgconn, sql);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql);
@@ -1992,7 +1992,7 @@ create_cursor(PgFdwScanState *fsstate)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecParams(conn->conn, buf.data, numParams, NULL, values,
+ res = PQexecParams(conn->pgconn, buf.data, numParams, NULL, values,
NULL, NULL, 0);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, fsstate->query);
@@ -2055,7 +2055,7 @@ fetch_more_data(PgFdwScanState *fsstate)
{
conn->async_scan = fsstate;
- if (!PQsendQuery(conn->conn, sql))
+ if (!PQsendQuery(conn->pgconn, sql))
pgfdw_report_error(ERROR, res, conn, false,
fsstate->query);
@@ -2065,13 +2065,13 @@ fetch_more_data(PgFdwScanState *fsstate)
/* Synchronous query execution */
conn->async_state = PGFDW_CONN_SYNC_RUNNING;
- res = PQexec(conn->conn, sql);
+ res = PQexec(conn->pgconn, sql);
break;
case PGFDW_CONN_ASYNC_RUNNING:
Assert(conn->async_scan != NULL);
- res = PQgetResult(conn->conn);
+ res = PQgetResult(conn->pgconn);
if (PQntuples(res) == fetch_size)
{
/*
@@ -2080,7 +2080,7 @@ fetch_more_data(PgFdwScanState *fsstate)
* more PQgetResult() is needed to reset the state to
* IDLE. See PQexecFinish() for details.
*/
- if (PQgetResult(conn->conn) != NULL)
+ if (PQgetResult(conn->pgconn) != NULL)
elog(ERROR, "Connection status error.");
}
@@ -2140,7 +2140,7 @@ fetch_more_data(PgFdwScanState *fsstate)
* We can immediately request the next bunch of tuples if
* we're on asynchronous connection.
*/
- if (!PQsendQuery(conn->conn, sql))
+ if (!PQsendQuery(conn->pgconn, sql))
pgfdw_report_error(ERROR, res, conn, false, fsstate->query);
}
else
@@ -2254,7 +2254,7 @@ close_cursor(PgFdwConn *conn, unsigned int cursor_number)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(conn->conn, sql);
+ res = PQexec(conn->pgconn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, sql);
PQclear(res);
@@ -2286,7 +2286,7 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQprepare(fmstate->conn->conn,
+ res = PQprepare(fmstate->conn->pgconn,
p_name,
fmstate->query,
0,
@@ -2440,7 +2440,7 @@ postgresAnalyzeForeignTable(Relation relation,
/* In what follows, do not risk leaking any PGresults. */
PG_TRY();
{
- res = PQexec(conn->conn, sql.data);
+ res = PQexec(conn->pgconn, sql.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
@@ -2534,7 +2534,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
/* In what follows, do not risk leaking any PGresults. */
PG_TRY();
{
- res = PQexec(conn->conn, sql.data);
+ res = PQexec(conn->pgconn, sql.data);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
PQclear(res);
@@ -2564,7 +2564,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
snprintf(fetch_sql, sizeof(fetch_sql), "FETCH %d FROM c%u",
fetch_size, cursor_number);
- res = PQexec(conn->conn, fetch_sql);
+ res = PQexec(conn->pgconn, fetch_sql);
/* On error, report the original query, not the FETCH. */
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
@@ -2726,7 +2726,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
conn = GetConnection(server, mapping, false);
/* Don't attempt to import collation if remote server hasn't got it */
- if (PQserverVersion(conn->conn) < 90100)
+ if (PQserverVersion(conn->pgconn) < 90100)
import_collate = false;
/* Create workspace for strings */
@@ -2739,7 +2739,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
appendStringInfoString(&buf, "SELECT 1 FROM pg_catalog.pg_namespace WHERE nspname = ");
deparseStringLiteral(&buf, stmt->remote_schema);
- res = PQexec(conn->conn, buf.data);
+ res = PQexec(conn->pgconn, buf.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, buf.data);
@@ -2834,7 +2834,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
appendStringInfo(&buf, " ORDER BY c.relname, a.attnum");
/* Fetch the data */
- res = PQexec(conn->conn, buf.data);
+ res = PQexec(conn->pgconn, buf.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, buf.data);
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 2472451..5dfc04a 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -29,7 +29,7 @@ typedef enum PgFdwConnState
typedef struct PgFdwConn
{
- PGconn *conn;
+ PGconn *pgconn;
int nscans;
PgFdwConnState async_state;
struct PgFdwScanState *async_scan;
--
2.1.0.GIT
Hi Horiguchi-san,
Here are my comments for the patches together
Sanity
1. The patch applies cleanly but has trailing white spaces.
[ashutosh@ubuntu coderoot]git apply
/mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch
/mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch:41: trailing whitespace.
entry->conn =
/mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch:44: trailing whitespace.
/mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch:611: trailing
whitespace.
warning: 3 lines add whitespace errors.
2. The patches compile cleanly.
3. The regression is clean, even in contrib/postgres_fdw and
contrib/file_fdw
Tests
-------
We need tests to make sure that the logic remains intact even after further
changes in this area. Couple of tests which require multiple foreign scans
within the same query fetching rows more than fetch size (100) would be
required. Also, some DMLs, which involve multiple foreign scans would test
the sanity when UPDATE/DELETE interleave such scans. By multiple foreign
scans I mean both multiple scans on a single foreign server and multiple
scans spread across multiple foreign servers.
Code
-------
Because previous "conn" is now replaced by "conn->pgconn", the double
indirection makes the code a bit ugly and prone to segfaults (conn being
NULL or invalid pointer). Can we minimize such code or wrap it under a
macro?
We need some comments about the structure definition of PgFdwConn and its
members explaining the purpose of this structure and its members.
Same is the case with enum PgFdwConnState. In fact, the state diagram of a
connection has become more complicated with the async connections, so it
might be better to explain that state diagram at one place in the code
(through comments). The definition of the enum might be a good place to do
that. Otherwise, the logic of connection maintenance is spread at multiple
places and is difficult to understand by looking at the code.
In function GetConnection(), at line
elog(DEBUG3, "new postgres_fdw connection %p for server \"%s\"",
- entry->conn, server->servername);
+ entry->conn->pgconn, server->servername);
entry->conn->pgconn may not necessarily be a new connection and may be NULL
(as the next line check it for being NULL). So, I think this line should be
moved within the following if block after pgconn has been initialised with
the new connection.
+ if (entry->conn->pgconn == NULL)
+ {
+ entry->conn->pgconn = connect_pg_server(server, user);
+ entry->conn->nscans = 0;
+ entry->conn->async_state = PGFDW_CONN_IDLE;
+ entry->conn->async_scan = NULL;
+ }
The if condition if (entry->conn == NULL) in GetConnection(), used to track
whether there is a PGConn active for the given entry, now it tracks whether
it has PgFdwConn for the same.
Please see more comments inline.
On Mon, Dec 15, 2014 at 2:39 PM, Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp> wrote:
* Outline of this patch
From some consideration after the previous discussion and
comments from others, I judged the original (WIP) patch was
overdone as the first step. So I reduced the patch to minimal
function. The new patch does the following,- Wrapping PGconn by PgFdwConn in order to handle multiple scans
on one connection.- The core async logic was added in fetch_more_data().
It might help if you can explain this logic in this mail as well as in code
(as per my comment above).
- Invoking remote commands asynchronously in ExecInitForeignScan.
- Canceling async invocation if any other foreign scans,
modifies, deletes use the same connection.
Cancellation is done by immediately fetching the return of
already-invoked acync command.
* Where this patch will be effective.
With upcoming inheritance-partition feature, this patch enables
stating and running foreign scans asynchronously. It will be more
effective for longer TAT or remote startup times, and larger
number of foreign servers. No negative performance effect on
other situations.
AFAIU, this logic sends only the first query in asynchronous manner not all
of them. Is that right? If yes, I think it's a sever limitation of the
feature. For a query involving multiple foreign scans, only the first one
will be done in async fashion and not the rest. Sorry, if my understanding
is wrong.
I think, we need some data which shows the speed up by this patch. You may
construct a case, where a single query involved multiple foreign scans, and
we can check what is the speed up obtained against the number of foreign
scans.
* Concerns about this patch.
- This breaks the assumption that scan starts at ExecForeignScan,
not ExecInitForeignScan, which might cause some problem.
This should be fine as long as it doesn't have any side effects like
sending query during EXPLAIN (which has been taken care of in this patch.)
Do you think, we need any special handling for PREPAREd statements?
- error reporting code in do_sql_command is quite ugly..
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Hello, thank you for the comment, Ashutosh.
I'll return after the New Year holidays. Very sorry not
addressing them sooner but then I will have more time on this.
Have a happy holidays.
regards,
=====
Hi Horiguchi-san,
Here are my comments for the patches togetherSanity
1. The patch applies cleanly but has trailing white spaces.
[ashutosh@ubuntu coderoot]git apply
/mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch
/mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch:41: trailing whitespace.
entry->conn =
/mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch:44: trailing whitespace./mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch:611: trailing
whitespace.warning: 3 lines add whitespace errors.
2. The patches compile cleanly.
3. The regression is clean, even in contrib/postgres_fdw and
contrib/file_fdwTests
-------
We need tests to make sure that the logic remains intact even after further
changes in this area. Couple of tests which require multiple foreign scans
within the same query fetching rows more than fetch size (100) would be
required. Also, some DMLs, which involve multiple foreign scans would test
the sanity when UPDATE/DELETE interleave such scans. By multiple foreign
scans I mean both multiple scans on a single foreign server and multiple
scans spread across multiple foreign servers.Code
-------
Because previous "conn" is now replaced by "conn->pgconn", the double
indirection makes the code a bit ugly and prone to segfaults (conn being
NULL or invalid pointer). Can we minimize such code or wrap it under a
macro?We need some comments about the structure definition of PgFdwConn and its
members explaining the purpose of this structure and its members.Same is the case with enum PgFdwConnState. In fact, the state diagram of a
connection has become more complicated with the async connections, so it
might be better to explain that state diagram at one place in the code
(through comments). The definition of the enum might be a good place to do
that. Otherwise, the logic of connection maintenance is spread at multiple
places and is difficult to understand by looking at the code.In function GetConnection(), at line elog(DEBUG3, "new postgres_fdw connection %p for server \"%s\"", - entry->conn, server->servername); + entry->conn->pgconn, server->servername);entry->conn->pgconn may not necessarily be a new connection and may be NULL (as the next line check it for being NULL). So, I think this line should be moved within the following if block after pgconn has been initialised with the new connection. + if (entry->conn->pgconn == NULL) + { + entry->conn->pgconn = connect_pg_server(server, user); + entry->conn->nscans = 0; + entry->conn->async_state = PGFDW_CONN_IDLE; + entry->conn->async_scan = NULL; + }The if condition if (entry->conn == NULL) in GetConnection(), used to track
whether there is a PGConn active for the given entry, now it tracks whether
it has PgFdwConn for the same.Please see more comments inline.
On Mon, Dec 15, 2014 at 2:39 PM, Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp> wrote:* Outline of this patch
From some consideration after the previous discussion and
comments from others, I judged the original (WIP) patch was
overdone as the first step. So I reduced the patch to minimal
function. The new patch does the following,- Wrapping PGconn by PgFdwConn in order to handle multiple scans
on one connection.- The core async logic was added in fetch_more_data().
It might help if you can explain this logic in this mail as well as in code
(as per my comment above).- Invoking remote commands asynchronously in ExecInitForeignScan.
- Canceling async invocation if any other foreign scans,
modifies, deletes use the same connection.Cancellation is done by immediately fetching the return of
already-invoked acync command.* Where this patch will be effective.
With upcoming inheritance-partition feature, this patch enables
stating and running foreign scans asynchronously. It will be more
effective for longer TAT or remote startup times, and larger
number of foreign servers. No negative performance effect on
other situations.AFAIU, this logic sends only the first query in asynchronous manner not all
of them. Is that right? If yes, I think it's a sever limitation of the
feature. For a query involving multiple foreign scans, only the first one
will be done in async fashion and not the rest. Sorry, if my understanding
is wrong.I think, we need some data which shows the speed up by this patch. You may
construct a case, where a single query involved multiple foreign scans, and
we can check what is the speed up obtained against the number of foreign
scans.* Concerns about this patch.
- This breaks the assumption that scan starts at ExecForeignScan,
not ExecInitForeignScan, which might cause some problem.This should be fine as long as it doesn't have any side effects like
sending query during EXPLAIN (which has been taken care of in this patch.)
Do you think, we need any special handling for PREPAREd statements?- error reporting code in do_sql_command is quite ugly..
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello, thank you for the comment.
This is the second version of the patch.
- Refactored to make the code simpler and clearer.
- Added comment about logic outline and struct members.
- Removed trailig white spaces..
- No additional test yet.
======
warning: 3 lines add whitespace errors.
Whoops. Fixed.
2. The patches compile cleanly.
3. The regression is clean, even in contrib/postgres_fdw and
contrib/file_fdwTests
-------
We need tests to make sure that the logic remains intact even after further
changes in this area. Couple of tests which require multiple foreign scans
within the same query fetching rows more than fetch size (100) would be
required. Also, some DMLs, which involve multiple foreign scans would test
the sanity when UPDATE/DELETE interleave such scans. By multiple foreign
scans I mean both multiple scans on a single foreign server and multiple
scans spread across multiple foreign servers.
Additional tests indeed might be needed. Some of the test related
to this patch are implicitly done in the present regression
tests. But no explicit ones.
fetch_size is currently a bare constant so I think it is not so
necessary to test for other fetch sizes. Even if different size
will potentially cause a problem, it will be found when the
different number is actually applied.
On the current design, async scan is started only on the first
scan on the connection, and if the next scan or modify claims the
same connection, the async state is immediately finished and
behaves as the same as the unpatched version. But since
asynchronous/parallel scan is introduced in any form, such kind
of test seems to be needed.
multi-server tests are not done also in the unpatched version but
there's no difference between multiple foregn servers on the same
remote server and them distributed on multiple remotes. The async
scan of this patch works only on the same foreign server so there
seems to be no need such kind of test. Do you have any specific
concern about this?
After all, I will add some explict tests for async-canceling in
the next patch.
Code
-------
Because previous "conn" is now replaced by "conn->pgconn", the double
indirection makes the code a bit ugly and prone to segfaults (conn being
NULL or invalid pointer). Can we minimize such code or wrap it under a
macro?
Agreed. It was annoyance also for me. I've done the following
things to encapsulate PgFdwConn to some extent in the second
version of this patch. They are described below.
We need some comments about the structure definition of PgFdwConn and its
members explaining the purpose of this structure and its members.
Thank you for pointing that. I forgot that. I added simple
comments there.
Same is the case with enum PgFdwConnState. In fact, the state diagram of a
connection has become more complicated with the async connections, so it
might be better to explain that state diagram at one place in the code
(through comments). The definition of the enum might be a good place to do
that.
I added a comment describing the and logic and meaning of the
statesjust above the enum declaration.
Otherwise, the logic of connection maintenance is spread at multiple
places and is difficult to understand by looking at the code.In function GetConnection(), at line elog(DEBUG3, "new postgres_fdw connection %p for server \"%s\"", - entry->conn, server->servername); + entry->conn->pgconn, server->servername);
Thank you, I replaced conn's in this form with PFC_PGCONN(conn).
entry->conn->pgconn may not necessarily be a new connection and may be NULL (as the next line check it for being NULL). So, I think this line should be moved within the following if block after pgconn has been initialised with the new connection. + if (entry->conn->pgconn == NULL) + { + entry->conn->pgconn = connect_pg_server(server, user); + entry->conn->nscans = 0; + entry->conn->async_state = PGFDW_CONN_IDLE; + entry->conn->async_scan = NULL; + }The if condition if (entry->conn == NULL) in GetConnection(), used to track
whether there is a PGConn active for the given entry, now it tracks whether
it has PgFdwConn for the same.
After some soncideration, I decided to make PgFdwConn to be
handled more similarly to PGconn. This patch has shrunk as a
result and bacame looks clear.
- Added macros to encapsulate PgFdwConn struct. (One of them is a function)
- Added macros to call PQxxx functions taking PgFdwConn.
- connect_pg_server() returns PgFdwConn.
- connection.c does not touch the inside of PgFdwConn except a
few points. The PgFdwConn's memory is allocated with malloc()
as PGconn and freed by PFCfinish() which is the correspondent
of PQfinish().
As the result of those chagnes, this patch has altered into the
following shape.
- All points where PGconn is used now uses PgFdwConn. They are
seemingly simple replacements.
- The major functional changes are concentrated within
fetch_more_data(), postgreBeginForeignScan(), GetConnection() ,
ReleaseConnection(), and the additional function
finish_async_connection().
Please see more comments inline.
* Outline of this patch
From some consideration after the previous discussion and
comments from others, I judged the original (WIP) patch was
overdone as the first step. So I reduced the patch to minimal
function. The new patch does the following,- Wrapping PGconn by PgFdwConn in order to handle multiple scans
on one connection.- The core async logic was added in fetch_more_data().
It might help if you can explain this logic in this mail as well as in code
(as per my comment above).
I wrote the outline of the logic in the comment for enum
PgFdwConnState in postgres_fdw.h. Is it make sense?
* Where this patch will be effective.
With upcoming inheritance-partition feature, this patch enables
stating and running foreign scans asynchronously. It will be more
effective for longer TAT or remote startup times, and larger
number of foreign servers. No negative performance effect on
other situations.AFAIU, this logic sends only the first query in asynchronous manner not all
of them. Is that right? If yes, I think it's a sever limitation of the
feature. For a query involving multiple foreign scans, only the first one
will be done in async fashion and not the rest. Sorry, if my understanding
is wrong.
You're right for the first point. So the domain I think this is
effective is the case of sharding. Each remote server can have
dedicate PGconn connection in the case. Addition to it, the
ongoing FDW Join pushdown should increase the chance for async
execution in this manner. I found that It is difficult to find
the appropriate policy for managing the load on the remote server
when multiple PGconn connection for single remote server, so it
would be the next issue.
I think, we need some data which shows the speed up by this patch. You may
construct a case, where a single query involved multiple foreign scans, and
we can check what is the speed up obtained against the number of foreign
scans.
Agreed, I'll show you some such figures afterwards.
* Concerns about this patch.
- This breaks the assumption that scan starts at ExecForeignScan,
not ExecInitForeignScan, which might cause some problem.This should be fine as long as it doesn't have any side effects like
sending query during EXPLAIN (which has been taken care of in this patch.)
Do you think, we need any special handling for PREPAREd statements?
I suppose there's no difference between PREAPREd and
not-PREPAREDd at the level of FDW.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-Asynchronous-execution-of-postgres_fdw-v2.patchtext/x-patch; charset=us-asciiDownload
>From a04a2f8ff32cf3095f7769eedde11ca946f024e5 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Fri, 28 Nov 2014 10:52:41 +0900
Subject: [PATCH] Asynchronous execution of postgres_fdw v2
This is the modified version of Asynchronous execution of
postgres_fdw.
- Refactored to make the code simpler.
- Added comment about logic outline and struct members.
---
contrib/postgres_fdw/connection.c | 84 ++++++------
contrib/postgres_fdw/postgres_fdw.c | 255 +++++++++++++++++++++++++++---------
contrib/postgres_fdw/postgres_fdw.h | 84 +++++++++++-
3 files changed, 318 insertions(+), 105 deletions(-)
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 4e02cb2..574b08e 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -44,7 +44,7 @@ typedef struct ConnCacheKey
typedef struct ConnCacheEntry
{
ConnCacheKey key; /* hash key (must be first) */
- PGconn *conn; /* connection to foreign server, or NULL */
+ PgFdwConn *conn; /* connection to foreign server, or NULL */
int xact_depth; /* 0 = no xact open, 1 = main xact open, 2 =
* one level of subxact open, etc */
bool have_prep_stmt; /* have we prepared any stmts in this xact? */
@@ -64,10 +64,10 @@ static unsigned int prep_stmt_number = 0;
static bool xact_got_connection = false;
/* prototypes of private functions */
-static PGconn *connect_pg_server(ForeignServer *server, UserMapping *user);
+static PgFdwConn *connect_pg_server(ForeignServer *server, UserMapping *user);
static void check_conn_params(const char **keywords, const char **values);
-static void configure_remote_session(PGconn *conn);
-static void do_sql_command(PGconn *conn, const char *sql);
+static void configure_remote_session(PgFdwConn *conn);
+static void do_sql_command(PgFdwConn *conn, const char *sql);
static void begin_remote_xact(ConnCacheEntry *entry);
static void pgfdw_xact_callback(XactEvent event, void *arg);
static void pgfdw_subxact_callback(SubXactEvent event,
@@ -93,7 +93,7 @@ static void pgfdw_subxact_callback(SubXactEvent event,
* be useful and not mere pedantry. We could not flush any active connections
* mid-transaction anyway.
*/
-PGconn *
+PgFdwConn *
GetConnection(ForeignServer *server, UserMapping *user,
bool will_prep_stmt)
{
@@ -161,7 +161,7 @@ GetConnection(ForeignServer *server, UserMapping *user,
entry->have_error = false;
entry->conn = connect_pg_server(server, user);
elog(DEBUG3, "new postgres_fdw connection %p for server \"%s\"",
- entry->conn, server->servername);
+ PFC_PGCONN(entry->conn), server->servername);
}
/*
@@ -169,6 +169,13 @@ GetConnection(ForeignServer *server, UserMapping *user,
*/
begin_remote_xact(entry);
+ /*
+ * Finish async query immediately if another foreign scan node sharing
+ * this connection comes.
+ */
+ if (++entry->conn->nscans > 1 && PFC_IS_ASYNC_RUNNING(entry->conn))
+ fetch_more_data(entry->conn->async_scan);
+
/* Remember if caller will prepare statements */
entry->have_prep_stmt |= will_prep_stmt;
@@ -178,10 +185,10 @@ GetConnection(ForeignServer *server, UserMapping *user,
/*
* Connect to remote server using specified server and user mapping properties.
*/
-static PGconn *
+static PgFdwConn *
connect_pg_server(ForeignServer *server, UserMapping *user)
{
- PGconn *volatile conn = NULL;
+ PgFdwConn *volatile conn = NULL;
/*
* Use PG_TRY block to ensure closing connection on error.
@@ -223,14 +230,14 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
/* verify connection parameters and make connection */
check_conn_params(keywords, values);
- conn = PQconnectdbParams(keywords, values, false);
- if (!conn || PQstatus(conn) != CONNECTION_OK)
+ conn = PFCconnectdbParams(keywords, values, false);
+ if (!conn || PFCstatus(conn) != CONNECTION_OK)
{
char *connmessage;
int msglen;
/* libpq typically appends a newline, strip that */
- connmessage = pstrdup(PQerrorMessage(conn));
+ connmessage = pstrdup(PFCerrorMessage(conn));
msglen = strlen(connmessage);
if (msglen > 0 && connmessage[msglen - 1] == '\n')
connmessage[msglen - 1] = '\0';
@@ -246,7 +253,7 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
* otherwise, he's piggybacking on the postgres server's user
* identity. See also dblink_security_check() in contrib/dblink.
*/
- if (!superuser() && !PQconnectionUsedPassword(conn))
+ if (!superuser() && !PFCconnectionUsedPassword(conn))
ereport(ERROR,
(errcode(ERRCODE_S_R_E_PROHIBITED_SQL_STATEMENT_ATTEMPTED),
errmsg("password is required"),
@@ -263,7 +270,7 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
{
/* Release PGconn data structure if we managed to create one */
if (conn)
- PQfinish(conn);
+ PFCfinish(conn);
PG_RE_THROW();
}
PG_END_TRY();
@@ -312,9 +319,9 @@ check_conn_params(const char **keywords, const char **values)
* there are any number of ways to break things.
*/
static void
-configure_remote_session(PGconn *conn)
+configure_remote_session(PgFdwConn *conn)
{
- int remoteversion = PQserverVersion(conn);
+ int remoteversion = PFCserverVersion(conn);
/* Force the search path to contain only pg_catalog (see deparse.c) */
do_sql_command(conn, "SET search_path = pg_catalog");
@@ -348,11 +355,11 @@ configure_remote_session(PGconn *conn)
* Convenience subroutine to issue a non-data-returning SQL command to remote
*/
static void
-do_sql_command(PGconn *conn, const char *sql)
+do_sql_command(PgFdwConn *conn, const char *sql)
{
PGresult *res;
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, sql);
PQclear(res);
@@ -379,7 +386,7 @@ begin_remote_xact(ConnCacheEntry *entry)
const char *sql;
elog(DEBUG3, "starting remote transaction on connection %p",
- entry->conn);
+ PFC_PGCONN(entry->conn));
if (IsolationIsSerializable())
sql = "START TRANSACTION ISOLATION LEVEL SERIALIZABLE";
@@ -408,13 +415,11 @@ begin_remote_xact(ConnCacheEntry *entry)
* Release connection reference count created by calling GetConnection.
*/
void
-ReleaseConnection(PGconn *conn)
+ReleaseConnection(PgFdwConn *conn)
{
- /*
- * Currently, we don't actually track connection references because all
- * cleanup is managed on a transaction or subtransaction basis instead. So
- * there's nothing to do here.
- */
+ /* ongoing async query should be canceled if no scans left */
+ if (--PFC_NSCANS(conn) == 0)
+ finish_async_connection(conn);
}
/*
@@ -429,7 +434,7 @@ ReleaseConnection(PGconn *conn)
* collisions are highly improbable; just be sure to use %u not %d to print.
*/
unsigned int
-GetCursorNumber(PGconn *conn)
+GetCursorNumber(PgFdwConn *conn)
{
return ++cursor_number;
}
@@ -443,7 +448,7 @@ GetCursorNumber(PGconn *conn)
* increasing the risk of prepared-statement name collisions by resetting.
*/
unsigned int
-GetPrepStmtNumber(PGconn *conn)
+GetPrepStmtNumber(PgFdwConn *conn)
{
return ++prep_stmt_number;
}
@@ -462,7 +467,7 @@ GetPrepStmtNumber(PGconn *conn)
* marked with have_error = true.
*/
void
-pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
+pgfdw_report_error(int elevel, PGresult *res, PgFdwConn *conn,
bool clear, const char *sql)
{
/* If requested, PGresult must be released before leaving this function. */
@@ -490,7 +495,7 @@ pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
* return NULL, not a PGresult at all.
*/
if (message_primary == NULL)
- message_primary = PQerrorMessage(conn);
+ message_primary = PFCerrorMessage(conn);
ereport(elevel,
(errcode(sqlstate),
@@ -542,7 +547,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
if (entry->xact_depth > 0)
{
elog(DEBUG3, "closing remote transaction on connection %p",
- entry->conn);
+ PFC_PGCONN(entry->conn));
switch (event)
{
@@ -567,7 +572,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
*/
if (entry->have_prep_stmt && entry->have_error)
{
- res = PQexec(entry->conn, "DEALLOCATE ALL");
+ res = PFCexec(entry->conn, "DEALLOCATE ALL");
PQclear(res);
}
entry->have_prep_stmt = false;
@@ -597,7 +602,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Assume we might have lost track of prepared statements */
entry->have_error = true;
/* If we're aborting, abort all remote transactions too */
- res = PQexec(entry->conn, "ABORT TRANSACTION");
+ res = PFCexec(entry->conn, "ABORT TRANSACTION");
/* Note: can't throw ERROR, it would be infinite loop */
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(WARNING, res, entry->conn, true,
@@ -608,7 +613,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* As above, make sure to clear any prepared stmts */
if (entry->have_prep_stmt && entry->have_error)
{
- res = PQexec(entry->conn, "DEALLOCATE ALL");
+ res = PFCexec(entry->conn, "DEALLOCATE ALL");
PQclear(res);
}
entry->have_prep_stmt = false;
@@ -620,17 +625,17 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Reset state to show we're out of a transaction */
entry->xact_depth = 0;
+ PFC_RESET(entry->conn);
/*
* If the connection isn't in a good idle state, discard it to
* recover. Next GetConnection will open a new connection.
*/
- if (PQstatus(entry->conn) != CONNECTION_OK ||
- PQtransactionStatus(entry->conn) != PQTRANS_IDLE)
+ if (PFCstatus(entry->conn) != CONNECTION_OK ||
+ PFCtransactionStatus(entry->conn) != PQTRANS_IDLE)
{
- elog(DEBUG3, "discarding connection %p", entry->conn);
- PQfinish(entry->conn);
- entry->conn = NULL;
+ elog(DEBUG3, "discarding connection %p", PFC_PGCONN(entry->conn));
+ PFCfinish(entry->conn);
}
}
@@ -676,6 +681,9 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
PGresult *res;
char sql[100];
+ /* Shut down asynchronous scan if running */
+ PFC_RESET(entry->conn);
+
/*
* We only care about connections with open remote subtransactions of
* the current level.
@@ -701,7 +709,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
snprintf(sql, sizeof(sql),
"ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
curlevel, curlevel);
- res = PQexec(entry->conn, sql);
+ res = PFCexec(entry->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(WARNING, res, entry->conn, true, sql);
else
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index d76e739..552b0d4 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -136,7 +136,7 @@ typedef struct PgFdwScanState
List *retrieved_attrs; /* list of retrieved attribute numbers */
/* for remote query execution */
- PGconn *conn; /* connection for the scan */
+ PgFdwConn *conn; /* connection for the scan */
unsigned int cursor_number; /* quasi-unique ID for my cursor */
bool cursor_exists; /* have we created the cursor? */
int numParams; /* number of parameters passed to query */
@@ -156,6 +156,7 @@ typedef struct PgFdwScanState
/* working memory contexts */
MemoryContext batch_cxt; /* context holding current batch of tuples */
MemoryContext temp_cxt; /* context for per-tuple temporary data */
+ ExprContext *econtext; /* copy of ps_ExprContext of ForeignScanState */
} PgFdwScanState;
/*
@@ -167,7 +168,7 @@ typedef struct PgFdwModifyState
AttInMetadata *attinmeta; /* attribute datatype conversion metadata */
/* for remote query execution */
- PGconn *conn; /* connection for the scan */
+ PgFdwConn *conn; /* connection for the scan */
char *p_name; /* name of prepared statement, if created */
/* extracted fdw_private data */
@@ -298,7 +299,7 @@ static void estimate_path_cost_size(PlannerInfo *root,
double *p_rows, int *p_width,
Cost *p_startup_cost, Cost *p_total_cost);
static void get_remote_estimate(const char *sql,
- PGconn *conn,
+ PgFdwConn *conn,
double *rows,
int *width,
Cost *startup_cost,
@@ -306,9 +307,8 @@ static void get_remote_estimate(const char *sql,
static bool ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
EquivalenceClass *ec, EquivalenceMember *em,
void *arg);
-static void create_cursor(ForeignScanState *node);
-static void fetch_more_data(ForeignScanState *node);
-static void close_cursor(PGconn *conn, unsigned int cursor_number);
+static void create_cursor(PgFdwScanState *node);
+static void close_cursor(PgFdwConn *conn, unsigned int cursor_number);
static void prepare_foreign_modify(PgFdwModifyState *fmstate);
static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
ItemPointer tupleid,
@@ -329,6 +329,18 @@ static HeapTuple make_tuple_from_result_row(PGresult *res,
MemoryContext temp_context);
static void conversion_error_callback(void *arg);
+/* wrapper functions for libpq functions */
+PgFdwConn *
+PFCconnectdbParams(const char *const * keywords,
+ const char *const * values, int expand_dbname)
+{
+ PgFdwConn *ret = PFC_ALLOCATE();
+
+ PFC_INIT(ret);
+ PFC_PGCONN(ret) = PQconnectdbParams(keywords, values, expand_dbname);
+
+ return ret;
+}
/*
* Foreign-data wrapper handler function: return a struct with pointers
@@ -982,6 +994,15 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
fsstate->param_values = (const char **) palloc0(numParams * sizeof(char *));
else
fsstate->param_values = NULL;
+
+ fsstate->econtext = node->ss.ps.ps_ExprContext;
+
+ /*
+ * Start scanning asynchronously if it is the first scan on this
+ * connection.
+ */
+ if (PFC_NSCANS(fsstate->conn) == 1)
+ create_cursor(fsstate);
}
/*
@@ -1000,7 +1021,7 @@ postgresIterateForeignScan(ForeignScanState *node)
* cursor on the remote side.
*/
if (!fsstate->cursor_exists)
- create_cursor(node);
+ create_cursor(fsstate);
/*
* Get some more tuples, if we've run out.
@@ -1009,7 +1030,7 @@ postgresIterateForeignScan(ForeignScanState *node)
{
/* No point in another fetch if we already detected EOF, though. */
if (!fsstate->eof_reached)
- fetch_more_data(node);
+ fetch_more_data(fsstate);
/* If we didn't get any tuples, must be end of data. */
if (fsstate->next_tuple >= fsstate->num_tuples)
return ExecClearTuple(slot);
@@ -1069,7 +1090,7 @@ postgresReScanForeignScan(ForeignScanState *node)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(fsstate->conn, sql);
+ res = PFCexec(fsstate->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fsstate->conn, true, sql);
PQclear(res);
@@ -1398,13 +1419,13 @@ postgresExecForeignInsert(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1468,13 +1489,13 @@ postgresExecForeignUpdate(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1538,13 +1559,13 @@ postgresExecForeignDelete(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1594,7 +1615,7 @@ postgresEndForeignModify(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(fmstate->conn, sql);
+ res = PFCexec(fmstate->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fmstate->conn, true, sql);
PQclear(res);
@@ -1726,7 +1747,7 @@ estimate_path_cost_size(PlannerInfo *root,
List *local_join_conds;
StringInfoData sql;
List *retrieved_attrs;
- PGconn *conn;
+ PgFdwConn *conn;
Selectivity local_sel;
QualCost local_cost;
@@ -1836,7 +1857,7 @@ estimate_path_cost_size(PlannerInfo *root,
* The given "sql" must be an EXPLAIN command.
*/
static void
-get_remote_estimate(const char *sql, PGconn *conn,
+get_remote_estimate(const char *sql, PgFdwConn *conn,
double *rows, int *width,
Cost *startup_cost, Cost *total_cost)
{
@@ -1852,7 +1873,7 @@ get_remote_estimate(const char *sql, PGconn *conn,
/*
* Execute EXPLAIN remotely.
*/
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql);
@@ -1917,13 +1938,12 @@ ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
* Create cursor for node's query with current parameter values.
*/
static void
-create_cursor(ForeignScanState *node)
+create_cursor(PgFdwScanState *fsstate)
{
- PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
- ExprContext *econtext = node->ss.ps.ps_ExprContext;
+ ExprContext *econtext = fsstate->econtext;
int numParams = fsstate->numParams;
const char **values = fsstate->param_values;
- PGconn *conn = fsstate->conn;
+ PgFdwConn *conn = fsstate->conn;
StringInfoData buf;
PGresult *res;
@@ -1985,8 +2005,8 @@ create_cursor(ForeignScanState *node)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecParams(conn, buf.data, numParams, NULL, values,
- NULL, NULL, 0);
+ res = PFCexecParams(conn, buf.data, numParams, NULL, values,
+ NULL, NULL, 0);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, fsstate->query);
PQclear(res);
@@ -2001,15 +2021,18 @@ create_cursor(ForeignScanState *node)
/* Clean up */
pfree(buf.data);
+
+ /* Start async scan if this is the first scan */
+ if (PFC_NSCANS(conn) == 1)
+ fetch_more_data(fsstate);
}
/*
* Fetch some more rows from the node's cursor.
*/
-static void
-fetch_more_data(ForeignScanState *node)
+void
+fetch_more_data(PgFdwScanState *fsstate)
{
- PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
PGresult *volatile res = NULL;
MemoryContext oldcontext;
@@ -2024,7 +2047,7 @@ fetch_more_data(ForeignScanState *node)
/* PGresult must be released before leaving this function. */
PG_TRY();
{
- PGconn *conn = fsstate->conn;
+ PgFdwConn *conn = fsstate->conn;
char sql[64];
int fetch_size;
int numrows;
@@ -2036,9 +2059,64 @@ fetch_more_data(ForeignScanState *node)
snprintf(sql, sizeof(sql), "FETCH %d FROM c%u",
fetch_size, fsstate->cursor_number);
- res = PQexec(conn, sql);
+ switch (conn->async_state)
+ {
+ case PGFDW_CONN_IDLE:
+ Assert(conn->async_scan == NULL);
+
+ /* Do async fetch only when only one scan uses this connection */
+ if (conn->nscans == 1)
+ {
+ if (!PFCsendQuery(conn, sql))
+ pgfdw_report_error(ERROR, res, conn, false,
+ fsstate->query);
+
+ conn->async_state = PGFDW_CONN_ASYNC_RUNNING;
+ conn->async_scan = fsstate;
+ goto end_of_fetch;
+ }
+
+ /* Do synchronous query execution */
+ conn->async_state = PGFDW_CONN_SYNC_RUNNING;
+ res = PFCexec(conn, sql);
+ break;
+
+ case PGFDW_CONN_ASYNC_RUNNING:
+ Assert(conn->async_scan != NULL);
+
+ res = PFCgetResult(conn);
+ if (PQntuples(res) == fetch_size)
+ {
+ /*
+ * Connection state doesn't go to IDLE even if all data
+ * has been sent to client for asynchronous query. One
+ * more PQgetResult() is needed to reset the state to
+ * IDLE. See PQexecFinish() for details.
+ */
+ if (PFCgetResult(conn) != NULL)
+ elog(ERROR, "Connection status error.");
+ }
+
+ if (conn->nscans == 1)
+ break;
+
+ /*
+ * If nscans is more then 1, stop invoking command asynchronously
+ * for multiple scans on this connection. If nscan is zero, async
+ * command on this connection should be finished immediately.
+ */
+ conn->async_state = PGFDW_CONN_SYNC_RUNNING;
+ conn->async_scan = NULL;
+ break;
+
+ default:
+ elog(ERROR, "unexpected async state : %d", conn->async_state);
+ break;
+
+ }
+
/* On error, report the original query, not the FETCH. */
- if (PQresultStatus(res) != PGRES_TUPLES_OK)
+ if (res && PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, fsstate->query);
/* Convert the data into HeapTuples */
@@ -2066,6 +2144,33 @@ fetch_more_data(ForeignScanState *node)
PQclear(res);
res = NULL;
+
+ switch(conn->async_state)
+ {
+ case PGFDW_CONN_ASYNC_RUNNING:
+ if (!fsstate->eof_reached)
+ {
+ /*
+ * We can immediately request the next bunch of tuples if
+ * we're on asynchronous connection.
+ */
+ if (!PFCsendQuery(conn, sql))
+ pgfdw_report_error(ERROR, res, conn, false, fsstate->query);
+ break;
+ }
+
+ /* Fall through */
+ case PGFDW_CONN_SYNC_RUNNING:
+ PFC_SET_IDLE(conn);
+ break;
+
+ default:
+ elog(ERROR, "Unexpedted async state: %d", conn->async_state);
+ break;
+ }
+
+end_of_fetch:
+ ; /* Nothing to do here but needed to make compiler quiet. */
}
PG_CATCH();
{
@@ -2079,6 +2184,32 @@ fetch_more_data(ForeignScanState *node)
}
/*
+ * Force cancelling async command state.
+ */
+void
+finish_async_connection(PgFdwConn *conn)
+{
+ PgFdwScanState *fsstate = conn->async_scan;
+ PgFdwConn *async_conn;
+
+ /* Nothing to do if no async connection */
+ if (fsstate == NULL) return;
+ async_conn = fsstate->conn;
+ Assert(async_conn);
+
+ /* Finish async command if any */
+ if (PFC_IS_ASYNC_RUNNING(async_conn))
+ fetch_more_data(async_conn->async_scan);
+
+ Assert(async_conn->async_state == PGFDW_CONN_IDLE &&
+ async_conn->async_scan == NULL);
+
+ /* Immediately discard the result */
+ fsstate->next_tuple = 0;
+ fsstate->num_tuples = 0;
+}
+
+/*
* Force assorted GUC parameters to settings that ensure that we'll output
* data values in a form that is unambiguous to the remote server.
*
@@ -2132,7 +2263,7 @@ reset_transmission_modes(int nestlevel)
* Utility routine to close a cursor.
*/
static void
-close_cursor(PGconn *conn, unsigned int cursor_number)
+close_cursor(PgFdwConn *conn, unsigned int cursor_number)
{
char sql[64];
PGresult *res;
@@ -2143,7 +2274,7 @@ close_cursor(PGconn *conn, unsigned int cursor_number)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, sql);
PQclear(res);
@@ -2175,11 +2306,11 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQprepare(fmstate->conn,
- p_name,
- fmstate->query,
- 0,
- NULL);
+ res = PFCprepare(fmstate->conn,
+ p_name,
+ fmstate->query,
+ 0,
+ NULL);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -2297,7 +2428,7 @@ postgresAnalyzeForeignTable(Relation relation,
ForeignTable *table;
ForeignServer *server;
UserMapping *user;
- PGconn *conn;
+ PgFdwConn *conn;
StringInfoData sql;
PGresult *volatile res = NULL;
@@ -2329,7 +2460,7 @@ postgresAnalyzeForeignTable(Relation relation,
/* In what follows, do not risk leaking any PGresults. */
PG_TRY();
{
- res = PQexec(conn, sql.data);
+ res = PFCexec(conn, sql.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
@@ -2379,7 +2510,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
ForeignTable *table;
ForeignServer *server;
UserMapping *user;
- PGconn *conn;
+ PgFdwConn *conn;
unsigned int cursor_number;
StringInfoData sql;
PGresult *volatile res = NULL;
@@ -2423,7 +2554,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
/* In what follows, do not risk leaking any PGresults. */
PG_TRY();
{
- res = PQexec(conn, sql.data);
+ res = PFCexec(conn, sql.data);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
PQclear(res);
@@ -2453,7 +2584,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
snprintf(fetch_sql, sizeof(fetch_sql), "FETCH %d FROM c%u",
fetch_size, cursor_number);
- res = PQexec(conn, fetch_sql);
+ res = PFCexec(conn, fetch_sql);
/* On error, report the original query, not the FETCH. */
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
@@ -2582,7 +2713,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
bool import_not_null = true;
ForeignServer *server;
UserMapping *mapping;
- PGconn *conn;
+ PgFdwConn *conn;
StringInfoData buf;
PGresult *volatile res = NULL;
int numrows,
@@ -2615,7 +2746,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
conn = GetConnection(server, mapping, false);
/* Don't attempt to import collation if remote server hasn't got it */
- if (PQserverVersion(conn) < 90100)
+ if (PFCserverVersion(conn) < 90100)
import_collate = false;
/* Create workspace for strings */
@@ -2628,7 +2759,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
appendStringInfoString(&buf, "SELECT 1 FROM pg_catalog.pg_namespace WHERE nspname = ");
deparseStringLiteral(&buf, stmt->remote_schema);
- res = PQexec(conn, buf.data);
+ res = PFCexec(conn, buf.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, buf.data);
@@ -2723,7 +2854,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
appendStringInfo(&buf, " ORDER BY c.relname, a.attnum");
/* Fetch the data */
- res = PQexec(conn, buf.data);
+ res = PFCexec(conn, buf.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, buf.data);
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 950c6f7..2c81189 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -20,17 +20,91 @@
#include "libpq-fe.h"
+/*
+ * PgFdwConnState - states of PgFdwConn
+ *
+ * PgFdwConn manages asynchronous query execution status on a PGconn
+ * connection. Since one PGconn cannot accept multiple asynchronous queries
+ * at once, the ongoing async query is immediately finished by another claim
+ * of the PgFdwConn to use. This state transition is represented using the
+ * enumeration PgFdwConnState and mainly made within fetch_more_data().
+ *
+ * PGFDW_CONN_ASYNC_RUNNING is the state to be entered when calling
+ * fetch_more_data() on the PgFdwConn in IDLE state and used only one
+ * scan. Being called on a PgFdwConn of the state, fetch_more_data() sends the
+ * next FETCH request after getting the result of the previous request.
+ *
+ * PGFDW_CONN_SYNC_RUNNING is rather an internal state in
+ * fetch_more_data(). It indicates that the function shouldn't send the next
+ * fetch requst after getting the result.
+ */
+typedef enum PgFdwConnState
+{
+ PGFDW_CONN_IDLE, /* running no query */
+ PGFDW_CONN_ASYNC_RUNNING, /* running a query asynchronously */
+ PGFDW_CONN_SYNC_RUNNING /* running a query synchronously */
+} PgFdwConnState;
+
+typedef struct PgFdwConn
+{
+ PGconn *pgconn; /* libpq connection for this connection */
+ int nscans; /* number of scans using this connection */
+ PgFdwConnState async_state;/* query running state */
+ struct PgFdwScanState *async_scan; /* the connection currently running
+ * async query on this connection */
+} PgFdwConn;
+
+/* Macros to operate PgFdwConn */
+#define PFC_IS_ASYNC_RUNNING(c) ((c)->async_state == PGFDW_CONN_ASYNC_RUNNING)
+#define PFC_PGCONN(c) ((c)->pgconn)
+#define PFC_NSCANS(c) ((c)->nscans)
+#define PFC_SET_IDLE(c) ((c)->async_scan = NULL, \
+ (c)->async_state = PGFDW_CONN_IDLE)
+#define PFC_RESET(c) \
+ ((PFC_IS_ASYNC_RUNNING(c) ? PFCconsumeInput(c):0), \
+ PFC_SET_IDLE(c), PFC_NSCANS(c) = 0)
+#define PFC_INIT(c) (PFC_NSCANS(c) = 0, PFC_SET_IDLE(c))
+
+#define PFC_ALLOCATE() ((PgFdwConn *)malloc(sizeof(PgFdwConn)))
+#define PFC_FREE(c) free(c)
+
+/* libpq wrappers to take PgFdwConn* instead of PGconn* */
+#define PFCsendQuery(c,q) PQsendQuery((c)->pgconn, (q))
+#define PFCexec(c, q) PQexec((c)->pgconn, (q))
+#define PFCexecParams(c, q, n, t, v, l, f, rf) \
+ PQexecParams((c)->pgconn,(q),(n),(t),(v),(l),(f),(rf))
+#define PFCprepare(c, sn, q, n, t) PQprepare((c)->pgconn,(sn),(q),(n),(t))
+#define PFCexecPrepared(c, sn, n, v, l, f, rf) \
+ PQexecPrepared((c)->pgconn,(sn),(n),(v),(l),(f),(rf))
+#define PFCgetResult(c) PQgetResult((c)->pgconn)
+#define PFCconsumeInput(c) PQconsumeInput((c)->pgconn)
+#define PFCisBusy(c) PQisBusy((c)->pgconn)
+#define PFCstatus(c) PQstatus((c)->pgconn)
+#define PFCtransactionStatus(c) PQtransactionStatus((c)->pgconn)
+#define PFCserverVersion(c) PQserverVersion((c)->pgconn)
+#define PFCerrorMessage(c) PQerrorMessage((c)->pgconn)
+#define PFCconnectionUsedPassword(c) PQconnectionUsedPassword((c)->pgconn)
+
+/* These are not simple wrappers of PQfinish */
+#define PFCfinish(c) (PQfinish((c)->pgconn), PFC_FREE(c))
+
+/* libpq wrapper functions */
+extern PgFdwConn *PFCconnectdbParams(const char *const * keywords,
+ const char *const * values, int expand_dbname);
+
/* in postgres_fdw.c */
extern int set_transmission_modes(void);
extern void reset_transmission_modes(int nestlevel);
+extern void fetch_more_data(struct PgFdwScanState *node);
+extern void finish_async_connection(PgFdwConn *fsstate);
/* in connection.c */
-extern PGconn *GetConnection(ForeignServer *server, UserMapping *user,
+extern PgFdwConn *GetConnection(ForeignServer *server, UserMapping *user,
bool will_prep_stmt);
-extern void ReleaseConnection(PGconn *conn);
-extern unsigned int GetCursorNumber(PGconn *conn);
-extern unsigned int GetPrepStmtNumber(PGconn *conn);
-extern void pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
+extern void ReleaseConnection(PgFdwConn *conn);
+extern unsigned int GetCursorNumber(PgFdwConn *conn);
+extern unsigned int GetPrepStmtNumber(PgFdwConn *conn);
+extern void pgfdw_report_error(int elevel, PGresult *res, PgFdwConn *conn,
bool clear, const char *sql);
/* in option.c */
--
2.1.0.GIT
On Fri, Jan 9, 2015 at 2:00 PM, Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hello, thank you for the comment.
This is the second version of the patch.
- Refactored to make the code simpler and clearer.
- Added comment about logic outline and struct members.
- Removed trailig white spaces..- No additional test yet.
======
warning: 3 lines add whitespace errors.
Whoops. Fixed.
2. The patches compile cleanly.
3. The regression is clean, even in contrib/postgres_fdw and
contrib/file_fdwTests
-------
We need tests to make sure that the logic remains intact even afterfurther
changes in this area. Couple of tests which require multiple foreign
scans
within the same query fetching rows more than fetch size (100) would be
required. Also, some DMLs, which involve multiple foreign scans wouldtest
the sanity when UPDATE/DELETE interleave such scans. By multiple foreign
scans I mean both multiple scans on a single foreign server and multiple
scans spread across multiple foreign servers.Additional tests indeed might be needed. Some of the test related
to this patch are implicitly done in the present regression
tests. But no explicit ones.fetch_size is currently a bare constant so I think it is not so
necessary to test for other fetch sizes. Even if different size
will potentially cause a problem, it will be found when the
different number is actually applied.On the current design, async scan is started only on the first
scan on the connection, and if the next scan or modify claims the
same connection, the async state is immediately finished and
behaves as the same as the unpatched version. But since
asynchronous/parallel scan is introduced in any form, such kind
of test seems to be needed.multi-server tests are not done also in the unpatched version but
there's no difference between multiple foregn servers on the same
remote server and them distributed on multiple remotes. The async
scan of this patch works only on the same foreign server so there
seems to be no need such kind of test. Do you have any specific
concern about this?After all, I will add some explict tests for async-canceling in
the next patch.Code
-------
Because previous "conn" is now replaced by "conn->pgconn", the double
indirection makes the code a bit ugly and prone to segfaults (conn being
NULL or invalid pointer). Can we minimize such code or wrap it under a
macro?Agreed. It was annoyance also for me. I've done the following
things to encapsulate PgFdwConn to some extent in the second
version of this patch. They are described below.
Looks better.
We need some comments about the structure definition of PgFdwConn and its
members explaining the purpose of this structure and its members.Thank you for pointing that. I forgot that. I added simple
comments there.Same is the case with enum PgFdwConnState. In fact, the state diagram of
a
connection has become more complicated with the async connections, so it
might be better to explain that state diagram at one place in the code
(through comments). The definition of the enum might be a good place todo
that.
I added a comment describing the and logic and meaning of the
statesjust above the enum declaration.
This needs to be clarified further. But that can wait till we finalise the
approach and the patch. Esp. comment below is confusing
1487 + * PGFDW_CONN_SYNC_RUNNING is rather an internal state in
1488 + * fetch_more_data(). It indicates that the function shouldn't send
the next
1489 + * fetch requst after getting the result.
I couldn't get the meaning of the second sentence, esp. it's connection
with synchronous-ness.
Otherwise, the logic of connection maintenance is spread at multiple
places and is difficult to understand by looking at the code.
In function GetConnection(), at line elog(DEBUG3, "new postgres_fdw connection %p for server \"%s\"", - entry->conn, server->servername); + entry->conn->pgconn, server->servername);Thank you, I replaced conn's in this form with PFC_PGCONN(conn).
This looks better.
entry->conn->pgconn may not necessarily be a new connection and may be
NULL
(as the next line check it for being NULL). So, I think this line should
be
moved within the following if block after pgconn has been initialised
with
the new connection. + if (entry->conn->pgconn == NULL) + { + entry->conn->pgconn = connect_pg_server(server, user); + entry->conn->nscans = 0; + entry->conn->async_state = PGFDW_CONN_IDLE; + entry->conn->async_scan = NULL; + }The if condition if (entry->conn == NULL) in GetConnection(), used to
track
whether there is a PGConn active for the given entry, now it tracks
whether
it has PgFdwConn for the same.
After some soncideration, I decided to make PgFdwConn to be
handled more similarly to PGconn. This patch has shrunk as a
result and bacame looks clear.
I think it's still prone to segfaults considering two pointer indirections.
- Added macros to encapsulate PgFdwConn struct. (One of them is a function)
- Added macros to call PQxxx functions taking PgFdwConn.
- connect_pg_server() returns PgFdwConn.
- connection.c does not touch the inside of PgFdwConn except a
few points. The PgFdwConn's memory is allocated with malloc()
as PGconn and freed by PFCfinish() which is the correspondent
of PQfinish().As the result of those chagnes, this patch has altered into the
following shape.- All points where PGconn is used now uses PgFdwConn. They are
seemingly simple replacements.- The major functional changes are concentrated within
fetch_more_data(), postgreBeginForeignScan(), GetConnection() ,
ReleaseConnection(), and the additional function
finish_async_connection().Please see more comments inline.
* Outline of this patch
From some consideration after the previous discussion and
comments from others, I judged the original (WIP) patch was
overdone as the first step. So I reduced the patch to minimal
function. The new patch does the following,- Wrapping PGconn by PgFdwConn in order to handle multiple scans
on one connection.- The core async logic was added in fetch_more_data().
It might help if you can explain this logic in this mail as well as in
code
(as per my comment above).
I wrote the outline of the logic in the comment for enum
PgFdwConnState in postgres_fdw.h. Is it make sense?
The point about two different ForeignScan nodes using the same connection
needs some clarification, I guess. It's not very clear, why would there be
more queries run on the same connection. I know why this happens, but it's
important to mention it somewhere. If it's already mentioned somewhere in
the file, sorry for not paying attention to that.
* Where this patch will be effective.
With upcoming inheritance-partition feature, this patch enables
stating and running foreign scans asynchronously. It will be more
effective for longer TAT or remote startup times, and larger
number of foreign servers. No negative performance effect on
other situations.AFAIU, this logic sends only the first query in asynchronous manner not
all
of them. Is that right? If yes, I think it's a sever limitation of the
feature. For a query involving multiple foreign scans, only the first one
will be done in async fashion and not the rest. Sorry, if myunderstanding
is wrong.
You're right for the first point. So the domain I think this is
effective is the case of sharding. Each remote server can have
dedicate PGconn connection in the case. Addition to it, the
ongoing FDW Join pushdown should increase the chance for async
execution in this manner. I found that It is difficult to find
the appropriate policy for managing the load on the remote server
when multiple PGconn connection for single remote server, so it
would be the next issue.
I think there is more chance that there will more than one ForeignScan
nodes trying interact with a foreign server, even after the push-down work.
The current solution doesn't address that. We actually need parallel
querying in two cases
1. Querying multiple servers in parallel
2. Querying same server (by two querists) in parallel within the same query
e.g. an un-pushable join.
We need a solution which is work in both the cases.
Is it possible to use the parallel query infrastructure being built by
Robert or to do something like parallel seq scan? That will work, not just
for Postgres FDW but all the FDWs.
I think, we need some data which shows the speed up by this patch. You
may
construct a case, where a single query involved multiple foreign scans,
and
we can check what is the speed up obtained against the number of foreign
scans.Agreed, I'll show you some such figures afterwards.
* Concerns about this patch.
- This breaks the assumption that scan starts at ExecForeignScan,
not ExecInitForeignScan, which might cause some problem.This should be fine as long as it doesn't have any side effects like
sending query during EXPLAIN (which has been taken care of in thispatch.)
Do you think, we need any special handling for PREPAREd statements?
I suppose there's no difference between PREAPREd and
not-PREPAREDd at the level of FDW.
In case of Prepared statements, ExecInit is called at the end of planning,
without subsequent execution like the case of EXPLAIN. I see that the patch
handles EXPLAIN well, but I didn't see any specific code for PREPARE.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Hello. This is a version 3 patch.
- PgFdwConnState is removed
- PgFdwConn is isolated as a separate module.
- State transition was simplicated, I think.
- Comment about multiple scans on a connection is added.
- The issue of PREPARE is not addressed yet.
- It is to show how the new style looks, so it is lacking for
comments for every PgFdwConn functions.
- Rebased to current master.
=======
I added a comment describing the and logic and meaning of the
statesjust above the enum declaration.This needs to be clarified further. But that can wait till we finalise the
approach and the patch. Esp. comment below is confusing
1487 + * PGFDW_CONN_SYNC_RUNNING is rather an internal state in
1488 + * fetch_more_data(). It indicates that the function shouldn't send
the next
1489 + * fetch requst after getting the result.I couldn't get the meaning of the second sentence, esp. it's connection
with synchronous-ness.
In this version, I removed PgFdwConnState. Now what indicates
that async fetch is running or not is the existence of
async_fetch. I think the complicated state transition is
dissapeard.
The if condition if (entry->conn == NULL) in GetConnection(), used to
track
whether there is a PGConn active for the given entry, now it tracks
whether
it has PgFdwConn for the same.
After some soncideration, I decided to make PgFdwConn to be
handled more similarly to PGconn. This patch has shrunk as a
result and bacame looks clear.I think it's still prone to segfaults considering two pointer indirections.
PGconn itself already makes two-level indirection, and PgFdwConn
has hidden its details mainly using macros. I may misunderstood
you, but if you're worried that PgFdwConn.pgconn can be set from
anywhere, we would should separate PgFdwConn into another
C-module and hide all the details as PGconn does. It is shown as
the separte patch. But I feel it a bit overdone because it is not
an end-user interface.
I wrote the outline of the logic in the comment for enum
PgFdwConnState in postgres_fdw.h. Is it make sense?The point about two different ForeignScan nodes using the same connection
needs some clarification, I guess. It's not very clear, why would there be
more queries run on the same connection. I know why this happens, but it's
important to mention it somewhere. If it's already mentioned somewhere in
the file, sorry for not paying attention to that.
Yeah. It is just what I stumbled on. I changed the comment in
fetch_more_data() like below. Does it make sense?
| /*
| * On the current postgres_fdw implement, multiple PgFdwScanState
| * on the same foreign server and mapped user share the same
| * connection to the remote server (see GetConnection() in
| * connection.c) and inidividual scans on it are separated using
| * cursors. Since one connection cannot accept two or more
| * asynchronous queries simultaneously, we should stop the async
| * fetching if the another scan comes.
| */
|
| if (PFCgetNscans(conn) > 1)
| PFCsetAsyncScan(conn, NULL);
I think there is more chance that there will more than one ForeignScan
nodes trying interact with a foreign server, even after the push-down work.
The current solution doesn't address that. We actually need parallel
querying in two cases
1. Querying multiple servers in parallel
2. Querying same server (by two querists) in parallel within the same query
e.g. an un-pushable join.We need a solution which is work in both the cases.
The first point is realized by this patch with some
limitations. The second point is that my old patch did, it made a
dedicated connection for individual scans up to some fixed number
aside the main connection, then the overflowed scans go to the
main connection and they are done in the manner the unpatched
postgres_fdw does.
I was thinking that the 'some fiexed number' could be set by a
parameter of a foreign server but I got a comment that it could
fill up the remote server unless reasonable load or/and bandwidth
managemant. So I abandoned the multiple-connection solution and
decided to do everything on the first connection. It's how the
current patch came.
Is it possible to use the parallel query infrastructure being built by
Robert or to do something like parallel seq scan? That will work, not just
for Postgres FDW but all the FDWs.
I haven't seen closer to the patch but if my understanding by a
glance is correct, the parallel scan devides the target table
into multple parts then runs subscans on every part in parallel.
It might allocate dedicated processes for every child scan on a
partitioned table.
But, I think, from the performance view, every scan of multiple
foreign scans don't need correnponding local process. But if the
parallel scan infrastructure makes the mechanism simpler, using
it is a very promising choice.
In case of Prepared statements, ExecInit is called at the end of planning,
without subsequent execution like the case of EXPLAIN. I see that the patch
handles EXPLAIN well, but I didn't see any specific code for PREPARE.
I'll look into the case after this, but I'd like to send a
revised patch at this point.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-Asynchronous-execution-of-postgres_fdw-v3.patchtext/x-patch; charset=us-asciiDownload
>From 58757978b5625aa3ae9a99fdcd9d6db393e62a5a Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Tue, 13 Jan 2015 19:20:35 +0900
Subject: [PATCH] Asynchronous execution of postgres_fdw v3
This is the modified version of Asynchronous execution of
postgres_fdw.
- Remove PgFdwAsyncState.
- Separate PgFdwConn into individual module
---
contrib/postgres_fdw/Makefile | 2 +-
contrib/postgres_fdw/PgFdwConn.c | 200 +++++++++++++++++++++++++++++++
contrib/postgres_fdw/PgFdwConn.h | 62 ++++++++++
contrib/postgres_fdw/connection.c | 86 +++++++------
contrib/postgres_fdw/postgres_fdw.c | 232 ++++++++++++++++++++++++++----------
contrib/postgres_fdw/postgres_fdw.h | 16 ++-
6 files changed, 490 insertions(+), 108 deletions(-)
create mode 100644 contrib/postgres_fdw/PgFdwConn.c
create mode 100644 contrib/postgres_fdw/PgFdwConn.h
diff --git a/contrib/postgres_fdw/Makefile b/contrib/postgres_fdw/Makefile
index d2b98e1..d0913e2 100644
--- a/contrib/postgres_fdw/Makefile
+++ b/contrib/postgres_fdw/Makefile
@@ -1,7 +1,7 @@
# contrib/postgres_fdw/Makefile
MODULE_big = postgres_fdw
-OBJS = postgres_fdw.o option.o deparse.o connection.o $(WIN32RES)
+OBJS = postgres_fdw.o PgFdwConn.o option.o deparse.o connection.o $(WIN32RES)
PGFILEDESC = "postgres_fdw - foreign data wrapper for PostgreSQL"
PG_CPPFLAGS = -I$(libpq_srcdir)
diff --git a/contrib/postgres_fdw/PgFdwConn.c b/contrib/postgres_fdw/PgFdwConn.c
new file mode 100644
index 0000000..b13b597
--- /dev/null
+++ b/contrib/postgres_fdw/PgFdwConn.c
@@ -0,0 +1,200 @@
+/*-------------------------------------------------------------------------
+ *
+ * PgFdwConn.c
+ * PGconn extending wrapper to enable asynchronous query.
+ *
+ * Portions Copyright (c) 2012-2015, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * contrib/postgres_fdw/PgFdwConn.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "PgFdwConn.h"
+
+#define PFC_ALLOCATE() ((PgFdwConn *)malloc(sizeof(PgFdwConn)))
+#define PFC_FREE(c) free(c)
+
+struct pgfdw_conn
+{
+ PGconn *pgconn; /* libpq connection for this connection */
+ int nscans; /* number of scans using this connection */
+ struct PgFdwScanState *async_scan; /* the connection currently running
+ * async query on this connection */
+};
+
+void
+PFCsetAsyncScan(PgFdwConn *conn, struct PgFdwScanState *scan)
+{
+ conn->async_scan = scan;
+}
+
+struct PgFdwScanState *
+PFCgetAsyncScan(PgFdwConn *conn)
+{
+ return conn->async_scan;
+}
+
+int
+PFCisAsyncRunning(PgFdwConn *conn)
+{
+ return conn->async_scan != NULL;
+}
+
+PGconn *
+PFCgetPGconn(PgFdwConn *conn)
+{
+ return conn->pgconn;
+}
+
+int
+PFCgetNscans(PgFdwConn *conn)
+{
+ return conn->nscans;
+}
+
+int
+PFCincrementNscans(PgFdwConn *conn)
+{
+ return ++conn->nscans;
+}
+
+int
+PFCdecrementNscans(PgFdwConn *conn)
+{
+ Assert(conn->nscans > 0);
+ return --conn->nscans;
+}
+
+void
+PFCcancelAsync(PgFdwConn *conn)
+{
+ if (PFCisAsyncRunning(conn))
+ PFCconsumeInput(conn);
+}
+
+void
+PFCinit(PgFdwConn *conn)
+{
+ conn->async_scan = NULL;
+ conn->nscans = 0;
+}
+
+int
+PFCsendQuery(PgFdwConn *conn, const char *query)
+{
+ return PQsendQuery(conn->pgconn, query);
+}
+
+PGresult *
+PFCexec(PgFdwConn *conn, const char *query)
+{
+ return PQexec(conn->pgconn, query);
+}
+
+PGresult *
+PFCexecParams(PgFdwConn *conn,
+ const char *command,
+ int nParams,
+ const Oid *paramTypes,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat)
+{
+ return PQexecParams(conn->pgconn,
+ command, nParams, paramTypes, paramValues,
+ paramLengths, paramFormats, resultFormat);
+}
+
+PGresult *
+PFCprepare(PgFdwConn *conn,
+ const char *stmtName, const char *query,
+ int nParams, const Oid *paramTypes)
+{
+ return PQprepare(conn->pgconn, stmtName, query, nParams, paramTypes);
+}
+
+PGresult *
+PFCexecPrepared(PgFdwConn *conn,
+ const char *stmtName,
+ int nParams,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat)
+{
+ return PQexecPrepared(conn->pgconn,
+ stmtName, nParams, paramValues, paramLengths,
+ paramFormats, resultFormat);
+}
+
+PGresult *
+PFCgetResult(PgFdwConn *conn)
+{
+ return PQgetResult(conn->pgconn);
+}
+
+int
+PFCconsumeInput(PgFdwConn *conn)
+{
+ return PQconsumeInput(conn->pgconn);
+}
+
+int
+PFCisBusy(PgFdwConn *conn)
+{
+ return PQisBusy(conn->pgconn);
+}
+
+ConnStatusType
+PFCstatus(const PgFdwConn *conn)
+{
+ return PQstatus(conn->pgconn);
+}
+
+PGTransactionStatusType
+PFCtransactionStatus(const PgFdwConn *conn)
+{
+ return PQtransactionStatus(conn->pgconn);
+}
+
+int
+PFCserverVersion(const PgFdwConn *conn)
+{
+ return PQserverVersion(conn->pgconn);
+}
+
+char *
+PFCerrorMessage(const PgFdwConn *conn)
+{
+ return PQerrorMessage(conn->pgconn);
+}
+
+int
+PFCconnectionUsedPassword(const PgFdwConn *conn)
+{
+ return PQconnectionUsedPassword(conn->pgconn);
+}
+
+void
+PFCfinish(PgFdwConn *conn)
+{
+ return PQfinish(conn->pgconn);
+ PFC_FREE(conn);
+}
+
+PgFdwConn *
+PFCconnectdbParams(const char *const * keywords,
+ const char *const * values, int expand_dbname)
+{
+ PgFdwConn *ret = PFC_ALLOCATE();
+
+ PFCinit(ret);
+ ret->pgconn = PQconnectdbParams(keywords, values, expand_dbname);
+
+ return ret;
+}
diff --git a/contrib/postgres_fdw/PgFdwConn.h b/contrib/postgres_fdw/PgFdwConn.h
new file mode 100644
index 0000000..2771de5
--- /dev/null
+++ b/contrib/postgres_fdw/PgFdwConn.h
@@ -0,0 +1,62 @@
+/*-------------------------------------------------------------------------
+ *
+ * PgFdwConn.h
+ * PGconn extending wrapper to enable asynchronous query.
+ *
+ * Portions Copyright (c) 2012-2015, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * contrib/postgres_fdw/PgFdwConn.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PGFDWCONN_H
+#define PGFDWCONN_H
+
+#include "libpq-fe.h"
+
+typedef struct pgfdw_conn PgFdwConn;
+struct PgFdwScanState;
+
+extern void PFCsetAsyncScan(PgFdwConn *conn, struct PgFdwScanState *scan);
+extern struct PgFdwScanState *PFCgetAsyncScan(PgFdwConn *conn);
+extern int PFCisAsyncRunning(PgFdwConn *conn);
+extern PGconn *PFCgetPGconn(PgFdwConn *conn);
+extern int PFCgetNscans(PgFdwConn *conn);
+extern int PFCincrementNscans(PgFdwConn *conn);
+extern int PFCdecrementNscans(PgFdwConn *conn);
+extern void PFCcancelAsync(PgFdwConn *conn);
+extern void PFCinit(PgFdwConn *conn);
+extern int PFCsendQuery(PgFdwConn *conn, const char *query);
+extern PGresult *PFCexec(PgFdwConn *conn, const char *query);
+extern PGresult *PFCexecParams(PgFdwConn *conn,
+ const char *command,
+ int nParams,
+ const Oid *paramTypes,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat);
+extern PGresult *PFCprepare(PgFdwConn *conn,
+ const char *stmtName, const char *query,
+ int nParams, const Oid *paramTypes);
+extern PGresult *PFCexecPrepared(PgFdwConn *conn,
+ const char *stmtName,
+ int nParams,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat);
+extern PGresult *PFCgetResult(PgFdwConn *conn);
+extern int PFCconsumeInput(PgFdwConn *conn);
+extern int PFCisBusy(PgFdwConn *conn);
+extern ConnStatusType PFCstatus(const PgFdwConn *conn);
+extern PGTransactionStatusType PFCtransactionStatus(const PgFdwConn *conn);
+extern int PFCserverVersion(const PgFdwConn *conn);
+extern char *PFCerrorMessage(const PgFdwConn *conn);
+extern int PFCconnectionUsedPassword(const PgFdwConn *conn);
+extern void PFCfinish(PgFdwConn *conn);
+extern PgFdwConn *PFCconnectdbParams(const char *const * keywords,
+ const char *const * values, int expand_dbname);
+
+#endif /* PGFDWCONN_H */
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 4e02cb2..5bf08ec 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -44,7 +44,7 @@ typedef struct ConnCacheKey
typedef struct ConnCacheEntry
{
ConnCacheKey key; /* hash key (must be first) */
- PGconn *conn; /* connection to foreign server, or NULL */
+ PgFdwConn *conn; /* connection to foreign server, or NULL */
int xact_depth; /* 0 = no xact open, 1 = main xact open, 2 =
* one level of subxact open, etc */
bool have_prep_stmt; /* have we prepared any stmts in this xact? */
@@ -64,10 +64,10 @@ static unsigned int prep_stmt_number = 0;
static bool xact_got_connection = false;
/* prototypes of private functions */
-static PGconn *connect_pg_server(ForeignServer *server, UserMapping *user);
+static PgFdwConn *connect_pg_server(ForeignServer *server, UserMapping *user);
static void check_conn_params(const char **keywords, const char **values);
-static void configure_remote_session(PGconn *conn);
-static void do_sql_command(PGconn *conn, const char *sql);
+static void configure_remote_session(PgFdwConn *conn);
+static void do_sql_command(PgFdwConn *conn, const char *sql);
static void begin_remote_xact(ConnCacheEntry *entry);
static void pgfdw_xact_callback(XactEvent event, void *arg);
static void pgfdw_subxact_callback(SubXactEvent event,
@@ -93,7 +93,7 @@ static void pgfdw_subxact_callback(SubXactEvent event,
* be useful and not mere pedantry. We could not flush any active connections
* mid-transaction anyway.
*/
-PGconn *
+PgFdwConn *
GetConnection(ForeignServer *server, UserMapping *user,
bool will_prep_stmt)
{
@@ -161,7 +161,7 @@ GetConnection(ForeignServer *server, UserMapping *user,
entry->have_error = false;
entry->conn = connect_pg_server(server, user);
elog(DEBUG3, "new postgres_fdw connection %p for server \"%s\"",
- entry->conn, server->servername);
+ PFCgetPGconn(entry->conn), server->servername);
}
/*
@@ -169,6 +169,13 @@ GetConnection(ForeignServer *server, UserMapping *user,
*/
begin_remote_xact(entry);
+ /*
+ * Finish async query immediately if another foreign scan node sharing
+ * this connection comes.
+ */
+ if (PFCincrementNscans(entry->conn) > 1 && PFCisAsyncRunning(entry->conn))
+ fetch_more_data(PFCgetAsyncScan(entry->conn));
+
/* Remember if caller will prepare statements */
entry->have_prep_stmt |= will_prep_stmt;
@@ -178,10 +185,10 @@ GetConnection(ForeignServer *server, UserMapping *user,
/*
* Connect to remote server using specified server and user mapping properties.
*/
-static PGconn *
+static PgFdwConn *
connect_pg_server(ForeignServer *server, UserMapping *user)
{
- PGconn *volatile conn = NULL;
+ PgFdwConn *volatile conn = NULL;
/*
* Use PG_TRY block to ensure closing connection on error.
@@ -223,14 +230,14 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
/* verify connection parameters and make connection */
check_conn_params(keywords, values);
- conn = PQconnectdbParams(keywords, values, false);
- if (!conn || PQstatus(conn) != CONNECTION_OK)
+ conn = PFCconnectdbParams(keywords, values, false);
+ if (!conn || PFCstatus(conn) != CONNECTION_OK)
{
char *connmessage;
int msglen;
/* libpq typically appends a newline, strip that */
- connmessage = pstrdup(PQerrorMessage(conn));
+ connmessage = pstrdup(PFCerrorMessage(conn));
msglen = strlen(connmessage);
if (msglen > 0 && connmessage[msglen - 1] == '\n')
connmessage[msglen - 1] = '\0';
@@ -246,7 +253,7 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
* otherwise, he's piggybacking on the postgres server's user
* identity. See also dblink_security_check() in contrib/dblink.
*/
- if (!superuser() && !PQconnectionUsedPassword(conn))
+ if (!superuser() && !PFCconnectionUsedPassword(conn))
ereport(ERROR,
(errcode(ERRCODE_S_R_E_PROHIBITED_SQL_STATEMENT_ATTEMPTED),
errmsg("password is required"),
@@ -263,7 +270,7 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
{
/* Release PGconn data structure if we managed to create one */
if (conn)
- PQfinish(conn);
+ PFCfinish(conn);
PG_RE_THROW();
}
PG_END_TRY();
@@ -312,9 +319,9 @@ check_conn_params(const char **keywords, const char **values)
* there are any number of ways to break things.
*/
static void
-configure_remote_session(PGconn *conn)
+configure_remote_session(PgFdwConn *conn)
{
- int remoteversion = PQserverVersion(conn);
+ int remoteversion = PFCserverVersion(conn);
/* Force the search path to contain only pg_catalog (see deparse.c) */
do_sql_command(conn, "SET search_path = pg_catalog");
@@ -348,11 +355,11 @@ configure_remote_session(PGconn *conn)
* Convenience subroutine to issue a non-data-returning SQL command to remote
*/
static void
-do_sql_command(PGconn *conn, const char *sql)
+do_sql_command(PgFdwConn *conn, const char *sql)
{
PGresult *res;
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, sql);
PQclear(res);
@@ -379,7 +386,7 @@ begin_remote_xact(ConnCacheEntry *entry)
const char *sql;
elog(DEBUG3, "starting remote transaction on connection %p",
- entry->conn);
+ PFCgetPGconn(entry->conn));
if (IsolationIsSerializable())
sql = "START TRANSACTION ISOLATION LEVEL SERIALIZABLE";
@@ -408,13 +415,11 @@ begin_remote_xact(ConnCacheEntry *entry)
* Release connection reference count created by calling GetConnection.
*/
void
-ReleaseConnection(PGconn *conn)
+ReleaseConnection(PgFdwConn *conn)
{
- /*
- * Currently, we don't actually track connection references because all
- * cleanup is managed on a transaction or subtransaction basis instead. So
- * there's nothing to do here.
- */
+ /* ongoing async query should be canceled if no scans left */
+ if (PFCdecrementNscans(conn) == 0)
+ finish_async_connection(conn);
}
/*
@@ -429,7 +434,7 @@ ReleaseConnection(PGconn *conn)
* collisions are highly improbable; just be sure to use %u not %d to print.
*/
unsigned int
-GetCursorNumber(PGconn *conn)
+GetCursorNumber(PgFdwConn *conn)
{
return ++cursor_number;
}
@@ -443,7 +448,7 @@ GetCursorNumber(PGconn *conn)
* increasing the risk of prepared-statement name collisions by resetting.
*/
unsigned int
-GetPrepStmtNumber(PGconn *conn)
+GetPrepStmtNumber(PgFdwConn *conn)
{
return ++prep_stmt_number;
}
@@ -462,7 +467,7 @@ GetPrepStmtNumber(PGconn *conn)
* marked with have_error = true.
*/
void
-pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
+pgfdw_report_error(int elevel, PGresult *res, PgFdwConn *conn,
bool clear, const char *sql)
{
/* If requested, PGresult must be released before leaving this function. */
@@ -490,7 +495,7 @@ pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
* return NULL, not a PGresult at all.
*/
if (message_primary == NULL)
- message_primary = PQerrorMessage(conn);
+ message_primary = PFCerrorMessage(conn);
ereport(elevel,
(errcode(sqlstate),
@@ -542,7 +547,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
if (entry->xact_depth > 0)
{
elog(DEBUG3, "closing remote transaction on connection %p",
- entry->conn);
+ PFCgetPGconn(entry->conn));
switch (event)
{
@@ -567,7 +572,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
*/
if (entry->have_prep_stmt && entry->have_error)
{
- res = PQexec(entry->conn, "DEALLOCATE ALL");
+ res = PFCexec(entry->conn, "DEALLOCATE ALL");
PQclear(res);
}
entry->have_prep_stmt = false;
@@ -597,7 +602,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Assume we might have lost track of prepared statements */
entry->have_error = true;
/* If we're aborting, abort all remote transactions too */
- res = PQexec(entry->conn, "ABORT TRANSACTION");
+ res = PFCexec(entry->conn, "ABORT TRANSACTION");
/* Note: can't throw ERROR, it would be infinite loop */
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(WARNING, res, entry->conn, true,
@@ -608,7 +613,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* As above, make sure to clear any prepared stmts */
if (entry->have_prep_stmt && entry->have_error)
{
- res = PQexec(entry->conn, "DEALLOCATE ALL");
+ res = PFCexec(entry->conn, "DEALLOCATE ALL");
PQclear(res);
}
entry->have_prep_stmt = false;
@@ -620,17 +625,19 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Reset state to show we're out of a transaction */
entry->xact_depth = 0;
+ PFCcancelAsync(entry->conn);
+ PFCinit(entry->conn);
/*
* If the connection isn't in a good idle state, discard it to
* recover. Next GetConnection will open a new connection.
*/
- if (PQstatus(entry->conn) != CONNECTION_OK ||
- PQtransactionStatus(entry->conn) != PQTRANS_IDLE)
+ if (PFCstatus(entry->conn) != CONNECTION_OK ||
+ PFCtransactionStatus(entry->conn) != PQTRANS_IDLE)
{
- elog(DEBUG3, "discarding connection %p", entry->conn);
- PQfinish(entry->conn);
- entry->conn = NULL;
+ elog(DEBUG3, "discarding connection %p",
+ PFCgetPGconn(entry->conn));
+ PFCfinish(entry->conn);
}
}
@@ -676,6 +683,9 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
PGresult *res;
char sql[100];
+ /* Shut down asynchronous scan if running */
+ PFCcancelAsync(entry->conn);
+
/*
* We only care about connections with open remote subtransactions of
* the current level.
@@ -701,7 +711,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
snprintf(sql, sizeof(sql),
"ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
curlevel, curlevel);
- res = PQexec(entry->conn, sql);
+ res = PFCexec(entry->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(WARNING, res, entry->conn, true, sql);
else
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index d76e739..08d9ca6 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -136,7 +136,7 @@ typedef struct PgFdwScanState
List *retrieved_attrs; /* list of retrieved attribute numbers */
/* for remote query execution */
- PGconn *conn; /* connection for the scan */
+ PgFdwConn *conn; /* connection for the scan */
unsigned int cursor_number; /* quasi-unique ID for my cursor */
bool cursor_exists; /* have we created the cursor? */
int numParams; /* number of parameters passed to query */
@@ -156,6 +156,7 @@ typedef struct PgFdwScanState
/* working memory contexts */
MemoryContext batch_cxt; /* context holding current batch of tuples */
MemoryContext temp_cxt; /* context for per-tuple temporary data */
+ ExprContext *econtext; /* copy of ps_ExprContext of ForeignScanState */
} PgFdwScanState;
/*
@@ -167,7 +168,7 @@ typedef struct PgFdwModifyState
AttInMetadata *attinmeta; /* attribute datatype conversion metadata */
/* for remote query execution */
- PGconn *conn; /* connection for the scan */
+ PgFdwConn *conn; /* connection for the scan */
char *p_name; /* name of prepared statement, if created */
/* extracted fdw_private data */
@@ -298,7 +299,7 @@ static void estimate_path_cost_size(PlannerInfo *root,
double *p_rows, int *p_width,
Cost *p_startup_cost, Cost *p_total_cost);
static void get_remote_estimate(const char *sql,
- PGconn *conn,
+ PgFdwConn *conn,
double *rows,
int *width,
Cost *startup_cost,
@@ -306,9 +307,8 @@ static void get_remote_estimate(const char *sql,
static bool ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
EquivalenceClass *ec, EquivalenceMember *em,
void *arg);
-static void create_cursor(ForeignScanState *node);
-static void fetch_more_data(ForeignScanState *node);
-static void close_cursor(PGconn *conn, unsigned int cursor_number);
+static void create_cursor(PgFdwScanState *node);
+static void close_cursor(PgFdwConn *conn, unsigned int cursor_number);
static void prepare_foreign_modify(PgFdwModifyState *fmstate);
static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
ItemPointer tupleid,
@@ -329,7 +329,6 @@ static HeapTuple make_tuple_from_result_row(PGresult *res,
MemoryContext temp_context);
static void conversion_error_callback(void *arg);
-
/*
* Foreign-data wrapper handler function: return a struct with pointers
* to my callback routines.
@@ -982,6 +981,15 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
fsstate->param_values = (const char **) palloc0(numParams * sizeof(char *));
else
fsstate->param_values = NULL;
+
+ fsstate->econtext = node->ss.ps.ps_ExprContext;
+
+ /*
+ * Start scanning asynchronously if it is the first scan on this
+ * connection.
+ */
+ if (PFCgetNscans(fsstate->conn) == 1)
+ create_cursor(fsstate);
}
/*
@@ -1000,7 +1008,7 @@ postgresIterateForeignScan(ForeignScanState *node)
* cursor on the remote side.
*/
if (!fsstate->cursor_exists)
- create_cursor(node);
+ create_cursor(fsstate);
/*
* Get some more tuples, if we've run out.
@@ -1009,7 +1017,7 @@ postgresIterateForeignScan(ForeignScanState *node)
{
/* No point in another fetch if we already detected EOF, though. */
if (!fsstate->eof_reached)
- fetch_more_data(node);
+ fetch_more_data(fsstate);
/* If we didn't get any tuples, must be end of data. */
if (fsstate->next_tuple >= fsstate->num_tuples)
return ExecClearTuple(slot);
@@ -1069,7 +1077,7 @@ postgresReScanForeignScan(ForeignScanState *node)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(fsstate->conn, sql);
+ res = PFCexec(fsstate->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fsstate->conn, true, sql);
PQclear(res);
@@ -1398,13 +1406,13 @@ postgresExecForeignInsert(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1468,13 +1476,13 @@ postgresExecForeignUpdate(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1538,13 +1546,13 @@ postgresExecForeignDelete(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1594,7 +1602,7 @@ postgresEndForeignModify(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(fmstate->conn, sql);
+ res = PFCexec(fmstate->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fmstate->conn, true, sql);
PQclear(res);
@@ -1726,7 +1734,7 @@ estimate_path_cost_size(PlannerInfo *root,
List *local_join_conds;
StringInfoData sql;
List *retrieved_attrs;
- PGconn *conn;
+ PgFdwConn *conn;
Selectivity local_sel;
QualCost local_cost;
@@ -1836,7 +1844,7 @@ estimate_path_cost_size(PlannerInfo *root,
* The given "sql" must be an EXPLAIN command.
*/
static void
-get_remote_estimate(const char *sql, PGconn *conn,
+get_remote_estimate(const char *sql, PgFdwConn *conn,
double *rows, int *width,
Cost *startup_cost, Cost *total_cost)
{
@@ -1852,7 +1860,7 @@ get_remote_estimate(const char *sql, PGconn *conn,
/*
* Execute EXPLAIN remotely.
*/
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql);
@@ -1917,13 +1925,12 @@ ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
* Create cursor for node's query with current parameter values.
*/
static void
-create_cursor(ForeignScanState *node)
+create_cursor(PgFdwScanState *fsstate)
{
- PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
- ExprContext *econtext = node->ss.ps.ps_ExprContext;
+ ExprContext *econtext = fsstate->econtext;
int numParams = fsstate->numParams;
const char **values = fsstate->param_values;
- PGconn *conn = fsstate->conn;
+ PgFdwConn *conn = fsstate->conn;
StringInfoData buf;
PGresult *res;
@@ -1985,8 +1992,8 @@ create_cursor(ForeignScanState *node)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecParams(conn, buf.data, numParams, NULL, values,
- NULL, NULL, 0);
+ res = PFCexecParams(conn, buf.data, numParams, NULL, values,
+ NULL, NULL, 0);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, fsstate->query);
PQclear(res);
@@ -2001,15 +2008,21 @@ create_cursor(ForeignScanState *node)
/* Clean up */
pfree(buf.data);
+
+ /*
+ * Start async scan if this is the first scan. See fetch_more_data() for
+ * details
+ */
+ if (PFCgetNscans(conn) == 1)
+ fetch_more_data(fsstate);
}
/*
* Fetch some more rows from the node's cursor.
*/
-static void
-fetch_more_data(ForeignScanState *node)
+void
+fetch_more_data(PgFdwScanState *fsstate)
{
- PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
PGresult *volatile res = NULL;
MemoryContext oldcontext;
@@ -2024,7 +2037,7 @@ fetch_more_data(ForeignScanState *node)
/* PGresult must be released before leaving this function. */
PG_TRY();
{
- PGconn *conn = fsstate->conn;
+ PgFdwConn *conn = fsstate->conn;
char sql[64];
int fetch_size;
int numrows;
@@ -2036,9 +2049,57 @@ fetch_more_data(ForeignScanState *node)
snprintf(sql, sizeof(sql), "FETCH %d FROM c%u",
fetch_size, fsstate->cursor_number);
- res = PQexec(conn, sql);
+ if (PFCisAsyncRunning(conn))
+ {
+ /* Get result of running async fetch */
+ res = PFCgetResult(conn);
+ if (PQntuples(res) == fetch_size)
+ {
+ /*
+ * Connection state doesn't go to IDLE even if all data
+ * has been sent to client for asynchronous query. One
+ * more PQgetResult() is needed to reset the state to
+ * IDLE. See PQexecFinish() for details.
+ */
+ if (PFCgetResult(conn) != NULL)
+ elog(ERROR, "Connection status error.");
+ }
+
+ /*
+ * On the current postgres_fdw implement, multiple PgFdwScanState
+ * on the same foreign server and mapped user share the same
+ * connection to the remote server (see GetConnection() in
+ * connection.c) and inidividual scans on it are separated using
+ * cursors. Since one connection cannot accept two or more
+ * asynchronous queries simultaneously, we should stop the async
+ * fetching if the another scan comes.
+ */
+ if (PFCgetNscans(conn) > 1)
+ PFCsetAsyncScan(conn, NULL);
+ }
+ else
+ {
+ /*
+ * If no async scan is running and the number of scans running on
+ * this connection is 1, start async fetch.
+ */
+ if (PFCgetNscans(conn) == 1)
+ {
+ if (!PFCsendQuery(conn, sql))
+ pgfdw_report_error(ERROR, res, conn, false,
+ fsstate->query);
+
+ PFCsetAsyncScan(conn, fsstate);
+ goto end_of_fetch;
+ }
+
+ /* Elsewise do synchronous query execution */
+ PFCsetAsyncScan(conn, NULL);
+ res = PFCexec(conn, sql);
+ }
+
/* On error, report the original query, not the FETCH. */
- if (PQresultStatus(res) != PGRES_TUPLES_OK)
+ if (res && PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, fsstate->query);
/* Convert the data into HeapTuples */
@@ -2066,6 +2127,26 @@ fetch_more_data(ForeignScanState *node)
PQclear(res);
res = NULL;
+
+ if (PFCisAsyncRunning(conn))
+ {
+ if (!fsstate->eof_reached)
+ {
+ /*
+ * We can immediately request the next bunch of tuples if
+ * we're on asynchronous connection.
+ */
+ if (!PFCsendQuery(conn, sql))
+ pgfdw_report_error(ERROR, res, conn, false, fsstate->query);
+ }
+ else
+ {
+ PFCsetAsyncScan(conn, NULL);
+ }
+ }
+
+end_of_fetch:
+ ; /* Nothing to do here but needed to make compiler quiet. */
}
PG_CATCH();
{
@@ -2079,6 +2160,31 @@ fetch_more_data(ForeignScanState *node)
}
/*
+ * Force cancelling async command state.
+ */
+void
+finish_async_connection(PgFdwConn *conn)
+{
+ PgFdwScanState *fsstate = PFCgetAsyncScan(conn);
+ PgFdwConn *async_conn;
+
+ /* Nothing to do if no async connection */
+ if (fsstate == NULL) return;
+ async_conn = fsstate->conn;
+ Assert(async_conn && PFCgetNscans(async_conn) != 1);
+
+ /* Finish async command if any */
+ if (PFCisAsyncRunning(async_conn))
+ fetch_more_data(PFCgetAsyncScan(async_conn));
+
+ Assert(!PFCisAsyncRunning(async_conn));
+
+ /* Immediately discard the result */
+ fsstate->next_tuple = 0;
+ fsstate->num_tuples = 0;
+}
+
+/*
* Force assorted GUC parameters to settings that ensure that we'll output
* data values in a form that is unambiguous to the remote server.
*
@@ -2132,7 +2238,7 @@ reset_transmission_modes(int nestlevel)
* Utility routine to close a cursor.
*/
static void
-close_cursor(PGconn *conn, unsigned int cursor_number)
+close_cursor(PgFdwConn *conn, unsigned int cursor_number)
{
char sql[64];
PGresult *res;
@@ -2143,7 +2249,7 @@ close_cursor(PGconn *conn, unsigned int cursor_number)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, sql);
PQclear(res);
@@ -2175,11 +2281,11 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQprepare(fmstate->conn,
- p_name,
- fmstate->query,
- 0,
- NULL);
+ res = PFCprepare(fmstate->conn,
+ p_name,
+ fmstate->query,
+ 0,
+ NULL);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -2297,7 +2403,7 @@ postgresAnalyzeForeignTable(Relation relation,
ForeignTable *table;
ForeignServer *server;
UserMapping *user;
- PGconn *conn;
+ PgFdwConn *conn;
StringInfoData sql;
PGresult *volatile res = NULL;
@@ -2329,7 +2435,7 @@ postgresAnalyzeForeignTable(Relation relation,
/* In what follows, do not risk leaking any PGresults. */
PG_TRY();
{
- res = PQexec(conn, sql.data);
+ res = PFCexec(conn, sql.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
@@ -2379,7 +2485,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
ForeignTable *table;
ForeignServer *server;
UserMapping *user;
- PGconn *conn;
+ PgFdwConn *conn;
unsigned int cursor_number;
StringInfoData sql;
PGresult *volatile res = NULL;
@@ -2423,7 +2529,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
/* In what follows, do not risk leaking any PGresults. */
PG_TRY();
{
- res = PQexec(conn, sql.data);
+ res = PFCexec(conn, sql.data);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
PQclear(res);
@@ -2453,7 +2559,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
snprintf(fetch_sql, sizeof(fetch_sql), "FETCH %d FROM c%u",
fetch_size, cursor_number);
- res = PQexec(conn, fetch_sql);
+ res = PFCexec(conn, fetch_sql);
/* On error, report the original query, not the FETCH. */
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
@@ -2582,7 +2688,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
bool import_not_null = true;
ForeignServer *server;
UserMapping *mapping;
- PGconn *conn;
+ PgFdwConn *conn;
StringInfoData buf;
PGresult *volatile res = NULL;
int numrows,
@@ -2615,7 +2721,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
conn = GetConnection(server, mapping, false);
/* Don't attempt to import collation if remote server hasn't got it */
- if (PQserverVersion(conn) < 90100)
+ if (PFCserverVersion(conn) < 90100)
import_collate = false;
/* Create workspace for strings */
@@ -2628,7 +2734,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
appendStringInfoString(&buf, "SELECT 1 FROM pg_catalog.pg_namespace WHERE nspname = ");
deparseStringLiteral(&buf, stmt->remote_schema);
- res = PQexec(conn, buf.data);
+ res = PFCexec(conn, buf.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, buf.data);
@@ -2723,7 +2829,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
appendStringInfo(&buf, " ORDER BY c.relname, a.attnum");
/* Fetch the data */
- res = PQexec(conn, buf.data);
+ res = PFCexec(conn, buf.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, buf.data);
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 950c6f7..cac7dfc 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -18,19 +18,23 @@
#include "nodes/relation.h"
#include "utils/relcache.h"
-#include "libpq-fe.h"
+#include "PgFdwConn.h"
+
+struct PgFdwScanState;
/* in postgres_fdw.c */
extern int set_transmission_modes(void);
extern void reset_transmission_modes(int nestlevel);
+extern void fetch_more_data(struct PgFdwScanState *node);
+extern void finish_async_connection(PgFdwConn *fsstate);
/* in connection.c */
-extern PGconn *GetConnection(ForeignServer *server, UserMapping *user,
+extern PgFdwConn *GetConnection(ForeignServer *server, UserMapping *user,
bool will_prep_stmt);
-extern void ReleaseConnection(PGconn *conn);
-extern unsigned int GetCursorNumber(PGconn *conn);
-extern unsigned int GetPrepStmtNumber(PGconn *conn);
-extern void pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
+extern void ReleaseConnection(PgFdwConn *conn);
+extern unsigned int GetCursorNumber(PgFdwConn *conn);
+extern unsigned int GetPrepStmtNumber(PgFdwConn *conn);
+extern void pgfdw_report_error(int elevel, PGresult *res, PgFdwConn *conn,
bool clear, const char *sql);
/* in option.c */
--
2.1.0.GIT
Hello,
- The issue of PREPARE is not addressed yet.
...
In case of Prepared statements, ExecInit is called at the end of planning,
without subsequent execution like the case of EXPLAIN. I see that the patch
handles EXPLAIN well, but I didn't see any specific code for PREPARE.I'll look into the case after this, but I'd like to send a
revised patch at this point.
Mmm.. CreateExecutorState() looks to be called when calculating
the expression in predicates, clauses, or EXECUTE parameters. All
of these complete SQL execution if any. And I couldn't make the
situation you mentioned.
Could you give me an example or illustration about such a
situation where ExecInit alone is called without
IterateForeignScan?
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jan 13, 2015 at 6:46 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Is it possible to use the parallel query infrastructure being built by
Robert or to do something like parallel seq scan? That will work, not just
for Postgres FDW but all the FDWs.But, I think, from the performance view, every scan of multiple
foreign scans don't need correnponding local process.
Quite so. I think this is largely a separate project.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jan 13, 2015 at 8:46 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
I'll look into the case after this, but I'd like to send a
revised patch at this point.
Hm. Seems like this patch is not completely baked yet. Horiguchi-san,
as you are obviously still working on it, would you agree to move it
to the next CF?
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello,
I'll look into the case after this, but I'd like to send a
revised patch at this point.Hm. Seems like this patch is not completely baked yet. Horiguchi-san,
as you are obviously still working on it, would you agree to move it
to the next CF?
Yes, that's fine with me. Thank you.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I revised the patch so that async scan will be done more
aggressively, and took execution time for two very simple cases.
As the result, simple seq scan gained 5% and hash join of two
foreign tables gained 150%. (2.4 times faster).
While measuring the performance, I noticed that each scan in a
query runs at once rather than alternating with each other in
many cases such as hash join or sorted joins and so. So I
modified the patch so that async fetch is done more
aggressively. The new v4 patch is attached. The following numbers
are taken based on it.
========
Simple seq scan for the first test.
CREATE TABLE lt1 (a int, b timestamp, c text);
CREATE SERVER sv1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost');
CREATE USER MAPPING FOR PUBLIC SERVER sv1;
CREATE FOREIGN TABLE ft1 () SERVER sv1 OPTIONS (table_name 'lt1');
INSERT INTO lt1 (SELECT a, now(), repeat('x', 128) FROM generate_series(0, 999999) a);
On this case, I took the the 10 times average of exec time of the
following query for both master head and patched version. The
fetch size is 100.
postgres=# EXPLAIN (ANALYZE ON, COSTS OFF) SELECT * FROM ft1;
QUERY PLAN
------------------------------------------------------------------
Foreign Scan on ft1 (actual time=0.79 5..4175.706 rows=1000000 loops=1)
Planning time: 0.060 ms
Execution time: 4276.043 ms
master head : avg = 4256.621, std dev = 17.099
patched pgfdw: avg = 4036.463, std dev = 2.608
The patched version is faster by about 5%. This should be pure
result of asynchronous fetching, not including the effect of
early starting of remote execution in ExecInit.
Interestingly, as fetch_count gets larger, the gain raises in
spite of the decrease of the number of query sending.
master head : avg = 2622.759, std dev = 38.379
patched pgfdw: avg = 2277.622, std dev = 27.269
About 15% gain. And for 10000,
master head : avg = 2000.980, std dev = 6.434
patched pgfdw: avg = 1616.793, std dev = 13.192
19%.. It is natural that exec time reduces along with increase of
fetch size, but I haven't found the reason why the patch's gain
also increases.
======================
The second case is a simple join of two foreign tables sharing
one connection.
The master head runs this query in about 16 seconds with almost
no fluctuation among multiple tries.
=# EXPLAIN (ANALYZE ON, COSTS OFF) SELECT x.a, x.c, y.c
FROM ft1 AS x JOIN ft1 AS y on x.a = y.a;
QUERY PLAN
----------------------------------------------------------------------------
Hash Join (actual time=7541.831..15924.631 rows=1000000 loops=1)
Hash Cond: (x.a = y.a)
-> Foreign Scan on ft1 x (actual time=1.176..6553.480 rows=1000000 loops=1)
-> Hash (actual time=7539.761..7539.761 rows=1000000 loops=1)
Buckets: 32768 Batches: 64 Memory Usage: 2829kB
-> Foreign Scan on ft1 y (actual time=1.067..6529.165 rows=1000000 loops=1)
Planning time: 0.223 ms
Execution time: 15973.916 ms
But the v4 patch mysteriously accelerates this query, 6.5 seconds.
=# EXPLAIN (ANALYZE ON, COSTS OFF) SELECT x.a, x.c, y.c
FROM ft1 AS x JOIN ft1 AS y on x.a = y.a;
QUERY PLAN
----------------------------------------------------------------------------
Hash Join (actual time=2556.977..5812.937 rows=1000000 loops=1)
Hash Cond: (x.a = y.a)
-> Foreign Scan on ft1 x (actual time=32.689..1936.565 rows=1000000 loops=1)
-> Hash (actual time=2523.810..2523.810 rows=1000000 loops=1)
Buckets: 32768 Batches: 64 Memory Usage: 2829kB
-> Foreign Scan on ft1 y (actual time=50.345..1928.411 rows=1000000 loops=1)
Planning time: 0.220 ms
Execution time: 6512.043 ms
The result data seems not broken. I don't know the reason yet but
I'll investigate it.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-Asynchronous-execution-of-postgres_fdw-v4.patchtext/x-patch; charset=us-asciiDownload
>From edba0530fb6a9c5a4e6def055757d6d60bce9171 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Tue, 13 Jan 2015 19:20:35 +0900
Subject: [PATCH] Asynchronous execution of postgres_fdw v4
This is the modified version of Asynchronous execution of
postgres_fdw.
- Do async fetch more aggressively than v3.
- No additional tests yet :(
---
contrib/postgres_fdw/Makefile | 2 +-
contrib/postgres_fdw/PgFdwConn.c | 200 +++++++++++++++++++++++++
contrib/postgres_fdw/PgFdwConn.h | 61 ++++++++
contrib/postgres_fdw/connection.c | 82 ++++++-----
contrib/postgres_fdw/postgres_fdw.c | 283 +++++++++++++++++++++++++++---------
contrib/postgres_fdw/postgres_fdw.h | 15 +-
6 files changed, 527 insertions(+), 116 deletions(-)
create mode 100644 contrib/postgres_fdw/PgFdwConn.c
create mode 100644 contrib/postgres_fdw/PgFdwConn.h
diff --git a/contrib/postgres_fdw/Makefile b/contrib/postgres_fdw/Makefile
index d2b98e1..d0913e2 100644
--- a/contrib/postgres_fdw/Makefile
+++ b/contrib/postgres_fdw/Makefile
@@ -1,7 +1,7 @@
# contrib/postgres_fdw/Makefile
MODULE_big = postgres_fdw
-OBJS = postgres_fdw.o option.o deparse.o connection.o $(WIN32RES)
+OBJS = postgres_fdw.o PgFdwConn.o option.o deparse.o connection.o $(WIN32RES)
PGFILEDESC = "postgres_fdw - foreign data wrapper for PostgreSQL"
PG_CPPFLAGS = -I$(libpq_srcdir)
diff --git a/contrib/postgres_fdw/PgFdwConn.c b/contrib/postgres_fdw/PgFdwConn.c
new file mode 100644
index 0000000..b13b597
--- /dev/null
+++ b/contrib/postgres_fdw/PgFdwConn.c
@@ -0,0 +1,200 @@
+/*-------------------------------------------------------------------------
+ *
+ * PgFdwConn.c
+ * PGconn extending wrapper to enable asynchronous query.
+ *
+ * Portions Copyright (c) 2012-2015, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * contrib/postgres_fdw/PgFdwConn.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "PgFdwConn.h"
+
+#define PFC_ALLOCATE() ((PgFdwConn *)malloc(sizeof(PgFdwConn)))
+#define PFC_FREE(c) free(c)
+
+struct pgfdw_conn
+{
+ PGconn *pgconn; /* libpq connection for this connection */
+ int nscans; /* number of scans using this connection */
+ struct PgFdwScanState *async_scan; /* the connection currently running
+ * async query on this connection */
+};
+
+void
+PFCsetAsyncScan(PgFdwConn *conn, struct PgFdwScanState *scan)
+{
+ conn->async_scan = scan;
+}
+
+struct PgFdwScanState *
+PFCgetAsyncScan(PgFdwConn *conn)
+{
+ return conn->async_scan;
+}
+
+int
+PFCisAsyncRunning(PgFdwConn *conn)
+{
+ return conn->async_scan != NULL;
+}
+
+PGconn *
+PFCgetPGconn(PgFdwConn *conn)
+{
+ return conn->pgconn;
+}
+
+int
+PFCgetNscans(PgFdwConn *conn)
+{
+ return conn->nscans;
+}
+
+int
+PFCincrementNscans(PgFdwConn *conn)
+{
+ return ++conn->nscans;
+}
+
+int
+PFCdecrementNscans(PgFdwConn *conn)
+{
+ Assert(conn->nscans > 0);
+ return --conn->nscans;
+}
+
+void
+PFCcancelAsync(PgFdwConn *conn)
+{
+ if (PFCisAsyncRunning(conn))
+ PFCconsumeInput(conn);
+}
+
+void
+PFCinit(PgFdwConn *conn)
+{
+ conn->async_scan = NULL;
+ conn->nscans = 0;
+}
+
+int
+PFCsendQuery(PgFdwConn *conn, const char *query)
+{
+ return PQsendQuery(conn->pgconn, query);
+}
+
+PGresult *
+PFCexec(PgFdwConn *conn, const char *query)
+{
+ return PQexec(conn->pgconn, query);
+}
+
+PGresult *
+PFCexecParams(PgFdwConn *conn,
+ const char *command,
+ int nParams,
+ const Oid *paramTypes,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat)
+{
+ return PQexecParams(conn->pgconn,
+ command, nParams, paramTypes, paramValues,
+ paramLengths, paramFormats, resultFormat);
+}
+
+PGresult *
+PFCprepare(PgFdwConn *conn,
+ const char *stmtName, const char *query,
+ int nParams, const Oid *paramTypes)
+{
+ return PQprepare(conn->pgconn, stmtName, query, nParams, paramTypes);
+}
+
+PGresult *
+PFCexecPrepared(PgFdwConn *conn,
+ const char *stmtName,
+ int nParams,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat)
+{
+ return PQexecPrepared(conn->pgconn,
+ stmtName, nParams, paramValues, paramLengths,
+ paramFormats, resultFormat);
+}
+
+PGresult *
+PFCgetResult(PgFdwConn *conn)
+{
+ return PQgetResult(conn->pgconn);
+}
+
+int
+PFCconsumeInput(PgFdwConn *conn)
+{
+ return PQconsumeInput(conn->pgconn);
+}
+
+int
+PFCisBusy(PgFdwConn *conn)
+{
+ return PQisBusy(conn->pgconn);
+}
+
+ConnStatusType
+PFCstatus(const PgFdwConn *conn)
+{
+ return PQstatus(conn->pgconn);
+}
+
+PGTransactionStatusType
+PFCtransactionStatus(const PgFdwConn *conn)
+{
+ return PQtransactionStatus(conn->pgconn);
+}
+
+int
+PFCserverVersion(const PgFdwConn *conn)
+{
+ return PQserverVersion(conn->pgconn);
+}
+
+char *
+PFCerrorMessage(const PgFdwConn *conn)
+{
+ return PQerrorMessage(conn->pgconn);
+}
+
+int
+PFCconnectionUsedPassword(const PgFdwConn *conn)
+{
+ return PQconnectionUsedPassword(conn->pgconn);
+}
+
+void
+PFCfinish(PgFdwConn *conn)
+{
+ return PQfinish(conn->pgconn);
+ PFC_FREE(conn);
+}
+
+PgFdwConn *
+PFCconnectdbParams(const char *const * keywords,
+ const char *const * values, int expand_dbname)
+{
+ PgFdwConn *ret = PFC_ALLOCATE();
+
+ PFCinit(ret);
+ ret->pgconn = PQconnectdbParams(keywords, values, expand_dbname);
+
+ return ret;
+}
diff --git a/contrib/postgres_fdw/PgFdwConn.h b/contrib/postgres_fdw/PgFdwConn.h
new file mode 100644
index 0000000..f695f5a
--- /dev/null
+++ b/contrib/postgres_fdw/PgFdwConn.h
@@ -0,0 +1,61 @@
+/*-------------------------------------------------------------------------
+ *
+ * PgFdwConn.h
+ * PGconn extending wrapper to enable asynchronous query.
+ *
+ * Portions Copyright (c) 2012-2015, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * contrib/postgres_fdw/PgFdwConn.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PGFDWCONN_H
+#define PGFDWCONN_H
+
+#include "libpq-fe.h"
+
+typedef struct pgfdw_conn PgFdwConn;
+struct PgFdwScanState;
+
+extern void PFCsetAsyncScan(PgFdwConn *conn, struct PgFdwScanState *scan);
+extern struct PgFdwScanState *PFCgetAsyncScan(PgFdwConn *conn);
+extern int PFCisAsyncRunning(PgFdwConn *conn);
+extern PGconn *PFCgetPGconn(PgFdwConn *conn);
+extern int PFCgetNscans(PgFdwConn *conn);
+extern int PFCincrementNscans(PgFdwConn *conn);
+extern int PFCdecrementNscans(PgFdwConn *conn);
+extern void PFCcancelAsync(PgFdwConn *conn);
+extern void PFCinit(PgFdwConn *conn);
+extern int PFCsendQuery(PgFdwConn *conn, const char *query);
+extern PGresult *PFCexec(PgFdwConn *conn, const char *query);
+extern PGresult *PFCexecParams(PgFdwConn *conn,
+ const char *command,
+ int nParams,
+ const Oid *paramTypes,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat);
+extern PGresult *PFCprepare(PgFdwConn *conn,
+ const char *stmtName, const char *query,
+ int nParams, const Oid *paramTypes);
+extern PGresult *PFCexecPrepared(PgFdwConn *conn,
+ const char *stmtName,
+ int nParams,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat);
+extern PGresult *PFCgetResult(PgFdwConn *conn);
+extern int PFCconsumeInput(PgFdwConn *conn);
+extern int PFCisBusy(PgFdwConn *conn);
+extern ConnStatusType PFCstatus(const PgFdwConn *conn);
+extern PGTransactionStatusType PFCtransactionStatus(const PgFdwConn *conn);
+extern int PFCserverVersion(const PgFdwConn *conn);
+extern char *PFCerrorMessage(const PgFdwConn *conn);
+extern int PFCconnectionUsedPassword(const PgFdwConn *conn);
+extern void PFCfinish(PgFdwConn *conn);
+extern PgFdwConn *PFCconnectdbParams(const char *const * keywords,
+ const char *const * values, int expand_dbname);
+#endif /* PGFDWCONN_H */
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 4e02cb2..2517f6b 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -44,7 +44,7 @@ typedef struct ConnCacheKey
typedef struct ConnCacheEntry
{
ConnCacheKey key; /* hash key (must be first) */
- PGconn *conn; /* connection to foreign server, or NULL */
+ PgFdwConn *conn; /* connection to foreign server, or NULL */
int xact_depth; /* 0 = no xact open, 1 = main xact open, 2 =
* one level of subxact open, etc */
bool have_prep_stmt; /* have we prepared any stmts in this xact? */
@@ -64,10 +64,10 @@ static unsigned int prep_stmt_number = 0;
static bool xact_got_connection = false;
/* prototypes of private functions */
-static PGconn *connect_pg_server(ForeignServer *server, UserMapping *user);
+static PgFdwConn *connect_pg_server(ForeignServer *server, UserMapping *user);
static void check_conn_params(const char **keywords, const char **values);
-static void configure_remote_session(PGconn *conn);
-static void do_sql_command(PGconn *conn, const char *sql);
+static void configure_remote_session(PgFdwConn *conn);
+static void do_sql_command(PgFdwConn *conn, const char *sql);
static void begin_remote_xact(ConnCacheEntry *entry);
static void pgfdw_xact_callback(XactEvent event, void *arg);
static void pgfdw_subxact_callback(SubXactEvent event,
@@ -93,7 +93,7 @@ static void pgfdw_subxact_callback(SubXactEvent event,
* be useful and not mere pedantry. We could not flush any active connections
* mid-transaction anyway.
*/
-PGconn *
+PgFdwConn *
GetConnection(ForeignServer *server, UserMapping *user,
bool will_prep_stmt)
{
@@ -161,9 +161,12 @@ GetConnection(ForeignServer *server, UserMapping *user,
entry->have_error = false;
entry->conn = connect_pg_server(server, user);
elog(DEBUG3, "new postgres_fdw connection %p for server \"%s\"",
- entry->conn, server->servername);
+ PFCgetPGconn(entry->conn), server->servername);
+
}
+ PFCincrementNscans(entry->conn);
+
/*
* Start a new transaction or subtransaction if needed.
*/
@@ -178,10 +181,10 @@ GetConnection(ForeignServer *server, UserMapping *user,
/*
* Connect to remote server using specified server and user mapping properties.
*/
-static PGconn *
+static PgFdwConn *
connect_pg_server(ForeignServer *server, UserMapping *user)
{
- PGconn *volatile conn = NULL;
+ PgFdwConn *volatile conn = NULL;
/*
* Use PG_TRY block to ensure closing connection on error.
@@ -223,14 +226,14 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
/* verify connection parameters and make connection */
check_conn_params(keywords, values);
- conn = PQconnectdbParams(keywords, values, false);
- if (!conn || PQstatus(conn) != CONNECTION_OK)
+ conn = PFCconnectdbParams(keywords, values, false);
+ if (!conn || PFCstatus(conn) != CONNECTION_OK)
{
char *connmessage;
int msglen;
/* libpq typically appends a newline, strip that */
- connmessage = pstrdup(PQerrorMessage(conn));
+ connmessage = pstrdup(PFCerrorMessage(conn));
msglen = strlen(connmessage);
if (msglen > 0 && connmessage[msglen - 1] == '\n')
connmessage[msglen - 1] = '\0';
@@ -246,7 +249,7 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
* otherwise, he's piggybacking on the postgres server's user
* identity. See also dblink_security_check() in contrib/dblink.
*/
- if (!superuser() && !PQconnectionUsedPassword(conn))
+ if (!superuser() && !PFCconnectionUsedPassword(conn))
ereport(ERROR,
(errcode(ERRCODE_S_R_E_PROHIBITED_SQL_STATEMENT_ATTEMPTED),
errmsg("password is required"),
@@ -263,7 +266,7 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
{
/* Release PGconn data structure if we managed to create one */
if (conn)
- PQfinish(conn);
+ PFCfinish(conn);
PG_RE_THROW();
}
PG_END_TRY();
@@ -312,9 +315,9 @@ check_conn_params(const char **keywords, const char **values)
* there are any number of ways to break things.
*/
static void
-configure_remote_session(PGconn *conn)
+configure_remote_session(PgFdwConn *conn)
{
- int remoteversion = PQserverVersion(conn);
+ int remoteversion = PFCserverVersion(conn);
/* Force the search path to contain only pg_catalog (see deparse.c) */
do_sql_command(conn, "SET search_path = pg_catalog");
@@ -348,11 +351,11 @@ configure_remote_session(PGconn *conn)
* Convenience subroutine to issue a non-data-returning SQL command to remote
*/
static void
-do_sql_command(PGconn *conn, const char *sql)
+do_sql_command(PgFdwConn *conn, const char *sql)
{
PGresult *res;
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, sql);
PQclear(res);
@@ -379,7 +382,7 @@ begin_remote_xact(ConnCacheEntry *entry)
const char *sql;
elog(DEBUG3, "starting remote transaction on connection %p",
- entry->conn);
+ PFCgetPGconn(entry->conn));
if (IsolationIsSerializable())
sql = "START TRANSACTION ISOLATION LEVEL SERIALIZABLE";
@@ -408,13 +411,11 @@ begin_remote_xact(ConnCacheEntry *entry)
* Release connection reference count created by calling GetConnection.
*/
void
-ReleaseConnection(PGconn *conn)
+ReleaseConnection(PgFdwConn *conn)
{
- /*
- * Currently, we don't actually track connection references because all
- * cleanup is managed on a transaction or subtransaction basis instead. So
- * there's nothing to do here.
- */
+ /* ongoing async query should be canceled if no scans left */
+ if (PFCdecrementNscans(conn) == 0)
+ finish_async_query(conn);
}
/*
@@ -429,7 +430,7 @@ ReleaseConnection(PGconn *conn)
* collisions are highly improbable; just be sure to use %u not %d to print.
*/
unsigned int
-GetCursorNumber(PGconn *conn)
+GetCursorNumber(PgFdwConn *conn)
{
return ++cursor_number;
}
@@ -443,7 +444,7 @@ GetCursorNumber(PGconn *conn)
* increasing the risk of prepared-statement name collisions by resetting.
*/
unsigned int
-GetPrepStmtNumber(PGconn *conn)
+GetPrepStmtNumber(PgFdwConn *conn)
{
return ++prep_stmt_number;
}
@@ -462,7 +463,7 @@ GetPrepStmtNumber(PGconn *conn)
* marked with have_error = true.
*/
void
-pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
+pgfdw_report_error(int elevel, PGresult *res, PgFdwConn *conn,
bool clear, const char *sql)
{
/* If requested, PGresult must be released before leaving this function. */
@@ -490,7 +491,7 @@ pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
* return NULL, not a PGresult at all.
*/
if (message_primary == NULL)
- message_primary = PQerrorMessage(conn);
+ message_primary = PFCerrorMessage(conn);
ereport(elevel,
(errcode(sqlstate),
@@ -542,7 +543,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
if (entry->xact_depth > 0)
{
elog(DEBUG3, "closing remote transaction on connection %p",
- entry->conn);
+ PFCgetPGconn(entry->conn));
switch (event)
{
@@ -567,7 +568,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
*/
if (entry->have_prep_stmt && entry->have_error)
{
- res = PQexec(entry->conn, "DEALLOCATE ALL");
+ res = PFCexec(entry->conn, "DEALLOCATE ALL");
PQclear(res);
}
entry->have_prep_stmt = false;
@@ -597,7 +598,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Assume we might have lost track of prepared statements */
entry->have_error = true;
/* If we're aborting, abort all remote transactions too */
- res = PQexec(entry->conn, "ABORT TRANSACTION");
+ res = PFCexec(entry->conn, "ABORT TRANSACTION");
/* Note: can't throw ERROR, it would be infinite loop */
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(WARNING, res, entry->conn, true,
@@ -608,7 +609,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* As above, make sure to clear any prepared stmts */
if (entry->have_prep_stmt && entry->have_error)
{
- res = PQexec(entry->conn, "DEALLOCATE ALL");
+ res = PFCexec(entry->conn, "DEALLOCATE ALL");
PQclear(res);
}
entry->have_prep_stmt = false;
@@ -620,17 +621,19 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Reset state to show we're out of a transaction */
entry->xact_depth = 0;
+ PFCcancelAsync(entry->conn);
+ PFCinit(entry->conn);
/*
* If the connection isn't in a good idle state, discard it to
* recover. Next GetConnection will open a new connection.
*/
- if (PQstatus(entry->conn) != CONNECTION_OK ||
- PQtransactionStatus(entry->conn) != PQTRANS_IDLE)
+ if (PFCstatus(entry->conn) != CONNECTION_OK ||
+ PFCtransactionStatus(entry->conn) != PQTRANS_IDLE)
{
- elog(DEBUG3, "discarding connection %p", entry->conn);
- PQfinish(entry->conn);
- entry->conn = NULL;
+ elog(DEBUG3, "discarding connection %p",
+ PFCgetPGconn(entry->conn));
+ PFCfinish(entry->conn);
}
}
@@ -676,6 +679,9 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
PGresult *res;
char sql[100];
+ /* Shut down asynchronous scan if running */
+ PFCcancelAsync(entry->conn);
+
/*
* We only care about connections with open remote subtransactions of
* the current level.
@@ -701,7 +707,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
snprintf(sql, sizeof(sql),
"ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
curlevel, curlevel);
- res = PQexec(entry->conn, sql);
+ res = PFCexec(entry->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(WARNING, res, entry->conn, true, sql);
else
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index d76e739..1dfb221 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -123,6 +123,12 @@ enum FdwModifyPrivateIndex
FdwModifyPrivateRetrievedAttrs
};
+typedef enum fetch_mode {
+ START_ONLY,
+ FORCE_SYNC,
+ ALLOW_ASYNC
+} fetch_mode;
+
/*
* Execution state of a foreign scan using postgres_fdw.
*/
@@ -136,7 +142,7 @@ typedef struct PgFdwScanState
List *retrieved_attrs; /* list of retrieved attribute numbers */
/* for remote query execution */
- PGconn *conn; /* connection for the scan */
+ PgFdwConn *conn; /* connection for the scan */
unsigned int cursor_number; /* quasi-unique ID for my cursor */
bool cursor_exists; /* have we created the cursor? */
int numParams; /* number of parameters passed to query */
@@ -156,6 +162,7 @@ typedef struct PgFdwScanState
/* working memory contexts */
MemoryContext batch_cxt; /* context holding current batch of tuples */
MemoryContext temp_cxt; /* context for per-tuple temporary data */
+ ExprContext *econtext; /* copy of ps_ExprContext of ForeignScanState */
} PgFdwScanState;
/*
@@ -167,7 +174,7 @@ typedef struct PgFdwModifyState
AttInMetadata *attinmeta; /* attribute datatype conversion metadata */
/* for remote query execution */
- PGconn *conn; /* connection for the scan */
+ PgFdwConn *conn; /* connection for the scan */
char *p_name; /* name of prepared statement, if created */
/* extracted fdw_private data */
@@ -298,7 +305,7 @@ static void estimate_path_cost_size(PlannerInfo *root,
double *p_rows, int *p_width,
Cost *p_startup_cost, Cost *p_total_cost);
static void get_remote_estimate(const char *sql,
- PGconn *conn,
+ PgFdwConn *conn,
double *rows,
int *width,
Cost *startup_cost,
@@ -306,9 +313,9 @@ static void get_remote_estimate(const char *sql,
static bool ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
EquivalenceClass *ec, EquivalenceMember *em,
void *arg);
-static void create_cursor(ForeignScanState *node);
-static void fetch_more_data(ForeignScanState *node);
-static void close_cursor(PGconn *conn, unsigned int cursor_number);
+static void create_cursor(PgFdwScanState *node);
+static void close_cursor(PgFdwConn *conn, unsigned int cursor_number);
+static void fetch_more_data(PgFdwScanState *node, fetch_mode cmd);
static void prepare_foreign_modify(PgFdwModifyState *fmstate);
static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
ItemPointer tupleid,
@@ -329,7 +336,6 @@ static HeapTuple make_tuple_from_result_row(PGresult *res,
MemoryContext temp_context);
static void conversion_error_callback(void *arg);
-
/*
* Foreign-data wrapper handler function: return a struct with pointers
* to my callback routines.
@@ -982,6 +988,15 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
fsstate->param_values = (const char **) palloc0(numParams * sizeof(char *));
else
fsstate->param_values = NULL;
+
+ fsstate->econtext = node->ss.ps.ps_ExprContext;
+
+ /*
+ * Start scanning asynchronously if it is the first scan on this
+ * connection.
+ */
+ if (PFCgetNscans(fsstate->conn) == 1)
+ create_cursor(fsstate);
}
/*
@@ -1000,7 +1015,10 @@ postgresIterateForeignScan(ForeignScanState *node)
* cursor on the remote side.
*/
if (!fsstate->cursor_exists)
- create_cursor(node);
+ {
+ finish_async_query(fsstate->conn);
+ create_cursor(fsstate);
+ }
/*
* Get some more tuples, if we've run out.
@@ -1009,7 +1027,7 @@ postgresIterateForeignScan(ForeignScanState *node)
{
/* No point in another fetch if we already detected EOF, though. */
if (!fsstate->eof_reached)
- fetch_more_data(node);
+ fetch_more_data(fsstate, ALLOW_ASYNC);
/* If we didn't get any tuples, must be end of data. */
if (fsstate->next_tuple >= fsstate->num_tuples)
return ExecClearTuple(slot);
@@ -1069,7 +1087,7 @@ postgresReScanForeignScan(ForeignScanState *node)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(fsstate->conn, sql);
+ res = PFCexec(fsstate->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fsstate->conn, true, sql);
PQclear(res);
@@ -1392,19 +1410,22 @@ postgresExecForeignInsert(EState *estate,
/* Convert parameters needed by prepared statement to text form */
p_values = convert_prep_stmt_params(fmstate, NULL, slot);
+ /* Finish async query if runing */
+ finish_async_query(fmstate->conn);
+
/*
* Execute the prepared statement, and check for success.
*
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1462,19 +1483,22 @@ postgresExecForeignUpdate(EState *estate,
(ItemPointer) DatumGetPointer(datum),
slot);
+ /* Finish async query if runing */
+ finish_async_query(fmstate->conn);
+
/*
* Execute the prepared statement, and check for success.
*
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1532,19 +1556,22 @@ postgresExecForeignDelete(EState *estate,
(ItemPointer) DatumGetPointer(datum),
NULL);
+ /* Finish async query if runing */
+ finish_async_query(fmstate->conn);
+
/*
* Execute the prepared statement, and check for success.
*
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1594,7 +1621,7 @@ postgresEndForeignModify(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(fmstate->conn, sql);
+ res = PFCexec(fmstate->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fmstate->conn, true, sql);
PQclear(res);
@@ -1726,7 +1753,7 @@ estimate_path_cost_size(PlannerInfo *root,
List *local_join_conds;
StringInfoData sql;
List *retrieved_attrs;
- PGconn *conn;
+ PgFdwConn *conn;
Selectivity local_sel;
QualCost local_cost;
@@ -1836,7 +1863,7 @@ estimate_path_cost_size(PlannerInfo *root,
* The given "sql" must be an EXPLAIN command.
*/
static void
-get_remote_estimate(const char *sql, PGconn *conn,
+get_remote_estimate(const char *sql, PgFdwConn *conn,
double *rows, int *width,
Cost *startup_cost, Cost *total_cost)
{
@@ -1852,7 +1879,7 @@ get_remote_estimate(const char *sql, PGconn *conn,
/*
* Execute EXPLAIN remotely.
*/
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql);
@@ -1917,13 +1944,12 @@ ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
* Create cursor for node's query with current parameter values.
*/
static void
-create_cursor(ForeignScanState *node)
+create_cursor(PgFdwScanState *fsstate)
{
- PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
- ExprContext *econtext = node->ss.ps.ps_ExprContext;
+ ExprContext *econtext = fsstate->econtext;
int numParams = fsstate->numParams;
const char **values = fsstate->param_values;
- PGconn *conn = fsstate->conn;
+ PgFdwConn *conn = fsstate->conn;
StringInfoData buf;
PGresult *res;
@@ -1985,8 +2011,8 @@ create_cursor(ForeignScanState *node)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecParams(conn, buf.data, numParams, NULL, values,
- NULL, NULL, 0);
+ res = PFCexecParams(conn, buf.data, numParams, NULL, values,
+ NULL, NULL, 0);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, fsstate->query);
PQclear(res);
@@ -2001,55 +2027,128 @@ create_cursor(ForeignScanState *node)
/* Clean up */
pfree(buf.data);
+
+ /*
+ * Start async scan if this is the first scan. See fetch_more_data() for
+ * details
+ */
+ if (PFCgetNscans(conn) == 1)
+ fetch_more_data(fsstate, START_ONLY);
}
/*
* Fetch some more rows from the node's cursor.
*/
static void
-fetch_more_data(ForeignScanState *node)
+fetch_more_data(PgFdwScanState *fsstate, fetch_mode cmd)
{
- PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
PGresult *volatile res = NULL;
MemoryContext oldcontext;
/*
* We'll store the tuples in the batch_cxt. First, flush the previous
- * batch.
+ * batch. Some tuples left unread when asynchronous fetching is
+ * interrupted. Don't flush to preserve the unread tuples for the case. It
+ * occurs no more than twice successively.
*/
- fsstate->tuples = NULL;
- MemoryContextReset(fsstate->batch_cxt);
+ if (fsstate->next_tuple >= fsstate->num_tuples)
+ {
+ fsstate->tuples = NULL;
+ MemoryContextReset(fsstate->batch_cxt);
+ }
oldcontext = MemoryContextSwitchTo(fsstate->batch_cxt);
/* PGresult must be released before leaving this function. */
PG_TRY();
{
- PGconn *conn = fsstate->conn;
+ PgFdwConn *conn = fsstate->conn;
char sql[64];
int fetch_size;
- int numrows;
+ int numrows, addrows, restrows;
+ HeapTuple *tmptuples;
int i;
/* The fetch size is arbitrary, but shouldn't be enormous. */
- fetch_size = 100;
+ fetch_size = 10000;
snprintf(sql, sizeof(sql), "FETCH %d FROM c%u",
fetch_size, fsstate->cursor_number);
- res = PQexec(conn, sql);
+ if (PFCisAsyncRunning(conn))
+ {
+ Assert (cmd != START_ONLY);
+
+ /*
+ * If the target fsstate is different from the scan state that the
+ * current async fetch running for, the result should be stored
+ * into it, then synchronously fetch data for the target fsstate.
+ */
+ if (fsstate != PFCgetAsyncScan(conn))
+ {
+ fetch_more_data(PFCgetAsyncScan(conn), FORCE_SYNC);
+ res = PFCexec(conn, sql);
+ }
+ else
+ {
+ /* Get result of running async fetch */
+ res = PFCgetResult(conn);
+ if (PQntuples(res) == fetch_size)
+ {
+ /*
+ * Connection state doesn't go to IDLE even if all data
+ * has been sent to client for asynchronous query. One
+ * more PQgetResult() is needed to reset the state to
+ * IDLE. See PQexecFinish() for details.
+ */
+ if (PFCgetResult(conn) != NULL)
+ elog(ERROR, "Connection status error.");
+ }
+ }
+ PFCsetAsyncScan(conn, NULL);
+ }
+ else
+ {
+ if (cmd == START_ONLY)
+ {
+ Assert(PFCgetNscans(conn) == 1);
+
+ if (!PFCsendQuery(conn, sql))
+ pgfdw_report_error(ERROR, res, conn, false,
+ fsstate->query);
+
+ PFCsetAsyncScan(conn, fsstate);
+ goto end_of_fetch;
+ }
+
+ /* Elsewise do synchronous query execution */
+ PFCsetAsyncScan(conn, NULL);
+ res = PFCexec(conn, sql);
+ }
+
/* On error, report the original query, not the FETCH. */
- if (PQresultStatus(res) != PGRES_TUPLES_OK)
+ if (res && PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, fsstate->query);
- /* Convert the data into HeapTuples */
- numrows = PQntuples(res);
+ /* allocate tuple storage */
+ tmptuples = fsstate->tuples;
+ addrows = PQntuples(res);
+ restrows = fsstate->num_tuples - fsstate->next_tuple;
+ numrows = restrows + addrows;
fsstate->tuples = (HeapTuple *) palloc0(numrows * sizeof(HeapTuple));
+
+ Assert(restrows == 0 || tmptuples);
+
+ /* copy unread tuples if any */
+ for (i = 0 ; i < restrows ; i++)
+ fsstate->tuples[i] = tmptuples[fsstate->next_tuple + i];
+
fsstate->num_tuples = numrows;
fsstate->next_tuple = 0;
- for (i = 0; i < numrows; i++)
+ /* Convert the data into HeapTuples */
+ for (i = 0 ; i < addrows; i++)
{
- fsstate->tuples[i] =
+ fsstate->tuples[restrows + i] =
make_tuple_from_result_row(res, i,
fsstate->rel,
fsstate->attinmeta,
@@ -2066,6 +2165,23 @@ fetch_more_data(ForeignScanState *node)
PQclear(res);
res = NULL;
+
+ if (cmd == ALLOW_ASYNC)
+ {
+ if (!fsstate->eof_reached)
+ {
+ /*
+ * We can immediately request the next bunch of tuples if
+ * we're on asynchronous connection.
+ */
+ if (!PFCsendQuery(conn, sql))
+ pgfdw_report_error(ERROR, res, conn, false, fsstate->query);
+ PFCsetAsyncScan(conn, fsstate);
+ }
+ }
+
+end_of_fetch:
+ ; /* Nothing to do here but needed to make compiler quiet. */
}
PG_CATCH();
{
@@ -2079,6 +2195,28 @@ fetch_more_data(ForeignScanState *node)
}
/*
+ * Force cancelling async command state.
+ */
+void
+finish_async_query(PgFdwConn *conn)
+{
+ PgFdwScanState *fsstate = PFCgetAsyncScan(conn);
+ PgFdwConn *async_conn;
+
+ /* Nothing to do if no async connection */
+ if (fsstate == NULL) return;
+ async_conn = fsstate->conn;
+ if (!async_conn ||
+ PFCgetNscans(async_conn) == 1 ||
+ !PFCisAsyncRunning(async_conn))
+ return;
+
+ fetch_more_data(PFCgetAsyncScan(async_conn), FORCE_SYNC);
+
+ Assert(!PFCisAsyncRunning(async_conn));
+}
+
+/*
* Force assorted GUC parameters to settings that ensure that we'll output
* data values in a form that is unambiguous to the remote server.
*
@@ -2132,7 +2270,7 @@ reset_transmission_modes(int nestlevel)
* Utility routine to close a cursor.
*/
static void
-close_cursor(PGconn *conn, unsigned int cursor_number)
+close_cursor(PgFdwConn *conn, unsigned int cursor_number)
{
char sql[64];
PGresult *res;
@@ -2143,7 +2281,7 @@ close_cursor(PGconn *conn, unsigned int cursor_number)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, sql);
PQclear(res);
@@ -2165,6 +2303,9 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
GetPrepStmtNumber(fmstate->conn));
p_name = pstrdup(prep_name);
+ /* Finish async query if runing */
+ finish_async_query(fmstate->conn);
+
/*
* We intentionally do not specify parameter types here, but leave the
* remote server to derive them by default. This avoids possible problems
@@ -2175,11 +2316,11 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQprepare(fmstate->conn,
- p_name,
- fmstate->query,
- 0,
- NULL);
+ res = PFCprepare(fmstate->conn,
+ p_name,
+ fmstate->query,
+ 0,
+ NULL);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -2297,7 +2438,7 @@ postgresAnalyzeForeignTable(Relation relation,
ForeignTable *table;
ForeignServer *server;
UserMapping *user;
- PGconn *conn;
+ PgFdwConn *conn;
StringInfoData sql;
PGresult *volatile res = NULL;
@@ -2329,7 +2470,7 @@ postgresAnalyzeForeignTable(Relation relation,
/* In what follows, do not risk leaking any PGresults. */
PG_TRY();
{
- res = PQexec(conn, sql.data);
+ res = PFCexec(conn, sql.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
@@ -2379,7 +2520,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
ForeignTable *table;
ForeignServer *server;
UserMapping *user;
- PGconn *conn;
+ PgFdwConn *conn;
unsigned int cursor_number;
StringInfoData sql;
PGresult *volatile res = NULL;
@@ -2423,7 +2564,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
/* In what follows, do not risk leaking any PGresults. */
PG_TRY();
{
- res = PQexec(conn, sql.data);
+ res = PFCexec(conn, sql.data);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
PQclear(res);
@@ -2453,7 +2594,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
snprintf(fetch_sql, sizeof(fetch_sql), "FETCH %d FROM c%u",
fetch_size, cursor_number);
- res = PQexec(conn, fetch_sql);
+ res = PFCexec(conn, fetch_sql);
/* On error, report the original query, not the FETCH. */
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
@@ -2582,7 +2723,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
bool import_not_null = true;
ForeignServer *server;
UserMapping *mapping;
- PGconn *conn;
+ PgFdwConn *conn;
StringInfoData buf;
PGresult *volatile res = NULL;
int numrows,
@@ -2615,7 +2756,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
conn = GetConnection(server, mapping, false);
/* Don't attempt to import collation if remote server hasn't got it */
- if (PQserverVersion(conn) < 90100)
+ if (PFCserverVersion(conn) < 90100)
import_collate = false;
/* Create workspace for strings */
@@ -2628,7 +2769,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
appendStringInfoString(&buf, "SELECT 1 FROM pg_catalog.pg_namespace WHERE nspname = ");
deparseStringLiteral(&buf, stmt->remote_schema);
- res = PQexec(conn, buf.data);
+ res = PFCexec(conn, buf.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, buf.data);
@@ -2723,7 +2864,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
appendStringInfo(&buf, " ORDER BY c.relname, a.attnum");
/* Fetch the data */
- res = PQexec(conn, buf.data);
+ res = PFCexec(conn, buf.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, buf.data);
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 950c6f7..b117a88 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -18,19 +18,22 @@
#include "nodes/relation.h"
#include "utils/relcache.h"
-#include "libpq-fe.h"
+#include "PgFdwConn.h"
+
+struct PgFdwScanState;
/* in postgres_fdw.c */
extern int set_transmission_modes(void);
extern void reset_transmission_modes(int nestlevel);
+extern void finish_async_query(PgFdwConn *fsstate);
/* in connection.c */
-extern PGconn *GetConnection(ForeignServer *server, UserMapping *user,
+extern PgFdwConn *GetConnection(ForeignServer *server, UserMapping *user,
bool will_prep_stmt);
-extern void ReleaseConnection(PGconn *conn);
-extern unsigned int GetCursorNumber(PGconn *conn);
-extern unsigned int GetPrepStmtNumber(PGconn *conn);
-extern void pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
+extern void ReleaseConnection(PgFdwConn *conn);
+extern unsigned int GetCursorNumber(PgFdwConn *conn);
+extern unsigned int GetPrepStmtNumber(PgFdwConn *conn);
+extern void pgfdw_report_error(int elevel, PGresult *res, PgFdwConn *conn,
bool clear, const char *sql);
/* in option.c */
--
2.1.0.GIT
Hello, that's a silly mistake. fetch_seize = 10000 in the v4
patch. This v5 patch is fixed at the point.
But the v4 patch mysteriously accelerates this query, 6.5 seconds.
=# EXPLAIN (ANALYZE ON, COSTS OFF) SELECT x.a, x.c, y.c
FROM ft1 AS x JOIN ft1 AS y on x.a = y.a;
...
Execution time: 6512.043 ms
fetch_size was 10000 at this run. I got about 13.0 seconds for
fetch_size = 100, which is about 19% faster than the original.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
=======
15 17:18:49 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20150116.171849.109146500.horiguchi.kyotaro@lab.ntt.co.jp>
Show quoted text
I revised the patch so that async scan will be done more
aggressively, and took execution time for two very simple cases.As the result, simple seq scan gained 5% and hash join of two
foreign tables gained 150%. (2.4 times faster).While measuring the performance, I noticed that each scan in a
query runs at once rather than alternating with each other in
many cases such as hash join or sorted joins and so. So I
modified the patch so that async fetch is done more
aggressively. The new v4 patch is attached. The following numbers
are taken based on it.========
Simple seq scan for the first test.CREATE TABLE lt1 (a int, b timestamp, c text);
CREATE SERVER sv1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost');
CREATE USER MAPPING FOR PUBLIC SERVER sv1;
CREATE FOREIGN TABLE ft1 () SERVER sv1 OPTIONS (table_name 'lt1');
INSERT INTO lt1 (SELECT a, now(), repeat('x', 128) FROM generate_series(0, 999999) a);On this case, I took the the 10 times average of exec time of the
following query for both master head and patched version. The
fetch size is 100.postgres=# EXPLAIN (ANALYZE ON, COSTS OFF) SELECT * FROM ft1;
QUERY PLAN
------------------------------------------------------------------
Foreign Scan on ft1 (actual time=0.79 5..4175.706 rows=1000000 loops=1)
Planning time: 0.060 ms
Execution time: 4276.043 msmaster head : avg = 4256.621, std dev = 17.099
patched pgfdw: avg = 4036.463, std dev = 2.608The patched version is faster by about 5%. This should be pure
result of asynchronous fetching, not including the effect of
early starting of remote execution in ExecInit.Interestingly, as fetch_count gets larger, the gain raises in
spite of the decrease of the number of query sending.master head : avg = 2622.759, std dev = 38.379
patched pgfdw: avg = 2277.622, std dev = 27.269About 15% gain. And for 10000,
master head : avg = 2000.980, std dev = 6.434
patched pgfdw: avg = 1616.793, std dev = 13.19219%.. It is natural that exec time reduces along with increase of
fetch size, but I haven't found the reason why the patch's gain
also increases.======================
The second case is a simple join of two foreign tables sharing
one connection.The master head runs this query in about 16 seconds with almost
no fluctuation among multiple tries.=# EXPLAIN (ANALYZE ON, COSTS OFF) SELECT x.a, x.c, y.c
FROM ft1 AS x JOIN ft1 AS y on x.a = y.a;
QUERY PLAN
----------------------------------------------------------------------------
Hash Join (actual time=7541.831..15924.631 rows=1000000 loops=1)
Hash Cond: (x.a = y.a)
-> Foreign Scan on ft1 x (actual time=1.176..6553.480 rows=1000000 loops=1)
-> Hash (actual time=7539.761..7539.761 rows=1000000 loops=1)
Buckets: 32768 Batches: 64 Memory Usage: 2829kB
-> Foreign Scan on ft1 y (actual time=1.067..6529.165 rows=1000000 loops=1)
Planning time: 0.223 ms
Execution time: 15973.916 msBut the v4 patch mysteriously accelerates this query, 6.5 seconds.
=# EXPLAIN (ANALYZE ON, COSTS OFF) SELECT x.a, x.c, y.c
FROM ft1 AS x JOIN ft1 AS y on x.a = y.a;
QUERY PLAN
----------------------------------------------------------------------------
Hash Join (actual time=2556.977..5812.937 rows=1000000 loops=1)
Hash Cond: (x.a = y.a)
-> Foreign Scan on ft1 x (actual time=32.689..1936.565 rows=1000000 loops=1)
-> Hash (actual time=2523.810..2523.810 rows=1000000 loops=1)
Buckets: 32768 Batches: 64 Memory Usage: 2829kB
-> Foreign Scan on ft1 y (actual time=50.345..1928.411 rows=1000000 loops=1)
Planning time: 0.220 ms
Execution time: 6512.043 msThe result data seems not broken. I don't know the reason yet but
I'll investigate it.
Attachments:
0001-Asynchronous-execution-of-postgres_fdw-v5.patchtext/x-patch; charset=us-asciiDownload
>From faea77944d4d3e3332d9723958f548356e3bceba Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Tue, 13 Jan 2015 19:20:35 +0900
Subject: [PATCH] Asynchronous execution of postgres_fdw v5
This is the modified version of Asynchronous execution of
postgres_fdw.
- Do async fetch more aggressively than v3.
- No additional tests yet :(
---
contrib/postgres_fdw/Makefile | 2 +-
contrib/postgres_fdw/PgFdwConn.c | 200 +++++++++++++++++++++++++
contrib/postgres_fdw/PgFdwConn.h | 61 ++++++++
contrib/postgres_fdw/connection.c | 82 ++++++-----
contrib/postgres_fdw/postgres_fdw.c | 281 +++++++++++++++++++++++++++---------
contrib/postgres_fdw/postgres_fdw.h | 15 +-
6 files changed, 526 insertions(+), 115 deletions(-)
create mode 100644 contrib/postgres_fdw/PgFdwConn.c
create mode 100644 contrib/postgres_fdw/PgFdwConn.h
diff --git a/contrib/postgres_fdw/Makefile b/contrib/postgres_fdw/Makefile
index d2b98e1..d0913e2 100644
--- a/contrib/postgres_fdw/Makefile
+++ b/contrib/postgres_fdw/Makefile
@@ -1,7 +1,7 @@
# contrib/postgres_fdw/Makefile
MODULE_big = postgres_fdw
-OBJS = postgres_fdw.o option.o deparse.o connection.o $(WIN32RES)
+OBJS = postgres_fdw.o PgFdwConn.o option.o deparse.o connection.o $(WIN32RES)
PGFILEDESC = "postgres_fdw - foreign data wrapper for PostgreSQL"
PG_CPPFLAGS = -I$(libpq_srcdir)
diff --git a/contrib/postgres_fdw/PgFdwConn.c b/contrib/postgres_fdw/PgFdwConn.c
new file mode 100644
index 0000000..b13b597
--- /dev/null
+++ b/contrib/postgres_fdw/PgFdwConn.c
@@ -0,0 +1,200 @@
+/*-------------------------------------------------------------------------
+ *
+ * PgFdwConn.c
+ * PGconn extending wrapper to enable asynchronous query.
+ *
+ * Portions Copyright (c) 2012-2015, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * contrib/postgres_fdw/PgFdwConn.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "PgFdwConn.h"
+
+#define PFC_ALLOCATE() ((PgFdwConn *)malloc(sizeof(PgFdwConn)))
+#define PFC_FREE(c) free(c)
+
+struct pgfdw_conn
+{
+ PGconn *pgconn; /* libpq connection for this connection */
+ int nscans; /* number of scans using this connection */
+ struct PgFdwScanState *async_scan; /* the connection currently running
+ * async query on this connection */
+};
+
+void
+PFCsetAsyncScan(PgFdwConn *conn, struct PgFdwScanState *scan)
+{
+ conn->async_scan = scan;
+}
+
+struct PgFdwScanState *
+PFCgetAsyncScan(PgFdwConn *conn)
+{
+ return conn->async_scan;
+}
+
+int
+PFCisAsyncRunning(PgFdwConn *conn)
+{
+ return conn->async_scan != NULL;
+}
+
+PGconn *
+PFCgetPGconn(PgFdwConn *conn)
+{
+ return conn->pgconn;
+}
+
+int
+PFCgetNscans(PgFdwConn *conn)
+{
+ return conn->nscans;
+}
+
+int
+PFCincrementNscans(PgFdwConn *conn)
+{
+ return ++conn->nscans;
+}
+
+int
+PFCdecrementNscans(PgFdwConn *conn)
+{
+ Assert(conn->nscans > 0);
+ return --conn->nscans;
+}
+
+void
+PFCcancelAsync(PgFdwConn *conn)
+{
+ if (PFCisAsyncRunning(conn))
+ PFCconsumeInput(conn);
+}
+
+void
+PFCinit(PgFdwConn *conn)
+{
+ conn->async_scan = NULL;
+ conn->nscans = 0;
+}
+
+int
+PFCsendQuery(PgFdwConn *conn, const char *query)
+{
+ return PQsendQuery(conn->pgconn, query);
+}
+
+PGresult *
+PFCexec(PgFdwConn *conn, const char *query)
+{
+ return PQexec(conn->pgconn, query);
+}
+
+PGresult *
+PFCexecParams(PgFdwConn *conn,
+ const char *command,
+ int nParams,
+ const Oid *paramTypes,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat)
+{
+ return PQexecParams(conn->pgconn,
+ command, nParams, paramTypes, paramValues,
+ paramLengths, paramFormats, resultFormat);
+}
+
+PGresult *
+PFCprepare(PgFdwConn *conn,
+ const char *stmtName, const char *query,
+ int nParams, const Oid *paramTypes)
+{
+ return PQprepare(conn->pgconn, stmtName, query, nParams, paramTypes);
+}
+
+PGresult *
+PFCexecPrepared(PgFdwConn *conn,
+ const char *stmtName,
+ int nParams,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat)
+{
+ return PQexecPrepared(conn->pgconn,
+ stmtName, nParams, paramValues, paramLengths,
+ paramFormats, resultFormat);
+}
+
+PGresult *
+PFCgetResult(PgFdwConn *conn)
+{
+ return PQgetResult(conn->pgconn);
+}
+
+int
+PFCconsumeInput(PgFdwConn *conn)
+{
+ return PQconsumeInput(conn->pgconn);
+}
+
+int
+PFCisBusy(PgFdwConn *conn)
+{
+ return PQisBusy(conn->pgconn);
+}
+
+ConnStatusType
+PFCstatus(const PgFdwConn *conn)
+{
+ return PQstatus(conn->pgconn);
+}
+
+PGTransactionStatusType
+PFCtransactionStatus(const PgFdwConn *conn)
+{
+ return PQtransactionStatus(conn->pgconn);
+}
+
+int
+PFCserverVersion(const PgFdwConn *conn)
+{
+ return PQserverVersion(conn->pgconn);
+}
+
+char *
+PFCerrorMessage(const PgFdwConn *conn)
+{
+ return PQerrorMessage(conn->pgconn);
+}
+
+int
+PFCconnectionUsedPassword(const PgFdwConn *conn)
+{
+ return PQconnectionUsedPassword(conn->pgconn);
+}
+
+void
+PFCfinish(PgFdwConn *conn)
+{
+ return PQfinish(conn->pgconn);
+ PFC_FREE(conn);
+}
+
+PgFdwConn *
+PFCconnectdbParams(const char *const * keywords,
+ const char *const * values, int expand_dbname)
+{
+ PgFdwConn *ret = PFC_ALLOCATE();
+
+ PFCinit(ret);
+ ret->pgconn = PQconnectdbParams(keywords, values, expand_dbname);
+
+ return ret;
+}
diff --git a/contrib/postgres_fdw/PgFdwConn.h b/contrib/postgres_fdw/PgFdwConn.h
new file mode 100644
index 0000000..f695f5a
--- /dev/null
+++ b/contrib/postgres_fdw/PgFdwConn.h
@@ -0,0 +1,61 @@
+/*-------------------------------------------------------------------------
+ *
+ * PgFdwConn.h
+ * PGconn extending wrapper to enable asynchronous query.
+ *
+ * Portions Copyright (c) 2012-2015, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * contrib/postgres_fdw/PgFdwConn.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PGFDWCONN_H
+#define PGFDWCONN_H
+
+#include "libpq-fe.h"
+
+typedef struct pgfdw_conn PgFdwConn;
+struct PgFdwScanState;
+
+extern void PFCsetAsyncScan(PgFdwConn *conn, struct PgFdwScanState *scan);
+extern struct PgFdwScanState *PFCgetAsyncScan(PgFdwConn *conn);
+extern int PFCisAsyncRunning(PgFdwConn *conn);
+extern PGconn *PFCgetPGconn(PgFdwConn *conn);
+extern int PFCgetNscans(PgFdwConn *conn);
+extern int PFCincrementNscans(PgFdwConn *conn);
+extern int PFCdecrementNscans(PgFdwConn *conn);
+extern void PFCcancelAsync(PgFdwConn *conn);
+extern void PFCinit(PgFdwConn *conn);
+extern int PFCsendQuery(PgFdwConn *conn, const char *query);
+extern PGresult *PFCexec(PgFdwConn *conn, const char *query);
+extern PGresult *PFCexecParams(PgFdwConn *conn,
+ const char *command,
+ int nParams,
+ const Oid *paramTypes,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat);
+extern PGresult *PFCprepare(PgFdwConn *conn,
+ const char *stmtName, const char *query,
+ int nParams, const Oid *paramTypes);
+extern PGresult *PFCexecPrepared(PgFdwConn *conn,
+ const char *stmtName,
+ int nParams,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat);
+extern PGresult *PFCgetResult(PgFdwConn *conn);
+extern int PFCconsumeInput(PgFdwConn *conn);
+extern int PFCisBusy(PgFdwConn *conn);
+extern ConnStatusType PFCstatus(const PgFdwConn *conn);
+extern PGTransactionStatusType PFCtransactionStatus(const PgFdwConn *conn);
+extern int PFCserverVersion(const PgFdwConn *conn);
+extern char *PFCerrorMessage(const PgFdwConn *conn);
+extern int PFCconnectionUsedPassword(const PgFdwConn *conn);
+extern void PFCfinish(PgFdwConn *conn);
+extern PgFdwConn *PFCconnectdbParams(const char *const * keywords,
+ const char *const * values, int expand_dbname);
+#endif /* PGFDWCONN_H */
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 4e02cb2..2517f6b 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -44,7 +44,7 @@ typedef struct ConnCacheKey
typedef struct ConnCacheEntry
{
ConnCacheKey key; /* hash key (must be first) */
- PGconn *conn; /* connection to foreign server, or NULL */
+ PgFdwConn *conn; /* connection to foreign server, or NULL */
int xact_depth; /* 0 = no xact open, 1 = main xact open, 2 =
* one level of subxact open, etc */
bool have_prep_stmt; /* have we prepared any stmts in this xact? */
@@ -64,10 +64,10 @@ static unsigned int prep_stmt_number = 0;
static bool xact_got_connection = false;
/* prototypes of private functions */
-static PGconn *connect_pg_server(ForeignServer *server, UserMapping *user);
+static PgFdwConn *connect_pg_server(ForeignServer *server, UserMapping *user);
static void check_conn_params(const char **keywords, const char **values);
-static void configure_remote_session(PGconn *conn);
-static void do_sql_command(PGconn *conn, const char *sql);
+static void configure_remote_session(PgFdwConn *conn);
+static void do_sql_command(PgFdwConn *conn, const char *sql);
static void begin_remote_xact(ConnCacheEntry *entry);
static void pgfdw_xact_callback(XactEvent event, void *arg);
static void pgfdw_subxact_callback(SubXactEvent event,
@@ -93,7 +93,7 @@ static void pgfdw_subxact_callback(SubXactEvent event,
* be useful and not mere pedantry. We could not flush any active connections
* mid-transaction anyway.
*/
-PGconn *
+PgFdwConn *
GetConnection(ForeignServer *server, UserMapping *user,
bool will_prep_stmt)
{
@@ -161,9 +161,12 @@ GetConnection(ForeignServer *server, UserMapping *user,
entry->have_error = false;
entry->conn = connect_pg_server(server, user);
elog(DEBUG3, "new postgres_fdw connection %p for server \"%s\"",
- entry->conn, server->servername);
+ PFCgetPGconn(entry->conn), server->servername);
+
}
+ PFCincrementNscans(entry->conn);
+
/*
* Start a new transaction or subtransaction if needed.
*/
@@ -178,10 +181,10 @@ GetConnection(ForeignServer *server, UserMapping *user,
/*
* Connect to remote server using specified server and user mapping properties.
*/
-static PGconn *
+static PgFdwConn *
connect_pg_server(ForeignServer *server, UserMapping *user)
{
- PGconn *volatile conn = NULL;
+ PgFdwConn *volatile conn = NULL;
/*
* Use PG_TRY block to ensure closing connection on error.
@@ -223,14 +226,14 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
/* verify connection parameters and make connection */
check_conn_params(keywords, values);
- conn = PQconnectdbParams(keywords, values, false);
- if (!conn || PQstatus(conn) != CONNECTION_OK)
+ conn = PFCconnectdbParams(keywords, values, false);
+ if (!conn || PFCstatus(conn) != CONNECTION_OK)
{
char *connmessage;
int msglen;
/* libpq typically appends a newline, strip that */
- connmessage = pstrdup(PQerrorMessage(conn));
+ connmessage = pstrdup(PFCerrorMessage(conn));
msglen = strlen(connmessage);
if (msglen > 0 && connmessage[msglen - 1] == '\n')
connmessage[msglen - 1] = '\0';
@@ -246,7 +249,7 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
* otherwise, he's piggybacking on the postgres server's user
* identity. See also dblink_security_check() in contrib/dblink.
*/
- if (!superuser() && !PQconnectionUsedPassword(conn))
+ if (!superuser() && !PFCconnectionUsedPassword(conn))
ereport(ERROR,
(errcode(ERRCODE_S_R_E_PROHIBITED_SQL_STATEMENT_ATTEMPTED),
errmsg("password is required"),
@@ -263,7 +266,7 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
{
/* Release PGconn data structure if we managed to create one */
if (conn)
- PQfinish(conn);
+ PFCfinish(conn);
PG_RE_THROW();
}
PG_END_TRY();
@@ -312,9 +315,9 @@ check_conn_params(const char **keywords, const char **values)
* there are any number of ways to break things.
*/
static void
-configure_remote_session(PGconn *conn)
+configure_remote_session(PgFdwConn *conn)
{
- int remoteversion = PQserverVersion(conn);
+ int remoteversion = PFCserverVersion(conn);
/* Force the search path to contain only pg_catalog (see deparse.c) */
do_sql_command(conn, "SET search_path = pg_catalog");
@@ -348,11 +351,11 @@ configure_remote_session(PGconn *conn)
* Convenience subroutine to issue a non-data-returning SQL command to remote
*/
static void
-do_sql_command(PGconn *conn, const char *sql)
+do_sql_command(PgFdwConn *conn, const char *sql)
{
PGresult *res;
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, sql);
PQclear(res);
@@ -379,7 +382,7 @@ begin_remote_xact(ConnCacheEntry *entry)
const char *sql;
elog(DEBUG3, "starting remote transaction on connection %p",
- entry->conn);
+ PFCgetPGconn(entry->conn));
if (IsolationIsSerializable())
sql = "START TRANSACTION ISOLATION LEVEL SERIALIZABLE";
@@ -408,13 +411,11 @@ begin_remote_xact(ConnCacheEntry *entry)
* Release connection reference count created by calling GetConnection.
*/
void
-ReleaseConnection(PGconn *conn)
+ReleaseConnection(PgFdwConn *conn)
{
- /*
- * Currently, we don't actually track connection references because all
- * cleanup is managed on a transaction or subtransaction basis instead. So
- * there's nothing to do here.
- */
+ /* ongoing async query should be canceled if no scans left */
+ if (PFCdecrementNscans(conn) == 0)
+ finish_async_query(conn);
}
/*
@@ -429,7 +430,7 @@ ReleaseConnection(PGconn *conn)
* collisions are highly improbable; just be sure to use %u not %d to print.
*/
unsigned int
-GetCursorNumber(PGconn *conn)
+GetCursorNumber(PgFdwConn *conn)
{
return ++cursor_number;
}
@@ -443,7 +444,7 @@ GetCursorNumber(PGconn *conn)
* increasing the risk of prepared-statement name collisions by resetting.
*/
unsigned int
-GetPrepStmtNumber(PGconn *conn)
+GetPrepStmtNumber(PgFdwConn *conn)
{
return ++prep_stmt_number;
}
@@ -462,7 +463,7 @@ GetPrepStmtNumber(PGconn *conn)
* marked with have_error = true.
*/
void
-pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
+pgfdw_report_error(int elevel, PGresult *res, PgFdwConn *conn,
bool clear, const char *sql)
{
/* If requested, PGresult must be released before leaving this function. */
@@ -490,7 +491,7 @@ pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
* return NULL, not a PGresult at all.
*/
if (message_primary == NULL)
- message_primary = PQerrorMessage(conn);
+ message_primary = PFCerrorMessage(conn);
ereport(elevel,
(errcode(sqlstate),
@@ -542,7 +543,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
if (entry->xact_depth > 0)
{
elog(DEBUG3, "closing remote transaction on connection %p",
- entry->conn);
+ PFCgetPGconn(entry->conn));
switch (event)
{
@@ -567,7 +568,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
*/
if (entry->have_prep_stmt && entry->have_error)
{
- res = PQexec(entry->conn, "DEALLOCATE ALL");
+ res = PFCexec(entry->conn, "DEALLOCATE ALL");
PQclear(res);
}
entry->have_prep_stmt = false;
@@ -597,7 +598,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Assume we might have lost track of prepared statements */
entry->have_error = true;
/* If we're aborting, abort all remote transactions too */
- res = PQexec(entry->conn, "ABORT TRANSACTION");
+ res = PFCexec(entry->conn, "ABORT TRANSACTION");
/* Note: can't throw ERROR, it would be infinite loop */
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(WARNING, res, entry->conn, true,
@@ -608,7 +609,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* As above, make sure to clear any prepared stmts */
if (entry->have_prep_stmt && entry->have_error)
{
- res = PQexec(entry->conn, "DEALLOCATE ALL");
+ res = PFCexec(entry->conn, "DEALLOCATE ALL");
PQclear(res);
}
entry->have_prep_stmt = false;
@@ -620,17 +621,19 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Reset state to show we're out of a transaction */
entry->xact_depth = 0;
+ PFCcancelAsync(entry->conn);
+ PFCinit(entry->conn);
/*
* If the connection isn't in a good idle state, discard it to
* recover. Next GetConnection will open a new connection.
*/
- if (PQstatus(entry->conn) != CONNECTION_OK ||
- PQtransactionStatus(entry->conn) != PQTRANS_IDLE)
+ if (PFCstatus(entry->conn) != CONNECTION_OK ||
+ PFCtransactionStatus(entry->conn) != PQTRANS_IDLE)
{
- elog(DEBUG3, "discarding connection %p", entry->conn);
- PQfinish(entry->conn);
- entry->conn = NULL;
+ elog(DEBUG3, "discarding connection %p",
+ PFCgetPGconn(entry->conn));
+ PFCfinish(entry->conn);
}
}
@@ -676,6 +679,9 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
PGresult *res;
char sql[100];
+ /* Shut down asynchronous scan if running */
+ PFCcancelAsync(entry->conn);
+
/*
* We only care about connections with open remote subtransactions of
* the current level.
@@ -701,7 +707,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
snprintf(sql, sizeof(sql),
"ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
curlevel, curlevel);
- res = PQexec(entry->conn, sql);
+ res = PFCexec(entry->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(WARNING, res, entry->conn, true, sql);
else
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index d76e739..f7b0207 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -123,6 +123,12 @@ enum FdwModifyPrivateIndex
FdwModifyPrivateRetrievedAttrs
};
+typedef enum fetch_mode {
+ START_ONLY,
+ FORCE_SYNC,
+ ALLOW_ASYNC
+} fetch_mode;
+
/*
* Execution state of a foreign scan using postgres_fdw.
*/
@@ -136,7 +142,7 @@ typedef struct PgFdwScanState
List *retrieved_attrs; /* list of retrieved attribute numbers */
/* for remote query execution */
- PGconn *conn; /* connection for the scan */
+ PgFdwConn *conn; /* connection for the scan */
unsigned int cursor_number; /* quasi-unique ID for my cursor */
bool cursor_exists; /* have we created the cursor? */
int numParams; /* number of parameters passed to query */
@@ -156,6 +162,7 @@ typedef struct PgFdwScanState
/* working memory contexts */
MemoryContext batch_cxt; /* context holding current batch of tuples */
MemoryContext temp_cxt; /* context for per-tuple temporary data */
+ ExprContext *econtext; /* copy of ps_ExprContext of ForeignScanState */
} PgFdwScanState;
/*
@@ -167,7 +174,7 @@ typedef struct PgFdwModifyState
AttInMetadata *attinmeta; /* attribute datatype conversion metadata */
/* for remote query execution */
- PGconn *conn; /* connection for the scan */
+ PgFdwConn *conn; /* connection for the scan */
char *p_name; /* name of prepared statement, if created */
/* extracted fdw_private data */
@@ -298,7 +305,7 @@ static void estimate_path_cost_size(PlannerInfo *root,
double *p_rows, int *p_width,
Cost *p_startup_cost, Cost *p_total_cost);
static void get_remote_estimate(const char *sql,
- PGconn *conn,
+ PgFdwConn *conn,
double *rows,
int *width,
Cost *startup_cost,
@@ -306,9 +313,9 @@ static void get_remote_estimate(const char *sql,
static bool ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
EquivalenceClass *ec, EquivalenceMember *em,
void *arg);
-static void create_cursor(ForeignScanState *node);
-static void fetch_more_data(ForeignScanState *node);
-static void close_cursor(PGconn *conn, unsigned int cursor_number);
+static void create_cursor(PgFdwScanState *node);
+static void close_cursor(PgFdwConn *conn, unsigned int cursor_number);
+static void fetch_more_data(PgFdwScanState *node, fetch_mode cmd);
static void prepare_foreign_modify(PgFdwModifyState *fmstate);
static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
ItemPointer tupleid,
@@ -329,7 +336,6 @@ static HeapTuple make_tuple_from_result_row(PGresult *res,
MemoryContext temp_context);
static void conversion_error_callback(void *arg);
-
/*
* Foreign-data wrapper handler function: return a struct with pointers
* to my callback routines.
@@ -982,6 +988,15 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
fsstate->param_values = (const char **) palloc0(numParams * sizeof(char *));
else
fsstate->param_values = NULL;
+
+ fsstate->econtext = node->ss.ps.ps_ExprContext;
+
+ /*
+ * Start scanning asynchronously if it is the first scan on this
+ * connection.
+ */
+ if (PFCgetNscans(fsstate->conn) == 1)
+ create_cursor(fsstate);
}
/*
@@ -1000,7 +1015,10 @@ postgresIterateForeignScan(ForeignScanState *node)
* cursor on the remote side.
*/
if (!fsstate->cursor_exists)
- create_cursor(node);
+ {
+ finish_async_query(fsstate->conn);
+ create_cursor(fsstate);
+ }
/*
* Get some more tuples, if we've run out.
@@ -1009,7 +1027,7 @@ postgresIterateForeignScan(ForeignScanState *node)
{
/* No point in another fetch if we already detected EOF, though. */
if (!fsstate->eof_reached)
- fetch_more_data(node);
+ fetch_more_data(fsstate, ALLOW_ASYNC);
/* If we didn't get any tuples, must be end of data. */
if (fsstate->next_tuple >= fsstate->num_tuples)
return ExecClearTuple(slot);
@@ -1069,7 +1087,7 @@ postgresReScanForeignScan(ForeignScanState *node)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(fsstate->conn, sql);
+ res = PFCexec(fsstate->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fsstate->conn, true, sql);
PQclear(res);
@@ -1392,19 +1410,22 @@ postgresExecForeignInsert(EState *estate,
/* Convert parameters needed by prepared statement to text form */
p_values = convert_prep_stmt_params(fmstate, NULL, slot);
+ /* Finish async query if runing */
+ finish_async_query(fmstate->conn);
+
/*
* Execute the prepared statement, and check for success.
*
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1462,19 +1483,22 @@ postgresExecForeignUpdate(EState *estate,
(ItemPointer) DatumGetPointer(datum),
slot);
+ /* Finish async query if runing */
+ finish_async_query(fmstate->conn);
+
/*
* Execute the prepared statement, and check for success.
*
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1532,19 +1556,22 @@ postgresExecForeignDelete(EState *estate,
(ItemPointer) DatumGetPointer(datum),
NULL);
+ /* Finish async query if runing */
+ finish_async_query(fmstate->conn);
+
/*
* Execute the prepared statement, and check for success.
*
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1594,7 +1621,7 @@ postgresEndForeignModify(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(fmstate->conn, sql);
+ res = PFCexec(fmstate->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fmstate->conn, true, sql);
PQclear(res);
@@ -1726,7 +1753,7 @@ estimate_path_cost_size(PlannerInfo *root,
List *local_join_conds;
StringInfoData sql;
List *retrieved_attrs;
- PGconn *conn;
+ PgFdwConn *conn;
Selectivity local_sel;
QualCost local_cost;
@@ -1836,7 +1863,7 @@ estimate_path_cost_size(PlannerInfo *root,
* The given "sql" must be an EXPLAIN command.
*/
static void
-get_remote_estimate(const char *sql, PGconn *conn,
+get_remote_estimate(const char *sql, PgFdwConn *conn,
double *rows, int *width,
Cost *startup_cost, Cost *total_cost)
{
@@ -1852,7 +1879,7 @@ get_remote_estimate(const char *sql, PGconn *conn,
/*
* Execute EXPLAIN remotely.
*/
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql);
@@ -1917,13 +1944,12 @@ ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
* Create cursor for node's query with current parameter values.
*/
static void
-create_cursor(ForeignScanState *node)
+create_cursor(PgFdwScanState *fsstate)
{
- PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
- ExprContext *econtext = node->ss.ps.ps_ExprContext;
+ ExprContext *econtext = fsstate->econtext;
int numParams = fsstate->numParams;
const char **values = fsstate->param_values;
- PGconn *conn = fsstate->conn;
+ PgFdwConn *conn = fsstate->conn;
StringInfoData buf;
PGresult *res;
@@ -1985,8 +2011,8 @@ create_cursor(ForeignScanState *node)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecParams(conn, buf.data, numParams, NULL, values,
- NULL, NULL, 0);
+ res = PFCexecParams(conn, buf.data, numParams, NULL, values,
+ NULL, NULL, 0);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, fsstate->query);
PQclear(res);
@@ -2001,33 +2027,45 @@ create_cursor(ForeignScanState *node)
/* Clean up */
pfree(buf.data);
+
+ /*
+ * Start async scan if this is the first scan. See fetch_more_data() for
+ * details
+ */
+ if (PFCgetNscans(conn) == 1)
+ fetch_more_data(fsstate, START_ONLY);
}
/*
* Fetch some more rows from the node's cursor.
*/
static void
-fetch_more_data(ForeignScanState *node)
+fetch_more_data(PgFdwScanState *fsstate, fetch_mode cmd)
{
- PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
PGresult *volatile res = NULL;
MemoryContext oldcontext;
/*
* We'll store the tuples in the batch_cxt. First, flush the previous
- * batch.
+ * batch. Some tuples left unread when asynchronous fetching is
+ * interrupted. Don't flush to preserve the unread tuples for the case. It
+ * occurs no more than twice successively.
*/
- fsstate->tuples = NULL;
- MemoryContextReset(fsstate->batch_cxt);
+ if (fsstate->next_tuple >= fsstate->num_tuples)
+ {
+ fsstate->tuples = NULL;
+ MemoryContextReset(fsstate->batch_cxt);
+ }
oldcontext = MemoryContextSwitchTo(fsstate->batch_cxt);
/* PGresult must be released before leaving this function. */
PG_TRY();
{
- PGconn *conn = fsstate->conn;
+ PgFdwConn *conn = fsstate->conn;
char sql[64];
int fetch_size;
- int numrows;
+ int numrows, addrows, restrows;
+ HeapTuple *tmptuples;
int i;
/* The fetch size is arbitrary, but shouldn't be enormous. */
@@ -2036,20 +2074,81 @@ fetch_more_data(ForeignScanState *node)
snprintf(sql, sizeof(sql), "FETCH %d FROM c%u",
fetch_size, fsstate->cursor_number);
- res = PQexec(conn, sql);
+ if (PFCisAsyncRunning(conn))
+ {
+ Assert (cmd != START_ONLY);
+
+ /*
+ * If the target fsstate is different from the scan state that the
+ * current async fetch running for, the result should be stored
+ * into it, then synchronously fetch data for the target fsstate.
+ */
+ if (fsstate != PFCgetAsyncScan(conn))
+ {
+ fetch_more_data(PFCgetAsyncScan(conn), FORCE_SYNC);
+ res = PFCexec(conn, sql);
+ }
+ else
+ {
+ /* Get result of running async fetch */
+ res = PFCgetResult(conn);
+ if (PQntuples(res) == fetch_size)
+ {
+ /*
+ * Connection state doesn't go to IDLE even if all data
+ * has been sent to client for asynchronous query. One
+ * more PQgetResult() is needed to reset the state to
+ * IDLE. See PQexecFinish() for details.
+ */
+ if (PFCgetResult(conn) != NULL)
+ elog(ERROR, "Connection status error.");
+ }
+ }
+ PFCsetAsyncScan(conn, NULL);
+ }
+ else
+ {
+ if (cmd == START_ONLY)
+ {
+ Assert(PFCgetNscans(conn) == 1);
+
+ if (!PFCsendQuery(conn, sql))
+ pgfdw_report_error(ERROR, res, conn, false,
+ fsstate->query);
+
+ PFCsetAsyncScan(conn, fsstate);
+ goto end_of_fetch;
+ }
+
+ /* Elsewise do synchronous query execution */
+ PFCsetAsyncScan(conn, NULL);
+ res = PFCexec(conn, sql);
+ }
+
/* On error, report the original query, not the FETCH. */
- if (PQresultStatus(res) != PGRES_TUPLES_OK)
+ if (res && PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, fsstate->query);
- /* Convert the data into HeapTuples */
- numrows = PQntuples(res);
+ /* allocate tuple storage */
+ tmptuples = fsstate->tuples;
+ addrows = PQntuples(res);
+ restrows = fsstate->num_tuples - fsstate->next_tuple;
+ numrows = restrows + addrows;
fsstate->tuples = (HeapTuple *) palloc0(numrows * sizeof(HeapTuple));
+
+ Assert(restrows == 0 || tmptuples);
+
+ /* copy unread tuples if any */
+ for (i = 0 ; i < restrows ; i++)
+ fsstate->tuples[i] = tmptuples[fsstate->next_tuple + i];
+
fsstate->num_tuples = numrows;
fsstate->next_tuple = 0;
- for (i = 0; i < numrows; i++)
+ /* Convert the data into HeapTuples */
+ for (i = 0 ; i < addrows; i++)
{
- fsstate->tuples[i] =
+ fsstate->tuples[restrows + i] =
make_tuple_from_result_row(res, i,
fsstate->rel,
fsstate->attinmeta,
@@ -2066,6 +2165,23 @@ fetch_more_data(ForeignScanState *node)
PQclear(res);
res = NULL;
+
+ if (cmd == ALLOW_ASYNC)
+ {
+ if (!fsstate->eof_reached)
+ {
+ /*
+ * We can immediately request the next bunch of tuples if
+ * we're on asynchronous connection.
+ */
+ if (!PFCsendQuery(conn, sql))
+ pgfdw_report_error(ERROR, res, conn, false, fsstate->query);
+ PFCsetAsyncScan(conn, fsstate);
+ }
+ }
+
+end_of_fetch:
+ ; /* Nothing to do here but needed to make compiler quiet. */
}
PG_CATCH();
{
@@ -2079,6 +2195,28 @@ fetch_more_data(ForeignScanState *node)
}
/*
+ * Force cancelling async command state.
+ */
+void
+finish_async_query(PgFdwConn *conn)
+{
+ PgFdwScanState *fsstate = PFCgetAsyncScan(conn);
+ PgFdwConn *async_conn;
+
+ /* Nothing to do if no async connection */
+ if (fsstate == NULL) return;
+ async_conn = fsstate->conn;
+ if (!async_conn ||
+ PFCgetNscans(async_conn) == 1 ||
+ !PFCisAsyncRunning(async_conn))
+ return;
+
+ fetch_more_data(PFCgetAsyncScan(async_conn), FORCE_SYNC);
+
+ Assert(!PFCisAsyncRunning(async_conn));
+}
+
+/*
* Force assorted GUC parameters to settings that ensure that we'll output
* data values in a form that is unambiguous to the remote server.
*
@@ -2132,7 +2270,7 @@ reset_transmission_modes(int nestlevel)
* Utility routine to close a cursor.
*/
static void
-close_cursor(PGconn *conn, unsigned int cursor_number)
+close_cursor(PgFdwConn *conn, unsigned int cursor_number)
{
char sql[64];
PGresult *res;
@@ -2143,7 +2281,7 @@ close_cursor(PGconn *conn, unsigned int cursor_number)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, sql);
PQclear(res);
@@ -2165,6 +2303,9 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
GetPrepStmtNumber(fmstate->conn));
p_name = pstrdup(prep_name);
+ /* Finish async query if runing */
+ finish_async_query(fmstate->conn);
+
/*
* We intentionally do not specify parameter types here, but leave the
* remote server to derive them by default. This avoids possible problems
@@ -2175,11 +2316,11 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQprepare(fmstate->conn,
- p_name,
- fmstate->query,
- 0,
- NULL);
+ res = PFCprepare(fmstate->conn,
+ p_name,
+ fmstate->query,
+ 0,
+ NULL);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -2297,7 +2438,7 @@ postgresAnalyzeForeignTable(Relation relation,
ForeignTable *table;
ForeignServer *server;
UserMapping *user;
- PGconn *conn;
+ PgFdwConn *conn;
StringInfoData sql;
PGresult *volatile res = NULL;
@@ -2329,7 +2470,7 @@ postgresAnalyzeForeignTable(Relation relation,
/* In what follows, do not risk leaking any PGresults. */
PG_TRY();
{
- res = PQexec(conn, sql.data);
+ res = PFCexec(conn, sql.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
@@ -2379,7 +2520,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
ForeignTable *table;
ForeignServer *server;
UserMapping *user;
- PGconn *conn;
+ PgFdwConn *conn;
unsigned int cursor_number;
StringInfoData sql;
PGresult *volatile res = NULL;
@@ -2423,7 +2564,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
/* In what follows, do not risk leaking any PGresults. */
PG_TRY();
{
- res = PQexec(conn, sql.data);
+ res = PFCexec(conn, sql.data);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
PQclear(res);
@@ -2453,7 +2594,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
snprintf(fetch_sql, sizeof(fetch_sql), "FETCH %d FROM c%u",
fetch_size, cursor_number);
- res = PQexec(conn, fetch_sql);
+ res = PFCexec(conn, fetch_sql);
/* On error, report the original query, not the FETCH. */
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
@@ -2582,7 +2723,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
bool import_not_null = true;
ForeignServer *server;
UserMapping *mapping;
- PGconn *conn;
+ PgFdwConn *conn;
StringInfoData buf;
PGresult *volatile res = NULL;
int numrows,
@@ -2615,7 +2756,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
conn = GetConnection(server, mapping, false);
/* Don't attempt to import collation if remote server hasn't got it */
- if (PQserverVersion(conn) < 90100)
+ if (PFCserverVersion(conn) < 90100)
import_collate = false;
/* Create workspace for strings */
@@ -2628,7 +2769,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
appendStringInfoString(&buf, "SELECT 1 FROM pg_catalog.pg_namespace WHERE nspname = ");
deparseStringLiteral(&buf, stmt->remote_schema);
- res = PQexec(conn, buf.data);
+ res = PFCexec(conn, buf.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, buf.data);
@@ -2723,7 +2864,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
appendStringInfo(&buf, " ORDER BY c.relname, a.attnum");
/* Fetch the data */
- res = PQexec(conn, buf.data);
+ res = PFCexec(conn, buf.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, buf.data);
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 950c6f7..b117a88 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -18,19 +18,22 @@
#include "nodes/relation.h"
#include "utils/relcache.h"
-#include "libpq-fe.h"
+#include "PgFdwConn.h"
+
+struct PgFdwScanState;
/* in postgres_fdw.c */
extern int set_transmission_modes(void);
extern void reset_transmission_modes(int nestlevel);
+extern void finish_async_query(PgFdwConn *fsstate);
/* in connection.c */
-extern PGconn *GetConnection(ForeignServer *server, UserMapping *user,
+extern PgFdwConn *GetConnection(ForeignServer *server, UserMapping *user,
bool will_prep_stmt);
-extern void ReleaseConnection(PGconn *conn);
-extern unsigned int GetCursorNumber(PGconn *conn);
-extern unsigned int GetPrepStmtNumber(PGconn *conn);
-extern void pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
+extern void ReleaseConnection(PgFdwConn *conn);
+extern unsigned int GetCursorNumber(PgFdwConn *conn);
+extern unsigned int GetPrepStmtNumber(PgFdwConn *conn);
+extern void pgfdw_report_error(int elevel, PGresult *res, PgFdwConn *conn,
bool clear, const char *sql);
/* in option.c */
--
2.1.0.GIT
I think its telling that varying the fetch size doubled the performance,
even on localhost. If you were to repeat this test across a network, the
performance difference would be far more drastic.
I understand the desire to keep the fetch size small by default, but I
think your results demonstrate how important the value is. At the very
least, it is worth reconsidering this "arbitrary" value. However, I think
the real solution is to make this configurable. It probably should be a
new option on the foreign server or table, but an argument could be made
for it to be global across the server just like work_mem.
Obviously, this shouldn't block your current patch but its worth revisiting.
- Matt Kelly
Hello, thank you for the comment. I added experimental adaptive
fetch size feature in this v6 patch.
At Tue, 20 Jan 2015 04:51:13 +0000, Matt Kelly <mkellycs@gmail.com> wrote in <CA+KcUkhLUo+Vaj4xR8GVsof_nW79uDZTDYhOSdt13CFJkaEEdQ@mail.gmail.com>
I think its telling that varying the fetch size doubled the performance,
even on localhost. If you were to repeat this test across a network, the
performance difference would be far more drastic.
I think so surely.
I understand the desire to keep the fetch size small by default, but I
think your results demonstrate how important the value is. At the very
least, it is worth reconsidering this "arbitrary" value. However, I think
the real solution is to make this configurable. It probably should be a
new option on the foreign server or table, but an argument could be made
for it to be global across the server just like work_mem.
The optimal number of fetch_count varies depending on query. Only
from the performance view, it should be the same as the table
size when simple scan on a table. Most of joins also not need to
read target relations simultaneously. (Local merge join on remote
sorted results is not available since fdw is not aware of the
sorted-ness). But it would be changed in near future. So I have
found no appropriate policy to decide the number.
The another point of view is memory requirement. This wouldn't
matter using single-row mode of libpq but it doesn't allow
multple simultaneous queries. The space needed for the fetch
buffer widely varies in proportion to the average row length. If
it is 1Kbytes, 10000 rows requires over 10MByes, which is larger
than the default value of work_mem. I tried adaptive fetch_size
based on fetch durtaion and required buffer size for the previous
turn in this version. But hard limit cannot be imposed since we
cannot know of the mean row length in advance. So, for example,
the average row length suddenly grows 1KB->10KB when fetch_size
is 10000, 100MB is required for the turn. I think, for the
ordinary cases, maximum fetch size cannot exceeds 1000.
The attatched is the new version implemented the adaptive fetch
size. Simple test runs showed the values below. A single scan was
boosted by about 5% (No effect?) and a join by 33%. The former
case is ununderstandable so I'll examine it tomorrow. This
doesn't seem so promising, though..
=====
master=# EXPLAIN (ANALYZE ON, COSTS OFF) SELECT * FROM ft1;
QUERY PLAN
-------------------------------------------------------------------------
Foreign Scan on ft1 (actual time=1.741..10046.272 rows=1000000 loops=1)
Planning time: 0.084 ms
Execution time: 10145.730 ms
(3 rows)
patched=# EXPLAIN (ANALYZE ON, COSTS OFF) SELECT * FROM ft1;
QUERY PLAN
------------------------------------------------------------------------
Foreign Scan on ft1 (actual time=1.072..9582.980 rows=1000000 loops=1)
Planning time: 0.077 ms
Execution time: 9683.164 ms
(3 rows)
patched=# EXPLAIN (ANALYZE ON, COSTS OFF) SELECT x.a, x.c, y.c FROM ft1 AS x JOIN ft1 AS y on x.a = y.a;
QUERY PLAN
================================
postgres=# EXPLAIN (ANALYZE ON, COSTS OFF) SELECT x.a, x.c, y.c FROM ft1 AS x JOIN ft1 AS y on x.a = y.a;
QUERY PLAN
-------------------------------------------------------------------------------
-------
Merge Join (actual time=18191.739..19534.001 rows=1000000 loops=1)
Merge Cond: (x.a = y.a)
-> Sort (actual time=9031.155..9294.465 rows=1000000 loops=1)
Sort Key: x.a
Sort Method: external sort Disk: 142728kB
-> Foreign Scan on ft1 x (actual time=1.156..6486.632 rows=1000000 lo
ops=1)
-> Sort (actual time=9160.577..9479.076 rows=1000000 loops=1)
Sort Key: y.a
Sort Method: external sort Disk: 146632kB
-> Foreign Scan on ft1 y (actual time=0.641..6517.594 rows=1000000 lo
ops=1)
Planning time: 0.203 ms
Execution time: 19626.881 ms
(12 rows)
-------------------------------------------------------------------------------
-------
Merge Join (actual time=11790.690..13134.071 rows=1000000 loops=1)
Merge Cond: (x.a = y.a)
-> Sort (actual time=8149.225..8413.611 rows=1000000 loops=1)
Sort Key: x.a
Sort Method: external sort Disk: 142728kB
-> Foreign Scan on ft1 x (actual time=0.679..3989.160 rows=1000000 lo
ops=1)
-> Sort (actual time=3641.457..3957.240 rows=1000000 loops=1)
Sort Key: y.a
Sort Method: external sort Disk: 146632kB
-> Foreign Scan on ft1 y (actual time=0.605..1852.655 rows=1000000 lo
ops=1)
Planning time: 0.203 ms
Execution time: 13226.414 ms
(12 rows)
Obviously, this shouldn't block your current patch but its worth revisiting.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-Asynchronous-execution-of-postgres_fdw-v6.patchtext/x-patch; charset=us-asciiDownload
>From 8408ea7c5642a59428952162253640df007485a5 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Tue, 13 Jan 2015 19:20:35 +0900
Subject: [PATCH] Asynchronous execution of postgres_fdw v6
This is the modified version of Asynchronous execution of
postgres_fdw.
- Experimental adaptive fetch size added.
---
contrib/postgres_fdw/Makefile | 2 +-
contrib/postgres_fdw/PgFdwConn.c | 200 +++++++++++++++++++
contrib/postgres_fdw/PgFdwConn.h | 61 ++++++
contrib/postgres_fdw/connection.c | 82 ++++----
contrib/postgres_fdw/postgres_fdw.c | 386 +++++++++++++++++++++++++++++-------
contrib/postgres_fdw/postgres_fdw.h | 15 +-
6 files changed, 624 insertions(+), 122 deletions(-)
create mode 100644 contrib/postgres_fdw/PgFdwConn.c
create mode 100644 contrib/postgres_fdw/PgFdwConn.h
diff --git a/contrib/postgres_fdw/Makefile b/contrib/postgres_fdw/Makefile
index d2b98e1..d0913e2 100644
--- a/contrib/postgres_fdw/Makefile
+++ b/contrib/postgres_fdw/Makefile
@@ -1,7 +1,7 @@
# contrib/postgres_fdw/Makefile
MODULE_big = postgres_fdw
-OBJS = postgres_fdw.o option.o deparse.o connection.o $(WIN32RES)
+OBJS = postgres_fdw.o PgFdwConn.o option.o deparse.o connection.o $(WIN32RES)
PGFILEDESC = "postgres_fdw - foreign data wrapper for PostgreSQL"
PG_CPPFLAGS = -I$(libpq_srcdir)
diff --git a/contrib/postgres_fdw/PgFdwConn.c b/contrib/postgres_fdw/PgFdwConn.c
new file mode 100644
index 0000000..b13b597
--- /dev/null
+++ b/contrib/postgres_fdw/PgFdwConn.c
@@ -0,0 +1,200 @@
+/*-------------------------------------------------------------------------
+ *
+ * PgFdwConn.c
+ * PGconn extending wrapper to enable asynchronous query.
+ *
+ * Portions Copyright (c) 2012-2015, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * contrib/postgres_fdw/PgFdwConn.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "PgFdwConn.h"
+
+#define PFC_ALLOCATE() ((PgFdwConn *)malloc(sizeof(PgFdwConn)))
+#define PFC_FREE(c) free(c)
+
+struct pgfdw_conn
+{
+ PGconn *pgconn; /* libpq connection for this connection */
+ int nscans; /* number of scans using this connection */
+ struct PgFdwScanState *async_scan; /* the connection currently running
+ * async query on this connection */
+};
+
+void
+PFCsetAsyncScan(PgFdwConn *conn, struct PgFdwScanState *scan)
+{
+ conn->async_scan = scan;
+}
+
+struct PgFdwScanState *
+PFCgetAsyncScan(PgFdwConn *conn)
+{
+ return conn->async_scan;
+}
+
+int
+PFCisAsyncRunning(PgFdwConn *conn)
+{
+ return conn->async_scan != NULL;
+}
+
+PGconn *
+PFCgetPGconn(PgFdwConn *conn)
+{
+ return conn->pgconn;
+}
+
+int
+PFCgetNscans(PgFdwConn *conn)
+{
+ return conn->nscans;
+}
+
+int
+PFCincrementNscans(PgFdwConn *conn)
+{
+ return ++conn->nscans;
+}
+
+int
+PFCdecrementNscans(PgFdwConn *conn)
+{
+ Assert(conn->nscans > 0);
+ return --conn->nscans;
+}
+
+void
+PFCcancelAsync(PgFdwConn *conn)
+{
+ if (PFCisAsyncRunning(conn))
+ PFCconsumeInput(conn);
+}
+
+void
+PFCinit(PgFdwConn *conn)
+{
+ conn->async_scan = NULL;
+ conn->nscans = 0;
+}
+
+int
+PFCsendQuery(PgFdwConn *conn, const char *query)
+{
+ return PQsendQuery(conn->pgconn, query);
+}
+
+PGresult *
+PFCexec(PgFdwConn *conn, const char *query)
+{
+ return PQexec(conn->pgconn, query);
+}
+
+PGresult *
+PFCexecParams(PgFdwConn *conn,
+ const char *command,
+ int nParams,
+ const Oid *paramTypes,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat)
+{
+ return PQexecParams(conn->pgconn,
+ command, nParams, paramTypes, paramValues,
+ paramLengths, paramFormats, resultFormat);
+}
+
+PGresult *
+PFCprepare(PgFdwConn *conn,
+ const char *stmtName, const char *query,
+ int nParams, const Oid *paramTypes)
+{
+ return PQprepare(conn->pgconn, stmtName, query, nParams, paramTypes);
+}
+
+PGresult *
+PFCexecPrepared(PgFdwConn *conn,
+ const char *stmtName,
+ int nParams,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat)
+{
+ return PQexecPrepared(conn->pgconn,
+ stmtName, nParams, paramValues, paramLengths,
+ paramFormats, resultFormat);
+}
+
+PGresult *
+PFCgetResult(PgFdwConn *conn)
+{
+ return PQgetResult(conn->pgconn);
+}
+
+int
+PFCconsumeInput(PgFdwConn *conn)
+{
+ return PQconsumeInput(conn->pgconn);
+}
+
+int
+PFCisBusy(PgFdwConn *conn)
+{
+ return PQisBusy(conn->pgconn);
+}
+
+ConnStatusType
+PFCstatus(const PgFdwConn *conn)
+{
+ return PQstatus(conn->pgconn);
+}
+
+PGTransactionStatusType
+PFCtransactionStatus(const PgFdwConn *conn)
+{
+ return PQtransactionStatus(conn->pgconn);
+}
+
+int
+PFCserverVersion(const PgFdwConn *conn)
+{
+ return PQserverVersion(conn->pgconn);
+}
+
+char *
+PFCerrorMessage(const PgFdwConn *conn)
+{
+ return PQerrorMessage(conn->pgconn);
+}
+
+int
+PFCconnectionUsedPassword(const PgFdwConn *conn)
+{
+ return PQconnectionUsedPassword(conn->pgconn);
+}
+
+void
+PFCfinish(PgFdwConn *conn)
+{
+ return PQfinish(conn->pgconn);
+ PFC_FREE(conn);
+}
+
+PgFdwConn *
+PFCconnectdbParams(const char *const * keywords,
+ const char *const * values, int expand_dbname)
+{
+ PgFdwConn *ret = PFC_ALLOCATE();
+
+ PFCinit(ret);
+ ret->pgconn = PQconnectdbParams(keywords, values, expand_dbname);
+
+ return ret;
+}
diff --git a/contrib/postgres_fdw/PgFdwConn.h b/contrib/postgres_fdw/PgFdwConn.h
new file mode 100644
index 0000000..f695f5a
--- /dev/null
+++ b/contrib/postgres_fdw/PgFdwConn.h
@@ -0,0 +1,61 @@
+/*-------------------------------------------------------------------------
+ *
+ * PgFdwConn.h
+ * PGconn extending wrapper to enable asynchronous query.
+ *
+ * Portions Copyright (c) 2012-2015, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * contrib/postgres_fdw/PgFdwConn.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PGFDWCONN_H
+#define PGFDWCONN_H
+
+#include "libpq-fe.h"
+
+typedef struct pgfdw_conn PgFdwConn;
+struct PgFdwScanState;
+
+extern void PFCsetAsyncScan(PgFdwConn *conn, struct PgFdwScanState *scan);
+extern struct PgFdwScanState *PFCgetAsyncScan(PgFdwConn *conn);
+extern int PFCisAsyncRunning(PgFdwConn *conn);
+extern PGconn *PFCgetPGconn(PgFdwConn *conn);
+extern int PFCgetNscans(PgFdwConn *conn);
+extern int PFCincrementNscans(PgFdwConn *conn);
+extern int PFCdecrementNscans(PgFdwConn *conn);
+extern void PFCcancelAsync(PgFdwConn *conn);
+extern void PFCinit(PgFdwConn *conn);
+extern int PFCsendQuery(PgFdwConn *conn, const char *query);
+extern PGresult *PFCexec(PgFdwConn *conn, const char *query);
+extern PGresult *PFCexecParams(PgFdwConn *conn,
+ const char *command,
+ int nParams,
+ const Oid *paramTypes,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat);
+extern PGresult *PFCprepare(PgFdwConn *conn,
+ const char *stmtName, const char *query,
+ int nParams, const Oid *paramTypes);
+extern PGresult *PFCexecPrepared(PgFdwConn *conn,
+ const char *stmtName,
+ int nParams,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat);
+extern PGresult *PFCgetResult(PgFdwConn *conn);
+extern int PFCconsumeInput(PgFdwConn *conn);
+extern int PFCisBusy(PgFdwConn *conn);
+extern ConnStatusType PFCstatus(const PgFdwConn *conn);
+extern PGTransactionStatusType PFCtransactionStatus(const PgFdwConn *conn);
+extern int PFCserverVersion(const PgFdwConn *conn);
+extern char *PFCerrorMessage(const PgFdwConn *conn);
+extern int PFCconnectionUsedPassword(const PgFdwConn *conn);
+extern void PFCfinish(PgFdwConn *conn);
+extern PgFdwConn *PFCconnectdbParams(const char *const * keywords,
+ const char *const * values, int expand_dbname);
+#endif /* PGFDWCONN_H */
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 4e02cb2..2517f6b 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -44,7 +44,7 @@ typedef struct ConnCacheKey
typedef struct ConnCacheEntry
{
ConnCacheKey key; /* hash key (must be first) */
- PGconn *conn; /* connection to foreign server, or NULL */
+ PgFdwConn *conn; /* connection to foreign server, or NULL */
int xact_depth; /* 0 = no xact open, 1 = main xact open, 2 =
* one level of subxact open, etc */
bool have_prep_stmt; /* have we prepared any stmts in this xact? */
@@ -64,10 +64,10 @@ static unsigned int prep_stmt_number = 0;
static bool xact_got_connection = false;
/* prototypes of private functions */
-static PGconn *connect_pg_server(ForeignServer *server, UserMapping *user);
+static PgFdwConn *connect_pg_server(ForeignServer *server, UserMapping *user);
static void check_conn_params(const char **keywords, const char **values);
-static void configure_remote_session(PGconn *conn);
-static void do_sql_command(PGconn *conn, const char *sql);
+static void configure_remote_session(PgFdwConn *conn);
+static void do_sql_command(PgFdwConn *conn, const char *sql);
static void begin_remote_xact(ConnCacheEntry *entry);
static void pgfdw_xact_callback(XactEvent event, void *arg);
static void pgfdw_subxact_callback(SubXactEvent event,
@@ -93,7 +93,7 @@ static void pgfdw_subxact_callback(SubXactEvent event,
* be useful and not mere pedantry. We could not flush any active connections
* mid-transaction anyway.
*/
-PGconn *
+PgFdwConn *
GetConnection(ForeignServer *server, UserMapping *user,
bool will_prep_stmt)
{
@@ -161,9 +161,12 @@ GetConnection(ForeignServer *server, UserMapping *user,
entry->have_error = false;
entry->conn = connect_pg_server(server, user);
elog(DEBUG3, "new postgres_fdw connection %p for server \"%s\"",
- entry->conn, server->servername);
+ PFCgetPGconn(entry->conn), server->servername);
+
}
+ PFCincrementNscans(entry->conn);
+
/*
* Start a new transaction or subtransaction if needed.
*/
@@ -178,10 +181,10 @@ GetConnection(ForeignServer *server, UserMapping *user,
/*
* Connect to remote server using specified server and user mapping properties.
*/
-static PGconn *
+static PgFdwConn *
connect_pg_server(ForeignServer *server, UserMapping *user)
{
- PGconn *volatile conn = NULL;
+ PgFdwConn *volatile conn = NULL;
/*
* Use PG_TRY block to ensure closing connection on error.
@@ -223,14 +226,14 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
/* verify connection parameters and make connection */
check_conn_params(keywords, values);
- conn = PQconnectdbParams(keywords, values, false);
- if (!conn || PQstatus(conn) != CONNECTION_OK)
+ conn = PFCconnectdbParams(keywords, values, false);
+ if (!conn || PFCstatus(conn) != CONNECTION_OK)
{
char *connmessage;
int msglen;
/* libpq typically appends a newline, strip that */
- connmessage = pstrdup(PQerrorMessage(conn));
+ connmessage = pstrdup(PFCerrorMessage(conn));
msglen = strlen(connmessage);
if (msglen > 0 && connmessage[msglen - 1] == '\n')
connmessage[msglen - 1] = '\0';
@@ -246,7 +249,7 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
* otherwise, he's piggybacking on the postgres server's user
* identity. See also dblink_security_check() in contrib/dblink.
*/
- if (!superuser() && !PQconnectionUsedPassword(conn))
+ if (!superuser() && !PFCconnectionUsedPassword(conn))
ereport(ERROR,
(errcode(ERRCODE_S_R_E_PROHIBITED_SQL_STATEMENT_ATTEMPTED),
errmsg("password is required"),
@@ -263,7 +266,7 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
{
/* Release PGconn data structure if we managed to create one */
if (conn)
- PQfinish(conn);
+ PFCfinish(conn);
PG_RE_THROW();
}
PG_END_TRY();
@@ -312,9 +315,9 @@ check_conn_params(const char **keywords, const char **values)
* there are any number of ways to break things.
*/
static void
-configure_remote_session(PGconn *conn)
+configure_remote_session(PgFdwConn *conn)
{
- int remoteversion = PQserverVersion(conn);
+ int remoteversion = PFCserverVersion(conn);
/* Force the search path to contain only pg_catalog (see deparse.c) */
do_sql_command(conn, "SET search_path = pg_catalog");
@@ -348,11 +351,11 @@ configure_remote_session(PGconn *conn)
* Convenience subroutine to issue a non-data-returning SQL command to remote
*/
static void
-do_sql_command(PGconn *conn, const char *sql)
+do_sql_command(PgFdwConn *conn, const char *sql)
{
PGresult *res;
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, sql);
PQclear(res);
@@ -379,7 +382,7 @@ begin_remote_xact(ConnCacheEntry *entry)
const char *sql;
elog(DEBUG3, "starting remote transaction on connection %p",
- entry->conn);
+ PFCgetPGconn(entry->conn));
if (IsolationIsSerializable())
sql = "START TRANSACTION ISOLATION LEVEL SERIALIZABLE";
@@ -408,13 +411,11 @@ begin_remote_xact(ConnCacheEntry *entry)
* Release connection reference count created by calling GetConnection.
*/
void
-ReleaseConnection(PGconn *conn)
+ReleaseConnection(PgFdwConn *conn)
{
- /*
- * Currently, we don't actually track connection references because all
- * cleanup is managed on a transaction or subtransaction basis instead. So
- * there's nothing to do here.
- */
+ /* ongoing async query should be canceled if no scans left */
+ if (PFCdecrementNscans(conn) == 0)
+ finish_async_query(conn);
}
/*
@@ -429,7 +430,7 @@ ReleaseConnection(PGconn *conn)
* collisions are highly improbable; just be sure to use %u not %d to print.
*/
unsigned int
-GetCursorNumber(PGconn *conn)
+GetCursorNumber(PgFdwConn *conn)
{
return ++cursor_number;
}
@@ -443,7 +444,7 @@ GetCursorNumber(PGconn *conn)
* increasing the risk of prepared-statement name collisions by resetting.
*/
unsigned int
-GetPrepStmtNumber(PGconn *conn)
+GetPrepStmtNumber(PgFdwConn *conn)
{
return ++prep_stmt_number;
}
@@ -462,7 +463,7 @@ GetPrepStmtNumber(PGconn *conn)
* marked with have_error = true.
*/
void
-pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
+pgfdw_report_error(int elevel, PGresult *res, PgFdwConn *conn,
bool clear, const char *sql)
{
/* If requested, PGresult must be released before leaving this function. */
@@ -490,7 +491,7 @@ pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
* return NULL, not a PGresult at all.
*/
if (message_primary == NULL)
- message_primary = PQerrorMessage(conn);
+ message_primary = PFCerrorMessage(conn);
ereport(elevel,
(errcode(sqlstate),
@@ -542,7 +543,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
if (entry->xact_depth > 0)
{
elog(DEBUG3, "closing remote transaction on connection %p",
- entry->conn);
+ PFCgetPGconn(entry->conn));
switch (event)
{
@@ -567,7 +568,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
*/
if (entry->have_prep_stmt && entry->have_error)
{
- res = PQexec(entry->conn, "DEALLOCATE ALL");
+ res = PFCexec(entry->conn, "DEALLOCATE ALL");
PQclear(res);
}
entry->have_prep_stmt = false;
@@ -597,7 +598,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Assume we might have lost track of prepared statements */
entry->have_error = true;
/* If we're aborting, abort all remote transactions too */
- res = PQexec(entry->conn, "ABORT TRANSACTION");
+ res = PFCexec(entry->conn, "ABORT TRANSACTION");
/* Note: can't throw ERROR, it would be infinite loop */
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(WARNING, res, entry->conn, true,
@@ -608,7 +609,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* As above, make sure to clear any prepared stmts */
if (entry->have_prep_stmt && entry->have_error)
{
- res = PQexec(entry->conn, "DEALLOCATE ALL");
+ res = PFCexec(entry->conn, "DEALLOCATE ALL");
PQclear(res);
}
entry->have_prep_stmt = false;
@@ -620,17 +621,19 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Reset state to show we're out of a transaction */
entry->xact_depth = 0;
+ PFCcancelAsync(entry->conn);
+ PFCinit(entry->conn);
/*
* If the connection isn't in a good idle state, discard it to
* recover. Next GetConnection will open a new connection.
*/
- if (PQstatus(entry->conn) != CONNECTION_OK ||
- PQtransactionStatus(entry->conn) != PQTRANS_IDLE)
+ if (PFCstatus(entry->conn) != CONNECTION_OK ||
+ PFCtransactionStatus(entry->conn) != PQTRANS_IDLE)
{
- elog(DEBUG3, "discarding connection %p", entry->conn);
- PQfinish(entry->conn);
- entry->conn = NULL;
+ elog(DEBUG3, "discarding connection %p",
+ PFCgetPGconn(entry->conn));
+ PFCfinish(entry->conn);
}
}
@@ -676,6 +679,9 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
PGresult *res;
char sql[100];
+ /* Shut down asynchronous scan if running */
+ PFCcancelAsync(entry->conn);
+
/*
* We only care about connections with open remote subtransactions of
* the current level.
@@ -701,7 +707,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
snprintf(sql, sizeof(sql),
"ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
curlevel, curlevel);
- res = PQexec(entry->conn, sql);
+ res = PFCexec(entry->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(WARNING, res, entry->conn, true, sql);
else
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index d76e739..2c58377 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -46,6 +46,24 @@ PG_MODULE_MAGIC;
/* Default CPU cost to process 1 row (above and beyond cpu_tuple_cost). */
#define DEFAULT_FDW_TUPLE_COST 0.01
+/* Fetch size at startup. This might be better be a GUC parameter */
+#define MIN_FETCH_SIZE 100
+
+/* Maximum fetch size. This might be better be a GUC parameter */
+#define MAX_FETCH_SIZE 1000
+
+/*
+ * Maximum size for fetch buffer in kilobytes. Ditto.
+ *
+ * This should be far larger than sizeof(HeapTuple) * FETCH_SIZE_MAX. This is
+ * not a hard limit because we cannot know in advance the average row length
+ * returned.
+ */
+#define MAX_FETCH_BUFFER_SIZE 10000 /* 10MB */
+
+/* Maximum duration allowed for a single fetch, in milliseconds */
+#define MAX_FETCH_DURATION 500
+
/*
* FDW-specific planner information kept in RelOptInfo.fdw_private for a
* foreign table. This information is collected by postgresGetForeignRelSize.
@@ -123,6 +141,12 @@ enum FdwModifyPrivateIndex
FdwModifyPrivateRetrievedAttrs
};
+typedef enum fetch_mode {
+ START_ONLY,
+ FORCE_SYNC,
+ ALLOW_ASYNC
+} fetch_mode;
+
/*
* Execution state of a foreign scan using postgres_fdw.
*/
@@ -136,7 +160,7 @@ typedef struct PgFdwScanState
List *retrieved_attrs; /* list of retrieved attribute numbers */
/* for remote query execution */
- PGconn *conn; /* connection for the scan */
+ PgFdwConn *conn; /* connection for the scan */
unsigned int cursor_number; /* quasi-unique ID for my cursor */
bool cursor_exists; /* have we created the cursor? */
int numParams; /* number of parameters passed to query */
@@ -148,7 +172,12 @@ typedef struct PgFdwScanState
HeapTuple *tuples; /* array of currently-retrieved tuples */
int num_tuples; /* # of tuples in array */
int next_tuple; /* index of next one to return */
-
+ int fetch_size; /* rows to be fetched at once */
+ int successive_async; /* # of successive fetches at this
+ fetch_size */
+ long last_fetch_req_at; /* The time of the last fetch request, in
+ * milliseconds*/
+ int last_buf_size; /* Buffer size required for the last fetch */
/* batch-level state, for optimizing rewinds and avoiding useless fetch */
int fetch_ct_2; /* Min(# of fetches done, 2) */
bool eof_reached; /* true if last fetch reached EOF */
@@ -156,6 +185,7 @@ typedef struct PgFdwScanState
/* working memory contexts */
MemoryContext batch_cxt; /* context holding current batch of tuples */
MemoryContext temp_cxt; /* context for per-tuple temporary data */
+ ExprContext *econtext; /* copy of ps_ExprContext of ForeignScanState */
} PgFdwScanState;
/*
@@ -167,7 +197,7 @@ typedef struct PgFdwModifyState
AttInMetadata *attinmeta; /* attribute datatype conversion metadata */
/* for remote query execution */
- PGconn *conn; /* connection for the scan */
+ PgFdwConn *conn; /* connection for the scan */
char *p_name; /* name of prepared statement, if created */
/* extracted fdw_private data */
@@ -298,7 +328,7 @@ static void estimate_path_cost_size(PlannerInfo *root,
double *p_rows, int *p_width,
Cost *p_startup_cost, Cost *p_total_cost);
static void get_remote_estimate(const char *sql,
- PGconn *conn,
+ PgFdwConn *conn,
double *rows,
int *width,
Cost *startup_cost,
@@ -306,9 +336,9 @@ static void get_remote_estimate(const char *sql,
static bool ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
EquivalenceClass *ec, EquivalenceMember *em,
void *arg);
-static void create_cursor(ForeignScanState *node);
-static void fetch_more_data(ForeignScanState *node);
-static void close_cursor(PGconn *conn, unsigned int cursor_number);
+static void create_cursor(PgFdwScanState *node);
+static void close_cursor(PgFdwConn *conn, unsigned int cursor_number);
+static void fetch_more_data(PgFdwScanState *node, fetch_mode cmd);
static void prepare_foreign_modify(PgFdwModifyState *fmstate);
static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
ItemPointer tupleid,
@@ -329,7 +359,6 @@ static HeapTuple make_tuple_from_result_row(PGresult *res,
MemoryContext temp_context);
static void conversion_error_callback(void *arg);
-
/*
* Foreign-data wrapper handler function: return a struct with pointers
* to my callback routines.
@@ -982,6 +1011,19 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
fsstate->param_values = (const char **) palloc0(numParams * sizeof(char *));
else
fsstate->param_values = NULL;
+
+ fsstate->econtext = node->ss.ps.ps_ExprContext;
+
+ fsstate->fetch_size = MIN_FETCH_SIZE;
+ fsstate->successive_async = 0;
+ fsstate->last_buf_size = 0;
+
+ /*
+ * Start scanning asynchronously if it is the first scan on this
+ * connection.
+ */
+ if (PFCgetNscans(fsstate->conn) == 1)
+ create_cursor(fsstate);
}
/*
@@ -1000,7 +1042,10 @@ postgresIterateForeignScan(ForeignScanState *node)
* cursor on the remote side.
*/
if (!fsstate->cursor_exists)
- create_cursor(node);
+ {
+ finish_async_query(fsstate->conn);
+ create_cursor(fsstate);
+ }
/*
* Get some more tuples, if we've run out.
@@ -1009,7 +1054,7 @@ postgresIterateForeignScan(ForeignScanState *node)
{
/* No point in another fetch if we already detected EOF, though. */
if (!fsstate->eof_reached)
- fetch_more_data(node);
+ fetch_more_data(fsstate, ALLOW_ASYNC);
/* If we didn't get any tuples, must be end of data. */
if (fsstate->next_tuple >= fsstate->num_tuples)
return ExecClearTuple(slot);
@@ -1069,7 +1114,7 @@ postgresReScanForeignScan(ForeignScanState *node)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(fsstate->conn, sql);
+ res = PFCexec(fsstate->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fsstate->conn, true, sql);
PQclear(res);
@@ -1392,19 +1437,22 @@ postgresExecForeignInsert(EState *estate,
/* Convert parameters needed by prepared statement to text form */
p_values = convert_prep_stmt_params(fmstate, NULL, slot);
+ /* Finish async query if runing */
+ finish_async_query(fmstate->conn);
+
/*
* Execute the prepared statement, and check for success.
*
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1462,19 +1510,22 @@ postgresExecForeignUpdate(EState *estate,
(ItemPointer) DatumGetPointer(datum),
slot);
+ /* Finish async query if runing */
+ finish_async_query(fmstate->conn);
+
/*
* Execute the prepared statement, and check for success.
*
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1532,19 +1583,22 @@ postgresExecForeignDelete(EState *estate,
(ItemPointer) DatumGetPointer(datum),
NULL);
+ /* Finish async query if runing */
+ finish_async_query(fmstate->conn);
+
/*
* Execute the prepared statement, and check for success.
*
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1594,7 +1648,7 @@ postgresEndForeignModify(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(fmstate->conn, sql);
+ res = PFCexec(fmstate->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fmstate->conn, true, sql);
PQclear(res);
@@ -1726,7 +1780,7 @@ estimate_path_cost_size(PlannerInfo *root,
List *local_join_conds;
StringInfoData sql;
List *retrieved_attrs;
- PGconn *conn;
+ PgFdwConn *conn;
Selectivity local_sel;
QualCost local_cost;
@@ -1836,7 +1890,7 @@ estimate_path_cost_size(PlannerInfo *root,
* The given "sql" must be an EXPLAIN command.
*/
static void
-get_remote_estimate(const char *sql, PGconn *conn,
+get_remote_estimate(const char *sql, PgFdwConn *conn,
double *rows, int *width,
Cost *startup_cost, Cost *total_cost)
{
@@ -1852,7 +1906,7 @@ get_remote_estimate(const char *sql, PGconn *conn,
/*
* Execute EXPLAIN remotely.
*/
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql);
@@ -1917,13 +1971,12 @@ ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
* Create cursor for node's query with current parameter values.
*/
static void
-create_cursor(ForeignScanState *node)
+create_cursor(PgFdwScanState *fsstate)
{
- PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
- ExprContext *econtext = node->ss.ps.ps_ExprContext;
+ ExprContext *econtext = fsstate->econtext;
int numParams = fsstate->numParams;
const char **values = fsstate->param_values;
- PGconn *conn = fsstate->conn;
+ PgFdwConn *conn = fsstate->conn;
StringInfoData buf;
PGresult *res;
@@ -1985,8 +2038,8 @@ create_cursor(ForeignScanState *node)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecParams(conn, buf.data, numParams, NULL, values,
- NULL, NULL, 0);
+ res = PFCexecParams(conn, buf.data, numParams, NULL, values,
+ NULL, NULL, 0);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, fsstate->query);
PQclear(res);
@@ -2001,71 +2054,216 @@ create_cursor(ForeignScanState *node)
/* Clean up */
pfree(buf.data);
+
+ /*
+ * Start async scan if this is the first scan. See fetch_more_data() for
+ * details
+ */
+ if (PFCgetNscans(conn) == 1)
+ fetch_more_data(fsstate, START_ONLY);
}
/*
* Fetch some more rows from the node's cursor.
*/
static void
-fetch_more_data(ForeignScanState *node)
+fetch_more_data(PgFdwScanState *fsstate, fetch_mode cmd)
{
- PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
PGresult *volatile res = NULL;
MemoryContext oldcontext;
/*
* We'll store the tuples in the batch_cxt. First, flush the previous
- * batch.
+ * batch. Some tuples left unread when asynchronous fetching is
+ * interrupted. Don't flush to preserve the unread tuples for the case. It
+ * occurs no more than twice successively.
*/
- fsstate->tuples = NULL;
- MemoryContextReset(fsstate->batch_cxt);
+ if (fsstate->next_tuple >= fsstate->num_tuples)
+ {
+ fsstate->tuples = NULL;
+ MemoryContextReset(fsstate->batch_cxt);
+ }
oldcontext = MemoryContextSwitchTo(fsstate->batch_cxt);
/* PGresult must be released before leaving this function. */
PG_TRY();
{
- PGconn *conn = fsstate->conn;
+ PgFdwConn *conn = fsstate->conn;
char sql[64];
- int fetch_size;
- int numrows;
+ int numrows, addrows, restrows;
+ HeapTuple *tmptuples;
+ int prev_fetch_size = fsstate->fetch_size;
+ int new_fetch_size = fsstate->fetch_size;
int i;
+ struct timeval tv = {0, 0};
+ long current_time;
+ int fetch_buf_size;
- /* The fetch size is arbitrary, but shouldn't be enormous. */
- fetch_size = 100;
+ gettimeofday(&tv, NULL);
+ current_time = tv.tv_sec * 1000 + tv.tv_usec / 1000;
+ /*
+ * Doing adaptive fetch size
+ *
+ * Since we don't have enough knowledge about how long fetching takes
+ * or how large space needed for received tuples in advance, change
+ * fetch_size dynamically according to maximal allowed duration and
+ * buffer space.
+ */
+ if (fsstate->last_buf_size > MAX_FETCH_BUFFER_SIZE)
+ {
+ new_fetch_size =
+ (int)((double)fsstate->fetch_size * MAX_FETCH_BUFFER_SIZE /
+ fsstate->last_buf_size);
+ }
+ if (PFCisBusy(conn) &&
+ fsstate->fetch_size > MIN_FETCH_SIZE &&
+ fsstate->last_fetch_req_at + MAX_FETCH_DURATION <
+ current_time)
+ {
+ int tmp_fetch_size = fsstate->fetch_size / 2;
+ if (tmp_fetch_size < new_fetch_size)
+ new_fetch_size = tmp_fetch_size;
+ }
+
+ /* Increase if not decreased and other conditions match. */
+ if (new_fetch_size == fsstate->fetch_size &&
+ fsstate->successive_async > 8 &&
+ fsstate->fetch_size < MAX_FETCH_SIZE)
+ fsstate->fetch_size *= 2;
+
+ /* */
+ if (new_fetch_size != fsstate->fetch_size)
+ {
+ if (new_fetch_size > MAX_FETCH_SIZE)
+ fsstate->fetch_size = MAX_FETCH_SIZE;
+ else if (new_fetch_size < MIN_FETCH_SIZE)
+ fsstate->fetch_size = MIN_FETCH_SIZE;
+ else
+ fsstate->fetch_size = new_fetch_size;
+ fsstate->successive_async = 0;
+ }
+
+
+ /* Making the query to fetch tuples */
snprintf(sql, sizeof(sql), "FETCH %d FROM c%u",
- fetch_size, fsstate->cursor_number);
+ fsstate->fetch_size, fsstate->cursor_number);
+
+ if (PFCisAsyncRunning(conn))
+ {
+ Assert (cmd != START_ONLY);
+
+ /*
+ * If the target fsstate is different from the scan state that the
+ * current async fetch running for, the result should be stored
+ * into it, then synchronously fetch data for the target fsstate.
+ */
+ if (fsstate != PFCgetAsyncScan(conn))
+ {
+ fetch_more_data(PFCgetAsyncScan(conn), FORCE_SYNC);
+ res = PFCexec(conn, sql);
+ }
+ else
+ {
+ /* Get result of running async fetch */
+ res = PFCgetResult(conn);
+ if (PQntuples(res) == prev_fetch_size)
+ {
+ /*
+ * Connection state doesn't go to IDLE even if all data
+ * has been sent to client for asynchronous query. One
+ * more PQgetResult() is needed to reset the state to
+ * IDLE. See PQexecFinish() for details.
+ */
+ if (PFCgetResult(conn) != NULL)
+ elog(ERROR, "Connection status error.");
+ }
+ }
+ PFCsetAsyncScan(conn, NULL);
+ }
+ else
+ {
+ if (cmd == START_ONLY)
+ {
+ Assert(PFCgetNscans(conn) == 1);
+
+ if (!PFCsendQuery(conn, sql))
+ pgfdw_report_error(ERROR, res, conn, false,
+ fsstate->query);
+ fsstate->last_fetch_req_at = current_time;
+
+ PFCsetAsyncScan(conn, fsstate);
+ goto end_of_fetch;
+ }
+
+ /* Elsewise do synchronous query execution */
+ PFCsetAsyncScan(conn, NULL);
+ res = PFCexec(conn, sql);
+ }
- res = PQexec(conn, sql);
/* On error, report the original query, not the FETCH. */
- if (PQresultStatus(res) != PGRES_TUPLES_OK)
+ if (res && PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, fsstate->query);
- /* Convert the data into HeapTuples */
- numrows = PQntuples(res);
- fsstate->tuples = (HeapTuple *) palloc0(numrows * sizeof(HeapTuple));
+ /* allocate tuple storage */
+ tmptuples = fsstate->tuples;
+ addrows = PQntuples(res);
+ restrows = fsstate->num_tuples - fsstate->next_tuple;
+ numrows = restrows + addrows;
+ fetch_buf_size = numrows * sizeof(HeapTuple);
+ fsstate->tuples = (HeapTuple *) palloc0(fetch_buf_size);
+
+ Assert(restrows == 0 || tmptuples);
+
+ /* copy unread tuples if any */
+ for (i = 0 ; i < restrows ; i++)
+ fsstate->tuples[i] = tmptuples[fsstate->next_tuple + i];
+
fsstate->num_tuples = numrows;
fsstate->next_tuple = 0;
- for (i = 0; i < numrows; i++)
+ /* Convert the data into HeapTuples */
+ for (i = 0 ; i < addrows; i++)
{
- fsstate->tuples[i] =
+ HeapTuple tup =
make_tuple_from_result_row(res, i,
fsstate->rel,
fsstate->attinmeta,
fsstate->retrieved_attrs,
fsstate->temp_cxt);
+ fsstate->tuples[restrows + i] = tup;
+ fetch_buf_size += (HEAPTUPLESIZE + tup->t_len);
}
+ fsstate->last_buf_size = fetch_buf_size / 1024; /* in kilobytes */
+
/* Update fetch_ct_2 */
if (fsstate->fetch_ct_2 < 2)
fsstate->fetch_ct_2++;
/* Must be EOF if we didn't get as many tuples as we asked for. */
- fsstate->eof_reached = (numrows < fetch_size);
+ fsstate->eof_reached = (numrows < prev_fetch_size);
PQclear(res);
res = NULL;
+
+ if (cmd == ALLOW_ASYNC)
+ {
+ if (!fsstate->eof_reached)
+ {
+ /*
+ * We can immediately request the next bunch of tuples if
+ * we're on asynchronous connection.
+ */
+ if (!PFCsendQuery(conn, sql))
+ pgfdw_report_error(ERROR, res, conn, false, fsstate->query);
+ fsstate->last_fetch_req_at = current_time;
+ PFCsetAsyncScan(conn, fsstate);
+ }
+ }
+
+end_of_fetch:
+ ; /* Nothing to do here but needed to make compiler quiet. */
}
PG_CATCH();
{
@@ -2075,10 +2273,41 @@ fetch_more_data(ForeignScanState *node)
}
PG_END_TRY();
+ if (PFCisAsyncRunning(fsstate->conn))
+ fsstate->successive_async++;
+ else
+ {
+ /* Reset fetch_size if the async_fetch stopped */
+ fsstate->successive_async = 0;
+ fsstate->fetch_size = MIN_FETCH_SIZE;
+ }
+
MemoryContextSwitchTo(oldcontext);
}
/*
+ * Force cancelling async command state.
+ */
+void
+finish_async_query(PgFdwConn *conn)
+{
+ PgFdwScanState *fsstate = PFCgetAsyncScan(conn);
+ PgFdwConn *async_conn;
+
+ /* Nothing to do if no async connection */
+ if (fsstate == NULL) return;
+ async_conn = fsstate->conn;
+ if (!async_conn ||
+ PFCgetNscans(async_conn) == 1 ||
+ !PFCisAsyncRunning(async_conn))
+ return;
+
+ fetch_more_data(PFCgetAsyncScan(async_conn), FORCE_SYNC);
+
+ Assert(!PFCisAsyncRunning(async_conn));
+}
+
+/*
* Force assorted GUC parameters to settings that ensure that we'll output
* data values in a form that is unambiguous to the remote server.
*
@@ -2132,7 +2361,7 @@ reset_transmission_modes(int nestlevel)
* Utility routine to close a cursor.
*/
static void
-close_cursor(PGconn *conn, unsigned int cursor_number)
+close_cursor(PgFdwConn *conn, unsigned int cursor_number)
{
char sql[64];
PGresult *res;
@@ -2143,7 +2372,7 @@ close_cursor(PGconn *conn, unsigned int cursor_number)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, sql);
PQclear(res);
@@ -2165,6 +2394,9 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
GetPrepStmtNumber(fmstate->conn));
p_name = pstrdup(prep_name);
+ /* Finish async query if runing */
+ finish_async_query(fmstate->conn);
+
/*
* We intentionally do not specify parameter types here, but leave the
* remote server to derive them by default. This avoids possible problems
@@ -2175,11 +2407,11 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQprepare(fmstate->conn,
- p_name,
- fmstate->query,
- 0,
- NULL);
+ res = PFCprepare(fmstate->conn,
+ p_name,
+ fmstate->query,
+ 0,
+ NULL);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -2297,7 +2529,7 @@ postgresAnalyzeForeignTable(Relation relation,
ForeignTable *table;
ForeignServer *server;
UserMapping *user;
- PGconn *conn;
+ PgFdwConn *conn;
StringInfoData sql;
PGresult *volatile res = NULL;
@@ -2329,7 +2561,7 @@ postgresAnalyzeForeignTable(Relation relation,
/* In what follows, do not risk leaking any PGresults. */
PG_TRY();
{
- res = PQexec(conn, sql.data);
+ res = PFCexec(conn, sql.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
@@ -2379,7 +2611,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
ForeignTable *table;
ForeignServer *server;
UserMapping *user;
- PGconn *conn;
+ PgFdwConn *conn;
unsigned int cursor_number;
StringInfoData sql;
PGresult *volatile res = NULL;
@@ -2423,7 +2655,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
/* In what follows, do not risk leaking any PGresults. */
PG_TRY();
{
- res = PQexec(conn, sql.data);
+ res = PFCexec(conn, sql.data);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
PQclear(res);
@@ -2453,7 +2685,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
snprintf(fetch_sql, sizeof(fetch_sql), "FETCH %d FROM c%u",
fetch_size, cursor_number);
- res = PQexec(conn, fetch_sql);
+ res = PFCexec(conn, fetch_sql);
/* On error, report the original query, not the FETCH. */
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
@@ -2582,7 +2814,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
bool import_not_null = true;
ForeignServer *server;
UserMapping *mapping;
- PGconn *conn;
+ PgFdwConn *conn;
StringInfoData buf;
PGresult *volatile res = NULL;
int numrows,
@@ -2615,7 +2847,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
conn = GetConnection(server, mapping, false);
/* Don't attempt to import collation if remote server hasn't got it */
- if (PQserverVersion(conn) < 90100)
+ if (PFCserverVersion(conn) < 90100)
import_collate = false;
/* Create workspace for strings */
@@ -2628,7 +2860,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
appendStringInfoString(&buf, "SELECT 1 FROM pg_catalog.pg_namespace WHERE nspname = ");
deparseStringLiteral(&buf, stmt->remote_schema);
- res = PQexec(conn, buf.data);
+ res = PFCexec(conn, buf.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, buf.data);
@@ -2723,7 +2955,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
appendStringInfo(&buf, " ORDER BY c.relname, a.attnum");
/* Fetch the data */
- res = PQexec(conn, buf.data);
+ res = PFCexec(conn, buf.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, buf.data);
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 950c6f7..b117a88 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -18,19 +18,22 @@
#include "nodes/relation.h"
#include "utils/relcache.h"
-#include "libpq-fe.h"
+#include "PgFdwConn.h"
+
+struct PgFdwScanState;
/* in postgres_fdw.c */
extern int set_transmission_modes(void);
extern void reset_transmission_modes(int nestlevel);
+extern void finish_async_query(PgFdwConn *fsstate);
/* in connection.c */
-extern PGconn *GetConnection(ForeignServer *server, UserMapping *user,
+extern PgFdwConn *GetConnection(ForeignServer *server, UserMapping *user,
bool will_prep_stmt);
-extern void ReleaseConnection(PGconn *conn);
-extern unsigned int GetCursorNumber(PGconn *conn);
-extern unsigned int GetPrepStmtNumber(PGconn *conn);
-extern void pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
+extern void ReleaseConnection(PgFdwConn *conn);
+extern unsigned int GetCursorNumber(PgFdwConn *conn);
+extern unsigned int GetPrepStmtNumber(PgFdwConn *conn);
+extern void pgfdw_report_error(int elevel, PGresult *res, PgFdwConn *conn,
bool clear, const char *sql);
/* in option.c */
--
2.1.0.GIT
I'm trying to compare v5 and v6 in my laptop right now. Apparently my
laptop is quite a bit faster than your machine because the tests complete
in roughly 3.3 seconds.
I added more data and didn't see anything other than noise. (Then again
the queries were dominated by the disk sort so I should retry with larger
work_mem). I'll try it again when I have more time to play with it. I
suspect the benefits would be more clear over a network.
Larger than default work_mem yes, but I think one of the prime use case for
the fdw is for more warehouse style situations (PostgresXL style use
cases). In those cases, work_mem might reasonably be set to 1GB. Then
even if you have 10KB rows you can fetch a million rows and still be using
less than work_mem. A simpler change would be to vary it with respect to
work_mem.
Half baked idea: I know its the wrong time in the execution phase, but if
you are using remote estimates for cost there should also be a row width
estimate which I believe is based from pg_statistic and its mean column
width.
Its actually a pity that there is no way to set fetch sizes based on "give
me as many tuples as will fit in less than x amount of memory". Because
that is almost always exactly what you want. Even when writing application
code, I've never actually wanted precisely 10,000 rows; I've always wanted
"a reasonable size chunk that could fit into memory" and then backed my way
into how many rows I wanted. If we were to extend FETCH to support syntax
like: FETCH FORWARD '10MB' FROM ...; then we would eliminate the need
estimate the value on the fly.
The async stuff, however, is a huge improvement over the last time I played
with the fdw. The two active postgres processes were easily consuming a
core and half of CPU. I think its not worth tying these two things
together. Its probably worth it to make these two separate discussions and
separate patches.
- Matt Kelly
*Just sanity checking myself: Shutting down the server, applying the
different patch, 'make clean install' in postgres_fdw, and then restarting
the server should obviously be sufficient to make sure its running the new
code because that is all linked at runtime, right?
Hello, thank you for looking this but sorry that the last patch
was buggy so that adaptive fetch size did not work.
The attached is the fixed patch. It apparently improves the
performance for the test case shown in the previous mail, in
which the average tuple length is about 140 bytes.
21 Jan 2015 05:22:34 +0000, Matt Kelly <mkellycs@gmail.com> wrote in <CA+KcUkg4cvDLf4v0M9_rVv_ZuAsG1oDHPj_YvczJa6w2nSkwNQ@mail.gmail.com>
I'm trying to compare v5 and v6 in my laptop right now. Apparently my
laptop is quite a bit faster than your machine because the tests complete
in roughly 3.3 seconds.I added more data and didn't see anything other than noise. (Then again
the queries were dominated by the disk sort so I should retry with larger
work_mem). I'll try it again when I have more time to play with it. I
suspect the benefits would be more clear over a network.Larger than default work_mem yes, but I think one of the prime use case for
the fdw is for more warehouse style situations (PostgresXL style use
cases). In those cases, work_mem might reasonably be set to 1GB. Then
even if you have 10KB rows you can fetch a million rows and still be using
less than work_mem. A simpler change would be to vary it with respect to
work_mem.
Agreed about the nature of the typical workload for postgres
FDW. But I think server itself including postgres_fdw should not
crash even by a sudden explosion of tuple length. The number 100
seems to be safe enough but 1000 seems suspicious, and 10000 is
looks to be danger from such standpoint.
Half baked idea: I know its the wrong time in the execution phase, but if
you are using remote estimates for cost there should also be a row width
estimate which I believe is based from pg_statistic and its mean column
width.
It reduces the chance to claim unexpected amount of memory, but
still the chance remains.
Its actually a pity that there is no way to set fetch sizes based on "give
me as many tuples as will fit in less than x amount of memory". Because
that is almost always exactly what you want. Even when writing application
code, I've never actually wanted precisely 10,000 rows; I've always wanted
"a reasonable size chunk that could fit into memory" and then backed my way
into how many rows I wanted. If we were to extend FETCH to support syntax
like: FETCH FORWARD '10MB' FROM ...; then we would eliminate the need
estimate the value on the fly.
I didn't think about hat. It makes sense, at least to me:) There
would be many cases that the *amount* of data is more crucial
than their number. I'll work on it.
The async stuff, however, is a huge improvement over the last time I played
with the fdw. The two active postgres processes were easily consuming a
core and half of CPU. I think its not worth tying these two things
together. Its probably worth it to make these two separate discussions and
separate patches.
Yes, they can be separated and also should be. I'll split them
after this.
- Matt Kelly
*Just sanity checking myself: Shutting down the server, applying the
different patch, 'make clean install' in postgres_fdw, and then restarting
the server should obviously be sufficient to make sure its running the new
code because that is all linked at runtime, right?
Yes. it's enough and I also did so. This patch touches only
postgres_fdw.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-Asynchronous-execution-of-postgres_fdw-v7.patchtext/x-patch; charset=us-asciiDownload
>From 12d0e8666871e2272beb1b85903baf0be92881b5 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Tue, 13 Jan 2015 19:20:35 +0900
Subject: [PATCH] Asynchronous execution of postgres_fdw v7
This is the buf fixed version of v6, which was modified version of
Asynchronous execution of postgres_fdw.
---
contrib/postgres_fdw/Makefile | 2 +-
contrib/postgres_fdw/PgFdwConn.c | 200 ++++++++++++++++++
contrib/postgres_fdw/PgFdwConn.h | 61 ++++++
contrib/postgres_fdw/connection.c | 82 ++++----
contrib/postgres_fdw/postgres_fdw.c | 391 +++++++++++++++++++++++++++++-------
contrib/postgres_fdw/postgres_fdw.h | 15 +-
6 files changed, 629 insertions(+), 122 deletions(-)
create mode 100644 contrib/postgres_fdw/PgFdwConn.c
create mode 100644 contrib/postgres_fdw/PgFdwConn.h
diff --git a/contrib/postgres_fdw/Makefile b/contrib/postgres_fdw/Makefile
index d2b98e1..d0913e2 100644
--- a/contrib/postgres_fdw/Makefile
+++ b/contrib/postgres_fdw/Makefile
@@ -1,7 +1,7 @@
# contrib/postgres_fdw/Makefile
MODULE_big = postgres_fdw
-OBJS = postgres_fdw.o option.o deparse.o connection.o $(WIN32RES)
+OBJS = postgres_fdw.o PgFdwConn.o option.o deparse.o connection.o $(WIN32RES)
PGFILEDESC = "postgres_fdw - foreign data wrapper for PostgreSQL"
PG_CPPFLAGS = -I$(libpq_srcdir)
diff --git a/contrib/postgres_fdw/PgFdwConn.c b/contrib/postgres_fdw/PgFdwConn.c
new file mode 100644
index 0000000..b13b597
--- /dev/null
+++ b/contrib/postgres_fdw/PgFdwConn.c
@@ -0,0 +1,200 @@
+/*-------------------------------------------------------------------------
+ *
+ * PgFdwConn.c
+ * PGconn extending wrapper to enable asynchronous query.
+ *
+ * Portions Copyright (c) 2012-2015, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * contrib/postgres_fdw/PgFdwConn.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "PgFdwConn.h"
+
+#define PFC_ALLOCATE() ((PgFdwConn *)malloc(sizeof(PgFdwConn)))
+#define PFC_FREE(c) free(c)
+
+struct pgfdw_conn
+{
+ PGconn *pgconn; /* libpq connection for this connection */
+ int nscans; /* number of scans using this connection */
+ struct PgFdwScanState *async_scan; /* the connection currently running
+ * async query on this connection */
+};
+
+void
+PFCsetAsyncScan(PgFdwConn *conn, struct PgFdwScanState *scan)
+{
+ conn->async_scan = scan;
+}
+
+struct PgFdwScanState *
+PFCgetAsyncScan(PgFdwConn *conn)
+{
+ return conn->async_scan;
+}
+
+int
+PFCisAsyncRunning(PgFdwConn *conn)
+{
+ return conn->async_scan != NULL;
+}
+
+PGconn *
+PFCgetPGconn(PgFdwConn *conn)
+{
+ return conn->pgconn;
+}
+
+int
+PFCgetNscans(PgFdwConn *conn)
+{
+ return conn->nscans;
+}
+
+int
+PFCincrementNscans(PgFdwConn *conn)
+{
+ return ++conn->nscans;
+}
+
+int
+PFCdecrementNscans(PgFdwConn *conn)
+{
+ Assert(conn->nscans > 0);
+ return --conn->nscans;
+}
+
+void
+PFCcancelAsync(PgFdwConn *conn)
+{
+ if (PFCisAsyncRunning(conn))
+ PFCconsumeInput(conn);
+}
+
+void
+PFCinit(PgFdwConn *conn)
+{
+ conn->async_scan = NULL;
+ conn->nscans = 0;
+}
+
+int
+PFCsendQuery(PgFdwConn *conn, const char *query)
+{
+ return PQsendQuery(conn->pgconn, query);
+}
+
+PGresult *
+PFCexec(PgFdwConn *conn, const char *query)
+{
+ return PQexec(conn->pgconn, query);
+}
+
+PGresult *
+PFCexecParams(PgFdwConn *conn,
+ const char *command,
+ int nParams,
+ const Oid *paramTypes,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat)
+{
+ return PQexecParams(conn->pgconn,
+ command, nParams, paramTypes, paramValues,
+ paramLengths, paramFormats, resultFormat);
+}
+
+PGresult *
+PFCprepare(PgFdwConn *conn,
+ const char *stmtName, const char *query,
+ int nParams, const Oid *paramTypes)
+{
+ return PQprepare(conn->pgconn, stmtName, query, nParams, paramTypes);
+}
+
+PGresult *
+PFCexecPrepared(PgFdwConn *conn,
+ const char *stmtName,
+ int nParams,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat)
+{
+ return PQexecPrepared(conn->pgconn,
+ stmtName, nParams, paramValues, paramLengths,
+ paramFormats, resultFormat);
+}
+
+PGresult *
+PFCgetResult(PgFdwConn *conn)
+{
+ return PQgetResult(conn->pgconn);
+}
+
+int
+PFCconsumeInput(PgFdwConn *conn)
+{
+ return PQconsumeInput(conn->pgconn);
+}
+
+int
+PFCisBusy(PgFdwConn *conn)
+{
+ return PQisBusy(conn->pgconn);
+}
+
+ConnStatusType
+PFCstatus(const PgFdwConn *conn)
+{
+ return PQstatus(conn->pgconn);
+}
+
+PGTransactionStatusType
+PFCtransactionStatus(const PgFdwConn *conn)
+{
+ return PQtransactionStatus(conn->pgconn);
+}
+
+int
+PFCserverVersion(const PgFdwConn *conn)
+{
+ return PQserverVersion(conn->pgconn);
+}
+
+char *
+PFCerrorMessage(const PgFdwConn *conn)
+{
+ return PQerrorMessage(conn->pgconn);
+}
+
+int
+PFCconnectionUsedPassword(const PgFdwConn *conn)
+{
+ return PQconnectionUsedPassword(conn->pgconn);
+}
+
+void
+PFCfinish(PgFdwConn *conn)
+{
+ return PQfinish(conn->pgconn);
+ PFC_FREE(conn);
+}
+
+PgFdwConn *
+PFCconnectdbParams(const char *const * keywords,
+ const char *const * values, int expand_dbname)
+{
+ PgFdwConn *ret = PFC_ALLOCATE();
+
+ PFCinit(ret);
+ ret->pgconn = PQconnectdbParams(keywords, values, expand_dbname);
+
+ return ret;
+}
diff --git a/contrib/postgres_fdw/PgFdwConn.h b/contrib/postgres_fdw/PgFdwConn.h
new file mode 100644
index 0000000..f695f5a
--- /dev/null
+++ b/contrib/postgres_fdw/PgFdwConn.h
@@ -0,0 +1,61 @@
+/*-------------------------------------------------------------------------
+ *
+ * PgFdwConn.h
+ * PGconn extending wrapper to enable asynchronous query.
+ *
+ * Portions Copyright (c) 2012-2015, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * contrib/postgres_fdw/PgFdwConn.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PGFDWCONN_H
+#define PGFDWCONN_H
+
+#include "libpq-fe.h"
+
+typedef struct pgfdw_conn PgFdwConn;
+struct PgFdwScanState;
+
+extern void PFCsetAsyncScan(PgFdwConn *conn, struct PgFdwScanState *scan);
+extern struct PgFdwScanState *PFCgetAsyncScan(PgFdwConn *conn);
+extern int PFCisAsyncRunning(PgFdwConn *conn);
+extern PGconn *PFCgetPGconn(PgFdwConn *conn);
+extern int PFCgetNscans(PgFdwConn *conn);
+extern int PFCincrementNscans(PgFdwConn *conn);
+extern int PFCdecrementNscans(PgFdwConn *conn);
+extern void PFCcancelAsync(PgFdwConn *conn);
+extern void PFCinit(PgFdwConn *conn);
+extern int PFCsendQuery(PgFdwConn *conn, const char *query);
+extern PGresult *PFCexec(PgFdwConn *conn, const char *query);
+extern PGresult *PFCexecParams(PgFdwConn *conn,
+ const char *command,
+ int nParams,
+ const Oid *paramTypes,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat);
+extern PGresult *PFCprepare(PgFdwConn *conn,
+ const char *stmtName, const char *query,
+ int nParams, const Oid *paramTypes);
+extern PGresult *PFCexecPrepared(PgFdwConn *conn,
+ const char *stmtName,
+ int nParams,
+ const char *const * paramValues,
+ const int *paramLengths,
+ const int *paramFormats,
+ int resultFormat);
+extern PGresult *PFCgetResult(PgFdwConn *conn);
+extern int PFCconsumeInput(PgFdwConn *conn);
+extern int PFCisBusy(PgFdwConn *conn);
+extern ConnStatusType PFCstatus(const PgFdwConn *conn);
+extern PGTransactionStatusType PFCtransactionStatus(const PgFdwConn *conn);
+extern int PFCserverVersion(const PgFdwConn *conn);
+extern char *PFCerrorMessage(const PgFdwConn *conn);
+extern int PFCconnectionUsedPassword(const PgFdwConn *conn);
+extern void PFCfinish(PgFdwConn *conn);
+extern PgFdwConn *PFCconnectdbParams(const char *const * keywords,
+ const char *const * values, int expand_dbname);
+#endif /* PGFDWCONN_H */
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 4e02cb2..2517f6b 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -44,7 +44,7 @@ typedef struct ConnCacheKey
typedef struct ConnCacheEntry
{
ConnCacheKey key; /* hash key (must be first) */
- PGconn *conn; /* connection to foreign server, or NULL */
+ PgFdwConn *conn; /* connection to foreign server, or NULL */
int xact_depth; /* 0 = no xact open, 1 = main xact open, 2 =
* one level of subxact open, etc */
bool have_prep_stmt; /* have we prepared any stmts in this xact? */
@@ -64,10 +64,10 @@ static unsigned int prep_stmt_number = 0;
static bool xact_got_connection = false;
/* prototypes of private functions */
-static PGconn *connect_pg_server(ForeignServer *server, UserMapping *user);
+static PgFdwConn *connect_pg_server(ForeignServer *server, UserMapping *user);
static void check_conn_params(const char **keywords, const char **values);
-static void configure_remote_session(PGconn *conn);
-static void do_sql_command(PGconn *conn, const char *sql);
+static void configure_remote_session(PgFdwConn *conn);
+static void do_sql_command(PgFdwConn *conn, const char *sql);
static void begin_remote_xact(ConnCacheEntry *entry);
static void pgfdw_xact_callback(XactEvent event, void *arg);
static void pgfdw_subxact_callback(SubXactEvent event,
@@ -93,7 +93,7 @@ static void pgfdw_subxact_callback(SubXactEvent event,
* be useful and not mere pedantry. We could not flush any active connections
* mid-transaction anyway.
*/
-PGconn *
+PgFdwConn *
GetConnection(ForeignServer *server, UserMapping *user,
bool will_prep_stmt)
{
@@ -161,9 +161,12 @@ GetConnection(ForeignServer *server, UserMapping *user,
entry->have_error = false;
entry->conn = connect_pg_server(server, user);
elog(DEBUG3, "new postgres_fdw connection %p for server \"%s\"",
- entry->conn, server->servername);
+ PFCgetPGconn(entry->conn), server->servername);
+
}
+ PFCincrementNscans(entry->conn);
+
/*
* Start a new transaction or subtransaction if needed.
*/
@@ -178,10 +181,10 @@ GetConnection(ForeignServer *server, UserMapping *user,
/*
* Connect to remote server using specified server and user mapping properties.
*/
-static PGconn *
+static PgFdwConn *
connect_pg_server(ForeignServer *server, UserMapping *user)
{
- PGconn *volatile conn = NULL;
+ PgFdwConn *volatile conn = NULL;
/*
* Use PG_TRY block to ensure closing connection on error.
@@ -223,14 +226,14 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
/* verify connection parameters and make connection */
check_conn_params(keywords, values);
- conn = PQconnectdbParams(keywords, values, false);
- if (!conn || PQstatus(conn) != CONNECTION_OK)
+ conn = PFCconnectdbParams(keywords, values, false);
+ if (!conn || PFCstatus(conn) != CONNECTION_OK)
{
char *connmessage;
int msglen;
/* libpq typically appends a newline, strip that */
- connmessage = pstrdup(PQerrorMessage(conn));
+ connmessage = pstrdup(PFCerrorMessage(conn));
msglen = strlen(connmessage);
if (msglen > 0 && connmessage[msglen - 1] == '\n')
connmessage[msglen - 1] = '\0';
@@ -246,7 +249,7 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
* otherwise, he's piggybacking on the postgres server's user
* identity. See also dblink_security_check() in contrib/dblink.
*/
- if (!superuser() && !PQconnectionUsedPassword(conn))
+ if (!superuser() && !PFCconnectionUsedPassword(conn))
ereport(ERROR,
(errcode(ERRCODE_S_R_E_PROHIBITED_SQL_STATEMENT_ATTEMPTED),
errmsg("password is required"),
@@ -263,7 +266,7 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
{
/* Release PGconn data structure if we managed to create one */
if (conn)
- PQfinish(conn);
+ PFCfinish(conn);
PG_RE_THROW();
}
PG_END_TRY();
@@ -312,9 +315,9 @@ check_conn_params(const char **keywords, const char **values)
* there are any number of ways to break things.
*/
static void
-configure_remote_session(PGconn *conn)
+configure_remote_session(PgFdwConn *conn)
{
- int remoteversion = PQserverVersion(conn);
+ int remoteversion = PFCserverVersion(conn);
/* Force the search path to contain only pg_catalog (see deparse.c) */
do_sql_command(conn, "SET search_path = pg_catalog");
@@ -348,11 +351,11 @@ configure_remote_session(PGconn *conn)
* Convenience subroutine to issue a non-data-returning SQL command to remote
*/
static void
-do_sql_command(PGconn *conn, const char *sql)
+do_sql_command(PgFdwConn *conn, const char *sql)
{
PGresult *res;
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, sql);
PQclear(res);
@@ -379,7 +382,7 @@ begin_remote_xact(ConnCacheEntry *entry)
const char *sql;
elog(DEBUG3, "starting remote transaction on connection %p",
- entry->conn);
+ PFCgetPGconn(entry->conn));
if (IsolationIsSerializable())
sql = "START TRANSACTION ISOLATION LEVEL SERIALIZABLE";
@@ -408,13 +411,11 @@ begin_remote_xact(ConnCacheEntry *entry)
* Release connection reference count created by calling GetConnection.
*/
void
-ReleaseConnection(PGconn *conn)
+ReleaseConnection(PgFdwConn *conn)
{
- /*
- * Currently, we don't actually track connection references because all
- * cleanup is managed on a transaction or subtransaction basis instead. So
- * there's nothing to do here.
- */
+ /* ongoing async query should be canceled if no scans left */
+ if (PFCdecrementNscans(conn) == 0)
+ finish_async_query(conn);
}
/*
@@ -429,7 +430,7 @@ ReleaseConnection(PGconn *conn)
* collisions are highly improbable; just be sure to use %u not %d to print.
*/
unsigned int
-GetCursorNumber(PGconn *conn)
+GetCursorNumber(PgFdwConn *conn)
{
return ++cursor_number;
}
@@ -443,7 +444,7 @@ GetCursorNumber(PGconn *conn)
* increasing the risk of prepared-statement name collisions by resetting.
*/
unsigned int
-GetPrepStmtNumber(PGconn *conn)
+GetPrepStmtNumber(PgFdwConn *conn)
{
return ++prep_stmt_number;
}
@@ -462,7 +463,7 @@ GetPrepStmtNumber(PGconn *conn)
* marked with have_error = true.
*/
void
-pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
+pgfdw_report_error(int elevel, PGresult *res, PgFdwConn *conn,
bool clear, const char *sql)
{
/* If requested, PGresult must be released before leaving this function. */
@@ -490,7 +491,7 @@ pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
* return NULL, not a PGresult at all.
*/
if (message_primary == NULL)
- message_primary = PQerrorMessage(conn);
+ message_primary = PFCerrorMessage(conn);
ereport(elevel,
(errcode(sqlstate),
@@ -542,7 +543,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
if (entry->xact_depth > 0)
{
elog(DEBUG3, "closing remote transaction on connection %p",
- entry->conn);
+ PFCgetPGconn(entry->conn));
switch (event)
{
@@ -567,7 +568,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
*/
if (entry->have_prep_stmt && entry->have_error)
{
- res = PQexec(entry->conn, "DEALLOCATE ALL");
+ res = PFCexec(entry->conn, "DEALLOCATE ALL");
PQclear(res);
}
entry->have_prep_stmt = false;
@@ -597,7 +598,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Assume we might have lost track of prepared statements */
entry->have_error = true;
/* If we're aborting, abort all remote transactions too */
- res = PQexec(entry->conn, "ABORT TRANSACTION");
+ res = PFCexec(entry->conn, "ABORT TRANSACTION");
/* Note: can't throw ERROR, it would be infinite loop */
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(WARNING, res, entry->conn, true,
@@ -608,7 +609,7 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* As above, make sure to clear any prepared stmts */
if (entry->have_prep_stmt && entry->have_error)
{
- res = PQexec(entry->conn, "DEALLOCATE ALL");
+ res = PFCexec(entry->conn, "DEALLOCATE ALL");
PQclear(res);
}
entry->have_prep_stmt = false;
@@ -620,17 +621,19 @@ pgfdw_xact_callback(XactEvent event, void *arg)
/* Reset state to show we're out of a transaction */
entry->xact_depth = 0;
+ PFCcancelAsync(entry->conn);
+ PFCinit(entry->conn);
/*
* If the connection isn't in a good idle state, discard it to
* recover. Next GetConnection will open a new connection.
*/
- if (PQstatus(entry->conn) != CONNECTION_OK ||
- PQtransactionStatus(entry->conn) != PQTRANS_IDLE)
+ if (PFCstatus(entry->conn) != CONNECTION_OK ||
+ PFCtransactionStatus(entry->conn) != PQTRANS_IDLE)
{
- elog(DEBUG3, "discarding connection %p", entry->conn);
- PQfinish(entry->conn);
- entry->conn = NULL;
+ elog(DEBUG3, "discarding connection %p",
+ PFCgetPGconn(entry->conn));
+ PFCfinish(entry->conn);
}
}
@@ -676,6 +679,9 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
PGresult *res;
char sql[100];
+ /* Shut down asynchronous scan if running */
+ PFCcancelAsync(entry->conn);
+
/*
* We only care about connections with open remote subtransactions of
* the current level.
@@ -701,7 +707,7 @@ pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid,
snprintf(sql, sizeof(sql),
"ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
curlevel, curlevel);
- res = PQexec(entry->conn, sql);
+ res = PFCexec(entry->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(WARNING, res, entry->conn, true, sql);
else
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index d76e739..a146a2b 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -46,6 +46,27 @@ PG_MODULE_MAGIC;
/* Default CPU cost to process 1 row (above and beyond cpu_tuple_cost). */
#define DEFAULT_FDW_TUPLE_COST 0.01
+/* Fetch size at startup. This might be better be a GUC parameter */
+#define MIN_FETCH_SIZE 100
+
+/* Maximum fetch size. This might be better be a GUC parameter */
+#define MAX_FETCH_SIZE 1000
+
+/*
+ * Maximum size for fetch buffer in kilobytes. Ditto.
+ *
+ * This should be far larger than sizeof(HeapTuple) * FETCH_SIZE_MAX. This is
+ * not a hard limit because we cannot know in advance the average row length
+ * returned.
+ */
+#define MAX_FETCH_BUFFER_SIZE 10000 /* 10MB */
+
+/* Maximum duration allowed for a single fetch, in milliseconds */
+#define MAX_FETCH_DURATION 500
+
+/* Number of successive async fetches to enlarge fetch_size */
+#define INCREASE_FETCH_SIZE_THRESHOLD 8
+
/*
* FDW-specific planner information kept in RelOptInfo.fdw_private for a
* foreign table. This information is collected by postgresGetForeignRelSize.
@@ -123,6 +144,12 @@ enum FdwModifyPrivateIndex
FdwModifyPrivateRetrievedAttrs
};
+typedef enum fetch_mode {
+ START_ONLY,
+ FORCE_SYNC,
+ ALLOW_ASYNC
+} fetch_mode;
+
/*
* Execution state of a foreign scan using postgres_fdw.
*/
@@ -136,7 +163,7 @@ typedef struct PgFdwScanState
List *retrieved_attrs; /* list of retrieved attribute numbers */
/* for remote query execution */
- PGconn *conn; /* connection for the scan */
+ PgFdwConn *conn; /* connection for the scan */
unsigned int cursor_number; /* quasi-unique ID for my cursor */
bool cursor_exists; /* have we created the cursor? */
int numParams; /* number of parameters passed to query */
@@ -148,7 +175,12 @@ typedef struct PgFdwScanState
HeapTuple *tuples; /* array of currently-retrieved tuples */
int num_tuples; /* # of tuples in array */
int next_tuple; /* index of next one to return */
-
+ int fetch_size; /* rows to be fetched at once */
+ int successive_async; /* # of successive fetches at this
+ fetch_size */
+ long last_fetch_req_at; /* The time of the last fetch request, in
+ * milliseconds*/
+ int last_buf_size; /* Buffer size required for the last fetch */
/* batch-level state, for optimizing rewinds and avoiding useless fetch */
int fetch_ct_2; /* Min(# of fetches done, 2) */
bool eof_reached; /* true if last fetch reached EOF */
@@ -156,6 +188,7 @@ typedef struct PgFdwScanState
/* working memory contexts */
MemoryContext batch_cxt; /* context holding current batch of tuples */
MemoryContext temp_cxt; /* context for per-tuple temporary data */
+ ExprContext *econtext; /* copy of ps_ExprContext of ForeignScanState */
} PgFdwScanState;
/*
@@ -167,7 +200,7 @@ typedef struct PgFdwModifyState
AttInMetadata *attinmeta; /* attribute datatype conversion metadata */
/* for remote query execution */
- PGconn *conn; /* connection for the scan */
+ PgFdwConn *conn; /* connection for the scan */
char *p_name; /* name of prepared statement, if created */
/* extracted fdw_private data */
@@ -298,7 +331,7 @@ static void estimate_path_cost_size(PlannerInfo *root,
double *p_rows, int *p_width,
Cost *p_startup_cost, Cost *p_total_cost);
static void get_remote_estimate(const char *sql,
- PGconn *conn,
+ PgFdwConn *conn,
double *rows,
int *width,
Cost *startup_cost,
@@ -306,9 +339,9 @@ static void get_remote_estimate(const char *sql,
static bool ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
EquivalenceClass *ec, EquivalenceMember *em,
void *arg);
-static void create_cursor(ForeignScanState *node);
-static void fetch_more_data(ForeignScanState *node);
-static void close_cursor(PGconn *conn, unsigned int cursor_number);
+static void create_cursor(PgFdwScanState *node);
+static void close_cursor(PgFdwConn *conn, unsigned int cursor_number);
+static void fetch_more_data(PgFdwScanState *node, fetch_mode cmd);
static void prepare_foreign_modify(PgFdwModifyState *fmstate);
static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
ItemPointer tupleid,
@@ -329,7 +362,6 @@ static HeapTuple make_tuple_from_result_row(PGresult *res,
MemoryContext temp_context);
static void conversion_error_callback(void *arg);
-
/*
* Foreign-data wrapper handler function: return a struct with pointers
* to my callback routines.
@@ -982,6 +1014,19 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
fsstate->param_values = (const char **) palloc0(numParams * sizeof(char *));
else
fsstate->param_values = NULL;
+
+ fsstate->econtext = node->ss.ps.ps_ExprContext;
+
+ fsstate->fetch_size = MIN_FETCH_SIZE;
+ fsstate->successive_async = 0;
+ fsstate->last_buf_size = 0;
+
+ /*
+ * Start scanning asynchronously if it is the first scan on this
+ * connection.
+ */
+ if (PFCgetNscans(fsstate->conn) == 1)
+ create_cursor(fsstate);
}
/*
@@ -1000,7 +1045,10 @@ postgresIterateForeignScan(ForeignScanState *node)
* cursor on the remote side.
*/
if (!fsstate->cursor_exists)
- create_cursor(node);
+ {
+ finish_async_query(fsstate->conn);
+ create_cursor(fsstate);
+ }
/*
* Get some more tuples, if we've run out.
@@ -1009,7 +1057,7 @@ postgresIterateForeignScan(ForeignScanState *node)
{
/* No point in another fetch if we already detected EOF, though. */
if (!fsstate->eof_reached)
- fetch_more_data(node);
+ fetch_more_data(fsstate, ALLOW_ASYNC);
/* If we didn't get any tuples, must be end of data. */
if (fsstate->next_tuple >= fsstate->num_tuples)
return ExecClearTuple(slot);
@@ -1069,7 +1117,7 @@ postgresReScanForeignScan(ForeignScanState *node)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(fsstate->conn, sql);
+ res = PFCexec(fsstate->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fsstate->conn, true, sql);
PQclear(res);
@@ -1392,19 +1440,22 @@ postgresExecForeignInsert(EState *estate,
/* Convert parameters needed by prepared statement to text form */
p_values = convert_prep_stmt_params(fmstate, NULL, slot);
+ /* Finish async query if runing */
+ finish_async_query(fmstate->conn);
+
/*
* Execute the prepared statement, and check for success.
*
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1462,19 +1513,22 @@ postgresExecForeignUpdate(EState *estate,
(ItemPointer) DatumGetPointer(datum),
slot);
+ /* Finish async query if runing */
+ finish_async_query(fmstate->conn);
+
/*
* Execute the prepared statement, and check for success.
*
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1532,19 +1586,22 @@ postgresExecForeignDelete(EState *estate,
(ItemPointer) DatumGetPointer(datum),
NULL);
+ /* Finish async query if runing */
+ finish_async_query(fmstate->conn);
+
/*
* Execute the prepared statement, and check for success.
*
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecPrepared(fmstate->conn,
- fmstate->p_name,
- fmstate->p_nums,
- p_values,
- NULL,
- NULL,
- 0);
+ res = PFCexecPrepared(fmstate->conn,
+ fmstate->p_name,
+ fmstate->p_nums,
+ p_values,
+ NULL,
+ NULL,
+ 0);
if (PQresultStatus(res) !=
(fmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -1594,7 +1651,7 @@ postgresEndForeignModify(EState *estate,
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(fmstate->conn, sql);
+ res = PFCexec(fmstate->conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fmstate->conn, true, sql);
PQclear(res);
@@ -1726,7 +1783,7 @@ estimate_path_cost_size(PlannerInfo *root,
List *local_join_conds;
StringInfoData sql;
List *retrieved_attrs;
- PGconn *conn;
+ PgFdwConn *conn;
Selectivity local_sel;
QualCost local_cost;
@@ -1836,7 +1893,7 @@ estimate_path_cost_size(PlannerInfo *root,
* The given "sql" must be an EXPLAIN command.
*/
static void
-get_remote_estimate(const char *sql, PGconn *conn,
+get_remote_estimate(const char *sql, PgFdwConn *conn,
double *rows, int *width,
Cost *startup_cost, Cost *total_cost)
{
@@ -1852,7 +1909,7 @@ get_remote_estimate(const char *sql, PGconn *conn,
/*
* Execute EXPLAIN remotely.
*/
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql);
@@ -1917,13 +1974,12 @@ ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
* Create cursor for node's query with current parameter values.
*/
static void
-create_cursor(ForeignScanState *node)
+create_cursor(PgFdwScanState *fsstate)
{
- PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
- ExprContext *econtext = node->ss.ps.ps_ExprContext;
+ ExprContext *econtext = fsstate->econtext;
int numParams = fsstate->numParams;
const char **values = fsstate->param_values;
- PGconn *conn = fsstate->conn;
+ PgFdwConn *conn = fsstate->conn;
StringInfoData buf;
PGresult *res;
@@ -1985,8 +2041,8 @@ create_cursor(ForeignScanState *node)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexecParams(conn, buf.data, numParams, NULL, values,
- NULL, NULL, 0);
+ res = PFCexecParams(conn, buf.data, numParams, NULL, values,
+ NULL, NULL, 0);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, fsstate->query);
PQclear(res);
@@ -2001,71 +2057,215 @@ create_cursor(ForeignScanState *node)
/* Clean up */
pfree(buf.data);
+
+ /*
+ * Start async scan if this is the first scan. See fetch_more_data() for
+ * details
+ */
+ if (PFCgetNscans(conn) == 1)
+ fetch_more_data(fsstate, START_ONLY);
}
/*
* Fetch some more rows from the node's cursor.
*/
static void
-fetch_more_data(ForeignScanState *node)
+fetch_more_data(PgFdwScanState *fsstate, fetch_mode cmd)
{
- PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
PGresult *volatile res = NULL;
MemoryContext oldcontext;
/*
* We'll store the tuples in the batch_cxt. First, flush the previous
- * batch.
+ * batch. Some tuples left unread when asynchronous fetching is
+ * interrupted. Don't flush to preserve the unread tuples for the case. It
+ * occurs no more than twice successively.
*/
- fsstate->tuples = NULL;
- MemoryContextReset(fsstate->batch_cxt);
+ if (fsstate->next_tuple >= fsstate->num_tuples)
+ {
+ fsstate->tuples = NULL;
+ MemoryContextReset(fsstate->batch_cxt);
+ }
oldcontext = MemoryContextSwitchTo(fsstate->batch_cxt);
/* PGresult must be released before leaving this function. */
PG_TRY();
{
- PGconn *conn = fsstate->conn;
+ PgFdwConn *conn = fsstate->conn;
char sql[64];
- int fetch_size;
- int numrows;
+ int numrows, addrows, restrows;
+ HeapTuple *tmptuples;
+ int prev_fetch_size = fsstate->fetch_size;
+ int new_fetch_size = fsstate->fetch_size;
int i;
+ struct timeval tv = {0, 0};
+ long current_time;
+ int fetch_buf_size;
- /* The fetch size is arbitrary, but shouldn't be enormous. */
- fetch_size = 100;
+ gettimeofday(&tv, NULL);
+ current_time = tv.tv_sec * 1000 + tv.tv_usec / 1000;
+ /*
+ * Doing adaptive fetch size
+ *
+ * Since we don't have enough knowledge about how long fetching takes
+ * or how large space needed for received tuples in advance, change
+ * fetch_size dynamically according to maximal allowed duration and
+ * buffer space.
+ */
+ if (fsstate->last_buf_size > MAX_FETCH_BUFFER_SIZE)
+ {
+ new_fetch_size =
+ (int)((double)fsstate->fetch_size * MAX_FETCH_BUFFER_SIZE /
+ fsstate->last_buf_size);
+ }
+ if (PFCisBusy(conn) &&
+ fsstate->fetch_size > MIN_FETCH_SIZE &&
+ fsstate->last_fetch_req_at + MAX_FETCH_DURATION <
+ current_time)
+ {
+ int tmp_fetch_size = fsstate->fetch_size / 2;
+ if (tmp_fetch_size < new_fetch_size)
+ new_fetch_size = tmp_fetch_size;
+ }
+
+ /* Increase if not decreased and other conditions match. */
+ if (new_fetch_size == fsstate->fetch_size &&
+ fsstate->successive_async >= INCREASE_FETCH_SIZE_THRESHOLD &&
+ fsstate->fetch_size < MAX_FETCH_SIZE)
+ new_fetch_size *= 2;
+
+ /* Change fetch_size as calculated above */
+ if (new_fetch_size != fsstate->fetch_size)
+ {
+ if (new_fetch_size > MAX_FETCH_SIZE)
+ fsstate->fetch_size = MAX_FETCH_SIZE;
+ else if (new_fetch_size < MIN_FETCH_SIZE)
+ fsstate->fetch_size = MIN_FETCH_SIZE;
+ else
+ fsstate->fetch_size = new_fetch_size;
+ fsstate->successive_async = 0;
+ }
+
+ /* Make the query to fetch tuples */
snprintf(sql, sizeof(sql), "FETCH %d FROM c%u",
- fetch_size, fsstate->cursor_number);
+ fsstate->fetch_size, fsstate->cursor_number);
+
+ if (PFCisAsyncRunning(conn))
+ {
+ Assert (cmd != START_ONLY);
+
+ /*
+ * If the target fsstate is different from the scan state that the
+ * current async fetch running for, the result should be stored
+ * into it, then synchronously fetch data for the target fsstate.
+ */
+ if (fsstate != PFCgetAsyncScan(conn))
+ {
+ fetch_more_data(PFCgetAsyncScan(conn), FORCE_SYNC);
+ res = PFCexec(conn, sql);
+ }
+ else
+ {
+ /* Get result of running async fetch */
+ res = PFCgetResult(conn);
+ if (PQntuples(res) == prev_fetch_size)
+ {
+ /*
+ * Connection state doesn't go to IDLE even if all data
+ * has been sent to client for asynchronous query. One
+ * more PQgetResult() is needed to reset the state to
+ * IDLE. See PQexecFinish() for details.
+ */
+ if (PFCgetResult(conn) != NULL)
+ elog(ERROR, "Connection status error.");
+ }
+ }
+ PFCsetAsyncScan(conn, NULL);
+ }
+ else
+ {
+ if (cmd == START_ONLY)
+ {
+ Assert(PFCgetNscans(conn) == 1);
+
+ if (!PFCsendQuery(conn, sql))
+ pgfdw_report_error(ERROR, res, conn, false,
+ fsstate->query);
+ fsstate->last_fetch_req_at = current_time;
+
+ PFCsetAsyncScan(conn, fsstate);
+ goto end_of_fetch;
+ }
+
+ /* Elsewise do synchronous query execution */
+ PFCsetAsyncScan(conn, NULL);
+ res = PFCexec(conn, sql);
+ }
- res = PQexec(conn, sql);
/* On error, report the original query, not the FETCH. */
- if (PQresultStatus(res) != PGRES_TUPLES_OK)
+ if (res && PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, fsstate->query);
- /* Convert the data into HeapTuples */
- numrows = PQntuples(res);
- fsstate->tuples = (HeapTuple *) palloc0(numrows * sizeof(HeapTuple));
+ /* allocate tuple storage */
+ tmptuples = fsstate->tuples;
+ addrows = PQntuples(res);
+ restrows = fsstate->num_tuples - fsstate->next_tuple;
+ numrows = restrows + addrows;
+ fetch_buf_size = numrows * sizeof(HeapTuple);
+ fsstate->tuples = (HeapTuple *) palloc0(fetch_buf_size);
+
+ Assert(restrows == 0 || tmptuples);
+
+ /* copy unread tuples if any */
+ for (i = 0 ; i < restrows ; i++)
+ fsstate->tuples[i] = tmptuples[fsstate->next_tuple + i];
+
fsstate->num_tuples = numrows;
fsstate->next_tuple = 0;
- for (i = 0; i < numrows; i++)
+ /* Convert the data into HeapTuples */
+ for (i = 0 ; i < addrows; i++)
{
- fsstate->tuples[i] =
+ HeapTuple tup =
make_tuple_from_result_row(res, i,
fsstate->rel,
fsstate->attinmeta,
fsstate->retrieved_attrs,
fsstate->temp_cxt);
+ fsstate->tuples[restrows + i] = tup;
+ fetch_buf_size += (HEAPTUPLESIZE + tup->t_len);
}
+ fsstate->last_buf_size = fetch_buf_size / 1024; /* in kilobytes */
+
/* Update fetch_ct_2 */
if (fsstate->fetch_ct_2 < 2)
fsstate->fetch_ct_2++;
/* Must be EOF if we didn't get as many tuples as we asked for. */
- fsstate->eof_reached = (numrows < fetch_size);
+ fsstate->eof_reached = (numrows < prev_fetch_size);
PQclear(res);
res = NULL;
+
+ if (cmd == ALLOW_ASYNC)
+ {
+ if (!fsstate->eof_reached)
+ {
+ /*
+ * We can immediately request the next bunch of tuples if
+ * we're on asynchronous connection.
+ */
+ if (!PFCsendQuery(conn, sql))
+ pgfdw_report_error(ERROR, res, conn, false, fsstate->query);
+ fsstate->last_fetch_req_at = current_time;
+ PFCsetAsyncScan(conn, fsstate);
+ }
+ }
+
+end_of_fetch:
+ ; /* Nothing to do here but needed to make compiler quiet. */
}
PG_CATCH();
{
@@ -2075,10 +2275,44 @@ fetch_more_data(ForeignScanState *node)
}
PG_END_TRY();
+ if (PFCisAsyncRunning(fsstate->conn))
+ {
+ if (fsstate->successive_async < INCREASE_FETCH_SIZE_THRESHOLD)
+ fsstate->successive_async++;
+ }
+ else
+ {
+ /* Reset fetch_size if the async_fetch stopped */
+ fsstate->successive_async = 0;
+ fsstate->fetch_size = MIN_FETCH_SIZE;
+ }
+
MemoryContextSwitchTo(oldcontext);
}
/*
+ * Force cancelling async command state.
+ */
+void
+finish_async_query(PgFdwConn *conn)
+{
+ PgFdwScanState *fsstate = PFCgetAsyncScan(conn);
+ PgFdwConn *async_conn;
+
+ /* Nothing to do if no async connection */
+ if (fsstate == NULL) return;
+ async_conn = fsstate->conn;
+ if (!async_conn ||
+ PFCgetNscans(async_conn) == 1 ||
+ !PFCisAsyncRunning(async_conn))
+ return;
+
+ fetch_more_data(PFCgetAsyncScan(async_conn), FORCE_SYNC);
+
+ Assert(!PFCisAsyncRunning(async_conn));
+}
+
+/*
* Force assorted GUC parameters to settings that ensure that we'll output
* data values in a form that is unambiguous to the remote server.
*
@@ -2132,7 +2366,7 @@ reset_transmission_modes(int nestlevel)
* Utility routine to close a cursor.
*/
static void
-close_cursor(PGconn *conn, unsigned int cursor_number)
+close_cursor(PgFdwConn *conn, unsigned int cursor_number)
{
char sql[64];
PGresult *res;
@@ -2143,7 +2377,7 @@ close_cursor(PGconn *conn, unsigned int cursor_number)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQexec(conn, sql);
+ res = PFCexec(conn, sql);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, true, sql);
PQclear(res);
@@ -2165,6 +2399,9 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
GetPrepStmtNumber(fmstate->conn));
p_name = pstrdup(prep_name);
+ /* Finish async query if runing */
+ finish_async_query(fmstate->conn);
+
/*
* We intentionally do not specify parameter types here, but leave the
* remote server to derive them by default. This avoids possible problems
@@ -2175,11 +2412,11 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
* We don't use a PG_TRY block here, so be careful not to throw error
* without releasing the PGresult.
*/
- res = PQprepare(fmstate->conn,
- p_name,
- fmstate->query,
- 0,
- NULL);
+ res = PFCprepare(fmstate->conn,
+ p_name,
+ fmstate->query,
+ 0,
+ NULL);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
@@ -2297,7 +2534,7 @@ postgresAnalyzeForeignTable(Relation relation,
ForeignTable *table;
ForeignServer *server;
UserMapping *user;
- PGconn *conn;
+ PgFdwConn *conn;
StringInfoData sql;
PGresult *volatile res = NULL;
@@ -2329,7 +2566,7 @@ postgresAnalyzeForeignTable(Relation relation,
/* In what follows, do not risk leaking any PGresults. */
PG_TRY();
{
- res = PQexec(conn, sql.data);
+ res = PFCexec(conn, sql.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
@@ -2379,7 +2616,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
ForeignTable *table;
ForeignServer *server;
UserMapping *user;
- PGconn *conn;
+ PgFdwConn *conn;
unsigned int cursor_number;
StringInfoData sql;
PGresult *volatile res = NULL;
@@ -2423,7 +2660,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
/* In what follows, do not risk leaking any PGresults. */
PG_TRY();
{
- res = PQexec(conn, sql.data);
+ res = PFCexec(conn, sql.data);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
PQclear(res);
@@ -2453,7 +2690,7 @@ postgresAcquireSampleRowsFunc(Relation relation, int elevel,
snprintf(fetch_sql, sizeof(fetch_sql), "FETCH %d FROM c%u",
fetch_size, cursor_number);
- res = PQexec(conn, fetch_sql);
+ res = PFCexec(conn, fetch_sql);
/* On error, report the original query, not the FETCH. */
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, sql.data);
@@ -2582,7 +2819,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
bool import_not_null = true;
ForeignServer *server;
UserMapping *mapping;
- PGconn *conn;
+ PgFdwConn *conn;
StringInfoData buf;
PGresult *volatile res = NULL;
int numrows,
@@ -2615,7 +2852,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
conn = GetConnection(server, mapping, false);
/* Don't attempt to import collation if remote server hasn't got it */
- if (PQserverVersion(conn) < 90100)
+ if (PFCserverVersion(conn) < 90100)
import_collate = false;
/* Create workspace for strings */
@@ -2628,7 +2865,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
appendStringInfoString(&buf, "SELECT 1 FROM pg_catalog.pg_namespace WHERE nspname = ");
deparseStringLiteral(&buf, stmt->remote_schema);
- res = PQexec(conn, buf.data);
+ res = PFCexec(conn, buf.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, buf.data);
@@ -2723,7 +2960,7 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid)
appendStringInfo(&buf, " ORDER BY c.relname, a.attnum");
/* Fetch the data */
- res = PQexec(conn, buf.data);
+ res = PFCexec(conn, buf.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
pgfdw_report_error(ERROR, res, conn, false, buf.data);
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 950c6f7..b117a88 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -18,19 +18,22 @@
#include "nodes/relation.h"
#include "utils/relcache.h"
-#include "libpq-fe.h"
+#include "PgFdwConn.h"
+
+struct PgFdwScanState;
/* in postgres_fdw.c */
extern int set_transmission_modes(void);
extern void reset_transmission_modes(int nestlevel);
+extern void finish_async_query(PgFdwConn *fsstate);
/* in connection.c */
-extern PGconn *GetConnection(ForeignServer *server, UserMapping *user,
+extern PgFdwConn *GetConnection(ForeignServer *server, UserMapping *user,
bool will_prep_stmt);
-extern void ReleaseConnection(PGconn *conn);
-extern unsigned int GetCursorNumber(PGconn *conn);
-extern unsigned int GetPrepStmtNumber(PGconn *conn);
-extern void pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
+extern void ReleaseConnection(PgFdwConn *conn);
+extern unsigned int GetCursorNumber(PgFdwConn *conn);
+extern unsigned int GetPrepStmtNumber(PgFdwConn *conn);
+extern void pgfdw_report_error(int elevel, PGresult *res, PgFdwConn *conn,
bool clear, const char *sql);
/* in option.c */
--
2.1.0.GIT
Kyotaro,
* Kyotaro HORIGUCHI (horiguchi.kyotaro@lab.ntt.co.jp) wrote:
The attached is the fixed patch. It apparently improves the
performance for the test case shown in the previous mail, in
which the average tuple length is about 140 bytes.
I'm all for improving performance of postgres_fdw and would like to see
us support sending queries off to be worked asyncronously, but starting
execution on the remote server during ExecInitNode is against the
documentated FDW API spec. I discussed exactly this issue over a year
ago here:
/messages/by-id/20131104032604.GB2706@tamriel.snowman.net
Sadly, there weren't any direct responses to that email, but I do recall
having a discussion on another thread (or in person?) with Tom where we
ended up agreeing that we can't simply remove that requirement from the
docs or the API.
I certainly appreciate that you've put quite a bit of effort into this
but I'm afraid we can't accept it while it's starting to run a query on
the remote side during the ExecInitNode phase. The query can not start
executing on the remote side until InterateForeignScan is called.
You might consider looking at the other suggestion in that email with
regard to adding an Async mechanism to the executor. I didn't get to
the point of writing code, but I did think about it a fair bit and still
believe that could work.
I'm not going to change the status of this patch in the CommitFest at
this time, in case anyone else feels I've misunderstood or not correctly
analyzed what the patch does (I'll admit, I've only read it and not
actually compiled it or run it with a debugger, but I'm pretty sure my
reading of what's happening is correct..), but I'm afraid this is going
to have to end up as Rejected.
Thanks!
Stephen
Stephen Frost <sfrost@snowman.net> writes:
I'm all for improving performance of postgres_fdw and would like to see
us support sending queries off to be worked asyncronously, but starting
execution on the remote server during ExecInitNode is against the
documentated FDW API spec. I discussed exactly this issue over a year
ago here:
Sadly, there weren't any direct responses to that email, but I do recall
having a discussion on another thread (or in person?) with Tom where we
ended up agreeing that we can't simply remove that requirement from the
docs or the API.
Yeah. There are at least a couple of reasons why not:
* ExecInitNode only creates the runtime data structures, it should not
begin execution. It's possible for example that the scan will never be
iterated at all; say it's on the inside of a nestloop and the outer
relation turns out to be empty. It's not apparent why starting the remote
query a few microseconds sooner is worth the risk of demanding useless
computation.
* If the scan is parameterized (again, it's on the inside of a nestloop,
and the outer relation is supplying join key values), those parameter
values are simply not available at ExecInitNode time.
Also, so far as a quick review of the actual patch goes, I would really
like to see this lose the "PFC" wrapper layer, which accounts for 95% of
the code churn in the patch and doesn't seem to add any actual value.
What it does add is unchecked malloc failure conditions.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tom,
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
Stephen Frost <sfrost@snowman.net> writes:
I'm all for improving performance of postgres_fdw and would like to see
us support sending queries off to be worked asyncronously, but starting
execution on the remote server during ExecInitNode is against the
documentated FDW API spec. I discussed exactly this issue over a year
ago here:Sadly, there weren't any direct responses to that email, but I do recall
having a discussion on another thread (or in person?) with Tom where we
ended up agreeing that we can't simply remove that requirement from the
docs or the API.Yeah. There are at least a couple of reasons why not:
Thanks for the reminders of those.
Also, so far as a quick review of the actual patch goes, I would really
like to see this lose the "PFC" wrapper layer, which accounts for 95% of
the code churn in the patch and doesn't seem to add any actual value.
What it does add is unchecked malloc failure conditions.
Agreed, the wrapper isn't doing anything particularly useful; I had
started out thinking that would be my first comment until it became
clear where all the performance improvement was coming from.
I've gone ahead and marked this as Rejected. The concept of async
execution of postgres_fdw is certainly still open and a worthwhile goal,
but this implementation isn't the way to achieve that.
Thanks!
Stephen
Hello.
I've gone ahead and marked this as Rejected. The concept of async
execution of postgres_fdw is certainly still open and a worthwhile goal,
but this implementation isn't the way to achieve that.
It sounds fair. I'm satisfied that we have agreed that the goal
is worthwhile. I'll show other implementations sooner.
Thank you.
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
Stephen Frost <sfrost@snowman.net> writes:
I'm all for improving performance of postgres_fdw and would like to see
us support sending queries off to be worked asyncronously, but starting
execution on the remote server during ExecInitNode is against the
documentated FDW API spec. I discussed exactly this issue over a year
ago here:Sadly, there weren't any direct responses to that email, but I do recall
having a discussion on another thread (or in person?) with Tom where we
ended up agreeing that we can't simply remove that requirement from the
docs or the API.Yeah. There are at least a couple of reasons why not:
Thanks for the reminders of those.
Also, so far as a quick review of the actual patch goes, I would really
like to see this lose the "PFC" wrapper layer, which accounts for 95% of
the code churn in the patch and doesn't seem to add any actual value.
What it does add is unchecked malloc failure conditions.Agreed, the wrapper isn't doing anything particularly useful; I had
started out thinking that would be my first comment until it became
clear where all the performance improvement was coming from.I've gone ahead and marked this as Rejected. The concept of async
execution of postgres_fdw is certainly still open and a worthwhile goal,
but this implementation isn't the way to achieve that.
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers