We need to support ForeignRecheck for late row locking, don't we?

Started by Etsuro Fujitaover 10 years ago6 messages
#1Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
1 attachment(s)

While working on the issue "Foreign join pushdown vs EvalPlanQual", I
happened to notice odd behaviors of late row locking in FDWs. An
example will be shown below using the attached postgres_fdw patch, which
is basically the same as [1]/messages/by-id/16016.1431455074@sss.pgh.pa.us, but I changed its isolation level to READ
COMMITTED and modified it so that the output parameter "updated" of
RefetchForeignRow is updated accordingly.

[Create an environment]

mydatabase=# create table mytable (a int);
mydatabase=# insert into mytable values (1);
postgres=# create extension postgres_fdw;
postgres=# create server myserver foreign data wrapper postgres_fdw
options (dbname 'mydatabase');
postgres=# create user mapping for current_user server myserver;
postgres=# create foreign table ftable (a int) server myserver options
(table_name 'mytable');

[Run concurrent transactions that behave oddly]

mydatabase=# begin;
mydatabase=# update mytable set a = a + 1;
postgres=# select * from ftable where a = 1 for update;
mydatabase=# commit;
(After the commit, the following result will be shown in the terminal
for the postgres database. Note that the result doesn't satify a = 1.)
a
---
2
(1 row)

I think the reason for that is because we don't check pushed-down quals
inside an EPQ testing even if what was fetched by RefetchForeignRow was
an updated version of the tuple rather than the same version previously
obtained. So, to fix this, I'd like to propose that pushed-down quals
be checked in ForeignRecheck.

Comments welcome!

Best regards,
Etsuro Fujita

[1]: /messages/by-id/16016.1431455074@sss.pgh.pa.us

Attachments:

postgres-fdw-late-locking-v2.patchtext/x-diff; name=postgres-fdw-late-locking-v2.patchDownload
*** a/contrib/postgres_fdw/connection.c
--- b/contrib/postgres_fdw/connection.c
***************
*** 384,390 **** begin_remote_xact(ConnCacheEntry *entry)
  		if (IsolationIsSerializable())
  			sql = "START TRANSACTION ISOLATION LEVEL SERIALIZABLE";
  		else
! 			sql = "START TRANSACTION ISOLATION LEVEL REPEATABLE READ";
  		do_sql_command(entry->conn, sql);
  		entry->xact_depth = 1;
  	}
--- 384,390 ----
  		if (IsolationIsSerializable())
  			sql = "START TRANSACTION ISOLATION LEVEL SERIALIZABLE";
  		else
! 			sql = "START TRANSACTION ISOLATION LEVEL READ COMMITTED";
  		do_sql_command(entry->conn, sql);
  		entry->xact_depth = 1;
  	}
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***************
*** 89,94 **** typedef struct PgFdwRelationInfo
--- 89,96 ----
   *
   * 1) SELECT statement text to be sent to the remote server
   * 2) Integer list of attribute numbers retrieved by the SELECT
+  * 3) SELECT statement text to be sent to the remote server
+  * 4) Integer list of attribute numbers retrieved by the SELECT
   *
   * These items are indexed with the enum FdwScanPrivateIndex, so an item
   * can be fetched with list_nth().  For example, to get the SELECT statement:
***************
*** 99,105 **** enum FdwScanPrivateIndex
  	/* SQL statement to execute remotely (as a String node) */
  	FdwScanPrivateSelectSql,
  	/* Integer list of attribute numbers retrieved by the SELECT */
! 	FdwScanPrivateRetrievedAttrs
  };
  
  /*
--- 101,111 ----
  	/* SQL statement to execute remotely (as a String node) */
  	FdwScanPrivateSelectSql,
  	/* Integer list of attribute numbers retrieved by the SELECT */
! 	FdwScanPrivateRetrievedAttrs,
! 	/* SQL statement to execute remotely (as a String node) */
! 	FdwScanPrivateSelectSql2,
! 	/* Integer list of attribute numbers retrieved by SELECT */
! 	FdwScanPrivateRetrievedAttrs2
  };
  
  /*
***************
*** 187,192 **** typedef struct PgFdwModifyState
--- 193,222 ----
  } PgFdwModifyState;
  
  /*
+  * Execution state for fetching/locking foreign rows.
+  */
+ typedef struct PgFdwFetchState
+ {
+ 	Relation	rel;			/* relcache entry for the foreign table */
+ 	AttInMetadata *attinmeta;	/* attribute datatype conversion metadata */
+ 
+ 	/* for remote query execution */
+ 	PGconn	   *conn;			/* connection for the fetch */
+ 	char	   *p_name;			/* name of prepared statement, if created */
+ 
+ 	/* extracted fdw_private data */
+ 	char	   *query;			/* text of SELECT command */
+ 	List	   *retrieved_attrs;	/* attr numbers retrieved by SELECT */
+ 
+ 	/* info about parameters for prepared statement */
+ 	int			p_nums;			/* number of parameters to transmit */
+ 	FmgrInfo   *p_flinfo;		/* output conversion functions for them */
+ 
+ 	/* working memory context */
+ 	MemoryContext temp_cxt;		/* context for per-tuple temporary data */
+ } PgFdwFetchState;
+ 
+ /*
   * Workspace for analyzing a foreign table.
   */
  typedef struct PgFdwAnalyzeState
***************
*** 277,282 **** static TupleTableSlot *postgresExecForeignDelete(EState *estate,
--- 307,318 ----
  static void postgresEndForeignModify(EState *estate,
  						 ResultRelInfo *resultRelInfo);
  static int	postgresIsForeignRelUpdatable(Relation rel);
+ static RowMarkType postgresGetForeignRowMarkType(RangeTblEntry *rte,
+ 							  LockClauseStrength strength);
+ static HeapTuple postgresRefetchForeignRow(EState *estate,
+ 						  ExecRowMark *erm,
+ 						  Datum rowid,
+ 						  bool *updated);
  static void postgresExplainForeignScan(ForeignScanState *node,
  						   ExplainState *es);
  static void postgresExplainForeignModify(ModifyTableState *mtstate,
***************
*** 307,321 **** static void get_remote_estimate(const char *sql,
  static bool ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
  						  EquivalenceClass *ec, EquivalenceMember *em,
  						  void *arg);
  static void create_cursor(ForeignScanState *node);
  static void fetch_more_data(ForeignScanState *node);
  static void close_cursor(PGconn *conn, unsigned int cursor_number);
! static void prepare_foreign_modify(PgFdwModifyState *fmstate);
! static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
! 						 ItemPointer tupleid,
! 						 TupleTableSlot *slot);
  static void store_returning_result(PgFdwModifyState *fmstate,
  					   TupleTableSlot *slot, PGresult *res);
  static int postgresAcquireSampleRowsFunc(Relation relation, int elevel,
  							  HeapTuple *rows, int targrows,
  							  double *totalrows,
--- 343,368 ----
  static bool ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
  						  EquivalenceClass *ec, EquivalenceMember *em,
  						  void *arg);
+ static List *create_foreign_fetch_info(PlannerInfo *root,
+ 						  RelOptInfo *baserel,
+ 						  RowMarkType markType);
  static void create_cursor(ForeignScanState *node);
  static void fetch_more_data(ForeignScanState *node);
  static void close_cursor(PGconn *conn, unsigned int cursor_number);
! static char *setup_prep_stmt(PGconn *conn, char *query);
! static const char **convert_prep_stmt_params(ItemPointer tupleid,
! 						 TupleTableSlot *slot,
! 						 int p_nums,
! 						 FmgrInfo *p_flinfo,
! 						 List *target_attrs,
! 						 MemoryContext temp_context);
  static void store_returning_result(PgFdwModifyState *fmstate,
  					   TupleTableSlot *slot, PGresult *res);
+ static void init_foreign_fetch_state(EState *estate,
+ 						 ExecRowMark *erm,
+ 						 List *fdw_private,
+ 						 int eflags);
+ static void finish_foreign_fetch_state(EState *estate, ExecRowMark *erm);
  static int postgresAcquireSampleRowsFunc(Relation relation, int elevel,
  							  HeapTuple *rows, int targrows,
  							  double *totalrows,
***************
*** 359,364 **** postgres_fdw_handler(PG_FUNCTION_ARGS)
--- 406,415 ----
  	routine->EndForeignModify = postgresEndForeignModify;
  	routine->IsForeignRelUpdatable = postgresIsForeignRelUpdatable;
  
+ 	/* Functions for SELECT FOR UPDATE/SHARE row locking */
+ 	routine->GetForeignRowMarkType = postgresGetForeignRowMarkType;
+ 	routine->RefetchForeignRow = postgresRefetchForeignRow;
+ 
  	/* Support functions for EXPLAIN */
  	routine->ExplainForeignScan = postgresExplainForeignScan;
  	routine->ExplainForeignModify = postgresExplainForeignModify;
***************
*** 747,752 **** postgresGetForeignPlan(PlannerInfo *root,
--- 798,804 ----
  	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
  	Index		scan_relid = baserel->relid;
  	List	   *fdw_private;
+ 	List	   *fdw_private2 = NIL;
  	List	   *remote_conds = NIL;
  	List	   *local_exprs = NIL;
  	List	   *params_list = NIL;
***************
*** 837,856 **** postgresGetForeignPlan(PlannerInfo *root,
  			 * complete information about, and (b) it wouldn't work anyway on
  			 * older remote servers.  Likewise, we don't worry about NOWAIT.
  			 */
! 			switch (rc->strength)
  			{
! 				case LCS_NONE:
! 					/* No locking needed */
! 					break;
! 				case LCS_FORKEYSHARE:
! 				case LCS_FORSHARE:
! 					appendStringInfoString(&sql, " FOR SHARE");
! 					break;
! 				case LCS_FORNOKEYUPDATE:
! 				case LCS_FORUPDATE:
! 					appendStringInfoString(&sql, " FOR UPDATE");
! 					break;
  			}
  		}
  	}
  
--- 889,914 ----
  			 * complete information about, and (b) it wouldn't work anyway on
  			 * older remote servers.  Likewise, we don't worry about NOWAIT.
  			 */
! 			if (rc->markType == ROW_MARK_COPY)
  			{
! 				switch (rc->strength)
! 				{
! 					case LCS_NONE:
! 						/* No locking needed */
! 						break;
! 					case LCS_FORKEYSHARE:
! 					case LCS_FORSHARE:
! 						appendStringInfoString(&sql, " FOR SHARE");
! 						break;
! 					case LCS_FORNOKEYUPDATE:
! 					case LCS_FORUPDATE:
! 						appendStringInfoString(&sql, " FOR UPDATE");
! 						break;
! 				}
  			}
+ 			else
+ 				fdw_private2 = create_foreign_fetch_info(root, baserel,
+ 														 rc->markType);
  		}
  	}
  
***************
*** 860,865 **** postgresGetForeignPlan(PlannerInfo *root,
--- 918,925 ----
  	 */
  	fdw_private = list_make2(makeString(sql.data),
  							 retrieved_attrs);
+ 	if (fdw_private2)
+ 		fdw_private = list_concat(fdw_private, fdw_private2);
  
  	/*
  	 * Create the ForeignScan node from target list, local filtering
***************
*** 888,893 **** postgresBeginForeignScan(ForeignScanState *node, int eflags)
--- 948,954 ----
  	EState	   *estate = node->ss.ps.state;
  	PgFdwScanState *fsstate;
  	RangeTblEntry *rte;
+ 	ExecRowMark *erm;
  	Oid			userid;
  	ForeignTable *table;
  	ForeignServer *server;
***************
*** 988,993 **** postgresBeginForeignScan(ForeignScanState *node, int eflags)
--- 1049,1061 ----
  		fsstate->param_values = (const char **) palloc0(numParams * sizeof(char *));
  	else
  		fsstate->param_values = NULL;
+ 
+ 	/*
+ 	 * Initialize state for fetching/locking foreign rows if needed.
+ 	 */
+ 	erm = ExecFindRowMark(estate, fsplan->scan.scanrelid, true);
+ 	if (erm && erm->relation && erm->ermExtra == NULL)
+ 		init_foreign_fetch_state(estate, erm, fsplan->fdw_private, eflags);
  }
  
  /*
***************
*** 1095,1101 **** postgresReScanForeignScan(ForeignScanState *node)
--- 1163,1172 ----
  static void
  postgresEndForeignScan(ForeignScanState *node)
  {
+ 	ForeignScan *fsplan = (ForeignScan *) node->ss.ps.plan;
+ 	EState	   *estate = node->ss.ps.state;
  	PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
+ 	ExecRowMark *erm;
  
  	/* if fsstate is NULL, we are in EXPLAIN; nothing to do */
  	if (fsstate == NULL)
***************
*** 1109,1114 **** postgresEndForeignScan(ForeignScanState *node)
--- 1180,1192 ----
  	ReleaseConnection(fsstate->conn);
  	fsstate->conn = NULL;
  
+ 	/*
+ 	 * Finish state for fetching/locking foreign rows if needed.
+ 	 */
+ 	erm = ExecFindRowMark(estate, fsplan->scan.scanrelid, true);
+ 	if (erm && erm->relation && erm->ermExtra != NULL)
+ 		finish_foreign_fetch_state(estate, erm);
+ 
  	/* MemoryContexts will be deleted automatically. */
  }
  
***************
*** 1406,1415 **** postgresExecForeignInsert(EState *estate,
  
  	/* Set up the prepared statement on the remote server, if we didn't yet */
  	if (!fmstate->p_name)
! 		prepare_foreign_modify(fmstate);
  
  	/* Convert parameters needed by prepared statement to text form */
! 	p_values = convert_prep_stmt_params(fmstate, NULL, slot);
  
  	/*
  	 * Execute the prepared statement, and check for success.
--- 1484,1497 ----
  
  	/* Set up the prepared statement on the remote server, if we didn't yet */
  	if (!fmstate->p_name)
! 		fmstate->p_name = setup_prep_stmt(fmstate->conn, fmstate->query);
  
  	/* Convert parameters needed by prepared statement to text form */
! 	p_values = convert_prep_stmt_params(NULL, slot,
! 										fmstate->p_nums,
! 										fmstate->p_flinfo,
! 										fmstate->target_attrs,
! 										fmstate->temp_cxt);
  
  	/*
  	 * Execute the prepared statement, and check for success.
***************
*** 1466,1472 **** postgresExecForeignUpdate(EState *estate,
  
  	/* Set up the prepared statement on the remote server, if we didn't yet */
  	if (!fmstate->p_name)
! 		prepare_foreign_modify(fmstate);
  
  	/* Get the ctid that was passed up as a resjunk column */
  	datum = ExecGetJunkAttribute(planSlot,
--- 1548,1554 ----
  
  	/* Set up the prepared statement on the remote server, if we didn't yet */
  	if (!fmstate->p_name)
! 		fmstate->p_name = setup_prep_stmt(fmstate->conn, fmstate->query);
  
  	/* Get the ctid that was passed up as a resjunk column */
  	datum = ExecGetJunkAttribute(planSlot,
***************
*** 1477,1485 **** postgresExecForeignUpdate(EState *estate,
  		elog(ERROR, "ctid is NULL");
  
  	/* Convert parameters needed by prepared statement to text form */
! 	p_values = convert_prep_stmt_params(fmstate,
! 										(ItemPointer) DatumGetPointer(datum),
! 										slot);
  
  	/*
  	 * Execute the prepared statement, and check for success.
--- 1559,1570 ----
  		elog(ERROR, "ctid is NULL");
  
  	/* Convert parameters needed by prepared statement to text form */
! 	p_values = convert_prep_stmt_params((ItemPointer) DatumGetPointer(datum),
! 										slot,
! 										fmstate->p_nums,
! 										fmstate->p_flinfo,
! 										fmstate->target_attrs,
! 										fmstate->temp_cxt);
  
  	/*
  	 * Execute the prepared statement, and check for success.
***************
*** 1536,1542 **** postgresExecForeignDelete(EState *estate,
  
  	/* Set up the prepared statement on the remote server, if we didn't yet */
  	if (!fmstate->p_name)
! 		prepare_foreign_modify(fmstate);
  
  	/* Get the ctid that was passed up as a resjunk column */
  	datum = ExecGetJunkAttribute(planSlot,
--- 1621,1627 ----
  
  	/* Set up the prepared statement on the remote server, if we didn't yet */
  	if (!fmstate->p_name)
! 		fmstate->p_name = setup_prep_stmt(fmstate->conn, fmstate->query);
  
  	/* Get the ctid that was passed up as a resjunk column */
  	datum = ExecGetJunkAttribute(planSlot,
***************
*** 1547,1555 **** postgresExecForeignDelete(EState *estate,
  		elog(ERROR, "ctid is NULL");
  
  	/* Convert parameters needed by prepared statement to text form */
! 	p_values = convert_prep_stmt_params(fmstate,
! 										(ItemPointer) DatumGetPointer(datum),
! 										NULL);
  
  	/*
  	 * Execute the prepared statement, and check for success.
--- 1632,1643 ----
  		elog(ERROR, "ctid is NULL");
  
  	/* Convert parameters needed by prepared statement to text form */
! 	p_values = convert_prep_stmt_params((ItemPointer) DatumGetPointer(datum),
! 										NULL,
! 										fmstate->p_nums,
! 										fmstate->p_flinfo,
! 										fmstate->target_attrs,
! 										fmstate->temp_cxt);
  
  	/*
  	 * Execute the prepared statement, and check for success.
***************
*** 1671,1676 **** postgresIsForeignRelUpdatable(Relation rel)
--- 1759,1862 ----
  }
  
  /*
+  * postgresGetForeignRowMarkType
+  *		Get rowmark type to use for a particular table
+  */
+ static RowMarkType
+ postgresGetForeignRowMarkType(RangeTblEntry *rte, LockClauseStrength strength)
+ {
+ 	switch (strength)
+ 	{
+ 		case LCS_NONE:
+ 			return ROW_MARK_REFERENCE;
+ 		case LCS_FORKEYSHARE:
+ 			return ROW_MARK_KEYSHARE;
+ 		case LCS_FORSHARE:
+ 			return ROW_MARK_SHARE;
+ 		case LCS_FORNOKEYUPDATE:
+ 			return ROW_MARK_NOKEYEXCLUSIVE;
+ 		case LCS_FORUPDATE:
+ 			return ROW_MARK_EXCLUSIVE;
+ 	}
+ 	return ROW_MARK_COPY;		/* shouldn't happen */
+ }
+ 
+ /*
+  * postgresRefetchForeignRow
+  *		Re-fetch one tuple from a foreign table, possibly locking it
+  */
+ static HeapTuple
+ postgresRefetchForeignRow(EState *estate,
+ 						  ExecRowMark *erm,
+ 						  Datum rowid,
+ 						  bool *updated)
+ {
+ 	PgFdwFetchState *ffstate = (PgFdwFetchState *) erm->ermExtra;
+ 	ItemPointer tupleid = (ItemPointer) DatumGetPointer(rowid);
+ 	const char **p_values;
+ 	PGresult   *res;
+ 	HeapTuple	tuple;
+ 
+ 	*updated = false;
+ 
+ 	/* Set up the prepared statement on the remote server, if we didn't yet */
+ 	if (!ffstate->p_name)
+ 		ffstate->p_name = setup_prep_stmt(ffstate->conn, ffstate->query);
+ 
+ 	/* Convert parameters needed by prepared statement to text form */
+ 	p_values = convert_prep_stmt_params(tupleid, NULL,
+ 										ffstate->p_nums,
+ 										ffstate->p_flinfo,
+ 										NIL,
+ 										ffstate->temp_cxt);
+ 
+ 	/*
+ 	 * Execute the prepared statement, and check for success.
+ 	 *
+ 	 * We don't use a PG_TRY block here, so be careful not to throw error
+ 	 * without releasing the PGresult.
+ 	 */
+ 	res = PQexecPrepared(ffstate->conn,
+ 						 ffstate->p_name,
+ 						 ffstate->p_nums,
+ 						 p_values,
+ 						 NULL,
+ 						 NULL,
+ 						 0);
+ 	if (PQresultStatus(res) != PGRES_TUPLES_OK)
+ 		pgfdw_report_error(ERROR, res, ffstate->conn, true, ffstate->query);
+ 
+ 	/* PGresult must be released before leaving this function. */
+ 	PG_TRY();
+ 	{
+ 		/* Create the tuple */
+ 		tuple = make_tuple_from_result_row(res, 0,
+ 										   ffstate->rel,
+ 										   ffstate->attinmeta,
+ 										   ffstate->retrieved_attrs,
+ 										   ffstate->temp_cxt);
+ 		tuple->t_tableOid = erm->relid;
+ 
+ 		PQclear(res);
+ 		res = NULL;
+ 	}
+ 	PG_CATCH();
+ 	{
+ 		if (res)
+ 			PQclear(res);
+ 		PG_RE_THROW();
+ 	}
+ 	PG_END_TRY();
+ 
+ 	MemoryContextReset(ffstate->temp_cxt);
+ 
+ 	if (!ItemPointerEquals(tupleid, &(tuple->t_self)))
+ 		*updated = true;
+ 
+ 	return tuple;
+ }
+ 
+ /*
   * postgresExplainForeignScan
   *		Produce extra output for EXPLAIN of a ForeignScan on a foreign table
   */
***************
*** 1933,1938 **** ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
--- 2119,2170 ----
  }
  
  /*
+  * Create the FDW-private information for fetching/locking foreign rows.
+  */
+ static List *
+ create_foreign_fetch_info(PlannerInfo *root,
+ 						  RelOptInfo *baserel,
+ 						  RowMarkType markType)
+ {
+ 	StringInfoData sql;
+ 	List	   *retrieved_attrs;
+ 	Bitmapset  *attrs_used = NULL;
+ 
+ 	/*
+ 	 * Build the query string to be sent for execution.
+ 	 */
+ 	initStringInfo(&sql);
+ 	/* Add ctid to attrs_used. */
+ 	attrs_used = bms_add_member(attrs_used,
+ 								SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber);
+ 	/* Add a whole-row var to attrs_used to retrieve all the columns. */
+ 	attrs_used = bms_add_member(attrs_used,
+ 								0 - FirstLowInvalidHeapAttributeNumber);
+ 	deparseSelectSql(&sql, root, baserel, attrs_used, &retrieved_attrs);
+ 	appendStringInfoString(&sql, " WHERE ctid = $1");
+ 
+ 	switch (markType)
+ 	{
+ 		case ROW_MARK_EXCLUSIVE:
+ 		case ROW_MARK_NOKEYEXCLUSIVE:
+ 			appendStringInfoString(&sql, " FOR UPDATE");
+ 			break;
+ 		case ROW_MARK_SHARE:
+ 		case ROW_MARK_KEYSHARE:
+ 			appendStringInfoString(&sql, " FOR SHARE");
+ 			break;
+ 		default:
+ 			break;
+ 	}
+ 
+ 	/*
+ 	 * Build the fdw_private list that will be available to the executor.
+ 	 * Items in the list must match enum FdwFetchPrivateIndex, above.
+ 	 */
+ 	return list_make2(makeString(sql.data), retrieved_attrs);
+ }
+ 
+ /*
   * Create cursor for node's query with current parameter values.
   */
  static void
***************
*** 2169,2179 **** close_cursor(PGconn *conn, unsigned int cursor_number)
  }
  
  /*
!  * prepare_foreign_modify
   *		Establish a prepared statement for execution of INSERT/UPDATE/DELETE
   */
! static void
! prepare_foreign_modify(PgFdwModifyState *fmstate)
  {
  	char		prep_name[NAMEDATALEN];
  	char	   *p_name;
--- 2401,2412 ----
  }
  
  /*
!  * setup_prep_stmt
   *		Establish a prepared statement for execution of INSERT/UPDATE/DELETE
+  *		or re-fetching tuples for EvalPlanQual rechecking
   */
! static char *
! setup_prep_stmt(PGconn *conn, char *query)
  {
  	char		prep_name[NAMEDATALEN];
  	char	   *p_name;
***************
*** 2181,2187 **** prepare_foreign_modify(PgFdwModifyState *fmstate)
  
  	/* Construct name we'll use for the prepared statement. */
  	snprintf(prep_name, sizeof(prep_name), "pgsql_fdw_prep_%u",
! 			 GetPrepStmtNumber(fmstate->conn));
  	p_name = pstrdup(prep_name);
  
  	/*
--- 2414,2420 ----
  
  	/* Construct name we'll use for the prepared statement. */
  	snprintf(prep_name, sizeof(prep_name), "pgsql_fdw_prep_%u",
! 			 GetPrepStmtNumber(conn));
  	p_name = pstrdup(prep_name);
  
  	/*
***************
*** 2194,2211 **** prepare_foreign_modify(PgFdwModifyState *fmstate)
  	 * We don't use a PG_TRY block here, so be careful not to throw error
  	 * without releasing the PGresult.
  	 */
! 	res = PQprepare(fmstate->conn,
! 					p_name,
! 					fmstate->query,
! 					0,
! 					NULL);
  
  	if (PQresultStatus(res) != PGRES_COMMAND_OK)
! 		pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query);
  	PQclear(res);
  
  	/* This action shows that the prepare has been done. */
! 	fmstate->p_name = p_name;
  }
  
  /*
--- 2427,2440 ----
  	 * We don't use a PG_TRY block here, so be careful not to throw error
  	 * without releasing the PGresult.
  	 */
! 	res = PQprepare(conn, p_name, query, 0, NULL);
  
  	if (PQresultStatus(res) != PGRES_COMMAND_OK)
! 		pgfdw_report_error(ERROR, res, conn, true, query);
  	PQclear(res);
  
  	/* This action shows that the prepare has been done. */
! 	return p_name;
  }
  
  /*
***************
*** 2218,2253 **** prepare_foreign_modify(PgFdwModifyState *fmstate)
   * Data is constructed in temp_cxt; caller should reset that after use.
   */
  static const char **
! convert_prep_stmt_params(PgFdwModifyState *fmstate,
! 						 ItemPointer tupleid,
! 						 TupleTableSlot *slot)
  {
  	const char **p_values;
  	int			pindex = 0;
  	MemoryContext oldcontext;
  
! 	oldcontext = MemoryContextSwitchTo(fmstate->temp_cxt);
  
! 	p_values = (const char **) palloc(sizeof(char *) * fmstate->p_nums);
  
  	/* 1st parameter should be ctid, if it's in use */
  	if (tupleid != NULL)
  	{
  		/* don't need set_transmission_modes for TID output */
! 		p_values[pindex] = OutputFunctionCall(&fmstate->p_flinfo[pindex],
  											  PointerGetDatum(tupleid));
  		pindex++;
  	}
  
  	/* get following parameters from slot */
! 	if (slot != NULL && fmstate->target_attrs != NIL)
  	{
  		int			nestlevel;
  		ListCell   *lc;
  
  		nestlevel = set_transmission_modes();
  
! 		foreach(lc, fmstate->target_attrs)
  		{
  			int			attnum = lfirst_int(lc);
  			Datum		value;
--- 2447,2485 ----
   * Data is constructed in temp_cxt; caller should reset that after use.
   */
  static const char **
! convert_prep_stmt_params(ItemPointer tupleid,
! 						 TupleTableSlot *slot,
! 						 int p_nums,
! 						 FmgrInfo *p_flinfo,
! 						 List *target_attrs,
! 						 MemoryContext temp_context)
  {
  	const char **p_values;
  	int			pindex = 0;
  	MemoryContext oldcontext;
  
! 	oldcontext = MemoryContextSwitchTo(temp_context);
  
! 	p_values = (const char **) palloc(sizeof(char *) * p_nums);
  
  	/* 1st parameter should be ctid, if it's in use */
  	if (tupleid != NULL)
  	{
  		/* don't need set_transmission_modes for TID output */
! 		p_values[pindex] = OutputFunctionCall(&p_flinfo[pindex],
  											  PointerGetDatum(tupleid));
  		pindex++;
  	}
  
  	/* get following parameters from slot */
! 	if (slot != NULL && target_attrs != NIL)
  	{
  		int			nestlevel;
  		ListCell   *lc;
  
  		nestlevel = set_transmission_modes();
  
! 		foreach(lc, target_attrs)
  		{
  			int			attnum = lfirst_int(lc);
  			Datum		value;
***************
*** 2257,2263 **** convert_prep_stmt_params(PgFdwModifyState *fmstate,
  			if (isnull)
  				p_values[pindex] = NULL;
  			else
! 				p_values[pindex] = OutputFunctionCall(&fmstate->p_flinfo[pindex],
  													  value);
  			pindex++;
  		}
--- 2489,2495 ----
  			if (isnull)
  				p_values[pindex] = NULL;
  			else
! 				p_values[pindex] = OutputFunctionCall(&p_flinfo[pindex],
  													  value);
  			pindex++;
  		}
***************
*** 2265,2271 **** convert_prep_stmt_params(PgFdwModifyState *fmstate,
  		reset_transmission_modes(nestlevel);
  	}
  
! 	Assert(pindex == fmstate->p_nums);
  
  	MemoryContextSwitchTo(oldcontext);
  
--- 2497,2503 ----
  		reset_transmission_modes(nestlevel);
  	}
  
! 	Assert(pindex == p_nums);
  
  	MemoryContextSwitchTo(oldcontext);
  
***************
*** 2305,2310 **** store_returning_result(PgFdwModifyState *fmstate,
--- 2537,2645 ----
  }
  
  /*
+  * init_foreign_fetch_state
+  *		Initialize an execution state for fetching/locking foreign rows
+  */
+ static void
+ init_foreign_fetch_state(EState *estate,
+ 						 ExecRowMark *erm,
+ 						 List *fdw_private,
+ 						 int eflags)
+ {
+ 	PgFdwFetchState *ffstate;
+ 	Relation	rel = erm->relation;
+ 	RangeTblEntry *rte;
+ 	Oid			userid;
+ 	ForeignTable *table;
+ 	ForeignServer *server;
+ 	UserMapping *user;
+ 	Oid			typefnoid;
+ 	bool		isvarlena;
+ 
+ 	/* Begin constructing PgFdwFetchState. */
+ 	ffstate = (PgFdwFetchState *) palloc0(sizeof(PgFdwFetchState));
+ 	ffstate->rel = rel;
+ 
+ 	/*
+ 	 * Identify which user to do the remote access as.  This should match what
+ 	 * ExecCheckRTEPerms() does.
+ 	 */
+ 	rte = rt_fetch(erm->rti, estate->es_range_table);
+ 	userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
+ 
+ 	/* Get info about foreign table. */
+ 	table = GetForeignTable(RelationGetRelid(rel));
+ 	server = GetForeignServer(table->serverid);
+ 	user = GetUserMapping(userid, server->serverid);
+ 
+ 	/* Open connection; report that we'll create a prepared statement. */
+ 	ffstate->conn = GetConnection(server, user, true);
+ 	ffstate->p_name = NULL;		/* prepared statement not made yet */
+ 
+ 	/* Deconstruct fdw_private data. */
+ 	ffstate->query = strVal(list_nth(fdw_private,
+ 									 FdwScanPrivateSelectSql2));
+ 	ffstate->retrieved_attrs = (List *) list_nth(fdw_private,
+ 											  FdwScanPrivateRetrievedAttrs2);
+ 
+ 	/* Create context for per-tuple temp workspace. */
+ 	ffstate->temp_cxt = AllocSetContextCreate(estate->es_query_cxt,
+ 											  "postgres_fdw temporary data",
+ 											  ALLOCSET_SMALL_MINSIZE,
+ 											  ALLOCSET_SMALL_INITSIZE,
+ 											  ALLOCSET_SMALL_MAXSIZE);
+ 
+ 	/* Prepare for input conversion of SELECT results. */
+ 	ffstate->attinmeta = TupleDescGetAttInMetadata(RelationGetDescr(rel));
+ 
+ 	/* Prepare for output conversion of parameters used in prepared stmt. */
+ 	ffstate->p_flinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo));
+ 	ffstate->p_nums = 0;
+ 
+ 	/* Only one transmittable parameter will be ctid */
+ 	getTypeOutputInfo(TIDOID, &typefnoid, &isvarlena);
+ 	fmgr_info(typefnoid, &ffstate->p_flinfo[ffstate->p_nums]);
+ 	ffstate->p_nums++;
+ 
+ 	erm->ermExtra = ffstate;
+ }
+ 
+ /*
+  * finish_foreign_fetch_state
+  *		Finish an execution state for fetching/locking foreign rows
+  */
+ static void
+ finish_foreign_fetch_state(EState *estate, ExecRowMark *erm)
+ {
+ 	PgFdwFetchState *ffstate = (PgFdwFetchState *) erm->ermExtra;
+ 
+ 	/* If we created a prepared statement, destroy it */
+ 	if (ffstate->p_name)
+ 	{
+ 		char		sql[64];
+ 		PGresult   *res;
+ 
+ 		snprintf(sql, sizeof(sql), "DEALLOCATE %s", ffstate->p_name);
+ 
+ 		/*
+ 		 * We don't use a PG_TRY block here, so be careful not to throw error
+ 		 * without releasing the PGresult.
+ 		 */
+ 		res = PQexec(ffstate->conn, sql);
+ 		if (PQresultStatus(res) != PGRES_COMMAND_OK)
+ 			pgfdw_report_error(ERROR, res, ffstate->conn, true, sql);
+ 		PQclear(res);
+ 		ffstate->p_name = NULL;
+ 	}
+ 
+ 	/* Release remote connection */
+ 	ReleaseConnection(ffstate->conn);
+ 	ffstate->conn = NULL;
+ 
+ 	erm->ermExtra = NULL;
+ }
+ 
+ /*
   * postgresAnalyzeForeignTable
   *		Test whether analyzing this foreign table is supported
   */
#2Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#1)
1 attachment(s)
Re: We need to support ForeignRecheck for late row locking, don't we?

On 2015/07/22 19:10, Etsuro Fujita wrote:

While working on the issue "Foreign join pushdown vs EvalPlanQual", I
happened to notice odd behaviors of late row locking in FDWs.

I think the reason for that is because we don't check pushed-down quals
inside an EPQ testing even if what was fetched by RefetchForeignRow was
an updated version of the tuple rather than the same version previously
obtained. So, to fix this, I'd like to propose that pushed-down quals
be checked in ForeignRecheck.

Attached is a patch for that.

* I've modified ForeignRecheck so as to check pushed-down quals whether
doing late locking or early locking. I think we could probably make
ForeignRecheck do so only when doing late locking, but I'm not sure it's
worth complicating the code.

* I've made the above change only for simple foreign table scans that
have scanrelid > 0 and fdw_scan_tlist = NIL. As for simple foreign
table scans that have scanrelid > 0 and *fdw_scan_tlist is non-NIL*, I
think we are under discussion in another thread I started. Will update
as necessary.

* Sorry, I've not fully updated comments and docs yet. Will update.

I'd be happy if I could get feedback earlier.

Best regards,
Etsuro Fujita

Attachments:

ForeignRecheck-v1.patchtext/x-diff; name=ForeignRecheck-v1.patchDownload
*** a/contrib/file_fdw/file_fdw.c
--- b/contrib/file_fdw/file_fdw.c
***************
*** 563,569 **** fileGetForeignPlan(PlannerInfo *root,
  							scan_relid,
  							NIL,	/* no expressions to evaluate */
  							best_path->fdw_private,
! 							NIL /* no custom tlist */ );
  }
  
  /*
--- 563,570 ----
  							scan_relid,
  							NIL,	/* no expressions to evaluate */
  							best_path->fdw_private,
! 							NIL,	/* no custom tlist */
! 							NIL /* no pushed-down qual */ );
  }
  
  /*
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***************
*** 748,753 **** postgresGetForeignPlan(PlannerInfo *root,
--- 748,754 ----
  	Index		scan_relid = baserel->relid;
  	List	   *fdw_private;
  	List	   *remote_conds = NIL;
+ 	List	   *remote_exprs = NIL;
  	List	   *local_exprs = NIL;
  	List	   *params_list = NIL;
  	List	   *retrieved_attrs;
***************
*** 769,776 **** postgresGetForeignPlan(PlannerInfo *root,
  	 *
  	 * This code must match "extract_actual_clauses(scan_clauses, false)"
  	 * except for the additional decision about remote versus local execution.
! 	 * Note however that we only strip the RestrictInfo nodes from the
! 	 * local_exprs list, since appendWhereClause expects a list of
  	 * RestrictInfos.
  	 */
  	foreach(lc, scan_clauses)
--- 770,777 ----
  	 *
  	 * This code must match "extract_actual_clauses(scan_clauses, false)"
  	 * except for the additional decision about remote versus local execution.
! 	 * Note however that we don't strip the RestrictInfo nodes from the
! 	 * remote_conds list, since appendWhereClause expects a list of
  	 * RestrictInfos.
  	 */
  	foreach(lc, scan_clauses)
***************
*** 784,794 **** postgresGetForeignPlan(PlannerInfo *root,
--- 785,801 ----
  			continue;
  
  		if (list_member_ptr(fpinfo->remote_conds, rinfo))
+ 		{
  			remote_conds = lappend(remote_conds, rinfo);
+ 			remote_exprs = lappend(remote_exprs, rinfo->clause);
+ 		}
  		else if (list_member_ptr(fpinfo->local_conds, rinfo))
  			local_exprs = lappend(local_exprs, rinfo->clause);
  		else if (is_foreign_expr(root, baserel, rinfo->clause))
+ 		{
  			remote_conds = lappend(remote_conds, rinfo);
+ 			remote_exprs = lappend(remote_exprs, rinfo->clause);
+ 		}
  		else
  			local_exprs = lappend(local_exprs, rinfo->clause);
  	}
***************
*** 874,880 **** postgresGetForeignPlan(PlannerInfo *root,
  							scan_relid,
  							params_list,
  							fdw_private,
! 							NIL /* no custom tlist */ );
  }
  
  /*
--- 881,888 ----
  							scan_relid,
  							params_list,
  							fdw_private,
! 							NIL,	/* no custom tlist */
! 							remote_exprs);
  }
  
  /*
*** a/src/backend/executor/nodeForeignscan.c
--- b/src/backend/executor/nodeForeignscan.c
***************
*** 72,79 **** ForeignNext(ForeignScanState *node)
  static bool
  ForeignRecheck(ForeignScanState *node, TupleTableSlot *slot)
  {
! 	/* There are no access-method-specific conditions to recheck. */
! 	return true;
  }
  
  /* ----------------------------------------------------------------
--- 72,90 ----
  static bool
  ForeignRecheck(ForeignScanState *node, TupleTableSlot *slot)
  {
! 	ExprContext *econtext;
! 
! 	/*
! 	 * extract necessary information from foreign scan node
! 	 */
! 	econtext = node->ss.ps.ps_ExprContext;
! 
! 	/* Does the tuple meet the pushed-down-qual condition? */
! 	econtext->ecxt_scantuple = slot;
! 
! 	ResetExprContext(econtext);
! 
! 	return ExecQual(node->pushedDownQual, econtext, false);
  }
  
  /* ----------------------------------------------------------------
***************
*** 135,140 **** ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
--- 146,154 ----
  	scanstate->ss.ps.qual = (List *)
  		ExecInitExpr((Expr *) node->scan.plan.qual,
  					 (PlanState *) scanstate);
+ 	scanstate->pushedDownQual = (List *)
+ 		ExecInitExpr((Expr *) node->pushedDownQual,
+ 					 (PlanState *) scanstate);
  
  	/*
  	 * tuple table initialization
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 604,609 **** _copyForeignScan(const ForeignScan *from)
--- 604,610 ----
  	COPY_NODE_FIELD(fdw_private);
  	COPY_NODE_FIELD(fdw_scan_tlist);
  	COPY_BITMAPSET_FIELD(fs_relids);
+ 	COPY_NODE_FIELD(pushedDownQual);
  	COPY_SCALAR_FIELD(fsSystemCol);
  
  	return newnode;
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 570,575 **** _outForeignScan(StringInfo str, const ForeignScan *node)
--- 570,576 ----
  	WRITE_NODE_FIELD(fdw_private);
  	WRITE_NODE_FIELD(fdw_scan_tlist);
  	WRITE_BITMAPSET_FIELD(fs_relids);
+ 	WRITE_NODE_FIELD(pushedDownQual);
  	WRITE_BOOL_FIELD(fsSystemCol);
  }
  
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
***************
*** 2109,2114 **** create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
--- 2109,2116 ----
  			replace_nestloop_params(root, (Node *) scan_plan->scan.plan.qual);
  		scan_plan->fdw_exprs = (List *)
  			replace_nestloop_params(root, (Node *) scan_plan->fdw_exprs);
+ 		scan_plan->pushedDownQual = (List *)
+ 			replace_nestloop_params(root, (Node *) scan_plan->pushedDownQual);
  	}
  
  	/*
***************
*** 3692,3698 **** make_foreignscan(List *qptlist,
  				 Index scanrelid,
  				 List *fdw_exprs,
  				 List *fdw_private,
! 				 List *fdw_scan_tlist)
  {
  	ForeignScan *node = makeNode(ForeignScan);
  	Plan	   *plan = &node->scan.plan;
--- 3694,3701 ----
  				 Index scanrelid,
  				 List *fdw_exprs,
  				 List *fdw_private,
! 				 List *fdw_scan_tlist,
! 				 List *pushedDownQual)
  {
  	ForeignScan *node = makeNode(ForeignScan);
  	Plan	   *plan = &node->scan.plan;
***************
*** 3710,3715 **** make_foreignscan(List *qptlist,
--- 3713,3719 ----
  	node->fdw_scan_tlist = fdw_scan_tlist;
  	/* fs_relids will be filled in by create_foreignscan_plan */
  	node->fs_relids = NULL;
+ 	node->pushedDownQual = pushedDownQual;
  	/* fsSystemCol will be filled in by create_foreignscan_plan */
  	node->fsSystemCol = false;
  
*** a/src/backend/optimizer/plan/setrefs.c
--- b/src/backend/optimizer/plan/setrefs.c
***************
*** 1132,1137 **** set_foreignscan_references(PlannerInfo *root,
--- 1132,1140 ----
  			fix_scan_list(root, fscan->scan.plan.qual, rtoffset);
  		fscan->fdw_exprs =
  			fix_scan_list(root, fscan->fdw_exprs, rtoffset);
+ 		/* pushedDownQual needs the same adjustments, too */
+ 		fscan->pushedDownQual =
+ 			fix_scan_list(root, fscan->pushedDownQual, rtoffset);
  	}
  
  	/* Adjust fs_relids if needed */
*** a/src/backend/optimizer/plan/subselect.c
--- b/src/backend/optimizer/plan/subselect.c
***************
*** 2368,2373 **** finalize_plan(PlannerInfo *root, Plan *plan, Bitmapset *valid_params,
--- 2368,2379 ----
  		case T_ForeignScan:
  			finalize_primnode((Node *) ((ForeignScan *) plan)->fdw_exprs,
  							  &context);
+ 
+ 			/*
+ 			 * we need not look at pushedDownQual, since it will have the same
+ 			 * param references as fdw_exprs.
+ 			 */
+ 
  			/* We assume fdw_scan_tlist cannot contain Params */
  			context.paramids = bms_add_members(context.paramids, scan_params);
  			break;
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
***************
*** 1571,1576 **** typedef struct WorkTableScanState
--- 1571,1577 ----
  typedef struct ForeignScanState
  {
  	ScanState	ss;				/* its first field is NodeTag */
+ 	List	   *pushedDownQual;	/* list of pushed-down quals, if any */
  	/* use struct pointer to avoid including fdwapi.h here */
  	struct FdwRoutine *fdwroutine;
  	void	   *fdw_state;		/* foreign-data wrapper can keep state here */
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
***************
*** 517,522 **** typedef struct ForeignScan
--- 517,523 ----
  	List	   *fdw_private;	/* private data for FDW */
  	List	   *fdw_scan_tlist; /* optional tlist describing scan tuple */
  	Bitmapset  *fs_relids;		/* RTIs generated by this scan */
+ 	List	   *pushedDownQual;	/* list of pushed-down quals, if any */
  	bool		fsSystemCol;	/* true if any "system column" is needed */
  } ForeignScan;
  
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
***************
*** 45,51 **** extern SubqueryScan *make_subqueryscan(List *qptlist, List *qpqual,
  				  Index scanrelid, Plan *subplan);
  extern ForeignScan *make_foreignscan(List *qptlist, List *qpqual,
  				 Index scanrelid, List *fdw_exprs, List *fdw_private,
! 				 List *fdw_scan_tlist);
  extern Append *make_append(List *appendplans, List *tlist);
  extern RecursiveUnion *make_recursive_union(List *tlist,
  					 Plan *lefttree, Plan *righttree, int wtParam,
--- 45,51 ----
  				  Index scanrelid, Plan *subplan);
  extern ForeignScan *make_foreignscan(List *qptlist, List *qpqual,
  				 Index scanrelid, List *fdw_exprs, List *fdw_private,
! 				 List *fdw_scan_tlist, List *pushedDownQual);
  extern Append *make_append(List *appendplans, List *tlist);
  extern RecursiveUnion *make_recursive_union(List *tlist,
  					 Plan *lefttree, Plan *righttree, int wtParam,
#3Kouhei Kaigai
kaigai@ak.jp.nec.com
In reply to: Etsuro Fujita (#2)
Re: We need to support ForeignRecheck for late row locking, don't we?

Fujita-san,

On 2015/07/22 19:10, Etsuro Fujita wrote:

While working on the issue "Foreign join pushdown vs EvalPlanQual", I
happened to notice odd behaviors of late row locking in FDWs.

I think the reason for that is because we don't check pushed-down quals
inside an EPQ testing even if what was fetched by RefetchForeignRow was
an updated version of the tuple rather than the same version previously
obtained. So, to fix this, I'd like to propose that pushed-down quals
be checked in ForeignRecheck.

Attached is a patch for that.

* I've modified ForeignRecheck so as to check pushed-down quals whether
doing late locking or early locking. I think we could probably make
ForeignRecheck do so only when doing late locking, but I'm not sure it's
worth complicating the code.

* I've made the above change only for simple foreign table scans that
have scanrelid > 0 and fdw_scan_tlist = NIL. As for simple foreign
table scans that have scanrelid > 0 and *fdw_scan_tlist is non-NIL*, I
think we are under discussion in another thread I started. Will update
as necessary.

* Sorry, I've not fully updated comments and docs yet. Will update.

I'd be happy if I could get feedback earlier.

Isn't it an option to put a new callback in ForeignRecheck?

FDW driver knows its private data structure includes expression node
that was pushed down to the remote side. So, it seems to me the best
way to consult FDW driver whether the supplied tuple should be visible
according to the pushed down qualifier.

More or less, this fix need a new interface contract around EvalPlanQual
logic. It is better to give FDW driver more flexibility of its private
data structure and the way to process recheck logic, rather than special
purpose variable.

If FDW driver managed pushed-down expression in its own format, requirement
to pushedDownQual makes them to have qualifier redundantly.
The callback approach does not have such kind of concern.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Kouhei Kaigai (#3)
Re: We need to support ForeignRecheck for late row locking, don't we?

KaiGai-san,

On 2015/07/24 23:51, Kouhei Kaigai wrote:

On 2015/07/22 19:10, Etsuro Fujita wrote:

While working on the issue "Foreign join pushdown vs EvalPlanQual", I
happened to notice odd behaviors of late row locking in FDWs.

I think the reason for that is because we don't check pushed-down quals
inside an EPQ testing even if what was fetched by RefetchForeignRow was
an updated version of the tuple rather than the same version previously
obtained. So, to fix this, I'd like to propose that pushed-down quals
be checked in ForeignRecheck.

* I've modified ForeignRecheck so as to check pushed-down quals whether
doing late locking or early locking.

Isn't it an option to put a new callback in ForeignRecheck?

FDW driver knows its private data structure includes expression node
that was pushed down to the remote side. So, it seems to me the best
way to consult FDW driver whether the supplied tuple should be visible
according to the pushed down qualifier.

More or less, this fix need a new interface contract around EvalPlanQual
logic. It is better to give FDW driver more flexibility of its private
data structure and the way to process recheck logic, rather than special
purpose variable.

If FDW driver managed pushed-down expression in its own format, requirement
to pushedDownQual makes them to have qualifier redundantly.
The callback approach does not have such kind of concern.

That might be an idea, but is there any performance disadvantage as
discussed in [1]/messages/by-id/5590ED5C.2040200@lab.ntt.co.jp?; it looks like that that needs to perform another
remote query to see if the supplied tuple satisfies the pushed-down
quals during EPQ testing.

Best regards,
Etsuro Fujita

[1]: /messages/by-id/5590ED5C.2040200@lab.ntt.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Kouhei Kaigai
kaigai@ak.jp.nec.com
In reply to: Etsuro Fujita (#4)
Re: We need to support ForeignRecheck for late row locking, don't we?

On 2015/07/24 23:51, Kouhei Kaigai wrote:

On 2015/07/22 19:10, Etsuro Fujita wrote:

While working on the issue "Foreign join pushdown vs EvalPlanQual", I
happened to notice odd behaviors of late row locking in FDWs.

I think the reason for that is because we don't check pushed-down quals
inside an EPQ testing even if what was fetched by RefetchForeignRow was
an updated version of the tuple rather than the same version previously
obtained. So, to fix this, I'd like to propose that pushed-down quals
be checked in ForeignRecheck.

* I've modified ForeignRecheck so as to check pushed-down quals whether
doing late locking or early locking.

Isn't it an option to put a new callback in ForeignRecheck?

FDW driver knows its private data structure includes expression node
that was pushed down to the remote side. So, it seems to me the best
way to consult FDW driver whether the supplied tuple should be visible
according to the pushed down qualifier.

More or less, this fix need a new interface contract around EvalPlanQual
logic. It is better to give FDW driver more flexibility of its private
data structure and the way to process recheck logic, rather than special
purpose variable.

If FDW driver managed pushed-down expression in its own format, requirement
to pushedDownQual makes them to have qualifier redundantly.
The callback approach does not have such kind of concern.

That might be an idea, but is there any performance disadvantage as
discussed in [1]?; it looks like that that needs to perform another
remote query to see if the supplied tuple satisfies the pushed-down
quals during EPQ testing.

I expect the callback of ForeignRecheck runs ExecQual() towards
the qualifier expression pushed-down but saved on the private data
of ForeignScanState. It does not need to kick another remote query
(unless FDW driver is not designed), so performance disadvantage is
none or quite limited.

Also, let's assume the case when scanrelid == 0 (join pushdown).
It is easy to put special code path if scanrelid == 0, that
implies ScanState is either ForeignScan or CustomScan.
If ForeignRecheck (= recheckMtd) is called instead of the if-
block below of the Assert() on ExecScanFetch, FDW driver will be
able to put its own special code path to run alternative sub-plan.
How this alternative sub-plan works? It walks down the sub-plan
tree that is typically consists of NestLoop + ForeignScan for
example, then ExecScanFetch() is called again towards ScanState
with scanrelid > 0 at that time.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Kouhei Kaigai (#5)
Re: We need to support ForeignRecheck for late row locking, don't we?

On 2015/07/27 18:16, Kouhei Kaigai wrote:

On 2015/07/24 23:51, Kouhei Kaigai wrote:

On 2015/07/22 19:10, Etsuro Fujita wrote:

While working on the issue "Foreign join pushdown vs EvalPlanQual", I
happened to notice odd behaviors of late row locking in FDWs.

I think the reason for that is because we don't check pushed-down quals
inside an EPQ testing even if what was fetched by RefetchForeignRow was
an updated version of the tuple rather than the same version previously
obtained. So, to fix this, I'd like to propose that pushed-down quals
be checked in ForeignRecheck.

* I've modified ForeignRecheck so as to check pushed-down quals whether
doing late locking or early locking.

Isn't it an option to put a new callback in ForeignRecheck?

FDW driver knows its private data structure includes expression node
that was pushed down to the remote side. So, it seems to me the best
way to consult FDW driver whether the supplied tuple should be visible
according to the pushed down qualifier.

More or less, this fix need a new interface contract around EvalPlanQual
logic. It is better to give FDW driver more flexibility of its private
data structure and the way to process recheck logic, rather than special
purpose variable.

If FDW driver managed pushed-down expression in its own format, requirement
to pushedDownQual makes them to have qualifier redundantly.
The callback approach does not have such kind of concern.

That might be an idea, but is there any performance disadvantage as
discussed in [1]?; it looks like that that needs to perform another
remote query to see if the supplied tuple satisfies the pushed-down
quals during EPQ testing.

I expect the callback of ForeignRecheck runs ExecQual() towards
the qualifier expression pushed-down but saved on the private data
of ForeignScanState. It does not need to kick another remote query
(unless FDW driver is not designed), so performance disadvantage is
none or quite limited.

The advantages look not clear to me.

I think the callback approach would be a good idea if FDWs were able to
do the recheck more efficiently in their own ways than the core, for
example.

Best regards,
Etsuro Fujita

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers