Exposing quals

Started by David Fetterover 17 years ago12 messages
#1David Fetter
david@fetter.org
1 attachment(s)

Folks,

I'd like to make it possible to see qualifiers from inside functions
in PostgreSQL. For my particular case, and for dblink, it would
allow user-space code to some remove bottlenecks which make currently
make inter-DBMS communication extremely inefficient.

The current code returns one (potentially) big string with the quals
ANDed together. I'm thinking that the new code would:

1. Return just a pointer.

2. Add some functions like rsinfo_get_node_tree_str() which can turn
same into (in this case, a string, but others might get tree
structures) on demand from client code.

3. Add wrapper functions in each untrusted PL (trusted PLs shouldn't
be doing anything that needs this).

4. Document the above.

Please find attached the patch, and thanks to Neil Conway and Korry
Douglas for the code, and to Jan Wieck for helping me hammer out the
scheme above. Mistakes are all mine ;)

Comments? Questions?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachments:

db_link_qual_pushdown-6.patchtext/plain; charset=us-asciiDownload
Index: contrib/dblink/dblink.c
===================================================================
RCS file: /projects/cvsroot/pgsql/contrib/dblink/dblink.c,v
retrieving revision 1.73
diff -c -c -r1.73 dblink.c
*** contrib/dblink/dblink.c	4 Apr 2008 17:02:56 -0000	1.73
--- contrib/dblink/dblink.c	8 Apr 2008 18:29:05 -0000
***************
*** 752,757 ****
--- 752,758 ----
  		char	   *conname = NULL;
  		remoteConn *rconn = NULL;
  		bool		fail = true;	/* default to backward compatible */
+ 		ReturnSetInfo	*rsi;
  
  		/* create a function context for cross-call persistence */
  		funcctx = SRF_FIRSTCALL_INIT();
***************
*** 826,831 ****
--- 827,843 ----
  				elog(ERROR, "wrong number of arguments");
  		}
  
+ 		if (sql && rsi->qual)
+ 		{
+ 			char *quals = rsinfo_get_qual_str(rsi);
+ 			char *qualifiedQuery = palloc(strlen(sql) + strlen(" WHERE ") +
+ 										  strlen(quals) + 1);
+ 
+ 			sprintf(qualifiedQuery, "%s WHERE %s", sql, quals);
+ 
+ 			sql = qualifiedQuery;
+ 		}
+ 
  		if (!conn)
  			DBLINK_CONN_NOT_AVAIL;
  
Index: src/backend/executor/execQual.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/execQual.c,v
retrieving revision 1.228
diff -c -c -r1.228 execQual.c
*** src/backend/executor/execQual.c	25 Mar 2008 22:42:43 -0000	1.228
--- src/backend/executor/execQual.c	8 Apr 2008 18:29:06 -0000
***************
*** 45,50 ****
--- 45,51 ----
  #include "funcapi.h"
  #include "miscadmin.h"
  #include "nodes/makefuncs.h"
+ #include "optimizer/clauses.h"
  #include "optimizer/planmain.h"
  #include "parser/parse_expr.h"
  #include "utils/acl.h"
***************
*** 1415,1420 ****
--- 1416,1440 ----
  	return result;
  }
  
+ /*
+  *
+  * Get either an empty string or a batch of qualifiers.
+  *
+  */
+ char *
+ rsinfo_get_qual_str(ReturnSetInfo *rsinfo)
+ {
+ 	Node	*qual;
+ 	List	*context;
+ 
+ 	if (rsinfo->qual == NIL)
+ 		return pstrdup("");
+ 
+ 	qual = (Node *) make_ands_explicit(rsinfo->qual);
+ 	context = deparse_context_for_plan(NULL, NULL, rsinfo->rtable);
+ 
+ 	return deparse_expression(qual, context, false, false);
+ }
  
  /*
   *		ExecMakeTableFunctionResult
***************
*** 1426,1431 ****
--- 1446,1452 ----
  Tuplestorestate *
  ExecMakeTableFunctionResult(ExprState *funcexpr,
  							ExprContext *econtext,
+ 							List *qual, List *rtable,
  							TupleDesc expectedDesc,
  							TupleDesc *returnDesc)
  {
***************
*** 1458,1463 ****
--- 1479,1486 ----
  	InitFunctionCallInfoData(fcinfo, NULL, 0, NULL, (Node *) &rsinfo);
  	rsinfo.type = T_ReturnSetInfo;
  	rsinfo.econtext = econtext;
+ 	rsinfo.qual = qual;
+ 	rsinfo.rtable = rtable;
  	rsinfo.expectedDesc = expectedDesc;
  	rsinfo.allowedModes = (int) (SFRM_ValuePerCall | SFRM_Materialize);
  	rsinfo.returnMode = SFRM_ValuePerCall;
Index: src/backend/executor/nodeFunctionscan.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeFunctionscan.c,v
retrieving revision 1.46
diff -c -c -r1.46 nodeFunctionscan.c
*** src/backend/executor/nodeFunctionscan.c	29 Feb 2008 02:49:39 -0000	1.46
--- src/backend/executor/nodeFunctionscan.c	8 Apr 2008 18:29:06 -0000
***************
*** 68,73 ****
--- 68,75 ----
  		node->tuplestorestate = tuplestorestate =
  			ExecMakeTableFunctionResult(node->funcexpr,
  										econtext,
+ 										node->ss.ps.plan->qual,
+ 										estate->es_range_table,
  										node->tupdesc,
  										&funcTupdesc);
  
Index: src/include/executor/executor.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/executor/executor.h,v
retrieving revision 1.147
diff -c -c -r1.147 executor.h
*** src/include/executor/executor.h	28 Mar 2008 00:21:56 -0000	1.147
--- src/include/executor/executor.h	8 Apr 2008 18:29:06 -0000
***************
*** 176,181 ****
--- 176,182 ----
  					   ExprDoneCond *isDone);
  extern Tuplestorestate *ExecMakeTableFunctionResult(ExprState *funcexpr,
  							ExprContext *econtext,
+ 							List *qual, List *rtable,
  							TupleDesc expectedDesc,
  							TupleDesc *returnDesc);
  extern Datum ExecEvalExprSwitchContext(ExprState *expression, ExprContext *econtext,
***************
*** 187,192 ****
--- 188,194 ----
  extern int	ExecCleanTargetListLength(List *targetlist);
  extern TupleTableSlot *ExecProject(ProjectionInfo *projInfo,
  			ExprDoneCond *isDone);
+ extern char *rsinfo_get_qual_str(ReturnSetInfo *rsinfo);
  
  /*
   * prototypes from functions in execScan.c
Index: src/include/nodes/execnodes.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/nodes/execnodes.h,v
retrieving revision 1.183
diff -c -c -r1.183 execnodes.h
*** src/include/nodes/execnodes.h	1 Jan 2008 19:45:58 -0000	1.183
--- src/include/nodes/execnodes.h	8 Apr 2008 18:29:06 -0000
***************
*** 168,173 ****
--- 168,175 ----
  	ExprContext *econtext;		/* context function is being called in */
  	TupleDesc	expectedDesc;	/* tuple descriptor expected by caller */
  	int			allowedModes;	/* bitmask: return modes caller can handle */
+ 	List		*qual;			/* any quals to be applied to result */
+ 	List		*rtable;
  	/* result status from function (but pre-initialized by caller): */
  	SetFunctionReturnMode returnMode;	/* actual return mode */
  	ExprDoneCond isDone;		/* status for ValuePerCall mode */
Index: src/pl/plperl/plperl.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/pl/plperl/plperl.c,v
retrieving revision 1.139
diff -c -c -r1.139 plperl.c
*** src/pl/plperl/plperl.c	28 Mar 2008 00:21:56 -0000	1.139
--- src/pl/plperl/plperl.c	8 Apr 2008 18:29:07 -0000
***************
*** 950,956 ****
  	ENTER;
  	SAVETMPS;
  	PUSHMARK(SP);
! 	XPUSHs(sv_2mortal(newSVstring("our $_TD; local $_TD=$_[0]; shift;")));
  	XPUSHs(sv_2mortal(newSVstring(s)));
  	PUTBACK;
  
--- 950,956 ----
  	ENTER;
  	SAVETMPS;
  	PUSHMARK(SP);
! 	XPUSHs(sv_2mortal(newSVstring("our $_TD; local $_TD=$_[0]; shift; our $_QUAL; local $_QUAL=$_[0]; shift;")));
  	XPUSHs(sv_2mortal(newSVstring(s)));
  	PUTBACK;
  
***************
*** 1058,1063 ****
--- 1058,1075 ----
  
  	XPUSHs(&PL_sv_undef);		/* no trigger data */
  
+ 	if (fcinfo->resultinfo && ((ReturnSetInfo *)fcinfo->resultinfo)->qual)
+ 	{
+ 		ReturnSetInfo *rsi = (ReturnSetInfo *)fcinfo->resultinfo;
+ 		HV *hv = newHV();
+ 
+ 		hv_store_string(hv, "qual_string", newSVstring(rsinfo_get_qual_str(rsi)));
+ 
+ 		XPUSHs(newRV_noinc((SV *)hv));
+ 	}
+ 	else
+ 		XPUSHs(&PL_sv_undef);	/* no qualifier string */
+ 
  	for (i = 0; i < desc->nargs; i++)
  	{
  		if (fcinfo->argnull[i])
#2Simon Riggs
simon@2ndquadrant.com
In reply to: David Fetter (#1)
Re: Exposing quals

On Mon, 2008-05-05 at 12:01 -0700, David Fetter wrote:

Please find attached the patch, and thanks to Neil Conway and Korry
Douglas for the code, and to Jan Wieck for helping me hammer out the
scheme above. Mistakes are all mine ;)

I see no negative comments to this patch on -hackers.

This was discussed here
http://wiki.postgresql.org/wiki/PgCon_2008_Developer_Meeting#SQL.2FMED
and I had understood the consensus to be that we would go ahead with
this?

The notes say "Heikki doesn't think this is a long term solution", but
in the following discussion it was the *only* way of doing this that
will work with non-PostgreSQL databases. So it seems like the way we
would want to go, yes?

So, can we add this to the CommitFest July page so it can receive some
substantial constructive/destructive comments?

This could be an important feature in conjunction with Hot Standby.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#3Noname
andrew@dunslane.net
In reply to: Simon Riggs (#2)
Re: Exposing quals

On Mon, 2008-05-05 at 12:01 -0700, David Fetter wrote:

Please find attached the patch, and thanks to Neil Conway and Korry
Douglas for the code, and to Jan Wieck for helping me hammer out the
scheme above. Mistakes are all mine ;)

I see no negative comments to this patch on -hackers.

This was discussed here
http://wiki.postgresql.org/wiki/PgCon_2008_Developer_Meeting#SQL.2FMED
and I had understood the consensus to be that we would go ahead with
this?

The notes say "Heikki doesn't think this is a long term solution", but
in the following discussion it was the *only* way of doing this that
will work with non-PostgreSQL databases. So it seems like the way we
would want to go, yes?

So, can we add this to the CommitFest July page so it can receive some
substantial constructive/destructive comments?

This could be an important feature in conjunction with Hot Standby.

The notes say at the end:

"Jan thinks that showing the node tree will work better. But others don't
agree with him -- it wouldn't work for PL/perlU. But Jan thinks it would
work to give it a pointer to the parse tree and the range, we'd need to
add an access function for the PL."

For the record, I agree with Jan's suggestion of passing a pointer to the
parse tree, and offline gave David a suggestion verbally as to how this
could be handled for PL/PerlU.

I don't think we should be tied too closely to a string representation,
although possibly the first and simplest callback function would simply
stringify the quals.

cheers

andrew

#4David Fetter
david@fetter.org
In reply to: Noname (#3)
New relkind (was Re: Exposing quals)

On Mon, Jul 07, 2008 at 06:46:29PM -0400, Andrew Dunstan wrote:

On Mon, 2008-05-05 at 12:01 -0700, David Fetter wrote:

Please find attached the patch, and thanks to Neil Conway and
Korry Douglas for the code, and to Jan Wieck for helping me
hammer out the scheme above. Mistakes are all mine ;)

I see no negative comments to this patch on -hackers.

This was discussed here
http://wiki.postgresql.org/wiki/PgCon_2008_Developer_Meeting#SQL.2FMED
and I had understood the consensus to be that we would go ahead
with this?

The notes say "Heikki doesn't think this is a long term solution",
but in the following discussion it was the *only* way of doing
this that will work with non-PostgreSQL databases. So it seems
like the way we would want to go, yes?

So, can we add this to the CommitFest July page so it can receive
some substantial constructive/destructive comments?

This could be an important feature in conjunction with Hot
Standby.

The notes say at the end:

"Jan thinks that showing the node tree will work better. But others
don't agree with him -- it wouldn't work for PL/perlU. But Jan
thinks it would work to give it a pointer to the parse tree and the
range, we'd need to add an access function for the PL."

For the record, I agree with Jan's suggestion of passing a pointer
to the parse tree, and offline gave David a suggestion verbally as
to how this could be handled for PL/PerlU.

I don't think we should be tied too closely to a string
representation, although possibly the first and simplest callback
function would simply stringify the quals.

As I understand Jan's plan, the idea is to create a new relkind with
an exit to user code at leaf nodes in the plan tree. This would
require an API design for both user C code and for each PL to use, but
would then allow PostgreSQL's optimizer to work on JOINs, etc.

Jan, have I got that right so far? Do you have something in the way
of a rough patch, docs, etc. for this?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#5Simon Riggs
simon@2ndquadrant.com
In reply to: David Fetter (#4)
Re: New relkind (was Re: Exposing quals)

On Mon, 2008-07-07 at 16:26 -0700, David Fetter wrote:

On Mon, Jul 07, 2008 at 06:46:29PM -0400, Andrew Dunstan wrote:

For the record, I agree with Jan's suggestion of passing a pointer
to the parse tree, and offline gave David a suggestion verbally as
to how this could be handled for PL/PerlU.

I don't think we should be tied too closely to a string
representation, although possibly the first and simplest callback
function would simply stringify the quals.

As I understand Jan's plan, the idea is to create a new relkind with
an exit to user code at leaf nodes in the plan tree. This would
require an API design for both user C code and for each PL to use, but
would then allow PostgreSQL's optimizer to work on JOINs, etc.

Jan, have I got that right so far? Do you have something in the way
of a rough patch, docs, etc. for this?

It sounds like we can make it happen as text for other DBMS and as plan
nodes for PostgreSQL, which is the best solution all round.

Personally not too worried which way we do this - as long as we do it
for 8.4 :-) It's obviously happening in the background, so I'll leave it
alone.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#6Hans-Juergen Schoenig
postgres@cybertec.at
In reply to: Simon Riggs (#5)
Re: New relkind (was Re: Exposing quals)

Simon Riggs wrote:

On Mon, 2008-07-07 at 16:26 -0700, David Fetter wrote:

On Mon, Jul 07, 2008 at 06:46:29PM -0400, Andrew Dunstan wrote:

For the record, I agree with Jan's suggestion of passing a pointer
to the parse tree, and offline gave David a suggestion verbally as
to how this could be handled for PL/PerlU.

I don't think we should be tied too closely to a string
representation, although possibly the first and simplest callback
function would simply stringify the quals.

As I understand Jan's plan, the idea is to create a new relkind with
an exit to user code at leaf nodes in the plan tree. This would
require an API design for both user C code and for each PL to use, but
would then allow PostgreSQL's optimizer to work on JOINs, etc.

Jan, have I got that right so far? Do you have something in the way
of a rough patch, docs, etc. for this?

It sounds like we can make it happen as text for other DBMS and as plan
nodes for PostgreSQL, which is the best solution all round.

Personally not too worried which way we do this - as long as we do it
for 8.4 :-) It's obviously happening in the background, so I'll leave it
alone.

I think the concept involving the plan tree is gold. Hannu Krosing
mentioned some idea like that recently as well.
If the function had the chance to tell the planner how it is gonna
operate (e.g produces sorted output, etc.) it would be perfect.
The golden thing here would be if we could teach a function whether it
is " STREAMABLE | NOT STREAMABLE". streamable would make sure that we
don't have to materialize the output of a set returning function. this
would allow google-like analysis in postgresql easily by allowing to
fetch data from any amount of data from any data source.

best regards,

hans

--
Cybertec Sch�nig & Sch�nig GmbH
PostgreSQL Solutions and Support
Gr�hrm�hlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com

#7Heikki Linnakangas
heikki@enterprisedb.com
In reply to: Simon Riggs (#2)
Re: Exposing quals

Simon Riggs wrote:

The notes say "Heikki doesn't think this is a long term solution", but
in the following discussion it was the *only* way of doing this that
will work with non-PostgreSQL databases. So it seems like the way we
would want to go, yes?

How did you come to the conclusion that this is the only way that will
work with non-PostgreSQL databases? I don't see any limitations like
that in any of the proposed approaches.

I guess I should clarify my position on this:

We should start moving towards a full SQL:MED solution that will
ultimately support pushing down joins, aggregates etc. to the remote
database. Including support for transaction control, using 2PC, and cost
estimation and intelligent planning.

This should be done in an extensible way, so that people can write their
own plugins to connect to different RDBMSs, as well as simple data
sources like flat files. The plugin needs to be able to control which
parts of a plan tree can be pushed down to the remote source, estimate
the cost of remote execution, and map remote data types to local ones.
And it then needs to be able to construct and execute the remote parts
of a plan.

We're obviously not going to get all that overnight, but whatever we
implement now should be the first step towards that, rather than
something that we need to deprecate and replace in the future.
Unfortunately I don't see a way to extend the proposed "exposing quals
to functions" patch to do more than just that.

The list of functionality a full-blown plugin will need is quite long. I
don't think there's any hope of supporting all that without reaching
into some PostgreSQL internal data structures, particularly the planner
structures like RelOptInfo, Path and Plan. The plugins will be more
tightly integrated into the system than say user defined data types.
They will need to be written in C, and they will be somewhat version
dependent. Simpler plugins, like one to read CSV files, with no "pushing
down" and no update support, will need less access to internals, and
thus will be less version dependent, so pgfoundry projects like that
will be feasible.

Note that the dependency on internal data structures doesn't go away by
saying that they're passed as text; the text representation of our data
structures is version dependent as well.

So what would the plugin API look like? To hook into the planner, I'm
envisioning the plugin would define these functions:

/*
* Generate a remote plan for executing a whole subquery remotely. For
* example, if the query is an aggregate, we might be able to execute
* the whole aggregate in the remote database. This will be called
* from grouping_planner(), like optimize_minmax_aggregates().
* Returns NULL if remote execution is not possible. (a dummy
* implementation can always return NULL.
*/
Plan *generate_remote_path(PlannerInfo *, List *tlist);

/*
* Generate a path for executing one relation in remote
* database. The relation can be a base (non-join) remote relation,
* or a join involving a remote relation. Can return NULL for join
* relations if the join can't be executed remotely.
*/
Path *generate_remote_path(PlannerInfo *, RelOptInfo *)

/*
* Create a Plan node from a Path. Called from create_plan, when
* the planner chooses to use a remote path. A typical implementation
* would create the SQL string to be executed in the remote database,
* and return a RemotePlan node with that SQL string in it.
*/
Plan *create_remote_plan(PlannerInfo *, RemotePath *)

On the execution side, the plugin needs to be able to execute a
previously generated RemotePlan. There would be a new executor node
type, a RemoteScan, that would be similar to a seq scan or index scan,
but delegates the actual execution to the plugin. The execution part of
the plugin API would reflect the API of executor nodes, something like:

void *scan_open(RemotePlan *)
HeapTuple *scan_getnext(void *scanstate)
void scan_close(void *scanstate)

The presumption here is that you would define remote tables with the
appropriate SQL:MED statements beforehand (CREATE FOREIGN TABLE).
However, it is flexible enough that you could implement the "exposing
quals to functions" functionality with this as well:
generate_remote_path() would need to recognize the function scans that
it can handle, and return a RemotePath struct with all the same
information as create_functionscan_path does (the cost estimates could
be adjusted for the pushed down quals at this point as well).
create_remote_plan would return a FunctionScan node, but with the extra
qualifiers passed into the function as arguments. In case of dblink, it
could just add extra WHERE clauses to the query that's being passed as
argument. I'm not proposing that we do the stuff described in this
paragraph, just using it as an example of the flexibility.

BTW, I think the "exposing quals to functions" functionality could be
implemented as a planner hook as well. The hook would call the standard
planner, and modify the plan tree after that, passing the quals as extra
arguments to functions that can take advantage of them.

A "foreign data wrapper" interface is also defined in the SQL/MED
standard. I've only looked at it briefly, but it seems provide roughly
the same functionality as the API I defined above. It would be a good
idea to look at that, though I don't think that part of the standard is
very widely adopted.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#8Simon Riggs
simon@2ndquadrant.com
In reply to: Heikki Linnakangas (#7)
Re: Exposing quals

On Tue, 2008-07-08 at 17:51 +0300, Heikki Linnakangas wrote:

Simon Riggs wrote:

The notes say "Heikki doesn't think this is a long term solution", but
in the following discussion it was the *only* way of doing this that
will work with non-PostgreSQL databases. So it seems like the way we
would want to go, yes?

How did you come to the conclusion that this is the only way that will
work with non-PostgreSQL databases?

SQL, in text form, is the interface to other databases. You can't pass
half a plan tree to Oracle, especially not a PostgreSQL plan tree. It
has to be text if you wish to send a query to another RDBMS, or another
version of PostgreSQL.

We should start moving towards a full SQL:MED solution that will
ultimately support pushing down joins, aggregates etc. to the remote
database. Including support for transaction control, using 2PC, and
cost estimation and intelligent planning.

This should be done in an extensible way, so that people can write
their own plugins to connect to different RDBMSs, as well as simple
data sources like flat files. The plugin needs to be able to control
which parts of a plan tree can be pushed down to the remote source,
estimate the cost of remote execution, and map remote data types to
local ones. And it then needs to be able to construct and execute the
remote parts of a plan.

So if I understand you, you want to pass the partial plan tree and then
have a plugin construct the SQL text. Sounds like a great approach.

Maybe you thought I meant internal interfaces should be in text? No,
that would be bizarre. I meant we should not attempt to pass partial
plan trees outside of the database, since that would limit the feature
to only working with the same version of PostgreSQL database. I support
your wish to have something that can work with other types of database.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#9Heikki Linnakangas
heikki@enterprisedb.com
In reply to: Simon Riggs (#8)
Re: Exposing quals

Simon Riggs wrote:

SQL, in text form, is the interface to other databases. You can't pass
half a plan tree to Oracle, especially not a PostgreSQL plan tree. It
has to be text if you wish to send a query to another RDBMS, or another
version of PostgreSQL.

Oh, I see. Agreed.

Though note that there's big differences in SQL dialects, so a
one-size-fits-all approach to generating SQL to be executed in the
remote database won't work. (not that I think anyone has suggested that)

So if I understand you, you want to pass the partial plan tree and then
have a plugin construct the SQL text.

Exactly.

Maybe you thought I meant internal interfaces should be in text?

Yeah, that's exactly what I thought you meant.

No,
that would be bizarre. I meant we should not attempt to pass partial
plan trees outside of the database, since that would limit the feature
to only working with the same version of PostgreSQL database.

Agreed. I'm glad we're on the same page now.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#10David Fetter
david@fetter.org
In reply to: Heikki Linnakangas (#9)
Re: Exposing quals

On Tue, Jul 08, 2008 at 06:22:23PM +0300, Heikki Linnakangas wrote:

Simon Riggs wrote:

SQL, in text form, is the interface to other databases. You can't
pass half a plan tree to Oracle, especially not a PostgreSQL plan
tree. It has to be text if you wish to send a query to another
RDBMS, or another version of PostgreSQL.

Oh, I see. Agreed.

Though note that there's big differences in SQL dialects, so a
one-size-fits-all approach to generating SQL to be executed in the
remote database won't work. (not that I think anyone has suggested
that)

So if I understand you, you want to pass the partial plan tree and
then have a plugin construct the SQL text.

Exactly.

Maybe you thought I meant internal interfaces should be in text?

Yeah, that's exactly what I thought you meant.

No, that would be bizarre. I meant we should not attempt to pass
partial plan trees outside of the database, since that would limit
the feature to only working with the same version of PostgreSQL
database.

Agreed. I'm glad we're on the same page now.

Everybody's weighed in on this thread except the guy who's actually
doing the work.

Jan?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#11Jan Wieck
JanWieck@Yahoo.com
In reply to: David Fetter (#10)
Re: Exposing quals

On 7/8/2008 11:38 AM, David Fetter wrote:

On Tue, Jul 08, 2008 at 06:22:23PM +0300, Heikki Linnakangas wrote:

Simon Riggs wrote:

SQL, in text form, is the interface to other databases. You can't
pass half a plan tree to Oracle, especially not a PostgreSQL plan
tree. It has to be text if you wish to send a query to another
RDBMS, or another version of PostgreSQL.

Oh, I see. Agreed.

Though note that there's big differences in SQL dialects, so a
one-size-fits-all approach to generating SQL to be executed in the
remote database won't work. (not that I think anyone has suggested
that)

So if I understand you, you want to pass the partial plan tree and
then have a plugin construct the SQL text.

Exactly.

Maybe you thought I meant internal interfaces should be in text?

Yeah, that's exactly what I thought you meant.

No, that would be bizarre. I meant we should not attempt to pass
partial plan trees outside of the database, since that would limit
the feature to only working with the same version of PostgreSQL
database.

Agreed. I'm glad we're on the same page now.

Everybody's weighed in on this thread except the guy who's actually
doing the work.

Jan?

Here,

I talked to my supervisor here in Toronto (that's where I am this week)
and Afilias actually sees enough value in this for me to go and spend
time officially on it.

The ideas I have so far are as follows:

Down in the exec nodes like SeqScan or IndexScan, there are several
parts available that are important.

- Scanned relation
- Targetlist
- Filter (for SeqScan)
- IndexQual (for IndexScan)

These pieces are available at least in the scans Init function and
actually can be converted back into some SQL statement that effectively
represents this one single table scan. However, parsing it back at that
point is nonsense, as we cannot expect everything out there to actually
be an SQL database.

Also, both the qualification as well as the targetlist can contain
things like user defined function calls. We neither want to deny nor
require that this sort of construct is actually handed over to the
external data source, so the interface needs to be more flexible.
Therefore it is best to divide the functionality into several user exit
functions.

The several functions that implement a scan type inside of the executor
very much resemble opening a cursor for a single table query, fetching
rows from it, eventually (in the case of a nested loop for example)
close and reopen the cursor with different key values from the outer
tuple, close the cursor. So it makes total sense to actually require an
implementation of an external data source to provide functions to open a
cursor, fetch rows, close the cursor.

There will be some connection and transaction handling around all this
that I have in mind but think it would distract from the problem to be
solved right here, so more about that another time.

The C implementation for open cursor would be called with a scan handle,
containing the connection, the relname, the targetlist and the
qualification subtrees. These are modified from the real ones in the
scan node so that all Var's have varno=1 and that all OUTER Var's have
been replaced with a Const that reflects the current outer tuples
values. From here there are several support functions available to "dumb
down" each of those to whatever the external data source may support. In
case of the targetlist, this could mean to filter out a unique list of
Var nodes only, removing all expressions from it. In case of the
qualification, this could mean remove everything that isn't a standard
operator (=, <>, ...), or remove everything that isn't Postgres builtin.
Finally, there is a support function that will build a SQL statement
according to what's left inside that scan handle.

The scan handle would track which modifications have been done to the
various pieces so that the outer support framework knows if it gets back
the originally requested targetlist, or if it has to run the projection
on the returned unique list of Var's. And if it has to recheck the
returned tuples for qualification, because some of the qual's had been
removed.

In order to allow the user exits to be written in PL's, I can think of
makiing a complex data type containing the scan handle. The subtrees
could be accessed by the PL via support functions that return them in
nodeToString() or other formats.

I'll try to write up a more complete proposal until end of next week.

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

#12David Fetter
david@fetter.org
In reply to: Jan Wieck (#11)
Re: Exposing quals

On Tue, Jul 08, 2008 at 02:41:45PM -0400, Jan Wieck wrote:

Here,

I talked to my supervisor here in Toronto (that's where I am this week)
and Afilias actually sees enough value in this for me to go and spend
time officially on it.

Yay!

The ideas I have so far are as follows:

Down in the exec nodes like SeqScan or IndexScan, there are several
parts available that are important.

- Scanned relation
- Targetlist
- Filter (for SeqScan)
- IndexQual (for IndexScan)

These pieces are available at least in the scans Init function and
actually can be converted back into some SQL statement that effectively
represents this one single table scan. However, parsing it back at that
point is nonsense, as we cannot expect everything out there to actually
be an SQL database.

Also, both the qualification as well as the targetlist can contain
things like user defined function calls. We neither want to deny nor
require that this sort of construct is actually handed over to the
external data source, so the interface needs to be more flexible.
Therefore it is best to divide the functionality into several user exit
functions.

Right :)

The several functions that implement a scan type inside of the
executor very much resemble opening a cursor for a single table
query, fetching rows from it, eventually (in the case of a nested
loop for example) close and reopen the cursor with different key
values from the outer tuple, close the cursor. So it makes total
sense to actually require an implementation of an external data
source to provide functions to open a cursor, fetch rows, close the
cursor.

That's not unreasonable, given that the simplest kind of external data
source would likely be along the lines of fopen().

There will be some connection and transaction handling around all
this that I have in mind but think it would distract from the
problem to be solved right here, so more about that another time.

Maybe something like a way of setting, for each relation, what kind
(if any) of transactions are available?

The C implementation for open cursor would be called with a scan
handle, containing the connection, the relname, the targetlist and
the qualification subtrees. These are modified from the real ones
in the scan node so that all Var's have varno=1 and that all OUTER
Var's have been replaced with a Const that reflects the current
outer tuples values. From here there are several support functions
available to "dumb down" each of those to whatever the external
data source may support. In case of the targetlist, this could mean
to filter out a unique list of Var nodes only, removing all
expressions from it. In case of the qualification, this could mean
remove everything that isn't a standard operator (=, <>, ...), or
remove everything that isn't Postgres builtin. Finally, there is a
support function that will build a SQL statement according to
what's left inside that scan handle.

Interesting.

The scan handle would track which modifications have been done to the
various pieces so that the outer support framework knows if it gets back
the originally requested targetlist, or if it has to run the projection
on the returned unique list of Var's. And if it has to recheck the
returned tuples for qualification, because some of the qual's had been
removed.

In order to allow the user exits to be written in PL's, I can think of
makiing a complex data type containing the scan handle. The subtrees
could be accessed by the PL via support functions that return them in
nodeToString() or other formats.

I'll try to write up a more complete proposal until end of next week.

Yay!

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate