[v9.5] Custom Plan API

Started by KaiGai Koheialmost 12 years ago64 messageshackers
Jump to latest
#1KaiGai Kohei
kaigai@ak.jp.nec.com

Prior to the development cycle towards v9.5, I'd like to reopen
the discussion of custom-plan interface. Even though we had lots
of discussion during the last three commit-fests, several issues
are still under discussion. So, I'd like to clarify direction of
the implementation, prior to the first commit-fest.

(1) DDL support and system catalog

Simon suggested that DDL command should be supported to track custom-
plan providers being installed, and to avoid nonsense hook calls
if it is an obvious case that custom-plan provider can help. It also
makes sense to give a chance to load extensions once installed.
(In the previous design, I assumed modules are loaded by LOAD command
or *_preload_libraries parameters).

I tried to implement the following syntax:

CREATE CUSTOM PLAN <name> FOR (scan|join|any) HANDLER <func_name>;

It records a particular function as an entrypoint of custom-plan provider,
then it will be called when planner tries to find out the best path to scan
or join relations. This function takes an argument (INTERNAL type) that packs
information to construct and register an alternative scan/join path, like
PlannerInfo, RelOptInfo and so on.

(*) The data structure below will be supplied, in case of scan path.
typedef struct {
uint32 custom_class;
PlannerInfo *root;
RelOptInfo *baserel;
RangeTblEntry *rte;
} customScanArg;

This function, usually implemented with C-language, can construct a custom
object being delivered from CustomPath type that contains a set of function
pointers; including functions that populate another objects delivered from
CustomPlan or CustomPlanState as I did in the patch towards v9.4 development.

Properties of individual custom-plan providers are recorded in the
pg_custom_plan system catalog. Right now, its definition is quite simple
- only superuser can create / drop custom-plan providers, and its definition
does not belong to a particular namespace.
Because of this assumption (only superuser can touch), I don't put database
ACL mechanism here.
What kind of characteristics should be there?

(2) Static functions to be exported

Tom concerned that custom-plan API needs several key functions can be
called by extensions, although these are declared as static functions,
thus, it looks like a part of interfaces.
Once people thought it is stable ones we can use beyond the version up,
it may become a barrier to the future improvement in the core code.
Is it a right understanding, isn't it?

One solution is to write a notice clearly, like: "these external functions
are not stable interfaces, so extension should not assumed these functions
are available beyond future version up".

Nevertheless, more stable functions are more kindness for authors of extensions.
So, I tried a few approaches.

First of all, we categorized functions into three categories.
(A) It walks on plan/expression tree recursively.
(B) It modifies internal state of the core backend.
(C) It is commonly used but in a particular source file.

Although the number of functions are not so many, (A) and (B) must have
its entrypoint from extensions. If unavailable, extension needs to manage
a copied code with small enhancement by itself, and its burden is similar
to just branching the tree.
Example of (A) are: create_plan_recurse, set_plan_refs, ...
Example of (B) are: fix_expr_common, ...

On the other hands, (C) functions are helpful if available, however, it
is not mandatory requirement to implement.

Our first trial, according to the proposition by Tom, is to investigate
a common walker function on plan tree as we are now doing on expression
tree. We expected, we can give function pointers of key routines to
extensions, instead of exporting the static functions.
However, it didn't work well because existing recursive call takes
various kind of jobs for each plan-node type, so it didn't fit a structure
of walker functions; that applies a uniform operation for each node.

Note that, I assumed the following walker functions that applies plan_walker
or expr_walker on the underlying plan/expression trees.
bool
plan_tree_walker(Plan *plan,
bool (*plan_walker) (),
bool (*expr_walker) (),
void *context)
Please tell me if it is different from your ideas, I'll reconsider it.

On the next, I tried another approach that gives function pointers of
(A) and (B) functions as a part of custom-plan interface.
It is workable at least, however, it seems to me its interface definition
has advantage in comparison to the original approach.

For example, below is definition of the callback in setref.c.

+   void    (*SetCustomPlanRef)(PlannerInfo *root,
+                               CustomPlan *custom_plan,
+                               int rtoffset,
+                               Plan *(*fn_set_plan_refs)(PlannerInfo *root,
+                                                         Plan *plan,
+                                                         int rtoffset),
+                               void (*fn_fix_expr_common)(PlannerInfo *root,
+                                                          Node *node));

Extension needs set_plan_refs() and fix_expr_common() at least, I added
function pointers of them. But this definition has to be updated according
to the future update of these functions. It does not seem to me a proper
way to smooth the impact of future internal change.

So, I'd like to find out where is a good common ground to solve the matter.

One idea is the first simple solution. The core PostgreSQL will be developed
independently from the out-of-tree modules, so we don't care about stability
of declaration of internal functions, even if it is exported to multiple
source files. (I believe it is our usual manner.)

One other idea is, a refactoring of the core backend to consolidate routines
per plan-node, not processing stage. For example, createplan.c contains most
of codes commonly needed to create plan, in addition to individual plan node.
Let's assume a function like create_seqscan_plan() are located in a separated
source file, then routines to be exported become clear.
One expected disadvantage is, this refactoring makes complicated to back patches.

Do you have any other ideas to implement it well?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Show quoted text

-----Original Message-----
From: Kohei KaiGai [mailto:kaigai@kaigai.gr.jp]
Sent: Tuesday, April 29, 2014 10:07 AM
To: Kaigai Kouhei(海外 浩平)
Cc: Tom Lane; Andres Freund; Robert Haas; Simon Riggs; PgHacker; Stephen
Frost; Shigeru Hanada; Jim Mlodgenski; Peter Eisentraut
Subject: Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

Yeah. I'm still not exactly convinced that custom-scan will ever
allow independent development of new plan types (which, with all due
respect to Robert, is what it was being sold as last year in Ottawa).
But I'm not opposed in principle to committing it, if we can find a
way to have a cleaner API for things like setrefs.c. It seems like
late-stage planner processing in general is an issue for this patch
(createplan.c and subselect.c are also looking messy). EXPLAIN isn't

too great either.

I'm not sure exactly what to do about those cases, but I wonder
whether things would get better if we had the equivalent of
expression_tree_walker/mutator capability for plan nodes. The state
of affairs in setrefs and subselect, at least, is a bit reminiscent
of the bad old days when we had lots of different bespoke code for
traversing expression trees.

Hmm. If we have something like expression_tree_walker/mutator for plan
nodes, we can pass a walker/mutator function's pointer instead of
exposing static functions that takes recursive jobs.
If custom-plan provider (that has sub-plans) got a callback with
walker/ mutator pointer, all it has to do for sub-plans are calling
this new plan-tree walking support routine with supplied walker/mutator.
It seems to me more simple design than what I did.

I tried to code the similar walker/mutator functions on plan-node tree,
however, it was not available to implement these routines enough simple,
because the job of walker/mutator functions are not uniform thus caller
side also must have a large switch-case branches.

I picked up setrefs.c for my investigation.
The set_plan_refs() applies fix_scan_list() on the expression tree being
appeared in the plan node if it is delivered from Scan, however, it also
applies set_join_references() for subclass of Join, or
set_dummy_tlist_references() for some other plan nodes.
It implies that the walker/mutator functions of Plan node has to apply
different operation according to the type of Plan node. I'm not certain
how much different forms are needed.
(In addition, set_plan_refs() performs usually like a walker, but often
performs as a mutator if trivial subquery....)

I'm expecting the function like below. It allows to call plan_walker
function for each plan-node and also allows to call expr_walker function
for each expression-node on the plan node.

bool
plan_tree_walker(Plan *plan,
bool (*plan_walker) (),
bool (*expr_walker) (),
void *context)

I'd like to see if something other form to implement this routine.

One alternative idea to give custom-plan provider a chance to handle its
subplans is, to give function pointers (1) to handle recursion of plan-tree
and (2) to set up backend's internal state.
In case of setrefs.c, set_plan_refs() and fix_expr_common() are minimum
necessity for extensions. It also kills necessity to export static
functions.

How about your thought?
--
KaiGai Kohei <kaigai@kaigai.gr.jp>

Attachments:

pgsql-v9.5-custom-plan-with-ctidscan.v0.patchapplication/octet-stream; name=pgsql-v9.5-custom-plan-with-ctidscan.v0.patchDownload+3327-38
#2Simon Riggs
simon@2ndQuadrant.com
In reply to: KaiGai Kohei (#1)
Re: [v9.5] Custom Plan API

On 7 May 2014 02:05, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

Prior to the development cycle towards v9.5, I'd like to reopen
the discussion of custom-plan interface. Even though we had lots
of discussion during the last three commit-fests, several issues
are still under discussion. So, I'd like to clarify direction of
the implementation, prior to the first commit-fest.

(1) DDL support and system catalog

Simon suggested that DDL command should be supported to track custom-
plan providers being installed, and to avoid nonsense hook calls
if it is an obvious case that custom-plan provider can help. It also
makes sense to give a chance to load extensions once installed.
(In the previous design, I assumed modules are loaded by LOAD command
or *_preload_libraries parameters).

I tried to implement the following syntax:

CREATE CUSTOM PLAN <name> FOR (scan|join|any) HANDLER <func_name>;

Thank you for exploring that thought and leading the way on this
research. I've been thinking about this also.

What I think we need is a declarative form that expresses the linkage
between base table(s) and a related data structures that can be used
to optimize a query, while still providing accurate results.

In other DBMS, we have concepts such as a JoinIndex or a MatView which
allow some kind of lookaside behaviour. Just for clarity, a concrete
example is Oracle's Materialized Views which can be set using ENABLE
QUERY REWRITE so that the MatView can be used as an alternative path
for a query. We do already have this concept in PostgreSQL, where an
index can be used to perform an IndexOnlyScan rather than accessing
the heap itself.

We have considerable evidence that the idea of alternate data
structures results in performance gains.
* KaiGai's work - https://wiki.postgresql.org/wiki/PGStrom
* /messages/by-id/52C59858.9090500@garret.ru
* http://citusdata.github.io/cstore_fdw/
* University of Manchester - exploring GPUs as part of the AXLE project
* Barcelona SuperComputer Centre - exploring FPGAs, as part of the AXLE project
* Some other authors have also cited gains using GPU technology in databases

So I would like to have a mechanism that provides a *generic*
Lookaside for a table or foreign table.

Tom and Kevin have previously expressed that MatViews would represent
a planning problem, in the general case. One way to solve that
planning issue is to link structures directly together, in the same
way that an index and a table are linked. We can then process the
lookaside in the same way we handle a partial index - check
prerequisites and if usable, calculate a cost for the alternate path.
We need not add planning time other than to the tables that might
benefit from that.

Roughly, I'm thinking of this...

CREATE LOOKASIDE ON foo
TO foo_mat_view;

and also this...

CREATE LOOKASIDE ON foo
TO foo_as_a_foreign_table /* e.g. PGStrom */

This would allow the planner to consider alternate plans for foo_mv
during set_plain_rel_pathlist() similarly to the way it considers
index paths, in one of the common cases that the mat view covers just
one table.

This concept is similar to ENABLE QUERY REWRITE in Oracle, but this
thought goes much further, to include any generic user-defined data
structure or foreign table.

Do we need this? For MVs, we *might* be able to deduce that the MV is
rewritable for "foo", but that is not deducible for Foreign Tables, by
current definition, so I prefer the explicit definition of objects
that are linked - since doing this for indexes is already familiar to
people.

Having an explicit linkage between data structures allows us to
enhance an existing application by transaparently adding new
structures, just as we already do with indexes. Specifically, that we
allow more than one lookaside structure on any one table.

Forget the exact name, thats not important. But I think the
requirements here are...

* Explicit definition that we are attaching an alternate path onto a
table (conceptually similar to adding an index)

* Ability to check that the alternate path is viable (similar to the
way we validate use of partial indexes prior to usage)
Checks on columns(SELECT), rows(WHERE), aggregations(GROUP)

* Ability to consider access cost for both normal table and alternate
path (like an index) - this allows the alternate path to *not* be
chosen when we are performing some operation that is sub-optimal (for
whatever reason).

* There may be some need to define operator classes that are
implemented via the alternate path

which works for single tables, but a later requirement would then be

* allows the join of one or more tables to be replaced with a single lookaside

Hopefully, we won't need a "Custom Plan" at all, just the ability to
lookaside when useful.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3KaiGai Kohei
kaigai@ak.jp.nec.com
In reply to: Simon Riggs (#2)
Re: [v9.5] Custom Plan API

On 7 May 2014 02:05, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

Prior to the development cycle towards v9.5, I'd like to reopen the
discussion of custom-plan interface. Even though we had lots of
discussion during the last three commit-fests, several issues are
still under discussion. So, I'd like to clarify direction of the
implementation, prior to the first commit-fest.

(1) DDL support and system catalog

Simon suggested that DDL command should be supported to track custom-
plan providers being installed, and to avoid nonsense hook calls if it
is an obvious case that custom-plan provider can help. It also makes
sense to give a chance to load extensions once installed.
(In the previous design, I assumed modules are loaded by LOAD command
or *_preload_libraries parameters).

I tried to implement the following syntax:

CREATE CUSTOM PLAN <name> FOR (scan|join|any) HANDLER <func_name>;

Thank you for exploring that thought and leading the way on this research.
I've been thinking about this also.

What I think we need is a declarative form that expresses the linkage between
base table(s) and a related data structures that can be used to optimize
a query, while still providing accurate results.

In other DBMS, we have concepts such as a JoinIndex or a MatView which allow
some kind of lookaside behaviour. Just for clarity, a concrete example is
Oracle's Materialized Views which can be set using ENABLE QUERY REWRITE
so that the MatView can be used as an alternative path for a query. We do
already have this concept in PostgreSQL, where an index can be used to
perform an IndexOnlyScan rather than accessing the heap itself.

We have considerable evidence that the idea of alternate data structures
results in performance gains.
* KaiGai's work - https://wiki.postgresql.org/wiki/PGStrom
* /messages/by-id/52C59858.9090500@garret.ru
* http://citusdata.github.io/cstore_fdw/
* University of Manchester - exploring GPUs as part of the AXLE project
* Barcelona SuperComputer Centre - exploring FPGAs, as part of the AXLE
project
* Some other authors have also cited gains using GPU technology in databases

So I would like to have a mechanism that provides a *generic* Lookaside
for a table or foreign table.

Tom and Kevin have previously expressed that MatViews would represent a
planning problem, in the general case. One way to solve that planning issue
is to link structures directly together, in the same way that an index and
a table are linked. We can then process the lookaside in the same way we
handle a partial index - check prerequisites and if usable, calculate a
cost for the alternate path.
We need not add planning time other than to the tables that might benefit
from that.

Roughly, I'm thinking of this...

CREATE LOOKASIDE ON foo
TO foo_mat_view;

and also this...

CREATE LOOKASIDE ON foo
TO foo_as_a_foreign_table /* e.g. PGStrom */

This would allow the planner to consider alternate plans for foo_mv during
set_plain_rel_pathlist() similarly to the way it considers index paths,
in one of the common cases that the mat view covers just one table.

This concept is similar to ENABLE QUERY REWRITE in Oracle, but this thought
goes much further, to include any generic user-defined data structure or
foreign table.

Let me clarify. This mechanism allows to add alternative scan/join paths
including built-in ones, not only custom enhanced plan/exec node, isn't it?
Probably, it is a variation of above proposition if we install a handler
function that proposes built-in path nodes towards the request for scan/join.

Do we need this? For MVs, we *might* be able to deduce that the MV is
rewritable for "foo", but that is not deducible for Foreign Tables, by
current definition, so I prefer the explicit definition of objects that
are linked - since doing this for indexes is already familiar to people.

Having an explicit linkage between data structures allows us to enhance
an existing application by transaparently adding new structures, just as
we already do with indexes. Specifically, that we allow more than one
lookaside structure on any one table.

Not only alternative data structure, alternative method to scan/join towards
same data structure is also important, isn't it?

Forget the exact name, thats not important. But I think the requirements
here are...

* Explicit definition that we are attaching an alternate path onto a table
(conceptually similar to adding an index)

I think the syntax allows "tables", not only a particular table.
It will inform the core planner this lookaside/customplan (name is not
important, anyway this feature...) can provide alternative path towards
the set of relations; being considered. So, it allows to reduce number of
function calls on planner stage.

* Ability to check that the alternate path is viable (similar to the way
we validate use of partial indexes prior to usage)
Checks on columns(SELECT), rows(WHERE), aggregations(GROUP)

I never deny it... but do you think this feature from the initial version??

* Ability to consider access cost for both normal table and alternate path
(like an index) - this allows the alternate path to *not* be chosen when
we are performing some operation that is sub-optimal (for whatever reason).

It is an usual job of existing planner, isn't it?

* There may be some need to define operator classes that are implemented
via the alternate path

which works for single tables, but a later requirement would then be

* allows the join of one or more tables to be replaced with a single lookaside

It's higher priority for me, and I guess it is same in MatView usage.

Hopefully, we won't need a "Custom Plan" at all, just the ability to
lookaside when useful.

Probably, lookaside is a special case in the scenario that custom-plan can
provide. I also think it is an attractive use case if we can redirect
a particular complicated join into a MatView reference. So, it makes sense
to bundle a handler function to replace join by matview reference.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Simon Riggs
simon@2ndQuadrant.com
In reply to: KaiGai Kohei (#3)
Re: [v9.5] Custom Plan API

On 7 May 2014 08:17, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

Let me clarify. This mechanism allows to add alternative scan/join paths
including built-in ones, not only custom enhanced plan/exec node, isn't it?
Probably, it is a variation of above proposition if we install a handler
function that proposes built-in path nodes towards the request for scan/join.

Yes, I am looking for a way to give you the full extent of your
requirements, within the Postgres framework. I have time and funding
to assist you in achieving this in a general way that all may make use
of.

Not only alternative data structure, alternative method to scan/join towards
same data structure is also important, isn't it?

Agreed. My proposal is that if the planner allows the lookaside to an
FDW then we pass the query for full execution on the FDW. That means
that the scan, aggregate and join could take place via the FDW. i.e.
"Custom Plan" == lookaside + FDW

Or put another way, if we add Lookaside then we can just plug in the
pgstrom FDW directly and we're done. And everybody else's FDW will
work as well, so Citus etcc will not need to recode.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5KaiGai Kohei
kaigai@ak.jp.nec.com
In reply to: Simon Riggs (#4)
Re: [v9.5] Custom Plan API

-----Original Message-----
From: Simon Riggs [mailto:simon@2ndQuadrant.com]
Sent: Wednesday, May 07, 2014 5:02 PM
To: Kaigai Kouhei(海外 浩平)
Cc: Tom Lane; Robert Haas; Andres Freund; PgHacker; Stephen Frost; Shigeru
Hanada; Jim Mlodgenski; Peter Eisentraut; Kohei KaiGai
Subject: Re: [v9.5] Custom Plan API

On 7 May 2014 08:17, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

Let me clarify. This mechanism allows to add alternative scan/join
paths including built-in ones, not only custom enhanced plan/exec node,

isn't it?

Probably, it is a variation of above proposition if we install a
handler function that proposes built-in path nodes towards the request

for scan/join.

Yes, I am looking for a way to give you the full extent of your requirements,
within the Postgres framework. I have time and funding to assist you in
achieving this in a general way that all may make use of.

Not only alternative data structure, alternative method to scan/join
towards same data structure is also important, isn't it?

Agreed. My proposal is that if the planner allows the lookaside to an FDW
then we pass the query for full execution on the FDW. That means that the
scan, aggregate and join could take place via the FDW. i.e.
"Custom Plan" == lookaside + FDW

Or put another way, if we add Lookaside then we can just plug in the pgstrom
FDW directly and we're done. And everybody else's FDW will work as well,
so Citus etcc will not need to recode.

Hmm. That sounds me, you intend to make FDW perform as a central facility
to host pluggable plan/exec stuff. Even though we have several things to be
clarified, I also think it's a direction worth to investigate.

Let me list up the things to be clarified / developed randomly.

* Join replacement by FDW; We still don't have consensus about join replacement
by FDW. Probably, it will be designed to remote-join implementation primarily,
however, things to do is similar. We may need to revisit the Hanada-san's
proposition in the past.

* Lookaside for ANY relations; I want planner to try GPU-scan for any relations
once installed, to reduce user's administration cost.
It needs lookaside allow to specify a particular foreign-server, not foreign-
table, then create ForeignScan node that is not associated with a particular
foreign-table.

* ForeignScan node that is not associated with a particular foreign-table.
Once we try to apply ForeignScan node instead of Sort or Aggregate, existing
FDW implementation needs to be improved. These nodes scan on a materialized
relation (generated on the fly), however, existing FDW code assumes
ForeignScan node is always associated with a particular foreign-table.
We need to eliminate this restriction.

* FDW method for MultiExec. In case when we can stack multiple ForeignScan
nodes, it's helpful to support to exchange scanned tuples in their own
data format. Let's assume two ForeignScan nodes are stacked. One performs
like Sort, another performs like Scan. If they internally handle column-
oriented data format, TupleTableSlot is not a best way for data exchange.

* Lookaside on the INSERT/UPDATE/DELETE. Probably, it can be implemented
using writable FDW feature. Not a big issue, but don't forget it...

How about your opinion?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Simon Riggs
simon@2ndQuadrant.com
In reply to: KaiGai Kohei (#5)
Re: [v9.5] Custom Plan API

On 7 May 2014 10:06, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

Let me list up the things to be clarified / developed randomly.

* Join replacement by FDW; We still don't have consensus about join replacement
by FDW. Probably, it will be designed to remote-join implementation primarily,
however, things to do is similar. We may need to revisit the Hanada-san's
proposition in the past.

Agreed. We need to push down joins into FDWs and we need to push down
aggregates also, so they can be passed to FDWs. I'm planning to look
at aggregate push down.

* Lookaside for ANY relations; I want planner to try GPU-scan for any relations
once installed, to reduce user's administration cost.
It needs lookaside allow to specify a particular foreign-server, not foreign-
table, then create ForeignScan node that is not associated with a particular
foreign-table.

IMHO we would not want to add indexes to every column, on every table,
nor would we wish to use lookaside for all tables. It is a good thing
to be able to add optimizations for individual tables. GPUs are not
good for everything; it is good to be able to leverage their
strengths, yet avoid their weaknesses.

If do you want that, you can write an Event Trigger that automatically
adds a lookaside for any table.

* ForeignScan node that is not associated with a particular foreign-table.
Once we try to apply ForeignScan node instead of Sort or Aggregate, existing
FDW implementation needs to be improved. These nodes scan on a materialized
relation (generated on the fly), however, existing FDW code assumes
ForeignScan node is always associated with a particular foreign-table.
We need to eliminate this restriction.

I don't think we need to do that, given the above.

* FDW method for MultiExec. In case when we can stack multiple ForeignScan
nodes, it's helpful to support to exchange scanned tuples in their own
data format. Let's assume two ForeignScan nodes are stacked. One performs
like Sort, another performs like Scan. If they internally handle column-
oriented data format, TupleTableSlot is not a best way for data exchange.

I agree TupleTableSlot may not be best way for bulk data movement. We
probably need to look at buffering/bulk movement between executor
nodes in general, which would be of benefit for the FDW case also.
This would be a problem even for Custom Scans as originally presented
also, so I don't see much change there.

* Lookaside on the INSERT/UPDATE/DELETE. Probably, it can be implemented
using writable FDW feature. Not a big issue, but don't forget it...

Yes, possible.

I hope these ideas make sense. This is early days and there may be
other ideas and much detail yet to come.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Stephen Frost
sfrost@snowman.net
In reply to: Simon Riggs (#4)
Re: [v9.5] Custom Plan API

* Simon Riggs (simon@2ndQuadrant.com) wrote:

Agreed. My proposal is that if the planner allows the lookaside to an
FDW then we pass the query for full execution on the FDW. That means
that the scan, aggregate and join could take place via the FDW. i.e.
"Custom Plan" == lookaside + FDW

How about we get that working for FDWs to begin with and then we can
come back to this idea..? We're pretty far from join-pushdown or
aggregate-pushdown to FDWs, last I checked, and having those would be a
massive win for everyone using FDWs.

Thanks,

Stephen

#8Stephen Frost
sfrost@snowman.net
In reply to: Simon Riggs (#6)
Re: [v9.5] Custom Plan API

* Simon Riggs (simon@2ndQuadrant.com) wrote:

IMHO we would not want to add indexes to every column, on every table,
nor would we wish to use lookaside for all tables. It is a good thing
to be able to add optimizations for individual tables. GPUs are not
good for everything; it is good to be able to leverage their
strengths, yet avoid their weaknesses.

It's the optimizer's job to figure out which path to pick though, based
on which will have the lowest cost.

If do you want that, you can write an Event Trigger that automatically
adds a lookaside for any table.

This sounds terribly ugly and like we're pushing optimization decisions
on to the user instead of just figuring out what the best answer is.

I agree TupleTableSlot may not be best way for bulk data movement. We
probably need to look at buffering/bulk movement between executor
nodes in general, which would be of benefit for the FDW case also.
This would be a problem even for Custom Scans as originally presented
also, so I don't see much change there.

Being able to do bulk movement would be useful, but (as I proposed
months ago) being able to do asyncronous returns would be extremely
useful also, when you consider FDWs and Append()- the main point there
being that you want to keep the FDWs busy and working in parallel.

Thanks,

Stephen

#9Simon Riggs
simon@2ndQuadrant.com
In reply to: Stephen Frost (#8)
Re: [v9.5] Custom Plan API

On 7 May 2014 17:43, Stephen Frost <sfrost@snowman.net> wrote:

* Simon Riggs (simon@2ndQuadrant.com) wrote:

IMHO we would not want to add indexes to every column, on every table,
nor would we wish to use lookaside for all tables. It is a good thing
to be able to add optimizations for individual tables. GPUs are not
good for everything; it is good to be able to leverage their
strengths, yet avoid their weaknesses.

It's the optimizer's job to figure out which path to pick though, based
on which will have the lowest cost.

Of course. I'm not suggesting otherwise.

If do you want that, you can write an Event Trigger that automatically
adds a lookaside for any table.

This sounds terribly ugly and like we're pushing optimization decisions
on to the user instead of just figuring out what the best answer is.

I'm proposing that we use a declarative approach, just like we do when
we say CREATE INDEX.

The idea is that we only consider a lookaside when a lookaside has
been declared. Same as when we add an index, the optimizer considers
whether to use that index. What we don't want to happen is that the
optimizer considers a GIN plan, even when a GIN index is not
available.

I'll explain it more at the developer meeting. It probably sounds a
bit weird at first.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Stephen Frost
sfrost@snowman.net
In reply to: Simon Riggs (#9)
Re: [v9.5] Custom Plan API

* Simon Riggs (simon@2ndQuadrant.com) wrote:

On 7 May 2014 17:43, Stephen Frost <sfrost@snowman.net> wrote:

It's the optimizer's job to figure out which path to pick though, based
on which will have the lowest cost.

Of course. I'm not suggesting otherwise.

If do you want that, you can write an Event Trigger that automatically
adds a lookaside for any table.

This sounds terribly ugly and like we're pushing optimization decisions
on to the user instead of just figuring out what the best answer is.

I'm proposing that we use a declarative approach, just like we do when
we say CREATE INDEX.

There's quite a few trade-offs when it comes to indexes though. I'm
trying to figure out when you wouldn't want to use a GPU, if it's
available to you and the cost model says it's faster? To me, that's
kind of like saying you want a declarative approach for when to use a
HashJoin.

The idea is that we only consider a lookaside when a lookaside has
been declared. Same as when we add an index, the optimizer considers
whether to use that index. What we don't want to happen is that the
optimizer considers a GIN plan, even when a GIN index is not
available.

Yes, I understood your proposal- I just don't agree with it. ;)

For MatViews and/or Indexes, there are trade-offs to be had as it
relates to disk space, insert speed, etc.

Thanks,

Stephen

#11KaiGai Kohei
kaigai@ak.jp.nec.com
In reply to: Simon Riggs (#6)
Re: [v9.5] Custom Plan API

Let me list up the things to be clarified / developed randomly.

* Join replacement by FDW; We still don't have consensus about join

replacement

by FDW. Probably, it will be designed to remote-join implementation

primarily,

however, things to do is similar. We may need to revisit the Hanada-san's
proposition in the past.

Agreed. We need to push down joins into FDWs and we need to push down
aggregates also, so they can be passed to FDWs. I'm planning to look at
aggregate push down.

Probably, it's a helpful feature.

* Lookaside for ANY relations; I want planner to try GPU-scan for any

relations

once installed, to reduce user's administration cost.
It needs lookaside allow to specify a particular foreign-server, not

foreign-

table, then create ForeignScan node that is not associated with a

particular

foreign-table.

IMHO we would not want to add indexes to every column, on every table, nor
would we wish to use lookaside for all tables. It is a good thing to be
able to add optimizations for individual tables. GPUs are not good for
everything; it is good to be able to leverage their strengths, yet avoid
their weaknesses.

If do you want that, you can write an Event Trigger that automatically adds
a lookaside for any table.

It may be a solution if we try to replace scan on a relation by a ForeignScan,
in other words, a case when we can describe 1:1 relationship between a table
and a foreign-table; being alternatively scanned.

Is it possible to fit a case when a ForeignScan replaces a built-in Join plans?
I don't think it is a realistic assumption to set up lookaside configuration
for all the possible combination of joins, preliminary.

I have an idea; if lookaside accept a function, foreign-server or something
subjective entity as an alternative path, it will be able to create paths
on the fly, not only preconfigured foreign-tables.
This idea will take two forms of DDL commands as:

CREATE LOOKASIDE <name> ON <target reltaion>
TO <alternative table/matview/foreign table/...>;

CREATE LOOKASIDE <name> ON <target relation>
EXECUTE <path generator function>;

Things to do internally is same. TO- form kicks a built-in routine, instead
of user defined function, to add alternative scan/join paths according to
the supplied table/matview/foreign table and so on.

* ForeignScan node that is not associated with a particular foreign-table.
Once we try to apply ForeignScan node instead of Sort or Aggregate,

existing

FDW implementation needs to be improved. These nodes scan on a

materialized

relation (generated on the fly), however, existing FDW code assumes
ForeignScan node is always associated with a particular foreign-table.
We need to eliminate this restriction.

I don't think we need to do that, given the above.

It makes a problem if ForeignScan is chosen as alternative path of Join.

The target-list of Join node are determined according to the query form
on the fly, so we cannot expect a particular TupleDesc to be returned
preliminary. Once we try to apply ForeignScan instead of Join node, it
has to have its TupleDesc depending on a set of joined relations.

I think, it is more straightforward approach to allow ForeignScan that
is not associated to a particular (cataloged) relations.

* FDW method for MultiExec. In case when we can stack multiple ForeignScan
nodes, it's helpful to support to exchange scanned tuples in their own
data format. Let's assume two ForeignScan nodes are stacked. One

performs

like Sort, another performs like Scan. If they internally handle column-
oriented data format, TupleTableSlot is not a best way for data

exchange.

I agree TupleTableSlot may not be best way for bulk data movement. We
probably need to look at buffering/bulk movement between executor nodes
in general, which would be of benefit for the FDW case also.
This would be a problem even for Custom Scans as originally presented also,
so I don't see much change there.

Yes. I is the reason why my Custom Scan proposition supports MultiExec method.

* Lookaside on the INSERT/UPDATE/DELETE. Probably, it can be implemented
using writable FDW feature. Not a big issue, but don't forget it...

Yes, possible.

I hope these ideas make sense. This is early days and there may be other
ideas and much detail yet to come.

I'd like to agree general direction. My biggest concern towards FDW is
transparency for application. If lookaside allows to redirect a reference
towards scan/join on regular relations by ForeignScan (as an alternative
method to execute), here is no strong reason to stick on custom-plan.

However, existing ForeignScan node does not support to work without
a particular foreign table. It may become a restriction if we try to
replace Join node by ForeignScan, and it is my worry.
(Even it may be solved during Join replacement by FDW works.)

One other point I noticed.

* SubPlan support; if an extension support its special logic to join relations,
but don't want to support various method to scan relations, it is natural to
leverage built-in scan logics (like SeqScan, ...).
I want ForeignScan to support to have SubPlans if FDW driver has capability.
I believe it can be implemented according to the existing manner, but we
need to expose several static functions to handle plan-tree recursively.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Simon Riggs
simon@2ndQuadrant.com
In reply to: KaiGai Kohei (#11)
Re: [v9.5] Custom Plan API

On 8 May 2014 01:49, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

* ForeignScan node that is not associated with a particular foreign-table.
Once we try to apply ForeignScan node instead of Sort or Aggregate,

existing

FDW implementation needs to be improved. These nodes scan on a

materialized

relation (generated on the fly), however, existing FDW code assumes
ForeignScan node is always associated with a particular foreign-table.
We need to eliminate this restriction.

I don't think we need to do that, given the above.

It makes a problem if ForeignScan is chosen as alternative path of Join.

The target-list of Join node are determined according to the query form
on the fly, so we cannot expect a particular TupleDesc to be returned
preliminary. Once we try to apply ForeignScan instead of Join node, it
has to have its TupleDesc depending on a set of joined relations.

I think, it is more straightforward approach to allow ForeignScan that
is not associated to a particular (cataloged) relations.

From your description, my understanding is that you would like to
stream data from 2 standard tables to the GPU, then perform a join on
the GPU itself.

I have been told that is not likely to be useful because of the data
transfer overheads.

Or did I misunderstand, and that this is intended to get around the
current lack of join pushdown into FDWs?

Can you be specific about the actual architecture you wish for, so we
can understand how to generalise that into an API?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Simon Riggs
simon@2ndQuadrant.com
In reply to: Stephen Frost (#10)
Re: [v9.5] Custom Plan API

On 7 May 2014 18:39, Stephen Frost <sfrost@snowman.net> wrote:

* Simon Riggs (simon@2ndQuadrant.com) wrote:

On 7 May 2014 17:43, Stephen Frost <sfrost@snowman.net> wrote:

It's the optimizer's job to figure out which path to pick though, based
on which will have the lowest cost.

Of course. I'm not suggesting otherwise.

If do you want that, you can write an Event Trigger that automatically
adds a lookaside for any table.

This sounds terribly ugly and like we're pushing optimization decisions
on to the user instead of just figuring out what the best answer is.

I'm proposing that we use a declarative approach, just like we do when
we say CREATE INDEX.

There's quite a few trade-offs when it comes to indexes though. I'm
trying to figure out when you wouldn't want to use a GPU, if it's
available to you and the cost model says it's faster? To me, that's
kind of like saying you want a declarative approach for when to use a
HashJoin.

I'm proposing something that is like an index, not like a plan node.

The reason that proposal is being made is that we need to consider
data structure, data location and processing details.

* In the case of Mat Views, if there is no Mat View, then we can't use
it - we can't replace that with just any mat view instead
* GPUs and other special processing units have finite data transfer
rates, so other people have proposed that they retain data on the
GPU/SPU - so we want to do a lookaside only for situations where the
data is already prepared to handle a lookaside.
* The other cases I cited of in-memory data structures are all
pre-arranged items with structures suited to processing particular
types of query

Given that I count 4-5 beneficial use cases for this index-like
lookaside, it seems worth investing time in.

It appears that Kaigai wishes something else in addition to this
concept, so there may be some confusion from that. I'm sure it will
take a while to really understand all the ideas and possibilities.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Stephen Frost
sfrost@snowman.net
In reply to: Simon Riggs (#12)
Re: [v9.5] Custom Plan API

Simon,

* Simon Riggs (simon@2ndQuadrant.com) wrote:

On 8 May 2014 01:49, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

From your description, my understanding is that you would like to

stream data from 2 standard tables to the GPU, then perform a join on
the GPU itself.

I have been told that is not likely to be useful because of the data
transfer overheads.

That was my original understanding and, I believe, the case at one
point, however...

Or did I misunderstand, and that this is intended to get around the
current lack of join pushdown into FDWs?

I believe the issue with the transfer speeds to the GPU have been either
eliminated or at least reduced to the point where it's practical now.
This is all based on prior discussions with KaiGai- I've not done any
testing myself. In any case, this is exactly what they're looking to
do, as I understand it, and to do the same with aggregates that work
well on GPUs.

Can you be specific about the actual architecture you wish for, so we
can understand how to generalise that into an API?

It's something that *could* be done with FDWs, once they have the
ability to have join push-down and aggregate push-down, but I (and, as I
understand it, Tom) feel isn't really the right answer for this because
the actual *data* is completely under PG in this scenario. It's just
in-memory processing that's being done on the GPU and in the GPU's
memory.

KaiGai has speculated about other possibilities (eg: having the GPU's
memory also used as some kind of multi-query cache, which would reduce
the transfer costs, but at a level of complexity regarding that cache
that I'm not sure it'd be sensible to try and do and, in any case, could
be done later and might make sense independently, if we could make it
work for, say, a memcached environment too; I'm thinking it would be
transaction-specific, but even that would be pretty tricky unless we
held locks across every row...).

Thanks,

Stephen

#15Stephen Frost
sfrost@snowman.net
In reply to: Simon Riggs (#13)
Re: [v9.5] Custom Plan API

Simon,

* Simon Riggs (simon@2ndQuadrant.com) wrote:

I'm proposing something that is like an index, not like a plan node.

The reason that proposal is being made is that we need to consider
data structure, data location and processing details.

* In the case of Mat Views, if there is no Mat View, then we can't use
it - we can't replace that with just any mat view instead

I agree with you about MatView's. There are clear trade-offs there,
similar to those with indexes.

* GPUs and other special processing units have finite data transfer
rates, so other people have proposed that they retain data on the
GPU/SPU - so we want to do a lookaside only for situations where the
data is already prepared to handle a lookaside.

I've heard this and I'm utterly unconvinced that it could be made to
work at all- and it's certainly moving the bar of usefullness quite far
away, making the whole thing much less practical. If we can't cost for
this transfer rate and make use of GPUs for medium-to-large size queries
which are only transient, then perhaps shoving all GPU work out across
an FDW is actually the right solution, and make that like some kind of
MatView as you're proposing- but I don't see how you're going to manage
updates and invalidation of that data in a sane way for a multi-user PG
system.

* The other cases I cited of in-memory data structures are all
pre-arranged items with structures suited to processing particular
types of query

If it's transient in-memory work, I'd like to see our generalized
optimizer consider them all instead of pushing that job on the user to
decide when the optimizer should consider certain methods.

Given that I count 4-5 beneficial use cases for this index-like
lookaside, it seems worth investing time in.

I'm all for making use of MatViews and GPUs, but there's more than one
way to get there and look-asides feels like pushing the decision,
unnecessarily, on to the user.

Thanks,

Stephen

#16Shigeru Hanada
shigeru.hanada@gmail.com
In reply to: KaiGai Kohei (#5)
Re: [v9.5] Custom Plan API

2014-05-07 18:06 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:

Let me list up the things to be clarified / developed randomly.

* Join replacement by FDW; We still don't have consensus about join replacement
by FDW. Probably, it will be designed to remote-join implementation primarily,
however, things to do is similar. We may need to revisit the Hanada-san's
proposition in the past.

I can't recall the details soon but the reason I gave up was about
introducing ForiegnJoinPath node, IIRC. I'll revisit the discussion
and my proposal.
--
Shigeru HANADA

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17KaiGai Kohei
kaigai@ak.jp.nec.com
In reply to: Simon Riggs (#12)
Re: [v9.5] Custom Plan API

* ForeignScan node that is not associated with a particular

foreign-table.

Once we try to apply ForeignScan node instead of Sort or
Aggregate,

existing

FDW implementation needs to be improved. These nodes scan on a

materialized

relation (generated on the fly), however, existing FDW code assumes
ForeignScan node is always associated with a particular

foreign-table.

We need to eliminate this restriction.

I don't think we need to do that, given the above.

It makes a problem if ForeignScan is chosen as alternative path of Join.

The target-list of Join node are determined according to the query
form on the fly, so we cannot expect a particular TupleDesc to be
returned preliminary. Once we try to apply ForeignScan instead of Join
node, it has to have its TupleDesc depending on a set of joined relations.

I think, it is more straightforward approach to allow ForeignScan that
is not associated to a particular (cataloged) relations.

From your description, my understanding is that you would like to stream
data from 2 standard tables to the GPU, then perform a join on the GPU itself.

I have been told that is not likely to be useful because of the data transfer
overheads.

Here are two solutions. One is currently I'm working; in case when number
of rows in left- and right- tables are not balanced well, we can keep a hash
table in the GPU DRAM, then we transfer the data stream chunk-by-chunk from
the other side. Kernel execution and data transfer can be run asynchronously,
so it allows to hide data transfer cost as long as we have enough number of
chunks, like processor pipelining.
Other solution is "integrated" GPU that kills necessity of data transfer,
like Intel's Haswell, AMD's Kaveri or Nvidia's Tegra K1; all majors are
moving to same direction.

Or did I misunderstand, and that this is intended to get around the current
lack of join pushdown into FDWs?

The logic above is obviously executed on the extension side, so it needs
ForeignScan node to perform like Join node; that reads two input relation
streams and output one joined relation stream.

It is quite similar to expected FDW join-pushdown design. It will consume
(remote) two relations and generates one output stream; looks like a scan
on a particular relation (but no catalog definition here).

Probably, it shall be visible to local backend as follows:
(it is a result of previous prototype based on custom-plan api)

postgres=# EXPLAIN VERBOSE SELECT count(*) FROM
pgbench1_branches b JOIN pgbench1_accounts a ON a.bid = b.bid WHERE aid < 100;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=101.60..101.61 rows=1 width=0)
Output: count(*)
-> Custom Scan (postgres-fdw) (cost=100.00..101.43 rows=71 width=0)
Remote SQL: SELECT NULL FROM (public.pgbench_branches r1 JOIN public.pgbench_accounts r2 ON ((r1.bid = r2.bid))) WHERE ((r2.aid < 100))
(4 rows)

The place of "Custom Scan" node will be ForeignScan, if Join pushdown got supported.
At that time, what relation should be scanned by this ForeignScan?
It is the reason why I proposed ForeignScan node without particular relation.

Can you be specific about the actual architecture you wish for, so we can
understand how to generalise that into an API?

If we push the role of CustomPlan node into ForeignScan, I want to use this node
to acquire control during query planning/execution.

As I did in the custom-plan patch, first of all, I want extension to have
a chance to add alternative path towards particular scan/join.
If extension can take over the execution, it will generate a ForeignPath
(or CustomPath) node then call add_path(). As usual manner, planner decide
whether the alternative path is cheaper than other candidates.

In case when it replaced scan relation by ForeignScan, it is almost same as
existing API doing, except for the underlying relation is regular one, not
foreign table.

In case when it replaced join relations by ForeignScan, it will be almost
same as expected ForeignScan with join-pushed down. Unlike usual table scan,
it does not have actual relation definition on catalog, and its result
tuple-slot is determined on the fly.
One thing different from the remote-join is, this ForeignScan node may have
sub-plans locally, if FDW driver (e.g GPU execution) may have capability on
Join only, but no relation scan portion.
So, unlike its naming, I want ForeignScan to support to have sub-plans if
FDW driver supports the capability.

Does it make you clear? Or, makes you more confused??

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Simon Riggs
simon@2ndQuadrant.com
In reply to: KaiGai Kohei (#17)
Re: [v9.5] Custom Plan API

On 8 May 2014 04:33, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

From your description, my understanding is that you would like to stream
data from 2 standard tables to the GPU, then perform a join on the GPU itself.

I have been told that is not likely to be useful because of the data transfer
overheads.

Here are two solutions. One is currently I'm working; in case when number
of rows in left- and right- tables are not balanced well, we can keep a hash
table in the GPU DRAM, then we transfer the data stream chunk-by-chunk from
the other side. Kernel execution and data transfer can be run asynchronously,
so it allows to hide data transfer cost as long as we have enough number of
chunks, like processor pipelining.

Makes sense to me, thanks for explaining.

The hardware-enhanced hash join sounds like a great idea.

My understanding is we would need

* a custom cost-model
* a custom execution node

The main question seems to be whether doing that would be allowable,
cos its certainly doable.

I'm still looking for a way to avoid adding planning time for all
queries though.

Other solution is "integrated" GPU that kills necessity of data transfer,
like Intel's Haswell, AMD's Kaveri or Nvidia's Tegra K1; all majors are
moving to same direction.

Sounds useful, but very non-specific, as yet.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Simon Riggs
simon@2ndQuadrant.com
In reply to: Stephen Frost (#15)
Re: [v9.5] Custom Plan API

On 8 May 2014 03:36, Stephen Frost <sfrost@snowman.net> wrote:

Given that I count 4-5 beneficial use cases for this index-like
lookaside, it seems worth investing time in.

I'm all for making use of MatViews and GPUs, but there's more than one
way to get there and look-asides feels like pushing the decision,
unnecessarily, on to the user.

I'm not sure I understand where most of your comments come from, so
its clear we're not talking about the same things yet.

We have multiple use cases where an alternate data structure could be
used to speed up queries.

My goal is to use the alternate data structure(s)

1) if the data structure contains matching data for the current query
2) only when the user has explicitly stated it would be correct to do
so, and they wish it
3) transparently to the application, rather than forcing them to recode
4) after fully considering cost-based optimization, which we can only
do if it is transparent

all of which is how mat views work in other DBMS. My additional requirement is

5) allow this to work with data structures outside the normal
heap/index/block structures, since we have multiple already working
examples of such things and many users wish to leverage those in their
applications

which I now understand is different from the main thrust of Kaigai's
proposal, so I will restate this later on another thread.

The requirement is similar to the idea of running

CREATE MATERIALIZED VIEW foo
BUILD DEFERRED
REFRESH COMPLETE
ON DEMAND
ENABLE QUERY REWRITE
ON PREBUILT TABLE

but expands on that to encompass any external data structure.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#4)
Re: [v9.5] Custom Plan API

On Wed, May 7, 2014 at 4:01 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

Agreed. My proposal is that if the planner allows the lookaside to an
FDW then we pass the query for full execution on the FDW. That means
that the scan, aggregate and join could take place via the FDW. i.e.
"Custom Plan" == lookaside + FDW

Or put another way, if we add Lookaside then we can just plug in the
pgstrom FDW directly and we're done. And everybody else's FDW will
work as well, so Citus etcc will not need to recode.

As Stephen notes downthread, Tom has already expressed opposition to
this idea on other threads, and I tend to agree with him, at least to
some degree. I think the drive to use foreign data wrappers for
PGStrom, CitusDB, and other things that aren't really foreign data
wrappers as originally conceived is a result of the fact that we've
got only one interface in this area that looks remotely like something
pluggable; and so everyone's trying to fit things into the constraints
of that interface whether it's actually a good fit or not.
Unfortunately, I think what CitusDB really wants is pluggable storage,
and what PGStrom really wants is custom paths, and I don't think
either of those things is the same as what FDWs provide.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Stephen Frost
sfrost@snowman.net
In reply to: Simon Riggs (#19)
#22Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#20)
#23Simon Riggs
simon@2ndQuadrant.com
In reply to: Stephen Frost (#21)
#24Simon Riggs
simon@2ndQuadrant.com
In reply to: KaiGai Kohei (#17)
#25Stephen Frost
sfrost@snowman.net
In reply to: Simon Riggs (#23)
#26KaiGai Kohei
kaigai@ak.jp.nec.com
In reply to: Robert Haas (#20)
#27Stephen Frost
sfrost@snowman.net
In reply to: Simon Riggs (#24)
#28Simon Riggs
simon@2ndQuadrant.com
In reply to: Stephen Frost (#25)
#29Stephen Frost
sfrost@snowman.net
In reply to: Simon Riggs (#2)
#30Simon Riggs
simon@2ndQuadrant.com
In reply to: Stephen Frost (#27)
#31Simon Riggs
simon@2ndQuadrant.com
In reply to: Stephen Frost (#29)
#32Stephen Frost
sfrost@snowman.net
In reply to: Simon Riggs (#30)
#33Simon Riggs
simon@2ndQuadrant.com
In reply to: Stephen Frost (#32)
#34Simon Riggs
simon@2ndQuadrant.com
In reply to: KaiGai Kohei (#1)
#35Stephen Frost
sfrost@snowman.net
In reply to: Simon Riggs (#34)
#36Robert Haas
robertmhaas@gmail.com
In reply to: Stephen Frost (#35)
#37Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#36)
#38Simon Riggs
simon@2ndQuadrant.com
In reply to: Stephen Frost (#35)
#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#36)
#40Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#37)
#41Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#37)
#42Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#40)
In reply to: KaiGai Kohei (#26)
#44Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#39)
#45KaiGai Kohei
kaigai@ak.jp.nec.com
In reply to: Tom Lane (#39)
#46KaiGai Kohei
kaigai@ak.jp.nec.com
In reply to: KaiGai Kohei (#45)
#47KaiGai Kohei
kaigai@ak.jp.nec.com
In reply to: Peter Geoghegan (#43)
#48Stephen Frost
sfrost@snowman.net
In reply to: Simon Riggs (#37)
#49KaiGai Kohei
kaigai@ak.jp.nec.com
In reply to: Stephen Frost (#48)
#50Stephen Frost
sfrost@snowman.net
In reply to: Peter Geoghegan (#43)
#51Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#44)
#52Stephen Frost
sfrost@snowman.net
In reply to: KaiGai Kohei (#45)
#53KaiGai Kohei
kaigai@ak.jp.nec.com
In reply to: Stephen Frost (#52)
#54Stephen Frost
sfrost@snowman.net
In reply to: KaiGai Kohei (#49)
#55Stephen Frost
sfrost@snowman.net
In reply to: KaiGai Kohei (#53)
#56KaiGai Kohei
kaigai@ak.jp.nec.com
In reply to: Stephen Frost (#55)
In reply to: Stephen Frost (#50)
#58Stephen Frost
sfrost@snowman.net
In reply to: KaiGai Kohei (#56)
#59Robert Haas
robertmhaas@gmail.com
In reply to: Stephen Frost (#51)
#60Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#59)
#61Simon Riggs
simon@2ndQuadrant.com
In reply to: Stephen Frost (#48)
#62Stephen Frost
sfrost@snowman.net
In reply to: Simon Riggs (#61)
#63Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#40)
#64KaiGai Kohei
kaigai@ak.jp.nec.com
In reply to: Simon Riggs (#63)