SQL/MED - core functionality

Started by Shigeru HANADAover 15 years ago61 messageshackers
Jump to latest
#1Shigeru HANADA
hanada@metrosystems.co.jp

Hi hackers,

Attached is a patch that adds core functionality of SQL/MED. This
patch provides:

* new option HANDLER for FOREIGN DATA WRAPPER
* CREATE/ALTER DDLs are supported
* psql \dew command shows handler option too
* pg_dump can dump HANDLER option

* new object type FOREIGN TABLE
* CREATE/ALTER/DROP DDLs are supported
* system columns except TABLEOID are not supported
* inheriting normal table is supported
* psql \d shows detail of foreign tables
* psql \det lists foreign tables
* psql \dE lists foreign tables in \d format
* pg_dump can dump the definition
* information_schema views added
* foreign table is read-only, so INSERT/UPDATE/DELETE are denied
* ANALYZE and VACUUM skips foreign tables

* new executor node ForeignScan
* it's a counterpart of SeqScan
* this node scans one foreign table at a time
* FDW HANDLER is necessary to execute SELECT statement

Patches for FDWs which can be used to execute SELECT statement will be
posted in their own thread soon.

"SQL/MED - file_fdw" : FDW for external PostgreSQL
"SEL/MED - postgresql_fdw" : FDW for server-side file (CSV, TEXT)

I would reuse existing CommitFest item "SQL/MED" for this patch, and
add new item for each FDW patch.

Regards,
--
Shigeru Hanada

Attachments:

fdw_core.patch.gzapplication/octet-stream; name=fdw_core.patch.gzDownload
#2Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Shigeru HANADA (#1)
Re: SQL/MED - core functionality

On 25.11.2010 09:34, Shigeru HANADA wrote:

Attached is a patch that adds core functionality of SQL/MED. This
patch provides:

* new option HANDLER for FOREIGN DATA WRAPPER
* CREATE/ALTER DDLs are supported
* psql \dew command shows handler option too
* pg_dump can dump HANDLER option

* new object type FOREIGN TABLE
* CREATE/ALTER/DROP DDLs are supported
* system columns except TABLEOID are not supported
* inheriting normal table is supported
* psql \d shows detail of foreign tables
* psql \det lists foreign tables
* psql \dE lists foreign tables in \d format
* pg_dump can dump the definition
* information_schema views added
* foreign table is read-only, so INSERT/UPDATE/DELETE are denied
* ANALYZE and VACUUM skips foreign tables

* new executor node ForeignScan
* it's a counterpart of SeqScan
* this node scans one foreign table at a time
* FDW HANDLER is necessary to execute SELECT statement

Patches for FDWs which can be used to execute SELECT statement will be
posted in their own thread soon.

"SQL/MED - file_fdw" : FDW for external PostgreSQL
"SEL/MED - postgresql_fdw" : FDW for server-side file (CSV, TEXT)

I would reuse existing CommitFest item "SQL/MED" for this patch, and
add new item for each FDW patch.

Looking at the API again, there's a few things I don't like about it:

* It's tied to the ForeignScanState, so all the executor state
structures are exposed to the FDW implementation. It feels like a
modularity violation that the FDW Iterate function returns the tuple by
storing it directly in scanstate->ss.ss_ScanTupleSlot for example. And
it's not going to work for remote scans that don't go through the
executor, for example if you wanted to rewrite contrib/dblink to use
foreign data wrappers. Or the SQL/MED passthrough mode.

* There's no clear Plan stage in the API. Except for EstimateCosts,
which just fills in the estimated costs in RelOptInfo, so it needs to
understand quite a lot of the planner data structures to come up with a
reasonable estimate. But if it e.g wants to apply a qual remotely, like
the PostgreSQL FDW does, it has to check for such quals at execution
time. And as I complained before, you don't get any meaningful EXPLAIN
output.

I propose the attached API instead. This has a clear separation between
plan and execution. I'm sure we'll have to extend the API in the future
FDWs want tighter integration, but I think this is a good start. It
makes it quite straightforward to write simple FDW like the file FDW,
without having to know anything about the executor or planner internals,
but provides enough flexibility to cover the functionality in your
PostgreSQL FDW.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachments:

fdwapi.htext/x-chdr; name=fdwapi.hDownload
#3Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Heikki Linnakangas (#2)
Re: SQL/MED - core functionality

On Thu, Nov 25, 2010 at 22:03, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

I propose the attached API instead. This has a clear separation between plan
and execution.

The APIs seem to be cleaner. The previous ones might be too straight
implementation of the SQL standard.

But I have some questions about the new APIs:
1. Doesn't FdwPlan need to inherit Plan struct?
2. Doesn't FdwPlan need to support copyObject()?
3. If "Datum *values, bool *isnulls" is the better interface,
why do we use TupleTableSlot? We might have the similar issue
in the index-only scan; it also handles virtual tuples.

--
Itagaki Takahiro

#4Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Itagaki Takahiro (#3)
Re: SQL/MED - core functionality

On 25.11.2010 16:16, Itagaki Takahiro wrote:

On Thu, Nov 25, 2010 at 22:03, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

I propose the attached API instead. This has a clear separation between plan
and execution.

The APIs seem to be cleaner. The previous ones might be too straight
implementation of the SQL standard.

But I have some questions about the new APIs:
1. Doesn't FdwPlan need to inherit Plan struct?
2. Doesn't FdwPlan need to support copyObject()?

No. You'll need a ForeignScan object in the planner that supports
copyObject(), just like in your patch. ForeignScan points to the
FdwPlan, but the FDW doesn't need to know anything about that stuff.

I left out some details on what exactly FdwPlan should contain and what
it's lifecycle should be. I'm thinking that it should be allocated in
the CurrentMemoryContext that's active when the FDW Plan routine is
called, which would be the same context where we store all the Plan
objects. It should not be modified after creation, so that it doesn't
need to be copied when the ForeignScan is copied with copyObject(). It
should not contain transient state information like connection objects,
or references to a remotely prepared cursor etc. It must be possible to
call BeginScan multiple times with the same FdwPlan object, so that it
can be stored in a prepared plan that is executed multiple times.

For a typical case like the PostgreSQL FDW, it would contain the foreign
server's OID, and the constructed SQL query that will be sent to the
remote server on execution. For the file FDW, it will probably contain
the filename, and the format options in some pre-parsed format.

3. If "Datum *values, bool *isnulls" is the better interface,
why do we use TupleTableSlot?

I'm not wedded to that part, but in general, the less the FDW needs to
know about PostgreSQL internals the better. There's performance gain
from passing a TupleTableSlot to the FDW, but the ForeignScan node will
certainly store the datums/isnulls array to a TupleTableSlot to pass on
the tuple.

We might have the similar issue
in the index-only scan; it also handles virtual tuples.

Index-only scans are a very different story, that's going to be tightly
internal to the planner and executor, there's no externally-visible API
there.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#4)
Re: SQL/MED - core functionality

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

I left out some details on what exactly FdwPlan should contain and what
it's lifecycle should be. I'm thinking that it should be allocated in
the CurrentMemoryContext that's active when the FDW Plan routine is
called, which would be the same context where we store all the Plan
objects. It should not be modified after creation, so that it doesn't
need to be copied when the ForeignScan is copied with copyObject(). It
should not contain transient state information like connection objects,
or references to a remotely prepared cursor etc. It must be possible to
call BeginScan multiple times with the same FdwPlan object, so that it
can be stored in a prepared plan that is executed multiple times.

The above statements seem mutually contradictory. In particular,
I think you're proposing that copyObject copy only a pointer and not the
whole plan tree when copying ForeignScan. That is entirely
unworkable/unacceptable: quite aside from the semantic ugliness, it will
fail altogether for cached plans.

regards, tom lane

#6Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#5)
Re: SQL/MED - core functionality

On 25.11.2010 18:18, Tom Lane wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes:

I left out some details on what exactly FdwPlan should contain and what
it's lifecycle should be. I'm thinking that it should be allocated in
the CurrentMemoryContext that's active when the FDW Plan routine is
called, which would be the same context where we store all the Plan
objects. It should not be modified after creation, so that it doesn't
need to be copied when the ForeignScan is copied with copyObject(). It
should not contain transient state information like connection objects,
or references to a remotely prepared cursor etc. It must be possible to
call BeginScan multiple times with the same FdwPlan object, so that it
can be stored in a prepared plan that is executed multiple times.

The above statements seem mutually contradictory. In particular,
I think you're proposing that copyObject copy only a pointer and not the
whole plan tree when copying ForeignScan.

Right.

That is entirely
unworkable/unacceptable: quite aside from the semantic ugliness, it will
fail altogether for cached plans.

Hmm, I see, cached plans are planned in a shorter-lived context first,
and copied to permanent storage afterwards. Needs more thought then.
Maybe the FDW needs to provide a copyFdwPlan() function to copy FdwPlans
returned by that FDW.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#6)
Re: SQL/MED - core functionality

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

Hmm, I see, cached plans are planned in a shorter-lived context first,
and copied to permanent storage afterwards. Needs more thought then.
Maybe the FDW needs to provide a copyFdwPlan() function to copy FdwPlans
returned by that FDW.

Or just specify a format for the extra information. Perhaps it could be
thought of as being a value of type bytea? Obviously we can't just have
a fixed amount of info, but maybe a blob with a length word is enough.

regards, tom lane

#8Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#7)
Re: SQL/MED - core functionality

On 25.11.2010 18:28, Tom Lane wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes:

Hmm, I see, cached plans are planned in a shorter-lived context first,
and copied to permanent storage afterwards. Needs more thought then.
Maybe the FDW needs to provide a copyFdwPlan() function to copy FdwPlans
returned by that FDW.

Or just specify a format for the extra information. Perhaps it could be
thought of as being a value of type bytea? Obviously we can't just have
a fixed amount of info, but maybe a blob with a length word is enough.

That seems quite awkward to work with. Let's at least make it a Node *,
so that you can store a Value or List there, or anything else that
already has copyObject support.

I think the PostgreSQL FDW would want to store the remote query there.
But it's not a stretch that you want to use parameter markers in the
remote query, with the parameter values determined at runtime. In that
case you'd also store a list of Exprs for the parameter values (Hmm,
BeginScan needs an ExprContext for that..). This is very hand-wavy, but
I think we'll hit the wall with a single blob pretty quickly.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#8)
Re: SQL/MED - core functionality

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

On 25.11.2010 18:28, Tom Lane wrote:

Or just specify a format for the extra information. Perhaps it could be
thought of as being a value of type bytea? Obviously we can't just have
a fixed amount of info, but maybe a blob with a length word is enough.

That seems quite awkward to work with. Let's at least make it a Node *,
so that you can store a Value or List there, or anything else that
already has copyObject support.

Yeah, that works. A struct could be emulated by using a List with a
known order of elements. If someone did need a binary blob, they could
represent it as a Const of type bytea.

regards, tom lane

#10Shigeru HANADA
hanada@metrosystems.co.jp
In reply to: Tom Lane (#9)
Re: SQL/MED - core functionality

Thanks for the comments.

I'll revise the patch along the discussion. Before starting code work,
please let me summarize the discussion.

* Generally, we should keep FDWs away from PostgreSQL internals,
such as TupleTableSlot.

* FDW should have planner hook which allows FDW to create FDW-specific
plan (FdwPlan in Heikki's proposal) for a scan on a foreign table.

* FdwPlan, a part of ForeignScan plan node, should be able to be
copied in generic way because plans would be copied into another
memory context during caching. It might be better to represent
FdwPlan with Node or List.

* FdwExecutionState, a part of ForeignScanState, should be used
instead of ForeignScanState to remove executor details from FDW
implementation.
# ISTM that FdwExecutionState would be replace FdwReply.

Regards,
--
Shigeru Hanada

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Shigeru HANADA (#10)
Re: SQL/MED - core functionality

Shigeru HANADA <hanada@metrosystems.co.jp> writes:

I'll revise the patch along the discussion. Before starting code work,
please let me summarize the discussion.

* Generally, we should keep FDWs away from PostgreSQL internals,
such as TupleTableSlot.

* FDW should have planner hook which allows FDW to create FDW-specific
plan (FdwPlan in Heikki's proposal) for a scan on a foreign table.

* FdwPlan, a part of ForeignScan plan node, should be able to be
copied in generic way because plans would be copied into another
memory context during caching. It might be better to represent
FdwPlan with Node or List.

* FdwExecutionState, a part of ForeignScanState, should be used
instead of ForeignScanState to remove executor details from FDW
implementation.
# ISTM that FdwExecutionState would be replace FdwReply.

FWIW, I think that the notion that FDW can be "kept away from Postgres
internals" is ridiculous on its face. Re-read the above list and ask
yourself exactly which of the bullet points above are not talking about
being hip-deep in Postgres internals. In particular, arbitrarily
avoiding use of TupleTableSlot is silly. It's a fundamental part of
many executor APIs.

It would be a good idea to avoid use of internals in the wire protocol
to another Postgres database; but the interfaces to the local database
can hardly avoid that, and we shouldn't bend them out of shape to meet
entirely-arbitrary requirements about what to use or not use.

regards, tom lane

#12Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#11)
Re: SQL/MED - core functionality

On Fri, Nov 26, 2010 at 11:16 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Shigeru HANADA <hanada@metrosystems.co.jp> writes:

I'll revise the patch along the discussion.  Before starting code work,
please let me summarize the discussion.

* Generally, we should keep FDWs away from PostgreSQL internals,
such as TupleTableSlot.

* FDW should have planner hook which allows FDW to create FDW-specific
plan (FdwPlan in Heikki's proposal) for a scan on a foreign table.

* FdwPlan, a part of ForeignScan plan node, should be able to be
copied in generic way because plans would be copied into another
memory context during caching.  It might be better to represent
FdwPlan with Node or List.

* FdwExecutionState, a part of ForeignScanState, should be used
instead of ForeignScanState to remove executor details from FDW
implementation.
# ISTM that FdwExecutionState would be replace FdwReply.

FWIW, I think that the notion that FDW can be "kept away from Postgres
internals" is ridiculous on its face.  Re-read the above list and ask
yourself exactly which of the bullet points above are not talking about
being hip-deep in Postgres internals.  In particular, arbitrarily
avoiding use of TupleTableSlot is silly.  It's a fundamental part of
many executor APIs.

It would be a good idea to avoid use of internals in the wire protocol
to another Postgres database; but the interfaces to the local database
can hardly avoid that, and we shouldn't bend them out of shape to meet
entirely-arbitrary requirements about what to use or not use.

But there's probably some value in minimizing the number of
unnecessary interfaces that get exposed to the plugins. There's no
help for the fact that the FDWs are going to need about Datums, but do
we gain anything by making them also know about TupleTableSlot? If
so, fine; if not, don't.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#13Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Shigeru HANADA (#1)
Re: SQL/MED - core functionality

2010/11/25 Shigeru HANADA <hanada@metrosystems.co.jp>:

Hi hackers,

Attached is a patch that adds core functionality of SQL/MED.  This
patch provides:

   "SQL/MED - file_fdw"       : FDW for external PostgreSQL
   "SEL/MED - postgresql_fdw" : FDW for server-side file (CSV, TEXT)

I've tried SQL/MED with postgresql_fdw last night, and found myself
confusing the long setup procedure. A simplest way to use it AFAIK is:

1.CREATE FOREIGN DATA WRAPPER ... (or run install sql script)
2.CREATE SERVER ... FOREIGN DATA WRAPPER ...
3.CREATE USER MAPPING FOR ...
4.CREATE FOREIGN TALBE( ... )

From a user's view, this is very long way to see a simplest foreign
table. I know it is based on the standard, but I really want a
shortcut. Especially, I don't understand why CREATE USER MAPPING FOR
current_user SERVER <server> is needed for default use case. If you
forget CREATE USER MAPPING and do CREATE FOREIGN TABLE, it raises an
error. User mapping is useful if the local user and remote user should
be mapped but I imagine in most cases they are the same.
postgresql_fdw can tell the remote user by conninfo string, in
addition.

This is another topic, but it would be useful if CREATE FOREIGN TABLE
can omit column definitions since fdw usually knows what should be
there in the definitions. I some times mistyped the column names
between remote and local and resulted in fail on execution.

If these are discussed before, I apologize for noise.

Basically, I would love to see this in the next release. Good work!

Regards,

--
Hitoshi Harada

#14Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Hitoshi Harada (#13)
Re: SQL/MED - core functionality

On Wed, Dec 1, 2010 at 12:30, Hitoshi Harada <umi.tanuki@gmail.com> wrote:

From a user's view, this is very long way to see a simplest foreign
table. I know it is based on the standard, but I really want a
shortcut. Especially, I don't understand why CREATE USER MAPPING FOR
current_user SERVER <server> is needed for default use case. If you
forget CREATE USER MAPPING and do CREATE FOREIGN TABLE, it raises an
error. User mapping is useful if the local user and remote user should
be mapped but I imagine in most cases they are the same.
postgresql_fdw can tell the remote user by conninfo string, in
addition.

How do you connect to the remote server when password is required?
I think we cannot pass through passwords to the remote server
even if the same user is used on the local and remote servers.

However, SERVER and USER MAPPING might be useless for file_fdw because
SERVER is always the local file system and USER is always the OS's user
who started the postmaster. I'm not sure how we should treat cases
where those settings don't have any configurations.

--
Itagaki Takahiro

#15Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Itagaki Takahiro (#14)
Re: SQL/MED - core functionality

2010/12/1 Itagaki Takahiro <itagaki.takahiro@gmail.com>:

On Wed, Dec 1, 2010 at 12:30, Hitoshi Harada <umi.tanuki@gmail.com> wrote:

From a user's view, this is very long way to see a simplest foreign
table. I know it is based on the standard, but I really want a
shortcut. Especially, I don't understand why CREATE USER MAPPING FOR
current_user SERVER <server> is needed for default use case. If you
forget CREATE USER MAPPING and do CREATE FOREIGN TABLE, it raises an
error. User mapping is useful if the local user and remote user should
be mapped but I imagine in most cases they are the same.
postgresql_fdw can tell the remote user by conninfo string, in
addition.

How do you connect to the remote server when password is required?
I think we cannot pass through passwords to the remote server
even if the same user is used on the local and remote servers.

Sure, it is limited on trust authentication only. If you need more
complicated connection, do USER MAPPING. But it's not clear to me that
it should be required in any case.

Regards,

--
Hitoshi Harada

#16Shigeru HANADA
hanada@metrosystems.co.jp
In reply to: Hitoshi Harada (#13)
Re: SQL/MED - core functionality

On Wed, 1 Dec 2010 12:30:46 +0900
Hitoshi Harada <umi.tanuki@gmail.com> wrote:

This is another topic, but it would be useful if CREATE FOREIGN TABLE
can omit column definitions since fdw usually knows what should be
there in the definitions. I some times mistyped the column names
between remote and local and resulted in fail on execution.

The SQL/MED standard includes "IMPORT FOREIGN SCHEMA schema FROM
SERVER server" syntax which imports definitions of remote tables in
the specified schema into local PostgreSQL, and you can optionally specify
the list of target tables with "LIMIT TO table_list" option.

This syntax would make defining foreign tables easier, but it needs to
enhance both FDW API and core parser.

Regards,
--
Shigeru Hanada

#17Peter Eisentraut
peter_e@gmx.net
In reply to: Hitoshi Harada (#13)
Re: SQL/MED - core functionality

On ons, 2010-12-01 at 12:30 +0900, Hitoshi Harada wrote:

I've tried SQL/MED with postgresql_fdw last night, and found myself
confusing the long setup procedure. A simplest way to use it AFAIK is:

1.CREATE FOREIGN DATA WRAPPER ... (or run install sql script)
2.CREATE SERVER ... FOREIGN DATA WRAPPER ...
3.CREATE USER MAPPING FOR ...
4.CREATE FOREIGN TALBE( ... )

From a user's view, this is very long way to see a simplest foreign
table. I know it is based on the standard, but I really want a
shortcut. Especially, I don't understand why CREATE USER MAPPING FOR
current_user SERVER <server> is needed for default use case. If you
forget CREATE USER MAPPING and do CREATE FOREIGN TABLE, it raises an
error. User mapping is useful if the local user and remote user should
be mapped but I imagine in most cases they are the same.
postgresql_fdw can tell the remote user by conninfo string, in
addition.

I reviewed the standard about this, and a lot of things are
implementation-defined. I think user mappings could be made optional.

This is another topic, but it would be useful if CREATE FOREIGN TABLE
can omit column definitions since fdw usually knows what should be
there in the definitions. I some times mistyped the column names
between remote and local and resulted in fail on execution.

Also, according to the standard, the column list in CREATE FOREIGN TABLE
is optional (if you can get it in some automatic way, of course).

#18Shigeru HANADA
hanada@metrosystems.co.jp
In reply to: Peter Eisentraut (#17)
Re: SQL/MED - core functionality

On Sun, 12 Dec 2010 23:47:53 +0200
Peter Eisentraut <peter_e@gmx.net> wrote:

On ons, 2010-12-01 at 12:30 +0900, Hitoshi Harada wrote:

From a user's view, this is very long way to see a simplest foreign
table. I know it is based on the standard, but I really want a
shortcut. Especially, I don't understand why CREATE USER MAPPING FOR
current_user SERVER <server> is needed for default use case. If you
forget CREATE USER MAPPING and do CREATE FOREIGN TABLE, it raises an
error. User mapping is useful if the local user and remote user should
be mapped but I imagine in most cases they are the same.
postgresql_fdw can tell the remote user by conninfo string, in
addition.

I reviewed the standard about this, and a lot of things are
implementation-defined. I think user mappings could be made optional.

Simple FDWs such as File FDW might not have concept of "user" on
remote side. In such case, it would be enough to control access
privilege per local user with GRANT/REVOKE SELECT statement.

This is another topic, but it would be useful if CREATE FOREIGN TABLE
can omit column definitions since fdw usually knows what should be
there in the definitions. I some times mistyped the column names
between remote and local and resulted in fail on execution.

Also, according to the standard, the column list in CREATE FOREIGN TABLE
is optional (if you can get it in some automatic way, of course).

To allow omitting column definitions for that purpose, a way to create
ero-column tables would have to be provided. New syntax which allows
FDWs to determine column definition would be necessary.

ex)
-- Create foo from the remote table foo on the server bar
CREATE FOREIGN TABLE foo SERVER bar;
-- Create zero-column table foo
CREATE FOREIGN TABLE foo () SERVER bar;

To support this feature, another hook function need to be added to FDW
API. ISTM that this feature should be considered with IMPORT SCHEMA
statement.

Regards,
--
Shigeru Hanada

#19Shigeru HANADA
hanada@metrosystems.co.jp
In reply to: Heikki Linnakangas (#2)
Re: SQL/MED - core functionality

On Thu, 25 Nov 2010 15:03:29 +0200
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

I propose the attached API instead. This has a clear separation between
plan and execution. I'm sure we'll have to extend the API in the future
FDWs want tighter integration, but I think this is a good start. It
makes it quite straightforward to write simple FDW like the file FDW,
without having to know anything about the executor or planner internals,
but provides enough flexibility to cover the functionality in your
PostgreSQL FDW.

Thanks for the comments and the proposal.

I've revised fdw_core patch along your proposal and subsequent
discussion, and tried to fix FDW patches pgsql_fdw and file_fdw
according to fdw_core. Attached is a WIP which includes changes
below.

(1) Introduce FdwPlan and FdwExecutionState
IIUC, FdwPlan should have copyObject support because it is a part of
Plan node and would be copied when the plan was cached. So FdwPlan
need to be a Node, and hold private information in List. In contrast,
FdwExecutionState.private doesn't have such limitation because it
would not be copied.

One problem to solve is that current PostgreSQL FDW needs PlanState to
generate foreign query string with deparse_expression(). It would be
able to generate SQL string from PlannerInfo and RelOptInfo in
PlanRelScan() without deparse_expression(), but it might need so many
codes.

(2) Pass ParamListInfo to BeginScan, not ForeignScanState
To handle parameters of EXECUTE statement, BeginScan need to get
ParamListInfo from EState. So I've added ParamListInfo parameter to
BeginScan.

(3) Iterate() returns result with TupleTableSlot parameter
How about receiving result with TupleTableSlot which was specified by
caller? This would allow FDWs to forget about ScanState. In this
design, the prototype of Iterate is:

void *(*Iterate)(FdwExecutionState *fstate, TupleTableSlot *slot);

(4) Support EXPLAIN information
Now EXPLAIN VERBOSE shows FDW-specific information. Here is a sample
of PostgreSQL FDW. File FDW might show filename and size.

postgres=# explain verbose select * From accounts where aid = 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------
Foreign Scan on local.accounts (cost=0.00..0.00 rows=1 width=100)
Output: aid, bid, abalance, filler
Filter: (accounts.aid = 1)
FDW-Info: SELECT aid, bid, abalance, filler FROM public.pgbench_accounts accounts WHERE (aid = 1)
(4 rows)

Regards,
--
Shigeru Hanada

Attachments:

fdw_core.patch.gzapplication/octet-stream; name=fdw_core.patch.gzDownload
#20Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Shigeru HANADA (#19)
Re: SQL/MED - core functionality

2010/12/14 Shigeru HANADA <hanada@metrosystems.co.jp>:

On Thu, 25 Nov 2010 15:03:29 +0200
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

I propose the attached API instead. This has a clear separation between
plan and execution. I'm sure we'll have to extend the API in the future
FDWs want tighter integration, but I think this is a good start. It
makes it quite straightforward to write simple FDW like the file FDW,
without having to know anything about the executor or planner internals,
but provides enough flexibility to cover the functionality in your
PostgreSQL FDW.

Thanks for the comments and the proposal.

I've revised fdw_core patch along your proposal and subsequent
discussion, and tried to fix FDW patches pgsql_fdw and file_fdw
according to fdw_core.

Reviewing the patch a little, it occurred to me that it might be a
good idea to split the patch into two pieces again. One is for adding
CREATE FOREIGN TABLE syntax and actual creation and the other is for
FDW APIs. ISTM syntax and and utility processing of FOREIGN TABLE has
been stable in the latest patches, and the discussion should be
concentrated on APIs. APIs change don't harm SQL interfaces. Of course
CREATE FOREIGN TABLE is not useful as alone, but it would be helpful
to be reviewed easily.

Regards,

--
Hitoshi Harada

#21Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Shigeru HANADA (#18)
#22Robert Haas
robertmhaas@gmail.com
In reply to: Shigeru HANADA (#19)
#23Pavel Stehule
pavel.stehule@gmail.com
In reply to: Robert Haas (#22)
#24Peter Eisentraut
peter_e@gmx.net
In reply to: Shigeru HANADA (#18)
#25Shigeru HANADA
hanada@metrosystems.co.jp
In reply to: Robert Haas (#22)
#26Robert Haas
robertmhaas@gmail.com
In reply to: Pavel Stehule (#23)
#27Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Robert Haas (#26)
#28Robert Haas
robertmhaas@gmail.com
In reply to: Shigeru HANADA (#25)
#29Robert Haas
robertmhaas@gmail.com
In reply to: Itagaki Takahiro (#27)
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#29)
#31Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Robert Haas (#29)
#32David Fetter
david@fetter.org
In reply to: Itagaki Takahiro (#31)
#33Robert Haas
robertmhaas@gmail.com
In reply to: Itagaki Takahiro (#31)
#34Shigeru HANADA
hanada@metrosystems.co.jp
In reply to: Shigeru HANADA (#1)
#35Simon Riggs
simon@2ndQuadrant.com
In reply to: Shigeru HANADA (#34)
#36Andrew Dunstan
andrew@dunslane.net
In reply to: Simon Riggs (#35)
#37Simon Riggs
simon@2ndQuadrant.com
In reply to: Andrew Dunstan (#36)
#38Andrew Dunstan
andrew@dunslane.net
In reply to: Simon Riggs (#37)
#39David Fetter
david@fetter.org
In reply to: Simon Riggs (#37)
#40David Fetter
david@fetter.org
In reply to: Simon Riggs (#35)
#41Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: David Fetter (#39)
#42David Fetter
david@fetter.org
In reply to: Alvaro Herrera (#41)
#43Simon Riggs
simon@2ndQuadrant.com
In reply to: David Fetter (#39)
#44Shigeru HANADA
hanada@metrosystems.co.jp
In reply to: Simon Riggs (#35)
#45Simon Riggs
simon@2ndQuadrant.com
In reply to: Shigeru HANADA (#44)
#46Shigeru HANADA
hanada@metrosystems.co.jp
In reply to: Simon Riggs (#45)
#47Robert Haas
robertmhaas@gmail.com
In reply to: Shigeru HANADA (#34)
#48Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#47)
#49Shigeru HANADA
hanada@metrosystems.co.jp
In reply to: Robert Haas (#48)
#50Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Robert Haas (#48)
#51Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Robert Haas (#48)
#52Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Heikki Linnakangas (#50)
#53Robert Haas
robertmhaas@gmail.com
In reply to: Shigeru HANADA (#49)
#54Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#50)
#55Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#51)
#56Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#48)
#57Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#56)
#58Shigeru HANADA
hanada@metrosystems.co.jp
In reply to: Robert Haas (#57)
#59Shigeru HANADA
hanada@metrosystems.co.jp
In reply to: Robert Haas (#56)
#60Shigeru HANADA
hanada@metrosystems.co.jp
In reply to: Robert Haas (#56)
#61Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Shigeru HANADA (#60)