patch: SQL/MED(FDW) DDL
Hello.
This is a proposal patch for SQL/MED for 9.1.
At the prev. CF, this patch had so many features to make it hard
to review all of them. So I devided it into smaller parts:
(1) foreign table DDL support (this proposal):
- support for foreign table DDL syntax (CREATE/ALTER FOREIGN TABLE)
- Definition of FDW routine interface and system catalogs for it.
(2) SELECT support for external PostgreSQL tables.
First of all, I'd like to discuss and make agreement on the basic design
and its implementation for FDW interface.
Comments and ideas would be very appriciated, especially
on how system catalog stores the information about
foreign tables, server, and user mappings.
Detail information are described in the wiki:
http://wiki.postgresql.org/wiki/SQL/MED
I'll add this item to the CF site.
And WIP patch for (2) will be available on the page for how (1) is utilized.
- Open Issue:
- fdw connection params issue:
In this implimentation, foreign table connection params(including
passwords) are visible to all users. To avoid it, store them in the
user_mapping object, not in foreign server object.
# It should be improved, though.
- fdw security issue:
It depends on how indivisual fdw extensions are designed.
I'll focus on it after basic agreement on fdw interface.
- prev. CF page:
https://commitfest.postgresql.org/action/patch_view?id=326
- code repository:
http://repo.or.cz/w/pgsql-fdw.git (branch: foreign_table).
repository branches structures is as follows:
master
+foreign_table: (active) branch for this proposal
- foreign_scan: (active) WIP branch for select support
+fdw_ddl: (will be obsolute) branch for last CF
- dblink
- copy_from
Thanks in advance,
SAKAMOTO Masahiko
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
e-mail: sakamoto.masahiko@oss.ntt.co.jp
Attachments:
fdw_table20100914.patch.gzapplication/x-gzip; name=fdw_table20100914.patch.gzDownload+0-1
On 15/09/10 08:46, SAKAMOTO Masahiko wrote:
This is a proposal patch for SQL/MED for 9.1.
At the prev. CF, this patch had so many features to make it hard
to review all of them. So I devided it into smaller parts:
(1) foreign table DDL support (this proposal):
- support for foreign table DDL syntax (CREATE/ALTER FOREIGN TABLE)
- Definition of FDW routine interface and system catalogs for it.
(2) SELECT support for external PostgreSQL tables.First of all, I'd like to discuss and make agreement on the basic design
and its implementation for FDW interface.
Comments and ideas would be very appriciated, especially
on how system catalog stores the information about
foreign tables, server, and user mappings.
In my mind the key question is: what does the API for writing foreign
data wrappers look like? I couldn't figure that out by looking at the patch.
The API needs to be simple and version-independent, so that you can
write simple wrappers like the flat file wrapper easily. At the same
time, it needs to be very flexible, so that it allows safely pushing
down all kinds constructs like functions, aggregates and joins. The
planner needs to know which constructs the wrapper can handle and get
cost estimates for the foreign scans. Those two goals are at odds.
I've been thinking about that for some time. Here's one sketch I made a
while ago:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg00395.php.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
2010/9/15 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>:
In my mind the key question is: what does the API for writing foreign
data wrappers look like? I couldn't figure that out by looking at the patch.The API needs to be simple and version-independent, so that you can
write simple wrappers like the flat file wrapper easily. At the same
time, it needs to be very flexible, so that it allows safely pushing
down all kinds constructs like functions, aggregates and joins. The
planner needs to know which constructs the wrapper can handle and get
cost estimates for the foreign scans. Those two goals are at odds.I've been thinking about that for some time. Here's one sketch I made a
while ago:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg00395.php.
I wonder if we might not think of the foreign data wrapper problem as
an instance of an even more general problem. Right now, we have
semi-pluggable index access methods - it's probably not quite possible
to implement one as a standalone module because of XLOG, and maybe
other reasons, but at least there's some abstraction layer there.
Perhaps we should be thinking about a similar facility of table-access
methods. What if someone wants to implement column-oriented storage,
or index-organized tables, or or tables that are really slow under
heavy write loads but can instantaneously compute SELECT COUNT(*) FROM
table, or mauve-colored tables with magic pixie dust? I don't want to
raise the bar for this project to the point where we can never get it
off the ground, but if there's a way to avoid baking in the assumption
that only foreign tables can ever have special capabilities, that
might be valuable.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On 15/09/10 21:21, Robert Haas wrote:
I wonder if we might not think of the foreign data wrapper problem as
an instance of an even more general problem. Right now, we have
semi-pluggable index access methods - it's probably not quite possible
to implement one as a standalone module because of XLOG, and maybe
other reasons, but at least there's some abstraction layer there.
Perhaps we should be thinking about a similar facility of table-access
methods. What if someone wants to implement column-oriented storage,
or index-organized tables, or or tables that are really slow under
heavy write loads but can instantaneously compute SELECT COUNT(*) FROM
table, or mauve-colored tables with magic pixie dust? I don't want to
raise the bar for this project to the point where we can never get it
off the ground, but if there's a way to avoid baking in the assumption
that only foreign tables can ever have special capabilities, that
might be valuable.
Well, you could implement all that as a foreign data wrappers. Tables
made out of pixie dust feels pretty foreign to me ;-).
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Wed, Sep 15, 2010 at 2:28 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
On 15/09/10 21:21, Robert Haas wrote:
I wonder if we might not think of the foreign data wrapper problem as
an instance of an even more general problem. Right now, we have
semi-pluggable index access methods - it's probably not quite possible
to implement one as a standalone module because of XLOG, and maybe
other reasons, but at least there's some abstraction layer there.
Perhaps we should be thinking about a similar facility of table-access
methods. What if someone wants to implement column-oriented storage,
or index-organized tables, or or tables that are really slow under
heavy write loads but can instantaneously compute SELECT COUNT(*) FROM
table, or mauve-colored tables with magic pixie dust? I don't want to
raise the bar for this project to the point where we can never get it
off the ground, but if there's a way to avoid baking in the assumption
that only foreign tables can ever have special capabilities, that
might be valuable.Well, you could implement all that as a foreign data wrappers. Tables made
out of pixie dust feels pretty foreign to me ;-).
Eh, maybe. It doesn't seem like the best name, if we're actually
managing the underlying data blocks with our smgr layer, etc.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> writes:
2010/9/15 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>:
I've been thinking about that for some time. Here's one sketch I made a
while ago:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg00395.php.
I wonder if we might not think of the foreign data wrapper problem as
an instance of an even more general problem. Right now, we have
semi-pluggable index access methods - it's probably not quite possible
to implement one as a standalone module because of XLOG, and maybe
other reasons, but at least there's some abstraction layer there.
Perhaps we should be thinking about a similar facility of table-access
methods. What if someone wants to implement column-oriented storage,
or index-organized tables, or or tables that are really slow under
heavy write loads but can instantaneously compute SELECT COUNT(*) FROM
table, or mauve-colored tables with magic pixie dust? I don't want to
raise the bar for this project to the point where we can never get it
off the ground, but if there's a way to avoid baking in the assumption
that only foreign tables can ever have special capabilities, that
might be valuable.
Well, the problem is to not draw the abstraction boundary so high that
your plugins have to reimplement the world to get anything done.
mysql got this wrong IMO, and are still paying the price in the form of
bizarre functional incompatibilities between their different storage
engines.
As an example, I don't think there is any sane way to provide
column-oriented storage as a plugin. The entire executor is based
around the assumption that table scans return a row at a time; in
consequence, the entire planner is too. You can't have a plugin that
replaces all of that. You could probably build a plugin that allows
columnar storage but reconstructs rows to return to the executor ... but
having to do that would largely destroy any advantages of a columnar DB,
I fear.
Yet there are other cases that probably *could* work well based on a
storage-level abstraction boundary; index-organized tables for instance.
So I think we need to have some realistic idea of what we want to
support and design an API accordingly, not hope that if we don't
know what we want we will somehow manage to pick an API that makes
all things possible.
I'm personally more worried about whether Heikki's sketch has the
boundary too high-level than too low-level. It might work all right
for handing off to a full-fledged remote database, particularly if
the other DB is also Postgres; but otherwise it's leaving a lot of
work to be done by the plugin author. And at the same time I don't
think it's exposing enough information to let the local planner do
anything intelligent in terms of trading off remote vs. local work.
regards, tom lane
On Wed, Sep 15, 2010 at 3:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Well, the problem is to not draw the abstraction boundary so high that
your plugins have to reimplement the world to get anything done.
mysql got this wrong IMO, and are still paying the price in the form of
bizarre functional incompatibilities between their different storage
engines.
Yeah, as far as I can tell there is pretty much universal consensus
that they got that wrong. Actually, I have no personal opinion on the
topic, having no familiarity with the innards of MySQL: but that is
what people keep telling me.
As an example, I don't think there is any sane way to provide
column-oriented storage as a plugin. The entire executor is based
around the assumption that table scans return a row at a time; in
consequence, the entire planner is too. You can't have a plugin that
replaces all of that. You could probably build a plugin that allows
columnar storage but reconstructs rows to return to the executor ... but
having to do that would largely destroy any advantages of a columnar DB,
I fear.
Yeah, I don't know. A columnar DB is a bit like making "SELECT * FROM
table" really mean some kind of join between table_part1, table_part2,
and table_part3 (which could then perhaps be reordered, a candidate
for join removal, etc.). But I have no position on whether whatever
infrastructure we'd need to support that is any way related to the
problem du jour. It's worth noting, however, that even if we give up
on a column-oriented storage within PG, we might easily be talking to
a column-oriented DB on the other end of an SQL/MED connection; and
we'd like to be able to handle that sanely.
Yet there are other cases that probably *could* work well based on a
storage-level abstraction boundary; index-organized tables for instance.
So I think we need to have some realistic idea of what we want to
support and design an API accordingly, not hope that if we don't
know what we want we will somehow manage to pick an API that makes
all things possible.
Agreed. Random ideas: index-organized tables, tables that use a
rollback log rather than VACUUM, tables that use strict two-phase
locking rather than MVCC, tables that have no concurrency control at
all and you get dirty reads (could be useful for logging tables),
write-once read-many tables, compressed tables, encrypted tables,
tables in formats used by previous versions of PostgreSQL, tables that
store data in a round-robin fashion (like MRTG rrdtool). Within the
general orbit of index-organized tables, you can wonder about
different kinds of indices: btree, hash, and gist all seem promising.
You can even imagine a GIST-like structure that does something like
maintain running totals for certain columns on each non-leaf page, to
speed up SUM operations. Feel free to ignore whatever of that seems
irrelevant.
I'm personally more worried about whether Heikki's sketch has the
boundary too high-level than too low-level. It might work all right
for handing off to a full-fledged remote database, particularly if
the other DB is also Postgres; but otherwise it's leaving a lot of
work to be done by the plugin author. And at the same time I don't
think it's exposing enough information to let the local planner do
anything intelligent in terms of trading off remote vs. local work.
Yeah, I think the API for exposing cost information needs a lot of thought.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
2010/9/16 Robert Haas <robertmhaas@gmail.com>:
On Wed, Sep 15, 2010 at 3:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yet there are other cases that probably *could* work well based on a
storage-level abstraction boundary; index-organized tables for instance.
So I think we need to have some realistic idea of what we want to
support and design an API accordingly, not hope that if we don't
know what we want we will somehow manage to pick an API that makes
all things possible.Agreed. Random ideas: index-organized tables...
I'd love to see a table that is based on one of the existing KVSs.
--
Hitoshi Harada
On Wed, Sep 15, 2010 at 8:53 PM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
2010/9/16 Robert Haas <robertmhaas@gmail.com>:
On Wed, Sep 15, 2010 at 3:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yet there are other cases that probably *could* work well based on a
storage-level abstraction boundary; index-organized tables for instance.
So I think we need to have some realistic idea of what we want to
support and design an API accordingly, not hope that if we don't
know what we want we will somehow manage to pick an API that makes
all things possible.Agreed. Random ideas: index-organized tables...
I'd love to see a table that is based on one of the existing KVSs.
I'm not familiar with the term KVS?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On Wed, Sep 15, 2010 at 8:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Sep 15, 2010 at 8:53 PM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
2010/9/16 Robert Haas <robertmhaas@gmail.com>:
On Wed, Sep 15, 2010 at 3:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yet there are other cases that probably *could* work well based on a
storage-level abstraction boundary; index-organized tables for instance.
So I think we need to have some realistic idea of what we want to
support and design an API accordingly, not hope that if we don't
know what we want we will somehow manage to pick an API that makes
all things possible.Agreed. Random ideas: index-organized tables...
I'd love to see a table that is based on one of the existing KVSs.
I'm not familiar with the term KVS?
Oh, key-value store, I bet. Yeah, that would be cool.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
2010/9/16 Robert Haas <robertmhaas@gmail.com>:
On Wed, Sep 15, 2010 at 8:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Sep 15, 2010 at 8:53 PM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
2010/9/16 Robert Haas <robertmhaas@gmail.com>:
On Wed, Sep 15, 2010 at 3:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yet there are other cases that probably *could* work well based on a
storage-level abstraction boundary; index-organized tables for instance.
So I think we need to have some realistic idea of what we want to
support and design an API accordingly, not hope that if we don't
know what we want we will somehow manage to pick an API that makes
all things possible.Agreed. Random ideas: index-organized tables...
I'd love to see a table that is based on one of the existing KVSs.
I'm not familiar with the term KVS?
Oh, key-value store, I bet. Yeah, that would be cool.
That's it. Like Redis, Tokyo Cabinet, or something.
--
Hitoshi Harada
Hitoshi Harada <umi.tanuki@gmail.com> writes:
2010/9/16 Robert Haas <robertmhaas@gmail.com>:
Oh, key-value store, I bet. �Yeah, that would be cool.
That's it. Like Redis, Tokyo Cabinet, or something.
What exactly do those get you that an ordinary index, or at worst an
index-organized table, doesn't get you?
regards, tom lane
On 16/09/10 13:22, Tom Lane wrote:
Hitoshi Harada<umi.tanuki@gmail.com> writes:
2010/9/16 Robert Haas<robertmhaas@gmail.com>:
Oh, key-value store, I bet. Yeah, that would be cool.
That's it. Like Redis, Tokyo Cabinet, or something.
What exactly do those get you that an ordinary index, or at worst an
index-organized table, doesn't get you?regards, tom lane
It is pretty rare to see key value stores vs relational engines
discussed without a descent into total foolishiness, but this Wikipedia
page looks like a reasonable summary:
Mark Kirkwood <mark.kirkwood@catalyst.net.nz> writes:
On 16/09/10 13:22, Tom Lane wrote:
What exactly do those get you that an ordinary index, or at worst an
index-organized table, doesn't get you?
It is pretty rare to see key value stores vs relational engines
discussed without a descent into total foolishiness, but this Wikipedia
page looks like a reasonable summary:
http://en.wikipedia.org/wiki/NoSQL
That doesn't do anything at all to answer my question. I don't want
to debate NoSQL versus traditional RDBMS here. What I asked was:
given that PG is a traditional RDBMS, what exactly are you hoping
to accomplish by putting a key-value storage mechanism in it? And
if you did, how would that be different from an index-organized table?
regards, tom lane
On Wed, Sep 15, 2010 at 9:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hitoshi Harada <umi.tanuki@gmail.com> writes:
2010/9/16 Robert Haas <robertmhaas@gmail.com>:
Oh, key-value store, I bet. Yeah, that would be cool.
That's it. Like Redis, Tokyo Cabinet, or something.
What exactly do those get you that an ordinary index, or at worst an
index-organized table, doesn't get you?
Speed, hopefully. Because otherwise there's little reason to think
that abandoning transactional semantics, durability, and strong typing
are good ideas.
We spend a lot of time focusing on people who need things to be MORE
ACID (sync rep and true serializability are just two examples) and
that's really great stuff but on occasion it can be useful to turn the
dial in the other direction. There are an awful lot of solutions out
there that are based on eventually consistent replication, in-memory
databases with periodic snapshots to disk, etc. and, while I sure as
hell wouldn't advocate using those solutions to target nuclear
weapons, they are proving to be pretty useful for things like social
networking, storing the results of giant web crawls, etc., because
they reduce the amount of hardware that you need to buy to process a
given volume of data.
The work I'm doing on unlogged tables is a good example of how we can
provide users with the flexibility they need. I just did my first
test (sorry, it's not in good enough shape to post for the CF, more's
the pity) and a bulk insert into an unlogged table was ~3x faster than
the same thing on a regular table. There are a lot of people who can
benefit from that sort of thing. For example, you can imagine that if
you have a "sessions" table where you store a record for each
currently-logged-in user, an unlogged table would be fine. If the
database crashes and comes back up again, everyone has to log in
again, but that's a rare event and not a disaster if it happens.
Another avenue that I think we should explore in this regard is tables
that have some sort of weakened MVCC properties. The most obvious
example is a table where all tuples are visible to everyone, period.
You can imagine using this as a logging table, for example., in lieu
of autonomous transactions. There might be other cases that are
useful, too: what about a table where only committed tuples are
visible, but we ignore the effect of snapshots (i.e. if the inserting
or deleting transaction is committed, then we assume that the
XMIN/XMAX is visible to the current snapshot without checking)? That
might seem like playing with firecrackers, but suppose the table has
very low concurrency: e.g. overnight you update data in bulk, then
after that's done you run daily reporting queries, then during the day
users run small read-only queries interactively. You pretty much know
that you're never going to have a tuple that's visible to some
transactions but not others, so why pay the overhead of checking every
time?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
2010/9/15 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>:
The API needs to be simple and version-independent, so that you can
write simple wrappers like the flat file wrapper easily. At the same
time, it needs to be very flexible, so that it allows safely pushing
down all kinds constructs like functions, aggregates and joins. The
planner needs to know which constructs the wrapper can handle and get
cost estimates for the foreign scans. Those two goals are at odds.
The patch didn't say nothing about the API design, no?
It just implement FOREIGN TABLE commands in the SQL standard,
and we need the part anyway if we want to support the standard.
--
Itagaki Takahiro
On 16/09/10 14:05, Robert Haas wrote:
On Wed, Sep 15, 2010 at 9:22 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
Hitoshi Harada<umi.tanuki@gmail.com> writes:
2010/9/16 Robert Haas<robertmhaas@gmail.com>:
Oh, key-value store, I bet. Yeah, that would be cool.
That's it. Like Redis, Tokyo Cabinet, or something.
What exactly do those get you that an ordinary index, or at worst an
index-organized table, doesn't get you?For example, you can imagine that if
you have a "sessions" table where you store a record for each
currently-logged-in user, an unlogged table would be fine. If the
database crashes and comes back up again, everyone has to log in
again, but that's a rare event and not a disaster if it happens.
Or perhaps even a "sessions" type table where the rows are overwritten
in place in some manner, to avoid bloat.
regards
Mark
2010/9/16 Mark Kirkwood <mark.kirkwood@catalyst.net.nz>:
On 16/09/10 14:05, Robert Haas wrote:
On Wed, Sep 15, 2010 at 9:22 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
Hitoshi Harada<umi.tanuki@gmail.com> writes:
2010/9/16 Robert Haas<robertmhaas@gmail.com>:
Oh, key-value store, I bet. Yeah, that would be cool.
That's it. Like Redis, Tokyo Cabinet, or something.
What exactly do those get you that an ordinary index, or at worst an
index-organized table, doesn't get you?For example, you can imagine that if
you have a "sessions" table where you store a record for each
currently-logged-in user, an unlogged table would be fine. If the
database crashes and comes back up again, everyone has to log in
again, but that's a rare event and not a disaster if it happens.Or perhaps even a "sessions" type table where the rows are overwritten in
place in some manner, to avoid bloat.
My answer is "variety". If an index-organized table was the one best
solution, there would not been so many KVSes these days.
Regards,
--
Hitoshi Harada
(2010/09/16 11:20), Itagaki Takahiro wrote:
2010/9/15 Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>:
The API needs to be simple and version-independent, so that you can
write simple wrappers like the flat file wrapper easily. At the same
time, it needs to be very flexible, so that it allows safely pushing
down all kinds constructs like functions, aggregates and joins. The
planner needs to know which constructs the wrapper can handle and get
cost estimates for the foreign scans. Those two goals are at odds.The patch didn't say nothing about the API design, no?
It just implement FOREIGN TABLE commands in the SQL standard,
and we need the part anyway if we want to support the standard.
you are right, sorry. Attached includes
- FDW routine interface and
- FDW implementation of FDW for external PostgreSQL.
This should be patched after the previous patch.
This is designed to be used in the executor module.
The API has some simple interfaces, such as FreeConnection,
Open, Iterate, Close, ReOpen, and OnError.
In this current FDW implementation for external PG,
It can push-down WHERE-clause by reconstructing query
that shouled be sent to the foreign server.
I think this FDW routine interface is simple, but
extension develpers can add intelligent features to some extent.
FDW routine interface design and what this fdw-for-pg can do
is summarized in:
http://wiki.postgresql.org/wiki/SQL/MED#PostgreSQL
Regards,
SAKAMOTO Masahiko
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
e-mail: sakamoto.masahiko@oss.ntt.co.jp
Attachments:
And this is sample script to play the feature of this patch.
It includes:
- create foreign data wrapper and foreign server(PG at localhost)
- select foreign PostgreSQL table.
*Important*
This script initializes your database cluster(specified by $PGDATA).
Run with care....
Regards,
SAKAMOTO Masahiko
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
e-mail: sakamoto.masahiko@oss.ntt.co.jp