Status of FDW pushdowns
Where are we on the remaining possible pushdowns for foreign data
wrappers, particularly the Postgres one? I know we do WHERE restriction
pushdowns in 9.3, but what about join and aggregate pushdowns? Is
anyone working on those?
I know join pushdowns seem insignificant, but it helps to restrict what
data must be passed back because you would only pass back joined rows.
Do we document these missing features anywhere?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote:
Where are we on the remaining possible pushdowns for foreign data
wrappers, particularly the Postgres one? I know we do WHERE restriction
pushdowns in 9.3, but what about join and aggregate pushdowns? Is
anyone working on those?I know join pushdowns seem insignificant, but it helps to restrict what
data must be passed back because you would only pass back joined rows.
By 'insignificant' you mean 'necessary to do any non-trivial real
work'. Personally, I'd prefer it if FDW was extended to allow
arbitrary parameterized queries like every other database connectivity
API ever made ever. But in lieu of that, I'll take as much push down
power as possible :-D.
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2013/11/21 Bruce Momjian <bruce@momjian.us>:
Where are we on the remaining possible pushdowns for foreign data
wrappers, particularly the Postgres one? I know we do WHERE restriction
pushdowns in 9.3, but what about join and aggregate pushdowns? Is
anyone working on those?I know join pushdowns seem insignificant, but it helps to restrict what
data must be passed back because you would only pass back joined rows.Do we document these missing features anywhere?
Probably, custom-scan api will provide more flexible way to push-down
aggregate, sort or other stuff performing on regular tables, not only
foreign tables.
It allows extensions to offer alternative scan/join path on the planning
stage, then executor callbacks its custom logic instead of the built-in
one, if its cost is cheaper.
Right now, it performs on relation scan or join only. However, we will be
able to apply same concept on aggregation.
For example, an aggregation node on a foreign table scan is a good
candidate to push down because it can be replaced with a custom-
logic that scans a materialized result of the remote aggregation query,
if its cost is enough cheap than local aggregation.
Probably, we need to add a hook and some logic to compare the
built-in aggregation and alternative paths provided by extensions.
It is also helpful for the people who want to implement something like
"parallel aggregate" performing on regular tables, not only foreign table.
Thanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Kohei KaiGai <kaigai@kaigai.gr.jp> writes:
Right now, it performs on relation scan or join only. However, we will be
able to apply same concept on aggregation.
For example, an aggregation node on a foreign table scan is a good
candidate to push down because it can be replaced with a custom-
logic that scans a materialized result of the remote aggregation query,
if its cost is enough cheap than local aggregation.
Probably, we need to add a hook and some logic to compare the
built-in aggregation and alternative paths provided by extensions.
Note that this is another thing that's blocked on Path-ifying the work
now done in grouping_planner. We don't currently have a way to represent
a local aggregation, much less a remote one, as a Path. We definitely
need that before we can open up any of that logic to FDWs.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Merlin Moncure <mmoncure@gmail.com> writes:
On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote:
I know join pushdowns seem insignificant, but it helps to restrict what
data must be passed back because you would only pass back joined rows.
By 'insignificant' you mean 'necessary to do any non-trivial real
work'. Personally, I'd prefer it if FDW was extended to allow
arbitrary parameterized queries like every other database connectivity
API ever made ever.
[ shrug... ] So use dblink. For better or worse, the FDW stuff is
following the SQL standard's SQL/MED design, which does not do it
like that.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2013/11/22 Kohei KaiGai <kaigai@kaigai.gr.jp>:
2013/11/21 Bruce Momjian <bruce@momjian.us>:
Where are we on the remaining possible pushdowns for foreign data
wrappers, particularly the Postgres one? I know we do WHERE restriction
pushdowns in 9.3, but what about join and aggregate pushdowns? Is
anyone working on those?I know join pushdowns seem insignificant, but it helps to restrict what
data must be passed back because you would only pass back joined rows.Do we document these missing features anywhere?
Probably, custom-scan api will provide more flexible way to push-down
aggregate, sort or other stuff performing on regular tables, not only
foreign tables.
It allows extensions to offer alternative scan/join path on the planning
stage, then executor callbacks its custom logic instead of the built-in
one, if its cost is cheaper.
IIRC, sort push-down is already supported. We can provide sorted
pathes by setting Pathkeys to additional ForeignPath. postgres_fdw
doesn't support this feature because we couldn't get consensus about
how to limit sort variation. One idea was to allow to define "foreign
index" on foreign tables to indicate which column combination is
reasonably sortable.
--
Shigeru HANADA
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2013/11/22 Tom Lane <tgl@sss.pgh.pa.us>:
Merlin Moncure <mmoncure@gmail.com> writes:
On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote:
I know join pushdowns seem insignificant, but it helps to restrict what
data must be passed back because you would only pass back joined rows.By 'insignificant' you mean 'necessary to do any non-trivial real
work'. Personally, I'd prefer it if FDW was extended to allow
arbitrary parameterized queries like every other database connectivity
API ever made ever.[ shrug... ] So use dblink. For better or worse, the FDW stuff is
following the SQL standard's SQL/MED design, which does not do it
like that.
Pass-through mode mentioned in SQL/MED standard might be what he wants.
--
Shigeru HANADA
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Nov 21, 2013 at 10:46:14AM -0500, Tom Lane wrote:
Merlin Moncure <mmoncure@gmail.com> writes:
On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote:
I know join pushdowns seem insignificant, but it helps to restrict what
data must be passed back because you would only pass back joined rows.By 'insignificant' you mean 'necessary to do any non-trivial real
work'. Personally, I'd prefer it if FDW was extended to allow
arbitrary parameterized queries like every other database connectivity
API ever made ever.[ shrug... ] So use dblink.
Not with a non-PostgreSQL data source.
For better or worse, the FDW stuff is following the SQL standard's
SQL/MED design, which does not do it like that.
What SQL/MED specifies along this line is purely a caution against
making a specification without a reference implementation. If I'm
reading it correctly, it's literally impossible to make what they
suggest safe.
Given those givens, we're free to do this in a way that's not
barking-at-the-moon crazy. At least two inter-database communication
links which work with PostgreSQL do this..
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Nov 21, 2013 at 6:43 PM, Shigeru Hanada
<shigeru.hanada@gmail.com> wrote:
2013/11/22 Tom Lane <tgl@sss.pgh.pa.us>:
Merlin Moncure <mmoncure@gmail.com> writes:
On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote:
I know join pushdowns seem insignificant, but it helps to restrict what
data must be passed back because you would only pass back joined rows.By 'insignificant' you mean 'necessary to do any non-trivial real
work'. Personally, I'd prefer it if FDW was extended to allow
arbitrary parameterized queries like every other database connectivity
API ever made ever.[ shrug... ] So use dblink. For better or worse, the FDW stuff is
following the SQL standard's SQL/MED design, which does not do it
like that.Pass-through mode mentioned in SQL/MED standard might be what he wants.
happen to have a link handy?
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Nov 22, 2013 at 08:25:05AM -0600, Merlin Moncure wrote:
On Thu, Nov 21, 2013 at 6:43 PM, Shigeru Hanada
<shigeru.hanada@gmail.com> wrote:2013/11/22 Tom Lane <tgl@sss.pgh.pa.us>:
Merlin Moncure <mmoncure@gmail.com> writes:
On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote:
I know join pushdowns seem insignificant, but it helps to restrict what
data must be passed back because you would only pass back joined rows.By 'insignificant' you mean 'necessary to do any non-trivial real
work'. Personally, I'd prefer it if FDW was extended to allow
arbitrary parameterized queries like every other database connectivity
API ever made ever.[ shrug... ] So use dblink. For better or worse, the FDW stuff is
following the SQL standard's SQL/MED design, which does not do it
like that.Pass-through mode mentioned in SQL/MED standard might be what he wants.
happen to have a link handy?
http://www.wiscorp.com/sql20nn.zip
You'll want to look at the PDF with MED in its title.
Passthrough mode, which is how the standard "handles" this problem is
basically a thing where you set it to be on, then everything your send
until setting it to off is passed through to the remote side. The
people writing the standard didn't think too much about the
possibility that the remote side might speak a broader or different
dialect of SQL from the local server. They also didn't imagine cases
where what's being passed isn't SQL at all.
In addition to breaking any possible parser, the "feature" as
described in the standard is just ripe for un-patchable exploits *in
its design*.
Of all the misdesign-by-committee contained in the standard, this
piece is far and away the stupidest I've encountered to date. We
should not even vaguely attempt to implement it.
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Merlin Moncure <mmoncure@gmail.com> writes:
By 'insignificant' you mean 'necessary to do any non-trivial real
work'. Personally, I'd prefer it if FDW was extended to allow
arbitrary parameterized queries like every other database connectivity
API ever made ever. But in lieu of that, I'll take as much push down
power as possible :-D.
That sounds more like FOREIGN VIEW and FOREIGN FUNCTION to me, where you
could have the whole control of the local/remote boundaries.
I mean that when planning a query using a FOREIGN VIEW it would probably
make sense to consider it as a CTE as far as the optimizer is concerned.
About FOREIGN FUNCTION, that would allow to inject arbitrary parameters
anywhere in the remote query when doing SQL functions. We have a very
nice version of FOREIGN FUNCTION already, that's plproxy.
Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi Merlin,
2013/11/22 Merlin Moncure <mmoncure@gmail.com>:
On Thu, Nov 21, 2013 at 6:43 PM, Shigeru Hanada
<shigeru.hanada@gmail.com> wrote:2013/11/22 Tom Lane <tgl@sss.pgh.pa.us>:
Merlin Moncure <mmoncure@gmail.com> writes:
On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote:
I know join pushdowns seem insignificant, but it helps to restrict what
data must be passed back because you would only pass back joined rows.By 'insignificant' you mean 'necessary to do any non-trivial real
work'. Personally, I'd prefer it if FDW was extended to allow
arbitrary parameterized queries like every other database connectivity
API ever made ever.[ shrug... ] So use dblink. For better or worse, the FDW stuff is
following the SQL standard's SQL/MED design, which does not do it
like that.Pass-through mode mentioned in SQL/MED standard might be what he wants.
happen to have a link handy?
SQL/MED standard doesn't say much about PASS THROUGH mode, especially
about interaction between client. Besides it, I think it would be
nice to allow arbitrary FDW as backend of dblink interface like this:
postgres=> SELECT dblink_connect('con1', 'server name of an FDW');
postgres=> SELECT * FROM dblink('con1', 'some query written in remote
syntax') as t(/* record type definition */...);
This provides a way to execute query without defining foreign table.
--
Shigeru HANADA
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Shigeru Hanada escribi�:
SQL/MED standard doesn't say much about PASS THROUGH mode, especially
about interaction between client. Besides it, I think it would be
nice to allow arbitrary FDW as backend of dblink interface like this:postgres=> SELECT dblink_connect('con1', 'server name of an FDW');
postgres=> SELECT * FROM dblink('con1', 'some query written in remote
syntax') as t(/* record type definition */...);This provides a way to execute query without defining foreign table.
Seems to me that if you want to read remote tables without creating a
foreign table, you could define them locally using something like the
WITH syntax and then use them normally in the rest of the query.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
Seems to me that if you want to read remote tables without creating a
foreign table, you could define them locally using something like the
WITH syntax and then use them normally in the rest of the query.
I guess what's needed here is a kind of barrier that allows pushing a
whole arbitrary subquery (with joins and quals and whatnot) down to the
remote side.
My current thinking about how to solve that would be to add a notion of
FOREIGN VIEW in our system, which would basically implement that barrier
and send the view definition on the remote, with known quals values as
constants, or something like that.
Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Nov 27, 2013 at 4:20 AM, Shigeru Hanada
<shigeru.hanada@gmail.com> wrote:
Hi Merlin,
2013/11/22 Merlin Moncure <mmoncure@gmail.com>:
On Thu, Nov 21, 2013 at 6:43 PM, Shigeru Hanada
<shigeru.hanada@gmail.com> wrote:2013/11/22 Tom Lane <tgl@sss.pgh.pa.us>:
Merlin Moncure <mmoncure@gmail.com> writes:
On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian <bruce@momjian.us> wrote:
I know join pushdowns seem insignificant, but it helps to restrict what
data must be passed back because you would only pass back joined rows.By 'insignificant' you mean 'necessary to do any non-trivial real
work'. Personally, I'd prefer it if FDW was extended to allow
arbitrary parameterized queries like every other database connectivity
API ever made ever.[ shrug... ] So use dblink. For better or worse, the FDW stuff is
following the SQL standard's SQL/MED design, which does not do it
like that.Pass-through mode mentioned in SQL/MED standard might be what he wants.
happen to have a link handy?
SQL/MED standard doesn't say much about PASS THROUGH mode, especially
about interaction between client. Besides it, I think it would be
nice to allow arbitrary FDW as backend of dblink interface like this:postgres=> SELECT dblink_connect('con1', 'server name of an FDW');
postgres=> SELECT * FROM dblink('con1', 'some query written in remote
syntax') as t(/* record type definition */...);This provides a way to execute query without defining foreign table.
yeah. (thanks for indulging -- this is barely on topic I guess).
if it were possible to create a supporting function (say, fdw_link)
that could somehow interface with a previously established server, it
could probably be worked out. Then all FDW could leverage
parameterization without having to copy and paste the pgsql-fdw qual
push code. But that would be a fairly large break from the rest of
the FDW syntax and having to define the record at each call site is
admittedly a bit of a headache.
Hm, another way to think about this would be to somehow abstract the
qual push into a library so that it could be accessed by other FDWs if
they opted in. This would address my chief complaint that only the
pgsql-fdw (the only database for which we already have an in-core high
quality connection api) driver could tap the excellent work you've
done. If this were even possible, it would probably result in more
fdw API changes.
If my:
SELECT * FROM big_sql_server_foreign_table WHERE id = x;
was fast, that'd be pretty nice.
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2013/11/27 Dimitri Fontaine <dimitri@2ndquadrant.fr>:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
Seems to me that if you want to read remote tables without creating a
foreign table, you could define them locally using something like the
WITH syntax and then use them normally in the rest of the query.I guess what's needed here is a kind of barrier that allows pushing a
whole arbitrary subquery (with joins and quals and whatnot) down to the
remote side.
Yes, a big problem is how to skip parsing remote query in PG context.
Bare query string (other than string literal) always parsed by PG
parser, but remote side would have different syntax and semantics, as
Dimitri says we need to pass whole of arbitrary query string to remote
side as-is.
My current thinking about how to solve that would be to add a notion of
FOREIGN VIEW in our system, which would basically implement that barrier
and send the view definition on the remote, with known quals values as
constants, or something like that.
I'm sorry but I don't see the point here. Do you mean that user
executes CREATE FOREIGN VIEW in advance and uses the view in a
subsequent query? Or, allow new syntax like WITH alias AS FOREIGN VIEW
(remote query)?
I think it's nice to support executing ad-hoc remote query written in
the syntax which is valid only on remote data source through FDW, and
at the moment dblink interface seems feasible for that purpose.
--
Shigeru HANADA
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Shigeru Hanada <shigeru.hanada@gmail.com> writes:
I'm sorry but I don't see the point here. Do you mean that user
executes CREATE FOREIGN VIEW in advance and uses the view in a
Yes that's what I mean.
I think it's nice to support executing ad-hoc remote query written in
the syntax which is valid only on remote data source through FDW, and
at the moment dblink interface seems feasible for that purpose.
I guess the view query would have to be validated by the FDW, which
would just receive a text.
Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I guess the view query would have to be validated by the FDW, which
would just receive a text.
+1
This is not exactly in context, but I and David Fetter discussed
recently how we could do similar thing.
This would work,but how can we do it for FDWs which do not parse SQL?
Am I missing something here?
Regards,
Atri
--
Regards,
Atri
l'apprenant
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Atri Sharma <atri.jiit@gmail.com> writes:
This would work,but how can we do it for FDWs which do not parse SQL?
Am I missing something here?
Worst case:
CREATE FOREIGN VIEW foo
AS $$
whatever syntax is accepted on the other side
$$;
Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Nov 27, 2013 at 11:08 PM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
Atri Sharma <atri.jiit@gmail.com> writes:
This would work,but how can we do it for FDWs which do not parse SQL?
Am I missing something here?Worst case:
CREATE FOREIGN VIEW foo
AS $$
whatever syntax is accepted on the other side
That doesnt sound like a very good idea.
Can we add a function to the FDW API to define a SQL to foreign server
side conversion?
I am just musing though.
Regards,
Atri
--
Regards,
Atri
l'apprenant
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers