Getting sorted data from foreign server for merge join
Hi All,
PFA patch to get data sorted from the foreign server (postgres_fdw)
according to the pathkeys useful for merge join.
For a given base relation (extendable to join when that becomes available
in postgres_fdw), the patch tries to find merge joinable clauses. It then
adds paths with pathkeys extracted out of these merge joinable clauses. The
merge joinable clauses form equivalence classes. The patch searches in
root->eq_classes for equivalence members belonging to the given relation.
For every such expression it creates a single member pathkey list and
corresponding path. The test postgres_fdw.sql has an existing join which
uses merge join. With this patch the data is sorted on the foreign server
than locally.
While mergejoinable clauses can be obtained from rel->joininfo as well. But
rel->joininfo contains other clauses as well and we need extra efforts to
remove duplicates if the same expression appears in multiple merge joinable
clauses.
Two joining relations can have multiple merge joinable clauses, requiring
multi-member pathkeys so that merge join is efficient to the maximum
extent. The order in which the expressions appears in pathkeys can change
the costs of sorting the data according to the pathkeys, depending upon the
expressions and the presence of indexes containing those expressions. Thus
ideally we would need to club all the expressions appearing in all the
clauses for given two relations and create paths with pathkeys for every
order of these expressions.That explodes the number of possible paths. We
may restrict the number of paths created by considering only certain orders
like sort_inner_and_outer(). In any case, costing such paths increases the
planning time which may not be worth it. So, this patch uses a heuristic
approach of creating single member pathkeys path for every merge joinable
expression.
The pathkeys need to be canonicalised using make_canonical_pathkey(), which
is a static function. I have added a TODO and comments in the patch
explaining possible ways to avoid "extern"alization of this function.
Comments/suggestions are welcome.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Attachments:
pg_sort_all_pd.patchtext/x-diff; charset=US-ASCII; name=pg_sort_all_pd.patchDownload+190-52
On Thu, Nov 5, 2015 at 11:54 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
Hi All,
PFA patch to get data sorted from the foreign server (postgres_fdw)
according to the pathkeys useful for merge join.For a given base relation (extendable to join when that becomes available in
postgres_fdw), the patch tries to find merge joinable clauses. It then adds
paths with pathkeys extracted out of these merge joinable clauses. The merge
joinable clauses form equivalence classes. The patch searches in
root->eq_classes for equivalence members belonging to the given relation.
For every such expression it creates a single member pathkey list and
corresponding path. The test postgres_fdw.sql has an existing join which
uses merge join. With this patch the data is sorted on the foreign server
than locally.While mergejoinable clauses can be obtained from rel->joininfo as well. But
rel->joininfo contains other clauses as well and we need extra efforts to
remove duplicates if the same expression appears in multiple merge joinable
clauses.Two joining relations can have multiple merge joinable clauses, requiring
multi-member pathkeys so that merge join is efficient to the maximum extent.
The order in which the expressions appears in pathkeys can change the costs
of sorting the data according to the pathkeys, depending upon the
expressions and the presence of indexes containing those expressions. Thus
ideally we would need to club all the expressions appearing in all the
clauses for given two relations and create paths with pathkeys for every
order of these expressions.That explodes the number of possible paths. We
may restrict the number of paths created by considering only certain orders
like sort_inner_and_outer(). In any case, costing such paths increases the
planning time which may not be worth it. So, this patch uses a heuristic
approach of creating single member pathkeys path for every merge joinable
expression.The pathkeys need to be canonicalised using make_canonical_pathkey(), which
is a static function. I have added a TODO and comments in the patch
explaining possible ways to avoid "extern"alization of this function.Comments/suggestions are welcome.
I think this approach is generally reasonable, but I suggested parts
of it, so may be biased. I would be interested in hearing the
opinions of others.
Random notes:
"possibily" is a typo.
usable_pklist is confusing because it seems like it might be talking
about primary keys rather than pathkeys. Also, I realize now, looking
at this again, that you're saying "usable" when what I really think
you mean is "useful". Lots of pathkeys are usable, but only a few of
those are useful. I suggest renaming usable_pathkeys to
query_pathkeys and usable_pklist to useful_pathkeys. Similarly, let's
rename generate_pathkeys_for_relation() to
get_useful_pathkeys_for_relation().
Although I'm usually on the side of marking things as extern whenever
we find it convenient, I'm nervous about doing that to
make_canonical_pathkey(), because it has side effects. Searching the
list of canonical pathkeys for the one we need is reasonable, but is
it really right to ever think that we might create a new one at this
stage? Maybe it is, but if so I'd like to hear a good explanation as
to why.
Is the comment "Equivalence classes covering relations other than the
current one are of interest here" missing a "not"?
I don't find this comment illuminating:
+ * In case of child relation, we need to check that the
+ * equivalence class indicates a join to a relation other than
+ * parents, other children and itself (something similar to above).
+ * Otherwise we will end up creating useless paths. The code below is
+ * similar to generate_implied_equalities_for_column(), which might
+ * give a hint.
That basically just says that we have to do it this way because the
other way would be wrong. But it doesn't say WHY the other way would
be wrong. Then a few lines later, you have another comment which says
the same thing again:
+ /*
+ * Ignore equivalence members which correspond to children
+ * or same relation or to parent relations
+ */
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Friday, November 6, 2015 10:32 AM, Robert Haas <robertmhaas@gmail.com> wrote:
I think this approach is generally reasonable, but I suggested
parts of it, so may be biased. I would be interested in hearing
the opinions of others.
Has anyone taken a close look at what happens if the two sides of
the merge join have different implementations of the same collation
name? Is there anything we should do to defend against the
problem?
We already face the issue of corrupted indexes when we have
different revisions of glibc on a primary and a standby or when the
OS on a server is updated, so this wouldn't be entirely a *new*
problem:
/messages/by-id/BA6132ED-1F6B-4A0B-AC22-81278F5AB81E@tripadvisor.com
... but it would be a brand-new way to hit it, and we might be able
to spot the problem in a merge join by watching for rows being fed
to either side of the join which are not in order according to the
machine doing the join.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Nov 6, 2015 at 11:32 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Nov 5, 2015 at 11:54 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:Hi All,
PFA patch to get data sorted from the foreign server (postgres_fdw)
according to the pathkeys useful for merge join.For a given base relation (extendable to join when that becomes
available in
postgres_fdw), the patch tries to find merge joinable clauses. It then
adds
paths with pathkeys extracted out of these merge joinable clauses. The
merge
joinable clauses form equivalence classes. The patch searches in
root->eq_classes for equivalence members belonging to the given relation.
For every such expression it creates a single member pathkey list and
corresponding path. The test postgres_fdw.sql has an existing join which
uses merge join. With this patch the data is sorted on the foreign server
than locally.While mergejoinable clauses can be obtained from rel->joininfo as well.
But
rel->joininfo contains other clauses as well and we need extra efforts to
remove duplicates if the same expression appears in multiple mergejoinable
clauses.
Two joining relations can have multiple merge joinable clauses, requiring
multi-member pathkeys so that merge join is efficient to the maximumextent.
The order in which the expressions appears in pathkeys can change the
costs
of sorting the data according to the pathkeys, depending upon the
expressions and the presence of indexes containing those expressions.Thus
ideally we would need to club all the expressions appearing in all the
clauses for given two relations and create paths with pathkeys for every
order of these expressions.That explodes the number of possible paths. We
may restrict the number of paths created by considering only certainorders
like sort_inner_and_outer(). In any case, costing such paths increases
the
planning time which may not be worth it. So, this patch uses a heuristic
approach of creating single member pathkeys path for every merge joinable
expression.The pathkeys need to be canonicalised using make_canonical_pathkey(),
which
is a static function. I have added a TODO and comments in the patch
explaining possible ways to avoid "extern"alization of this function.Comments/suggestions are welcome.
I think this approach is generally reasonable, but I suggested parts
of it, so may be biased. I would be interested in hearing the
opinions of others.Random notes:
"possibily" is a typo.
usable_pklist is confusing because it seems like it might be talking
about primary keys rather than pathkeys. Also, I realize now, looking
at this again, that you're saying "usable" when what I really think
you mean is "useful". Lots of pathkeys are usable, but only a few of
those are useful. I suggest renaming usable_pathkeys to
query_pathkeys and usable_pklist to useful_pathkeys. Similarly, let's
rename generate_pathkeys_for_relation() to
get_useful_pathkeys_for_relation().Although I'm usually on the side of marking things as extern whenever
we find it convenient, I'm nervous about doing that to
make_canonical_pathkey(), because it has side effects. Searching the
list of canonical pathkeys for the one we need is reasonable, but is
it really right to ever think that we might create a new one at this
stage? Maybe it is, but if so I'd like to hear a good explanation as
to why.Is the comment "Equivalence classes covering relations other than the
current one are of interest here" missing a "not"?I don't find this comment illuminating:
+ * In case of child relation, we need to check that the + * equivalence class indicates a join to a relation other than + * parents, other children and itself (something similar to above). + * Otherwise we will end up creating useless paths. The code below is + * similar to generate_implied_equalities_for_column(), which might + * give a hint.That basically just says that we have to do it this way because the
other way would be wrong. But it doesn't say WHY the other way would
be wrong. Then a few lines later, you have another comment which says
the same thing again:+ /* + * Ignore equivalence members which correspond to children + * or same relation or to parent relations + */--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Sorry to barge in late, but I was wondering if what we've learned with this
patch can be applied to the case of asserting a sort order on a query
returning from dblink().
This is a common use-case for us where one machine will issue a query
request to a non-Pg database that happens to speak libpq (Vertica,
Redshift, I guess Greenplum would be one too), but for whom foreign data
wrappers aren't yet an option, and then join that data to local tables,
many of which have indexes matching the sort order that I know is coming
from the dblink(), but have no way of telling that to the query planner.
I'm guessing the answer is "no", but I wanted to make you aware of a
similar real-world need.
On Fri, Nov 6, 2015 at 4:54 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
PFA patch to get data sorted from the foreign server (postgres_fdw)
according to the pathkeys useful for merge join.
An idle thought. There are going to be a lot of cases where different
software systems actually disagree about collation rules. I wonder if
it would be valuable to have a node that just checks that each row is
in fact greater than the previous row and throws an error if not. That
can be optional or a parameter of the FDW but it's probably cheap
enough to have enabled by default. It would save a lot of difficult to
heartache since the behaviour if the inputs aren't correctly sorted
will be strangely incorrect join results. Often the results may just
be missing or duplicated rows and that can be easy to miss and lead to
corrupted databases or security problems.
--
greg
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Nov 7, 2015 at 12:07 PM, Corey Huinker <corey.huinker@gmail.com> wrote:
Sorry to barge in late, but I was wondering if what we've learned with this
patch can be applied to the case of asserting a sort order on a query
returning from dblink().
Nope.
Sorry to the bearer of bad news, but that would be a different and
much harder development project. Set-returning functions have no way
to indicate anything about the ordering of their return values at
present. We could invent something, but the work involved wouldn't
have much to do with this patch.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Nov 7, 2015 at 5:42 PM, Greg Stark <stark@mit.edu> wrote:
On Fri, Nov 6, 2015 at 4:54 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:PFA patch to get data sorted from the foreign server (postgres_fdw)
according to the pathkeys useful for merge join.An idle thought. There are going to be a lot of cases where different
software systems actually disagree about collation rules. I wonder if
it would be valuable to have a node that just checks that each row is
in fact greater than the previous row and throws an error if not. That
can be optional or a parameter of the FDW but it's probably cheap
enough to have enabled by default. It would save a lot of difficult to
heartache since the behaviour if the inputs aren't correctly sorted
will be strangely incorrect join results. Often the results may just
be missing or duplicated rows and that can be easy to miss and lead to
corrupted databases or security problems.
It's not a bad thought, but it could come up even locally - we've had
more than one situation where indexes have gotten corrupted by
updating glibc. The new glibc doesn't agree with the old one on what
the collation ordering is, and so the indexes are wrong with respect
to the new glibc version. If we were going to design something like
this, rather than making it a separate node, I'd be inclined to create
it as a C-callable function that could be invoked anywhere we want to
check that the ordering is valid. I suspect you're wrong about the
cost, though: I bet it wouldn't be too hard to find cases where it
imposes a really noticeable penalty.
Also, to be clear, I don't think this patch needs to solve that problem.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Nov 6, 2015 at 2:03 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
Has anyone taken a close look at what happens if the two sides of
the merge join have different implementations of the same collation
name? Is there anything we should do to defend against the
problem?
The issue of FDWs vs. collations has been thought about to some degree
(see FDW_COLLATE_NONE/SAFE/UNSAFE), but I don't quite understand that
stuff in detail.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2015/11/10 0:56, Robert Haas wrote:
On Sat, Nov 7, 2015 at 12:07 PM, Corey Huinker <corey.huinker@gmail.com> wrote:
Sorry to barge in late, but I was wondering if what we've learned with this
patch can be applied to the case of asserting a sort order on a query
returning from dblink().Nope.
Sorry to the bearer of bad news, but that would be a different and
much harder development project. Set-returning functions have no way
to indicate anything about the ordering of their return values at
present. We could invent something, but the work involved wouldn't
have much to do with this patch.
There was a patch (which, it seems, was rejected [1]https://commitfest.postgresql.org/4/88/) to do this sort of
thing.
Thanks,
Amit
[1]: https://commitfest.postgresql.org/4/88/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Nov 9, 2015 at 9:39 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Nov 6, 2015 at 2:03 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
Has anyone taken a close look at what happens if the two sides of
the merge join have different implementations of the same collation
name? Is there anything we should do to defend against the
problem?The issue of FDWs vs. collations has been thought about to some degree
(see FDW_COLLATE_NONE/SAFE/UNSAFE), but I don't quite understand that
stuff in detail> .
collations arising from a foreign table's var are considered to be safer
(FDW_COLLATE_SAFE) to push down to the foreign server , since they are
either local default collation or are assumed to be same on foreign and
local server as per user declaration. The onus of making that sure is on
the user who declares particular collation for a foreign table var. As said
upthread, different glibc implementations can cause collation ordering
mismatch, this patch will be susceptible to the same problem as
master/standby problem.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
On 16 November 2015 at 17:47, Ashutosh Bapat <
ashutosh.bapat@enterprisedb.com> wrote:
collations arising from a foreign table's var are considered to be safer
(FDW_COLLATE_SAFE) to push down to the foreign server , since they are
either local default collation or are assumed to be same on foreign and
local server as per user declaration. The onus of making that sure is on
the user who declares particular collation for a foreign table var. As said
upthread, different glibc implementations can cause collation ordering
mismatch, this patch will be susceptible to the same problem as
master/standby problem.
Yeah. It's less bad than the related problems we already have:
* Upgrading glibc can cause your indexes to no longer match your local
collations
* Different glibc versions on master and replica(s) can have the same effect
I don't see a problem with adding another way this same issue can be
expressed, since there's no sane way to fix it _properly_ without either
versioned collations in glibc or bringing collation onboard into Pg.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Thanks Robert for your comments. Please see my replies inlined. Updated
patch is attached.
On Fri, Nov 6, 2015 at 10:02 PM, Robert Haas <robertmhaas@gmail.com> wrote:
I think this approach is generally reasonable, but I suggested parts
of it, so may be biased. I would be interested in hearing the
opinions of others.Random notes:
"possibily" is a typo.
Fixed.
usable_pklist is confusing because it seems like it might be talking
about primary keys rather than pathkeys. Also, I realize now, looking
at this again, that you're saying "usable" when what I really think
you mean is "useful". Lots of pathkeys are usable, but only a few of
those are useful. I suggest renaming usable_pathkeys to
query_pathkeys
Done.
and usable_pklist to useful_pathkeys.
pathkeys is being used to mean a list of pathkey nodes. What we want here
is list of pathkeys so I have renamed it as useful_pathkeys_list, somewhat
long but clear.
Similarly, let's
rename generate_pathkeys_for_relation() to
get_useful_pathkeys_for_relation().
Done.
Although I'm usually on the side of marking things as extern whenever
we find it convenient, I'm nervous about doing that to
make_canonical_pathkey(), because it has side effects. Searching the
list of canonical pathkeys for the one we need is reasonable, but is
it really right to ever think that we might create a new one at this
stage? Maybe it is, but if so I'd like to hear a good explanation as
to why.
For a foreign table no pathkeys are created before creating paths for
individual foreign table since there are no indexes. For a regular table,
the pathkeys get created for useful indexes. Exception to this is if the
expression appears in the ORDER BY clause, the pathkey for the same is
created while handling ORDER BY clause. So, we will always need to create
pathkey for a foreign table unless the corresponding expression does not
appear in the ORDER BY clause. This can be verified by breaking in
make_canonical_pathkey() while executing "explain verbose select * from ft1
join lt using (val);" ft1(val int, val2 int) is foreign table and lt (val
int, val2 int) is local table. You will hit the first breakpoint with stack
trace
Breakpoint 1, make_canonical_pathkey (root=0x231d858, eclass=0x231f030,
opfamily=1976, strategy=1, nulls_first=0 '\000') at pathkeys.c:60
(gdb) where
#0 make_canonical_pathkey (root=0x231d858, eclass=0x231f030,
opfamily=1976, strategy=1, nulls_first=0 '\000') at pathkeys.c:60
#1 0x00007f6740077e39 in get_useful_pathkeys_for_relation (root=0x231d858,
rel=0x231e390) at postgres_fdw.c:663
#2 0x00007f6740077f34 in postgresGetForeignPaths (root=0x231d858,
baserel=0x231e390, foreigntableid=16393) at postgres_fdw.c:705
#3 0x00000000007079cf in set_foreign_pathlist (root=0x231d858,
rel=0x231e390, rte=0x231c488) at allpaths.c:600
#4 0x0000000000707653 in set_rel_pathlist (root=0x231d858, rel=0x231e390,
rti=1, rte=0x231c488) at allpaths.c:395
#5 0x000000000070735f in set_base_rel_pathlists (root=0x231d858) at
allpaths.c:277
at this point
(gdb) p root->canon_pathkeys
$1 = (List *) 0x0
so get_useful_pathkeys_for_relation->make_canonical_pathkey() will create
the first pathkey.
I have left the corresponding TODO untouched, if anybody else wants to
review that part of the code. I will remove it in the final version of the
patch.
Is the comment "Equivalence classes covering relations other than the
current one are of interest here" missing a "not"?
The sentence is correct. We need equivalence class covering relations other
than the current one, because only such equivalence classes indicate joins
between two relations. If an equivalence class covers only a single baserel
(has only a single member in ec_relids), it indicates equivalence between
columns/expressions of the same table, which can not result in a join. I
have changed to comment to be more elaborate.
I don't find this comment illuminating:
+ * In case of child relation, we need to check that the + * equivalence class indicates a join to a relation other than + * parents, other children and itself (something similar to above). + * Otherwise we will end up creating useless paths. The code below is + * similar to generate_implied_equalities_for_column(), which might + * give a hint.That basically just says that we have to do it this way because the
other way would be wrong. But it doesn't say WHY the other way would
be wrong.
There's no "other way" which is wrong. What's the other way you are talking
about?
Anway, I have updated the comment to be more readable.
Then a few lines later, you have another comment which says
the same thing again:+ /* + * Ignore equivalence members which correspond to children + * or same relation or to parent relations + */
Updated this too.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Attachments:
pg_sort_all_pd_v2.patchtext/x-diff; charset=US-ASCII; name=pg_sort_all_pd_v2.patchDownload+219-58
On Tue, Nov 17, 2015 at 8:33 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
Although I'm usually on the side of marking things as extern whenever
we find it convenient, I'm nervous about doing that to
make_canonical_pathkey(), because it has side effects. Searching the
list of canonical pathkeys for the one we need is reasonable, but is
it really right to ever think that we might create a new one at this
stage? Maybe it is, but if so I'd like to hear a good explanation as
to why.For a foreign table no pathkeys are created before creating paths for
individual foreign table since there are no indexes. For a regular table,
the pathkeys get created for useful indexes.
OK.
Could some portion of the second foreach loop inside
get_useful_pathkeys_for_relation be replaced by a call to
eclass_useful_for_merging? The logic looks very similar.
More broadly, I'm wondering about the asymmetries between this code
and pathkeys_useful_for_merging(). The latter has a
right_merge_direction() test and a case for non-EC-derivable join
clauses that aren't present in your code. I wonder if we could just
generate pathkeys speculatively and then test which ones survive
truncate_useless_pathkeys(), or something like that. This isn't an
enormously complicated patch, but it duplicates more of the logic in
pathkeys.c than I'd like.
I'm inclined to think that we shouldn't consider pathkeys that might
be useful for merge-joining unless we're using remote estimates. It
seems more speculative than pushing down a toplevel sort.
This patch seems rather light on test cases.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Nov 19, 2015 at 7:30 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Nov 17, 2015 at 8:33 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:Although I'm usually on the side of marking things as extern whenever
we find it convenient, I'm nervous about doing that to
make_canonical_pathkey(), because it has side effects. Searching the
list of canonical pathkeys for the one we need is reasonable, but is
it really right to ever think that we might create a new one at this
stage? Maybe it is, but if so I'd like to hear a good explanation as
to why.For a foreign table no pathkeys are created before creating paths for
individual foreign table since there are no indexes. For a regular table,
the pathkeys get created for useful indexes.OK.
Could some portion of the second foreach loop inside
get_useful_pathkeys_for_relation be replaced by a call to
eclass_useful_for_merging? The logic looks very similar.
Yes. I have incorporated this change in the patch attached.
More broadly, I'm wondering about the asymmetries between this code
and pathkeys_useful_for_merging(). The latter has a
right_merge_direction() test and a case for non-EC-derivable join
clauses that aren't present in your code.
I wonder if we could just
generate pathkeys speculatively and then test which ones survive
truncate_useless_pathkeys(), or something like that. This isn't an
enormously complicated patch, but it duplicates more of the logic in
pathkeys.c than I'd like.
pathkeys_useful_for_merging(), truncate_useless_pathkeys() and host of
functions in that area are crafted to assess the usefulness of given
pathkeys which for regular tables are "speculated" from indexes on that
table. Foreign tables do not have indexes and neither we have information
about the indexes (if any) on foreign server, thus pathkeys to be assessed
are not readily available. Hence we need some other way of "speculating"
pathkeys for foreign tables. We can not just generate pathkeys because
there are infinite possible expressions on which pathkeys can be built. So,
we hunt for the expressions at various places like root_pathkeys, merge
joinable clauses etc. The seeming duplication of code is because the places
where we are hunting for useful pathkeys in case of foreign table are same
as the places used for assessing usefulness for pathkeys in case of regular
table. Thus in case of foreign tables, we always generate useful pathkeys,
which do not need any further assessment. For regular tables we have set of
pathkeys which need to be assessed. Because of this difference the code
differs in details.
right_merge_direction() compares whether a given pathkey (readily available
or derived from an index) has the same direction as required by
root->query_pathkeys or ascending direction. For foreign tables, the
pathkeys are themselves created from root->query_pathkeys, thus doesn't
require this assessment. The patch creates pathkeys other than those from
root->query_pathkeys in the ascending order (like other places in the code
eg. select_outer_pathkeys_for_merge()), which is what
right_merge_direction() assesses. So again we don't need to call
right_merge_direction().
non-EC-derivable join is interesting. Thanks for bringing it out. These are
mergejoinable clauses in an OUTER join, where columns from inner side can
be NULL, and thus not exactly equal. I will incorporate that change in the
patch. Again while doing so, we have to just pick up the right or left
equivalence class from a given RestrictInfo and don't need to assess it
further like pathkeys_useful_for_merging().
Added this change in the attached patch.
I'm inclined to think that we shouldn't consider pathkeys that might
be useful for merge-joining unless we're using remote estimates. It
seems more speculative than pushing down a toplevel sort.
I agree with you but let me elaborate why I agree. The pathkeys we are
generating are heauristic in nature and thus may not be useful in the same
sense as pathkeys for regular tables. If use_remote_estimate is ON, the
planner will spend time in explaining multiple queries, but it will be in
better position to cost the usage. If use_remote_estimate is OFF, we will
altogether loose chances of merge join, which doesn't look good to me. But
a speculative costing done in this case can result in choosing wrong plan
similar to the same case as toplevel sort. But then choosing a wrong join
has more serious implications than estimating wrong cost for top level join.
This patch seems rather light on test cases.
Added tests. Let me know if you have any specific scenario in mind.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Attachments:
pg_sort_all_pd_v3.patchtext/x-diff; charset=US-ASCII; name=pg_sort_all_pd_v3.patchDownload+398-42
Hi Ashutosh,
I reviewed your latest version of patch and over all the implementation
and other details look good to me.
Here are few cosmetic issues which I found:
1) Patch not getting applied cleanly - white space warning
2)
- List *usable_pathkeys = NIL;
+ List *useful_pathkeys_list = NIL; /* List of all pathkeys */
Code alignment is not correct with other declared variables.
3)
+ {
+ PathKey *pathkey;
+ List *pathkeys;
+
+ pathkey = make_canonical_pathkey(root, cur_ec,
+
linitial_oid(cur_ec->ec_opfamilies),
+ BTLessStrategyNumber,
+ false);
+ pathkeys = list_make1(pathkey);
+ useful_pathkeys_list = lappend(useful_pathkeys_list, pathkeys);
+ }
Code alignment need to fix at make_canonical_pathkey().
4)
I don't understand the meaning of following added testcase into
postgres_fdw.
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -171,20 +171,35 @@ SELECT COUNT(*) FROM ft1 t1;
-- join two tables
SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3,
t1.c1 OFFSET 100 LIMIT 10;
-- subquery
SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10)
ORDER BY c1;
-- subquery+MAX
SELECT * FROM ft1 t1 WHERE t1.c3 = (SELECT MAX(c3) FROM ft2 t2) ORDER BY
c1;
-- used in CTE
WITH t1 AS (SELECT * FROM ft1 WHERE c1 <= 10) SELECT t2.c1, t2.c2, t2.c3,
t2.c4 FROM t1, ft2 t2 WHERE t1.c1 = t2.c1 ORDER BY t1.c1;
-- fixed values
SELECT 'fixed', NULL FROM ft1 t1 WHERE c1 = 1;
+-- getting data sorted from the foreign table for merge join
+-- Since we are interested in merge join, disable other joins
+SET enable_hashjoin TO false;
+SET enable_nestloop TO false;
+-- inner join, expressions in the clauses appear in the equivalence class
list
+EXPLAIN (VERBOSE, COSTS false)
+ SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 =
t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C
1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+-- outer join, expression in the clauses do not appear in equivalence
class list
+-- but no output change as compared to the previous query
+EXPLAIN (VERBOSE, COSTS false)
+ SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1
= t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 =
t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SET enable_hashjoin TO true;
+SET enable_nestloop TO true;
Because, I removed the code changes of the patch and then I run the test
seem like it has nothing to do with the code changes. Above set of test
giving
same result with/without patch.
Am I missing something ?
Apart from this I debugged the patch for each scenario (query pathkeys and
pathkeys arising out of the equivalence classes) and so far patch looks good
to me.
Attaching update version of patch by fixing the cosmetic changes.
On Fri, Nov 20, 2015 at 9:14 PM, Ashutosh Bapat <
ashutosh.bapat@enterprisedb.com> wrote:
On Thu, Nov 19, 2015 at 7:30 AM, Robert Haas <robertmhaas@gmail.com>
wrote:On Tue, Nov 17, 2015 at 8:33 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:Although I'm usually on the side of marking things as extern whenever
we find it convenient, I'm nervous about doing that to
make_canonical_pathkey(), because it has side effects. Searching the
list of canonical pathkeys for the one we need is reasonable, but is
it really right to ever think that we might create a new one at this
stage? Maybe it is, but if so I'd like to hear a good explanation as
to why.For a foreign table no pathkeys are created before creating paths for
individual foreign table since there are no indexes. For a regulartable,
the pathkeys get created for useful indexes.
OK.
Could some portion of the second foreach loop inside
get_useful_pathkeys_for_relation be replaced by a call to
eclass_useful_for_merging? The logic looks very similar.Yes. I have incorporated this change in the patch attached.
More broadly, I'm wondering about the asymmetries between this code
and pathkeys_useful_for_merging(). The latter has a
right_merge_direction() test and a case for non-EC-derivable join
clauses that aren't present in your code.I wonder if we could just
generate pathkeys speculatively and then test which ones survive
truncate_useless_pathkeys(), or something like that. This isn't an
enormously complicated patch, but it duplicates more of the logic in
pathkeys.c than I'd like.pathkeys_useful_for_merging(), truncate_useless_pathkeys() and host of
functions in that area are crafted to assess the usefulness of given
pathkeys which for regular tables are "speculated" from indexes on that
table. Foreign tables do not have indexes and neither we have information
about the indexes (if any) on foreign server, thus pathkeys to be assessed
are not readily available. Hence we need some other way of "speculating"
pathkeys for foreign tables. We can not just generate pathkeys because
there are infinite possible expressions on which pathkeys can be built. So,
we hunt for the expressions at various places like root_pathkeys, merge
joinable clauses etc. The seeming duplication of code is because the places
where we are hunting for useful pathkeys in case of foreign table are same
as the places used for assessing usefulness for pathkeys in case of regular
table. Thus in case of foreign tables, we always generate useful pathkeys,
which do not need any further assessment. For regular tables we have set of
pathkeys which need to be assessed. Because of this difference the code
differs in details.right_merge_direction() compares whether a given pathkey (readily
available or derived from an index) has the same direction as required by
root->query_pathkeys or ascending direction. For foreign tables, the
pathkeys are themselves created from root->query_pathkeys, thus doesn't
require this assessment. The patch creates pathkeys other than those from
root->query_pathkeys in the ascending order (like other places in the code
eg. select_outer_pathkeys_for_merge()), which is what
right_merge_direction() assesses. So again we don't need to call
right_merge_direction().non-EC-derivable join is interesting. Thanks for bringing it out. These
are mergejoinable clauses in an OUTER join, where columns from inner side
can be NULL, and thus not exactly equal. I will incorporate that change in
the patch. Again while doing so, we have to just pick up the right or left
equivalence class from a given RestrictInfo and don't need to assess it
further like pathkeys_useful_for_merging().Added this change in the attached patch.
I'm inclined to think that we shouldn't consider pathkeys that might
be useful for merge-joining unless we're using remote estimates. It
seems more speculative than pushing down a toplevel sort.I agree with you but let me elaborate why I agree. The pathkeys we are
generating are heauristic in nature and thus may not be useful in the same
sense as pathkeys for regular tables. If use_remote_estimate is ON, the
planner will spend time in explaining multiple queries, but it will be in
better position to cost the usage. If use_remote_estimate is OFF, we will
altogether loose chances of merge join, which doesn't look good to me. But
a speculative costing done in this case can result in choosing wrong plan
similar to the same case as toplevel sort. But then choosing a wrong join
has more serious implications than estimating wrong cost for top level join.This patch seems rather light on test cases.
Added tests. Let me know if you have any specific scenario in mind.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Rushabh Lathia
www.EntepriseDB.com
Attachments:
pg_sort_all_pd_v4.patchapplication/x-download; name=pg_sort_all_pd_v4.patchDownload+398-42
Thanks Rushabh for your review and comments.
On Thu, Nov 26, 2015 at 5:39 PM, Rushabh Lathia <rushabh.lathia@gmail.com>
wrote:
Hi Ashutosh,
I reviewed your latest version of patch and over all the implementation
and other details look good to me.Here are few cosmetic issues which I found:
1) Patch not getting applied cleanly - white space warning
Done.
2)
- List *usable_pathkeys = NIL; + List *useful_pathkeys_list = NIL; /* List of all pathkeys */Code alignment is not correct with other declared variables.
Incorporated the change in the patch.
3)
+ { + PathKey *pathkey; + List *pathkeys; + + pathkey = make_canonical_pathkey(root, cur_ec, + linitial_oid(cur_ec->ec_opfamilies), + BTLessStrategyNumber, + false); + pathkeys = list_make1(pathkey); + useful_pathkeys_list = lappend(useful_pathkeys_list, pathkeys); + }Code alignment need to fix at make_canonical_pathkey().
Incorporated the change in the patch.
I have also removed the TODO item in the prologue of this function, since
none has objected to externalization of make_canonical_pathkeys till now
and it's not expected to be part of the final commit.
4)
I don't understand the meaning of following added testcase into
postgres_fdw.+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -171,20 +171,35 @@ SELECT COUNT(*) FROM ft1 t1; -- join two tables SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10; -- subquery SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10) ORDER BY c1; -- subquery+MAX SELECT * FROM ft1 t1 WHERE t1.c3 = (SELECT MAX(c3) FROM ft2 t2) ORDER BY c1; -- used in CTE WITH t1 AS (SELECT * FROM ft1 WHERE c1 <= 10) SELECT t2.c1, t2.c2, t2.c3, t2.c4 FROM t1, ft2 t2 WHERE t1.c1 = t2.c1 ORDER BY t1.c1; -- fixed values SELECT 'fixed', NULL FROM ft1 t1 WHERE c1 = 1; +-- getting data sorted from the foreign table for merge join +-- Since we are interested in merge join, disable other joins +SET enable_hashjoin TO false; +SET enable_nestloop TO false; +-- inner join, expressions in the clauses appear in the equivalence class list +EXPLAIN (VERBOSE, COSTS false) + SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10; +SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10; +-- outer join, expression in the clauses do not appear in equivalence class list +-- but no output change as compared to the previous query +EXPLAIN (VERBOSE, COSTS false) + SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10; +SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10; +SET enable_hashjoin TO true; +SET enable_nestloop TO true;Because, I removed the code changes of the patch and then I run the test
seem like it has nothing to do with the code changes. Above set of test
giving
same result with/without patch.Am I missing something ?
Actually, the output of merge join is always ordered by the pathkeys used
for merge join. That routed through LIMIT node remains ordered. So, we
actually do not need ORDER BY t1.c1 clause in the above queries. Without
that clause, the tests will show difference output with and without patch.
I have changed the attached patch accordingly.
Apart from this I debugged the patch for each scenario (query pathkeys and
pathkeys arising out of the equivalence classes) and so far patch looks
good
to me.
Thanks.
Attaching update version of patch by fixing the cosmetic changes.
Attached version of patch contains your changes.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Attachments:
pg_sort_all_pd_v5.patchtext/x-diff; charset=US-ASCII; name=pg_sort_all_pd_v5.patchDownload+396-42
Thanks Ashutosh.
Re-reviewed and Re-verified the patch, pg_sort_all_pd_v5.patch
looks good to me.
On Fri, Nov 27, 2015 at 3:02 PM, Ashutosh Bapat <
ashutosh.bapat@enterprisedb.com> wrote:
Thanks Rushabh for your review and comments.
On Thu, Nov 26, 2015 at 5:39 PM, Rushabh Lathia <rushabh.lathia@gmail.com>
wrote:Hi Ashutosh,
I reviewed your latest version of patch and over all the implementation
and other details look good to me.Here are few cosmetic issues which I found:
1) Patch not getting applied cleanly - white space warning
Done.
2)
- List *usable_pathkeys = NIL; + List *useful_pathkeys_list = NIL; /* List of all pathkeys */Code alignment is not correct with other declared variables.
Incorporated the change in the patch.
3)
+ { + PathKey *pathkey; + List *pathkeys; + + pathkey = make_canonical_pathkey(root, cur_ec, + linitial_oid(cur_ec->ec_opfamilies), + BTLessStrategyNumber, + false); + pathkeys = list_make1(pathkey); + useful_pathkeys_list = lappend(useful_pathkeys_list, pathkeys); + }Code alignment need to fix at make_canonical_pathkey().
Incorporated the change in the patch.
I have also removed the TODO item in the prologue of this function, since
none has objected to externalization of make_canonical_pathkeys till now
and it's not expected to be part of the final commit.4)
I don't understand the meaning of following added testcase into
postgres_fdw.+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -171,20 +171,35 @@ SELECT COUNT(*) FROM ft1 t1; -- join two tables SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10; -- subquery SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10) ORDER BY c1; -- subquery+MAX SELECT * FROM ft1 t1 WHERE t1.c3 = (SELECT MAX(c3) FROM ft2 t2) ORDER BY c1; -- used in CTE WITH t1 AS (SELECT * FROM ft1 WHERE c1 <= 10) SELECT t2.c1, t2.c2, t2.c3, t2.c4 FROM t1, ft2 t2 WHERE t1.c1 = t2.c1 ORDER BY t1.c1; -- fixed values SELECT 'fixed', NULL FROM ft1 t1 WHERE c1 = 1; +-- getting data sorted from the foreign table for merge join +-- Since we are interested in merge join, disable other joins +SET enable_hashjoin TO false; +SET enable_nestloop TO false; +-- inner join, expressions in the clauses appear in the equivalence class list +EXPLAIN (VERBOSE, COSTS false) + SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10; +SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10; +-- outer join, expression in the clauses do not appear in equivalence class list +-- but no output change as compared to the previous query +EXPLAIN (VERBOSE, COSTS false) + SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10; +SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10; +SET enable_hashjoin TO true; +SET enable_nestloop TO true;Because, I removed the code changes of the patch and then I run the test
seem like it has nothing to do with the code changes. Above set of test
giving
same result with/without patch.Am I missing something ?
Actually, the output of merge join is always ordered by the pathkeys used
for merge join. That routed through LIMIT node remains ordered. So, we
actually do not need ORDER BY t1.c1 clause in the above queries. Without
that clause, the tests will show difference output with and without patch.
I have changed the attached patch accordingly.Apart from this I debugged the patch for each scenario (query pathkeys and
pathkeys arising out of the equivalence classes) and so far patch looks
good
to me.Thanks.
Attaching update version of patch by fixing the cosmetic changes.
Attached version of patch contains your changes.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
--
Rushabh Lathia
On Wed, Dec 2, 2015 at 6:45 AM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
Thanks Ashutosh.
Re-reviewed and Re-verified the patch, pg_sort_all_pd_v5.patch
looks good to me.
This patch needs a rebase.
It's not going to work to say this is a patch proposed for commit when
it's still got a TODO comment in it that obviously needs to be
changed. And the formatting of that long comment is pretty weird,
too, and not consistent with other functions in that same file (e.g.
get_remote_estimate, ec_member_matches_foreign, create_cursor).
Aside from that, I think before we commit this, somebody should do
some testing that demonstrates that this is actually a good idea. Not
as part of the test case set for this patch, but just in general.
Merge joins are typically going to be relevant for large tables, but
the examples in the regression tests are necessarily tiny. I'd like
to see some sample data and some sample queries that get appreciably
faster with this code. If we can't find any, we don't need the code.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Dec 9, 2015 at 12:14 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Dec 2, 2015 at 6:45 AM, Rushabh Lathia <rushabh.lathia@gmail.com>
wrote:Thanks Ashutosh.
Re-reviewed and Re-verified the patch, pg_sort_all_pd_v5.patch
looks good to me.This patch needs a rebase.
Done.
It's not going to work to say this is a patch proposed for commit when
it's still got a TODO comment in it that obviously needs to be
changed. And the formatting of that long comment is pretty weird,
too, and not consistent with other functions in that same file (e.g.
get_remote_estimate, ec_member_matches_foreign, create_cursor).
The TODO was present in v4 but not in v5 and is not present in v6 attached
here.. Formatted comment according estimate_path_cost_size(),
convert_prep_stmt_params().
Aside from that, I think before we commit this, somebody should do
some testing that demonstrates that this is actually a good idea. Not
as part of the test case set for this patch, but just in general.
Merge joins are typically going to be relevant for large tables, but
the examples in the regression tests are necessarily tiny. I'd like
to see some sample data and some sample queries that get appreciably
faster with this code. If we can't find any, we don't need the code.
I tested the patch on my laptop with two types of queries, a join between
two foreign tables on different foreign servers (pointing to the same self
server) and a join between one foreign and one local table. The foreign
tables and servers are created using sort_pd_setup.sql attached. Foreign
tables pointed to table with index useful for join clause. Both the joining
tables had 10M rows. The execution time of query was measured for 100 runs
and average and standard deviation were calculated (using function
query_execution_stats() in script sort_pd.sql) and are presented below.
1. Query between foreign tables
SELECT ft1.val, ft2.val FROM ft1 join ft2 on (ft1.val = ft2.val)
Plan and timings without patch
EXPLAIN (VERBOSE, ANALYSE) :query ;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=508510.02..1129945.94 rows=999995 width=8) (actual
time=33803.826..82416.342 rows=10000000 loops=1)
Output: ft1.val, ft2.val
Hash Cond: (ft1.val = ft2.val)
-> Foreign Scan on public.ft1 (cost=100.00..344347.31 rows=9999977
width=4) (actual time=0.624..28531.803 rows=10000000 loops=1)
Output: ft1.val
Remote SQL: SELECT val FROM public.lt
-> Hash (cost=344347.31..344347.31 rows=9999977 width=4) (actual
time=33258.025..33258.025 rows=10000000 loops=1)
Output: ft2.val
Buckets: 131072 Batches: 256 Memory Usage: 2400kB
-> Foreign Scan on public.ft2 (cost=100.00..344347.31
rows=9999977 width=4) (actual time=22.171..28134.970 rows=10000000 loops=1)
Output: ft2.val
Remote SQL: SELECT val FROM public.lt
Planning time: 33.155 ms
Execution time: 82914.607 ms
(14 rows)
avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--------------+------------------+--------------+--------------
78750.95487 | 2911.51825687913 | 74314.886 | 89358.464
Plan and timing with patch
EXPLAIN (VERBOSE, ANALYSE) :query ;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=200.86..1183070.86 rows=10000000 width=8) (actual
time=1.776..73140.219 rows=10000000 loops=1)
Output: ft1.val, ft2.val
Merge Cond: (ft1.val = ft2.val)
-> Foreign Scan on public.ft1 (cost=100.43..504035.43 rows=10000000
width=4) (actual time=0.937..30422.457 rows=10000000 loops=1)
Output: ft1.val, ft1.val2
Remote SQL: SELECT val FROM public.lt ORDER BY val ASC
-> Materialize (cost=100.43..529035.43 rows=10000000 width=4) (actual
time=0.826..33448.822 rows=10000000 loops=1)
Output: ft2.val, ft2.val2
-> Foreign Scan on public.ft2 (cost=100.43..504035.43
rows=10000000 width=4) (actual time=0.818..31035.362 rows=10000000 loops=1)
Output: ft2.val, ft2.val2
Remote SQL: SELECT val FROM public.lt ORDER BY val ASC
Planning time: 163.161 ms
Execution time: 73654.106 ms
(13 rows)
avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--------------+------------------+--------------+--------------
71881.15916 | 819.091605498189 | 70197.312 | 74653.314
It can be observed that the with the patch, merge join strategy is used
instead of hash join and the execution time reduces by approx 9%. A desired
effect is that the deviation in the execution time has reduced heavily
(almost by 75%).
2. Join between local and foreign table
Without patch the plan and timings are
EXPLAIN (VERBOSE, ANALYSE) :query ;
QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=308410.66..1019846.69 rows=9999970 width=8) (actual
time=7674.681..47767.136 rows=10000000 loops=1)
Output: lt.val, ft1.val
Hash Cond: (ft1.val = lt.val)
-> Foreign Scan on public.ft1 (cost=100.00..344347.55 rows=9999985
width=4) (actual time=0.506..26679.980 rows=10000000 loops=1)
Output: ft1.val
Remote SQL: SELECT val FROM public.lt
-> Hash (cost=144247.85..144247.85 rows=9999985 width=4) (actual
time=7667.598..7667.598 rows=10000000 loops=1)
Output: lt.val
Buckets: 131072 Batches: 256 Memory Usage: 2400kB
-> Seq Scan on public.lt (cost=0.00..144247.85 rows=9999985
width=4) (actual time=0.018..2959.111 rows=10000000 loops=1)
Output: lt.val
Planning time: 8.668 ms
Execution time: 48209.365 ms
(13 rows)
SELECT avg_exe_time, std_dev_exe_time, min_exe_time, max_exe_time
FROM query_execution_stats(:'query', :num_samples);
avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--------------+------------------+--------------+--------------
47246.46956 | 2579.42041949119 | 43603.411 | 56096.759
With the patch the plan and timings are
EXPLAIN (VERBOSE, ANALYSE) :query ;
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=155.01..957924.85 rows=9999970 width=8) (actual
time=0.592..45125.356 rows=10000000 loops=1)
Output: lt.val, ft1.val
Merge Cond: (ft1.val = lt.val)
-> Foreign Scan on public.ft1 (cost=100.43..504038.91 rows=9999985
width=4) (actual time=0.551..30526.048 rows=10000000 loops=1)
Output: ft1.val, ft1.val2
Remote SQL: SELECT val FROM public.lt ORDER BY val ASC
-> Index Only Scan using i_lt_val on public.lt (cost=0.43..303939.21
rows=9999985 width=4) (actual time=0.032..6192.406 rows=10000000 loops=1)
Output: lt.val
Heap Fetches: 10000000
Planning time: 9.043 ms
Execution time: 45666.023 ms
(11 rows)
avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--------------+------------------+--------------+--------------
42803.36105 | 166.874491432755 | 42321.314 | 43316.902
Again observe that with the patch, merge join is used instead of hash join
and timing reduces by approx 9%. Again the deviation in execution reduces
heavily (almost by 75%). There is increase in planning time with the patch
owing to firing EXPLAIN on the foreign server.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
On Thu, Dec 17, 2015 at 3:32 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
On Wed, Dec 9, 2015 at 12:14 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Dec 2, 2015 at 6:45 AM, Rushabh Lathia <rushabh.lathia@gmail.com>
wrote:Thanks Ashutosh.
Re-reviewed and Re-verified the patch, pg_sort_all_pd_v5.patch
looks good to me.This patch needs a rebase.
Done.
Thanks.
It's not going to work to say this is a patch proposed for commit when
it's still got a TODO comment in it that obviously needs to be
changed. And the formatting of that long comment is pretty weird,
too, and not consistent with other functions in that same file (e.g.
get_remote_estimate, ec_member_matches_foreign, create_cursor).The TODO was present in v4 but not in v5 and is not present in v6 attached
here.. Formatted comment according estimate_path_cost_size(),
convert_prep_stmt_params().
Hrm, I must have been looking at the wrong version somehow. Sorry about that.
Aside from that, I think before we commit this, somebody should do
some testing that demonstrates that this is actually a good idea. Not
as part of the test case set for this patch, but just in general.
Merge joins are typically going to be relevant for large tables, but
the examples in the regression tests are necessarily tiny. I'd like
to see some sample data and some sample queries that get appreciably
faster with this code. If we can't find any, we don't need the code.I tested the patch on my laptop with two types of queries, a join between
two foreign tables on different foreign servers (pointing to the same self
server) and a join between one foreign and one local table. The foreign
tables and servers are created using sort_pd_setup.sql attached. Foreign
tables pointed to table with index useful for join clause. Both the joining
tables had 10M rows. The execution time of query was measured for 100 runs
and average and standard deviation were calculated (using function
query_execution_stats() in script sort_pd.sql) and are presented below.
OK, cool.
I went over this patch in some detail today and did a lot of cosmetic
cleanup. The results are attached. I'm fairly happy with this
version, but let me know what you think. Of course, feedback from
others is more than welcome also.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company