Partition-wise aggregation/grouping
Hi all,
Declarative partitioning is supported in PostgreSQL 10 and work is already
in
progress to support partition-wise joins. Here is a proposal for
partition-wise
aggregation/grouping. Our initial performance measurement has shown 7 times
performance when partitions are on foreign servers and approximately 15%
when
partitions are local.
Partition-wise aggregation/grouping computes aggregates for each partition
separately. If the group clause contains the partition key, all the rows
belonging to a given group come from one partition, thus allowing aggregates
to be computed completely for each partition. Otherwise, partial aggregates
computed for each partition are combined across the partitions to produce
the
final aggregates. This technique improves performance because:
i. When partitions are located on foreign server, we can push down the
aggregate to the foreign server.
ii. If hash table for each partition fits in memory, but that for the whole
relation does not, each partition-wise aggregate can use an in-memory hash
table.
iii. Aggregation at the level of partitions can exploit properties of
partitions like indexes, their storage etc.
Attached an experimental patch for the same based on the partition-wise join
patches posted in [1]/messages/by-id/CAFjFpRcbY2QN3cfeMTzVEoyF5Lfku-ijyNR=PbXj1e=9a=qMoQ@mail.gmail.com.
This patch currently implements partition-wise aggregation when group clause
contains the partitioning key. A query below, involving a partitioned table
with 3 partitions containing 1M rows each, producing total 30 groups showed
15% improvement over non-partition-wise aggregation. Same query showed 7
times
improvement when the partitions were located on the foreign servers.
Here is the sample plan:
postgres=# set enable_partition_wise_agg to true;
SET
postgres=# EXPLAIN ANALYZE SELECT a, count(*) FROM plt1 GROUP BY a;
QUERY
PLAN
--------------------------------------------------------------------------------------------------------------
Append (cost=5100.00..61518.90 rows=30 width=12) (actual
time=324.837..944.804 rows=30 loops=1)
-> Foreign Scan (cost=5100.00..20506.30 rows=10 width=12) (actual
time=324.837..324.838 rows=10 loops=1)
Relations: Aggregate on (public.fplt1_p1 plt1)
-> Foreign Scan (cost=5100.00..20506.30 rows=10 width=12) (actual
time=309.954..309.956 rows=10 loops=1)
Relations: Aggregate on (public.fplt1_p2 plt1)
-> Foreign Scan (cost=5100.00..20506.30 rows=10 width=12) (actual
time=310.002..310.004 rows=10 loops=1)
Relations: Aggregate on (public.fplt1_p3 plt1)
Planning time: 0.370 ms
Execution time: 945.384 ms
(9 rows)
postgres=# set enable_partition_wise_agg to false;
SET
postgres=# EXPLAIN ANALYZE SELECT a, count(*) FROM plt1 GROUP BY a;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=121518.01..121518.31 rows=30 width=12) (actual
time=6498.452..6498.459 rows=30 loops=1)
Group Key: plt1.a
-> Append (cost=0.00..106518.00 rows=3000001 width=4) (actual
time=0.595..5769.592 rows=3000000 loops=1)
-> Seq Scan on plt1 (cost=0.00..0.00 rows=1 width=4) (actual
time=0.007..0.007 rows=0 loops=1)
-> Foreign Scan on fplt1_p1 (cost=100.00..35506.00 rows=1000000
width=4) (actual time=0.587..1844.506 rows=1000000 loops=1)
-> Foreign Scan on fplt1_p2 (cost=100.00..35506.00 rows=1000000
width=4) (actual time=0.384..1839.633 rows=1000000 loops=1)
-> Foreign Scan on fplt1_p3 (cost=100.00..35506.00 rows=1000000
width=4) (actual time=0.402..1876.505 rows=1000000 loops=1)
Planning time: 0.251 ms
Execution time: 6499.018 ms
(9 rows)
Patch needs a lot of improvement including:
1. Support for partial partition-wise aggregation
2. Estimating number of groups for every partition
3. Estimating cost of partition-wise aggregation based on sample partitions
similar to partition-wise join
and much more.
In order to support partial aggregation on foreign partitions, we need
support
to fetch partially aggregated results from the foreign server. That can be
handled as a separate follow-on patch.
Though is lot of work to be done, I would like to get suggestions/opinions
from
hackers.
I would like to thank Ashutosh Bapat for providing a draft patch and helping
me off-list on this feature while he is busy working on partition-wise join
feature.
[1]: /messages/by-id/CAFjFpRcbY2QN3cfeMTzVEoyF5Lfku-ijyNR=PbXj1e=9a=qMoQ@mail.gmail.com
/messages/by-id/CAFjFpRcbY2QN3cfeMTzVEoyF5Lfku-ijyNR=PbXj1e=9a=qMoQ@mail.gmail.com
Thanks
--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Attachments:
pg_partwise_agg_WIP.patchapplication/x-download; name=pg_partwise_agg_WIP.patchDownload
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 6ef1e48..8388ea7 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -1064,6 +1064,7 @@ deparseFromExpr(List *quals, deparse_expr_cxt *context)
/* For upper relations, scanrel must be either a joinrel or a baserel */
Assert(context->foreignrel->reloptkind != RELOPT_UPPER_REL ||
IS_JOIN_REL(scanrel) ||
+ scanrel->reloptkind == RELOPT_OTHER_MEMBER_REL ||
scanrel->reloptkind == RELOPT_BASEREL);
/* Construct FROM clause */
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 059c5c3..d968832 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -7181,3 +7181,115 @@ AND ftoptions @> array['fetch_size=60000'];
(1 row)
ROLLBACK;
+-- Partition-wise aggregates with FDW
+CREATE TABLE plt1 (a int, b int, c text) PARTITION BY RANGE(a);
+CREATE TABLE plt1_p1 (a int, b int, c text);
+CREATE TABLE plt1_p2 (a int, b int, c text);
+CREATE TABLE plt1_p3 (a int, b int, c text);
+INSERT INTO plt1_p1 SELECT i % 30, i % 50, to_char(i/30, 'FM0000') FROM generate_series(1, 3000) i WHERE (i % 30) < 10;
+INSERT INTO plt1_p2 SELECT i % 30, i % 50, to_char(i/30, 'FM0000') FROM generate_series(1, 3000) i WHERE (i % 30) < 20 and (i % 30) >= 10;
+INSERT INTO plt1_p3 SELECT i % 30, i % 50, to_char(i/30, 'FM0000') FROM generate_series(1, 3000) i WHERE (i % 30) < 30 and (i % 30) >= 20;
+-- Create foreign partitions
+CREATE FOREIGN TABLE fplt1_p1 PARTITION OF plt1 FOR VALUES FROM (0) TO (10) SERVER loopback OPTIONS (table_name 'plt1_p1');
+CREATE FOREIGN TABLE fplt1_p2 PARTITION OF plt1 FOR VALUES FROM (10) TO (20) SERVER loopback OPTIONS (table_name 'plt1_p2');;
+CREATE FOREIGN TABLE fplt1_p3 PARTITION OF plt1 FOR VALUES FROM (20) TO (30) SERVER loopback OPTIONS (table_name 'plt1_p3');;
+ANALYZE plt1;
+ANALYZE fplt1_p1;
+ANALYZE fplt1_p2;
+ANALYZE fplt1_p3;
+-- When GROUP BY clause matches with PARTITION KEY.
+-- Plan when partition-wise-agg is disabled
+SET enable_partition_wise_agg TO false;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT a, sum(b), min(b), count(*) FROM plt1 GROUP BY a ORDER BY 1;
+ QUERY PLAN
+-----------------------------------------------------------------
+ Sort
+ Output: plt1.a, (sum(plt1.b)), (min(plt1.b)), (count(*))
+ Sort Key: plt1.a
+ -> HashAggregate
+ Output: plt1.a, sum(plt1.b), min(plt1.b), count(*)
+ Group Key: plt1.a
+ -> Append
+ -> Seq Scan on public.plt1
+ Output: plt1.a, plt1.b
+ -> Foreign Scan on public.fplt1_p1
+ Output: fplt1_p1.a, fplt1_p1.b
+ Remote SQL: SELECT a, b FROM public.plt1_p1
+ -> Foreign Scan on public.fplt1_p2
+ Output: fplt1_p2.a, fplt1_p2.b
+ Remote SQL: SELECT a, b FROM public.plt1_p2
+ -> Foreign Scan on public.fplt1_p3
+ Output: fplt1_p3.a, fplt1_p3.b
+ Remote SQL: SELECT a, b FROM public.plt1_p3
+(18 rows)
+
+-- Plan when partition-wise-agg is enabled
+SET enable_partition_wise_agg TO true;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT a, sum(b), min(b), count(*) FROM plt1 GROUP BY a ORDER BY 1;
+NOTICE: partition-wise grouping is possible.
+ QUERY PLAN
+---------------------------------------------------------------------------------------------
+ Sort
+ Output: fplt1_p1.a, (sum(fplt1_p1.b)), (min(fplt1_p1.b)), (count(*))
+ Sort Key: fplt1_p1.a
+ -> Append
+ -> Foreign Scan
+ Output: fplt1_p1.a, (sum(fplt1_p1.b)), (min(fplt1_p1.b)), (count(*))
+ Relations: Aggregate on (public.fplt1_p1 plt1)
+ Remote SQL: SELECT a, sum(b), min(b), count(*) FROM public.plt1_p1 GROUP BY a
+ -> Foreign Scan
+ Output: fplt1_p2.a, (sum(fplt1_p2.b)), (min(fplt1_p2.b)), (count(*))
+ Relations: Aggregate on (public.fplt1_p2 plt1)
+ Remote SQL: SELECT a, sum(b), min(b), count(*) FROM public.plt1_p2 GROUP BY a
+ -> Foreign Scan
+ Output: fplt1_p3.a, (sum(fplt1_p3.b)), (min(fplt1_p3.b)), (count(*))
+ Relations: Aggregate on (public.fplt1_p3 plt1)
+ Remote SQL: SELECT a, sum(b), min(b), count(*) FROM public.plt1_p3 GROUP BY a
+(16 rows)
+
+SELECT a, sum(b), min(b), count(*) FROM plt1 GROUP BY a ORDER BY 1;
+NOTICE: partition-wise grouping is possible.
+ a | sum | min | count
+----+------+-----+-------
+ 0 | 2000 | 0 | 100
+ 1 | 2100 | 1 | 100
+ 2 | 2200 | 2 | 100
+ 3 | 2300 | 3 | 100
+ 4 | 2400 | 4 | 100
+ 5 | 2500 | 5 | 100
+ 6 | 2600 | 6 | 100
+ 7 | 2700 | 7 | 100
+ 8 | 2800 | 8 | 100
+ 9 | 2900 | 9 | 100
+ 10 | 2000 | 0 | 100
+ 11 | 2100 | 1 | 100
+ 12 | 2200 | 2 | 100
+ 13 | 2300 | 3 | 100
+ 14 | 2400 | 4 | 100
+ 15 | 2500 | 5 | 100
+ 16 | 2600 | 6 | 100
+ 17 | 2700 | 7 | 100
+ 18 | 2800 | 8 | 100
+ 19 | 2900 | 9 | 100
+ 20 | 2000 | 0 | 100
+ 21 | 2100 | 1 | 100
+ 22 | 2200 | 2 | 100
+ 23 | 2300 | 3 | 100
+ 24 | 2400 | 4 | 100
+ 25 | 2500 | 5 | 100
+ 26 | 2600 | 6 | 100
+ 27 | 2700 | 7 | 100
+ 28 | 2800 | 8 | 100
+ 29 | 2900 | 9 | 100
+(30 rows)
+
+-- Clean-up
+DROP FOREIGN TABLE fplt1_p3;
+DROP FOREIGN TABLE fplt1_p2;
+DROP FOREIGN TABLE fplt1_p1;
+DROP TABLE plt1_p3;
+DROP TABLE plt1_p2;
+DROP TABLE plt1_p1;
+DROP TABLE plt1;
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 22acba8..630c374 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -349,7 +349,8 @@ static bool postgresRecheckForeignScan(ForeignScanState *node,
static void postgresGetForeignUpperPaths(PlannerInfo *root,
UpperRelationKind stage,
RelOptInfo *input_rel,
- RelOptInfo *output_rel);
+ RelOptInfo *output_rel,
+ PathTarget *target);
/*
* Helper functions
@@ -415,7 +416,8 @@ static void add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
Path *epq_path);
static void add_foreign_grouping_paths(PlannerInfo *root,
RelOptInfo *input_rel,
- RelOptInfo *grouped_rel);
+ RelOptInfo *grouped_rel,
+ PathTarget *target);
/*
@@ -2688,7 +2690,7 @@ estimate_path_cost_size(PlannerInfo *root,
else if (foreignrel->reloptkind == RELOPT_UPPER_REL)
{
PgFdwRelationInfo *ofpinfo;
- PathTarget *ptarget = root->upper_targets[UPPERREL_GROUP_AGG];
+ PathTarget *ptarget = fpinfo->grouped_target;
AggClauseCosts aggcosts;
double input_rows;
int numGroupCols;
@@ -4536,7 +4538,7 @@ foreign_grouping_ok(PlannerInfo *root, RelOptInfo *grouped_rel)
* different from those in the plan's targetlist. Use a copy of path
* target to record the new sortgrouprefs.
*/
- grouping_target = copy_pathtarget(root->upper_targets[UPPERREL_GROUP_AGG]);
+ grouping_target = copy_pathtarget(fpinfo->grouped_target);
/*
* Evaluate grouping targets and check whether they are safe to push down
@@ -4715,7 +4717,8 @@ foreign_grouping_ok(PlannerInfo *root, RelOptInfo *grouped_rel)
*/
static void
postgresGetForeignUpperPaths(PlannerInfo *root, UpperRelationKind stage,
- RelOptInfo *input_rel, RelOptInfo *output_rel)
+ RelOptInfo *input_rel, RelOptInfo *output_rel,
+ PathTarget *target)
{
PgFdwRelationInfo *fpinfo;
@@ -4735,7 +4738,7 @@ postgresGetForeignUpperPaths(PlannerInfo *root, UpperRelationKind stage,
fpinfo->pushdown_safe = false;
output_rel->fdw_private = fpinfo;
- add_foreign_grouping_paths(root, input_rel, output_rel);
+ add_foreign_grouping_paths(root, input_rel, output_rel, target);
}
/*
@@ -4747,13 +4750,12 @@ postgresGetForeignUpperPaths(PlannerInfo *root, UpperRelationKind stage,
*/
static void
add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
- RelOptInfo *grouped_rel)
+ RelOptInfo *grouped_rel, PathTarget *target)
{
Query *parse = root->parse;
PgFdwRelationInfo *ifpinfo = input_rel->fdw_private;
PgFdwRelationInfo *fpinfo = grouped_rel->fdw_private;
ForeignPath *grouppath;
- PathTarget *grouping_target;
double rows;
int width;
Cost startup_cost;
@@ -4764,7 +4766,8 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
!root->hasHavingQual)
return;
- grouping_target = root->upper_targets[UPPERREL_GROUP_AGG];
+ /* Store passed-in target in fpinfo for later use */
+ fpinfo->grouped_target = target;
/* save the input_rel as outerrel in fpinfo */
fpinfo->outerrel = input_rel;
@@ -4795,7 +4798,7 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
/* Create and add foreign path to the grouping relation. */
grouppath = create_foreignscan_path(root,
grouped_rel,
- grouping_target,
+ target,
rows,
startup_cost,
total_cost,
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 57dbb79..99cecb7 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -95,6 +95,7 @@ typedef struct PgFdwRelationInfo
/* Grouping information */
List *grouped_tlist;
+ PathTarget *grouped_target;
/* Subquery information */
bool make_outerrel_subquery; /* do we deparse outerrel as a
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 8f3edc1..f02ec8a 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -1706,3 +1706,47 @@ WHERE ftrelid = 'table30000'::regclass
AND ftoptions @> array['fetch_size=60000'];
ROLLBACK;
+
+
+-- Partition-wise aggregates with FDW
+CREATE TABLE plt1 (a int, b int, c text) PARTITION BY RANGE(a);
+
+CREATE TABLE plt1_p1 (a int, b int, c text);
+CREATE TABLE plt1_p2 (a int, b int, c text);
+CREATE TABLE plt1_p3 (a int, b int, c text);
+
+INSERT INTO plt1_p1 SELECT i % 30, i % 50, to_char(i/30, 'FM0000') FROM generate_series(1, 3000) i WHERE (i % 30) < 10;
+INSERT INTO plt1_p2 SELECT i % 30, i % 50, to_char(i/30, 'FM0000') FROM generate_series(1, 3000) i WHERE (i % 30) < 20 and (i % 30) >= 10;
+INSERT INTO plt1_p3 SELECT i % 30, i % 50, to_char(i/30, 'FM0000') FROM generate_series(1, 3000) i WHERE (i % 30) < 30 and (i % 30) >= 20;
+
+-- Create foreign partitions
+CREATE FOREIGN TABLE fplt1_p1 PARTITION OF plt1 FOR VALUES FROM (0) TO (10) SERVER loopback OPTIONS (table_name 'plt1_p1');
+CREATE FOREIGN TABLE fplt1_p2 PARTITION OF plt1 FOR VALUES FROM (10) TO (20) SERVER loopback OPTIONS (table_name 'plt1_p2');;
+CREATE FOREIGN TABLE fplt1_p3 PARTITION OF plt1 FOR VALUES FROM (20) TO (30) SERVER loopback OPTIONS (table_name 'plt1_p3');;
+
+ANALYZE plt1;
+ANALYZE fplt1_p1;
+ANALYZE fplt1_p2;
+ANALYZE fplt1_p3;
+
+-- When GROUP BY clause matches with PARTITION KEY.
+-- Plan when partition-wise-agg is disabled
+SET enable_partition_wise_agg TO false;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT a, sum(b), min(b), count(*) FROM plt1 GROUP BY a ORDER BY 1;
+
+-- Plan when partition-wise-agg is enabled
+SET enable_partition_wise_agg TO true;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT a, sum(b), min(b), count(*) FROM plt1 GROUP BY a ORDER BY 1;
+SELECT a, sum(b), min(b), count(*) FROM plt1 GROUP BY a ORDER BY 1;
+
+
+-- Clean-up
+DROP FOREIGN TABLE fplt1_p3;
+DROP FOREIGN TABLE fplt1_p2;
+DROP FOREIGN TABLE fplt1_p1;
+DROP TABLE plt1_p3;
+DROP TABLE plt1_p2;
+DROP TABLE plt1_p1;
+DROP TABLE plt1;
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 8143f80..0ab5c56 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -130,8 +130,6 @@ static void subquery_push_qual(Query *subquery,
static void recurse_push_qual(Node *setOp, Query *topquery,
RangeTblEntry *rte, Index rti, Node *qual);
static void remove_unused_subquery_outputs(Query *subquery, RelOptInfo *rel);
-static void add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
- List *live_childrels);
/*
@@ -1338,7 +1336,7 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
* parameterization or ordering. Similarly it collects partial paths from
* non-dummy children to create partial append paths.
*/
-static void
+void
add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
List *live_childrels)
{
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 220c81c..b2edaca 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -128,6 +128,7 @@ bool enable_mergejoin = true;
bool enable_hashjoin = true;
bool enable_gathermerge = true;
bool enable_partition_wise_join = false;
+bool enable_partition_wise_agg = true;
typedef struct
{
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 0aae4ca..929e0a6 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1610,6 +1610,7 @@ create_sort_plan(PlannerInfo *root, SortPath *best_path, int flags)
{
Sort *plan;
Plan *subplan;
+ Relids relids;
/*
* We don't want any excess columns in the sorted tuples, so request a
@@ -1619,7 +1620,12 @@ create_sort_plan(PlannerInfo *root, SortPath *best_path, int flags)
subplan = create_plan_recurse(root, best_path->subpath,
flags | CP_SMALL_TLIST);
- plan = make_sort_from_pathkeys(subplan, best_path->path.pathkeys, NULL);
+ /*
+ * TODO: we need to fix something here. The "other" upper rels are not
+ * marked as "OTHER" rels and may not have relids.
+ */
+ relids = IS_OTHER_REL(best_path->subpath->parent) ? best_path->path.parent->relids : NULL;
+ plan = make_sort_from_pathkeys(subplan, best_path->path.pathkeys, relids);
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -3393,15 +3399,8 @@ create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
/* Copy foreign server OID; likewise, no need to make FDW do this */
scan_plan->fs_server = rel->serverid;
- /*
- * Likewise, copy the relids that are represented by this foreign scan. An
- * upper rel doesn't have relids set, but it covers all the base relations
- * participating in the underlying scan, so use root's all_baserels.
- */
- if (rel->reloptkind == RELOPT_UPPER_REL)
- scan_plan->fs_relids = root->all_baserels;
- else
- scan_plan->fs_relids = best_path->path.parent->relids;
+ /* Likewise, copy the relids from Path to Plan */
+ scan_plan->fs_relids = best_path->path.parent->relids;
/*
* If this is a foreign join, and to make it valid to push down we had to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 17fae4f..9f7db45 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -157,6 +157,15 @@ static PathTarget *make_sort_input_target(PlannerInfo *root,
bool *have_postponed_srfs);
static void adjust_paths_for_srfs(PlannerInfo *root, RelOptInfo *rel,
List *targets, List *targets_contain_srfs);
+static void try_partition_wise_grouping(PlannerInfo *root,
+ RelOptInfo *input_rel,
+ RelOptInfo *grouped_rel,
+ PathTarget *target,
+ const AggClauseCosts *agg_costs,
+ List *rollup_lists,
+ List *rollup_groupclauses);
+static bool have_grouping_by_partkey(RelOptInfo *input_rel, PathTarget *target,
+ List *groupClause);
/*****************************************************************************
@@ -2067,7 +2076,8 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
if (final_rel->fdwroutine &&
final_rel->fdwroutine->GetForeignUpperPaths)
final_rel->fdwroutine->GetForeignUpperPaths(root, UPPERREL_FINAL,
- current_rel, final_rel);
+ current_rel, final_rel,
+ NULL);
/* Let extensions possibly add some more paths */
if (create_upper_paths_hook)
@@ -3283,7 +3293,18 @@ create_grouping_paths(PlannerInfo *root,
ListCell *lc;
/* For now, do all work in the (GROUP_AGG, NULL) upperrel */
- grouped_rel = fetch_upper_rel(root, UPPERREL_GROUP_AGG, NULL);
+ if (IS_OTHER_REL(input_rel))
+ {
+
+ /*
+ * TODO: We should mark these rels as "other upper" rels similar to
+ * "other" join and base relations.
+ */
+ grouped_rel = fetch_upper_rel(root, UPPERREL_GROUP_AGG,
+ input_rel->relids);
+ }
+ else
+ grouped_rel = fetch_upper_rel(root, UPPERREL_GROUP_AGG, NULL);
/*
* If the input relation is not parallel-safe, then the grouped relation
@@ -3303,6 +3324,9 @@ create_grouping_paths(PlannerInfo *root,
grouped_rel->useridiscurrent = input_rel->useridiscurrent;
grouped_rel->fdwroutine = input_rel->fdwroutine;
+ /* Copy input rels's relids to grouped rel */
+ grouped_rel->relids = input_rel->relids;
+
/*
* Check for degenerate grouping.
*/
@@ -3377,6 +3401,34 @@ create_grouping_paths(PlannerInfo *root,
rollup_groupclauses);
/*
+ * Number of groups estimated above is based on parent relation. However
+ * we need to estimate the number of groups for the child. For that we
+ * must know the number of partitions. Find that and devise new estimate
+ * for number of groups.
+ *
+ * FIXME: We might need to do this in get_number_of_groups() itself. But
+ * not sure at this time. Need to revise the logic.
+ */
+ if (IS_OTHER_REL(input_rel))
+ {
+ RelOptInfo *rel;
+
+ /* Find top-most parent rel */
+ if (IS_JOIN_REL(input_rel))
+ rel = find_join_rel(root, input_rel->top_parent_relids);
+ else
+ rel = find_base_rel(root,
+ bms_singleton_member(input_rel->top_parent_relids));
+
+ /*
+ * Divide estimated number of groups by number of children to get
+ * number of groups estimate for child rel.
+ */
+ if (rel->part_scheme->nparts > 0)
+ dNumGroups = clamp_row_est(dNumGroups / rel->part_scheme->nparts);
+ }
+
+ /*
* Determine whether it's possible to perform sort-based implementations
* of grouping. (Note that if groupClause is empty,
* grouping_is_sortable() is trivially true, and all the
@@ -3441,6 +3493,11 @@ create_grouping_paths(PlannerInfo *root,
/* Insufficient support for partial mode. */
try_parallel_aggregation = false;
}
+ else if (IS_OTHER_REL(input_rel))
+ {
+ /* TODO: enable parallel query for partition-wise grouping. */
+ try_parallel_aggregation = false;
+ }
else
{
/* Everything looks good. */
@@ -3855,13 +3912,21 @@ create_grouping_paths(PlannerInfo *root,
errdetail("Some of the datatypes only support hashing, while others only support sorting.")));
/*
+ * If input relation is partitioned check if we can perform partition-wise
+ * grouping and/or aggregation.
+ */
+ try_partition_wise_grouping(root, input_rel, grouped_rel, target,
+ agg_costs, rollup_lists, rollup_groupclauses);
+
+ /*
* If there is an FDW that's responsible for all baserels of the query,
* let it consider adding ForeignPaths.
*/
if (grouped_rel->fdwroutine &&
grouped_rel->fdwroutine->GetForeignUpperPaths)
grouped_rel->fdwroutine->GetForeignUpperPaths(root, UPPERREL_GROUP_AGG,
- input_rel, grouped_rel);
+ input_rel, grouped_rel,
+ target);
/* Let extensions possibly add some more paths */
if (create_upper_paths_hook)
@@ -3885,6 +3950,191 @@ create_grouping_paths(PlannerInfo *root,
}
/*
+ * If the input relation is partitioned and the partition keys are leading
+ * group by clauses, each partition produces a different set of groups.
+ * Aggregates within each such group can be computed partition-wise. This
+ * might be optimal because of presence of suitable paths with pathkeys or
+ * because the hash tables for most of the partitions fit in the memory.
+ */
+static void
+try_partition_wise_grouping(PlannerInfo *root,
+ RelOptInfo *input_rel,
+ RelOptInfo *grouped_rel,
+ PathTarget *target,
+ const AggClauseCosts *agg_costs,
+ List *rollup_lists,
+ List *rollup_groupclauses)
+{
+ Query *query = root->parse;
+ int nparts;
+ int cnt_parts;
+ PartitionScheme part_scheme = input_rel->part_scheme;
+ RelOptInfo **part_rels;
+ List *live_children = NIL;
+ PathTarget *scanjoin_target;
+ ListCell *lc;
+
+ /* Nothing to do, if user disabled partition-wise aggregation. */
+ if (!enable_partition_wise_agg)
+ return;
+
+ /* Do not handle grouping sets for now. */
+ if (rollup_groupclauses || rollup_lists)
+ return;
+
+ /* Nothing to do, if the input relation is not partitioned. */
+ if (!part_scheme)
+ return;
+
+ Assert(input_rel->part_rels);
+
+ /*
+ * TODO: for now do nothing if partition keys are not leading group by
+ * clauses. In general we may calculate partial aggregates for each
+ * partition and combine them.
+ */
+ if (!have_grouping_by_partkey(input_rel, target, query->groupClause))
+ return;
+
+ /* TODO: should be removed in final version */
+ elog(NOTICE, "partition-wise grouping is possible.");
+
+ nparts = part_scheme->nparts;
+ grouped_rel->part_scheme = input_rel->part_scheme;
+ part_rels = (RelOptInfo **) palloc(nparts * sizeof(RelOptInfo *));
+ grouped_rel->part_rels = part_rels;
+
+ /* Add paths for partition-wise aggregation/grouping. */
+ for (cnt_parts = 0; cnt_parts < nparts; cnt_parts++)
+ {
+ RelOptInfo *input_child_rel = input_rel->part_rels[cnt_parts];
+ PathTarget *child_target = copy_pathtarget(target);
+ List *appinfos = find_appinfos_by_relids(root,
+ input_child_rel->relids);
+
+ /*
+ * Now that there can be multiple grouping relations, if we have to
+ * manage those in the root, we need separate identifiers for those.
+ * What better identifier than the input relids themselves?
+ */
+ part_rels[cnt_parts] = fetch_upper_rel(root, UPPERREL_GROUP_AGG,
+ input_child_rel->relids);
+
+ /* Ignore empty children. They contribute nothing. */
+ if (IS_DUMMY_REL(input_child_rel))
+ {
+ mark_dummy_rel(part_rels[cnt_parts]);
+ continue;
+ }
+ else
+ live_children = lappend(live_children, part_rels[cnt_parts]);
+
+ /*
+ * Forcibly apply scan/join target to all the Paths for the scan/join
+ * rel.
+ *
+ * In principle we should re-run set_cheapest() here to identify the
+ * cheapest path, but it seems unlikely that adding the same tlist
+ * eval costs to all the paths would change that, so we don't bother.
+ * Instead, just assume that the cheapest-startup and cheapest-total
+ * paths remain so. (There should be no parameterized paths anymore,
+ * so we needn't worry about updating cheapest_parameterized_paths.)
+ */
+ scanjoin_target = copy_pathtarget(input_rel->cheapest_startup_path->pathtarget);
+ scanjoin_target->exprs = (List *) adjust_appendrel_attrs(root, (Node *) scanjoin_target->exprs,
+ appinfos);
+
+ foreach(lc, input_child_rel->pathlist)
+ {
+ Path *subpath = (Path *) lfirst(lc);
+ Path *path;
+
+ Assert(subpath->param_info == NULL);
+ path = apply_projection_to_path(root, input_child_rel,
+ subpath, scanjoin_target);
+ /* If we had to add a Result, path is different from subpath */
+ if (path != subpath)
+ {
+ lfirst(lc) = path;
+ if (subpath == input_child_rel->cheapest_startup_path)
+ input_child_rel->cheapest_startup_path = path;
+ if (subpath == input_child_rel->cheapest_total_path)
+ input_child_rel->cheapest_total_path = path;
+ }
+ }
+
+ /*
+ * TODO:
+ * We should somehow make this target available for FDWs, which are
+ * expected to fetch it directly from root->upper_targets. That array
+ * can hold only one target for each kind of upper rel. We will now
+ * have many such upper relations.
+ */
+ child_target->exprs = (List *) adjust_appendrel_attrs(root,
+ (Node *) target->exprs,
+ appinfos);
+
+ create_grouping_paths(root, input_child_rel, child_target, agg_costs,
+ rollup_lists, rollup_groupclauses);
+
+ }
+
+ /*
+ * add_paths_to_append_rel() sets the path target from the given relation.
+ * In this case grouped_rel doesn't have a target set. So temporarily set
+ * it.
+ * TODO: probably we should do something better than this.
+ */
+ grouped_rel->reltarget = target;
+ add_paths_to_append_rel(root, grouped_rel, live_children);
+ grouped_rel->reltarget = NULL;
+}
+
+/*
+ * Returns true if partition keys of the given relation are leading group by
+ * clauses.
+ */
+static bool
+have_grouping_by_partkey(RelOptInfo *input_rel, PathTarget *target,
+ List *groupClause)
+{
+ PartitionScheme part_scheme = input_rel->part_scheme;
+ ListCell *lc;
+ List *tlist = make_tlist_from_pathtarget(target);
+ List *group_exprs = get_sortgrouplist_exprs(groupClause, tlist);
+ int cnt_pk = 0;
+ int num_pks;
+
+ /* Input relation should be partitioned. */
+ Assert(part_scheme);
+
+ num_pks = part_scheme->partnatts;
+
+ foreach(lc, group_exprs)
+ {
+ Expr *group_expr = lfirst(lc);
+ List *pk_exprs;
+
+ /* All partition keys are present in the group clause. */
+ if (cnt_pk >= num_pks)
+ return true;
+
+ pk_exprs = input_rel->partexprs[cnt_pk];
+
+ if (!list_member(pk_exprs, group_expr))
+ return false;
+
+ cnt_pk++;
+ }
+
+ /* All partition keys are present in the group clause. */
+ if (cnt_pk >= num_pks)
+ return true;
+
+ return false;
+}
+
+/*
* create_window_paths
*
* Build a new upperrel containing Paths for window-function evaluation.
@@ -3959,7 +4209,8 @@ create_window_paths(PlannerInfo *root,
if (window_rel->fdwroutine &&
window_rel->fdwroutine->GetForeignUpperPaths)
window_rel->fdwroutine->GetForeignUpperPaths(root, UPPERREL_WINDOW,
- input_rel, window_rel);
+ input_rel, window_rel,
+ NULL);
/* Let extensions possibly add some more paths */
if (create_upper_paths_hook)
@@ -4263,7 +4514,8 @@ create_distinct_paths(PlannerInfo *root,
if (distinct_rel->fdwroutine &&
distinct_rel->fdwroutine->GetForeignUpperPaths)
distinct_rel->fdwroutine->GetForeignUpperPaths(root, UPPERREL_DISTINCT,
- input_rel, distinct_rel);
+ input_rel, distinct_rel,
+ NULL);
/* Let extensions possibly add some more paths */
if (create_upper_paths_hook)
@@ -4405,7 +4657,8 @@ create_ordered_paths(PlannerInfo *root,
if (ordered_rel->fdwroutine &&
ordered_rel->fdwroutine->GetForeignUpperPaths)
ordered_rel->fdwroutine->GetForeignUpperPaths(root, UPPERREL_ORDERED,
- input_rel, ordered_rel);
+ input_rel, ordered_rel,
+ NULL);
/* Let extensions possibly add some more paths */
if (create_upper_paths_hook)
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 7f423c9..0cce5e4 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -920,6 +920,15 @@ static struct config_bool ConfigureNamesBool[] =
false,
NULL, NULL, NULL
},
+ {
+ {"enable_partition_wise_agg", PGC_USERSET, QUERY_TUNING_METHOD,
+ gettext_noop("Enables partition-wise aggregation and grouping."),
+ NULL
+ },
+ &enable_partition_wise_agg,
+ true,
+ NULL, NULL, NULL
+ },
{
{"geqo", PGC_USERSET, QUERY_TUNING_GEQO,
diff --git a/src/include/foreign/fdwapi.h b/src/include/foreign/fdwapi.h
index 6ca44f7..10fdf6d 100644
--- a/src/include/foreign/fdwapi.h
+++ b/src/include/foreign/fdwapi.h
@@ -62,7 +62,8 @@ typedef void (*GetForeignJoinPaths_function) (PlannerInfo *root,
typedef void (*GetForeignUpperPaths_function) (PlannerInfo *root,
UpperRelationKind stage,
RelOptInfo *input_rel,
- RelOptInfo *output_rel);
+ RelOptInfo *output_rel,
+ PathTarget *target);
typedef void (*AddForeignUpdateTargets_function) (Query *parsetree,
RangeTblEntry *target_rte,
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index e7949d37..7cd8e2c 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -68,6 +68,7 @@ extern bool enable_mergejoin;
extern bool enable_hashjoin;
extern bool enable_gathermerge;
extern bool enable_partition_wise_join;
+extern bool enable_partition_wise_agg;
extern int constraint_exclusion;
extern double clamp_row_est(double nrows);
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index f31b70e..77388cc 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -60,6 +60,8 @@ extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
Path *bitmapqual);
extern void generate_partition_wise_join_paths(PlannerInfo *root,
RelOptInfo *rel);
+extern void add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
+ List *live_childrels);
#ifdef OPTIMIZER_DEBUG
extern void debug_print_rel(PlannerInfo *root, RelOptInfo *rel);
diff --git a/src/test/regress/expected/partition_agg.out b/src/test/regress/expected/partition_agg.out
new file mode 100644
index 0000000..315d396
--- /dev/null
+++ b/src/test/regress/expected/partition_agg.out
@@ -0,0 +1,192 @@
+--
+-- PARTITION_AGG
+-- Test partition-wise aggregation on partitioned tables
+--
+-- Enable partition-wise join, which by default is disabled.
+SET enable_partition_wise_join TO true;
+--
+-- Tests for list partitioned tables.
+--
+CREATE TABLE ptab1 (a int, b int, c text) PARTITION BY LIST(c);
+CREATE TABLE ptab1_p1 PARTITION OF ptab1 FOR VALUES IN ('0000', '0003', '0004', '0010');
+CREATE TABLE ptab1_p2 PARTITION OF ptab1 FOR VALUES IN ('0001', '0005', '0002', '0009');
+CREATE TABLE ptab1_p3 PARTITION OF ptab1 FOR VALUES IN ('0006', '0007', '0008', '0011');
+INSERT INTO ptab1 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(0, 599, 2) i;
+ANALYZE ptab1;
+ANALYZE ptab1_p1;
+ANALYZE ptab1_p2;
+ANALYZE ptab1_p3;
+-- TODO: This table is created only for testing the results. Remove once
+-- results are tested.
+CREATE TABLE uptab1 AS SELECT * FROM ptab1;
+ANALYZE uptab1;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT c, sum(a), avg(b), COUNT(*) FROM ptab1 GROUP BY c ORDER BY 1, 2, 3;
+NOTICE: partition-wise grouping is possible.
+ QUERY PLAN
+------------------------------------------------------------------------------
+ Sort
+ Output: ptab1_p1.c, (sum(ptab1_p1.a)), (avg(ptab1_p1.b)), (count(*))
+ Sort Key: ptab1_p1.c, (sum(ptab1_p1.a)), (avg(ptab1_p1.b))
+ -> Append
+ -> HashAggregate
+ Output: ptab1_p1.c, sum(ptab1_p1.a), avg(ptab1_p1.b), count(*)
+ Group Key: ptab1_p1.c
+ -> Seq Scan on public.ptab1_p1
+ Output: ptab1_p1.c, ptab1_p1.a, ptab1_p1.b
+ -> HashAggregate
+ Output: ptab1_p2.c, sum(ptab1_p2.a), avg(ptab1_p2.b), count(*)
+ Group Key: ptab1_p2.c
+ -> Seq Scan on public.ptab1_p2
+ Output: ptab1_p2.c, ptab1_p2.a, ptab1_p2.b
+ -> HashAggregate
+ Output: ptab1_p3.c, sum(ptab1_p3.a), avg(ptab1_p3.b), count(*)
+ Group Key: ptab1_p3.c
+ -> Seq Scan on public.ptab1_p3
+ Output: ptab1_p3.c, ptab1_p3.a, ptab1_p3.b
+(19 rows)
+
+SELECT c, sum(a), avg(b), COUNT(*) FROM ptab1 GROUP BY c ORDER BY 1, 2, 3;
+NOTICE: partition-wise grouping is possible.
+ c | sum | avg | count
+------+-------+----------------------+-------
+ 0000 | 600 | 24.0000000000000000 | 25
+ 0001 | 1850 | 74.0000000000000000 | 25
+ 0002 | 3100 | 124.0000000000000000 | 25
+ 0003 | 4350 | 174.0000000000000000 | 25
+ 0004 | 5600 | 224.0000000000000000 | 25
+ 0005 | 6850 | 274.0000000000000000 | 25
+ 0006 | 8100 | 324.0000000000000000 | 25
+ 0007 | 9350 | 374.0000000000000000 | 25
+ 0008 | 10600 | 424.0000000000000000 | 25
+ 0009 | 11850 | 474.0000000000000000 | 25
+ 0010 | 13100 | 524.0000000000000000 | 25
+ 0011 | 14350 | 574.0000000000000000 | 25
+(12 rows)
+
+SELECT c, sum(a), avg(b), COUNT(*) FROM uptab1 GROUP BY c ORDER BY 1, 2, 3;
+ c | sum | avg | count
+------+-------+----------------------+-------
+ 0000 | 600 | 24.0000000000000000 | 25
+ 0001 | 1850 | 74.0000000000000000 | 25
+ 0002 | 3100 | 124.0000000000000000 | 25
+ 0003 | 4350 | 174.0000000000000000 | 25
+ 0004 | 5600 | 224.0000000000000000 | 25
+ 0005 | 6850 | 274.0000000000000000 | 25
+ 0006 | 8100 | 324.0000000000000000 | 25
+ 0007 | 9350 | 374.0000000000000000 | 25
+ 0008 | 10600 | 424.0000000000000000 | 25
+ 0009 | 11850 | 474.0000000000000000 | 25
+ 0010 | 13100 | 524.0000000000000000 | 25
+ 0011 | 14350 | 574.0000000000000000 | 25
+(12 rows)
+
+-- JOIN query
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.c, sum(t1.a), avg(t1.b), COUNT(*) FROM ptab1 t1, ptab1 t2 WHERE t1.c = t2.c GROUP BY t1.c ORDER BY 1, 2, 3;
+NOTICE: partition-wise grouping is possible.
+ QUERY PLAN
+------------------------------------------------------------------
+ Sort
+ Output: t1.c, (sum(t1.a)), (avg(t1.b)), (count(*))
+ Sort Key: t1.c, (sum(t1.a)), (avg(t1.b))
+ -> Append
+ -> HashAggregate
+ Output: t1.c, sum(t1.a), avg(t1.b), count(*)
+ Group Key: t1.c
+ -> Hash Join
+ Output: t1.c, t1.a, t1.b
+ Hash Cond: (t1.c = t2.c)
+ -> Seq Scan on public.ptab1_p1 t1
+ Output: t1.c, t1.a, t1.b
+ -> Hash
+ Output: t2.c
+ -> Seq Scan on public.ptab1_p1 t2
+ Output: t2.c
+ -> HashAggregate
+ Output: t1_1.c, sum(t1_1.a), avg(t1_1.b), count(*)
+ Group Key: t1_1.c
+ -> Hash Join
+ Output: t1_1.c, t1_1.a, t1_1.b
+ Hash Cond: (t1_1.c = t2_1.c)
+ -> Seq Scan on public.ptab1_p2 t1_1
+ Output: t1_1.c, t1_1.a, t1_1.b
+ -> Hash
+ Output: t2_1.c
+ -> Seq Scan on public.ptab1_p2 t2_1
+ Output: t2_1.c
+ -> HashAggregate
+ Output: t1_2.c, sum(t1_2.a), avg(t1_2.b), count(*)
+ Group Key: t1_2.c
+ -> Hash Join
+ Output: t1_2.c, t1_2.a, t1_2.b
+ Hash Cond: (t1_2.c = t2_2.c)
+ -> Seq Scan on public.ptab1_p3 t1_2
+ Output: t1_2.c, t1_2.a, t1_2.b
+ -> Hash
+ Output: t2_2.c
+ -> Seq Scan on public.ptab1_p3 t2_2
+ Output: t2_2.c
+(40 rows)
+
+SELECT t1.c, sum(t1.a), avg(t1.b), COUNT(*) FROM ptab1 t1, ptab1 t2 WHERE t1.c = t2.c GROUP BY t1.c ORDER BY 1, 2, 3;
+NOTICE: partition-wise grouping is possible.
+ c | sum | avg | count
+------+--------+----------------------+-------
+ 0000 | 15000 | 24.0000000000000000 | 625
+ 0001 | 46250 | 74.0000000000000000 | 625
+ 0002 | 77500 | 124.0000000000000000 | 625
+ 0003 | 108750 | 174.0000000000000000 | 625
+ 0004 | 140000 | 224.0000000000000000 | 625
+ 0005 | 171250 | 274.0000000000000000 | 625
+ 0006 | 202500 | 324.0000000000000000 | 625
+ 0007 | 233750 | 374.0000000000000000 | 625
+ 0008 | 265000 | 424.0000000000000000 | 625
+ 0009 | 296250 | 474.0000000000000000 | 625
+ 0010 | 327500 | 524.0000000000000000 | 625
+ 0011 | 358750 | 574.0000000000000000 | 625
+(12 rows)
+
+SELECT t1.c, sum(t1.a), avg(t1.b), COUNT(*) FROM uptab1 t1, uptab1 t2 WHERE t1.c = t2.c GROUP BY t1.c ORDER BY 1, 2, 3;
+ c | sum | avg | count
+------+--------+----------------------+-------
+ 0000 | 15000 | 24.0000000000000000 | 625
+ 0001 | 46250 | 74.0000000000000000 | 625
+ 0002 | 77500 | 124.0000000000000000 | 625
+ 0003 | 108750 | 174.0000000000000000 | 625
+ 0004 | 140000 | 224.0000000000000000 | 625
+ 0005 | 171250 | 274.0000000000000000 | 625
+ 0006 | 202500 | 324.0000000000000000 | 625
+ 0007 | 233750 | 374.0000000000000000 | 625
+ 0008 | 265000 | 424.0000000000000000 | 625
+ 0009 | 296250 | 474.0000000000000000 | 625
+ 0010 | 327500 | 524.0000000000000000 | 625
+ 0011 | 358750 | 574.0000000000000000 | 625
+(12 rows)
+
+-- Negative testcase
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT COUNT(*) FROM ptab1 GROUP BY a;
+ QUERY PLAN
+-----------------------------------------
+ HashAggregate
+ Output: count(*), ptab1.a
+ Group Key: ptab1.a
+ -> Append
+ -> Seq Scan on public.ptab1
+ Output: ptab1.a
+ -> Seq Scan on public.ptab1_p1
+ Output: ptab1_p1.a
+ -> Seq Scan on public.ptab1_p2
+ Output: ptab1_p2.a
+ -> Seq Scan on public.ptab1_p3
+ Output: ptab1_p3.a
+(12 rows)
+
+-- Cleanup
+DROP TABLE uptab1;
+DROP TABLE ptab1_p3;
+DROP TABLE ptab1_p2;
+DROP TABLE ptab1_p1;
+DROP TABLE ptab1;
+RESET enable_partition_wise_join;
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index 27f09fa..67c2041 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -1399,44 +1399,49 @@ ANALYZE plt1_e;
-- test partition matching with N-way join
EXPLAIN (COSTS OFF)
SELECT avg(t1.a), avg(t2.b), avg(t3.a + t3.b), t1.c, t2.c, t3.c FROM plt1 t1, plt2 t2, plt1_e t3 WHERE t1.c = t2.c AND ltrim(t3.c, 'A') = t1.c GROUP BY t1.c, t2.c, t3.c ORDER BY t1.c, t2.c, t3.c;
- QUERY PLAN
---------------------------------------------------------------------------------------
+NOTICE: partition-wise grouping is possible.
+ QUERY PLAN
+--------------------------------------------------------------------------------
Sort
Sort Key: t1.c, t3.c
- -> HashAggregate
- Group Key: t1.c, t2.c, t3.c
- -> Result
- -> Append
- -> Hash Join
- Hash Cond: (t1.c = t2.c)
- -> Seq Scan on plt1_p1 t1
- -> Hash
- -> Hash Join
- Hash Cond: (t2.c = ltrim(t3.c, 'A'::text))
- -> Seq Scan on plt2_p1 t2
- -> Hash
- -> Seq Scan on plt1_e_p1 t3
- -> Hash Join
- Hash Cond: (t1_1.c = t2_1.c)
- -> Seq Scan on plt1_p2 t1_1
- -> Hash
- -> Hash Join
- Hash Cond: (t2_1.c = ltrim(t3_1.c, 'A'::text))
- -> Seq Scan on plt2_p2 t2_1
- -> Hash
- -> Seq Scan on plt1_e_p2 t3_1
- -> Hash Join
- Hash Cond: (t1_2.c = t2_2.c)
- -> Seq Scan on plt1_p3 t1_2
- -> Hash
- -> Hash Join
- Hash Cond: (t2_2.c = ltrim(t3_2.c, 'A'::text))
- -> Seq Scan on plt2_p3 t2_2
- -> Hash
- -> Seq Scan on plt1_e_p3 t3_2
-(33 rows)
+ -> Append
+ -> HashAggregate
+ Group Key: t1.c, t2.c, t3.c
+ -> Hash Join
+ Hash Cond: (t1.c = t2.c)
+ -> Seq Scan on plt1_p1 t1
+ -> Hash
+ -> Hash Join
+ Hash Cond: (t2.c = ltrim(t3.c, 'A'::text))
+ -> Seq Scan on plt2_p1 t2
+ -> Hash
+ -> Seq Scan on plt1_e_p1 t3
+ -> HashAggregate
+ Group Key: t1_1.c, t2_1.c, t3_1.c
+ -> Hash Join
+ Hash Cond: (t1_1.c = t2_1.c)
+ -> Seq Scan on plt1_p2 t1_1
+ -> Hash
+ -> Hash Join
+ Hash Cond: (t2_1.c = ltrim(t3_1.c, 'A'::text))
+ -> Seq Scan on plt2_p2 t2_1
+ -> Hash
+ -> Seq Scan on plt1_e_p2 t3_1
+ -> HashAggregate
+ Group Key: t1_2.c, t2_2.c, t3_2.c
+ -> Hash Join
+ Hash Cond: (t1_2.c = t2_2.c)
+ -> Seq Scan on plt1_p3 t1_2
+ -> Hash
+ -> Hash Join
+ Hash Cond: (t2_2.c = ltrim(t3_2.c, 'A'::text))
+ -> Seq Scan on plt2_p3 t2_2
+ -> Hash
+ -> Seq Scan on plt1_e_p3 t3_2
+(36 rows)
SELECT avg(t1.a), avg(t2.b), avg(t3.a + t3.b), t1.c, t2.c, t3.c FROM plt1 t1, plt2 t2, plt1_e t3 WHERE t1.c = t2.c AND ltrim(t3.c, 'A') = t1.c GROUP BY t1.c, t2.c, t3.c ORDER BY t1.c, t2.c, t3.c;
+NOTICE: partition-wise grouping is possible.
avg | avg | avg | c | c | c
----------------------+----------------------+-----------------------+------+------+-------
24.0000000000000000 | 24.0000000000000000 | 48.0000000000000000 | 0000 | 0000 | A0000
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index cd1f7f3..f4cd466 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -81,11 +81,12 @@ select name, setting from pg_settings where name like 'enable%';
enable_material | on
enable_mergejoin | on
enable_nestloop | on
+ enable_partition_wise_agg | on
enable_partition_wise_join | off
enable_seqscan | on
enable_sort | on
enable_tidscan | on
-(13 rows)
+(14 rows)
-- Test that the pg_timezone_names and pg_timezone_abbrevs views are
-- more-or-less working. We can't test their contents in any great detail
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 966984d..56c07d3 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -104,6 +104,10 @@ test: publication subscription
# Another group of parallel tests
# ----------
test: select_views portals_p2 foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data window xmlmap functional_deps advisory_lock json jsonb json_encoding indirect_toast equivclass partition_join multi_level_partition_join
+# TODO: should be added in parallel tests above, but before that need to make
+# sure we have unique objects to avoid any concurrency issues.
+test: partition_agg
+
# ----------
# Another group of parallel tests
# NB: temp.sql does a reconnect which transiently uses 2 connections,
diff --git a/src/test/regress/sql/partition_agg.sql b/src/test/regress/sql/partition_agg.sql
new file mode 100644
index 0000000..df6dd9c
--- /dev/null
+++ b/src/test/regress/sql/partition_agg.sql
@@ -0,0 +1,48 @@
+--
+-- PARTITION_AGG
+-- Test partition-wise aggregation on partitioned tables
+--
+
+-- Enable partition-wise join, which by default is disabled.
+SET enable_partition_wise_join TO true;
+
+--
+-- Tests for list partitioned tables.
+--
+CREATE TABLE ptab1 (a int, b int, c text) PARTITION BY LIST(c);
+CREATE TABLE ptab1_p1 PARTITION OF ptab1 FOR VALUES IN ('0000', '0003', '0004', '0010');
+CREATE TABLE ptab1_p2 PARTITION OF ptab1 FOR VALUES IN ('0001', '0005', '0002', '0009');
+CREATE TABLE ptab1_p3 PARTITION OF ptab1 FOR VALUES IN ('0006', '0007', '0008', '0011');
+INSERT INTO ptab1 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(0, 599, 2) i;
+ANALYZE ptab1;
+ANALYZE ptab1_p1;
+ANALYZE ptab1_p2;
+ANALYZE ptab1_p3;
+-- TODO: This table is created only for testing the results. Remove once
+-- results are tested.
+CREATE TABLE uptab1 AS SELECT * FROM ptab1;
+ANALYZE uptab1;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT c, sum(a), avg(b), COUNT(*) FROM ptab1 GROUP BY c ORDER BY 1, 2, 3;
+SELECT c, sum(a), avg(b), COUNT(*) FROM ptab1 GROUP BY c ORDER BY 1, 2, 3;
+SELECT c, sum(a), avg(b), COUNT(*) FROM uptab1 GROUP BY c ORDER BY 1, 2, 3;
+
+-- JOIN query
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.c, sum(t1.a), avg(t1.b), COUNT(*) FROM ptab1 t1, ptab1 t2 WHERE t1.c = t2.c GROUP BY t1.c ORDER BY 1, 2, 3;
+SELECT t1.c, sum(t1.a), avg(t1.b), COUNT(*) FROM ptab1 t1, ptab1 t2 WHERE t1.c = t2.c GROUP BY t1.c ORDER BY 1, 2, 3;
+SELECT t1.c, sum(t1.a), avg(t1.b), COUNT(*) FROM uptab1 t1, uptab1 t2 WHERE t1.c = t2.c GROUP BY t1.c ORDER BY 1, 2, 3;
+
+
+-- Negative testcase
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT COUNT(*) FROM ptab1 GROUP BY a;
+
+-- Cleanup
+DROP TABLE uptab1;
+DROP TABLE ptab1_p3;
+DROP TABLE ptab1_p2;
+DROP TABLE ptab1_p1;
+DROP TABLE ptab1;
+RESET enable_partition_wise_join;
Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote:
Declarative partitioning is supported in PostgreSQL 10 and work is already in
progress to support partition-wise joins. Here is a proposal for partition-wise
aggregation/grouping. Our initial performance measurement has shown 7 times
performance when partitions are on foreign servers and approximately 15% when
partitions are local.Partition-wise aggregation/grouping computes aggregates for each partition
separately. If the group clause contains the partition key, all the rows
belonging to a given group come from one partition, thus allowing aggregates
to be computed completely for each partition. Otherwise, partial aggregates
computed for each partition are combined across the partitions to produce the
final aggregates. This technique improves performance because:
i. When partitions are located on foreign server, we can push down the
aggregate to the foreign server.
ii. If hash table for each partition fits in memory, but that for the whole
relation does not, each partition-wise aggregate can use an in-memory hash
table.
iii. Aggregation at the level of partitions can exploit properties of
partitions like indexes, their storage etc.
I suspect this overlaps with
/messages/by-id/29111.1483984605@localhost
I'm working on the next version of the patch, which will be able to aggregate
the result of both base relation scans and joins. I'm trying hard to make the
next version available before an urgent vacation that I'll have to take at
random date between today and early April. I suggest that we coordinate the
effort, it's lot of work in any case.
--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Mar 21, 2017 at 1:47 PM, Antonin Houska <ah@cybertec.at> wrote:
Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote:
Declarative partitioning is supported in PostgreSQL 10 and work is
already in
progress to support partition-wise joins. Here is a proposal for
partition-wise
aggregation/grouping. Our initial performance measurement has shown 7
times
performance when partitions are on foreign servers and approximately 15%
when
partitions are local.
Partition-wise aggregation/grouping computes aggregates for each
partition
separately. If the group clause contains the partition key, all the rows
belonging to a given group come from one partition, thus allowingaggregates
to be computed completely for each partition. Otherwise, partial
aggregates
computed for each partition are combined across the partitions to
produce the
final aggregates. This technique improves performance because:
i. When partitions are located on foreign server, we can push down the
aggregate to the foreign server.ii. If hash table for each partition fits in memory, but that for the
whole
relation does not, each partition-wise aggregate can use an in-memory
hash
table.
iii. Aggregation at the level of partitions can exploit properties of
partitions like indexes, their storage etc.I suspect this overlaps with
/messages/by-id/29111.1483984605@localhost
I'm working on the next version of the patch, which will be able to
aggregate
the result of both base relation scans and joins. I'm trying hard to make
the
next version available before an urgent vacation that I'll have to take at
random date between today and early April. I suggest that we coordinate the
effort, it's lot of work in any case.
IIUC, it seems that you are trying to push down the aggregation into the
joining relations. So basically you are converting
Agg -> Join -> {scan1, scan2} into
FinalAgg -> Join -> {PartialAgg -> scan1, PartialAgg -> scan2}.
In addition to that your patch pushes aggregates on base rel to its
children,
if any.
Where as what I propose here is pushing down aggregation below the append
node keeping join/scan as is. So basically I am converting
Agg -> Append-> Join -> {scan1, scan2} into
Append -> Agg -> Join -> {scan1, scan2}.
This will require partition-wise join as posted in [1]/messages/by-id/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@mail.gmail.com.
But I am planning to make this work for partitioned relations and not for
generic inheritance.
I treat these two as separate strategies/paths to be consider while
planning.
Our work will overlap when we are pushing down the aggregate on partitioned
base relation to its children/partitions.
I think you should continue working on pushing down aggregate onto the
joins/scans where as I will continue my work on pushing down aggregates to
partitions (joins as well as single table). Once we are done with these
task,
then we may need to find a way to integrate them.
[1]: /messages/by-id/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@mail.gmail.com
/messages/by-id/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@mail.gmail.com
--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
The promised new version of my patch is here:
/messages/by-id/9666.1491295317@localhost
Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote:
On Tue, Mar 21, 2017 at 1:47 PM, Antonin Houska <ah@cybertec.at> wrote:
Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote:
IIUC, it seems that you are trying to push down the aggregation into the
joining relations. So basically you are converting
Agg -> Join -> {scan1, scan2} into
FinalAgg -> Join -> {PartialAgg -> scan1, PartialAgg -> scan2}.
In addition to that your patch pushes aggregates on base rel to its children,
if any.Where as what I propose here is pushing down aggregation below the append
node keeping join/scan as is. So basically I am converting
Agg -> Append-> Join -> {scan1, scan2} into
Append -> Agg -> Join -> {scan1, scan2}.
This will require partition-wise join as posted in [1].
But I am planning to make this work for partitioned relations and not for
generic inheritance.I treat these two as separate strategies/paths to be consider while planning.
Our work will overlap when we are pushing down the aggregate on partitioned
base relation to its children/partitions.I think you should continue working on pushing down aggregate onto the
joins/scans where as I will continue my work on pushing down aggregates to
partitions (joins as well as single table). Once we are done with these task,
then we may need to find a way to integrate them.[1] /messages/by-id/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@mail.gmail.com
My patch does also create (partial) aggregation paths below the Append node,
but only expects SeqScan as input. Please check if you patch can be based on
this or if there's any conflict.
(I'll probably be unable to respond before Monday 04/17.)
--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Antonin Houska <ah@cybertec.at> wrote:
Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote:
Our work will overlap when we are pushing down the aggregate on partitioned
base relation to its children/partitions.I think you should continue working on pushing down aggregate onto the
joins/scans where as I will continue my work on pushing down aggregates to
partitions (joins as well as single table). Once we are done with these task,
then we may need to find a way to integrate them.[1] /messages/by-id/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@mail.gmail.com
My patch does also create (partial) aggregation paths below the Append node,
but only expects SeqScan as input. Please check if you patch can be based on
this or if there's any conflict.
Well, I haven't imposed any explicit restriction on the kind of path to be
aggregated below the Append path. Maybe the only thing to do is to merge my
patch with the "partition-wise join" patch (which I haven't checked yet).
--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Antonin Houska <ah@cybertec.at> wrote:
Antonin Houska <ah@cybertec.at> wrote:
Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote:
Our work will overlap when we are pushing down the aggregate on partitioned
base relation to its children/partitions.I think you should continue working on pushing down aggregate onto the
joins/scans where as I will continue my work on pushing down aggregates to
partitions (joins as well as single table). Once we are done with these task,
then we may need to find a way to integrate them.[1] /messages/by-id/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@mail.gmail.com
My patch does also create (partial) aggregation paths below the Append node,
but only expects SeqScan as input. Please check if you patch can be based on
this or if there's any conflict.Well, I haven't imposed any explicit restriction on the kind of path to be
aggregated below the Append path. Maybe the only thing to do is to merge my
patch with the "partition-wise join" patch (which I haven't checked yet).
Attached is a diff that contains both patches merged. This is just to prove my
assumption, details to be elaborated later. The scripts attached produce the
following plan in my environment:
QUERY PLAN
------------------------------------------------
Parallel Finalize HashAggregate
Group Key: b_1.j
-> Append
-> Parallel Partial HashAggregate
Group Key: b_1.j
-> Hash Join
Hash Cond: (b_1.j = c_1.k)
-> Seq Scan on b_1
-> Hash
-> Seq Scan on c_1
-> Parallel Partial HashAggregate
Group Key: b_2.j
-> Hash Join
Hash Cond: (b_2.j = c_2.k)
-> Seq Scan on b_2
-> Hash
-> Seq Scan on c_2
Note that I had no better idea how to enforce the plan than hard-wiring zero
costs of the partial aggregation paths. This simulates the use case of partial
aggregation performed on remote node (postgres_fdw). Other use cases may
exist, but I only wanted to prove the concept in terms of coding so far.
--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at
Attachments:
agg_pushdown_partition_wise.difftext/x-diffDownload
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
new file mode 100644
index b29549a..418c59a
*** a/contrib/postgres_fdw/expected/postgres_fdw.out
--- b/contrib/postgres_fdw/expected/postgres_fdw.out
*************** AND ftoptions @> array['fetch_size=60000
*** 7219,7221 ****
--- 7219,7341 ----
(1 row)
ROLLBACK;
+ -- ===================================================================
+ -- test partition-wise-joins
+ -- ===================================================================
+ SET enable_partition_wise_join=on;
+ CREATE TABLE fprt1 (a int, b int, c varchar) PARTITION BY RANGE(a);
+ CREATE TABLE fprt1_p1 (LIKE fprt1);
+ CREATE TABLE fprt1_p2 (LIKE fprt1);
+ INSERT INTO fprt1_p1 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(0, 249, 2) i;
+ INSERT INTO fprt1_p2 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(250, 499, 2) i;
+ CREATE FOREIGN TABLE ftprt1_p1 PARTITION OF fprt1 FOR VALUES FROM (0) TO (250)
+ SERVER loopback OPTIONS (table_name 'fprt1_p1', use_remote_estimate 'true');
+ CREATE FOREIGN TABLE ftprt1_p2 PARTITION OF fprt1 FOR VALUES FROM (250) TO (500)
+ SERVER loopback OPTIONS (TABLE_NAME 'fprt1_p2');
+ ANALYZE fprt1;
+ ANALYZE fprt1_p1;
+ ANALYZE fprt1_p2;
+ CREATE TABLE fprt2 (a int, b int, c varchar) PARTITION BY RANGE(b);
+ CREATE TABLE fprt2_p1 (LIKE fprt2);
+ CREATE TABLE fprt2_p2 (LIKE fprt2);
+ INSERT INTO fprt2_p1 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(0, 249, 3) i;
+ INSERT INTO fprt2_p2 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(250, 499, 3) i;
+ CREATE FOREIGN TABLE ftprt2_p1 PARTITION OF fprt2 FOR VALUES FROM (0) TO (250)
+ SERVER loopback OPTIONS (table_name 'fprt2_p1', use_remote_estimate 'true');
+ CREATE FOREIGN TABLE ftprt2_p2 PARTITION OF fprt2 FOR VALUES FROM (250) TO (500)
+ SERVER loopback OPTIONS (table_name 'fprt2_p2', use_remote_estimate 'true');
+ ANALYZE fprt2;
+ ANALYZE fprt2_p1;
+ ANALYZE fprt2_p2;
+ -- inner join three tables
+ EXPLAIN (COSTS OFF)
+ SELECT t1.a,t2.b,t3.c FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) INNER JOIN fprt1 t3 ON (t2.b = t3.a) WHERE t1.a % 25 =0 ORDER BY 1,2,3;
+ QUERY PLAN
+ --------------------------------------------------------------------------------------------------------------------
+ Sort
+ Sort Key: t1.a, t3.c
+ -> Append
+ -> Foreign Scan
+ Relations: ((public.ftprt1_p1 t1) INNER JOIN (public.ftprt2_p1 t2)) INNER JOIN (public.ftprt1_p1 t3)
+ -> Foreign Scan
+ Relations: ((public.ftprt1_p2 t1) INNER JOIN (public.ftprt2_p2 t2)) INNER JOIN (public.ftprt1_p2 t3)
+ (7 rows)
+
+ SELECT t1.a,t2.b,t3.c FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) INNER JOIN fprt1 t3 ON (t2.b = t3.a) WHERE t1.a % 25 =0 ORDER BY 1,2,3;
+ a | b | c
+ -----+-----+------
+ 0 | 0 | 0000
+ 150 | 150 | 0003
+ 250 | 250 | 0005
+ 400 | 400 | 0008
+ (4 rows)
+
+ -- left outer join + nullable clasue
+ EXPLAIN (COSTS OFF)
+ SELECT t1.a,t2.b,t2.c FROM fprt1 t1 LEFT JOIN (SELECT * FROM fprt2 WHERE a < 10) t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a < 10 ORDER BY 1,2,3;
+ QUERY PLAN
+ -----------------------------------------------------------------------------------
+ Sort
+ Sort Key: t1.a, ftprt2_p1.b, ftprt2_p1.c
+ -> Append
+ -> Foreign Scan
+ Relations: (public.ftprt1_p1 t1) LEFT JOIN (public.ftprt2_p1 fprt2)
+ (5 rows)
+
+ SELECT t1.a,t2.b,t2.c FROM fprt1 t1 LEFT JOIN (SELECT * FROM fprt2 WHERE a < 10) t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a < 10 ORDER BY 1,2,3;
+ a | b | c
+ ---+---+------
+ 0 | 0 | 0000
+ 2 | |
+ 4 | |
+ 6 | 6 | 0000
+ 8 | |
+ (5 rows)
+
+ -- with whole-row reference
+ EXPLAIN (COSTS OFF)
+ SELECT t1,t2 FROM fprt1 t1 JOIN fprt2 t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a % 25 =0 ORDER BY 1,2;
+ QUERY PLAN
+ ---------------------------------------------------------------------------------
+ Sort
+ Sort Key: ((t1.*)::fprt1), ((t2.*)::fprt2)
+ -> Append
+ -> Foreign Scan
+ Relations: (public.ftprt1_p1 t1) INNER JOIN (public.ftprt2_p1 t2)
+ -> Foreign Scan
+ Relations: (public.ftprt1_p2 t1) INNER JOIN (public.ftprt2_p2 t2)
+ (7 rows)
+
+ SELECT t1,t2 FROM fprt1 t1 JOIN fprt2 t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a % 25 =0 ORDER BY 1,2;
+ t1 | t2
+ ----------------+----------------
+ (0,0,0000) | (0,0,0000)
+ (150,150,0003) | (150,150,0003)
+ (250,250,0005) | (250,250,0005)
+ (400,400,0008) | (400,400,0008)
+ (4 rows)
+
+ -- join with lateral reference
+ EXPLAIN (COSTS OFF)
+ SELECT t1.a,t1.b FROM fprt1 t1, LATERAL (SELECT t2.a, t2.b FROM fprt2 t2 WHERE t1.a = t2.b AND t1.b = t2.a) q WHERE t1.a%25 = 0 ORDER BY 1,2;
+ QUERY PLAN
+ ---------------------------------------------------------------------------------
+ Sort
+ Sort Key: t1.a, t1.b
+ -> Append
+ -> Foreign Scan
+ Relations: (public.ftprt1_p1 t1) INNER JOIN (public.ftprt2_p1 t2)
+ -> Foreign Scan
+ Relations: (public.ftprt1_p2 t1) INNER JOIN (public.ftprt2_p2 t2)
+ (7 rows)
+
+ SELECT t1.a,t1.b FROM fprt1 t1, LATERAL (SELECT t2.a, t2.b FROM fprt2 t2 WHERE t1.a = t2.b AND t1.b = t2.a) q WHERE t1.a%25 = 0 ORDER BY 1,2;
+ a | b
+ -----+-----
+ 0 | 0
+ 150 | 150
+ 250 | 250
+ 400 | 400
+ (4 rows)
+
+ RESET enable_partition_wise_join;
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
new file mode 100644
index 423eb02..a275f55
*** a/contrib/postgres_fdw/sql/postgres_fdw.sql
--- b/contrib/postgres_fdw/sql/postgres_fdw.sql
*************** WHERE ftrelid = 'table30000'::regclass
*** 1709,1711 ****
--- 1709,1764 ----
AND ftoptions @> array['fetch_size=60000'];
ROLLBACK;
+
+ -- ===================================================================
+ -- test partition-wise-joins
+ -- ===================================================================
+ SET enable_partition_wise_join=on;
+
+ CREATE TABLE fprt1 (a int, b int, c varchar) PARTITION BY RANGE(a);
+ CREATE TABLE fprt1_p1 (LIKE fprt1);
+ CREATE TABLE fprt1_p2 (LIKE fprt1);
+ INSERT INTO fprt1_p1 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(0, 249, 2) i;
+ INSERT INTO fprt1_p2 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(250, 499, 2) i;
+ CREATE FOREIGN TABLE ftprt1_p1 PARTITION OF fprt1 FOR VALUES FROM (0) TO (250)
+ SERVER loopback OPTIONS (table_name 'fprt1_p1', use_remote_estimate 'true');
+ CREATE FOREIGN TABLE ftprt1_p2 PARTITION OF fprt1 FOR VALUES FROM (250) TO (500)
+ SERVER loopback OPTIONS (TABLE_NAME 'fprt1_p2');
+ ANALYZE fprt1;
+ ANALYZE fprt1_p1;
+ ANALYZE fprt1_p2;
+
+ CREATE TABLE fprt2 (a int, b int, c varchar) PARTITION BY RANGE(b);
+ CREATE TABLE fprt2_p1 (LIKE fprt2);
+ CREATE TABLE fprt2_p2 (LIKE fprt2);
+ INSERT INTO fprt2_p1 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(0, 249, 3) i;
+ INSERT INTO fprt2_p2 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(250, 499, 3) i;
+ CREATE FOREIGN TABLE ftprt2_p1 PARTITION OF fprt2 FOR VALUES FROM (0) TO (250)
+ SERVER loopback OPTIONS (table_name 'fprt2_p1', use_remote_estimate 'true');
+ CREATE FOREIGN TABLE ftprt2_p2 PARTITION OF fprt2 FOR VALUES FROM (250) TO (500)
+ SERVER loopback OPTIONS (table_name 'fprt2_p2', use_remote_estimate 'true');
+ ANALYZE fprt2;
+ ANALYZE fprt2_p1;
+ ANALYZE fprt2_p2;
+
+ -- inner join three tables
+ EXPLAIN (COSTS OFF)
+ SELECT t1.a,t2.b,t3.c FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) INNER JOIN fprt1 t3 ON (t2.b = t3.a) WHERE t1.a % 25 =0 ORDER BY 1,2,3;
+ SELECT t1.a,t2.b,t3.c FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) INNER JOIN fprt1 t3 ON (t2.b = t3.a) WHERE t1.a % 25 =0 ORDER BY 1,2,3;
+
+ -- left outer join + nullable clasue
+ EXPLAIN (COSTS OFF)
+ SELECT t1.a,t2.b,t2.c FROM fprt1 t1 LEFT JOIN (SELECT * FROM fprt2 WHERE a < 10) t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a < 10 ORDER BY 1,2,3;
+ SELECT t1.a,t2.b,t2.c FROM fprt1 t1 LEFT JOIN (SELECT * FROM fprt2 WHERE a < 10) t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a < 10 ORDER BY 1,2,3;
+
+ -- with whole-row reference
+ EXPLAIN (COSTS OFF)
+ SELECT t1,t2 FROM fprt1 t1 JOIN fprt2 t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a % 25 =0 ORDER BY 1,2;
+ SELECT t1,t2 FROM fprt1 t1 JOIN fprt2 t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a % 25 =0 ORDER BY 1,2;
+
+ -- join with lateral reference
+ EXPLAIN (COSTS OFF)
+ SELECT t1.a,t1.b FROM fprt1 t1, LATERAL (SELECT t2.a, t2.b FROM fprt2 t2 WHERE t1.a = t2.b AND t1.b = t2.a) q WHERE t1.a%25 = 0 ORDER BY 1,2;
+ SELECT t1.a,t1.b FROM fprt1 t1, LATERAL (SELECT t2.a, t2.b FROM fprt2 t2 WHERE t1.a = t2.b AND t1.b = t2.a) q WHERE t1.a%25 = 0 ORDER BY 1,2;
+
+ RESET enable_partition_wise_join;
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
new file mode 100644
index e02b0c8..c4d9228
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
*************** ANY <replaceable class="parameter">num_s
*** 3643,3648 ****
--- 3643,3667 ----
</listitem>
</varlistentry>
+ <varlistentry id="guc-enable-partition-wise-join" xreflabel="enable_partition_wise_join">
+ <term><varname>enable_partition_wise_join</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>enable_partition_wise_join</> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Enables or disables the query planner's use of partition-wise join
+ plans. When enabled, it spends time in creating paths for joins between
+ partitions and consumes memory to construct expression nodes to be used
+ for those joins, even if partition-wise join does not result in the
+ cheapest path. The time and memory increase exponentially with the
+ number of partitioned tables being joined and they increase linearly
+ with the number of partitions. The default is <literal>off</>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-enable-seqscan" xreflabel="enable_seqscan">
<term><varname>enable_seqscan</varname> (<type>boolean</type>)
<indexterm>
diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
new file mode 100644
index dbeaab5..ac8c2fa
*** a/doc/src/sgml/fdwhandler.sgml
--- b/doc/src/sgml/fdwhandler.sgml
*************** ShutdownForeignScan(ForeignScanState *no
*** 1270,1275 ****
--- 1270,1295 ----
</para>
</sect2>
+ <sect2 id="fdw-callbacks-reparameterize-paths">
+ <title>FDW Routines For reparameterization of paths</title>
+
+ <para>
+ <programlisting>
+ List *
+ ReparameterizeForeignPathByChild(PlannerInfo *root, List *fdw_private,
+ RelOptInfo *child_rel);
+ </programlisting>
+ This function is called while converting a path parameterized by the
+ top-most parent of the given child relation <literal>child_rel</> to be
+ parameterized by the child relation. The function is used to reparameterize
+ any paths or translate any expression nodes saved in the given
+ <literal>fdw_private</> member of a <structname>ForeignPath</>. The
+ callback may use <literal>reparameterize_path_by_child</>,
+ <literal>adjust_appendrel_attrs</> or
+ <literal>adjust_appendrel_attrs_multilevel</> as required.
+ </para>
+ </sect2>
+
</sect1>
<sect1 id="fdw-helpers">
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
new file mode 100644
index e0d2665..c44bb0e
*** a/src/backend/catalog/partition.c
--- b/src/backend/catalog/partition.c
*************** static List *generate_partition_qual(Rel
*** 126,140 ****
static PartitionRangeBound *make_one_range_bound(PartitionKey key, int index,
List *datums, bool lower);
! static int32 partition_rbound_cmp(PartitionKey key,
! Datum *datums1, RangeDatumContent *content1, bool lower1,
PartitionRangeBound *b2);
! static int32 partition_rbound_datum_cmp(PartitionKey key,
! Datum *rb_datums, RangeDatumContent *rb_content,
! Datum *tuple_datums);
! static int32 partition_bound_cmp(PartitionKey key,
! PartitionBoundInfo boundinfo,
int offset, void *probe, bool probe_is_bound);
static int partition_bound_bsearch(PartitionKey key,
PartitionBoundInfo boundinfo,
--- 126,141 ----
static PartitionRangeBound *make_one_range_bound(PartitionKey key, int index,
List *datums, bool lower);
! static int32 partition_rbound_cmp(int partnatts, FmgrInfo *partsupfunc,
! Oid *partcollation, Datum *datums1,
! RangeDatumContent *content1, bool lower1,
PartitionRangeBound *b2);
! static int32 partition_rbound_datum_cmp(int partnatts, FmgrInfo *partsupfunc,
! Oid *partcollation, Datum *rb_datums,
! RangeDatumContent *rb_content, Datum *tuple_datums);
! static int32 partition_bound_cmp(int partnatts, FmgrInfo *partsupfunc,
! Oid *partcollation, PartitionBoundInfo boundinfo,
int offset, void *probe, bool probe_is_bound);
static int partition_bound_bsearch(PartitionKey key,
PartitionBoundInfo boundinfo,
*************** RelationBuildPartitionDesc(Relation rel)
*** 592,598 ****
* representation of partition bounds.
*/
bool
! partition_bounds_equal(PartitionKey key,
PartitionBoundInfo b1, PartitionBoundInfo b2)
{
int i;
--- 593,599 ----
* representation of partition bounds.
*/
bool
! partition_bounds_equal(int partnatts, int16 *parttyplen, bool *parttypbyval,
PartitionBoundInfo b1, PartitionBoundInfo b2)
{
int i;
*************** partition_bounds_equal(PartitionKey key,
*** 613,619 ****
{
int j;
! for (j = 0; j < key->partnatts; j++)
{
/* For range partitions, the bounds might not be finite. */
if (b1->content != NULL)
--- 614,620 ----
{
int j;
! for (j = 0; j < partnatts; j++)
{
/* For range partitions, the bounds might not be finite. */
if (b1->content != NULL)
*************** partition_bounds_equal(PartitionKey key,
*** 642,649 ****
* context. datumIsEqual() should be simple enough to be safe.
*/
if (!datumIsEqual(b1->datums[i][j], b2->datums[i][j],
! key->parttypbyval[j],
! key->parttyplen[j]))
return false;
}
--- 643,649 ----
* context. datumIsEqual() should be simple enough to be safe.
*/
if (!datumIsEqual(b1->datums[i][j], b2->datums[i][j],
! parttypbyval[j], parttyplen[j]))
return false;
}
*************** partition_bounds_equal(PartitionKey key,
*** 652,658 ****
}
/* There are ndatums+1 indexes in case of range partitions */
! if (key->strategy == PARTITION_STRATEGY_RANGE &&
b1->indexes[i] != b2->indexes[i])
return false;
--- 652,658 ----
}
/* There are ndatums+1 indexes in case of range partitions */
! if (b1->strategy == PARTITION_STRATEGY_RANGE &&
b1->indexes[i] != b2->indexes[i])
return false;
*************** check_new_partition_bound(char *relname,
*** 734,741 ****
* First check if the resulting range would be empty with
* specified lower and upper bounds
*/
! if (partition_rbound_cmp(key, lower->datums, lower->content, true,
! upper) >= 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("cannot create range partition with empty range"),
--- 734,742 ----
* First check if the resulting range would be empty with
* specified lower and upper bounds
*/
! if (partition_rbound_cmp(key->partnatts, key->partsupfunc,
! key->partcollation, lower->datums,
! lower->content, true, upper) >= 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("cannot create range partition with empty range"),
*************** qsort_partition_rbound_cmp(const void *a
*** 1865,1871 ****
PartitionRangeBound *b2 = (*(PartitionRangeBound *const *) b);
PartitionKey key = (PartitionKey) arg;
! return partition_rbound_cmp(key, b1->datums, b1->content, b1->lower, b2);
}
/*
--- 1866,1874 ----
PartitionRangeBound *b2 = (*(PartitionRangeBound *const *) b);
PartitionKey key = (PartitionKey) arg;
! return partition_rbound_cmp(key->partnatts, key->partsupfunc,
! key->partcollation, b1->datums, b1->content,
! b1->lower, b2);
}
/*
*************** qsort_partition_rbound_cmp(const void *a
*** 1875,1881 ****
* content1, and lower1) is <=, =, >= the bound specified in *b2
*/
static int32
! partition_rbound_cmp(PartitionKey key,
Datum *datums1, RangeDatumContent *content1, bool lower1,
PartitionRangeBound *b2)
{
--- 1878,1884 ----
* content1, and lower1) is <=, =, >= the bound specified in *b2
*/
static int32
! partition_rbound_cmp(int partnatts, FmgrInfo *partsupfunc, Oid *partcollation,
Datum *datums1, RangeDatumContent *content1, bool lower1,
PartitionRangeBound *b2)
{
*************** partition_rbound_cmp(PartitionKey key,
*** 1885,1891 ****
RangeDatumContent *content2 = b2->content;
bool lower2 = b2->lower;
! for (i = 0; i < key->partnatts; i++)
{
/*
* First, handle cases involving infinity, which don't require
--- 1888,1894 ----
RangeDatumContent *content2 = b2->content;
bool lower2 = b2->lower;
! for (i = 0; i < partnatts; i++)
{
/*
* First, handle cases involving infinity, which don't require
*************** partition_rbound_cmp(PartitionKey key,
*** 1905,1912 ****
else if (content2[i] != RANGE_DATUM_FINITE)
return content2[i] == RANGE_DATUM_NEG_INF ? 1 : -1;
! cmpval = DatumGetInt32(FunctionCall2Coll(&key->partsupfunc[i],
! key->partcollation[i],
datums1[i],
datums2[i]));
if (cmpval != 0)
--- 1908,1915 ----
else if (content2[i] != RANGE_DATUM_FINITE)
return content2[i] == RANGE_DATUM_NEG_INF ? 1 : -1;
! cmpval = DatumGetInt32(FunctionCall2Coll(&partsupfunc[i],
! partcollation[i],
datums1[i],
datums2[i]));
if (cmpval != 0)
*************** partition_rbound_cmp(PartitionKey key,
*** 1932,1951 ****
* rb_lower) <=, =, >= partition key of tuple (tuple_datums)
*/
static int32
! partition_rbound_datum_cmp(PartitionKey key,
! Datum *rb_datums, RangeDatumContent *rb_content,
! Datum *tuple_datums)
{
int i;
int32 cmpval = -1;
! for (i = 0; i < key->partnatts; i++)
{
if (rb_content[i] != RANGE_DATUM_FINITE)
return rb_content[i] == RANGE_DATUM_NEG_INF ? -1 : 1;
! cmpval = DatumGetInt32(FunctionCall2Coll(&key->partsupfunc[i],
! key->partcollation[i],
rb_datums[i],
tuple_datums[i]));
if (cmpval != 0)
--- 1935,1954 ----
* rb_lower) <=, =, >= partition key of tuple (tuple_datums)
*/
static int32
! partition_rbound_datum_cmp(int partnatts, FmgrInfo *partsupfunc,
! Oid *partcollation, Datum *rb_datums,
! RangeDatumContent *rb_content, Datum *tuple_datums)
{
int i;
int32 cmpval = -1;
! for (i = 0; i < partnatts; i++)
{
if (rb_content[i] != RANGE_DATUM_FINITE)
return rb_content[i] == RANGE_DATUM_NEG_INF ? -1 : 1;
! cmpval = DatumGetInt32(FunctionCall2Coll(&partsupfunc[i],
! partcollation[i],
rb_datums[i],
tuple_datums[i]));
if (cmpval != 0)
*************** partition_rbound_datum_cmp(PartitionKey
*** 1962,1978 ****
* specified in *probe.
*/
static int32
! partition_bound_cmp(PartitionKey key, PartitionBoundInfo boundinfo,
! int offset, void *probe, bool probe_is_bound)
{
Datum *bound_datums = boundinfo->datums[offset];
int32 cmpval = -1;
! switch (key->strategy)
{
case PARTITION_STRATEGY_LIST:
! cmpval = DatumGetInt32(FunctionCall2Coll(&key->partsupfunc[0],
! key->partcollation[0],
bound_datums[0],
*(Datum *) probe));
break;
--- 1965,1982 ----
* specified in *probe.
*/
static int32
! partition_bound_cmp(int partnatts, FmgrInfo *partsupfunc, Oid *partcollation,
! PartitionBoundInfo boundinfo, int offset, void *probe,
! bool probe_is_bound)
{
Datum *bound_datums = boundinfo->datums[offset];
int32 cmpval = -1;
! switch (boundinfo->strategy)
{
case PARTITION_STRATEGY_LIST:
! cmpval = DatumGetInt32(FunctionCall2Coll(&partsupfunc[0],
! partcollation[0],
bound_datums[0],
*(Datum *) probe));
break;
*************** partition_bound_cmp(PartitionKey key, Pa
*** 1990,2001 ****
*/
bool lower = boundinfo->indexes[offset] < 0;
! cmpval = partition_rbound_cmp(key,
! bound_datums, content, lower,
! (PartitionRangeBound *) probe);
}
else
! cmpval = partition_rbound_datum_cmp(key,
bound_datums, content,
(Datum *) probe);
break;
--- 1994,2007 ----
*/
bool lower = boundinfo->indexes[offset] < 0;
! cmpval = partition_rbound_cmp(partnatts, partsupfunc,
! partcollation, bound_datums,
! content, lower,
! (PartitionRangeBound *) probe);
}
else
! cmpval = partition_rbound_datum_cmp(partnatts, partsupfunc,
! partcollation,
bound_datums, content,
(Datum *) probe);
break;
*************** partition_bound_cmp(PartitionKey key, Pa
*** 2003,2009 ****
default:
elog(ERROR, "unexpected partition strategy: %d",
! (int) key->strategy);
}
return cmpval;
--- 2009,2015 ----
default:
elog(ERROR, "unexpected partition strategy: %d",
! (int) boundinfo->strategy);
}
return cmpval;
*************** partition_bound_bsearch(PartitionKey key
*** 2037,2043 ****
int32 cmpval;
mid = (lo + hi + 1) / 2;
! cmpval = partition_bound_cmp(key, boundinfo, mid, probe,
probe_is_bound);
if (cmpval <= 0)
{
--- 2043,2050 ----
int32 cmpval;
mid = (lo + hi + 1) / 2;
! cmpval = partition_bound_cmp(key->partnatts, key->partsupfunc,
! key->partcollation, boundinfo, mid, probe,
probe_is_bound);
if (cmpval <= 0)
{
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
new file mode 100644
index 5a34a46..717763d
*** a/src/backend/executor/execExpr.c
--- b/src/backend/executor/execExpr.c
*************** ExecInitExprRec(Expr *node, PlanState *p
*** 723,728 ****
--- 723,755 ----
break;
}
+ case T_GroupedVar:
+ /*
+ * GroupedVar is treated as an aggregate if it appears in the
+ * targetlist of Agg node, but as a normal variable elsewhere.
+ */
+ if (parent && (IsA(parent, AggState)))
+ {
+ GroupedVar *gvar = (GroupedVar *) node;
+
+ /*
+ * Currently GroupedVar can only represent partial aggregate.
+ */
+ Assert(gvar->agg_partial != NULL);
+
+ ExecInitExprRec((Expr *) gvar->agg_partial, parent, state,
+ resv, resnull);
+ break;
+ }
+ else
+ {
+ /*
+ * set_plan_refs should have replaced GroupedVar in the
+ * targetlist with an ordinary Var.
+ */
+ elog(ERROR, "parent of GroupedVar is not Agg node");
+ }
+
case T_GroupingFunc:
{
GroupingFunc *grp_node = (GroupingFunc *) node;
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
new file mode 100644
index c2b8618..c4cb4c0
*** a/src/backend/executor/nodeAgg.c
--- b/src/backend/executor/nodeAgg.c
*************** find_unaggregated_cols_walker(Node *node
*** 1829,1834 ****
--- 1829,1845 ----
/* do not descend into aggregate exprs */
return false;
}
+ if (IsA(node, GroupedVar))
+ {
+ GroupedVar *gvar = (GroupedVar *) node;
+
+ /*
+ * GroupedVar is currently used only for partial aggregation, so treat
+ * it like an Aggref above.
+ */
+ Assert(gvar->agg_partial != NULL);
+ return false;
+ }
return expression_tree_walker(node, find_unaggregated_cols_walker,
(void *) colnos);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
new file mode 100644
index 00a0fed..7d188ea
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
*************** _copyPlaceHolderVar(const PlaceHolderVar
*** 2206,2211 ****
--- 2206,2226 ----
}
/*
+ * _copyGroupedVar
+ */
+ static GroupedVar *
+ _copyGroupedVar(const GroupedVar *from)
+ {
+ GroupedVar *newnode = makeNode(GroupedVar);
+
+ COPY_NODE_FIELD(gvexpr);
+ COPY_NODE_FIELD(agg_partial);
+ COPY_SCALAR_FIELD(gvid);
+
+ return newnode;
+ }
+
+ /*
* _copySpecialJoinInfo
*/
static SpecialJoinInfo *
*************** copyObjectImpl(const void *from)
*** 4984,4989 ****
--- 4999,5007 ----
case T_PlaceHolderVar:
retval = _copyPlaceHolderVar(from);
break;
+ case T_GroupedVar:
+ retval = _copyGroupedVar(from);
+ break;
case T_SpecialJoinInfo:
retval = _copySpecialJoinInfo(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
new file mode 100644
index 46573ae..f1dacd5
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
*************** _equalPlaceHolderVar(const PlaceHolderVa
*** 874,879 ****
--- 874,887 ----
}
static bool
+ _equalGroupedVar(const GroupedVar *a, const GroupedVar *b)
+ {
+ COMPARE_SCALAR_FIELD(gvid);
+
+ return true;
+ }
+
+ static bool
_equalSpecialJoinInfo(const SpecialJoinInfo *a, const SpecialJoinInfo *b)
{
COMPARE_BITMAPSET_FIELD(min_lefthand);
*************** equal(const void *a, const void *b)
*** 3148,3153 ****
--- 3156,3164 ----
case T_PlaceHolderVar:
retval = _equalPlaceHolderVar(a, b);
break;
+ case T_GroupedVar:
+ retval = _equalGroupedVar(a, b);
+ break;
case T_SpecialJoinInfo:
retval = _equalSpecialJoinInfo(a, b);
break;
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
new file mode 100644
index 3e8189c..5c00e55
*** a/src/backend/nodes/nodeFuncs.c
--- b/src/backend/nodes/nodeFuncs.c
*************** exprType(const Node *expr)
*** 259,264 ****
--- 259,267 ----
case T_PlaceHolderVar:
type = exprType((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
+ case T_GroupedVar:
+ type = exprType((Node *) ((const GroupedVar *) expr)->agg_partial);
+ break;
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(expr));
type = InvalidOid; /* keep compiler quiet */
*************** exprCollation(const Node *expr)
*** 931,936 ****
--- 934,942 ----
case T_PlaceHolderVar:
coll = exprCollation((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
+ case T_GroupedVar:
+ coll = exprCollation((Node *) ((const GroupedVar *) expr)->gvexpr);
+ break;
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(expr));
coll = InvalidOid; /* keep compiler quiet */
*************** expression_tree_walker(Node *node,
*** 2198,2203 ****
--- 2204,2211 ----
break;
case T_PlaceHolderVar:
return walker(((PlaceHolderVar *) node)->phexpr, context);
+ case T_GroupedVar:
+ return walker(((GroupedVar *) node)->gvexpr, context);
case T_InferenceElem:
return walker(((InferenceElem *) node)->expr, context);
case T_AppendRelInfo:
*************** expression_tree_mutator(Node *node,
*** 2989,2994 ****
--- 2997,3012 ----
return (Node *) newnode;
}
break;
+ case T_GroupedVar:
+ {
+ GroupedVar *gv = (GroupedVar *) node;
+ GroupedVar *newnode;
+
+ FLATCOPY(newnode, gv, GroupedVar);
+ MUTATE(newnode->gvexpr, gv->gvexpr, Expr *);
+ MUTATE(newnode->agg_partial, gv->agg_partial, Aggref *);
+ return (Node *) newnode;
+ }
case T_InferenceElem:
{
InferenceElem *inferenceelemdexpr = (InferenceElem *) node;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
new file mode 100644
index 28cef85..4b6ee30
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
*************** _outPlannerInfo(StringInfo str, const Pl
*** 2186,2191 ****
--- 2186,2192 ----
WRITE_NODE_FIELD(pcinfo_list);
WRITE_NODE_FIELD(rowMarks);
WRITE_NODE_FIELD(placeholder_list);
+ WRITE_NODE_FIELD(grouped_var_list);
WRITE_NODE_FIELD(fkey_list);
WRITE_NODE_FIELD(query_pathkeys);
WRITE_NODE_FIELD(group_pathkeys);
*************** _outParamPathInfo(StringInfo str, const
*** 2408,2413 ****
--- 2409,2424 ----
}
static void
+ _outGroupedPathInfo(StringInfo str, const GroupedPathInfo *node)
+ {
+ WRITE_NODE_TYPE("GROUPEDPATHINFO");
+
+ WRITE_NODE_FIELD(target);
+ WRITE_NODE_FIELD(pathlist);
+ WRITE_NODE_FIELD(partial_pathlist);
+ }
+
+ static void
_outRestrictInfo(StringInfo str, const RestrictInfo *node)
{
WRITE_NODE_TYPE("RESTRICTINFO");
*************** _outPlaceHolderVar(StringInfo str, const
*** 2451,2456 ****
--- 2462,2477 ----
}
static void
+ _outGroupedVar(StringInfo str, const GroupedVar *node)
+ {
+ WRITE_NODE_TYPE("GROUPEDVAR");
+
+ WRITE_NODE_FIELD(gvexpr);
+ WRITE_NODE_FIELD(agg_partial);
+ WRITE_UINT_FIELD(gvid);
+ }
+
+ static void
_outSpecialJoinInfo(StringInfo str, const SpecialJoinInfo *node)
{
WRITE_NODE_TYPE("SPECIALJOININFO");
*************** outNode(StringInfo str, const void *obj)
*** 3996,4007 ****
--- 4017,4034 ----
case T_ParamPathInfo:
_outParamPathInfo(str, obj);
break;
+ case T_GroupedPathInfo:
+ _outGroupedPathInfo(str, obj);
+ break;
case T_RestrictInfo:
_outRestrictInfo(str, obj);
break;
case T_PlaceHolderVar:
_outPlaceHolderVar(str, obj);
break;
+ case T_GroupedVar:
+ _outGroupedVar(str, obj);
+ break;
case T_SpecialJoinInfo:
_outSpecialJoinInfo(str, obj);
break;
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
new file mode 100644
index a883220..138f71c
*** a/src/backend/nodes/readfuncs.c
--- b/src/backend/nodes/readfuncs.c
*************** _readVar(void)
*** 522,527 ****
--- 522,542 ----
}
/*
+ * _readGroupedVar
+ */
+ static GroupedVar *
+ _readGroupedVar(void)
+ {
+ READ_LOCALS(GroupedVar);
+
+ READ_NODE_FIELD(gvexpr);
+ READ_NODE_FIELD(agg_partial);
+ READ_UINT_FIELD(gvid);
+
+ READ_DONE();
+ }
+
+ /*
* _readConst
*/
static Const *
*************** parseNodeString(void)
*** 2440,2445 ****
--- 2455,2462 ----
return_value = _readTableFunc();
else if (MATCH("VAR", 3))
return_value = _readVar();
+ else if (MATCH("GROUPEDVAR", 10))
+ return_value = _readGroupedVar();
else if (MATCH("CONST", 5))
return_value = _readConst();
else if (MATCH("PARAM", 5))
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
new file mode 100644
index fc0fca4..eee093f
*** a/src/backend/optimizer/README
--- b/src/backend/optimizer/README
*************** be desirable to postpone the Gather stag
*** 1076,1078 ****
--- 1076,1105 ----
plan as possible. Expanding the range of cases in which more work can be
pushed below the Gather (and costing them accurately) is likely to keep us
busy for a long time to come.
+
+ Partition-wise joins
+ --------------------
+ A join between two similarly partitioned tables can be broken down into joins
+ between their matching partitions if there exists an equi-join condition
+ between the partition keys of the joining tables. The equi-join between
+ partition keys implies that all join partners for a given row in one
+ partitioned table must be in the corresponding partition of the other
+ partitioned table. The join partners can not be found in other partitions. This
+ condition allows the join between partitioned tables to be broken into joins
+ between the matching partitions. The resultant join is partitioned in the same
+ way as the joining relations, thus allowing an N-way join between similarly
+ partitioned tables having equi-join condition between their partition keys to
+ be broken down into N-way joins between their matching partitions. This
+ technique of breaking down a join between partition tables into join between
+ their partitions is called partition-wise join. We will use term "partitioned
+ relation" for both partitioned table as well as join between partitioned tables
+ which can use partition-wise join technique.
+
+ Partitioning properties of a partitioned table are stored in
+ PartitionSchemeData structure. Planner maintains a list of canonical partition
+ schemes (distinct PartitionSchemeData objects) so that any two partitioned
+ relations with same partitioning scheme share the same PartitionSchemeData
+ object. This reduces memory consumed by PartitionSchemeData objects and makes
+ it easy to compare the partition schemes of joining relations. RelOptInfos of
+ partitioned relations hold partition key expressions and the RelOptInfos of
+ the partition relations of that relation.
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
new file mode 100644
index b5cab0c..1ad910d
*** a/src/backend/optimizer/geqo/geqo_eval.c
--- b/src/backend/optimizer/geqo/geqo_eval.c
*************** merge_clump(PlannerInfo *root, List *clu
*** 264,271 ****
/* Keep searching if join order is not valid */
if (joinrel)
{
/* Create GatherPaths for any useful partial paths for rel */
! generate_gather_paths(root, joinrel);
/* Find and save the cheapest paths for this joinrel */
set_cheapest(joinrel);
--- 264,279 ----
/* Keep searching if join order is not valid */
if (joinrel)
{
+
+ /*
+ * Create "append" paths for partitioned joins. Do this before
+ * creating GatherPaths so that partial "append" paths in
+ * partitioned joins will be considered.
+ */
+ generate_partition_wise_join_paths(root, joinrel);
+
/* Create GatherPaths for any useful partial paths for rel */
! generate_gather_paths(root, joinrel, false);
/* Find and save the cheapest paths for this joinrel */
set_cheapest(joinrel);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
new file mode 100644
index b93b4fc..83a2c37
*** a/src/backend/optimizer/path/allpaths.c
--- b/src/backend/optimizer/path/allpaths.c
***************
*** 24,29 ****
--- 24,30 ----
#include "catalog/pg_operator.h"
#include "catalog/pg_proc.h"
#include "foreign/fdwapi.h"
+ #include "miscadmin.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
#ifdef OPTIMIZER_DEBUG
*************** set_rel_pathlist(PlannerInfo *root, RelO
*** 486,492 ****
* we'll consider gathering partial paths for the parent appendrel.)
*/
if (rel->reloptkind == RELOPT_BASEREL)
! generate_gather_paths(root, rel);
/*
* Allow a plugin to editorialize on the set of Paths for this base
--- 487,496 ----
* we'll consider gathering partial paths for the parent appendrel.)
*/
if (rel->reloptkind == RELOPT_BASEREL)
! {
! generate_gather_paths(root, rel, false);
! generate_gather_paths(root, rel, true);
! }
/*
* Allow a plugin to editorialize on the set of Paths for this base
*************** static void
*** 686,691 ****
--- 690,696 ----
set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
{
Relids required_outer;
+ Path *seq_path;
/*
* We don't support pushing join clauses into the quals of a seqscan, but
*************** set_plain_rel_pathlist(PlannerInfo *root
*** 694,708 ****
*/
required_outer = rel->lateral_relids;
! /* Consider sequential scan */
! add_path(rel, create_seqscan_path(root, rel, required_outer, 0));
! /* If appropriate, consider parallel sequential scan */
if (rel->consider_parallel && required_outer == NULL)
create_plain_partial_paths(root, rel);
/* Consider index scans */
! create_index_paths(root, rel);
/* Consider TID scans */
create_tidscan_paths(root, rel);
--- 699,726 ----
*/
required_outer = rel->lateral_relids;
! /* Consider sequential scan, both plain and grouped. */
! seq_path = create_seqscan_path(root, rel, required_outer, 0);
! add_path(rel, seq_path, false);
! if (rel->gpi != NULL && required_outer == NULL)
! create_grouped_path(root, rel, seq_path, false, false, AGG_HASHED);
! /* If appropriate, consider parallel sequential scan (plain or grouped) */
if (rel->consider_parallel && required_outer == NULL)
create_plain_partial_paths(root, rel);
/* Consider index scans */
! create_index_paths(root, rel, false);
! if (rel->gpi != NULL)
! {
! /*
! * TODO Instead of calling the whole clause-matching machinery twice
! * (there should be no difference between plain and grouped paths from
! * this point of view), consider returning a separate list of paths
! * usable as grouped ones.
! */
! create_index_paths(root, rel, true);
! }
/* Consider TID scans */
create_tidscan_paths(root, rel);
*************** static void
*** 716,721 ****
--- 734,740 ----
create_plain_partial_paths(PlannerInfo *root, RelOptInfo *rel)
{
int parallel_workers;
+ Path *path;
parallel_workers = compute_parallel_worker(rel, rel->pages, -1);
*************** create_plain_partial_paths(PlannerInfo *
*** 724,730 ****
return;
/* Add an unordered partial path based on a parallel sequential scan. */
! add_partial_path(rel, create_seqscan_path(root, rel, NULL, parallel_workers));
}
/*
--- 743,850 ----
return;
/* Add an unordered partial path based on a parallel sequential scan. */
! path = create_seqscan_path(root, rel, NULL, parallel_workers);
! add_partial_path(rel, path, false);
!
! /*
! * Do partial aggregation at base relation level if the relation is
! * eligible for it.
! */
! if (rel->gpi != NULL)
! create_grouped_path(root, rel, path, false, true, AGG_HASHED);
! }
!
! /*
! * Apply partial aggregation to a subpath and add the AggPath to the
! * appropriate pathlist.
! *
! * "precheck" tells whether the aggregation path should first be checked using
! * add_path_precheck().
! *
! * If "partial" is true, the resulting path is considered partial in terms of
! * parallel execution.
! *
! * The path we create here shouldn't be parameterized because of supposedly
! * high startup cost of aggregation (whether due to build of hash table for
! * AGG_HASHED strategy or due to explicit sort for AGG_SORTED).
! *
! * XXX IndexPath as an input for AGG_SORTED might seem to be an exception, but
! * aggregation of its output is only beneficial if it's performed by multiple
! * workers, i.e. the resulting path is partial (Besides parallel aggregation,
! * the other use case of aggregation push-down is aggregation performed on
! * remote database, but that has nothing to do with IndexScan). And partial
! * path cannot be parameterized because it's semantically wrong to use it on
! * the inner side of NL join.
! */
! void
! create_grouped_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
! bool precheck, bool partial, AggStrategy aggstrategy)
! {
! List *group_clauses = NIL;
! List *group_exprs = NIL;
! List *agg_exprs = NIL;
! Path *agg_path;
!
! /*
! * If the AggPath should be partial, the subpath must be too, and
! * therefore the subpath is essentially parallel_safe.
! */
! Assert(subpath->parallel_safe || !partial);
!
! /*
! * Grouped path should never be parameterized, so we're not supposed to
! * receive parameterized subpath.
! */
! Assert(subpath->param_info == NULL);
!
! /*
! * Note that "partial" in the following function names refers to 2-stage
! * aggregation, not to parallel processing.
! */
! if (aggstrategy == AGG_HASHED)
! agg_path = (Path *) create_partial_agg_hashed_path(root, subpath,
! true,
! &group_clauses,
! &group_exprs,
! &agg_exprs,
! subpath->rows);
! else if (aggstrategy == AGG_SORTED)
! agg_path = (Path *) create_partial_agg_sorted_path(root, subpath,
! true,
! &group_clauses,
! &group_exprs,
! &agg_exprs,
! subpath->rows);
! else
! elog(ERROR, "unexpected strategy %d", aggstrategy);
!
! /* Add the grouped path to the list of grouped base paths. */
! if (agg_path != NULL)
! {
! if (precheck)
! {
! List *pathkeys;
!
! /* AGG_HASH is not supposed to generate sorted output. */
! pathkeys = aggstrategy == AGG_SORTED ? subpath->pathkeys : NIL;
!
! if (!partial &&
! !add_path_precheck(rel, agg_path->startup_cost,
! agg_path->total_cost, pathkeys, NULL,
! true))
! return;
!
! if (partial &&
! !add_partial_path_precheck(rel, agg_path->total_cost, pathkeys,
! true))
! return;
! }
!
! if (!partial)
! add_path(rel, (Path *) agg_path, true);
! else
! add_partial_path(rel, (Path *) agg_path, true);
! }
}
/*
*************** set_tablesample_rel_pathlist(PlannerInfo
*** 810,816 ****
path = (Path *) create_material_path(rel, path);
}
! add_path(rel, path);
/* For the moment, at least, there are no other paths to consider */
}
--- 930,936 ----
path = (Path *) create_material_path(rel, path);
}
! add_path(rel, path, false);
/* For the moment, at least, there are no other paths to consider */
}
*************** set_append_rel_size(PlannerInfo *root, R
*** 915,926 ****
childrel = find_base_rel(root, childRTindex);
Assert(childrel->reloptkind == RELOPT_OTHER_MEMBER_REL);
/*
! * We have to copy the parent's targetlist and quals to the child,
! * with appropriate substitution of variables. However, only the
! * baserestrictinfo quals are needed before we can check for
! * constraint exclusion; so do that first and then check to see if we
! * can disregard this child.
*
* The child rel's targetlist might contain non-Var expressions, which
* means that substitution into the quals could produce opportunities
--- 1035,1100 ----
childrel = find_base_rel(root, childRTindex);
Assert(childrel->reloptkind == RELOPT_OTHER_MEMBER_REL);
+ if (rel->part_scheme)
+ {
+ AttrNumber attno;
+
+ /*
+ * For a partitioned tables, individual partitions can participate
+ * in the pair-wise joins. We need attr_needed data for building
+ * targetlists of joins between partitions.
+ */
+ for (attno = rel->min_attr; attno <= rel->max_attr; attno++)
+ {
+ int index = attno - rel->min_attr;
+ Relids attr_needed = bms_copy(rel->attr_needed[index]);
+
+ /* System attributes do not need translation. */
+ if (attno <= 0)
+ {
+ Assert(rel->min_attr == childrel->min_attr);
+ childrel->attr_needed[index] = attr_needed;
+ }
+ else
+ {
+ Var *var = list_nth(appinfo->translated_vars,
+ attno - 1);
+ int child_index;
+
+ /*
+ * Parent Var for a user defined attribute translates to
+ * child Var.
+ */
+ Assert(IsA(var, Var));
+
+ child_index = var->varattno - childrel->min_attr;
+ childrel->attr_needed[child_index] = attr_needed;
+ }
+ }
+ }
+
/*
! * Copy/Modify targetlist. Even if this child is deemed empty, we need
! * its targetlist in case it falls on nullable side in a child-join
! * because of partition-wise join.
! *
! * NB: the resulting childrel->reltarget->exprs may contain arbitrary
! * expressions, which otherwise would not occur in a rel's targetlist.
! * Code that might be looking at an appendrel child must cope with
! * such. (Normally, a rel's targetlist would only include Vars and
! * PlaceHolderVars.) XXX we do not bother to update the cost or width
! * fields of childrel->reltarget; not clear if that would be useful.
! */
! childrel->reltarget->exprs = (List *)
! adjust_appendrel_attrs(root,
! (Node *) rel->reltarget->exprs,
! 1, &appinfo);
!
! /*
! * We have to copy the parent's quals to the child, with appropriate
! * substitution of variables. However, only the baserestrictinfo quals
! * are needed before we can check for constraint exclusion; so do that
! * first and then check to see if we can disregard this child.
*
* The child rel's targetlist might contain non-Var expressions, which
* means that substitution into the quals could produce opportunities
*************** set_append_rel_size(PlannerInfo *root, R
*** 941,947 ****
Assert(IsA(rinfo, RestrictInfo));
childqual = adjust_appendrel_attrs(root,
(Node *) rinfo->clause,
! appinfo);
childqual = eval_const_expressions(root, childqual);
/* check for flat-out constant */
if (childqual && IsA(childqual, Const))
--- 1115,1121 ----
Assert(IsA(rinfo, RestrictInfo));
childqual = adjust_appendrel_attrs(root,
(Node *) rinfo->clause,
! 1, &appinfo);
childqual = eval_const_expressions(root, childqual);
/* check for flat-out constant */
if (childqual && IsA(childqual, Const))
*************** set_append_rel_size(PlannerInfo *root, R
*** 1047,1070 ****
continue;
}
! /*
! * CE failed, so finish copying/modifying targetlist and join quals.
! *
! * NB: the resulting childrel->reltarget->exprs may contain arbitrary
! * expressions, which otherwise would not occur in a rel's targetlist.
! * Code that might be looking at an appendrel child must cope with
! * such. (Normally, a rel's targetlist would only include Vars and
! * PlaceHolderVars.) XXX we do not bother to update the cost or width
! * fields of childrel->reltarget; not clear if that would be useful.
! */
childrel->joininfo = (List *)
adjust_appendrel_attrs(root,
(Node *) rel->joininfo,
! appinfo);
! childrel->reltarget->exprs = (List *)
! adjust_appendrel_attrs(root,
! (Node *) rel->reltarget->exprs,
! appinfo);
/*
* We have to make child entries in the EquivalenceClass data
--- 1221,1231 ----
continue;
}
! /* CE failed, so finish copying/modifying join quals. */
childrel->joininfo = (List *)
adjust_appendrel_attrs(root,
(Node *) rel->joininfo,
! 1, &appinfo);
/*
* We have to make child entries in the EquivalenceClass data
*************** set_append_rel_size(PlannerInfo *root, R
*** 1079,1092 ****
childrel->has_eclass_joins = rel->has_eclass_joins;
/*
- * Note: we could compute appropriate attr_needed data for the child's
- * variables, by transforming the parent's attr_needed through the
- * translated_vars mapping. However, currently there's no need
- * because attr_needed is only examined for base relations not
- * otherrels. So we just leave the child's attr_needed empty.
- */
-
- /*
* If parallelism is allowable for this query in general, see whether
* it's allowable for this childrel in particular. But if we've
* already decided the appendrel is not parallel-safe as a whole,
--- 1240,1245 ----
*************** add_paths_to_append_rel(PlannerInfo *roo
*** 1281,1299 ****
bool subpaths_valid = true;
List *partial_subpaths = NIL;
bool partial_subpaths_valid = true;
List *all_child_pathkeys = NIL;
List *all_child_outers = NIL;
ListCell *l;
List *partitioned_rels = NIL;
RangeTblEntry *rte;
! rte = planner_rt_fetch(rel->relid, root);
! if (rte->relkind == RELKIND_PARTITIONED_TABLE)
{
! partitioned_rels = get_partitioned_child_rels(root, rel->relid);
! /* The root partitioned table is included as a child rel */
! Assert(list_length(partitioned_rels) >= 1);
}
/*
* For every non-dummy child, remember the cheapest path. Also, identify
--- 1434,1460 ----
bool subpaths_valid = true;
List *partial_subpaths = NIL;
bool partial_subpaths_valid = true;
+ List *grouped_subpaths = NIL;
+ bool grouped_subpaths_valid = true;
List *all_child_pathkeys = NIL;
List *all_child_outers = NIL;
ListCell *l;
List *partitioned_rels = NIL;
RangeTblEntry *rte;
! if (rel->reloptkind == RELOPT_BASEREL)
{
! rte = planner_rt_fetch(rel->relid, root);
!
! if (rte->relkind == RELKIND_PARTITIONED_TABLE)
! {
! partitioned_rels = get_partitioned_child_rels(root, rel->relid);
! /* The root partitioned table is included as a child rel */
! Assert(list_length(partitioned_rels) >= 1);
! }
}
+ else if (rel->reloptkind == RELOPT_JOINREL && rel->part_scheme)
+ partitioned_rels = get_partitioned_child_rels_for_join(root, rel);
/*
* For every non-dummy child, remember the cheapest path. Also, identify
*************** add_paths_to_append_rel(PlannerInfo *roo
*** 1324,1329 ****
--- 1485,1521 ----
partial_subpaths_valid = false;
/*
+ * For grouped paths, use only the unparameterized subpaths.
+ *
+ * XXX Consider if the parameterized subpaths should be processed
+ * below. It's probably not useful for sequential scans (due to
+ * repeated aggregation), but might be worthwhile for other child
+ * nodes.
+ */
+ if (childrel->gpi != NULL && childrel->gpi->pathlist != NIL)
+ {
+ Path *path;
+
+ path = (Path *) linitial(childrel->gpi->pathlist);
+
+ /*
+ * PoC only: Simulate remote aggregation, which seems to be the
+ * typical use case for pushing the aggregation below Append node.
+ */
+ path->startup_cost = 0.0;
+ path->total_cost = 0.0;
+
+ if (path->param_info == NULL)
+ grouped_subpaths = accumulate_append_subpath(grouped_subpaths,
+ path);
+ else
+ grouped_subpaths_valid = false;
+ }
+ else
+ grouped_subpaths_valid = false;
+
+
+ /*
* Collect lists of all the available path orderings and
* parameterizations for all the children. We use these as a
* heuristic to indicate which sort orderings and parameterizations we
*************** add_paths_to_append_rel(PlannerInfo *roo
*** 1395,1401 ****
*/
if (subpaths_valid)
add_path(rel, (Path *) create_append_path(rel, subpaths, NULL, 0,
! partitioned_rels));
/*
* Consider an append of partial unordered, unparameterized partial paths.
--- 1587,1594 ----
*/
if (subpaths_valid)
add_path(rel, (Path *) create_append_path(rel, subpaths, NULL, 0,
! partitioned_rels),
! false);
/*
* Consider an append of partial unordered, unparameterized partial paths.
*************** add_paths_to_append_rel(PlannerInfo *roo
*** 1422,1429 ****
/* Generate a partial append path. */
appendpath = create_append_path(rel, partial_subpaths, NULL,
! parallel_workers, partitioned_rels);
! add_partial_path(rel, (Path *) appendpath);
}
/*
--- 1615,1635 ----
/* Generate a partial append path. */
appendpath = create_append_path(rel, partial_subpaths, NULL,
! parallel_workers,
! partitioned_rels);
! add_partial_path(rel, (Path *) appendpath, false);
! }
!
! /* TODO Also partial grouped paths? */
! if (grouped_subpaths_valid)
! {
! Path *path;
!
! path = (Path *) create_append_path(rel, grouped_subpaths, NULL, 0,
! partitioned_rels);
! /* pathtarget will produce the grouped relation.. */
! path->pathtarget = rel->gpi->target;
! add_path(rel, path, true);
}
/*
*************** add_paths_to_append_rel(PlannerInfo *roo
*** 1476,1482 ****
if (subpaths_valid)
add_path(rel, (Path *)
create_append_path(rel, subpaths, required_outer, 0,
! partitioned_rels));
}
}
--- 1682,1689 ----
if (subpaths_valid)
add_path(rel, (Path *)
create_append_path(rel, subpaths, required_outer, 0,
! partitioned_rels),
! false);
}
}
*************** generate_mergeappend_paths(PlannerInfo *
*** 1572,1585 ****
startup_subpaths,
pathkeys,
NULL,
! partitioned_rels));
if (startup_neq_total)
add_path(rel, (Path *) create_merge_append_path(root,
rel,
total_subpaths,
pathkeys,
NULL,
! partitioned_rels));
}
}
--- 1779,1794 ----
startup_subpaths,
pathkeys,
NULL,
! partitioned_rels),
! false);
if (startup_neq_total)
add_path(rel, (Path *) create_merge_append_path(root,
rel,
total_subpaths,
pathkeys,
NULL,
! partitioned_rels),
! false);
}
}
*************** set_dummy_rel_pathlist(RelOptInfo *rel)
*** 1712,1718 ****
rel->pathlist = NIL;
rel->partial_pathlist = NIL;
! add_path(rel, (Path *) create_append_path(rel, NIL, NULL, 0, NIL));
/*
* We set the cheapest path immediately, to ensure that IS_DUMMY_REL()
--- 1921,1927 ----
rel->pathlist = NIL;
rel->partial_pathlist = NIL;
! add_path(rel, (Path *) create_append_path(rel, NIL, NULL, 0, NIL), false);
/*
* We set the cheapest path immediately, to ensure that IS_DUMMY_REL()
*************** set_subquery_pathlist(PlannerInfo *root,
*** 1926,1932 ****
/* Generate outer path using this subpath */
add_path(rel, (Path *)
create_subqueryscan_path(root, rel, subpath,
! pathkeys, required_outer));
}
}
--- 2135,2141 ----
/* Generate outer path using this subpath */
add_path(rel, (Path *)
create_subqueryscan_path(root, rel, subpath,
! pathkeys, required_outer), false);
}
}
*************** set_function_pathlist(PlannerInfo *root,
*** 1995,2001 ****
/* Generate appropriate path */
add_path(rel, create_functionscan_path(root, rel,
! pathkeys, required_outer));
}
/*
--- 2204,2210 ----
/* Generate appropriate path */
add_path(rel, create_functionscan_path(root, rel,
! pathkeys, required_outer), false);
}
/*
*************** set_values_pathlist(PlannerInfo *root, R
*** 2015,2021 ****
required_outer = rel->lateral_relids;
/* Generate appropriate path */
! add_path(rel, create_valuesscan_path(root, rel, required_outer));
}
/*
--- 2224,2230 ----
required_outer = rel->lateral_relids;
/* Generate appropriate path */
! add_path(rel, create_valuesscan_path(root, rel, required_outer), false);
}
/*
*************** set_tablefunc_pathlist(PlannerInfo *root
*** 2036,2042 ****
/* Generate appropriate path */
add_path(rel, create_tablefuncscan_path(root, rel,
! required_outer));
}
/*
--- 2245,2251 ----
/* Generate appropriate path */
add_path(rel, create_tablefuncscan_path(root, rel,
! required_outer), false);
}
/*
*************** set_cte_pathlist(PlannerInfo *root, RelO
*** 2102,2108 ****
required_outer = rel->lateral_relids;
/* Generate appropriate path */
! add_path(rel, create_ctescan_path(root, rel, required_outer));
}
/*
--- 2311,2317 ----
required_outer = rel->lateral_relids;
/* Generate appropriate path */
! add_path(rel, create_ctescan_path(root, rel, required_outer), false);
}
/*
*************** set_namedtuplestore_pathlist(PlannerInfo
*** 2129,2135 ****
required_outer = rel->lateral_relids;
/* Generate appropriate path */
! add_path(rel, create_namedtuplestorescan_path(root, rel, required_outer));
/* Select cheapest path (pretty easy in this case...) */
set_cheapest(rel);
--- 2338,2345 ----
required_outer = rel->lateral_relids;
/* Generate appropriate path */
! add_path(rel, create_namedtuplestorescan_path(root, rel, required_outer),
! false);
/* Select cheapest path (pretty easy in this case...) */
set_cheapest(rel);
*************** set_worktable_pathlist(PlannerInfo *root
*** 2182,2188 ****
required_outer = rel->lateral_relids;
/* Generate appropriate path */
! add_path(rel, create_worktablescan_path(root, rel, required_outer));
}
/*
--- 2392,2399 ----
required_outer = rel->lateral_relids;
/* Generate appropriate path */
! add_path(rel, create_worktablescan_path(root, rel, required_outer),
! false);
}
/*
*************** set_worktable_pathlist(PlannerInfo *root
*** 2195,2208 ****
* path that some GatherPath or GatherMergePath has a reference to.)
*/
void
! generate_gather_paths(PlannerInfo *root, RelOptInfo *rel)
{
Path *cheapest_partial_path;
Path *simple_gather_path;
ListCell *lc;
/* If there are no partial paths, there's nothing to do here. */
! if (rel->partial_pathlist == NIL)
return;
/*
--- 2406,2426 ----
* path that some GatherPath or GatherMergePath has a reference to.)
*/
void
! generate_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool grouped)
{
Path *cheapest_partial_path;
Path *simple_gather_path;
+ List *pathlist = NIL;
+ PathTarget *partial_target;
ListCell *lc;
+ if (!grouped)
+ pathlist = rel->partial_pathlist;
+ else if (rel->gpi != NULL)
+ pathlist = rel->gpi->partial_pathlist;
+
/* If there are no partial paths, there's nothing to do here. */
! if (pathlist == NIL)
return;
/*
*************** generate_gather_paths(PlannerInfo *root,
*** 2210,2226 ****
* path of interest: the cheapest one. That will be the one at the front
* of partial_pathlist because of the way add_partial_path works.
*/
! cheapest_partial_path = linitial(rel->partial_pathlist);
simple_gather_path = (Path *)
! create_gather_path(root, rel, cheapest_partial_path, rel->reltarget,
NULL, NULL);
! add_path(rel, simple_gather_path);
/*
* For each useful ordering, we can consider an order-preserving Gather
* Merge.
*/
! foreach (lc, rel->partial_pathlist)
{
Path *subpath = (Path *) lfirst(lc);
GatherMergePath *path;
--- 2428,2450 ----
* path of interest: the cheapest one. That will be the one at the front
* of partial_pathlist because of the way add_partial_path works.
*/
! cheapest_partial_path = linitial(pathlist);
!
! if (!grouped)
! partial_target = rel->reltarget;
! else if (rel->gpi != NULL)
! partial_target = rel->gpi->target;
!
simple_gather_path = (Path *)
! create_gather_path(root, rel, cheapest_partial_path, partial_target,
NULL, NULL);
! add_path(rel, simple_gather_path, grouped);
/*
* For each useful ordering, we can consider an order-preserving Gather
* Merge.
*/
! foreach (lc, pathlist)
{
Path *subpath = (Path *) lfirst(lc);
GatherMergePath *path;
*************** generate_gather_paths(PlannerInfo *root,
*** 2228,2236 ****
if (subpath->pathkeys == NIL)
continue;
! path = create_gather_merge_path(root, rel, subpath, rel->reltarget,
subpath->pathkeys, NULL, NULL);
! add_path(rel, &path->path);
}
}
--- 2452,2460 ----
if (subpath->pathkeys == NIL)
continue;
! path = create_gather_merge_path(root, rel, subpath, partial_target,
subpath->pathkeys, NULL, NULL);
! add_path(rel, &path->path, grouped);
}
}
*************** standard_join_search(PlannerInfo *root,
*** 2388,2402 ****
* Run generate_gather_paths() for each just-processed joinrel. We
* could not do this earlier because both regular and partial paths
* can get added to a particular joinrel at multiple times within
! * join_search_one_level. After that, we're done creating paths for
! * the joinrel, so run set_cheapest().
*/
foreach(lc, root->join_rel_level[lev])
{
rel = (RelOptInfo *) lfirst(lc);
/* Create GatherPaths for any useful partial paths for rel */
! generate_gather_paths(root, rel);
/* Find and save the cheapest paths for this rel */
set_cheapest(rel);
--- 2612,2641 ----
* Run generate_gather_paths() for each just-processed joinrel. We
* could not do this earlier because both regular and partial paths
* can get added to a particular joinrel at multiple times within
! * join_search_one_level.
! *
! * Similarly, create paths for joinrels which used partition-wise join
! * technique. We could not do this earlier because paths can get added
! * to a particular child-join at multiple times within
! * join_search_one_level.
! *
! * After that, we're done creating paths for the joinrel, so run
! * set_cheapest().
*/
foreach(lc, root->join_rel_level[lev])
{
rel = (RelOptInfo *) lfirst(lc);
+ /*
+ * Create paths for partition-wise joins. Do this before creating
+ * GatherPaths so that partial "append" paths in partitioned joins
+ * will be considered.
+ */
+ generate_partition_wise_join_paths(root, rel);
+
/* Create GatherPaths for any useful partial paths for rel */
! generate_gather_paths(root, rel, false);
! generate_gather_paths(root, rel, true);
/* Find and save the cheapest paths for this rel */
set_cheapest(rel);
*************** create_partial_bitmap_paths(PlannerInfo
*** 3047,3053 ****
return;
add_partial_path(rel, (Path *) create_bitmap_heap_path(root, rel,
! bitmapqual, rel->lateral_relids, 1.0, parallel_workers));
}
/*
--- 3286,3292 ----
return;
add_partial_path(rel, (Path *) create_bitmap_heap_path(root, rel,
! bitmapqual, rel->lateral_relids, 1.0, parallel_workers), false);
}
/*
*************** compute_parallel_worker(RelOptInfo *rel,
*** 3142,3147 ****
--- 3381,3454 ----
return parallel_workers;
}
+ /*
+ * generate_partition_wise_join_paths
+ *
+ * Create paths representing partition-wise join for given partitioned
+ * join relation.
+ *
+ * This must not be called until after we are done adding paths for all
+ * child-joins. (Otherwise, add_path might delete a path that some "append"
+ * path has reference to.
+ */
+ void
+ generate_partition_wise_join_paths(PlannerInfo *root, RelOptInfo *rel)
+ {
+ List *live_children = NIL;
+ int cnt_parts;
+ int num_parts;
+ RelOptInfo **part_rels;
+
+ /* Handle only join relations. */
+ if (!IS_JOIN_REL(rel))
+ return;
+
+ /* If the relation is not partitioned or is proven dummy, nothing to do. */
+ if (!rel->part_scheme || !rel->boundinfo || IS_DUMMY_REL(rel))
+ return;
+
+ /* A partitioned join should have RelOptInfos of the child-joins. */
+ Assert(rel->part_rels && rel->nparts > 0);
+
+ /* Guard against stack overflow due to overly deep partition hierarchy. */
+ check_stack_depth();
+
+ num_parts = rel->nparts;
+ part_rels = rel->part_rels;
+
+ /* Collect non-dummy child-joins. */
+ for (cnt_parts = 0; cnt_parts < num_parts; cnt_parts++)
+ {
+ RelOptInfo *child_rel = part_rels[cnt_parts];
+
+ /* Add partition-wise join paths for partitioned child-joins. */
+ generate_partition_wise_join_paths(root, child_rel);
+
+ /* Dummy children will not be scanned, so ingore those. */
+ if (IS_DUMMY_REL(child_rel))
+ continue;
+
+ set_cheapest(child_rel);
+
+ #ifdef OPTIMIZER_DEBUG
+ debug_print_rel(root, rel);
+ #endif
+
+ live_children = lappend(live_children, child_rel);
+ }
+
+ /* If all child-joins are dummy, parent join is also dummy. */
+ if (!live_children)
+ {
+ mark_dummy_rel(rel);
+ return;
+ }
+
+ /* Add "append" paths containing paths from child-joins. */
+ add_paths_to_append_rel(root, rel, live_children);
+ list_free(live_children);
+ }
+
/*****************************************************************************
* DEBUG SUPPORT
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
new file mode 100644
index 52643d0..f278b77
*** a/src/backend/optimizer/path/costsize.c
--- b/src/backend/optimizer/path/costsize.c
*************** bool enable_material = true;
*** 127,132 ****
--- 127,133 ----
bool enable_mergejoin = true;
bool enable_hashjoin = true;
bool enable_gathermerge = true;
+ bool enable_partition_wise_join = false;
typedef struct
{
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
new file mode 100644
index 67bd760..780ea04
*** a/src/backend/optimizer/path/equivclass.c
--- b/src/backend/optimizer/path/equivclass.c
*************** generate_join_implied_equalities_broken(
*** 1329,1335 ****
if (IS_OTHER_REL(inner_rel) && result != NIL)
result = (List *) adjust_appendrel_attrs_multilevel(root,
(Node *) result,
! inner_rel);
return result;
}
--- 1329,1336 ----
if (IS_OTHER_REL(inner_rel) && result != NIL)
result = (List *) adjust_appendrel_attrs_multilevel(root,
(Node *) result,
! inner_rel->relids,
! inner_rel->top_parent_relids);
return result;
}
*************** add_child_rel_equivalences(PlannerInfo *
*** 2112,2118 ****
child_expr = (Expr *)
adjust_appendrel_attrs(root,
(Node *) cur_em->em_expr,
! appinfo);
/*
* Transform em_relids to match. Note we do *not* do
--- 2113,2119 ----
child_expr = (Expr *)
adjust_appendrel_attrs(root,
(Node *) cur_em->em_expr,
! 1, &appinfo);
/*
* Transform em_relids to match. Note we do *not* do
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
new file mode 100644
index 6e4bae8..a6fa713
*** a/src/backend/optimizer/path/indxpath.c
--- b/src/backend/optimizer/path/indxpath.c
***************
*** 32,37 ****
--- 32,38 ----
#include "optimizer/predtest.h"
#include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
+ #include "optimizer/tlist.h"
#include "optimizer/var.h"
#include "utils/builtins.h"
#include "utils/bytea.h"
*************** static bool eclass_already_used(Equivale
*** 107,119 ****
static bool bms_equal_any(Relids relids, List *relids_list);
static void get_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
! List **bitindexpaths);
static List *build_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
bool useful_predicate,
ScanTypeControl scantype,
bool *skip_nonnative_saop,
! bool *skip_lower_saop);
static List *build_paths_for_OR(PlannerInfo *root, RelOptInfo *rel,
List *clauses, List *other_clauses);
static List *generate_bitmap_or_paths(PlannerInfo *root, RelOptInfo *rel,
--- 108,121 ----
static bool bms_equal_any(Relids relids, List *relids_list);
static void get_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
! List **bitindexpaths, bool grouped);
static List *build_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
bool useful_predicate,
ScanTypeControl scantype,
bool *skip_nonnative_saop,
! bool *skip_lower_saop,
! bool grouped);
static List *build_paths_for_OR(PlannerInfo *root, RelOptInfo *rel,
List *clauses, List *other_clauses);
static List *generate_bitmap_or_paths(PlannerInfo *root, RelOptInfo *rel,
*************** static Const *string_to_const(const char
*** 229,235 ****
* as meaning "unparameterized so far as the indexquals are concerned".
*/
void
! create_index_paths(PlannerInfo *root, RelOptInfo *rel)
{
List *indexpaths;
List *bitindexpaths;
--- 231,237 ----
* as meaning "unparameterized so far as the indexquals are concerned".
*/
void
! create_index_paths(PlannerInfo *root, RelOptInfo *rel, bool grouped)
{
List *indexpaths;
List *bitindexpaths;
*************** create_index_paths(PlannerInfo *root, Re
*** 274,281 ****
* non-parameterized paths. Plain paths go directly to add_path(),
* bitmap paths are added to bitindexpaths to be handled below.
*/
! get_index_paths(root, rel, index, &rclauseset,
! &bitindexpaths);
/*
* Identify the join clauses that can match the index. For the moment
--- 276,283 ----
* non-parameterized paths. Plain paths go directly to add_path(),
* bitmap paths are added to bitindexpaths to be handled below.
*/
! get_index_paths(root, rel, index, &rclauseset, &bitindexpaths,
! grouped);
/*
* Identify the join clauses that can match the index. For the moment
*************** create_index_paths(PlannerInfo *root, Re
*** 338,344 ****
bitmapqual = choose_bitmap_and(root, rel, bitindexpaths);
bpath = create_bitmap_heap_path(root, rel, bitmapqual,
rel->lateral_relids, 1.0, 0);
! add_path(rel, (Path *) bpath);
/* create a partial bitmap heap path */
if (rel->consider_parallel && rel->lateral_relids == NULL)
--- 340,346 ----
bitmapqual = choose_bitmap_and(root, rel, bitindexpaths);
bpath = create_bitmap_heap_path(root, rel, bitmapqual,
rel->lateral_relids, 1.0, 0);
! add_path(rel, (Path *) bpath, false);
/* create a partial bitmap heap path */
if (rel->consider_parallel && rel->lateral_relids == NULL)
*************** create_index_paths(PlannerInfo *root, Re
*** 415,421 ****
loop_count = get_loop_count(root, rel->relid, required_outer);
bpath = create_bitmap_heap_path(root, rel, bitmapqual,
required_outer, loop_count, 0);
! add_path(rel, (Path *) bpath);
}
}
}
--- 417,423 ----
loop_count = get_loop_count(root, rel->relid, required_outer);
bpath = create_bitmap_heap_path(root, rel, bitmapqual,
required_outer, loop_count, 0);
! add_path(rel, (Path *) bpath, false);
}
}
}
*************** get_join_index_paths(PlannerInfo *root,
*** 667,673 ****
Assert(clauseset.nonempty);
/* Build index path(s) using the collected set of clauses */
! get_index_paths(root, rel, index, &clauseset, bitindexpaths);
/*
* Remember we considered paths for this set of relids. We use lcons not
--- 669,675 ----
Assert(clauseset.nonempty);
/* Build index path(s) using the collected set of clauses */
! get_index_paths(root, rel, index, &clauseset, bitindexpaths, false);
/*
* Remember we considered paths for this set of relids. We use lcons not
*************** bms_equal_any(Relids relids, List *relid
*** 736,742 ****
static void
get_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
! List **bitindexpaths)
{
List *indexpaths;
bool skip_nonnative_saop = false;
--- 738,744 ----
static void
get_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
! List **bitindexpaths, bool grouped)
{
List *indexpaths;
bool skip_nonnative_saop = false;
*************** get_index_paths(PlannerInfo *root, RelOp
*** 754,760 ****
index->predOK,
ST_ANYSCAN,
&skip_nonnative_saop,
! &skip_lower_saop);
/*
* If we skipped any lower-order ScalarArrayOpExprs on an index with an AM
--- 756,762 ----
index->predOK,
ST_ANYSCAN,
&skip_nonnative_saop,
! &skip_lower_saop, grouped);
/*
* If we skipped any lower-order ScalarArrayOpExprs on an index with an AM
*************** get_index_paths(PlannerInfo *root, RelOp
*** 769,775 ****
index->predOK,
ST_ANYSCAN,
&skip_nonnative_saop,
! NULL));
}
/*
--- 771,777 ----
index->predOK,
ST_ANYSCAN,
&skip_nonnative_saop,
! NULL, grouped));
}
/*
*************** get_index_paths(PlannerInfo *root, RelOp
*** 789,797 ****
IndexPath *ipath = (IndexPath *) lfirst(lc);
if (index->amhasgettuple)
! add_path(rel, (Path *) ipath);
! if (index->amhasgetbitmap &&
(ipath->path.pathkeys == NIL ||
ipath->indexselectivity < 1.0))
*bitindexpaths = lappend(*bitindexpaths, ipath);
--- 791,799 ----
IndexPath *ipath = (IndexPath *) lfirst(lc);
if (index->amhasgettuple)
! add_path(rel, (Path *) ipath, grouped);
! if (!grouped && index->amhasgetbitmap &&
(ipath->path.pathkeys == NIL ||
ipath->indexselectivity < 1.0))
*bitindexpaths = lappend(*bitindexpaths, ipath);
*************** get_index_paths(PlannerInfo *root, RelOp
*** 802,815 ****
* natively, generate bitmap scan paths relying on executor-managed
* ScalarArrayOpExpr.
*/
! if (skip_nonnative_saop)
{
indexpaths = build_index_paths(root, rel,
index, clauses,
false,
ST_BITMAPSCAN,
NULL,
! NULL);
*bitindexpaths = list_concat(*bitindexpaths, indexpaths);
}
}
--- 804,818 ----
* natively, generate bitmap scan paths relying on executor-managed
* ScalarArrayOpExpr.
*/
! if (!grouped && skip_nonnative_saop)
{
indexpaths = build_index_paths(root, rel,
index, clauses,
false,
ST_BITMAPSCAN,
NULL,
! NULL,
! false);
*bitindexpaths = list_concat(*bitindexpaths, indexpaths);
}
}
*************** build_index_paths(PlannerInfo *root, Rel
*** 861,867 ****
bool useful_predicate,
ScanTypeControl scantype,
bool *skip_nonnative_saop,
! bool *skip_lower_saop)
{
List *result = NIL;
IndexPath *ipath;
--- 864,870 ----
bool useful_predicate,
ScanTypeControl scantype,
bool *skip_nonnative_saop,
! bool *skip_lower_saop, bool grouped)
{
List *result = NIL;
IndexPath *ipath;
*************** build_index_paths(PlannerInfo *root, Rel
*** 878,883 ****
--- 881,890 ----
bool index_is_ordered;
bool index_only_scan;
int indexcol;
+ bool can_agg_sorted;
+ List *group_clauses, *group_exprs, *agg_exprs;
+ AggPath *agg_path;
+ double agg_input_rows;
/*
* Check that index supports the desired scan type(s)
*************** build_index_paths(PlannerInfo *root, Rel
*** 891,896 ****
--- 898,906 ----
case ST_BITMAPSCAN:
if (!index->amhasgetbitmap)
return NIL;
+
+ if (grouped)
+ return NIL;
break;
case ST_ANYSCAN:
/* either or both are OK */
*************** build_index_paths(PlannerInfo *root, Rel
*** 1032,1037 ****
--- 1042,1051 ----
* later merging or final output ordering, OR the index has a useful
* predicate, OR an index-only scan is possible.
*/
+ can_agg_sorted = true;
+ group_clauses = NIL;
+ group_exprs = NIL;
+ agg_exprs = NIL;
if (index_clauses != NIL || useful_pathkeys != NIL || useful_predicate ||
index_only_scan)
{
*************** build_index_paths(PlannerInfo *root, Rel
*** 1048,1054 ****
outer_relids,
loop_count,
false);
! result = lappend(result, ipath);
/*
* If appropriate, consider parallel index scan. We don't allow
--- 1062,1086 ----
outer_relids,
loop_count,
false);
! if (!grouped)
! result = lappend(result, ipath);
! else
! {
! /* TODO Double-check if this is the correct input value. */
! agg_input_rows = rel->rows * ipath->indexselectivity;
!
! agg_path = create_partial_agg_sorted_path(root, (Path *) ipath,
! true,
! &group_clauses,
! &group_exprs,
! &agg_exprs,
! agg_input_rows);
!
! if (agg_path != NULL)
! result = lappend(result, agg_path);
! else
! can_agg_sorted = false;
! }
/*
* If appropriate, consider parallel index scan. We don't allow
*************** build_index_paths(PlannerInfo *root, Rel
*** 1077,1083 ****
* using parallel workers, just free it.
*/
if (ipath->path.parallel_workers > 0)
! add_partial_path(rel, (Path *) ipath);
else
pfree(ipath);
}
--- 1109,1139 ----
* using parallel workers, just free it.
*/
if (ipath->path.parallel_workers > 0)
! {
! if (!grouped)
! add_partial_path(rel, (Path *) ipath, grouped);
! else if (can_agg_sorted && outer_relids == NULL)
! {
! /* TODO Double-check if this is the correct input value. */
! agg_input_rows = rel->rows * ipath->indexselectivity;
!
! agg_path = create_partial_agg_sorted_path(root,
! (Path *) ipath,
! false,
! &group_clauses,
! &group_exprs,
! &agg_exprs,
! agg_input_rows);
!
! /*
! * If create_agg_sorted_path succeeded once, it should
! * always do.
! */
! Assert(agg_path != NULL);
!
! add_partial_path(rel, (Path *) agg_path, grouped);
! }
! }
else
pfree(ipath);
}
*************** build_index_paths(PlannerInfo *root, Rel
*** 1105,1111 ****
outer_relids,
loop_count,
false);
! result = lappend(result, ipath);
/* If appropriate, consider parallel index scan */
if (index->amcanparallel &&
--- 1161,1185 ----
outer_relids,
loop_count,
false);
!
! if (!grouped)
! result = lappend(result, ipath);
! else if (can_agg_sorted)
! {
! /* TODO Double-check if this is the correct input value. */
! agg_input_rows = rel->rows * ipath->indexselectivity;
!
! agg_path = create_partial_agg_sorted_path(root,
! (Path *) ipath,
! true,
! &group_clauses,
! &group_exprs,
! &agg_exprs,
! agg_input_rows);
!
! Assert(agg_path != NULL);
! result = lappend(result, agg_path);
! }
/* If appropriate, consider parallel index scan */
if (index->amcanparallel &&
*************** build_index_paths(PlannerInfo *root, Rel
*** 1129,1135 ****
* using parallel workers, just free it.
*/
if (ipath->path.parallel_workers > 0)
! add_partial_path(rel, (Path *) ipath);
else
pfree(ipath);
}
--- 1203,1227 ----
* using parallel workers, just free it.
*/
if (ipath->path.parallel_workers > 0)
! {
! if (!grouped)
! add_partial_path(rel, (Path *) ipath, grouped);
! else if (can_agg_sorted && outer_relids == NULL)
! {
! /* TODO Double-check if this is the correct input value. */
! agg_input_rows = rel->rows * ipath->indexselectivity;
!
! agg_path = create_partial_agg_sorted_path(root,
! (Path *) ipath,
! false,
! &group_clauses,
! &group_exprs,
! &agg_exprs,
! agg_input_rows);
! Assert(agg_path != NULL);
! add_partial_path(rel, (Path *) agg_path, grouped);
! }
! }
else
pfree(ipath);
}
*************** build_paths_for_OR(PlannerInfo *root, Re
*** 1244,1250 ****
useful_predicate,
ST_BITMAPSCAN,
NULL,
! NULL);
result = list_concat(result, indexpaths);
}
--- 1336,1343 ----
useful_predicate,
ST_BITMAPSCAN,
NULL,
! NULL,
! false);
result = list_concat(result, indexpaths);
}
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
new file mode 100644
index 5aedcd1..f25719f
*** a/src/backend/optimizer/path/joinpath.c
--- b/src/backend/optimizer/path/joinpath.c
***************
*** 22,34 ****
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/planmain.h"
/* Hook for plugins to get control in add_paths_to_joinrel() */
set_join_pathlist_hook_type set_join_pathlist_hook = NULL;
! #define PATH_PARAM_BY_REL(path, rel) \
((path)->param_info && bms_overlap(PATH_REQ_OUTER(path), (rel)->relids))
static void try_partial_mergejoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
Path *outer_path,
--- 22,45 ----
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/planmain.h"
+ #include "optimizer/tlist.h"
/* Hook for plugins to get control in add_paths_to_joinrel() */
set_join_pathlist_hook_type set_join_pathlist_hook = NULL;
! /*
! * Paths parameterized by the parent can be considered to be parameterized by
! * any of its child.
! */
! #define PATH_PARAM_BY_PARENT(path, rel) \
! ((path)->param_info && bms_overlap(PATH_REQ_OUTER(path), \
! (rel)->top_parent_relids))
! #define PATH_PARAM_BY_REL_SELF(path, rel) \
((path)->param_info && bms_overlap(PATH_REQ_OUTER(path), (rel)->relids))
+ #define PATH_PARAM_BY_REL(path, rel) \
+ (PATH_PARAM_BY_REL_SELF(path, rel) || PATH_PARAM_BY_PARENT(path, rel))
+
static void try_partial_mergejoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
Path *outer_path,
*************** static void try_partial_mergejoin_path(P
*** 38,66 ****
List *outersortkeys,
List *innersortkeys,
JoinType jointype,
! JoinPathExtraData *extra);
static void sort_inner_and_outer(PlannerInfo *root, RelOptInfo *joinrel,
! RelOptInfo *outerrel, RelOptInfo *innerrel,
! JoinType jointype, JoinPathExtraData *extra);
static void match_unsorted_outer(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outerrel, RelOptInfo *innerrel,
! JoinType jointype, JoinPathExtraData *extra);
static void consider_parallel_nestloop(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra);
static void consider_parallel_mergejoin(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
JoinPathExtraData *extra,
! Path *inner_cheapest_total);
static void hash_inner_and_outer(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outerrel, RelOptInfo *innerrel,
! JoinType jointype, JoinPathExtraData *extra);
static List *select_mergejoin_clauses(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
--- 49,97 ----
List *outersortkeys,
List *innersortkeys,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped,
! bool do_aggregate);
static void sort_inner_and_outer(PlannerInfo *root, RelOptInfo *joinrel,
! RelOptInfo *outerrel, RelOptInfo *innerrel,
! JoinType jointype, JoinPathExtraData *extra,
! bool grouped);
! static void sort_inner_and_outer_common(PlannerInfo *root,
! RelOptInfo *joinrel,
! RelOptInfo *outerrel,
! RelOptInfo *innerrel,
! JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped_outer,
! bool grouped_inner,
! bool do_aggregate);
static void match_unsorted_outer(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outerrel, RelOptInfo *innerrel,
! JoinType jointype, JoinPathExtraData *extra,
! bool grouped);
static void consider_parallel_nestloop(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped, bool do_aggregate);
static void consider_parallel_mergejoin(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
JoinPathExtraData *extra,
! Path *inner_cheapest_total,
! bool grouped);
static void hash_inner_and_outer(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outerrel, RelOptInfo *innerrel,
! JoinType jointype, JoinPathExtraData *extra,
! bool grouped);
! static bool is_grouped_join_target_complete(PlannerInfo *root,
! PathTarget *jointarget,
! Path *outer_path,
! Path *inner_path);
static List *select_mergejoin_clauses(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
*************** static void generate_mergejoin_paths(Pla
*** 77,83 ****
bool useallclauses,
Path *inner_cheapest_total,
List *merge_pathkeys,
! bool is_partial);
/*
--- 108,117 ----
bool useallclauses,
Path *inner_cheapest_total,
List *merge_pathkeys,
! bool is_partial,
! bool grouped_outer,
! bool grouped_inner,
! bool do_aggregate);
/*
*************** add_paths_to_joinrel(PlannerInfo *root,
*** 115,120 ****
--- 149,167 ----
JoinPathExtraData extra;
bool mergejoin_allowed = true;
ListCell *lc;
+ Relids joinrelids;
+
+ /*
+ * PlannerInfo doesn't contain the SpecialJoinInfos created for joins
+ * between child relations, even if there is a SpecialJoinInfo node for
+ * the join between the topmost parents. Hence while calculating Relids
+ * set representing the restriction, consider relids of topmost parent
+ * of partitions.
+ */
+ if (joinrel->reloptkind == RELOPT_OTHER_JOINREL)
+ joinrelids = joinrel->top_parent_relids;
+ else
+ joinrelids = joinrel->relids;
extra.restrictlist = restrictlist;
extra.mergeclause_list = NIL;
*************** add_paths_to_joinrel(PlannerInfo *root,
*** 197,212 ****
* join has already been proven legal.) If the SJ is relevant, it
* presents constraints for joining to anything not in its RHS.
*/
! if (bms_overlap(joinrel->relids, sjinfo2->min_righthand) &&
! !bms_overlap(joinrel->relids, sjinfo2->min_lefthand))
extra.param_source_rels = bms_join(extra.param_source_rels,
bms_difference(root->all_baserels,
sjinfo2->min_righthand));
/* full joins constrain both sides symmetrically */
if (sjinfo2->jointype == JOIN_FULL &&
! bms_overlap(joinrel->relids, sjinfo2->min_lefthand) &&
! !bms_overlap(joinrel->relids, sjinfo2->min_righthand))
extra.param_source_rels = bms_join(extra.param_source_rels,
bms_difference(root->all_baserels,
sjinfo2->min_lefthand));
--- 244,259 ----
* join has already been proven legal.) If the SJ is relevant, it
* presents constraints for joining to anything not in its RHS.
*/
! if (bms_overlap(joinrelids, sjinfo2->min_righthand) &&
! !bms_overlap(joinrelids, sjinfo2->min_lefthand))
extra.param_source_rels = bms_join(extra.param_source_rels,
bms_difference(root->all_baserels,
sjinfo2->min_righthand));
/* full joins constrain both sides symmetrically */
if (sjinfo2->jointype == JOIN_FULL &&
! bms_overlap(joinrelids, sjinfo2->min_lefthand) &&
! !bms_overlap(joinrelids, sjinfo2->min_righthand))
extra.param_source_rels = bms_join(extra.param_source_rels,
bms_difference(root->all_baserels,
sjinfo2->min_lefthand));
*************** add_paths_to_joinrel(PlannerInfo *root,
*** 227,234 ****
* sorted. Skip this if we can't mergejoin.
*/
if (mergejoin_allowed)
sort_inner_and_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra);
/*
* 2. Consider paths where the outer relation need not be explicitly
--- 274,285 ----
* sorted. Skip this if we can't mergejoin.
*/
if (mergejoin_allowed)
+ {
sort_inner_and_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra, false);
! sort_inner_and_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra, true);
! }
/*
* 2. Consider paths where the outer relation need not be explicitly
*************** add_paths_to_joinrel(PlannerInfo *root,
*** 238,245 ****
* joins at all, so it wouldn't work in the prohibited cases either.)
*/
if (mergejoin_allowed)
match_unsorted_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra);
#ifdef NOT_USED
--- 289,300 ----
* joins at all, so it wouldn't work in the prohibited cases either.)
*/
if (mergejoin_allowed)
+ {
match_unsorted_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra, false);
! match_unsorted_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra, true);
! }
#ifdef NOT_USED
*************** add_paths_to_joinrel(PlannerInfo *root,
*** 265,272 ****
* joins, because there may be no other alternative.
*/
if (enable_hashjoin || jointype == JOIN_FULL)
hash_inner_and_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra);
/*
* 5. If inner and outer relations are foreign tables (or joins) belonging
--- 320,331 ----
* joins, because there may be no other alternative.
*/
if (enable_hashjoin || jointype == JOIN_FULL)
+ {
hash_inner_and_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra, false);
! hash_inner_and_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra, true);
! }
/*
* 5. If inner and outer relations are foreign tables (or joins) belonging
*************** add_paths_to_joinrel(PlannerInfo *root,
*** 304,321 ****
*/
static inline bool
allow_star_schema_join(PlannerInfo *root,
! Path *outer_path,
! Path *inner_path)
{
- Relids innerparams = PATH_REQ_OUTER(inner_path);
- Relids outerrelids = outer_path->parent->relids;
-
/*
* It's a star-schema case if the outer rel provides some but not all of
* the inner rel's parameterization.
*/
! return (bms_overlap(innerparams, outerrelids) &&
! bms_nonempty_difference(innerparams, outerrelids));
}
/*
--- 363,377 ----
*/
static inline bool
allow_star_schema_join(PlannerInfo *root,
! Relids outerrelids,
! Relids inner_paramrels)
{
/*
* It's a star-schema case if the outer rel provides some but not all of
* the inner rel's parameterization.
*/
! return (bms_overlap(inner_paramrels, outerrelids) &&
! bms_nonempty_difference(inner_paramrels, outerrelids));
}
/*
*************** try_nestloop_path(PlannerInfo *root,
*** 330,339 ****
Path *inner_path,
List *pathkeys,
JoinType jointype,
! JoinPathExtraData *extra)
{
Relids required_outer;
JoinCostWorkspace workspace;
/*
* Check to see if proposed path is still parameterized, and reject if the
--- 386,427 ----
Path *inner_path,
List *pathkeys,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped,
! bool do_aggregate)
{
Relids required_outer;
JoinCostWorkspace workspace;
+ RelOptInfo *innerrel = inner_path->parent;
+ RelOptInfo *outerrel = outer_path->parent;
+ Relids innerrelids;
+ Relids outerrelids;
+ Relids inner_paramrels = PATH_REQ_OUTER(inner_path);
+ Relids outer_paramrels = PATH_REQ_OUTER(outer_path);
+ Path *join_path;
+ PathTarget *join_target;
+
+ /* Caller should not request aggregation w/o grouped output. */
+ Assert(!do_aggregate || grouped);
+
+ /* GroupedPathInfo is necessary for us to produce a grouped set. */
+ Assert(joinrel->gpi != NULL || !grouped);
+
+ /*
+ * Parameterized paths in the child relations (base or join) are
+ * parameterized by top-level parent. Any paths we will create to be
+ * parameterized by the child child relations, are not added to the
+ * pathlist. Hence run parameterization tests on the parent relids.
+ */
+ if (innerrel->top_parent_relids)
+ innerrelids = innerrel->top_parent_relids;
+ else
+ innerrelids = innerrel->relids;
+
+ if (outerrel->top_parent_relids)
+ outerrelids = outerrel->top_parent_relids;
+ else
+ outerrelids = outerrel->relids;
/*
* Check to see if proposed path is still parameterized, and reject if the
*************** try_nestloop_path(PlannerInfo *root,
*** 341,359 ****
* says to allow it anyway. Also, we must reject if have_dangerous_phv
* doesn't like the look of it, which could only happen if the nestloop is
* still parameterized.
*/
! required_outer = calc_nestloop_required_outer(outer_path,
! inner_path);
! if (required_outer &&
! ((!bms_overlap(required_outer, extra->param_source_rels) &&
! !allow_star_schema_join(root, outer_path, inner_path)) ||
! have_dangerous_phv(root,
! outer_path->parent->relids,
! PATH_REQ_OUTER(inner_path))))
{
! /* Waste no memory when we reject a path here */
! bms_free(required_outer);
! return;
}
/*
--- 429,452 ----
* says to allow it anyway. Also, we must reject if have_dangerous_phv
* doesn't like the look of it, which could only happen if the nestloop is
* still parameterized.
+ *
+ * Grouped path should never be parameterized.
*/
! required_outer = calc_nestloop_required_outer(outerrelids, outer_paramrels,
! innerrelids, inner_paramrels);
! if (required_outer)
{
! if (grouped ||
! (!bms_overlap(required_outer, extra->param_source_rels) &&
! !allow_star_schema_join(root, outerrelids, inner_paramrels)) ||
! have_dangerous_phv(root,
! outer_path->parent->relids,
! PATH_REQ_OUTER(inner_path)))
! {
! /* Waste no memory when we reject a path here */
! bms_free(required_outer);
! return;
! }
}
/*
*************** try_nestloop_path(PlannerInfo *root,
*** 368,388 ****
initial_cost_nestloop(root, &workspace, jointype,
outer_path, inner_path, extra);
! if (add_path_precheck(joinrel,
workspace.startup_cost, workspace.total_cost,
! pathkeys, required_outer))
{
! add_path(joinrel, (Path *)
! create_nestloop_path(root,
! joinrel,
! jointype,
! &workspace,
! extra,
! outer_path,
! inner_path,
! extra->restrictlist,
! pathkeys,
! required_outer));
}
else
{
--- 461,522 ----
initial_cost_nestloop(root, &workspace, jointype,
outer_path, inner_path, extra);
! /*
! * Determine which target the join should produce.
! *
! * In the case of explicit aggregation, output of the join itself is
! * plain.
! */
! if (!grouped || do_aggregate)
! join_target = joinrel->reltarget;
! else
! join_target = joinrel->gpi->target;
!
! join_path = (Path *) create_nestloop_path(root, joinrel, jointype,
! &workspace, extra,
! outer_path, inner_path,
! extra->restrictlist, pathkeys,
! required_outer, join_target);
!
! /* Do partial aggregation if needed. */
! if (do_aggregate && required_outer == NULL)
! {
! create_grouped_path(root, joinrel, join_path, true, false,
! AGG_HASHED);
! create_grouped_path(root, joinrel, join_path, true, false,
! AGG_SORTED);
! }
! else if (add_path_precheck(joinrel,
workspace.startup_cost, workspace.total_cost,
! pathkeys, required_outer, grouped))
{
! /*
! * Since result produced by a child is part of the result produced by
! * its topmost parent and has same properties, the parameters
! * representing that parent may be substituted by values from a child.
! * Hence expressions and hence paths using those expressions,
! * parameterized by a parent can be said to be parameterized by any of
! * its child. For a join between child relations, if the inner path is
! * parameterized by the parent of the outer relation, create a
! * nestloop join path with inner relation parameterized by the outer
! * relation by translating the inner path to be parameterized by the
! * outer child relation. The translated path should have the same costs
! * as the original path, so cost check above should still hold.
! */
! if (PATH_PARAM_BY_PARENT(inner_path, outer_path->parent))
! {
! inner_path = reparameterize_path_by_child(root, inner_path,
! outer_path->parent);
!
! /*
! * If we could not translate the path, we can't create nest loop
! * path.
! */
! if (!inner_path)
! return;
! }
!
! add_path(joinrel, join_path, grouped);
}
else
{
*************** try_partial_nestloop_path(PlannerInfo *r
*** 403,411 ****
Path *inner_path,
List *pathkeys,
JoinType jointype,
! JoinPathExtraData *extra)
{
JoinCostWorkspace workspace;
/*
* If the inner path is parameterized, the parameterization must be fully
--- 537,553 ----
Path *inner_path,
List *pathkeys,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped,
! bool do_aggregate)
{
JoinCostWorkspace workspace;
+ Path *join_path;
+ PathTarget *join_target;
+
+ /* The same checks we do in try_nestloop_path. */
+ Assert(!do_aggregate || grouped);
+ Assert(joinrel->gpi != NULL || !grouped);
/*
* If the inner path is parameterized, the parameterization must be fully
*************** try_partial_nestloop_path(PlannerInfo *r
*** 428,448 ****
*/
initial_cost_nestloop(root, &workspace, jointype,
outer_path, inner_path, extra);
! if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
return;
! /* Might be good enough to be worth trying, so let's try it. */
! add_partial_path(joinrel, (Path *)
! create_nestloop_path(root,
! joinrel,
! jointype,
! &workspace,
! extra,
! outer_path,
! inner_path,
! extra->restrictlist,
! pathkeys,
! NULL));
}
/*
--- 570,650 ----
*/
initial_cost_nestloop(root, &workspace, jointype,
outer_path, inner_path, extra);
!
! /*
! * Determine which target the join should produce.
! *
! * In the case of explicit aggregation, output of the join itself is
! * plain.
! */
! if (!grouped || do_aggregate)
! join_target = joinrel->reltarget;
! else
! {
! Assert(joinrel->gpi != NULL);
! join_target = joinrel->gpi->target;
! }
!
! join_path = (Path *) create_nestloop_path(root, joinrel, jointype,
! &workspace, extra,
! outer_path, inner_path,
! extra->restrictlist, pathkeys,
! NULL, join_target);
!
! if (do_aggregate)
! {
! create_grouped_path(root, joinrel, join_path, true, true, AGG_HASHED);
! create_grouped_path(root, joinrel, join_path, true, true, AGG_SORTED);
! }
! else if (add_partial_path_precheck(joinrel, workspace.total_cost,
! pathkeys, grouped))
! {
! /* Might be good enough to be worth trying, so let's try it. */
! add_partial_path(joinrel, (Path *) join_path, grouped);
! }
! }
!
! static void
! try_grouped_nestloop_path(PlannerInfo *root,
! RelOptInfo *joinrel,
! Path *outer_path,
! Path *inner_path,
! List *pathkeys,
! JoinType jointype,
! JoinPathExtraData *extra,
! bool do_aggregate,
! bool partial)
! {
! /*
! * Missing GroupedPathInfo indicates that we should not try to create a
! * grouped join.
! */
! if (joinrel->gpi == NULL)
return;
! /*
! * Reject the path if we're supposed to combine grouped and plain relation
! * but the grouped one does not evaluate all the relevant aggregates.
! */
! if (!do_aggregate &&
! !is_grouped_join_target_complete(root, joinrel->gpi->target,
! outer_path, inner_path))
! return;
!
! /*
! * As repeated aggregation doesn't seem to be attractive, make sure that
! * the resulting grouped relation is not parameterized.
! */
! if (outer_path->param_info != NULL || inner_path->param_info != NULL)
! return;
!
! if (!partial)
! try_nestloop_path(root, joinrel, outer_path, inner_path, pathkeys,
! jointype, extra, true, do_aggregate);
! else
! try_partial_nestloop_path(root, joinrel, outer_path, inner_path,
! pathkeys, jointype, extra, true,
! do_aggregate);
}
/*
*************** try_mergejoin_path(PlannerInfo *root,
*** 461,470 ****
List *innersortkeys,
JoinType jointype,
JoinPathExtraData *extra,
! bool is_partial)
{
Relids required_outer;
JoinCostWorkspace workspace;
if (is_partial)
{
--- 663,682 ----
List *innersortkeys,
JoinType jointype,
JoinPathExtraData *extra,
! bool is_partial,
! bool grouped,
! bool do_aggregate)
{
Relids required_outer;
JoinCostWorkspace workspace;
+ Path *join_path;
+ PathTarget *join_target;
+
+ /* Caller should not request aggregation w/o grouped output. */
+ Assert(!do_aggregate || grouped);
+
+ /* GroupedPathInfo is necessary for us to produce a grouped set. */
+ Assert(joinrel->gpi != NULL || !grouped);
if (is_partial)
{
*************** try_mergejoin_path(PlannerInfo *root,
*** 477,498 ****
outersortkeys,
innersortkeys,
jointype,
! extra);
return;
}
/*
! * Check to see if proposed path is still parameterized, and reject if the
! * parameterization wouldn't be sensible.
*/
! required_outer = calc_non_nestloop_required_outer(outer_path,
! inner_path);
! if (required_outer &&
! !bms_overlap(required_outer, extra->param_source_rels))
{
! /* Waste no memory when we reject a path here */
! bms_free(required_outer);
! return;
}
/*
--- 689,713 ----
outersortkeys,
innersortkeys,
jointype,
! extra,
! grouped,
! do_aggregate);
return;
}
/*
! * Check to see if proposed path is still parameterized, and reject if
! * it's grouped or if the parameterization wouldn't be sensible.
*/
! required_outer = calc_non_nestloop_required_outer(outer_path, inner_path);
! if (required_outer)
{
! if (grouped || !bms_overlap(required_outer, extra->param_source_rels))
! {
! /* Waste no memory when we reject a path here */
! bms_free(required_outer);
! return;
! }
}
/*
*************** try_mergejoin_path(PlannerInfo *root,
*** 511,537 ****
*/
initial_cost_mergejoin(root, &workspace, jointype, mergeclauses,
outer_path, inner_path,
! outersortkeys, innersortkeys,
! extra);
! if (add_path_precheck(joinrel,
workspace.startup_cost, workspace.total_cost,
! pathkeys, required_outer))
{
! add_path(joinrel, (Path *)
! create_mergejoin_path(root,
! joinrel,
! jointype,
! &workspace,
! extra,
! outer_path,
! inner_path,
! extra->restrictlist,
! pathkeys,
! required_outer,
! mergeclauses,
! outersortkeys,
! innersortkeys));
}
else
{
--- 726,773 ----
*/
initial_cost_mergejoin(root, &workspace, jointype, mergeclauses,
outer_path, inner_path,
! outersortkeys, innersortkeys, extra);
! /*
! * Determine which target the join should produce.
! *
! * In the case of explicit aggregation, output of the join itself is
! * plain.
! */
! if (!grouped || do_aggregate)
! join_target = joinrel->reltarget;
! else
! join_target = joinrel->gpi->target;
!
!
! join_path = (Path *) create_mergejoin_path(root,
! joinrel,
! jointype,
! &workspace,
! extra,
! outer_path,
! inner_path,
! extra->restrictlist,
! pathkeys,
! required_outer,
! mergeclauses,
! outersortkeys,
! innersortkeys,
! join_target);
!
! /* Do partial aggregation if needed. */
! if (do_aggregate)
! {
! create_grouped_path(root, joinrel, join_path, true, false,
! AGG_HASHED);
! create_grouped_path(root, joinrel, join_path, true, false,
! AGG_SORTED);
! }
! else if (add_path_precheck(joinrel,
workspace.startup_cost, workspace.total_cost,
! pathkeys, required_outer, grouped))
{
! add_path(joinrel, (Path *) join_path, grouped);
}
else
{
*************** try_partial_mergejoin_path(PlannerInfo *
*** 555,563 ****
List *outersortkeys,
List *innersortkeys,
JoinType jointype,
! JoinPathExtraData *extra)
{
JoinCostWorkspace workspace;
/*
* See comments in try_partial_hashjoin_path().
--- 791,807 ----
List *outersortkeys,
List *innersortkeys,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped,
! bool do_aggregate)
{
JoinCostWorkspace workspace;
+ Path *join_path;
+ PathTarget *join_target;
+
+ /* The same checks we do in try_mergejoin_path. */
+ Assert(!do_aggregate || grouped);
+ Assert(joinrel->gpi != NULL || !grouped);
/*
* See comments in try_partial_hashjoin_path().
*************** try_partial_mergejoin_path(PlannerInfo *
*** 587,613 ****
*/
initial_cost_mergejoin(root, &workspace, jointype, mergeclauses,
outer_path, inner_path,
! outersortkeys, innersortkeys,
! extra);
! if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
return;
! /* Might be good enough to be worth trying, so let's try it. */
! add_partial_path(joinrel, (Path *)
! create_mergejoin_path(root,
! joinrel,
! jointype,
! &workspace,
! extra,
! outer_path,
! inner_path,
! extra->restrictlist,
! pathkeys,
! NULL,
! mergeclauses,
! outersortkeys,
! innersortkeys));
}
/*
--- 831,1003 ----
*/
initial_cost_mergejoin(root, &workspace, jointype, mergeclauses,
outer_path, inner_path,
! outersortkeys, innersortkeys, extra);
! /*
! * Determine which target the join should produce.
! *
! * In the case of explicit aggregation, output of the join itself is
! * plain.
! */
! if (!grouped || do_aggregate)
! join_target = joinrel->reltarget;
! else
! {
! Assert(joinrel->gpi != NULL);
! join_target = joinrel->gpi->target;
! }
!
! join_path = (Path *) create_mergejoin_path(root,
! joinrel,
! jointype,
! &workspace,
! extra,
! outer_path,
! inner_path,
! extra->restrictlist,
! pathkeys,
! NULL,
! mergeclauses,
! outersortkeys,
! innersortkeys,
! join_target);
!
! if (do_aggregate)
! {
! create_grouped_path(root, joinrel, join_path, true, true, AGG_HASHED);
! create_grouped_path(root, joinrel, join_path, true, true, AGG_SORTED);
! }
! else if (add_partial_path_precheck(joinrel, workspace.total_cost,
! pathkeys, grouped))
! {
! /* Might be good enough to be worth trying, so let's try it. */
! add_partial_path(joinrel, (Path *) join_path, grouped);
! }
! }
!
! static void
! try_grouped_mergejoin_path(PlannerInfo *root,
! RelOptInfo *joinrel,
! Path *outer_path,
! Path *inner_path,
! List *pathkeys,
! List *mergeclauses,
! List *outersortkeys,
! List *innersortkeys,
! JoinType jointype,
! JoinPathExtraData *extra,
! bool partial,
! bool do_aggregate)
! {
! /*
! * Missing GroupedPathInfo indicates that we should not try to create a
! * grouped join.
! */
! if (joinrel->gpi == NULL)
return;
! /*
! * Reject the path if we're supposed to combine grouped and plain relation
! * but the grouped one does not evaluate all the relevant aggregates.
! */
! if (!do_aggregate &&
! !is_grouped_join_target_complete(root, joinrel->gpi->target,
! outer_path, inner_path))
! return;
!
! /*
! * As repeated aggregation doesn't seem to be attractive, make sure that
! * the resulting grouped relation is not parameterized.
! */
! if (outer_path->param_info != NULL || inner_path->param_info != NULL)
! return;
!
! if (!partial)
! try_mergejoin_path(root, joinrel, outer_path, inner_path, pathkeys,
! mergeclauses, outersortkeys, innersortkeys,
! jointype, extra, false, true, do_aggregate);
! else
! try_partial_mergejoin_path(root, joinrel, outer_path, inner_path,
! pathkeys,
! mergeclauses, outersortkeys, innersortkeys,
! jointype, extra, true, do_aggregate);
! }
!
! static void
! try_mergejoin_path_common(PlannerInfo *root,
! RelOptInfo *joinrel,
! Path *outer_path,
! Path *inner_path,
! List *pathkeys,
! List *mergeclauses,
! List *outersortkeys,
! List *innersortkeys,
! JoinType jointype,
! JoinPathExtraData *extra,
! bool partial,
! bool grouped_outer,
! bool grouped_inner,
! bool do_aggregate)
! {
! bool grouped_join;
!
! grouped_join = grouped_outer || grouped_inner || do_aggregate;
!
! /* Join of two grouped paths is not supported. */
! Assert(!(grouped_outer && grouped_inner));
!
! if (!grouped_join)
! {
! /* Only join plain paths. */
! try_mergejoin_path(root,
! joinrel,
! outer_path,
! inner_path,
! pathkeys,
! mergeclauses,
! outersortkeys,
! innersortkeys,
! jointype,
! extra,
! partial,
! false, false);
! }
! else if (grouped_outer || grouped_inner)
! {
! Assert(!do_aggregate);
!
! /*
! * Exactly one of the input paths is grouped, so create a grouped join
! * path.
! */
! try_grouped_mergejoin_path(root,
! joinrel,
! outer_path,
! inner_path,
! pathkeys,
! mergeclauses,
! outersortkeys,
! innersortkeys,
! jointype,
! extra,
! partial,
! false);
! }
! /* Preform explicit aggregation only if suitable target exists. */
! else if (joinrel->gpi != NULL)
! {
! try_grouped_mergejoin_path(root,
! joinrel,
! outer_path,
! inner_path,
! pathkeys,
! mergeclauses,
! outersortkeys,
! innersortkeys,
! jointype,
! extra,
! partial, true);
! }
}
/*
*************** try_hashjoin_path(PlannerInfo *root,
*** 622,668 ****
Path *inner_path,
List *hashclauses,
JoinType jointype,
! JoinPathExtraData *extra)
{
Relids required_outer;
JoinCostWorkspace workspace;
/*
! * Check to see if proposed path is still parameterized, and reject if the
! * parameterization wouldn't be sensible.
*/
! required_outer = calc_non_nestloop_required_outer(outer_path,
! inner_path);
! if (required_outer &&
! !bms_overlap(required_outer, extra->param_source_rels))
{
! /* Waste no memory when we reject a path here */
! bms_free(required_outer);
! return;
}
/*
* See comments in try_nestloop_path(). Also note that hashjoin paths
* never have any output pathkeys, per comments in create_hashjoin_path.
*/
initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
outer_path, inner_path, extra);
! if (add_path_precheck(joinrel,
workspace.startup_cost, workspace.total_cost,
! NIL, required_outer))
{
! add_path(joinrel, (Path *)
! create_hashjoin_path(root,
! joinrel,
! jointype,
! &workspace,
! extra,
! outer_path,
! inner_path,
! extra->restrictlist,
! required_outer,
! hashclauses));
}
else
{
--- 1012,1086 ----
Path *inner_path,
List *hashclauses,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped,
! bool do_aggregate)
{
Relids required_outer;
JoinCostWorkspace workspace;
+ Path *join_path;
+ PathTarget *join_target;
+
+ /* Caller should not request aggregation w/o grouped output. */
+ Assert(!do_aggregate || grouped);
+
+ /* GroupedPathInfo is necessary for us to produce a grouped set. */
+ Assert(joinrel->gpi != NULL || !grouped);
/*
! * Check to see if proposed path is still parameterized, and reject if
! * it's grouped or if the parameterization wouldn't be sensible.
*/
! required_outer = calc_non_nestloop_required_outer(outer_path, inner_path);
! if (required_outer)
{
! if (grouped || !bms_overlap(required_outer, extra->param_source_rels))
! {
! /* Waste no memory when we reject a path here */
! bms_free(required_outer);
! return;
! }
}
/*
* See comments in try_nestloop_path(). Also note that hashjoin paths
* never have any output pathkeys, per comments in create_hashjoin_path.
+ *
+ * TODO Need to consider aggregation here?
*/
initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
outer_path, inner_path, extra);
! /*
! * Determine which target the join should produce.
! *
! * In the case of explicit aggregation, output of the join itself is
! * plain.
! */
! if (!grouped || do_aggregate)
! join_target = joinrel->reltarget;
! else
! join_target = joinrel->gpi->target;
!
! join_path = (Path *) create_hashjoin_path(root, joinrel, jointype,
! &workspace,
! extra,
! outer_path, inner_path,
! extra->restrictlist,
! required_outer, hashclauses,
! join_target);
!
! /* Do partial aggregation if needed. */
! if (do_aggregate)
! {
! create_grouped_path(root, joinrel, join_path, true, false,
! AGG_HASHED);
! }
! else if (add_path_precheck(joinrel,
workspace.startup_cost, workspace.total_cost,
! NIL, required_outer, grouped))
{
! add_path(joinrel, (Path *) join_path, grouped);
}
else
{
*************** try_partial_hashjoin_path(PlannerInfo *r
*** 683,691 ****
Path *inner_path,
List *hashclauses,
JoinType jointype,
! JoinPathExtraData *extra)
{
JoinCostWorkspace workspace;
/*
* If the inner path is parameterized, the parameterization must be fully
--- 1101,1117 ----
Path *inner_path,
List *hashclauses,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped,
! bool do_aggregate)
{
JoinCostWorkspace workspace;
+ Path *join_path;
+ PathTarget *join_target;
+
+ /* The same checks we do in try_hashjoin_path. */
+ Assert(!do_aggregate || grouped);
+ Assert(joinrel->gpi != NULL || !grouped);
/*
* If the inner path is parameterized, the parameterization must be fully
*************** try_partial_hashjoin_path(PlannerInfo *r
*** 708,728 ****
*/
initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
outer_path, inner_path, extra);
! if (!add_partial_path_precheck(joinrel, workspace.total_cost, NIL))
return;
! /* Might be good enough to be worth trying, so let's try it. */
! add_partial_path(joinrel, (Path *)
! create_hashjoin_path(root,
! joinrel,
! jointype,
! &workspace,
! extra,
! outer_path,
! inner_path,
! extra->restrictlist,
! NULL,
! hashclauses));
}
/*
--- 1134,1229 ----
*/
initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
outer_path, inner_path, extra);
!
! /*
! * Determine which target the join should produce.
! *
! * In the case of explicit aggregation, output of the join itself is
! * plain.
! */
! if (!grouped || do_aggregate)
! join_target = joinrel->reltarget;
! else
! {
! Assert(joinrel->gpi != NULL);
! join_target = joinrel->gpi->target;
! }
!
! join_path = (Path *) create_hashjoin_path(root, joinrel, jointype,
! &workspace,
! extra,
! outer_path, inner_path,
! extra->restrictlist, NULL,
! hashclauses, join_target);
!
! /* Do partial aggregation if needed. */
! if (do_aggregate)
! {
! create_grouped_path(root, joinrel, join_path, true, true, AGG_HASHED);
! }
! else if (add_partial_path_precheck(joinrel, workspace.total_cost,
! NIL, grouped))
! {
! add_partial_path(joinrel, (Path *) join_path , grouped);
! }
! }
!
! /*
! * Create a new grouped hash join path by joining a grouped path to plain
! * (non-grouped) one, or by joining 2 plain relations and applying grouping on
! * the result.
! *
! * Joining of 2 grouped paths is not supported. If a grouped relation A was
! * joined to grouped relation B, then the grouping of B reduces the number of
! * times each group of A is appears in the join output. This makes difference
! * for some aggregates, e.g. sum().
! *
! * If do_aggregate is true, neither input rel is grouped so we need to
! * aggregate the join result explicitly.
! *
! * partial argument tells whether the join path should be considered partial.
! */
! static void
! try_grouped_hashjoin_path(PlannerInfo *root,
! RelOptInfo *joinrel,
! Path *outer_path,
! Path *inner_path,
! List *hashclauses,
! JoinType jointype,
! JoinPathExtraData *extra,
! bool do_aggregate,
! bool partial)
! {
! /*
! * Missing GroupedPathInfo indicates that we should not try to create a
! * grouped join.
! */
! if (joinrel->gpi == NULL)
return;
! /*
! * Reject the path if we're supposed to combine grouped and plain relation
! * but the grouped one does not evaluate all the relevant aggregates.
! */
! if (!do_aggregate &&
! !is_grouped_join_target_complete(root, joinrel->gpi->target,
! outer_path, inner_path))
! return;
!
! /*
! * As repeated aggregation doesn't seem to be attractive, make sure that
! * the resulting grouped relation is not parameterized.
! */
! if (outer_path->param_info != NULL || inner_path->param_info != NULL)
! return;
!
! if (!partial)
! try_hashjoin_path(root, joinrel, outer_path, inner_path, hashclauses,
! jointype, extra, true, do_aggregate);
! else
! try_partial_hashjoin_path(root, joinrel, outer_path, inner_path,
! hashclauses, jointype, extra, true,
! do_aggregate);
}
/*
*************** sort_inner_and_outer(PlannerInfo *root,
*** 773,779 ****
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra)
{
JoinType save_jointype = jointype;
Path *outer_path;
--- 1274,1313 ----
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped)
! {
! if (!grouped)
! {
! sort_inner_and_outer_common(root, joinrel, outerrel, innerrel,
! jointype, extra, false, false, false);
! }
! else
! {
! /* Use all the supported strategies to generate grouped join. */
! sort_inner_and_outer_common(root, joinrel, outerrel, innerrel,
! jointype, extra, true, false, false);
! sort_inner_and_outer_common(root, joinrel, outerrel, innerrel,
! jointype, extra, false, true, false);
! sort_inner_and_outer_common(root, joinrel, outerrel, innerrel,
! jointype, extra, false, false, true);
! }
! }
!
! /*
! * TODO As merge_pathkeys shouldn't differ across execution, use a separate
! * function to derive them and pass them here in a list.
! */
! static void
! sort_inner_and_outer_common(PlannerInfo *root,
! RelOptInfo *joinrel,
! RelOptInfo *outerrel,
! RelOptInfo *innerrel,
! JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped_outer,
! bool grouped_inner,
! bool do_aggregate)
{
JoinType save_jointype = jointype;
Path *outer_path;
*************** sort_inner_and_outer(PlannerInfo *root,
*** 782,787 ****
--- 1316,1322 ----
Path *cheapest_safe_inner = NULL;
List *all_pathkeys;
ListCell *l;
+ bool grouped_result;
/*
* We only consider the cheapest-total-cost input paths, since we are
*************** sort_inner_and_outer(PlannerInfo *root,
*** 796,803 ****
* against mergejoins with parameterized inputs; see comments in
* src/backend/optimizer/README.
*/
! outer_path = outerrel->cheapest_total_path;
! inner_path = innerrel->cheapest_total_path;
/*
* If either cheapest-total path is parameterized by the other rel, we
--- 1331,1357 ----
* against mergejoins with parameterized inputs; see comments in
* src/backend/optimizer/README.
*/
! if (grouped_outer)
! {
! if (outerrel->gpi != NULL && outerrel->gpi->pathlist != NIL)
! outer_path = linitial(outerrel->gpi->pathlist);
! else
! return;
! }
! else
! outer_path = outerrel->cheapest_total_path;
!
! if (grouped_inner)
! {
! if (innerrel->gpi != NULL && innerrel->gpi->pathlist != NIL)
! inner_path = linitial(innerrel->gpi->pathlist);
! else
! return;
! }
! else
! inner_path = innerrel->cheapest_total_path;
!
! grouped_result = grouped_outer || grouped_inner || do_aggregate;
/*
* If either cheapest-total path is parameterized by the other rel, we
*************** sort_inner_and_outer(PlannerInfo *root,
*** 843,855 ****
outerrel->partial_pathlist != NIL &&
bms_is_empty(joinrel->lateral_relids))
{
! cheapest_partial_outer = (Path *) linitial(outerrel->partial_pathlist);
if (inner_path->parallel_safe)
cheapest_safe_inner = inner_path;
else if (save_jointype != JOIN_UNIQUE_INNER)
cheapest_safe_inner =
! get_cheapest_parallel_safe_total_inner(innerrel->pathlist);
}
/*
--- 1397,1446 ----
outerrel->partial_pathlist != NIL &&
bms_is_empty(joinrel->lateral_relids))
{
! if (grouped_outer)
! {
! if (outerrel->gpi != NULL && outerrel->gpi->partial_pathlist != NIL)
! cheapest_partial_outer = (Path *)
! linitial(outerrel->gpi->partial_pathlist);
! else
! return;
! }
! else
! cheapest_partial_outer = (Path *)
! linitial(outerrel->partial_pathlist);
!
! if (grouped_inner)
! {
! if (innerrel->gpi != NULL && innerrel->gpi->pathlist != NIL)
! inner_path = linitial(innerrel->gpi->pathlist);
! else
! return;
! }
! else
! inner_path = innerrel->cheapest_total_path;
if (inner_path->parallel_safe)
cheapest_safe_inner = inner_path;
else if (save_jointype != JOIN_UNIQUE_INNER)
+ {
+ List *inner_pathlist;
+
+ if (!grouped_inner)
+ inner_pathlist = innerrel->pathlist;
+ else
+ {
+ Assert(innerrel->gpi != NULL);
+ inner_pathlist = innerrel->gpi->pathlist;
+ }
+
+ /*
+ * All the grouped paths should be unparameterized, so the
+ * function is overly stringent in the grouped_inner case, but
+ * still useful.
+ */
cheapest_safe_inner =
! get_cheapest_parallel_safe_total_inner(inner_pathlist);
! }
}
/*
*************** sort_inner_and_outer(PlannerInfo *root,
*** 925,957 ****
* properly. try_mergejoin_path will detect that case and suppress an
* explicit sort step, so we needn't do so here.
*/
! try_mergejoin_path(root,
! joinrel,
! outer_path,
! inner_path,
! merge_pathkeys,
! cur_mergeclauses,
! outerkeys,
! innerkeys,
! jointype,
! extra,
! false);
/*
* If we have partial outer and parallel safe inner path then try
* partial mergejoin path.
*/
if (cheapest_partial_outer && cheapest_safe_inner)
! try_partial_mergejoin_path(root,
! joinrel,
! cheapest_partial_outer,
! cheapest_safe_inner,
! merge_pathkeys,
! cur_mergeclauses,
! outerkeys,
! innerkeys,
! jointype,
! extra);
}
}
--- 1516,1574 ----
* properly. try_mergejoin_path will detect that case and suppress an
* explicit sort step, so we needn't do so here.
*/
! if (!grouped_result)
! try_mergejoin_path(root,
! joinrel,
! outer_path,
! inner_path,
! merge_pathkeys,
! cur_mergeclauses,
! outerkeys,
! innerkeys,
! jointype,
! extra,
! false, false, false);
! else
! {
! try_mergejoin_path_common(root, joinrel, outer_path, inner_path,
! merge_pathkeys, cur_mergeclauses,
! outerkeys, innerkeys, jointype, extra,
! false,
! grouped_outer, grouped_inner,
! do_aggregate);
! }
/*
* If we have partial outer and parallel safe inner path then try
* partial mergejoin path.
*/
if (cheapest_partial_outer && cheapest_safe_inner)
! {
! if (!grouped_result)
! {
! try_partial_mergejoin_path(root,
! joinrel,
! cheapest_partial_outer,
! cheapest_safe_inner,
! merge_pathkeys,
! cur_mergeclauses,
! outerkeys,
! innerkeys,
! jointype,
! extra, false, false);
! }
! else
! {
! try_mergejoin_path_common(root, joinrel,
! cheapest_partial_outer,
! cheapest_safe_inner,
! merge_pathkeys, cur_mergeclauses,
! outerkeys, innerkeys, jointype, extra,
! true,
! grouped_outer, grouped_inner,
! do_aggregate);
! }
! }
}
}
*************** sort_inner_and_outer(PlannerInfo *root,
*** 968,973 ****
--- 1585,1598 ----
* some sort key requirements). So, we consider truncations of the
* mergeclause list as well as the full list. (Ideally we'd consider all
* subsets of the mergeclause list, but that seems way too expensive.)
+ *
+ * grouped_outer - is outerpath grouped?
+ * grouped_inner - use grouped paths of innerrel?
+ * do_aggregate - apply (partial) aggregation to the output?
+ *
+ * TODO If subsequent calls often differ only by the 3 arguments above,
+ * consider a workspace structure to share useful info (eg merge clauses)
+ * across calls.
*/
static void
generate_mergejoin_paths(PlannerInfo *root,
*************** generate_mergejoin_paths(PlannerInfo *ro
*** 979,985 ****
bool useallclauses,
Path *inner_cheapest_total,
List *merge_pathkeys,
! bool is_partial)
{
List *mergeclauses;
List *innersortkeys;
--- 1604,1613 ----
bool useallclauses,
Path *inner_cheapest_total,
List *merge_pathkeys,
! bool is_partial,
! bool grouped_outer,
! bool grouped_inner,
! bool do_aggregate)
{
List *mergeclauses;
List *innersortkeys;
*************** generate_mergejoin_paths(PlannerInfo *ro
*** 1030,1046 ****
* try_mergejoin_path will do the right thing if inner_cheapest_total is
* already correctly sorted.)
*/
! try_mergejoin_path(root,
! joinrel,
! outerpath,
! inner_cheapest_total,
! merge_pathkeys,
! mergeclauses,
! NIL,
! innersortkeys,
! jointype,
! extra,
! is_partial);
/* Can't do anything else if inner path needs to be unique'd */
if (save_jointype == JOIN_UNIQUE_INNER)
--- 1658,1675 ----
* try_mergejoin_path will do the right thing if inner_cheapest_total is
* already correctly sorted.)
*/
! try_mergejoin_path_common(root,
! joinrel,
! outerpath,
! inner_cheapest_total,
! merge_pathkeys,
! mergeclauses,
! NIL,
! innersortkeys,
! jointype,
! extra,
! is_partial,
! grouped_outer, grouped_inner, do_aggregate);
/* Can't do anything else if inner path needs to be unique'd */
if (save_jointype == JOIN_UNIQUE_INNER)
*************** generate_mergejoin_paths(PlannerInfo *ro
*** 1096,1111 ****
for (sortkeycnt = num_sortkeys; sortkeycnt > 0; sortkeycnt--)
{
Path *innerpath;
List *newclauses = NIL;
/*
* Look for an inner path ordered well enough for the first
* 'sortkeycnt' innersortkeys. NB: trialsortkeys list is modified
* destructively, which is why we made a copy...
*/
trialsortkeys = list_truncate(trialsortkeys, sortkeycnt);
! innerpath = get_cheapest_path_for_pathkeys(innerrel->pathlist,
trialsortkeys,
NULL,
TOTAL_COST,
--- 1725,1746 ----
for (sortkeycnt = num_sortkeys; sortkeycnt > 0; sortkeycnt--)
{
+ List *inner_pathlist = NIL;
Path *innerpath;
List *newclauses = NIL;
+ if (!grouped_inner)
+ inner_pathlist = innerrel->pathlist;
+ else if (innerrel->gpi != NULL)
+ inner_pathlist = innerrel->gpi->pathlist;
+
/*
* Look for an inner path ordered well enough for the first
* 'sortkeycnt' innersortkeys. NB: trialsortkeys list is modified
* destructively, which is why we made a copy...
*/
trialsortkeys = list_truncate(trialsortkeys, sortkeycnt);
! innerpath = get_cheapest_path_for_pathkeys(inner_pathlist,
trialsortkeys,
NULL,
TOTAL_COST,
*************** generate_mergejoin_paths(PlannerInfo *ro
*** 1128,1148 ****
}
else
newclauses = mergeclauses;
! try_mergejoin_path(root,
! joinrel,
! outerpath,
! innerpath,
! merge_pathkeys,
! newclauses,
! NIL,
! NIL,
! jointype,
! extra,
! is_partial);
cheapest_total_inner = innerpath;
}
/* Same on the basis of cheapest startup cost ... */
! innerpath = get_cheapest_path_for_pathkeys(innerrel->pathlist,
trialsortkeys,
NULL,
STARTUP_COST,
--- 1763,1787 ----
}
else
newclauses = mergeclauses;
!
! try_mergejoin_path_common(root,
! joinrel,
! outerpath,
! innerpath,
! merge_pathkeys,
! newclauses,
! NIL,
! NIL,
! jointype,
! extra,
! is_partial,
! grouped_outer, grouped_inner,
! do_aggregate);
!
cheapest_total_inner = innerpath;
}
/* Same on the basis of cheapest startup cost ... */
! innerpath = get_cheapest_path_for_pathkeys(inner_pathlist,
trialsortkeys,
NULL,
STARTUP_COST,
*************** generate_mergejoin_paths(PlannerInfo *ro
*** 1173,1189 ****
else
newclauses = mergeclauses;
}
! try_mergejoin_path(root,
! joinrel,
! outerpath,
! innerpath,
! merge_pathkeys,
! newclauses,
! NIL,
! NIL,
! jointype,
! extra,
! is_partial);
}
cheapest_startup_inner = innerpath;
}
--- 1812,1830 ----
else
newclauses = mergeclauses;
}
! try_mergejoin_path_common(root,
! joinrel,
! outerpath,
! innerpath,
! merge_pathkeys,
! newclauses,
! NIL,
! NIL,
! jointype,
! extra,
! is_partial,
! grouped_outer, grouped_inner,
! do_aggregate);
}
cheapest_startup_inner = innerpath;
}
*************** generate_mergejoin_paths(PlannerInfo *ro
*** 1218,1223 ****
--- 1859,1866 ----
* 'innerrel' is the inner join relation
* 'jointype' is the type of join to do
* 'extra' contains additional input values
+ * 'grouped' indicates that the at least one relation in the join has been
+ * aggregated.
*/
static void
match_unsorted_outer(PlannerInfo *root,
*************** match_unsorted_outer(PlannerInfo *root,
*** 1225,1231 ****
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra)
{
JoinType save_jointype = jointype;
bool nestjoinOK;
--- 1868,1875 ----
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped)
{
JoinType save_jointype = jointype;
bool nestjoinOK;
*************** match_unsorted_outer(PlannerInfo *root,
*** 1235,1240 ****
--- 1879,1906 ----
ListCell *lc1;
/*
+ * If grouped join path is requested, we ignore cases where either input
+ * path needs to be unique. For each side we should expect either grouped
+ * or plain relation, which differ quite a bit.
+ *
+ * XXX Although unique-ification of grouped path might result in too
+ * expensive input path (note that grouped input relation is not
+ * necessarily unique, regardless the grouping keys --- one or more plain
+ * relation could already have been joined to it), we might want to
+ * unique-ify the input relation in the future at least in the case it's a
+ * plain relation.
+ *
+ * (Materialization is not involved in grouped paths for similar reasons.)
+ */
+ if (grouped &&
+ (jointype == JOIN_UNIQUE_OUTER || jointype == JOIN_UNIQUE_INNER))
+ return;
+
+ /* No grouped join w/o grouped target. */
+ if (grouped && joinrel->gpi == NULL)
+ return;
+
+ /*
* Nestloop only supports inner, left, semi, and anti joins. Also, if we
* are doing a right or full mergejoin, we must use *all* the mergeclauses
* as join clauses, else we will not have a valid plan. (Although these
*************** match_unsorted_outer(PlannerInfo *root,
*** 1290,1296 ****
create_unique_path(root, innerrel, inner_cheapest_total, extra->sjinfo);
Assert(inner_cheapest_total);
}
! else if (nestjoinOK)
{
/*
* Consider materializing the cheapest inner path, unless
--- 1956,1962 ----
create_unique_path(root, innerrel, inner_cheapest_total, extra->sjinfo);
Assert(inner_cheapest_total);
}
! else if (nestjoinOK && !grouped)
{
/*
* Consider materializing the cheapest inner path, unless
*************** match_unsorted_outer(PlannerInfo *root,
*** 1321,1326 ****
--- 1987,1994 ----
*/
if (save_jointype == JOIN_UNIQUE_OUTER)
{
+ Assert(!grouped);
+
if (outerpath != outerrel->cheapest_total_path)
continue;
outerpath = (Path *) create_unique_path(root, outerrel,
*************** match_unsorted_outer(PlannerInfo *root,
*** 1348,1354 ****
inner_cheapest_total,
merge_pathkeys,
jointype,
! extra);
}
else if (nestjoinOK)
{
--- 2016,2023 ----
inner_cheapest_total,
merge_pathkeys,
jointype,
! extra,
! false, false);
}
else if (nestjoinOK)
{
*************** match_unsorted_outer(PlannerInfo *root,
*** 1364,1387 ****
{
Path *innerpath = (Path *) lfirst(lc2);
! try_nestloop_path(root,
! joinrel,
! outerpath,
! innerpath,
! merge_pathkeys,
! jointype,
! extra);
}
! /* Also consider materialized form of the cheapest inner path */
! if (matpath != NULL)
try_nestloop_path(root,
joinrel,
outerpath,
matpath,
merge_pathkeys,
jointype,
! extra);
}
/* Can't do anything else if outer path needs to be unique'd */
--- 2033,2078 ----
{
Path *innerpath = (Path *) lfirst(lc2);
! if (!grouped)
! try_nestloop_path(root,
! joinrel,
! outerpath,
! innerpath,
! merge_pathkeys,
! jointype,
! extra, false, false);
! else
! {
! /*
! * Since both input paths are plain, request explicit
! * aggregation.
! */
! try_grouped_nestloop_path(root,
! joinrel,
! outerpath,
! innerpath,
! merge_pathkeys,
! jointype,
! extra,
! true,
! false);
! }
}
! /*
! * Also consider materialized form of the cheapest inner path.
! *
! * (There's no matpath for grouped join.)
! */
! if (matpath != NULL && !grouped)
try_nestloop_path(root,
joinrel,
outerpath,
matpath,
merge_pathkeys,
jointype,
! extra,
! false, false);
}
/* Can't do anything else if outer path needs to be unique'd */
*************** match_unsorted_outer(PlannerInfo *root,
*** 1396,1402 ****
generate_mergejoin_paths(root, joinrel, innerrel, outerpath,
save_jointype, extra, useallclauses,
inner_cheapest_total, merge_pathkeys,
! false);
}
/*
--- 2087,2163 ----
generate_mergejoin_paths(root, joinrel, innerrel, outerpath,
save_jointype, extra, useallclauses,
inner_cheapest_total, merge_pathkeys,
! false, false, false, grouped);
!
! /* Try to join the plain outer relation to grouped inner. */
! if (grouped && nestjoinOK &&
! save_jointype != JOIN_UNIQUE_OUTER &&
! save_jointype != JOIN_UNIQUE_INNER &&
! innerrel->gpi != NULL && outerrel->gpi == NULL)
! {
! Path *inner_cheapest_grouped = (Path *) linitial(innerrel->gpi->pathlist);
!
! if (PATH_PARAM_BY_REL(inner_cheapest_grouped, outerrel))
! continue;
!
! /* XXX Shouldn't Assert() be used here instead? */
! if (PATH_PARAM_BY_REL(outerpath, innerrel))
! continue;
!
! /*
! * Only outer grouped path is interesting in this case: grouped
! * path on the inner side of NL join would imply repeated
! * aggregation somewhere in the inner path.
! */
! generate_mergejoin_paths(root, joinrel, innerrel, outerpath,
! save_jointype, extra, useallclauses,
! inner_cheapest_grouped, merge_pathkeys,
! false, false, true, false);
! }
! }
!
! /*
! * Combine grouped outer and plain inner paths.
! */
! if (grouped && nestjoinOK &&
! save_jointype != JOIN_UNIQUE_OUTER &&
! save_jointype != JOIN_UNIQUE_INNER)
! {
! /*
! * If the inner rel had a grouped target, its plain paths should be
! * ignored. Otherwise we could create grouped paths with different
! * targets.
! */
! if (outerrel->gpi != NULL && innerrel->gpi == NULL &&
! inner_cheapest_total != NULL)
! {
! /* Nested loop paths. */
! foreach(lc1, outerrel->gpi->pathlist)
! {
! Path *outerpath = (Path *) lfirst(lc1);
! List *merge_pathkeys = build_join_pathkeys(root, joinrel, jointype,
! outerpath->pathkeys);
!
! if (PATH_PARAM_BY_REL(outerpath, innerrel))
! continue;
!
! try_grouped_nestloop_path(root,
! joinrel,
! outerpath,
! inner_cheapest_total,
! merge_pathkeys,
! jointype,
! extra,
! false,
! false);
!
! /* Merge join paths. */
! generate_mergejoin_paths(root, joinrel, innerrel, outerpath,
! save_jointype, extra, useallclauses,
! inner_cheapest_total, merge_pathkeys,
! false, true, false, false);
! }
! }
}
/*
*************** match_unsorted_outer(PlannerInfo *root,
*** 1416,1423 ****
bms_is_empty(joinrel->lateral_relids))
{
if (nestjoinOK)
! consider_parallel_nestloop(root, joinrel, outerrel, innerrel,
! save_jointype, extra);
/*
* If inner_cheapest_total is NULL or non parallel-safe then find the
--- 2177,2197 ----
bms_is_empty(joinrel->lateral_relids))
{
if (nestjoinOK)
! {
! if (!grouped)
! /* Plain partial paths. */
! consider_parallel_nestloop(root, joinrel, outerrel, innerrel,
! save_jointype, extra, false, false);
! else
! {
! /* Grouped partial paths with explicit aggregation. */
! consider_parallel_nestloop(root, joinrel, outerrel, innerrel,
! save_jointype, extra, true, true);
! /* Grouped partial paths w/o explicit aggregation. */
! consider_parallel_nestloop(root, joinrel, outerrel, innerrel,
! save_jointype, extra, true, false);
! }
! }
/*
* If inner_cheapest_total is NULL or non parallel-safe then find the
*************** match_unsorted_outer(PlannerInfo *root,
*** 1437,1443 ****
if (inner_cheapest_total)
consider_parallel_mergejoin(root, joinrel, outerrel, innerrel,
save_jointype, extra,
! inner_cheapest_total);
}
}
--- 2211,2217 ----
if (inner_cheapest_total)
consider_parallel_mergejoin(root, joinrel, outerrel, innerrel,
save_jointype, extra,
! inner_cheapest_total, grouped);
}
}
*************** consider_parallel_mergejoin(PlannerInfo
*** 1460,1469 ****
RelOptInfo *innerrel,
JoinType jointype,
JoinPathExtraData *extra,
! Path *inner_cheapest_total)
{
ListCell *lc1;
/* generate merge join path for each partial outer path */
foreach(lc1, outerrel->partial_pathlist)
{
--- 2234,2252 ----
RelOptInfo *innerrel,
JoinType jointype,
JoinPathExtraData *extra,
! Path *inner_cheapest_total,
! bool grouped)
{
ListCell *lc1;
+ if (grouped)
+ {
+ /* TODO Consider if these types should be supported. */
+ if (jointype == JOIN_UNIQUE_OUTER ||
+ jointype == JOIN_UNIQUE_INNER)
+ return;
+ }
+
/* generate merge join path for each partial outer path */
foreach(lc1, outerrel->partial_pathlist)
{
*************** consider_parallel_mergejoin(PlannerInfo
*** 1476,1484 ****
merge_pathkeys = build_join_pathkeys(root, joinrel, jointype,
outerpath->pathkeys);
! generate_mergejoin_paths(root, joinrel, innerrel, outerpath, jointype,
! extra, false, inner_cheapest_total,
! merge_pathkeys, true);
}
}
--- 2259,2314 ----
merge_pathkeys = build_join_pathkeys(root, joinrel, jointype,
outerpath->pathkeys);
! if (!grouped)
! generate_mergejoin_paths(root, joinrel, innerrel, outerpath,
! jointype, extra, false,
! inner_cheapest_total, merge_pathkeys,
! true,
! false, false, false);
! else
! {
! /*
! * Create grouped join by joining plain rels and aggregating the
! * result.
! */
! Assert(joinrel->gpi != NULL);
! generate_mergejoin_paths(root, joinrel, innerrel, outerpath,
! jointype, extra, false,
! inner_cheapest_total, merge_pathkeys,
! true, false, false, true);
!
! /* Combine the plain outer with grouped inner one(s). */
! if (outerrel->gpi == NULL && innerrel->gpi != NULL)
! {
! Path *inner_cheapest_grouped = (Path *)
! linitial(innerrel->gpi->pathlist);
!
! if (inner_cheapest_grouped != NULL &&
! inner_cheapest_grouped->parallel_safe)
! generate_mergejoin_paths(root, joinrel, innerrel,
! outerpath, jointype, extra,
! false, inner_cheapest_grouped,
! merge_pathkeys,
! true, false, true, false);
! }
! }
! }
!
! /* In addition, try to join grouped outer to plain inner one(s). */
! if (grouped && outerrel->gpi != NULL && innerrel->gpi == NULL)
! {
! foreach(lc1, outerrel->gpi->partial_pathlist)
! {
! Path *outerpath = (Path *) lfirst(lc1);
! List *merge_pathkeys;
!
! merge_pathkeys = build_join_pathkeys(root, joinrel, jointype,
! outerpath->pathkeys);
! generate_mergejoin_paths(root, joinrel, innerrel, outerpath,
! jointype, extra, false,
! inner_cheapest_total, merge_pathkeys,
! true, true, false, false);
! }
}
}
*************** consider_parallel_nestloop(PlannerInfo *
*** 1499,1513 ****
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra)
{
JoinType save_jointype = jointype;
ListCell *lc1;
if (jointype == JOIN_UNIQUE_INNER)
jointype = JOIN_INNER;
! foreach(lc1, outerrel->partial_pathlist)
{
Path *outerpath = (Path *) lfirst(lc1);
List *pathkeys;
--- 2329,2373 ----
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped, bool do_aggregate)
{
JoinType save_jointype = jointype;
+ List *outer_pathlist;
ListCell *lc1;
+ if (grouped)
+ {
+ /* TODO Consider if these types should be supported. */
+ if (save_jointype == JOIN_UNIQUE_OUTER ||
+ save_jointype == JOIN_UNIQUE_INNER)
+ return;
+ }
+
if (jointype == JOIN_UNIQUE_INNER)
jointype = JOIN_INNER;
! if (!grouped || do_aggregate)
! {
! /*
! * If creating grouped paths by explicit aggregation, the input paths
! * must be plain.
! */
! outer_pathlist = outerrel->partial_pathlist;
! }
! else if (outerrel->gpi != NULL)
! {
! /*
! * Only the outer paths are accepted as grouped when we try to combine
! * grouped and plain ones. Grouped inner path implies repeated
! * aggregation, which doesn't sound as a good idea.
! */
! outer_pathlist = outerrel->gpi->partial_pathlist;
! }
! else
! return;
!
! foreach(lc1, outer_pathlist)
{
Path *outerpath = (Path *) lfirst(lc1);
List *pathkeys;
*************** consider_parallel_nestloop(PlannerInfo *
*** 1538,1544 ****
* inner paths, but right now create_unique_path is not on board
* with that.)
*/
! if (save_jointype == JOIN_UNIQUE_INNER)
{
if (innerpath != innerrel->cheapest_total_path)
continue;
--- 2398,2404 ----
* inner paths, but right now create_unique_path is not on board
* with that.)
*/
! if (save_jointype == JOIN_UNIQUE_INNER && !grouped)
{
if (innerpath != innerrel->cheapest_total_path)
continue;
*************** consider_parallel_nestloop(PlannerInfo *
*** 1548,1555 ****
Assert(innerpath);
}
! try_partial_nestloop_path(root, joinrel, outerpath, innerpath,
! pathkeys, jointype, extra);
}
}
}
--- 2408,2433 ----
Assert(innerpath);
}
! if (!grouped)
! try_partial_nestloop_path(root, joinrel, outerpath, innerpath,
! pathkeys, jointype, extra,
! false, false);
! else if (do_aggregate)
! {
! /* Request aggregation as both input rels are plain. */
! try_grouped_nestloop_path(root, joinrel, outerpath, innerpath,
! pathkeys, jointype, extra,
! true, true);
! }
! /*
! * Only combine the grouped outer path with the plain inner if the
! * inner relation cannot produce grouped paths. Otherwise we could
! * generate grouped paths with different targets.
! */
! else if (innerrel->gpi == NULL)
! try_grouped_nestloop_path(root, joinrel, outerpath, innerpath,
! pathkeys, jointype, extra,
! false, true);
}
}
}
*************** hash_inner_and_outer(PlannerInfo *root,
*** 1571,1583 ****
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra)
{
JoinType save_jointype = jointype;
bool isouterjoin = IS_OUTER_JOIN(jointype);
List *hashclauses;
ListCell *l;
/*
* We need to build only one hashclauses list for any given pair of outer
* and inner relations; all of the hashable clauses will be used as keys.
--- 2449,2466 ----
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped)
{
JoinType save_jointype = jointype;
bool isouterjoin = IS_OUTER_JOIN(jointype);
List *hashclauses;
ListCell *l;
+ /* No grouped join w/o grouped target. */
+ if (grouped && joinrel->gpi == NULL)
+ return;
+
/*
* We need to build only one hashclauses list for any given pair of outer
* and inner relations; all of the hashable clauses will be used as keys.
*************** hash_inner_and_outer(PlannerInfo *root,
*** 1627,1632 ****
--- 2510,2518 ----
* can't use a hashjoin. (There's no use looking for alternative
* input paths, since these should already be the least-parameterized
* available paths.)
+ *
+ * (The same check should work for grouped paths, as these don't
+ * differ in parameterization.)
*/
if (PATH_PARAM_BY_REL(cheapest_total_outer, innerrel) ||
PATH_PARAM_BY_REL(cheapest_total_inner, outerrel))
*************** hash_inner_and_outer(PlannerInfo *root,
*** 1646,1652 ****
cheapest_total_inner,
hashclauses,
jointype,
! extra);
/* no possibility of cheap startup here */
}
else if (jointype == JOIN_UNIQUE_INNER)
--- 2532,2539 ----
cheapest_total_inner,
hashclauses,
jointype,
! extra,
! false, false);
/* no possibility of cheap startup here */
}
else if (jointype == JOIN_UNIQUE_INNER)
*************** hash_inner_and_outer(PlannerInfo *root,
*** 1662,1668 ****
cheapest_total_inner,
hashclauses,
jointype,
! extra);
if (cheapest_startup_outer != NULL &&
cheapest_startup_outer != cheapest_total_outer)
try_hashjoin_path(root,
--- 2549,2556 ----
cheapest_total_inner,
hashclauses,
jointype,
! extra,
! false, false);
if (cheapest_startup_outer != NULL &&
cheapest_startup_outer != cheapest_total_outer)
try_hashjoin_path(root,
*************** hash_inner_and_outer(PlannerInfo *root,
*** 1671,1733 ****
cheapest_total_inner,
hashclauses,
jointype,
! extra);
}
else
{
! /*
! * For other jointypes, we consider the cheapest startup outer
! * together with the cheapest total inner, and then consider
! * pairings of cheapest-total paths including parameterized ones.
! * There is no use in generating parameterized paths on the basis
! * of possibly cheap startup cost, so this is sufficient.
! */
! ListCell *lc1;
! ListCell *lc2;
!
! if (cheapest_startup_outer != NULL)
! try_hashjoin_path(root,
! joinrel,
! cheapest_startup_outer,
! cheapest_total_inner,
! hashclauses,
! jointype,
! extra);
!
! foreach(lc1, outerrel->cheapest_parameterized_paths)
{
- Path *outerpath = (Path *) lfirst(lc1);
-
/*
! * We cannot use an outer path that is parameterized by the
! * inner rel.
*/
! if (PATH_PARAM_BY_REL(outerpath, innerrel))
! continue;
! foreach(lc2, innerrel->cheapest_parameterized_paths)
{
! Path *innerpath = (Path *) lfirst(lc2);
/*
! * We cannot use an inner path that is parameterized by
! * the outer rel, either.
*/
! if (PATH_PARAM_BY_REL(innerpath, outerrel))
continue;
! if (outerpath == cheapest_startup_outer &&
! innerpath == cheapest_total_inner)
! continue; /* already tried it */
! try_hashjoin_path(root,
! joinrel,
! outerpath,
! innerpath,
! hashclauses,
! jointype,
! extra);
}
}
}
--- 2559,2712 ----
cheapest_total_inner,
hashclauses,
jointype,
! extra,
! false, false);
}
else
{
! if (!grouped)
{
/*
! * For other jointypes, we consider the cheapest startup outer
! * together with the cheapest total inner, and then consider
! * pairings of cheapest-total paths including parameterized
! * ones. There is no use in generating parameterized paths on
! * the basis of possibly cheap startup cost, so this is
! * sufficient.
*/
! ListCell *lc1;
! if (cheapest_startup_outer != NULL)
! try_hashjoin_path(root,
! joinrel,
! cheapest_startup_outer,
! cheapest_total_inner,
! hashclauses,
! jointype,
! extra,
! false, false);
!
! foreach(lc1, outerrel->cheapest_parameterized_paths)
{
! Path *outerpath = (Path *) lfirst(lc1);
! ListCell *lc2;
/*
! * We cannot use an outer path that is parameterized by the
! * inner rel.
*/
! if (PATH_PARAM_BY_REL(outerpath, innerrel))
continue;
! foreach(lc2, innerrel->cheapest_parameterized_paths)
! {
! Path *innerpath = (Path *) lfirst(lc2);
! /*
! * We cannot use an inner path that is parameterized by
! * the outer rel, either.
! */
! if (PATH_PARAM_BY_REL(innerpath, outerrel))
! continue;
!
! if (outerpath == cheapest_startup_outer &&
! innerpath == cheapest_total_inner)
! continue; /* already tried it */
!
! try_hashjoin_path(root,
! joinrel,
! outerpath,
! innerpath,
! hashclauses,
! jointype,
! extra,
! false, false);
! }
! }
! }
! else
! {
! /* Create grouped paths if possible. */
! /*
! * TODO
! *
! * Consider processing JOIN_UNIQUE_INNER and JOIN_UNIQUE_OUTER
! * join types, ie perform grouping of the inner / outer rel if
! * it's not unique yet and if the grouping is legal.
! */
! if (jointype == JOIN_UNIQUE_OUTER ||
! jointype == JOIN_UNIQUE_INNER)
! return;
!
! /*
! * Join grouped relation to non-grouped one.
! *
! * Do not use plain path of the input rel whose target does
! * have GroupedPahtInfo. For example (assuming that join of
! * two grouped rels is not supported), the only way to
! * evaluate SELECT sum(a.x), sum(b.y) ... is to join "a" and
! * "b" and aggregate the result. Otherwise the path target
! * wouldn't match joinrel->gpi->target. TODO Move this comment
! * elsewhere as it seems common to all join kinds.
! */
! /*
! * TODO Allow outer join if the grouped rel is on the
! * non-nullable side.
! */
! if (jointype == JOIN_INNER)
! {
! Path *grouped_path, *plain_path;
!
! if (outerrel->gpi != NULL &&
! outerrel->gpi->pathlist != NIL &&
! innerrel->gpi == NULL)
! {
! grouped_path = (Path *)
! linitial(outerrel->gpi->pathlist);
! plain_path = cheapest_total_inner;
! try_grouped_hashjoin_path(root, joinrel,
! grouped_path, plain_path,
! hashclauses, jointype,
! extra, false, false);
! }
! else if (innerrel->gpi != NULL &&
! innerrel->gpi->pathlist != NIL &&
! outerrel->gpi == NULL)
! {
! grouped_path = (Path *)
! linitial(innerrel->gpi->pathlist);
! plain_path = cheapest_total_outer;
! try_grouped_hashjoin_path(root, joinrel, plain_path,
! grouped_path, hashclauses,
! jointype, extra,
! false, false);
!
! if (cheapest_startup_outer != NULL &&
! cheapest_startup_outer != cheapest_total_outer)
! {
! plain_path = cheapest_startup_outer;
! try_grouped_hashjoin_path(root, joinrel,
! plain_path,
! grouped_path,
! hashclauses,
! jointype, extra,
! false, false);
! }
! }
}
+
+ /*
+ * Try to join plain relations and make a grouped rel out of
+ * the join.
+ *
+ * Since aggregation needs the whole relation, we are only
+ * interested in total costs.
+ */
+ try_grouped_hashjoin_path(root, joinrel,
+ cheapest_total_outer,
+ cheapest_total_inner,
+ hashclauses,
+ jointype, extra, true, false);
}
}
*************** hash_inner_and_outer(PlannerInfo *root,
*** 1765,1777 ****
cheapest_safe_inner =
get_cheapest_parallel_safe_total_inner(innerrel->pathlist);
! if (cheapest_safe_inner != NULL)
! try_partial_hashjoin_path(root, joinrel,
! cheapest_partial_outer,
! cheapest_safe_inner,
! hashclauses, jointype, extra);
}
}
}
/*
--- 2744,2967 ----
cheapest_safe_inner =
get_cheapest_parallel_safe_total_inner(innerrel->pathlist);
! if (!grouped)
! {
! if (cheapest_safe_inner != NULL)
! try_partial_hashjoin_path(root, joinrel,
! cheapest_partial_outer,
! cheapest_safe_inner,
! hashclauses, jointype, extra,
! false, false);
! }
! else if (joinrel->gpi != NULL)
! {
! /*
! * Grouped partial path.
! *
! * 1. Apply aggregation to the plain partial join path.
! */
! if (cheapest_safe_inner != NULL)
! try_grouped_hashjoin_path(root, joinrel,
! cheapest_partial_outer,
! cheapest_safe_inner,
! hashclauses,
! jointype, extra, true, true);
!
! /*
! * 2. Join the cheapest partial grouped outer path (if one
! * exists) to cheapest_safe_inner (there's no reason to look
! * for another inner path than what we used for non-grouped
! * partial join path).
! */
! if (outerrel->gpi != NULL &&
! outerrel->gpi->partial_pathlist != NIL &&
! innerrel->gpi == NULL &&
! cheapest_safe_inner != NULL)
! {
! Path *outer_path;
!
! outer_path = (Path *)
! linitial(outerrel->gpi->partial_pathlist);
!
! try_grouped_hashjoin_path(root, joinrel, outer_path,
! cheapest_safe_inner,
! hashclauses,
! jointype, extra, false, true);
! }
!
! /*
! * 3. Join the cheapest_partial_outer path (again, no reason
! * to use different outer path than the one we used for plain
! * partial join) to the cheapest grouped inner path if the
! * latter exists and is parallel-safe.
! */
! if (innerrel->gpi != NULL &&
! innerrel->gpi->pathlist != NIL &&
! outerrel->gpi == NULL)
! {
! Path *inner_path;
!
! inner_path = (Path *) linitial(innerrel->gpi->pathlist);
!
! if (inner_path->parallel_safe)
! try_grouped_hashjoin_path(root, joinrel,
! cheapest_partial_outer,
! inner_path,
! hashclauses,
! jointype, extra,
! false, true);
! }
!
! /*
! * Other combinations seem impossible because: 1. At most 1
! * input relation of the join can be grouped, 2. the inner
! * path must not be partial.
! */
! }
! }
! }
! }
!
! /*
! * Do the input paths emit all the aggregates contained in the grouped target
! * of the join?
! *
! * The point is that one input relation might be unable to evaluate some
! * aggregate(s), so it'll only generate plain paths. It's wrong to combine
! * such plain paths with grouped ones that the other input rel might be able
! * to generate because the result would miss the aggregate(s) the first
! * relation failed to evaluate.
! *
! * TODO For better efficiency, consider storing Bitmapset of
! * GroupedVarInfo.gvid in GroupedPathInfo.
! */
! static bool
! is_grouped_join_target_complete(PlannerInfo *root, PathTarget *jointarget,
! Path *outer_path, Path *inner_path)
! {
! RelOptInfo *outer_rel = outer_path->parent;
! RelOptInfo *inner_rel = inner_path->parent;
! ListCell *l1;
!
! /*
! * Join of two grouped relations is not supported.
! *
! * This actually isn't check of target completeness --- can it be located
! * elsewhere?
! */
! if (outer_rel->gpi != NULL && inner_rel->gpi != NULL)
! return false;
!
! foreach(l1, jointarget->exprs)
! {
! Expr *expr = (Expr *) lfirst(l1);
! GroupedVar *gvar;
! GroupedVarInfo *gvi = NULL;
! ListCell *l2;
! bool found = false;
!
! /* Only interested in aggregates. */
! if (!IsA(expr, GroupedVar))
! continue;
!
! gvar = castNode(GroupedVar, expr);
!
! /* Find the corresponding GroupedVarInfo. */
! foreach(l2, root->grouped_var_list)
! {
! GroupedVarInfo *gvi_tmp = castNode(GroupedVarInfo, lfirst(l2));
!
! if (gvi_tmp->gvid == gvar->gvid)
! {
! gvi = gvi_tmp;
! break;
! }
! }
! Assert(gvi != NULL);
!
! /*
! * If any aggregate references both input relations, something went
! * wrong during construction of one of the input targets: one input
! * rel is grouped, but no grouping target should have been created for
! * it if some aggregate required more than that input rel.
! */
! Assert(gvi->gv_eval_at == NULL ||
! !(bms_overlap(gvi->gv_eval_at, outer_rel->relids) &&
! bms_overlap(gvi->gv_eval_at, inner_rel->relids)));
!
! /*
! * If the aggregate belongs to the plain relation, it probably
! * means that non-grouping expression made aggregation of that
! * input relation impossible. Since that expression is not
! * necessarily emitted by the current join, aggregation might be
! * possible here. On the other hand, aggregation of a join which
! * already contains a grouped relation does not seem too
! * beneficial.
! *
! * XXX The condition below is also met if the query contains both
! * "star aggregate" and a normal one. Since the earlier can be
! * added to any base relation, and since we don't support join of
! * 2 grouped relations, join of arbitrary 2 relations will always
! * result in a plain relation.
! *
! * XXX If we conclude that aggregation is worth, only consider
! * this test failed if target usable for aggregation cannot be
! * created (i.e. the non-grouping expression is in the output of
! * the current join).
! */
! if ((outer_rel->gpi == NULL &&
! bms_overlap(gvi->gv_eval_at, outer_rel->relids))
! || (inner_rel->gpi == NULL &&
! bms_overlap(gvi->gv_eval_at, inner_rel->relids)))
! return false;
!
! /* Look for the aggregate in the input targets. */
! if (outer_rel->gpi != NULL)
! {
! /* No more than one input path should be grouped. */
! Assert(inner_rel->gpi == NULL);
!
! foreach(l2, outer_path->pathtarget->exprs)
! {
! expr = (Expr *) lfirst(l2);
!
! if (!IsA(expr, GroupedVar))
! continue;
!
! gvar = castNode(GroupedVar, expr);
! if (gvar->gvid == gvi->gvid)
! {
! found = true;
! break;
! }
! }
}
+ else if (!found && inner_rel->gpi != NULL)
+ {
+ Assert(outer_rel->gpi == NULL);
+
+ foreach(l2, inner_path->pathtarget->exprs)
+ {
+ expr = (Expr *) lfirst(l2);
+
+ if (!IsA(expr, GroupedVar))
+ continue;
+
+ gvar = castNode(GroupedVar, expr);
+ if (gvar->gvid == gvi->gvid)
+ {
+ found = true;
+ break;
+ }
+ }
+ }
+
+ /* Even a single missing aggregate causes the whole test to fail. */
+ if (!found)
+ return false;
}
+
+ return true;
}
/*
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
new file mode 100644
index 5a68de3..ea24ed9
*** a/src/backend/optimizer/path/joinrels.c
--- b/src/backend/optimizer/path/joinrels.c
***************
*** 14,23 ****
--- 14,29 ----
*/
#include "postgres.h"
+ #include "miscadmin.h"
+ #include "nodes/relation.h"
+ #include "optimizer/clauses.h"
#include "optimizer/joininfo.h"
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
+ #include "optimizer/prep.h"
+ #include "optimizer/cost.h"
#include "utils/memutils.h"
+ #include "utils/lsyscache.h"
static void make_rels_by_clause_joins(PlannerInfo *root,
*************** static void make_rels_by_clauseless_join
*** 29,40 ****
static bool has_join_restriction(PlannerInfo *root, RelOptInfo *rel);
static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
static bool is_dummy_rel(RelOptInfo *rel);
- static void mark_dummy_rel(RelOptInfo *rel);
static bool restriction_is_constant_false(List *restrictlist,
bool only_pushed_down);
static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
RelOptInfo *rel2, RelOptInfo *joinrel,
SpecialJoinInfo *sjinfo, List *restrictlist);
/*
--- 35,53 ----
static bool has_join_restriction(PlannerInfo *root, RelOptInfo *rel);
static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
static bool is_dummy_rel(RelOptInfo *rel);
static bool restriction_is_constant_false(List *restrictlist,
bool only_pushed_down);
static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
RelOptInfo *rel2, RelOptInfo *joinrel,
SpecialJoinInfo *sjinfo, List *restrictlist);
+ static void try_partition_wise_join(PlannerInfo *root, RelOptInfo *rel1,
+ RelOptInfo *rel2, RelOptInfo *joinrel,
+ SpecialJoinInfo *parent_sjinfo,
+ List *parent_restrictlist);
+ static int match_expr_to_partition_keys(Expr *expr, RelOptInfo *rel);
+ static void build_joinrel_partition_bounds(RelOptInfo *rel1, RelOptInfo *rel2,
+ RelOptInfo *joinrel, JoinType jointype,
+ List **rel1_parts, List **rel2_parts);
/*
*************** make_join_rel(PlannerInfo *root, RelOptI
*** 731,736 ****
--- 744,752 ----
populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
restrictlist);
+ /* Apply partition-wise join technique, if possible. */
+ try_partition_wise_join(root, rel1, rel2, joinrel, sjinfo, restrictlist);
+
bms_free(joinrelids);
return joinrel;
*************** is_dummy_rel(RelOptInfo *rel)
*** 1197,1203 ****
* is that the best solution is to explicitly make the dummy path in the same
* context the given RelOptInfo is in.
*/
! static void
mark_dummy_rel(RelOptInfo *rel)
{
MemoryContext oldcontext;
--- 1213,1219 ----
* is that the best solution is to explicitly make the dummy path in the same
* context the given RelOptInfo is in.
*/
! void
mark_dummy_rel(RelOptInfo *rel)
{
MemoryContext oldcontext;
*************** mark_dummy_rel(RelOptInfo *rel)
*** 1217,1223 ****
rel->partial_pathlist = NIL;
/* Set up the dummy path */
! add_path(rel, (Path *) create_append_path(rel, NIL, NULL, 0, NIL));
/* Set or update cheapest_total_path and related fields */
set_cheapest(rel);
--- 1233,1239 ----
rel->partial_pathlist = NIL;
/* Set up the dummy path */
! add_path(rel, (Path *) create_append_path(rel, NIL, NULL, 0, NIL), false);
/* Set or update cheapest_total_path and related fields */
set_cheapest(rel);
*************** restriction_is_constant_false(List *rest
*** 1268,1270 ****
--- 1284,1712 ----
}
return false;
}
+
+ /*
+ * Assess whether join between given two partitioned relations can be broken
+ * down into joins between matching partitions; a technique called
+ * "partition-wise join"
+ *
+ * Partition-wise join is possible when a. Joining relations have same
+ * partitioning scheme b. There exists an equi-join between the partition keys
+ * of the two relations.
+ *
+ * Partition-wise join is planned as follows (details: optimizer/README.)
+ *
+ * 1. Create the RelOptInfos for joins between matching partitions i.e
+ * child-joins and add paths those.
+ *
+ * 2. Add "append" paths to join between parent relations. The second phase is
+ * implemented by generate_partition_wise_join_paths().
+ *
+ * The RelOptInfo, SpecialJoinInfo and restrictlist for each child join are
+ * obtained by translating the respective parent join structures.
+ */
+ static void
+ try_partition_wise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
+ RelOptInfo *joinrel, SpecialJoinInfo *parent_sjinfo,
+ List *parent_restrictlist)
+ {
+ int nparts;
+ int cnt_parts;
+ ListCell *lc1;
+ ListCell *lc2;
+ List *rel1_parts;
+ List *rel2_parts;
+ bool is_strict;
+
+ /* Guard against stack overflow due to overly deep partition hierarchy. */
+ check_stack_depth();
+
+ /* Nothing to do, if the join relation is not partitioned. */
+ if (!joinrel->part_scheme)
+ return;
+
+ /*
+ * set_append_rel_pathlist() may not create paths in children of an empty
+ * partitioned table and so we can not add paths to a child-joins when one
+ * of the joining relations is empty. So, deem such a join as
+ * unpartitioned.
+ */
+ if (IS_DUMMY_REL(rel1) || IS_DUMMY_REL(rel2))
+ return;
+
+ /*
+ * Since this join relation is partitioned, all the base relations
+ * participating in this join must be partitioned and so are all the
+ * intermediate join relations.
+ */
+ Assert(rel1->part_scheme && rel2->part_scheme);
+
+ /*
+ * Every pair of joining relations we see here should have an equi-join
+ * between partition keys if this join has been deemed as a partitioned
+ * join. See build_joinrel_partition_info() for reasons.
+ */
+ Assert(have_partkey_equi_join(rel1, rel2, parent_sjinfo->jointype,
+ parent_restrictlist, &is_strict));
+
+ /*
+ * The partition scheme of the join relation should match that of the
+ * joining relations.
+ */
+ Assert(joinrel->part_scheme == rel1->part_scheme &&
+ joinrel->part_scheme == rel2->part_scheme);
+
+ /* We should have RelOptInfos of the partitions available. */
+ Assert(rel1->part_rels && rel2->part_rels);
+
+ /*
+ * Calculate bounds for the join relation. If we can not come up with joint
+ * bounds, we can not use partition-wise join.
+ */
+ build_joinrel_partition_bounds(rel1, rel2, joinrel,
+ parent_sjinfo->jointype, &rel1_parts,
+ &rel2_parts);
+ if (!joinrel->boundinfo)
+ return;
+
+ Assert(list_length(rel1_parts) == list_length(rel2_parts));
+ Assert(joinrel->nparts == list_length(rel1_parts));
+ Assert(joinrel->nparts > 0);
+
+ nparts = joinrel->nparts;
+
+ elog(DEBUG3, "join between relations %s and %s is considered for partition-wise join.",
+ bmsToString(rel1->relids), bmsToString(rel2->relids));
+
+ /* Allocate space for hold child-joins RelOptInfos, if not already done. */
+ if (!joinrel->part_rels)
+ joinrel->part_rels = (RelOptInfo **) palloc0(sizeof(RelOptInfo *) * nparts);
+
+ /*
+ * Create child join relations for this partitioned join, if those don't
+ * exist. Add paths to child-joins for a pair of child relations
+ * corresponding corresponding to the given pair of parent relations.
+ */
+ cnt_parts = 0;
+ forboth (lc1, rel1_parts, lc2, rel2_parts)
+ {
+ RelOptInfo *child_rel1 = lfirst(lc1);
+ RelOptInfo *child_rel2 = lfirst(lc2);
+ SpecialJoinInfo *child_sjinfo;
+ List *child_restrictlist;
+ RelOptInfo *child_joinrel;
+ Relids child_joinrelids;
+ AppendRelInfo **appinfos;
+ int nappinfos;
+
+ /* We should never try to join two overlapping sets of rels. */
+ Assert(!bms_overlap(child_rel1->relids, child_rel2->relids));
+ child_joinrelids = bms_union(child_rel1->relids, child_rel2->relids);
+ appinfos = find_appinfos_by_relids(root, child_joinrelids, &nappinfos);
+
+ /*
+ * Construct SpecialJoinInfo from parent join relations's
+ * SpecialJoinInfo.
+ */
+ child_sjinfo = build_child_join_sjinfo(root, parent_sjinfo,
+ child_rel1->relids,
+ child_rel2->relids);
+
+ /*
+ * Construct restrictions applicable to the child join from
+ * those applicable to the parent join.
+ */
+ child_restrictlist = (List *) adjust_appendrel_attrs(root,
+ (Node *) parent_restrictlist,
+ nappinfos, appinfos);
+
+ child_joinrel = joinrel->part_rels[cnt_parts];
+ if (!child_joinrel)
+ {
+ child_joinrel = build_child_join_rel(root, child_rel1, child_rel2,
+ joinrel, child_restrictlist,
+ child_sjinfo,
+ child_sjinfo->jointype);
+ joinrel->part_rels[cnt_parts] = child_joinrel;
+ }
+
+ Assert(bms_equal(child_joinrel->relids, child_joinrelids));
+
+ /* Also translate expressions that AggPath will use in its target. */
+ if (child_joinrel->gpi != NULL)
+ {
+ Assert(child_joinrel->gpi->target != NULL);
+
+ child_joinrel->gpi->target->exprs =
+ (List *) adjust_appendrel_attrs(root,
+ (Node *) child_joinrel->gpi->target->exprs,
+ nappinfos, appinfos);
+ }
+
+ populate_joinrel_with_paths(root, child_rel1, child_rel2,
+ child_joinrel, child_sjinfo,
+ child_restrictlist);
+
+ pfree(appinfos);
+
+ /*
+ * If the child relations themselves are partitioned, try partition-wise join
+ * recursively.
+ */
+ try_partition_wise_join(root, child_rel1, child_rel2, child_joinrel,
+ child_sjinfo, child_restrictlist);
+ cnt_parts++;
+ }
+ }
+
+ /*
+ * Returns true if there exists an equi-join condition for each pair of
+ * partition key from given relations being joined.
+ */
+ bool
+ have_partkey_equi_join(RelOptInfo *rel1, RelOptInfo *rel2, JoinType jointype,
+ List *restrictlist, bool *is_strict)
+ {
+ PartitionScheme part_scheme = rel1->part_scheme;
+ ListCell *lc;
+ int cnt_pks;
+ int num_pks;
+ bool *pk_has_clause;
+
+ *is_strict = false;
+
+ /*
+ * This function should be called when the joining relations have same
+ * partitioning scheme.
+ */
+ Assert(rel1->part_scheme == rel2->part_scheme);
+ Assert(part_scheme);
+
+ num_pks = part_scheme->partnatts;
+
+ pk_has_clause = (bool *) palloc0(sizeof(bool) * num_pks);
+
+ foreach (lc, restrictlist)
+ {
+ RestrictInfo *rinfo = lfirst(lc);
+ OpExpr *opexpr;
+ Expr *expr1;
+ Expr *expr2;
+ int ipk1;
+ int ipk2;
+
+ /* If processing an outer join, only use its own join clauses. */
+ if (IS_OUTER_JOIN(jointype) && rinfo->is_pushed_down)
+ continue;
+
+ /* Skip clauses which can not be used for a join. */
+ if (!rinfo->can_join)
+ continue;
+
+ /* Skip clauses which are not equality conditions. */
+ if (!rinfo->mergeopfamilies)
+ continue;
+
+ opexpr = (OpExpr *) rinfo->clause;
+ Assert(is_opclause(opexpr));
+
+ /*
+ * The equi-join between partition keys is strict if equi-join between
+ * at least one partition key is using a strict operator. See
+ * explanation about outer join reordering identity 3 in
+ * optimizer/README
+ */
+ *is_strict = *is_strict || op_strict(opexpr->opno);
+
+ /* Match the operands to the relation. */
+ if (bms_is_subset(rinfo->left_relids, rel1->relids) &&
+ bms_is_subset(rinfo->right_relids, rel2->relids))
+ {
+ expr1 = linitial(opexpr->args);
+ expr2 = lsecond(opexpr->args);
+ }
+ else if (bms_is_subset(rinfo->left_relids, rel2->relids) &&
+ bms_is_subset(rinfo->right_relids, rel1->relids))
+ {
+ expr1 = lsecond(opexpr->args);
+ expr2 = linitial(opexpr->args);
+ }
+ else
+ continue;
+
+ /*
+ * Only clauses referencing the partition keys are useful for
+ * partition-wise join.
+ */
+ ipk1 = match_expr_to_partition_keys(expr1, rel1);
+ if (ipk1 < 0)
+ continue;
+ ipk2 = match_expr_to_partition_keys(expr2, rel2);
+ if (ipk2 < 0)
+ continue;
+
+ /*
+ * If the clause refers to keys at different cardinal positions in the
+ * partition keys of joining relations, it can not be used for
+ * partition-wise join.
+ */
+ if (ipk1 != ipk2)
+ continue;
+
+ /*
+ * The clause allows partition-wise join if only it uses the same
+ * operator family as that specified by the partition key.
+ */
+ if (!list_member_oid(rinfo->mergeopfamilies,
+ part_scheme->partopfamily[ipk1]))
+ continue;
+
+ /* Mark the partition key as having an equi-join clause. */
+ pk_has_clause[ipk1] = true;
+ }
+
+ /* Check whether every partition key has an equi-join condition. */
+ for (cnt_pks = 0; cnt_pks < num_pks; cnt_pks++)
+ {
+ if (!pk_has_clause[cnt_pks])
+ {
+ pfree(pk_has_clause);
+ return false;
+ }
+ }
+
+ pfree(pk_has_clause);
+ return true;
+ }
+
+ /*
+ * Find the partition key from the given relation matching the given
+ * expression. If found, return the index of the partition key, else return -1.
+ */
+ static int
+ match_expr_to_partition_keys(Expr *expr, RelOptInfo *rel)
+ {
+ int cnt_pks;
+ int num_pks;
+
+ /* This function should be called only for partitioned relations. */
+ Assert(rel->part_scheme);
+
+ num_pks = rel->part_scheme->partnatts;
+
+ /* Remove the relabel decoration. */
+ while (IsA(expr, RelabelType))
+ expr = (Expr *) (castNode(RelabelType, expr))->arg;
+
+ for (cnt_pks = 0; cnt_pks < num_pks; cnt_pks++)
+ {
+ List *pkexprs = rel->partexprs[cnt_pks];
+ ListCell *lc;
+
+ foreach(lc, pkexprs)
+ {
+ Expr *pkexpr = lfirst(lc);
+ if (equal(pkexpr, expr))
+ return cnt_pks;
+ }
+ }
+
+ return -1;
+ }
+
+ /*
+ * Calculate the bounds/lists of the join relation based on partition bounds of the
+ * joining relations. Also returns the matching partitions from the joining
+ * relations.
+ *
+ * As of now, it simply checks whether the bounds/lists of the joining
+ * relations match and returns bounds/lists of the first relation. In future
+ * this function will be expanded to merge the bounds/lists from the joining
+ * relations to produce the bounds/lists of the join relation. If the function
+ * fails to merge the bounds/lists, it returns NULL and the lists are also NIL.
+ *
+ * The function also returns two lists of RelOptInfos, one for each joining
+ * relation. The RelOptInfos at the same position in each of the lists give the
+ * partitions with matching bounds which can be joined to produce join relation
+ * corresponding to the merged partition bounds corresponding to that position.
+ * When there doesn't exist a matching partition on either side, corresponding
+ * RelOptInfo will be NULL.
+ */
+ static void
+ build_joinrel_partition_bounds(RelOptInfo *rel1, RelOptInfo *rel2,
+ RelOptInfo *joinrel, JoinType jointype,
+ List **rel1_parts, List **rel2_parts)
+ {
+ PartitionScheme part_scheme;
+ int cnt;
+ int nparts;
+ int16 *parttyplen;
+ bool *parttypbyval;
+
+ Assert(rel1->part_scheme == rel2->part_scheme);
+ Assert(rel1->nparts == rel2->nparts);
+ *rel1_parts = NIL;
+ *rel2_parts = NIL;
+
+ part_scheme = rel1->part_scheme;
+
+ /*
+ * Ideally, we should be able to join two relations which have different
+ * number of partitions as long as the bounds of partitions available on
+ * both the sides match. But for now, we need exact same number of
+ * partitions on both the sides.
+ */
+ if (rel1->nparts != rel2->nparts)
+ {
+ /*
+ * If this pair of joining relations did not have same number of
+ * partitions no other pair can have same number of partitions.
+ */
+ Assert(!joinrel->boundinfo && joinrel->nparts == 0);
+ return;
+ }
+
+
+ parttyplen = (int16 *) palloc(sizeof(int16) * part_scheme->partnatts);
+ parttypbyval = (bool *) palloc(sizeof(bool) * part_scheme->partnatts);
+ for (cnt = 0; cnt < part_scheme->partnatts; cnt++)
+ get_typlenbyval(part_scheme->partopcintype[cnt], &parttyplen[cnt],
+ &parttypbyval[cnt]);
+
+ if (!partition_bounds_equal(part_scheme->partnatts, parttyplen,
+ parttypbyval, rel1->boundinfo,
+ rel2->boundinfo))
+ {
+ /*
+ * If this pair of joining relations did not have same partition bounds
+ * no other pair can have same partition bounds.
+ */
+ Assert(!joinrel->boundinfo && joinrel->nparts == 0);
+ return;
+ }
+
+ nparts = rel1->nparts;
+ for (cnt = 0; cnt < nparts; cnt++)
+ {
+ *rel1_parts = lappend(*rel1_parts, rel1->part_rels[cnt]);
+ *rel2_parts = lappend(*rel2_parts, rel2->part_rels[cnt]);
+ }
+
+ /* Set the partition bounds if not already set. */
+ if (!joinrel->boundinfo)
+ {
+ joinrel->boundinfo = rel1->boundinfo;
+ joinrel->nparts = rel1->nparts;
+ }
+ else
+ {
+ /* Verify existing bounds. */
+ Assert(partition_bounds_equal(part_scheme->partnatts, parttyplen,
+ parttypbyval, joinrel->boundinfo,
+ rel1->boundinfo));
+ Assert(joinrel->nparts == rel1->nparts);
+ }
+
+ pfree(parttyplen);
+ pfree(parttypbyval);
+ }
diff --git a/src/backend/optimizer/path/tidpath.c b/src/backend/optimizer/path/tidpath.c
new file mode 100644
index a2fe661..91d855c
*** a/src/backend/optimizer/path/tidpath.c
--- b/src/backend/optimizer/path/tidpath.c
*************** create_tidscan_paths(PlannerInfo *root,
*** 266,270 ****
if (tidquals)
add_path(rel, (Path *) create_tidscan_path(root, rel, tidquals,
! required_outer));
}
--- 266,270 ----
if (tidquals)
add_path(rel, (Path *) create_tidscan_path(root, rel, tidquals,
! required_outer), false);
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
new file mode 100644
index 95e6eb7..3f1389f
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
*************** static Plan *prepare_sort_from_pathkeys(
*** 252,258 ****
static EquivalenceMember *find_ec_member_for_tle(EquivalenceClass *ec,
TargetEntry *tle,
Relids relids);
! static Sort *make_sort_from_pathkeys(Plan *lefttree, List *pathkeys);
static Sort *make_sort_from_groupcols(List *groupcls,
AttrNumber *grpColIdx,
Plan *lefttree);
--- 252,259 ----
static EquivalenceMember *find_ec_member_for_tle(EquivalenceClass *ec,
TargetEntry *tle,
Relids relids);
! static Sort *make_sort_from_pathkeys(Plan *lefttree, List *pathkeys,
! Relids relids);
static Sort *make_sort_from_groupcols(List *groupcls,
AttrNumber *grpColIdx,
Plan *lefttree);
*************** create_sort_plan(PlannerInfo *root, Sort
*** 1650,1656 ****
subplan = create_plan_recurse(root, best_path->subpath,
flags | CP_SMALL_TLIST);
! plan = make_sort_from_pathkeys(subplan, best_path->path.pathkeys);
copy_generic_path_info(&plan->plan, (Path *) best_path);
--- 1651,1657 ----
subplan = create_plan_recurse(root, best_path->subpath,
flags | CP_SMALL_TLIST);
! plan = make_sort_from_pathkeys(subplan, best_path->path.pathkeys, NULL);
copy_generic_path_info(&plan->plan, (Path *) best_path);
*************** create_mergejoin_plan(PlannerInfo *root,
*** 3767,3772 ****
--- 3768,3775 ----
ListCell *lc;
ListCell *lop;
ListCell *lip;
+ Path *outer_path = best_path->jpath.outerjoinpath;
+ Path *inner_path = best_path->jpath.innerjoinpath;
/*
* MergeJoin can project, so we don't have to demand exact tlists from the
*************** create_mergejoin_plan(PlannerInfo *root,
*** 3830,3837 ****
*/
if (best_path->outersortkeys)
{
Sort *sort = make_sort_from_pathkeys(outer_plan,
! best_path->outersortkeys);
label_sort_with_costsize(root, sort, -1.0);
outer_plan = (Plan *) sort;
--- 3833,3842 ----
*/
if (best_path->outersortkeys)
{
+ Relids outer_relids = outer_path->parent->relids;
Sort *sort = make_sort_from_pathkeys(outer_plan,
! best_path->outersortkeys,
! outer_relids);
label_sort_with_costsize(root, sort, -1.0);
outer_plan = (Plan *) sort;
*************** create_mergejoin_plan(PlannerInfo *root,
*** 3842,3849 ****
if (best_path->innersortkeys)
{
Sort *sort = make_sort_from_pathkeys(inner_plan,
! best_path->innersortkeys);
label_sort_with_costsize(root, sort, -1.0);
inner_plan = (Plan *) sort;
--- 3847,3856 ----
if (best_path->innersortkeys)
{
+ Relids inner_relids = inner_path->parent->relids;
Sort *sort = make_sort_from_pathkeys(inner_plan,
! best_path->innersortkeys,
! inner_relids);
label_sort_with_costsize(root, sort, -1.0);
inner_plan = (Plan *) sort;
*************** prepare_sort_from_pathkeys(Plan *lefttre
*** 5687,5697 ****
continue;
/*
! * Ignore child members unless they match the rel being
* sorted.
*/
if (em->em_is_child &&
! !bms_equal(em->em_relids, relids))
continue;
sortexpr = em->em_expr;
--- 5694,5704 ----
continue;
/*
! * Ignore child members unless they belong to the rel being
* sorted.
*/
if (em->em_is_child &&
! !bms_is_subset(em->em_relids, relids))
continue;
sortexpr = em->em_expr;
*************** find_ec_member_for_tle(EquivalenceClass
*** 5803,5812 ****
continue;
/*
! * Ignore child members unless they match the rel being sorted.
*/
if (em->em_is_child &&
! !bms_equal(em->em_relids, relids))
continue;
/* Match if same expression (after stripping relabel) */
--- 5810,5819 ----
continue;
/*
! * Ignore child members unless they belong to the rel being sorted.
*/
if (em->em_is_child &&
! !bms_is_subset(em->em_relids, relids))
continue;
/* Match if same expression (after stripping relabel) */
*************** find_ec_member_for_tle(EquivalenceClass
*** 5827,5835 ****
*
* 'lefttree' is the node which yields input tuples
* 'pathkeys' is the list of pathkeys by which the result is to be sorted
*/
static Sort *
! make_sort_from_pathkeys(Plan *lefttree, List *pathkeys)
{
int numsortkeys;
AttrNumber *sortColIdx;
--- 5834,5843 ----
*
* 'lefttree' is the node which yields input tuples
* 'pathkeys' is the list of pathkeys by which the result is to be sorted
+ * 'relids' is the set of relations required by prepare_sort_from_pathkeys()
*/
static Sort *
! make_sort_from_pathkeys(Plan *lefttree, List *pathkeys, Relids relids)
{
int numsortkeys;
AttrNumber *sortColIdx;
*************** make_sort_from_pathkeys(Plan *lefttree,
*** 5839,5845 ****
/* Compute sort column info, and adjust lefttree as needed */
lefttree = prepare_sort_from_pathkeys(lefttree, pathkeys,
! NULL,
NULL,
false,
&numsortkeys,
--- 5847,5853 ----
/* Compute sort column info, and adjust lefttree as needed */
lefttree = prepare_sort_from_pathkeys(lefttree, pathkeys,
! relids,
NULL,
false,
&numsortkeys,
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
new file mode 100644
index ebd442a..0313c71
*** a/src/backend/optimizer/plan/initsplan.c
--- b/src/backend/optimizer/plan/initsplan.c
***************
*** 14,20 ****
--- 14,22 ----
*/
#include "postgres.h"
+ #include "access/sysattr.h"
#include "catalog/pg_type.h"
+ #include "catalog/pg_class.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
***************
*** 26,31 ****
--- 28,34 ----
#include "optimizer/planner.h"
#include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
+ #include "optimizer/tlist.h"
#include "optimizer/var.h"
#include "parser/analyze.h"
#include "rewrite/rewriteManip.h"
*************** typedef struct PostponedQual
*** 45,50 ****
--- 48,54 ----
} PostponedQual;
+ static void create_grouped_var_infos(PlannerInfo *root);
static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
Index rtindex);
static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
*************** add_vars_to_targetlist(PlannerInfo *root
*** 240,245 ****
--- 244,533 ----
}
}
+ /*
+ * Add GroupedVarInfo to grouped_var_list for each aggregate and setup
+ * GroupedPathInfo for each base relation that can product grouped paths.
+ *
+ * XXX In the future we might want to create GroupedVarInfo for grouping
+ * expressions too, so that grouping key is not limited to plain Var if the
+ * grouping takes place below the top-level join.
+ *
+ * root->group_pathkeys must be setup before this function is called.
+ */
+ extern void
+ add_grouping_info_to_base_rels(PlannerInfo *root)
+ {
+ int i;
+
+ /* No grouping in the query? */
+ if (!root->parse->groupClause || root->group_pathkeys == NIL)
+ return;
+
+ /* TODO This is just for PoC. Relax the limitation later. */
+ if (root->parse->havingQual)
+ return;
+
+ /* Create GroupedVarInfo per (distinct) aggregate. */
+ create_grouped_var_infos(root);
+
+ /* Is no grouping is possible below the top-level join? */
+ if (root->grouped_var_list == NIL)
+ return;
+
+ /* Process the individual base relations. */
+ for (i = 1; i < root->simple_rel_array_size; i++)
+ {
+ RelOptInfo *rel = root->simple_rel_array[i];
+
+ /*
+ * "other rels" will have their targets built later, by translation of
+ * the target of the parent rel - see set_append_rel_size. If we
+ * wanted to prepare the child rels here, we'd need another iteration
+ * of simple_rel_array_size.
+ */
+ if (rel != NULL && rel->reloptkind == RELOPT_BASEREL)
+ prepare_rel_for_grouping(root, rel);
+ }
+ }
+
+ /*
+ * Create GroupedVarInfo for each distinct aggregate.
+ *
+ * If any aggregate is not suitable, set root->grouped_var_list to NIL and
+ * return.
+ *
+ * TODO Include aggregates from HAVING clause.
+ */
+ static void
+ create_grouped_var_infos(PlannerInfo *root)
+ {
+ List *tlist_exprs;
+ ListCell *lc;
+
+ Assert(root->grouped_var_list == NIL);
+
+ /*
+ * TODO Check if processed_tlist contains the HAVING aggregates. If not,
+ * get them elsewhere.
+ */
+ tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+ PVC_INCLUDE_AGGREGATES);
+ if (tlist_exprs == NIL)
+ return;
+
+ /* tlist_exprs may also contain Vars, but we only need Aggrefs. */
+ foreach(lc, tlist_exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+ Aggref *aggref;
+ ListCell *lc2;
+ GroupedVarInfo *gvi;
+ bool exists;
+
+ if (IsA(expr, Var))
+ continue;
+
+ aggref = castNode(Aggref, expr);
+
+ /* TODO Think if (some of) these can be handled. */
+ if (aggref->aggvariadic ||
+ aggref->aggdirectargs || aggref->aggorder ||
+ aggref->aggdistinct || aggref->aggfilter)
+ {
+ /*
+ * Partial aggregation is not useful if at least one aggregate
+ * cannot be evaluated below the top-level join.
+ *
+ * XXX Is it worth freeing the GroupedVarInfos and their subtrees?
+ */
+ root->grouped_var_list = NIL;
+ break;
+ }
+
+ /* Does GroupedVarInfo for this aggregate already exist? */
+ exists = false;
+ foreach(lc2, root->grouped_var_list)
+ {
+ Expr *expr = (Expr *) lfirst(lc2);
+
+ gvi = castNode(GroupedVarInfo, expr);
+
+ if (equal(expr, gvi->gvexpr))
+ {
+ exists = true;
+ break;
+ }
+ }
+
+ /* Construct a new GroupedVarInfo if does not exist yet. */
+ if (!exists)
+ {
+ Relids relids;
+
+ /* TODO Initialize gv_width. */
+ gvi = makeNode(GroupedVarInfo);
+
+ gvi->gvid = list_length(root->grouped_var_list);
+ gvi->gvexpr = (Expr *) copyObject(aggref);
+ gvi->agg_partial = copyObject(aggref);
+ mark_partial_aggref(gvi->agg_partial, AGGSPLIT_INITIAL_SERIAL);
+
+ /* Find out where the aggregate should be evaluated. */
+ relids = pull_varnos((Node *) aggref);
+ if (!bms_is_empty(relids))
+ gvi->gv_eval_at = relids;
+ else
+ {
+ Assert(aggref->aggstar);
+ gvi->gv_eval_at = NULL;
+ }
+
+ root->grouped_var_list = lappend(root->grouped_var_list, gvi);
+ }
+ }
+
+ list_free(tlist_exprs);
+ }
+
+ /*
+ * Check if all the expressions of rel->reltarget can be used as grouping
+ * expressions and create target for grouped paths.
+ *
+ * If we succeed to create the grouping target, also replace rel->reltarget
+ * with a new one that has sortgrouprefs initialized -- this is necessary for
+ * create_agg_plan to match the grouping clauses against the input target
+ * expressions.
+ *
+ * rel_agg_attrs is a set attributes of the relation referenced by aggregate
+ * arguments. These can exist in the (plain) target without being grouping
+ * expressions.
+ *
+ * rel_agg_vars should be passed instead if rel is a join.
+ *
+ * TODO How about PHVs?
+ *
+ * TODO Make sure cost / width of both "result" and "plain" are correct.
+ */
+ PathTarget *
+ create_grouped_target(PlannerInfo *root, RelOptInfo *rel,
+ Relids rel_agg_attrs, List *rel_agg_vars)
+ {
+ PathTarget *result, *plain;
+ ListCell *lc;
+
+ /* The plan to be returned. */
+ result = create_empty_pathtarget();
+ /* The one to replace rel->reltarget. */
+ plain = create_empty_pathtarget();
+
+ foreach(lc, rel->reltarget->exprs)
+ {
+ Expr *texpr;
+ Index sortgroupref;
+ bool agg_arg_only = false;
+
+ texpr = (Expr *) lfirst(lc);
+
+ sortgroupref = get_expr_sortgroupref(root, texpr);
+ if (sortgroupref > 0)
+ {
+ /* It's o.k. to use the target expression for grouping. */
+ add_column_to_pathtarget(result, texpr, sortgroupref);
+
+ /*
+ * As for the plain target, add the original expression but set
+ * sortgroupref in addition.
+ */
+ add_column_to_pathtarget(plain, texpr, sortgroupref);
+
+ /* Process the next expression. */
+ continue;
+ }
+
+ /*
+ * It may still be o.k. if the expression is only contained in Aggref
+ * - then it's not expected in the grouped output.
+ *
+ * TODO Try to handle generic expression, not only Var. That might
+ * require us to create rel->reltarget of the grouping rel in
+ * parallel to that of the plain rel, and adding whole expressions
+ * instead of individual vars.
+ */
+ if (IsA(texpr, Var))
+ {
+ Var *arg_var = castNode(Var, texpr);
+
+ if (rel->relid > 0)
+ {
+ AttrNumber varattno;
+
+ /*
+ * For a single relation we only need to check attribute
+ * number.
+ *
+ * Apply the same offset that pull_varattnos() did.
+ */
+ varattno = arg_var->varattno - FirstLowInvalidHeapAttributeNumber;
+
+ if (bms_is_member(varattno, rel_agg_attrs))
+ agg_arg_only = true;
+ }
+ else
+ {
+ ListCell *lc2;
+
+ /* Join case. */
+ foreach(lc2, rel_agg_vars)
+ {
+ Var *var = castNode(Var, lfirst(lc2));
+
+ if (var->varno == arg_var->varno &&
+ var->varattno == arg_var->varattno)
+ {
+ agg_arg_only = true;
+ break;
+ }
+ }
+ }
+
+ if (agg_arg_only)
+ {
+ /*
+ * This expression is not suitable for grouping, but the
+ * aggregation input target ought to stay complete.
+ */
+ add_column_to_pathtarget(plain, texpr, 0);
+ }
+ }
+
+ /*
+ * A single mismatched expression makes the whole relation useless
+ * for grouping.
+ */
+ if (!agg_arg_only)
+ {
+ /*
+ * TODO This seems possible to happen multiple times per relation,
+ * so result might be worth freeing. Implement free_pathtarget()?
+ * Or mark the relation as inappropriate for grouping?
+ */
+ /* TODO Free both result and plain. */
+ return NULL;
+ }
+ }
+
+ if (list_length(result->exprs) == 0)
+ {
+ /* TODO free_pathtarget(result); free_pathtarget(plain) */
+ result = NULL;
+ }
+
+ /* Apply the adjusted input target as the replacement is complete now.q */
+ rel->reltarget = plain;
+
+ return result;
+ }
+
/*****************************************************************************
*
*************** create_lateral_join_info(PlannerInfo *ro
*** 629,639 ****
for (rti = 1; rti < root->simple_rel_array_size; rti++)
{
RelOptInfo *brel = root->simple_rel_array[rti];
! if (brel == NULL || brel->reloptkind != RELOPT_BASEREL)
continue;
! if (root->simple_rte_array[rti]->inh)
{
foreach(lc, root->append_rel_list)
{
--- 917,941 ----
for (rti = 1; rti < root->simple_rel_array_size; rti++)
{
RelOptInfo *brel = root->simple_rel_array[rti];
+ RangeTblEntry *brte = root->simple_rte_array[rti];
! if (brel == NULL)
continue;
! /*
! * If an "other rel" RTE is a "partitioned table", we must propagate
! * the lateral info inherited all the way from the root parent to its
! * children. That's because the children are not linked directly with
! * the root parent via AppendRelInfo's unlike in case of a regular
! * inheritance set (see expand_inherited_rtentry()). Failing to
! * do this would result in those children not getting marked with the
! * appropriate lateral info.
! */
! if (brel->reloptkind != RELOPT_BASEREL &&
! brte->relkind != RELKIND_PARTITIONED_TABLE)
! continue;
!
! if (brte->inh)
{
foreach(lc, root->append_rel_list)
{
diff --git a/src/backend/optimizer/plan/planagg.c b/src/backend/optimizer/plan/planagg.c
new file mode 100644
index 5565736..058af2c
*** a/src/backend/optimizer/plan/planagg.c
--- b/src/backend/optimizer/plan/planagg.c
*************** preprocess_minmax_aggregates(PlannerInfo
*** 223,229 ****
create_minmaxagg_path(root, grouped_rel,
create_pathtarget(root, tlist),
aggs_list,
! (List *) parse->havingQual));
}
/*
--- 223,229 ----
create_minmaxagg_path(root, grouped_rel,
create_pathtarget(root, tlist),
aggs_list,
! (List *) parse->havingQual), false);
}
/*
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
new file mode 100644
index ef0de3f..f70b445
*** a/src/backend/optimizer/plan/planmain.c
--- b/src/backend/optimizer/plan/planmain.c
*************** query_planner(PlannerInfo *root, List *t
*** 83,89 ****
add_path(final_rel, (Path *)
create_result_path(root, final_rel,
final_rel->reltarget,
! (List *) parse->jointree->quals));
/* Select cheapest path (pretty easy in this case...) */
set_cheapest(final_rel);
--- 83,89 ----
add_path(final_rel, (Path *)
create_result_path(root, final_rel,
final_rel->reltarget,
! (List *) parse->jointree->quals), false);
/* Select cheapest path (pretty easy in this case...) */
set_cheapest(final_rel);
*************** query_planner(PlannerInfo *root, List *t
*** 114,119 ****
--- 114,120 ----
root->full_join_clauses = NIL;
root->join_info_list = NIL;
root->placeholder_list = NIL;
+ root->grouped_var_list = NIL;
root->fkey_list = NIL;
root->initial_rels = NIL;
*************** query_planner(PlannerInfo *root, List *t
*** 177,182 ****
--- 178,191 ----
(*qp_callback) (root, qp_extra);
/*
+ * If the query result can be grouped, check if any grouping can be
+ * performed below the top-level join. If so, Initialize GroupedPathInfo
+ * of base relations capable to do the grouping and setup
+ * root->grouped_var_list.
+ */
+ add_grouping_info_to_base_rels(root);
+
+ /*
* Examine any "placeholder" expressions generated during subquery pullup.
* Make sure that the Vars they need are marked as needed at the relevant
* join level. This must be done before join removal because it might
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
new file mode 100644
index 649a233..d47f635
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*************** typedef struct
*** 108,117 ****
--- 108,135 ----
int *tleref_to_colnum_map;
} grouping_sets_data;
+ /* Result of a given invocation of inheritance_planner_guts() */
+ typedef struct
+ {
+ Index nominalRelation;
+ List *partitioned_rels;
+ List *resultRelations;
+ List *subpaths;
+ List *subroots;
+ List *withCheckOptionLists;
+ List *returningLists;
+ List *final_rtable;
+ List *init_plans;
+ int save_rel_array_size;
+ RelOptInfo **save_rel_array;
+ } inheritance_planner_result;
+
/* Local functions */
static Node *preprocess_expression(PlannerInfo *root, Node *expr, int kind);
static void preprocess_qual_conditions(PlannerInfo *root, Node *jtnode);
static void inheritance_planner(PlannerInfo *root);
+ static void inheritance_planner_guts(PlannerInfo *root,
+ inheritance_planner_result *inhpres);
static void grouping_planner(PlannerInfo *root, bool inheritance_update,
double tuple_fraction);
static grouping_sets_data *preprocess_grouping_sets(PlannerInfo *root);
*************** static void standard_qp_callback(Planner
*** 130,138 ****
static double get_number_of_groups(PlannerInfo *root,
double path_rows,
grouping_sets_data *gd);
- static Size estimate_hashagg_tablesize(Path *path,
- const AggClauseCosts *agg_costs,
- double dNumGroups);
static RelOptInfo *create_grouping_paths(PlannerInfo *root,
RelOptInfo *input_rel,
PathTarget *target,
--- 148,153 ----
*************** preprocess_phv_expression(PlannerInfo *r
*** 1020,1044 ****
static void
inheritance_planner(PlannerInfo *root)
{
Query *parse = root->parse;
int parentRTindex = parse->resultRelation;
Bitmapset *subqueryRTindexes;
Bitmapset *modifiableARIindexes;
! int nominalRelation = -1;
! List *final_rtable = NIL;
! int save_rel_array_size = 0;
! RelOptInfo **save_rel_array = NULL;
! List *subpaths = NIL;
! List *subroots = NIL;
! List *resultRelations = NIL;
! List *withCheckOptionLists = NIL;
! List *returningLists = NIL;
! List *rowMarks;
! RelOptInfo *final_rel;
ListCell *lc;
Index rti;
RangeTblEntry *parent_rte;
- List *partitioned_rels = NIL;
Assert(parse->commandType != CMD_INSERT);
--- 1035,1139 ----
static void
inheritance_planner(PlannerInfo *root)
{
+ inheritance_planner_result inhpres;
+ Query *parse = root->parse;
+ RelOptInfo *final_rel;
+ Index rti;
+ int final_rtable_len;
+ ListCell *lc;
+ List *rowMarks;
+
+ /*
+ * Away we go... Although the inheritance hierarchy to be processed might
+ * be represented in a non-flat manner, some of the elements needed to
+ * create the final ModifyTable path are always returned in a flat list
+ * structure.
+ */
+ memset(&inhpres, 0, sizeof(inhpres));
+ inheritance_planner_guts(root, &inhpres);
+
+ /* Result path must go into outer query's FINAL upperrel */
+ final_rel = fetch_upper_rel(root, UPPERREL_FINAL, NULL);
+
+ /*
+ * We don't currently worry about setting final_rel's consider_parallel
+ * flag in this case, nor about allowing FDWs or create_upper_paths_hook
+ * to get control here.
+ */
+
+ /*
+ * If we managed to exclude every child rel, return a dummy plan; it
+ * doesn't even need a ModifyTable node.
+ */
+ if (inhpres.subpaths == NIL)
+ {
+ set_dummy_rel_pathlist(final_rel);
+ return;
+ }
+
+ /*
+ * Put back the final adjusted rtable into the master copy of the Query.
+ * (We mustn't do this if we found no non-excluded children.)
+ */
+ parse->rtable = inhpres.final_rtable;
+ root->simple_rel_array_size = inhpres.save_rel_array_size;
+ root->simple_rel_array = inhpres.save_rel_array;
+ /* Must reconstruct master's simple_rte_array, too */
+ final_rtable_len = list_length(inhpres.final_rtable);
+ root->simple_rte_array = (RangeTblEntry **)
+ palloc0((final_rtable_len + 1) *
+ sizeof(RangeTblEntry *));
+ rti = 1;
+ foreach(lc, inhpres.final_rtable)
+ {
+ RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);
+
+ root->simple_rte_array[rti++] = rte;
+ }
+
+ /*
+ * If there was a FOR [KEY] UPDATE/SHARE clause, the LockRows node will
+ * have dealt with fetching non-locked marked rows, else we need to have
+ * ModifyTable do that.
+ */
+ if (parse->rowMarks)
+ rowMarks = NIL;
+ else
+ rowMarks = root->rowMarks;
+
+ /* Create Path representing a ModifyTable to do the UPDATE/DELETE work */
+ add_path(final_rel, (Path *)
+ create_modifytable_path(root, final_rel,
+ parse->commandType,
+ parse->canSetTag,
+ inhpres.nominalRelation,
+ inhpres.partitioned_rels,
+ inhpres.resultRelations,
+ inhpres.subpaths,
+ inhpres.subroots,
+ inhpres.withCheckOptionLists,
+ inhpres.returningLists,
+ rowMarks,
+ NULL,
+ SS_assign_special_param(root)), false);
+ }
+
+ /*
+ * inheritance_planner_guts
+ * Recursive guts of inheritance_planner
+ */
+ static void
+ inheritance_planner_guts(PlannerInfo *root,
+ inheritance_planner_result *inhpres)
+ {
Query *parse = root->parse;
int parentRTindex = parse->resultRelation;
Bitmapset *subqueryRTindexes;
Bitmapset *modifiableARIindexes;
! bool nominalRelationSet = false;
ListCell *lc;
Index rti;
RangeTblEntry *parent_rte;
Assert(parse->commandType != CMD_INSERT);
*************** inheritance_planner(PlannerInfo *root)
*** 1106,1112 ****
*/
parent_rte = rt_fetch(parentRTindex, root->parse->rtable);
if (parent_rte->relkind == RELKIND_PARTITIONED_TABLE)
! nominalRelation = parentRTindex;
/*
* And now we can get on with generating a plan for each child table.
--- 1201,1210 ----
*/
parent_rte = rt_fetch(parentRTindex, root->parse->rtable);
if (parent_rte->relkind == RELKIND_PARTITIONED_TABLE)
! {
! inhpres->nominalRelation = parentRTindex;
! nominalRelationSet = true;
! }
/*
* And now we can get on with generating a plan for each child table.
*************** inheritance_planner(PlannerInfo *root)
*** 1115,1120 ****
--- 1213,1219 ----
{
AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc);
PlannerInfo *subroot;
+ Index childRTindex = appinfo->child_relid;
RangeTblEntry *child_rte;
RelOptInfo *sub_final_rel;
Path *subpath;
*************** inheritance_planner(PlannerInfo *root)
*** 1136,1152 ****
* references to the parent RTE to refer to the current child RTE,
* then fool around with subquery RTEs.
*/
! subroot->parse = (Query *)
! adjust_appendrel_attrs(root,
! (Node *) parse,
! appinfo);
/*
* If there are securityQuals attached to the parent, move them to the
* child rel (they've already been transformed properly for that).
*/
parent_rte = rt_fetch(parentRTindex, subroot->parse->rtable);
! child_rte = rt_fetch(appinfo->child_relid, subroot->parse->rtable);
child_rte->securityQuals = parent_rte->securityQuals;
parent_rte->securityQuals = NIL;
--- 1235,1249 ----
* references to the parent RTE to refer to the current child RTE,
* then fool around with subquery RTEs.
*/
! subroot->parse = (Query *) adjust_appendrel_attrs(root, (Node *) parse,
! 1, &appinfo);
/*
* If there are securityQuals attached to the parent, move them to the
* child rel (they've already been transformed properly for that).
*/
parent_rte = rt_fetch(parentRTindex, subroot->parse->rtable);
! child_rte = rt_fetch(childRTindex, subroot->parse->rtable);
child_rte->securityQuals = parent_rte->securityQuals;
parent_rte->securityQuals = NIL;
*************** inheritance_planner(PlannerInfo *root)
*** 1191,1197 ****
* These won't be referenced, so there's no need to make them very
* valid-looking.
*/
! while (list_length(subroot->parse->rtable) < list_length(final_rtable))
subroot->parse->rtable = lappend(subroot->parse->rtable,
makeNode(RangeTblEntry));
--- 1288,1295 ----
* These won't be referenced, so there's no need to make them very
* valid-looking.
*/
! while (list_length(subroot->parse->rtable) <
! list_length(inhpres->final_rtable))
subroot->parse->rtable = lappend(subroot->parse->rtable,
makeNode(RangeTblEntry));
*************** inheritance_planner(PlannerInfo *root)
*** 1203,1209 ****
* since subquery RTEs couldn't contain any references to the target
* rel.
*/
! if (final_rtable != NIL && subqueryRTindexes != NULL)
{
ListCell *lr;
--- 1301,1307 ----
* since subquery RTEs couldn't contain any references to the target
* rel.
*/
! if (inhpres->final_rtable != NIL && subqueryRTindexes != NULL)
{
ListCell *lr;
*************** inheritance_planner(PlannerInfo *root)
*** 1248,1253 ****
--- 1346,1392 ----
}
}
+ /*
+ * Recurse for a partitioned child table. We shouldn't be planning
+ * a partitioned RTE as a child member, which is what the code after
+ * this block does.
+ */
+ if (child_rte->inh)
+ {
+ inheritance_planner_result child_inhpres;
+
+ Assert(child_rte->relkind == RELKIND_PARTITIONED_TABLE);
+
+ /* During the recursive invocation, this child is the parent. */
+ subroot->parse->resultRelation = childRTindex;
+ memset(&child_inhpres, 0, sizeof(child_inhpres));
+ inheritance_planner_guts(subroot, &child_inhpres);
+
+ inhpres->partitioned_rels = list_concat(inhpres->partitioned_rels,
+ child_inhpres.partitioned_rels);
+ inhpres->resultRelations = list_concat(inhpres->resultRelations,
+ child_inhpres.resultRelations);
+ inhpres->subpaths = list_concat(inhpres->subpaths,
+ child_inhpres.subpaths);
+ inhpres->subroots = list_concat(inhpres->subroots,
+ child_inhpres.subroots);
+ inhpres->withCheckOptionLists =
+ list_concat(inhpres->withCheckOptionLists,
+ child_inhpres.withCheckOptionLists);
+ inhpres->returningLists = list_concat(inhpres->returningLists,
+ child_inhpres.returningLists);
+ if (child_inhpres.final_rtable != NIL)
+ inhpres->final_rtable = child_inhpres.final_rtable;
+ if (child_inhpres.init_plans != NIL)
+ inhpres->init_plans = child_inhpres.init_plans;
+ if (child_inhpres.save_rel_array_size != 0)
+ {
+ inhpres->save_rel_array_size = child_inhpres.save_rel_array_size;
+ inhpres->save_rel_array = child_inhpres.save_rel_array;
+ }
+ continue;
+ }
+
/* There shouldn't be any OJ info to translate, as yet */
Assert(subroot->join_info_list == NIL);
/* and we haven't created PlaceHolderInfos, either */
*************** inheritance_planner(PlannerInfo *root)
*** 1279,1286 ****
* the duplicate child RTE added for the parent does not appear
* anywhere else in the plan tree.
*/
! if (nominalRelation < 0)
! nominalRelation = appinfo->child_relid;
/*
* Select cheapest path in case there's more than one. We always run
--- 1418,1428 ----
* the duplicate child RTE added for the parent does not appear
* anywhere else in the plan tree.
*/
! if (!nominalRelationSet)
! {
! inhpres->nominalRelation = childRTindex;
! nominalRelationSet = true;
! }
/*
* Select cheapest path in case there's more than one. We always run
*************** inheritance_planner(PlannerInfo *root)
*** 1303,1314 ****
* becomes the initial contents of final_rtable; otherwise, append
* just its modified subquery RTEs to final_rtable.
*/
! if (final_rtable == NIL)
! final_rtable = subroot->parse->rtable;
else
! final_rtable = list_concat(final_rtable,
! list_copy_tail(subroot->parse->rtable,
! list_length(final_rtable)));
/*
* We need to collect all the RelOptInfos from all child plans into
--- 1445,1456 ----
* becomes the initial contents of final_rtable; otherwise, append
* just its modified subquery RTEs to final_rtable.
*/
! if (inhpres->final_rtable == NIL)
! inhpres->final_rtable = subroot->parse->rtable;
else
! inhpres->final_rtable = list_concat(inhpres->final_rtable,
! list_copy_tail(subroot->parse->rtable,
! list_length(inhpres->final_rtable)));
/*
* We need to collect all the RelOptInfos from all child plans into
*************** inheritance_planner(PlannerInfo *root)
*** 1317,1425 ****
* have to propagate forward the RelOptInfos that were already built
* in previous children.
*/
! Assert(subroot->simple_rel_array_size >= save_rel_array_size);
! for (rti = 1; rti < save_rel_array_size; rti++)
{
! RelOptInfo *brel = save_rel_array[rti];
if (brel)
subroot->simple_rel_array[rti] = brel;
}
! save_rel_array_size = subroot->simple_rel_array_size;
! save_rel_array = subroot->simple_rel_array;
/* Make sure any initplans from this rel get into the outer list */
! root->init_plans = subroot->init_plans;
/* Build list of sub-paths */
! subpaths = lappend(subpaths, subpath);
/* Build list of modified subroots, too */
! subroots = lappend(subroots, subroot);
/* Build list of target-relation RT indexes */
! resultRelations = lappend_int(resultRelations, appinfo->child_relid);
/* Build lists of per-relation WCO and RETURNING targetlists */
if (parse->withCheckOptions)
! withCheckOptionLists = lappend(withCheckOptionLists,
! subroot->parse->withCheckOptions);
if (parse->returningList)
! returningLists = lappend(returningLists,
! subroot->parse->returningList);
!
Assert(!parse->onConflict);
}
if (parent_rte->relkind == RELKIND_PARTITIONED_TABLE)
{
! partitioned_rels = get_partitioned_child_rels(root, parentRTindex);
/* The root partitioned table is included as a child rel */
! Assert(list_length(partitioned_rels) >= 1);
! }
!
! /* Result path must go into outer query's FINAL upperrel */
! final_rel = fetch_upper_rel(root, UPPERREL_FINAL, NULL);
!
! /*
! * We don't currently worry about setting final_rel's consider_parallel
! * flag in this case, nor about allowing FDWs or create_upper_paths_hook
! * to get control here.
! */
!
! /*
! * If we managed to exclude every child rel, return a dummy plan; it
! * doesn't even need a ModifyTable node.
! */
! if (subpaths == NIL)
! {
! set_dummy_rel_pathlist(final_rel);
! return;
! }
!
! /*
! * Put back the final adjusted rtable into the master copy of the Query.
! * (We mustn't do this if we found no non-excluded children.)
! */
! parse->rtable = final_rtable;
! root->simple_rel_array_size = save_rel_array_size;
! root->simple_rel_array = save_rel_array;
! /* Must reconstruct master's simple_rte_array, too */
! root->simple_rte_array = (RangeTblEntry **)
! palloc0((list_length(final_rtable) + 1) * sizeof(RangeTblEntry *));
! rti = 1;
! foreach(lc, final_rtable)
! {
! RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);
!
! root->simple_rte_array[rti++] = rte;
}
-
- /*
- * If there was a FOR [KEY] UPDATE/SHARE clause, the LockRows node will
- * have dealt with fetching non-locked marked rows, else we need to have
- * ModifyTable do that.
- */
- if (parse->rowMarks)
- rowMarks = NIL;
- else
- rowMarks = root->rowMarks;
-
- /* Create Path representing a ModifyTable to do the UPDATE/DELETE work */
- add_path(final_rel, (Path *)
- create_modifytable_path(root, final_rel,
- parse->commandType,
- parse->canSetTag,
- nominalRelation,
- partitioned_rels,
- resultRelations,
- subpaths,
- subroots,
- withCheckOptionLists,
- returningLists,
- rowMarks,
- NULL,
- SS_assign_special_param(root)));
}
/*--------------------
--- 1459,1506 ----
* have to propagate forward the RelOptInfos that were already built
* in previous children.
*/
! Assert(subroot->simple_rel_array_size >= inhpres->save_rel_array_size);
! for (rti = 1; rti < inhpres->save_rel_array_size; rti++)
{
! RelOptInfo *brel = inhpres->save_rel_array[rti];
if (brel)
subroot->simple_rel_array[rti] = brel;
}
! inhpres->save_rel_array_size = subroot->simple_rel_array_size;
! inhpres->save_rel_array = subroot->simple_rel_array;
/* Make sure any initplans from this rel get into the outer list */
! inhpres->init_plans = subroot->init_plans;
/* Build list of sub-paths */
! inhpres->subpaths = lappend(inhpres->subpaths, subpath);
/* Build list of modified subroots, too */
! inhpres->subroots = lappend(inhpres->subroots, subroot);
/* Build list of target-relation RT indexes */
! inhpres->resultRelations = lappend_int(inhpres->resultRelations,
! childRTindex);
/* Build lists of per-relation WCO and RETURNING targetlists */
if (parse->withCheckOptions)
! inhpres->withCheckOptionLists =
! lappend(inhpres->withCheckOptionLists,
! subroot->parse->withCheckOptions);
if (parse->returningList)
! inhpres->returningLists = lappend(inhpres->returningLists,
! subroot->parse->returningList);
Assert(!parse->onConflict);
}
if (parent_rte->relkind == RELKIND_PARTITIONED_TABLE)
{
! inhpres->partitioned_rels = get_partitioned_child_rels(root,
! parentRTindex);
/* The root partitioned table is included as a child rel */
! Assert(list_length(inhpres->partitioned_rels) >= 1);
}
}
/*--------------------
*************** grouping_planner(PlannerInfo *root, bool
*** 2040,2046 ****
}
/* And shove it into final_rel */
! add_path(final_rel, path);
}
/*
--- 2121,2127 ----
}
/* And shove it into final_rel */
! add_path(final_rel, path, false);
}
/*
*************** get_number_of_groups(PlannerInfo *root,
*** 3446,3485 ****
}
/*
- * estimate_hashagg_tablesize
- * estimate the number of bytes that a hash aggregate hashtable will
- * require based on the agg_costs, path width and dNumGroups.
- *
- * XXX this may be over-estimating the size now that hashagg knows to omit
- * unneeded columns from the hashtable. Also for mixed-mode grouping sets,
- * grouping columns not in the hashed set are counted here even though hashagg
- * won't store them. Is this a problem?
- */
- static Size
- estimate_hashagg_tablesize(Path *path, const AggClauseCosts *agg_costs,
- double dNumGroups)
- {
- Size hashentrysize;
-
- /* Estimate per-hash-entry space at tuple width... */
- hashentrysize = MAXALIGN(path->pathtarget->width) +
- MAXALIGN(SizeofMinimalTupleHeader);
-
- /* plus space for pass-by-ref transition values... */
- hashentrysize += agg_costs->transitionSpace;
- /* plus the per-hash-entry overhead */
- hashentrysize += hash_agg_entry_size(agg_costs->numAggs);
-
- /*
- * Note that this disregards the effect of fill-factor and growth policy
- * of the hash-table. That's probably ok, given default the default
- * fill-factor is relatively high. It'd be hard to meaningfully factor in
- * "double-in-size" growth policies here.
- */
- return hashentrysize * dNumGroups;
- }
-
- /*
* create_grouping_paths
*
* Build a new upperrel containing Paths for grouping and/or aggregation.
--- 3527,3532 ----
*************** create_grouping_paths(PlannerInfo *root,
*** 3600,3606 ****
(List *) parse->havingQual);
}
! add_path(grouped_rel, path);
/* No need to consider any other alternatives. */
set_cheapest(grouped_rel);
--- 3647,3653 ----
(List *) parse->havingQual);
}
! add_path(grouped_rel, path, false);
/* No need to consider any other alternatives. */
set_cheapest(grouped_rel);
*************** create_grouping_paths(PlannerInfo *root,
*** 3777,3783 ****
parse->groupClause,
NIL,
&agg_partial_costs,
! dNumPartialGroups));
else
add_partial_path(grouped_rel, (Path *)
create_group_path(root,
--- 3824,3831 ----
parse->groupClause,
NIL,
&agg_partial_costs,
! dNumPartialGroups),
! false);
else
add_partial_path(grouped_rel, (Path *)
create_group_path(root,
*************** create_grouping_paths(PlannerInfo *root,
*** 3786,3792 ****
partial_grouping_target,
parse->groupClause,
NIL,
! dNumPartialGroups));
}
}
}
--- 3834,3841 ----
partial_grouping_target,
parse->groupClause,
NIL,
! dNumPartialGroups),
! false);
}
}
}
*************** create_grouping_paths(PlannerInfo *root,
*** 3817,3823 ****
parse->groupClause,
NIL,
&agg_partial_costs,
! dNumPartialGroups));
}
}
}
--- 3866,3873 ----
parse->groupClause,
NIL,
&agg_partial_costs,
! dNumPartialGroups),
! false);
}
}
}
*************** create_grouping_paths(PlannerInfo *root,
*** 3869,3875 ****
parse->groupClause,
(List *) parse->havingQual,
agg_costs,
! dNumGroups));
}
else if (parse->groupClause)
{
--- 3919,3925 ----
parse->groupClause,
(List *) parse->havingQual,
agg_costs,
! dNumGroups), false);
}
else if (parse->groupClause)
{
*************** create_grouping_paths(PlannerInfo *root,
*** 3884,3890 ****
target,
parse->groupClause,
(List *) parse->havingQual,
! dNumGroups));
}
else
{
--- 3934,3940 ----
target,
parse->groupClause,
(List *) parse->havingQual,
! dNumGroups), false);
}
else
{
*************** create_grouping_paths(PlannerInfo *root,
*** 3933,3939 ****
parse->groupClause,
(List *) parse->havingQual,
&agg_final_costs,
! dNumGroups));
else
add_path(grouped_rel, (Path *)
create_group_path(root,
--- 3983,3989 ----
parse->groupClause,
(List *) parse->havingQual,
&agg_final_costs,
! dNumGroups), false);
else
add_path(grouped_rel, (Path *)
create_group_path(root,
*************** create_grouping_paths(PlannerInfo *root,
*** 3942,3948 ****
target,
parse->groupClause,
(List *) parse->havingQual,
! dNumGroups));
/*
* The point of using Gather Merge rather than Gather is that it
--- 3992,3998 ----
target,
parse->groupClause,
(List *) parse->havingQual,
! dNumGroups), false);
/*
* The point of using Gather Merge rather than Gather is that it
*************** create_grouping_paths(PlannerInfo *root,
*** 3995,4001 ****
parse->groupClause,
(List *) parse->havingQual,
&agg_final_costs,
! dNumGroups));
else
add_path(grouped_rel, (Path *)
create_group_path(root,
--- 4045,4051 ----
parse->groupClause,
(List *) parse->havingQual,
&agg_final_costs,
! dNumGroups), false);
else
add_path(grouped_rel, (Path *)
create_group_path(root,
*************** create_grouping_paths(PlannerInfo *root,
*** 4004,4010 ****
target,
parse->groupClause,
(List *) parse->havingQual,
! dNumGroups));
}
}
}
--- 4054,4060 ----
target,
parse->groupClause,
(List *) parse->havingQual,
! dNumGroups), false);
}
}
}
*************** create_grouping_paths(PlannerInfo *root,
*** 4049,4055 ****
parse->groupClause,
(List *) parse->havingQual,
agg_costs,
! dNumGroups));
}
}
--- 4099,4105 ----
parse->groupClause,
(List *) parse->havingQual,
agg_costs,
! dNumGroups), false);
}
}
*************** create_grouping_paths(PlannerInfo *root,
*** 4087,4095 ****
parse->groupClause,
(List *) parse->havingQual,
&agg_final_costs,
! dNumGroups));
}
}
}
/* Give a helpful error if we failed to find any implementation */
--- 4137,4212 ----
parse->groupClause,
(List *) parse->havingQual,
&agg_final_costs,
! dNumGroups), false);
}
}
+
+ /*
+ * If input_rel has partially aggregated partial paths, gather them
+ * and perform the final aggregation.
+ *
+ * TODO Allow havingQual - currently not supported at base relation
+ * level.
+ */
+ if (input_rel->gpi != NULL &&
+ input_rel->gpi->partial_pathlist != NIL &&
+ !parse->havingQual)
+ {
+ Path *path = (Path *) linitial(input_rel->gpi->partial_pathlist);
+ double total_groups = path->rows * path->parallel_workers;
+
+ path = (Path *) create_gather_path(root,
+ input_rel,
+ path,
+ path->pathtarget,
+ NULL,
+ &total_groups);
+
+ /*
+ * The input path is partially aggregated and the final
+ * aggregation - if the path wins - will be done below. So we're
+ * done with it for now.
+ *
+ * The top-level grouped_rel needs to receive the path into
+ * regular pathlist, as opposed grouped_rel->gpi->pathlist.
+ */
+ add_path(input_rel, path, false);
+ }
+
+ /*
+ * If input_rel has partially aggregated paths, perform the final
+ * aggregation.
+ *
+ * TODO Allow havingQual - currently not supported at base relation
+ * level.
+ */
+ if (input_rel->gpi != NULL && input_rel->gpi->pathlist != NIL &&
+ !parse->havingQual)
+ {
+ Path *pre_agg = (Path *) linitial(input_rel->gpi->pathlist);
+
+ dNumGroups = get_number_of_groups(root, pre_agg->rows, gd);
+
+ MemSet(&agg_final_costs, 0, sizeof(AggClauseCosts));
+ get_agg_clause_costs(root, (Node *) target->exprs,
+ AGGSPLIT_FINAL_DESERIAL,
+ &agg_final_costs);
+ get_agg_clause_costs(root, parse->havingQual,
+ AGGSPLIT_FINAL_DESERIAL,
+ &agg_final_costs);
+
+ add_path(grouped_rel,
+ (Path *) create_agg_path(root, grouped_rel,
+ pre_agg,
+ target,
+ AGG_HASHED,
+ AGGSPLIT_FINAL_DESERIAL,
+ parse->groupClause,
+ (List *) parse->havingQual,
+ &agg_final_costs,
+ dNumGroups),
+ false);
+ }
}
/* Give a helpful error if we failed to find any implementation */
*************** consider_groupingsets_paths(PlannerInfo
*** 4289,4295 ****
strat,
new_rollups,
agg_costs,
! dNumGroups));
return;
}
--- 4406,4412 ----
strat,
new_rollups,
agg_costs,
! dNumGroups), false);
return;
}
*************** consider_groupingsets_paths(PlannerInfo
*** 4447,4453 ****
AGG_MIXED,
rollups,
agg_costs,
! dNumGroups));
}
}
--- 4564,4570 ----
AGG_MIXED,
rollups,
agg_costs,
! dNumGroups), false);
}
}
*************** consider_groupingsets_paths(PlannerInfo
*** 4464,4470 ****
AGG_SORTED,
gd->rollups,
agg_costs,
! dNumGroups));
}
/*
--- 4581,4587 ----
AGG_SORTED,
gd->rollups,
agg_costs,
! dNumGroups), false);
}
/*
*************** create_one_window_path(PlannerInfo *root
*** 4649,4655 ****
window_pathkeys);
}
! add_path(window_rel, path);
}
/*
--- 4766,4772 ----
window_pathkeys);
}
! add_path(window_rel, path, false);
}
/*
*************** create_distinct_paths(PlannerInfo *root,
*** 4755,4761 ****
create_upper_unique_path(root, distinct_rel,
path,
list_length(root->distinct_pathkeys),
! numDistinctRows));
}
}
--- 4872,4878 ----
create_upper_unique_path(root, distinct_rel,
path,
list_length(root->distinct_pathkeys),
! numDistinctRows), false);
}
}
*************** create_distinct_paths(PlannerInfo *root,
*** 4782,4788 ****
create_upper_unique_path(root, distinct_rel,
path,
list_length(root->distinct_pathkeys),
! numDistinctRows));
}
/*
--- 4899,4905 ----
create_upper_unique_path(root, distinct_rel,
path,
list_length(root->distinct_pathkeys),
! numDistinctRows), false);
}
/*
*************** create_distinct_paths(PlannerInfo *root,
*** 4829,4835 ****
parse->distinctClause,
NIL,
NULL,
! numDistinctRows));
}
/* Give a helpful error if we failed to find any implementation */
--- 4946,4952 ----
parse->distinctClause,
NIL,
NULL,
! numDistinctRows), false);
}
/* Give a helpful error if we failed to find any implementation */
*************** create_ordered_paths(PlannerInfo *root,
*** 4927,4933 ****
path = apply_projection_to_path(root, ordered_rel,
path, target);
! add_path(ordered_rel, path);
}
}
--- 5044,5050 ----
path = apply_projection_to_path(root, ordered_rel,
path, target);
! add_path(ordered_rel, path, false);
}
}
*************** create_ordered_paths(PlannerInfo *root,
*** 4977,4983 ****
path = apply_projection_to_path(root, ordered_rel,
path, target);
! add_path(ordered_rel, path);
}
}
--- 5094,5100 ----
path = apply_projection_to_path(root, ordered_rel,
path, target);
! add_path(ordered_rel, path, false);
}
}
*************** get_partitioned_child_rels(PlannerInfo *
*** 6083,6085 ****
--- 6200,6230 ----
return result;
}
+
+ /*
+ * get_partitioned_child_rels_for_join
+ * Build and return a list containing the RTI of every partitioned
+ * relation which is a child of some rel included in the join.
+ *
+ * Note: Only call this function on joins between partitioned tables.
+ */
+ List *
+ get_partitioned_child_rels_for_join(PlannerInfo *root,
+ RelOptInfo *joinrel)
+ {
+ List *result = NIL;
+ ListCell *l;
+
+ foreach(l, root->pcinfo_list)
+ {
+ PartitionedChildRelInfo *pc = lfirst(l);
+
+ if (bms_is_member(pc->parent_relid, joinrel->relids))
+ result = list_concat(result, list_copy(pc->child_rels));
+ }
+
+ /* The root partitioned table is included as a child rel */
+ Assert(list_length(result) >= bms_num_members(joinrel->relids));
+
+ return result;
+ }
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
new file mode 100644
index 1278371..44c3919
*** a/src/backend/optimizer/plan/setrefs.c
--- b/src/backend/optimizer/plan/setrefs.c
*************** typedef struct
*** 40,46 ****
--- 40,50 ----
List *tlist; /* underlying target list */
int num_vars; /* number of plain Var tlist entries */
bool has_ph_vars; /* are there PlaceHolderVar entries? */
+ bool has_grp_vars; /* are there GroupedVar entries? */
bool has_non_vars; /* are there other entries? */
+ bool has_conv_whole_rows; /* are there ConvertRowtypeExpr entries
+ * encapsulating a whole-row Var?
+ */
tlist_vinfo vars[FLEXIBLE_ARRAY_MEMBER]; /* has num_vars entries */
} indexed_tlist;
*************** static List *set_returning_clause_refere
*** 139,144 ****
--- 143,149 ----
int rtoffset);
static bool extract_query_dependencies_walker(Node *node,
PlannerInfo *context);
+ static Var *get_wholerow_ref_from_convert_row_type(Node *node);
/*****************************************************************************
*
*************** set_upper_references(PlannerInfo *root,
*** 1725,1733 ****
--- 1730,1781 ----
indexed_tlist *subplan_itlist;
List *output_targetlist;
ListCell *l;
+ List *sub_tlist_save = NIL;
+
+ if (root->grouped_var_list != NIL)
+ {
+ if (IsA(plan, Agg))
+ {
+ Agg *agg = (Agg *) plan;
+
+ if (agg->aggsplit == AGGSPLIT_FINAL_DESERIAL)
+ {
+ /*
+ * convert_combining_aggrefs could have replaced some vars
+ * with Aggref expressions representing the partial
+ * aggregation. We need to restore the same Aggrefs in the
+ * subplan targetlist, but this would break the subplan if
+ * it's something else than the partial aggregation (i.e. the
+ * partial aggregation takes place lower in the plan tree). So
+ * we'll eventually need to restore the original list.
+ */
+ if (!IsA(subplan, Agg))
+ sub_tlist_save = subplan->targetlist;
+ #ifdef USE_ASSERT_CHECKING
+ else
+ Assert(((Agg *) subplan)->aggsplit == AGGSPLIT_INITIAL_SERIAL);
+ #endif /* USE_ASSERT_CHECKING */
+
+ /*
+ * Restore the aggregate expressions that we might have
+ * removed when planning for aggregation at base relation
+ * level.
+ */
+ subplan->targetlist =
+ restore_grouping_expressions(root, subplan->targetlist);
+ }
+ }
+ }
subplan_itlist = build_tlist_index(subplan->targetlist);
+ /*
+ * The replacement of GroupVars by Aggrefs was only needed for the index
+ * build.
+ */
+ if (sub_tlist_save != NIL)
+ subplan->targetlist = sub_tlist_save;
+
output_targetlist = NIL;
foreach(l, plan->targetlist)
{
*************** build_tlist_index(List *tlist)
*** 1937,1943 ****
--- 1985,1993 ----
itlist->tlist = tlist;
itlist->has_ph_vars = false;
+ itlist->has_grp_vars = false;
itlist->has_non_vars = false;
+ itlist->has_conv_whole_rows = false;
/* Find the Vars and fill in the index array */
vinfo = itlist->vars;
*************** build_tlist_index(List *tlist)
*** 1956,1961 ****
--- 2006,2015 ----
}
else if (tle->expr && IsA(tle->expr, PlaceHolderVar))
itlist->has_ph_vars = true;
+ else if (tle->expr && IsA(tle->expr, GroupedVar))
+ itlist->has_grp_vars = true;
+ else if (get_wholerow_ref_from_convert_row_type((Node *) tle->expr))
+ itlist->has_conv_whole_rows = true;
else
itlist->has_non_vars = true;
}
*************** build_tlist_index(List *tlist)
*** 1971,1977 ****
* This is like build_tlist_index, but we only index tlist entries that
* are Vars belonging to some rel other than the one specified. We will set
* has_ph_vars (allowing PlaceHolderVars to be matched), but not has_non_vars
! * (so nothing other than Vars and PlaceHolderVars can be matched).
*/
static indexed_tlist *
build_tlist_index_other_vars(List *tlist, Index ignore_rel)
--- 2025,2034 ----
* This is like build_tlist_index, but we only index tlist entries that
* are Vars belonging to some rel other than the one specified. We will set
* has_ph_vars (allowing PlaceHolderVars to be matched), but not has_non_vars
! * (so nothing other than Vars and PlaceHolderVars can be matched). In case of
! * DML, where this function will be used, returning lists from child relations
! * will be appended similar to a simple append relation. That does not require
! * fixing ConvertRowtypeExpr references. So, those are not considered here.
*/
static indexed_tlist *
build_tlist_index_other_vars(List *tlist, Index ignore_rel)
*************** build_tlist_index_other_vars(List *tlist
*** 1988,1993 ****
--- 2045,2051 ----
itlist->tlist = tlist;
itlist->has_ph_vars = false;
itlist->has_non_vars = false;
+ itlist->has_conv_whole_rows = false;
/* Find the desired Vars and fill in the index array */
vinfo = itlist->vars;
*************** fix_join_expr_mutator(Node *node, fix_jo
*** 2233,2238 ****
--- 2291,2321 ----
/* No referent found for Var */
elog(ERROR, "variable not found in subplan target lists");
}
+ if (IsA(node, GroupedVar))
+ {
+ GroupedVar *gvar = (GroupedVar *) node;
+
+ /* See if the GroupedVar has bubbled up from a lower plan node */
+ if (context->outer_itlist && context->outer_itlist->has_grp_vars)
+ {
+ newvar = search_indexed_tlist_for_non_var((Expr *) gvar,
+ context->outer_itlist,
+ OUTER_VAR);
+ if (newvar)
+ return (Node *) newvar;
+ }
+ if (context->inner_itlist && context->inner_itlist->has_grp_vars)
+ {
+ newvar = search_indexed_tlist_for_non_var((Expr *) gvar,
+ context->inner_itlist,
+ INNER_VAR);
+ if (newvar)
+ return (Node *) newvar;
+ }
+
+ /* No referent found for GroupedVar */
+ elog(ERROR, "grouped variable not found in subplan target lists");
+ }
if (IsA(node, PlaceHolderVar))
{
PlaceHolderVar *phv = (PlaceHolderVar *) node;
*************** fix_join_expr_mutator(Node *node, fix_jo
*** 2258,2263 ****
--- 2341,2369 ----
/* If not supplied by input plans, evaluate the contained expr */
return fix_join_expr_mutator((Node *) phv->phexpr, context);
}
+ if (get_wholerow_ref_from_convert_row_type(node))
+ {
+ if (context->outer_itlist &&
+ context->outer_itlist->has_conv_whole_rows)
+ {
+ newvar = search_indexed_tlist_for_non_var((Expr *) node,
+ context->outer_itlist,
+ OUTER_VAR);
+
+ if (newvar)
+ return (Node *) newvar;
+ }
+ if (context->inner_itlist &&
+ context->inner_itlist->has_conv_whole_rows)
+ {
+ newvar = search_indexed_tlist_for_non_var((Expr *) node,
+ context->inner_itlist,
+ INNER_VAR);
+
+ if (newvar)
+ return (Node *) newvar;
+ }
+ }
if (IsA(node, Param))
return fix_param_node(context->root, (Param *) node);
/* Try matching more complex expressions too, if tlists have any */
*************** fix_upper_expr_mutator(Node *node, fix_u
*** 2364,2369 ****
--- 2470,2486 ----
/* If not supplied by input plan, evaluate the contained expr */
return fix_upper_expr_mutator((Node *) phv->phexpr, context);
}
+ if (get_wholerow_ref_from_convert_row_type(node))
+ {
+ if (context->subplan_itlist->has_conv_whole_rows)
+ {
+ newvar = search_indexed_tlist_for_non_var((Expr *) node,
+ context->subplan_itlist,
+ context->newvarno);
+ if (newvar)
+ return (Node *) newvar;
+ }
+ }
if (IsA(node, Param))
return fix_param_node(context->root, (Param *) node);
if (IsA(node, Aggref))
*************** fix_upper_expr_mutator(Node *node, fix_u
*** 2389,2395 ****
/* If no match, just fall through to process it normally */
}
/* Try matching more complex expressions too, if tlist has any */
! if (context->subplan_itlist->has_non_vars)
{
newvar = search_indexed_tlist_for_non_var((Expr *) node,
context->subplan_itlist,
--- 2506,2513 ----
/* If no match, just fall through to process it normally */
}
/* Try matching more complex expressions too, if tlist has any */
! if (context->subplan_itlist->has_grp_vars ||
! context->subplan_itlist->has_non_vars)
{
newvar = search_indexed_tlist_for_non_var((Expr *) node,
context->subplan_itlist,
*************** extract_query_dependencies_walker(Node *
*** 2596,2598 ****
--- 2714,2748 ----
return expression_tree_walker(node, extract_query_dependencies_walker,
(void *) context);
}
+
+ /*
+ * get_wholerow_ref_from_convert_row_type
+ * Given a node, check if it's a ConvertRowtypeExpr encapsulating a
+ * whole-row reference as implicit cast and return the whole-row
+ * reference Var if so. Otherwise return NULL. In case of multi-level
+ * partitioning, we will have as many nested ConvertRowtypeExpr as there
+ * are levels in partition hierarchy.
+ */
+ static Var *
+ get_wholerow_ref_from_convert_row_type(Node *node)
+ {
+ Var *var = NULL;
+ ConvertRowtypeExpr *convexpr;
+
+ if (!node || !IsA(node, ConvertRowtypeExpr))
+ return NULL;
+
+ /* Traverse nested ConvertRowtypeExpr's. */
+ convexpr = castNode(ConvertRowtypeExpr, node);
+ while (convexpr->convertformat == COERCE_IMPLICIT_CAST &&
+ IsA(convexpr->arg, ConvertRowtypeExpr))
+ convexpr = (ConvertRowtypeExpr *) convexpr->arg;
+
+ if (IsA(convexpr->arg, Var))
+ var = castNode(Var, convexpr->arg);
+
+ if (var && var->varattno == 0)
+ return var;
+
+ return NULL;
+ }
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
new file mode 100644
index a1be858..8bdaa44
*** a/src/backend/optimizer/prep/prepunion.c
--- b/src/backend/optimizer/prep/prepunion.c
***************
*** 55,61 ****
typedef struct
{
PlannerInfo *root;
! AppendRelInfo *appinfo;
} adjust_appendrel_attrs_context;
static Path *recurse_set_operations(Node *setOp, PlannerInfo *root,
--- 55,62 ----
typedef struct
{
PlannerInfo *root;
! int nappinfos;
! AppendRelInfo **appinfos;
} adjust_appendrel_attrs_context;
static Path *recurse_set_operations(Node *setOp, PlannerInfo *root,
*************** static List *generate_append_tlist(List
*** 97,103 ****
List *input_tlists,
List *refnames_tlist);
static List *generate_setop_grouplist(SetOperationStmt *op, List *targetlist);
! static void expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte,
Index rti);
static void make_inh_translation_list(Relation oldrelation,
Relation newrelation,
--- 98,104 ----
List *input_tlists,
List *refnames_tlist);
static List *generate_setop_grouplist(SetOperationStmt *op, List *targetlist);
! static List *expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte,
Index rti);
static void make_inh_translation_list(Relation oldrelation,
Relation newrelation,
*************** static Bitmapset *translate_col_privs(co
*** 107,113 ****
List *translated_vars);
static Node *adjust_appendrel_attrs_mutator(Node *node,
adjust_appendrel_attrs_context *context);
- static Relids adjust_relid_set(Relids relids, Index oldrelid, Index newrelid);
static List *adjust_inherited_tlist(List *tlist,
AppendRelInfo *context);
--- 108,113 ----
*************** plan_set_operations(PlannerInfo *root)
*** 207,213 ****
root->processed_tlist = top_tlist;
/* Add only the final path to the SETOP upperrel. */
! add_path(setop_rel, path);
/* Let extensions possibly add some more paths */
if (create_upper_paths_hook)
--- 207,213 ----
root->processed_tlist = top_tlist;
/* Add only the final path to the SETOP upperrel. */
! add_path(setop_rel, path, false);
/* Let extensions possibly add some more paths */
if (create_upper_paths_hook)
*************** expand_inherited_tables(PlannerInfo *roo
*** 1330,1348 ****
Index nrtes;
Index rti;
ListCell *rl;
/*
* expand_inherited_rtentry may add RTEs to parse->rtable; there is no
* need to scan them since they can't have inh=true. So just scan as far
* as the original end of the rtable list.
*/
! nrtes = list_length(root->parse->rtable);
! rl = list_head(root->parse->rtable);
for (rti = 1; rti <= nrtes; rti++)
{
RangeTblEntry *rte = (RangeTblEntry *) lfirst(rl);
! expand_inherited_rtentry(root, rte, rti);
rl = lnext(rl);
}
}
--- 1330,1351 ----
Index nrtes;
Index rti;
ListCell *rl;
+ Query *parse = root->parse;
/*
* expand_inherited_rtentry may add RTEs to parse->rtable; there is no
* need to scan them since they can't have inh=true. So just scan as far
* as the original end of the rtable list.
*/
! nrtes = list_length(parse->rtable);
! rl = list_head(parse->rtable);
for (rti = 1; rti <= nrtes; rti++)
{
RangeTblEntry *rte = (RangeTblEntry *) lfirst(rl);
+ List *appinfos;
! appinfos = expand_inherited_rtentry(root, rte, rti);
! root->append_rel_list = list_concat(root->append_rel_list, appinfos);
rl = lnext(rl);
}
}
*************** expand_inherited_tables(PlannerInfo *roo
*** 1362,1369 ****
*
* A childless table is never considered to be an inheritance set; therefore
* a parent RTE must always have at least two associated AppendRelInfos.
*/
! static void
expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
{
Query *parse = root->parse;
--- 1365,1374 ----
*
* A childless table is never considered to be an inheritance set; therefore
* a parent RTE must always have at least two associated AppendRelInfos.
+ *
+ * Returns a list of AppendRelInfos, or NIL.
*/
! static List*
expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
{
Query *parse = root->parse;
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1380,1391 ****
/* Does RT entry allow inheritance? */
if (!rte->inh)
! return;
/* Ignore any already-expanded UNION ALL nodes */
if (rte->rtekind != RTE_RELATION)
{
Assert(rte->rtekind == RTE_SUBQUERY);
! return;
}
/* Fast path for common case of childless table */
parentOID = rte->relid;
--- 1385,1396 ----
/* Does RT entry allow inheritance? */
if (!rte->inh)
! return NIL;
/* Ignore any already-expanded UNION ALL nodes */
if (rte->rtekind != RTE_RELATION)
{
Assert(rte->rtekind == RTE_SUBQUERY);
! return NIL;
}
/* Fast path for common case of childless table */
parentOID = rte->relid;
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1393,1399 ****
{
/* Clear flag before returning */
rte->inh = false;
! return;
}
/*
--- 1398,1404 ----
{
/* Clear flag before returning */
rte->inh = false;
! return NIL;
}
/*
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1417,1424 ****
else
lockmode = AccessShareLock;
! /* Scan for all members of inheritance set, acquire needed locks */
! inhOIDs = find_all_inheritors(parentOID, lockmode, NULL);
/*
* Check that there's at least one descendant, else treat as no-child
--- 1422,1440 ----
else
lockmode = AccessShareLock;
! /*
! * Expand partitioned table level-wise to help optimizations like
! * partition-wise join which match partitions at every level. Otherwise,
! * scan for all members of inheritance set. Acquire needed locks
! */
! if (rte->relkind == RELKIND_PARTITIONED_TABLE)
! {
! inhOIDs = list_make1_oid(parentOID);
! inhOIDs = list_concat(inhOIDs,
! find_inheritance_children(parentOID, lockmode));
! }
! else
! inhOIDs = find_all_inheritors(parentOID, lockmode, NULL);
/*
* Check that there's at least one descendant, else treat as no-child
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1429,1435 ****
{
/* Clear flag before returning */
rte->inh = false;
! return;
}
/*
--- 1445,1451 ----
{
/* Clear flag before returning */
rte->inh = false;
! return NIL;
}
/*
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1457,1462 ****
--- 1473,1484 ----
Index childRTindex;
AppendRelInfo *appinfo;
+ /*
+ * If this child is a partitioned table, this contains AppendRelInfos
+ * for its own children.
+ */
+ List *myappinfos;
+
/* Open rel if needed; we already have required locks */
if (childOID != parentOID)
newrelation = heap_open(childOID, NoLock);
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1490,1496 ****
childrte = copyObject(rte);
childrte->relid = childOID;
childrte->relkind = newrelation->rd_rel->relkind;
! childrte->inh = false;
childrte->requiredPerms = 0;
childrte->securityQuals = NIL;
parse->rtable = lappend(parse->rtable, childrte);
--- 1512,1523 ----
childrte = copyObject(rte);
childrte->relid = childOID;
childrte->relkind = newrelation->rd_rel->relkind;
! /* A partitioned child will need to be expanded further. */
! if (childOID != parentOID &&
! childrte->relkind == RELKIND_PARTITIONED_TABLE)
! childrte->inh = true;
! else
! childrte->inh = false;
childrte->requiredPerms = 0;
childrte->securityQuals = NIL;
parse->rtable = lappend(parse->rtable, childrte);
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1498,1506 ****
/*
* Build an AppendRelInfo for this parent and child, unless the child
! * is a partitioned table.
*/
! if (childrte->relkind != RELKIND_PARTITIONED_TABLE)
{
need_append = true;
appinfo = makeNode(AppendRelInfo);
--- 1525,1533 ----
/*
* Build an AppendRelInfo for this parent and child, unless the child
! * RTE simply duplicates the parent *partitioned* table.
*/
! if (childrte->relkind != RELKIND_PARTITIONED_TABLE || childrte->inh)
{
need_append = true;
appinfo = makeNode(AppendRelInfo);
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1570,1575 ****
--- 1597,1610 ----
/* Close child relations, but keep locks */
if (childOID != parentOID)
heap_close(newrelation, NoLock);
+
+ /* Expand partitioned children recursively. */
+ if (childrte->inh)
+ {
+ myappinfos = expand_inherited_rtentry(root, childrte,
+ childRTindex);
+ appinfos = list_concat(appinfos, myappinfos);
+ }
}
heap_close(oldrelation, NoLock);
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1585,1591 ****
{
/* Clear flag before returning */
rte->inh = false;
! return;
}
/*
--- 1620,1626 ----
{
/* Clear flag before returning */
rte->inh = false;
! return NIL;
}
/*
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1606,1613 ****
root->pcinfo_list = lappend(root->pcinfo_list, pcinfo);
}
! /* Otherwise, OK to add to root->append_rel_list */
! root->append_rel_list = list_concat(root->append_rel_list, appinfos);
}
/*
--- 1641,1648 ----
root->pcinfo_list = lappend(root->pcinfo_list, pcinfo);
}
! /* The following will be concatenated to root->append_rel_list. */
! return appinfos;
}
/*
*************** translate_col_privs(const Bitmapset *par
*** 1767,1776 ****
/*
* adjust_appendrel_attrs
! * Copy the specified query or expression and translate Vars referring
! * to the parent rel of the specified AppendRelInfo to refer to the
! * child rel instead. We also update rtindexes appearing outside Vars,
! * such as resultRelation and jointree relids.
*
* Note: this is only applied after conversion of sublinks to subplans,
* so we don't need to cope with recursion into sub-queries.
--- 1802,1812 ----
/*
* adjust_appendrel_attrs
! * Copy the specified query or expression and translate Vars referring to
! * the parent rels of the child rels specified in the given list of
! * AppendRelInfos to refer to the corresponding child rel instead. We also
! * update rtindexes appearing outside Vars, such as resultRelation and
! * jointree relids.
*
* Note: this is only applied after conversion of sublinks to subplans,
* so we don't need to cope with recursion into sub-queries.
*************** translate_col_privs(const Bitmapset *par
*** 1779,1791 ****
* maybe we should try to fold the two routines together.
*/
Node *
! adjust_appendrel_attrs(PlannerInfo *root, Node *node, AppendRelInfo *appinfo)
{
Node *result;
adjust_appendrel_attrs_context context;
context.root = root;
! context.appinfo = appinfo;
/*
* Must be prepared to start with a Query or a bare expression tree.
--- 1815,1835 ----
* maybe we should try to fold the two routines together.
*/
Node *
! adjust_appendrel_attrs(PlannerInfo *root, Node *node, int nappinfos,
! AppendRelInfo **appinfos)
{
Node *result;
adjust_appendrel_attrs_context context;
context.root = root;
! context.nappinfos = nappinfos;
! context.appinfos = appinfos;
!
! /*
! * Catch a caller who wants to adjust expressions, but doesn't pass any
! * AppendRelInfo.
! */
! Assert(appinfos && nappinfos >= 1);
/*
* Must be prepared to start with a Query or a bare expression tree.
*************** adjust_appendrel_attrs(PlannerInfo *root
*** 1793,1812 ****
if (node && IsA(node, Query))
{
Query *newnode;
newnode = query_tree_mutator((Query *) node,
adjust_appendrel_attrs_mutator,
(void *) &context,
QTW_IGNORE_RC_SUBQUERIES);
! if (newnode->resultRelation == appinfo->parent_relid)
{
! newnode->resultRelation = appinfo->child_relid;
! /* Fix tlist resnos too, if it's inherited UPDATE */
! if (newnode->commandType == CMD_UPDATE)
! newnode->targetList =
! adjust_inherited_tlist(newnode->targetList,
! appinfo);
}
result = (Node *) newnode;
}
else
--- 1837,1864 ----
if (node && IsA(node, Query))
{
Query *newnode;
+ int cnt;
newnode = query_tree_mutator((Query *) node,
adjust_appendrel_attrs_mutator,
(void *) &context,
QTW_IGNORE_RC_SUBQUERIES);
! for (cnt = 0; cnt < nappinfos; cnt++)
{
! AppendRelInfo *appinfo = appinfos[cnt];
!
! if (newnode->resultRelation == appinfo->parent_relid)
! {
! newnode->resultRelation = appinfo->child_relid;
! /* Fix tlist resnos too, if it's inherited UPDATE */
! if (newnode->commandType == CMD_UPDATE)
! newnode->targetList =
! adjust_inherited_tlist(newnode->targetList,
! appinfo);
! break;
! }
}
+
result = (Node *) newnode;
}
else
*************** static Node *
*** 1819,1831 ****
adjust_appendrel_attrs_mutator(Node *node,
adjust_appendrel_attrs_context *context)
{
! AppendRelInfo *appinfo = context->appinfo;
if (node == NULL)
return NULL;
if (IsA(node, Var))
{
Var *var = (Var *) copyObject(node);
if (var->varlevelsup == 0 &&
var->varno == appinfo->parent_relid)
--- 1871,1900 ----
adjust_appendrel_attrs_mutator(Node *node,
adjust_appendrel_attrs_context *context)
{
! AppendRelInfo **appinfos = context->appinfos;
! int nappinfos = context->nappinfos;
! int cnt;
!
! /*
! * Catch a caller who wants to adjust expressions, but doesn't pass any
! * AppendRelInfo.
! */
! Assert(appinfos && nappinfos >= 1);
if (node == NULL)
return NULL;
if (IsA(node, Var))
{
Var *var = (Var *) copyObject(node);
+ AppendRelInfo *appinfo;
+
+ for (cnt = 0; cnt < nappinfos; cnt++)
+ {
+ appinfo = appinfos[cnt];
+
+ if (var->varno == appinfo->parent_relid)
+ break;
+ }
if (var->varlevelsup == 0 &&
var->varno == appinfo->parent_relid)
*************** adjust_appendrel_attrs_mutator(Node *nod
*** 1908,1936 ****
{
CurrentOfExpr *cexpr = (CurrentOfExpr *) copyObject(node);
! if (cexpr->cvarno == appinfo->parent_relid)
! cexpr->cvarno = appinfo->child_relid;
return (Node *) cexpr;
}
if (IsA(node, RangeTblRef))
{
RangeTblRef *rtr = (RangeTblRef *) copyObject(node);
! if (rtr->rtindex == appinfo->parent_relid)
! rtr->rtindex = appinfo->child_relid;
return (Node *) rtr;
}
if (IsA(node, JoinExpr))
{
/* Copy the JoinExpr node with correct mutation of subnodes */
JoinExpr *j;
j = (JoinExpr *) expression_tree_mutator(node,
adjust_appendrel_attrs_mutator,
(void *) context);
/* now fix JoinExpr's rtindex (probably never happens) */
! if (j->rtindex == appinfo->parent_relid)
! j->rtindex = appinfo->child_relid;
return (Node *) j;
}
if (IsA(node, PlaceHolderVar))
--- 1977,2030 ----
{
CurrentOfExpr *cexpr = (CurrentOfExpr *) copyObject(node);
! for (cnt = 0; cnt < nappinfos; cnt++)
! {
! AppendRelInfo *appinfo = appinfos[cnt];
!
! if (cexpr->cvarno == appinfo->parent_relid)
! {
! cexpr->cvarno = appinfo->child_relid;
! break;
! }
! }
return (Node *) cexpr;
}
if (IsA(node, RangeTblRef))
{
RangeTblRef *rtr = (RangeTblRef *) copyObject(node);
! for (cnt = 0; cnt < nappinfos; cnt++)
! {
! AppendRelInfo *appinfo = appinfos[cnt];
!
! if (rtr->rtindex == appinfo->parent_relid)
! {
! rtr->rtindex = appinfo->child_relid;
! break;
! }
! }
return (Node *) rtr;
}
if (IsA(node, JoinExpr))
{
/* Copy the JoinExpr node with correct mutation of subnodes */
JoinExpr *j;
+ AppendRelInfo *appinfo;
j = (JoinExpr *) expression_tree_mutator(node,
adjust_appendrel_attrs_mutator,
(void *) context);
/* now fix JoinExpr's rtindex (probably never happens) */
! for (cnt = 0; cnt < nappinfos; cnt++)
! {
! appinfo = appinfos[cnt];
!
! if (j->rtindex == appinfo->parent_relid)
! {
! j->rtindex = appinfo->child_relid;
! break;
! }
! }
return (Node *) j;
}
if (IsA(node, PlaceHolderVar))
*************** adjust_appendrel_attrs_mutator(Node *nod
*** 1943,1951 ****
(void *) context);
/* now fix PlaceHolderVar's relid sets */
if (phv->phlevelsup == 0)
! phv->phrels = adjust_relid_set(phv->phrels,
! appinfo->parent_relid,
! appinfo->child_relid);
return (Node *) phv;
}
/* Shouldn't need to handle planner auxiliary nodes here */
--- 2037,2044 ----
(void *) context);
/* now fix PlaceHolderVar's relid sets */
if (phv->phlevelsup == 0)
! phv->phrels = adjust_child_relids(phv->phrels, context->nappinfos,
! context->appinfos);
return (Node *) phv;
}
/* Shouldn't need to handle planner auxiliary nodes here */
*************** adjust_appendrel_attrs_mutator(Node *nod
*** 1976,1999 ****
adjust_appendrel_attrs_mutator((Node *) oldinfo->orclause, context);
/* adjust relid sets too */
! newinfo->clause_relids = adjust_relid_set(oldinfo->clause_relids,
! appinfo->parent_relid,
! appinfo->child_relid);
! newinfo->required_relids = adjust_relid_set(oldinfo->required_relids,
! appinfo->parent_relid,
! appinfo->child_relid);
! newinfo->outer_relids = adjust_relid_set(oldinfo->outer_relids,
! appinfo->parent_relid,
! appinfo->child_relid);
! newinfo->nullable_relids = adjust_relid_set(oldinfo->nullable_relids,
! appinfo->parent_relid,
! appinfo->child_relid);
! newinfo->left_relids = adjust_relid_set(oldinfo->left_relids,
! appinfo->parent_relid,
! appinfo->child_relid);
! newinfo->right_relids = adjust_relid_set(oldinfo->right_relids,
! appinfo->parent_relid,
! appinfo->child_relid);
/*
* Reset cached derivative fields, since these might need to have
--- 2069,2092 ----
adjust_appendrel_attrs_mutator((Node *) oldinfo->orclause, context);
/* adjust relid sets too */
! newinfo->clause_relids = adjust_child_relids(oldinfo->clause_relids,
! context->nappinfos,
! context->appinfos);
! newinfo->required_relids = adjust_child_relids(oldinfo->required_relids,
! context->nappinfos,
! context->appinfos);
! newinfo->outer_relids = adjust_child_relids(oldinfo->outer_relids,
! context->nappinfos,
! context->appinfos);
! newinfo->nullable_relids = adjust_child_relids(oldinfo->nullable_relids,
! context->nappinfos,
! context->appinfos);
! newinfo->left_relids = adjust_child_relids(oldinfo->left_relids,
! context->nappinfos,
! context->appinfos);
! newinfo->right_relids = adjust_child_relids(oldinfo->right_relids,
! context->nappinfos,
! context->appinfos);
/*
* Reset cached derivative fields, since these might need to have
*************** adjust_appendrel_attrs_mutator(Node *nod
*** 2025,2047 ****
}
/*
! * Substitute newrelid for oldrelid in a Relid set
*/
! static Relids
! adjust_relid_set(Relids relids, Index oldrelid, Index newrelid)
{
! if (bms_is_member(oldrelid, relids))
{
! /* Ensure we have a modifiable copy */
! relids = bms_copy(relids);
! /* Remove old, add new */
! relids = bms_del_member(relids, oldrelid);
! relids = bms_add_member(relids, newrelid);
}
return relids;
}
/*
* Adjust the targetlist entries of an inherited UPDATE operation
*
* The expressions have already been fixed, but we have to make sure that
--- 2118,2212 ----
}
/*
! * Replace parent relids by child relids in the copy of given relid set
! * according to the given list of AppendRelInfos. The given relid set is
! * returned as is if it contains no parent in the given list, otherwise, the
! * given relid set is not changed.
*/
! Relids
! adjust_child_relids(Relids relids, int nappinfos, AppendRelInfo **appinfos)
{
! Bitmapset *result = NULL;
! int cnt;
!
! for (cnt = 0; cnt < nappinfos; cnt++)
{
! AppendRelInfo *appinfo = appinfos[cnt];
!
! /* Remove parent, add child */
! if (bms_is_member(appinfo->parent_relid, relids))
! {
! /* Make a copy if we are changing the set. */
! if (!result)
! result = bms_copy(relids);
!
! result = bms_del_member(result, appinfo->parent_relid);
! result = bms_add_member(result, appinfo->child_relid);
! }
}
+
+ /* Return new set if we modified the given set. */
+ if (result)
+ return result;
+
+ /* Else return the given relids set as is. */
return relids;
}
/*
+ * Replace any relid present in top_parent_relids with its child in
+ * child_relids. Members of child_relids can be multiple levels below top
+ * parent in the partition hierarchy.
+ */
+ Relids
+ adjust_child_relids_multilevel(PlannerInfo *root, Relids relids,
+ Relids child_relids, Relids top_parent_relids)
+ {
+ AppendRelInfo **appinfos;
+ int nappinfos;
+ Relids parent_relids = NULL;
+ Relids result;
+ Relids tmp_result = NULL;
+ int cnt;
+
+ /*
+ * If the given relids set doesn't contain any of the top parent relids,
+ * it will remain unchanged.
+ */
+ if (!bms_overlap(relids, top_parent_relids))
+ return relids;
+
+ appinfos = find_appinfos_by_relids(root, child_relids, &nappinfos);
+
+ /* Construct relids set for the immediate parent of the given child. */
+ for (cnt = 0; cnt < nappinfos; cnt++)
+ {
+ AppendRelInfo *appinfo = appinfos[cnt];
+
+ parent_relids = bms_add_member(parent_relids, appinfo->parent_relid);
+ }
+
+ /* Recurse if immediate parent is not the top parent. */
+ if (!bms_equal(parent_relids, top_parent_relids))
+ {
+ tmp_result = adjust_child_relids_multilevel(root, relids,
+ parent_relids,
+ top_parent_relids);
+ relids = tmp_result;
+ }
+
+ result = adjust_child_relids(relids, nappinfos, appinfos);
+
+ /* Free memory consumed by any intermediate result. */
+ if (tmp_result)
+ bms_free(tmp_result);
+ bms_free(parent_relids);
+ pfree(appinfos);
+
+ return result;
+ }
+
+ /*
* Adjust the targetlist entries of an inherited UPDATE operation
*
* The expressions have already been fixed, but we have to make sure that
*************** adjust_inherited_tlist(List *tlist, Appe
*** 2142,2162 ****
* adjust_appendrel_attrs_multilevel
* Apply Var translations from a toplevel appendrel parent down to a child.
*
! * In some cases we need to translate expressions referencing a baserel
* to reference an appendrel child that's multiple levels removed from it.
*/
Node *
adjust_appendrel_attrs_multilevel(PlannerInfo *root, Node *node,
! RelOptInfo *child_rel)
{
! AppendRelInfo *appinfo = find_childrel_appendrelinfo(root, child_rel);
! RelOptInfo *parent_rel = find_base_rel(root, appinfo->parent_relid);
- /* If parent is also a child, first recurse to apply its translations */
- if (IS_OTHER_REL(parent_rel))
- node = adjust_appendrel_attrs_multilevel(root, node, parent_rel);
- else
- Assert(parent_rel->reloptkind == RELOPT_BASEREL);
/* Now translate for this child */
! return adjust_appendrel_attrs(root, node, appinfo);
}
--- 2307,2432 ----
* adjust_appendrel_attrs_multilevel
* Apply Var translations from a toplevel appendrel parent down to a child.
*
! * In some cases we need to translate expressions referencing a parent relation
* to reference an appendrel child that's multiple levels removed from it.
*/
Node *
adjust_appendrel_attrs_multilevel(PlannerInfo *root, Node *node,
! Relids child_relids,
! Relids top_parent_relids)
{
! AppendRelInfo **appinfos;
! Bitmapset *parent_relids = NULL;
! int nappinfos;
! int cnt;
!
! Assert(bms_num_members(child_relids) == bms_num_members(top_parent_relids));
!
! appinfos = find_appinfos_by_relids(root, child_relids, &nappinfos);
!
! /* Construct relids set for the immediate parent of given child. */
! for (cnt = 0; cnt < nappinfos; cnt++)
! {
! AppendRelInfo *appinfo = appinfos[cnt];
!
! parent_relids = bms_add_member(parent_relids, appinfo->parent_relid);
! }
!
! /* Recurse if immediate parent is not the top parent. */
! if (!bms_equal(parent_relids, top_parent_relids))
! node = adjust_appendrel_attrs_multilevel(root, node, parent_relids,
! top_parent_relids);
/* Now translate for this child */
! node = adjust_appendrel_attrs(root, node, nappinfos, appinfos);
!
! pfree(appinfos);
!
! return node;
! }
!
! /*
! * Construct the SpecialJoinInfo for a child-join by translating
! * SpecialJoinInfo for the join between parents. left_relids and right_relids
! * are the relids of left and right side of the join respectively.
! */
! SpecialJoinInfo *
! build_child_join_sjinfo(PlannerInfo *root, SpecialJoinInfo *parent_sjinfo,
! Relids left_relids, Relids right_relids)
! {
! SpecialJoinInfo *sjinfo = makeNode(SpecialJoinInfo);
! AppendRelInfo **left_appinfos;
! int left_nappinfos;
! AppendRelInfo **right_appinfos;
! int right_nappinfos;
!
! memcpy(sjinfo, parent_sjinfo, sizeof(SpecialJoinInfo));
! left_appinfos = find_appinfos_by_relids(root, left_relids,
! &left_nappinfos);
! right_appinfos = find_appinfos_by_relids(root, right_relids,
! &right_nappinfos);
!
! sjinfo->min_lefthand = adjust_child_relids(sjinfo->min_lefthand,
! left_nappinfos, left_appinfos);
! sjinfo->min_righthand = adjust_child_relids(sjinfo->min_righthand,
! right_nappinfos,
! right_appinfos);
! sjinfo->syn_lefthand = adjust_child_relids(sjinfo->syn_lefthand,
! left_nappinfos, left_appinfos);
! sjinfo->syn_righthand = adjust_child_relids(sjinfo->syn_righthand,
! right_nappinfos,
! right_appinfos);
!
! /*
! * Replace the Var nodes of parent with those of children in expressions.
! * This function may be called within a temporary context, but the
! * expressions will be shallow-copied into the plan. Hence copy those in
! * the planner's context.
! */
! sjinfo->semi_rhs_exprs = (List *) adjust_appendrel_attrs(root,
! (Node *) sjinfo->semi_rhs_exprs,
! right_nappinfos,
! right_appinfos);
!
! pfree(left_appinfos);
! pfree(right_appinfos);
!
! return sjinfo;
! }
!
! /*
! * find_appinfos_by_relids
! * Find AppendRelInfo structures for all relations specified by relids.
! *
! * The AppendRelInfos are returned in an array, which can be pfree'd by the
! * caller.
! */
! AppendRelInfo **
! find_appinfos_by_relids(PlannerInfo *root, Relids relids, int *nappinfos)
! {
! ListCell *lc;
! AppendRelInfo **appinfos;
! int cnt = 0;
!
! *nappinfos = bms_num_members(relids);
! appinfos = (AppendRelInfo **) palloc(sizeof(AppendRelInfo *) * *nappinfos);
!
! foreach (lc, root->append_rel_list)
! {
! AppendRelInfo *appinfo = lfirst(lc);
!
! if (bms_is_member(appinfo->child_relid, relids))
! {
! appinfos[cnt] = appinfo;
! cnt++;
!
! /* Stop when we have gathered all the AppendRelInfos. */
! if (cnt == *nappinfos)
! return appinfos;
! }
! }
!
! /* Should have found the entries ... */
! elog(ERROR, "Did not find one or more of requested child rels in append_rel_list");
! return NULL; /* not reached */
}
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
new file mode 100644
index 2d5caae..2bacec9
*** a/src/backend/optimizer/util/pathnode.c
--- b/src/backend/optimizer/util/pathnode.c
***************
*** 18,32 ****
--- 18,39 ----
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
+ #include "nodes/extensible.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/planmain.h"
+ #include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
+ #include "optimizer/tlist.h"
+ /* TODO Remove this if get_grouping_expressions ends up in another module. */
+ #include "optimizer/tlist.h"
#include "optimizer/var.h"
#include "parser/parsetree.h"
+ #include "foreign/fdwapi.h"
#include "utils/lsyscache.h"
+ #include "utils/memutils.h"
#include "utils/selfuncs.h"
*************** set_cheapest(RelOptInfo *parent_rel)
*** 409,416 ****
* Returns nothing, but modifies parent_rel->pathlist.
*/
void
! add_path(RelOptInfo *parent_rel, Path *new_path)
{
bool accept_new = true; /* unless we find a superior old path */
ListCell *insert_after = NULL; /* where to insert new item */
List *new_path_pathkeys;
--- 416,424 ----
* Returns nothing, but modifies parent_rel->pathlist.
*/
void
! add_path(RelOptInfo *parent_rel, Path *new_path, bool grouped)
{
+ List *pathlist;
bool accept_new = true; /* unless we find a superior old path */
ListCell *insert_after = NULL; /* where to insert new item */
List *new_path_pathkeys;
*************** add_path(RelOptInfo *parent_rel, Path *n
*** 427,432 ****
--- 435,448 ----
/* Pretend parameterized paths have no pathkeys, per comment above */
new_path_pathkeys = new_path->param_info ? NIL : new_path->pathkeys;
+ if (!grouped)
+ pathlist = parent_rel->pathlist;
+ else
+ {
+ Assert(parent_rel->gpi != NULL);
+ pathlist = parent_rel->gpi->pathlist;
+ }
+
/*
* Loop to check proposed new path against old paths. Note it is possible
* for more than one old path to be tossed out because new_path dominates
*************** add_path(RelOptInfo *parent_rel, Path *n
*** 436,442 ****
* list cell.
*/
p1_prev = NULL;
! for (p1 = list_head(parent_rel->pathlist); p1 != NULL; p1 = p1_next)
{
Path *old_path = (Path *) lfirst(p1);
bool remove_old = false; /* unless new proves superior */
--- 452,458 ----
* list cell.
*/
p1_prev = NULL;
! for (p1 = list_head(pathlist); p1 != NULL; p1 = p1_next)
{
Path *old_path = (Path *) lfirst(p1);
bool remove_old = false; /* unless new proves superior */
*************** add_path(RelOptInfo *parent_rel, Path *n
*** 582,589 ****
*/
if (remove_old)
{
! parent_rel->pathlist = list_delete_cell(parent_rel->pathlist,
! p1, p1_prev);
/*
* Delete the data pointed-to by the deleted cell, if possible
--- 598,604 ----
*/
if (remove_old)
{
! pathlist = list_delete_cell(pathlist, p1, p1_prev);
/*
* Delete the data pointed-to by the deleted cell, if possible
*************** add_path(RelOptInfo *parent_rel, Path *n
*** 614,622 ****
{
/* Accept the new path: insert it at proper place in pathlist */
if (insert_after)
! lappend_cell(parent_rel->pathlist, insert_after, new_path);
else
! parent_rel->pathlist = lcons(new_path, parent_rel->pathlist);
}
else
{
--- 629,642 ----
{
/* Accept the new path: insert it at proper place in pathlist */
if (insert_after)
! lappend_cell(pathlist, insert_after, new_path);
else
! pathlist = lcons(new_path, pathlist);
!
! if (!grouped)
! parent_rel->pathlist = pathlist;
! else
! parent_rel->gpi->pathlist = pathlist;
}
else
{
*************** add_path(RelOptInfo *parent_rel, Path *n
*** 646,653 ****
bool
add_path_precheck(RelOptInfo *parent_rel,
Cost startup_cost, Cost total_cost,
! List *pathkeys, Relids required_outer)
{
List *new_path_pathkeys;
bool consider_startup;
ListCell *p1;
--- 666,674 ----
bool
add_path_precheck(RelOptInfo *parent_rel,
Cost startup_cost, Cost total_cost,
! List *pathkeys, Relids required_outer, bool grouped)
{
+ List *pathlist;
List *new_path_pathkeys;
bool consider_startup;
ListCell *p1;
*************** add_path_precheck(RelOptInfo *parent_rel
*** 656,664 ****
new_path_pathkeys = required_outer ? NIL : pathkeys;
/* Decide whether new path's startup cost is interesting */
! consider_startup = required_outer ? parent_rel->consider_param_startup : parent_rel->consider_startup;
! foreach(p1, parent_rel->pathlist)
{
Path *old_path = (Path *) lfirst(p1);
PathKeysComparison keyscmp;
--- 677,694 ----
new_path_pathkeys = required_outer ? NIL : pathkeys;
/* Decide whether new path's startup cost is interesting */
! consider_startup = required_outer ? parent_rel->consider_param_startup :
! parent_rel->consider_startup;
! if (!grouped)
! pathlist = parent_rel->pathlist;
! else
! {
! Assert(parent_rel->gpi != NULL);
! pathlist = parent_rel->gpi->pathlist;
! }
!
! foreach(p1, pathlist)
{
Path *old_path = (Path *) lfirst(p1);
PathKeysComparison keyscmp;
*************** add_path_precheck(RelOptInfo *parent_rel
*** 749,771 ****
* referenced by partial BitmapHeapPaths.
*/
void
! add_partial_path(RelOptInfo *parent_rel, Path *new_path)
{
bool accept_new = true; /* unless we find a superior old path */
ListCell *insert_after = NULL; /* where to insert new item */
ListCell *p1;
ListCell *p1_prev;
ListCell *p1_next;
/* Check for query cancel. */
CHECK_FOR_INTERRUPTS();
/*
* As in add_path, throw out any paths which are dominated by the new
* path, but throw out the new path if some existing path dominates it.
*/
p1_prev = NULL;
! for (p1 = list_head(parent_rel->partial_pathlist); p1 != NULL;
p1 = p1_next)
{
Path *old_path = (Path *) lfirst(p1);
--- 779,810 ----
* referenced by partial BitmapHeapPaths.
*/
void
! add_partial_path(RelOptInfo *parent_rel, Path *new_path, bool grouped)
{
bool accept_new = true; /* unless we find a superior old path */
ListCell *insert_after = NULL; /* where to insert new item */
ListCell *p1;
ListCell *p1_prev;
ListCell *p1_next;
+ List *pathlist;
/* Check for query cancel. */
CHECK_FOR_INTERRUPTS();
+ if (!grouped)
+ pathlist = parent_rel->partial_pathlist;
+ else
+ {
+ Assert(parent_rel->gpi != NULL);
+ pathlist = parent_rel->gpi->partial_pathlist;
+ }
+
/*
* As in add_path, throw out any paths which are dominated by the new
* path, but throw out the new path if some existing path dominates it.
*/
p1_prev = NULL;
! for (p1 = list_head(pathlist); p1 != NULL;
p1 = p1_next)
{
Path *old_path = (Path *) lfirst(p1);
*************** add_partial_path(RelOptInfo *parent_rel,
*** 819,830 ****
}
/*
! * Remove current element from partial_pathlist if dominated by new.
*/
if (remove_old)
{
! parent_rel->partial_pathlist =
! list_delete_cell(parent_rel->partial_pathlist, p1, p1_prev);
pfree(old_path);
/* p1_prev does not advance */
}
--- 858,868 ----
}
/*
! * Remove current element from pathlist if dominated by new.
*/
if (remove_old)
{
! pathlist = list_delete_cell(pathlist, p1, p1_prev);
pfree(old_path);
/* p1_prev does not advance */
}
*************** add_partial_path(RelOptInfo *parent_rel,
*** 839,845 ****
/*
* If we found an old path that dominates new_path, we can quit
! * scanning the partial_pathlist; we will not add new_path, and we
* assume new_path cannot dominate any later path.
*/
if (!accept_new)
--- 877,883 ----
/*
* If we found an old path that dominates new_path, we can quit
! * scanning the pathlist; we will not add new_path, and we
* assume new_path cannot dominate any later path.
*/
if (!accept_new)
*************** add_partial_path(RelOptInfo *parent_rel,
*** 850,859 ****
{
/* Accept the new path: insert it at proper place */
if (insert_after)
! lappend_cell(parent_rel->partial_pathlist, insert_after, new_path);
else
! parent_rel->partial_pathlist =
! lcons(new_path, parent_rel->partial_pathlist);
}
else
{
--- 888,901 ----
{
/* Accept the new path: insert it at proper place */
if (insert_after)
! lappend_cell(pathlist, insert_after, new_path);
else
! pathlist = lcons(new_path, pathlist);
!
! if (!grouped)
! parent_rel->partial_pathlist = pathlist;
! else
! parent_rel->gpi->partial_pathlist = pathlist;
}
else
{
*************** add_partial_path(RelOptInfo *parent_rel,
*** 874,882 ****
*/
bool
add_partial_path_precheck(RelOptInfo *parent_rel, Cost total_cost,
! List *pathkeys)
{
ListCell *p1;
/*
* Our goal here is twofold. First, we want to find out whether this path
--- 916,933 ----
*/
bool
add_partial_path_precheck(RelOptInfo *parent_rel, Cost total_cost,
! List *pathkeys, bool grouped)
{
ListCell *p1;
+ List *pathlist;
+
+ if (!grouped)
+ pathlist = parent_rel->partial_pathlist;
+ else
+ {
+ Assert(parent_rel->gpi != NULL);
+ pathlist = parent_rel->gpi->partial_pathlist;
+ }
/*
* Our goal here is twofold. First, we want to find out whether this path
*************** add_partial_path_precheck(RelOptInfo *pa
*** 886,895 ****
* final cost computations. If so, we definitely want to consider it.
*
* Unlike add_path(), we always compare pathkeys here. This is because we
! * expect partial_pathlist to be very short, and getting a definitive
! * answer at this stage avoids the need to call add_path_precheck.
*/
! foreach(p1, parent_rel->partial_pathlist)
{
Path *old_path = (Path *) lfirst(p1);
PathKeysComparison keyscmp;
--- 937,947 ----
* final cost computations. If so, we definitely want to consider it.
*
* Unlike add_path(), we always compare pathkeys here. This is because we
! * expect partial_pathlist / grouped_pathlist to be very short, and
! * getting a definitive answer at this stage avoids the need to call
! * add_path_precheck.
*/
! foreach(p1, pathlist)
{
Path *old_path = (Path *) lfirst(p1);
PathKeysComparison keyscmp;
*************** add_partial_path_precheck(RelOptInfo *pa
*** 918,924 ****
* completion.
*/
if (!add_path_precheck(parent_rel, total_cost, total_cost, pathkeys,
! NULL))
return false;
return true;
--- 970,976 ----
* completion.
*/
if (!add_path_precheck(parent_rel, total_cost, total_cost, pathkeys,
! NULL, grouped))
return false;
return true;
*************** create_foreignscan_path(PlannerInfo *roo
*** 1994,2007 ****
* Note: result must not share storage with either input
*/
Relids
! calc_nestloop_required_outer(Path *outer_path, Path *inner_path)
{
- Relids outer_paramrels = PATH_REQ_OUTER(outer_path);
- Relids inner_paramrels = PATH_REQ_OUTER(inner_path);
Relids required_outer;
/* inner_path can require rels from outer path, but not vice versa */
! Assert(!bms_overlap(outer_paramrels, inner_path->parent->relids));
/* easy case if inner path is not parameterized */
if (!inner_paramrels)
return bms_copy(outer_paramrels);
--- 2046,2060 ----
* Note: result must not share storage with either input
*/
Relids
! calc_nestloop_required_outer(Relids outerrelids,
! Relids outer_paramrels,
! Relids innerrelids,
! Relids inner_paramrels)
{
Relids required_outer;
/* inner_path can require rels from outer path, but not vice versa */
! Assert(!bms_overlap(outer_paramrels, innerrelids));
/* easy case if inner path is not parameterized */
if (!inner_paramrels)
return bms_copy(outer_paramrels);
*************** calc_nestloop_required_outer(Path *outer
*** 2009,2015 ****
required_outer = bms_union(outer_paramrels, inner_paramrels);
/* ... and remove any mention of now-satisfied outer rels */
required_outer = bms_del_members(required_outer,
! outer_path->parent->relids);
/* maintain invariant that required_outer is exactly NULL if empty */
if (bms_is_empty(required_outer))
{
--- 2062,2068 ----
required_outer = bms_union(outer_paramrels, inner_paramrels);
/* ... and remove any mention of now-satisfied outer rels */
required_outer = bms_del_members(required_outer,
! outerrelids);
/* maintain invariant that required_outer is exactly NULL if empty */
if (bms_is_empty(required_outer))
{
*************** calc_non_nestloop_required_outer(Path *o
*** 2055,2060 ****
--- 2108,2114 ----
* 'restrict_clauses' are the RestrictInfo nodes to apply at the join
* 'pathkeys' are the path keys of the new join path
* 'required_outer' is the set of required outer rels
+ * 'target' can be passed to override that of joinrel.
*
* Returns the resulting path node.
*/
*************** create_nestloop_path(PlannerInfo *root,
*** 2068,2074 ****
Path *inner_path,
List *restrict_clauses,
List *pathkeys,
! Relids required_outer)
{
NestPath *pathnode = makeNode(NestPath);
Relids inner_req_outer = PATH_REQ_OUTER(inner_path);
--- 2122,2129 ----
Path *inner_path,
List *restrict_clauses,
List *pathkeys,
! Relids required_outer,
! PathTarget *target)
{
NestPath *pathnode = makeNode(NestPath);
Relids inner_req_outer = PATH_REQ_OUTER(inner_path);
*************** create_nestloop_path(PlannerInfo *root,
*** 2101,2107 ****
pathnode->path.pathtype = T_NestLoop;
pathnode->path.parent = joinrel;
! pathnode->path.pathtarget = joinrel->reltarget;
pathnode->path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
--- 2156,2162 ----
pathnode->path.pathtype = T_NestLoop;
pathnode->path.parent = joinrel;
! pathnode->path.pathtarget = target == NULL ? joinrel->reltarget : target;
pathnode->path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
*************** create_mergejoin_path(PlannerInfo *root,
*** 2159,2171 ****
Relids required_outer,
List *mergeclauses,
List *outersortkeys,
! List *innersortkeys)
{
MergePath *pathnode = makeNode(MergePath);
pathnode->jpath.path.pathtype = T_MergeJoin;
pathnode->jpath.path.parent = joinrel;
! pathnode->jpath.path.pathtarget = joinrel->reltarget;
pathnode->jpath.path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
--- 2214,2228 ----
Relids required_outer,
List *mergeclauses,
List *outersortkeys,
! List *innersortkeys,
! PathTarget *target)
{
MergePath *pathnode = makeNode(MergePath);
pathnode->jpath.path.pathtype = T_MergeJoin;
pathnode->jpath.path.parent = joinrel;
! pathnode->jpath.path.pathtarget = target == NULL ? joinrel->reltarget :
! target;
pathnode->jpath.path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
*************** create_mergejoin_path(PlannerInfo *root,
*** 2210,2215 ****
--- 2267,2273 ----
* 'required_outer' is the set of required outer rels
* 'hashclauses' are the RestrictInfo nodes to use as hash clauses
* (this should be a subset of the restrict_clauses list)
+ * 'target' can be passed to override that of joinrel.
*/
HashPath *
create_hashjoin_path(PlannerInfo *root,
*************** create_hashjoin_path(PlannerInfo *root,
*** 2221,2233 ****
Path *inner_path,
List *restrict_clauses,
Relids required_outer,
! List *hashclauses)
{
HashPath *pathnode = makeNode(HashPath);
pathnode->jpath.path.pathtype = T_HashJoin;
pathnode->jpath.path.parent = joinrel;
! pathnode->jpath.path.pathtarget = joinrel->reltarget;
pathnode->jpath.path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
--- 2279,2293 ----
Path *inner_path,
List *restrict_clauses,
Relids required_outer,
! List *hashclauses,
! PathTarget *target)
{
HashPath *pathnode = makeNode(HashPath);
pathnode->jpath.path.pathtype = T_HashJoin;
pathnode->jpath.path.parent = joinrel;
! pathnode->jpath.path.pathtarget = target == NULL ? joinrel->reltarget :
! target;
pathnode->jpath.path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
*************** create_agg_path(PlannerInfo *root,
*** 2682,2688 ****
pathnode->path.pathtarget = target;
/* For now, assume we are above any joins, so no parameterization */
pathnode->path.param_info = NULL;
! pathnode->path.parallel_aware = false;
pathnode->path.parallel_safe = rel->consider_parallel &&
subpath->parallel_safe;
pathnode->path.parallel_workers = subpath->parallel_workers;
--- 2742,2748 ----
pathnode->path.pathtarget = target;
/* For now, assume we are above any joins, so no parameterization */
pathnode->path.param_info = NULL;
! pathnode->path.parallel_aware = true;
pathnode->path.parallel_safe = rel->consider_parallel &&
subpath->parallel_safe;
pathnode->path.parallel_workers = subpath->parallel_workers;
*************** create_agg_path(PlannerInfo *root,
*** 2713,2718 ****
--- 2773,2948 ----
}
/*
+ * Apply partial AGG_SORTED aggregation path to subpath if it's suitably
+ * sorted.
+ *
+ * first_call indicates whether the function is being called first time for
+ * given index --- since the target should not change, we can skip the check
+ * of sorting during subsequent calls.
+ *
+ * group_clauses, group_exprs and agg_exprs are pointers to lists we populate
+ * when called first time for particular index, and that user passes for
+ * subsequent calls.
+ *
+ * NULL is returned if sorting of subpath output is not suitable.
+ */
+ AggPath *
+ create_partial_agg_sorted_path(PlannerInfo *root, Path *subpath,
+ bool first_call,
+ List **group_clauses, List **group_exprs,
+ List **agg_exprs, double input_rows)
+ {
+ RelOptInfo *rel;
+ AggClauseCosts agg_costs;
+ double dNumGroups;
+ AggPath *result = NULL;
+
+ rel = subpath->parent;
+ Assert(rel->gpi != NULL);
+
+ if (subpath->pathkeys == NIL)
+ return NULL;
+
+ if (!grouping_is_sortable(root->parse->groupClause))
+ return NULL;
+
+ if (first_call)
+ {
+ ListCell *lc1;
+ List *key_subset = NIL;
+
+ /*
+ * Find all query pathkeys that our relation does affect.
+ */
+ foreach(lc1, root->group_pathkeys)
+ {
+ PathKey *gkey = castNode(PathKey, lfirst(lc1));
+ ListCell *lc2;
+
+ foreach(lc2, subpath->pathkeys)
+ {
+ PathKey *skey = castNode(PathKey, lfirst(lc2));
+
+ if (skey == gkey)
+ {
+ key_subset = lappend(key_subset, gkey);
+ break;
+ }
+ }
+ }
+
+ if (key_subset == NIL)
+ return NULL;
+
+ /* Check if AGG_SORTED is useful for the whole query. */
+ if (!pathkeys_contained_in(key_subset, subpath->pathkeys))
+ return NULL;
+ }
+
+ if (first_call)
+ get_grouping_expressions(root, rel->gpi->target, group_clauses,
+ group_exprs, agg_exprs);
+
+ MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+ Assert(*agg_exprs != NIL);
+ get_agg_clause_costs(root, (Node *) *agg_exprs, AGGSPLIT_INITIAL_SERIAL,
+ &agg_costs);
+
+ Assert(*group_exprs != NIL);
+ dNumGroups = estimate_num_groups(root, *group_exprs, input_rows, NULL);
+
+ /* TODO HAVING qual. */
+ Assert(*group_clauses != NIL);
+ result = create_agg_path(root, rel, subpath, rel->gpi->target, AGG_SORTED,
+ AGGSPLIT_INITIAL_SERIAL, *group_clauses, NIL,
+ &agg_costs, dNumGroups);
+
+ return result;
+ }
+
+ /*
+ * Appy partial AGG_HASHED aggregation to subpath.
+ *
+ * Arguments have the same meaning as those of create_agg_sorted_path.
+ *
+ */
+ AggPath *
+ create_partial_agg_hashed_path(PlannerInfo *root, Path *subpath,
+ bool first_call,
+ List **group_clauses, List **group_exprs,
+ List **agg_exprs, double input_rows)
+ {
+ RelOptInfo *rel;
+ bool can_hash;
+ AggClauseCosts agg_costs;
+ double dNumGroups;
+ Size hashaggtablesize;
+ Query *parse = root->parse;
+ AggPath *result = NULL;
+
+ rel = subpath->parent;
+ Assert(rel->gpi != NULL);
+
+ if (first_call)
+ {
+ /*
+ * Find one grouping clause per grouping column.
+ *
+ * All that create_agg_plan eventually needs of the clause is
+ * tleSortGroupRef, so we don't have to care that the clause
+ * expression might differ from texpr, in case texpr was derived from
+ * EC.
+ */
+ get_grouping_expressions(root, rel->gpi->target, group_clauses,
+ group_exprs, agg_exprs);
+ }
+
+ MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+ Assert(*agg_exprs != NIL);
+ get_agg_clause_costs(root, (Node *) *agg_exprs, AGGSPLIT_INITIAL_SERIAL,
+ &agg_costs);
+
+ can_hash = (parse->groupClause != NIL &&
+ parse->groupingSets == NIL &&
+ agg_costs.numOrderedAggs == 0 &&
+ grouping_is_hashable(parse->groupClause));
+
+ if (can_hash)
+ {
+ Assert(*group_exprs != NIL);
+ dNumGroups = estimate_num_groups(root, *group_exprs, input_rows,
+ NULL);
+
+ hashaggtablesize = estimate_hashagg_tablesize(subpath, &agg_costs,
+ dNumGroups);
+
+ if (hashaggtablesize < work_mem * 1024L)
+ {
+ /*
+ * Create the partial aggregation path.
+ */
+ Assert(*group_clauses != NIL);
+
+ result = create_agg_path(root, rel, subpath,
+ rel->gpi->target,
+ AGG_HASHED,
+ AGGSPLIT_INITIAL_SERIAL,
+ *group_clauses, NIL,
+ &agg_costs,
+ dNumGroups);
+
+ /*
+ * The agg path should require no fewer parameters than the plain
+ * one.
+ */
+ result->path.param_info = subpath->param_info;
+ }
+ }
+
+ return result;
+ }
+
+ /*
* create_groupingsets_path
* Creates a pathnode that represents performing GROUPING SETS aggregation
*
*************** reparameterize_path(PlannerInfo *root, P
*** 3426,3428 ****
--- 3656,4081 ----
}
return NULL;
}
+
+ /*
+ * reparameterize_path_by_child
+ * Given a path parameterized by the parent of the given relation,
+ * translate the path to be parameterized by the given child relation.
+ *
+ * The function creates a new path of the same type as the given path, but
+ * parameterized by the given child relation. If it can not reparameterize the
+ * path as required, it returns NULL.
+ *
+ * The cost, number of rows, width and parallel path properties depend upon
+ * path->parent, which does not change during the translation. Hence those
+ * members are copied as they are.
+ */
+
+ Path *
+ reparameterize_path_by_child(PlannerInfo *root, Path *path,
+ RelOptInfo *child_rel)
+ {
+
+ #define FLAT_COPY_PATH(newnode, node, nodetype) \
+ ( (newnode) = makeNode(nodetype), \
+ memcpy((newnode), (node), sizeof(nodetype)) )
+
+ Path *new_path;
+ ParamPathInfo *new_ppi;
+ ParamPathInfo *old_ppi;
+ Relids required_outer;
+
+ /*
+ * If the path is not parameterized by parent of the given relation or it it
+ * doesn't need reparameterization.
+ */
+ if (!path->param_info ||
+ !bms_overlap(PATH_REQ_OUTER(path), child_rel->top_parent_relids))
+ return path;
+
+ /*
+ * Make a copy of the given path and reparameterize or translate the
+ * path specific members.
+ */
+ switch (nodeTag(path))
+ {
+ case T_Path:
+ FLAT_COPY_PATH(new_path, path, Path);
+ break;
+
+ case T_IndexPath:
+ {
+ IndexPath *ipath;
+
+ FLAT_COPY_PATH(ipath, path, IndexPath);
+ ipath->indexclauses = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) ipath->indexclauses,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ ipath->indexquals = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) ipath->indexquals,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ new_path = (Path *) ipath;
+ }
+ break;
+
+ case T_BitmapHeapPath:
+ {
+ BitmapHeapPath *bhpath;
+
+ FLAT_COPY_PATH(bhpath, path, BitmapHeapPath);
+ bhpath->bitmapqual = reparameterize_path_by_child(root,
+ bhpath->bitmapqual,
+ child_rel);
+ new_path = (Path *) bhpath;
+ }
+ break;
+
+ case T_BitmapAndPath:
+ {
+ BitmapAndPath *bapath;
+ ListCell *lc;
+ List *bitmapquals = NIL;
+
+ FLAT_COPY_PATH(bapath, path, BitmapAndPath);
+ foreach (lc, bapath->bitmapquals)
+ {
+ Path *bmqpath = lfirst(lc);
+
+ bitmapquals = lappend(bitmapquals,
+ reparameterize_path_by_child(root,
+ bmqpath,
+ child_rel));
+ }
+ bapath->bitmapquals = bitmapquals;
+ new_path = (Path *) bapath;
+ }
+ break;
+
+ case T_BitmapOrPath:
+ {
+ BitmapOrPath *bopath;
+ ListCell *lc;
+ List *bitmapquals = NIL;
+
+ FLAT_COPY_PATH(bopath, path, BitmapOrPath);
+ foreach (lc, bopath->bitmapquals)
+ {
+ Path *bmqpath = lfirst(lc);
+
+ bitmapquals = lappend(bitmapquals,
+ reparameterize_path_by_child(root,
+ bmqpath,
+ child_rel));
+ }
+ bopath->bitmapquals = bitmapquals;
+ new_path = (Path *) bopath;
+ }
+ break;
+
+ case T_TidPath:
+ {
+ TidPath *tpath;
+
+ /*
+ * TidPath contains tidquals, which do not contain any external
+ * parameters per create_tidscan_path(). So don't bother to
+ * translate those.
+ */
+ FLAT_COPY_PATH(tpath, path, TidPath);
+ new_path = (Path *) tpath;
+ }
+ break;
+
+ case T_ForeignPath:
+ {
+ ForeignPath *fpath;
+ ReparameterizeForeignPathByChild_function rfpc_func;
+
+ FLAT_COPY_PATH(fpath, path, ForeignPath);
+ if (fpath->fdw_outerpath)
+ fpath->fdw_outerpath = reparameterize_path_by_child(root,
+ fpath->fdw_outerpath,
+ child_rel);
+ rfpc_func = path->parent->fdwroutine->ReparameterizeForeignPathByChild;
+
+ /* Hand over to FDW if supported. */
+ if (rfpc_func)
+ fpath->fdw_private = rfpc_func(root, fpath->fdw_private,
+ child_rel);
+ new_path = (Path *) fpath;
+ }
+ break;
+
+ case T_CustomPath:
+ {
+ CustomPath *cpath;
+ ListCell *lc;
+ List *custompaths = NIL;
+
+ FLAT_COPY_PATH(cpath, path, CustomPath);
+
+ foreach (lc, cpath->custom_paths)
+ {
+ Path *subpath = lfirst(lc);
+
+ custompaths = lappend(custompaths,
+ reparameterize_path_by_child(root,
+ subpath,
+ child_rel));
+ }
+ cpath->custom_paths = custompaths;
+
+ if (cpath->methods &&
+ cpath->methods->ReparameterizeCustomPathByChild)
+ cpath->custom_private = cpath->methods->ReparameterizeCustomPathByChild(root,
+ cpath->custom_private,
+ child_rel);
+
+ new_path = (Path *) cpath;
+ }
+ break;
+
+ case T_NestPath:
+ {
+ JoinPath *jpath;
+
+ FLAT_COPY_PATH(jpath, path, NestPath);
+
+ jpath->outerjoinpath = reparameterize_path_by_child(root,
+ jpath->outerjoinpath,
+ child_rel);
+ jpath->innerjoinpath = reparameterize_path_by_child(root,
+ jpath->innerjoinpath,
+ child_rel);
+ jpath->joinrestrictinfo = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) jpath->joinrestrictinfo,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ new_path = (Path *) jpath;
+ }
+ break;
+
+ case T_MergePath:
+ {
+ JoinPath *jpath;
+ MergePath *mpath;
+
+ FLAT_COPY_PATH(mpath, path, MergePath);
+
+ jpath = (JoinPath *) mpath;
+ jpath->outerjoinpath = reparameterize_path_by_child(root,
+ jpath->outerjoinpath,
+ child_rel);
+ jpath->innerjoinpath = reparameterize_path_by_child(root,
+ jpath->innerjoinpath,
+ child_rel);
+ jpath->joinrestrictinfo = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) jpath->joinrestrictinfo,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ mpath->path_mergeclauses = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) mpath->path_mergeclauses,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ new_path = (Path *) mpath;
+ }
+ break;
+
+ case T_HashPath:
+ {
+ JoinPath *jpath;
+ HashPath *hpath;
+ FLAT_COPY_PATH(hpath, path, HashPath);
+
+ jpath = (JoinPath *) hpath;
+ jpath->outerjoinpath = reparameterize_path_by_child(root,
+ jpath->outerjoinpath,
+ child_rel);
+ jpath->innerjoinpath = reparameterize_path_by_child(root,
+ jpath->innerjoinpath,
+ child_rel);
+ jpath->joinrestrictinfo = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) jpath->joinrestrictinfo,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ hpath->path_hashclauses = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) hpath->path_hashclauses,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ new_path = (Path *) hpath;
+ }
+ break;
+
+ case T_AppendPath:
+ {
+ AppendPath *apath;
+ List *subpaths = NIL;
+ ListCell *lc;
+
+ FLAT_COPY_PATH(apath, path, AppendPath);
+ foreach (lc, apath->subpaths)
+ subpaths = lappend(subpaths,
+ reparameterize_path_by_child(root,
+ lfirst(lc),
+ child_rel));
+ apath->subpaths = subpaths;
+ new_path = (Path *) apath;
+ }
+ break;
+
+ case T_MergeAppend:
+ {
+ MergeAppendPath *mapath;
+ List *subpaths = NIL;
+ ListCell *lc;
+
+ FLAT_COPY_PATH(mapath, path, MergeAppendPath);
+ foreach (lc, mapath->subpaths)
+ subpaths = lappend(subpaths,
+ reparameterize_path_by_child(root,
+ lfirst(lc),
+ child_rel));
+ mapath->subpaths = subpaths;
+ new_path = (Path *) mapath;
+ }
+ break;
+
+ case T_MaterialPath:
+ {
+ MaterialPath *mpath;
+
+ FLAT_COPY_PATH(mpath, path, MaterialPath);
+ mpath->subpath = reparameterize_path_by_child(root,
+ mpath->subpath,
+ child_rel);
+ new_path = (Path *) mpath;
+ }
+ break;
+
+ case T_UniquePath:
+ {
+ UniquePath *upath;
+
+ FLAT_COPY_PATH(upath, path, UniquePath);
+ upath->subpath = reparameterize_path_by_child(root,
+ upath->subpath,
+ child_rel);
+ upath->uniq_exprs = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) upath->uniq_exprs,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ new_path = (Path *) upath;
+ }
+ break;
+
+ case T_GatherPath:
+ {
+ GatherPath *gpath;
+
+ FLAT_COPY_PATH(gpath, path, GatherPath);
+ gpath->subpath = reparameterize_path_by_child(root,
+ gpath->subpath,
+ child_rel);
+ new_path = (Path *) gpath;
+ }
+ break;
+
+ case T_GatherMergePath:
+ {
+ GatherMergePath *gmpath;
+
+ FLAT_COPY_PATH(gmpath, path, GatherMergePath);
+ gmpath->subpath = reparameterize_path_by_child(root,
+ gmpath->subpath,
+ child_rel);
+ new_path = (Path *) gmpath;
+ }
+ break;
+
+ case T_SubqueryScanPath:
+ /*
+ * Subqueries can't be partitioned right now, so a subquery can not
+ * participate in a partition-wise join and hence can not be seen
+ * here.
+ */
+ case T_ResultPath:
+ /*
+ * A result path can not have any parameterization, so we
+ * should never see it here.
+ */
+ default:
+ /* Other kinds of paths can not appear in a join tree. */
+ elog(ERROR, "unrecognized path node type %d", (int) nodeTag(path));
+
+ /* Keep compiler quite about unassigned new_path */
+ return NULL;
+ }
+
+ /*
+ * Adjust the parameterization information, which refers to the topmost
+ * parent. The topmost parent can be multiple levels away from the given
+ * child, hence use multi-level expression adjustment routines.
+ */
+ old_ppi = new_path->param_info;
+ required_outer = adjust_child_relids_multilevel(root,
+ old_ppi->ppi_req_outer,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+
+ /* If we already have a PPI for this parameterization, just return it */
+ new_ppi = find_param_path_info(new_path->parent, required_outer);
+
+ /*
+ * If not, build a new one and link it to the list of PPIs. When called
+ * during GEQO join planning, we are in a short-lived memory context. We
+ * must make sure that the new PPI and its contents attached to a baserel
+ * survives the GEQO cycle, else the baserel is trashed for future GEQO
+ * cycles. On the other hand, when we are adding new PPI to a joinrel
+ * during GEQO, we don't want that to clutter the main planning context.
+ * Upshot is that the best solution is to explicitly allocate new PPI in
+ * the same context the given RelOptInfo is in.
+ */
+ if (!new_ppi)
+ {
+ MemoryContext oldcontext;
+ RelOptInfo *rel = path->parent;
+
+ oldcontext = MemoryContextSwitchTo(GetMemoryChunkContext(rel));
+
+ new_ppi = makeNode(ParamPathInfo);
+ new_ppi->ppi_req_outer = bms_copy(required_outer);
+ new_ppi->ppi_rows = old_ppi->ppi_rows;
+ new_ppi->ppi_clauses = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) old_ppi->ppi_clauses,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ rel->ppilist = lappend(rel->ppilist, new_ppi);
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+ bms_free(required_outer);
+
+ new_path->param_info = new_ppi;
+
+ /*
+ * Adjust the path target if the parent of the outer relation is referenced
+ * in the targetlist. This can happen when only the parent of outer relation is
+ * laterally referenced in this relation.
+ */
+ if (bms_overlap(path->parent->lateral_relids, child_rel->top_parent_relids))
+ {
+ List *exprs;
+
+ new_path->pathtarget = copy_pathtarget(new_path->pathtarget);
+ exprs = new_path->pathtarget->exprs;
+ exprs = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) exprs,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ new_path->pathtarget->exprs = exprs;
+ }
+
+ return new_path;
+ }
diff --git a/src/backend/optimizer/util/placeholder.c b/src/backend/optimizer/util/placeholder.c
new file mode 100644
index 698a387..6714288
*** a/src/backend/optimizer/util/placeholder.c
--- b/src/backend/optimizer/util/placeholder.c
***************
*** 20,25 ****
--- 20,26 ----
#include "optimizer/pathnode.h"
#include "optimizer/placeholder.h"
#include "optimizer/planmain.h"
+ #include "optimizer/prep.h"
#include "optimizer/var.h"
#include "utils/lsyscache.h"
*************** add_placeholders_to_joinrel(PlannerInfo
*** 414,419 ****
--- 415,424 ----
Relids relids = joinrel->relids;
ListCell *lc;
+ /* This function is called only on the parent relations. */
+ Assert(!IS_OTHER_REL(joinrel) && !IS_OTHER_REL(outer_rel) &&
+ !IS_OTHER_REL(inner_rel));
+
foreach(lc, root->placeholder_list)
{
PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(lc);
*************** add_placeholders_to_joinrel(PlannerInfo
*** 459,461 ****
--- 464,518 ----
}
}
}
+
+ /*
+ * add_placeholders_to_child_joinrel
+ * Translate the PHVs in parent's targetlist and add them to the child's
+ * targetlist. Also adjust the cost
+ */
+ void
+ add_placeholders_to_child_joinrel(PlannerInfo *root, RelOptInfo *childrel,
+ RelOptInfo *parentrel)
+ {
+ ListCell *lc;
+ AppendRelInfo **appinfos;
+ int nappinfos;
+
+
+ Assert(IS_JOIN_REL(childrel) && IS_JOIN_REL(parentrel));
+
+ /* Ensure child relations is really what it claims to be. */
+ Assert(IS_OTHER_REL(childrel));
+
+ appinfos = find_appinfos_by_relids(root, childrel->relids, &nappinfos);
+ foreach (lc, parentrel->reltarget->exprs)
+ {
+ PlaceHolderVar *phv = lfirst(lc);
+
+ if (IsA(phv, PlaceHolderVar))
+ {
+ /*
+ * In case the placeholder Var refers to any of the parent
+ * relations, translate it to refer to the corresponding child.
+ */
+ if (bms_overlap(phv->phrels, parentrel->relids) &&
+ childrel->reloptkind == RELOPT_OTHER_JOINREL)
+ {
+ phv = (PlaceHolderVar *) adjust_appendrel_attrs(root,
+ (Node *) phv,
+ nappinfos,
+ appinfos);
+ }
+
+ childrel->reltarget->exprs = lappend(childrel->reltarget->exprs,
+ phv);
+ }
+ }
+
+ /* Adjust the cost and width of child targetlist. */
+ childrel->reltarget->cost.startup = parentrel->reltarget->cost.startup;
+ childrel->reltarget->cost.per_tuple = parentrel->reltarget->cost.per_tuple;
+ childrel->reltarget->width = parentrel->reltarget->width;
+
+ pfree(appinfos);
+ }
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
new file mode 100644
index 9207c8d..7e846e1
*** a/src/backend/optimizer/util/plancat.c
--- b/src/backend/optimizer/util/plancat.c
***************
*** 27,32 ****
--- 27,33 ----
#include "catalog/catalog.h"
#include "catalog/dependency.h"
#include "catalog/heap.h"
+ #include "catalog/pg_inherits_fn.h"
#include "catalog/partition.h"
#include "catalog/pg_am.h"
#include "catalog/pg_statistic_ext.h"
*************** static List *get_relation_constraints(Pl
*** 68,73 ****
--- 69,80 ----
static List *build_index_tlist(PlannerInfo *root, IndexOptInfo *index,
Relation heapRelation);
static List *get_relation_statistics(RelOptInfo *rel, Relation relation);
+ static List **build_baserel_partition_key_exprs(Relation relation,
+ Index varno);
+ static PartitionScheme find_partition_scheme(struct PlannerInfo *root,
+ Relation rel);
+ static void get_relation_partition_info(PlannerInfo *root, RelOptInfo *rel,
+ Relation relation);
/*
* get_relation_info -
*************** get_relation_info(PlannerInfo *root, Oid
*** 420,425 ****
--- 427,436 ----
/* Collect info about relation's foreign keys, if relevant */
get_relation_foreign_keys(root, rel, relation, inhparent);
+ /* Collect info about relation's partitioning scheme, if any. */
+ if (inhparent)
+ get_relation_partition_info(root, rel, relation);
+
heap_close(relation, NoLock);
/*
*************** has_row_triggers(PlannerInfo *root, Inde
*** 1801,1803 ****
--- 1812,1975 ----
heap_close(relation, NoLock);
return result;
}
+
+ /*
+ * get_relation_partition_info
+ *
+ * Retrieves partitioning information for a given relation.
+ *
+ * Partitioning scheme, partition key expressions and OIDs of partitions are
+ * added to the given RelOptInfo. A partitioned table can participate in the
+ * query as a simple relation or an inheritance parent. Only the later can have
+ * child relations, and hence partitions. From the point of view of the query
+ * optimizer only such relations are considered to be partitioned. Hence
+ * partitioning information is set only for an inheritance parent.
+ */
+ static void
+ get_relation_partition_info(PlannerInfo *root, RelOptInfo *rel,
+ Relation relation)
+ {
+ PartitionDesc part_desc = RelationGetPartitionDesc(relation);
+
+ /* No partitioning information for an unpartitioned relation. */
+ if (relation->rd_rel->relkind != RELKIND_PARTITIONED_TABLE ||
+ !(rel->part_scheme = find_partition_scheme(root, relation)))
+ return;
+
+ Assert(part_desc);
+ rel->nparts = part_desc->nparts;
+ rel->boundinfo = part_desc->boundinfo;
+ rel->partexprs = build_baserel_partition_key_exprs(relation, rel->relid);
+ rel->part_oids = part_desc->oids;
+
+ Assert(rel->nparts > 0 && rel->boundinfo && rel->part_oids);
+ return;
+ }
+
+ /*
+ * find_partition_scheme
+ *
+ * The function returns a canonical partition scheme which exactly matches the
+ * partitioning properties of the given relation if one exists in the of
+ * canonical partitioning schemes maintained in PlannerInfo. If none of the
+ * existing partitioning schemes match, the function creates a canonical
+ * partition scheme and adds it to the list.
+ *
+ * For an unpartitioned table or for a multi-level partitioned table it returns
+ * NULL. See comments in the function for more details.
+ */
+ static PartitionScheme
+ find_partition_scheme(PlannerInfo *root, Relation relation)
+ {
+ PartitionKey part_key = RelationGetPartitionKey(relation);
+ ListCell *lc;
+ int partnatts;
+ PartitionScheme part_scheme = NULL;
+
+ /* No partition scheme for an unpartitioned relation. */
+ if (!part_key)
+ return NULL;
+
+ partnatts = part_key->partnatts;
+
+ /* Search for a matching partition scheme and return if found one. */
+ foreach (lc, root->part_schemes)
+ {
+ part_scheme = lfirst(lc);
+
+ /* Match partitioning strategy and number of keys. */
+ if (part_key->strategy != part_scheme->strategy ||
+ partnatts != part_scheme->partnatts)
+ continue;
+
+ /* Match the partition key types. */
+ if (memcmp(part_key->partopfamily, part_scheme->partopfamily,
+ sizeof(Oid) * partnatts) != 0 ||
+ memcmp(part_key->partopcintype, part_scheme->partopcintype,
+ sizeof(Oid) * partnatts) != 0 ||
+ memcmp(part_key->parttypcoll, part_scheme->parttypcoll,
+ sizeof(Oid) * partnatts) != 0)
+ continue;
+
+ /* Found matching partition scheme. */
+ return part_scheme;
+ }
+
+ /* Did not find matching partition scheme. Create one. */
+ part_scheme = (PartitionScheme) palloc0(sizeof(PartitionSchemeData));
+
+ part_scheme->strategy = part_key->strategy;
+ /* Store partition key information. */
+ part_scheme->partnatts = part_key->partnatts;
+ part_scheme->partopfamily = part_key->partopfamily;
+ part_scheme->partopcintype = part_key->partopcintype;
+ part_scheme->parttypcoll = part_key->parttypcoll;
+ part_scheme->partsupfunc = part_key->partsupfunc;
+
+ /* Add the partitioning scheme to PlannerInfo. */
+ root->part_schemes = lappend(root->part_schemes, part_scheme);
+
+ return part_scheme;
+ }
+
+ /*
+ * build_baserel_partition_key_exprs
+ *
+ * Collect partition key expressions for a given base relation. The function
+ * converts any single column partition keys into corresponding Var nodes. It
+ * restamps Var nodes in partition key expressions by given varno. The
+ * partition key expressions are returned as an array of single element lists
+ * to be stored in RelOptInfo of the base relation.
+ */
+ static List **
+ build_baserel_partition_key_exprs(Relation relation, Index varno)
+ {
+ PartitionKey part_key = RelationGetPartitionKey(relation);
+ int num_pkexprs;
+ int cnt_pke;
+ List **partexprs;
+ ListCell *lc;
+
+ if (!part_key || part_key->partnatts <= 0)
+ return NULL;
+
+ num_pkexprs = part_key->partnatts;
+ partexprs = (List **) palloc(sizeof(List *) * num_pkexprs);
+ lc = list_head(part_key->partexprs);
+
+ for (cnt_pke = 0; cnt_pke < num_pkexprs; cnt_pke++)
+ {
+ AttrNumber attno = part_key->partattrs[cnt_pke];
+ Expr *pkexpr;
+
+ if (attno != InvalidAttrNumber)
+ {
+ /* Single column partition key is stored as a Var node. */
+ Form_pg_attribute att_tup;
+
+ if (attno < 0)
+ att_tup = SystemAttributeDefinition(attno,
+ relation->rd_rel->relhasoids);
+ else
+ att_tup = relation->rd_att->attrs[attno - 1];
+
+ pkexpr = (Expr *) makeVar(varno, attno, att_tup->atttypid,
+ att_tup->atttypmod,
+ att_tup->attcollation, 0);
+ }
+ else
+ {
+ if (lc == NULL)
+ elog(ERROR, "wrong number of partition key expressions");
+
+ /* Re-stamp the expression with given varno. */
+ pkexpr = (Expr *) copyObject(lfirst(lc));
+ ChangeVarNodes((Node *) pkexpr, 1, varno, 0);
+ lc = lnext(lc);
+ }
+
+ partexprs[cnt_pke] = list_make1(pkexpr);
+ }
+
+ return partexprs;
+ }
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
new file mode 100644
index 342d884..308bdec
*** a/src/backend/optimizer/util/relnode.c
--- b/src/backend/optimizer/util/relnode.c
***************
*** 23,30 ****
--- 23,32 ----
#include "optimizer/paths.h"
#include "optimizer/placeholder.h"
#include "optimizer/plancat.h"
+ #include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
#include "optimizer/tlist.h"
+ #include "optimizer/var.h"
#include "utils/hsearch.h"
*************** typedef struct JoinHashEntry
*** 35,41 ****
} JoinHashEntry;
static void build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
! RelOptInfo *input_rel);
static List *build_joinrel_restrictlist(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outer_rel,
--- 37,43 ----
} JoinHashEntry;
static void build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
! RelOptInfo *input_rel, bool grouped);
static List *build_joinrel_restrictlist(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outer_rel,
*************** static List *subbuild_joinrel_joinlist(R
*** 52,57 ****
--- 54,64 ----
static void set_foreign_rel_properties(RelOptInfo *joinrel,
RelOptInfo *outer_rel, RelOptInfo *inner_rel);
static void add_join_rel(PlannerInfo *root, RelOptInfo *joinrel);
+ extern ParamPathInfo *find_param_path_info(RelOptInfo *rel,
+ Relids required_outer);
+ static void build_joinrel_partition_info(RelOptInfo *joinrel,
+ RelOptInfo *outer_rel, RelOptInfo *inner_rel,
+ List *restrictlist, JoinType jointype);
/*
*************** build_simple_rel(PlannerInfo *root, int
*** 120,125 ****
--- 127,133 ----
rel->cheapest_parameterized_paths = NIL;
rel->direct_lateral_relids = NULL;
rel->lateral_relids = NULL;
+ rel->gpi = NULL;
rel->relid = relid;
rel->rtekind = rte->rtekind;
/* min_attr, max_attr, attr_needed, attr_widths are set below */
*************** build_simple_rel(PlannerInfo *root, int
*** 146,151 ****
--- 154,164 ----
rel->baserestrict_min_security = UINT_MAX;
rel->joininfo = NIL;
rel->has_eclass_joins = false;
+ rel->part_scheme = NULL;
+ rel->nparts = 0;
+ rel->boundinfo = NULL;
+ rel->partexprs = NULL;
+ rel->part_rels = NULL;
/*
* Pass top parent's relids down the inheritance hierarchy. If the parent
*************** build_simple_rel(PlannerInfo *root, int
*** 218,237 ****
if (rte->inh)
{
ListCell *l;
foreach(l, root->append_rel_list)
{
AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(l);
/* append_rel_list contains all append rels; ignore others */
if (appinfo->parent_relid != relid)
continue;
! (void) build_simple_rel(root, appinfo->child_relid,
! rel);
}
}
return rel;
}
--- 231,293 ----
if (rte->inh)
{
ListCell *l;
+ int nparts = rel->nparts;
+
+ if (nparts > 0)
+ rel->part_rels = (RelOptInfo **) palloc0(sizeof(RelOptInfo *) * nparts);
foreach(l, root->append_rel_list)
{
AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(l);
+ RelOptInfo *childrel;
+ int cnt_parts;
+ RangeTblEntry *childRTE;
/* append_rel_list contains all append rels; ignore others */
if (appinfo->parent_relid != relid)
continue;
! childrel = build_simple_rel(root, appinfo->child_relid,
! rel);
!
! /* Nothing more to do for an unpartitioned table. */
! if (!rel->part_scheme)
! continue;
!
! childRTE = root->simple_rte_array[appinfo->child_relid];
! /*
! * Two partitioned tables with the same partitioning scheme, have
! * their partition bounds arranged in the same order. The order of
! * partition OIDs in RelOptInfo corresponds to the partition bound
! * order. Thus the OIDs of matching partitions from both the tables
! * are placed at the same position in the array of partition OIDs
! * in the respective RelOptInfos. Arranging RelOptInfos of
! * partitions in the same order as their OIDs makes it easy to find
! * the RelOptInfos of matching partitions for partition-wise join.
! */
! for (cnt_parts = 0; cnt_parts < nparts; cnt_parts++)
! {
! if (rel->part_oids[cnt_parts] == childRTE->relid)
! {
! Assert(!rel->part_rels[cnt_parts]);
! rel->part_rels[cnt_parts] = childrel;
! break;
! }
! }
}
}
+ /* Should have found all the childrels of a partitioned relation. */
+ if (rel->part_scheme)
+ {
+ int cnt_parts;
+
+ for (cnt_parts = 0; cnt_parts < rel->nparts; cnt_parts++)
+ if (!rel->part_rels[cnt_parts])
+ elog(ERROR, "could not find the RelOptInfo of a partition with oid %u",
+ rel->part_oids[cnt_parts]);
+ }
+
return rel;
}
*************** build_join_rel(PlannerInfo *root,
*** 453,458 ****
--- 509,517 ----
RelOptInfo *joinrel;
List *restrictlist;
+ /* This function should be used only for join between parents. */
+ Assert(!IS_OTHER_REL(outer_rel) && !IS_OTHER_REL(inner_rel));
+
/*
* See if we already have a joinrel for this set of base rels.
*/
*************** build_join_rel(PlannerInfo *root,
*** 497,502 ****
--- 556,562 ----
inner_rel->direct_lateral_relids);
joinrel->lateral_relids = min_join_parameterization(root, joinrel->relids,
outer_rel, inner_rel);
+ joinrel->gpi = NULL;
joinrel->relid = 0; /* indicates not a baserel */
joinrel->rtekind = RTE_JOIN;
joinrel->min_attr = 0;
*************** build_join_rel(PlannerInfo *root,
*** 527,532 ****
--- 587,597 ----
joinrel->joininfo = NIL;
joinrel->has_eclass_joins = false;
joinrel->top_parent_relids = NULL;
+ joinrel->part_scheme = NULL;
+ joinrel->nparts = 0;
+ joinrel->boundinfo = NULL;
+ joinrel->partexprs = NULL;
+ joinrel->part_rels = NULL;
/* Compute information relevant to the foreign relations. */
set_foreign_rel_properties(joinrel, outer_rel, inner_rel);
*************** build_join_rel(PlannerInfo *root,
*** 539,548 ****
* and inner rels we first try to build it from. But the contents should
* be the same regardless.
*/
! build_joinrel_tlist(root, joinrel, outer_rel);
! build_joinrel_tlist(root, joinrel, inner_rel);
add_placeholders_to_joinrel(root, joinrel, outer_rel, inner_rel);
/*
* add_placeholders_to_joinrel also took care of adding the ph_lateral
* sets of any PlaceHolderVars computed here to direct_lateral_relids, so
--- 604,620 ----
* and inner rels we first try to build it from. But the contents should
* be the same regardless.
*/
! build_joinrel_tlist(root, joinrel, outer_rel, false);
! build_joinrel_tlist(root, joinrel, inner_rel, false);
add_placeholders_to_joinrel(root, joinrel, outer_rel, inner_rel);
+ /* Try to build grouped target. */
+ /*
+ * TODO Consider if placeholders make sense here. If not, also make the
+ * related code below conditional.
+ */
+ prepare_rel_for_grouping(root, joinrel);
+
/*
* add_placeholders_to_joinrel also took care of adding the ph_lateral
* sets of any PlaceHolderVars computed here to direct_lateral_relids, so
*************** build_join_rel(PlannerInfo *root,
*** 572,577 ****
--- 644,653 ----
*/
joinrel->has_eclass_joins = has_relevant_eclass_joinclause(root, joinrel);
+ /* Store the partition information. */
+ build_joinrel_partition_info(joinrel, outer_rel, inner_rel, restrictlist,
+ sjinfo->jointype);
+
/*
* Set estimates of the joinrel's size.
*/
*************** build_join_rel(PlannerInfo *root,
*** 617,622 ****
--- 693,845 ----
return joinrel;
}
+ /*
+ * build_child_join_rel
+ * Builds RelOptInfo for joining given two child relations from RelOptInfo
+ * representing the join between their parents.
+ *
+ * 'outer_rel' and 'inner_rel' are the RelOptInfos of child relations being
+ * joined.
+ * 'parent_joinrel' is the RelOptInfo representing the join between parent
+ * relations. Most of the members of new RelOptInfo are produced by
+ * translating corresponding members of this RelOptInfo.
+ * 'sjinfo': context info for child join
+ * 'restrictlist': list of RestrictInfo nodes that apply to this particular
+ * pair of joinable relations.
+ * 'join_appinfos': list of AppendRelInfo nodes for base child relations involved
+ * in this join.
+ */
+ RelOptInfo *
+ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
+ RelOptInfo *inner_rel, RelOptInfo *parent_joinrel,
+ List *restrictlist, SpecialJoinInfo *sjinfo,
+ JoinType jointype)
+ {
+ RelOptInfo *joinrel = makeNode(RelOptInfo);
+ AppendRelInfo **appinfos;
+ int nappinfos;
+
+ /* Only joins between other relations land here. */
+ Assert(IS_OTHER_REL(outer_rel) && IS_OTHER_REL(inner_rel));
+
+ joinrel->reloptkind = RELOPT_OTHER_JOINREL;
+ joinrel->relids = bms_union(outer_rel->relids, inner_rel->relids);
+ joinrel->rows = 0;
+ /* cheap startup cost is interesting iff not all tuples to be retrieved */
+ joinrel->consider_startup = (root->tuple_fraction > 0);
+ joinrel->consider_param_startup = false;
+ joinrel->consider_parallel = false;
+ joinrel->reltarget = create_empty_pathtarget();
+ joinrel->pathlist = NIL;
+ joinrel->ppilist = NIL;
+ joinrel->partial_pathlist = NIL;
+ joinrel->cheapest_startup_path = NULL;
+ joinrel->cheapest_total_path = NULL;
+ joinrel->cheapest_unique_path = NULL;
+ joinrel->cheapest_parameterized_paths = NIL;
+ joinrel->direct_lateral_relids = NULL;
+ joinrel->lateral_relids = NULL;
+ joinrel->gpi = makeNode(GroupedPathInfo);
+ if (parent_joinrel->gpi)
+ /*
+ * Translation into child varnos will take place along with other
+ * translations, see try_partition_wise_join.
+ */
+ joinrel->gpi->target = copy_pathtarget(parent_joinrel->gpi->target);
+ joinrel->relid = 0; /* indicates not a baserel */
+ joinrel->rtekind = RTE_JOIN;
+ joinrel->min_attr = 0;
+ joinrel->max_attr = 0;
+ joinrel->attr_needed = NULL;
+ joinrel->attr_widths = NULL;
+ joinrel->lateral_vars = NIL;
+ joinrel->lateral_referencers = NULL;
+ joinrel->indexlist = NIL;
+ joinrel->pages = 0;
+ joinrel->tuples = 0;
+ joinrel->allvisfrac = 0;
+ joinrel->subroot = NULL;
+ joinrel->subplan_params = NIL;
+ joinrel->serverid = InvalidOid;
+ joinrel->userid = InvalidOid;
+ joinrel->useridiscurrent = false;
+ joinrel->fdwroutine = NULL;
+ joinrel->fdw_private = NULL;
+ joinrel->baserestrictinfo = NIL;
+ joinrel->baserestrictcost.startup = 0;
+ joinrel->baserestrictcost.per_tuple = 0;
+ joinrel->joininfo = NIL;
+ joinrel->has_eclass_joins = false;
+ joinrel->top_parent_relids = NULL;
+ joinrel->part_scheme = NULL;
+ joinrel->part_rels = NULL;
+ joinrel->partexprs = NULL;
+
+ joinrel->top_parent_relids = bms_union(outer_rel->top_parent_relids,
+ inner_rel->top_parent_relids);
+
+ /* Compute information relevant to foreign relations. */
+ set_foreign_rel_properties(joinrel, outer_rel, inner_rel);
+
+ /* Build targetlist */
+ build_joinrel_tlist(root, joinrel, outer_rel, false);
+ build_joinrel_tlist(root, joinrel, inner_rel, false);
+ /* Add placeholder variables. */
+ add_placeholders_to_child_joinrel(root, joinrel, parent_joinrel);
+
+ /* Try to build grouped target. */
+ /*
+ * TODO Consider if placeholders make sense here. If not, also make the
+ * related code below conditional.
+ */
+ prepare_rel_for_grouping(root, joinrel);
+
+
+ /* Construct joininfo list. */
+ appinfos = find_appinfos_by_relids(root, joinrel->relids, &nappinfos);
+ joinrel->joininfo = (List *) adjust_appendrel_attrs(root,
+ (Node *) parent_joinrel->joininfo,
+ nappinfos,
+ appinfos);
+ pfree(appinfos);
+
+ /*
+ * Lateral relids referred in child join will be same as that referred in
+ * the parent relation. Throw any partial result computed while building
+ * the targetlist.
+ */
+ bms_free(joinrel->direct_lateral_relids);
+ bms_free(joinrel->lateral_relids);
+ joinrel->direct_lateral_relids = (Relids) bms_copy(parent_joinrel->direct_lateral_relids);
+ joinrel->lateral_relids = (Relids) bms_copy(parent_joinrel->lateral_relids);
+
+ /*
+ * If the parent joinrel has pending equivalence classes, so does the
+ * child.
+ */
+ joinrel->has_eclass_joins = parent_joinrel->has_eclass_joins;
+
+ /* Is the join between partitions itself partitioned? */
+ build_joinrel_partition_info(joinrel, outer_rel, inner_rel, restrictlist,
+ jointype);
+
+ /* Child joinrel is parallel safe if parent is parallel safe. */
+ joinrel->consider_parallel = parent_joinrel->consider_parallel;
+
+
+ /* Set estimates of the child-joinrel's size. */
+ set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
+ sjinfo, restrictlist);
+
+ /* We build the join only once. */
+ Assert(!find_join_rel(root, joinrel->relids));
+
+ /* Add the relation to the PlannerInfo. */
+ add_join_rel(root, joinrel);
+
+ return joinrel;
+ }
+
/*
* min_join_parameterization
*
*************** min_join_parameterization(PlannerInfo *r
*** 670,679 ****
*/
static void
build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
! RelOptInfo *input_rel)
{
! Relids relids = joinrel->relids;
ListCell *vars;
foreach(vars, input_rel->reltarget->exprs)
{
--- 893,932 ----
*/
static void
build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
! RelOptInfo *input_rel, bool grouped)
{
! Relids relids;
! PathTarget *input_target, *result;
ListCell *vars;
+ int i = -1;
+
+ /* attrs_needed refers to parent relids and not those of a child. */
+ if (joinrel->top_parent_relids)
+ relids = joinrel->top_parent_relids;
+ else
+ relids = joinrel->relids;
+
+ if (!grouped)
+ {
+ input_target = input_rel->reltarget;
+ result = joinrel->reltarget;
+ }
+ else
+ {
+ if (input_rel->gpi != NULL)
+ {
+ input_target = input_rel->gpi->target;
+ Assert(input_target != NULL);
+ }
+ else
+ input_target = input_rel->reltarget;
+
+ /* Caller should have initialized this. */
+ Assert(joinrel->gpi != NULL);
+
+ /* Default to the plain target. */
+ result = joinrel->gpi->target;
+ }
foreach(vars, input_rel->reltarget->exprs)
{
*************** build_joinrel_tlist(PlannerInfo *root, R
*** 690,713 ****
/*
* Otherwise, anything in a baserel or joinrel targetlist ought to be
! * a Var. (More general cases can only appear in appendrel child
! * rels, which will never be seen here.)
*/
! if (!IsA(var, Var))
elog(ERROR, "unexpected node type in rel targetlist: %d",
(int) nodeTag(var));
- /* Get the Var's original base rel */
- baserel = find_base_rel(root, var->varno);
-
- /* Is it still needed above this joinrel? */
- ndx = var->varattno - baserel->min_attr;
if (bms_nonempty_difference(baserel->attr_needed[ndx], relids))
{
/* Yup, add it to the output */
! joinrel->reltarget->exprs = lappend(joinrel->reltarget->exprs, var);
! /* Vars have cost zero, so no need to adjust reltarget->cost */
! joinrel->reltarget->width += baserel->attr_widths[ndx];
}
}
}
--- 943,1009 ----
/*
* Otherwise, anything in a baserel or joinrel targetlist ought to be
! * a Var or ConvertRowtypeExpr introduced while translating parent
! * targetlist to that of the child.
*/
! if (IsA(var, Var))
! {
! /* Get the Var's original base rel */
! baserel = find_base_rel(root, var->varno);
!
! /* Is it still needed above this joinrel? */
! ndx = var->varattno - baserel->min_attr;
! }
! else if (IsA(var, ConvertRowtypeExpr))
! {
! ConvertRowtypeExpr *child_expr = (ConvertRowtypeExpr *) var;
! Var *childvar = (Var *) child_expr->arg;
!
! /*
! * Child's whole-row references are converted to that of parent
! * using ConvertRowtypeExpr. There can be as many
! * ConvertRowtypeExpr decorations as the depth of partition tree.
! * The argument to deepest ConvertRowtypeExpr is expected to be a
! * whole-row reference of the child.
! */
! while (IsA(childvar, ConvertRowtypeExpr))
! {
! child_expr = (ConvertRowtypeExpr *) childvar;
! childvar = (Var *) child_expr->arg;
! }
! Assert(IsA(childvar, Var) && childvar->varattno == 0);
!
! baserel = find_base_rel(root, childvar->varno);
! ndx = 0 - baserel->min_attr;
! }
! else
elog(ERROR, "unexpected node type in rel targetlist: %d",
(int) nodeTag(var));
if (bms_nonempty_difference(baserel->attr_needed[ndx], relids))
{
+ Index sortgroupref = 0;
+
/* Yup, add it to the output */
! if (input_target->sortgrouprefs)
! sortgroupref = input_target->sortgrouprefs[i];
!
! /*
! * Even if not used for grouping in the input path (the input path
! * is not necessarily grouped), it might be useful for grouping
! * higher in the join tree.
! */
! if (sortgroupref == 0)
! sortgroupref = get_expr_sortgroupref(root, (Expr *) var);
!
! add_column_to_pathtarget(result, (Expr *) var, sortgroupref);
!
! /*
! * Vars have cost zero, so no need to adjust reltarget->cost. Even
! * if, it's a ConvertRowtypeExpr, it will be computed only for the
! * base relation, costing nothing for a join.
! */
! result->width += baserel->attr_widths[ndx];
}
}
}
*************** subbuild_joinrel_joinlist(RelOptInfo *jo
*** 843,848 ****
--- 1139,1147 ----
{
ListCell *l;
+ /* Expected to be called only for join between parent relations. */
+ Assert(joinrel->reloptkind == RELOPT_JOINREL);
+
foreach(l, joininfo_list)
{
RestrictInfo *rinfo = (RestrictInfo *) lfirst(l);
*************** get_baserel_parampathinfo(PlannerInfo *r
*** 1048,1059 ****
Assert(!bms_overlap(baserel->relids, required_outer));
/* If we already have a PPI for this parameterization, just return it */
! foreach(lc, baserel->ppilist)
! {
! ppi = (ParamPathInfo *) lfirst(lc);
! if (bms_equal(ppi->ppi_req_outer, required_outer))
! return ppi;
! }
/*
* Identify all joinclauses that are movable to this base rel given this
--- 1347,1354 ----
Assert(!bms_overlap(baserel->relids, required_outer));
/* If we already have a PPI for this parameterization, just return it */
! if ((ppi = find_param_path_info(baserel, required_outer)))
! return ppi;
/*
* Identify all joinclauses that are movable to this base rel given this
*************** get_baserel_parampathinfo(PlannerInfo *r
*** 1095,1100 ****
--- 1390,1545 ----
}
/*
+ * If the relation can produce grouped paths, create GroupedPathInfo for it
+ * and create target for the grouped paths.
+ */
+ void
+ prepare_rel_for_grouping(PlannerInfo *root, RelOptInfo *rel)
+ {
+ List *rel_aggregates;
+ Relids rel_agg_attrs = NULL;
+ List *rel_agg_vars = NIL;
+ bool found_higher;
+ ListCell *lc;
+ PathTarget *target_grouped;
+
+ if (rel->relid > 0)
+ {
+ RangeTblEntry *rte = root->simple_rte_array[rel->relid];;
+
+ /*
+ * rtekind != RTE_RELATION case is not supported yet.
+ */
+ if (rte->rtekind != RTE_RELATION)
+ return;
+ }
+
+ /* Caller should only pass base relations or joins. */
+ Assert(rel->reloptkind == RELOPT_BASEREL ||
+ rel->reloptkind == RELOPT_JOINREL ||
+ rel->reloptkind == RELOPT_OTHER_JOINREL);
+
+ /*
+ * If any outer join can set the attribute value to NULL, the aggregate
+ * would receive different input at the base rel level.
+ *
+ * TODO For RELOPT_JOINREL, do not return if all the joins that can set
+ * any entry of the grouped target (do we need to postpone this check
+ * until the grouped target is available, and should create_grouped_target
+ * take care?) of this rel to NULL are provably below rel. (It's ok if rel
+ * is one of these joins.)
+ */
+ if (bms_overlap(rel->relids, root->nullable_baserels))
+ return;
+
+ /*
+ * Check if some aggregates can be evaluated in this relation's target,
+ * and collect all vars referenced by these aggregates.
+ */
+ rel_aggregates = NIL;
+ found_higher = false;
+ foreach(lc, root->grouped_var_list)
+ {
+ GroupedVarInfo *gvi = castNode(GroupedVarInfo, lfirst(lc));
+
+ /*
+ * The subset includes gv_eval_at uninitialized, which typically means
+ * Aggref.aggstar.
+ */
+ if (bms_is_subset(gvi->gv_eval_at, rel->relids))
+ {
+ Aggref *aggref = castNode(Aggref, gvi->gvexpr);
+
+ /*
+ * Accept the aggregate.
+ *
+ * GroupedVarInfo is more convenient for the next processing than
+ * Aggref, see add_aggregates_to_grouped_target.
+ */
+ rel_aggregates = lappend(rel_aggregates, gvi);
+
+ if (rel->relid > 0)
+ {
+ /*
+ * Simple relation. Collect attributes referenced by the
+ * aggregate arguments.
+ */
+ pull_varattnos((Node *) aggref, rel->relid, &rel_agg_attrs);
+ }
+ else
+ {
+ List *agg_vars;
+
+ /*
+ * Join. Collect vars referenced by the aggregate
+ * arguments.
+ */
+ /*
+ * TODO Can any argument contain PHVs? And if so, does it matter?
+ * Consider PVC_INCLUDE_PLACEHOLDERS | PVC_RECURSE_PLACEHOLDERS.
+ */
+ agg_vars = pull_var_clause((Node *) aggref,
+ PVC_RECURSE_AGGREGATES);
+ rel_agg_vars = list_concat(rel_agg_vars, agg_vars);
+ }
+ }
+ else if (bms_overlap(gvi->gv_eval_at, rel->relids))
+ {
+ /*
+ * Remember that there is at least one aggregate that needs more
+ * than this rel.
+ */
+ found_higher = true;
+ }
+ }
+
+ /*
+ * Grouping makes little sense w/o aggregate function.
+ */
+ if (rel_aggregates == NIL)
+ {
+ bms_free(rel_agg_attrs);
+ return;
+ }
+
+ if (found_higher)
+ {
+ /*
+ * If some aggregate(s) need only this rel but some other need
+ * multiple relations including the the current one, grouping of the
+ * current rel could steal some input variables from the "higher
+ * aggregate" (besides decreasing the number of input rows).
+ */
+ list_free(rel_aggregates);
+ bms_free(rel_agg_attrs);
+ return;
+ }
+
+ /*
+ * If rel->reltarget can be used for aggregation, mark the relation as
+ * capable of grouping.
+ */
+ Assert(rel->gpi == NULL);
+ target_grouped = create_grouped_target(root, rel, rel_agg_attrs,
+ rel_agg_vars);
+ if (target_grouped != NULL)
+ {
+ GroupedPathInfo *gpi;
+
+ gpi = makeNode(GroupedPathInfo);
+ gpi->target = copy_pathtarget(target_grouped);
+ gpi->pathlist = NIL;
+ gpi->partial_pathlist = NIL;
+ rel->gpi = gpi;
+
+ /*
+ * Add aggregates (in the form of GroupedVar) to the target.
+ */
+ add_aggregates_to_target(root, gpi->target, rel_aggregates, rel);
+ }
+ }
+
+ /*
* get_joinrel_parampathinfo
* Get the ParamPathInfo for a parameterized path for a join relation,
* constructing one if we don't have one already.
*************** get_joinrel_parampathinfo(PlannerInfo *r
*** 1290,1301 ****
*restrict_clauses = list_concat(pclauses, *restrict_clauses);
/* If we already have a PPI for this parameterization, just return it */
! foreach(lc, joinrel->ppilist)
! {
! ppi = (ParamPathInfo *) lfirst(lc);
! if (bms_equal(ppi->ppi_req_outer, required_outer))
! return ppi;
! }
/* Estimate the number of rows returned by the parameterized join */
rows = get_parameterized_joinrel_size(root, joinrel,
--- 1735,1742 ----
*restrict_clauses = list_concat(pclauses, *restrict_clauses);
/* If we already have a PPI for this parameterization, just return it */
! if ((ppi = find_param_path_info(joinrel, required_outer)))
! return ppi;
/* Estimate the number of rows returned by the parameterized join */
rows = get_parameterized_joinrel_size(root, joinrel,
*************** ParamPathInfo *
*** 1334,1340 ****
get_appendrel_parampathinfo(RelOptInfo *appendrel, Relids required_outer)
{
ParamPathInfo *ppi;
- ListCell *lc;
/* Unparameterized paths have no ParamPathInfo */
if (bms_is_empty(required_outer))
--- 1775,1780 ----
*************** get_appendrel_parampathinfo(RelOptInfo *
*** 1343,1354 ****
Assert(!bms_overlap(appendrel->relids, required_outer));
/* If we already have a PPI for this parameterization, just return it */
! foreach(lc, appendrel->ppilist)
! {
! ppi = (ParamPathInfo *) lfirst(lc);
! if (bms_equal(ppi->ppi_req_outer, required_outer))
! return ppi;
! }
/* Else build the ParamPathInfo */
ppi = makeNode(ParamPathInfo);
--- 1783,1790 ----
Assert(!bms_overlap(appendrel->relids, required_outer));
/* If we already have a PPI for this parameterization, just return it */
! if ((ppi = find_param_path_info(appendrel, required_outer)))
! return ppi;
/* Else build the ParamPathInfo */
ppi = makeNode(ParamPathInfo);
*************** get_appendrel_parampathinfo(RelOptInfo *
*** 1359,1361 ****
--- 1795,1917 ----
return ppi;
}
+
+ /*
+ * Returns a ParamPathInfo for outer relations specified by required_outer, if
+ * already available in the given rel. Returns NULL otherwise.
+ */
+ ParamPathInfo *
+ find_param_path_info(RelOptInfo *rel, Relids required_outer)
+ {
+ ListCell *lc;
+
+ foreach(lc, rel->ppilist)
+ {
+ ParamPathInfo *ppi = (ParamPathInfo *) lfirst(lc);
+ if (bms_equal(ppi->ppi_req_outer, required_outer))
+ return ppi;
+ }
+
+ return NULL;
+ }
+
+ /*
+ * build_joinrel_partition_info
+ * If the join between given partitioned relations is possibly partitioned
+ * set the partitioning scheme and partition keys expressions for the
+ * join.
+ *
+ * If the two relations have same partitioning scheme, their join may be
+ * partitioned and will follow the same partitioning scheme as the joining
+ * relations.
+ */
+ static void
+ build_joinrel_partition_info(RelOptInfo *joinrel, RelOptInfo *outer_rel,
+ RelOptInfo *inner_rel, List *restrictlist,
+ JoinType jointype)
+ {
+ int num_pks;
+ int cnt;
+ bool is_strict;
+
+ /* Nothing to do if partition-wise join technique is disabled. */
+ if (!enable_partition_wise_join)
+ {
+ joinrel->part_scheme = NULL;
+ return;
+ }
+
+ /*
+ * The join is not partitioned, if any of the relations being joined are
+ * not partitioned or they do not have same partitioning scheme or if there
+ * is no equi-join between partition keys.
+ *
+ * For an N-way inner join, where every syntactic inner join has equi-join
+ * between partition keys and a matching partitioning scheme, partition
+ * keys of N relations form an equivalence class, thus inducing an
+ * equi-join between any pair of joining relations.
+ *
+ * For an N-way join with outer joins, where every syntactic join has an
+ * equi-join between partition keys and a matching partitioning scheme,
+ * outer join reordering identities in optimizer/README imply that only
+ * those pairs of join are legal which have an equi-join between partition
+ * keys. Thus every pair of joining relations we see here should have an
+ * equi-join if this join has been deemed as a partitioned join.
+ */
+ if (!outer_rel->part_scheme || !inner_rel->part_scheme ||
+ outer_rel->part_scheme != inner_rel->part_scheme ||
+ !have_partkey_equi_join(outer_rel, inner_rel, jointype, restrictlist,
+ &is_strict))
+ {
+ joinrel->part_scheme = NULL;
+ return;
+ }
+
+ /*
+ * This function will be called only once for each joinrel, hence it should
+ * not have partition scheme, partition key expressions and array for
+ * storing child relations set.
+ */
+ Assert(!joinrel->part_scheme && !joinrel->partexprs &&
+ !joinrel->part_rels);
+
+ /*
+ * Join relation is partitioned using same partitioning scheme as the
+ * joining relations.
+ */
+ joinrel->part_scheme = outer_rel->part_scheme;
+ num_pks = joinrel->part_scheme->partnatts;
+
+ /*
+ * Construct partition keys for the join.
+ *
+ * An INNER join between two partitioned relations is partition by key
+ * expressions from both the relations. For tables A and B partitioned by a
+ * and b respectively, (A INNER JOIN B ON A.a = B.b) is partitioned by both
+ * A.a and B.b.
+ *
+ * An OUTER join like (A LEFT JOIN B ON A.a = B.b) may produce rows with
+ * B.b NULL. These rows may not fit the partitioning conditions imposed on
+ * B.b. Hence, strictly speaking, the join is not partitioned by B.b.
+ * Strictly speaking, partition keys of an OUTER join should include
+ * partition key expressions from the OUTER side only. Consider a join like
+ * (A LEFT JOIN B on (A.a = B.b) LEFT JOIN C ON B.b = C.c. If we do not
+ * include B.b as partition key expression for (AB), it prohibits us from
+ * using partition-wise join when joining (AB) with C as there is no
+ * equi-join between partition keys of joining relations. If the equality
+ * operator is strict, two NULL values are never equal and no two rows from
+ * mis-matching partitions can join. Hence if the equality operator is
+ * strict it's safe to include B.b as partition key expression for (AB),
+ * even though rows in (AB) are not strictly partitioned by B.b.
+ */
+ joinrel->partexprs = (List **) palloc0(sizeof(List *) * num_pks);
+ for (cnt = 0; cnt < num_pks; cnt++)
+ {
+ List *pkexpr = list_copy(outer_rel->partexprs[cnt]);
+
+ if (jointype == JOIN_INNER || is_strict)
+ pkexpr = list_concat(pkexpr,
+ list_copy(inner_rel->partexprs[cnt]));
+ joinrel->partexprs[cnt] = pkexpr;
+ }
+ }
diff --git a/src/backend/optimizer/util/tlist.c b/src/backend/optimizer/util/tlist.c
new file mode 100644
index 0952385..dd962b7
*** a/src/backend/optimizer/util/tlist.c
--- b/src/backend/optimizer/util/tlist.c
*************** get_sortgrouplist_exprs(List *sgClauses,
*** 408,413 ****
--- 408,487 ----
return result;
}
+ /*
+ * get_sortgrouplist_clauses
+ *
+ * Given a "grouped target" (i.e. target where each non-GroupedVar
+ * element must have sortgroupref set), build a list of the referencing
+ * SortGroupClauses, a list of the corresponding grouping expressions and
+ * a list of aggregate expressions.
+ */
+ /* Refine the function name. */
+ void
+ get_grouping_expressions(PlannerInfo *root, PathTarget *target,
+ List **grouping_clauses, List **grouping_exprs,
+ List **agg_exprs)
+ {
+ ListCell *l;
+ int i = 0;
+
+ foreach(l, target->exprs)
+ {
+ Index sortgroupref = 0;
+ SortGroupClause *cl;
+ Expr *texpr;
+
+ texpr = (Expr *) lfirst(l);
+
+ /* The target should contain at least one grouping column. */
+ Assert(target->sortgrouprefs != NULL);
+
+ if (IsA(texpr, GroupedVar))
+ {
+ /*
+ * texpr should represent the first aggregate in the targetlist.
+ */
+ break;
+ }
+
+ /*
+ * Find the clause by sortgroupref.
+ */
+ sortgroupref = target->sortgrouprefs[i++];
+
+ /*
+ * Besides aggregates, the target should contain no expressions w/o
+ * sortgroupref. Plain relation being joined to grouped can have
+ * sortgroupref equal to zero for expressions contained neither in
+ * grouping expression nor in aggregate arguments, but if the target
+ * contains such an expression, it shouldn't be used for aggregation
+ * --- see can_aggregate field of GroupedPathInfo.
+ */
+ Assert(sortgroupref > 0);
+
+ cl = get_sortgroupref_clause(sortgroupref, root->parse->groupClause);
+ *grouping_clauses = list_append_unique(*grouping_clauses, cl);
+
+ /*
+ * Add only unique clauses because of joins (both sides of a join can
+ * point at the same grouping clause). XXX Is it worth adding a bool
+ * argument indicating that we're dealing with join right now?
+ */
+ *grouping_exprs = list_append_unique(*grouping_exprs, texpr);
+ }
+
+ /* Now collect the aggregates. */
+ while (l != NULL)
+ {
+ GroupedVar *gvar = castNode(GroupedVar, lfirst(l));
+
+ /* Currently, GroupedVarInfo can only represent aggregate. */
+ Assert(gvar->agg_partial != NULL);
+ *agg_exprs = lappend(*agg_exprs, gvar->agg_partial);
+ l = lnext(l);
+ }
+ }
+
/*****************************************************************************
* Functions to extract data from a list of SortGroupClauses
*************** apply_pathtarget_labeling_to_tlist(List
*** 783,788 ****
--- 857,1081 ----
}
/*
+ * Replace each "grouped var" in the source targetlist with the original
+ * expression.
+ *
+ * TODO Think of more suitable name. Although "grouped var" may substitute for
+ * grouping expressions in the future, currently Aggref is the only outcome of
+ * the replacement. undo_grouped_var_substitutions?
+ */
+ List *
+ restore_grouping_expressions(PlannerInfo *root, List *src)
+ {
+ List *result = NIL;
+ ListCell *l;
+
+ foreach(l, src)
+ {
+ TargetEntry *te, *te_new;
+ Aggref *expr_new = NULL;
+
+ te = castNode(TargetEntry, lfirst(l));
+
+ if (IsA(te->expr, GroupedVar))
+ {
+ GroupedVar *gvar;
+
+ gvar = castNode(GroupedVar, te->expr);
+ expr_new = gvar->agg_partial;
+ }
+
+ if (expr_new != NULL)
+ {
+ te_new = flatCopyTargetEntry(te);
+ te_new->expr = (Expr *) expr_new;
+ }
+ else
+ te_new = te;
+ result = lappend(result, te_new);
+ }
+
+ return result;
+ }
+
+ /*
+ * For each aggregate add GroupedVar to target if "vars" is true, or the
+ * Aggref (marked as partial) if "vars" is false.
+ *
+ * If caller passes the aggregates, he must do so in the form of
+ * GroupedVarInfos so that we don't have to look for gvid. If NULL is passed,
+ * the function retrieves the suitable aggregates itself.
+ *
+ * List of the aggregates added is returned. This is only useful if the
+ * function had to retrieve the aggregates itself (i.e. NIL was passed for
+ * aggregates) -- caller is expected to do extra checks in that case (and to
+ * also free the list).
+ */
+ List *
+ add_aggregates_to_target(PlannerInfo *root, PathTarget *target,
+ List *aggregates, RelOptInfo *rel)
+ {
+ ListCell *lc;
+ GroupedVarInfo *gvi;
+
+ if (aggregates == NIL)
+ {
+ /* Caller should pass the aggregates for base relation. */
+ Assert(rel->reloptkind != RELOPT_BASEREL);
+
+ /* Collect all aggregates that this rel can evaluate. */
+ foreach(lc, root->grouped_var_list)
+ {
+ gvi = castNode(GroupedVarInfo, lfirst(lc));
+
+ /*
+ * Overlap is not guarantee of correctness alone, but caller needs
+ * to do additional checks, so we're optimistic here.
+ *
+ * If gv_eval_at is NULL, the underlying Aggref should have
+ * aggstar set.
+ */
+ if (bms_overlap(gvi->gv_eval_at, rel->relids) ||
+ gvi->gv_eval_at == NULL)
+ aggregates = lappend(aggregates, gvi);
+ }
+
+ if (aggregates == NIL)
+ return NIL;
+ }
+
+ /* Create the vars and add them to the target. */
+ foreach(lc, aggregates)
+ {
+ GroupedVar *gvar;
+
+ gvi = castNode(GroupedVarInfo, lfirst(lc));
+ gvar = makeNode(GroupedVar);
+ gvar->gvid = gvi->gvid;
+ gvar->gvexpr = gvi->gvexpr;
+ gvar->agg_partial = gvi->agg_partial;
+ add_new_column_to_pathtarget(target, (Expr *) gvar);
+ }
+
+ return aggregates;
+ }
+
+ /*
+ * Return ressortgroupref of the target entry that is either equal to the
+ * expression or exists in the same equivalence class.
+ */
+ Index
+ get_expr_sortgroupref(PlannerInfo *root, Expr *expr)
+ {
+ ListCell *lc;
+ Index sortgroupref;
+
+ /*
+ * First, check if the query group clause contains exactly this
+ * expression.
+ */
+ foreach(lc, root->processed_tlist)
+ {
+ TargetEntry *te = castNode(TargetEntry, lfirst(lc));
+
+ if (equal(expr, te->expr) && te->ressortgroupref > 0)
+ return te->ressortgroupref;
+ }
+
+ /*
+ * If exactly this expression is not there, check if a grouping clause
+ * exists that belongs to the same equivalence class as the expression.
+ */
+ foreach(lc, root->group_pathkeys)
+ {
+ PathKey *pk = castNode(PathKey, lfirst(lc));
+ EquivalenceClass *ec = pk->pk_eclass;
+ ListCell *lm;
+ EquivalenceMember *em;
+ Expr *em_expr = NULL;
+ Query *query = root->parse;
+
+ /*
+ * Single-member EC cannot provide us with additional expression.
+ */
+ if (list_length(ec->ec_members) < 2)
+ continue;
+
+ /* We need equality anywhere in the join tree. */
+ if (ec->ec_below_outer_join)
+ continue;
+
+ /*
+ * TODO Reconsider this restriction. As the grouping expression is
+ * only evaluated at the relation level (and only the result will be
+ * propagated to the final targetlist), volatile function might be
+ * o.k. Need to think what volatile EC exactly means.
+ */
+ if (ec->ec_has_volatile)
+ continue;
+
+ foreach(lm, ec->ec_members)
+ {
+ em = (EquivalenceMember *) lfirst(lm);
+
+ /* The EC has !ec_below_outer_join. */
+ Assert(!em->em_nullable_relids);
+ if (equal(em->em_expr, expr))
+ {
+ em_expr = (Expr *) em->em_expr;
+ break;
+ }
+ }
+
+ if (em_expr == NULL)
+ /* Go for the next EC. */
+ continue;
+
+ /*
+ * Find the corresponding SortGroupClause, which provides us with
+ * sortgroupref. (It can belong to any EC member.)
+ */
+ sortgroupref = 0;
+ foreach(lm, ec->ec_members)
+ {
+ ListCell *lsg;
+
+ em = (EquivalenceMember *) lfirst(lm);
+ foreach(lsg, query->groupClause)
+ {
+ SortGroupClause *sgc;
+ Expr *expr;
+
+ sgc = (SortGroupClause *) lfirst(lsg);
+ expr = (Expr *) get_sortgroupclause_expr(sgc,
+ query->targetList);
+ if (equal(em->em_expr, expr))
+ {
+ Assert(sgc->tleSortGroupRef > 0);
+ sortgroupref = sgc->tleSortGroupRef;
+ break;
+ }
+ }
+
+ if (sortgroupref > 0)
+ break;
+ }
+
+ /*
+ * Since we searched in group_pathkeys, at least one EM of this EC
+ * should correspond to a SortGroupClause, otherwise the EC could
+ * not exist at all.
+ */
+ Assert(sortgroupref > 0);
+
+ return sortgroupref;
+ }
+
+ /* No EC found in group_pathkeys. */
+ return 0;
+ }
+
+ /*
* split_pathtarget_at_srfs
* Split given PathTarget into multiple levels to position SRFs safely
*
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
new file mode 100644
index 184e5da..5e3c3b4
*** a/src/backend/utils/adt/ruleutils.c
--- b/src/backend/utils/adt/ruleutils.c
*************** get_rule_expr(Node *node, deparse_contex
*** 7559,7564 ****
--- 7559,7572 ----
get_agg_expr((Aggref *) node, context, (Aggref *) node);
break;
+ case T_GroupedVar:
+ {
+ GroupedVar *gvar = castNode(GroupedVar, node);
+
+ get_agg_expr(gvar->agg_partial, context, (Aggref *) gvar->gvexpr);
+ break;
+ }
+
case T_GroupingFunc:
{
GroupingFunc *gexpr = (GroupingFunc *) node;
*************** get_agg_combine_expr(Node *node, deparse
*** 8993,9002 ****
Aggref *aggref;
Aggref *original_aggref = private;
! if (!IsA(node, Aggref))
elog(ERROR, "combining Aggref does not point to an Aggref");
- aggref = (Aggref *) node;
get_agg_expr(aggref, context, original_aggref);
}
--- 9001,9018 ----
Aggref *aggref;
Aggref *original_aggref = private;
! if (IsA(node, Aggref))
! aggref = (Aggref *) node;
! else if (IsA(node, GroupedVar))
! {
! GroupedVar *gvar = castNode(GroupedVar, node);
!
! aggref = gvar->agg_partial;
! original_aggref = castNode(Aggref, gvar->gvexpr);
! }
! else
elog(ERROR, "combining Aggref does not point to an Aggref");
get_agg_expr(aggref, context, original_aggref);
}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
new file mode 100644
index a35b93b..78e24ea
*** a/src/backend/utils/adt/selfuncs.c
--- b/src/backend/utils/adt/selfuncs.c
***************
*** 114,119 ****
--- 114,120 ----
#include "catalog/pg_statistic_ext.h"
#include "catalog/pg_type.h"
#include "executor/executor.h"
+ #include "executor/nodeAgg.h"
#include "mb/pg_wchar.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
*************** estimate_hash_bucketsize(PlannerInfo *ro
*** 3705,3710 ****
--- 3706,3744 ----
return (Selectivity) estfract;
}
+ /*
+ * estimate_hashagg_tablesize
+ * estimate the number of bytes that a hash aggregate hashtable will
+ * require based on the agg_costs, path width and dNumGroups.
+ *
+ * XXX this may be over-estimating the size now that hashagg knows to omit
+ * unneeded columns from the hashtable. Also for mixed-mode grouping sets,
+ * grouping columns not in the hashed set are counted here even though hashagg
+ * won't store them. Is this a problem?
+ */
+ Size
+ estimate_hashagg_tablesize(Path *path, const AggClauseCosts *agg_costs,
+ double dNumGroups)
+ {
+ Size hashentrysize;
+
+ /* Estimate per-hash-entry space at tuple width... */
+ hashentrysize = MAXALIGN(path->pathtarget->width) +
+ MAXALIGN(SizeofMinimalTupleHeader);
+
+ /* plus space for pass-by-ref transition values... */
+ hashentrysize += agg_costs->transitionSpace;
+ /* plus the per-hash-entry overhead */
+ hashentrysize += hash_agg_entry_size(agg_costs->numAggs);
+
+ /*
+ * Note that this disregards the effect of fill-factor and growth policy
+ * of the hash-table. That's probably ok, given default the default
+ * fill-factor is relatively high. It'd be hard to meaningfully factor in
+ * "double-in-size" growth policies here.
+ */
+ return hashentrysize * dNumGroups;
+ }
/*-------------------------------------------------------------------------
*
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
new file mode 100644
index 85c6b61..cf94ccc
*** a/src/backend/utils/cache/relcache.c
--- b/src/backend/utils/cache/relcache.c
*************** equalPartitionDescs(PartitionKey key, Pa
*** 1204,1210 ****
if (partdesc2->boundinfo == NULL)
return false;
! if (!partition_bounds_equal(key, partdesc1->boundinfo,
partdesc2->boundinfo))
return false;
}
--- 1204,1212 ----
if (partdesc2->boundinfo == NULL)
return false;
! if (!partition_bounds_equal(key->partnatts, key->parttyplen,
! key->parttypbyval,
! partdesc1->boundinfo,
partdesc2->boundinfo))
return false;
}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
new file mode 100644
index a414fb2..343986d
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
*************** static struct config_bool ConfigureNames
*** 914,919 ****
--- 914,928 ----
true,
NULL, NULL, NULL
},
+ {
+ {"enable_partition_wise_join", PGC_USERSET, QUERY_TUNING_METHOD,
+ gettext_noop("Enables partition-wise join."),
+ NULL
+ },
+ &enable_partition_wise_join,
+ false,
+ NULL, NULL, NULL
+ },
{
{"geqo", PGC_USERSET, QUERY_TUNING_GEQO,
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
new file mode 100644
index 421644c..e51bca1
*** a/src/include/catalog/partition.h
--- b/src/include/catalog/partition.h
*************** typedef struct PartitionDispatchData
*** 71,78 ****
typedef struct PartitionDispatchData *PartitionDispatch;
extern void RelationBuildPartitionDesc(Relation relation);
! extern bool partition_bounds_equal(PartitionKey key,
! PartitionBoundInfo p1, PartitionBoundInfo p2);
extern void check_new_partition_bound(char *relname, Relation parent, Node *bound);
extern Oid get_partition_parent(Oid relid);
--- 71,79 ----
typedef struct PartitionDispatchData *PartitionDispatch;
extern void RelationBuildPartitionDesc(Relation relation);
! extern bool partition_bounds_equal(int partnatts, int16 *parttyplen,
! bool *parttypbyval, PartitionBoundInfo b1,
! PartitionBoundInfo b2);
extern void check_new_partition_bound(char *relname, Relation parent, Node *bound);
extern Oid get_partition_parent(Oid relid);
diff --git a/src/include/foreign/fdwapi.h b/src/include/foreign/fdwapi.h
new file mode 100644
index 6ca44f7..c57ff7b
*** a/src/include/foreign/fdwapi.h
--- b/src/include/foreign/fdwapi.h
*************** typedef void (*ShutdownForeignScan_funct
*** 155,160 ****
--- 155,163 ----
typedef bool (*IsForeignScanParallelSafe_function) (PlannerInfo *root,
RelOptInfo *rel,
RangeTblEntry *rte);
+ typedef List *(*ReparameterizeForeignPathByChild_function) (PlannerInfo *root,
+ List *fdw_private,
+ RelOptInfo *child_rel);
/*
* FdwRoutine is the struct returned by a foreign-data wrapper's handler
*************** typedef struct FdwRoutine
*** 226,231 ****
--- 229,237 ----
InitializeDSMForeignScan_function InitializeDSMForeignScan;
InitializeWorkerForeignScan_function InitializeWorkerForeignScan;
ShutdownForeignScan_function ShutdownForeignScan;
+
+ /* Support functions for path reparameterization. */
+ ReparameterizeForeignPathByChild_function ReparameterizeForeignPathByChild;
} FdwRoutine;
diff --git a/src/include/nodes/extensible.h b/src/include/nodes/extensible.h
new file mode 100644
index 0b02cc1..1c802ad
*** a/src/include/nodes/extensible.h
--- b/src/include/nodes/extensible.h
*************** typedef struct CustomPathMethods
*** 96,101 ****
--- 96,104 ----
List *tlist,
List *clauses,
List *custom_plans);
+ struct List *(*ReparameterizeCustomPathByChild) (PlannerInfo *root,
+ List *custom_private,
+ RelOptInfo *child_rel);
} CustomPathMethods;
/*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
new file mode 100644
index f59d719..ba1eac8
*** a/src/include/nodes/nodes.h
--- b/src/include/nodes/nodes.h
*************** typedef enum NodeTag
*** 218,223 ****
--- 218,224 ----
T_IndexOptInfo,
T_ForeignKeyOptInfo,
T_ParamPathInfo,
+ T_GroupedPathInfo,
T_Path,
T_IndexPath,
T_BitmapHeapPath,
*************** typedef enum NodeTag
*** 258,267 ****
--- 259,270 ----
T_PathTarget,
T_RestrictInfo,
T_PlaceHolderVar,
+ T_GroupedVar,
T_SpecialJoinInfo,
T_AppendRelInfo,
T_PartitionedChildRelInfo,
T_PlaceHolderInfo,
+ T_GroupedVarInfo,
T_MinMaxAggInfo,
T_PlannerParamItem,
T_RollupData,
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
new file mode 100644
index 7a8e2fd..b576dd5
*** a/src/include/nodes/relation.h
--- b/src/include/nodes/relation.h
***************
*** 15,20 ****
--- 15,21 ----
#define RELATION_H
#include "access/sdir.h"
+ #include "catalog/partition.h"
#include "lib/stringinfo.h"
#include "nodes/params.h"
#include "nodes/parsenodes.h"
*************** typedef struct PlannerInfo
*** 256,261 ****
--- 257,264 ----
List *placeholder_list; /* list of PlaceHolderInfos */
+ List *grouped_var_list; /* List of GroupedVarInfos. */
+
List *fkey_list; /* list of ForeignKeyOptInfos */
List *query_pathkeys; /* desired pathkeys for query_planner() */
*************** typedef struct PlannerInfo
*** 265,270 ****
--- 268,276 ----
List *distinct_pathkeys; /* distinctClause pathkeys, if any */
List *sort_pathkeys; /* sortClause pathkeys, if any */
+ List *part_schemes; /* Canonicalised partition schemes
+ * used in the query. */
+
List *initial_rels; /* RelOptInfos we are now trying to join */
/* Use fetch_upper_rel() to get any particular upper rel */
*************** typedef struct PlannerInfo
*** 325,330 ****
--- 331,362 ----
((root)->simple_rte_array ? (root)->simple_rte_array[rti] : \
rt_fetch(rti, (root)->parse->rtable))
+ /*
+ * Partitioning scheme
+ * Structure to hold partitioning scheme for a given relation.
+ *
+ * Multiple relations may be partitioned in the same way. The relations
+ * resulting from joining such relations may be partitioned in the same way as
+ * the joining relations. Similarly, relations derived from such relations by
+ * grouping, sorting may be partitioned in the same way as the underlying
+ * scan relations. All such relations partitioned in the same way share the
+ * partitioning scheme.
+ *
+ * PlannerInfo stores a list of distinct "canonical" partitioning schemes.
+ * RelOptInfo of a partitioned relation holds the pointer to "canonical"
+ * partitioning scheme.
+ */
+ typedef struct PartitionSchemeData
+ {
+ char strategy; /* partition strategy */
+ int16 partnatts; /* number of partition attributes */
+ Oid *partopfamily; /* OIDs of operator families */
+ Oid *partopcintype; /* OIDs of opclass declared input data types */
+ FmgrInfo *partsupfunc; /* lookup info for support funcs */
+ Oid *parttypcoll; /* OIDs of collations of partition keys. */
+ } PartitionSchemeData;
+
+ typedef struct PartitionSchemeData *PartitionScheme;
/*----------
* RelOptInfo
*************** typedef struct PlannerInfo
*** 359,364 ****
--- 391,401 ----
* handling join alias Vars. Currently this is not needed because all join
* alias Vars are expanded to non-aliased form during preprocess_expression.
*
+ * We also have relations representing joins between child relations of
+ * different partitioned tables. These relations are not added to
+ * join_rel_level lists as they are not joined directly by the dynamic
+ * programming algorithm.
+ *
* There is also a RelOptKind for "upper" relations, which are RelOptInfos
* that describe post-scan/join processing steps, such as aggregation.
* Many of the fields in these RelOptInfos are meaningless, but their Path
*************** typedef struct PlannerInfo
*** 401,406 ****
--- 438,445 ----
* direct_lateral_relids - rels this rel has direct LATERAL references to
* lateral_relids - required outer rels for LATERAL, as a Relids set
* (includes both direct and indirect lateral references)
+ * gpi - GroupedPathInfo if the relation can produce grouped paths, NULL
+ * otherwise.
*
* If the relation is a base relation it will have these fields set:
*
*************** typedef struct PlannerInfo
*** 486,491 ****
--- 525,543 ----
* We store baserestrictcost in the RelOptInfo (for base relations) because
* we know we will need it at least once (to price the sequential scan)
* and may need it multiple times to price index scans.
+ *
+ * If the relation is partitioned these fields will be set
+ * part_scheme - Partitioning scheme of the relation
+ * nparts - Number of partitions
+ * boundinfo - Partition bounds/lists
+ * part_rels - RelOptInfos of the partition relations
+ * partexprs - Partition key expressions
+ *
+ * Note: A base relation will always have only one set of partition keys. But a
+ * join relation is partitioned by the partition keys of joining relations.
+ * Partition keys are stored as an array of partition key expressions, with
+ * each array element containing a list of one (for a base relation) or more
+ * (as many as the number of joining relations) expressions.
*----------
*/
typedef enum RelOptKind
*************** typedef enum RelOptKind
*** 493,498 ****
--- 545,551 ----
RELOPT_BASEREL,
RELOPT_JOINREL,
RELOPT_OTHER_MEMBER_REL,
+ RELOPT_OTHER_JOINREL,
RELOPT_UPPER_REL,
RELOPT_DEADREL
} RelOptKind;
*************** typedef enum RelOptKind
*** 506,518 ****
(rel)->reloptkind == RELOPT_OTHER_MEMBER_REL)
/* Is the given relation a join relation? */
! #define IS_JOIN_REL(rel) ((rel)->reloptkind == RELOPT_JOINREL)
/* Is the given relation an upper relation? */
#define IS_UPPER_REL(rel) ((rel)->reloptkind == RELOPT_UPPER_REL)
/* Is the given relation an "other" relation? */
! #define IS_OTHER_REL(rel) ((rel)->reloptkind == RELOPT_OTHER_MEMBER_REL)
typedef struct RelOptInfo
{
--- 559,575 ----
(rel)->reloptkind == RELOPT_OTHER_MEMBER_REL)
/* Is the given relation a join relation? */
! #define IS_JOIN_REL(rel) \
! ((rel)->reloptkind == RELOPT_JOINREL || \
! (rel)->reloptkind == RELOPT_OTHER_JOINREL)
/* Is the given relation an upper relation? */
#define IS_UPPER_REL(rel) ((rel)->reloptkind == RELOPT_UPPER_REL)
/* Is the given relation an "other" relation? */
! #define IS_OTHER_REL(rel) \
! ((rel)->reloptkind == RELOPT_OTHER_MEMBER_REL || \
! (rel)->reloptkind == RELOPT_OTHER_JOINREL)
typedef struct RelOptInfo
{
*************** typedef struct RelOptInfo
*** 548,553 ****
--- 605,613 ----
Relids direct_lateral_relids; /* rels directly laterally referenced */
Relids lateral_relids; /* minimum parameterization of rel */
+ /* Information needed to produce grouped paths. */
+ struct GroupedPathInfo *gpi;
+
/* information about a base rel (not set for join rels!) */
Index relid;
Oid reltablespace; /* containing tablespace */
*************** typedef struct RelOptInfo
*** 566,571 ****
--- 626,632 ----
PlannerInfo *subroot; /* if subquery */
List *subplan_params; /* if subquery */
int rel_parallel_workers; /* wanted number of parallel workers */
+ Oid *part_oids; /* OIDs of partitions */
/* Information about foreign tables and foreign joins */
Oid serverid; /* identifies server for the table or join */
*************** typedef struct RelOptInfo
*** 591,596 ****
--- 652,673 ----
/* used by "other" relations */
Relids top_parent_relids; /* Relids of topmost parents */
+
+ /* For all the partitioned relations. */
+ PartitionScheme part_scheme; /* Partitioning scheme. */
+ int nparts; /* number of partitions */
+ PartitionBoundInfo boundinfo; /* Partition bounds/lists */
+ struct RelOptInfo **part_rels; /* Array of RelOptInfos of partitions,
+ * stored in the same order as bounds
+ * or lists in PartitionScheme.
+ */
+ List **partexprs; /* Array of list of partition key
+ * expressions. For base relations
+ * these are one element lists. For
+ * join there may be as many elements
+ * as the number of joining
+ * relations.
+ */
} RelOptInfo;
/*
*************** typedef struct ParamPathInfo
*** 913,918 ****
--- 990,1017 ----
List *ppi_clauses; /* join clauses available from outer rels */
} ParamPathInfo;
+ /*
+ * GroupedPathInfo
+ *
+ * If RelOptInfo points to this structure, grouped paths can be created for
+ * it.
+ *
+ * "target" will be used as pathtarget of grouped paths produced by this
+ * relation. Grouped path is either a result of aggregation of the relation
+ * that owns this structure or, if the owning relation is a join, a join path
+ * whose one side is a grouped path and the other is a plain (i.e. not
+ * grouped) one. (Two grouped paths cannot be joined in general because
+ * grouping of one side of the join essentially reduces occurrence of groups
+ * of the other side in the input of the final aggregation.)
+ */
+ typedef struct GroupedPathInfo
+ {
+ NodeTag type;
+
+ PathTarget *target; /* output of grouped paths. */
+ List *pathlist; /* List of grouped paths. */
+ List *partial_pathlist; /* List of partial grouped paths. */
+ } GroupedPathInfo;
/*
* Type "Path" is used as-is for sequential-scan paths, as well as some other
*************** typedef struct PlaceHolderVar
*** 1852,1857 ****
--- 1951,1989 ----
Index phlevelsup; /* > 0 if PHV belongs to outer query */
} PlaceHolderVar;
+
+ /*
+ * Similar to the concept of PlaceHolderVar, we treat aggregates and grouping
+ * columns as special variables if grouping is possible below the top-level
+ * join. The reason is that aggregates having start as the argument can be
+ * evaluated at various places in the join tree (i.e. cannot be assigned to
+ * target list of exactly one relation). Also this concept seems to be less
+ * invasive than adding the grouped vars to reltarget (in which case
+ * attr_needed and attr_widths arrays of RelOptInfo) would also need
+ * additional changes.
+ *
+ * gvexpr is a pointer to gvexpr field of the corresponding instance
+ * GroupedVarInfo. It's there for the sake of exprType(), exprCollation(),
+ * etc.
+ *
+ * agg_partial also points to the corresponding field of GroupedVarInfo if the
+ * GroupedVar is in the target of a parent relation (RELOPT_BASEREL). However
+ * within a child relation's (RELOPT_OTHER_MEMBER_REL) target it points to a
+ * copy which has argument expressions translated, so they no longer reference
+ * the parent.
+ *
+ * XXX Currently we only create GroupedVar for aggregates, but sometime we can
+ * do it for grouping keys as well. That would allow grouping below the
+ * top-level join by keys other than plain Var.
+ */
+ typedef struct GroupedVar
+ {
+ Expr xpr;
+ Expr *gvexpr; /* the represented expression */
+ Aggref *agg_partial; /* partial aggregate if gvexpr is aggregate */
+ Index gvid; /* GroupedVarInfo */
+ } GroupedVar;
+
/*
* "Special join" info.
*
*************** typedef struct PlaceHolderInfo
*** 2067,2072 ****
--- 2199,2220 ----
} PlaceHolderInfo;
/*
+ * Likewise, GroupedVarInfo exists for each distinct GroupedVar.
+ */
+ typedef struct GroupedVarInfo
+ {
+ NodeTag type;
+
+ Index gvid; /* GroupedVar.gvid */
+ Expr *gvexpr; /* the represented expression. */
+ Aggref *agg_partial; /* if gvexpr is aggregate, agg_partial is
+ * the corresponding partial aggregate */
+ Relids gv_eval_at; /* lowest level we can evaluate the expression
+ * at or NULL if it can happen anywhere. */
+ int32 gv_width; /* estimated width of the expression */
+ } GroupedVarInfo;
+
+ /*
* This struct describes one potentially index-optimizable MIN/MAX aggregate
* function. MinMaxAggPath contains a list of these, and if we accept that
* path, the list is stored into root->minmax_aggs for use during setrefs.c.
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
new file mode 100644
index ed70def..ca06455
*** a/src/include/optimizer/cost.h
--- b/src/include/optimizer/cost.h
*************** extern bool enable_material;
*** 67,72 ****
--- 67,73 ----
extern bool enable_mergejoin;
extern bool enable_hashjoin;
extern bool enable_gathermerge;
+ extern bool enable_partition_wise_join;
extern int constraint_exclusion;
extern double clamp_row_est(double nrows);
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
new file mode 100644
index 77bc770..4a0d845
*** a/src/include/optimizer/pathnode.h
--- b/src/include/optimizer/pathnode.h
*************** extern int compare_path_costs(Path *path
*** 25,37 ****
extern int compare_fractional_path_costs(Path *path1, Path *path2,
double fraction);
extern void set_cheapest(RelOptInfo *parent_rel);
! extern void add_path(RelOptInfo *parent_rel, Path *new_path);
extern bool add_path_precheck(RelOptInfo *parent_rel,
Cost startup_cost, Cost total_cost,
! List *pathkeys, Relids required_outer);
! extern void add_partial_path(RelOptInfo *parent_rel, Path *new_path);
extern bool add_partial_path_precheck(RelOptInfo *parent_rel,
! Cost total_cost, List *pathkeys);
extern Path *create_seqscan_path(PlannerInfo *root, RelOptInfo *rel,
Relids required_outer, int parallel_workers);
--- 25,39 ----
extern int compare_fractional_path_costs(Path *path1, Path *path2,
double fraction);
extern void set_cheapest(RelOptInfo *parent_rel);
! extern void add_path(RelOptInfo *parent_rel, Path *new_path, bool grouped);
extern bool add_path_precheck(RelOptInfo *parent_rel,
Cost startup_cost, Cost total_cost,
! List *pathkeys, Relids required_outer, bool grouped);
! extern void add_partial_path(RelOptInfo *parent_rel, Path *new_path,
! bool grouped);
extern bool add_partial_path_precheck(RelOptInfo *parent_rel,
! Cost total_cost, List *pathkeys,
! bool grouped);
extern Path *create_seqscan_path(PlannerInfo *root, RelOptInfo *rel,
Relids required_outer, int parallel_workers);
*************** extern ForeignPath *create_foreignscan_p
*** 112,118 ****
Path *fdw_outerpath,
List *fdw_private);
! extern Relids calc_nestloop_required_outer(Path *outer_path, Path *inner_path);
extern Relids calc_non_nestloop_required_outer(Path *outer_path, Path *inner_path);
extern NestPath *create_nestloop_path(PlannerInfo *root,
--- 114,123 ----
Path *fdw_outerpath,
List *fdw_private);
! extern Relids calc_nestloop_required_outer(Relids outerrelids,
! Relids outer_paramrels,
! Relids innerrelids,
! Relids inner_paramrels);
extern Relids calc_non_nestloop_required_outer(Path *outer_path, Path *inner_path);
extern NestPath *create_nestloop_path(PlannerInfo *root,
*************** extern NestPath *create_nestloop_path(Pl
*** 124,130 ****
Path *inner_path,
List *restrict_clauses,
List *pathkeys,
! Relids required_outer);
extern MergePath *create_mergejoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
--- 129,136 ----
Path *inner_path,
List *restrict_clauses,
List *pathkeys,
! Relids required_outer,
! PathTarget *target);
extern MergePath *create_mergejoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
*************** extern MergePath *create_mergejoin_path(
*** 138,144 ****
Relids required_outer,
List *mergeclauses,
List *outersortkeys,
! List *innersortkeys);
extern HashPath *create_hashjoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
--- 144,151 ----
Relids required_outer,
List *mergeclauses,
List *outersortkeys,
! List *innersortkeys,
! PathTarget *target);
extern HashPath *create_hashjoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
*************** extern HashPath *create_hashjoin_path(Pl
*** 149,155 ****
Path *inner_path,
List *restrict_clauses,
Relids required_outer,
! List *hashclauses);
extern ProjectionPath *create_projection_path(PlannerInfo *root,
RelOptInfo *rel,
--- 156,163 ----
Path *inner_path,
List *restrict_clauses,
Relids required_outer,
! List *hashclauses,
! PathTarget *target);
extern ProjectionPath *create_projection_path(PlannerInfo *root,
RelOptInfo *rel,
*************** extern AggPath *create_agg_path(PlannerI
*** 190,195 ****
--- 198,217 ----
List *qual,
const AggClauseCosts *aggcosts,
double numGroups);
+ extern AggPath *create_partial_agg_sorted_path(PlannerInfo *root,
+ Path *subpath,
+ bool first_call,
+ List **group_clauses,
+ List **group_exprs,
+ List **agg_exprs,
+ double input_rows);
+ extern AggPath *create_partial_agg_hashed_path(PlannerInfo *root,
+ Path *subpath,
+ bool first_call,
+ List **group_clauses,
+ List **group_exprs,
+ List **agg_exprs,
+ double input_rows);
extern GroupingSetsPath *create_groupingsets_path(PlannerInfo *root,
RelOptInfo *rel,
Path *subpath,
*************** extern LimitPath *create_limit_path(Plan
*** 248,253 ****
--- 270,277 ----
extern Path *reparameterize_path(PlannerInfo *root, Path *path,
Relids required_outer,
double loop_count);
+ extern Path *reparameterize_path_by_child(PlannerInfo *root, Path *path,
+ RelOptInfo *child_rel);
/*
* prototypes for relnode.c
*************** extern ParamPathInfo *get_joinrel_paramp
*** 285,289 ****
--- 309,320 ----
List **restrict_clauses);
extern ParamPathInfo *get_appendrel_parampathinfo(RelOptInfo *appendrel,
Relids required_outer);
+ extern ParamPathInfo *find_param_path_info(RelOptInfo *rel,
+ Relids required_outer);
+ extern void prepare_rel_for_grouping(PlannerInfo *root, RelOptInfo *rel);
+ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
+ RelOptInfo *outer_rel, RelOptInfo *inner_rel,
+ RelOptInfo *parent_joinrel, List *restrictlist,
+ SpecialJoinInfo *sjinfo, JoinType jointype);
#endif /* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
new file mode 100644
index 25fe78c..8dd4efd
*** a/src/include/optimizer/paths.h
--- b/src/include/optimizer/paths.h
*************** extern void set_dummy_rel_pathlist(RelOp
*** 53,63 ****
extern RelOptInfo *standard_join_search(PlannerInfo *root, int levels_needed,
List *initial_rels);
! extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel);
extern int compute_parallel_worker(RelOptInfo *rel, double heap_pages,
double index_pages);
extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
Path *bitmapqual);
#ifdef OPTIMIZER_DEBUG
extern void debug_print_rel(PlannerInfo *root, RelOptInfo *rel);
--- 53,69 ----
extern RelOptInfo *standard_join_search(PlannerInfo *root, int levels_needed,
List *initial_rels);
! extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
! bool grouped);
! extern void create_grouped_path(PlannerInfo *root, RelOptInfo *rel,
! Path *subpath, bool precheck, bool partial,
! AggStrategy aggstrategy);
extern int compute_parallel_worker(RelOptInfo *rel, double heap_pages,
double index_pages);
extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
Path *bitmapqual);
+ extern void generate_partition_wise_join_paths(PlannerInfo *root,
+ RelOptInfo *rel);
#ifdef OPTIMIZER_DEBUG
extern void debug_print_rel(PlannerInfo *root, RelOptInfo *rel);
*************** extern void debug_print_rel(PlannerInfo
*** 67,73 ****
* indxpath.c
* routines to generate index paths
*/
! extern void create_index_paths(PlannerInfo *root, RelOptInfo *rel);
extern bool relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
List *restrictlist,
List *exprlist, List *oprlist);
--- 73,80 ----
* indxpath.c
* routines to generate index paths
*/
! extern void create_index_paths(PlannerInfo *root, RelOptInfo *rel,
! bool grouped);
extern bool relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
List *restrictlist,
List *exprlist, List *oprlist);
*************** extern bool have_join_order_restriction(
*** 111,116 ****
--- 118,126 ----
RelOptInfo *rel1, RelOptInfo *rel2);
extern bool have_dangerous_phv(PlannerInfo *root,
Relids outer_relids, Relids inner_params);
+ extern void mark_dummy_rel(RelOptInfo *rel);
+ extern bool have_partkey_equi_join(RelOptInfo *rel1, RelOptInfo *rel2,
+ JoinType jointype, List *restrictlist, bool *is_strict);
/*
* equivclass.c
diff --git a/src/include/optimizer/placeholder.h b/src/include/optimizer/placeholder.h
new file mode 100644
index 11e6403..8598268
*** a/src/include/optimizer/placeholder.h
--- b/src/include/optimizer/placeholder.h
*************** extern void fix_placeholder_input_needed
*** 28,32 ****
--- 28,34 ----
extern void add_placeholders_to_base_rels(PlannerInfo *root);
extern void add_placeholders_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outer_rel, RelOptInfo *inner_rel);
+ extern void add_placeholders_to_child_joinrel(PlannerInfo *root,
+ RelOptInfo *childrel, RelOptInfo *parentrel);
#endif /* PLACEHOLDER_H */
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
new file mode 100644
index 5df68a2..07bc4c0
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
*************** extern int join_collapse_limit;
*** 74,80 ****
extern void add_base_rels_to_query(PlannerInfo *root, Node *jtnode);
extern void build_base_rel_tlists(PlannerInfo *root, List *final_tlist);
extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
! Relids where_needed, bool create_new_ph);
extern void find_lateral_references(PlannerInfo *root);
extern void create_lateral_join_info(PlannerInfo *root);
extern List *deconstruct_jointree(PlannerInfo *root);
--- 74,82 ----
extern void add_base_rels_to_query(PlannerInfo *root, Node *jtnode);
extern void build_base_rel_tlists(PlannerInfo *root, List *final_tlist);
extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
! Relids where_needed, bool create_new_ph);
! extern void add_grouping_info_to_base_rels(PlannerInfo *root);
! extern void add_grouped_vars_to_rels(PlannerInfo *root);
extern void find_lateral_references(PlannerInfo *root);
extern void create_lateral_join_info(PlannerInfo *root);
extern List *deconstruct_jointree(PlannerInfo *root);
diff --git a/src/include/optimizer/planner.h b/src/include/optimizer/planner.h
new file mode 100644
index f3aaa23..4a550bb
*** a/src/include/optimizer/planner.h
--- b/src/include/optimizer/planner.h
*************** extern Expr *preprocess_phv_expression(P
*** 58,62 ****
--- 58,64 ----
extern bool plan_cluster_use_sort(Oid tableOid, Oid indexOid);
extern List *get_partitioned_child_rels(PlannerInfo *root, Index rti);
+ extern List *get_partitioned_child_rels_for_join(PlannerInfo *root,
+ RelOptInfo *joinrel);
#endif /* PLANNER_H */
diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h
new file mode 100644
index 2b20b36..95802c9
*** a/src/include/optimizer/prep.h
--- b/src/include/optimizer/prep.h
*************** extern RelOptInfo *plan_set_operations(P
*** 53,61 ****
extern void expand_inherited_tables(PlannerInfo *root);
extern Node *adjust_appendrel_attrs(PlannerInfo *root, Node *node,
! AppendRelInfo *appinfo);
extern Node *adjust_appendrel_attrs_multilevel(PlannerInfo *root, Node *node,
! RelOptInfo *child_rel);
#endif /* PREP_H */
--- 53,74 ----
extern void expand_inherited_tables(PlannerInfo *root);
extern Node *adjust_appendrel_attrs(PlannerInfo *root, Node *node,
! int nappinfos, AppendRelInfo **appinfos);
extern Node *adjust_appendrel_attrs_multilevel(PlannerInfo *root, Node *node,
! Relids child_relids,
! Relids top_parent_relids);
!
! extern Relids adjust_child_relids(Relids relids, int nappinfos,
! AppendRelInfo **appinfos);
!
! extern AppendRelInfo **find_appinfos_by_relids(PlannerInfo *root,
! Relids relids, int *nappinfos);
!
! extern SpecialJoinInfo *build_child_join_sjinfo(PlannerInfo *root,
! SpecialJoinInfo *parent_sjinfo,
! Relids left_relids, Relids right_relids);
! extern Relids adjust_child_relids_multilevel(PlannerInfo *root, Relids relids,
! Relids child_relids, Relids top_parent_relids);
#endif /* PREP_H */
diff --git a/src/include/optimizer/tlist.h b/src/include/optimizer/tlist.h
new file mode 100644
index ccb93d8..ddea03c
*** a/src/include/optimizer/tlist.h
--- b/src/include/optimizer/tlist.h
*************** extern Node *get_sortgroupclause_expr(So
*** 41,46 ****
--- 41,49 ----
List *targetList);
extern List *get_sortgrouplist_exprs(List *sgClauses,
List *targetList);
+ extern void get_grouping_expressions(PlannerInfo *root, PathTarget *target,
+ List **grouping_clauses,
+ List **grouping_exprs, List **agg_exprs);
extern SortGroupClause *get_sortgroupref_clause(Index sortref,
List *clauses);
*************** extern void split_pathtarget_at_srfs(Pla
*** 65,70 ****
--- 68,84 ----
PathTarget *target, PathTarget *input_target,
List **targets, List **targets_contain_srfs);
+ /* TODO Find the best location (position and in some cases even file) for the
+ * following ones. */
+ extern List *restore_grouping_expressions(PlannerInfo *root, List *src);
+ extern List *add_aggregates_to_target(PlannerInfo *root, PathTarget *target,
+ List *aggregates, RelOptInfo *rel);
+ extern Index get_expr_sortgroupref(PlannerInfo *root, Expr *expr);
+ /* TODO Move definition from initsplan.c to tlist.c. */
+ extern PathTarget *create_grouped_target(PlannerInfo *root, RelOptInfo *rel,
+ Relids rel_agg_attrs,
+ List *rel_agg_vars);
+
/* Convenience macro to get a PathTarget with valid cost/width fields */
#define create_pathtarget(root, tlist) \
set_pathtarget_cost_width(root, make_pathtarget_from_tlist(tlist))
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
new file mode 100644
index 9f9d2dc..e05e6f6
*** a/src/include/utils/selfuncs.h
--- b/src/include/utils/selfuncs.h
*************** extern double estimate_num_groups(Planne
*** 206,211 ****
--- 206,214 ----
extern Selectivity estimate_hash_bucketsize(PlannerInfo *root, Node *hashkey,
double nbuckets);
+ extern Size estimate_hashagg_tablesize(Path *path,
+ const AggClauseCosts *agg_costs,
+ double dNumGroups);
extern List *deconstruct_indexquals(IndexPath *path);
extern void genericcostestimate(PlannerInfo *root, IndexPath *path,
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
new file mode 100644
index 6163ed8..7a969f2
*** a/src/test/regress/expected/inherit.out
--- b/src/test/regress/expected/inherit.out
*************** select tableoid::regclass::text as relna
*** 625,630 ****
--- 625,652 ----
(3 rows)
drop table parted_tab;
+ -- Check UPDATE with *multi-level partitioned* inherited target
+ create table mlparted_tab (a int, b char, c text) partition by list (a);
+ create table mlparted_tab_part1 partition of mlparted_tab for values in (1);
+ create table mlparted_tab_part2 partition of mlparted_tab for values in (2) partition by list (b);
+ create table mlparted_tab_part3 partition of mlparted_tab for values in (3);
+ create table mlparted_tab_part2a partition of mlparted_tab_part2 for values in ('a');
+ create table mlparted_tab_part2b partition of mlparted_tab_part2 for values in ('b');
+ insert into mlparted_tab values (1, 'a'), (2, 'a'), (2, 'b'), (3, 'a');
+ update mlparted_tab mlp set c = 'xxx'
+ from
+ (select a from some_tab union all select a+1 from some_tab) ss (a)
+ where (mlp.a = ss.a and mlp.b = 'b') or mlp.a = 3;
+ select tableoid::regclass::text as relname, mlparted_tab.* from mlparted_tab order by 1,2;
+ relname | a | b | c
+ ---------------------+---+---+-----
+ mlparted_tab_part1 | 1 | a |
+ mlparted_tab_part2a | 2 | a |
+ mlparted_tab_part2b | 2 | b | xxx
+ mlparted_tab_part3 | 3 | a | xxx
+ (4 rows)
+
+ drop table mlparted_tab;
drop table some_tab cascade;
NOTICE: drop cascades to table some_tab_child
/* Test multiple inheritance of column defaults */
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
new file mode 100644
index 568b783..cd1f7f3
*** a/src/test/regress/expected/sysviews.out
--- b/src/test/regress/expected/sysviews.out
*************** select count(*) >= 0 as ok from pg_prepa
*** 70,90 ****
-- This is to record the prevailing planner enable_foo settings during
-- a regression test run.
select name, setting from pg_settings where name like 'enable%';
! name | setting
! ----------------------+---------
! enable_bitmapscan | on
! enable_gathermerge | on
! enable_hashagg | on
! enable_hashjoin | on
! enable_indexonlyscan | on
! enable_indexscan | on
! enable_material | on
! enable_mergejoin | on
! enable_nestloop | on
! enable_seqscan | on
! enable_sort | on
! enable_tidscan | on
! (12 rows)
-- Test that the pg_timezone_names and pg_timezone_abbrevs views are
-- more-or-less working. We can't test their contents in any great detail
--- 70,91 ----
-- This is to record the prevailing planner enable_foo settings during
-- a regression test run.
select name, setting from pg_settings where name like 'enable%';
! name | setting
! ----------------------------+---------
! enable_bitmapscan | on
! enable_gathermerge | on
! enable_hashagg | on
! enable_hashjoin | on
! enable_indexonlyscan | on
! enable_indexscan | on
! enable_material | on
! enable_mergejoin | on
! enable_nestloop | on
! enable_partition_wise_join | off
! enable_seqscan | on
! enable_sort | on
! enable_tidscan | on
! (13 rows)
-- Test that the pg_timezone_names and pg_timezone_abbrevs views are
-- more-or-less working. We can't test their contents in any great detail
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
new file mode 100644
index 1f8f098..2d14885
*** a/src/test/regress/parallel_schedule
--- b/src/test/regress/parallel_schedule
*************** test: publication subscription
*** 103,109 ****
# ----------
# Another group of parallel tests
# ----------
! test: select_views portals_p2 foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data window xmlmap functional_deps advisory_lock json jsonb json_encoding indirect_toast equivclass
# ----------
# Another group of parallel tests
# NB: temp.sql does a reconnect which transiently uses 2 connections,
--- 103,109 ----
# ----------
# Another group of parallel tests
# ----------
! test: select_views portals_p2 foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data window xmlmap functional_deps advisory_lock json jsonb json_encoding indirect_toast equivclass partition_join multi_level_partition_join
# ----------
# Another group of parallel tests
# NB: temp.sql does a reconnect which transiently uses 2 connections,
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
new file mode 100644
index 04206c3..9ac24dd
*** a/src/test/regress/serial_schedule
--- b/src/test/regress/serial_schedule
*************** test: with
*** 179,181 ****
--- 179,183 ----
test: xml
test: event_trigger
test: stats
+ test: partition_join
+ test: multi_level_partition_join
diff --git a/src/test/regress/sql/inherit.sql b/src/test/regress/sql/inherit.sql
new file mode 100644
index d43b75c..b814a4c
*** a/src/test/regress/sql/inherit.sql
--- b/src/test/regress/sql/inherit.sql
*************** where parted_tab.a = ss.a;
*** 154,159 ****
--- 154,176 ----
select tableoid::regclass::text as relname, parted_tab.* from parted_tab order by 1,2;
drop table parted_tab;
+
+ -- Check UPDATE with *multi-level partitioned* inherited target
+ create table mlparted_tab (a int, b char, c text) partition by list (a);
+ create table mlparted_tab_part1 partition of mlparted_tab for values in (1);
+ create table mlparted_tab_part2 partition of mlparted_tab for values in (2) partition by list (b);
+ create table mlparted_tab_part3 partition of mlparted_tab for values in (3);
+ create table mlparted_tab_part2a partition of mlparted_tab_part2 for values in ('a');
+ create table mlparted_tab_part2b partition of mlparted_tab_part2 for values in ('b');
+ insert into mlparted_tab values (1, 'a'), (2, 'a'), (2, 'b'), (3, 'a');
+
+ update mlparted_tab mlp set c = 'xxx'
+ from
+ (select a from some_tab union all select a+1 from some_tab) ss (a)
+ where (mlp.a = ss.a and mlp.b = 'b') or mlp.a = 3;
+ select tableoid::regclass::text as relname, mlparted_tab.* from mlparted_tab order by 1,2;
+
+ drop table mlparted_tab;
drop table some_tab cascade;
/* Test multiple inheritance of column defaults */
On Wed, Apr 26, 2017 at 6:28 AM, Antonin Houska <ah@cybertec.at> wrote:
Attached is a diff that contains both patches merged. This is just to prove my
assumption, details to be elaborated later. The scripts attached produce the
following plan in my environment:QUERY PLAN
------------------------------------------------
Parallel Finalize HashAggregate
Group Key: b_1.j
-> Append
-> Parallel Partial HashAggregate
Group Key: b_1.j
-> Hash Join
Hash Cond: (b_1.j = c_1.k)
-> Seq Scan on b_1
-> Hash
-> Seq Scan on c_1
-> Parallel Partial HashAggregate
Group Key: b_2.j
-> Hash Join
Hash Cond: (b_2.j = c_2.k)
-> Seq Scan on b_2
-> Hash
-> Seq Scan on c_2
Well, I'm confused. I see that there's a relationship between what
Antonin is trying to do and what Jeevan is trying to do, but I can't
figure out whether one is a subset of the other, whether they're both
orthogonal, or something else. This plan looks similar to what I
would expect Jeevan's patch to produce, except i have no idea what
"Parallel" would mean in a plan that contains no Gather node.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Apr 26, 2017 at 6:28 AM, Antonin Houska <ah@cybertec.at> wrote:
Attached is a diff that contains both patches merged. This is just to prove my
assumption, details to be elaborated later. The scripts attached produce the
following plan in my environment:QUERY PLAN
------------------------------------------------
Parallel Finalize HashAggregate
Group Key: b_1.j
-> Append
-> Parallel Partial HashAggregate
Group Key: b_1.j
-> Hash Join
Hash Cond: (b_1.j = c_1.k)
-> Seq Scan on b_1
-> Hash
-> Seq Scan on c_1
-> Parallel Partial HashAggregate
Group Key: b_2.j
-> Hash Join
Hash Cond: (b_2.j = c_2.k)
-> Seq Scan on b_2
-> Hash
-> Seq Scan on c_2Well, I'm confused. I see that there's a relationship between what
Antonin is trying to do and what Jeevan is trying to do, but I can't
figure out whether one is a subset of the other, whether they're both
orthogonal, or something else. This plan looks similar to what I
would expect Jeevan's patch to produce,
The point is that the patch Jeevan wanted to work on is actually a subset of
[1]: /messages/by-id/9666.1491295317@localhost
except i have no idea what "Parallel" would mean in a plan that contains no
Gather node.
parallel_aware field was set mistakenly on the AggPath. Fixed patch is
attached below, producing this plan:
QUERY PLAN
------------------------------------------------
Finalize HashAggregate
Group Key: b_1.j
-> Append
-> Partial HashAggregate
Group Key: b_1.j
-> Hash Join
Hash Cond: (b_1.j = c_1.k)
-> Seq Scan on b_1
-> Hash
-> Seq Scan on c_1
-> Partial HashAggregate
Group Key: b_2.j
-> Hash Join
Hash Cond: (b_2.j = c_2.k)
-> Seq Scan on b_2
-> Hash
-> Seq Scan on c_2
[1]: /messages/by-id/9666.1491295317@localhost
[2]: https://commitfest.postgresql.org/14/994/
--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at
Attachments:
agg_pushdown_partition_wise_v2.difftext/x-diffDownload
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
new file mode 100644
index d1bc5b0..0f782dc
*** a/contrib/postgres_fdw/expected/postgres_fdw.out
--- b/contrib/postgres_fdw/expected/postgres_fdw.out
*************** AND ftoptions @> array['fetch_size=60000
*** 7248,7250 ****
--- 7248,7370 ----
(1 row)
ROLLBACK;
+ -- ===================================================================
+ -- test partition-wise-joins
+ -- ===================================================================
+ SET enable_partition_wise_join=on;
+ CREATE TABLE fprt1 (a int, b int, c varchar) PARTITION BY RANGE(a);
+ CREATE TABLE fprt1_p1 (LIKE fprt1);
+ CREATE TABLE fprt1_p2 (LIKE fprt1);
+ INSERT INTO fprt1_p1 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(0, 249, 2) i;
+ INSERT INTO fprt1_p2 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(250, 499, 2) i;
+ CREATE FOREIGN TABLE ftprt1_p1 PARTITION OF fprt1 FOR VALUES FROM (0) TO (250)
+ SERVER loopback OPTIONS (table_name 'fprt1_p1', use_remote_estimate 'true');
+ CREATE FOREIGN TABLE ftprt1_p2 PARTITION OF fprt1 FOR VALUES FROM (250) TO (500)
+ SERVER loopback OPTIONS (TABLE_NAME 'fprt1_p2');
+ ANALYZE fprt1;
+ ANALYZE fprt1_p1;
+ ANALYZE fprt1_p2;
+ CREATE TABLE fprt2 (a int, b int, c varchar) PARTITION BY RANGE(b);
+ CREATE TABLE fprt2_p1 (LIKE fprt2);
+ CREATE TABLE fprt2_p2 (LIKE fprt2);
+ INSERT INTO fprt2_p1 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(0, 249, 3) i;
+ INSERT INTO fprt2_p2 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(250, 499, 3) i;
+ CREATE FOREIGN TABLE ftprt2_p1 PARTITION OF fprt2 FOR VALUES FROM (0) TO (250)
+ SERVER loopback OPTIONS (table_name 'fprt2_p1', use_remote_estimate 'true');
+ CREATE FOREIGN TABLE ftprt2_p2 PARTITION OF fprt2 FOR VALUES FROM (250) TO (500)
+ SERVER loopback OPTIONS (table_name 'fprt2_p2', use_remote_estimate 'true');
+ ANALYZE fprt2;
+ ANALYZE fprt2_p1;
+ ANALYZE fprt2_p2;
+ -- inner join three tables
+ EXPLAIN (COSTS OFF)
+ SELECT t1.a,t2.b,t3.c FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) INNER JOIN fprt1 t3 ON (t2.b = t3.a) WHERE t1.a % 25 =0 ORDER BY 1,2,3;
+ QUERY PLAN
+ --------------------------------------------------------------------------------------------------------------------
+ Sort
+ Sort Key: t1.a, t3.c
+ -> Append
+ -> Foreign Scan
+ Relations: ((public.ftprt1_p1 t1) INNER JOIN (public.ftprt2_p1 t2)) INNER JOIN (public.ftprt1_p1 t3)
+ -> Foreign Scan
+ Relations: ((public.ftprt1_p2 t1) INNER JOIN (public.ftprt2_p2 t2)) INNER JOIN (public.ftprt1_p2 t3)
+ (7 rows)
+
+ SELECT t1.a,t2.b,t3.c FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) INNER JOIN fprt1 t3 ON (t2.b = t3.a) WHERE t1.a % 25 =0 ORDER BY 1,2,3;
+ a | b | c
+ -----+-----+------
+ 0 | 0 | 0000
+ 150 | 150 | 0003
+ 250 | 250 | 0005
+ 400 | 400 | 0008
+ (4 rows)
+
+ -- left outer join + nullable clasue
+ EXPLAIN (COSTS OFF)
+ SELECT t1.a,t2.b,t2.c FROM fprt1 t1 LEFT JOIN (SELECT * FROM fprt2 WHERE a < 10) t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a < 10 ORDER BY 1,2,3;
+ QUERY PLAN
+ -----------------------------------------------------------------------------------
+ Sort
+ Sort Key: t1.a, ftprt2_p1.b, ftprt2_p1.c
+ -> Append
+ -> Foreign Scan
+ Relations: (public.ftprt1_p1 t1) LEFT JOIN (public.ftprt2_p1 fprt2)
+ (5 rows)
+
+ SELECT t1.a,t2.b,t2.c FROM fprt1 t1 LEFT JOIN (SELECT * FROM fprt2 WHERE a < 10) t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a < 10 ORDER BY 1,2,3;
+ a | b | c
+ ---+---+------
+ 0 | 0 | 0000
+ 2 | |
+ 4 | |
+ 6 | 6 | 0000
+ 8 | |
+ (5 rows)
+
+ -- with whole-row reference
+ EXPLAIN (COSTS OFF)
+ SELECT t1,t2 FROM fprt1 t1 JOIN fprt2 t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a % 25 =0 ORDER BY 1,2;
+ QUERY PLAN
+ ---------------------------------------------------------------------------------
+ Sort
+ Sort Key: ((t1.*)::fprt1), ((t2.*)::fprt2)
+ -> Append
+ -> Foreign Scan
+ Relations: (public.ftprt1_p1 t1) INNER JOIN (public.ftprt2_p1 t2)
+ -> Foreign Scan
+ Relations: (public.ftprt1_p2 t1) INNER JOIN (public.ftprt2_p2 t2)
+ (7 rows)
+
+ SELECT t1,t2 FROM fprt1 t1 JOIN fprt2 t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a % 25 =0 ORDER BY 1,2;
+ t1 | t2
+ ----------------+----------------
+ (0,0,0000) | (0,0,0000)
+ (150,150,0003) | (150,150,0003)
+ (250,250,0005) | (250,250,0005)
+ (400,400,0008) | (400,400,0008)
+ (4 rows)
+
+ -- join with lateral reference
+ EXPLAIN (COSTS OFF)
+ SELECT t1.a,t1.b FROM fprt1 t1, LATERAL (SELECT t2.a, t2.b FROM fprt2 t2 WHERE t1.a = t2.b AND t1.b = t2.a) q WHERE t1.a%25 = 0 ORDER BY 1,2;
+ QUERY PLAN
+ ---------------------------------------------------------------------------------
+ Sort
+ Sort Key: t1.a, t1.b
+ -> Append
+ -> Foreign Scan
+ Relations: (public.ftprt1_p1 t1) INNER JOIN (public.ftprt2_p1 t2)
+ -> Foreign Scan
+ Relations: (public.ftprt1_p2 t1) INNER JOIN (public.ftprt2_p2 t2)
+ (7 rows)
+
+ SELECT t1.a,t1.b FROM fprt1 t1, LATERAL (SELECT t2.a, t2.b FROM fprt2 t2 WHERE t1.a = t2.b AND t1.b = t2.a) q WHERE t1.a%25 = 0 ORDER BY 1,2;
+ a | b
+ -----+-----
+ 0 | 0
+ 150 | 150
+ 250 | 250
+ 400 | 400
+ (4 rows)
+
+ RESET enable_partition_wise_join;
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
new file mode 100644
index 509bb54..76a0551
*** a/contrib/postgres_fdw/sql/postgres_fdw.sql
--- b/contrib/postgres_fdw/sql/postgres_fdw.sql
*************** WHERE ftrelid = 'table30000'::regclass
*** 1717,1719 ****
--- 1717,1772 ----
AND ftoptions @> array['fetch_size=60000'];
ROLLBACK;
+
+ -- ===================================================================
+ -- test partition-wise-joins
+ -- ===================================================================
+ SET enable_partition_wise_join=on;
+
+ CREATE TABLE fprt1 (a int, b int, c varchar) PARTITION BY RANGE(a);
+ CREATE TABLE fprt1_p1 (LIKE fprt1);
+ CREATE TABLE fprt1_p2 (LIKE fprt1);
+ INSERT INTO fprt1_p1 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(0, 249, 2) i;
+ INSERT INTO fprt1_p2 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(250, 499, 2) i;
+ CREATE FOREIGN TABLE ftprt1_p1 PARTITION OF fprt1 FOR VALUES FROM (0) TO (250)
+ SERVER loopback OPTIONS (table_name 'fprt1_p1', use_remote_estimate 'true');
+ CREATE FOREIGN TABLE ftprt1_p2 PARTITION OF fprt1 FOR VALUES FROM (250) TO (500)
+ SERVER loopback OPTIONS (TABLE_NAME 'fprt1_p2');
+ ANALYZE fprt1;
+ ANALYZE fprt1_p1;
+ ANALYZE fprt1_p2;
+
+ CREATE TABLE fprt2 (a int, b int, c varchar) PARTITION BY RANGE(b);
+ CREATE TABLE fprt2_p1 (LIKE fprt2);
+ CREATE TABLE fprt2_p2 (LIKE fprt2);
+ INSERT INTO fprt2_p1 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(0, 249, 3) i;
+ INSERT INTO fprt2_p2 SELECT i, i, to_char(i/50, 'FM0000') FROM generate_series(250, 499, 3) i;
+ CREATE FOREIGN TABLE ftprt2_p1 PARTITION OF fprt2 FOR VALUES FROM (0) TO (250)
+ SERVER loopback OPTIONS (table_name 'fprt2_p1', use_remote_estimate 'true');
+ CREATE FOREIGN TABLE ftprt2_p2 PARTITION OF fprt2 FOR VALUES FROM (250) TO (500)
+ SERVER loopback OPTIONS (table_name 'fprt2_p2', use_remote_estimate 'true');
+ ANALYZE fprt2;
+ ANALYZE fprt2_p1;
+ ANALYZE fprt2_p2;
+
+ -- inner join three tables
+ EXPLAIN (COSTS OFF)
+ SELECT t1.a,t2.b,t3.c FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) INNER JOIN fprt1 t3 ON (t2.b = t3.a) WHERE t1.a % 25 =0 ORDER BY 1,2,3;
+ SELECT t1.a,t2.b,t3.c FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) INNER JOIN fprt1 t3 ON (t2.b = t3.a) WHERE t1.a % 25 =0 ORDER BY 1,2,3;
+
+ -- left outer join + nullable clasue
+ EXPLAIN (COSTS OFF)
+ SELECT t1.a,t2.b,t2.c FROM fprt1 t1 LEFT JOIN (SELECT * FROM fprt2 WHERE a < 10) t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a < 10 ORDER BY 1,2,3;
+ SELECT t1.a,t2.b,t2.c FROM fprt1 t1 LEFT JOIN (SELECT * FROM fprt2 WHERE a < 10) t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a < 10 ORDER BY 1,2,3;
+
+ -- with whole-row reference
+ EXPLAIN (COSTS OFF)
+ SELECT t1,t2 FROM fprt1 t1 JOIN fprt2 t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a % 25 =0 ORDER BY 1,2;
+ SELECT t1,t2 FROM fprt1 t1 JOIN fprt2 t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a % 25 =0 ORDER BY 1,2;
+
+ -- join with lateral reference
+ EXPLAIN (COSTS OFF)
+ SELECT t1.a,t1.b FROM fprt1 t1, LATERAL (SELECT t2.a, t2.b FROM fprt2 t2 WHERE t1.a = t2.b AND t1.b = t2.a) q WHERE t1.a%25 = 0 ORDER BY 1,2;
+ SELECT t1.a,t1.b FROM fprt1 t1, LATERAL (SELECT t2.a, t2.b FROM fprt2 t2 WHERE t1.a = t2.b AND t1.b = t2.a) q WHERE t1.a%25 = 0 ORDER BY 1,2;
+
+ RESET enable_partition_wise_join;
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
new file mode 100644
index e02b0c8..c4d9228
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
*************** ANY <replaceable class="parameter">num_s
*** 3643,3648 ****
--- 3643,3667 ----
</listitem>
</varlistentry>
+ <varlistentry id="guc-enable-partition-wise-join" xreflabel="enable_partition_wise_join">
+ <term><varname>enable_partition_wise_join</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>enable_partition_wise_join</> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Enables or disables the query planner's use of partition-wise join
+ plans. When enabled, it spends time in creating paths for joins between
+ partitions and consumes memory to construct expression nodes to be used
+ for those joins, even if partition-wise join does not result in the
+ cheapest path. The time and memory increase exponentially with the
+ number of partitioned tables being joined and they increase linearly
+ with the number of partitions. The default is <literal>off</>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-enable-seqscan" xreflabel="enable_seqscan">
<term><varname>enable_seqscan</varname> (<type>boolean</type>)
<indexterm>
diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
new file mode 100644
index dbeaab5..ac8c2fa
*** a/doc/src/sgml/fdwhandler.sgml
--- b/doc/src/sgml/fdwhandler.sgml
*************** ShutdownForeignScan(ForeignScanState *no
*** 1270,1275 ****
--- 1270,1295 ----
</para>
</sect2>
+ <sect2 id="fdw-callbacks-reparameterize-paths">
+ <title>FDW Routines For reparameterization of paths</title>
+
+ <para>
+ <programlisting>
+ List *
+ ReparameterizeForeignPathByChild(PlannerInfo *root, List *fdw_private,
+ RelOptInfo *child_rel);
+ </programlisting>
+ This function is called while converting a path parameterized by the
+ top-most parent of the given child relation <literal>child_rel</> to be
+ parameterized by the child relation. The function is used to reparameterize
+ any paths or translate any expression nodes saved in the given
+ <literal>fdw_private</> member of a <structname>ForeignPath</>. The
+ callback may use <literal>reparameterize_path_by_child</>,
+ <literal>adjust_appendrel_attrs</> or
+ <literal>adjust_appendrel_attrs_multilevel</> as required.
+ </para>
+ </sect2>
+
</sect1>
<sect1 id="fdw-helpers">
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
new file mode 100644
index e0d2665..c44bb0e
*** a/src/backend/catalog/partition.c
--- b/src/backend/catalog/partition.c
*************** static List *generate_partition_qual(Rel
*** 126,140 ****
static PartitionRangeBound *make_one_range_bound(PartitionKey key, int index,
List *datums, bool lower);
! static int32 partition_rbound_cmp(PartitionKey key,
! Datum *datums1, RangeDatumContent *content1, bool lower1,
PartitionRangeBound *b2);
! static int32 partition_rbound_datum_cmp(PartitionKey key,
! Datum *rb_datums, RangeDatumContent *rb_content,
! Datum *tuple_datums);
! static int32 partition_bound_cmp(PartitionKey key,
! PartitionBoundInfo boundinfo,
int offset, void *probe, bool probe_is_bound);
static int partition_bound_bsearch(PartitionKey key,
PartitionBoundInfo boundinfo,
--- 126,141 ----
static PartitionRangeBound *make_one_range_bound(PartitionKey key, int index,
List *datums, bool lower);
! static int32 partition_rbound_cmp(int partnatts, FmgrInfo *partsupfunc,
! Oid *partcollation, Datum *datums1,
! RangeDatumContent *content1, bool lower1,
PartitionRangeBound *b2);
! static int32 partition_rbound_datum_cmp(int partnatts, FmgrInfo *partsupfunc,
! Oid *partcollation, Datum *rb_datums,
! RangeDatumContent *rb_content, Datum *tuple_datums);
! static int32 partition_bound_cmp(int partnatts, FmgrInfo *partsupfunc,
! Oid *partcollation, PartitionBoundInfo boundinfo,
int offset, void *probe, bool probe_is_bound);
static int partition_bound_bsearch(PartitionKey key,
PartitionBoundInfo boundinfo,
*************** RelationBuildPartitionDesc(Relation rel)
*** 592,598 ****
* representation of partition bounds.
*/
bool
! partition_bounds_equal(PartitionKey key,
PartitionBoundInfo b1, PartitionBoundInfo b2)
{
int i;
--- 593,599 ----
* representation of partition bounds.
*/
bool
! partition_bounds_equal(int partnatts, int16 *parttyplen, bool *parttypbyval,
PartitionBoundInfo b1, PartitionBoundInfo b2)
{
int i;
*************** partition_bounds_equal(PartitionKey key,
*** 613,619 ****
{
int j;
! for (j = 0; j < key->partnatts; j++)
{
/* For range partitions, the bounds might not be finite. */
if (b1->content != NULL)
--- 614,620 ----
{
int j;
! for (j = 0; j < partnatts; j++)
{
/* For range partitions, the bounds might not be finite. */
if (b1->content != NULL)
*************** partition_bounds_equal(PartitionKey key,
*** 642,649 ****
* context. datumIsEqual() should be simple enough to be safe.
*/
if (!datumIsEqual(b1->datums[i][j], b2->datums[i][j],
! key->parttypbyval[j],
! key->parttyplen[j]))
return false;
}
--- 643,649 ----
* context. datumIsEqual() should be simple enough to be safe.
*/
if (!datumIsEqual(b1->datums[i][j], b2->datums[i][j],
! parttypbyval[j], parttyplen[j]))
return false;
}
*************** partition_bounds_equal(PartitionKey key,
*** 652,658 ****
}
/* There are ndatums+1 indexes in case of range partitions */
! if (key->strategy == PARTITION_STRATEGY_RANGE &&
b1->indexes[i] != b2->indexes[i])
return false;
--- 652,658 ----
}
/* There are ndatums+1 indexes in case of range partitions */
! if (b1->strategy == PARTITION_STRATEGY_RANGE &&
b1->indexes[i] != b2->indexes[i])
return false;
*************** check_new_partition_bound(char *relname,
*** 734,741 ****
* First check if the resulting range would be empty with
* specified lower and upper bounds
*/
! if (partition_rbound_cmp(key, lower->datums, lower->content, true,
! upper) >= 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("cannot create range partition with empty range"),
--- 734,742 ----
* First check if the resulting range would be empty with
* specified lower and upper bounds
*/
! if (partition_rbound_cmp(key->partnatts, key->partsupfunc,
! key->partcollation, lower->datums,
! lower->content, true, upper) >= 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("cannot create range partition with empty range"),
*************** qsort_partition_rbound_cmp(const void *a
*** 1865,1871 ****
PartitionRangeBound *b2 = (*(PartitionRangeBound *const *) b);
PartitionKey key = (PartitionKey) arg;
! return partition_rbound_cmp(key, b1->datums, b1->content, b1->lower, b2);
}
/*
--- 1866,1874 ----
PartitionRangeBound *b2 = (*(PartitionRangeBound *const *) b);
PartitionKey key = (PartitionKey) arg;
! return partition_rbound_cmp(key->partnatts, key->partsupfunc,
! key->partcollation, b1->datums, b1->content,
! b1->lower, b2);
}
/*
*************** qsort_partition_rbound_cmp(const void *a
*** 1875,1881 ****
* content1, and lower1) is <=, =, >= the bound specified in *b2
*/
static int32
! partition_rbound_cmp(PartitionKey key,
Datum *datums1, RangeDatumContent *content1, bool lower1,
PartitionRangeBound *b2)
{
--- 1878,1884 ----
* content1, and lower1) is <=, =, >= the bound specified in *b2
*/
static int32
! partition_rbound_cmp(int partnatts, FmgrInfo *partsupfunc, Oid *partcollation,
Datum *datums1, RangeDatumContent *content1, bool lower1,
PartitionRangeBound *b2)
{
*************** partition_rbound_cmp(PartitionKey key,
*** 1885,1891 ****
RangeDatumContent *content2 = b2->content;
bool lower2 = b2->lower;
! for (i = 0; i < key->partnatts; i++)
{
/*
* First, handle cases involving infinity, which don't require
--- 1888,1894 ----
RangeDatumContent *content2 = b2->content;
bool lower2 = b2->lower;
! for (i = 0; i < partnatts; i++)
{
/*
* First, handle cases involving infinity, which don't require
*************** partition_rbound_cmp(PartitionKey key,
*** 1905,1912 ****
else if (content2[i] != RANGE_DATUM_FINITE)
return content2[i] == RANGE_DATUM_NEG_INF ? 1 : -1;
! cmpval = DatumGetInt32(FunctionCall2Coll(&key->partsupfunc[i],
! key->partcollation[i],
datums1[i],
datums2[i]));
if (cmpval != 0)
--- 1908,1915 ----
else if (content2[i] != RANGE_DATUM_FINITE)
return content2[i] == RANGE_DATUM_NEG_INF ? 1 : -1;
! cmpval = DatumGetInt32(FunctionCall2Coll(&partsupfunc[i],
! partcollation[i],
datums1[i],
datums2[i]));
if (cmpval != 0)
*************** partition_rbound_cmp(PartitionKey key,
*** 1932,1951 ****
* rb_lower) <=, =, >= partition key of tuple (tuple_datums)
*/
static int32
! partition_rbound_datum_cmp(PartitionKey key,
! Datum *rb_datums, RangeDatumContent *rb_content,
! Datum *tuple_datums)
{
int i;
int32 cmpval = -1;
! for (i = 0; i < key->partnatts; i++)
{
if (rb_content[i] != RANGE_DATUM_FINITE)
return rb_content[i] == RANGE_DATUM_NEG_INF ? -1 : 1;
! cmpval = DatumGetInt32(FunctionCall2Coll(&key->partsupfunc[i],
! key->partcollation[i],
rb_datums[i],
tuple_datums[i]));
if (cmpval != 0)
--- 1935,1954 ----
* rb_lower) <=, =, >= partition key of tuple (tuple_datums)
*/
static int32
! partition_rbound_datum_cmp(int partnatts, FmgrInfo *partsupfunc,
! Oid *partcollation, Datum *rb_datums,
! RangeDatumContent *rb_content, Datum *tuple_datums)
{
int i;
int32 cmpval = -1;
! for (i = 0; i < partnatts; i++)
{
if (rb_content[i] != RANGE_DATUM_FINITE)
return rb_content[i] == RANGE_DATUM_NEG_INF ? -1 : 1;
! cmpval = DatumGetInt32(FunctionCall2Coll(&partsupfunc[i],
! partcollation[i],
rb_datums[i],
tuple_datums[i]));
if (cmpval != 0)
*************** partition_rbound_datum_cmp(PartitionKey
*** 1962,1978 ****
* specified in *probe.
*/
static int32
! partition_bound_cmp(PartitionKey key, PartitionBoundInfo boundinfo,
! int offset, void *probe, bool probe_is_bound)
{
Datum *bound_datums = boundinfo->datums[offset];
int32 cmpval = -1;
! switch (key->strategy)
{
case PARTITION_STRATEGY_LIST:
! cmpval = DatumGetInt32(FunctionCall2Coll(&key->partsupfunc[0],
! key->partcollation[0],
bound_datums[0],
*(Datum *) probe));
break;
--- 1965,1982 ----
* specified in *probe.
*/
static int32
! partition_bound_cmp(int partnatts, FmgrInfo *partsupfunc, Oid *partcollation,
! PartitionBoundInfo boundinfo, int offset, void *probe,
! bool probe_is_bound)
{
Datum *bound_datums = boundinfo->datums[offset];
int32 cmpval = -1;
! switch (boundinfo->strategy)
{
case PARTITION_STRATEGY_LIST:
! cmpval = DatumGetInt32(FunctionCall2Coll(&partsupfunc[0],
! partcollation[0],
bound_datums[0],
*(Datum *) probe));
break;
*************** partition_bound_cmp(PartitionKey key, Pa
*** 1990,2001 ****
*/
bool lower = boundinfo->indexes[offset] < 0;
! cmpval = partition_rbound_cmp(key,
! bound_datums, content, lower,
! (PartitionRangeBound *) probe);
}
else
! cmpval = partition_rbound_datum_cmp(key,
bound_datums, content,
(Datum *) probe);
break;
--- 1994,2007 ----
*/
bool lower = boundinfo->indexes[offset] < 0;
! cmpval = partition_rbound_cmp(partnatts, partsupfunc,
! partcollation, bound_datums,
! content, lower,
! (PartitionRangeBound *) probe);
}
else
! cmpval = partition_rbound_datum_cmp(partnatts, partsupfunc,
! partcollation,
bound_datums, content,
(Datum *) probe);
break;
*************** partition_bound_cmp(PartitionKey key, Pa
*** 2003,2009 ****
default:
elog(ERROR, "unexpected partition strategy: %d",
! (int) key->strategy);
}
return cmpval;
--- 2009,2015 ----
default:
elog(ERROR, "unexpected partition strategy: %d",
! (int) boundinfo->strategy);
}
return cmpval;
*************** partition_bound_bsearch(PartitionKey key
*** 2037,2043 ****
int32 cmpval;
mid = (lo + hi + 1) / 2;
! cmpval = partition_bound_cmp(key, boundinfo, mid, probe,
probe_is_bound);
if (cmpval <= 0)
{
--- 2043,2050 ----
int32 cmpval;
mid = (lo + hi + 1) / 2;
! cmpval = partition_bound_cmp(key->partnatts, key->partsupfunc,
! key->partcollation, boundinfo, mid, probe,
probe_is_bound);
if (cmpval <= 0)
{
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
new file mode 100644
index 5a34a46..717763d
*** a/src/backend/executor/execExpr.c
--- b/src/backend/executor/execExpr.c
*************** ExecInitExprRec(Expr *node, PlanState *p
*** 723,728 ****
--- 723,755 ----
break;
}
+ case T_GroupedVar:
+ /*
+ * GroupedVar is treated as an aggregate if it appears in the
+ * targetlist of Agg node, but as a normal variable elsewhere.
+ */
+ if (parent && (IsA(parent, AggState)))
+ {
+ GroupedVar *gvar = (GroupedVar *) node;
+
+ /*
+ * Currently GroupedVar can only represent partial aggregate.
+ */
+ Assert(gvar->agg_partial != NULL);
+
+ ExecInitExprRec((Expr *) gvar->agg_partial, parent, state,
+ resv, resnull);
+ break;
+ }
+ else
+ {
+ /*
+ * set_plan_refs should have replaced GroupedVar in the
+ * targetlist with an ordinary Var.
+ */
+ elog(ERROR, "parent of GroupedVar is not Agg node");
+ }
+
case T_GroupingFunc:
{
GroupingFunc *grp_node = (GroupingFunc *) node;
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
new file mode 100644
index c2b8618..c4cb4c0
*** a/src/backend/executor/nodeAgg.c
--- b/src/backend/executor/nodeAgg.c
*************** find_unaggregated_cols_walker(Node *node
*** 1829,1834 ****
--- 1829,1845 ----
/* do not descend into aggregate exprs */
return false;
}
+ if (IsA(node, GroupedVar))
+ {
+ GroupedVar *gvar = (GroupedVar *) node;
+
+ /*
+ * GroupedVar is currently used only for partial aggregation, so treat
+ * it like an Aggref above.
+ */
+ Assert(gvar->agg_partial != NULL);
+ return false;
+ }
return expression_tree_walker(node, find_unaggregated_cols_walker,
(void *) colnos);
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
new file mode 100644
index 00a0fed..7d188ea
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
*************** _copyPlaceHolderVar(const PlaceHolderVar
*** 2206,2211 ****
--- 2206,2226 ----
}
/*
+ * _copyGroupedVar
+ */
+ static GroupedVar *
+ _copyGroupedVar(const GroupedVar *from)
+ {
+ GroupedVar *newnode = makeNode(GroupedVar);
+
+ COPY_NODE_FIELD(gvexpr);
+ COPY_NODE_FIELD(agg_partial);
+ COPY_SCALAR_FIELD(gvid);
+
+ return newnode;
+ }
+
+ /*
* _copySpecialJoinInfo
*/
static SpecialJoinInfo *
*************** copyObjectImpl(const void *from)
*** 4984,4989 ****
--- 4999,5007 ----
case T_PlaceHolderVar:
retval = _copyPlaceHolderVar(from);
break;
+ case T_GroupedVar:
+ retval = _copyGroupedVar(from);
+ break;
case T_SpecialJoinInfo:
retval = _copySpecialJoinInfo(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
new file mode 100644
index 46573ae..f1dacd5
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
*************** _equalPlaceHolderVar(const PlaceHolderVa
*** 874,879 ****
--- 874,887 ----
}
static bool
+ _equalGroupedVar(const GroupedVar *a, const GroupedVar *b)
+ {
+ COMPARE_SCALAR_FIELD(gvid);
+
+ return true;
+ }
+
+ static bool
_equalSpecialJoinInfo(const SpecialJoinInfo *a, const SpecialJoinInfo *b)
{
COMPARE_BITMAPSET_FIELD(min_lefthand);
*************** equal(const void *a, const void *b)
*** 3148,3153 ****
--- 3156,3164 ----
case T_PlaceHolderVar:
retval = _equalPlaceHolderVar(a, b);
break;
+ case T_GroupedVar:
+ retval = _equalGroupedVar(a, b);
+ break;
case T_SpecialJoinInfo:
retval = _equalSpecialJoinInfo(a, b);
break;
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
new file mode 100644
index 3e8189c..5c00e55
*** a/src/backend/nodes/nodeFuncs.c
--- b/src/backend/nodes/nodeFuncs.c
*************** exprType(const Node *expr)
*** 259,264 ****
--- 259,267 ----
case T_PlaceHolderVar:
type = exprType((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
+ case T_GroupedVar:
+ type = exprType((Node *) ((const GroupedVar *) expr)->agg_partial);
+ break;
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(expr));
type = InvalidOid; /* keep compiler quiet */
*************** exprCollation(const Node *expr)
*** 931,936 ****
--- 934,942 ----
case T_PlaceHolderVar:
coll = exprCollation((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
+ case T_GroupedVar:
+ coll = exprCollation((Node *) ((const GroupedVar *) expr)->gvexpr);
+ break;
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(expr));
coll = InvalidOid; /* keep compiler quiet */
*************** expression_tree_walker(Node *node,
*** 2198,2203 ****
--- 2204,2211 ----
break;
case T_PlaceHolderVar:
return walker(((PlaceHolderVar *) node)->phexpr, context);
+ case T_GroupedVar:
+ return walker(((GroupedVar *) node)->gvexpr, context);
case T_InferenceElem:
return walker(((InferenceElem *) node)->expr, context);
case T_AppendRelInfo:
*************** expression_tree_mutator(Node *node,
*** 2989,2994 ****
--- 2997,3012 ----
return (Node *) newnode;
}
break;
+ case T_GroupedVar:
+ {
+ GroupedVar *gv = (GroupedVar *) node;
+ GroupedVar *newnode;
+
+ FLATCOPY(newnode, gv, GroupedVar);
+ MUTATE(newnode->gvexpr, gv->gvexpr, Expr *);
+ MUTATE(newnode->agg_partial, gv->agg_partial, Aggref *);
+ return (Node *) newnode;
+ }
case T_InferenceElem:
{
InferenceElem *inferenceelemdexpr = (InferenceElem *) node;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
new file mode 100644
index 28cef85..4b6ee30
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
*************** _outPlannerInfo(StringInfo str, const Pl
*** 2186,2191 ****
--- 2186,2192 ----
WRITE_NODE_FIELD(pcinfo_list);
WRITE_NODE_FIELD(rowMarks);
WRITE_NODE_FIELD(placeholder_list);
+ WRITE_NODE_FIELD(grouped_var_list);
WRITE_NODE_FIELD(fkey_list);
WRITE_NODE_FIELD(query_pathkeys);
WRITE_NODE_FIELD(group_pathkeys);
*************** _outParamPathInfo(StringInfo str, const
*** 2408,2413 ****
--- 2409,2424 ----
}
static void
+ _outGroupedPathInfo(StringInfo str, const GroupedPathInfo *node)
+ {
+ WRITE_NODE_TYPE("GROUPEDPATHINFO");
+
+ WRITE_NODE_FIELD(target);
+ WRITE_NODE_FIELD(pathlist);
+ WRITE_NODE_FIELD(partial_pathlist);
+ }
+
+ static void
_outRestrictInfo(StringInfo str, const RestrictInfo *node)
{
WRITE_NODE_TYPE("RESTRICTINFO");
*************** _outPlaceHolderVar(StringInfo str, const
*** 2451,2456 ****
--- 2462,2477 ----
}
static void
+ _outGroupedVar(StringInfo str, const GroupedVar *node)
+ {
+ WRITE_NODE_TYPE("GROUPEDVAR");
+
+ WRITE_NODE_FIELD(gvexpr);
+ WRITE_NODE_FIELD(agg_partial);
+ WRITE_UINT_FIELD(gvid);
+ }
+
+ static void
_outSpecialJoinInfo(StringInfo str, const SpecialJoinInfo *node)
{
WRITE_NODE_TYPE("SPECIALJOININFO");
*************** outNode(StringInfo str, const void *obj)
*** 3996,4007 ****
--- 4017,4034 ----
case T_ParamPathInfo:
_outParamPathInfo(str, obj);
break;
+ case T_GroupedPathInfo:
+ _outGroupedPathInfo(str, obj);
+ break;
case T_RestrictInfo:
_outRestrictInfo(str, obj);
break;
case T_PlaceHolderVar:
_outPlaceHolderVar(str, obj);
break;
+ case T_GroupedVar:
+ _outGroupedVar(str, obj);
+ break;
case T_SpecialJoinInfo:
_outSpecialJoinInfo(str, obj);
break;
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
new file mode 100644
index a883220..138f71c
*** a/src/backend/nodes/readfuncs.c
--- b/src/backend/nodes/readfuncs.c
*************** _readVar(void)
*** 522,527 ****
--- 522,542 ----
}
/*
+ * _readGroupedVar
+ */
+ static GroupedVar *
+ _readGroupedVar(void)
+ {
+ READ_LOCALS(GroupedVar);
+
+ READ_NODE_FIELD(gvexpr);
+ READ_NODE_FIELD(agg_partial);
+ READ_UINT_FIELD(gvid);
+
+ READ_DONE();
+ }
+
+ /*
* _readConst
*/
static Const *
*************** parseNodeString(void)
*** 2440,2445 ****
--- 2455,2462 ----
return_value = _readTableFunc();
else if (MATCH("VAR", 3))
return_value = _readVar();
+ else if (MATCH("GROUPEDVAR", 10))
+ return_value = _readGroupedVar();
else if (MATCH("CONST", 5))
return_value = _readConst();
else if (MATCH("PARAM", 5))
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
new file mode 100644
index fc0fca4..eee093f
*** a/src/backend/optimizer/README
--- b/src/backend/optimizer/README
*************** be desirable to postpone the Gather stag
*** 1076,1078 ****
--- 1076,1105 ----
plan as possible. Expanding the range of cases in which more work can be
pushed below the Gather (and costing them accurately) is likely to keep us
busy for a long time to come.
+
+ Partition-wise joins
+ --------------------
+ A join between two similarly partitioned tables can be broken down into joins
+ between their matching partitions if there exists an equi-join condition
+ between the partition keys of the joining tables. The equi-join between
+ partition keys implies that all join partners for a given row in one
+ partitioned table must be in the corresponding partition of the other
+ partitioned table. The join partners can not be found in other partitions. This
+ condition allows the join between partitioned tables to be broken into joins
+ between the matching partitions. The resultant join is partitioned in the same
+ way as the joining relations, thus allowing an N-way join between similarly
+ partitioned tables having equi-join condition between their partition keys to
+ be broken down into N-way joins between their matching partitions. This
+ technique of breaking down a join between partition tables into join between
+ their partitions is called partition-wise join. We will use term "partitioned
+ relation" for both partitioned table as well as join between partitioned tables
+ which can use partition-wise join technique.
+
+ Partitioning properties of a partitioned table are stored in
+ PartitionSchemeData structure. Planner maintains a list of canonical partition
+ schemes (distinct PartitionSchemeData objects) so that any two partitioned
+ relations with same partitioning scheme share the same PartitionSchemeData
+ object. This reduces memory consumed by PartitionSchemeData objects and makes
+ it easy to compare the partition schemes of joining relations. RelOptInfos of
+ partitioned relations hold partition key expressions and the RelOptInfos of
+ the partition relations of that relation.
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
new file mode 100644
index b5cab0c..1ad910d
*** a/src/backend/optimizer/geqo/geqo_eval.c
--- b/src/backend/optimizer/geqo/geqo_eval.c
*************** merge_clump(PlannerInfo *root, List *clu
*** 264,271 ****
/* Keep searching if join order is not valid */
if (joinrel)
{
/* Create GatherPaths for any useful partial paths for rel */
! generate_gather_paths(root, joinrel);
/* Find and save the cheapest paths for this joinrel */
set_cheapest(joinrel);
--- 264,279 ----
/* Keep searching if join order is not valid */
if (joinrel)
{
+
+ /*
+ * Create "append" paths for partitioned joins. Do this before
+ * creating GatherPaths so that partial "append" paths in
+ * partitioned joins will be considered.
+ */
+ generate_partition_wise_join_paths(root, joinrel);
+
/* Create GatherPaths for any useful partial paths for rel */
! generate_gather_paths(root, joinrel, false);
/* Find and save the cheapest paths for this joinrel */
set_cheapest(joinrel);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
new file mode 100644
index b93b4fc..83a2c37
*** a/src/backend/optimizer/path/allpaths.c
--- b/src/backend/optimizer/path/allpaths.c
***************
*** 24,29 ****
--- 24,30 ----
#include "catalog/pg_operator.h"
#include "catalog/pg_proc.h"
#include "foreign/fdwapi.h"
+ #include "miscadmin.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
#ifdef OPTIMIZER_DEBUG
*************** set_rel_pathlist(PlannerInfo *root, RelO
*** 486,492 ****
* we'll consider gathering partial paths for the parent appendrel.)
*/
if (rel->reloptkind == RELOPT_BASEREL)
! generate_gather_paths(root, rel);
/*
* Allow a plugin to editorialize on the set of Paths for this base
--- 487,496 ----
* we'll consider gathering partial paths for the parent appendrel.)
*/
if (rel->reloptkind == RELOPT_BASEREL)
! {
! generate_gather_paths(root, rel, false);
! generate_gather_paths(root, rel, true);
! }
/*
* Allow a plugin to editorialize on the set of Paths for this base
*************** static void
*** 686,691 ****
--- 690,696 ----
set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
{
Relids required_outer;
+ Path *seq_path;
/*
* We don't support pushing join clauses into the quals of a seqscan, but
*************** set_plain_rel_pathlist(PlannerInfo *root
*** 694,708 ****
*/
required_outer = rel->lateral_relids;
! /* Consider sequential scan */
! add_path(rel, create_seqscan_path(root, rel, required_outer, 0));
! /* If appropriate, consider parallel sequential scan */
if (rel->consider_parallel && required_outer == NULL)
create_plain_partial_paths(root, rel);
/* Consider index scans */
! create_index_paths(root, rel);
/* Consider TID scans */
create_tidscan_paths(root, rel);
--- 699,726 ----
*/
required_outer = rel->lateral_relids;
! /* Consider sequential scan, both plain and grouped. */
! seq_path = create_seqscan_path(root, rel, required_outer, 0);
! add_path(rel, seq_path, false);
! if (rel->gpi != NULL && required_outer == NULL)
! create_grouped_path(root, rel, seq_path, false, false, AGG_HASHED);
! /* If appropriate, consider parallel sequential scan (plain or grouped) */
if (rel->consider_parallel && required_outer == NULL)
create_plain_partial_paths(root, rel);
/* Consider index scans */
! create_index_paths(root, rel, false);
! if (rel->gpi != NULL)
! {
! /*
! * TODO Instead of calling the whole clause-matching machinery twice
! * (there should be no difference between plain and grouped paths from
! * this point of view), consider returning a separate list of paths
! * usable as grouped ones.
! */
! create_index_paths(root, rel, true);
! }
/* Consider TID scans */
create_tidscan_paths(root, rel);
*************** static void
*** 716,721 ****
--- 734,740 ----
create_plain_partial_paths(PlannerInfo *root, RelOptInfo *rel)
{
int parallel_workers;
+ Path *path;
parallel_workers = compute_parallel_worker(rel, rel->pages, -1);
*************** create_plain_partial_paths(PlannerInfo *
*** 724,730 ****
return;
/* Add an unordered partial path based on a parallel sequential scan. */
! add_partial_path(rel, create_seqscan_path(root, rel, NULL, parallel_workers));
}
/*
--- 743,850 ----
return;
/* Add an unordered partial path based on a parallel sequential scan. */
! path = create_seqscan_path(root, rel, NULL, parallel_workers);
! add_partial_path(rel, path, false);
!
! /*
! * Do partial aggregation at base relation level if the relation is
! * eligible for it.
! */
! if (rel->gpi != NULL)
! create_grouped_path(root, rel, path, false, true, AGG_HASHED);
! }
!
! /*
! * Apply partial aggregation to a subpath and add the AggPath to the
! * appropriate pathlist.
! *
! * "precheck" tells whether the aggregation path should first be checked using
! * add_path_precheck().
! *
! * If "partial" is true, the resulting path is considered partial in terms of
! * parallel execution.
! *
! * The path we create here shouldn't be parameterized because of supposedly
! * high startup cost of aggregation (whether due to build of hash table for
! * AGG_HASHED strategy or due to explicit sort for AGG_SORTED).
! *
! * XXX IndexPath as an input for AGG_SORTED might seem to be an exception, but
! * aggregation of its output is only beneficial if it's performed by multiple
! * workers, i.e. the resulting path is partial (Besides parallel aggregation,
! * the other use case of aggregation push-down is aggregation performed on
! * remote database, but that has nothing to do with IndexScan). And partial
! * path cannot be parameterized because it's semantically wrong to use it on
! * the inner side of NL join.
! */
! void
! create_grouped_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
! bool precheck, bool partial, AggStrategy aggstrategy)
! {
! List *group_clauses = NIL;
! List *group_exprs = NIL;
! List *agg_exprs = NIL;
! Path *agg_path;
!
! /*
! * If the AggPath should be partial, the subpath must be too, and
! * therefore the subpath is essentially parallel_safe.
! */
! Assert(subpath->parallel_safe || !partial);
!
! /*
! * Grouped path should never be parameterized, so we're not supposed to
! * receive parameterized subpath.
! */
! Assert(subpath->param_info == NULL);
!
! /*
! * Note that "partial" in the following function names refers to 2-stage
! * aggregation, not to parallel processing.
! */
! if (aggstrategy == AGG_HASHED)
! agg_path = (Path *) create_partial_agg_hashed_path(root, subpath,
! true,
! &group_clauses,
! &group_exprs,
! &agg_exprs,
! subpath->rows);
! else if (aggstrategy == AGG_SORTED)
! agg_path = (Path *) create_partial_agg_sorted_path(root, subpath,
! true,
! &group_clauses,
! &group_exprs,
! &agg_exprs,
! subpath->rows);
! else
! elog(ERROR, "unexpected strategy %d", aggstrategy);
!
! /* Add the grouped path to the list of grouped base paths. */
! if (agg_path != NULL)
! {
! if (precheck)
! {
! List *pathkeys;
!
! /* AGG_HASH is not supposed to generate sorted output. */
! pathkeys = aggstrategy == AGG_SORTED ? subpath->pathkeys : NIL;
!
! if (!partial &&
! !add_path_precheck(rel, agg_path->startup_cost,
! agg_path->total_cost, pathkeys, NULL,
! true))
! return;
!
! if (partial &&
! !add_partial_path_precheck(rel, agg_path->total_cost, pathkeys,
! true))
! return;
! }
!
! if (!partial)
! add_path(rel, (Path *) agg_path, true);
! else
! add_partial_path(rel, (Path *) agg_path, true);
! }
}
/*
*************** set_tablesample_rel_pathlist(PlannerInfo
*** 810,816 ****
path = (Path *) create_material_path(rel, path);
}
! add_path(rel, path);
/* For the moment, at least, there are no other paths to consider */
}
--- 930,936 ----
path = (Path *) create_material_path(rel, path);
}
! add_path(rel, path, false);
/* For the moment, at least, there are no other paths to consider */
}
*************** set_append_rel_size(PlannerInfo *root, R
*** 915,926 ****
childrel = find_base_rel(root, childRTindex);
Assert(childrel->reloptkind == RELOPT_OTHER_MEMBER_REL);
/*
! * We have to copy the parent's targetlist and quals to the child,
! * with appropriate substitution of variables. However, only the
! * baserestrictinfo quals are needed before we can check for
! * constraint exclusion; so do that first and then check to see if we
! * can disregard this child.
*
* The child rel's targetlist might contain non-Var expressions, which
* means that substitution into the quals could produce opportunities
--- 1035,1100 ----
childrel = find_base_rel(root, childRTindex);
Assert(childrel->reloptkind == RELOPT_OTHER_MEMBER_REL);
+ if (rel->part_scheme)
+ {
+ AttrNumber attno;
+
+ /*
+ * For a partitioned tables, individual partitions can participate
+ * in the pair-wise joins. We need attr_needed data for building
+ * targetlists of joins between partitions.
+ */
+ for (attno = rel->min_attr; attno <= rel->max_attr; attno++)
+ {
+ int index = attno - rel->min_attr;
+ Relids attr_needed = bms_copy(rel->attr_needed[index]);
+
+ /* System attributes do not need translation. */
+ if (attno <= 0)
+ {
+ Assert(rel->min_attr == childrel->min_attr);
+ childrel->attr_needed[index] = attr_needed;
+ }
+ else
+ {
+ Var *var = list_nth(appinfo->translated_vars,
+ attno - 1);
+ int child_index;
+
+ /*
+ * Parent Var for a user defined attribute translates to
+ * child Var.
+ */
+ Assert(IsA(var, Var));
+
+ child_index = var->varattno - childrel->min_attr;
+ childrel->attr_needed[child_index] = attr_needed;
+ }
+ }
+ }
+
/*
! * Copy/Modify targetlist. Even if this child is deemed empty, we need
! * its targetlist in case it falls on nullable side in a child-join
! * because of partition-wise join.
! *
! * NB: the resulting childrel->reltarget->exprs may contain arbitrary
! * expressions, which otherwise would not occur in a rel's targetlist.
! * Code that might be looking at an appendrel child must cope with
! * such. (Normally, a rel's targetlist would only include Vars and
! * PlaceHolderVars.) XXX we do not bother to update the cost or width
! * fields of childrel->reltarget; not clear if that would be useful.
! */
! childrel->reltarget->exprs = (List *)
! adjust_appendrel_attrs(root,
! (Node *) rel->reltarget->exprs,
! 1, &appinfo);
!
! /*
! * We have to copy the parent's quals to the child, with appropriate
! * substitution of variables. However, only the baserestrictinfo quals
! * are needed before we can check for constraint exclusion; so do that
! * first and then check to see if we can disregard this child.
*
* The child rel's targetlist might contain non-Var expressions, which
* means that substitution into the quals could produce opportunities
*************** set_append_rel_size(PlannerInfo *root, R
*** 941,947 ****
Assert(IsA(rinfo, RestrictInfo));
childqual = adjust_appendrel_attrs(root,
(Node *) rinfo->clause,
! appinfo);
childqual = eval_const_expressions(root, childqual);
/* check for flat-out constant */
if (childqual && IsA(childqual, Const))
--- 1115,1121 ----
Assert(IsA(rinfo, RestrictInfo));
childqual = adjust_appendrel_attrs(root,
(Node *) rinfo->clause,
! 1, &appinfo);
childqual = eval_const_expressions(root, childqual);
/* check for flat-out constant */
if (childqual && IsA(childqual, Const))
*************** set_append_rel_size(PlannerInfo *root, R
*** 1047,1070 ****
continue;
}
! /*
! * CE failed, so finish copying/modifying targetlist and join quals.
! *
! * NB: the resulting childrel->reltarget->exprs may contain arbitrary
! * expressions, which otherwise would not occur in a rel's targetlist.
! * Code that might be looking at an appendrel child must cope with
! * such. (Normally, a rel's targetlist would only include Vars and
! * PlaceHolderVars.) XXX we do not bother to update the cost or width
! * fields of childrel->reltarget; not clear if that would be useful.
! */
childrel->joininfo = (List *)
adjust_appendrel_attrs(root,
(Node *) rel->joininfo,
! appinfo);
! childrel->reltarget->exprs = (List *)
! adjust_appendrel_attrs(root,
! (Node *) rel->reltarget->exprs,
! appinfo);
/*
* We have to make child entries in the EquivalenceClass data
--- 1221,1231 ----
continue;
}
! /* CE failed, so finish copying/modifying join quals. */
childrel->joininfo = (List *)
adjust_appendrel_attrs(root,
(Node *) rel->joininfo,
! 1, &appinfo);
/*
* We have to make child entries in the EquivalenceClass data
*************** set_append_rel_size(PlannerInfo *root, R
*** 1079,1092 ****
childrel->has_eclass_joins = rel->has_eclass_joins;
/*
- * Note: we could compute appropriate attr_needed data for the child's
- * variables, by transforming the parent's attr_needed through the
- * translated_vars mapping. However, currently there's no need
- * because attr_needed is only examined for base relations not
- * otherrels. So we just leave the child's attr_needed empty.
- */
-
- /*
* If parallelism is allowable for this query in general, see whether
* it's allowable for this childrel in particular. But if we've
* already decided the appendrel is not parallel-safe as a whole,
--- 1240,1245 ----
*************** add_paths_to_append_rel(PlannerInfo *roo
*** 1281,1299 ****
bool subpaths_valid = true;
List *partial_subpaths = NIL;
bool partial_subpaths_valid = true;
List *all_child_pathkeys = NIL;
List *all_child_outers = NIL;
ListCell *l;
List *partitioned_rels = NIL;
RangeTblEntry *rte;
! rte = planner_rt_fetch(rel->relid, root);
! if (rte->relkind == RELKIND_PARTITIONED_TABLE)
{
! partitioned_rels = get_partitioned_child_rels(root, rel->relid);
! /* The root partitioned table is included as a child rel */
! Assert(list_length(partitioned_rels) >= 1);
}
/*
* For every non-dummy child, remember the cheapest path. Also, identify
--- 1434,1460 ----
bool subpaths_valid = true;
List *partial_subpaths = NIL;
bool partial_subpaths_valid = true;
+ List *grouped_subpaths = NIL;
+ bool grouped_subpaths_valid = true;
List *all_child_pathkeys = NIL;
List *all_child_outers = NIL;
ListCell *l;
List *partitioned_rels = NIL;
RangeTblEntry *rte;
! if (rel->reloptkind == RELOPT_BASEREL)
{
! rte = planner_rt_fetch(rel->relid, root);
!
! if (rte->relkind == RELKIND_PARTITIONED_TABLE)
! {
! partitioned_rels = get_partitioned_child_rels(root, rel->relid);
! /* The root partitioned table is included as a child rel */
! Assert(list_length(partitioned_rels) >= 1);
! }
}
+ else if (rel->reloptkind == RELOPT_JOINREL && rel->part_scheme)
+ partitioned_rels = get_partitioned_child_rels_for_join(root, rel);
/*
* For every non-dummy child, remember the cheapest path. Also, identify
*************** add_paths_to_append_rel(PlannerInfo *roo
*** 1324,1329 ****
--- 1485,1521 ----
partial_subpaths_valid = false;
/*
+ * For grouped paths, use only the unparameterized subpaths.
+ *
+ * XXX Consider if the parameterized subpaths should be processed
+ * below. It's probably not useful for sequential scans (due to
+ * repeated aggregation), but might be worthwhile for other child
+ * nodes.
+ */
+ if (childrel->gpi != NULL && childrel->gpi->pathlist != NIL)
+ {
+ Path *path;
+
+ path = (Path *) linitial(childrel->gpi->pathlist);
+
+ /*
+ * PoC only: Simulate remote aggregation, which seems to be the
+ * typical use case for pushing the aggregation below Append node.
+ */
+ path->startup_cost = 0.0;
+ path->total_cost = 0.0;
+
+ if (path->param_info == NULL)
+ grouped_subpaths = accumulate_append_subpath(grouped_subpaths,
+ path);
+ else
+ grouped_subpaths_valid = false;
+ }
+ else
+ grouped_subpaths_valid = false;
+
+
+ /*
* Collect lists of all the available path orderings and
* parameterizations for all the children. We use these as a
* heuristic to indicate which sort orderings and parameterizations we
*************** add_paths_to_append_rel(PlannerInfo *roo
*** 1395,1401 ****
*/
if (subpaths_valid)
add_path(rel, (Path *) create_append_path(rel, subpaths, NULL, 0,
! partitioned_rels));
/*
* Consider an append of partial unordered, unparameterized partial paths.
--- 1587,1594 ----
*/
if (subpaths_valid)
add_path(rel, (Path *) create_append_path(rel, subpaths, NULL, 0,
! partitioned_rels),
! false);
/*
* Consider an append of partial unordered, unparameterized partial paths.
*************** add_paths_to_append_rel(PlannerInfo *roo
*** 1422,1429 ****
/* Generate a partial append path. */
appendpath = create_append_path(rel, partial_subpaths, NULL,
! parallel_workers, partitioned_rels);
! add_partial_path(rel, (Path *) appendpath);
}
/*
--- 1615,1635 ----
/* Generate a partial append path. */
appendpath = create_append_path(rel, partial_subpaths, NULL,
! parallel_workers,
! partitioned_rels);
! add_partial_path(rel, (Path *) appendpath, false);
! }
!
! /* TODO Also partial grouped paths? */
! if (grouped_subpaths_valid)
! {
! Path *path;
!
! path = (Path *) create_append_path(rel, grouped_subpaths, NULL, 0,
! partitioned_rels);
! /* pathtarget will produce the grouped relation.. */
! path->pathtarget = rel->gpi->target;
! add_path(rel, path, true);
}
/*
*************** add_paths_to_append_rel(PlannerInfo *roo
*** 1476,1482 ****
if (subpaths_valid)
add_path(rel, (Path *)
create_append_path(rel, subpaths, required_outer, 0,
! partitioned_rels));
}
}
--- 1682,1689 ----
if (subpaths_valid)
add_path(rel, (Path *)
create_append_path(rel, subpaths, required_outer, 0,
! partitioned_rels),
! false);
}
}
*************** generate_mergeappend_paths(PlannerInfo *
*** 1572,1585 ****
startup_subpaths,
pathkeys,
NULL,
! partitioned_rels));
if (startup_neq_total)
add_path(rel, (Path *) create_merge_append_path(root,
rel,
total_subpaths,
pathkeys,
NULL,
! partitioned_rels));
}
}
--- 1779,1794 ----
startup_subpaths,
pathkeys,
NULL,
! partitioned_rels),
! false);
if (startup_neq_total)
add_path(rel, (Path *) create_merge_append_path(root,
rel,
total_subpaths,
pathkeys,
NULL,
! partitioned_rels),
! false);
}
}
*************** set_dummy_rel_pathlist(RelOptInfo *rel)
*** 1712,1718 ****
rel->pathlist = NIL;
rel->partial_pathlist = NIL;
! add_path(rel, (Path *) create_append_path(rel, NIL, NULL, 0, NIL));
/*
* We set the cheapest path immediately, to ensure that IS_DUMMY_REL()
--- 1921,1927 ----
rel->pathlist = NIL;
rel->partial_pathlist = NIL;
! add_path(rel, (Path *) create_append_path(rel, NIL, NULL, 0, NIL), false);
/*
* We set the cheapest path immediately, to ensure that IS_DUMMY_REL()
*************** set_subquery_pathlist(PlannerInfo *root,
*** 1926,1932 ****
/* Generate outer path using this subpath */
add_path(rel, (Path *)
create_subqueryscan_path(root, rel, subpath,
! pathkeys, required_outer));
}
}
--- 2135,2141 ----
/* Generate outer path using this subpath */
add_path(rel, (Path *)
create_subqueryscan_path(root, rel, subpath,
! pathkeys, required_outer), false);
}
}
*************** set_function_pathlist(PlannerInfo *root,
*** 1995,2001 ****
/* Generate appropriate path */
add_path(rel, create_functionscan_path(root, rel,
! pathkeys, required_outer));
}
/*
--- 2204,2210 ----
/* Generate appropriate path */
add_path(rel, create_functionscan_path(root, rel,
! pathkeys, required_outer), false);
}
/*
*************** set_values_pathlist(PlannerInfo *root, R
*** 2015,2021 ****
required_outer = rel->lateral_relids;
/* Generate appropriate path */
! add_path(rel, create_valuesscan_path(root, rel, required_outer));
}
/*
--- 2224,2230 ----
required_outer = rel->lateral_relids;
/* Generate appropriate path */
! add_path(rel, create_valuesscan_path(root, rel, required_outer), false);
}
/*
*************** set_tablefunc_pathlist(PlannerInfo *root
*** 2036,2042 ****
/* Generate appropriate path */
add_path(rel, create_tablefuncscan_path(root, rel,
! required_outer));
}
/*
--- 2245,2251 ----
/* Generate appropriate path */
add_path(rel, create_tablefuncscan_path(root, rel,
! required_outer), false);
}
/*
*************** set_cte_pathlist(PlannerInfo *root, RelO
*** 2102,2108 ****
required_outer = rel->lateral_relids;
/* Generate appropriate path */
! add_path(rel, create_ctescan_path(root, rel, required_outer));
}
/*
--- 2311,2317 ----
required_outer = rel->lateral_relids;
/* Generate appropriate path */
! add_path(rel, create_ctescan_path(root, rel, required_outer), false);
}
/*
*************** set_namedtuplestore_pathlist(PlannerInfo
*** 2129,2135 ****
required_outer = rel->lateral_relids;
/* Generate appropriate path */
! add_path(rel, create_namedtuplestorescan_path(root, rel, required_outer));
/* Select cheapest path (pretty easy in this case...) */
set_cheapest(rel);
--- 2338,2345 ----
required_outer = rel->lateral_relids;
/* Generate appropriate path */
! add_path(rel, create_namedtuplestorescan_path(root, rel, required_outer),
! false);
/* Select cheapest path (pretty easy in this case...) */
set_cheapest(rel);
*************** set_worktable_pathlist(PlannerInfo *root
*** 2182,2188 ****
required_outer = rel->lateral_relids;
/* Generate appropriate path */
! add_path(rel, create_worktablescan_path(root, rel, required_outer));
}
/*
--- 2392,2399 ----
required_outer = rel->lateral_relids;
/* Generate appropriate path */
! add_path(rel, create_worktablescan_path(root, rel, required_outer),
! false);
}
/*
*************** set_worktable_pathlist(PlannerInfo *root
*** 2195,2208 ****
* path that some GatherPath or GatherMergePath has a reference to.)
*/
void
! generate_gather_paths(PlannerInfo *root, RelOptInfo *rel)
{
Path *cheapest_partial_path;
Path *simple_gather_path;
ListCell *lc;
/* If there are no partial paths, there's nothing to do here. */
! if (rel->partial_pathlist == NIL)
return;
/*
--- 2406,2426 ----
* path that some GatherPath or GatherMergePath has a reference to.)
*/
void
! generate_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool grouped)
{
Path *cheapest_partial_path;
Path *simple_gather_path;
+ List *pathlist = NIL;
+ PathTarget *partial_target;
ListCell *lc;
+ if (!grouped)
+ pathlist = rel->partial_pathlist;
+ else if (rel->gpi != NULL)
+ pathlist = rel->gpi->partial_pathlist;
+
/* If there are no partial paths, there's nothing to do here. */
! if (pathlist == NIL)
return;
/*
*************** generate_gather_paths(PlannerInfo *root,
*** 2210,2226 ****
* path of interest: the cheapest one. That will be the one at the front
* of partial_pathlist because of the way add_partial_path works.
*/
! cheapest_partial_path = linitial(rel->partial_pathlist);
simple_gather_path = (Path *)
! create_gather_path(root, rel, cheapest_partial_path, rel->reltarget,
NULL, NULL);
! add_path(rel, simple_gather_path);
/*
* For each useful ordering, we can consider an order-preserving Gather
* Merge.
*/
! foreach (lc, rel->partial_pathlist)
{
Path *subpath = (Path *) lfirst(lc);
GatherMergePath *path;
--- 2428,2450 ----
* path of interest: the cheapest one. That will be the one at the front
* of partial_pathlist because of the way add_partial_path works.
*/
! cheapest_partial_path = linitial(pathlist);
!
! if (!grouped)
! partial_target = rel->reltarget;
! else if (rel->gpi != NULL)
! partial_target = rel->gpi->target;
!
simple_gather_path = (Path *)
! create_gather_path(root, rel, cheapest_partial_path, partial_target,
NULL, NULL);
! add_path(rel, simple_gather_path, grouped);
/*
* For each useful ordering, we can consider an order-preserving Gather
* Merge.
*/
! foreach (lc, pathlist)
{
Path *subpath = (Path *) lfirst(lc);
GatherMergePath *path;
*************** generate_gather_paths(PlannerInfo *root,
*** 2228,2236 ****
if (subpath->pathkeys == NIL)
continue;
! path = create_gather_merge_path(root, rel, subpath, rel->reltarget,
subpath->pathkeys, NULL, NULL);
! add_path(rel, &path->path);
}
}
--- 2452,2460 ----
if (subpath->pathkeys == NIL)
continue;
! path = create_gather_merge_path(root, rel, subpath, partial_target,
subpath->pathkeys, NULL, NULL);
! add_path(rel, &path->path, grouped);
}
}
*************** standard_join_search(PlannerInfo *root,
*** 2388,2402 ****
* Run generate_gather_paths() for each just-processed joinrel. We
* could not do this earlier because both regular and partial paths
* can get added to a particular joinrel at multiple times within
! * join_search_one_level. After that, we're done creating paths for
! * the joinrel, so run set_cheapest().
*/
foreach(lc, root->join_rel_level[lev])
{
rel = (RelOptInfo *) lfirst(lc);
/* Create GatherPaths for any useful partial paths for rel */
! generate_gather_paths(root, rel);
/* Find and save the cheapest paths for this rel */
set_cheapest(rel);
--- 2612,2641 ----
* Run generate_gather_paths() for each just-processed joinrel. We
* could not do this earlier because both regular and partial paths
* can get added to a particular joinrel at multiple times within
! * join_search_one_level.
! *
! * Similarly, create paths for joinrels which used partition-wise join
! * technique. We could not do this earlier because paths can get added
! * to a particular child-join at multiple times within
! * join_search_one_level.
! *
! * After that, we're done creating paths for the joinrel, so run
! * set_cheapest().
*/
foreach(lc, root->join_rel_level[lev])
{
rel = (RelOptInfo *) lfirst(lc);
+ /*
+ * Create paths for partition-wise joins. Do this before creating
+ * GatherPaths so that partial "append" paths in partitioned joins
+ * will be considered.
+ */
+ generate_partition_wise_join_paths(root, rel);
+
/* Create GatherPaths for any useful partial paths for rel */
! generate_gather_paths(root, rel, false);
! generate_gather_paths(root, rel, true);
/* Find and save the cheapest paths for this rel */
set_cheapest(rel);
*************** create_partial_bitmap_paths(PlannerInfo
*** 3047,3053 ****
return;
add_partial_path(rel, (Path *) create_bitmap_heap_path(root, rel,
! bitmapqual, rel->lateral_relids, 1.0, parallel_workers));
}
/*
--- 3286,3292 ----
return;
add_partial_path(rel, (Path *) create_bitmap_heap_path(root, rel,
! bitmapqual, rel->lateral_relids, 1.0, parallel_workers), false);
}
/*
*************** compute_parallel_worker(RelOptInfo *rel,
*** 3142,3147 ****
--- 3381,3454 ----
return parallel_workers;
}
+ /*
+ * generate_partition_wise_join_paths
+ *
+ * Create paths representing partition-wise join for given partitioned
+ * join relation.
+ *
+ * This must not be called until after we are done adding paths for all
+ * child-joins. (Otherwise, add_path might delete a path that some "append"
+ * path has reference to.
+ */
+ void
+ generate_partition_wise_join_paths(PlannerInfo *root, RelOptInfo *rel)
+ {
+ List *live_children = NIL;
+ int cnt_parts;
+ int num_parts;
+ RelOptInfo **part_rels;
+
+ /* Handle only join relations. */
+ if (!IS_JOIN_REL(rel))
+ return;
+
+ /* If the relation is not partitioned or is proven dummy, nothing to do. */
+ if (!rel->part_scheme || !rel->boundinfo || IS_DUMMY_REL(rel))
+ return;
+
+ /* A partitioned join should have RelOptInfos of the child-joins. */
+ Assert(rel->part_rels && rel->nparts > 0);
+
+ /* Guard against stack overflow due to overly deep partition hierarchy. */
+ check_stack_depth();
+
+ num_parts = rel->nparts;
+ part_rels = rel->part_rels;
+
+ /* Collect non-dummy child-joins. */
+ for (cnt_parts = 0; cnt_parts < num_parts; cnt_parts++)
+ {
+ RelOptInfo *child_rel = part_rels[cnt_parts];
+
+ /* Add partition-wise join paths for partitioned child-joins. */
+ generate_partition_wise_join_paths(root, child_rel);
+
+ /* Dummy children will not be scanned, so ingore those. */
+ if (IS_DUMMY_REL(child_rel))
+ continue;
+
+ set_cheapest(child_rel);
+
+ #ifdef OPTIMIZER_DEBUG
+ debug_print_rel(root, rel);
+ #endif
+
+ live_children = lappend(live_children, child_rel);
+ }
+
+ /* If all child-joins are dummy, parent join is also dummy. */
+ if (!live_children)
+ {
+ mark_dummy_rel(rel);
+ return;
+ }
+
+ /* Add "append" paths containing paths from child-joins. */
+ add_paths_to_append_rel(root, rel, live_children);
+ list_free(live_children);
+ }
+
/*****************************************************************************
* DEBUG SUPPORT
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
new file mode 100644
index 52643d0..f278b77
*** a/src/backend/optimizer/path/costsize.c
--- b/src/backend/optimizer/path/costsize.c
*************** bool enable_material = true;
*** 127,132 ****
--- 127,133 ----
bool enable_mergejoin = true;
bool enable_hashjoin = true;
bool enable_gathermerge = true;
+ bool enable_partition_wise_join = false;
typedef struct
{
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
new file mode 100644
index 67bd760..780ea04
*** a/src/backend/optimizer/path/equivclass.c
--- b/src/backend/optimizer/path/equivclass.c
*************** generate_join_implied_equalities_broken(
*** 1329,1335 ****
if (IS_OTHER_REL(inner_rel) && result != NIL)
result = (List *) adjust_appendrel_attrs_multilevel(root,
(Node *) result,
! inner_rel);
return result;
}
--- 1329,1336 ----
if (IS_OTHER_REL(inner_rel) && result != NIL)
result = (List *) adjust_appendrel_attrs_multilevel(root,
(Node *) result,
! inner_rel->relids,
! inner_rel->top_parent_relids);
return result;
}
*************** add_child_rel_equivalences(PlannerInfo *
*** 2112,2118 ****
child_expr = (Expr *)
adjust_appendrel_attrs(root,
(Node *) cur_em->em_expr,
! appinfo);
/*
* Transform em_relids to match. Note we do *not* do
--- 2113,2119 ----
child_expr = (Expr *)
adjust_appendrel_attrs(root,
(Node *) cur_em->em_expr,
! 1, &appinfo);
/*
* Transform em_relids to match. Note we do *not* do
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
new file mode 100644
index 6e4bae8..a6fa713
*** a/src/backend/optimizer/path/indxpath.c
--- b/src/backend/optimizer/path/indxpath.c
***************
*** 32,37 ****
--- 32,38 ----
#include "optimizer/predtest.h"
#include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
+ #include "optimizer/tlist.h"
#include "optimizer/var.h"
#include "utils/builtins.h"
#include "utils/bytea.h"
*************** static bool eclass_already_used(Equivale
*** 107,119 ****
static bool bms_equal_any(Relids relids, List *relids_list);
static void get_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
! List **bitindexpaths);
static List *build_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
bool useful_predicate,
ScanTypeControl scantype,
bool *skip_nonnative_saop,
! bool *skip_lower_saop);
static List *build_paths_for_OR(PlannerInfo *root, RelOptInfo *rel,
List *clauses, List *other_clauses);
static List *generate_bitmap_or_paths(PlannerInfo *root, RelOptInfo *rel,
--- 108,121 ----
static bool bms_equal_any(Relids relids, List *relids_list);
static void get_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
! List **bitindexpaths, bool grouped);
static List *build_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
bool useful_predicate,
ScanTypeControl scantype,
bool *skip_nonnative_saop,
! bool *skip_lower_saop,
! bool grouped);
static List *build_paths_for_OR(PlannerInfo *root, RelOptInfo *rel,
List *clauses, List *other_clauses);
static List *generate_bitmap_or_paths(PlannerInfo *root, RelOptInfo *rel,
*************** static Const *string_to_const(const char
*** 229,235 ****
* as meaning "unparameterized so far as the indexquals are concerned".
*/
void
! create_index_paths(PlannerInfo *root, RelOptInfo *rel)
{
List *indexpaths;
List *bitindexpaths;
--- 231,237 ----
* as meaning "unparameterized so far as the indexquals are concerned".
*/
void
! create_index_paths(PlannerInfo *root, RelOptInfo *rel, bool grouped)
{
List *indexpaths;
List *bitindexpaths;
*************** create_index_paths(PlannerInfo *root, Re
*** 274,281 ****
* non-parameterized paths. Plain paths go directly to add_path(),
* bitmap paths are added to bitindexpaths to be handled below.
*/
! get_index_paths(root, rel, index, &rclauseset,
! &bitindexpaths);
/*
* Identify the join clauses that can match the index. For the moment
--- 276,283 ----
* non-parameterized paths. Plain paths go directly to add_path(),
* bitmap paths are added to bitindexpaths to be handled below.
*/
! get_index_paths(root, rel, index, &rclauseset, &bitindexpaths,
! grouped);
/*
* Identify the join clauses that can match the index. For the moment
*************** create_index_paths(PlannerInfo *root, Re
*** 338,344 ****
bitmapqual = choose_bitmap_and(root, rel, bitindexpaths);
bpath = create_bitmap_heap_path(root, rel, bitmapqual,
rel->lateral_relids, 1.0, 0);
! add_path(rel, (Path *) bpath);
/* create a partial bitmap heap path */
if (rel->consider_parallel && rel->lateral_relids == NULL)
--- 340,346 ----
bitmapqual = choose_bitmap_and(root, rel, bitindexpaths);
bpath = create_bitmap_heap_path(root, rel, bitmapqual,
rel->lateral_relids, 1.0, 0);
! add_path(rel, (Path *) bpath, false);
/* create a partial bitmap heap path */
if (rel->consider_parallel && rel->lateral_relids == NULL)
*************** create_index_paths(PlannerInfo *root, Re
*** 415,421 ****
loop_count = get_loop_count(root, rel->relid, required_outer);
bpath = create_bitmap_heap_path(root, rel, bitmapqual,
required_outer, loop_count, 0);
! add_path(rel, (Path *) bpath);
}
}
}
--- 417,423 ----
loop_count = get_loop_count(root, rel->relid, required_outer);
bpath = create_bitmap_heap_path(root, rel, bitmapqual,
required_outer, loop_count, 0);
! add_path(rel, (Path *) bpath, false);
}
}
}
*************** get_join_index_paths(PlannerInfo *root,
*** 667,673 ****
Assert(clauseset.nonempty);
/* Build index path(s) using the collected set of clauses */
! get_index_paths(root, rel, index, &clauseset, bitindexpaths);
/*
* Remember we considered paths for this set of relids. We use lcons not
--- 669,675 ----
Assert(clauseset.nonempty);
/* Build index path(s) using the collected set of clauses */
! get_index_paths(root, rel, index, &clauseset, bitindexpaths, false);
/*
* Remember we considered paths for this set of relids. We use lcons not
*************** bms_equal_any(Relids relids, List *relid
*** 736,742 ****
static void
get_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
! List **bitindexpaths)
{
List *indexpaths;
bool skip_nonnative_saop = false;
--- 738,744 ----
static void
get_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
! List **bitindexpaths, bool grouped)
{
List *indexpaths;
bool skip_nonnative_saop = false;
*************** get_index_paths(PlannerInfo *root, RelOp
*** 754,760 ****
index->predOK,
ST_ANYSCAN,
&skip_nonnative_saop,
! &skip_lower_saop);
/*
* If we skipped any lower-order ScalarArrayOpExprs on an index with an AM
--- 756,762 ----
index->predOK,
ST_ANYSCAN,
&skip_nonnative_saop,
! &skip_lower_saop, grouped);
/*
* If we skipped any lower-order ScalarArrayOpExprs on an index with an AM
*************** get_index_paths(PlannerInfo *root, RelOp
*** 769,775 ****
index->predOK,
ST_ANYSCAN,
&skip_nonnative_saop,
! NULL));
}
/*
--- 771,777 ----
index->predOK,
ST_ANYSCAN,
&skip_nonnative_saop,
! NULL, grouped));
}
/*
*************** get_index_paths(PlannerInfo *root, RelOp
*** 789,797 ****
IndexPath *ipath = (IndexPath *) lfirst(lc);
if (index->amhasgettuple)
! add_path(rel, (Path *) ipath);
! if (index->amhasgetbitmap &&
(ipath->path.pathkeys == NIL ||
ipath->indexselectivity < 1.0))
*bitindexpaths = lappend(*bitindexpaths, ipath);
--- 791,799 ----
IndexPath *ipath = (IndexPath *) lfirst(lc);
if (index->amhasgettuple)
! add_path(rel, (Path *) ipath, grouped);
! if (!grouped && index->amhasgetbitmap &&
(ipath->path.pathkeys == NIL ||
ipath->indexselectivity < 1.0))
*bitindexpaths = lappend(*bitindexpaths, ipath);
*************** get_index_paths(PlannerInfo *root, RelOp
*** 802,815 ****
* natively, generate bitmap scan paths relying on executor-managed
* ScalarArrayOpExpr.
*/
! if (skip_nonnative_saop)
{
indexpaths = build_index_paths(root, rel,
index, clauses,
false,
ST_BITMAPSCAN,
NULL,
! NULL);
*bitindexpaths = list_concat(*bitindexpaths, indexpaths);
}
}
--- 804,818 ----
* natively, generate bitmap scan paths relying on executor-managed
* ScalarArrayOpExpr.
*/
! if (!grouped && skip_nonnative_saop)
{
indexpaths = build_index_paths(root, rel,
index, clauses,
false,
ST_BITMAPSCAN,
NULL,
! NULL,
! false);
*bitindexpaths = list_concat(*bitindexpaths, indexpaths);
}
}
*************** build_index_paths(PlannerInfo *root, Rel
*** 861,867 ****
bool useful_predicate,
ScanTypeControl scantype,
bool *skip_nonnative_saop,
! bool *skip_lower_saop)
{
List *result = NIL;
IndexPath *ipath;
--- 864,870 ----
bool useful_predicate,
ScanTypeControl scantype,
bool *skip_nonnative_saop,
! bool *skip_lower_saop, bool grouped)
{
List *result = NIL;
IndexPath *ipath;
*************** build_index_paths(PlannerInfo *root, Rel
*** 878,883 ****
--- 881,890 ----
bool index_is_ordered;
bool index_only_scan;
int indexcol;
+ bool can_agg_sorted;
+ List *group_clauses, *group_exprs, *agg_exprs;
+ AggPath *agg_path;
+ double agg_input_rows;
/*
* Check that index supports the desired scan type(s)
*************** build_index_paths(PlannerInfo *root, Rel
*** 891,896 ****
--- 898,906 ----
case ST_BITMAPSCAN:
if (!index->amhasgetbitmap)
return NIL;
+
+ if (grouped)
+ return NIL;
break;
case ST_ANYSCAN:
/* either or both are OK */
*************** build_index_paths(PlannerInfo *root, Rel
*** 1032,1037 ****
--- 1042,1051 ----
* later merging or final output ordering, OR the index has a useful
* predicate, OR an index-only scan is possible.
*/
+ can_agg_sorted = true;
+ group_clauses = NIL;
+ group_exprs = NIL;
+ agg_exprs = NIL;
if (index_clauses != NIL || useful_pathkeys != NIL || useful_predicate ||
index_only_scan)
{
*************** build_index_paths(PlannerInfo *root, Rel
*** 1048,1054 ****
outer_relids,
loop_count,
false);
! result = lappend(result, ipath);
/*
* If appropriate, consider parallel index scan. We don't allow
--- 1062,1086 ----
outer_relids,
loop_count,
false);
! if (!grouped)
! result = lappend(result, ipath);
! else
! {
! /* TODO Double-check if this is the correct input value. */
! agg_input_rows = rel->rows * ipath->indexselectivity;
!
! agg_path = create_partial_agg_sorted_path(root, (Path *) ipath,
! true,
! &group_clauses,
! &group_exprs,
! &agg_exprs,
! agg_input_rows);
!
! if (agg_path != NULL)
! result = lappend(result, agg_path);
! else
! can_agg_sorted = false;
! }
/*
* If appropriate, consider parallel index scan. We don't allow
*************** build_index_paths(PlannerInfo *root, Rel
*** 1077,1083 ****
* using parallel workers, just free it.
*/
if (ipath->path.parallel_workers > 0)
! add_partial_path(rel, (Path *) ipath);
else
pfree(ipath);
}
--- 1109,1139 ----
* using parallel workers, just free it.
*/
if (ipath->path.parallel_workers > 0)
! {
! if (!grouped)
! add_partial_path(rel, (Path *) ipath, grouped);
! else if (can_agg_sorted && outer_relids == NULL)
! {
! /* TODO Double-check if this is the correct input value. */
! agg_input_rows = rel->rows * ipath->indexselectivity;
!
! agg_path = create_partial_agg_sorted_path(root,
! (Path *) ipath,
! false,
! &group_clauses,
! &group_exprs,
! &agg_exprs,
! agg_input_rows);
!
! /*
! * If create_agg_sorted_path succeeded once, it should
! * always do.
! */
! Assert(agg_path != NULL);
!
! add_partial_path(rel, (Path *) agg_path, grouped);
! }
! }
else
pfree(ipath);
}
*************** build_index_paths(PlannerInfo *root, Rel
*** 1105,1111 ****
outer_relids,
loop_count,
false);
! result = lappend(result, ipath);
/* If appropriate, consider parallel index scan */
if (index->amcanparallel &&
--- 1161,1185 ----
outer_relids,
loop_count,
false);
!
! if (!grouped)
! result = lappend(result, ipath);
! else if (can_agg_sorted)
! {
! /* TODO Double-check if this is the correct input value. */
! agg_input_rows = rel->rows * ipath->indexselectivity;
!
! agg_path = create_partial_agg_sorted_path(root,
! (Path *) ipath,
! true,
! &group_clauses,
! &group_exprs,
! &agg_exprs,
! agg_input_rows);
!
! Assert(agg_path != NULL);
! result = lappend(result, agg_path);
! }
/* If appropriate, consider parallel index scan */
if (index->amcanparallel &&
*************** build_index_paths(PlannerInfo *root, Rel
*** 1129,1135 ****
* using parallel workers, just free it.
*/
if (ipath->path.parallel_workers > 0)
! add_partial_path(rel, (Path *) ipath);
else
pfree(ipath);
}
--- 1203,1227 ----
* using parallel workers, just free it.
*/
if (ipath->path.parallel_workers > 0)
! {
! if (!grouped)
! add_partial_path(rel, (Path *) ipath, grouped);
! else if (can_agg_sorted && outer_relids == NULL)
! {
! /* TODO Double-check if this is the correct input value. */
! agg_input_rows = rel->rows * ipath->indexselectivity;
!
! agg_path = create_partial_agg_sorted_path(root,
! (Path *) ipath,
! false,
! &group_clauses,
! &group_exprs,
! &agg_exprs,
! agg_input_rows);
! Assert(agg_path != NULL);
! add_partial_path(rel, (Path *) agg_path, grouped);
! }
! }
else
pfree(ipath);
}
*************** build_paths_for_OR(PlannerInfo *root, Re
*** 1244,1250 ****
useful_predicate,
ST_BITMAPSCAN,
NULL,
! NULL);
result = list_concat(result, indexpaths);
}
--- 1336,1343 ----
useful_predicate,
ST_BITMAPSCAN,
NULL,
! NULL,
! false);
result = list_concat(result, indexpaths);
}
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
new file mode 100644
index 5aedcd1..f25719f
*** a/src/backend/optimizer/path/joinpath.c
--- b/src/backend/optimizer/path/joinpath.c
***************
*** 22,34 ****
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/planmain.h"
/* Hook for plugins to get control in add_paths_to_joinrel() */
set_join_pathlist_hook_type set_join_pathlist_hook = NULL;
! #define PATH_PARAM_BY_REL(path, rel) \
((path)->param_info && bms_overlap(PATH_REQ_OUTER(path), (rel)->relids))
static void try_partial_mergejoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
Path *outer_path,
--- 22,45 ----
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/planmain.h"
+ #include "optimizer/tlist.h"
/* Hook for plugins to get control in add_paths_to_joinrel() */
set_join_pathlist_hook_type set_join_pathlist_hook = NULL;
! /*
! * Paths parameterized by the parent can be considered to be parameterized by
! * any of its child.
! */
! #define PATH_PARAM_BY_PARENT(path, rel) \
! ((path)->param_info && bms_overlap(PATH_REQ_OUTER(path), \
! (rel)->top_parent_relids))
! #define PATH_PARAM_BY_REL_SELF(path, rel) \
((path)->param_info && bms_overlap(PATH_REQ_OUTER(path), (rel)->relids))
+ #define PATH_PARAM_BY_REL(path, rel) \
+ (PATH_PARAM_BY_REL_SELF(path, rel) || PATH_PARAM_BY_PARENT(path, rel))
+
static void try_partial_mergejoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
Path *outer_path,
*************** static void try_partial_mergejoin_path(P
*** 38,66 ****
List *outersortkeys,
List *innersortkeys,
JoinType jointype,
! JoinPathExtraData *extra);
static void sort_inner_and_outer(PlannerInfo *root, RelOptInfo *joinrel,
! RelOptInfo *outerrel, RelOptInfo *innerrel,
! JoinType jointype, JoinPathExtraData *extra);
static void match_unsorted_outer(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outerrel, RelOptInfo *innerrel,
! JoinType jointype, JoinPathExtraData *extra);
static void consider_parallel_nestloop(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra);
static void consider_parallel_mergejoin(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
JoinPathExtraData *extra,
! Path *inner_cheapest_total);
static void hash_inner_and_outer(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outerrel, RelOptInfo *innerrel,
! JoinType jointype, JoinPathExtraData *extra);
static List *select_mergejoin_clauses(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
--- 49,97 ----
List *outersortkeys,
List *innersortkeys,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped,
! bool do_aggregate);
static void sort_inner_and_outer(PlannerInfo *root, RelOptInfo *joinrel,
! RelOptInfo *outerrel, RelOptInfo *innerrel,
! JoinType jointype, JoinPathExtraData *extra,
! bool grouped);
! static void sort_inner_and_outer_common(PlannerInfo *root,
! RelOptInfo *joinrel,
! RelOptInfo *outerrel,
! RelOptInfo *innerrel,
! JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped_outer,
! bool grouped_inner,
! bool do_aggregate);
static void match_unsorted_outer(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outerrel, RelOptInfo *innerrel,
! JoinType jointype, JoinPathExtraData *extra,
! bool grouped);
static void consider_parallel_nestloop(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped, bool do_aggregate);
static void consider_parallel_mergejoin(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
JoinPathExtraData *extra,
! Path *inner_cheapest_total,
! bool grouped);
static void hash_inner_and_outer(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outerrel, RelOptInfo *innerrel,
! JoinType jointype, JoinPathExtraData *extra,
! bool grouped);
! static bool is_grouped_join_target_complete(PlannerInfo *root,
! PathTarget *jointarget,
! Path *outer_path,
! Path *inner_path);
static List *select_mergejoin_clauses(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
*************** static void generate_mergejoin_paths(Pla
*** 77,83 ****
bool useallclauses,
Path *inner_cheapest_total,
List *merge_pathkeys,
! bool is_partial);
/*
--- 108,117 ----
bool useallclauses,
Path *inner_cheapest_total,
List *merge_pathkeys,
! bool is_partial,
! bool grouped_outer,
! bool grouped_inner,
! bool do_aggregate);
/*
*************** add_paths_to_joinrel(PlannerInfo *root,
*** 115,120 ****
--- 149,167 ----
JoinPathExtraData extra;
bool mergejoin_allowed = true;
ListCell *lc;
+ Relids joinrelids;
+
+ /*
+ * PlannerInfo doesn't contain the SpecialJoinInfos created for joins
+ * between child relations, even if there is a SpecialJoinInfo node for
+ * the join between the topmost parents. Hence while calculating Relids
+ * set representing the restriction, consider relids of topmost parent
+ * of partitions.
+ */
+ if (joinrel->reloptkind == RELOPT_OTHER_JOINREL)
+ joinrelids = joinrel->top_parent_relids;
+ else
+ joinrelids = joinrel->relids;
extra.restrictlist = restrictlist;
extra.mergeclause_list = NIL;
*************** add_paths_to_joinrel(PlannerInfo *root,
*** 197,212 ****
* join has already been proven legal.) If the SJ is relevant, it
* presents constraints for joining to anything not in its RHS.
*/
! if (bms_overlap(joinrel->relids, sjinfo2->min_righthand) &&
! !bms_overlap(joinrel->relids, sjinfo2->min_lefthand))
extra.param_source_rels = bms_join(extra.param_source_rels,
bms_difference(root->all_baserels,
sjinfo2->min_righthand));
/* full joins constrain both sides symmetrically */
if (sjinfo2->jointype == JOIN_FULL &&
! bms_overlap(joinrel->relids, sjinfo2->min_lefthand) &&
! !bms_overlap(joinrel->relids, sjinfo2->min_righthand))
extra.param_source_rels = bms_join(extra.param_source_rels,
bms_difference(root->all_baserels,
sjinfo2->min_lefthand));
--- 244,259 ----
* join has already been proven legal.) If the SJ is relevant, it
* presents constraints for joining to anything not in its RHS.
*/
! if (bms_overlap(joinrelids, sjinfo2->min_righthand) &&
! !bms_overlap(joinrelids, sjinfo2->min_lefthand))
extra.param_source_rels = bms_join(extra.param_source_rels,
bms_difference(root->all_baserels,
sjinfo2->min_righthand));
/* full joins constrain both sides symmetrically */
if (sjinfo2->jointype == JOIN_FULL &&
! bms_overlap(joinrelids, sjinfo2->min_lefthand) &&
! !bms_overlap(joinrelids, sjinfo2->min_righthand))
extra.param_source_rels = bms_join(extra.param_source_rels,
bms_difference(root->all_baserels,
sjinfo2->min_lefthand));
*************** add_paths_to_joinrel(PlannerInfo *root,
*** 227,234 ****
* sorted. Skip this if we can't mergejoin.
*/
if (mergejoin_allowed)
sort_inner_and_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra);
/*
* 2. Consider paths where the outer relation need not be explicitly
--- 274,285 ----
* sorted. Skip this if we can't mergejoin.
*/
if (mergejoin_allowed)
+ {
sort_inner_and_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra, false);
! sort_inner_and_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra, true);
! }
/*
* 2. Consider paths where the outer relation need not be explicitly
*************** add_paths_to_joinrel(PlannerInfo *root,
*** 238,245 ****
* joins at all, so it wouldn't work in the prohibited cases either.)
*/
if (mergejoin_allowed)
match_unsorted_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra);
#ifdef NOT_USED
--- 289,300 ----
* joins at all, so it wouldn't work in the prohibited cases either.)
*/
if (mergejoin_allowed)
+ {
match_unsorted_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra, false);
! match_unsorted_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra, true);
! }
#ifdef NOT_USED
*************** add_paths_to_joinrel(PlannerInfo *root,
*** 265,272 ****
* joins, because there may be no other alternative.
*/
if (enable_hashjoin || jointype == JOIN_FULL)
hash_inner_and_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra);
/*
* 5. If inner and outer relations are foreign tables (or joins) belonging
--- 320,331 ----
* joins, because there may be no other alternative.
*/
if (enable_hashjoin || jointype == JOIN_FULL)
+ {
hash_inner_and_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra, false);
! hash_inner_and_outer(root, joinrel, outerrel, innerrel,
! jointype, &extra, true);
! }
/*
* 5. If inner and outer relations are foreign tables (or joins) belonging
*************** add_paths_to_joinrel(PlannerInfo *root,
*** 304,321 ****
*/
static inline bool
allow_star_schema_join(PlannerInfo *root,
! Path *outer_path,
! Path *inner_path)
{
- Relids innerparams = PATH_REQ_OUTER(inner_path);
- Relids outerrelids = outer_path->parent->relids;
-
/*
* It's a star-schema case if the outer rel provides some but not all of
* the inner rel's parameterization.
*/
! return (bms_overlap(innerparams, outerrelids) &&
! bms_nonempty_difference(innerparams, outerrelids));
}
/*
--- 363,377 ----
*/
static inline bool
allow_star_schema_join(PlannerInfo *root,
! Relids outerrelids,
! Relids inner_paramrels)
{
/*
* It's a star-schema case if the outer rel provides some but not all of
* the inner rel's parameterization.
*/
! return (bms_overlap(inner_paramrels, outerrelids) &&
! bms_nonempty_difference(inner_paramrels, outerrelids));
}
/*
*************** try_nestloop_path(PlannerInfo *root,
*** 330,339 ****
Path *inner_path,
List *pathkeys,
JoinType jointype,
! JoinPathExtraData *extra)
{
Relids required_outer;
JoinCostWorkspace workspace;
/*
* Check to see if proposed path is still parameterized, and reject if the
--- 386,427 ----
Path *inner_path,
List *pathkeys,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped,
! bool do_aggregate)
{
Relids required_outer;
JoinCostWorkspace workspace;
+ RelOptInfo *innerrel = inner_path->parent;
+ RelOptInfo *outerrel = outer_path->parent;
+ Relids innerrelids;
+ Relids outerrelids;
+ Relids inner_paramrels = PATH_REQ_OUTER(inner_path);
+ Relids outer_paramrels = PATH_REQ_OUTER(outer_path);
+ Path *join_path;
+ PathTarget *join_target;
+
+ /* Caller should not request aggregation w/o grouped output. */
+ Assert(!do_aggregate || grouped);
+
+ /* GroupedPathInfo is necessary for us to produce a grouped set. */
+ Assert(joinrel->gpi != NULL || !grouped);
+
+ /*
+ * Parameterized paths in the child relations (base or join) are
+ * parameterized by top-level parent. Any paths we will create to be
+ * parameterized by the child child relations, are not added to the
+ * pathlist. Hence run parameterization tests on the parent relids.
+ */
+ if (innerrel->top_parent_relids)
+ innerrelids = innerrel->top_parent_relids;
+ else
+ innerrelids = innerrel->relids;
+
+ if (outerrel->top_parent_relids)
+ outerrelids = outerrel->top_parent_relids;
+ else
+ outerrelids = outerrel->relids;
/*
* Check to see if proposed path is still parameterized, and reject if the
*************** try_nestloop_path(PlannerInfo *root,
*** 341,359 ****
* says to allow it anyway. Also, we must reject if have_dangerous_phv
* doesn't like the look of it, which could only happen if the nestloop is
* still parameterized.
*/
! required_outer = calc_nestloop_required_outer(outer_path,
! inner_path);
! if (required_outer &&
! ((!bms_overlap(required_outer, extra->param_source_rels) &&
! !allow_star_schema_join(root, outer_path, inner_path)) ||
! have_dangerous_phv(root,
! outer_path->parent->relids,
! PATH_REQ_OUTER(inner_path))))
{
! /* Waste no memory when we reject a path here */
! bms_free(required_outer);
! return;
}
/*
--- 429,452 ----
* says to allow it anyway. Also, we must reject if have_dangerous_phv
* doesn't like the look of it, which could only happen if the nestloop is
* still parameterized.
+ *
+ * Grouped path should never be parameterized.
*/
! required_outer = calc_nestloop_required_outer(outerrelids, outer_paramrels,
! innerrelids, inner_paramrels);
! if (required_outer)
{
! if (grouped ||
! (!bms_overlap(required_outer, extra->param_source_rels) &&
! !allow_star_schema_join(root, outerrelids, inner_paramrels)) ||
! have_dangerous_phv(root,
! outer_path->parent->relids,
! PATH_REQ_OUTER(inner_path)))
! {
! /* Waste no memory when we reject a path here */
! bms_free(required_outer);
! return;
! }
}
/*
*************** try_nestloop_path(PlannerInfo *root,
*** 368,388 ****
initial_cost_nestloop(root, &workspace, jointype,
outer_path, inner_path, extra);
! if (add_path_precheck(joinrel,
workspace.startup_cost, workspace.total_cost,
! pathkeys, required_outer))
{
! add_path(joinrel, (Path *)
! create_nestloop_path(root,
! joinrel,
! jointype,
! &workspace,
! extra,
! outer_path,
! inner_path,
! extra->restrictlist,
! pathkeys,
! required_outer));
}
else
{
--- 461,522 ----
initial_cost_nestloop(root, &workspace, jointype,
outer_path, inner_path, extra);
! /*
! * Determine which target the join should produce.
! *
! * In the case of explicit aggregation, output of the join itself is
! * plain.
! */
! if (!grouped || do_aggregate)
! join_target = joinrel->reltarget;
! else
! join_target = joinrel->gpi->target;
!
! join_path = (Path *) create_nestloop_path(root, joinrel, jointype,
! &workspace, extra,
! outer_path, inner_path,
! extra->restrictlist, pathkeys,
! required_outer, join_target);
!
! /* Do partial aggregation if needed. */
! if (do_aggregate && required_outer == NULL)
! {
! create_grouped_path(root, joinrel, join_path, true, false,
! AGG_HASHED);
! create_grouped_path(root, joinrel, join_path, true, false,
! AGG_SORTED);
! }
! else if (add_path_precheck(joinrel,
workspace.startup_cost, workspace.total_cost,
! pathkeys, required_outer, grouped))
{
! /*
! * Since result produced by a child is part of the result produced by
! * its topmost parent and has same properties, the parameters
! * representing that parent may be substituted by values from a child.
! * Hence expressions and hence paths using those expressions,
! * parameterized by a parent can be said to be parameterized by any of
! * its child. For a join between child relations, if the inner path is
! * parameterized by the parent of the outer relation, create a
! * nestloop join path with inner relation parameterized by the outer
! * relation by translating the inner path to be parameterized by the
! * outer child relation. The translated path should have the same costs
! * as the original path, so cost check above should still hold.
! */
! if (PATH_PARAM_BY_PARENT(inner_path, outer_path->parent))
! {
! inner_path = reparameterize_path_by_child(root, inner_path,
! outer_path->parent);
!
! /*
! * If we could not translate the path, we can't create nest loop
! * path.
! */
! if (!inner_path)
! return;
! }
!
! add_path(joinrel, join_path, grouped);
}
else
{
*************** try_partial_nestloop_path(PlannerInfo *r
*** 403,411 ****
Path *inner_path,
List *pathkeys,
JoinType jointype,
! JoinPathExtraData *extra)
{
JoinCostWorkspace workspace;
/*
* If the inner path is parameterized, the parameterization must be fully
--- 537,553 ----
Path *inner_path,
List *pathkeys,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped,
! bool do_aggregate)
{
JoinCostWorkspace workspace;
+ Path *join_path;
+ PathTarget *join_target;
+
+ /* The same checks we do in try_nestloop_path. */
+ Assert(!do_aggregate || grouped);
+ Assert(joinrel->gpi != NULL || !grouped);
/*
* If the inner path is parameterized, the parameterization must be fully
*************** try_partial_nestloop_path(PlannerInfo *r
*** 428,448 ****
*/
initial_cost_nestloop(root, &workspace, jointype,
outer_path, inner_path, extra);
! if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
return;
! /* Might be good enough to be worth trying, so let's try it. */
! add_partial_path(joinrel, (Path *)
! create_nestloop_path(root,
! joinrel,
! jointype,
! &workspace,
! extra,
! outer_path,
! inner_path,
! extra->restrictlist,
! pathkeys,
! NULL));
}
/*
--- 570,650 ----
*/
initial_cost_nestloop(root, &workspace, jointype,
outer_path, inner_path, extra);
!
! /*
! * Determine which target the join should produce.
! *
! * In the case of explicit aggregation, output of the join itself is
! * plain.
! */
! if (!grouped || do_aggregate)
! join_target = joinrel->reltarget;
! else
! {
! Assert(joinrel->gpi != NULL);
! join_target = joinrel->gpi->target;
! }
!
! join_path = (Path *) create_nestloop_path(root, joinrel, jointype,
! &workspace, extra,
! outer_path, inner_path,
! extra->restrictlist, pathkeys,
! NULL, join_target);
!
! if (do_aggregate)
! {
! create_grouped_path(root, joinrel, join_path, true, true, AGG_HASHED);
! create_grouped_path(root, joinrel, join_path, true, true, AGG_SORTED);
! }
! else if (add_partial_path_precheck(joinrel, workspace.total_cost,
! pathkeys, grouped))
! {
! /* Might be good enough to be worth trying, so let's try it. */
! add_partial_path(joinrel, (Path *) join_path, grouped);
! }
! }
!
! static void
! try_grouped_nestloop_path(PlannerInfo *root,
! RelOptInfo *joinrel,
! Path *outer_path,
! Path *inner_path,
! List *pathkeys,
! JoinType jointype,
! JoinPathExtraData *extra,
! bool do_aggregate,
! bool partial)
! {
! /*
! * Missing GroupedPathInfo indicates that we should not try to create a
! * grouped join.
! */
! if (joinrel->gpi == NULL)
return;
! /*
! * Reject the path if we're supposed to combine grouped and plain relation
! * but the grouped one does not evaluate all the relevant aggregates.
! */
! if (!do_aggregate &&
! !is_grouped_join_target_complete(root, joinrel->gpi->target,
! outer_path, inner_path))
! return;
!
! /*
! * As repeated aggregation doesn't seem to be attractive, make sure that
! * the resulting grouped relation is not parameterized.
! */
! if (outer_path->param_info != NULL || inner_path->param_info != NULL)
! return;
!
! if (!partial)
! try_nestloop_path(root, joinrel, outer_path, inner_path, pathkeys,
! jointype, extra, true, do_aggregate);
! else
! try_partial_nestloop_path(root, joinrel, outer_path, inner_path,
! pathkeys, jointype, extra, true,
! do_aggregate);
}
/*
*************** try_mergejoin_path(PlannerInfo *root,
*** 461,470 ****
List *innersortkeys,
JoinType jointype,
JoinPathExtraData *extra,
! bool is_partial)
{
Relids required_outer;
JoinCostWorkspace workspace;
if (is_partial)
{
--- 663,682 ----
List *innersortkeys,
JoinType jointype,
JoinPathExtraData *extra,
! bool is_partial,
! bool grouped,
! bool do_aggregate)
{
Relids required_outer;
JoinCostWorkspace workspace;
+ Path *join_path;
+ PathTarget *join_target;
+
+ /* Caller should not request aggregation w/o grouped output. */
+ Assert(!do_aggregate || grouped);
+
+ /* GroupedPathInfo is necessary for us to produce a grouped set. */
+ Assert(joinrel->gpi != NULL || !grouped);
if (is_partial)
{
*************** try_mergejoin_path(PlannerInfo *root,
*** 477,498 ****
outersortkeys,
innersortkeys,
jointype,
! extra);
return;
}
/*
! * Check to see if proposed path is still parameterized, and reject if the
! * parameterization wouldn't be sensible.
*/
! required_outer = calc_non_nestloop_required_outer(outer_path,
! inner_path);
! if (required_outer &&
! !bms_overlap(required_outer, extra->param_source_rels))
{
! /* Waste no memory when we reject a path here */
! bms_free(required_outer);
! return;
}
/*
--- 689,713 ----
outersortkeys,
innersortkeys,
jointype,
! extra,
! grouped,
! do_aggregate);
return;
}
/*
! * Check to see if proposed path is still parameterized, and reject if
! * it's grouped or if the parameterization wouldn't be sensible.
*/
! required_outer = calc_non_nestloop_required_outer(outer_path, inner_path);
! if (required_outer)
{
! if (grouped || !bms_overlap(required_outer, extra->param_source_rels))
! {
! /* Waste no memory when we reject a path here */
! bms_free(required_outer);
! return;
! }
}
/*
*************** try_mergejoin_path(PlannerInfo *root,
*** 511,537 ****
*/
initial_cost_mergejoin(root, &workspace, jointype, mergeclauses,
outer_path, inner_path,
! outersortkeys, innersortkeys,
! extra);
! if (add_path_precheck(joinrel,
workspace.startup_cost, workspace.total_cost,
! pathkeys, required_outer))
{
! add_path(joinrel, (Path *)
! create_mergejoin_path(root,
! joinrel,
! jointype,
! &workspace,
! extra,
! outer_path,
! inner_path,
! extra->restrictlist,
! pathkeys,
! required_outer,
! mergeclauses,
! outersortkeys,
! innersortkeys));
}
else
{
--- 726,773 ----
*/
initial_cost_mergejoin(root, &workspace, jointype, mergeclauses,
outer_path, inner_path,
! outersortkeys, innersortkeys, extra);
! /*
! * Determine which target the join should produce.
! *
! * In the case of explicit aggregation, output of the join itself is
! * plain.
! */
! if (!grouped || do_aggregate)
! join_target = joinrel->reltarget;
! else
! join_target = joinrel->gpi->target;
!
!
! join_path = (Path *) create_mergejoin_path(root,
! joinrel,
! jointype,
! &workspace,
! extra,
! outer_path,
! inner_path,
! extra->restrictlist,
! pathkeys,
! required_outer,
! mergeclauses,
! outersortkeys,
! innersortkeys,
! join_target);
!
! /* Do partial aggregation if needed. */
! if (do_aggregate)
! {
! create_grouped_path(root, joinrel, join_path, true, false,
! AGG_HASHED);
! create_grouped_path(root, joinrel, join_path, true, false,
! AGG_SORTED);
! }
! else if (add_path_precheck(joinrel,
workspace.startup_cost, workspace.total_cost,
! pathkeys, required_outer, grouped))
{
! add_path(joinrel, (Path *) join_path, grouped);
}
else
{
*************** try_partial_mergejoin_path(PlannerInfo *
*** 555,563 ****
List *outersortkeys,
List *innersortkeys,
JoinType jointype,
! JoinPathExtraData *extra)
{
JoinCostWorkspace workspace;
/*
* See comments in try_partial_hashjoin_path().
--- 791,807 ----
List *outersortkeys,
List *innersortkeys,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped,
! bool do_aggregate)
{
JoinCostWorkspace workspace;
+ Path *join_path;
+ PathTarget *join_target;
+
+ /* The same checks we do in try_mergejoin_path. */
+ Assert(!do_aggregate || grouped);
+ Assert(joinrel->gpi != NULL || !grouped);
/*
* See comments in try_partial_hashjoin_path().
*************** try_partial_mergejoin_path(PlannerInfo *
*** 587,613 ****
*/
initial_cost_mergejoin(root, &workspace, jointype, mergeclauses,
outer_path, inner_path,
! outersortkeys, innersortkeys,
! extra);
! if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
return;
! /* Might be good enough to be worth trying, so let's try it. */
! add_partial_path(joinrel, (Path *)
! create_mergejoin_path(root,
! joinrel,
! jointype,
! &workspace,
! extra,
! outer_path,
! inner_path,
! extra->restrictlist,
! pathkeys,
! NULL,
! mergeclauses,
! outersortkeys,
! innersortkeys));
}
/*
--- 831,1003 ----
*/
initial_cost_mergejoin(root, &workspace, jointype, mergeclauses,
outer_path, inner_path,
! outersortkeys, innersortkeys, extra);
! /*
! * Determine which target the join should produce.
! *
! * In the case of explicit aggregation, output of the join itself is
! * plain.
! */
! if (!grouped || do_aggregate)
! join_target = joinrel->reltarget;
! else
! {
! Assert(joinrel->gpi != NULL);
! join_target = joinrel->gpi->target;
! }
!
! join_path = (Path *) create_mergejoin_path(root,
! joinrel,
! jointype,
! &workspace,
! extra,
! outer_path,
! inner_path,
! extra->restrictlist,
! pathkeys,
! NULL,
! mergeclauses,
! outersortkeys,
! innersortkeys,
! join_target);
!
! if (do_aggregate)
! {
! create_grouped_path(root, joinrel, join_path, true, true, AGG_HASHED);
! create_grouped_path(root, joinrel, join_path, true, true, AGG_SORTED);
! }
! else if (add_partial_path_precheck(joinrel, workspace.total_cost,
! pathkeys, grouped))
! {
! /* Might be good enough to be worth trying, so let's try it. */
! add_partial_path(joinrel, (Path *) join_path, grouped);
! }
! }
!
! static void
! try_grouped_mergejoin_path(PlannerInfo *root,
! RelOptInfo *joinrel,
! Path *outer_path,
! Path *inner_path,
! List *pathkeys,
! List *mergeclauses,
! List *outersortkeys,
! List *innersortkeys,
! JoinType jointype,
! JoinPathExtraData *extra,
! bool partial,
! bool do_aggregate)
! {
! /*
! * Missing GroupedPathInfo indicates that we should not try to create a
! * grouped join.
! */
! if (joinrel->gpi == NULL)
return;
! /*
! * Reject the path if we're supposed to combine grouped and plain relation
! * but the grouped one does not evaluate all the relevant aggregates.
! */
! if (!do_aggregate &&
! !is_grouped_join_target_complete(root, joinrel->gpi->target,
! outer_path, inner_path))
! return;
!
! /*
! * As repeated aggregation doesn't seem to be attractive, make sure that
! * the resulting grouped relation is not parameterized.
! */
! if (outer_path->param_info != NULL || inner_path->param_info != NULL)
! return;
!
! if (!partial)
! try_mergejoin_path(root, joinrel, outer_path, inner_path, pathkeys,
! mergeclauses, outersortkeys, innersortkeys,
! jointype, extra, false, true, do_aggregate);
! else
! try_partial_mergejoin_path(root, joinrel, outer_path, inner_path,
! pathkeys,
! mergeclauses, outersortkeys, innersortkeys,
! jointype, extra, true, do_aggregate);
! }
!
! static void
! try_mergejoin_path_common(PlannerInfo *root,
! RelOptInfo *joinrel,
! Path *outer_path,
! Path *inner_path,
! List *pathkeys,
! List *mergeclauses,
! List *outersortkeys,
! List *innersortkeys,
! JoinType jointype,
! JoinPathExtraData *extra,
! bool partial,
! bool grouped_outer,
! bool grouped_inner,
! bool do_aggregate)
! {
! bool grouped_join;
!
! grouped_join = grouped_outer || grouped_inner || do_aggregate;
!
! /* Join of two grouped paths is not supported. */
! Assert(!(grouped_outer && grouped_inner));
!
! if (!grouped_join)
! {
! /* Only join plain paths. */
! try_mergejoin_path(root,
! joinrel,
! outer_path,
! inner_path,
! pathkeys,
! mergeclauses,
! outersortkeys,
! innersortkeys,
! jointype,
! extra,
! partial,
! false, false);
! }
! else if (grouped_outer || grouped_inner)
! {
! Assert(!do_aggregate);
!
! /*
! * Exactly one of the input paths is grouped, so create a grouped join
! * path.
! */
! try_grouped_mergejoin_path(root,
! joinrel,
! outer_path,
! inner_path,
! pathkeys,
! mergeclauses,
! outersortkeys,
! innersortkeys,
! jointype,
! extra,
! partial,
! false);
! }
! /* Preform explicit aggregation only if suitable target exists. */
! else if (joinrel->gpi != NULL)
! {
! try_grouped_mergejoin_path(root,
! joinrel,
! outer_path,
! inner_path,
! pathkeys,
! mergeclauses,
! outersortkeys,
! innersortkeys,
! jointype,
! extra,
! partial, true);
! }
}
/*
*************** try_hashjoin_path(PlannerInfo *root,
*** 622,668 ****
Path *inner_path,
List *hashclauses,
JoinType jointype,
! JoinPathExtraData *extra)
{
Relids required_outer;
JoinCostWorkspace workspace;
/*
! * Check to see if proposed path is still parameterized, and reject if the
! * parameterization wouldn't be sensible.
*/
! required_outer = calc_non_nestloop_required_outer(outer_path,
! inner_path);
! if (required_outer &&
! !bms_overlap(required_outer, extra->param_source_rels))
{
! /* Waste no memory when we reject a path here */
! bms_free(required_outer);
! return;
}
/*
* See comments in try_nestloop_path(). Also note that hashjoin paths
* never have any output pathkeys, per comments in create_hashjoin_path.
*/
initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
outer_path, inner_path, extra);
! if (add_path_precheck(joinrel,
workspace.startup_cost, workspace.total_cost,
! NIL, required_outer))
{
! add_path(joinrel, (Path *)
! create_hashjoin_path(root,
! joinrel,
! jointype,
! &workspace,
! extra,
! outer_path,
! inner_path,
! extra->restrictlist,
! required_outer,
! hashclauses));
}
else
{
--- 1012,1086 ----
Path *inner_path,
List *hashclauses,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped,
! bool do_aggregate)
{
Relids required_outer;
JoinCostWorkspace workspace;
+ Path *join_path;
+ PathTarget *join_target;
+
+ /* Caller should not request aggregation w/o grouped output. */
+ Assert(!do_aggregate || grouped);
+
+ /* GroupedPathInfo is necessary for us to produce a grouped set. */
+ Assert(joinrel->gpi != NULL || !grouped);
/*
! * Check to see if proposed path is still parameterized, and reject if
! * it's grouped or if the parameterization wouldn't be sensible.
*/
! required_outer = calc_non_nestloop_required_outer(outer_path, inner_path);
! if (required_outer)
{
! if (grouped || !bms_overlap(required_outer, extra->param_source_rels))
! {
! /* Waste no memory when we reject a path here */
! bms_free(required_outer);
! return;
! }
}
/*
* See comments in try_nestloop_path(). Also note that hashjoin paths
* never have any output pathkeys, per comments in create_hashjoin_path.
+ *
+ * TODO Need to consider aggregation here?
*/
initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
outer_path, inner_path, extra);
! /*
! * Determine which target the join should produce.
! *
! * In the case of explicit aggregation, output of the join itself is
! * plain.
! */
! if (!grouped || do_aggregate)
! join_target = joinrel->reltarget;
! else
! join_target = joinrel->gpi->target;
!
! join_path = (Path *) create_hashjoin_path(root, joinrel, jointype,
! &workspace,
! extra,
! outer_path, inner_path,
! extra->restrictlist,
! required_outer, hashclauses,
! join_target);
!
! /* Do partial aggregation if needed. */
! if (do_aggregate)
! {
! create_grouped_path(root, joinrel, join_path, true, false,
! AGG_HASHED);
! }
! else if (add_path_precheck(joinrel,
workspace.startup_cost, workspace.total_cost,
! NIL, required_outer, grouped))
{
! add_path(joinrel, (Path *) join_path, grouped);
}
else
{
*************** try_partial_hashjoin_path(PlannerInfo *r
*** 683,691 ****
Path *inner_path,
List *hashclauses,
JoinType jointype,
! JoinPathExtraData *extra)
{
JoinCostWorkspace workspace;
/*
* If the inner path is parameterized, the parameterization must be fully
--- 1101,1117 ----
Path *inner_path,
List *hashclauses,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped,
! bool do_aggregate)
{
JoinCostWorkspace workspace;
+ Path *join_path;
+ PathTarget *join_target;
+
+ /* The same checks we do in try_hashjoin_path. */
+ Assert(!do_aggregate || grouped);
+ Assert(joinrel->gpi != NULL || !grouped);
/*
* If the inner path is parameterized, the parameterization must be fully
*************** try_partial_hashjoin_path(PlannerInfo *r
*** 708,728 ****
*/
initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
outer_path, inner_path, extra);
! if (!add_partial_path_precheck(joinrel, workspace.total_cost, NIL))
return;
! /* Might be good enough to be worth trying, so let's try it. */
! add_partial_path(joinrel, (Path *)
! create_hashjoin_path(root,
! joinrel,
! jointype,
! &workspace,
! extra,
! outer_path,
! inner_path,
! extra->restrictlist,
! NULL,
! hashclauses));
}
/*
--- 1134,1229 ----
*/
initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
outer_path, inner_path, extra);
!
! /*
! * Determine which target the join should produce.
! *
! * In the case of explicit aggregation, output of the join itself is
! * plain.
! */
! if (!grouped || do_aggregate)
! join_target = joinrel->reltarget;
! else
! {
! Assert(joinrel->gpi != NULL);
! join_target = joinrel->gpi->target;
! }
!
! join_path = (Path *) create_hashjoin_path(root, joinrel, jointype,
! &workspace,
! extra,
! outer_path, inner_path,
! extra->restrictlist, NULL,
! hashclauses, join_target);
!
! /* Do partial aggregation if needed. */
! if (do_aggregate)
! {
! create_grouped_path(root, joinrel, join_path, true, true, AGG_HASHED);
! }
! else if (add_partial_path_precheck(joinrel, workspace.total_cost,
! NIL, grouped))
! {
! add_partial_path(joinrel, (Path *) join_path , grouped);
! }
! }
!
! /*
! * Create a new grouped hash join path by joining a grouped path to plain
! * (non-grouped) one, or by joining 2 plain relations and applying grouping on
! * the result.
! *
! * Joining of 2 grouped paths is not supported. If a grouped relation A was
! * joined to grouped relation B, then the grouping of B reduces the number of
! * times each group of A is appears in the join output. This makes difference
! * for some aggregates, e.g. sum().
! *
! * If do_aggregate is true, neither input rel is grouped so we need to
! * aggregate the join result explicitly.
! *
! * partial argument tells whether the join path should be considered partial.
! */
! static void
! try_grouped_hashjoin_path(PlannerInfo *root,
! RelOptInfo *joinrel,
! Path *outer_path,
! Path *inner_path,
! List *hashclauses,
! JoinType jointype,
! JoinPathExtraData *extra,
! bool do_aggregate,
! bool partial)
! {
! /*
! * Missing GroupedPathInfo indicates that we should not try to create a
! * grouped join.
! */
! if (joinrel->gpi == NULL)
return;
! /*
! * Reject the path if we're supposed to combine grouped and plain relation
! * but the grouped one does not evaluate all the relevant aggregates.
! */
! if (!do_aggregate &&
! !is_grouped_join_target_complete(root, joinrel->gpi->target,
! outer_path, inner_path))
! return;
!
! /*
! * As repeated aggregation doesn't seem to be attractive, make sure that
! * the resulting grouped relation is not parameterized.
! */
! if (outer_path->param_info != NULL || inner_path->param_info != NULL)
! return;
!
! if (!partial)
! try_hashjoin_path(root, joinrel, outer_path, inner_path, hashclauses,
! jointype, extra, true, do_aggregate);
! else
! try_partial_hashjoin_path(root, joinrel, outer_path, inner_path,
! hashclauses, jointype, extra, true,
! do_aggregate);
}
/*
*************** sort_inner_and_outer(PlannerInfo *root,
*** 773,779 ****
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra)
{
JoinType save_jointype = jointype;
Path *outer_path;
--- 1274,1313 ----
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped)
! {
! if (!grouped)
! {
! sort_inner_and_outer_common(root, joinrel, outerrel, innerrel,
! jointype, extra, false, false, false);
! }
! else
! {
! /* Use all the supported strategies to generate grouped join. */
! sort_inner_and_outer_common(root, joinrel, outerrel, innerrel,
! jointype, extra, true, false, false);
! sort_inner_and_outer_common(root, joinrel, outerrel, innerrel,
! jointype, extra, false, true, false);
! sort_inner_and_outer_common(root, joinrel, outerrel, innerrel,
! jointype, extra, false, false, true);
! }
! }
!
! /*
! * TODO As merge_pathkeys shouldn't differ across execution, use a separate
! * function to derive them and pass them here in a list.
! */
! static void
! sort_inner_and_outer_common(PlannerInfo *root,
! RelOptInfo *joinrel,
! RelOptInfo *outerrel,
! RelOptInfo *innerrel,
! JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped_outer,
! bool grouped_inner,
! bool do_aggregate)
{
JoinType save_jointype = jointype;
Path *outer_path;
*************** sort_inner_and_outer(PlannerInfo *root,
*** 782,787 ****
--- 1316,1322 ----
Path *cheapest_safe_inner = NULL;
List *all_pathkeys;
ListCell *l;
+ bool grouped_result;
/*
* We only consider the cheapest-total-cost input paths, since we are
*************** sort_inner_and_outer(PlannerInfo *root,
*** 796,803 ****
* against mergejoins with parameterized inputs; see comments in
* src/backend/optimizer/README.
*/
! outer_path = outerrel->cheapest_total_path;
! inner_path = innerrel->cheapest_total_path;
/*
* If either cheapest-total path is parameterized by the other rel, we
--- 1331,1357 ----
* against mergejoins with parameterized inputs; see comments in
* src/backend/optimizer/README.
*/
! if (grouped_outer)
! {
! if (outerrel->gpi != NULL && outerrel->gpi->pathlist != NIL)
! outer_path = linitial(outerrel->gpi->pathlist);
! else
! return;
! }
! else
! outer_path = outerrel->cheapest_total_path;
!
! if (grouped_inner)
! {
! if (innerrel->gpi != NULL && innerrel->gpi->pathlist != NIL)
! inner_path = linitial(innerrel->gpi->pathlist);
! else
! return;
! }
! else
! inner_path = innerrel->cheapest_total_path;
!
! grouped_result = grouped_outer || grouped_inner || do_aggregate;
/*
* If either cheapest-total path is parameterized by the other rel, we
*************** sort_inner_and_outer(PlannerInfo *root,
*** 843,855 ****
outerrel->partial_pathlist != NIL &&
bms_is_empty(joinrel->lateral_relids))
{
! cheapest_partial_outer = (Path *) linitial(outerrel->partial_pathlist);
if (inner_path->parallel_safe)
cheapest_safe_inner = inner_path;
else if (save_jointype != JOIN_UNIQUE_INNER)
cheapest_safe_inner =
! get_cheapest_parallel_safe_total_inner(innerrel->pathlist);
}
/*
--- 1397,1446 ----
outerrel->partial_pathlist != NIL &&
bms_is_empty(joinrel->lateral_relids))
{
! if (grouped_outer)
! {
! if (outerrel->gpi != NULL && outerrel->gpi->partial_pathlist != NIL)
! cheapest_partial_outer = (Path *)
! linitial(outerrel->gpi->partial_pathlist);
! else
! return;
! }
! else
! cheapest_partial_outer = (Path *)
! linitial(outerrel->partial_pathlist);
!
! if (grouped_inner)
! {
! if (innerrel->gpi != NULL && innerrel->gpi->pathlist != NIL)
! inner_path = linitial(innerrel->gpi->pathlist);
! else
! return;
! }
! else
! inner_path = innerrel->cheapest_total_path;
if (inner_path->parallel_safe)
cheapest_safe_inner = inner_path;
else if (save_jointype != JOIN_UNIQUE_INNER)
+ {
+ List *inner_pathlist;
+
+ if (!grouped_inner)
+ inner_pathlist = innerrel->pathlist;
+ else
+ {
+ Assert(innerrel->gpi != NULL);
+ inner_pathlist = innerrel->gpi->pathlist;
+ }
+
+ /*
+ * All the grouped paths should be unparameterized, so the
+ * function is overly stringent in the grouped_inner case, but
+ * still useful.
+ */
cheapest_safe_inner =
! get_cheapest_parallel_safe_total_inner(inner_pathlist);
! }
}
/*
*************** sort_inner_and_outer(PlannerInfo *root,
*** 925,957 ****
* properly. try_mergejoin_path will detect that case and suppress an
* explicit sort step, so we needn't do so here.
*/
! try_mergejoin_path(root,
! joinrel,
! outer_path,
! inner_path,
! merge_pathkeys,
! cur_mergeclauses,
! outerkeys,
! innerkeys,
! jointype,
! extra,
! false);
/*
* If we have partial outer and parallel safe inner path then try
* partial mergejoin path.
*/
if (cheapest_partial_outer && cheapest_safe_inner)
! try_partial_mergejoin_path(root,
! joinrel,
! cheapest_partial_outer,
! cheapest_safe_inner,
! merge_pathkeys,
! cur_mergeclauses,
! outerkeys,
! innerkeys,
! jointype,
! extra);
}
}
--- 1516,1574 ----
* properly. try_mergejoin_path will detect that case and suppress an
* explicit sort step, so we needn't do so here.
*/
! if (!grouped_result)
! try_mergejoin_path(root,
! joinrel,
! outer_path,
! inner_path,
! merge_pathkeys,
! cur_mergeclauses,
! outerkeys,
! innerkeys,
! jointype,
! extra,
! false, false, false);
! else
! {
! try_mergejoin_path_common(root, joinrel, outer_path, inner_path,
! merge_pathkeys, cur_mergeclauses,
! outerkeys, innerkeys, jointype, extra,
! false,
! grouped_outer, grouped_inner,
! do_aggregate);
! }
/*
* If we have partial outer and parallel safe inner path then try
* partial mergejoin path.
*/
if (cheapest_partial_outer && cheapest_safe_inner)
! {
! if (!grouped_result)
! {
! try_partial_mergejoin_path(root,
! joinrel,
! cheapest_partial_outer,
! cheapest_safe_inner,
! merge_pathkeys,
! cur_mergeclauses,
! outerkeys,
! innerkeys,
! jointype,
! extra, false, false);
! }
! else
! {
! try_mergejoin_path_common(root, joinrel,
! cheapest_partial_outer,
! cheapest_safe_inner,
! merge_pathkeys, cur_mergeclauses,
! outerkeys, innerkeys, jointype, extra,
! true,
! grouped_outer, grouped_inner,
! do_aggregate);
! }
! }
}
}
*************** sort_inner_and_outer(PlannerInfo *root,
*** 968,973 ****
--- 1585,1598 ----
* some sort key requirements). So, we consider truncations of the
* mergeclause list as well as the full list. (Ideally we'd consider all
* subsets of the mergeclause list, but that seems way too expensive.)
+ *
+ * grouped_outer - is outerpath grouped?
+ * grouped_inner - use grouped paths of innerrel?
+ * do_aggregate - apply (partial) aggregation to the output?
+ *
+ * TODO If subsequent calls often differ only by the 3 arguments above,
+ * consider a workspace structure to share useful info (eg merge clauses)
+ * across calls.
*/
static void
generate_mergejoin_paths(PlannerInfo *root,
*************** generate_mergejoin_paths(PlannerInfo *ro
*** 979,985 ****
bool useallclauses,
Path *inner_cheapest_total,
List *merge_pathkeys,
! bool is_partial)
{
List *mergeclauses;
List *innersortkeys;
--- 1604,1613 ----
bool useallclauses,
Path *inner_cheapest_total,
List *merge_pathkeys,
! bool is_partial,
! bool grouped_outer,
! bool grouped_inner,
! bool do_aggregate)
{
List *mergeclauses;
List *innersortkeys;
*************** generate_mergejoin_paths(PlannerInfo *ro
*** 1030,1046 ****
* try_mergejoin_path will do the right thing if inner_cheapest_total is
* already correctly sorted.)
*/
! try_mergejoin_path(root,
! joinrel,
! outerpath,
! inner_cheapest_total,
! merge_pathkeys,
! mergeclauses,
! NIL,
! innersortkeys,
! jointype,
! extra,
! is_partial);
/* Can't do anything else if inner path needs to be unique'd */
if (save_jointype == JOIN_UNIQUE_INNER)
--- 1658,1675 ----
* try_mergejoin_path will do the right thing if inner_cheapest_total is
* already correctly sorted.)
*/
! try_mergejoin_path_common(root,
! joinrel,
! outerpath,
! inner_cheapest_total,
! merge_pathkeys,
! mergeclauses,
! NIL,
! innersortkeys,
! jointype,
! extra,
! is_partial,
! grouped_outer, grouped_inner, do_aggregate);
/* Can't do anything else if inner path needs to be unique'd */
if (save_jointype == JOIN_UNIQUE_INNER)
*************** generate_mergejoin_paths(PlannerInfo *ro
*** 1096,1111 ****
for (sortkeycnt = num_sortkeys; sortkeycnt > 0; sortkeycnt--)
{
Path *innerpath;
List *newclauses = NIL;
/*
* Look for an inner path ordered well enough for the first
* 'sortkeycnt' innersortkeys. NB: trialsortkeys list is modified
* destructively, which is why we made a copy...
*/
trialsortkeys = list_truncate(trialsortkeys, sortkeycnt);
! innerpath = get_cheapest_path_for_pathkeys(innerrel->pathlist,
trialsortkeys,
NULL,
TOTAL_COST,
--- 1725,1746 ----
for (sortkeycnt = num_sortkeys; sortkeycnt > 0; sortkeycnt--)
{
+ List *inner_pathlist = NIL;
Path *innerpath;
List *newclauses = NIL;
+ if (!grouped_inner)
+ inner_pathlist = innerrel->pathlist;
+ else if (innerrel->gpi != NULL)
+ inner_pathlist = innerrel->gpi->pathlist;
+
/*
* Look for an inner path ordered well enough for the first
* 'sortkeycnt' innersortkeys. NB: trialsortkeys list is modified
* destructively, which is why we made a copy...
*/
trialsortkeys = list_truncate(trialsortkeys, sortkeycnt);
! innerpath = get_cheapest_path_for_pathkeys(inner_pathlist,
trialsortkeys,
NULL,
TOTAL_COST,
*************** generate_mergejoin_paths(PlannerInfo *ro
*** 1128,1148 ****
}
else
newclauses = mergeclauses;
! try_mergejoin_path(root,
! joinrel,
! outerpath,
! innerpath,
! merge_pathkeys,
! newclauses,
! NIL,
! NIL,
! jointype,
! extra,
! is_partial);
cheapest_total_inner = innerpath;
}
/* Same on the basis of cheapest startup cost ... */
! innerpath = get_cheapest_path_for_pathkeys(innerrel->pathlist,
trialsortkeys,
NULL,
STARTUP_COST,
--- 1763,1787 ----
}
else
newclauses = mergeclauses;
!
! try_mergejoin_path_common(root,
! joinrel,
! outerpath,
! innerpath,
! merge_pathkeys,
! newclauses,
! NIL,
! NIL,
! jointype,
! extra,
! is_partial,
! grouped_outer, grouped_inner,
! do_aggregate);
!
cheapest_total_inner = innerpath;
}
/* Same on the basis of cheapest startup cost ... */
! innerpath = get_cheapest_path_for_pathkeys(inner_pathlist,
trialsortkeys,
NULL,
STARTUP_COST,
*************** generate_mergejoin_paths(PlannerInfo *ro
*** 1173,1189 ****
else
newclauses = mergeclauses;
}
! try_mergejoin_path(root,
! joinrel,
! outerpath,
! innerpath,
! merge_pathkeys,
! newclauses,
! NIL,
! NIL,
! jointype,
! extra,
! is_partial);
}
cheapest_startup_inner = innerpath;
}
--- 1812,1830 ----
else
newclauses = mergeclauses;
}
! try_mergejoin_path_common(root,
! joinrel,
! outerpath,
! innerpath,
! merge_pathkeys,
! newclauses,
! NIL,
! NIL,
! jointype,
! extra,
! is_partial,
! grouped_outer, grouped_inner,
! do_aggregate);
}
cheapest_startup_inner = innerpath;
}
*************** generate_mergejoin_paths(PlannerInfo *ro
*** 1218,1223 ****
--- 1859,1866 ----
* 'innerrel' is the inner join relation
* 'jointype' is the type of join to do
* 'extra' contains additional input values
+ * 'grouped' indicates that the at least one relation in the join has been
+ * aggregated.
*/
static void
match_unsorted_outer(PlannerInfo *root,
*************** match_unsorted_outer(PlannerInfo *root,
*** 1225,1231 ****
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra)
{
JoinType save_jointype = jointype;
bool nestjoinOK;
--- 1868,1875 ----
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped)
{
JoinType save_jointype = jointype;
bool nestjoinOK;
*************** match_unsorted_outer(PlannerInfo *root,
*** 1235,1240 ****
--- 1879,1906 ----
ListCell *lc1;
/*
+ * If grouped join path is requested, we ignore cases where either input
+ * path needs to be unique. For each side we should expect either grouped
+ * or plain relation, which differ quite a bit.
+ *
+ * XXX Although unique-ification of grouped path might result in too
+ * expensive input path (note that grouped input relation is not
+ * necessarily unique, regardless the grouping keys --- one or more plain
+ * relation could already have been joined to it), we might want to
+ * unique-ify the input relation in the future at least in the case it's a
+ * plain relation.
+ *
+ * (Materialization is not involved in grouped paths for similar reasons.)
+ */
+ if (grouped &&
+ (jointype == JOIN_UNIQUE_OUTER || jointype == JOIN_UNIQUE_INNER))
+ return;
+
+ /* No grouped join w/o grouped target. */
+ if (grouped && joinrel->gpi == NULL)
+ return;
+
+ /*
* Nestloop only supports inner, left, semi, and anti joins. Also, if we
* are doing a right or full mergejoin, we must use *all* the mergeclauses
* as join clauses, else we will not have a valid plan. (Although these
*************** match_unsorted_outer(PlannerInfo *root,
*** 1290,1296 ****
create_unique_path(root, innerrel, inner_cheapest_total, extra->sjinfo);
Assert(inner_cheapest_total);
}
! else if (nestjoinOK)
{
/*
* Consider materializing the cheapest inner path, unless
--- 1956,1962 ----
create_unique_path(root, innerrel, inner_cheapest_total, extra->sjinfo);
Assert(inner_cheapest_total);
}
! else if (nestjoinOK && !grouped)
{
/*
* Consider materializing the cheapest inner path, unless
*************** match_unsorted_outer(PlannerInfo *root,
*** 1321,1326 ****
--- 1987,1994 ----
*/
if (save_jointype == JOIN_UNIQUE_OUTER)
{
+ Assert(!grouped);
+
if (outerpath != outerrel->cheapest_total_path)
continue;
outerpath = (Path *) create_unique_path(root, outerrel,
*************** match_unsorted_outer(PlannerInfo *root,
*** 1348,1354 ****
inner_cheapest_total,
merge_pathkeys,
jointype,
! extra);
}
else if (nestjoinOK)
{
--- 2016,2023 ----
inner_cheapest_total,
merge_pathkeys,
jointype,
! extra,
! false, false);
}
else if (nestjoinOK)
{
*************** match_unsorted_outer(PlannerInfo *root,
*** 1364,1387 ****
{
Path *innerpath = (Path *) lfirst(lc2);
! try_nestloop_path(root,
! joinrel,
! outerpath,
! innerpath,
! merge_pathkeys,
! jointype,
! extra);
}
! /* Also consider materialized form of the cheapest inner path */
! if (matpath != NULL)
try_nestloop_path(root,
joinrel,
outerpath,
matpath,
merge_pathkeys,
jointype,
! extra);
}
/* Can't do anything else if outer path needs to be unique'd */
--- 2033,2078 ----
{
Path *innerpath = (Path *) lfirst(lc2);
! if (!grouped)
! try_nestloop_path(root,
! joinrel,
! outerpath,
! innerpath,
! merge_pathkeys,
! jointype,
! extra, false, false);
! else
! {
! /*
! * Since both input paths are plain, request explicit
! * aggregation.
! */
! try_grouped_nestloop_path(root,
! joinrel,
! outerpath,
! innerpath,
! merge_pathkeys,
! jointype,
! extra,
! true,
! false);
! }
}
! /*
! * Also consider materialized form of the cheapest inner path.
! *
! * (There's no matpath for grouped join.)
! */
! if (matpath != NULL && !grouped)
try_nestloop_path(root,
joinrel,
outerpath,
matpath,
merge_pathkeys,
jointype,
! extra,
! false, false);
}
/* Can't do anything else if outer path needs to be unique'd */
*************** match_unsorted_outer(PlannerInfo *root,
*** 1396,1402 ****
generate_mergejoin_paths(root, joinrel, innerrel, outerpath,
save_jointype, extra, useallclauses,
inner_cheapest_total, merge_pathkeys,
! false);
}
/*
--- 2087,2163 ----
generate_mergejoin_paths(root, joinrel, innerrel, outerpath,
save_jointype, extra, useallclauses,
inner_cheapest_total, merge_pathkeys,
! false, false, false, grouped);
!
! /* Try to join the plain outer relation to grouped inner. */
! if (grouped && nestjoinOK &&
! save_jointype != JOIN_UNIQUE_OUTER &&
! save_jointype != JOIN_UNIQUE_INNER &&
! innerrel->gpi != NULL && outerrel->gpi == NULL)
! {
! Path *inner_cheapest_grouped = (Path *) linitial(innerrel->gpi->pathlist);
!
! if (PATH_PARAM_BY_REL(inner_cheapest_grouped, outerrel))
! continue;
!
! /* XXX Shouldn't Assert() be used here instead? */
! if (PATH_PARAM_BY_REL(outerpath, innerrel))
! continue;
!
! /*
! * Only outer grouped path is interesting in this case: grouped
! * path on the inner side of NL join would imply repeated
! * aggregation somewhere in the inner path.
! */
! generate_mergejoin_paths(root, joinrel, innerrel, outerpath,
! save_jointype, extra, useallclauses,
! inner_cheapest_grouped, merge_pathkeys,
! false, false, true, false);
! }
! }
!
! /*
! * Combine grouped outer and plain inner paths.
! */
! if (grouped && nestjoinOK &&
! save_jointype != JOIN_UNIQUE_OUTER &&
! save_jointype != JOIN_UNIQUE_INNER)
! {
! /*
! * If the inner rel had a grouped target, its plain paths should be
! * ignored. Otherwise we could create grouped paths with different
! * targets.
! */
! if (outerrel->gpi != NULL && innerrel->gpi == NULL &&
! inner_cheapest_total != NULL)
! {
! /* Nested loop paths. */
! foreach(lc1, outerrel->gpi->pathlist)
! {
! Path *outerpath = (Path *) lfirst(lc1);
! List *merge_pathkeys = build_join_pathkeys(root, joinrel, jointype,
! outerpath->pathkeys);
!
! if (PATH_PARAM_BY_REL(outerpath, innerrel))
! continue;
!
! try_grouped_nestloop_path(root,
! joinrel,
! outerpath,
! inner_cheapest_total,
! merge_pathkeys,
! jointype,
! extra,
! false,
! false);
!
! /* Merge join paths. */
! generate_mergejoin_paths(root, joinrel, innerrel, outerpath,
! save_jointype, extra, useallclauses,
! inner_cheapest_total, merge_pathkeys,
! false, true, false, false);
! }
! }
}
/*
*************** match_unsorted_outer(PlannerInfo *root,
*** 1416,1423 ****
bms_is_empty(joinrel->lateral_relids))
{
if (nestjoinOK)
! consider_parallel_nestloop(root, joinrel, outerrel, innerrel,
! save_jointype, extra);
/*
* If inner_cheapest_total is NULL or non parallel-safe then find the
--- 2177,2197 ----
bms_is_empty(joinrel->lateral_relids))
{
if (nestjoinOK)
! {
! if (!grouped)
! /* Plain partial paths. */
! consider_parallel_nestloop(root, joinrel, outerrel, innerrel,
! save_jointype, extra, false, false);
! else
! {
! /* Grouped partial paths with explicit aggregation. */
! consider_parallel_nestloop(root, joinrel, outerrel, innerrel,
! save_jointype, extra, true, true);
! /* Grouped partial paths w/o explicit aggregation. */
! consider_parallel_nestloop(root, joinrel, outerrel, innerrel,
! save_jointype, extra, true, false);
! }
! }
/*
* If inner_cheapest_total is NULL or non parallel-safe then find the
*************** match_unsorted_outer(PlannerInfo *root,
*** 1437,1443 ****
if (inner_cheapest_total)
consider_parallel_mergejoin(root, joinrel, outerrel, innerrel,
save_jointype, extra,
! inner_cheapest_total);
}
}
--- 2211,2217 ----
if (inner_cheapest_total)
consider_parallel_mergejoin(root, joinrel, outerrel, innerrel,
save_jointype, extra,
! inner_cheapest_total, grouped);
}
}
*************** consider_parallel_mergejoin(PlannerInfo
*** 1460,1469 ****
RelOptInfo *innerrel,
JoinType jointype,
JoinPathExtraData *extra,
! Path *inner_cheapest_total)
{
ListCell *lc1;
/* generate merge join path for each partial outer path */
foreach(lc1, outerrel->partial_pathlist)
{
--- 2234,2252 ----
RelOptInfo *innerrel,
JoinType jointype,
JoinPathExtraData *extra,
! Path *inner_cheapest_total,
! bool grouped)
{
ListCell *lc1;
+ if (grouped)
+ {
+ /* TODO Consider if these types should be supported. */
+ if (jointype == JOIN_UNIQUE_OUTER ||
+ jointype == JOIN_UNIQUE_INNER)
+ return;
+ }
+
/* generate merge join path for each partial outer path */
foreach(lc1, outerrel->partial_pathlist)
{
*************** consider_parallel_mergejoin(PlannerInfo
*** 1476,1484 ****
merge_pathkeys = build_join_pathkeys(root, joinrel, jointype,
outerpath->pathkeys);
! generate_mergejoin_paths(root, joinrel, innerrel, outerpath, jointype,
! extra, false, inner_cheapest_total,
! merge_pathkeys, true);
}
}
--- 2259,2314 ----
merge_pathkeys = build_join_pathkeys(root, joinrel, jointype,
outerpath->pathkeys);
! if (!grouped)
! generate_mergejoin_paths(root, joinrel, innerrel, outerpath,
! jointype, extra, false,
! inner_cheapest_total, merge_pathkeys,
! true,
! false, false, false);
! else
! {
! /*
! * Create grouped join by joining plain rels and aggregating the
! * result.
! */
! Assert(joinrel->gpi != NULL);
! generate_mergejoin_paths(root, joinrel, innerrel, outerpath,
! jointype, extra, false,
! inner_cheapest_total, merge_pathkeys,
! true, false, false, true);
!
! /* Combine the plain outer with grouped inner one(s). */
! if (outerrel->gpi == NULL && innerrel->gpi != NULL)
! {
! Path *inner_cheapest_grouped = (Path *)
! linitial(innerrel->gpi->pathlist);
!
! if (inner_cheapest_grouped != NULL &&
! inner_cheapest_grouped->parallel_safe)
! generate_mergejoin_paths(root, joinrel, innerrel,
! outerpath, jointype, extra,
! false, inner_cheapest_grouped,
! merge_pathkeys,
! true, false, true, false);
! }
! }
! }
!
! /* In addition, try to join grouped outer to plain inner one(s). */
! if (grouped && outerrel->gpi != NULL && innerrel->gpi == NULL)
! {
! foreach(lc1, outerrel->gpi->partial_pathlist)
! {
! Path *outerpath = (Path *) lfirst(lc1);
! List *merge_pathkeys;
!
! merge_pathkeys = build_join_pathkeys(root, joinrel, jointype,
! outerpath->pathkeys);
! generate_mergejoin_paths(root, joinrel, innerrel, outerpath,
! jointype, extra, false,
! inner_cheapest_total, merge_pathkeys,
! true, true, false, false);
! }
}
}
*************** consider_parallel_nestloop(PlannerInfo *
*** 1499,1513 ****
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra)
{
JoinType save_jointype = jointype;
ListCell *lc1;
if (jointype == JOIN_UNIQUE_INNER)
jointype = JOIN_INNER;
! foreach(lc1, outerrel->partial_pathlist)
{
Path *outerpath = (Path *) lfirst(lc1);
List *pathkeys;
--- 2329,2373 ----
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped, bool do_aggregate)
{
JoinType save_jointype = jointype;
+ List *outer_pathlist;
ListCell *lc1;
+ if (grouped)
+ {
+ /* TODO Consider if these types should be supported. */
+ if (save_jointype == JOIN_UNIQUE_OUTER ||
+ save_jointype == JOIN_UNIQUE_INNER)
+ return;
+ }
+
if (jointype == JOIN_UNIQUE_INNER)
jointype = JOIN_INNER;
! if (!grouped || do_aggregate)
! {
! /*
! * If creating grouped paths by explicit aggregation, the input paths
! * must be plain.
! */
! outer_pathlist = outerrel->partial_pathlist;
! }
! else if (outerrel->gpi != NULL)
! {
! /*
! * Only the outer paths are accepted as grouped when we try to combine
! * grouped and plain ones. Grouped inner path implies repeated
! * aggregation, which doesn't sound as a good idea.
! */
! outer_pathlist = outerrel->gpi->partial_pathlist;
! }
! else
! return;
!
! foreach(lc1, outer_pathlist)
{
Path *outerpath = (Path *) lfirst(lc1);
List *pathkeys;
*************** consider_parallel_nestloop(PlannerInfo *
*** 1538,1544 ****
* inner paths, but right now create_unique_path is not on board
* with that.)
*/
! if (save_jointype == JOIN_UNIQUE_INNER)
{
if (innerpath != innerrel->cheapest_total_path)
continue;
--- 2398,2404 ----
* inner paths, but right now create_unique_path is not on board
* with that.)
*/
! if (save_jointype == JOIN_UNIQUE_INNER && !grouped)
{
if (innerpath != innerrel->cheapest_total_path)
continue;
*************** consider_parallel_nestloop(PlannerInfo *
*** 1548,1555 ****
Assert(innerpath);
}
! try_partial_nestloop_path(root, joinrel, outerpath, innerpath,
! pathkeys, jointype, extra);
}
}
}
--- 2408,2433 ----
Assert(innerpath);
}
! if (!grouped)
! try_partial_nestloop_path(root, joinrel, outerpath, innerpath,
! pathkeys, jointype, extra,
! false, false);
! else if (do_aggregate)
! {
! /* Request aggregation as both input rels are plain. */
! try_grouped_nestloop_path(root, joinrel, outerpath, innerpath,
! pathkeys, jointype, extra,
! true, true);
! }
! /*
! * Only combine the grouped outer path with the plain inner if the
! * inner relation cannot produce grouped paths. Otherwise we could
! * generate grouped paths with different targets.
! */
! else if (innerrel->gpi == NULL)
! try_grouped_nestloop_path(root, joinrel, outerpath, innerpath,
! pathkeys, jointype, extra,
! false, true);
}
}
}
*************** hash_inner_and_outer(PlannerInfo *root,
*** 1571,1583 ****
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra)
{
JoinType save_jointype = jointype;
bool isouterjoin = IS_OUTER_JOIN(jointype);
List *hashclauses;
ListCell *l;
/*
* We need to build only one hashclauses list for any given pair of outer
* and inner relations; all of the hashable clauses will be used as keys.
--- 2449,2466 ----
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
! JoinPathExtraData *extra,
! bool grouped)
{
JoinType save_jointype = jointype;
bool isouterjoin = IS_OUTER_JOIN(jointype);
List *hashclauses;
ListCell *l;
+ /* No grouped join w/o grouped target. */
+ if (grouped && joinrel->gpi == NULL)
+ return;
+
/*
* We need to build only one hashclauses list for any given pair of outer
* and inner relations; all of the hashable clauses will be used as keys.
*************** hash_inner_and_outer(PlannerInfo *root,
*** 1627,1632 ****
--- 2510,2518 ----
* can't use a hashjoin. (There's no use looking for alternative
* input paths, since these should already be the least-parameterized
* available paths.)
+ *
+ * (The same check should work for grouped paths, as these don't
+ * differ in parameterization.)
*/
if (PATH_PARAM_BY_REL(cheapest_total_outer, innerrel) ||
PATH_PARAM_BY_REL(cheapest_total_inner, outerrel))
*************** hash_inner_and_outer(PlannerInfo *root,
*** 1646,1652 ****
cheapest_total_inner,
hashclauses,
jointype,
! extra);
/* no possibility of cheap startup here */
}
else if (jointype == JOIN_UNIQUE_INNER)
--- 2532,2539 ----
cheapest_total_inner,
hashclauses,
jointype,
! extra,
! false, false);
/* no possibility of cheap startup here */
}
else if (jointype == JOIN_UNIQUE_INNER)
*************** hash_inner_and_outer(PlannerInfo *root,
*** 1662,1668 ****
cheapest_total_inner,
hashclauses,
jointype,
! extra);
if (cheapest_startup_outer != NULL &&
cheapest_startup_outer != cheapest_total_outer)
try_hashjoin_path(root,
--- 2549,2556 ----
cheapest_total_inner,
hashclauses,
jointype,
! extra,
! false, false);
if (cheapest_startup_outer != NULL &&
cheapest_startup_outer != cheapest_total_outer)
try_hashjoin_path(root,
*************** hash_inner_and_outer(PlannerInfo *root,
*** 1671,1733 ****
cheapest_total_inner,
hashclauses,
jointype,
! extra);
}
else
{
! /*
! * For other jointypes, we consider the cheapest startup outer
! * together with the cheapest total inner, and then consider
! * pairings of cheapest-total paths including parameterized ones.
! * There is no use in generating parameterized paths on the basis
! * of possibly cheap startup cost, so this is sufficient.
! */
! ListCell *lc1;
! ListCell *lc2;
!
! if (cheapest_startup_outer != NULL)
! try_hashjoin_path(root,
! joinrel,
! cheapest_startup_outer,
! cheapest_total_inner,
! hashclauses,
! jointype,
! extra);
!
! foreach(lc1, outerrel->cheapest_parameterized_paths)
{
- Path *outerpath = (Path *) lfirst(lc1);
-
/*
! * We cannot use an outer path that is parameterized by the
! * inner rel.
*/
! if (PATH_PARAM_BY_REL(outerpath, innerrel))
! continue;
! foreach(lc2, innerrel->cheapest_parameterized_paths)
{
! Path *innerpath = (Path *) lfirst(lc2);
/*
! * We cannot use an inner path that is parameterized by
! * the outer rel, either.
*/
! if (PATH_PARAM_BY_REL(innerpath, outerrel))
continue;
! if (outerpath == cheapest_startup_outer &&
! innerpath == cheapest_total_inner)
! continue; /* already tried it */
! try_hashjoin_path(root,
! joinrel,
! outerpath,
! innerpath,
! hashclauses,
! jointype,
! extra);
}
}
}
--- 2559,2712 ----
cheapest_total_inner,
hashclauses,
jointype,
! extra,
! false, false);
}
else
{
! if (!grouped)
{
/*
! * For other jointypes, we consider the cheapest startup outer
! * together with the cheapest total inner, and then consider
! * pairings of cheapest-total paths including parameterized
! * ones. There is no use in generating parameterized paths on
! * the basis of possibly cheap startup cost, so this is
! * sufficient.
*/
! ListCell *lc1;
! if (cheapest_startup_outer != NULL)
! try_hashjoin_path(root,
! joinrel,
! cheapest_startup_outer,
! cheapest_total_inner,
! hashclauses,
! jointype,
! extra,
! false, false);
!
! foreach(lc1, outerrel->cheapest_parameterized_paths)
{
! Path *outerpath = (Path *) lfirst(lc1);
! ListCell *lc2;
/*
! * We cannot use an outer path that is parameterized by the
! * inner rel.
*/
! if (PATH_PARAM_BY_REL(outerpath, innerrel))
continue;
! foreach(lc2, innerrel->cheapest_parameterized_paths)
! {
! Path *innerpath = (Path *) lfirst(lc2);
! /*
! * We cannot use an inner path that is parameterized by
! * the outer rel, either.
! */
! if (PATH_PARAM_BY_REL(innerpath, outerrel))
! continue;
!
! if (outerpath == cheapest_startup_outer &&
! innerpath == cheapest_total_inner)
! continue; /* already tried it */
!
! try_hashjoin_path(root,
! joinrel,
! outerpath,
! innerpath,
! hashclauses,
! jointype,
! extra,
! false, false);
! }
! }
! }
! else
! {
! /* Create grouped paths if possible. */
! /*
! * TODO
! *
! * Consider processing JOIN_UNIQUE_INNER and JOIN_UNIQUE_OUTER
! * join types, ie perform grouping of the inner / outer rel if
! * it's not unique yet and if the grouping is legal.
! */
! if (jointype == JOIN_UNIQUE_OUTER ||
! jointype == JOIN_UNIQUE_INNER)
! return;
!
! /*
! * Join grouped relation to non-grouped one.
! *
! * Do not use plain path of the input rel whose target does
! * have GroupedPahtInfo. For example (assuming that join of
! * two grouped rels is not supported), the only way to
! * evaluate SELECT sum(a.x), sum(b.y) ... is to join "a" and
! * "b" and aggregate the result. Otherwise the path target
! * wouldn't match joinrel->gpi->target. TODO Move this comment
! * elsewhere as it seems common to all join kinds.
! */
! /*
! * TODO Allow outer join if the grouped rel is on the
! * non-nullable side.
! */
! if (jointype == JOIN_INNER)
! {
! Path *grouped_path, *plain_path;
!
! if (outerrel->gpi != NULL &&
! outerrel->gpi->pathlist != NIL &&
! innerrel->gpi == NULL)
! {
! grouped_path = (Path *)
! linitial(outerrel->gpi->pathlist);
! plain_path = cheapest_total_inner;
! try_grouped_hashjoin_path(root, joinrel,
! grouped_path, plain_path,
! hashclauses, jointype,
! extra, false, false);
! }
! else if (innerrel->gpi != NULL &&
! innerrel->gpi->pathlist != NIL &&
! outerrel->gpi == NULL)
! {
! grouped_path = (Path *)
! linitial(innerrel->gpi->pathlist);
! plain_path = cheapest_total_outer;
! try_grouped_hashjoin_path(root, joinrel, plain_path,
! grouped_path, hashclauses,
! jointype, extra,
! false, false);
!
! if (cheapest_startup_outer != NULL &&
! cheapest_startup_outer != cheapest_total_outer)
! {
! plain_path = cheapest_startup_outer;
! try_grouped_hashjoin_path(root, joinrel,
! plain_path,
! grouped_path,
! hashclauses,
! jointype, extra,
! false, false);
! }
! }
}
+
+ /*
+ * Try to join plain relations and make a grouped rel out of
+ * the join.
+ *
+ * Since aggregation needs the whole relation, we are only
+ * interested in total costs.
+ */
+ try_grouped_hashjoin_path(root, joinrel,
+ cheapest_total_outer,
+ cheapest_total_inner,
+ hashclauses,
+ jointype, extra, true, false);
}
}
*************** hash_inner_and_outer(PlannerInfo *root,
*** 1765,1777 ****
cheapest_safe_inner =
get_cheapest_parallel_safe_total_inner(innerrel->pathlist);
! if (cheapest_safe_inner != NULL)
! try_partial_hashjoin_path(root, joinrel,
! cheapest_partial_outer,
! cheapest_safe_inner,
! hashclauses, jointype, extra);
}
}
}
/*
--- 2744,2967 ----
cheapest_safe_inner =
get_cheapest_parallel_safe_total_inner(innerrel->pathlist);
! if (!grouped)
! {
! if (cheapest_safe_inner != NULL)
! try_partial_hashjoin_path(root, joinrel,
! cheapest_partial_outer,
! cheapest_safe_inner,
! hashclauses, jointype, extra,
! false, false);
! }
! else if (joinrel->gpi != NULL)
! {
! /*
! * Grouped partial path.
! *
! * 1. Apply aggregation to the plain partial join path.
! */
! if (cheapest_safe_inner != NULL)
! try_grouped_hashjoin_path(root, joinrel,
! cheapest_partial_outer,
! cheapest_safe_inner,
! hashclauses,
! jointype, extra, true, true);
!
! /*
! * 2. Join the cheapest partial grouped outer path (if one
! * exists) to cheapest_safe_inner (there's no reason to look
! * for another inner path than what we used for non-grouped
! * partial join path).
! */
! if (outerrel->gpi != NULL &&
! outerrel->gpi->partial_pathlist != NIL &&
! innerrel->gpi == NULL &&
! cheapest_safe_inner != NULL)
! {
! Path *outer_path;
!
! outer_path = (Path *)
! linitial(outerrel->gpi->partial_pathlist);
!
! try_grouped_hashjoin_path(root, joinrel, outer_path,
! cheapest_safe_inner,
! hashclauses,
! jointype, extra, false, true);
! }
!
! /*
! * 3. Join the cheapest_partial_outer path (again, no reason
! * to use different outer path than the one we used for plain
! * partial join) to the cheapest grouped inner path if the
! * latter exists and is parallel-safe.
! */
! if (innerrel->gpi != NULL &&
! innerrel->gpi->pathlist != NIL &&
! outerrel->gpi == NULL)
! {
! Path *inner_path;
!
! inner_path = (Path *) linitial(innerrel->gpi->pathlist);
!
! if (inner_path->parallel_safe)
! try_grouped_hashjoin_path(root, joinrel,
! cheapest_partial_outer,
! inner_path,
! hashclauses,
! jointype, extra,
! false, true);
! }
!
! /*
! * Other combinations seem impossible because: 1. At most 1
! * input relation of the join can be grouped, 2. the inner
! * path must not be partial.
! */
! }
! }
! }
! }
!
! /*
! * Do the input paths emit all the aggregates contained in the grouped target
! * of the join?
! *
! * The point is that one input relation might be unable to evaluate some
! * aggregate(s), so it'll only generate plain paths. It's wrong to combine
! * such plain paths with grouped ones that the other input rel might be able
! * to generate because the result would miss the aggregate(s) the first
! * relation failed to evaluate.
! *
! * TODO For better efficiency, consider storing Bitmapset of
! * GroupedVarInfo.gvid in GroupedPathInfo.
! */
! static bool
! is_grouped_join_target_complete(PlannerInfo *root, PathTarget *jointarget,
! Path *outer_path, Path *inner_path)
! {
! RelOptInfo *outer_rel = outer_path->parent;
! RelOptInfo *inner_rel = inner_path->parent;
! ListCell *l1;
!
! /*
! * Join of two grouped relations is not supported.
! *
! * This actually isn't check of target completeness --- can it be located
! * elsewhere?
! */
! if (outer_rel->gpi != NULL && inner_rel->gpi != NULL)
! return false;
!
! foreach(l1, jointarget->exprs)
! {
! Expr *expr = (Expr *) lfirst(l1);
! GroupedVar *gvar;
! GroupedVarInfo *gvi = NULL;
! ListCell *l2;
! bool found = false;
!
! /* Only interested in aggregates. */
! if (!IsA(expr, GroupedVar))
! continue;
!
! gvar = castNode(GroupedVar, expr);
!
! /* Find the corresponding GroupedVarInfo. */
! foreach(l2, root->grouped_var_list)
! {
! GroupedVarInfo *gvi_tmp = castNode(GroupedVarInfo, lfirst(l2));
!
! if (gvi_tmp->gvid == gvar->gvid)
! {
! gvi = gvi_tmp;
! break;
! }
! }
! Assert(gvi != NULL);
!
! /*
! * If any aggregate references both input relations, something went
! * wrong during construction of one of the input targets: one input
! * rel is grouped, but no grouping target should have been created for
! * it if some aggregate required more than that input rel.
! */
! Assert(gvi->gv_eval_at == NULL ||
! !(bms_overlap(gvi->gv_eval_at, outer_rel->relids) &&
! bms_overlap(gvi->gv_eval_at, inner_rel->relids)));
!
! /*
! * If the aggregate belongs to the plain relation, it probably
! * means that non-grouping expression made aggregation of that
! * input relation impossible. Since that expression is not
! * necessarily emitted by the current join, aggregation might be
! * possible here. On the other hand, aggregation of a join which
! * already contains a grouped relation does not seem too
! * beneficial.
! *
! * XXX The condition below is also met if the query contains both
! * "star aggregate" and a normal one. Since the earlier can be
! * added to any base relation, and since we don't support join of
! * 2 grouped relations, join of arbitrary 2 relations will always
! * result in a plain relation.
! *
! * XXX If we conclude that aggregation is worth, only consider
! * this test failed if target usable for aggregation cannot be
! * created (i.e. the non-grouping expression is in the output of
! * the current join).
! */
! if ((outer_rel->gpi == NULL &&
! bms_overlap(gvi->gv_eval_at, outer_rel->relids))
! || (inner_rel->gpi == NULL &&
! bms_overlap(gvi->gv_eval_at, inner_rel->relids)))
! return false;
!
! /* Look for the aggregate in the input targets. */
! if (outer_rel->gpi != NULL)
! {
! /* No more than one input path should be grouped. */
! Assert(inner_rel->gpi == NULL);
!
! foreach(l2, outer_path->pathtarget->exprs)
! {
! expr = (Expr *) lfirst(l2);
!
! if (!IsA(expr, GroupedVar))
! continue;
!
! gvar = castNode(GroupedVar, expr);
! if (gvar->gvid == gvi->gvid)
! {
! found = true;
! break;
! }
! }
}
+ else if (!found && inner_rel->gpi != NULL)
+ {
+ Assert(outer_rel->gpi == NULL);
+
+ foreach(l2, inner_path->pathtarget->exprs)
+ {
+ expr = (Expr *) lfirst(l2);
+
+ if (!IsA(expr, GroupedVar))
+ continue;
+
+ gvar = castNode(GroupedVar, expr);
+ if (gvar->gvid == gvi->gvid)
+ {
+ found = true;
+ break;
+ }
+ }
+ }
+
+ /* Even a single missing aggregate causes the whole test to fail. */
+ if (!found)
+ return false;
}
+
+ return true;
}
/*
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
new file mode 100644
index 5a68de3..ea24ed9
*** a/src/backend/optimizer/path/joinrels.c
--- b/src/backend/optimizer/path/joinrels.c
***************
*** 14,23 ****
--- 14,29 ----
*/
#include "postgres.h"
+ #include "miscadmin.h"
+ #include "nodes/relation.h"
+ #include "optimizer/clauses.h"
#include "optimizer/joininfo.h"
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
+ #include "optimizer/prep.h"
+ #include "optimizer/cost.h"
#include "utils/memutils.h"
+ #include "utils/lsyscache.h"
static void make_rels_by_clause_joins(PlannerInfo *root,
*************** static void make_rels_by_clauseless_join
*** 29,40 ****
static bool has_join_restriction(PlannerInfo *root, RelOptInfo *rel);
static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
static bool is_dummy_rel(RelOptInfo *rel);
- static void mark_dummy_rel(RelOptInfo *rel);
static bool restriction_is_constant_false(List *restrictlist,
bool only_pushed_down);
static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
RelOptInfo *rel2, RelOptInfo *joinrel,
SpecialJoinInfo *sjinfo, List *restrictlist);
/*
--- 35,53 ----
static bool has_join_restriction(PlannerInfo *root, RelOptInfo *rel);
static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
static bool is_dummy_rel(RelOptInfo *rel);
static bool restriction_is_constant_false(List *restrictlist,
bool only_pushed_down);
static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
RelOptInfo *rel2, RelOptInfo *joinrel,
SpecialJoinInfo *sjinfo, List *restrictlist);
+ static void try_partition_wise_join(PlannerInfo *root, RelOptInfo *rel1,
+ RelOptInfo *rel2, RelOptInfo *joinrel,
+ SpecialJoinInfo *parent_sjinfo,
+ List *parent_restrictlist);
+ static int match_expr_to_partition_keys(Expr *expr, RelOptInfo *rel);
+ static void build_joinrel_partition_bounds(RelOptInfo *rel1, RelOptInfo *rel2,
+ RelOptInfo *joinrel, JoinType jointype,
+ List **rel1_parts, List **rel2_parts);
/*
*************** make_join_rel(PlannerInfo *root, RelOptI
*** 731,736 ****
--- 744,752 ----
populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
restrictlist);
+ /* Apply partition-wise join technique, if possible. */
+ try_partition_wise_join(root, rel1, rel2, joinrel, sjinfo, restrictlist);
+
bms_free(joinrelids);
return joinrel;
*************** is_dummy_rel(RelOptInfo *rel)
*** 1197,1203 ****
* is that the best solution is to explicitly make the dummy path in the same
* context the given RelOptInfo is in.
*/
! static void
mark_dummy_rel(RelOptInfo *rel)
{
MemoryContext oldcontext;
--- 1213,1219 ----
* is that the best solution is to explicitly make the dummy path in the same
* context the given RelOptInfo is in.
*/
! void
mark_dummy_rel(RelOptInfo *rel)
{
MemoryContext oldcontext;
*************** mark_dummy_rel(RelOptInfo *rel)
*** 1217,1223 ****
rel->partial_pathlist = NIL;
/* Set up the dummy path */
! add_path(rel, (Path *) create_append_path(rel, NIL, NULL, 0, NIL));
/* Set or update cheapest_total_path and related fields */
set_cheapest(rel);
--- 1233,1239 ----
rel->partial_pathlist = NIL;
/* Set up the dummy path */
! add_path(rel, (Path *) create_append_path(rel, NIL, NULL, 0, NIL), false);
/* Set or update cheapest_total_path and related fields */
set_cheapest(rel);
*************** restriction_is_constant_false(List *rest
*** 1268,1270 ****
--- 1284,1712 ----
}
return false;
}
+
+ /*
+ * Assess whether join between given two partitioned relations can be broken
+ * down into joins between matching partitions; a technique called
+ * "partition-wise join"
+ *
+ * Partition-wise join is possible when a. Joining relations have same
+ * partitioning scheme b. There exists an equi-join between the partition keys
+ * of the two relations.
+ *
+ * Partition-wise join is planned as follows (details: optimizer/README.)
+ *
+ * 1. Create the RelOptInfos for joins between matching partitions i.e
+ * child-joins and add paths those.
+ *
+ * 2. Add "append" paths to join between parent relations. The second phase is
+ * implemented by generate_partition_wise_join_paths().
+ *
+ * The RelOptInfo, SpecialJoinInfo and restrictlist for each child join are
+ * obtained by translating the respective parent join structures.
+ */
+ static void
+ try_partition_wise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
+ RelOptInfo *joinrel, SpecialJoinInfo *parent_sjinfo,
+ List *parent_restrictlist)
+ {
+ int nparts;
+ int cnt_parts;
+ ListCell *lc1;
+ ListCell *lc2;
+ List *rel1_parts;
+ List *rel2_parts;
+ bool is_strict;
+
+ /* Guard against stack overflow due to overly deep partition hierarchy. */
+ check_stack_depth();
+
+ /* Nothing to do, if the join relation is not partitioned. */
+ if (!joinrel->part_scheme)
+ return;
+
+ /*
+ * set_append_rel_pathlist() may not create paths in children of an empty
+ * partitioned table and so we can not add paths to a child-joins when one
+ * of the joining relations is empty. So, deem such a join as
+ * unpartitioned.
+ */
+ if (IS_DUMMY_REL(rel1) || IS_DUMMY_REL(rel2))
+ return;
+
+ /*
+ * Since this join relation is partitioned, all the base relations
+ * participating in this join must be partitioned and so are all the
+ * intermediate join relations.
+ */
+ Assert(rel1->part_scheme && rel2->part_scheme);
+
+ /*
+ * Every pair of joining relations we see here should have an equi-join
+ * between partition keys if this join has been deemed as a partitioned
+ * join. See build_joinrel_partition_info() for reasons.
+ */
+ Assert(have_partkey_equi_join(rel1, rel2, parent_sjinfo->jointype,
+ parent_restrictlist, &is_strict));
+
+ /*
+ * The partition scheme of the join relation should match that of the
+ * joining relations.
+ */
+ Assert(joinrel->part_scheme == rel1->part_scheme &&
+ joinrel->part_scheme == rel2->part_scheme);
+
+ /* We should have RelOptInfos of the partitions available. */
+ Assert(rel1->part_rels && rel2->part_rels);
+
+ /*
+ * Calculate bounds for the join relation. If we can not come up with joint
+ * bounds, we can not use partition-wise join.
+ */
+ build_joinrel_partition_bounds(rel1, rel2, joinrel,
+ parent_sjinfo->jointype, &rel1_parts,
+ &rel2_parts);
+ if (!joinrel->boundinfo)
+ return;
+
+ Assert(list_length(rel1_parts) == list_length(rel2_parts));
+ Assert(joinrel->nparts == list_length(rel1_parts));
+ Assert(joinrel->nparts > 0);
+
+ nparts = joinrel->nparts;
+
+ elog(DEBUG3, "join between relations %s and %s is considered for partition-wise join.",
+ bmsToString(rel1->relids), bmsToString(rel2->relids));
+
+ /* Allocate space for hold child-joins RelOptInfos, if not already done. */
+ if (!joinrel->part_rels)
+ joinrel->part_rels = (RelOptInfo **) palloc0(sizeof(RelOptInfo *) * nparts);
+
+ /*
+ * Create child join relations for this partitioned join, if those don't
+ * exist. Add paths to child-joins for a pair of child relations
+ * corresponding corresponding to the given pair of parent relations.
+ */
+ cnt_parts = 0;
+ forboth (lc1, rel1_parts, lc2, rel2_parts)
+ {
+ RelOptInfo *child_rel1 = lfirst(lc1);
+ RelOptInfo *child_rel2 = lfirst(lc2);
+ SpecialJoinInfo *child_sjinfo;
+ List *child_restrictlist;
+ RelOptInfo *child_joinrel;
+ Relids child_joinrelids;
+ AppendRelInfo **appinfos;
+ int nappinfos;
+
+ /* We should never try to join two overlapping sets of rels. */
+ Assert(!bms_overlap(child_rel1->relids, child_rel2->relids));
+ child_joinrelids = bms_union(child_rel1->relids, child_rel2->relids);
+ appinfos = find_appinfos_by_relids(root, child_joinrelids, &nappinfos);
+
+ /*
+ * Construct SpecialJoinInfo from parent join relations's
+ * SpecialJoinInfo.
+ */
+ child_sjinfo = build_child_join_sjinfo(root, parent_sjinfo,
+ child_rel1->relids,
+ child_rel2->relids);
+
+ /*
+ * Construct restrictions applicable to the child join from
+ * those applicable to the parent join.
+ */
+ child_restrictlist = (List *) adjust_appendrel_attrs(root,
+ (Node *) parent_restrictlist,
+ nappinfos, appinfos);
+
+ child_joinrel = joinrel->part_rels[cnt_parts];
+ if (!child_joinrel)
+ {
+ child_joinrel = build_child_join_rel(root, child_rel1, child_rel2,
+ joinrel, child_restrictlist,
+ child_sjinfo,
+ child_sjinfo->jointype);
+ joinrel->part_rels[cnt_parts] = child_joinrel;
+ }
+
+ Assert(bms_equal(child_joinrel->relids, child_joinrelids));
+
+ /* Also translate expressions that AggPath will use in its target. */
+ if (child_joinrel->gpi != NULL)
+ {
+ Assert(child_joinrel->gpi->target != NULL);
+
+ child_joinrel->gpi->target->exprs =
+ (List *) adjust_appendrel_attrs(root,
+ (Node *) child_joinrel->gpi->target->exprs,
+ nappinfos, appinfos);
+ }
+
+ populate_joinrel_with_paths(root, child_rel1, child_rel2,
+ child_joinrel, child_sjinfo,
+ child_restrictlist);
+
+ pfree(appinfos);
+
+ /*
+ * If the child relations themselves are partitioned, try partition-wise join
+ * recursively.
+ */
+ try_partition_wise_join(root, child_rel1, child_rel2, child_joinrel,
+ child_sjinfo, child_restrictlist);
+ cnt_parts++;
+ }
+ }
+
+ /*
+ * Returns true if there exists an equi-join condition for each pair of
+ * partition key from given relations being joined.
+ */
+ bool
+ have_partkey_equi_join(RelOptInfo *rel1, RelOptInfo *rel2, JoinType jointype,
+ List *restrictlist, bool *is_strict)
+ {
+ PartitionScheme part_scheme = rel1->part_scheme;
+ ListCell *lc;
+ int cnt_pks;
+ int num_pks;
+ bool *pk_has_clause;
+
+ *is_strict = false;
+
+ /*
+ * This function should be called when the joining relations have same
+ * partitioning scheme.
+ */
+ Assert(rel1->part_scheme == rel2->part_scheme);
+ Assert(part_scheme);
+
+ num_pks = part_scheme->partnatts;
+
+ pk_has_clause = (bool *) palloc0(sizeof(bool) * num_pks);
+
+ foreach (lc, restrictlist)
+ {
+ RestrictInfo *rinfo = lfirst(lc);
+ OpExpr *opexpr;
+ Expr *expr1;
+ Expr *expr2;
+ int ipk1;
+ int ipk2;
+
+ /* If processing an outer join, only use its own join clauses. */
+ if (IS_OUTER_JOIN(jointype) && rinfo->is_pushed_down)
+ continue;
+
+ /* Skip clauses which can not be used for a join. */
+ if (!rinfo->can_join)
+ continue;
+
+ /* Skip clauses which are not equality conditions. */
+ if (!rinfo->mergeopfamilies)
+ continue;
+
+ opexpr = (OpExpr *) rinfo->clause;
+ Assert(is_opclause(opexpr));
+
+ /*
+ * The equi-join between partition keys is strict if equi-join between
+ * at least one partition key is using a strict operator. See
+ * explanation about outer join reordering identity 3 in
+ * optimizer/README
+ */
+ *is_strict = *is_strict || op_strict(opexpr->opno);
+
+ /* Match the operands to the relation. */
+ if (bms_is_subset(rinfo->left_relids, rel1->relids) &&
+ bms_is_subset(rinfo->right_relids, rel2->relids))
+ {
+ expr1 = linitial(opexpr->args);
+ expr2 = lsecond(opexpr->args);
+ }
+ else if (bms_is_subset(rinfo->left_relids, rel2->relids) &&
+ bms_is_subset(rinfo->right_relids, rel1->relids))
+ {
+ expr1 = lsecond(opexpr->args);
+ expr2 = linitial(opexpr->args);
+ }
+ else
+ continue;
+
+ /*
+ * Only clauses referencing the partition keys are useful for
+ * partition-wise join.
+ */
+ ipk1 = match_expr_to_partition_keys(expr1, rel1);
+ if (ipk1 < 0)
+ continue;
+ ipk2 = match_expr_to_partition_keys(expr2, rel2);
+ if (ipk2 < 0)
+ continue;
+
+ /*
+ * If the clause refers to keys at different cardinal positions in the
+ * partition keys of joining relations, it can not be used for
+ * partition-wise join.
+ */
+ if (ipk1 != ipk2)
+ continue;
+
+ /*
+ * The clause allows partition-wise join if only it uses the same
+ * operator family as that specified by the partition key.
+ */
+ if (!list_member_oid(rinfo->mergeopfamilies,
+ part_scheme->partopfamily[ipk1]))
+ continue;
+
+ /* Mark the partition key as having an equi-join clause. */
+ pk_has_clause[ipk1] = true;
+ }
+
+ /* Check whether every partition key has an equi-join condition. */
+ for (cnt_pks = 0; cnt_pks < num_pks; cnt_pks++)
+ {
+ if (!pk_has_clause[cnt_pks])
+ {
+ pfree(pk_has_clause);
+ return false;
+ }
+ }
+
+ pfree(pk_has_clause);
+ return true;
+ }
+
+ /*
+ * Find the partition key from the given relation matching the given
+ * expression. If found, return the index of the partition key, else return -1.
+ */
+ static int
+ match_expr_to_partition_keys(Expr *expr, RelOptInfo *rel)
+ {
+ int cnt_pks;
+ int num_pks;
+
+ /* This function should be called only for partitioned relations. */
+ Assert(rel->part_scheme);
+
+ num_pks = rel->part_scheme->partnatts;
+
+ /* Remove the relabel decoration. */
+ while (IsA(expr, RelabelType))
+ expr = (Expr *) (castNode(RelabelType, expr))->arg;
+
+ for (cnt_pks = 0; cnt_pks < num_pks; cnt_pks++)
+ {
+ List *pkexprs = rel->partexprs[cnt_pks];
+ ListCell *lc;
+
+ foreach(lc, pkexprs)
+ {
+ Expr *pkexpr = lfirst(lc);
+ if (equal(pkexpr, expr))
+ return cnt_pks;
+ }
+ }
+
+ return -1;
+ }
+
+ /*
+ * Calculate the bounds/lists of the join relation based on partition bounds of the
+ * joining relations. Also returns the matching partitions from the joining
+ * relations.
+ *
+ * As of now, it simply checks whether the bounds/lists of the joining
+ * relations match and returns bounds/lists of the first relation. In future
+ * this function will be expanded to merge the bounds/lists from the joining
+ * relations to produce the bounds/lists of the join relation. If the function
+ * fails to merge the bounds/lists, it returns NULL and the lists are also NIL.
+ *
+ * The function also returns two lists of RelOptInfos, one for each joining
+ * relation. The RelOptInfos at the same position in each of the lists give the
+ * partitions with matching bounds which can be joined to produce join relation
+ * corresponding to the merged partition bounds corresponding to that position.
+ * When there doesn't exist a matching partition on either side, corresponding
+ * RelOptInfo will be NULL.
+ */
+ static void
+ build_joinrel_partition_bounds(RelOptInfo *rel1, RelOptInfo *rel2,
+ RelOptInfo *joinrel, JoinType jointype,
+ List **rel1_parts, List **rel2_parts)
+ {
+ PartitionScheme part_scheme;
+ int cnt;
+ int nparts;
+ int16 *parttyplen;
+ bool *parttypbyval;
+
+ Assert(rel1->part_scheme == rel2->part_scheme);
+ Assert(rel1->nparts == rel2->nparts);
+ *rel1_parts = NIL;
+ *rel2_parts = NIL;
+
+ part_scheme = rel1->part_scheme;
+
+ /*
+ * Ideally, we should be able to join two relations which have different
+ * number of partitions as long as the bounds of partitions available on
+ * both the sides match. But for now, we need exact same number of
+ * partitions on both the sides.
+ */
+ if (rel1->nparts != rel2->nparts)
+ {
+ /*
+ * If this pair of joining relations did not have same number of
+ * partitions no other pair can have same number of partitions.
+ */
+ Assert(!joinrel->boundinfo && joinrel->nparts == 0);
+ return;
+ }
+
+
+ parttyplen = (int16 *) palloc(sizeof(int16) * part_scheme->partnatts);
+ parttypbyval = (bool *) palloc(sizeof(bool) * part_scheme->partnatts);
+ for (cnt = 0; cnt < part_scheme->partnatts; cnt++)
+ get_typlenbyval(part_scheme->partopcintype[cnt], &parttyplen[cnt],
+ &parttypbyval[cnt]);
+
+ if (!partition_bounds_equal(part_scheme->partnatts, parttyplen,
+ parttypbyval, rel1->boundinfo,
+ rel2->boundinfo))
+ {
+ /*
+ * If this pair of joining relations did not have same partition bounds
+ * no other pair can have same partition bounds.
+ */
+ Assert(!joinrel->boundinfo && joinrel->nparts == 0);
+ return;
+ }
+
+ nparts = rel1->nparts;
+ for (cnt = 0; cnt < nparts; cnt++)
+ {
+ *rel1_parts = lappend(*rel1_parts, rel1->part_rels[cnt]);
+ *rel2_parts = lappend(*rel2_parts, rel2->part_rels[cnt]);
+ }
+
+ /* Set the partition bounds if not already set. */
+ if (!joinrel->boundinfo)
+ {
+ joinrel->boundinfo = rel1->boundinfo;
+ joinrel->nparts = rel1->nparts;
+ }
+ else
+ {
+ /* Verify existing bounds. */
+ Assert(partition_bounds_equal(part_scheme->partnatts, parttyplen,
+ parttypbyval, joinrel->boundinfo,
+ rel1->boundinfo));
+ Assert(joinrel->nparts == rel1->nparts);
+ }
+
+ pfree(parttyplen);
+ pfree(parttypbyval);
+ }
diff --git a/src/backend/optimizer/path/tidpath.c b/src/backend/optimizer/path/tidpath.c
new file mode 100644
index a2fe661..91d855c
*** a/src/backend/optimizer/path/tidpath.c
--- b/src/backend/optimizer/path/tidpath.c
*************** create_tidscan_paths(PlannerInfo *root,
*** 266,270 ****
if (tidquals)
add_path(rel, (Path *) create_tidscan_path(root, rel, tidquals,
! required_outer));
}
--- 266,270 ----
if (tidquals)
add_path(rel, (Path *) create_tidscan_path(root, rel, tidquals,
! required_outer), false);
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
new file mode 100644
index 95e6eb7..3f1389f
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
*************** static Plan *prepare_sort_from_pathkeys(
*** 252,258 ****
static EquivalenceMember *find_ec_member_for_tle(EquivalenceClass *ec,
TargetEntry *tle,
Relids relids);
! static Sort *make_sort_from_pathkeys(Plan *lefttree, List *pathkeys);
static Sort *make_sort_from_groupcols(List *groupcls,
AttrNumber *grpColIdx,
Plan *lefttree);
--- 252,259 ----
static EquivalenceMember *find_ec_member_for_tle(EquivalenceClass *ec,
TargetEntry *tle,
Relids relids);
! static Sort *make_sort_from_pathkeys(Plan *lefttree, List *pathkeys,
! Relids relids);
static Sort *make_sort_from_groupcols(List *groupcls,
AttrNumber *grpColIdx,
Plan *lefttree);
*************** create_sort_plan(PlannerInfo *root, Sort
*** 1650,1656 ****
subplan = create_plan_recurse(root, best_path->subpath,
flags | CP_SMALL_TLIST);
! plan = make_sort_from_pathkeys(subplan, best_path->path.pathkeys);
copy_generic_path_info(&plan->plan, (Path *) best_path);
--- 1651,1657 ----
subplan = create_plan_recurse(root, best_path->subpath,
flags | CP_SMALL_TLIST);
! plan = make_sort_from_pathkeys(subplan, best_path->path.pathkeys, NULL);
copy_generic_path_info(&plan->plan, (Path *) best_path);
*************** create_mergejoin_plan(PlannerInfo *root,
*** 3767,3772 ****
--- 3768,3775 ----
ListCell *lc;
ListCell *lop;
ListCell *lip;
+ Path *outer_path = best_path->jpath.outerjoinpath;
+ Path *inner_path = best_path->jpath.innerjoinpath;
/*
* MergeJoin can project, so we don't have to demand exact tlists from the
*************** create_mergejoin_plan(PlannerInfo *root,
*** 3830,3837 ****
*/
if (best_path->outersortkeys)
{
Sort *sort = make_sort_from_pathkeys(outer_plan,
! best_path->outersortkeys);
label_sort_with_costsize(root, sort, -1.0);
outer_plan = (Plan *) sort;
--- 3833,3842 ----
*/
if (best_path->outersortkeys)
{
+ Relids outer_relids = outer_path->parent->relids;
Sort *sort = make_sort_from_pathkeys(outer_plan,
! best_path->outersortkeys,
! outer_relids);
label_sort_with_costsize(root, sort, -1.0);
outer_plan = (Plan *) sort;
*************** create_mergejoin_plan(PlannerInfo *root,
*** 3842,3849 ****
if (best_path->innersortkeys)
{
Sort *sort = make_sort_from_pathkeys(inner_plan,
! best_path->innersortkeys);
label_sort_with_costsize(root, sort, -1.0);
inner_plan = (Plan *) sort;
--- 3847,3856 ----
if (best_path->innersortkeys)
{
+ Relids inner_relids = inner_path->parent->relids;
Sort *sort = make_sort_from_pathkeys(inner_plan,
! best_path->innersortkeys,
! inner_relids);
label_sort_with_costsize(root, sort, -1.0);
inner_plan = (Plan *) sort;
*************** prepare_sort_from_pathkeys(Plan *lefttre
*** 5687,5697 ****
continue;
/*
! * Ignore child members unless they match the rel being
* sorted.
*/
if (em->em_is_child &&
! !bms_equal(em->em_relids, relids))
continue;
sortexpr = em->em_expr;
--- 5694,5704 ----
continue;
/*
! * Ignore child members unless they belong to the rel being
* sorted.
*/
if (em->em_is_child &&
! !bms_is_subset(em->em_relids, relids))
continue;
sortexpr = em->em_expr;
*************** find_ec_member_for_tle(EquivalenceClass
*** 5803,5812 ****
continue;
/*
! * Ignore child members unless they match the rel being sorted.
*/
if (em->em_is_child &&
! !bms_equal(em->em_relids, relids))
continue;
/* Match if same expression (after stripping relabel) */
--- 5810,5819 ----
continue;
/*
! * Ignore child members unless they belong to the rel being sorted.
*/
if (em->em_is_child &&
! !bms_is_subset(em->em_relids, relids))
continue;
/* Match if same expression (after stripping relabel) */
*************** find_ec_member_for_tle(EquivalenceClass
*** 5827,5835 ****
*
* 'lefttree' is the node which yields input tuples
* 'pathkeys' is the list of pathkeys by which the result is to be sorted
*/
static Sort *
! make_sort_from_pathkeys(Plan *lefttree, List *pathkeys)
{
int numsortkeys;
AttrNumber *sortColIdx;
--- 5834,5843 ----
*
* 'lefttree' is the node which yields input tuples
* 'pathkeys' is the list of pathkeys by which the result is to be sorted
+ * 'relids' is the set of relations required by prepare_sort_from_pathkeys()
*/
static Sort *
! make_sort_from_pathkeys(Plan *lefttree, List *pathkeys, Relids relids)
{
int numsortkeys;
AttrNumber *sortColIdx;
*************** make_sort_from_pathkeys(Plan *lefttree,
*** 5839,5845 ****
/* Compute sort column info, and adjust lefttree as needed */
lefttree = prepare_sort_from_pathkeys(lefttree, pathkeys,
! NULL,
NULL,
false,
&numsortkeys,
--- 5847,5853 ----
/* Compute sort column info, and adjust lefttree as needed */
lefttree = prepare_sort_from_pathkeys(lefttree, pathkeys,
! relids,
NULL,
false,
&numsortkeys,
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
new file mode 100644
index ebd442a..0313c71
*** a/src/backend/optimizer/plan/initsplan.c
--- b/src/backend/optimizer/plan/initsplan.c
***************
*** 14,20 ****
--- 14,22 ----
*/
#include "postgres.h"
+ #include "access/sysattr.h"
#include "catalog/pg_type.h"
+ #include "catalog/pg_class.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
***************
*** 26,31 ****
--- 28,34 ----
#include "optimizer/planner.h"
#include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
+ #include "optimizer/tlist.h"
#include "optimizer/var.h"
#include "parser/analyze.h"
#include "rewrite/rewriteManip.h"
*************** typedef struct PostponedQual
*** 45,50 ****
--- 48,54 ----
} PostponedQual;
+ static void create_grouped_var_infos(PlannerInfo *root);
static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
Index rtindex);
static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
*************** add_vars_to_targetlist(PlannerInfo *root
*** 240,245 ****
--- 244,533 ----
}
}
+ /*
+ * Add GroupedVarInfo to grouped_var_list for each aggregate and setup
+ * GroupedPathInfo for each base relation that can product grouped paths.
+ *
+ * XXX In the future we might want to create GroupedVarInfo for grouping
+ * expressions too, so that grouping key is not limited to plain Var if the
+ * grouping takes place below the top-level join.
+ *
+ * root->group_pathkeys must be setup before this function is called.
+ */
+ extern void
+ add_grouping_info_to_base_rels(PlannerInfo *root)
+ {
+ int i;
+
+ /* No grouping in the query? */
+ if (!root->parse->groupClause || root->group_pathkeys == NIL)
+ return;
+
+ /* TODO This is just for PoC. Relax the limitation later. */
+ if (root->parse->havingQual)
+ return;
+
+ /* Create GroupedVarInfo per (distinct) aggregate. */
+ create_grouped_var_infos(root);
+
+ /* Is no grouping is possible below the top-level join? */
+ if (root->grouped_var_list == NIL)
+ return;
+
+ /* Process the individual base relations. */
+ for (i = 1; i < root->simple_rel_array_size; i++)
+ {
+ RelOptInfo *rel = root->simple_rel_array[i];
+
+ /*
+ * "other rels" will have their targets built later, by translation of
+ * the target of the parent rel - see set_append_rel_size. If we
+ * wanted to prepare the child rels here, we'd need another iteration
+ * of simple_rel_array_size.
+ */
+ if (rel != NULL && rel->reloptkind == RELOPT_BASEREL)
+ prepare_rel_for_grouping(root, rel);
+ }
+ }
+
+ /*
+ * Create GroupedVarInfo for each distinct aggregate.
+ *
+ * If any aggregate is not suitable, set root->grouped_var_list to NIL and
+ * return.
+ *
+ * TODO Include aggregates from HAVING clause.
+ */
+ static void
+ create_grouped_var_infos(PlannerInfo *root)
+ {
+ List *tlist_exprs;
+ ListCell *lc;
+
+ Assert(root->grouped_var_list == NIL);
+
+ /*
+ * TODO Check if processed_tlist contains the HAVING aggregates. If not,
+ * get them elsewhere.
+ */
+ tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+ PVC_INCLUDE_AGGREGATES);
+ if (tlist_exprs == NIL)
+ return;
+
+ /* tlist_exprs may also contain Vars, but we only need Aggrefs. */
+ foreach(lc, tlist_exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+ Aggref *aggref;
+ ListCell *lc2;
+ GroupedVarInfo *gvi;
+ bool exists;
+
+ if (IsA(expr, Var))
+ continue;
+
+ aggref = castNode(Aggref, expr);
+
+ /* TODO Think if (some of) these can be handled. */
+ if (aggref->aggvariadic ||
+ aggref->aggdirectargs || aggref->aggorder ||
+ aggref->aggdistinct || aggref->aggfilter)
+ {
+ /*
+ * Partial aggregation is not useful if at least one aggregate
+ * cannot be evaluated below the top-level join.
+ *
+ * XXX Is it worth freeing the GroupedVarInfos and their subtrees?
+ */
+ root->grouped_var_list = NIL;
+ break;
+ }
+
+ /* Does GroupedVarInfo for this aggregate already exist? */
+ exists = false;
+ foreach(lc2, root->grouped_var_list)
+ {
+ Expr *expr = (Expr *) lfirst(lc2);
+
+ gvi = castNode(GroupedVarInfo, expr);
+
+ if (equal(expr, gvi->gvexpr))
+ {
+ exists = true;
+ break;
+ }
+ }
+
+ /* Construct a new GroupedVarInfo if does not exist yet. */
+ if (!exists)
+ {
+ Relids relids;
+
+ /* TODO Initialize gv_width. */
+ gvi = makeNode(GroupedVarInfo);
+
+ gvi->gvid = list_length(root->grouped_var_list);
+ gvi->gvexpr = (Expr *) copyObject(aggref);
+ gvi->agg_partial = copyObject(aggref);
+ mark_partial_aggref(gvi->agg_partial, AGGSPLIT_INITIAL_SERIAL);
+
+ /* Find out where the aggregate should be evaluated. */
+ relids = pull_varnos((Node *) aggref);
+ if (!bms_is_empty(relids))
+ gvi->gv_eval_at = relids;
+ else
+ {
+ Assert(aggref->aggstar);
+ gvi->gv_eval_at = NULL;
+ }
+
+ root->grouped_var_list = lappend(root->grouped_var_list, gvi);
+ }
+ }
+
+ list_free(tlist_exprs);
+ }
+
+ /*
+ * Check if all the expressions of rel->reltarget can be used as grouping
+ * expressions and create target for grouped paths.
+ *
+ * If we succeed to create the grouping target, also replace rel->reltarget
+ * with a new one that has sortgrouprefs initialized -- this is necessary for
+ * create_agg_plan to match the grouping clauses against the input target
+ * expressions.
+ *
+ * rel_agg_attrs is a set attributes of the relation referenced by aggregate
+ * arguments. These can exist in the (plain) target without being grouping
+ * expressions.
+ *
+ * rel_agg_vars should be passed instead if rel is a join.
+ *
+ * TODO How about PHVs?
+ *
+ * TODO Make sure cost / width of both "result" and "plain" are correct.
+ */
+ PathTarget *
+ create_grouped_target(PlannerInfo *root, RelOptInfo *rel,
+ Relids rel_agg_attrs, List *rel_agg_vars)
+ {
+ PathTarget *result, *plain;
+ ListCell *lc;
+
+ /* The plan to be returned. */
+ result = create_empty_pathtarget();
+ /* The one to replace rel->reltarget. */
+ plain = create_empty_pathtarget();
+
+ foreach(lc, rel->reltarget->exprs)
+ {
+ Expr *texpr;
+ Index sortgroupref;
+ bool agg_arg_only = false;
+
+ texpr = (Expr *) lfirst(lc);
+
+ sortgroupref = get_expr_sortgroupref(root, texpr);
+ if (sortgroupref > 0)
+ {
+ /* It's o.k. to use the target expression for grouping. */
+ add_column_to_pathtarget(result, texpr, sortgroupref);
+
+ /*
+ * As for the plain target, add the original expression but set
+ * sortgroupref in addition.
+ */
+ add_column_to_pathtarget(plain, texpr, sortgroupref);
+
+ /* Process the next expression. */
+ continue;
+ }
+
+ /*
+ * It may still be o.k. if the expression is only contained in Aggref
+ * - then it's not expected in the grouped output.
+ *
+ * TODO Try to handle generic expression, not only Var. That might
+ * require us to create rel->reltarget of the grouping rel in
+ * parallel to that of the plain rel, and adding whole expressions
+ * instead of individual vars.
+ */
+ if (IsA(texpr, Var))
+ {
+ Var *arg_var = castNode(Var, texpr);
+
+ if (rel->relid > 0)
+ {
+ AttrNumber varattno;
+
+ /*
+ * For a single relation we only need to check attribute
+ * number.
+ *
+ * Apply the same offset that pull_varattnos() did.
+ */
+ varattno = arg_var->varattno - FirstLowInvalidHeapAttributeNumber;
+
+ if (bms_is_member(varattno, rel_agg_attrs))
+ agg_arg_only = true;
+ }
+ else
+ {
+ ListCell *lc2;
+
+ /* Join case. */
+ foreach(lc2, rel_agg_vars)
+ {
+ Var *var = castNode(Var, lfirst(lc2));
+
+ if (var->varno == arg_var->varno &&
+ var->varattno == arg_var->varattno)
+ {
+ agg_arg_only = true;
+ break;
+ }
+ }
+ }
+
+ if (agg_arg_only)
+ {
+ /*
+ * This expression is not suitable for grouping, but the
+ * aggregation input target ought to stay complete.
+ */
+ add_column_to_pathtarget(plain, texpr, 0);
+ }
+ }
+
+ /*
+ * A single mismatched expression makes the whole relation useless
+ * for grouping.
+ */
+ if (!agg_arg_only)
+ {
+ /*
+ * TODO This seems possible to happen multiple times per relation,
+ * so result might be worth freeing. Implement free_pathtarget()?
+ * Or mark the relation as inappropriate for grouping?
+ */
+ /* TODO Free both result and plain. */
+ return NULL;
+ }
+ }
+
+ if (list_length(result->exprs) == 0)
+ {
+ /* TODO free_pathtarget(result); free_pathtarget(plain) */
+ result = NULL;
+ }
+
+ /* Apply the adjusted input target as the replacement is complete now.q */
+ rel->reltarget = plain;
+
+ return result;
+ }
+
/*****************************************************************************
*
*************** create_lateral_join_info(PlannerInfo *ro
*** 629,639 ****
for (rti = 1; rti < root->simple_rel_array_size; rti++)
{
RelOptInfo *brel = root->simple_rel_array[rti];
! if (brel == NULL || brel->reloptkind != RELOPT_BASEREL)
continue;
! if (root->simple_rte_array[rti]->inh)
{
foreach(lc, root->append_rel_list)
{
--- 917,941 ----
for (rti = 1; rti < root->simple_rel_array_size; rti++)
{
RelOptInfo *brel = root->simple_rel_array[rti];
+ RangeTblEntry *brte = root->simple_rte_array[rti];
! if (brel == NULL)
continue;
! /*
! * If an "other rel" RTE is a "partitioned table", we must propagate
! * the lateral info inherited all the way from the root parent to its
! * children. That's because the children are not linked directly with
! * the root parent via AppendRelInfo's unlike in case of a regular
! * inheritance set (see expand_inherited_rtentry()). Failing to
! * do this would result in those children not getting marked with the
! * appropriate lateral info.
! */
! if (brel->reloptkind != RELOPT_BASEREL &&
! brte->relkind != RELKIND_PARTITIONED_TABLE)
! continue;
!
! if (brte->inh)
{
foreach(lc, root->append_rel_list)
{
diff --git a/src/backend/optimizer/plan/planagg.c b/src/backend/optimizer/plan/planagg.c
new file mode 100644
index 5565736..058af2c
*** a/src/backend/optimizer/plan/planagg.c
--- b/src/backend/optimizer/plan/planagg.c
*************** preprocess_minmax_aggregates(PlannerInfo
*** 223,229 ****
create_minmaxagg_path(root, grouped_rel,
create_pathtarget(root, tlist),
aggs_list,
! (List *) parse->havingQual));
}
/*
--- 223,229 ----
create_minmaxagg_path(root, grouped_rel,
create_pathtarget(root, tlist),
aggs_list,
! (List *) parse->havingQual), false);
}
/*
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
new file mode 100644
index ef0de3f..f70b445
*** a/src/backend/optimizer/plan/planmain.c
--- b/src/backend/optimizer/plan/planmain.c
*************** query_planner(PlannerInfo *root, List *t
*** 83,89 ****
add_path(final_rel, (Path *)
create_result_path(root, final_rel,
final_rel->reltarget,
! (List *) parse->jointree->quals));
/* Select cheapest path (pretty easy in this case...) */
set_cheapest(final_rel);
--- 83,89 ----
add_path(final_rel, (Path *)
create_result_path(root, final_rel,
final_rel->reltarget,
! (List *) parse->jointree->quals), false);
/* Select cheapest path (pretty easy in this case...) */
set_cheapest(final_rel);
*************** query_planner(PlannerInfo *root, List *t
*** 114,119 ****
--- 114,120 ----
root->full_join_clauses = NIL;
root->join_info_list = NIL;
root->placeholder_list = NIL;
+ root->grouped_var_list = NIL;
root->fkey_list = NIL;
root->initial_rels = NIL;
*************** query_planner(PlannerInfo *root, List *t
*** 177,182 ****
--- 178,191 ----
(*qp_callback) (root, qp_extra);
/*
+ * If the query result can be grouped, check if any grouping can be
+ * performed below the top-level join. If so, Initialize GroupedPathInfo
+ * of base relations capable to do the grouping and setup
+ * root->grouped_var_list.
+ */
+ add_grouping_info_to_base_rels(root);
+
+ /*
* Examine any "placeholder" expressions generated during subquery pullup.
* Make sure that the Vars they need are marked as needed at the relevant
* join level. This must be done before join removal because it might
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
new file mode 100644
index 649a233..d47f635
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*************** typedef struct
*** 108,117 ****
--- 108,135 ----
int *tleref_to_colnum_map;
} grouping_sets_data;
+ /* Result of a given invocation of inheritance_planner_guts() */
+ typedef struct
+ {
+ Index nominalRelation;
+ List *partitioned_rels;
+ List *resultRelations;
+ List *subpaths;
+ List *subroots;
+ List *withCheckOptionLists;
+ List *returningLists;
+ List *final_rtable;
+ List *init_plans;
+ int save_rel_array_size;
+ RelOptInfo **save_rel_array;
+ } inheritance_planner_result;
+
/* Local functions */
static Node *preprocess_expression(PlannerInfo *root, Node *expr, int kind);
static void preprocess_qual_conditions(PlannerInfo *root, Node *jtnode);
static void inheritance_planner(PlannerInfo *root);
+ static void inheritance_planner_guts(PlannerInfo *root,
+ inheritance_planner_result *inhpres);
static void grouping_planner(PlannerInfo *root, bool inheritance_update,
double tuple_fraction);
static grouping_sets_data *preprocess_grouping_sets(PlannerInfo *root);
*************** static void standard_qp_callback(Planner
*** 130,138 ****
static double get_number_of_groups(PlannerInfo *root,
double path_rows,
grouping_sets_data *gd);
- static Size estimate_hashagg_tablesize(Path *path,
- const AggClauseCosts *agg_costs,
- double dNumGroups);
static RelOptInfo *create_grouping_paths(PlannerInfo *root,
RelOptInfo *input_rel,
PathTarget *target,
--- 148,153 ----
*************** preprocess_phv_expression(PlannerInfo *r
*** 1020,1044 ****
static void
inheritance_planner(PlannerInfo *root)
{
Query *parse = root->parse;
int parentRTindex = parse->resultRelation;
Bitmapset *subqueryRTindexes;
Bitmapset *modifiableARIindexes;
! int nominalRelation = -1;
! List *final_rtable = NIL;
! int save_rel_array_size = 0;
! RelOptInfo **save_rel_array = NULL;
! List *subpaths = NIL;
! List *subroots = NIL;
! List *resultRelations = NIL;
! List *withCheckOptionLists = NIL;
! List *returningLists = NIL;
! List *rowMarks;
! RelOptInfo *final_rel;
ListCell *lc;
Index rti;
RangeTblEntry *parent_rte;
- List *partitioned_rels = NIL;
Assert(parse->commandType != CMD_INSERT);
--- 1035,1139 ----
static void
inheritance_planner(PlannerInfo *root)
{
+ inheritance_planner_result inhpres;
+ Query *parse = root->parse;
+ RelOptInfo *final_rel;
+ Index rti;
+ int final_rtable_len;
+ ListCell *lc;
+ List *rowMarks;
+
+ /*
+ * Away we go... Although the inheritance hierarchy to be processed might
+ * be represented in a non-flat manner, some of the elements needed to
+ * create the final ModifyTable path are always returned in a flat list
+ * structure.
+ */
+ memset(&inhpres, 0, sizeof(inhpres));
+ inheritance_planner_guts(root, &inhpres);
+
+ /* Result path must go into outer query's FINAL upperrel */
+ final_rel = fetch_upper_rel(root, UPPERREL_FINAL, NULL);
+
+ /*
+ * We don't currently worry about setting final_rel's consider_parallel
+ * flag in this case, nor about allowing FDWs or create_upper_paths_hook
+ * to get control here.
+ */
+
+ /*
+ * If we managed to exclude every child rel, return a dummy plan; it
+ * doesn't even need a ModifyTable node.
+ */
+ if (inhpres.subpaths == NIL)
+ {
+ set_dummy_rel_pathlist(final_rel);
+ return;
+ }
+
+ /*
+ * Put back the final adjusted rtable into the master copy of the Query.
+ * (We mustn't do this if we found no non-excluded children.)
+ */
+ parse->rtable = inhpres.final_rtable;
+ root->simple_rel_array_size = inhpres.save_rel_array_size;
+ root->simple_rel_array = inhpres.save_rel_array;
+ /* Must reconstruct master's simple_rte_array, too */
+ final_rtable_len = list_length(inhpres.final_rtable);
+ root->simple_rte_array = (RangeTblEntry **)
+ palloc0((final_rtable_len + 1) *
+ sizeof(RangeTblEntry *));
+ rti = 1;
+ foreach(lc, inhpres.final_rtable)
+ {
+ RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);
+
+ root->simple_rte_array[rti++] = rte;
+ }
+
+ /*
+ * If there was a FOR [KEY] UPDATE/SHARE clause, the LockRows node will
+ * have dealt with fetching non-locked marked rows, else we need to have
+ * ModifyTable do that.
+ */
+ if (parse->rowMarks)
+ rowMarks = NIL;
+ else
+ rowMarks = root->rowMarks;
+
+ /* Create Path representing a ModifyTable to do the UPDATE/DELETE work */
+ add_path(final_rel, (Path *)
+ create_modifytable_path(root, final_rel,
+ parse->commandType,
+ parse->canSetTag,
+ inhpres.nominalRelation,
+ inhpres.partitioned_rels,
+ inhpres.resultRelations,
+ inhpres.subpaths,
+ inhpres.subroots,
+ inhpres.withCheckOptionLists,
+ inhpres.returningLists,
+ rowMarks,
+ NULL,
+ SS_assign_special_param(root)), false);
+ }
+
+ /*
+ * inheritance_planner_guts
+ * Recursive guts of inheritance_planner
+ */
+ static void
+ inheritance_planner_guts(PlannerInfo *root,
+ inheritance_planner_result *inhpres)
+ {
Query *parse = root->parse;
int parentRTindex = parse->resultRelation;
Bitmapset *subqueryRTindexes;
Bitmapset *modifiableARIindexes;
! bool nominalRelationSet = false;
ListCell *lc;
Index rti;
RangeTblEntry *parent_rte;
Assert(parse->commandType != CMD_INSERT);
*************** inheritance_planner(PlannerInfo *root)
*** 1106,1112 ****
*/
parent_rte = rt_fetch(parentRTindex, root->parse->rtable);
if (parent_rte->relkind == RELKIND_PARTITIONED_TABLE)
! nominalRelation = parentRTindex;
/*
* And now we can get on with generating a plan for each child table.
--- 1201,1210 ----
*/
parent_rte = rt_fetch(parentRTindex, root->parse->rtable);
if (parent_rte->relkind == RELKIND_PARTITIONED_TABLE)
! {
! inhpres->nominalRelation = parentRTindex;
! nominalRelationSet = true;
! }
/*
* And now we can get on with generating a plan for each child table.
*************** inheritance_planner(PlannerInfo *root)
*** 1115,1120 ****
--- 1213,1219 ----
{
AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc);
PlannerInfo *subroot;
+ Index childRTindex = appinfo->child_relid;
RangeTblEntry *child_rte;
RelOptInfo *sub_final_rel;
Path *subpath;
*************** inheritance_planner(PlannerInfo *root)
*** 1136,1152 ****
* references to the parent RTE to refer to the current child RTE,
* then fool around with subquery RTEs.
*/
! subroot->parse = (Query *)
! adjust_appendrel_attrs(root,
! (Node *) parse,
! appinfo);
/*
* If there are securityQuals attached to the parent, move them to the
* child rel (they've already been transformed properly for that).
*/
parent_rte = rt_fetch(parentRTindex, subroot->parse->rtable);
! child_rte = rt_fetch(appinfo->child_relid, subroot->parse->rtable);
child_rte->securityQuals = parent_rte->securityQuals;
parent_rte->securityQuals = NIL;
--- 1235,1249 ----
* references to the parent RTE to refer to the current child RTE,
* then fool around with subquery RTEs.
*/
! subroot->parse = (Query *) adjust_appendrel_attrs(root, (Node *) parse,
! 1, &appinfo);
/*
* If there are securityQuals attached to the parent, move them to the
* child rel (they've already been transformed properly for that).
*/
parent_rte = rt_fetch(parentRTindex, subroot->parse->rtable);
! child_rte = rt_fetch(childRTindex, subroot->parse->rtable);
child_rte->securityQuals = parent_rte->securityQuals;
parent_rte->securityQuals = NIL;
*************** inheritance_planner(PlannerInfo *root)
*** 1191,1197 ****
* These won't be referenced, so there's no need to make them very
* valid-looking.
*/
! while (list_length(subroot->parse->rtable) < list_length(final_rtable))
subroot->parse->rtable = lappend(subroot->parse->rtable,
makeNode(RangeTblEntry));
--- 1288,1295 ----
* These won't be referenced, so there's no need to make them very
* valid-looking.
*/
! while (list_length(subroot->parse->rtable) <
! list_length(inhpres->final_rtable))
subroot->parse->rtable = lappend(subroot->parse->rtable,
makeNode(RangeTblEntry));
*************** inheritance_planner(PlannerInfo *root)
*** 1203,1209 ****
* since subquery RTEs couldn't contain any references to the target
* rel.
*/
! if (final_rtable != NIL && subqueryRTindexes != NULL)
{
ListCell *lr;
--- 1301,1307 ----
* since subquery RTEs couldn't contain any references to the target
* rel.
*/
! if (inhpres->final_rtable != NIL && subqueryRTindexes != NULL)
{
ListCell *lr;
*************** inheritance_planner(PlannerInfo *root)
*** 1248,1253 ****
--- 1346,1392 ----
}
}
+ /*
+ * Recurse for a partitioned child table. We shouldn't be planning
+ * a partitioned RTE as a child member, which is what the code after
+ * this block does.
+ */
+ if (child_rte->inh)
+ {
+ inheritance_planner_result child_inhpres;
+
+ Assert(child_rte->relkind == RELKIND_PARTITIONED_TABLE);
+
+ /* During the recursive invocation, this child is the parent. */
+ subroot->parse->resultRelation = childRTindex;
+ memset(&child_inhpres, 0, sizeof(child_inhpres));
+ inheritance_planner_guts(subroot, &child_inhpres);
+
+ inhpres->partitioned_rels = list_concat(inhpres->partitioned_rels,
+ child_inhpres.partitioned_rels);
+ inhpres->resultRelations = list_concat(inhpres->resultRelations,
+ child_inhpres.resultRelations);
+ inhpres->subpaths = list_concat(inhpres->subpaths,
+ child_inhpres.subpaths);
+ inhpres->subroots = list_concat(inhpres->subroots,
+ child_inhpres.subroots);
+ inhpres->withCheckOptionLists =
+ list_concat(inhpres->withCheckOptionLists,
+ child_inhpres.withCheckOptionLists);
+ inhpres->returningLists = list_concat(inhpres->returningLists,
+ child_inhpres.returningLists);
+ if (child_inhpres.final_rtable != NIL)
+ inhpres->final_rtable = child_inhpres.final_rtable;
+ if (child_inhpres.init_plans != NIL)
+ inhpres->init_plans = child_inhpres.init_plans;
+ if (child_inhpres.save_rel_array_size != 0)
+ {
+ inhpres->save_rel_array_size = child_inhpres.save_rel_array_size;
+ inhpres->save_rel_array = child_inhpres.save_rel_array;
+ }
+ continue;
+ }
+
/* There shouldn't be any OJ info to translate, as yet */
Assert(subroot->join_info_list == NIL);
/* and we haven't created PlaceHolderInfos, either */
*************** inheritance_planner(PlannerInfo *root)
*** 1279,1286 ****
* the duplicate child RTE added for the parent does not appear
* anywhere else in the plan tree.
*/
! if (nominalRelation < 0)
! nominalRelation = appinfo->child_relid;
/*
* Select cheapest path in case there's more than one. We always run
--- 1418,1428 ----
* the duplicate child RTE added for the parent does not appear
* anywhere else in the plan tree.
*/
! if (!nominalRelationSet)
! {
! inhpres->nominalRelation = childRTindex;
! nominalRelationSet = true;
! }
/*
* Select cheapest path in case there's more than one. We always run
*************** inheritance_planner(PlannerInfo *root)
*** 1303,1314 ****
* becomes the initial contents of final_rtable; otherwise, append
* just its modified subquery RTEs to final_rtable.
*/
! if (final_rtable == NIL)
! final_rtable = subroot->parse->rtable;
else
! final_rtable = list_concat(final_rtable,
! list_copy_tail(subroot->parse->rtable,
! list_length(final_rtable)));
/*
* We need to collect all the RelOptInfos from all child plans into
--- 1445,1456 ----
* becomes the initial contents of final_rtable; otherwise, append
* just its modified subquery RTEs to final_rtable.
*/
! if (inhpres->final_rtable == NIL)
! inhpres->final_rtable = subroot->parse->rtable;
else
! inhpres->final_rtable = list_concat(inhpres->final_rtable,
! list_copy_tail(subroot->parse->rtable,
! list_length(inhpres->final_rtable)));
/*
* We need to collect all the RelOptInfos from all child plans into
*************** inheritance_planner(PlannerInfo *root)
*** 1317,1425 ****
* have to propagate forward the RelOptInfos that were already built
* in previous children.
*/
! Assert(subroot->simple_rel_array_size >= save_rel_array_size);
! for (rti = 1; rti < save_rel_array_size; rti++)
{
! RelOptInfo *brel = save_rel_array[rti];
if (brel)
subroot->simple_rel_array[rti] = brel;
}
! save_rel_array_size = subroot->simple_rel_array_size;
! save_rel_array = subroot->simple_rel_array;
/* Make sure any initplans from this rel get into the outer list */
! root->init_plans = subroot->init_plans;
/* Build list of sub-paths */
! subpaths = lappend(subpaths, subpath);
/* Build list of modified subroots, too */
! subroots = lappend(subroots, subroot);
/* Build list of target-relation RT indexes */
! resultRelations = lappend_int(resultRelations, appinfo->child_relid);
/* Build lists of per-relation WCO and RETURNING targetlists */
if (parse->withCheckOptions)
! withCheckOptionLists = lappend(withCheckOptionLists,
! subroot->parse->withCheckOptions);
if (parse->returningList)
! returningLists = lappend(returningLists,
! subroot->parse->returningList);
!
Assert(!parse->onConflict);
}
if (parent_rte->relkind == RELKIND_PARTITIONED_TABLE)
{
! partitioned_rels = get_partitioned_child_rels(root, parentRTindex);
/* The root partitioned table is included as a child rel */
! Assert(list_length(partitioned_rels) >= 1);
! }
!
! /* Result path must go into outer query's FINAL upperrel */
! final_rel = fetch_upper_rel(root, UPPERREL_FINAL, NULL);
!
! /*
! * We don't currently worry about setting final_rel's consider_parallel
! * flag in this case, nor about allowing FDWs or create_upper_paths_hook
! * to get control here.
! */
!
! /*
! * If we managed to exclude every child rel, return a dummy plan; it
! * doesn't even need a ModifyTable node.
! */
! if (subpaths == NIL)
! {
! set_dummy_rel_pathlist(final_rel);
! return;
! }
!
! /*
! * Put back the final adjusted rtable into the master copy of the Query.
! * (We mustn't do this if we found no non-excluded children.)
! */
! parse->rtable = final_rtable;
! root->simple_rel_array_size = save_rel_array_size;
! root->simple_rel_array = save_rel_array;
! /* Must reconstruct master's simple_rte_array, too */
! root->simple_rte_array = (RangeTblEntry **)
! palloc0((list_length(final_rtable) + 1) * sizeof(RangeTblEntry *));
! rti = 1;
! foreach(lc, final_rtable)
! {
! RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);
!
! root->simple_rte_array[rti++] = rte;
}
-
- /*
- * If there was a FOR [KEY] UPDATE/SHARE clause, the LockRows node will
- * have dealt with fetching non-locked marked rows, else we need to have
- * ModifyTable do that.
- */
- if (parse->rowMarks)
- rowMarks = NIL;
- else
- rowMarks = root->rowMarks;
-
- /* Create Path representing a ModifyTable to do the UPDATE/DELETE work */
- add_path(final_rel, (Path *)
- create_modifytable_path(root, final_rel,
- parse->commandType,
- parse->canSetTag,
- nominalRelation,
- partitioned_rels,
- resultRelations,
- subpaths,
- subroots,
- withCheckOptionLists,
- returningLists,
- rowMarks,
- NULL,
- SS_assign_special_param(root)));
}
/*--------------------
--- 1459,1506 ----
* have to propagate forward the RelOptInfos that were already built
* in previous children.
*/
! Assert(subroot->simple_rel_array_size >= inhpres->save_rel_array_size);
! for (rti = 1; rti < inhpres->save_rel_array_size; rti++)
{
! RelOptInfo *brel = inhpres->save_rel_array[rti];
if (brel)
subroot->simple_rel_array[rti] = brel;
}
! inhpres->save_rel_array_size = subroot->simple_rel_array_size;
! inhpres->save_rel_array = subroot->simple_rel_array;
/* Make sure any initplans from this rel get into the outer list */
! inhpres->init_plans = subroot->init_plans;
/* Build list of sub-paths */
! inhpres->subpaths = lappend(inhpres->subpaths, subpath);
/* Build list of modified subroots, too */
! inhpres->subroots = lappend(inhpres->subroots, subroot);
/* Build list of target-relation RT indexes */
! inhpres->resultRelations = lappend_int(inhpres->resultRelations,
! childRTindex);
/* Build lists of per-relation WCO and RETURNING targetlists */
if (parse->withCheckOptions)
! inhpres->withCheckOptionLists =
! lappend(inhpres->withCheckOptionLists,
! subroot->parse->withCheckOptions);
if (parse->returningList)
! inhpres->returningLists = lappend(inhpres->returningLists,
! subroot->parse->returningList);
Assert(!parse->onConflict);
}
if (parent_rte->relkind == RELKIND_PARTITIONED_TABLE)
{
! inhpres->partitioned_rels = get_partitioned_child_rels(root,
! parentRTindex);
/* The root partitioned table is included as a child rel */
! Assert(list_length(inhpres->partitioned_rels) >= 1);
}
}
/*--------------------
*************** grouping_planner(PlannerInfo *root, bool
*** 2040,2046 ****
}
/* And shove it into final_rel */
! add_path(final_rel, path);
}
/*
--- 2121,2127 ----
}
/* And shove it into final_rel */
! add_path(final_rel, path, false);
}
/*
*************** get_number_of_groups(PlannerInfo *root,
*** 3446,3485 ****
}
/*
- * estimate_hashagg_tablesize
- * estimate the number of bytes that a hash aggregate hashtable will
- * require based on the agg_costs, path width and dNumGroups.
- *
- * XXX this may be over-estimating the size now that hashagg knows to omit
- * unneeded columns from the hashtable. Also for mixed-mode grouping sets,
- * grouping columns not in the hashed set are counted here even though hashagg
- * won't store them. Is this a problem?
- */
- static Size
- estimate_hashagg_tablesize(Path *path, const AggClauseCosts *agg_costs,
- double dNumGroups)
- {
- Size hashentrysize;
-
- /* Estimate per-hash-entry space at tuple width... */
- hashentrysize = MAXALIGN(path->pathtarget->width) +
- MAXALIGN(SizeofMinimalTupleHeader);
-
- /* plus space for pass-by-ref transition values... */
- hashentrysize += agg_costs->transitionSpace;
- /* plus the per-hash-entry overhead */
- hashentrysize += hash_agg_entry_size(agg_costs->numAggs);
-
- /*
- * Note that this disregards the effect of fill-factor and growth policy
- * of the hash-table. That's probably ok, given default the default
- * fill-factor is relatively high. It'd be hard to meaningfully factor in
- * "double-in-size" growth policies here.
- */
- return hashentrysize * dNumGroups;
- }
-
- /*
* create_grouping_paths
*
* Build a new upperrel containing Paths for grouping and/or aggregation.
--- 3527,3532 ----
*************** create_grouping_paths(PlannerInfo *root,
*** 3600,3606 ****
(List *) parse->havingQual);
}
! add_path(grouped_rel, path);
/* No need to consider any other alternatives. */
set_cheapest(grouped_rel);
--- 3647,3653 ----
(List *) parse->havingQual);
}
! add_path(grouped_rel, path, false);
/* No need to consider any other alternatives. */
set_cheapest(grouped_rel);
*************** create_grouping_paths(PlannerInfo *root,
*** 3777,3783 ****
parse->groupClause,
NIL,
&agg_partial_costs,
! dNumPartialGroups));
else
add_partial_path(grouped_rel, (Path *)
create_group_path(root,
--- 3824,3831 ----
parse->groupClause,
NIL,
&agg_partial_costs,
! dNumPartialGroups),
! false);
else
add_partial_path(grouped_rel, (Path *)
create_group_path(root,
*************** create_grouping_paths(PlannerInfo *root,
*** 3786,3792 ****
partial_grouping_target,
parse->groupClause,
NIL,
! dNumPartialGroups));
}
}
}
--- 3834,3841 ----
partial_grouping_target,
parse->groupClause,
NIL,
! dNumPartialGroups),
! false);
}
}
}
*************** create_grouping_paths(PlannerInfo *root,
*** 3817,3823 ****
parse->groupClause,
NIL,
&agg_partial_costs,
! dNumPartialGroups));
}
}
}
--- 3866,3873 ----
parse->groupClause,
NIL,
&agg_partial_costs,
! dNumPartialGroups),
! false);
}
}
}
*************** create_grouping_paths(PlannerInfo *root,
*** 3869,3875 ****
parse->groupClause,
(List *) parse->havingQual,
agg_costs,
! dNumGroups));
}
else if (parse->groupClause)
{
--- 3919,3925 ----
parse->groupClause,
(List *) parse->havingQual,
agg_costs,
! dNumGroups), false);
}
else if (parse->groupClause)
{
*************** create_grouping_paths(PlannerInfo *root,
*** 3884,3890 ****
target,
parse->groupClause,
(List *) parse->havingQual,
! dNumGroups));
}
else
{
--- 3934,3940 ----
target,
parse->groupClause,
(List *) parse->havingQual,
! dNumGroups), false);
}
else
{
*************** create_grouping_paths(PlannerInfo *root,
*** 3933,3939 ****
parse->groupClause,
(List *) parse->havingQual,
&agg_final_costs,
! dNumGroups));
else
add_path(grouped_rel, (Path *)
create_group_path(root,
--- 3983,3989 ----
parse->groupClause,
(List *) parse->havingQual,
&agg_final_costs,
! dNumGroups), false);
else
add_path(grouped_rel, (Path *)
create_group_path(root,
*************** create_grouping_paths(PlannerInfo *root,
*** 3942,3948 ****
target,
parse->groupClause,
(List *) parse->havingQual,
! dNumGroups));
/*
* The point of using Gather Merge rather than Gather is that it
--- 3992,3998 ----
target,
parse->groupClause,
(List *) parse->havingQual,
! dNumGroups), false);
/*
* The point of using Gather Merge rather than Gather is that it
*************** create_grouping_paths(PlannerInfo *root,
*** 3995,4001 ****
parse->groupClause,
(List *) parse->havingQual,
&agg_final_costs,
! dNumGroups));
else
add_path(grouped_rel, (Path *)
create_group_path(root,
--- 4045,4051 ----
parse->groupClause,
(List *) parse->havingQual,
&agg_final_costs,
! dNumGroups), false);
else
add_path(grouped_rel, (Path *)
create_group_path(root,
*************** create_grouping_paths(PlannerInfo *root,
*** 4004,4010 ****
target,
parse->groupClause,
(List *) parse->havingQual,
! dNumGroups));
}
}
}
--- 4054,4060 ----
target,
parse->groupClause,
(List *) parse->havingQual,
! dNumGroups), false);
}
}
}
*************** create_grouping_paths(PlannerInfo *root,
*** 4049,4055 ****
parse->groupClause,
(List *) parse->havingQual,
agg_costs,
! dNumGroups));
}
}
--- 4099,4105 ----
parse->groupClause,
(List *) parse->havingQual,
agg_costs,
! dNumGroups), false);
}
}
*************** create_grouping_paths(PlannerInfo *root,
*** 4087,4095 ****
parse->groupClause,
(List *) parse->havingQual,
&agg_final_costs,
! dNumGroups));
}
}
}
/* Give a helpful error if we failed to find any implementation */
--- 4137,4212 ----
parse->groupClause,
(List *) parse->havingQual,
&agg_final_costs,
! dNumGroups), false);
}
}
+
+ /*
+ * If input_rel has partially aggregated partial paths, gather them
+ * and perform the final aggregation.
+ *
+ * TODO Allow havingQual - currently not supported at base relation
+ * level.
+ */
+ if (input_rel->gpi != NULL &&
+ input_rel->gpi->partial_pathlist != NIL &&
+ !parse->havingQual)
+ {
+ Path *path = (Path *) linitial(input_rel->gpi->partial_pathlist);
+ double total_groups = path->rows * path->parallel_workers;
+
+ path = (Path *) create_gather_path(root,
+ input_rel,
+ path,
+ path->pathtarget,
+ NULL,
+ &total_groups);
+
+ /*
+ * The input path is partially aggregated and the final
+ * aggregation - if the path wins - will be done below. So we're
+ * done with it for now.
+ *
+ * The top-level grouped_rel needs to receive the path into
+ * regular pathlist, as opposed grouped_rel->gpi->pathlist.
+ */
+ add_path(input_rel, path, false);
+ }
+
+ /*
+ * If input_rel has partially aggregated paths, perform the final
+ * aggregation.
+ *
+ * TODO Allow havingQual - currently not supported at base relation
+ * level.
+ */
+ if (input_rel->gpi != NULL && input_rel->gpi->pathlist != NIL &&
+ !parse->havingQual)
+ {
+ Path *pre_agg = (Path *) linitial(input_rel->gpi->pathlist);
+
+ dNumGroups = get_number_of_groups(root, pre_agg->rows, gd);
+
+ MemSet(&agg_final_costs, 0, sizeof(AggClauseCosts));
+ get_agg_clause_costs(root, (Node *) target->exprs,
+ AGGSPLIT_FINAL_DESERIAL,
+ &agg_final_costs);
+ get_agg_clause_costs(root, parse->havingQual,
+ AGGSPLIT_FINAL_DESERIAL,
+ &agg_final_costs);
+
+ add_path(grouped_rel,
+ (Path *) create_agg_path(root, grouped_rel,
+ pre_agg,
+ target,
+ AGG_HASHED,
+ AGGSPLIT_FINAL_DESERIAL,
+ parse->groupClause,
+ (List *) parse->havingQual,
+ &agg_final_costs,
+ dNumGroups),
+ false);
+ }
}
/* Give a helpful error if we failed to find any implementation */
*************** consider_groupingsets_paths(PlannerInfo
*** 4289,4295 ****
strat,
new_rollups,
agg_costs,
! dNumGroups));
return;
}
--- 4406,4412 ----
strat,
new_rollups,
agg_costs,
! dNumGroups), false);
return;
}
*************** consider_groupingsets_paths(PlannerInfo
*** 4447,4453 ****
AGG_MIXED,
rollups,
agg_costs,
! dNumGroups));
}
}
--- 4564,4570 ----
AGG_MIXED,
rollups,
agg_costs,
! dNumGroups), false);
}
}
*************** consider_groupingsets_paths(PlannerInfo
*** 4464,4470 ****
AGG_SORTED,
gd->rollups,
agg_costs,
! dNumGroups));
}
/*
--- 4581,4587 ----
AGG_SORTED,
gd->rollups,
agg_costs,
! dNumGroups), false);
}
/*
*************** create_one_window_path(PlannerInfo *root
*** 4649,4655 ****
window_pathkeys);
}
! add_path(window_rel, path);
}
/*
--- 4766,4772 ----
window_pathkeys);
}
! add_path(window_rel, path, false);
}
/*
*************** create_distinct_paths(PlannerInfo *root,
*** 4755,4761 ****
create_upper_unique_path(root, distinct_rel,
path,
list_length(root->distinct_pathkeys),
! numDistinctRows));
}
}
--- 4872,4878 ----
create_upper_unique_path(root, distinct_rel,
path,
list_length(root->distinct_pathkeys),
! numDistinctRows), false);
}
}
*************** create_distinct_paths(PlannerInfo *root,
*** 4782,4788 ****
create_upper_unique_path(root, distinct_rel,
path,
list_length(root->distinct_pathkeys),
! numDistinctRows));
}
/*
--- 4899,4905 ----
create_upper_unique_path(root, distinct_rel,
path,
list_length(root->distinct_pathkeys),
! numDistinctRows), false);
}
/*
*************** create_distinct_paths(PlannerInfo *root,
*** 4829,4835 ****
parse->distinctClause,
NIL,
NULL,
! numDistinctRows));
}
/* Give a helpful error if we failed to find any implementation */
--- 4946,4952 ----
parse->distinctClause,
NIL,
NULL,
! numDistinctRows), false);
}
/* Give a helpful error if we failed to find any implementation */
*************** create_ordered_paths(PlannerInfo *root,
*** 4927,4933 ****
path = apply_projection_to_path(root, ordered_rel,
path, target);
! add_path(ordered_rel, path);
}
}
--- 5044,5050 ----
path = apply_projection_to_path(root, ordered_rel,
path, target);
! add_path(ordered_rel, path, false);
}
}
*************** create_ordered_paths(PlannerInfo *root,
*** 4977,4983 ****
path = apply_projection_to_path(root, ordered_rel,
path, target);
! add_path(ordered_rel, path);
}
}
--- 5094,5100 ----
path = apply_projection_to_path(root, ordered_rel,
path, target);
! add_path(ordered_rel, path, false);
}
}
*************** get_partitioned_child_rels(PlannerInfo *
*** 6083,6085 ****
--- 6200,6230 ----
return result;
}
+
+ /*
+ * get_partitioned_child_rels_for_join
+ * Build and return a list containing the RTI of every partitioned
+ * relation which is a child of some rel included in the join.
+ *
+ * Note: Only call this function on joins between partitioned tables.
+ */
+ List *
+ get_partitioned_child_rels_for_join(PlannerInfo *root,
+ RelOptInfo *joinrel)
+ {
+ List *result = NIL;
+ ListCell *l;
+
+ foreach(l, root->pcinfo_list)
+ {
+ PartitionedChildRelInfo *pc = lfirst(l);
+
+ if (bms_is_member(pc->parent_relid, joinrel->relids))
+ result = list_concat(result, list_copy(pc->child_rels));
+ }
+
+ /* The root partitioned table is included as a child rel */
+ Assert(list_length(result) >= bms_num_members(joinrel->relids));
+
+ return result;
+ }
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
new file mode 100644
index 1278371..44c3919
*** a/src/backend/optimizer/plan/setrefs.c
--- b/src/backend/optimizer/plan/setrefs.c
*************** typedef struct
*** 40,46 ****
--- 40,50 ----
List *tlist; /* underlying target list */
int num_vars; /* number of plain Var tlist entries */
bool has_ph_vars; /* are there PlaceHolderVar entries? */
+ bool has_grp_vars; /* are there GroupedVar entries? */
bool has_non_vars; /* are there other entries? */
+ bool has_conv_whole_rows; /* are there ConvertRowtypeExpr entries
+ * encapsulating a whole-row Var?
+ */
tlist_vinfo vars[FLEXIBLE_ARRAY_MEMBER]; /* has num_vars entries */
} indexed_tlist;
*************** static List *set_returning_clause_refere
*** 139,144 ****
--- 143,149 ----
int rtoffset);
static bool extract_query_dependencies_walker(Node *node,
PlannerInfo *context);
+ static Var *get_wholerow_ref_from_convert_row_type(Node *node);
/*****************************************************************************
*
*************** set_upper_references(PlannerInfo *root,
*** 1725,1733 ****
--- 1730,1781 ----
indexed_tlist *subplan_itlist;
List *output_targetlist;
ListCell *l;
+ List *sub_tlist_save = NIL;
+
+ if (root->grouped_var_list != NIL)
+ {
+ if (IsA(plan, Agg))
+ {
+ Agg *agg = (Agg *) plan;
+
+ if (agg->aggsplit == AGGSPLIT_FINAL_DESERIAL)
+ {
+ /*
+ * convert_combining_aggrefs could have replaced some vars
+ * with Aggref expressions representing the partial
+ * aggregation. We need to restore the same Aggrefs in the
+ * subplan targetlist, but this would break the subplan if
+ * it's something else than the partial aggregation (i.e. the
+ * partial aggregation takes place lower in the plan tree). So
+ * we'll eventually need to restore the original list.
+ */
+ if (!IsA(subplan, Agg))
+ sub_tlist_save = subplan->targetlist;
+ #ifdef USE_ASSERT_CHECKING
+ else
+ Assert(((Agg *) subplan)->aggsplit == AGGSPLIT_INITIAL_SERIAL);
+ #endif /* USE_ASSERT_CHECKING */
+
+ /*
+ * Restore the aggregate expressions that we might have
+ * removed when planning for aggregation at base relation
+ * level.
+ */
+ subplan->targetlist =
+ restore_grouping_expressions(root, subplan->targetlist);
+ }
+ }
+ }
subplan_itlist = build_tlist_index(subplan->targetlist);
+ /*
+ * The replacement of GroupVars by Aggrefs was only needed for the index
+ * build.
+ */
+ if (sub_tlist_save != NIL)
+ subplan->targetlist = sub_tlist_save;
+
output_targetlist = NIL;
foreach(l, plan->targetlist)
{
*************** build_tlist_index(List *tlist)
*** 1937,1943 ****
--- 1985,1993 ----
itlist->tlist = tlist;
itlist->has_ph_vars = false;
+ itlist->has_grp_vars = false;
itlist->has_non_vars = false;
+ itlist->has_conv_whole_rows = false;
/* Find the Vars and fill in the index array */
vinfo = itlist->vars;
*************** build_tlist_index(List *tlist)
*** 1956,1961 ****
--- 2006,2015 ----
}
else if (tle->expr && IsA(tle->expr, PlaceHolderVar))
itlist->has_ph_vars = true;
+ else if (tle->expr && IsA(tle->expr, GroupedVar))
+ itlist->has_grp_vars = true;
+ else if (get_wholerow_ref_from_convert_row_type((Node *) tle->expr))
+ itlist->has_conv_whole_rows = true;
else
itlist->has_non_vars = true;
}
*************** build_tlist_index(List *tlist)
*** 1971,1977 ****
* This is like build_tlist_index, but we only index tlist entries that
* are Vars belonging to some rel other than the one specified. We will set
* has_ph_vars (allowing PlaceHolderVars to be matched), but not has_non_vars
! * (so nothing other than Vars and PlaceHolderVars can be matched).
*/
static indexed_tlist *
build_tlist_index_other_vars(List *tlist, Index ignore_rel)
--- 2025,2034 ----
* This is like build_tlist_index, but we only index tlist entries that
* are Vars belonging to some rel other than the one specified. We will set
* has_ph_vars (allowing PlaceHolderVars to be matched), but not has_non_vars
! * (so nothing other than Vars and PlaceHolderVars can be matched). In case of
! * DML, where this function will be used, returning lists from child relations
! * will be appended similar to a simple append relation. That does not require
! * fixing ConvertRowtypeExpr references. So, those are not considered here.
*/
static indexed_tlist *
build_tlist_index_other_vars(List *tlist, Index ignore_rel)
*************** build_tlist_index_other_vars(List *tlist
*** 1988,1993 ****
--- 2045,2051 ----
itlist->tlist = tlist;
itlist->has_ph_vars = false;
itlist->has_non_vars = false;
+ itlist->has_conv_whole_rows = false;
/* Find the desired Vars and fill in the index array */
vinfo = itlist->vars;
*************** fix_join_expr_mutator(Node *node, fix_jo
*** 2233,2238 ****
--- 2291,2321 ----
/* No referent found for Var */
elog(ERROR, "variable not found in subplan target lists");
}
+ if (IsA(node, GroupedVar))
+ {
+ GroupedVar *gvar = (GroupedVar *) node;
+
+ /* See if the GroupedVar has bubbled up from a lower plan node */
+ if (context->outer_itlist && context->outer_itlist->has_grp_vars)
+ {
+ newvar = search_indexed_tlist_for_non_var((Expr *) gvar,
+ context->outer_itlist,
+ OUTER_VAR);
+ if (newvar)
+ return (Node *) newvar;
+ }
+ if (context->inner_itlist && context->inner_itlist->has_grp_vars)
+ {
+ newvar = search_indexed_tlist_for_non_var((Expr *) gvar,
+ context->inner_itlist,
+ INNER_VAR);
+ if (newvar)
+ return (Node *) newvar;
+ }
+
+ /* No referent found for GroupedVar */
+ elog(ERROR, "grouped variable not found in subplan target lists");
+ }
if (IsA(node, PlaceHolderVar))
{
PlaceHolderVar *phv = (PlaceHolderVar *) node;
*************** fix_join_expr_mutator(Node *node, fix_jo
*** 2258,2263 ****
--- 2341,2369 ----
/* If not supplied by input plans, evaluate the contained expr */
return fix_join_expr_mutator((Node *) phv->phexpr, context);
}
+ if (get_wholerow_ref_from_convert_row_type(node))
+ {
+ if (context->outer_itlist &&
+ context->outer_itlist->has_conv_whole_rows)
+ {
+ newvar = search_indexed_tlist_for_non_var((Expr *) node,
+ context->outer_itlist,
+ OUTER_VAR);
+
+ if (newvar)
+ return (Node *) newvar;
+ }
+ if (context->inner_itlist &&
+ context->inner_itlist->has_conv_whole_rows)
+ {
+ newvar = search_indexed_tlist_for_non_var((Expr *) node,
+ context->inner_itlist,
+ INNER_VAR);
+
+ if (newvar)
+ return (Node *) newvar;
+ }
+ }
if (IsA(node, Param))
return fix_param_node(context->root, (Param *) node);
/* Try matching more complex expressions too, if tlists have any */
*************** fix_upper_expr_mutator(Node *node, fix_u
*** 2364,2369 ****
--- 2470,2486 ----
/* If not supplied by input plan, evaluate the contained expr */
return fix_upper_expr_mutator((Node *) phv->phexpr, context);
}
+ if (get_wholerow_ref_from_convert_row_type(node))
+ {
+ if (context->subplan_itlist->has_conv_whole_rows)
+ {
+ newvar = search_indexed_tlist_for_non_var((Expr *) node,
+ context->subplan_itlist,
+ context->newvarno);
+ if (newvar)
+ return (Node *) newvar;
+ }
+ }
if (IsA(node, Param))
return fix_param_node(context->root, (Param *) node);
if (IsA(node, Aggref))
*************** fix_upper_expr_mutator(Node *node, fix_u
*** 2389,2395 ****
/* If no match, just fall through to process it normally */
}
/* Try matching more complex expressions too, if tlist has any */
! if (context->subplan_itlist->has_non_vars)
{
newvar = search_indexed_tlist_for_non_var((Expr *) node,
context->subplan_itlist,
--- 2506,2513 ----
/* If no match, just fall through to process it normally */
}
/* Try matching more complex expressions too, if tlist has any */
! if (context->subplan_itlist->has_grp_vars ||
! context->subplan_itlist->has_non_vars)
{
newvar = search_indexed_tlist_for_non_var((Expr *) node,
context->subplan_itlist,
*************** extract_query_dependencies_walker(Node *
*** 2596,2598 ****
--- 2714,2748 ----
return expression_tree_walker(node, extract_query_dependencies_walker,
(void *) context);
}
+
+ /*
+ * get_wholerow_ref_from_convert_row_type
+ * Given a node, check if it's a ConvertRowtypeExpr encapsulating a
+ * whole-row reference as implicit cast and return the whole-row
+ * reference Var if so. Otherwise return NULL. In case of multi-level
+ * partitioning, we will have as many nested ConvertRowtypeExpr as there
+ * are levels in partition hierarchy.
+ */
+ static Var *
+ get_wholerow_ref_from_convert_row_type(Node *node)
+ {
+ Var *var = NULL;
+ ConvertRowtypeExpr *convexpr;
+
+ if (!node || !IsA(node, ConvertRowtypeExpr))
+ return NULL;
+
+ /* Traverse nested ConvertRowtypeExpr's. */
+ convexpr = castNode(ConvertRowtypeExpr, node);
+ while (convexpr->convertformat == COERCE_IMPLICIT_CAST &&
+ IsA(convexpr->arg, ConvertRowtypeExpr))
+ convexpr = (ConvertRowtypeExpr *) convexpr->arg;
+
+ if (IsA(convexpr->arg, Var))
+ var = castNode(Var, convexpr->arg);
+
+ if (var && var->varattno == 0)
+ return var;
+
+ return NULL;
+ }
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
new file mode 100644
index a1be858..8bdaa44
*** a/src/backend/optimizer/prep/prepunion.c
--- b/src/backend/optimizer/prep/prepunion.c
***************
*** 55,61 ****
typedef struct
{
PlannerInfo *root;
! AppendRelInfo *appinfo;
} adjust_appendrel_attrs_context;
static Path *recurse_set_operations(Node *setOp, PlannerInfo *root,
--- 55,62 ----
typedef struct
{
PlannerInfo *root;
! int nappinfos;
! AppendRelInfo **appinfos;
} adjust_appendrel_attrs_context;
static Path *recurse_set_operations(Node *setOp, PlannerInfo *root,
*************** static List *generate_append_tlist(List
*** 97,103 ****
List *input_tlists,
List *refnames_tlist);
static List *generate_setop_grouplist(SetOperationStmt *op, List *targetlist);
! static void expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte,
Index rti);
static void make_inh_translation_list(Relation oldrelation,
Relation newrelation,
--- 98,104 ----
List *input_tlists,
List *refnames_tlist);
static List *generate_setop_grouplist(SetOperationStmt *op, List *targetlist);
! static List *expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte,
Index rti);
static void make_inh_translation_list(Relation oldrelation,
Relation newrelation,
*************** static Bitmapset *translate_col_privs(co
*** 107,113 ****
List *translated_vars);
static Node *adjust_appendrel_attrs_mutator(Node *node,
adjust_appendrel_attrs_context *context);
- static Relids adjust_relid_set(Relids relids, Index oldrelid, Index newrelid);
static List *adjust_inherited_tlist(List *tlist,
AppendRelInfo *context);
--- 108,113 ----
*************** plan_set_operations(PlannerInfo *root)
*** 207,213 ****
root->processed_tlist = top_tlist;
/* Add only the final path to the SETOP upperrel. */
! add_path(setop_rel, path);
/* Let extensions possibly add some more paths */
if (create_upper_paths_hook)
--- 207,213 ----
root->processed_tlist = top_tlist;
/* Add only the final path to the SETOP upperrel. */
! add_path(setop_rel, path, false);
/* Let extensions possibly add some more paths */
if (create_upper_paths_hook)
*************** expand_inherited_tables(PlannerInfo *roo
*** 1330,1348 ****
Index nrtes;
Index rti;
ListCell *rl;
/*
* expand_inherited_rtentry may add RTEs to parse->rtable; there is no
* need to scan them since they can't have inh=true. So just scan as far
* as the original end of the rtable list.
*/
! nrtes = list_length(root->parse->rtable);
! rl = list_head(root->parse->rtable);
for (rti = 1; rti <= nrtes; rti++)
{
RangeTblEntry *rte = (RangeTblEntry *) lfirst(rl);
! expand_inherited_rtentry(root, rte, rti);
rl = lnext(rl);
}
}
--- 1330,1351 ----
Index nrtes;
Index rti;
ListCell *rl;
+ Query *parse = root->parse;
/*
* expand_inherited_rtentry may add RTEs to parse->rtable; there is no
* need to scan them since they can't have inh=true. So just scan as far
* as the original end of the rtable list.
*/
! nrtes = list_length(parse->rtable);
! rl = list_head(parse->rtable);
for (rti = 1; rti <= nrtes; rti++)
{
RangeTblEntry *rte = (RangeTblEntry *) lfirst(rl);
+ List *appinfos;
! appinfos = expand_inherited_rtentry(root, rte, rti);
! root->append_rel_list = list_concat(root->append_rel_list, appinfos);
rl = lnext(rl);
}
}
*************** expand_inherited_tables(PlannerInfo *roo
*** 1362,1369 ****
*
* A childless table is never considered to be an inheritance set; therefore
* a parent RTE must always have at least two associated AppendRelInfos.
*/
! static void
expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
{
Query *parse = root->parse;
--- 1365,1374 ----
*
* A childless table is never considered to be an inheritance set; therefore
* a parent RTE must always have at least two associated AppendRelInfos.
+ *
+ * Returns a list of AppendRelInfos, or NIL.
*/
! static List*
expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
{
Query *parse = root->parse;
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1380,1391 ****
/* Does RT entry allow inheritance? */
if (!rte->inh)
! return;
/* Ignore any already-expanded UNION ALL nodes */
if (rte->rtekind != RTE_RELATION)
{
Assert(rte->rtekind == RTE_SUBQUERY);
! return;
}
/* Fast path for common case of childless table */
parentOID = rte->relid;
--- 1385,1396 ----
/* Does RT entry allow inheritance? */
if (!rte->inh)
! return NIL;
/* Ignore any already-expanded UNION ALL nodes */
if (rte->rtekind != RTE_RELATION)
{
Assert(rte->rtekind == RTE_SUBQUERY);
! return NIL;
}
/* Fast path for common case of childless table */
parentOID = rte->relid;
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1393,1399 ****
{
/* Clear flag before returning */
rte->inh = false;
! return;
}
/*
--- 1398,1404 ----
{
/* Clear flag before returning */
rte->inh = false;
! return NIL;
}
/*
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1417,1424 ****
else
lockmode = AccessShareLock;
! /* Scan for all members of inheritance set, acquire needed locks */
! inhOIDs = find_all_inheritors(parentOID, lockmode, NULL);
/*
* Check that there's at least one descendant, else treat as no-child
--- 1422,1440 ----
else
lockmode = AccessShareLock;
! /*
! * Expand partitioned table level-wise to help optimizations like
! * partition-wise join which match partitions at every level. Otherwise,
! * scan for all members of inheritance set. Acquire needed locks
! */
! if (rte->relkind == RELKIND_PARTITIONED_TABLE)
! {
! inhOIDs = list_make1_oid(parentOID);
! inhOIDs = list_concat(inhOIDs,
! find_inheritance_children(parentOID, lockmode));
! }
! else
! inhOIDs = find_all_inheritors(parentOID, lockmode, NULL);
/*
* Check that there's at least one descendant, else treat as no-child
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1429,1435 ****
{
/* Clear flag before returning */
rte->inh = false;
! return;
}
/*
--- 1445,1451 ----
{
/* Clear flag before returning */
rte->inh = false;
! return NIL;
}
/*
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1457,1462 ****
--- 1473,1484 ----
Index childRTindex;
AppendRelInfo *appinfo;
+ /*
+ * If this child is a partitioned table, this contains AppendRelInfos
+ * for its own children.
+ */
+ List *myappinfos;
+
/* Open rel if needed; we already have required locks */
if (childOID != parentOID)
newrelation = heap_open(childOID, NoLock);
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1490,1496 ****
childrte = copyObject(rte);
childrte->relid = childOID;
childrte->relkind = newrelation->rd_rel->relkind;
! childrte->inh = false;
childrte->requiredPerms = 0;
childrte->securityQuals = NIL;
parse->rtable = lappend(parse->rtable, childrte);
--- 1512,1523 ----
childrte = copyObject(rte);
childrte->relid = childOID;
childrte->relkind = newrelation->rd_rel->relkind;
! /* A partitioned child will need to be expanded further. */
! if (childOID != parentOID &&
! childrte->relkind == RELKIND_PARTITIONED_TABLE)
! childrte->inh = true;
! else
! childrte->inh = false;
childrte->requiredPerms = 0;
childrte->securityQuals = NIL;
parse->rtable = lappend(parse->rtable, childrte);
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1498,1506 ****
/*
* Build an AppendRelInfo for this parent and child, unless the child
! * is a partitioned table.
*/
! if (childrte->relkind != RELKIND_PARTITIONED_TABLE)
{
need_append = true;
appinfo = makeNode(AppendRelInfo);
--- 1525,1533 ----
/*
* Build an AppendRelInfo for this parent and child, unless the child
! * RTE simply duplicates the parent *partitioned* table.
*/
! if (childrte->relkind != RELKIND_PARTITIONED_TABLE || childrte->inh)
{
need_append = true;
appinfo = makeNode(AppendRelInfo);
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1570,1575 ****
--- 1597,1610 ----
/* Close child relations, but keep locks */
if (childOID != parentOID)
heap_close(newrelation, NoLock);
+
+ /* Expand partitioned children recursively. */
+ if (childrte->inh)
+ {
+ myappinfos = expand_inherited_rtentry(root, childrte,
+ childRTindex);
+ appinfos = list_concat(appinfos, myappinfos);
+ }
}
heap_close(oldrelation, NoLock);
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1585,1591 ****
{
/* Clear flag before returning */
rte->inh = false;
! return;
}
/*
--- 1620,1626 ----
{
/* Clear flag before returning */
rte->inh = false;
! return NIL;
}
/*
*************** expand_inherited_rtentry(PlannerInfo *ro
*** 1606,1613 ****
root->pcinfo_list = lappend(root->pcinfo_list, pcinfo);
}
! /* Otherwise, OK to add to root->append_rel_list */
! root->append_rel_list = list_concat(root->append_rel_list, appinfos);
}
/*
--- 1641,1648 ----
root->pcinfo_list = lappend(root->pcinfo_list, pcinfo);
}
! /* The following will be concatenated to root->append_rel_list. */
! return appinfos;
}
/*
*************** translate_col_privs(const Bitmapset *par
*** 1767,1776 ****
/*
* adjust_appendrel_attrs
! * Copy the specified query or expression and translate Vars referring
! * to the parent rel of the specified AppendRelInfo to refer to the
! * child rel instead. We also update rtindexes appearing outside Vars,
! * such as resultRelation and jointree relids.
*
* Note: this is only applied after conversion of sublinks to subplans,
* so we don't need to cope with recursion into sub-queries.
--- 1802,1812 ----
/*
* adjust_appendrel_attrs
! * Copy the specified query or expression and translate Vars referring to
! * the parent rels of the child rels specified in the given list of
! * AppendRelInfos to refer to the corresponding child rel instead. We also
! * update rtindexes appearing outside Vars, such as resultRelation and
! * jointree relids.
*
* Note: this is only applied after conversion of sublinks to subplans,
* so we don't need to cope with recursion into sub-queries.
*************** translate_col_privs(const Bitmapset *par
*** 1779,1791 ****
* maybe we should try to fold the two routines together.
*/
Node *
! adjust_appendrel_attrs(PlannerInfo *root, Node *node, AppendRelInfo *appinfo)
{
Node *result;
adjust_appendrel_attrs_context context;
context.root = root;
! context.appinfo = appinfo;
/*
* Must be prepared to start with a Query or a bare expression tree.
--- 1815,1835 ----
* maybe we should try to fold the two routines together.
*/
Node *
! adjust_appendrel_attrs(PlannerInfo *root, Node *node, int nappinfos,
! AppendRelInfo **appinfos)
{
Node *result;
adjust_appendrel_attrs_context context;
context.root = root;
! context.nappinfos = nappinfos;
! context.appinfos = appinfos;
!
! /*
! * Catch a caller who wants to adjust expressions, but doesn't pass any
! * AppendRelInfo.
! */
! Assert(appinfos && nappinfos >= 1);
/*
* Must be prepared to start with a Query or a bare expression tree.
*************** adjust_appendrel_attrs(PlannerInfo *root
*** 1793,1812 ****
if (node && IsA(node, Query))
{
Query *newnode;
newnode = query_tree_mutator((Query *) node,
adjust_appendrel_attrs_mutator,
(void *) &context,
QTW_IGNORE_RC_SUBQUERIES);
! if (newnode->resultRelation == appinfo->parent_relid)
{
! newnode->resultRelation = appinfo->child_relid;
! /* Fix tlist resnos too, if it's inherited UPDATE */
! if (newnode->commandType == CMD_UPDATE)
! newnode->targetList =
! adjust_inherited_tlist(newnode->targetList,
! appinfo);
}
result = (Node *) newnode;
}
else
--- 1837,1864 ----
if (node && IsA(node, Query))
{
Query *newnode;
+ int cnt;
newnode = query_tree_mutator((Query *) node,
adjust_appendrel_attrs_mutator,
(void *) &context,
QTW_IGNORE_RC_SUBQUERIES);
! for (cnt = 0; cnt < nappinfos; cnt++)
{
! AppendRelInfo *appinfo = appinfos[cnt];
!
! if (newnode->resultRelation == appinfo->parent_relid)
! {
! newnode->resultRelation = appinfo->child_relid;
! /* Fix tlist resnos too, if it's inherited UPDATE */
! if (newnode->commandType == CMD_UPDATE)
! newnode->targetList =
! adjust_inherited_tlist(newnode->targetList,
! appinfo);
! break;
! }
}
+
result = (Node *) newnode;
}
else
*************** static Node *
*** 1819,1831 ****
adjust_appendrel_attrs_mutator(Node *node,
adjust_appendrel_attrs_context *context)
{
! AppendRelInfo *appinfo = context->appinfo;
if (node == NULL)
return NULL;
if (IsA(node, Var))
{
Var *var = (Var *) copyObject(node);
if (var->varlevelsup == 0 &&
var->varno == appinfo->parent_relid)
--- 1871,1900 ----
adjust_appendrel_attrs_mutator(Node *node,
adjust_appendrel_attrs_context *context)
{
! AppendRelInfo **appinfos = context->appinfos;
! int nappinfos = context->nappinfos;
! int cnt;
!
! /*
! * Catch a caller who wants to adjust expressions, but doesn't pass any
! * AppendRelInfo.
! */
! Assert(appinfos && nappinfos >= 1);
if (node == NULL)
return NULL;
if (IsA(node, Var))
{
Var *var = (Var *) copyObject(node);
+ AppendRelInfo *appinfo;
+
+ for (cnt = 0; cnt < nappinfos; cnt++)
+ {
+ appinfo = appinfos[cnt];
+
+ if (var->varno == appinfo->parent_relid)
+ break;
+ }
if (var->varlevelsup == 0 &&
var->varno == appinfo->parent_relid)
*************** adjust_appendrel_attrs_mutator(Node *nod
*** 1908,1936 ****
{
CurrentOfExpr *cexpr = (CurrentOfExpr *) copyObject(node);
! if (cexpr->cvarno == appinfo->parent_relid)
! cexpr->cvarno = appinfo->child_relid;
return (Node *) cexpr;
}
if (IsA(node, RangeTblRef))
{
RangeTblRef *rtr = (RangeTblRef *) copyObject(node);
! if (rtr->rtindex == appinfo->parent_relid)
! rtr->rtindex = appinfo->child_relid;
return (Node *) rtr;
}
if (IsA(node, JoinExpr))
{
/* Copy the JoinExpr node with correct mutation of subnodes */
JoinExpr *j;
j = (JoinExpr *) expression_tree_mutator(node,
adjust_appendrel_attrs_mutator,
(void *) context);
/* now fix JoinExpr's rtindex (probably never happens) */
! if (j->rtindex == appinfo->parent_relid)
! j->rtindex = appinfo->child_relid;
return (Node *) j;
}
if (IsA(node, PlaceHolderVar))
--- 1977,2030 ----
{
CurrentOfExpr *cexpr = (CurrentOfExpr *) copyObject(node);
! for (cnt = 0; cnt < nappinfos; cnt++)
! {
! AppendRelInfo *appinfo = appinfos[cnt];
!
! if (cexpr->cvarno == appinfo->parent_relid)
! {
! cexpr->cvarno = appinfo->child_relid;
! break;
! }
! }
return (Node *) cexpr;
}
if (IsA(node, RangeTblRef))
{
RangeTblRef *rtr = (RangeTblRef *) copyObject(node);
! for (cnt = 0; cnt < nappinfos; cnt++)
! {
! AppendRelInfo *appinfo = appinfos[cnt];
!
! if (rtr->rtindex == appinfo->parent_relid)
! {
! rtr->rtindex = appinfo->child_relid;
! break;
! }
! }
return (Node *) rtr;
}
if (IsA(node, JoinExpr))
{
/* Copy the JoinExpr node with correct mutation of subnodes */
JoinExpr *j;
+ AppendRelInfo *appinfo;
j = (JoinExpr *) expression_tree_mutator(node,
adjust_appendrel_attrs_mutator,
(void *) context);
/* now fix JoinExpr's rtindex (probably never happens) */
! for (cnt = 0; cnt < nappinfos; cnt++)
! {
! appinfo = appinfos[cnt];
!
! if (j->rtindex == appinfo->parent_relid)
! {
! j->rtindex = appinfo->child_relid;
! break;
! }
! }
return (Node *) j;
}
if (IsA(node, PlaceHolderVar))
*************** adjust_appendrel_attrs_mutator(Node *nod
*** 1943,1951 ****
(void *) context);
/* now fix PlaceHolderVar's relid sets */
if (phv->phlevelsup == 0)
! phv->phrels = adjust_relid_set(phv->phrels,
! appinfo->parent_relid,
! appinfo->child_relid);
return (Node *) phv;
}
/* Shouldn't need to handle planner auxiliary nodes here */
--- 2037,2044 ----
(void *) context);
/* now fix PlaceHolderVar's relid sets */
if (phv->phlevelsup == 0)
! phv->phrels = adjust_child_relids(phv->phrels, context->nappinfos,
! context->appinfos);
return (Node *) phv;
}
/* Shouldn't need to handle planner auxiliary nodes here */
*************** adjust_appendrel_attrs_mutator(Node *nod
*** 1976,1999 ****
adjust_appendrel_attrs_mutator((Node *) oldinfo->orclause, context);
/* adjust relid sets too */
! newinfo->clause_relids = adjust_relid_set(oldinfo->clause_relids,
! appinfo->parent_relid,
! appinfo->child_relid);
! newinfo->required_relids = adjust_relid_set(oldinfo->required_relids,
! appinfo->parent_relid,
! appinfo->child_relid);
! newinfo->outer_relids = adjust_relid_set(oldinfo->outer_relids,
! appinfo->parent_relid,
! appinfo->child_relid);
! newinfo->nullable_relids = adjust_relid_set(oldinfo->nullable_relids,
! appinfo->parent_relid,
! appinfo->child_relid);
! newinfo->left_relids = adjust_relid_set(oldinfo->left_relids,
! appinfo->parent_relid,
! appinfo->child_relid);
! newinfo->right_relids = adjust_relid_set(oldinfo->right_relids,
! appinfo->parent_relid,
! appinfo->child_relid);
/*
* Reset cached derivative fields, since these might need to have
--- 2069,2092 ----
adjust_appendrel_attrs_mutator((Node *) oldinfo->orclause, context);
/* adjust relid sets too */
! newinfo->clause_relids = adjust_child_relids(oldinfo->clause_relids,
! context->nappinfos,
! context->appinfos);
! newinfo->required_relids = adjust_child_relids(oldinfo->required_relids,
! context->nappinfos,
! context->appinfos);
! newinfo->outer_relids = adjust_child_relids(oldinfo->outer_relids,
! context->nappinfos,
! context->appinfos);
! newinfo->nullable_relids = adjust_child_relids(oldinfo->nullable_relids,
! context->nappinfos,
! context->appinfos);
! newinfo->left_relids = adjust_child_relids(oldinfo->left_relids,
! context->nappinfos,
! context->appinfos);
! newinfo->right_relids = adjust_child_relids(oldinfo->right_relids,
! context->nappinfos,
! context->appinfos);
/*
* Reset cached derivative fields, since these might need to have
*************** adjust_appendrel_attrs_mutator(Node *nod
*** 2025,2047 ****
}
/*
! * Substitute newrelid for oldrelid in a Relid set
*/
! static Relids
! adjust_relid_set(Relids relids, Index oldrelid, Index newrelid)
{
! if (bms_is_member(oldrelid, relids))
{
! /* Ensure we have a modifiable copy */
! relids = bms_copy(relids);
! /* Remove old, add new */
! relids = bms_del_member(relids, oldrelid);
! relids = bms_add_member(relids, newrelid);
}
return relids;
}
/*
* Adjust the targetlist entries of an inherited UPDATE operation
*
* The expressions have already been fixed, but we have to make sure that
--- 2118,2212 ----
}
/*
! * Replace parent relids by child relids in the copy of given relid set
! * according to the given list of AppendRelInfos. The given relid set is
! * returned as is if it contains no parent in the given list, otherwise, the
! * given relid set is not changed.
*/
! Relids
! adjust_child_relids(Relids relids, int nappinfos, AppendRelInfo **appinfos)
{
! Bitmapset *result = NULL;
! int cnt;
!
! for (cnt = 0; cnt < nappinfos; cnt++)
{
! AppendRelInfo *appinfo = appinfos[cnt];
!
! /* Remove parent, add child */
! if (bms_is_member(appinfo->parent_relid, relids))
! {
! /* Make a copy if we are changing the set. */
! if (!result)
! result = bms_copy(relids);
!
! result = bms_del_member(result, appinfo->parent_relid);
! result = bms_add_member(result, appinfo->child_relid);
! }
}
+
+ /* Return new set if we modified the given set. */
+ if (result)
+ return result;
+
+ /* Else return the given relids set as is. */
return relids;
}
/*
+ * Replace any relid present in top_parent_relids with its child in
+ * child_relids. Members of child_relids can be multiple levels below top
+ * parent in the partition hierarchy.
+ */
+ Relids
+ adjust_child_relids_multilevel(PlannerInfo *root, Relids relids,
+ Relids child_relids, Relids top_parent_relids)
+ {
+ AppendRelInfo **appinfos;
+ int nappinfos;
+ Relids parent_relids = NULL;
+ Relids result;
+ Relids tmp_result = NULL;
+ int cnt;
+
+ /*
+ * If the given relids set doesn't contain any of the top parent relids,
+ * it will remain unchanged.
+ */
+ if (!bms_overlap(relids, top_parent_relids))
+ return relids;
+
+ appinfos = find_appinfos_by_relids(root, child_relids, &nappinfos);
+
+ /* Construct relids set for the immediate parent of the given child. */
+ for (cnt = 0; cnt < nappinfos; cnt++)
+ {
+ AppendRelInfo *appinfo = appinfos[cnt];
+
+ parent_relids = bms_add_member(parent_relids, appinfo->parent_relid);
+ }
+
+ /* Recurse if immediate parent is not the top parent. */
+ if (!bms_equal(parent_relids, top_parent_relids))
+ {
+ tmp_result = adjust_child_relids_multilevel(root, relids,
+ parent_relids,
+ top_parent_relids);
+ relids = tmp_result;
+ }
+
+ result = adjust_child_relids(relids, nappinfos, appinfos);
+
+ /* Free memory consumed by any intermediate result. */
+ if (tmp_result)
+ bms_free(tmp_result);
+ bms_free(parent_relids);
+ pfree(appinfos);
+
+ return result;
+ }
+
+ /*
* Adjust the targetlist entries of an inherited UPDATE operation
*
* The expressions have already been fixed, but we have to make sure that
*************** adjust_inherited_tlist(List *tlist, Appe
*** 2142,2162 ****
* adjust_appendrel_attrs_multilevel
* Apply Var translations from a toplevel appendrel parent down to a child.
*
! * In some cases we need to translate expressions referencing a baserel
* to reference an appendrel child that's multiple levels removed from it.
*/
Node *
adjust_appendrel_attrs_multilevel(PlannerInfo *root, Node *node,
! RelOptInfo *child_rel)
{
! AppendRelInfo *appinfo = find_childrel_appendrelinfo(root, child_rel);
! RelOptInfo *parent_rel = find_base_rel(root, appinfo->parent_relid);
- /* If parent is also a child, first recurse to apply its translations */
- if (IS_OTHER_REL(parent_rel))
- node = adjust_appendrel_attrs_multilevel(root, node, parent_rel);
- else
- Assert(parent_rel->reloptkind == RELOPT_BASEREL);
/* Now translate for this child */
! return adjust_appendrel_attrs(root, node, appinfo);
}
--- 2307,2432 ----
* adjust_appendrel_attrs_multilevel
* Apply Var translations from a toplevel appendrel parent down to a child.
*
! * In some cases we need to translate expressions referencing a parent relation
* to reference an appendrel child that's multiple levels removed from it.
*/
Node *
adjust_appendrel_attrs_multilevel(PlannerInfo *root, Node *node,
! Relids child_relids,
! Relids top_parent_relids)
{
! AppendRelInfo **appinfos;
! Bitmapset *parent_relids = NULL;
! int nappinfos;
! int cnt;
!
! Assert(bms_num_members(child_relids) == bms_num_members(top_parent_relids));
!
! appinfos = find_appinfos_by_relids(root, child_relids, &nappinfos);
!
! /* Construct relids set for the immediate parent of given child. */
! for (cnt = 0; cnt < nappinfos; cnt++)
! {
! AppendRelInfo *appinfo = appinfos[cnt];
!
! parent_relids = bms_add_member(parent_relids, appinfo->parent_relid);
! }
!
! /* Recurse if immediate parent is not the top parent. */
! if (!bms_equal(parent_relids, top_parent_relids))
! node = adjust_appendrel_attrs_multilevel(root, node, parent_relids,
! top_parent_relids);
/* Now translate for this child */
! node = adjust_appendrel_attrs(root, node, nappinfos, appinfos);
!
! pfree(appinfos);
!
! return node;
! }
!
! /*
! * Construct the SpecialJoinInfo for a child-join by translating
! * SpecialJoinInfo for the join between parents. left_relids and right_relids
! * are the relids of left and right side of the join respectively.
! */
! SpecialJoinInfo *
! build_child_join_sjinfo(PlannerInfo *root, SpecialJoinInfo *parent_sjinfo,
! Relids left_relids, Relids right_relids)
! {
! SpecialJoinInfo *sjinfo = makeNode(SpecialJoinInfo);
! AppendRelInfo **left_appinfos;
! int left_nappinfos;
! AppendRelInfo **right_appinfos;
! int right_nappinfos;
!
! memcpy(sjinfo, parent_sjinfo, sizeof(SpecialJoinInfo));
! left_appinfos = find_appinfos_by_relids(root, left_relids,
! &left_nappinfos);
! right_appinfos = find_appinfos_by_relids(root, right_relids,
! &right_nappinfos);
!
! sjinfo->min_lefthand = adjust_child_relids(sjinfo->min_lefthand,
! left_nappinfos, left_appinfos);
! sjinfo->min_righthand = adjust_child_relids(sjinfo->min_righthand,
! right_nappinfos,
! right_appinfos);
! sjinfo->syn_lefthand = adjust_child_relids(sjinfo->syn_lefthand,
! left_nappinfos, left_appinfos);
! sjinfo->syn_righthand = adjust_child_relids(sjinfo->syn_righthand,
! right_nappinfos,
! right_appinfos);
!
! /*
! * Replace the Var nodes of parent with those of children in expressions.
! * This function may be called within a temporary context, but the
! * expressions will be shallow-copied into the plan. Hence copy those in
! * the planner's context.
! */
! sjinfo->semi_rhs_exprs = (List *) adjust_appendrel_attrs(root,
! (Node *) sjinfo->semi_rhs_exprs,
! right_nappinfos,
! right_appinfos);
!
! pfree(left_appinfos);
! pfree(right_appinfos);
!
! return sjinfo;
! }
!
! /*
! * find_appinfos_by_relids
! * Find AppendRelInfo structures for all relations specified by relids.
! *
! * The AppendRelInfos are returned in an array, which can be pfree'd by the
! * caller.
! */
! AppendRelInfo **
! find_appinfos_by_relids(PlannerInfo *root, Relids relids, int *nappinfos)
! {
! ListCell *lc;
! AppendRelInfo **appinfos;
! int cnt = 0;
!
! *nappinfos = bms_num_members(relids);
! appinfos = (AppendRelInfo **) palloc(sizeof(AppendRelInfo *) * *nappinfos);
!
! foreach (lc, root->append_rel_list)
! {
! AppendRelInfo *appinfo = lfirst(lc);
!
! if (bms_is_member(appinfo->child_relid, relids))
! {
! appinfos[cnt] = appinfo;
! cnt++;
!
! /* Stop when we have gathered all the AppendRelInfos. */
! if (cnt == *nappinfos)
! return appinfos;
! }
! }
!
! /* Should have found the entries ... */
! elog(ERROR, "Did not find one or more of requested child rels in append_rel_list");
! return NULL; /* not reached */
}
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
new file mode 100644
index 2d5caae..79a000b
*** a/src/backend/optimizer/util/pathnode.c
--- b/src/backend/optimizer/util/pathnode.c
***************
*** 18,32 ****
--- 18,39 ----
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
+ #include "nodes/extensible.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/planmain.h"
+ #include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
+ #include "optimizer/tlist.h"
+ /* TODO Remove this if get_grouping_expressions ends up in another module. */
+ #include "optimizer/tlist.h"
#include "optimizer/var.h"
#include "parser/parsetree.h"
+ #include "foreign/fdwapi.h"
#include "utils/lsyscache.h"
+ #include "utils/memutils.h"
#include "utils/selfuncs.h"
*************** set_cheapest(RelOptInfo *parent_rel)
*** 409,416 ****
* Returns nothing, but modifies parent_rel->pathlist.
*/
void
! add_path(RelOptInfo *parent_rel, Path *new_path)
{
bool accept_new = true; /* unless we find a superior old path */
ListCell *insert_after = NULL; /* where to insert new item */
List *new_path_pathkeys;
--- 416,424 ----
* Returns nothing, but modifies parent_rel->pathlist.
*/
void
! add_path(RelOptInfo *parent_rel, Path *new_path, bool grouped)
{
+ List *pathlist;
bool accept_new = true; /* unless we find a superior old path */
ListCell *insert_after = NULL; /* where to insert new item */
List *new_path_pathkeys;
*************** add_path(RelOptInfo *parent_rel, Path *n
*** 427,432 ****
--- 435,448 ----
/* Pretend parameterized paths have no pathkeys, per comment above */
new_path_pathkeys = new_path->param_info ? NIL : new_path->pathkeys;
+ if (!grouped)
+ pathlist = parent_rel->pathlist;
+ else
+ {
+ Assert(parent_rel->gpi != NULL);
+ pathlist = parent_rel->gpi->pathlist;
+ }
+
/*
* Loop to check proposed new path against old paths. Note it is possible
* for more than one old path to be tossed out because new_path dominates
*************** add_path(RelOptInfo *parent_rel, Path *n
*** 436,442 ****
* list cell.
*/
p1_prev = NULL;
! for (p1 = list_head(parent_rel->pathlist); p1 != NULL; p1 = p1_next)
{
Path *old_path = (Path *) lfirst(p1);
bool remove_old = false; /* unless new proves superior */
--- 452,458 ----
* list cell.
*/
p1_prev = NULL;
! for (p1 = list_head(pathlist); p1 != NULL; p1 = p1_next)
{
Path *old_path = (Path *) lfirst(p1);
bool remove_old = false; /* unless new proves superior */
*************** add_path(RelOptInfo *parent_rel, Path *n
*** 582,589 ****
*/
if (remove_old)
{
! parent_rel->pathlist = list_delete_cell(parent_rel->pathlist,
! p1, p1_prev);
/*
* Delete the data pointed-to by the deleted cell, if possible
--- 598,604 ----
*/
if (remove_old)
{
! pathlist = list_delete_cell(pathlist, p1, p1_prev);
/*
* Delete the data pointed-to by the deleted cell, if possible
*************** add_path(RelOptInfo *parent_rel, Path *n
*** 614,622 ****
{
/* Accept the new path: insert it at proper place in pathlist */
if (insert_after)
! lappend_cell(parent_rel->pathlist, insert_after, new_path);
else
! parent_rel->pathlist = lcons(new_path, parent_rel->pathlist);
}
else
{
--- 629,642 ----
{
/* Accept the new path: insert it at proper place in pathlist */
if (insert_after)
! lappend_cell(pathlist, insert_after, new_path);
else
! pathlist = lcons(new_path, pathlist);
!
! if (!grouped)
! parent_rel->pathlist = pathlist;
! else
! parent_rel->gpi->pathlist = pathlist;
}
else
{
*************** add_path(RelOptInfo *parent_rel, Path *n
*** 646,653 ****
bool
add_path_precheck(RelOptInfo *parent_rel,
Cost startup_cost, Cost total_cost,
! List *pathkeys, Relids required_outer)
{
List *new_path_pathkeys;
bool consider_startup;
ListCell *p1;
--- 666,674 ----
bool
add_path_precheck(RelOptInfo *parent_rel,
Cost startup_cost, Cost total_cost,
! List *pathkeys, Relids required_outer, bool grouped)
{
+ List *pathlist;
List *new_path_pathkeys;
bool consider_startup;
ListCell *p1;
*************** add_path_precheck(RelOptInfo *parent_rel
*** 656,664 ****
new_path_pathkeys = required_outer ? NIL : pathkeys;
/* Decide whether new path's startup cost is interesting */
! consider_startup = required_outer ? parent_rel->consider_param_startup : parent_rel->consider_startup;
! foreach(p1, parent_rel->pathlist)
{
Path *old_path = (Path *) lfirst(p1);
PathKeysComparison keyscmp;
--- 677,694 ----
new_path_pathkeys = required_outer ? NIL : pathkeys;
/* Decide whether new path's startup cost is interesting */
! consider_startup = required_outer ? parent_rel->consider_param_startup :
! parent_rel->consider_startup;
! if (!grouped)
! pathlist = parent_rel->pathlist;
! else
! {
! Assert(parent_rel->gpi != NULL);
! pathlist = parent_rel->gpi->pathlist;
! }
!
! foreach(p1, pathlist)
{
Path *old_path = (Path *) lfirst(p1);
PathKeysComparison keyscmp;
*************** add_path_precheck(RelOptInfo *parent_rel
*** 749,771 ****
* referenced by partial BitmapHeapPaths.
*/
void
! add_partial_path(RelOptInfo *parent_rel, Path *new_path)
{
bool accept_new = true; /* unless we find a superior old path */
ListCell *insert_after = NULL; /* where to insert new item */
ListCell *p1;
ListCell *p1_prev;
ListCell *p1_next;
/* Check for query cancel. */
CHECK_FOR_INTERRUPTS();
/*
* As in add_path, throw out any paths which are dominated by the new
* path, but throw out the new path if some existing path dominates it.
*/
p1_prev = NULL;
! for (p1 = list_head(parent_rel->partial_pathlist); p1 != NULL;
p1 = p1_next)
{
Path *old_path = (Path *) lfirst(p1);
--- 779,810 ----
* referenced by partial BitmapHeapPaths.
*/
void
! add_partial_path(RelOptInfo *parent_rel, Path *new_path, bool grouped)
{
bool accept_new = true; /* unless we find a superior old path */
ListCell *insert_after = NULL; /* where to insert new item */
ListCell *p1;
ListCell *p1_prev;
ListCell *p1_next;
+ List *pathlist;
/* Check for query cancel. */
CHECK_FOR_INTERRUPTS();
+ if (!grouped)
+ pathlist = parent_rel->partial_pathlist;
+ else
+ {
+ Assert(parent_rel->gpi != NULL);
+ pathlist = parent_rel->gpi->partial_pathlist;
+ }
+
/*
* As in add_path, throw out any paths which are dominated by the new
* path, but throw out the new path if some existing path dominates it.
*/
p1_prev = NULL;
! for (p1 = list_head(pathlist); p1 != NULL;
p1 = p1_next)
{
Path *old_path = (Path *) lfirst(p1);
*************** add_partial_path(RelOptInfo *parent_rel,
*** 819,830 ****
}
/*
! * Remove current element from partial_pathlist if dominated by new.
*/
if (remove_old)
{
! parent_rel->partial_pathlist =
! list_delete_cell(parent_rel->partial_pathlist, p1, p1_prev);
pfree(old_path);
/* p1_prev does not advance */
}
--- 858,868 ----
}
/*
! * Remove current element from pathlist if dominated by new.
*/
if (remove_old)
{
! pathlist = list_delete_cell(pathlist, p1, p1_prev);
pfree(old_path);
/* p1_prev does not advance */
}
*************** add_partial_path(RelOptInfo *parent_rel,
*** 839,845 ****
/*
* If we found an old path that dominates new_path, we can quit
! * scanning the partial_pathlist; we will not add new_path, and we
* assume new_path cannot dominate any later path.
*/
if (!accept_new)
--- 877,883 ----
/*
* If we found an old path that dominates new_path, we can quit
! * scanning the pathlist; we will not add new_path, and we
* assume new_path cannot dominate any later path.
*/
if (!accept_new)
*************** add_partial_path(RelOptInfo *parent_rel,
*** 850,859 ****
{
/* Accept the new path: insert it at proper place */
if (insert_after)
! lappend_cell(parent_rel->partial_pathlist, insert_after, new_path);
else
! parent_rel->partial_pathlist =
! lcons(new_path, parent_rel->partial_pathlist);
}
else
{
--- 888,901 ----
{
/* Accept the new path: insert it at proper place */
if (insert_after)
! lappend_cell(pathlist, insert_after, new_path);
else
! pathlist = lcons(new_path, pathlist);
!
! if (!grouped)
! parent_rel->partial_pathlist = pathlist;
! else
! parent_rel->gpi->partial_pathlist = pathlist;
}
else
{
*************** add_partial_path(RelOptInfo *parent_rel,
*** 874,882 ****
*/
bool
add_partial_path_precheck(RelOptInfo *parent_rel, Cost total_cost,
! List *pathkeys)
{
ListCell *p1;
/*
* Our goal here is twofold. First, we want to find out whether this path
--- 916,933 ----
*/
bool
add_partial_path_precheck(RelOptInfo *parent_rel, Cost total_cost,
! List *pathkeys, bool grouped)
{
ListCell *p1;
+ List *pathlist;
+
+ if (!grouped)
+ pathlist = parent_rel->partial_pathlist;
+ else
+ {
+ Assert(parent_rel->gpi != NULL);
+ pathlist = parent_rel->gpi->partial_pathlist;
+ }
/*
* Our goal here is twofold. First, we want to find out whether this path
*************** add_partial_path_precheck(RelOptInfo *pa
*** 886,895 ****
* final cost computations. If so, we definitely want to consider it.
*
* Unlike add_path(), we always compare pathkeys here. This is because we
! * expect partial_pathlist to be very short, and getting a definitive
! * answer at this stage avoids the need to call add_path_precheck.
*/
! foreach(p1, parent_rel->partial_pathlist)
{
Path *old_path = (Path *) lfirst(p1);
PathKeysComparison keyscmp;
--- 937,947 ----
* final cost computations. If so, we definitely want to consider it.
*
* Unlike add_path(), we always compare pathkeys here. This is because we
! * expect partial_pathlist / grouped_pathlist to be very short, and
! * getting a definitive answer at this stage avoids the need to call
! * add_path_precheck.
*/
! foreach(p1, pathlist)
{
Path *old_path = (Path *) lfirst(p1);
PathKeysComparison keyscmp;
*************** add_partial_path_precheck(RelOptInfo *pa
*** 918,924 ****
* completion.
*/
if (!add_path_precheck(parent_rel, total_cost, total_cost, pathkeys,
! NULL))
return false;
return true;
--- 970,976 ----
* completion.
*/
if (!add_path_precheck(parent_rel, total_cost, total_cost, pathkeys,
! NULL, grouped))
return false;
return true;
*************** create_foreignscan_path(PlannerInfo *roo
*** 1994,2007 ****
* Note: result must not share storage with either input
*/
Relids
! calc_nestloop_required_outer(Path *outer_path, Path *inner_path)
{
- Relids outer_paramrels = PATH_REQ_OUTER(outer_path);
- Relids inner_paramrels = PATH_REQ_OUTER(inner_path);
Relids required_outer;
/* inner_path can require rels from outer path, but not vice versa */
! Assert(!bms_overlap(outer_paramrels, inner_path->parent->relids));
/* easy case if inner path is not parameterized */
if (!inner_paramrels)
return bms_copy(outer_paramrels);
--- 2046,2060 ----
* Note: result must not share storage with either input
*/
Relids
! calc_nestloop_required_outer(Relids outerrelids,
! Relids outer_paramrels,
! Relids innerrelids,
! Relids inner_paramrels)
{
Relids required_outer;
/* inner_path can require rels from outer path, but not vice versa */
! Assert(!bms_overlap(outer_paramrels, innerrelids));
/* easy case if inner path is not parameterized */
if (!inner_paramrels)
return bms_copy(outer_paramrels);
*************** calc_nestloop_required_outer(Path *outer
*** 2009,2015 ****
required_outer = bms_union(outer_paramrels, inner_paramrels);
/* ... and remove any mention of now-satisfied outer rels */
required_outer = bms_del_members(required_outer,
! outer_path->parent->relids);
/* maintain invariant that required_outer is exactly NULL if empty */
if (bms_is_empty(required_outer))
{
--- 2062,2068 ----
required_outer = bms_union(outer_paramrels, inner_paramrels);
/* ... and remove any mention of now-satisfied outer rels */
required_outer = bms_del_members(required_outer,
! outerrelids);
/* maintain invariant that required_outer is exactly NULL if empty */
if (bms_is_empty(required_outer))
{
*************** calc_non_nestloop_required_outer(Path *o
*** 2055,2060 ****
--- 2108,2114 ----
* 'restrict_clauses' are the RestrictInfo nodes to apply at the join
* 'pathkeys' are the path keys of the new join path
* 'required_outer' is the set of required outer rels
+ * 'target' can be passed to override that of joinrel.
*
* Returns the resulting path node.
*/
*************** create_nestloop_path(PlannerInfo *root,
*** 2068,2074 ****
Path *inner_path,
List *restrict_clauses,
List *pathkeys,
! Relids required_outer)
{
NestPath *pathnode = makeNode(NestPath);
Relids inner_req_outer = PATH_REQ_OUTER(inner_path);
--- 2122,2129 ----
Path *inner_path,
List *restrict_clauses,
List *pathkeys,
! Relids required_outer,
! PathTarget *target)
{
NestPath *pathnode = makeNode(NestPath);
Relids inner_req_outer = PATH_REQ_OUTER(inner_path);
*************** create_nestloop_path(PlannerInfo *root,
*** 2101,2107 ****
pathnode->path.pathtype = T_NestLoop;
pathnode->path.parent = joinrel;
! pathnode->path.pathtarget = joinrel->reltarget;
pathnode->path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
--- 2156,2162 ----
pathnode->path.pathtype = T_NestLoop;
pathnode->path.parent = joinrel;
! pathnode->path.pathtarget = target == NULL ? joinrel->reltarget : target;
pathnode->path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
*************** create_mergejoin_path(PlannerInfo *root,
*** 2159,2171 ****
Relids required_outer,
List *mergeclauses,
List *outersortkeys,
! List *innersortkeys)
{
MergePath *pathnode = makeNode(MergePath);
pathnode->jpath.path.pathtype = T_MergeJoin;
pathnode->jpath.path.parent = joinrel;
! pathnode->jpath.path.pathtarget = joinrel->reltarget;
pathnode->jpath.path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
--- 2214,2228 ----
Relids required_outer,
List *mergeclauses,
List *outersortkeys,
! List *innersortkeys,
! PathTarget *target)
{
MergePath *pathnode = makeNode(MergePath);
pathnode->jpath.path.pathtype = T_MergeJoin;
pathnode->jpath.path.parent = joinrel;
! pathnode->jpath.path.pathtarget = target == NULL ? joinrel->reltarget :
! target;
pathnode->jpath.path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
*************** create_mergejoin_path(PlannerInfo *root,
*** 2210,2215 ****
--- 2267,2273 ----
* 'required_outer' is the set of required outer rels
* 'hashclauses' are the RestrictInfo nodes to use as hash clauses
* (this should be a subset of the restrict_clauses list)
+ * 'target' can be passed to override that of joinrel.
*/
HashPath *
create_hashjoin_path(PlannerInfo *root,
*************** create_hashjoin_path(PlannerInfo *root,
*** 2221,2233 ****
Path *inner_path,
List *restrict_clauses,
Relids required_outer,
! List *hashclauses)
{
HashPath *pathnode = makeNode(HashPath);
pathnode->jpath.path.pathtype = T_HashJoin;
pathnode->jpath.path.parent = joinrel;
! pathnode->jpath.path.pathtarget = joinrel->reltarget;
pathnode->jpath.path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
--- 2279,2293 ----
Path *inner_path,
List *restrict_clauses,
Relids required_outer,
! List *hashclauses,
! PathTarget *target)
{
HashPath *pathnode = makeNode(HashPath);
pathnode->jpath.path.pathtype = T_HashJoin;
pathnode->jpath.path.parent = joinrel;
! pathnode->jpath.path.pathtarget = target == NULL ? joinrel->reltarget :
! target;
pathnode->jpath.path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
*************** create_agg_path(PlannerInfo *root,
*** 2713,2718 ****
--- 2773,2948 ----
}
/*
+ * Apply partial AGG_SORTED aggregation path to subpath if it's suitably
+ * sorted.
+ *
+ * first_call indicates whether the function is being called first time for
+ * given index --- since the target should not change, we can skip the check
+ * of sorting during subsequent calls.
+ *
+ * group_clauses, group_exprs and agg_exprs are pointers to lists we populate
+ * when called first time for particular index, and that user passes for
+ * subsequent calls.
+ *
+ * NULL is returned if sorting of subpath output is not suitable.
+ */
+ AggPath *
+ create_partial_agg_sorted_path(PlannerInfo *root, Path *subpath,
+ bool first_call,
+ List **group_clauses, List **group_exprs,
+ List **agg_exprs, double input_rows)
+ {
+ RelOptInfo *rel;
+ AggClauseCosts agg_costs;
+ double dNumGroups;
+ AggPath *result = NULL;
+
+ rel = subpath->parent;
+ Assert(rel->gpi != NULL);
+
+ if (subpath->pathkeys == NIL)
+ return NULL;
+
+ if (!grouping_is_sortable(root->parse->groupClause))
+ return NULL;
+
+ if (first_call)
+ {
+ ListCell *lc1;
+ List *key_subset = NIL;
+
+ /*
+ * Find all query pathkeys that our relation does affect.
+ */
+ foreach(lc1, root->group_pathkeys)
+ {
+ PathKey *gkey = castNode(PathKey, lfirst(lc1));
+ ListCell *lc2;
+
+ foreach(lc2, subpath->pathkeys)
+ {
+ PathKey *skey = castNode(PathKey, lfirst(lc2));
+
+ if (skey == gkey)
+ {
+ key_subset = lappend(key_subset, gkey);
+ break;
+ }
+ }
+ }
+
+ if (key_subset == NIL)
+ return NULL;
+
+ /* Check if AGG_SORTED is useful for the whole query. */
+ if (!pathkeys_contained_in(key_subset, subpath->pathkeys))
+ return NULL;
+ }
+
+ if (first_call)
+ get_grouping_expressions(root, rel->gpi->target, group_clauses,
+ group_exprs, agg_exprs);
+
+ MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+ Assert(*agg_exprs != NIL);
+ get_agg_clause_costs(root, (Node *) *agg_exprs, AGGSPLIT_INITIAL_SERIAL,
+ &agg_costs);
+
+ Assert(*group_exprs != NIL);
+ dNumGroups = estimate_num_groups(root, *group_exprs, input_rows, NULL);
+
+ /* TODO HAVING qual. */
+ Assert(*group_clauses != NIL);
+ result = create_agg_path(root, rel, subpath, rel->gpi->target, AGG_SORTED,
+ AGGSPLIT_INITIAL_SERIAL, *group_clauses, NIL,
+ &agg_costs, dNumGroups);
+
+ return result;
+ }
+
+ /*
+ * Appy partial AGG_HASHED aggregation to subpath.
+ *
+ * Arguments have the same meaning as those of create_agg_sorted_path.
+ *
+ */
+ AggPath *
+ create_partial_agg_hashed_path(PlannerInfo *root, Path *subpath,
+ bool first_call,
+ List **group_clauses, List **group_exprs,
+ List **agg_exprs, double input_rows)
+ {
+ RelOptInfo *rel;
+ bool can_hash;
+ AggClauseCosts agg_costs;
+ double dNumGroups;
+ Size hashaggtablesize;
+ Query *parse = root->parse;
+ AggPath *result = NULL;
+
+ rel = subpath->parent;
+ Assert(rel->gpi != NULL);
+
+ if (first_call)
+ {
+ /*
+ * Find one grouping clause per grouping column.
+ *
+ * All that create_agg_plan eventually needs of the clause is
+ * tleSortGroupRef, so we don't have to care that the clause
+ * expression might differ from texpr, in case texpr was derived from
+ * EC.
+ */
+ get_grouping_expressions(root, rel->gpi->target, group_clauses,
+ group_exprs, agg_exprs);
+ }
+
+ MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+ Assert(*agg_exprs != NIL);
+ get_agg_clause_costs(root, (Node *) *agg_exprs, AGGSPLIT_INITIAL_SERIAL,
+ &agg_costs);
+
+ can_hash = (parse->groupClause != NIL &&
+ parse->groupingSets == NIL &&
+ agg_costs.numOrderedAggs == 0 &&
+ grouping_is_hashable(parse->groupClause));
+
+ if (can_hash)
+ {
+ Assert(*group_exprs != NIL);
+ dNumGroups = estimate_num_groups(root, *group_exprs, input_rows,
+ NULL);
+
+ hashaggtablesize = estimate_hashagg_tablesize(subpath, &agg_costs,
+ dNumGroups);
+
+ if (hashaggtablesize < work_mem * 1024L)
+ {
+ /*
+ * Create the partial aggregation path.
+ */
+ Assert(*group_clauses != NIL);
+
+ result = create_agg_path(root, rel, subpath,
+ rel->gpi->target,
+ AGG_HASHED,
+ AGGSPLIT_INITIAL_SERIAL,
+ *group_clauses, NIL,
+ &agg_costs,
+ dNumGroups);
+
+ /*
+ * The agg path should require no fewer parameters than the plain
+ * one.
+ */
+ result->path.param_info = subpath->param_info;
+ }
+ }
+
+ return result;
+ }
+
+ /*
* create_groupingsets_path
* Creates a pathnode that represents performing GROUPING SETS aggregation
*
*************** reparameterize_path(PlannerInfo *root, P
*** 3426,3428 ****
--- 3656,4081 ----
}
return NULL;
}
+
+ /*
+ * reparameterize_path_by_child
+ * Given a path parameterized by the parent of the given relation,
+ * translate the path to be parameterized by the given child relation.
+ *
+ * The function creates a new path of the same type as the given path, but
+ * parameterized by the given child relation. If it can not reparameterize the
+ * path as required, it returns NULL.
+ *
+ * The cost, number of rows, width and parallel path properties depend upon
+ * path->parent, which does not change during the translation. Hence those
+ * members are copied as they are.
+ */
+
+ Path *
+ reparameterize_path_by_child(PlannerInfo *root, Path *path,
+ RelOptInfo *child_rel)
+ {
+
+ #define FLAT_COPY_PATH(newnode, node, nodetype) \
+ ( (newnode) = makeNode(nodetype), \
+ memcpy((newnode), (node), sizeof(nodetype)) )
+
+ Path *new_path;
+ ParamPathInfo *new_ppi;
+ ParamPathInfo *old_ppi;
+ Relids required_outer;
+
+ /*
+ * If the path is not parameterized by parent of the given relation or it it
+ * doesn't need reparameterization.
+ */
+ if (!path->param_info ||
+ !bms_overlap(PATH_REQ_OUTER(path), child_rel->top_parent_relids))
+ return path;
+
+ /*
+ * Make a copy of the given path and reparameterize or translate the
+ * path specific members.
+ */
+ switch (nodeTag(path))
+ {
+ case T_Path:
+ FLAT_COPY_PATH(new_path, path, Path);
+ break;
+
+ case T_IndexPath:
+ {
+ IndexPath *ipath;
+
+ FLAT_COPY_PATH(ipath, path, IndexPath);
+ ipath->indexclauses = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) ipath->indexclauses,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ ipath->indexquals = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) ipath->indexquals,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ new_path = (Path *) ipath;
+ }
+ break;
+
+ case T_BitmapHeapPath:
+ {
+ BitmapHeapPath *bhpath;
+
+ FLAT_COPY_PATH(bhpath, path, BitmapHeapPath);
+ bhpath->bitmapqual = reparameterize_path_by_child(root,
+ bhpath->bitmapqual,
+ child_rel);
+ new_path = (Path *) bhpath;
+ }
+ break;
+
+ case T_BitmapAndPath:
+ {
+ BitmapAndPath *bapath;
+ ListCell *lc;
+ List *bitmapquals = NIL;
+
+ FLAT_COPY_PATH(bapath, path, BitmapAndPath);
+ foreach (lc, bapath->bitmapquals)
+ {
+ Path *bmqpath = lfirst(lc);
+
+ bitmapquals = lappend(bitmapquals,
+ reparameterize_path_by_child(root,
+ bmqpath,
+ child_rel));
+ }
+ bapath->bitmapquals = bitmapquals;
+ new_path = (Path *) bapath;
+ }
+ break;
+
+ case T_BitmapOrPath:
+ {
+ BitmapOrPath *bopath;
+ ListCell *lc;
+ List *bitmapquals = NIL;
+
+ FLAT_COPY_PATH(bopath, path, BitmapOrPath);
+ foreach (lc, bopath->bitmapquals)
+ {
+ Path *bmqpath = lfirst(lc);
+
+ bitmapquals = lappend(bitmapquals,
+ reparameterize_path_by_child(root,
+ bmqpath,
+ child_rel));
+ }
+ bopath->bitmapquals = bitmapquals;
+ new_path = (Path *) bopath;
+ }
+ break;
+
+ case T_TidPath:
+ {
+ TidPath *tpath;
+
+ /*
+ * TidPath contains tidquals, which do not contain any external
+ * parameters per create_tidscan_path(). So don't bother to
+ * translate those.
+ */
+ FLAT_COPY_PATH(tpath, path, TidPath);
+ new_path = (Path *) tpath;
+ }
+ break;
+
+ case T_ForeignPath:
+ {
+ ForeignPath *fpath;
+ ReparameterizeForeignPathByChild_function rfpc_func;
+
+ FLAT_COPY_PATH(fpath, path, ForeignPath);
+ if (fpath->fdw_outerpath)
+ fpath->fdw_outerpath = reparameterize_path_by_child(root,
+ fpath->fdw_outerpath,
+ child_rel);
+ rfpc_func = path->parent->fdwroutine->ReparameterizeForeignPathByChild;
+
+ /* Hand over to FDW if supported. */
+ if (rfpc_func)
+ fpath->fdw_private = rfpc_func(root, fpath->fdw_private,
+ child_rel);
+ new_path = (Path *) fpath;
+ }
+ break;
+
+ case T_CustomPath:
+ {
+ CustomPath *cpath;
+ ListCell *lc;
+ List *custompaths = NIL;
+
+ FLAT_COPY_PATH(cpath, path, CustomPath);
+
+ foreach (lc, cpath->custom_paths)
+ {
+ Path *subpath = lfirst(lc);
+
+ custompaths = lappend(custompaths,
+ reparameterize_path_by_child(root,
+ subpath,
+ child_rel));
+ }
+ cpath->custom_paths = custompaths;
+
+ if (cpath->methods &&
+ cpath->methods->ReparameterizeCustomPathByChild)
+ cpath->custom_private = cpath->methods->ReparameterizeCustomPathByChild(root,
+ cpath->custom_private,
+ child_rel);
+
+ new_path = (Path *) cpath;
+ }
+ break;
+
+ case T_NestPath:
+ {
+ JoinPath *jpath;
+
+ FLAT_COPY_PATH(jpath, path, NestPath);
+
+ jpath->outerjoinpath = reparameterize_path_by_child(root,
+ jpath->outerjoinpath,
+ child_rel);
+ jpath->innerjoinpath = reparameterize_path_by_child(root,
+ jpath->innerjoinpath,
+ child_rel);
+ jpath->joinrestrictinfo = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) jpath->joinrestrictinfo,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ new_path = (Path *) jpath;
+ }
+ break;
+
+ case T_MergePath:
+ {
+ JoinPath *jpath;
+ MergePath *mpath;
+
+ FLAT_COPY_PATH(mpath, path, MergePath);
+
+ jpath = (JoinPath *) mpath;
+ jpath->outerjoinpath = reparameterize_path_by_child(root,
+ jpath->outerjoinpath,
+ child_rel);
+ jpath->innerjoinpath = reparameterize_path_by_child(root,
+ jpath->innerjoinpath,
+ child_rel);
+ jpath->joinrestrictinfo = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) jpath->joinrestrictinfo,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ mpath->path_mergeclauses = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) mpath->path_mergeclauses,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ new_path = (Path *) mpath;
+ }
+ break;
+
+ case T_HashPath:
+ {
+ JoinPath *jpath;
+ HashPath *hpath;
+ FLAT_COPY_PATH(hpath, path, HashPath);
+
+ jpath = (JoinPath *) hpath;
+ jpath->outerjoinpath = reparameterize_path_by_child(root,
+ jpath->outerjoinpath,
+ child_rel);
+ jpath->innerjoinpath = reparameterize_path_by_child(root,
+ jpath->innerjoinpath,
+ child_rel);
+ jpath->joinrestrictinfo = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) jpath->joinrestrictinfo,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ hpath->path_hashclauses = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) hpath->path_hashclauses,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ new_path = (Path *) hpath;
+ }
+ break;
+
+ case T_AppendPath:
+ {
+ AppendPath *apath;
+ List *subpaths = NIL;
+ ListCell *lc;
+
+ FLAT_COPY_PATH(apath, path, AppendPath);
+ foreach (lc, apath->subpaths)
+ subpaths = lappend(subpaths,
+ reparameterize_path_by_child(root,
+ lfirst(lc),
+ child_rel));
+ apath->subpaths = subpaths;
+ new_path = (Path *) apath;
+ }
+ break;
+
+ case T_MergeAppend:
+ {
+ MergeAppendPath *mapath;
+ List *subpaths = NIL;
+ ListCell *lc;
+
+ FLAT_COPY_PATH(mapath, path, MergeAppendPath);
+ foreach (lc, mapath->subpaths)
+ subpaths = lappend(subpaths,
+ reparameterize_path_by_child(root,
+ lfirst(lc),
+ child_rel));
+ mapath->subpaths = subpaths;
+ new_path = (Path *) mapath;
+ }
+ break;
+
+ case T_MaterialPath:
+ {
+ MaterialPath *mpath;
+
+ FLAT_COPY_PATH(mpath, path, MaterialPath);
+ mpath->subpath = reparameterize_path_by_child(root,
+ mpath->subpath,
+ child_rel);
+ new_path = (Path *) mpath;
+ }
+ break;
+
+ case T_UniquePath:
+ {
+ UniquePath *upath;
+
+ FLAT_COPY_PATH(upath, path, UniquePath);
+ upath->subpath = reparameterize_path_by_child(root,
+ upath->subpath,
+ child_rel);
+ upath->uniq_exprs = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) upath->uniq_exprs,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ new_path = (Path *) upath;
+ }
+ break;
+
+ case T_GatherPath:
+ {
+ GatherPath *gpath;
+
+ FLAT_COPY_PATH(gpath, path, GatherPath);
+ gpath->subpath = reparameterize_path_by_child(root,
+ gpath->subpath,
+ child_rel);
+ new_path = (Path *) gpath;
+ }
+ break;
+
+ case T_GatherMergePath:
+ {
+ GatherMergePath *gmpath;
+
+ FLAT_COPY_PATH(gmpath, path, GatherMergePath);
+ gmpath->subpath = reparameterize_path_by_child(root,
+ gmpath->subpath,
+ child_rel);
+ new_path = (Path *) gmpath;
+ }
+ break;
+
+ case T_SubqueryScanPath:
+ /*
+ * Subqueries can't be partitioned right now, so a subquery can not
+ * participate in a partition-wise join and hence can not be seen
+ * here.
+ */
+ case T_ResultPath:
+ /*
+ * A result path can not have any parameterization, so we
+ * should never see it here.
+ */
+ default:
+ /* Other kinds of paths can not appear in a join tree. */
+ elog(ERROR, "unrecognized path node type %d", (int) nodeTag(path));
+
+ /* Keep compiler quite about unassigned new_path */
+ return NULL;
+ }
+
+ /*
+ * Adjust the parameterization information, which refers to the topmost
+ * parent. The topmost parent can be multiple levels away from the given
+ * child, hence use multi-level expression adjustment routines.
+ */
+ old_ppi = new_path->param_info;
+ required_outer = adjust_child_relids_multilevel(root,
+ old_ppi->ppi_req_outer,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+
+ /* If we already have a PPI for this parameterization, just return it */
+ new_ppi = find_param_path_info(new_path->parent, required_outer);
+
+ /*
+ * If not, build a new one and link it to the list of PPIs. When called
+ * during GEQO join planning, we are in a short-lived memory context. We
+ * must make sure that the new PPI and its contents attached to a baserel
+ * survives the GEQO cycle, else the baserel is trashed for future GEQO
+ * cycles. On the other hand, when we are adding new PPI to a joinrel
+ * during GEQO, we don't want that to clutter the main planning context.
+ * Upshot is that the best solution is to explicitly allocate new PPI in
+ * the same context the given RelOptInfo is in.
+ */
+ if (!new_ppi)
+ {
+ MemoryContext oldcontext;
+ RelOptInfo *rel = path->parent;
+
+ oldcontext = MemoryContextSwitchTo(GetMemoryChunkContext(rel));
+
+ new_ppi = makeNode(ParamPathInfo);
+ new_ppi->ppi_req_outer = bms_copy(required_outer);
+ new_ppi->ppi_rows = old_ppi->ppi_rows;
+ new_ppi->ppi_clauses = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) old_ppi->ppi_clauses,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ rel->ppilist = lappend(rel->ppilist, new_ppi);
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+ bms_free(required_outer);
+
+ new_path->param_info = new_ppi;
+
+ /*
+ * Adjust the path target if the parent of the outer relation is referenced
+ * in the targetlist. This can happen when only the parent of outer relation is
+ * laterally referenced in this relation.
+ */
+ if (bms_overlap(path->parent->lateral_relids, child_rel->top_parent_relids))
+ {
+ List *exprs;
+
+ new_path->pathtarget = copy_pathtarget(new_path->pathtarget);
+ exprs = new_path->pathtarget->exprs;
+ exprs = (List *) adjust_appendrel_attrs_multilevel(root,
+ (Node *) exprs,
+ child_rel->relids,
+ child_rel->top_parent_relids);
+ new_path->pathtarget->exprs = exprs;
+ }
+
+ return new_path;
+ }
diff --git a/src/backend/optimizer/util/placeholder.c b/src/backend/optimizer/util/placeholder.c
new file mode 100644
index 698a387..6714288
*** a/src/backend/optimizer/util/placeholder.c
--- b/src/backend/optimizer/util/placeholder.c
***************
*** 20,25 ****
--- 20,26 ----
#include "optimizer/pathnode.h"
#include "optimizer/placeholder.h"
#include "optimizer/planmain.h"
+ #include "optimizer/prep.h"
#include "optimizer/var.h"
#include "utils/lsyscache.h"
*************** add_placeholders_to_joinrel(PlannerInfo
*** 414,419 ****
--- 415,424 ----
Relids relids = joinrel->relids;
ListCell *lc;
+ /* This function is called only on the parent relations. */
+ Assert(!IS_OTHER_REL(joinrel) && !IS_OTHER_REL(outer_rel) &&
+ !IS_OTHER_REL(inner_rel));
+
foreach(lc, root->placeholder_list)
{
PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(lc);
*************** add_placeholders_to_joinrel(PlannerInfo
*** 459,461 ****
--- 464,518 ----
}
}
}
+
+ /*
+ * add_placeholders_to_child_joinrel
+ * Translate the PHVs in parent's targetlist and add them to the child's
+ * targetlist. Also adjust the cost
+ */
+ void
+ add_placeholders_to_child_joinrel(PlannerInfo *root, RelOptInfo *childrel,
+ RelOptInfo *parentrel)
+ {
+ ListCell *lc;
+ AppendRelInfo **appinfos;
+ int nappinfos;
+
+
+ Assert(IS_JOIN_REL(childrel) && IS_JOIN_REL(parentrel));
+
+ /* Ensure child relations is really what it claims to be. */
+ Assert(IS_OTHER_REL(childrel));
+
+ appinfos = find_appinfos_by_relids(root, childrel->relids, &nappinfos);
+ foreach (lc, parentrel->reltarget->exprs)
+ {
+ PlaceHolderVar *phv = lfirst(lc);
+
+ if (IsA(phv, PlaceHolderVar))
+ {
+ /*
+ * In case the placeholder Var refers to any of the parent
+ * relations, translate it to refer to the corresponding child.
+ */
+ if (bms_overlap(phv->phrels, parentrel->relids) &&
+ childrel->reloptkind == RELOPT_OTHER_JOINREL)
+ {
+ phv = (PlaceHolderVar *) adjust_appendrel_attrs(root,
+ (Node *) phv,
+ nappinfos,
+ appinfos);
+ }
+
+ childrel->reltarget->exprs = lappend(childrel->reltarget->exprs,
+ phv);
+ }
+ }
+
+ /* Adjust the cost and width of child targetlist. */
+ childrel->reltarget->cost.startup = parentrel->reltarget->cost.startup;
+ childrel->reltarget->cost.per_tuple = parentrel->reltarget->cost.per_tuple;
+ childrel->reltarget->width = parentrel->reltarget->width;
+
+ pfree(appinfos);
+ }
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
new file mode 100644
index 9207c8d..7e846e1
*** a/src/backend/optimizer/util/plancat.c
--- b/src/backend/optimizer/util/plancat.c
***************
*** 27,32 ****
--- 27,33 ----
#include "catalog/catalog.h"
#include "catalog/dependency.h"
#include "catalog/heap.h"
+ #include "catalog/pg_inherits_fn.h"
#include "catalog/partition.h"
#include "catalog/pg_am.h"
#include "catalog/pg_statistic_ext.h"
*************** static List *get_relation_constraints(Pl
*** 68,73 ****
--- 69,80 ----
static List *build_index_tlist(PlannerInfo *root, IndexOptInfo *index,
Relation heapRelation);
static List *get_relation_statistics(RelOptInfo *rel, Relation relation);
+ static List **build_baserel_partition_key_exprs(Relation relation,
+ Index varno);
+ static PartitionScheme find_partition_scheme(struct PlannerInfo *root,
+ Relation rel);
+ static void get_relation_partition_info(PlannerInfo *root, RelOptInfo *rel,
+ Relation relation);
/*
* get_relation_info -
*************** get_relation_info(PlannerInfo *root, Oid
*** 420,425 ****
--- 427,436 ----
/* Collect info about relation's foreign keys, if relevant */
get_relation_foreign_keys(root, rel, relation, inhparent);
+ /* Collect info about relation's partitioning scheme, if any. */
+ if (inhparent)
+ get_relation_partition_info(root, rel, relation);
+
heap_close(relation, NoLock);
/*
*************** has_row_triggers(PlannerInfo *root, Inde
*** 1801,1803 ****
--- 1812,1975 ----
heap_close(relation, NoLock);
return result;
}
+
+ /*
+ * get_relation_partition_info
+ *
+ * Retrieves partitioning information for a given relation.
+ *
+ * Partitioning scheme, partition key expressions and OIDs of partitions are
+ * added to the given RelOptInfo. A partitioned table can participate in the
+ * query as a simple relation or an inheritance parent. Only the later can have
+ * child relations, and hence partitions. From the point of view of the query
+ * optimizer only such relations are considered to be partitioned. Hence
+ * partitioning information is set only for an inheritance parent.
+ */
+ static void
+ get_relation_partition_info(PlannerInfo *root, RelOptInfo *rel,
+ Relation relation)
+ {
+ PartitionDesc part_desc = RelationGetPartitionDesc(relation);
+
+ /* No partitioning information for an unpartitioned relation. */
+ if (relation->rd_rel->relkind != RELKIND_PARTITIONED_TABLE ||
+ !(rel->part_scheme = find_partition_scheme(root, relation)))
+ return;
+
+ Assert(part_desc);
+ rel->nparts = part_desc->nparts;
+ rel->boundinfo = part_desc->boundinfo;
+ rel->partexprs = build_baserel_partition_key_exprs(relation, rel->relid);
+ rel->part_oids = part_desc->oids;
+
+ Assert(rel->nparts > 0 && rel->boundinfo && rel->part_oids);
+ return;
+ }
+
+ /*
+ * find_partition_scheme
+ *
+ * The function returns a canonical partition scheme which exactly matches the
+ * partitioning properties of the given relation if one exists in the of
+ * canonical partitioning schemes maintained in PlannerInfo. If none of the
+ * existing partitioning schemes match, the function creates a canonical
+ * partition scheme and adds it to the list.
+ *
+ * For an unpartitioned table or for a multi-level partitioned table it returns
+ * NULL. See comments in the function for more details.
+ */
+ static PartitionScheme
+ find_partition_scheme(PlannerInfo *root, Relation relation)
+ {
+ PartitionKey part_key = RelationGetPartitionKey(relation);
+ ListCell *lc;
+ int partnatts;
+ PartitionScheme part_scheme = NULL;
+
+ /* No partition scheme for an unpartitioned relation. */
+ if (!part_key)
+ return NULL;
+
+ partnatts = part_key->partnatts;
+
+ /* Search for a matching partition scheme and return if found one. */
+ foreach (lc, root->part_schemes)
+ {
+ part_scheme = lfirst(lc);
+
+ /* Match partitioning strategy and number of keys. */
+ if (part_key->strategy != part_scheme->strategy ||
+ partnatts != part_scheme->partnatts)
+ continue;
+
+ /* Match the partition key types. */
+ if (memcmp(part_key->partopfamily, part_scheme->partopfamily,
+ sizeof(Oid) * partnatts) != 0 ||
+ memcmp(part_key->partopcintype, part_scheme->partopcintype,
+ sizeof(Oid) * partnatts) != 0 ||
+ memcmp(part_key->parttypcoll, part_scheme->parttypcoll,
+ sizeof(Oid) * partnatts) != 0)
+ continue;
+
+ /* Found matching partition scheme. */
+ return part_scheme;
+ }
+
+ /* Did not find matching partition scheme. Create one. */
+ part_scheme = (PartitionScheme) palloc0(sizeof(PartitionSchemeData));
+
+ part_scheme->strategy = part_key->strategy;
+ /* Store partition key information. */
+ part_scheme->partnatts = part_key->partnatts;
+ part_scheme->partopfamily = part_key->partopfamily;
+ part_scheme->partopcintype = part_key->partopcintype;
+ part_scheme->parttypcoll = part_key->parttypcoll;
+ part_scheme->partsupfunc = part_key->partsupfunc;
+
+ /* Add the partitioning scheme to PlannerInfo. */
+ root->part_schemes = lappend(root->part_schemes, part_scheme);
+
+ return part_scheme;
+ }
+
+ /*
+ * build_baserel_partition_key_exprs
+ *
+ * Collect partition key expressions for a given base relation. The function
+ * converts any single column partition keys into corresponding Var nodes. It
+ * restamps Var nodes in partition key expressions by given varno. The
+ * partition key expressions are returned as an array of single element lists
+ * to be stored in RelOptInfo of the base relation.
+ */
+ static List **
+ build_baserel_partition_key_exprs(Relation relation, Index varno)
+ {
+ PartitionKey part_key = RelationGetPartitionKey(relation);
+ int num_pkexprs;
+ int cnt_pke;
+ List **partexprs;
+ ListCell *lc;
+
+ if (!part_key || part_key->partnatts <= 0)
+ return NULL;
+
+ num_pkexprs = part_key->partnatts;
+ partexprs = (List **) palloc(sizeof(List *) * num_pkexprs);
+ lc = list_head(part_key->partexprs);
+
+ for (cnt_pke = 0; cnt_pke < num_pkexprs; cnt_pke++)
+ {
+ AttrNumber attno = part_key->partattrs[cnt_pke];
+ Expr *pkexpr;
+
+ if (attno != InvalidAttrNumber)
+ {
+ /* Single column partition key is stored as a Var node. */
+ Form_pg_attribute att_tup;
+
+ if (attno < 0)
+ att_tup = SystemAttributeDefinition(attno,
+ relation->rd_rel->relhasoids);
+ else
+ att_tup = relation->rd_att->attrs[attno - 1];
+
+ pkexpr = (Expr *) makeVar(varno, attno, att_tup->atttypid,
+ att_tup->atttypmod,
+ att_tup->attcollation, 0);
+ }
+ else
+ {
+ if (lc == NULL)
+ elog(ERROR, "wrong number of partition key expressions");
+
+ /* Re-stamp the expression with given varno. */
+ pkexpr = (Expr *) copyObject(lfirst(lc));
+ ChangeVarNodes((Node *) pkexpr, 1, varno, 0);
+ lc = lnext(lc);
+ }
+
+ partexprs[cnt_pke] = list_make1(pkexpr);
+ }
+
+ return partexprs;
+ }
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
new file mode 100644
index 342d884..308bdec
*** a/src/backend/optimizer/util/relnode.c
--- b/src/backend/optimizer/util/relnode.c
***************
*** 23,30 ****
--- 23,32 ----
#include "optimizer/paths.h"
#include "optimizer/placeholder.h"
#include "optimizer/plancat.h"
+ #include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
#include "optimizer/tlist.h"
+ #include "optimizer/var.h"
#include "utils/hsearch.h"
*************** typedef struct JoinHashEntry
*** 35,41 ****
} JoinHashEntry;
static void build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
! RelOptInfo *input_rel);
static List *build_joinrel_restrictlist(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outer_rel,
--- 37,43 ----
} JoinHashEntry;
static void build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
! RelOptInfo *input_rel, bool grouped);
static List *build_joinrel_restrictlist(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outer_rel,
*************** static List *subbuild_joinrel_joinlist(R
*** 52,57 ****
--- 54,64 ----
static void set_foreign_rel_properties(RelOptInfo *joinrel,
RelOptInfo *outer_rel, RelOptInfo *inner_rel);
static void add_join_rel(PlannerInfo *root, RelOptInfo *joinrel);
+ extern ParamPathInfo *find_param_path_info(RelOptInfo *rel,
+ Relids required_outer);
+ static void build_joinrel_partition_info(RelOptInfo *joinrel,
+ RelOptInfo *outer_rel, RelOptInfo *inner_rel,
+ List *restrictlist, JoinType jointype);
/*
*************** build_simple_rel(PlannerInfo *root, int
*** 120,125 ****
--- 127,133 ----
rel->cheapest_parameterized_paths = NIL;
rel->direct_lateral_relids = NULL;
rel->lateral_relids = NULL;
+ rel->gpi = NULL;
rel->relid = relid;
rel->rtekind = rte->rtekind;
/* min_attr, max_attr, attr_needed, attr_widths are set below */
*************** build_simple_rel(PlannerInfo *root, int
*** 146,151 ****
--- 154,164 ----
rel->baserestrict_min_security = UINT_MAX;
rel->joininfo = NIL;
rel->has_eclass_joins = false;
+ rel->part_scheme = NULL;
+ rel->nparts = 0;
+ rel->boundinfo = NULL;
+ rel->partexprs = NULL;
+ rel->part_rels = NULL;
/*
* Pass top parent's relids down the inheritance hierarchy. If the parent
*************** build_simple_rel(PlannerInfo *root, int
*** 218,237 ****
if (rte->inh)
{
ListCell *l;
foreach(l, root->append_rel_list)
{
AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(l);
/* append_rel_list contains all append rels; ignore others */
if (appinfo->parent_relid != relid)
continue;
! (void) build_simple_rel(root, appinfo->child_relid,
! rel);
}
}
return rel;
}
--- 231,293 ----
if (rte->inh)
{
ListCell *l;
+ int nparts = rel->nparts;
+
+ if (nparts > 0)
+ rel->part_rels = (RelOptInfo **) palloc0(sizeof(RelOptInfo *) * nparts);
foreach(l, root->append_rel_list)
{
AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(l);
+ RelOptInfo *childrel;
+ int cnt_parts;
+ RangeTblEntry *childRTE;
/* append_rel_list contains all append rels; ignore others */
if (appinfo->parent_relid != relid)
continue;
! childrel = build_simple_rel(root, appinfo->child_relid,
! rel);
!
! /* Nothing more to do for an unpartitioned table. */
! if (!rel->part_scheme)
! continue;
!
! childRTE = root->simple_rte_array[appinfo->child_relid];
! /*
! * Two partitioned tables with the same partitioning scheme, have
! * their partition bounds arranged in the same order. The order of
! * partition OIDs in RelOptInfo corresponds to the partition bound
! * order. Thus the OIDs of matching partitions from both the tables
! * are placed at the same position in the array of partition OIDs
! * in the respective RelOptInfos. Arranging RelOptInfos of
! * partitions in the same order as their OIDs makes it easy to find
! * the RelOptInfos of matching partitions for partition-wise join.
! */
! for (cnt_parts = 0; cnt_parts < nparts; cnt_parts++)
! {
! if (rel->part_oids[cnt_parts] == childRTE->relid)
! {
! Assert(!rel->part_rels[cnt_parts]);
! rel->part_rels[cnt_parts] = childrel;
! break;
! }
! }
}
}
+ /* Should have found all the childrels of a partitioned relation. */
+ if (rel->part_scheme)
+ {
+ int cnt_parts;
+
+ for (cnt_parts = 0; cnt_parts < rel->nparts; cnt_parts++)
+ if (!rel->part_rels[cnt_parts])
+ elog(ERROR, "could not find the RelOptInfo of a partition with oid %u",
+ rel->part_oids[cnt_parts]);
+ }
+
return rel;
}
*************** build_join_rel(PlannerInfo *root,
*** 453,458 ****
--- 509,517 ----
RelOptInfo *joinrel;
List *restrictlist;
+ /* This function should be used only for join between parents. */
+ Assert(!IS_OTHER_REL(outer_rel) && !IS_OTHER_REL(inner_rel));
+
/*
* See if we already have a joinrel for this set of base rels.
*/
*************** build_join_rel(PlannerInfo *root,
*** 497,502 ****
--- 556,562 ----
inner_rel->direct_lateral_relids);
joinrel->lateral_relids = min_join_parameterization(root, joinrel->relids,
outer_rel, inner_rel);
+ joinrel->gpi = NULL;
joinrel->relid = 0; /* indicates not a baserel */
joinrel->rtekind = RTE_JOIN;
joinrel->min_attr = 0;
*************** build_join_rel(PlannerInfo *root,
*** 527,532 ****
--- 587,597 ----
joinrel->joininfo = NIL;
joinrel->has_eclass_joins = false;
joinrel->top_parent_relids = NULL;
+ joinrel->part_scheme = NULL;
+ joinrel->nparts = 0;
+ joinrel->boundinfo = NULL;
+ joinrel->partexprs = NULL;
+ joinrel->part_rels = NULL;
/* Compute information relevant to the foreign relations. */
set_foreign_rel_properties(joinrel, outer_rel, inner_rel);
*************** build_join_rel(PlannerInfo *root,
*** 539,548 ****
* and inner rels we first try to build it from. But the contents should
* be the same regardless.
*/
! build_joinrel_tlist(root, joinrel, outer_rel);
! build_joinrel_tlist(root, joinrel, inner_rel);
add_placeholders_to_joinrel(root, joinrel, outer_rel, inner_rel);
/*
* add_placeholders_to_joinrel also took care of adding the ph_lateral
* sets of any PlaceHolderVars computed here to direct_lateral_relids, so
--- 604,620 ----
* and inner rels we first try to build it from. But the contents should
* be the same regardless.
*/
! build_joinrel_tlist(root, joinrel, outer_rel, false);
! build_joinrel_tlist(root, joinrel, inner_rel, false);
add_placeholders_to_joinrel(root, joinrel, outer_rel, inner_rel);
+ /* Try to build grouped target. */
+ /*
+ * TODO Consider if placeholders make sense here. If not, also make the
+ * related code below conditional.
+ */
+ prepare_rel_for_grouping(root, joinrel);
+
/*
* add_placeholders_to_joinrel also took care of adding the ph_lateral
* sets of any PlaceHolderVars computed here to direct_lateral_relids, so
*************** build_join_rel(PlannerInfo *root,
*** 572,577 ****
--- 644,653 ----
*/
joinrel->has_eclass_joins = has_relevant_eclass_joinclause(root, joinrel);
+ /* Store the partition information. */
+ build_joinrel_partition_info(joinrel, outer_rel, inner_rel, restrictlist,
+ sjinfo->jointype);
+
/*
* Set estimates of the joinrel's size.
*/
*************** build_join_rel(PlannerInfo *root,
*** 617,622 ****
--- 693,845 ----
return joinrel;
}
+ /*
+ * build_child_join_rel
+ * Builds RelOptInfo for joining given two child relations from RelOptInfo
+ * representing the join between their parents.
+ *
+ * 'outer_rel' and 'inner_rel' are the RelOptInfos of child relations being
+ * joined.
+ * 'parent_joinrel' is the RelOptInfo representing the join between parent
+ * relations. Most of the members of new RelOptInfo are produced by
+ * translating corresponding members of this RelOptInfo.
+ * 'sjinfo': context info for child join
+ * 'restrictlist': list of RestrictInfo nodes that apply to this particular
+ * pair of joinable relations.
+ * 'join_appinfos': list of AppendRelInfo nodes for base child relations involved
+ * in this join.
+ */
+ RelOptInfo *
+ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
+ RelOptInfo *inner_rel, RelOptInfo *parent_joinrel,
+ List *restrictlist, SpecialJoinInfo *sjinfo,
+ JoinType jointype)
+ {
+ RelOptInfo *joinrel = makeNode(RelOptInfo);
+ AppendRelInfo **appinfos;
+ int nappinfos;
+
+ /* Only joins between other relations land here. */
+ Assert(IS_OTHER_REL(outer_rel) && IS_OTHER_REL(inner_rel));
+
+ joinrel->reloptkind = RELOPT_OTHER_JOINREL;
+ joinrel->relids = bms_union(outer_rel->relids, inner_rel->relids);
+ joinrel->rows = 0;
+ /* cheap startup cost is interesting iff not all tuples to be retrieved */
+ joinrel->consider_startup = (root->tuple_fraction > 0);
+ joinrel->consider_param_startup = false;
+ joinrel->consider_parallel = false;
+ joinrel->reltarget = create_empty_pathtarget();
+ joinrel->pathlist = NIL;
+ joinrel->ppilist = NIL;
+ joinrel->partial_pathlist = NIL;
+ joinrel->cheapest_startup_path = NULL;
+ joinrel->cheapest_total_path = NULL;
+ joinrel->cheapest_unique_path = NULL;
+ joinrel->cheapest_parameterized_paths = NIL;
+ joinrel->direct_lateral_relids = NULL;
+ joinrel->lateral_relids = NULL;
+ joinrel->gpi = makeNode(GroupedPathInfo);
+ if (parent_joinrel->gpi)
+ /*
+ * Translation into child varnos will take place along with other
+ * translations, see try_partition_wise_join.
+ */
+ joinrel->gpi->target = copy_pathtarget(parent_joinrel->gpi->target);
+ joinrel->relid = 0; /* indicates not a baserel */
+ joinrel->rtekind = RTE_JOIN;
+ joinrel->min_attr = 0;
+ joinrel->max_attr = 0;
+ joinrel->attr_needed = NULL;
+ joinrel->attr_widths = NULL;
+ joinrel->lateral_vars = NIL;
+ joinrel->lateral_referencers = NULL;
+ joinrel->indexlist = NIL;
+ joinrel->pages = 0;
+ joinrel->tuples = 0;
+ joinrel->allvisfrac = 0;
+ joinrel->subroot = NULL;
+ joinrel->subplan_params = NIL;
+ joinrel->serverid = InvalidOid;
+ joinrel->userid = InvalidOid;
+ joinrel->useridiscurrent = false;
+ joinrel->fdwroutine = NULL;
+ joinrel->fdw_private = NULL;
+ joinrel->baserestrictinfo = NIL;
+ joinrel->baserestrictcost.startup = 0;
+ joinrel->baserestrictcost.per_tuple = 0;
+ joinrel->joininfo = NIL;
+ joinrel->has_eclass_joins = false;
+ joinrel->top_parent_relids = NULL;
+ joinrel->part_scheme = NULL;
+ joinrel->part_rels = NULL;
+ joinrel->partexprs = NULL;
+
+ joinrel->top_parent_relids = bms_union(outer_rel->top_parent_relids,
+ inner_rel->top_parent_relids);
+
+ /* Compute information relevant to foreign relations. */
+ set_foreign_rel_properties(joinrel, outer_rel, inner_rel);
+
+ /* Build targetlist */
+ build_joinrel_tlist(root, joinrel, outer_rel, false);
+ build_joinrel_tlist(root, joinrel, inner_rel, false);
+ /* Add placeholder variables. */
+ add_placeholders_to_child_joinrel(root, joinrel, parent_joinrel);
+
+ /* Try to build grouped target. */
+ /*
+ * TODO Consider if placeholders make sense here. If not, also make the
+ * related code below conditional.
+ */
+ prepare_rel_for_grouping(root, joinrel);
+
+
+ /* Construct joininfo list. */
+ appinfos = find_appinfos_by_relids(root, joinrel->relids, &nappinfos);
+ joinrel->joininfo = (List *) adjust_appendrel_attrs(root,
+ (Node *) parent_joinrel->joininfo,
+ nappinfos,
+ appinfos);
+ pfree(appinfos);
+
+ /*
+ * Lateral relids referred in child join will be same as that referred in
+ * the parent relation. Throw any partial result computed while building
+ * the targetlist.
+ */
+ bms_free(joinrel->direct_lateral_relids);
+ bms_free(joinrel->lateral_relids);
+ joinrel->direct_lateral_relids = (Relids) bms_copy(parent_joinrel->direct_lateral_relids);
+ joinrel->lateral_relids = (Relids) bms_copy(parent_joinrel->lateral_relids);
+
+ /*
+ * If the parent joinrel has pending equivalence classes, so does the
+ * child.
+ */
+ joinrel->has_eclass_joins = parent_joinrel->has_eclass_joins;
+
+ /* Is the join between partitions itself partitioned? */
+ build_joinrel_partition_info(joinrel, outer_rel, inner_rel, restrictlist,
+ jointype);
+
+ /* Child joinrel is parallel safe if parent is parallel safe. */
+ joinrel->consider_parallel = parent_joinrel->consider_parallel;
+
+
+ /* Set estimates of the child-joinrel's size. */
+ set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
+ sjinfo, restrictlist);
+
+ /* We build the join only once. */
+ Assert(!find_join_rel(root, joinrel->relids));
+
+ /* Add the relation to the PlannerInfo. */
+ add_join_rel(root, joinrel);
+
+ return joinrel;
+ }
+
/*
* min_join_parameterization
*
*************** min_join_parameterization(PlannerInfo *r
*** 670,679 ****
*/
static void
build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
! RelOptInfo *input_rel)
{
! Relids relids = joinrel->relids;
ListCell *vars;
foreach(vars, input_rel->reltarget->exprs)
{
--- 893,932 ----
*/
static void
build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
! RelOptInfo *input_rel, bool grouped)
{
! Relids relids;
! PathTarget *input_target, *result;
ListCell *vars;
+ int i = -1;
+
+ /* attrs_needed refers to parent relids and not those of a child. */
+ if (joinrel->top_parent_relids)
+ relids = joinrel->top_parent_relids;
+ else
+ relids = joinrel->relids;
+
+ if (!grouped)
+ {
+ input_target = input_rel->reltarget;
+ result = joinrel->reltarget;
+ }
+ else
+ {
+ if (input_rel->gpi != NULL)
+ {
+ input_target = input_rel->gpi->target;
+ Assert(input_target != NULL);
+ }
+ else
+ input_target = input_rel->reltarget;
+
+ /* Caller should have initialized this. */
+ Assert(joinrel->gpi != NULL);
+
+ /* Default to the plain target. */
+ result = joinrel->gpi->target;
+ }
foreach(vars, input_rel->reltarget->exprs)
{
*************** build_joinrel_tlist(PlannerInfo *root, R
*** 690,713 ****
/*
* Otherwise, anything in a baserel or joinrel targetlist ought to be
! * a Var. (More general cases can only appear in appendrel child
! * rels, which will never be seen here.)
*/
! if (!IsA(var, Var))
elog(ERROR, "unexpected node type in rel targetlist: %d",
(int) nodeTag(var));
- /* Get the Var's original base rel */
- baserel = find_base_rel(root, var->varno);
-
- /* Is it still needed above this joinrel? */
- ndx = var->varattno - baserel->min_attr;
if (bms_nonempty_difference(baserel->attr_needed[ndx], relids))
{
/* Yup, add it to the output */
! joinrel->reltarget->exprs = lappend(joinrel->reltarget->exprs, var);
! /* Vars have cost zero, so no need to adjust reltarget->cost */
! joinrel->reltarget->width += baserel->attr_widths[ndx];
}
}
}
--- 943,1009 ----
/*
* Otherwise, anything in a baserel or joinrel targetlist ought to be
! * a Var or ConvertRowtypeExpr introduced while translating parent
! * targetlist to that of the child.
*/
! if (IsA(var, Var))
! {
! /* Get the Var's original base rel */
! baserel = find_base_rel(root, var->varno);
!
! /* Is it still needed above this joinrel? */
! ndx = var->varattno - baserel->min_attr;
! }
! else if (IsA(var, ConvertRowtypeExpr))
! {
! ConvertRowtypeExpr *child_expr = (ConvertRowtypeExpr *) var;
! Var *childvar = (Var *) child_expr->arg;
!
! /*
! * Child's whole-row references are converted to that of parent
! * using ConvertRowtypeExpr. There can be as many
! * ConvertRowtypeExpr decorations as the depth of partition tree.
! * The argument to deepest ConvertRowtypeExpr is expected to be a
! * whole-row reference of the child.
! */
! while (IsA(childvar, ConvertRowtypeExpr))
! {
! child_expr = (ConvertRowtypeExpr *) childvar;
! childvar = (Var *) child_expr->arg;
! }
! Assert(IsA(childvar, Var) && childvar->varattno == 0);
!
! baserel = find_base_rel(root, childvar->varno);
! ndx = 0 - baserel->min_attr;
! }
! else
elog(ERROR, "unexpected node type in rel targetlist: %d",
(int) nodeTag(var));
if (bms_nonempty_difference(baserel->attr_needed[ndx], relids))
{
+ Index sortgroupref = 0;
+
/* Yup, add it to the output */
! if (input_target->sortgrouprefs)
! sortgroupref = input_target->sortgrouprefs[i];
!
! /*
! * Even if not used for grouping in the input path (the input path
! * is not necessarily grouped), it might be useful for grouping
! * higher in the join tree.
! */
! if (sortgroupref == 0)
! sortgroupref = get_expr_sortgroupref(root, (Expr *) var);
!
! add_column_to_pathtarget(result, (Expr *) var, sortgroupref);
!
! /*
! * Vars have cost zero, so no need to adjust reltarget->cost. Even
! * if, it's a ConvertRowtypeExpr, it will be computed only for the
! * base relation, costing nothing for a join.
! */
! result->width += baserel->attr_widths[ndx];
}
}
}
*************** subbuild_joinrel_joinlist(RelOptInfo *jo
*** 843,848 ****
--- 1139,1147 ----
{
ListCell *l;
+ /* Expected to be called only for join between parent relations. */
+ Assert(joinrel->reloptkind == RELOPT_JOINREL);
+
foreach(l, joininfo_list)
{
RestrictInfo *rinfo = (RestrictInfo *) lfirst(l);
*************** get_baserel_parampathinfo(PlannerInfo *r
*** 1048,1059 ****
Assert(!bms_overlap(baserel->relids, required_outer));
/* If we already have a PPI for this parameterization, just return it */
! foreach(lc, baserel->ppilist)
! {
! ppi = (ParamPathInfo *) lfirst(lc);
! if (bms_equal(ppi->ppi_req_outer, required_outer))
! return ppi;
! }
/*
* Identify all joinclauses that are movable to this base rel given this
--- 1347,1354 ----
Assert(!bms_overlap(baserel->relids, required_outer));
/* If we already have a PPI for this parameterization, just return it */
! if ((ppi = find_param_path_info(baserel, required_outer)))
! return ppi;
/*
* Identify all joinclauses that are movable to this base rel given this
*************** get_baserel_parampathinfo(PlannerInfo *r
*** 1095,1100 ****
--- 1390,1545 ----
}
/*
+ * If the relation can produce grouped paths, create GroupedPathInfo for it
+ * and create target for the grouped paths.
+ */
+ void
+ prepare_rel_for_grouping(PlannerInfo *root, RelOptInfo *rel)
+ {
+ List *rel_aggregates;
+ Relids rel_agg_attrs = NULL;
+ List *rel_agg_vars = NIL;
+ bool found_higher;
+ ListCell *lc;
+ PathTarget *target_grouped;
+
+ if (rel->relid > 0)
+ {
+ RangeTblEntry *rte = root->simple_rte_array[rel->relid];;
+
+ /*
+ * rtekind != RTE_RELATION case is not supported yet.
+ */
+ if (rte->rtekind != RTE_RELATION)
+ return;
+ }
+
+ /* Caller should only pass base relations or joins. */
+ Assert(rel->reloptkind == RELOPT_BASEREL ||
+ rel->reloptkind == RELOPT_JOINREL ||
+ rel->reloptkind == RELOPT_OTHER_JOINREL);
+
+ /*
+ * If any outer join can set the attribute value to NULL, the aggregate
+ * would receive different input at the base rel level.
+ *
+ * TODO For RELOPT_JOINREL, do not return if all the joins that can set
+ * any entry of the grouped target (do we need to postpone this check
+ * until the grouped target is available, and should create_grouped_target
+ * take care?) of this rel to NULL are provably below rel. (It's ok if rel
+ * is one of these joins.)
+ */
+ if (bms_overlap(rel->relids, root->nullable_baserels))
+ return;
+
+ /*
+ * Check if some aggregates can be evaluated in this relation's target,
+ * and collect all vars referenced by these aggregates.
+ */
+ rel_aggregates = NIL;
+ found_higher = false;
+ foreach(lc, root->grouped_var_list)
+ {
+ GroupedVarInfo *gvi = castNode(GroupedVarInfo, lfirst(lc));
+
+ /*
+ * The subset includes gv_eval_at uninitialized, which typically means
+ * Aggref.aggstar.
+ */
+ if (bms_is_subset(gvi->gv_eval_at, rel->relids))
+ {
+ Aggref *aggref = castNode(Aggref, gvi->gvexpr);
+
+ /*
+ * Accept the aggregate.
+ *
+ * GroupedVarInfo is more convenient for the next processing than
+ * Aggref, see add_aggregates_to_grouped_target.
+ */
+ rel_aggregates = lappend(rel_aggregates, gvi);
+
+ if (rel->relid > 0)
+ {
+ /*
+ * Simple relation. Collect attributes referenced by the
+ * aggregate arguments.
+ */
+ pull_varattnos((Node *) aggref, rel->relid, &rel_agg_attrs);
+ }
+ else
+ {
+ List *agg_vars;
+
+ /*
+ * Join. Collect vars referenced by the aggregate
+ * arguments.
+ */
+ /*
+ * TODO Can any argument contain PHVs? And if so, does it matter?
+ * Consider PVC_INCLUDE_PLACEHOLDERS | PVC_RECURSE_PLACEHOLDERS.
+ */
+ agg_vars = pull_var_clause((Node *) aggref,
+ PVC_RECURSE_AGGREGATES);
+ rel_agg_vars = list_concat(rel_agg_vars, agg_vars);
+ }
+ }
+ else if (bms_overlap(gvi->gv_eval_at, rel->relids))
+ {
+ /*
+ * Remember that there is at least one aggregate that needs more
+ * than this rel.
+ */
+ found_higher = true;
+ }
+ }
+
+ /*
+ * Grouping makes little sense w/o aggregate function.
+ */
+ if (rel_aggregates == NIL)
+ {
+ bms_free(rel_agg_attrs);
+ return;
+ }
+
+ if (found_higher)
+ {
+ /*
+ * If some aggregate(s) need only this rel but some other need
+ * multiple relations including the the current one, grouping of the
+ * current rel could steal some input variables from the "higher
+ * aggregate" (besides decreasing the number of input rows).
+ */
+ list_free(rel_aggregates);
+ bms_free(rel_agg_attrs);
+ return;
+ }
+
+ /*
+ * If rel->reltarget can be used for aggregation, mark the relation as
+ * capable of grouping.
+ */
+ Assert(rel->gpi == NULL);
+ target_grouped = create_grouped_target(root, rel, rel_agg_attrs,
+ rel_agg_vars);
+ if (target_grouped != NULL)
+ {
+ GroupedPathInfo *gpi;
+
+ gpi = makeNode(GroupedPathInfo);
+ gpi->target = copy_pathtarget(target_grouped);
+ gpi->pathlist = NIL;
+ gpi->partial_pathlist = NIL;
+ rel->gpi = gpi;
+
+ /*
+ * Add aggregates (in the form of GroupedVar) to the target.
+ */
+ add_aggregates_to_target(root, gpi->target, rel_aggregates, rel);
+ }
+ }
+
+ /*
* get_joinrel_parampathinfo
* Get the ParamPathInfo for a parameterized path for a join relation,
* constructing one if we don't have one already.
*************** get_joinrel_parampathinfo(PlannerInfo *r
*** 1290,1301 ****
*restrict_clauses = list_concat(pclauses, *restrict_clauses);
/* If we already have a PPI for this parameterization, just return it */
! foreach(lc, joinrel->ppilist)
! {
! ppi = (ParamPathInfo *) lfirst(lc);
! if (bms_equal(ppi->ppi_req_outer, required_outer))
! return ppi;
! }
/* Estimate the number of rows returned by the parameterized join */
rows = get_parameterized_joinrel_size(root, joinrel,
--- 1735,1742 ----
*restrict_clauses = list_concat(pclauses, *restrict_clauses);
/* If we already have a PPI for this parameterization, just return it */
! if ((ppi = find_param_path_info(joinrel, required_outer)))
! return ppi;
/* Estimate the number of rows returned by the parameterized join */
rows = get_parameterized_joinrel_size(root, joinrel,
*************** ParamPathInfo *
*** 1334,1340 ****
get_appendrel_parampathinfo(RelOptInfo *appendrel, Relids required_outer)
{
ParamPathInfo *ppi;
- ListCell *lc;
/* Unparameterized paths have no ParamPathInfo */
if (bms_is_empty(required_outer))
--- 1775,1780 ----
*************** get_appendrel_parampathinfo(RelOptInfo *
*** 1343,1354 ****
Assert(!bms_overlap(appendrel->relids, required_outer));
/* If we already have a PPI for this parameterization, just return it */
! foreach(lc, appendrel->ppilist)
! {
! ppi = (ParamPathInfo *) lfirst(lc);
! if (bms_equal(ppi->ppi_req_outer, required_outer))
! return ppi;
! }
/* Else build the ParamPathInfo */
ppi = makeNode(ParamPathInfo);
--- 1783,1790 ----
Assert(!bms_overlap(appendrel->relids, required_outer));
/* If we already have a PPI for this parameterization, just return it */
! if ((ppi = find_param_path_info(appendrel, required_outer)))
! return ppi;
/* Else build the ParamPathInfo */
ppi = makeNode(ParamPathInfo);
*************** get_appendrel_parampathinfo(RelOptInfo *
*** 1359,1361 ****
--- 1795,1917 ----
return ppi;
}
+
+ /*
+ * Returns a ParamPathInfo for outer relations specified by required_outer, if
+ * already available in the given rel. Returns NULL otherwise.
+ */
+ ParamPathInfo *
+ find_param_path_info(RelOptInfo *rel, Relids required_outer)
+ {
+ ListCell *lc;
+
+ foreach(lc, rel->ppilist)
+ {
+ ParamPathInfo *ppi = (ParamPathInfo *) lfirst(lc);
+ if (bms_equal(ppi->ppi_req_outer, required_outer))
+ return ppi;
+ }
+
+ return NULL;
+ }
+
+ /*
+ * build_joinrel_partition_info
+ * If the join between given partitioned relations is possibly partitioned
+ * set the partitioning scheme and partition keys expressions for the
+ * join.
+ *
+ * If the two relations have same partitioning scheme, their join may be
+ * partitioned and will follow the same partitioning scheme as the joining
+ * relations.
+ */
+ static void
+ build_joinrel_partition_info(RelOptInfo *joinrel, RelOptInfo *outer_rel,
+ RelOptInfo *inner_rel, List *restrictlist,
+ JoinType jointype)
+ {
+ int num_pks;
+ int cnt;
+ bool is_strict;
+
+ /* Nothing to do if partition-wise join technique is disabled. */
+ if (!enable_partition_wise_join)
+ {
+ joinrel->part_scheme = NULL;
+ return;
+ }
+
+ /*
+ * The join is not partitioned, if any of the relations being joined are
+ * not partitioned or they do not have same partitioning scheme or if there
+ * is no equi-join between partition keys.
+ *
+ * For an N-way inner join, where every syntactic inner join has equi-join
+ * between partition keys and a matching partitioning scheme, partition
+ * keys of N relations form an equivalence class, thus inducing an
+ * equi-join between any pair of joining relations.
+ *
+ * For an N-way join with outer joins, where every syntactic join has an
+ * equi-join between partition keys and a matching partitioning scheme,
+ * outer join reordering identities in optimizer/README imply that only
+ * those pairs of join are legal which have an equi-join between partition
+ * keys. Thus every pair of joining relations we see here should have an
+ * equi-join if this join has been deemed as a partitioned join.
+ */
+ if (!outer_rel->part_scheme || !inner_rel->part_scheme ||
+ outer_rel->part_scheme != inner_rel->part_scheme ||
+ !have_partkey_equi_join(outer_rel, inner_rel, jointype, restrictlist,
+ &is_strict))
+ {
+ joinrel->part_scheme = NULL;
+ return;
+ }
+
+ /*
+ * This function will be called only once for each joinrel, hence it should
+ * not have partition scheme, partition key expressions and array for
+ * storing child relations set.
+ */
+ Assert(!joinrel->part_scheme && !joinrel->partexprs &&
+ !joinrel->part_rels);
+
+ /*
+ * Join relation is partitioned using same partitioning scheme as the
+ * joining relations.
+ */
+ joinrel->part_scheme = outer_rel->part_scheme;
+ num_pks = joinrel->part_scheme->partnatts;
+
+ /*
+ * Construct partition keys for the join.
+ *
+ * An INNER join between two partitioned relations is partition by key
+ * expressions from both the relations. For tables A and B partitioned by a
+ * and b respectively, (A INNER JOIN B ON A.a = B.b) is partitioned by both
+ * A.a and B.b.
+ *
+ * An OUTER join like (A LEFT JOIN B ON A.a = B.b) may produce rows with
+ * B.b NULL. These rows may not fit the partitioning conditions imposed on
+ * B.b. Hence, strictly speaking, the join is not partitioned by B.b.
+ * Strictly speaking, partition keys of an OUTER join should include
+ * partition key expressions from the OUTER side only. Consider a join like
+ * (A LEFT JOIN B on (A.a = B.b) LEFT JOIN C ON B.b = C.c. If we do not
+ * include B.b as partition key expression for (AB), it prohibits us from
+ * using partition-wise join when joining (AB) with C as there is no
+ * equi-join between partition keys of joining relations. If the equality
+ * operator is strict, two NULL values are never equal and no two rows from
+ * mis-matching partitions can join. Hence if the equality operator is
+ * strict it's safe to include B.b as partition key expression for (AB),
+ * even though rows in (AB) are not strictly partitioned by B.b.
+ */
+ joinrel->partexprs = (List **) palloc0(sizeof(List *) * num_pks);
+ for (cnt = 0; cnt < num_pks; cnt++)
+ {
+ List *pkexpr = list_copy(outer_rel->partexprs[cnt]);
+
+ if (jointype == JOIN_INNER || is_strict)
+ pkexpr = list_concat(pkexpr,
+ list_copy(inner_rel->partexprs[cnt]));
+ joinrel->partexprs[cnt] = pkexpr;
+ }
+ }
diff --git a/src/backend/optimizer/util/tlist.c b/src/backend/optimizer/util/tlist.c
new file mode 100644
index 0952385..dd962b7
*** a/src/backend/optimizer/util/tlist.c
--- b/src/backend/optimizer/util/tlist.c
*************** get_sortgrouplist_exprs(List *sgClauses,
*** 408,413 ****
--- 408,487 ----
return result;
}
+ /*
+ * get_sortgrouplist_clauses
+ *
+ * Given a "grouped target" (i.e. target where each non-GroupedVar
+ * element must have sortgroupref set), build a list of the referencing
+ * SortGroupClauses, a list of the corresponding grouping expressions and
+ * a list of aggregate expressions.
+ */
+ /* Refine the function name. */
+ void
+ get_grouping_expressions(PlannerInfo *root, PathTarget *target,
+ List **grouping_clauses, List **grouping_exprs,
+ List **agg_exprs)
+ {
+ ListCell *l;
+ int i = 0;
+
+ foreach(l, target->exprs)
+ {
+ Index sortgroupref = 0;
+ SortGroupClause *cl;
+ Expr *texpr;
+
+ texpr = (Expr *) lfirst(l);
+
+ /* The target should contain at least one grouping column. */
+ Assert(target->sortgrouprefs != NULL);
+
+ if (IsA(texpr, GroupedVar))
+ {
+ /*
+ * texpr should represent the first aggregate in the targetlist.
+ */
+ break;
+ }
+
+ /*
+ * Find the clause by sortgroupref.
+ */
+ sortgroupref = target->sortgrouprefs[i++];
+
+ /*
+ * Besides aggregates, the target should contain no expressions w/o
+ * sortgroupref. Plain relation being joined to grouped can have
+ * sortgroupref equal to zero for expressions contained neither in
+ * grouping expression nor in aggregate arguments, but if the target
+ * contains such an expression, it shouldn't be used for aggregation
+ * --- see can_aggregate field of GroupedPathInfo.
+ */
+ Assert(sortgroupref > 0);
+
+ cl = get_sortgroupref_clause(sortgroupref, root->parse->groupClause);
+ *grouping_clauses = list_append_unique(*grouping_clauses, cl);
+
+ /*
+ * Add only unique clauses because of joins (both sides of a join can
+ * point at the same grouping clause). XXX Is it worth adding a bool
+ * argument indicating that we're dealing with join right now?
+ */
+ *grouping_exprs = list_append_unique(*grouping_exprs, texpr);
+ }
+
+ /* Now collect the aggregates. */
+ while (l != NULL)
+ {
+ GroupedVar *gvar = castNode(GroupedVar, lfirst(l));
+
+ /* Currently, GroupedVarInfo can only represent aggregate. */
+ Assert(gvar->agg_partial != NULL);
+ *agg_exprs = lappend(*agg_exprs, gvar->agg_partial);
+ l = lnext(l);
+ }
+ }
+
/*****************************************************************************
* Functions to extract data from a list of SortGroupClauses
*************** apply_pathtarget_labeling_to_tlist(List
*** 783,788 ****
--- 857,1081 ----
}
/*
+ * Replace each "grouped var" in the source targetlist with the original
+ * expression.
+ *
+ * TODO Think of more suitable name. Although "grouped var" may substitute for
+ * grouping expressions in the future, currently Aggref is the only outcome of
+ * the replacement. undo_grouped_var_substitutions?
+ */
+ List *
+ restore_grouping_expressions(PlannerInfo *root, List *src)
+ {
+ List *result = NIL;
+ ListCell *l;
+
+ foreach(l, src)
+ {
+ TargetEntry *te, *te_new;
+ Aggref *expr_new = NULL;
+
+ te = castNode(TargetEntry, lfirst(l));
+
+ if (IsA(te->expr, GroupedVar))
+ {
+ GroupedVar *gvar;
+
+ gvar = castNode(GroupedVar, te->expr);
+ expr_new = gvar->agg_partial;
+ }
+
+ if (expr_new != NULL)
+ {
+ te_new = flatCopyTargetEntry(te);
+ te_new->expr = (Expr *) expr_new;
+ }
+ else
+ te_new = te;
+ result = lappend(result, te_new);
+ }
+
+ return result;
+ }
+
+ /*
+ * For each aggregate add GroupedVar to target if "vars" is true, or the
+ * Aggref (marked as partial) if "vars" is false.
+ *
+ * If caller passes the aggregates, he must do so in the form of
+ * GroupedVarInfos so that we don't have to look for gvid. If NULL is passed,
+ * the function retrieves the suitable aggregates itself.
+ *
+ * List of the aggregates added is returned. This is only useful if the
+ * function had to retrieve the aggregates itself (i.e. NIL was passed for
+ * aggregates) -- caller is expected to do extra checks in that case (and to
+ * also free the list).
+ */
+ List *
+ add_aggregates_to_target(PlannerInfo *root, PathTarget *target,
+ List *aggregates, RelOptInfo *rel)
+ {
+ ListCell *lc;
+ GroupedVarInfo *gvi;
+
+ if (aggregates == NIL)
+ {
+ /* Caller should pass the aggregates for base relation. */
+ Assert(rel->reloptkind != RELOPT_BASEREL);
+
+ /* Collect all aggregates that this rel can evaluate. */
+ foreach(lc, root->grouped_var_list)
+ {
+ gvi = castNode(GroupedVarInfo, lfirst(lc));
+
+ /*
+ * Overlap is not guarantee of correctness alone, but caller needs
+ * to do additional checks, so we're optimistic here.
+ *
+ * If gv_eval_at is NULL, the underlying Aggref should have
+ * aggstar set.
+ */
+ if (bms_overlap(gvi->gv_eval_at, rel->relids) ||
+ gvi->gv_eval_at == NULL)
+ aggregates = lappend(aggregates, gvi);
+ }
+
+ if (aggregates == NIL)
+ return NIL;
+ }
+
+ /* Create the vars and add them to the target. */
+ foreach(lc, aggregates)
+ {
+ GroupedVar *gvar;
+
+ gvi = castNode(GroupedVarInfo, lfirst(lc));
+ gvar = makeNode(GroupedVar);
+ gvar->gvid = gvi->gvid;
+ gvar->gvexpr = gvi->gvexpr;
+ gvar->agg_partial = gvi->agg_partial;
+ add_new_column_to_pathtarget(target, (Expr *) gvar);
+ }
+
+ return aggregates;
+ }
+
+ /*
+ * Return ressortgroupref of the target entry that is either equal to the
+ * expression or exists in the same equivalence class.
+ */
+ Index
+ get_expr_sortgroupref(PlannerInfo *root, Expr *expr)
+ {
+ ListCell *lc;
+ Index sortgroupref;
+
+ /*
+ * First, check if the query group clause contains exactly this
+ * expression.
+ */
+ foreach(lc, root->processed_tlist)
+ {
+ TargetEntry *te = castNode(TargetEntry, lfirst(lc));
+
+ if (equal(expr, te->expr) && te->ressortgroupref > 0)
+ return te->ressortgroupref;
+ }
+
+ /*
+ * If exactly this expression is not there, check if a grouping clause
+ * exists that belongs to the same equivalence class as the expression.
+ */
+ foreach(lc, root->group_pathkeys)
+ {
+ PathKey *pk = castNode(PathKey, lfirst(lc));
+ EquivalenceClass *ec = pk->pk_eclass;
+ ListCell *lm;
+ EquivalenceMember *em;
+ Expr *em_expr = NULL;
+ Query *query = root->parse;
+
+ /*
+ * Single-member EC cannot provide us with additional expression.
+ */
+ if (list_length(ec->ec_members) < 2)
+ continue;
+
+ /* We need equality anywhere in the join tree. */
+ if (ec->ec_below_outer_join)
+ continue;
+
+ /*
+ * TODO Reconsider this restriction. As the grouping expression is
+ * only evaluated at the relation level (and only the result will be
+ * propagated to the final targetlist), volatile function might be
+ * o.k. Need to think what volatile EC exactly means.
+ */
+ if (ec->ec_has_volatile)
+ continue;
+
+ foreach(lm, ec->ec_members)
+ {
+ em = (EquivalenceMember *) lfirst(lm);
+
+ /* The EC has !ec_below_outer_join. */
+ Assert(!em->em_nullable_relids);
+ if (equal(em->em_expr, expr))
+ {
+ em_expr = (Expr *) em->em_expr;
+ break;
+ }
+ }
+
+ if (em_expr == NULL)
+ /* Go for the next EC. */
+ continue;
+
+ /*
+ * Find the corresponding SortGroupClause, which provides us with
+ * sortgroupref. (It can belong to any EC member.)
+ */
+ sortgroupref = 0;
+ foreach(lm, ec->ec_members)
+ {
+ ListCell *lsg;
+
+ em = (EquivalenceMember *) lfirst(lm);
+ foreach(lsg, query->groupClause)
+ {
+ SortGroupClause *sgc;
+ Expr *expr;
+
+ sgc = (SortGroupClause *) lfirst(lsg);
+ expr = (Expr *) get_sortgroupclause_expr(sgc,
+ query->targetList);
+ if (equal(em->em_expr, expr))
+ {
+ Assert(sgc->tleSortGroupRef > 0);
+ sortgroupref = sgc->tleSortGroupRef;
+ break;
+ }
+ }
+
+ if (sortgroupref > 0)
+ break;
+ }
+
+ /*
+ * Since we searched in group_pathkeys, at least one EM of this EC
+ * should correspond to a SortGroupClause, otherwise the EC could
+ * not exist at all.
+ */
+ Assert(sortgroupref > 0);
+
+ return sortgroupref;
+ }
+
+ /* No EC found in group_pathkeys. */
+ return 0;
+ }
+
+ /*
* split_pathtarget_at_srfs
* Split given PathTarget into multiple levels to position SRFs safely
*
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
new file mode 100644
index 184e5da..5e3c3b4
*** a/src/backend/utils/adt/ruleutils.c
--- b/src/backend/utils/adt/ruleutils.c
*************** get_rule_expr(Node *node, deparse_contex
*** 7559,7564 ****
--- 7559,7572 ----
get_agg_expr((Aggref *) node, context, (Aggref *) node);
break;
+ case T_GroupedVar:
+ {
+ GroupedVar *gvar = castNode(GroupedVar, node);
+
+ get_agg_expr(gvar->agg_partial, context, (Aggref *) gvar->gvexpr);
+ break;
+ }
+
case T_GroupingFunc:
{
GroupingFunc *gexpr = (GroupingFunc *) node;
*************** get_agg_combine_expr(Node *node, deparse
*** 8993,9002 ****
Aggref *aggref;
Aggref *original_aggref = private;
! if (!IsA(node, Aggref))
elog(ERROR, "combining Aggref does not point to an Aggref");
- aggref = (Aggref *) node;
get_agg_expr(aggref, context, original_aggref);
}
--- 9001,9018 ----
Aggref *aggref;
Aggref *original_aggref = private;
! if (IsA(node, Aggref))
! aggref = (Aggref *) node;
! else if (IsA(node, GroupedVar))
! {
! GroupedVar *gvar = castNode(GroupedVar, node);
!
! aggref = gvar->agg_partial;
! original_aggref = castNode(Aggref, gvar->gvexpr);
! }
! else
elog(ERROR, "combining Aggref does not point to an Aggref");
get_agg_expr(aggref, context, original_aggref);
}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
new file mode 100644
index a35b93b..78e24ea
*** a/src/backend/utils/adt/selfuncs.c
--- b/src/backend/utils/adt/selfuncs.c
***************
*** 114,119 ****
--- 114,120 ----
#include "catalog/pg_statistic_ext.h"
#include "catalog/pg_type.h"
#include "executor/executor.h"
+ #include "executor/nodeAgg.h"
#include "mb/pg_wchar.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
*************** estimate_hash_bucketsize(PlannerInfo *ro
*** 3705,3710 ****
--- 3706,3744 ----
return (Selectivity) estfract;
}
+ /*
+ * estimate_hashagg_tablesize
+ * estimate the number of bytes that a hash aggregate hashtable will
+ * require based on the agg_costs, path width and dNumGroups.
+ *
+ * XXX this may be over-estimating the size now that hashagg knows to omit
+ * unneeded columns from the hashtable. Also for mixed-mode grouping sets,
+ * grouping columns not in the hashed set are counted here even though hashagg
+ * won't store them. Is this a problem?
+ */
+ Size
+ estimate_hashagg_tablesize(Path *path, const AggClauseCosts *agg_costs,
+ double dNumGroups)
+ {
+ Size hashentrysize;
+
+ /* Estimate per-hash-entry space at tuple width... */
+ hashentrysize = MAXALIGN(path->pathtarget->width) +
+ MAXALIGN(SizeofMinimalTupleHeader);
+
+ /* plus space for pass-by-ref transition values... */
+ hashentrysize += agg_costs->transitionSpace;
+ /* plus the per-hash-entry overhead */
+ hashentrysize += hash_agg_entry_size(agg_costs->numAggs);
+
+ /*
+ * Note that this disregards the effect of fill-factor and growth policy
+ * of the hash-table. That's probably ok, given default the default
+ * fill-factor is relatively high. It'd be hard to meaningfully factor in
+ * "double-in-size" growth policies here.
+ */
+ return hashentrysize * dNumGroups;
+ }
/*-------------------------------------------------------------------------
*
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
new file mode 100644
index 85c6b61..cf94ccc
*** a/src/backend/utils/cache/relcache.c
--- b/src/backend/utils/cache/relcache.c
*************** equalPartitionDescs(PartitionKey key, Pa
*** 1204,1210 ****
if (partdesc2->boundinfo == NULL)
return false;
! if (!partition_bounds_equal(key, partdesc1->boundinfo,
partdesc2->boundinfo))
return false;
}
--- 1204,1212 ----
if (partdesc2->boundinfo == NULL)
return false;
! if (!partition_bounds_equal(key->partnatts, key->parttyplen,
! key->parttypbyval,
! partdesc1->boundinfo,
partdesc2->boundinfo))
return false;
}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
new file mode 100644
index a414fb2..343986d
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
*************** static struct config_bool ConfigureNames
*** 914,919 ****
--- 914,928 ----
true,
NULL, NULL, NULL
},
+ {
+ {"enable_partition_wise_join", PGC_USERSET, QUERY_TUNING_METHOD,
+ gettext_noop("Enables partition-wise join."),
+ NULL
+ },
+ &enable_partition_wise_join,
+ false,
+ NULL, NULL, NULL
+ },
{
{"geqo", PGC_USERSET, QUERY_TUNING_GEQO,
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
new file mode 100644
index 421644c..e51bca1
*** a/src/include/catalog/partition.h
--- b/src/include/catalog/partition.h
*************** typedef struct PartitionDispatchData
*** 71,78 ****
typedef struct PartitionDispatchData *PartitionDispatch;
extern void RelationBuildPartitionDesc(Relation relation);
! extern bool partition_bounds_equal(PartitionKey key,
! PartitionBoundInfo p1, PartitionBoundInfo p2);
extern void check_new_partition_bound(char *relname, Relation parent, Node *bound);
extern Oid get_partition_parent(Oid relid);
--- 71,79 ----
typedef struct PartitionDispatchData *PartitionDispatch;
extern void RelationBuildPartitionDesc(Relation relation);
! extern bool partition_bounds_equal(int partnatts, int16 *parttyplen,
! bool *parttypbyval, PartitionBoundInfo b1,
! PartitionBoundInfo b2);
extern void check_new_partition_bound(char *relname, Relation parent, Node *bound);
extern Oid get_partition_parent(Oid relid);
diff --git a/src/include/foreign/fdwapi.h b/src/include/foreign/fdwapi.h
new file mode 100644
index 6ca44f7..c57ff7b
*** a/src/include/foreign/fdwapi.h
--- b/src/include/foreign/fdwapi.h
*************** typedef void (*ShutdownForeignScan_funct
*** 155,160 ****
--- 155,163 ----
typedef bool (*IsForeignScanParallelSafe_function) (PlannerInfo *root,
RelOptInfo *rel,
RangeTblEntry *rte);
+ typedef List *(*ReparameterizeForeignPathByChild_function) (PlannerInfo *root,
+ List *fdw_private,
+ RelOptInfo *child_rel);
/*
* FdwRoutine is the struct returned by a foreign-data wrapper's handler
*************** typedef struct FdwRoutine
*** 226,231 ****
--- 229,237 ----
InitializeDSMForeignScan_function InitializeDSMForeignScan;
InitializeWorkerForeignScan_function InitializeWorkerForeignScan;
ShutdownForeignScan_function ShutdownForeignScan;
+
+ /* Support functions for path reparameterization. */
+ ReparameterizeForeignPathByChild_function ReparameterizeForeignPathByChild;
} FdwRoutine;
diff --git a/src/include/nodes/extensible.h b/src/include/nodes/extensible.h
new file mode 100644
index 0b02cc1..1c802ad
*** a/src/include/nodes/extensible.h
--- b/src/include/nodes/extensible.h
*************** typedef struct CustomPathMethods
*** 96,101 ****
--- 96,104 ----
List *tlist,
List *clauses,
List *custom_plans);
+ struct List *(*ReparameterizeCustomPathByChild) (PlannerInfo *root,
+ List *custom_private,
+ RelOptInfo *child_rel);
} CustomPathMethods;
/*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
new file mode 100644
index f59d719..ba1eac8
*** a/src/include/nodes/nodes.h
--- b/src/include/nodes/nodes.h
*************** typedef enum NodeTag
*** 218,223 ****
--- 218,224 ----
T_IndexOptInfo,
T_ForeignKeyOptInfo,
T_ParamPathInfo,
+ T_GroupedPathInfo,
T_Path,
T_IndexPath,
T_BitmapHeapPath,
*************** typedef enum NodeTag
*** 258,267 ****
--- 259,270 ----
T_PathTarget,
T_RestrictInfo,
T_PlaceHolderVar,
+ T_GroupedVar,
T_SpecialJoinInfo,
T_AppendRelInfo,
T_PartitionedChildRelInfo,
T_PlaceHolderInfo,
+ T_GroupedVarInfo,
T_MinMaxAggInfo,
T_PlannerParamItem,
T_RollupData,
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
new file mode 100644
index 7a8e2fd..b576dd5
*** a/src/include/nodes/relation.h
--- b/src/include/nodes/relation.h
***************
*** 15,20 ****
--- 15,21 ----
#define RELATION_H
#include "access/sdir.h"
+ #include "catalog/partition.h"
#include "lib/stringinfo.h"
#include "nodes/params.h"
#include "nodes/parsenodes.h"
*************** typedef struct PlannerInfo
*** 256,261 ****
--- 257,264 ----
List *placeholder_list; /* list of PlaceHolderInfos */
+ List *grouped_var_list; /* List of GroupedVarInfos. */
+
List *fkey_list; /* list of ForeignKeyOptInfos */
List *query_pathkeys; /* desired pathkeys for query_planner() */
*************** typedef struct PlannerInfo
*** 265,270 ****
--- 268,276 ----
List *distinct_pathkeys; /* distinctClause pathkeys, if any */
List *sort_pathkeys; /* sortClause pathkeys, if any */
+ List *part_schemes; /* Canonicalised partition schemes
+ * used in the query. */
+
List *initial_rels; /* RelOptInfos we are now trying to join */
/* Use fetch_upper_rel() to get any particular upper rel */
*************** typedef struct PlannerInfo
*** 325,330 ****
--- 331,362 ----
((root)->simple_rte_array ? (root)->simple_rte_array[rti] : \
rt_fetch(rti, (root)->parse->rtable))
+ /*
+ * Partitioning scheme
+ * Structure to hold partitioning scheme for a given relation.
+ *
+ * Multiple relations may be partitioned in the same way. The relations
+ * resulting from joining such relations may be partitioned in the same way as
+ * the joining relations. Similarly, relations derived from such relations by
+ * grouping, sorting may be partitioned in the same way as the underlying
+ * scan relations. All such relations partitioned in the same way share the
+ * partitioning scheme.
+ *
+ * PlannerInfo stores a list of distinct "canonical" partitioning schemes.
+ * RelOptInfo of a partitioned relation holds the pointer to "canonical"
+ * partitioning scheme.
+ */
+ typedef struct PartitionSchemeData
+ {
+ char strategy; /* partition strategy */
+ int16 partnatts; /* number of partition attributes */
+ Oid *partopfamily; /* OIDs of operator families */
+ Oid *partopcintype; /* OIDs of opclass declared input data types */
+ FmgrInfo *partsupfunc; /* lookup info for support funcs */
+ Oid *parttypcoll; /* OIDs of collations of partition keys. */
+ } PartitionSchemeData;
+
+ typedef struct PartitionSchemeData *PartitionScheme;
/*----------
* RelOptInfo
*************** typedef struct PlannerInfo
*** 359,364 ****
--- 391,401 ----
* handling join alias Vars. Currently this is not needed because all join
* alias Vars are expanded to non-aliased form during preprocess_expression.
*
+ * We also have relations representing joins between child relations of
+ * different partitioned tables. These relations are not added to
+ * join_rel_level lists as they are not joined directly by the dynamic
+ * programming algorithm.
+ *
* There is also a RelOptKind for "upper" relations, which are RelOptInfos
* that describe post-scan/join processing steps, such as aggregation.
* Many of the fields in these RelOptInfos are meaningless, but their Path
*************** typedef struct PlannerInfo
*** 401,406 ****
--- 438,445 ----
* direct_lateral_relids - rels this rel has direct LATERAL references to
* lateral_relids - required outer rels for LATERAL, as a Relids set
* (includes both direct and indirect lateral references)
+ * gpi - GroupedPathInfo if the relation can produce grouped paths, NULL
+ * otherwise.
*
* If the relation is a base relation it will have these fields set:
*
*************** typedef struct PlannerInfo
*** 486,491 ****
--- 525,543 ----
* We store baserestrictcost in the RelOptInfo (for base relations) because
* we know we will need it at least once (to price the sequential scan)
* and may need it multiple times to price index scans.
+ *
+ * If the relation is partitioned these fields will be set
+ * part_scheme - Partitioning scheme of the relation
+ * nparts - Number of partitions
+ * boundinfo - Partition bounds/lists
+ * part_rels - RelOptInfos of the partition relations
+ * partexprs - Partition key expressions
+ *
+ * Note: A base relation will always have only one set of partition keys. But a
+ * join relation is partitioned by the partition keys of joining relations.
+ * Partition keys are stored as an array of partition key expressions, with
+ * each array element containing a list of one (for a base relation) or more
+ * (as many as the number of joining relations) expressions.
*----------
*/
typedef enum RelOptKind
*************** typedef enum RelOptKind
*** 493,498 ****
--- 545,551 ----
RELOPT_BASEREL,
RELOPT_JOINREL,
RELOPT_OTHER_MEMBER_REL,
+ RELOPT_OTHER_JOINREL,
RELOPT_UPPER_REL,
RELOPT_DEADREL
} RelOptKind;
*************** typedef enum RelOptKind
*** 506,518 ****
(rel)->reloptkind == RELOPT_OTHER_MEMBER_REL)
/* Is the given relation a join relation? */
! #define IS_JOIN_REL(rel) ((rel)->reloptkind == RELOPT_JOINREL)
/* Is the given relation an upper relation? */
#define IS_UPPER_REL(rel) ((rel)->reloptkind == RELOPT_UPPER_REL)
/* Is the given relation an "other" relation? */
! #define IS_OTHER_REL(rel) ((rel)->reloptkind == RELOPT_OTHER_MEMBER_REL)
typedef struct RelOptInfo
{
--- 559,575 ----
(rel)->reloptkind == RELOPT_OTHER_MEMBER_REL)
/* Is the given relation a join relation? */
! #define IS_JOIN_REL(rel) \
! ((rel)->reloptkind == RELOPT_JOINREL || \
! (rel)->reloptkind == RELOPT_OTHER_JOINREL)
/* Is the given relation an upper relation? */
#define IS_UPPER_REL(rel) ((rel)->reloptkind == RELOPT_UPPER_REL)
/* Is the given relation an "other" relation? */
! #define IS_OTHER_REL(rel) \
! ((rel)->reloptkind == RELOPT_OTHER_MEMBER_REL || \
! (rel)->reloptkind == RELOPT_OTHER_JOINREL)
typedef struct RelOptInfo
{
*************** typedef struct RelOptInfo
*** 548,553 ****
--- 605,613 ----
Relids direct_lateral_relids; /* rels directly laterally referenced */
Relids lateral_relids; /* minimum parameterization of rel */
+ /* Information needed to produce grouped paths. */
+ struct GroupedPathInfo *gpi;
+
/* information about a base rel (not set for join rels!) */
Index relid;
Oid reltablespace; /* containing tablespace */
*************** typedef struct RelOptInfo
*** 566,571 ****
--- 626,632 ----
PlannerInfo *subroot; /* if subquery */
List *subplan_params; /* if subquery */
int rel_parallel_workers; /* wanted number of parallel workers */
+ Oid *part_oids; /* OIDs of partitions */
/* Information about foreign tables and foreign joins */
Oid serverid; /* identifies server for the table or join */
*************** typedef struct RelOptInfo
*** 591,596 ****
--- 652,673 ----
/* used by "other" relations */
Relids top_parent_relids; /* Relids of topmost parents */
+
+ /* For all the partitioned relations. */
+ PartitionScheme part_scheme; /* Partitioning scheme. */
+ int nparts; /* number of partitions */
+ PartitionBoundInfo boundinfo; /* Partition bounds/lists */
+ struct RelOptInfo **part_rels; /* Array of RelOptInfos of partitions,
+ * stored in the same order as bounds
+ * or lists in PartitionScheme.
+ */
+ List **partexprs; /* Array of list of partition key
+ * expressions. For base relations
+ * these are one element lists. For
+ * join there may be as many elements
+ * as the number of joining
+ * relations.
+ */
} RelOptInfo;
/*
*************** typedef struct ParamPathInfo
*** 913,918 ****
--- 990,1017 ----
List *ppi_clauses; /* join clauses available from outer rels */
} ParamPathInfo;
+ /*
+ * GroupedPathInfo
+ *
+ * If RelOptInfo points to this structure, grouped paths can be created for
+ * it.
+ *
+ * "target" will be used as pathtarget of grouped paths produced by this
+ * relation. Grouped path is either a result of aggregation of the relation
+ * that owns this structure or, if the owning relation is a join, a join path
+ * whose one side is a grouped path and the other is a plain (i.e. not
+ * grouped) one. (Two grouped paths cannot be joined in general because
+ * grouping of one side of the join essentially reduces occurrence of groups
+ * of the other side in the input of the final aggregation.)
+ */
+ typedef struct GroupedPathInfo
+ {
+ NodeTag type;
+
+ PathTarget *target; /* output of grouped paths. */
+ List *pathlist; /* List of grouped paths. */
+ List *partial_pathlist; /* List of partial grouped paths. */
+ } GroupedPathInfo;
/*
* Type "Path" is used as-is for sequential-scan paths, as well as some other
*************** typedef struct PlaceHolderVar
*** 1852,1857 ****
--- 1951,1989 ----
Index phlevelsup; /* > 0 if PHV belongs to outer query */
} PlaceHolderVar;
+
+ /*
+ * Similar to the concept of PlaceHolderVar, we treat aggregates and grouping
+ * columns as special variables if grouping is possible below the top-level
+ * join. The reason is that aggregates having start as the argument can be
+ * evaluated at various places in the join tree (i.e. cannot be assigned to
+ * target list of exactly one relation). Also this concept seems to be less
+ * invasive than adding the grouped vars to reltarget (in which case
+ * attr_needed and attr_widths arrays of RelOptInfo) would also need
+ * additional changes.
+ *
+ * gvexpr is a pointer to gvexpr field of the corresponding instance
+ * GroupedVarInfo. It's there for the sake of exprType(), exprCollation(),
+ * etc.
+ *
+ * agg_partial also points to the corresponding field of GroupedVarInfo if the
+ * GroupedVar is in the target of a parent relation (RELOPT_BASEREL). However
+ * within a child relation's (RELOPT_OTHER_MEMBER_REL) target it points to a
+ * copy which has argument expressions translated, so they no longer reference
+ * the parent.
+ *
+ * XXX Currently we only create GroupedVar for aggregates, but sometime we can
+ * do it for grouping keys as well. That would allow grouping below the
+ * top-level join by keys other than plain Var.
+ */
+ typedef struct GroupedVar
+ {
+ Expr xpr;
+ Expr *gvexpr; /* the represented expression */
+ Aggref *agg_partial; /* partial aggregate if gvexpr is aggregate */
+ Index gvid; /* GroupedVarInfo */
+ } GroupedVar;
+
/*
* "Special join" info.
*
*************** typedef struct PlaceHolderInfo
*** 2067,2072 ****
--- 2199,2220 ----
} PlaceHolderInfo;
/*
+ * Likewise, GroupedVarInfo exists for each distinct GroupedVar.
+ */
+ typedef struct GroupedVarInfo
+ {
+ NodeTag type;
+
+ Index gvid; /* GroupedVar.gvid */
+ Expr *gvexpr; /* the represented expression. */
+ Aggref *agg_partial; /* if gvexpr is aggregate, agg_partial is
+ * the corresponding partial aggregate */
+ Relids gv_eval_at; /* lowest level we can evaluate the expression
+ * at or NULL if it can happen anywhere. */
+ int32 gv_width; /* estimated width of the expression */
+ } GroupedVarInfo;
+
+ /*
* This struct describes one potentially index-optimizable MIN/MAX aggregate
* function. MinMaxAggPath contains a list of these, and if we accept that
* path, the list is stored into root->minmax_aggs for use during setrefs.c.
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
new file mode 100644
index ed70def..ca06455
*** a/src/include/optimizer/cost.h
--- b/src/include/optimizer/cost.h
*************** extern bool enable_material;
*** 67,72 ****
--- 67,73 ----
extern bool enable_mergejoin;
extern bool enable_hashjoin;
extern bool enable_gathermerge;
+ extern bool enable_partition_wise_join;
extern int constraint_exclusion;
extern double clamp_row_est(double nrows);
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
new file mode 100644
index 77bc770..4a0d845
*** a/src/include/optimizer/pathnode.h
--- b/src/include/optimizer/pathnode.h
*************** extern int compare_path_costs(Path *path
*** 25,37 ****
extern int compare_fractional_path_costs(Path *path1, Path *path2,
double fraction);
extern void set_cheapest(RelOptInfo *parent_rel);
! extern void add_path(RelOptInfo *parent_rel, Path *new_path);
extern bool add_path_precheck(RelOptInfo *parent_rel,
Cost startup_cost, Cost total_cost,
! List *pathkeys, Relids required_outer);
! extern void add_partial_path(RelOptInfo *parent_rel, Path *new_path);
extern bool add_partial_path_precheck(RelOptInfo *parent_rel,
! Cost total_cost, List *pathkeys);
extern Path *create_seqscan_path(PlannerInfo *root, RelOptInfo *rel,
Relids required_outer, int parallel_workers);
--- 25,39 ----
extern int compare_fractional_path_costs(Path *path1, Path *path2,
double fraction);
extern void set_cheapest(RelOptInfo *parent_rel);
! extern void add_path(RelOptInfo *parent_rel, Path *new_path, bool grouped);
extern bool add_path_precheck(RelOptInfo *parent_rel,
Cost startup_cost, Cost total_cost,
! List *pathkeys, Relids required_outer, bool grouped);
! extern void add_partial_path(RelOptInfo *parent_rel, Path *new_path,
! bool grouped);
extern bool add_partial_path_precheck(RelOptInfo *parent_rel,
! Cost total_cost, List *pathkeys,
! bool grouped);
extern Path *create_seqscan_path(PlannerInfo *root, RelOptInfo *rel,
Relids required_outer, int parallel_workers);
*************** extern ForeignPath *create_foreignscan_p
*** 112,118 ****
Path *fdw_outerpath,
List *fdw_private);
! extern Relids calc_nestloop_required_outer(Path *outer_path, Path *inner_path);
extern Relids calc_non_nestloop_required_outer(Path *outer_path, Path *inner_path);
extern NestPath *create_nestloop_path(PlannerInfo *root,
--- 114,123 ----
Path *fdw_outerpath,
List *fdw_private);
! extern Relids calc_nestloop_required_outer(Relids outerrelids,
! Relids outer_paramrels,
! Relids innerrelids,
! Relids inner_paramrels);
extern Relids calc_non_nestloop_required_outer(Path *outer_path, Path *inner_path);
extern NestPath *create_nestloop_path(PlannerInfo *root,
*************** extern NestPath *create_nestloop_path(Pl
*** 124,130 ****
Path *inner_path,
List *restrict_clauses,
List *pathkeys,
! Relids required_outer);
extern MergePath *create_mergejoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
--- 129,136 ----
Path *inner_path,
List *restrict_clauses,
List *pathkeys,
! Relids required_outer,
! PathTarget *target);
extern MergePath *create_mergejoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
*************** extern MergePath *create_mergejoin_path(
*** 138,144 ****
Relids required_outer,
List *mergeclauses,
List *outersortkeys,
! List *innersortkeys);
extern HashPath *create_hashjoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
--- 144,151 ----
Relids required_outer,
List *mergeclauses,
List *outersortkeys,
! List *innersortkeys,
! PathTarget *target);
extern HashPath *create_hashjoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
*************** extern HashPath *create_hashjoin_path(Pl
*** 149,155 ****
Path *inner_path,
List *restrict_clauses,
Relids required_outer,
! List *hashclauses);
extern ProjectionPath *create_projection_path(PlannerInfo *root,
RelOptInfo *rel,
--- 156,163 ----
Path *inner_path,
List *restrict_clauses,
Relids required_outer,
! List *hashclauses,
! PathTarget *target);
extern ProjectionPath *create_projection_path(PlannerInfo *root,
RelOptInfo *rel,
*************** extern AggPath *create_agg_path(PlannerI
*** 190,195 ****
--- 198,217 ----
List *qual,
const AggClauseCosts *aggcosts,
double numGroups);
+ extern AggPath *create_partial_agg_sorted_path(PlannerInfo *root,
+ Path *subpath,
+ bool first_call,
+ List **group_clauses,
+ List **group_exprs,
+ List **agg_exprs,
+ double input_rows);
+ extern AggPath *create_partial_agg_hashed_path(PlannerInfo *root,
+ Path *subpath,
+ bool first_call,
+ List **group_clauses,
+ List **group_exprs,
+ List **agg_exprs,
+ double input_rows);
extern GroupingSetsPath *create_groupingsets_path(PlannerInfo *root,
RelOptInfo *rel,
Path *subpath,
*************** extern LimitPath *create_limit_path(Plan
*** 248,253 ****
--- 270,277 ----
extern Path *reparameterize_path(PlannerInfo *root, Path *path,
Relids required_outer,
double loop_count);
+ extern Path *reparameterize_path_by_child(PlannerInfo *root, Path *path,
+ RelOptInfo *child_rel);
/*
* prototypes for relnode.c
*************** extern ParamPathInfo *get_joinrel_paramp
*** 285,289 ****
--- 309,320 ----
List **restrict_clauses);
extern ParamPathInfo *get_appendrel_parampathinfo(RelOptInfo *appendrel,
Relids required_outer);
+ extern ParamPathInfo *find_param_path_info(RelOptInfo *rel,
+ Relids required_outer);
+ extern void prepare_rel_for_grouping(PlannerInfo *root, RelOptInfo *rel);
+ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
+ RelOptInfo *outer_rel, RelOptInfo *inner_rel,
+ RelOptInfo *parent_joinrel, List *restrictlist,
+ SpecialJoinInfo *sjinfo, JoinType jointype);
#endif /* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
new file mode 100644
index 25fe78c..8dd4efd
*** a/src/include/optimizer/paths.h
--- b/src/include/optimizer/paths.h
*************** extern void set_dummy_rel_pathlist(RelOp
*** 53,63 ****
extern RelOptInfo *standard_join_search(PlannerInfo *root, int levels_needed,
List *initial_rels);
! extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel);
extern int compute_parallel_worker(RelOptInfo *rel, double heap_pages,
double index_pages);
extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
Path *bitmapqual);
#ifdef OPTIMIZER_DEBUG
extern void debug_print_rel(PlannerInfo *root, RelOptInfo *rel);
--- 53,69 ----
extern RelOptInfo *standard_join_search(PlannerInfo *root, int levels_needed,
List *initial_rels);
! extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
! bool grouped);
! extern void create_grouped_path(PlannerInfo *root, RelOptInfo *rel,
! Path *subpath, bool precheck, bool partial,
! AggStrategy aggstrategy);
extern int compute_parallel_worker(RelOptInfo *rel, double heap_pages,
double index_pages);
extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
Path *bitmapqual);
+ extern void generate_partition_wise_join_paths(PlannerInfo *root,
+ RelOptInfo *rel);
#ifdef OPTIMIZER_DEBUG
extern void debug_print_rel(PlannerInfo *root, RelOptInfo *rel);
*************** extern void debug_print_rel(PlannerInfo
*** 67,73 ****
* indxpath.c
* routines to generate index paths
*/
! extern void create_index_paths(PlannerInfo *root, RelOptInfo *rel);
extern bool relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
List *restrictlist,
List *exprlist, List *oprlist);
--- 73,80 ----
* indxpath.c
* routines to generate index paths
*/
! extern void create_index_paths(PlannerInfo *root, RelOptInfo *rel,
! bool grouped);
extern bool relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
List *restrictlist,
List *exprlist, List *oprlist);
*************** extern bool have_join_order_restriction(
*** 111,116 ****
--- 118,126 ----
RelOptInfo *rel1, RelOptInfo *rel2);
extern bool have_dangerous_phv(PlannerInfo *root,
Relids outer_relids, Relids inner_params);
+ extern void mark_dummy_rel(RelOptInfo *rel);
+ extern bool have_partkey_equi_join(RelOptInfo *rel1, RelOptInfo *rel2,
+ JoinType jointype, List *restrictlist, bool *is_strict);
/*
* equivclass.c
diff --git a/src/include/optimizer/placeholder.h b/src/include/optimizer/placeholder.h
new file mode 100644
index 11e6403..8598268
*** a/src/include/optimizer/placeholder.h
--- b/src/include/optimizer/placeholder.h
*************** extern void fix_placeholder_input_needed
*** 28,32 ****
--- 28,34 ----
extern void add_placeholders_to_base_rels(PlannerInfo *root);
extern void add_placeholders_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outer_rel, RelOptInfo *inner_rel);
+ extern void add_placeholders_to_child_joinrel(PlannerInfo *root,
+ RelOptInfo *childrel, RelOptInfo *parentrel);
#endif /* PLACEHOLDER_H */
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
new file mode 100644
index 5df68a2..07bc4c0
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
*************** extern int join_collapse_limit;
*** 74,80 ****
extern void add_base_rels_to_query(PlannerInfo *root, Node *jtnode);
extern void build_base_rel_tlists(PlannerInfo *root, List *final_tlist);
extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
! Relids where_needed, bool create_new_ph);
extern void find_lateral_references(PlannerInfo *root);
extern void create_lateral_join_info(PlannerInfo *root);
extern List *deconstruct_jointree(PlannerInfo *root);
--- 74,82 ----
extern void add_base_rels_to_query(PlannerInfo *root, Node *jtnode);
extern void build_base_rel_tlists(PlannerInfo *root, List *final_tlist);
extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
! Relids where_needed, bool create_new_ph);
! extern void add_grouping_info_to_base_rels(PlannerInfo *root);
! extern void add_grouped_vars_to_rels(PlannerInfo *root);
extern void find_lateral_references(PlannerInfo *root);
extern void create_lateral_join_info(PlannerInfo *root);
extern List *deconstruct_jointree(PlannerInfo *root);
diff --git a/src/include/optimizer/planner.h b/src/include/optimizer/planner.h
new file mode 100644
index f3aaa23..4a550bb
*** a/src/include/optimizer/planner.h
--- b/src/include/optimizer/planner.h
*************** extern Expr *preprocess_phv_expression(P
*** 58,62 ****
--- 58,64 ----
extern bool plan_cluster_use_sort(Oid tableOid, Oid indexOid);
extern List *get_partitioned_child_rels(PlannerInfo *root, Index rti);
+ extern List *get_partitioned_child_rels_for_join(PlannerInfo *root,
+ RelOptInfo *joinrel);
#endif /* PLANNER_H */
diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h
new file mode 100644
index 2b20b36..95802c9
*** a/src/include/optimizer/prep.h
--- b/src/include/optimizer/prep.h
*************** extern RelOptInfo *plan_set_operations(P
*** 53,61 ****
extern void expand_inherited_tables(PlannerInfo *root);
extern Node *adjust_appendrel_attrs(PlannerInfo *root, Node *node,
! AppendRelInfo *appinfo);
extern Node *adjust_appendrel_attrs_multilevel(PlannerInfo *root, Node *node,
! RelOptInfo *child_rel);
#endif /* PREP_H */
--- 53,74 ----
extern void expand_inherited_tables(PlannerInfo *root);
extern Node *adjust_appendrel_attrs(PlannerInfo *root, Node *node,
! int nappinfos, AppendRelInfo **appinfos);
extern Node *adjust_appendrel_attrs_multilevel(PlannerInfo *root, Node *node,
! Relids child_relids,
! Relids top_parent_relids);
!
! extern Relids adjust_child_relids(Relids relids, int nappinfos,
! AppendRelInfo **appinfos);
!
! extern AppendRelInfo **find_appinfos_by_relids(PlannerInfo *root,
! Relids relids, int *nappinfos);
!
! extern SpecialJoinInfo *build_child_join_sjinfo(PlannerInfo *root,
! SpecialJoinInfo *parent_sjinfo,
! Relids left_relids, Relids right_relids);
! extern Relids adjust_child_relids_multilevel(PlannerInfo *root, Relids relids,
! Relids child_relids, Relids top_parent_relids);
#endif /* PREP_H */
diff --git a/src/include/optimizer/tlist.h b/src/include/optimizer/tlist.h
new file mode 100644
index ccb93d8..ddea03c
*** a/src/include/optimizer/tlist.h
--- b/src/include/optimizer/tlist.h
*************** extern Node *get_sortgroupclause_expr(So
*** 41,46 ****
--- 41,49 ----
List *targetList);
extern List *get_sortgrouplist_exprs(List *sgClauses,
List *targetList);
+ extern void get_grouping_expressions(PlannerInfo *root, PathTarget *target,
+ List **grouping_clauses,
+ List **grouping_exprs, List **agg_exprs);
extern SortGroupClause *get_sortgroupref_clause(Index sortref,
List *clauses);
*************** extern void split_pathtarget_at_srfs(Pla
*** 65,70 ****
--- 68,84 ----
PathTarget *target, PathTarget *input_target,
List **targets, List **targets_contain_srfs);
+ /* TODO Find the best location (position and in some cases even file) for the
+ * following ones. */
+ extern List *restore_grouping_expressions(PlannerInfo *root, List *src);
+ extern List *add_aggregates_to_target(PlannerInfo *root, PathTarget *target,
+ List *aggregates, RelOptInfo *rel);
+ extern Index get_expr_sortgroupref(PlannerInfo *root, Expr *expr);
+ /* TODO Move definition from initsplan.c to tlist.c. */
+ extern PathTarget *create_grouped_target(PlannerInfo *root, RelOptInfo *rel,
+ Relids rel_agg_attrs,
+ List *rel_agg_vars);
+
/* Convenience macro to get a PathTarget with valid cost/width fields */
#define create_pathtarget(root, tlist) \
set_pathtarget_cost_width(root, make_pathtarget_from_tlist(tlist))
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
new file mode 100644
index 9f9d2dc..e05e6f6
*** a/src/include/utils/selfuncs.h
--- b/src/include/utils/selfuncs.h
*************** extern double estimate_num_groups(Planne
*** 206,211 ****
--- 206,214 ----
extern Selectivity estimate_hash_bucketsize(PlannerInfo *root, Node *hashkey,
double nbuckets);
+ extern Size estimate_hashagg_tablesize(Path *path,
+ const AggClauseCosts *agg_costs,
+ double dNumGroups);
extern List *deconstruct_indexquals(IndexPath *path);
extern void genericcostestimate(PlannerInfo *root, IndexPath *path,
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
new file mode 100644
index 6163ed8..7a969f2
*** a/src/test/regress/expected/inherit.out
--- b/src/test/regress/expected/inherit.out
*************** select tableoid::regclass::text as relna
*** 625,630 ****
--- 625,652 ----
(3 rows)
drop table parted_tab;
+ -- Check UPDATE with *multi-level partitioned* inherited target
+ create table mlparted_tab (a int, b char, c text) partition by list (a);
+ create table mlparted_tab_part1 partition of mlparted_tab for values in (1);
+ create table mlparted_tab_part2 partition of mlparted_tab for values in (2) partition by list (b);
+ create table mlparted_tab_part3 partition of mlparted_tab for values in (3);
+ create table mlparted_tab_part2a partition of mlparted_tab_part2 for values in ('a');
+ create table mlparted_tab_part2b partition of mlparted_tab_part2 for values in ('b');
+ insert into mlparted_tab values (1, 'a'), (2, 'a'), (2, 'b'), (3, 'a');
+ update mlparted_tab mlp set c = 'xxx'
+ from
+ (select a from some_tab union all select a+1 from some_tab) ss (a)
+ where (mlp.a = ss.a and mlp.b = 'b') or mlp.a = 3;
+ select tableoid::regclass::text as relname, mlparted_tab.* from mlparted_tab order by 1,2;
+ relname | a | b | c
+ ---------------------+---+---+-----
+ mlparted_tab_part1 | 1 | a |
+ mlparted_tab_part2a | 2 | a |
+ mlparted_tab_part2b | 2 | b | xxx
+ mlparted_tab_part3 | 3 | a | xxx
+ (4 rows)
+
+ drop table mlparted_tab;
drop table some_tab cascade;
NOTICE: drop cascades to table some_tab_child
/* Test multiple inheritance of column defaults */
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
new file mode 100644
index 568b783..cd1f7f3
*** a/src/test/regress/expected/sysviews.out
--- b/src/test/regress/expected/sysviews.out
*************** select count(*) >= 0 as ok from pg_prepa
*** 70,90 ****
-- This is to record the prevailing planner enable_foo settings during
-- a regression test run.
select name, setting from pg_settings where name like 'enable%';
! name | setting
! ----------------------+---------
! enable_bitmapscan | on
! enable_gathermerge | on
! enable_hashagg | on
! enable_hashjoin | on
! enable_indexonlyscan | on
! enable_indexscan | on
! enable_material | on
! enable_mergejoin | on
! enable_nestloop | on
! enable_seqscan | on
! enable_sort | on
! enable_tidscan | on
! (12 rows)
-- Test that the pg_timezone_names and pg_timezone_abbrevs views are
-- more-or-less working. We can't test their contents in any great detail
--- 70,91 ----
-- This is to record the prevailing planner enable_foo settings during
-- a regression test run.
select name, setting from pg_settings where name like 'enable%';
! name | setting
! ----------------------------+---------
! enable_bitmapscan | on
! enable_gathermerge | on
! enable_hashagg | on
! enable_hashjoin | on
! enable_indexonlyscan | on
! enable_indexscan | on
! enable_material | on
! enable_mergejoin | on
! enable_nestloop | on
! enable_partition_wise_join | off
! enable_seqscan | on
! enable_sort | on
! enable_tidscan | on
! (13 rows)
-- Test that the pg_timezone_names and pg_timezone_abbrevs views are
-- more-or-less working. We can't test their contents in any great detail
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
new file mode 100644
index 1f8f098..2d14885
*** a/src/test/regress/parallel_schedule
--- b/src/test/regress/parallel_schedule
*************** test: publication subscription
*** 103,109 ****
# ----------
# Another group of parallel tests
# ----------
! test: select_views portals_p2 foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data window xmlmap functional_deps advisory_lock json jsonb json_encoding indirect_toast equivclass
# ----------
# Another group of parallel tests
# NB: temp.sql does a reconnect which transiently uses 2 connections,
--- 103,109 ----
# ----------
# Another group of parallel tests
# ----------
! test: select_views portals_p2 foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data window xmlmap functional_deps advisory_lock json jsonb json_encoding indirect_toast equivclass partition_join multi_level_partition_join
# ----------
# Another group of parallel tests
# NB: temp.sql does a reconnect which transiently uses 2 connections,
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
new file mode 100644
index 04206c3..9ac24dd
*** a/src/test/regress/serial_schedule
--- b/src/test/regress/serial_schedule
*************** test: with
*** 179,181 ****
--- 179,183 ----
test: xml
test: event_trigger
test: stats
+ test: partition_join
+ test: multi_level_partition_join
diff --git a/src/test/regress/sql/inherit.sql b/src/test/regress/sql/inherit.sql
new file mode 100644
index d43b75c..b814a4c
*** a/src/test/regress/sql/inherit.sql
--- b/src/test/regress/sql/inherit.sql
*************** where parted_tab.a = ss.a;
*** 154,159 ****
--- 154,176 ----
select tableoid::regclass::text as relname, parted_tab.* from parted_tab order by 1,2;
drop table parted_tab;
+
+ -- Check UPDATE with *multi-level partitioned* inherited target
+ create table mlparted_tab (a int, b char, c text) partition by list (a);
+ create table mlparted_tab_part1 partition of mlparted_tab for values in (1);
+ create table mlparted_tab_part2 partition of mlparted_tab for values in (2) partition by list (b);
+ create table mlparted_tab_part3 partition of mlparted_tab for values in (3);
+ create table mlparted_tab_part2a partition of mlparted_tab_part2 for values in ('a');
+ create table mlparted_tab_part2b partition of mlparted_tab_part2 for values in ('b');
+ insert into mlparted_tab values (1, 'a'), (2, 'a'), (2, 'b'), (3, 'a');
+
+ update mlparted_tab mlp set c = 'xxx'
+ from
+ (select a from some_tab union all select a+1 from some_tab) ss (a)
+ where (mlp.a = ss.a and mlp.b = 'b') or mlp.a = 3;
+ select tableoid::regclass::text as relname, mlparted_tab.* from mlparted_tab order by 1,2;
+
+ drop table mlparted_tab;
drop table some_tab cascade;
/* Test multiple inheritance of column defaults */
On Thu, Apr 27, 2017 at 4:53 PM, Antonin Houska <ah@cybertec.at> wrote:
Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Apr 26, 2017 at 6:28 AM, Antonin Houska <ah@cybertec.at> wrote:
Attached is a diff that contains both patches merged. This is just to
prove my
assumption, details to be elaborated later. The scripts attached
produce the
following plan in my environment:
QUERY PLAN
------------------------------------------------
Parallel Finalize HashAggregate
Group Key: b_1.j
-> Append
-> Parallel Partial HashAggregate
Group Key: b_1.j
-> Hash Join
Hash Cond: (b_1.j = c_1.k)
-> Seq Scan on b_1
-> Hash
-> Seq Scan on c_1
-> Parallel Partial HashAggregate
Group Key: b_2.j
-> Hash Join
Hash Cond: (b_2.j = c_2.k)
-> Seq Scan on b_2
-> Hash
-> Seq Scan on c_2Well, I'm confused. I see that there's a relationship between what
Antonin is trying to do and what Jeevan is trying to do, but I can't
figure out whether one is a subset of the other, whether they're both
orthogonal, or something else. This plan looks similar to what I
would expect Jeevan's patch to produce,The point is that the patch Jeevan wanted to work on is actually a subset
of
[1] combined with [2].
Seems like, as you are targeting every relation whether or not it is
partitioned. Where as I am targeting only partitioned relations in my
patch.
except i have no idea what "Parallel" would mean in a plan that contains
no
Gather node.
parallel_aware field was set mistakenly on the AggPath. Fixed patch is
attached below, producing this plan:QUERY PLAN
------------------------------------------------
Finalize HashAggregate
Group Key: b_1.j
-> Append
-> Partial HashAggregate
Group Key: b_1.j
-> Hash Join
Hash Cond: (b_1.j = c_1.k)
-> Seq Scan on b_1
-> Hash
-> Seq Scan on c_1
-> Partial HashAggregate
Group Key: b_2.j
-> Hash Join
Hash Cond: (b_2.j = c_2.k)
-> Seq Scan on b_2
-> Hash
-> Seq Scan on c_2
With my patch, I am getting following plan where we push entire
aggregation below append.
QUERY PLAN
------------------------------------------
Append
-> HashAggregate
Group Key: b_1.j
-> Hash Join
Hash Cond: (b_1.j = c_1.k)
-> Seq Scan on b_1
-> Hash
-> Seq Scan on c_1
-> HashAggregate
Group Key: b_2.j
-> Hash Join
Hash Cond: (b_2.j = c_2.k)
-> Seq Scan on b_2
-> Hash
-> Seq Scan on c_2
(15 rows)
Antonin, I have tried applying your patch on master but it doesn't get
apply. Can you please provide the HEAD and any other changes required
to be applied first?
How the plan look like when GROUP BY key does not match with the
partitioning key i.e. GROUP BY b.v ?
[1] /messages/by-id/9666.1491295317@localhost
[2] https://commitfest.postgresql.org/14/994/
--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Phone: +91 20 66449694
Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb
This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.
Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote:
On Thu, Apr 27, 2017 at 4:53 PM, Antonin Houska <ah@cybertec.at> wrote:
Robert Haas <robertmhaas@gmail.com> wrote:
Well, I'm confused. I see that there's a relationship between what
Antonin is trying to do and what Jeevan is trying to do, but I can't
figure out whether one is a subset of the other, whether they're both
orthogonal, or something else. This plan looks similar to what I
would expect Jeevan's patch to produce,
The point is that the patch Jeevan wanted to work on is actually a subset of
[1] combined with [2].
Seems like, as you are targeting every relation whether or not it is
partitioned.
Yes.
With my patch, I am getting following plan where we push entire
aggregation below append.QUERY PLAN
------------------------------------------
Append
-> HashAggregate
Group Key: b_1.j
-> Hash Join
Hash Cond: (b_1.j = c_1.k)
-> Seq Scan on b_1
-> Hash
-> Seq Scan on c_1
-> HashAggregate
Group Key: b_2.j
-> Hash Join
Hash Cond: (b_2.j = c_2.k)
-> Seq Scan on b_2
-> Hash
-> Seq Scan on c_2
(15 rows)
I think this is not generic enough because the result of the Append plan can
be joined to another relation. As such a join can duplicate the
already-aggregated values, the aggregates should not be finalized below the
top-level plan.
Antonin, I have tried applying your patch on master but it doesn't get
apply. Can you please provide the HEAD and any other changes required
to be applied first?
I've lost that information. I'll post a new version to the [1] thread asap.
How the plan look like when GROUP BY key does not match with the
partitioning key i.e. GROUP BY b.v ?
EXPLAIN (COSTS false)
SELECT b.v, avg(b.v + c.v)
FROM b
JOIN
c ON b.j = c.k
GROUP BY b.v;
QUERY PLAN
------------------------------------------------
Finalize HashAggregate
Group Key: b_1.v
-> Append
-> Partial HashAggregate
Group Key: b_1.v
-> Hash Join
Hash Cond: (b_1.j = c_1.k)
-> Seq Scan on b_1
-> Hash
-> Seq Scan on c_1
-> Partial HashAggregate
Group Key: b_2.v
-> Hash Join
Hash Cond: (b_2.j = c_2.k)
-> Seq Scan on b_2
-> Hash
-> Seq Scan on c_2
--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Apr 28, 2017 at 3:03 AM, Antonin Houska <ah@cybertec.at> wrote:
I think this is not generic enough because the result of the Append plan can
be joined to another relation. As such a join can duplicate the
already-aggregated values, the aggregates should not be finalized below the
top-level plan.
If the grouping key matches the partition key, then it's correct to
push the entire aggregate down, and there's probably a large
performance advantage from avoiding aggregating twice. If the two
don't match, then pushing the aggregate down necessarily involves a
"partial" and a "finalize" stage, which may or may not be cheaper than
doing the aggregation all at once. If you have lots of 2-row groups
with 1 row in the first branch of the append and 1 row in the second
branch of the append, breaking the aggregate into two steps is
probably going to be a loser. If the overall number of groups is
small, it's probably going to win. But when the grouping key matches
the partition key, so that two-stage aggregation isn't required, I
suspect the pushdown should almost always win.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
Attached is the patch to implement partition-wise aggregation/grouping.
As explained earlier, we produce a full aggregation for each partition when
partition keys are leading group by clauses and then append is performed.
Else we do a partial aggregation on each partition, append them and then add
finalization step over it.
I have observed that cost estimated for partition-wise aggregation and cost
for the plans without partition-wise aggregation is almost same. However,
execution time shows significant improvement (as explained my in the very
first email) with partition-wise aggregates. Planner chooses a plan
according
to the costs, and thus most of the time plan without partition-wise
aggregation is chosen. Hence, to force partition-wise plans and for the
regression runs, I have added a GUC named partition_wise_agg_cost_factor to
adjust the costings.
This feature is only used when enable_partition_wise_agg GUC is set to on.
Here are the details of the patches in the patch-set:
0001 - Refactors sort and hash final grouping paths into separate functions.
Since partition-wise aggregation too builds paths same as that of
create_grouping_paths(), separated path creation for sort and hash agg into
separate functions. These functions later used by main partition-wise
aggregation/grouping patch.
0002 - Passes targetlist to get_number_of_groups().
We need to estimate groups for individual child relations and thus need to
pass targetlist corresponding to the child rel.
0003 - Adds enable_partition_wise_agg and partition_wise_agg_cost_factor
GUCs.
0004 - Implements partition-wise aggregation.
0005 - Adds test-cases.
0006 - postgres_fdw changes which enable pushing aggregation for other upper
relations.
Since this patch is highly dependent on partition-wise join [1]/messages/by-id/CAFjFpRd9Vqh_=-Ldv-XqWY006d07TJ+VXuhXCbdj=P1jukYBrw@mail.gmail.com, one needs
to
apply all those patches on HEAD (my repository head was at:
66ed3829df959adb47f71d7c903ac59f0670f3e1) before applying these patches in
order.
Suggestions / feedback / inputs ?
[1]: /messages/by-id/CAFjFpRd9Vqh_=-Ldv-XqWY006d07TJ+VXuhXCbdj=P1jukYBrw@mail.gmail.com
/messages/by-id/CAFjFpRd9Vqh_=-Ldv-XqWY006d07TJ+VXuhXCbdj=P1jukYBrw@mail.gmail.com
On Tue, Mar 21, 2017 at 12:47 PM, Jeevan Chalke <
jeevan.chalke@enterprisedb.com> wrote:
Hi all,
Declarative partitioning is supported in PostgreSQL 10 and work is already
in
progress to support partition-wise joins. Here is a proposal for
partition-wise
aggregation/grouping. Our initial performance measurement has shown 7
times
performance when partitions are on foreign servers and approximately 15%
when
partitions are local.Partition-wise aggregation/grouping computes aggregates for each partition
separately. If the group clause contains the partition key, all the rows
belonging to a given group come from one partition, thus allowing
aggregates
to be computed completely for each partition. Otherwise, partial
aggregates
computed for each partition are combined across the partitions to produce
the
final aggregates. This technique improves performance because:
i. When partitions are located on foreign server, we can push down the
aggregate to the foreign server.
ii. If hash table for each partition fits in memory, but that for the whole
relation does not, each partition-wise aggregate can use an in-memory hash
table.
iii. Aggregation at the level of partitions can exploit properties of
partitions like indexes, their storage etc.Attached an experimental patch for the same based on the partition-wise
join
patches posted in [1].This patch currently implements partition-wise aggregation when group
clause
contains the partitioning key. A query below, involving a partitioned
table
with 3 partitions containing 1M rows each, producing total 30 groups showed
15% improvement over non-partition-wise aggregation. Same query showed 7
times
improvement when the partitions were located on the foreign servers.Here is the sample plan:
postgres=# set enable_partition_wise_agg to true;
SET
postgres=# EXPLAIN ANALYZE SELECT a, count(*) FROM plt1 GROUP BY a;
QUERY
PLAN
------------------------------------------------------------
--------------------------------------------------
Append (cost=5100.00..61518.90 rows=30 width=12) (actual
time=324.837..944.804 rows=30 loops=1)
-> Foreign Scan (cost=5100.00..20506.30 rows=10 width=12) (actual
time=324.837..324.838 rows=10 loops=1)
Relations: Aggregate on (public.fplt1_p1 plt1)
-> Foreign Scan (cost=5100.00..20506.30 rows=10 width=12) (actual
time=309.954..309.956 rows=10 loops=1)
Relations: Aggregate on (public.fplt1_p2 plt1)
-> Foreign Scan (cost=5100.00..20506.30 rows=10 width=12) (actual
time=310.002..310.004 rows=10 loops=1)
Relations: Aggregate on (public.fplt1_p3 plt1)
Planning time: 0.370 ms
Execution time: 945.384 ms
(9 rows)postgres=# set enable_partition_wise_agg to false;
SET
postgres=# EXPLAIN ANALYZE SELECT a, count(*) FROM plt1 GROUP BY a;
QUERY
PLAN
------------------------------------------------------------
------------------------------------------------------------
---------------
HashAggregate (cost=121518.01..121518.31 rows=30 width=12) (actual
time=6498.452..6498.459 rows=30 loops=1)
Group Key: plt1.a
-> Append (cost=0.00..106518.00 rows=3000001 width=4) (actual
time=0.595..5769.592 rows=3000000 loops=1)
-> Seq Scan on plt1 (cost=0.00..0.00 rows=1 width=4) (actual
time=0.007..0.007 rows=0 loops=1)
-> Foreign Scan on fplt1_p1 (cost=100.00..35506.00 rows=1000000
width=4) (actual time=0.587..1844.506 rows=1000000 loops=1)
-> Foreign Scan on fplt1_p2 (cost=100.00..35506.00 rows=1000000
width=4) (actual time=0.384..1839.633 rows=1000000 loops=1)
-> Foreign Scan on fplt1_p3 (cost=100.00..35506.00 rows=1000000
width=4) (actual time=0.402..1876.505 rows=1000000 loops=1)
Planning time: 0.251 ms
Execution time: 6499.018 ms
(9 rows)Patch needs a lot of improvement including:
1. Support for partial partition-wise aggregation
2. Estimating number of groups for every partition
3. Estimating cost of partition-wise aggregation based on sample partitions
similar to partition-wise join
and much more.In order to support partial aggregation on foreign partitions, we need
support
to fetch partially aggregated results from the foreign server. That can be
handled as a separate follow-on patch.Though is lot of work to be done, I would like to get suggestions/opinions
from
hackers.I would like to thank Ashutosh Bapat for providing a draft patch and
helping
me off-list on this feature while he is busy working on partition-wise join
feature.[1] /messages/by-id/CAFjFpRcbY2QN3cfeMTzVEoyF5Lfku
-ijyNR%3DPbXj1e%3D9a%3DqMoQ%40mail.gmail.comThanks
--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Attachments:
On Wed, Aug 23, 2017 at 4:43 PM, Jeevan Chalke <
jeevan.chalke@enterprisedb.com> wrote:
Hi,
Attached is the patch to implement partition-wise aggregation/grouping.
As explained earlier, we produce a full aggregation for each partition when
partition keys are leading group by clauses and then append is performed.
Else we do a partial aggregation on each partition, append them and then
add
finalization step over it.I have observed that cost estimated for partition-wise aggregation and cost
for the plans without partition-wise aggregation is almost same. However,
execution time shows significant improvement (as explained my in the very
first email) with partition-wise aggregates. Planner chooses a plan
according
to the costs, and thus most of the time plan without partition-wise
aggregation is chosen. Hence, to force partition-wise plans and for the
regression runs, I have added a GUC named partition_wise_agg_cost_factor to
adjust the costings.This feature is only used when enable_partition_wise_agg GUC is set to on.
Here are the details of the patches in the patch-set:
Here are the new patch-set re-based on HEAD (f0a0c17) and
latest partition-wise join (v29) patches.
0001 - Refactors sort and hash final grouping paths into separate
functions.
Since partition-wise aggregation too builds paths same as that of
create_grouping_paths(), separated path creation for sort and hash agg into
separate functions. These functions later used by main partition-wise
aggregation/grouping patch.0002 - Passes targetlist to get_number_of_groups().
We need to estimate groups for individual child relations and thus need to
pass targetlist corresponding to the child rel.0003 - Adds enable_partition_wise_agg and partition_wise_agg_cost_factor
GUCs.0004 - Implements partition-wise aggregation.
0005 - Adds test-cases.
0006 - postgres_fdw changes which enable pushing aggregation for other
upper
relations.
0007 - Provides infrastructure to allow partial aggregation
This will allow us to push the partial aggregation over fdw.
With this one can write SUM(PARTIAL x) to get a partial sum
result. Since PARTIAL is used in syntax, I need to move that
to a reserved keywords category. This is kind of PoC patch
and needs input over approach and the way it is implemented.
0008 - Teaches postgres_fdw to push partial aggregation
With this we can push aggregate on remote server when
GROUP BY key does not match with the PARTITION key too.
Since this patch is highly dependent on partition-wise join [1], one needs
to
apply all those patches on HEAD (my repository head was at:
66ed3829df959adb47f71d7c903ac59f0670f3e1) before applying these patches in
order.Suggestions / feedback / inputs ?
[1] /messages/by-id/CAFjFpRd9Vqh_=-Ldv-
XqWY006d07TJ+VXuhXCbdj=P1jukYBrw@mail.gmail.com
--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Attachments:
partition-wise-agg-v2.tar.gzapplication/x-gzip; name=partition-wise-agg-v2.tar.gzDownload
� ���Y �]{s���������D^���a'w����4�����v�;;��,6��������x�"($����F �s��EA�F����wC���������� ���~��@�L��i����`I:c8����]6�aF( ��?1�CKu���_h���?i������E8�����������(�Ee�����7&�pdf��F��0x
|���7`h�6�6����6�B�0���c�"�vF�W .�-0&`0x����a��y
~a2���������6��/#�Dy�o�7��yMi� 4�����"���kc�z0���9�\��?���������`�����&`� �n��nA7;����{�&�'��}�pKhbg�uFr��U��`xv�AB��A���Yh/���0X���?���q�"Y�8�d��_�
�I�:4�;`���C@�`y��0�d �%)��
��;$8���{wo�� ����D��pXE|z��5�tw>{{�n�~�6����,�=���c�]:���Q�oN&c�FT}������+���V,��[�7�
���cj��mg�h$�|<���p���lv��=��"DU�L"