Push down Aggregates below joins
Currently, the planner always first decides the scan/join order, and
adds Group/Agg nodes on top of the joins. Sometimes it would be legal,
and beneficial, to perform the aggregation below a join. I've been
hacking on a patch to allow that.
For example:
create temp table a (id int4 primary key);
create temp table b (id int4);
insert into a select g from generate_series(1, 1000) g;
insert into b select g/10 from generate_series(1, 10000) g;
analyze a,b;
explain select b.id from a, b where a.id = b.id group by b.id;
Currently, you get a plan like this:
QUERY PLAN
-----------------------------------------------------------------------
HashAggregate (cost=323.64..333.65 rows=1001 width=4)
Group Key: b.id
-> Hash Join (cost=27.50..298.66 rows=9990 width=4)
Hash Cond: (b.id = a.id)
-> Seq Scan on b (cost=0.00..145.00 rows=10000 width=4)
-> Hash (cost=15.00..15.00 rows=1000 width=4)
-> Seq Scan on a (cost=0.00..15.00 rows=1000 width=4)
(7 rows)
With the patch, you get a plan like this:
QUERY PLAN
-------------------------------------------------------------------------
Hash Join (cost=192.52..221.27 rows=9990 width=4)
Hash Cond: (a.id = b.id)
-> Seq Scan on a (cost=0.00..15.00 rows=1000 width=4)
-> Hash (cost=180.01..180.01 rows=1001 width=4)
-> HashAggregate (cost=170.00..180.01 rows=1001 width=4)
Group Key: b.id
-> Seq Scan on b (cost=0.00..145.00 rows=10000 width=4)
(7 rows)
This is faster, because you need to join fewer rows. (That's not
reflected in the cost estimates above; cost estimation is not done
properly in this prototype yet.)
Implementation
--------------
Move the responsibility of planning aggregates from the "upper stages",
in grouping_planner(), into scan/join planning, in query_planner().
In query_planner(), after building the RelOptInfo for a scan or join
rel, also build a grouped RelOptInfo to shadow each RelOptInfo (if
aggregation can be done at that rel). The grouped RelOptInfo is stored
in a new 'grouped_rel' field in the parent RelOptInfo.
A grouped rel holds Paths where the grouping/aggregation is already
performed at that node, or below it. For a base rel, it represents
performing the aggregation on top of the scan, i.e. the Paths are like
Agg(Scan). For a grouped join rel, the paths look like an Agg(Join(A,
B)), or Join(Agg(A), B).
The first three of the attached patches just move existing code around.
The fourth patch contains the actual feature.
This is still a rough prototype, but any thoughts on the general approach?
- Heikki
Attachments:
0001-Refactor-query_planner-into-two-parts.patchtext/x-patch; name=0001-Refactor-query_planner-into-two-parts.patchDownload
From a3038daaa5d691e4a29d8804528668f80cdd0758 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 15 Jun 2018 15:13:41 +0300
Subject: [PATCH 1/4] Refactor query_planner() into two parts.
Commit db9f0e1d9a introduced a callback to query_planner(), which gave the
caller a chance to build query_pathkeys, in the middle of query_planner()
steps. That feels a bit awkward. Let's split query_planner() into two
functions instead, one to process the jointree, building the equivalence
classes, and another to produce the Paths. The caller can build the
query_pathkeys between the two function calls.
---
src/backend/optimizer/plan/planagg.c | 51 ++++++-------
src/backend/optimizer/plan/planmain.c | 125 ++++++++++++++++++-------------
src/backend/optimizer/plan/planner.c | 45 +++++------
src/backend/optimizer/util/placeholder.c | 2 +-
src/include/nodes/relation.h | 9 ++-
src/include/optimizer/planmain.h | 7 +-
6 files changed, 131 insertions(+), 108 deletions(-)
diff --git a/src/backend/optimizer/plan/planagg.c b/src/backend/optimizer/plan/planagg.c
index 95cbffbd69..07f0734559 100644
--- a/src/backend/optimizer/plan/planagg.c
+++ b/src/backend/optimizer/plan/planagg.c
@@ -50,7 +50,6 @@
static bool find_minmax_aggs_walker(Node *node, List **context);
static bool build_minmax_path(PlannerInfo *root, MinMaxAggInfo *mminfo,
Oid eqop, Oid sortop, bool nulls_first);
-static void minmax_qp_callback(PlannerInfo *root, void *extra);
static Oid fetch_agg_sort_op(Oid aggfnoid);
@@ -63,9 +62,9 @@ static Oid fetch_agg_sort_op(Oid aggfnoid);
* the (UPPERREL_GROUP_AGG, NULL) upperrel.
*
* This should be called by grouping_planner() just before it's ready to call
- * query_planner(), because we generate indexscan paths by cloning the
- * planner's state and invoking query_planner() on a modified version of
- * the query parsetree. Thus, all preprocessing needed before query_planner()
+ * process_jointree(), because we generate indexscan paths by cloning the
+ * planner's state and invoking the scan/join planner on a modified version of
+ * the query parsetree. Thus, all preprocessing needed before process_jointree()
* must already be done.
*
* Note: we are passed the preprocessed targetlist separately, because it's
@@ -435,13 +434,33 @@ build_minmax_path(PlannerInfo *root, MinMaxAggInfo *mminfo,
FLOAT8PASSBYVAL);
/*
- * Generate the best paths for this query, telling query_planner that we
- * have LIMIT 1.
+ * Build base relation and equivalence classes, knowing that we have
+ * LIMIT 1.
*/
subroot->tuple_fraction = 1.0;
subroot->limit_tuples = 1.0;
- final_rel = query_planner(subroot, tlist, minmax_qp_callback, NULL);
+ process_jointree(subroot, tlist);
+
+ /*
+ * Compute query_pathkeys to represent the desired order. There is no
+ * GROUP BY, window functions, or DISTINCT in the generated query.
+ */
+ subroot->group_pathkeys = NIL;
+ subroot->window_pathkeys = NIL;
+ subroot->distinct_pathkeys = NIL;
+
+ subroot->sort_pathkeys =
+ make_pathkeys_for_sortclauses(subroot,
+ subroot->parse->sortClause,
+ subroot->parse->targetList);
+
+ subroot->query_pathkeys = subroot->sort_pathkeys;
+
+ /*
+ * Generate the best paths for the generated query.
+ */
+ final_rel = query_planner(subroot);
/*
* Since we didn't go through subquery_planner() to handle the subquery,
@@ -495,24 +514,6 @@ build_minmax_path(PlannerInfo *root, MinMaxAggInfo *mminfo,
}
/*
- * Compute query_pathkeys and other pathkeys during query_planner()
- */
-static void
-minmax_qp_callback(PlannerInfo *root, void *extra)
-{
- root->group_pathkeys = NIL;
- root->window_pathkeys = NIL;
- root->distinct_pathkeys = NIL;
-
- root->sort_pathkeys =
- make_pathkeys_for_sortclauses(root,
- root->parse->sortClause,
- root->parse->targetList);
-
- root->query_pathkeys = root->sort_pathkeys;
-}
-
-/*
* Get the OID of the sort operator, if any, associated with an aggregate.
* Returns InvalidOid if there is no such operator.
*/
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 7a34abca04..dc5cc110a9 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -29,73 +29,35 @@
/*
- * query_planner
- * Generate a path (that is, a simplified plan) for a basic query,
- * which may involve joins but not any fancier features.
+ * process_jointree
+ * Analyze the jointree of a query.
*
- * Since query_planner does not handle the toplevel processing (grouping,
- * sorting, etc) it cannot select the best path by itself. Instead, it
- * returns the RelOptInfo for the top level of joining, and the caller
- * (grouping_planner) can choose among the surviving paths for the rel.
+ * This builds the base relations, restrict/join quals, and equivalence
+ * classes.
*
* root describes the query to plan
* tlist is the target list the query should produce
* (this is NOT necessarily root->parse->targetList!)
- * qp_callback is a function to compute query_pathkeys once it's safe to do so
- * qp_extra is optional extra data to pass to qp_callback
- *
- * Note: the PlannerInfo node also includes a query_pathkeys field, which
- * tells query_planner the sort order that is desired in the final output
- * plan. This value is *not* available at call time, but is computed by
- * qp_callback once we have completed merging the query's equivalence classes.
- * (We cannot construct canonical pathkeys until that's done.)
*/
-RelOptInfo *
-query_planner(PlannerInfo *root, List *tlist,
- query_pathkeys_callback qp_callback, void *qp_extra)
+void
+process_jointree(PlannerInfo *root, List *tlist)
{
Query *parse = root->parse;
- List *joinlist;
- RelOptInfo *final_rel;
Index rti;
double total_pages;
+ List *joinlist;
/*
- * If the query has an empty join tree, then it's something easy like
- * "SELECT 2+2;" or "INSERT ... VALUES()". Fall through quickly.
+ * If the query has an empty join tree, fall through quickly.
*/
if (parse->jointree->fromlist == NIL)
{
- /* We need a dummy joinrel to describe the empty set of baserels */
- final_rel = build_empty_join_rel(root);
-
- /*
- * If query allows parallelism in general, check whether the quals are
- * parallel-restricted. (We need not check final_rel->reltarget
- * because it's empty at this point. Anything parallel-restricted in
- * the query tlist will be dealt with later.)
- */
- if (root->glob->parallelModeOK)
- final_rel->consider_parallel =
- is_parallel_safe(root, parse->jointree->quals);
-
- /* The only path for it is a trivial Result path */
- add_path(final_rel, (Path *)
- create_result_path(root, final_rel,
- final_rel->reltarget,
- (List *) parse->jointree->quals));
-
- /* Select cheapest path (pretty easy in this case...) */
- set_cheapest(final_rel);
-
/*
- * We still are required to call qp_callback, in case it's something
- * like "SELECT 2+2 ORDER BY 1".
+ * Initialize canon_pathkeys, in case it's something like
+ * "SELECT 2+2 ORDER BY 1".
*/
root->canon_pathkeys = NIL;
- (*qp_callback) (root, qp_extra);
-
- return final_rel;
+ return;
}
/*
@@ -171,10 +133,11 @@ query_planner(PlannerInfo *root, List *tlist,
/*
* We have completed merging equivalence sets, so it's now possible to
- * generate pathkeys in canonical form; so compute query_pathkeys and
- * other pathkeys fields in PlannerInfo.
+ * generate pathkeys in canonical form. (We don't do that here, though.
+ * The caller will compute query_pathkeys and other pathkeys fields in
+ * PlannerInfo, based on the "upper" parts of the query, like GROUP BY
+ * and ORDER BY.)
*/
- (*qp_callback) (root, qp_extra);
/*
* Examine any "placeholder" expressions generated during subquery pullup.
@@ -190,7 +153,7 @@ query_planner(PlannerInfo *root, List *tlist,
* jointree preprocessing, but the necessary information isn't available
* until we've built baserel data structures and classified qual clauses.
*/
- joinlist = remove_useless_joins(root, joinlist);
+ root->join_subproblem_list = remove_useless_joins(root, joinlist);
/*
* Also, reduce any semijoins with unique inner rels to plain inner joins.
@@ -252,11 +215,65 @@ query_planner(PlannerInfo *root, List *tlist,
total_pages += (double) brel->pages;
}
root->total_table_pages = total_pages;
+}
+
+/*
+ * query_planner
+ * Generate paths (that is, simplified plans) for a basic query,
+ * which may involve joins but not any fancier features.
+ *
+ * Since query_planner does not handle the toplevel processing (grouping,
+ * sorting, etc) it cannot select the best path by itself. Instead, it
+ * returns the RelOptInfo for the top level of joining, and the caller
+ * (grouping_planner) can choose among the surviving paths for the rel.
+ *
+ * The PlannerInfo node also includes a query_pathkeys field, which tells
+ * query_planner the sort order that is desired in the final output plan.
+ * The pathkeys must be in canonical form, therefore they can only be
+ * computed after we have completed merging the query's equivalence classes,
+ * ie. after process_jointree().
+ */
+RelOptInfo *
+query_planner(PlannerInfo *root)
+{
+ Query *parse = root->parse;
+ RelOptInfo *final_rel;
+
+ /*
+ * If the query has an empty join tree, then it's something easy like
+ * "SELECT 2+2;" or "INSERT ... VALUES()". Fall through quickly.
+ */
+ if (parse->jointree->fromlist == NIL)
+ {
+ /* We need a dummy joinrel to describe the empty set of baserels */
+ final_rel = build_empty_join_rel(root);
+
+ /*
+ * If query allows parallelism in general, check whether the quals are
+ * parallel-restricted. (We need not check final_rel->reltarget
+ * because it's empty at this point. Anything parallel-restricted in
+ * the query tlist will be dealt with later.)
+ */
+ if (root->glob->parallelModeOK)
+ final_rel->consider_parallel =
+ is_parallel_safe(root, parse->jointree->quals);
+
+ /* The only path for it is a trivial Result path */
+ add_path(final_rel, (Path *)
+ create_result_path(root, final_rel,
+ final_rel->reltarget,
+ (List *) parse->jointree->quals));
+
+ /* Select cheapest path (pretty easy in this case...) */
+ set_cheapest(final_rel);
+
+ return final_rel;
+ }
/*
* Ready to do the primary planning.
*/
- final_rel = make_one_rel(root, joinlist);
+ final_rel = make_one_rel(root, root->join_subproblem_list);
/* Check that we got at least one usable path */
if (!final_rel || !final_rel->cheapest_total_path ||
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 67a2c7a581..e73007f3ba 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -116,6 +116,7 @@ static void preprocess_qual_conditions(PlannerInfo *root, Node *jtnode);
static void inheritance_planner(PlannerInfo *root);
static void grouping_planner(PlannerInfo *root, bool inheritance_update,
double tuple_fraction);
+static void compute_pathkeys(PlannerInfo *root, List *tlist, List *activeWindows, List *groupClause);
static grouping_sets_data *preprocess_grouping_sets(PlannerInfo *root);
static List *remap_to_groupclause_idx(List *groupClause, List *gsets,
int *tleref_to_colnum_map);
@@ -128,7 +129,6 @@ static void remove_useless_groupby_columns(PlannerInfo *root);
static List *preprocess_groupclause(PlannerInfo *root, List *force);
static List *extract_rollup_sets(List *groupingSets);
static List *reorder_grouping_sets(List *groupingSets, List *sortclause);
-static void standard_qp_callback(PlannerInfo *root, void *extra);
static double get_number_of_groups(PlannerInfo *root,
double path_rows,
grouping_sets_data *gd,
@@ -1787,7 +1787,6 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
WindowFuncLists *wflists = NULL;
List *activeWindows = NIL;
grouping_sets_data *gset_data = NULL;
- standard_qp_extra qp_extra;
/* A recursive query should always have setOperations */
Assert(!root->hasRecursion);
@@ -1880,29 +1879,34 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
else
root->limit_tuples = limit_tuples;
- /* Set up data needed by standard_qp_callback */
- qp_extra.tlist = tlist;
- qp_extra.activeWindows = activeWindows;
- qp_extra.groupClause = (gset_data
- ? (gset_data->rollups ? linitial_node(RollupData, gset_data->rollups)->groupClause : NIL)
- : parse->groupClause);
+ /*
+ * Build the base relations and equivalence classes, based on the
+ * scan/join portion of this query, ie the FROM/WHERE clauses.
+ */
+ process_jointree(root, tlist);
+
+ /*
+ * Generate pathkey representations of the query's sort clause,
+ * distinct clause, etc.
+ */
+ compute_pathkeys(root, tlist, activeWindows,
+ gset_data
+ ? (gset_data->rollups ? linitial_node(RollupData, gset_data->rollups)->groupClause : NIL)
+ : parse->groupClause);
/*
* Generate the best unsorted and presorted paths for the scan/join
* portion of this Query, ie the processing represented by the
* FROM/WHERE clauses. (Note there may not be any presorted paths.)
- * We also generate (in standard_qp_callback) pathkey representations
- * of the query's sort clause, distinct clause, etc.
*/
- current_rel = query_planner(root, tlist,
- standard_qp_callback, &qp_extra);
+ current_rel = query_planner(root);
/*
* Convert the query's result tlist into PathTarget format.
*
- * Note: it's desirable to not do this till after query_planner(),
+ * Note: it's desirable to not do this till after process_jointree(), (FIXME: or query_planner()?)
* because the target width estimates can use per-Var width numbers
- * that were obtained within query_planner().
+ * that were obtained within process_jointree().
*/
final_target = create_pathtarget(root, tlist);
final_target_parallel_safe =
@@ -3440,26 +3444,23 @@ reorder_grouping_sets(List *groupingsets, List *sortclause)
}
/*
- * Compute query_pathkeys and other pathkeys during plan generation
+ * Compute query_pathkeys and other pathkeys, to tell query_planner() which
+ * orderings would be useful for the later planner stages.
*/
static void
-standard_qp_callback(PlannerInfo *root, void *extra)
+compute_pathkeys(PlannerInfo *root, List *tlist, List *activeWindows, List *groupClause)
{
Query *parse = root->parse;
- standard_qp_extra *qp_extra = (standard_qp_extra *) extra;
- List *tlist = qp_extra->tlist;
- List *activeWindows = qp_extra->activeWindows;
/*
* Calculate pathkeys that represent grouping/ordering requirements. The
* sortClause is certainly sort-able, but GROUP BY and DISTINCT might not
* be, in which case we just leave their pathkeys empty.
*/
- if (qp_extra->groupClause &&
- grouping_is_sortable(qp_extra->groupClause))
+ if (groupClause && grouping_is_sortable(groupClause))
root->group_pathkeys =
make_pathkeys_for_sortclauses(root,
- qp_extra->groupClause,
+ groupClause,
tlist);
else
root->group_pathkeys = NIL;
diff --git a/src/backend/optimizer/util/placeholder.c b/src/backend/optimizer/util/placeholder.c
index c79d0f25d4..1b393aa936 100644
--- a/src/backend/optimizer/util/placeholder.c
+++ b/src/backend/optimizer/util/placeholder.c
@@ -62,7 +62,7 @@ make_placeholder_expr(PlannerInfo *root, Expr *expr, Relids phrels)
* We build PlaceHolderInfos only for PHVs that are still present in the
* simplified query passed to query_planner().
*
- * Note: this should only be called after query_planner() has started. Also,
+ * Note: this should only be called after process_jointree() (FIXME: or query_planner()?). Also,
* create_new_ph must not be true after deconstruct_jointree begins, because
* make_outerjoininfo assumes that we already know about all placeholders.
*/
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 5af484024a..6bf9e84c4a 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -218,6 +218,13 @@ typedef struct PlannerInfo
Relids nullable_baserels;
/*
+ * join_subprolem_list represents the join tree, as a tree of join order
+ * decisions that need to be made by make_one_rel(). See
+ * deconstruct_jointree().
+ */
+ List *join_subproblem_list;
+
+ /*
* join_rel_list is a list of all join-relation RelOptInfos we have
* considered in this planning run. For small problems we just scan the
* list to do lookups, but when there are many join relations we build a
@@ -339,7 +346,7 @@ typedef struct PlannerInfo
/*
* In places where it's known that simple_rte_array[] must have been prepared
* already, we just index into it to fetch RTEs. In code that might be
- * executed before or after entering query_planner(), use this macro.
+ * executed before or after entering process_jointree(), use this macro.
*/
#define planner_rt_fetch(rti, root) \
((root)->simple_rte_array ? (root)->simple_rte_array[rti] : \
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index c8ab0280d2..4e61dff241 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -31,14 +31,11 @@ extern double cursor_tuple_fraction;
extern int force_parallel_mode;
extern bool parallel_leader_participation;
-/* query_planner callback to compute query_pathkeys */
-typedef void (*query_pathkeys_callback) (PlannerInfo *root, void *extra);
-
/*
* prototypes for plan/planmain.c
*/
-extern RelOptInfo *query_planner(PlannerInfo *root, List *tlist,
- query_pathkeys_callback qp_callback, void *qp_extra);
+extern void process_jointree(PlannerInfo *root, List *tlist);
+extern RelOptInfo *query_planner(PlannerInfo *root);
/*
* prototypes for plan/planagg.c
--
2.11.0
0002-Move-GROUP-BY-planning-code-from-planner.c-to-separa.patchtext/x-patch; name=0002-Move-GROUP-BY-planning-code-from-planner.c-to-separa.patchDownload
From 2a545e9e949ca1daad6f724ceaaae20ce281d5a6 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Tue, 12 Jun 2018 15:31:39 +0300
Subject: [PATCH 2/4] Move GROUP BY planning code from planner.c to separate
file, aggpath.c
---
src/backend/optimizer/path/Makefile | 4 +-
src/backend/optimizer/path/aggpath.c | 1967 ++++++++++++
src/backend/optimizer/plan/Makefile | 4 +-
src/backend/optimizer/plan/planner.c | 5420 +++++++++++-----------------------
src/include/nodes/relation.h | 19 +
src/include/optimizer/paths.h | 13 +
src/include/optimizer/planmain.h | 1 +
src/include/optimizer/planner.h | 2 +
8 files changed, 3741 insertions(+), 3689 deletions(-)
create mode 100644 src/backend/optimizer/path/aggpath.c
diff --git a/src/backend/optimizer/path/Makefile b/src/backend/optimizer/path/Makefile
index 6864a62132..21216a621d 100644
--- a/src/backend/optimizer/path/Makefile
+++ b/src/backend/optimizer/path/Makefile
@@ -12,7 +12,7 @@ subdir = src/backend/optimizer/path
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = allpaths.o clausesel.o costsize.o equivclass.o indxpath.o \
- joinpath.o joinrels.o pathkeys.o tidpath.o
+OBJS = aggpath.o allpaths.o clausesel.o costsize.o equivclass.o \
+ indxpath.o joinpath.o joinrels.o pathkeys.o tidpath.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/optimizer/path/aggpath.c b/src/backend/optimizer/path/aggpath.c
new file mode 100644
index 0000000000..618171b148
--- /dev/null
+++ b/src/backend/optimizer/path/aggpath.c
@@ -0,0 +1,1967 @@
+/*-------------------------------------------------------------------------
+ *
+ * aggpath.c
+ * Routines to generate paths for processing GROUP BY and aggregates.
+ *
+ * XXX
+ *
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/optimizer/path/aggpath.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include <math.h>
+
+#include "access/htup_details.h"
+#include "catalog/pg_aggregate.h"
+#include "catalog/pg_type.h"
+#include "executor/nodeAgg.h"
+#include "foreign/fdwapi.h"
+#include "lib/knapsack.h"
+#include "miscadmin.h"
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/clauses.h"
+#include "optimizer/cost.h"
+#include "optimizer/pathnode.h"
+#include "optimizer/paths.h"
+#include "optimizer/planmain.h"
+#include "optimizer/planner.h"
+#include "optimizer/prep.h"
+#include "optimizer/tlist.h"
+#include "optimizer/var.h"
+#include "parser/parsetree.h"
+#include "parser/parse_agg.h"
+#include "parser/parse_clause.h"
+#include "rewrite/rewriteManip.h"
+#include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
+#include "utils/syscache.h"
+
+static RelOptInfo *make_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
+ PathTarget *target, bool target_parallel_safe,
+ Node *havingQual);
+static bool is_degenerate_grouping(PlannerInfo *root);
+static void create_degenerate_grouping_paths(PlannerInfo *root,
+ RelOptInfo *input_rel,
+ RelOptInfo *grouped_rel);
+static void create_ordinary_grouping_paths(PlannerInfo *root,
+ RelOptInfo *input_rel,
+ RelOptInfo *grouped_rel,
+ const AggClauseCosts *agg_costs,
+ grouping_sets_data *gd,
+ GroupPathExtraData *extra,
+ RelOptInfo **partially_grouped_rel_p);
+static bool can_partial_agg(PlannerInfo *root,
+ const AggClauseCosts *agg_costs);
+static void create_partitionwise_grouping_paths(PlannerInfo *root,
+ RelOptInfo *input_rel,
+ RelOptInfo *grouped_rel,
+ RelOptInfo *partially_grouped_rel,
+ const AggClauseCosts *agg_costs,
+ grouping_sets_data *gd,
+ PartitionwiseAggregateType patype,
+ GroupPathExtraData *extra);
+static bool group_by_has_partkey(RelOptInfo *input_rel,
+ List *targetList,
+ List *groupClause);
+static RelOptInfo *create_partial_grouping_paths(PlannerInfo *root,
+ RelOptInfo *grouped_rel,
+ RelOptInfo *input_rel,
+ grouping_sets_data *gd,
+ GroupPathExtraData *extra,
+ bool force_rel_creation);
+static void gather_grouping_paths(PlannerInfo *root, RelOptInfo *rel);
+static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
+ RelOptInfo *grouped_rel,
+ RelOptInfo *partially_grouped_rel,
+ const AggClauseCosts *agg_costs,
+ grouping_sets_data *gd,
+ double dNumGroups,
+ GroupPathExtraData *extra);
+static PathTarget *make_partial_grouping_target(PlannerInfo *root,
+ PathTarget *grouping_target,
+ Node *havingQual);
+static void consider_groupingsets_paths(PlannerInfo *root,
+ RelOptInfo *grouped_rel,
+ Path *path,
+ bool is_sorted,
+ bool can_hash,
+ grouping_sets_data *gd,
+ const AggClauseCosts *agg_costs,
+ double dNumGroups);
+
+/*
+ * Estimate number of groups produced by grouping clauses (1 if not grouping)
+ *
+ * path_rows: number of output rows from scan/join step
+ * gd: grouping sets data including list of grouping sets and their clauses
+ * target_list: target list containing group clause references
+ *
+ * If doing grouping sets, we also annotate the gsets data with the estimates
+ * for each set and each individual rollup list, with a view to later
+ * determining whether some combination of them could be hashed instead.
+ */
+static double
+get_number_of_groups(PlannerInfo *root,
+ double path_rows,
+ grouping_sets_data *gd,
+ List *target_list)
+{
+ Query *parse = root->parse;
+ double dNumGroups;
+
+ if (parse->groupClause)
+ {
+ List *groupExprs;
+
+ if (parse->groupingSets)
+ {
+ /* Add up the estimates for each grouping set */
+ ListCell *lc;
+ ListCell *lc2;
+
+ Assert(gd); /* keep Coverity happy */
+
+ dNumGroups = 0;
+
+ foreach(lc, gd->rollups)
+ {
+ RollupData *rollup = lfirst_node(RollupData, lc);
+ ListCell *lc;
+
+ groupExprs = get_sortgrouplist_exprs(rollup->groupClause,
+ target_list);
+
+ rollup->numGroups = 0.0;
+
+ forboth(lc, rollup->gsets, lc2, rollup->gsets_data)
+ {
+ List *gset = (List *) lfirst(lc);
+ GroupingSetData *gs = lfirst_node(GroupingSetData, lc2);
+ double numGroups = estimate_num_groups(root,
+ groupExprs,
+ path_rows,
+ &gset);
+
+ gs->numGroups = numGroups;
+ rollup->numGroups += numGroups;
+ }
+
+ dNumGroups += rollup->numGroups;
+ }
+
+ if (gd->hash_sets_idx)
+ {
+ ListCell *lc;
+
+ gd->dNumHashGroups = 0;
+
+ groupExprs = get_sortgrouplist_exprs(parse->groupClause,
+ target_list);
+
+ forboth(lc, gd->hash_sets_idx, lc2, gd->unsortable_sets)
+ {
+ List *gset = (List *) lfirst(lc);
+ GroupingSetData *gs = lfirst_node(GroupingSetData, lc2);
+ double numGroups = estimate_num_groups(root,
+ groupExprs,
+ path_rows,
+ &gset);
+
+ gs->numGroups = numGroups;
+ gd->dNumHashGroups += numGroups;
+ }
+
+ dNumGroups += gd->dNumHashGroups;
+ }
+ }
+ else
+ {
+ /* Plain GROUP BY */
+ groupExprs = get_sortgrouplist_exprs(parse->groupClause,
+ target_list);
+
+ dNumGroups = estimate_num_groups(root, groupExprs, path_rows,
+ NULL);
+ }
+ }
+ else if (parse->groupingSets)
+ {
+ /* Empty grouping sets ... one result row for each one */
+ dNumGroups = list_length(parse->groupingSets);
+ }
+ else if (parse->hasAggs || root->hasHavingQual)
+ {
+ /* Plain aggregation, one result row */
+ dNumGroups = 1;
+ }
+ else
+ {
+ /* Not grouping */
+ dNumGroups = 1;
+ }
+
+ return dNumGroups;
+}
+
+/*
+ * estimate_hashagg_tablesize
+ * estimate the number of bytes that a hash aggregate hashtable will
+ * require based on the agg_costs, path width and dNumGroups.
+ *
+ * XXX this may be over-estimating the size now that hashagg knows to omit
+ * unneeded columns from the hashtable. Also for mixed-mode grouping sets,
+ * grouping columns not in the hashed set are counted here even though hashagg
+ * won't store them. Is this a problem?
+ */
+static Size
+estimate_hashagg_tablesize(Path *path, const AggClauseCosts *agg_costs,
+ double dNumGroups)
+{
+ Size hashentrysize;
+
+ /* Estimate per-hash-entry space at tuple width... */
+ hashentrysize = MAXALIGN(path->pathtarget->width) +
+ MAXALIGN(SizeofMinimalTupleHeader);
+
+ /* plus space for pass-by-ref transition values... */
+ hashentrysize += agg_costs->transitionSpace;
+ /* plus the per-hash-entry overhead */
+ hashentrysize += hash_agg_entry_size(agg_costs->numAggs);
+
+ /*
+ * Note that this disregards the effect of fill-factor and growth policy
+ * of the hash-table. That's probably ok, given default the default
+ * fill-factor is relatively high. It'd be hard to meaningfully factor in
+ * "double-in-size" growth policies here.
+ */
+ return hashentrysize * dNumGroups;
+}
+
+/*
+ * create_grouping_paths
+ *
+ * Build a new upperrel containing Paths for grouping and/or aggregation.
+ * Along the way, we also build an upperrel for Paths which are partially
+ * grouped and/or aggregated. A partially grouped and/or aggregated path
+ * needs a FinalizeAggregate node to complete the aggregation. Currently,
+ * the only partially grouped paths we build are also partial paths; that
+ * is, they need a Gather and then a FinalizeAggregate.
+ *
+ * input_rel: contains the source-data Paths
+ * target: the pathtarget for the result Paths to compute
+ * agg_costs: cost info about all aggregates in query (in AGGSPLIT_SIMPLE mode)
+ * gd: grouping sets data including list of grouping sets and their clauses
+ *
+ * Note: all Paths in input_rel are expected to return the target computed
+ * by make_group_input_target.
+ */
+RelOptInfo *
+create_grouping_paths(PlannerInfo *root,
+ RelOptInfo *input_rel,
+ PathTarget *target,
+ bool target_parallel_safe,
+ const AggClauseCosts *agg_costs,
+ grouping_sets_data *gd)
+{
+ Query *parse = root->parse;
+ RelOptInfo *grouped_rel;
+ RelOptInfo *partially_grouped_rel;
+
+ /*
+ * Create grouping relation to hold fully aggregated grouping and/or
+ * aggregation paths.
+ */
+ grouped_rel = make_grouping_rel(root, input_rel, target,
+ target_parallel_safe, parse->havingQual);
+
+ /*
+ * Create either paths for a degenerate grouping or paths for ordinary
+ * grouping, as appropriate.
+ */
+ if (is_degenerate_grouping(root))
+ create_degenerate_grouping_paths(root, input_rel, grouped_rel);
+ else
+ {
+ int flags = 0;
+ GroupPathExtraData extra;
+
+ /*
+ * Determine whether it's possible to perform sort-based
+ * implementations of grouping. (Note that if groupClause is empty,
+ * grouping_is_sortable() is trivially true, and all the
+ * pathkeys_contained_in() tests will succeed too, so that we'll
+ * consider every surviving input path.)
+ *
+ * If we have grouping sets, we might be able to sort some but not all
+ * of them; in this case, we need can_sort to be true as long as we
+ * must consider any sorted-input plan.
+ */
+ if ((gd && gd->rollups != NIL)
+ || grouping_is_sortable(parse->groupClause))
+ flags |= GROUPING_CAN_USE_SORT;
+
+ /*
+ * Determine whether we should consider hash-based implementations of
+ * grouping.
+ *
+ * Hashed aggregation only applies if we're grouping. If we have
+ * grouping sets, some groups might be hashable but others not; in
+ * this case we set can_hash true as long as there is nothing globally
+ * preventing us from hashing (and we should therefore consider plans
+ * with hashes).
+ *
+ * Executor doesn't support hashed aggregation with DISTINCT or ORDER
+ * BY aggregates. (Doing so would imply storing *all* the input
+ * values in the hash table, and/or running many sorts in parallel,
+ * either of which seems like a certain loser.) We similarly don't
+ * support ordered-set aggregates in hashed aggregation, but that case
+ * is also included in the numOrderedAggs count.
+ *
+ * Note: grouping_is_hashable() is much more expensive to check than
+ * the other gating conditions, so we want to do it last.
+ */
+ if ((parse->groupClause != NIL &&
+ agg_costs->numOrderedAggs == 0 &&
+ (gd ? gd->any_hashable : grouping_is_hashable(parse->groupClause))))
+ flags |= GROUPING_CAN_USE_HASH;
+
+ /*
+ * Determine whether partial aggregation is possible.
+ */
+ if (can_partial_agg(root, agg_costs))
+ flags |= GROUPING_CAN_PARTIAL_AGG;
+
+ extra.flags = flags;
+ extra.target_parallel_safe = target_parallel_safe;
+ extra.havingQual = parse->havingQual;
+ extra.targetList = parse->targetList;
+ extra.partial_costs_set = false;
+
+ /*
+ * Determine whether partitionwise aggregation is in theory possible.
+ * It can be disabled by the user, and for now, we don't try to
+ * support grouping sets. create_ordinary_grouping_paths() will check
+ * additional conditions, such as whether input_rel is partitioned.
+ */
+ if (enable_partitionwise_aggregate && !parse->groupingSets)
+ extra.patype = PARTITIONWISE_AGGREGATE_FULL;
+ else
+ extra.patype = PARTITIONWISE_AGGREGATE_NONE;
+
+ create_ordinary_grouping_paths(root, input_rel, grouped_rel,
+ agg_costs, gd, &extra,
+ &partially_grouped_rel);
+ }
+
+ set_cheapest(grouped_rel);
+ return grouped_rel;
+}
+
+/*
+ * make_grouping_rel
+ *
+ * Create a new grouping rel and set basic properties.
+ *
+ * input_rel represents the underlying scan/join relation.
+ * target is the output expected from the grouping relation.
+ */
+static RelOptInfo *
+make_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
+ PathTarget *target, bool target_parallel_safe,
+ Node *havingQual)
+{
+ RelOptInfo *grouped_rel;
+
+ if (IS_OTHER_REL(input_rel))
+ {
+ grouped_rel = fetch_upper_rel(root, UPPERREL_GROUP_AGG,
+ input_rel->relids);
+ grouped_rel->reloptkind = RELOPT_OTHER_UPPER_REL;
+ }
+ else
+ {
+ /*
+ * By tradition, the relids set for the main grouping relation is
+ * NULL. (This could be changed, but might require adjustments
+ * elsewhere.)
+ */
+ grouped_rel = fetch_upper_rel(root, UPPERREL_GROUP_AGG, NULL);
+ }
+
+ /* Set target. */
+ grouped_rel->reltarget = target;
+
+ /*
+ * If the input relation is not parallel-safe, then the grouped relation
+ * can't be parallel-safe, either. Otherwise, it's parallel-safe if the
+ * target list and HAVING quals are parallel-safe.
+ */
+ if (input_rel->consider_parallel && target_parallel_safe &&
+ is_parallel_safe(root, (Node *) havingQual))
+ grouped_rel->consider_parallel = true;
+
+ /*
+ * If the input rel belongs to a single FDW, so does the grouped rel.
+ */
+ grouped_rel->serverid = input_rel->serverid;
+ grouped_rel->userid = input_rel->userid;
+ grouped_rel->useridiscurrent = input_rel->useridiscurrent;
+ grouped_rel->fdwroutine = input_rel->fdwroutine;
+
+ return grouped_rel;
+}
+
+/*
+ * is_degenerate_grouping
+ *
+ * A degenerate grouping is one in which the query has a HAVING qual and/or
+ * grouping sets, but no aggregates and no GROUP BY (which implies that the
+ * grouping sets are all empty).
+ */
+static bool
+is_degenerate_grouping(PlannerInfo *root)
+{
+ Query *parse = root->parse;
+
+ return (root->hasHavingQual || parse->groupingSets) &&
+ !parse->hasAggs && parse->groupClause == NIL;
+}
+
+/*
+ * create_degenerate_grouping_paths
+ *
+ * When the grouping is degenerate (see is_degenerate_grouping), we are
+ * supposed to emit either zero or one row for each grouping set depending on
+ * whether HAVING succeeds. Furthermore, there cannot be any variables in
+ * either HAVING or the targetlist, so we actually do not need the FROM table
+ * at all! We can just throw away the plan-so-far and generate a Result node.
+ * This is a sufficiently unusual corner case that it's not worth contorting
+ * the structure of this module to avoid having to generate the earlier paths
+ * in the first place.
+ */
+static void
+create_degenerate_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
+ RelOptInfo *grouped_rel)
+{
+ Query *parse = root->parse;
+ int nrows;
+ Path *path;
+
+ nrows = list_length(parse->groupingSets);
+ if (nrows > 1)
+ {
+ /*
+ * Doesn't seem worthwhile writing code to cons up a generate_series
+ * or a values scan to emit multiple rows. Instead just make N clones
+ * and append them. (With a volatile HAVING clause, this means you
+ * might get between 0 and N output rows. Offhand I think that's
+ * desired.)
+ */
+ List *paths = NIL;
+
+ while (--nrows >= 0)
+ {
+ path = (Path *)
+ create_result_path(root, grouped_rel,
+ grouped_rel->reltarget,
+ (List *) parse->havingQual);
+ paths = lappend(paths, path);
+ }
+ path = (Path *)
+ create_append_path(root,
+ grouped_rel,
+ paths,
+ NIL,
+ NULL,
+ 0,
+ false,
+ NIL,
+ -1);
+ }
+ else
+ {
+ /* No grouping sets, or just one, so one output row */
+ path = (Path *)
+ create_result_path(root, grouped_rel,
+ grouped_rel->reltarget,
+ (List *) parse->havingQual);
+ }
+
+ add_path(grouped_rel, path);
+}
+
+/*
+ * create_ordinary_grouping_paths
+ *
+ * Create grouping paths for the ordinary (that is, non-degenerate) case.
+ *
+ * We need to consider sorted and hashed aggregation in the same function,
+ * because otherwise (1) it would be harder to throw an appropriate error
+ * message if neither way works, and (2) we should not allow hashtable size
+ * considerations to dissuade us from using hashing if sorting is not possible.
+ *
+ * *partially_grouped_rel_p will be set to the partially grouped rel which this
+ * function creates, or to NULL if it doesn't create one.
+ */
+static void
+create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
+ RelOptInfo *grouped_rel,
+ const AggClauseCosts *agg_costs,
+ grouping_sets_data *gd,
+ GroupPathExtraData *extra,
+ RelOptInfo **partially_grouped_rel_p)
+{
+ Path *cheapest_path = input_rel->cheapest_total_path;
+ RelOptInfo *partially_grouped_rel = NULL;
+ double dNumGroups;
+ PartitionwiseAggregateType patype = PARTITIONWISE_AGGREGATE_NONE;
+
+ /*
+ * If this is the topmost grouping relation or if the parent relation is
+ * doing some form of partitionwise aggregation, then we may be able to do
+ * it at this level also. However, if the input relation is not
+ * partitioned, partitionwise aggregate is impossible, and if it is dummy,
+ * partitionwise aggregate is pointless.
+ */
+ if (extra->patype != PARTITIONWISE_AGGREGATE_NONE &&
+ input_rel->part_scheme && input_rel->part_rels &&
+ !IS_DUMMY_REL(input_rel))
+ {
+ /*
+ * If this is the topmost relation or if the parent relation is doing
+ * full partitionwise aggregation, then we can do full partitionwise
+ * aggregation provided that the GROUP BY clause contains all of the
+ * partitioning columns at this level. Otherwise, we can do at most
+ * partial partitionwise aggregation. But if partial aggregation is
+ * not supported in general then we can't use it for partitionwise
+ * aggregation either.
+ */
+ if (extra->patype == PARTITIONWISE_AGGREGATE_FULL &&
+ group_by_has_partkey(input_rel, extra->targetList,
+ root->parse->groupClause))
+ patype = PARTITIONWISE_AGGREGATE_FULL;
+ else if ((extra->flags & GROUPING_CAN_PARTIAL_AGG) != 0)
+ patype = PARTITIONWISE_AGGREGATE_PARTIAL;
+ else
+ patype = PARTITIONWISE_AGGREGATE_NONE;
+ }
+
+ /*
+ * Before generating paths for grouped_rel, we first generate any possible
+ * partially grouped paths; that way, later code can easily consider both
+ * parallel and non-parallel approaches to grouping.
+ */
+ if ((extra->flags & GROUPING_CAN_PARTIAL_AGG) != 0)
+ {
+ bool force_rel_creation;
+
+ /*
+ * If we're doing partitionwise aggregation at this level, force
+ * creation of a partially_grouped_rel so we can add partitionwise
+ * paths to it.
+ */
+ force_rel_creation = (patype == PARTITIONWISE_AGGREGATE_PARTIAL);
+
+ partially_grouped_rel =
+ create_partial_grouping_paths(root,
+ grouped_rel,
+ input_rel,
+ gd,
+ extra,
+ force_rel_creation);
+ }
+
+ /* Set out parameter. */
+ *partially_grouped_rel_p = partially_grouped_rel;
+
+ /* Apply partitionwise aggregation technique, if possible. */
+ if (patype != PARTITIONWISE_AGGREGATE_NONE)
+ create_partitionwise_grouping_paths(root, input_rel, grouped_rel,
+ partially_grouped_rel, agg_costs,
+ gd, patype, extra);
+
+ /* If we are doing partial aggregation only, return. */
+ if (extra->patype == PARTITIONWISE_AGGREGATE_PARTIAL)
+ {
+ Assert(partially_grouped_rel);
+
+ if (partially_grouped_rel->pathlist)
+ set_cheapest(partially_grouped_rel);
+
+ return;
+ }
+
+ /* Gather any partially grouped partial paths. */
+ if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
+ {
+ gather_grouping_paths(root, partially_grouped_rel);
+ set_cheapest(partially_grouped_rel);
+ }
+
+ /*
+ * Estimate number of groups.
+ */
+ dNumGroups = get_number_of_groups(root,
+ cheapest_path->rows,
+ gd,
+ extra->targetList);
+
+ /* Build final grouping paths */
+ add_paths_to_grouping_rel(root, input_rel, grouped_rel,
+ partially_grouped_rel, agg_costs, gd,
+ dNumGroups, extra);
+
+ /* Give a helpful error if we failed to find any implementation */
+ if (grouped_rel->pathlist == NIL)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("could not implement GROUP BY"),
+ errdetail("Some of the datatypes only support hashing, while others only support sorting.")));
+
+ /*
+ * If there is an FDW that's responsible for all baserels of the query,
+ * let it consider adding ForeignPaths.
+ */
+ if (grouped_rel->fdwroutine &&
+ grouped_rel->fdwroutine->GetForeignUpperPaths)
+ grouped_rel->fdwroutine->GetForeignUpperPaths(root, UPPERREL_GROUP_AGG,
+ input_rel, grouped_rel,
+ extra);
+
+ /* Let extensions possibly add some more paths */
+ if (create_upper_paths_hook)
+ (*create_upper_paths_hook) (root, UPPERREL_GROUP_AGG,
+ input_rel, grouped_rel,
+ extra);
+}
+
+/*
+ * add_paths_to_grouping_rel
+ *
+ * Add non-partial paths to grouping relation.
+ */
+static void
+add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
+ RelOptInfo *grouped_rel,
+ RelOptInfo *partially_grouped_rel,
+ const AggClauseCosts *agg_costs,
+ grouping_sets_data *gd, double dNumGroups,
+ GroupPathExtraData *extra)
+{
+ Query *parse = root->parse;
+ Path *cheapest_path = input_rel->cheapest_total_path;
+ ListCell *lc;
+ bool can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
+ bool can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
+ List *havingQual = (List *) extra->havingQual;
+ AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
+
+ if (can_sort)
+ {
+ /*
+ * Use any available suitably-sorted path as input, and also consider
+ * sorting the cheapest-total path.
+ */
+ foreach(lc, input_rel->pathlist)
+ {
+ Path *path = (Path *) lfirst(lc);
+ bool is_sorted;
+
+ is_sorted = pathkeys_contained_in(root->group_pathkeys,
+ path->pathkeys);
+ if (path == cheapest_path || is_sorted)
+ {
+ /* Sort the cheapest-total path if it isn't already sorted */
+ if (!is_sorted)
+ path = (Path *) create_sort_path(root,
+ grouped_rel,
+ path,
+ root->group_pathkeys,
+ -1.0);
+
+ /* Now decide what to stick atop it */
+ if (parse->groupingSets)
+ {
+ consider_groupingsets_paths(root, grouped_rel,
+ path, true, can_hash,
+ gd, agg_costs, dNumGroups);
+ }
+ else if (parse->hasAggs)
+ {
+ /*
+ * We have aggregation, possibly with plain GROUP BY. Make
+ * an AggPath.
+ */
+ add_path(grouped_rel, (Path *)
+ create_agg_path(root,
+ grouped_rel,
+ path,
+ grouped_rel->reltarget,
+ parse->groupClause ? AGG_SORTED : AGG_PLAIN,
+ AGGSPLIT_SIMPLE,
+ parse->groupClause,
+ havingQual,
+ agg_costs,
+ dNumGroups));
+ }
+ else if (parse->groupClause)
+ {
+ /*
+ * We have GROUP BY without aggregation or grouping sets.
+ * Make a GroupPath.
+ */
+ add_path(grouped_rel, (Path *)
+ create_group_path(root,
+ grouped_rel,
+ path,
+ parse->groupClause,
+ havingQual,
+ dNumGroups));
+ }
+ else
+ {
+ /* Other cases should have been handled above */
+ Assert(false);
+ }
+ }
+ }
+
+ /*
+ * Instead of operating directly on the input relation, we can
+ * consider finalizing a partially aggregated path.
+ */
+ if (partially_grouped_rel != NULL)
+ {
+ foreach(lc, partially_grouped_rel->pathlist)
+ {
+ Path *path = (Path *) lfirst(lc);
+
+ /*
+ * Insert a Sort node, if required. But there's no point in
+ * sorting anything but the cheapest path.
+ */
+ if (!pathkeys_contained_in(root->group_pathkeys, path->pathkeys))
+ {
+ if (path != partially_grouped_rel->cheapest_total_path)
+ continue;
+ path = (Path *) create_sort_path(root,
+ grouped_rel,
+ path,
+ root->group_pathkeys,
+ -1.0);
+ }
+
+ if (parse->hasAggs)
+ add_path(grouped_rel, (Path *)
+ create_agg_path(root,
+ grouped_rel,
+ path,
+ grouped_rel->reltarget,
+ parse->groupClause ? AGG_SORTED : AGG_PLAIN,
+ AGGSPLIT_FINAL_DESERIAL,
+ parse->groupClause,
+ havingQual,
+ agg_final_costs,
+ dNumGroups));
+ else
+ add_path(grouped_rel, (Path *)
+ create_group_path(root,
+ grouped_rel,
+ path,
+ parse->groupClause,
+ havingQual,
+ dNumGroups));
+ }
+ }
+ }
+
+ if (can_hash)
+ {
+ Size hashaggtablesize;
+
+ if (parse->groupingSets)
+ {
+ /*
+ * Try for a hash-only groupingsets path over unsorted input.
+ */
+ consider_groupingsets_paths(root, grouped_rel,
+ cheapest_path, false, true,
+ gd, agg_costs, dNumGroups);
+ }
+ else
+ {
+ hashaggtablesize = estimate_hashagg_tablesize(cheapest_path,
+ agg_costs,
+ dNumGroups);
+
+ /*
+ * Provided that the estimated size of the hashtable does not
+ * exceed work_mem, we'll generate a HashAgg Path, although if we
+ * were unable to sort above, then we'd better generate a Path, so
+ * that we at least have one.
+ */
+ if (hashaggtablesize < work_mem * 1024L ||
+ grouped_rel->pathlist == NIL)
+ {
+ /*
+ * We just need an Agg over the cheapest-total input path,
+ * since input order won't matter.
+ */
+ add_path(grouped_rel, (Path *)
+ create_agg_path(root, grouped_rel,
+ cheapest_path,
+ grouped_rel->reltarget,
+ AGG_HASHED,
+ AGGSPLIT_SIMPLE,
+ parse->groupClause,
+ havingQual,
+ agg_costs,
+ dNumGroups));
+ }
+ }
+
+ /*
+ * Generate a Finalize HashAgg Path atop of the cheapest partially
+ * grouped path, assuming there is one. Once again, we'll only do this
+ * if it looks as though the hash table won't exceed work_mem.
+ */
+ if (partially_grouped_rel && partially_grouped_rel->pathlist)
+ {
+ Path *path = partially_grouped_rel->cheapest_total_path;
+
+ hashaggtablesize = estimate_hashagg_tablesize(path,
+ agg_final_costs,
+ dNumGroups);
+
+ if (hashaggtablesize < work_mem * 1024L)
+ add_path(grouped_rel, (Path *)
+ create_agg_path(root,
+ grouped_rel,
+ path,
+ grouped_rel->reltarget,
+ AGG_HASHED,
+ AGGSPLIT_FINAL_DESERIAL,
+ parse->groupClause,
+ havingQual,
+ agg_final_costs,
+ dNumGroups));
+ }
+ }
+
+ /*
+ * When partitionwise aggregate is used, we might have fully aggregated
+ * paths in the partial pathlist, because add_paths_to_append_rel() will
+ * consider a path for grouped_rel consisting of a Parallel Append of
+ * non-partial paths from each child.
+ */
+ if (grouped_rel->partial_pathlist != NIL)
+ gather_grouping_paths(root, grouped_rel);
+}
+
+/*
+ * can_partial_agg
+ *
+ * Determines whether or not partial grouping and/or aggregation is possible.
+ * Returns true when possible, false otherwise.
+ */
+static bool
+can_partial_agg(PlannerInfo *root, const AggClauseCosts *agg_costs)
+{
+ Query *parse = root->parse;
+
+ if (!parse->hasAggs && parse->groupClause == NIL)
+ {
+ /*
+ * We don't know how to do parallel aggregation unless we have either
+ * some aggregates or a grouping clause.
+ */
+ return false;
+ }
+ else if (parse->groupingSets)
+ {
+ /* We don't know how to do grouping sets in parallel. */
+ return false;
+ }
+ else if (agg_costs->hasNonPartial || agg_costs->hasNonSerial)
+ {
+ /* Insufficient support for partial mode. */
+ return false;
+ }
+
+ /* Everything looks good. */
+ return true;
+}
+
+
+/*
+ * make_partial_grouping_target
+ * Generate appropriate PathTarget for output of partial aggregate
+ * (or partial grouping, if there are no aggregates) nodes.
+ *
+ * A partial aggregation node needs to emit all the same aggregates that
+ * a regular aggregation node would, plus any aggregates used in HAVING;
+ * except that the Aggref nodes should be marked as partial aggregates.
+ *
+ * In addition, we'd better emit any Vars and PlaceholderVars that are
+ * used outside of Aggrefs in the aggregation tlist and HAVING. (Presumably,
+ * these would be Vars that are grouped by or used in grouping expressions.)
+ *
+ * grouping_target is the tlist to be emitted by the topmost aggregation step.
+ * havingQual represents the HAVING clause.
+ */
+static PathTarget *
+make_partial_grouping_target(PlannerInfo *root,
+ PathTarget *grouping_target,
+ Node *havingQual)
+{
+ Query *parse = root->parse;
+ PathTarget *partial_target;
+ List *non_group_cols;
+ List *non_group_exprs;
+ int i;
+ ListCell *lc;
+
+ partial_target = create_empty_pathtarget();
+ non_group_cols = NIL;
+
+ i = 0;
+ foreach(lc, grouping_target->exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+ Index sgref = get_pathtarget_sortgroupref(grouping_target, i);
+
+ if (sgref && parse->groupClause &&
+ get_sortgroupref_clause_noerr(sgref, parse->groupClause) != NULL)
+ {
+ /*
+ * It's a grouping column, so add it to the partial_target as-is.
+ * (This allows the upper agg step to repeat the grouping calcs.)
+ */
+ add_column_to_pathtarget(partial_target, expr, sgref);
+ }
+ else
+ {
+ /*
+ * Non-grouping column, so just remember the expression for later
+ * call to pull_var_clause.
+ */
+ non_group_cols = lappend(non_group_cols, expr);
+ }
+
+ i++;
+ }
+
+ /*
+ * If there's a HAVING clause, we'll need the Vars/Aggrefs it uses, too.
+ */
+ if (havingQual)
+ non_group_cols = lappend(non_group_cols, havingQual);
+
+ /*
+ * Pull out all the Vars, PlaceHolderVars, and Aggrefs mentioned in
+ * non-group cols (plus HAVING), and add them to the partial_target if not
+ * already present. (An expression used directly as a GROUP BY item will
+ * be present already.) Note this includes Vars used in resjunk items, so
+ * we are covering the needs of ORDER BY and window specifications.
+ */
+ non_group_exprs = pull_var_clause((Node *) non_group_cols,
+ PVC_INCLUDE_AGGREGATES |
+ PVC_RECURSE_WINDOWFUNCS |
+ PVC_INCLUDE_PLACEHOLDERS);
+
+ add_new_columns_to_pathtarget(partial_target, non_group_exprs);
+
+ /*
+ * Adjust Aggrefs to put them in partial mode. At this point all Aggrefs
+ * are at the top level of the target list, so we can just scan the list
+ * rather than recursing through the expression trees.
+ */
+ foreach(lc, partial_target->exprs)
+ {
+ Aggref *aggref = (Aggref *) lfirst(lc);
+
+ if (IsA(aggref, Aggref))
+ {
+ Aggref *newaggref;
+
+ /*
+ * We shouldn't need to copy the substructure of the Aggref node,
+ * but flat-copy the node itself to avoid damaging other trees.
+ */
+ newaggref = makeNode(Aggref);
+ memcpy(newaggref, aggref, sizeof(Aggref));
+
+ /* For now, assume serialization is required */
+ mark_partial_aggref(newaggref, AGGSPLIT_INITIAL_SERIAL);
+
+ lfirst(lc) = newaggref;
+ }
+ }
+
+ /* clean up cruft */
+ list_free(non_group_exprs);
+ list_free(non_group_cols);
+
+ /* XXX this causes some redundant cost calculation ... */
+ return set_pathtarget_cost_width(root, partial_target);
+}
+
+/*
+ * create_partial_grouping_paths
+ *
+ * Create a new upper relation representing the result of partial aggregation
+ * and populate it with appropriate paths. Note that we don't finalize the
+ * lists of paths here, so the caller can add additional partial or non-partial
+ * paths and must afterward call gather_grouping_paths and set_cheapest on
+ * the returned upper relation.
+ *
+ * All paths for this new upper relation -- both partial and non-partial --
+ * have been partially aggregated but require a subsequent FinalizeAggregate
+ * step.
+ *
+ * NB: This function is allowed to return NULL if it determines that there is
+ * no real need to create a new RelOptInfo.
+ */
+static RelOptInfo *
+create_partial_grouping_paths(PlannerInfo *root,
+ RelOptInfo *grouped_rel,
+ RelOptInfo *input_rel,
+ grouping_sets_data *gd,
+ GroupPathExtraData *extra,
+ bool force_rel_creation)
+{
+ Query *parse = root->parse;
+ RelOptInfo *partially_grouped_rel;
+ AggClauseCosts *agg_partial_costs = &extra->agg_partial_costs;
+ AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
+ Path *cheapest_partial_path = NULL;
+ Path *cheapest_total_path = NULL;
+ double dNumPartialGroups = 0;
+ double dNumPartialPartialGroups = 0;
+ ListCell *lc;
+ bool can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
+ bool can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
+
+ /*
+ * Consider whether we should generate partially aggregated non-partial
+ * paths. We can only do this if we have a non-partial path, and only if
+ * the parent of the input rel is performing partial partitionwise
+ * aggregation. (Note that extra->patype is the type of partitionwise
+ * aggregation being used at the parent level, not this level.)
+ */
+ if (input_rel->pathlist != NIL &&
+ extra->patype == PARTITIONWISE_AGGREGATE_PARTIAL)
+ cheapest_total_path = input_rel->cheapest_total_path;
+
+ /*
+ * If parallelism is possible for grouped_rel, then we should consider
+ * generating partially-grouped partial paths. However, if the input rel
+ * has no partial paths, then we can't.
+ */
+ if (grouped_rel->consider_parallel && input_rel->partial_pathlist != NIL)
+ cheapest_partial_path = linitial(input_rel->partial_pathlist);
+
+ /*
+ * If we can't partially aggregate partial paths, and we can't partially
+ * aggregate non-partial paths, then don't bother creating the new
+ * RelOptInfo at all, unless the caller specified force_rel_creation.
+ */
+ if (cheapest_total_path == NULL &&
+ cheapest_partial_path == NULL &&
+ !force_rel_creation)
+ return NULL;
+
+ /*
+ * Build a new upper relation to represent the result of partially
+ * aggregating the rows from the input relation.
+ */
+ partially_grouped_rel = fetch_upper_rel(root,
+ UPPERREL_PARTIAL_GROUP_AGG,
+ grouped_rel->relids);
+ partially_grouped_rel->consider_parallel =
+ grouped_rel->consider_parallel;
+ partially_grouped_rel->reloptkind = grouped_rel->reloptkind;
+ partially_grouped_rel->serverid = grouped_rel->serverid;
+ partially_grouped_rel->userid = grouped_rel->userid;
+ partially_grouped_rel->useridiscurrent = grouped_rel->useridiscurrent;
+ partially_grouped_rel->fdwroutine = grouped_rel->fdwroutine;
+
+ /*
+ * Build target list for partial aggregate paths. These paths cannot just
+ * emit the same tlist as regular aggregate paths, because (1) we must
+ * include Vars and Aggrefs needed in HAVING, which might not appear in
+ * the result tlist, and (2) the Aggrefs must be set in partial mode.
+ */
+ partially_grouped_rel->reltarget =
+ make_partial_grouping_target(root, grouped_rel->reltarget,
+ extra->havingQual);
+
+ if (!extra->partial_costs_set)
+ {
+ /*
+ * Collect statistics about aggregates for estimating costs of
+ * performing aggregation in parallel.
+ */
+ MemSet(agg_partial_costs, 0, sizeof(AggClauseCosts));
+ MemSet(agg_final_costs, 0, sizeof(AggClauseCosts));
+ if (parse->hasAggs)
+ {
+ List *partial_target_exprs;
+
+ /* partial phase */
+ partial_target_exprs = partially_grouped_rel->reltarget->exprs;
+ get_agg_clause_costs(root, (Node *) partial_target_exprs,
+ AGGSPLIT_INITIAL_SERIAL,
+ agg_partial_costs);
+
+ /* final phase */
+ get_agg_clause_costs(root, (Node *) grouped_rel->reltarget->exprs,
+ AGGSPLIT_FINAL_DESERIAL,
+ agg_final_costs);
+ get_agg_clause_costs(root, extra->havingQual,
+ AGGSPLIT_FINAL_DESERIAL,
+ agg_final_costs);
+ }
+
+ extra->partial_costs_set = true;
+ }
+
+ /* Estimate number of partial groups. */
+ if (cheapest_total_path != NULL)
+ dNumPartialGroups =
+ get_number_of_groups(root,
+ cheapest_total_path->rows,
+ gd,
+ extra->targetList);
+ if (cheapest_partial_path != NULL)
+ dNumPartialPartialGroups =
+ get_number_of_groups(root,
+ cheapest_partial_path->rows,
+ gd,
+ extra->targetList);
+
+ if (can_sort && cheapest_total_path != NULL)
+ {
+ /* This should have been checked previously */
+ Assert(parse->hasAggs || parse->groupClause);
+
+ /*
+ * Use any available suitably-sorted path as input, and also consider
+ * sorting the cheapest partial path.
+ */
+ foreach(lc, input_rel->pathlist)
+ {
+ Path *path = (Path *) lfirst(lc);
+ bool is_sorted;
+
+ is_sorted = pathkeys_contained_in(root->group_pathkeys,
+ path->pathkeys);
+ if (path == cheapest_total_path || is_sorted)
+ {
+ /* Sort the cheapest partial path, if it isn't already */
+ if (!is_sorted)
+ path = (Path *) create_sort_path(root,
+ partially_grouped_rel,
+ path,
+ root->group_pathkeys,
+ -1.0);
+
+ if (parse->hasAggs)
+ add_path(partially_grouped_rel, (Path *)
+ create_agg_path(root,
+ partially_grouped_rel,
+ path,
+ partially_grouped_rel->reltarget,
+ parse->groupClause ? AGG_SORTED : AGG_PLAIN,
+ AGGSPLIT_INITIAL_SERIAL,
+ parse->groupClause,
+ NIL,
+ agg_partial_costs,
+ dNumPartialGroups));
+ else
+ add_path(partially_grouped_rel, (Path *)
+ create_group_path(root,
+ partially_grouped_rel,
+ path,
+ parse->groupClause,
+ NIL,
+ dNumPartialGroups));
+ }
+ }
+ }
+
+ if (can_sort && cheapest_partial_path != NULL)
+ {
+ /* Similar to above logic, but for partial paths. */
+ foreach(lc, input_rel->partial_pathlist)
+ {
+ Path *path = (Path *) lfirst(lc);
+ bool is_sorted;
+
+ is_sorted = pathkeys_contained_in(root->group_pathkeys,
+ path->pathkeys);
+ if (path == cheapest_partial_path || is_sorted)
+ {
+ /* Sort the cheapest partial path, if it isn't already */
+ if (!is_sorted)
+ path = (Path *) create_sort_path(root,
+ partially_grouped_rel,
+ path,
+ root->group_pathkeys,
+ -1.0);
+
+ if (parse->hasAggs)
+ add_partial_path(partially_grouped_rel, (Path *)
+ create_agg_path(root,
+ partially_grouped_rel,
+ path,
+ partially_grouped_rel->reltarget,
+ parse->groupClause ? AGG_SORTED : AGG_PLAIN,
+ AGGSPLIT_INITIAL_SERIAL,
+ parse->groupClause,
+ NIL,
+ agg_partial_costs,
+ dNumPartialPartialGroups));
+ else
+ add_partial_path(partially_grouped_rel, (Path *)
+ create_group_path(root,
+ partially_grouped_rel,
+ path,
+ parse->groupClause,
+ NIL,
+ dNumPartialPartialGroups));
+ }
+ }
+ }
+
+ if (can_hash && cheapest_total_path != NULL)
+ {
+ Size hashaggtablesize;
+
+ /* Checked above */
+ Assert(parse->hasAggs || parse->groupClause);
+
+ hashaggtablesize =
+ estimate_hashagg_tablesize(cheapest_total_path,
+ agg_partial_costs,
+ dNumPartialGroups);
+
+ /*
+ * Tentatively produce a partial HashAgg Path, depending on if it
+ * looks as if the hash table will fit in work_mem.
+ */
+ if (hashaggtablesize < work_mem * 1024L &&
+ cheapest_total_path != NULL)
+ {
+ add_path(partially_grouped_rel, (Path *)
+ create_agg_path(root,
+ partially_grouped_rel,
+ cheapest_total_path,
+ partially_grouped_rel->reltarget,
+ AGG_HASHED,
+ AGGSPLIT_INITIAL_SERIAL,
+ parse->groupClause,
+ NIL,
+ agg_partial_costs,
+ dNumPartialGroups));
+ }
+ }
+
+ if (can_hash && cheapest_partial_path != NULL)
+ {
+ Size hashaggtablesize;
+
+ hashaggtablesize =
+ estimate_hashagg_tablesize(cheapest_partial_path,
+ agg_partial_costs,
+ dNumPartialPartialGroups);
+
+ /* Do the same for partial paths. */
+ if (hashaggtablesize < work_mem * 1024L &&
+ cheapest_partial_path != NULL)
+ {
+ add_partial_path(partially_grouped_rel, (Path *)
+ create_agg_path(root,
+ partially_grouped_rel,
+ cheapest_partial_path,
+ partially_grouped_rel->reltarget,
+ AGG_HASHED,
+ AGGSPLIT_INITIAL_SERIAL,
+ parse->groupClause,
+ NIL,
+ agg_partial_costs,
+ dNumPartialPartialGroups));
+ }
+ }
+
+ /*
+ * If there is an FDW that's responsible for all baserels of the query,
+ * let it consider adding partially grouped ForeignPaths.
+ */
+ if (partially_grouped_rel->fdwroutine &&
+ partially_grouped_rel->fdwroutine->GetForeignUpperPaths)
+ {
+ FdwRoutine *fdwroutine = partially_grouped_rel->fdwroutine;
+
+ fdwroutine->GetForeignUpperPaths(root,
+ UPPERREL_PARTIAL_GROUP_AGG,
+ input_rel, partially_grouped_rel,
+ extra);
+ }
+
+ return partially_grouped_rel;
+}
+
+/*
+ * Generate Gather and Gather Merge paths for a grouping relation or partial
+ * grouping relation.
+ *
+ * generate_gather_paths does most of the work, but we also consider a special
+ * case: we could try sorting the data by the group_pathkeys and then applying
+ * Gather Merge.
+ *
+ * NB: This function shouldn't be used for anything other than a grouped or
+ * partially grouped relation not only because of the fact that it explicitly
+ * references group_pathkeys but we pass "true" as the third argument to
+ * generate_gather_paths().
+ */
+static void
+gather_grouping_paths(PlannerInfo *root, RelOptInfo *rel)
+{
+ Path *cheapest_partial_path;
+
+ /* Try Gather for unordered paths and Gather Merge for ordered ones. */
+ generate_gather_paths(root, rel, true);
+
+ /* Try cheapest partial path + explicit Sort + Gather Merge. */
+ cheapest_partial_path = linitial(rel->partial_pathlist);
+ if (!pathkeys_contained_in(root->group_pathkeys,
+ cheapest_partial_path->pathkeys))
+ {
+ Path *path;
+ double total_groups;
+
+ total_groups =
+ cheapest_partial_path->rows * cheapest_partial_path->parallel_workers;
+ path = (Path *) create_sort_path(root, rel, cheapest_partial_path,
+ root->group_pathkeys,
+ -1.0);
+ path = (Path *)
+ create_gather_merge_path(root,
+ rel,
+ path,
+ rel->reltarget,
+ root->group_pathkeys,
+ NULL,
+ &total_groups);
+
+ add_path(rel, path);
+ }
+}
+
+/*
+ * create_partitionwise_grouping_paths
+ *
+ * If the partition keys of input relation are part of the GROUP BY clause, all
+ * the rows belonging to a given group come from a single partition. This
+ * allows aggregation/grouping over a partitioned relation to be broken down
+ * into aggregation/grouping on each partition. This should be no worse, and
+ * often better, than the normal approach.
+ *
+ * However, if the GROUP BY clause does not contain all the partition keys,
+ * rows from a given group may be spread across multiple partitions. In that
+ * case, we perform partial aggregation for each group, append the results,
+ * and then finalize aggregation. This is less certain to win than the
+ * previous case. It may win if the PartialAggregate stage greatly reduces
+ * the number of groups, because fewer rows will pass through the Append node.
+ * It may lose if we have lots of small groups.
+ */
+static void
+create_partitionwise_grouping_paths(PlannerInfo *root,
+ RelOptInfo *input_rel,
+ RelOptInfo *grouped_rel,
+ RelOptInfo *partially_grouped_rel,
+ const AggClauseCosts *agg_costs,
+ grouping_sets_data *gd,
+ PartitionwiseAggregateType patype,
+ GroupPathExtraData *extra)
+{
+ int nparts = input_rel->nparts;
+ int cnt_parts;
+ List *grouped_live_children = NIL;
+ List *partially_grouped_live_children = NIL;
+ PathTarget *target = grouped_rel->reltarget;
+
+ Assert(patype != PARTITIONWISE_AGGREGATE_NONE);
+ Assert(patype != PARTITIONWISE_AGGREGATE_PARTIAL ||
+ partially_grouped_rel != NULL);
+
+ /* Add paths for partitionwise aggregation/grouping. */
+ for (cnt_parts = 0; cnt_parts < nparts; cnt_parts++)
+ {
+ RelOptInfo *child_input_rel = input_rel->part_rels[cnt_parts];
+ PathTarget *child_target = copy_pathtarget(target);
+ AppendRelInfo **appinfos;
+ int nappinfos;
+ GroupPathExtraData child_extra;
+ RelOptInfo *child_grouped_rel;
+ RelOptInfo *child_partially_grouped_rel;
+
+ /* Input child rel must have a path */
+ Assert(child_input_rel->pathlist != NIL);
+
+ /*
+ * Copy the given "extra" structure as is and then override the
+ * members specific to this child.
+ */
+ memcpy(&child_extra, extra, sizeof(child_extra));
+
+ appinfos = find_appinfos_by_relids(root, child_input_rel->relids,
+ &nappinfos);
+
+ child_target->exprs = (List *)
+ adjust_appendrel_attrs(root,
+ (Node *) target->exprs,
+ nappinfos, appinfos);
+
+ /* Translate havingQual and targetList. */
+ child_extra.havingQual = (Node *)
+ adjust_appendrel_attrs(root,
+ extra->havingQual,
+ nappinfos, appinfos);
+ child_extra.targetList = (List *)
+ adjust_appendrel_attrs(root,
+ (Node *) extra->targetList,
+ nappinfos, appinfos);
+
+ /*
+ * extra->patype was the value computed for our parent rel; patype is
+ * the value for this relation. For the child, our value is its
+ * parent rel's value.
+ */
+ child_extra.patype = patype;
+
+ /*
+ * Create grouping relation to hold fully aggregated grouping and/or
+ * aggregation paths for the child.
+ */
+ child_grouped_rel = make_grouping_rel(root, child_input_rel,
+ child_target,
+ extra->target_parallel_safe,
+ child_extra.havingQual);
+
+ /* Ignore empty children. They contribute nothing. */
+ if (IS_DUMMY_REL(child_input_rel))
+ {
+ mark_dummy_rel(child_grouped_rel);
+
+ continue;
+ }
+
+ /* Create grouping paths for this child relation. */
+ create_ordinary_grouping_paths(root, child_input_rel,
+ child_grouped_rel,
+ agg_costs, gd, &child_extra,
+ &child_partially_grouped_rel);
+
+ if (child_partially_grouped_rel)
+ {
+ partially_grouped_live_children =
+ lappend(partially_grouped_live_children,
+ child_partially_grouped_rel);
+ }
+
+ if (patype == PARTITIONWISE_AGGREGATE_FULL)
+ {
+ set_cheapest(child_grouped_rel);
+ grouped_live_children = lappend(grouped_live_children,
+ child_grouped_rel);
+ }
+
+ pfree(appinfos);
+ }
+
+ /*
+ * All children can't be dummy at this point. If they are, then the parent
+ * too marked as dummy.
+ */
+ Assert(grouped_live_children != NIL ||
+ partially_grouped_live_children != NIL);
+
+ /*
+ * Try to create append paths for partially grouped children. For full
+ * partitionwise aggregation, we might have paths in the partial_pathlist
+ * if parallel aggregation is possible. For partial partitionwise
+ * aggregation, we may have paths in both pathlist and partial_pathlist.
+ */
+ if (partially_grouped_rel)
+ {
+ add_paths_to_append_rel(root, partially_grouped_rel,
+ partially_grouped_live_children);
+
+ /*
+ * We need call set_cheapest, since the finalization step will use the
+ * cheapest path from the rel.
+ */
+ if (partially_grouped_rel->pathlist)
+ set_cheapest(partially_grouped_rel);
+ }
+
+ /* If possible, create append paths for fully grouped children. */
+ if (patype == PARTITIONWISE_AGGREGATE_FULL)
+ add_paths_to_append_rel(root, grouped_rel, grouped_live_children);
+}
+
+/*
+ * group_by_has_partkey
+ *
+ * Returns true, if all the partition keys of the given relation are part of
+ * the GROUP BY clauses, false otherwise.
+ */
+static bool
+group_by_has_partkey(RelOptInfo *input_rel,
+ List *targetList,
+ List *groupClause)
+{
+ List *groupexprs = get_sortgrouplist_exprs(groupClause, targetList);
+ int cnt = 0;
+ int partnatts;
+
+ /* Input relation should be partitioned. */
+ Assert(input_rel->part_scheme);
+
+ /* Rule out early, if there are no partition keys present. */
+ if (!input_rel->partexprs)
+ return false;
+
+ partnatts = input_rel->part_scheme->partnatts;
+
+ for (cnt = 0; cnt < partnatts; cnt++)
+ {
+ List *partexprs = input_rel->partexprs[cnt];
+ ListCell *lc;
+ bool found = false;
+
+ foreach(lc, partexprs)
+ {
+ Expr *partexpr = lfirst(lc);
+
+ if (list_member(groupexprs, partexpr))
+ {
+ found = true;
+ break;
+ }
+ }
+
+ /*
+ * If none of the partition key expressions match with any of the
+ * GROUP BY expression, return false.
+ */
+ if (!found)
+ return false;
+ }
+
+ return true;
+}
+
+
+/*
+ * For a given input path, consider the possible ways of doing grouping sets on
+ * it, by combinations of hashing and sorting. This can be called multiple
+ * times, so it's important that it not scribble on input. No result is
+ * returned, but any generated paths are added to grouped_rel.
+ */
+static void
+consider_groupingsets_paths(PlannerInfo *root,
+ RelOptInfo *grouped_rel,
+ Path *path,
+ bool is_sorted,
+ bool can_hash,
+ grouping_sets_data *gd,
+ const AggClauseCosts *agg_costs,
+ double dNumGroups)
+{
+ Query *parse = root->parse;
+
+ /*
+ * If we're not being offered sorted input, then only consider plans that
+ * can be done entirely by hashing.
+ *
+ * We can hash everything if it looks like it'll fit in work_mem. But if
+ * the input is actually sorted despite not being advertised as such, we
+ * prefer to make use of that in order to use less memory.
+ *
+ * If none of the grouping sets are sortable, then ignore the work_mem
+ * limit and generate a path anyway, since otherwise we'll just fail.
+ */
+ if (!is_sorted)
+ {
+ List *new_rollups = NIL;
+ RollupData *unhashed_rollup = NULL;
+ List *sets_data;
+ List *empty_sets_data = NIL;
+ List *empty_sets = NIL;
+ ListCell *lc;
+ ListCell *l_start = list_head(gd->rollups);
+ AggStrategy strat = AGG_HASHED;
+ Size hashsize;
+ double exclude_groups = 0.0;
+
+ Assert(can_hash);
+
+ /*
+ * If the input is coincidentally sorted usefully (which can happen
+ * even if is_sorted is false, since that only means that our caller
+ * has set up the sorting for us), then save some hashtable space by
+ * making use of that. But we need to watch out for degenerate cases:
+ *
+ * 1) If there are any empty grouping sets, then group_pathkeys might
+ * be NIL if all non-empty grouping sets are unsortable. In this case,
+ * there will be a rollup containing only empty groups, and the
+ * pathkeys_contained_in test is vacuously true; this is ok.
+ *
+ * XXX: the above relies on the fact that group_pathkeys is generated
+ * from the first rollup. If we add the ability to consider multiple
+ * sort orders for grouping input, this assumption might fail.
+ *
+ * 2) If there are no empty sets and only unsortable sets, then the
+ * rollups list will be empty (and thus l_start == NULL), and
+ * group_pathkeys will be NIL; we must ensure that the vacuously-true
+ * pathkeys_contain_in test doesn't cause us to crash.
+ */
+ if (l_start != NULL &&
+ pathkeys_contained_in(root->group_pathkeys, path->pathkeys))
+ {
+ unhashed_rollup = lfirst_node(RollupData, l_start);
+ exclude_groups = unhashed_rollup->numGroups;
+ l_start = lnext(l_start);
+ }
+
+ hashsize = estimate_hashagg_tablesize(path,
+ agg_costs,
+ dNumGroups - exclude_groups);
+
+ /*
+ * gd->rollups is empty if we have only unsortable columns to work
+ * with. Override work_mem in that case; otherwise, we'll rely on the
+ * sorted-input case to generate usable mixed paths.
+ */
+ if (hashsize > work_mem * 1024L && gd->rollups)
+ return; /* nope, won't fit */
+
+ /*
+ * We need to burst the existing rollups list into individual grouping
+ * sets and recompute a groupClause for each set.
+ */
+ sets_data = list_copy(gd->unsortable_sets);
+
+ for_each_cell(lc, l_start)
+ {
+ RollupData *rollup = lfirst_node(RollupData, lc);
+
+ /*
+ * If we find an unhashable rollup that's not been skipped by the
+ * "actually sorted" check above, we can't cope; we'd need sorted
+ * input (with a different sort order) but we can't get that here.
+ * So bail out; we'll get a valid path from the is_sorted case
+ * instead.
+ *
+ * The mere presence of empty grouping sets doesn't make a rollup
+ * unhashable (see preprocess_grouping_sets), we handle those
+ * specially below.
+ */
+ if (!rollup->hashable)
+ return;
+ else
+ sets_data = list_concat(sets_data, list_copy(rollup->gsets_data));
+ }
+ foreach(lc, sets_data)
+ {
+ GroupingSetData *gs = lfirst_node(GroupingSetData, lc);
+ List *gset = gs->set;
+ RollupData *rollup;
+
+ if (gset == NIL)
+ {
+ /* Empty grouping sets can't be hashed. */
+ empty_sets_data = lappend(empty_sets_data, gs);
+ empty_sets = lappend(empty_sets, NIL);
+ }
+ else
+ {
+ rollup = makeNode(RollupData);
+
+ rollup->groupClause = preprocess_groupclause(root, gset);
+ rollup->gsets_data = list_make1(gs);
+ rollup->gsets = remap_to_groupclause_idx(rollup->groupClause,
+ rollup->gsets_data,
+ gd->tleref_to_colnum_map);
+ rollup->numGroups = gs->numGroups;
+ rollup->hashable = true;
+ rollup->is_hashed = true;
+ new_rollups = lappend(new_rollups, rollup);
+ }
+ }
+
+ /*
+ * If we didn't find anything nonempty to hash, then bail. We'll
+ * generate a path from the is_sorted case.
+ */
+ if (new_rollups == NIL)
+ return;
+
+ /*
+ * If there were empty grouping sets they should have been in the
+ * first rollup.
+ */
+ Assert(!unhashed_rollup || !empty_sets);
+
+ if (unhashed_rollup)
+ {
+ new_rollups = lappend(new_rollups, unhashed_rollup);
+ strat = AGG_MIXED;
+ }
+ else if (empty_sets)
+ {
+ RollupData *rollup = makeNode(RollupData);
+
+ rollup->groupClause = NIL;
+ rollup->gsets_data = empty_sets_data;
+ rollup->gsets = empty_sets;
+ rollup->numGroups = list_length(empty_sets);
+ rollup->hashable = false;
+ rollup->is_hashed = false;
+ new_rollups = lappend(new_rollups, rollup);
+ strat = AGG_MIXED;
+ }
+
+ add_path(grouped_rel, (Path *)
+ create_groupingsets_path(root,
+ grouped_rel,
+ path,
+ (List *) parse->havingQual,
+ strat,
+ new_rollups,
+ agg_costs,
+ dNumGroups));
+ return;
+ }
+
+ /*
+ * If we have sorted input but nothing we can do with it, bail.
+ */
+ if (list_length(gd->rollups) == 0)
+ return;
+
+ /*
+ * Given sorted input, we try and make two paths: one sorted and one mixed
+ * sort/hash. (We need to try both because hashagg might be disabled, or
+ * some columns might not be sortable.)
+ *
+ * can_hash is passed in as false if some obstacle elsewhere (such as
+ * ordered aggs) means that we shouldn't consider hashing at all.
+ */
+ if (can_hash && gd->any_hashable)
+ {
+ List *rollups = NIL;
+ List *hash_sets = list_copy(gd->unsortable_sets);
+ double availspace = (work_mem * 1024.0);
+ ListCell *lc;
+
+ /*
+ * Account first for space needed for groups we can't sort at all.
+ */
+ availspace -= (double) estimate_hashagg_tablesize(path,
+ agg_costs,
+ gd->dNumHashGroups);
+
+ if (availspace > 0 && list_length(gd->rollups) > 1)
+ {
+ double scale;
+ int num_rollups = list_length(gd->rollups);
+ int k_capacity;
+ int *k_weights = palloc(num_rollups * sizeof(int));
+ Bitmapset *hash_items = NULL;
+ int i;
+
+ /*
+ * We treat this as a knapsack problem: the knapsack capacity
+ * represents work_mem, the item weights are the estimated memory
+ * usage of the hashtables needed to implement a single rollup,
+ * and we really ought to use the cost saving as the item value;
+ * however, currently the costs assigned to sort nodes don't
+ * reflect the comparison costs well, and so we treat all items as
+ * of equal value (each rollup we hash instead saves us one sort).
+ *
+ * To use the discrete knapsack, we need to scale the values to a
+ * reasonably small bounded range. We choose to allow a 5% error
+ * margin; we have no more than 4096 rollups in the worst possible
+ * case, which with a 5% error margin will require a bit over 42MB
+ * of workspace. (Anyone wanting to plan queries that complex had
+ * better have the memory for it. In more reasonable cases, with
+ * no more than a couple of dozen rollups, the memory usage will
+ * be negligible.)
+ *
+ * k_capacity is naturally bounded, but we clamp the values for
+ * scale and weight (below) to avoid overflows or underflows (or
+ * uselessly trying to use a scale factor less than 1 byte).
+ */
+ scale = Max(availspace / (20.0 * num_rollups), 1.0);
+ k_capacity = (int) floor(availspace / scale);
+
+ /*
+ * We leave the first rollup out of consideration since it's the
+ * one that matches the input sort order. We assign indexes "i"
+ * to only those entries considered for hashing; the second loop,
+ * below, must use the same condition.
+ */
+ i = 0;
+ for_each_cell(lc, lnext(list_head(gd->rollups)))
+ {
+ RollupData *rollup = lfirst_node(RollupData, lc);
+
+ if (rollup->hashable)
+ {
+ double sz = estimate_hashagg_tablesize(path,
+ agg_costs,
+ rollup->numGroups);
+
+ /*
+ * If sz is enormous, but work_mem (and hence scale) is
+ * small, avoid integer overflow here.
+ */
+ k_weights[i] = (int) Min(floor(sz / scale),
+ k_capacity + 1.0);
+ ++i;
+ }
+ }
+
+ /*
+ * Apply knapsack algorithm; compute the set of items which
+ * maximizes the value stored (in this case the number of sorts
+ * saved) while keeping the total size (approximately) within
+ * capacity.
+ */
+ if (i > 0)
+ hash_items = DiscreteKnapsack(k_capacity, i, k_weights, NULL);
+
+ if (!bms_is_empty(hash_items))
+ {
+ rollups = list_make1(linitial(gd->rollups));
+
+ i = 0;
+ for_each_cell(lc, lnext(list_head(gd->rollups)))
+ {
+ RollupData *rollup = lfirst_node(RollupData, lc);
+
+ if (rollup->hashable)
+ {
+ if (bms_is_member(i, hash_items))
+ hash_sets = list_concat(hash_sets,
+ list_copy(rollup->gsets_data));
+ else
+ rollups = lappend(rollups, rollup);
+ ++i;
+ }
+ else
+ rollups = lappend(rollups, rollup);
+ }
+ }
+ }
+
+ if (!rollups && hash_sets)
+ rollups = list_copy(gd->rollups);
+
+ foreach(lc, hash_sets)
+ {
+ GroupingSetData *gs = lfirst_node(GroupingSetData, lc);
+ RollupData *rollup = makeNode(RollupData);
+
+ Assert(gs->set != NIL);
+
+ rollup->groupClause = preprocess_groupclause(root, gs->set);
+ rollup->gsets_data = list_make1(gs);
+ rollup->gsets = remap_to_groupclause_idx(rollup->groupClause,
+ rollup->gsets_data,
+ gd->tleref_to_colnum_map);
+ rollup->numGroups = gs->numGroups;
+ rollup->hashable = true;
+ rollup->is_hashed = true;
+ rollups = lcons(rollup, rollups);
+ }
+
+ if (rollups)
+ {
+ add_path(grouped_rel, (Path *)
+ create_groupingsets_path(root,
+ grouped_rel,
+ path,
+ (List *) parse->havingQual,
+ AGG_MIXED,
+ rollups,
+ agg_costs,
+ dNumGroups));
+ }
+ }
+
+ /*
+ * Now try the simple sorted case.
+ */
+ if (!gd->unsortable_sets)
+ add_path(grouped_rel, (Path *)
+ create_groupingsets_path(root,
+ grouped_rel,
+ path,
+ (List *) parse->havingQual,
+ AGG_SORTED,
+ gd->rollups,
+ agg_costs,
+ dNumGroups));
+}
+
+
+/*
+ * Given a groupclause and a list of GroupingSetData, return equivalent sets
+ * (without annotation) mapped to indexes into the given groupclause.
+ */
+List *
+remap_to_groupclause_idx(List *groupClause,
+ List *gsets,
+ int *tleref_to_colnum_map)
+{
+ int ref = 0;
+ List *result = NIL;
+ ListCell *lc;
+
+ foreach(lc, groupClause)
+ {
+ SortGroupClause *gc = lfirst_node(SortGroupClause, lc);
+
+ tleref_to_colnum_map[gc->tleSortGroupRef] = ref++;
+ }
+
+ foreach(lc, gsets)
+ {
+ List *set = NIL;
+ ListCell *lc2;
+ GroupingSetData *gs = lfirst_node(GroupingSetData, lc);
+
+ foreach(lc2, gs->set)
+ {
+ set = lappend_int(set, tleref_to_colnum_map[lfirst_int(lc2)]);
+ }
+
+ result = lappend(result, set);
+ }
+
+ return result;
+}
diff --git a/src/backend/optimizer/plan/Makefile b/src/backend/optimizer/plan/Makefile
index 88a9f7ff8c..e3c96b704e 100644
--- a/src/backend/optimizer/plan/Makefile
+++ b/src/backend/optimizer/plan/Makefile
@@ -12,7 +12,7 @@ subdir = src/backend/optimizer/plan
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = analyzejoins.o createplan.o initsplan.o planagg.o planmain.o planner.o \
- setrefs.o subselect.o
+OBJS = analyzejoins.o createplan.o initsplan.o planagg.o planmain.o \
+ planner.o setrefs.o subselect.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index e73007f3ba..0d31eded37 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -28,10 +28,9 @@
#include "executor/executor.h"
#include "executor/nodeAgg.h"
#include "foreign/fdwapi.h"
+#include "lib/bipartite_match.h"
#include "miscadmin.h"
#include "jit/jit.h"
-#include "lib/bipartite_match.h"
-#include "lib/knapsack.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
#ifdef OPTIMIZER_DEBUG
@@ -94,22 +93,6 @@ typedef struct
List *groupClause; /* overrides parse->groupClause */
} standard_qp_extra;
-/*
- * Data specific to grouping sets
- */
-
-typedef struct
-{
- List *rollups;
- List *hash_sets_idx;
- double dNumHashGroups;
- bool any_hashable;
- Bitmapset *unsortable_refs;
- Bitmapset *unhashable_refs;
- List *unsortable_sets;
- int *tleref_to_colnum_map;
-} grouping_sets_data;
-
/* Local functions */
static Node *preprocess_expression(PlannerInfo *root, Node *expr, int kind);
static void preprocess_qual_conditions(PlannerInfo *root, Node *jtnode);
@@ -117,53 +100,12 @@ static void inheritance_planner(PlannerInfo *root);
static void grouping_planner(PlannerInfo *root, bool inheritance_update,
double tuple_fraction);
static void compute_pathkeys(PlannerInfo *root, List *tlist, List *activeWindows, List *groupClause);
-static grouping_sets_data *preprocess_grouping_sets(PlannerInfo *root);
-static List *remap_to_groupclause_idx(List *groupClause, List *gsets,
- int *tleref_to_colnum_map);
static void preprocess_rowmarks(PlannerInfo *root);
static double preprocess_limit(PlannerInfo *root,
double tuple_fraction,
int64 *offset_est, int64 *count_est);
static bool limit_needed(Query *parse);
static void remove_useless_groupby_columns(PlannerInfo *root);
-static List *preprocess_groupclause(PlannerInfo *root, List *force);
-static List *extract_rollup_sets(List *groupingSets);
-static List *reorder_grouping_sets(List *groupingSets, List *sortclause);
-static double get_number_of_groups(PlannerInfo *root,
- double path_rows,
- grouping_sets_data *gd,
- List *target_list);
-static Size estimate_hashagg_tablesize(Path *path,
- const AggClauseCosts *agg_costs,
- double dNumGroups);
-static RelOptInfo *create_grouping_paths(PlannerInfo *root,
- RelOptInfo *input_rel,
- PathTarget *target,
- bool target_parallel_safe,
- const AggClauseCosts *agg_costs,
- grouping_sets_data *gd);
-static bool is_degenerate_grouping(PlannerInfo *root);
-static void create_degenerate_grouping_paths(PlannerInfo *root,
- RelOptInfo *input_rel,
- RelOptInfo *grouped_rel);
-static RelOptInfo *make_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
- PathTarget *target, bool target_parallel_safe,
- Node *havingQual);
-static void create_ordinary_grouping_paths(PlannerInfo *root,
- RelOptInfo *input_rel,
- RelOptInfo *grouped_rel,
- const AggClauseCosts *agg_costs,
- grouping_sets_data *gd,
- GroupPathExtraData *extra,
- RelOptInfo **partially_grouped_rel_p);
-static void consider_groupingsets_paths(PlannerInfo *root,
- RelOptInfo *grouped_rel,
- Path *path,
- bool is_sorted,
- bool can_hash,
- grouping_sets_data *gd,
- const AggClauseCosts *agg_costs,
- double dNumGroups);
static RelOptInfo *create_window_paths(PlannerInfo *root,
RelOptInfo *input_rel,
PathTarget *input_target,
@@ -189,9 +131,6 @@ static RelOptInfo *create_ordered_paths(PlannerInfo *root,
double limit_tuples);
static PathTarget *make_group_input_target(PlannerInfo *root,
PathTarget *final_target);
-static PathTarget *make_partial_grouping_target(PlannerInfo *root,
- PathTarget *grouping_target,
- Node *havingQual);
static List *postprocess_setop_tlist(List *new_tlist, List *orig_tlist);
static List *select_active_windows(PlannerInfo *root, WindowFuncLists *wflists);
static PathTarget *make_window_input_target(PlannerInfo *root,
@@ -204,39 +143,14 @@ static PathTarget *make_sort_input_target(PlannerInfo *root,
bool *have_postponed_srfs);
static void adjust_paths_for_srfs(PlannerInfo *root, RelOptInfo *rel,
List *targets, List *targets_contain_srfs);
-static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
- RelOptInfo *grouped_rel,
- RelOptInfo *partially_grouped_rel,
- const AggClauseCosts *agg_costs,
- grouping_sets_data *gd,
- double dNumGroups,
- GroupPathExtraData *extra);
-static RelOptInfo *create_partial_grouping_paths(PlannerInfo *root,
- RelOptInfo *grouped_rel,
- RelOptInfo *input_rel,
- grouping_sets_data *gd,
- GroupPathExtraData *extra,
- bool force_rel_creation);
-static void gather_grouping_paths(PlannerInfo *root, RelOptInfo *rel);
-static bool can_partial_agg(PlannerInfo *root,
- const AggClauseCosts *agg_costs);
static void apply_scanjoin_target_to_paths(PlannerInfo *root,
RelOptInfo *rel,
List *scanjoin_targets,
List *scanjoin_targets_contain_srfs,
bool scanjoin_target_parallel_safe,
bool tlist_same_exprs);
-static void create_partitionwise_grouping_paths(PlannerInfo *root,
- RelOptInfo *input_rel,
- RelOptInfo *grouped_rel,
- RelOptInfo *partially_grouped_rel,
- const AggClauseCosts *agg_costs,
- grouping_sets_data *gd,
- PartitionwiseAggregateType patype,
- GroupPathExtraData *extra);
-static bool group_by_has_partkey(RelOptInfo *input_rel,
- List *targetList,
- List *groupClause);
+static List *extract_rollup_sets(List *groupingSets);
+static List *reorder_grouping_sets(List *groupingSets, List *sortclause);
/*****************************************************************************
@@ -2252,13 +2166,121 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
/* Note: currently, we leave it to callers to do set_cheapest() */
}
+
+
+
+/*
+ * preprocess_groupclause - do preparatory work on GROUP BY clause
+ *
+ * The idea here is to adjust the ordering of the GROUP BY elements
+ * (which in itself is semantically insignificant) to match ORDER BY,
+ * thereby allowing a single sort operation to both implement the ORDER BY
+ * requirement and set up for a Unique step that implements GROUP BY.
+ *
+ * In principle it might be interesting to consider other orderings of the
+ * GROUP BY elements, which could match the sort ordering of other
+ * possible plans (eg an indexscan) and thereby reduce cost. We don't
+ * bother with that, though. Hashed grouping will frequently win anyway.
+ *
+ * Note: we need no comparable processing of the distinctClause because
+ * the parser already enforced that that matches ORDER BY.
+ *
+ * For grouping sets, the order of items is instead forced to agree with that
+ * of the grouping set (and items not in the grouping set are skipped). The
+ * work of sorting the order of grouping set elements to match the ORDER BY if
+ * possible is done elsewhere.
+ */
+List *
+preprocess_groupclause(PlannerInfo *root, List *force)
+{
+ Query *parse = root->parse;
+ List *new_groupclause = NIL;
+ bool partial_match;
+ ListCell *sl;
+ ListCell *gl;
+
+ /* For grouping sets, we need to force the ordering */
+ if (force)
+ {
+ foreach(sl, force)
+ {
+ Index ref = lfirst_int(sl);
+ SortGroupClause *cl = get_sortgroupref_clause(ref, parse->groupClause);
+
+ new_groupclause = lappend(new_groupclause, cl);
+ }
+
+ return new_groupclause;
+ }
+
+ /* If no ORDER BY, nothing useful to do here */
+ if (parse->sortClause == NIL)
+ return parse->groupClause;
+
+ /*
+ * Scan the ORDER BY clause and construct a list of matching GROUP BY
+ * items, but only as far as we can make a matching prefix.
+ *
+ * This code assumes that the sortClause contains no duplicate items.
+ */
+ foreach(sl, parse->sortClause)
+ {
+ SortGroupClause *sc = lfirst_node(SortGroupClause, sl);
+
+ foreach(gl, parse->groupClause)
+ {
+ SortGroupClause *gc = lfirst_node(SortGroupClause, gl);
+
+ if (equal(gc, sc))
+ {
+ new_groupclause = lappend(new_groupclause, gc);
+ break;
+ }
+ }
+ if (gl == NULL)
+ break; /* no match, so stop scanning */
+ }
+
+ /* Did we match all of the ORDER BY list, or just some of it? */
+ partial_match = (sl != NULL);
+
+ /* If no match at all, no point in reordering GROUP BY */
+ if (new_groupclause == NIL)
+ return parse->groupClause;
+
+ /*
+ * Add any remaining GROUP BY items to the new list, but only if we were
+ * able to make a complete match. In other words, we only rearrange the
+ * GROUP BY list if the result is that one list is a prefix of the other
+ * --- otherwise there's no possibility of a common sort. Also, give up
+ * if there are any non-sortable GROUP BY items, since then there's no
+ * hope anyway.
+ */
+ foreach(gl, parse->groupClause)
+ {
+ SortGroupClause *gc = lfirst_node(SortGroupClause, gl);
+
+ if (list_member_ptr(new_groupclause, gc))
+ continue; /* it matched an ORDER BY item */
+ if (partial_match)
+ return parse->groupClause; /* give up, no common sort possible */
+ if (!OidIsValid(gc->sortop))
+ return parse->groupClause; /* give up, GROUP BY can't be sorted */
+ new_groupclause = lappend(new_groupclause, gc);
+ }
+
+ /* Success --- install the rearranged GROUP BY list */
+ Assert(list_length(parse->groupClause) == list_length(new_groupclause));
+ return new_groupclause;
+}
+
/*
* Do preprocessing for groupingSets clause and related data. This handles the
* preliminary steps of expanding the grouping sets, organizing them into lists
* of rollups, and preparing annotations which will later be filled in with
* size estimates.
*/
-static grouping_sets_data *
+grouping_sets_data *
preprocess_grouping_sets(PlannerInfo *root)
{
Query *parse = root->parse;
@@ -2430,65 +2452,307 @@ preprocess_grouping_sets(PlannerInfo *root)
}
/*
- * Given a groupclause and a list of GroupingSetData, return equivalent sets
- * (without annotation) mapped to indexes into the given groupclause.
+ * Extract lists of grouping sets that can be implemented using a single
+ * rollup-type aggregate pass each. Returns a list of lists of grouping sets.
+ *
+ * Input must be sorted with smallest sets first. Result has each sublist
+ * sorted with smallest sets first.
+ *
+ * We want to produce the absolute minimum possible number of lists here to
+ * avoid excess sorts. Fortunately, there is an algorithm for this; the problem
+ * of finding the minimal partition of a partially-ordered set into chains
+ * (which is what we need, taking the list of grouping sets as a poset ordered
+ * by set inclusion) can be mapped to the problem of finding the maximum
+ * cardinality matching on a bipartite graph, which is solvable in polynomial
+ * time with a worst case of no worse than O(n^2.5) and usually much
+ * better. Since our N is at most 4096, we don't need to consider fallbacks to
+ * heuristic or approximate methods. (Planning time for a 12-d cube is under
+ * half a second on my modest system even with optimization off and assertions
+ * on.)
*/
static List *
-remap_to_groupclause_idx(List *groupClause,
- List *gsets,
- int *tleref_to_colnum_map)
+extract_rollup_sets(List *groupingSets)
{
- int ref = 0;
+ int num_sets_raw = list_length(groupingSets);
+ int num_empty = 0;
+ int num_sets = 0; /* distinct sets */
+ int num_chains = 0;
List *result = NIL;
+ List **results;
+ List **orig_sets;
+ Bitmapset **set_masks;
+ int *chains;
+ short **adjacency;
+ short *adjacency_buf;
+ BipartiteMatchState *state;
+ int i;
+ int j;
+ int j_size;
+ ListCell *lc1 = list_head(groupingSets);
ListCell *lc;
- foreach(lc, groupClause)
+ /*
+ * Start by stripping out empty sets. The algorithm doesn't require this,
+ * but the planner currently needs all empty sets to be returned in the
+ * first list, so we strip them here and add them back after.
+ */
+ while (lc1 && lfirst(lc1) == NIL)
{
- SortGroupClause *gc = lfirst_node(SortGroupClause, lc);
-
- tleref_to_colnum_map[gc->tleSortGroupRef] = ref++;
+ ++num_empty;
+ lc1 = lnext(lc1);
}
- foreach(lc, gsets)
+ /* bail out now if it turns out that all we had were empty sets. */
+ if (!lc1)
+ return list_make1(groupingSets);
+
+ /*----------
+ * We don't strictly need to remove duplicate sets here, but if we don't,
+ * they tend to become scattered through the result, which is a bit
+ * confusing (and irritating if we ever decide to optimize them out).
+ * So we remove them here and add them back after.
+ *
+ * For each non-duplicate set, we fill in the following:
+ *
+ * orig_sets[i] = list of the original set lists
+ * set_masks[i] = bitmapset for testing inclusion
+ * adjacency[i] = array [n, v1, v2, ... vn] of adjacency indices
+ *
+ * chains[i] will be the result group this set is assigned to.
+ *
+ * We index all of these from 1 rather than 0 because it is convenient
+ * to leave 0 free for the NIL node in the graph algorithm.
+ *----------
+ */
+ orig_sets = palloc0((num_sets_raw + 1) * sizeof(List *));
+ set_masks = palloc0((num_sets_raw + 1) * sizeof(Bitmapset *));
+ adjacency = palloc0((num_sets_raw + 1) * sizeof(short *));
+ adjacency_buf = palloc((num_sets_raw + 1) * sizeof(short));
+
+ j_size = 0;
+ j = 0;
+ i = 1;
+
+ for_each_cell(lc, lc1)
{
- List *set = NIL;
+ List *candidate = (List *) lfirst(lc);
+ Bitmapset *candidate_set = NULL;
ListCell *lc2;
- GroupingSetData *gs = lfirst_node(GroupingSetData, lc);
+ int dup_of = 0;
- foreach(lc2, gs->set)
+ foreach(lc2, candidate)
{
- set = lappend_int(set, tleref_to_colnum_map[lfirst_int(lc2)]);
+ candidate_set = bms_add_member(candidate_set, lfirst_int(lc2));
}
- result = lappend(result, set);
- }
+ /* we can only be a dup if we're the same length as a previous set */
+ if (j_size == list_length(candidate))
+ {
+ int k;
- return result;
-}
+ for (k = j; k < i; ++k)
+ {
+ if (bms_equal(set_masks[k], candidate_set))
+ {
+ dup_of = k;
+ break;
+ }
+ }
+ }
+ else if (j_size < list_length(candidate))
+ {
+ j_size = list_length(candidate);
+ j = i;
+ }
+ if (dup_of > 0)
+ {
+ orig_sets[dup_of] = lappend(orig_sets[dup_of], candidate);
+ bms_free(candidate_set);
+ }
+ else
+ {
+ int k;
+ int n_adj = 0;
+ orig_sets[i] = list_make1(candidate);
+ set_masks[i] = candidate_set;
-/*
- * Detect whether a plan node is a "dummy" plan created when a relation
- * is deemed not to need scanning due to constraint exclusion.
- *
- * Currently, such dummy plans are Result nodes with constant FALSE
- * filter quals (see set_dummy_rel_pathlist and create_append_plan).
- *
- * XXX this probably ought to be somewhere else, but not clear where.
- */
-bool
-is_dummy_plan(Plan *plan)
-{
- if (IsA(plan, Result))
- {
- List *rcqual = (List *) ((Result *) plan)->resconstantqual;
+ /* fill in adjacency list; no need to compare equal-size sets */
- if (list_length(rcqual) == 1)
- {
- Const *constqual = (Const *) linitial(rcqual);
+ for (k = j - 1; k > 0; --k)
+ {
+ if (bms_is_subset(set_masks[k], candidate_set))
+ adjacency_buf[++n_adj] = k;
+ }
- if (constqual && IsA(constqual, Const))
+ if (n_adj > 0)
+ {
+ adjacency_buf[0] = n_adj;
+ adjacency[i] = palloc((n_adj + 1) * sizeof(short));
+ memcpy(adjacency[i], adjacency_buf, (n_adj + 1) * sizeof(short));
+ }
+ else
+ adjacency[i] = NULL;
+
+ ++i;
+ }
+ }
+
+ num_sets = i - 1;
+
+ /*
+ * Apply the graph matching algorithm to do the work.
+ */
+ state = BipartiteMatch(num_sets, num_sets, adjacency);
+
+ /*
+ * Now, the state->pair* fields have the info we need to assign sets to
+ * chains. Two sets (u,v) belong to the same chain if pair_uv[u] = v or
+ * pair_vu[v] = u (both will be true, but we check both so that we can do
+ * it in one pass)
+ */
+ chains = palloc0((num_sets + 1) * sizeof(int));
+
+ for (i = 1; i <= num_sets; ++i)
+ {
+ int u = state->pair_vu[i];
+ int v = state->pair_uv[i];
+
+ if (u > 0 && u < i)
+ chains[i] = chains[u];
+ else if (v > 0 && v < i)
+ chains[i] = chains[v];
+ else
+ chains[i] = ++num_chains;
+ }
+
+ /* build result lists. */
+ results = palloc0((num_chains + 1) * sizeof(List *));
+
+ for (i = 1; i <= num_sets; ++i)
+ {
+ int c = chains[i];
+
+ Assert(c > 0);
+
+ results[c] = list_concat(results[c], orig_sets[i]);
+ }
+
+ /* push any empty sets back on the first list. */
+ while (num_empty-- > 0)
+ results[1] = lcons(NIL, results[1]);
+
+ /* make result list */
+ for (i = 1; i <= num_chains; ++i)
+ result = lappend(result, results[i]);
+
+ /*
+ * Free all the things.
+ *
+ * (This is over-fussy for small sets but for large sets we could have
+ * tied up a nontrivial amount of memory.)
+ */
+ BipartiteMatchFree(state);
+ pfree(results);
+ pfree(chains);
+ for (i = 1; i <= num_sets; ++i)
+ if (adjacency[i])
+ pfree(adjacency[i]);
+ pfree(adjacency);
+ pfree(adjacency_buf);
+ pfree(orig_sets);
+ for (i = 1; i <= num_sets; ++i)
+ bms_free(set_masks[i]);
+ pfree(set_masks);
+
+ return result;
+}
+
+/*
+ * Reorder the elements of a list of grouping sets such that they have correct
+ * prefix relationships. Also inserts the GroupingSetData annotations.
+ *
+ * The input must be ordered with smallest sets first; the result is returned
+ * with largest sets first. Note that the result shares no list substructure
+ * with the input, so it's safe for the caller to modify it later.
+ *
+ * If we're passed in a sortclause, we follow its order of columns to the
+ * extent possible, to minimize the chance that we add unnecessary sorts.
+ * (We're trying here to ensure that GROUPING SETS ((a,b,c),(c)) ORDER BY c,b,a
+ * gets implemented in one pass.)
+ */
+static List *
+reorder_grouping_sets(List *groupingsets, List *sortclause)
+{
+ ListCell *lc;
+ ListCell *lc2;
+ List *previous = NIL;
+ List *result = NIL;
+
+ foreach(lc, groupingsets)
+ {
+ List *candidate = (List *) lfirst(lc);
+ List *new_elems = list_difference_int(candidate, previous);
+ GroupingSetData *gs = makeNode(GroupingSetData);
+
+ if (list_length(new_elems) > 0)
+ {
+ while (list_length(sortclause) > list_length(previous))
+ {
+ SortGroupClause *sc = list_nth(sortclause, list_length(previous));
+ int ref = sc->tleSortGroupRef;
+
+ if (list_member_int(new_elems, ref))
+ {
+ previous = lappend_int(previous, ref);
+ new_elems = list_delete_int(new_elems, ref);
+ }
+ else
+ {
+ /* diverged from the sortclause; give up on it */
+ sortclause = NIL;
+ break;
+ }
+ }
+
+ foreach(lc2, new_elems)
+ {
+ previous = lappend_int(previous, lfirst_int(lc2));
+ }
+ }
+
+ gs->set = list_copy(previous);
+ result = lcons(gs, result);
+ list_free(new_elems);
+ }
+
+ list_free(previous);
+
+ return result;
+}
+
+/*
+ * Detect whether a plan node is a "dummy" plan created when a relation
+ * is deemed not to need scanning due to constraint exclusion.
+ *
+ * Currently, such dummy plans are Result nodes with constant FALSE
+ * filter quals (see set_dummy_rel_pathlist and create_append_plan).
+ *
+ * XXX this probably ought to be somewhere else, but not clear where.
+ */
+bool
+is_dummy_plan(Plan *plan)
+{
+ if (IsA(plan, Result))
+ {
+ List *rcqual = (List *) ((Result *) plan)->resconstantqual;
+
+ if (list_length(rcqual) == 1)
+ {
+ Const *constqual = (Const *) linitial(rcqual);
+
+ if (constqual && IsA(constqual, Const))
{
if (!constqual->constisnull &&
!DatumGetBool(constqual->constvalue))
@@ -3058,3708 +3322,1699 @@ remove_useless_groupby_columns(PlannerInfo *root)
}
/*
- * preprocess_groupclause - do preparatory work on GROUP BY clause
- *
- * The idea here is to adjust the ordering of the GROUP BY elements
- * (which in itself is semantically insignificant) to match ORDER BY,
- * thereby allowing a single sort operation to both implement the ORDER BY
- * requirement and set up for a Unique step that implements GROUP BY.
- *
- * In principle it might be interesting to consider other orderings of the
- * GROUP BY elements, which could match the sort ordering of other
- * possible plans (eg an indexscan) and thereby reduce cost. We don't
- * bother with that, though. Hashed grouping will frequently win anyway.
- *
- * Note: we need no comparable processing of the distinctClause because
- * the parser already enforced that that matches ORDER BY.
- *
- * For grouping sets, the order of items is instead forced to agree with that
- * of the grouping set (and items not in the grouping set are skipped). The
- * work of sorting the order of grouping set elements to match the ORDER BY if
- * possible is done elsewhere.
+ * Compute query_pathkeys and other pathkeys, to tell query_planner() which
+ * orderings would be useful for the later planner stages.
*/
-static List *
-preprocess_groupclause(PlannerInfo *root, List *force)
+static void
+compute_pathkeys(PlannerInfo *root, List *tlist, List *activeWindows, List *groupClause)
{
Query *parse = root->parse;
- List *new_groupclause = NIL;
- bool partial_match;
- ListCell *sl;
- ListCell *gl;
- /* For grouping sets, we need to force the ordering */
- if (force)
- {
- foreach(sl, force)
- {
- Index ref = lfirst_int(sl);
- SortGroupClause *cl = get_sortgroupref_clause(ref, parse->groupClause);
+ /*
+ * Calculate pathkeys that represent grouping/ordering requirements. The
+ * sortClause is certainly sort-able, but GROUP BY and DISTINCT might not
+ * be, in which case we just leave their pathkeys empty.
+ */
+ if (groupClause && grouping_is_sortable(groupClause))
+ root->group_pathkeys =
+ make_pathkeys_for_sortclauses(root,
+ groupClause,
+ tlist);
+ else
+ root->group_pathkeys = NIL;
- new_groupclause = lappend(new_groupclause, cl);
- }
+ /* We consider only the first (bottom) window in pathkeys logic */
+ if (activeWindows != NIL)
+ {
+ WindowClause *wc = linitial_node(WindowClause, activeWindows);
- return new_groupclause;
+ root->window_pathkeys = make_pathkeys_for_window(root,
+ wc,
+ tlist);
}
+ else
+ root->window_pathkeys = NIL;
- /* If no ORDER BY, nothing useful to do here */
- if (parse->sortClause == NIL)
- return parse->groupClause;
+ if (parse->distinctClause &&
+ grouping_is_sortable(parse->distinctClause))
+ root->distinct_pathkeys =
+ make_pathkeys_for_sortclauses(root,
+ parse->distinctClause,
+ tlist);
+ else
+ root->distinct_pathkeys = NIL;
+
+ root->sort_pathkeys =
+ make_pathkeys_for_sortclauses(root,
+ parse->sortClause,
+ tlist);
/*
- * Scan the ORDER BY clause and construct a list of matching GROUP BY
- * items, but only as far as we can make a matching prefix.
+ * Figure out whether we want a sorted result from query_planner.
*
- * This code assumes that the sortClause contains no duplicate items.
- */
- foreach(sl, parse->sortClause)
- {
- SortGroupClause *sc = lfirst_node(SortGroupClause, sl);
-
- foreach(gl, parse->groupClause)
- {
- SortGroupClause *gc = lfirst_node(SortGroupClause, gl);
-
- if (equal(gc, sc))
- {
- new_groupclause = lappend(new_groupclause, gc);
- break;
- }
- }
- if (gl == NULL)
- break; /* no match, so stop scanning */
- }
-
- /* Did we match all of the ORDER BY list, or just some of it? */
- partial_match = (sl != NULL);
-
- /* If no match at all, no point in reordering GROUP BY */
- if (new_groupclause == NIL)
- return parse->groupClause;
-
- /*
- * Add any remaining GROUP BY items to the new list, but only if we were
- * able to make a complete match. In other words, we only rearrange the
- * GROUP BY list if the result is that one list is a prefix of the other
- * --- otherwise there's no possibility of a common sort. Also, give up
- * if there are any non-sortable GROUP BY items, since then there's no
- * hope anyway.
- */
- foreach(gl, parse->groupClause)
- {
- SortGroupClause *gc = lfirst_node(SortGroupClause, gl);
-
- if (list_member_ptr(new_groupclause, gc))
- continue; /* it matched an ORDER BY item */
- if (partial_match)
- return parse->groupClause; /* give up, no common sort possible */
- if (!OidIsValid(gc->sortop))
- return parse->groupClause; /* give up, GROUP BY can't be sorted */
- new_groupclause = lappend(new_groupclause, gc);
- }
-
- /* Success --- install the rearranged GROUP BY list */
- Assert(list_length(parse->groupClause) == list_length(new_groupclause));
- return new_groupclause;
-}
-
-/*
- * Extract lists of grouping sets that can be implemented using a single
- * rollup-type aggregate pass each. Returns a list of lists of grouping sets.
- *
- * Input must be sorted with smallest sets first. Result has each sublist
- * sorted with smallest sets first.
- *
- * We want to produce the absolute minimum possible number of lists here to
- * avoid excess sorts. Fortunately, there is an algorithm for this; the problem
- * of finding the minimal partition of a partially-ordered set into chains
- * (which is what we need, taking the list of grouping sets as a poset ordered
- * by set inclusion) can be mapped to the problem of finding the maximum
- * cardinality matching on a bipartite graph, which is solvable in polynomial
- * time with a worst case of no worse than O(n^2.5) and usually much
- * better. Since our N is at most 4096, we don't need to consider fallbacks to
- * heuristic or approximate methods. (Planning time for a 12-d cube is under
- * half a second on my modest system even with optimization off and assertions
- * on.)
- */
-static List *
-extract_rollup_sets(List *groupingSets)
-{
- int num_sets_raw = list_length(groupingSets);
- int num_empty = 0;
- int num_sets = 0; /* distinct sets */
- int num_chains = 0;
- List *result = NIL;
- List **results;
- List **orig_sets;
- Bitmapset **set_masks;
- int *chains;
- short **adjacency;
- short *adjacency_buf;
- BipartiteMatchState *state;
- int i;
- int j;
- int j_size;
- ListCell *lc1 = list_head(groupingSets);
- ListCell *lc;
-
- /*
- * Start by stripping out empty sets. The algorithm doesn't require this,
- * but the planner currently needs all empty sets to be returned in the
- * first list, so we strip them here and add them back after.
- */
- while (lc1 && lfirst(lc1) == NIL)
- {
- ++num_empty;
- lc1 = lnext(lc1);
- }
-
- /* bail out now if it turns out that all we had were empty sets. */
- if (!lc1)
- return list_make1(groupingSets);
-
- /*----------
- * We don't strictly need to remove duplicate sets here, but if we don't,
- * they tend to become scattered through the result, which is a bit
- * confusing (and irritating if we ever decide to optimize them out).
- * So we remove them here and add them back after.
- *
- * For each non-duplicate set, we fill in the following:
- *
- * orig_sets[i] = list of the original set lists
- * set_masks[i] = bitmapset for testing inclusion
- * adjacency[i] = array [n, v1, v2, ... vn] of adjacency indices
- *
- * chains[i] will be the result group this set is assigned to.
- *
- * We index all of these from 1 rather than 0 because it is convenient
- * to leave 0 free for the NIL node in the graph algorithm.
- *----------
- */
- orig_sets = palloc0((num_sets_raw + 1) * sizeof(List *));
- set_masks = palloc0((num_sets_raw + 1) * sizeof(Bitmapset *));
- adjacency = palloc0((num_sets_raw + 1) * sizeof(short *));
- adjacency_buf = palloc((num_sets_raw + 1) * sizeof(short));
-
- j_size = 0;
- j = 0;
- i = 1;
-
- for_each_cell(lc, lc1)
- {
- List *candidate = (List *) lfirst(lc);
- Bitmapset *candidate_set = NULL;
- ListCell *lc2;
- int dup_of = 0;
-
- foreach(lc2, candidate)
- {
- candidate_set = bms_add_member(candidate_set, lfirst_int(lc2));
- }
-
- /* we can only be a dup if we're the same length as a previous set */
- if (j_size == list_length(candidate))
- {
- int k;
-
- for (k = j; k < i; ++k)
- {
- if (bms_equal(set_masks[k], candidate_set))
- {
- dup_of = k;
- break;
- }
- }
- }
- else if (j_size < list_length(candidate))
- {
- j_size = list_length(candidate);
- j = i;
- }
-
- if (dup_of > 0)
- {
- orig_sets[dup_of] = lappend(orig_sets[dup_of], candidate);
- bms_free(candidate_set);
- }
- else
- {
- int k;
- int n_adj = 0;
-
- orig_sets[i] = list_make1(candidate);
- set_masks[i] = candidate_set;
-
- /* fill in adjacency list; no need to compare equal-size sets */
-
- for (k = j - 1; k > 0; --k)
- {
- if (bms_is_subset(set_masks[k], candidate_set))
- adjacency_buf[++n_adj] = k;
- }
-
- if (n_adj > 0)
- {
- adjacency_buf[0] = n_adj;
- adjacency[i] = palloc((n_adj + 1) * sizeof(short));
- memcpy(adjacency[i], adjacency_buf, (n_adj + 1) * sizeof(short));
- }
- else
- adjacency[i] = NULL;
-
- ++i;
- }
- }
-
- num_sets = i - 1;
-
- /*
- * Apply the graph matching algorithm to do the work.
- */
- state = BipartiteMatch(num_sets, num_sets, adjacency);
-
- /*
- * Now, the state->pair* fields have the info we need to assign sets to
- * chains. Two sets (u,v) belong to the same chain if pair_uv[u] = v or
- * pair_vu[v] = u (both will be true, but we check both so that we can do
- * it in one pass)
- */
- chains = palloc0((num_sets + 1) * sizeof(int));
-
- for (i = 1; i <= num_sets; ++i)
- {
- int u = state->pair_vu[i];
- int v = state->pair_uv[i];
-
- if (u > 0 && u < i)
- chains[i] = chains[u];
- else if (v > 0 && v < i)
- chains[i] = chains[v];
- else
- chains[i] = ++num_chains;
- }
-
- /* build result lists. */
- results = palloc0((num_chains + 1) * sizeof(List *));
-
- for (i = 1; i <= num_sets; ++i)
- {
- int c = chains[i];
-
- Assert(c > 0);
-
- results[c] = list_concat(results[c], orig_sets[i]);
- }
-
- /* push any empty sets back on the first list. */
- while (num_empty-- > 0)
- results[1] = lcons(NIL, results[1]);
-
- /* make result list */
- for (i = 1; i <= num_chains; ++i)
- result = lappend(result, results[i]);
-
- /*
- * Free all the things.
- *
- * (This is over-fussy for small sets but for large sets we could have
- * tied up a nontrivial amount of memory.)
- */
- BipartiteMatchFree(state);
- pfree(results);
- pfree(chains);
- for (i = 1; i <= num_sets; ++i)
- if (adjacency[i])
- pfree(adjacency[i]);
- pfree(adjacency);
- pfree(adjacency_buf);
- pfree(orig_sets);
- for (i = 1; i <= num_sets; ++i)
- bms_free(set_masks[i]);
- pfree(set_masks);
-
- return result;
-}
-
-/*
- * Reorder the elements of a list of grouping sets such that they have correct
- * prefix relationships. Also inserts the GroupingSetData annotations.
- *
- * The input must be ordered with smallest sets first; the result is returned
- * with largest sets first. Note that the result shares no list substructure
- * with the input, so it's safe for the caller to modify it later.
- *
- * If we're passed in a sortclause, we follow its order of columns to the
- * extent possible, to minimize the chance that we add unnecessary sorts.
- * (We're trying here to ensure that GROUPING SETS ((a,b,c),(c)) ORDER BY c,b,a
- * gets implemented in one pass.)
- */
-static List *
-reorder_grouping_sets(List *groupingsets, List *sortclause)
-{
- ListCell *lc;
- ListCell *lc2;
- List *previous = NIL;
- List *result = NIL;
-
- foreach(lc, groupingsets)
- {
- List *candidate = (List *) lfirst(lc);
- List *new_elems = list_difference_int(candidate, previous);
- GroupingSetData *gs = makeNode(GroupingSetData);
-
- if (list_length(new_elems) > 0)
- {
- while (list_length(sortclause) > list_length(previous))
- {
- SortGroupClause *sc = list_nth(sortclause, list_length(previous));
- int ref = sc->tleSortGroupRef;
-
- if (list_member_int(new_elems, ref))
- {
- previous = lappend_int(previous, ref);
- new_elems = list_delete_int(new_elems, ref);
- }
- else
- {
- /* diverged from the sortclause; give up on it */
- sortclause = NIL;
- break;
- }
- }
-
- foreach(lc2, new_elems)
- {
- previous = lappend_int(previous, lfirst_int(lc2));
- }
- }
-
- gs->set = list_copy(previous);
- result = lcons(gs, result);
- list_free(new_elems);
- }
-
- list_free(previous);
-
- return result;
-}
-
-/*
- * Compute query_pathkeys and other pathkeys, to tell query_planner() which
- * orderings would be useful for the later planner stages.
- */
-static void
-compute_pathkeys(PlannerInfo *root, List *tlist, List *activeWindows, List *groupClause)
-{
- Query *parse = root->parse;
-
- /*
- * Calculate pathkeys that represent grouping/ordering requirements. The
- * sortClause is certainly sort-able, but GROUP BY and DISTINCT might not
- * be, in which case we just leave their pathkeys empty.
- */
- if (groupClause && grouping_is_sortable(groupClause))
- root->group_pathkeys =
- make_pathkeys_for_sortclauses(root,
- groupClause,
- tlist);
- else
- root->group_pathkeys = NIL;
-
- /* We consider only the first (bottom) window in pathkeys logic */
- if (activeWindows != NIL)
- {
- WindowClause *wc = linitial_node(WindowClause, activeWindows);
-
- root->window_pathkeys = make_pathkeys_for_window(root,
- wc,
- tlist);
- }
- else
- root->window_pathkeys = NIL;
-
- if (parse->distinctClause &&
- grouping_is_sortable(parse->distinctClause))
- root->distinct_pathkeys =
- make_pathkeys_for_sortclauses(root,
- parse->distinctClause,
- tlist);
- else
- root->distinct_pathkeys = NIL;
-
- root->sort_pathkeys =
- make_pathkeys_for_sortclauses(root,
- parse->sortClause,
- tlist);
-
- /*
- * Figure out whether we want a sorted result from query_planner.
- *
- * If we have a sortable GROUP BY clause, then we want a result sorted
- * properly for grouping. Otherwise, if we have window functions to
- * evaluate, we try to sort for the first window. Otherwise, if there's a
- * sortable DISTINCT clause that's more rigorous than the ORDER BY clause,
- * we try to produce output that's sufficiently well sorted for the
- * DISTINCT. Otherwise, if there is an ORDER BY clause, we want to sort
- * by the ORDER BY clause.
- *
- * Note: if we have both ORDER BY and GROUP BY, and ORDER BY is a superset
- * of GROUP BY, it would be tempting to request sort by ORDER BY --- but
- * that might just leave us failing to exploit an available sort order at
- * all. Needs more thought. The choice for DISTINCT versus ORDER BY is
- * much easier, since we know that the parser ensured that one is a
- * superset of the other.
- */
- if (root->group_pathkeys)
- root->query_pathkeys = root->group_pathkeys;
- else if (root->window_pathkeys)
- root->query_pathkeys = root->window_pathkeys;
- else if (list_length(root->distinct_pathkeys) >
- list_length(root->sort_pathkeys))
- root->query_pathkeys = root->distinct_pathkeys;
- else if (root->sort_pathkeys)
- root->query_pathkeys = root->sort_pathkeys;
- else
- root->query_pathkeys = NIL;
-}
-
-/*
- * Estimate number of groups produced by grouping clauses (1 if not grouping)
- *
- * path_rows: number of output rows from scan/join step
- * gd: grouping sets data including list of grouping sets and their clauses
- * target_list: target list containing group clause references
- *
- * If doing grouping sets, we also annotate the gsets data with the estimates
- * for each set and each individual rollup list, with a view to later
- * determining whether some combination of them could be hashed instead.
- */
-static double
-get_number_of_groups(PlannerInfo *root,
- double path_rows,
- grouping_sets_data *gd,
- List *target_list)
-{
- Query *parse = root->parse;
- double dNumGroups;
-
- if (parse->groupClause)
- {
- List *groupExprs;
-
- if (parse->groupingSets)
- {
- /* Add up the estimates for each grouping set */
- ListCell *lc;
- ListCell *lc2;
-
- Assert(gd); /* keep Coverity happy */
-
- dNumGroups = 0;
-
- foreach(lc, gd->rollups)
- {
- RollupData *rollup = lfirst_node(RollupData, lc);
- ListCell *lc;
-
- groupExprs = get_sortgrouplist_exprs(rollup->groupClause,
- target_list);
-
- rollup->numGroups = 0.0;
-
- forboth(lc, rollup->gsets, lc2, rollup->gsets_data)
- {
- List *gset = (List *) lfirst(lc);
- GroupingSetData *gs = lfirst_node(GroupingSetData, lc2);
- double numGroups = estimate_num_groups(root,
- groupExprs,
- path_rows,
- &gset);
-
- gs->numGroups = numGroups;
- rollup->numGroups += numGroups;
- }
-
- dNumGroups += rollup->numGroups;
- }
-
- if (gd->hash_sets_idx)
- {
- ListCell *lc;
-
- gd->dNumHashGroups = 0;
-
- groupExprs = get_sortgrouplist_exprs(parse->groupClause,
- target_list);
-
- forboth(lc, gd->hash_sets_idx, lc2, gd->unsortable_sets)
- {
- List *gset = (List *) lfirst(lc);
- GroupingSetData *gs = lfirst_node(GroupingSetData, lc2);
- double numGroups = estimate_num_groups(root,
- groupExprs,
- path_rows,
- &gset);
-
- gs->numGroups = numGroups;
- gd->dNumHashGroups += numGroups;
- }
-
- dNumGroups += gd->dNumHashGroups;
- }
- }
- else
- {
- /* Plain GROUP BY */
- groupExprs = get_sortgrouplist_exprs(parse->groupClause,
- target_list);
-
- dNumGroups = estimate_num_groups(root, groupExprs, path_rows,
- NULL);
- }
- }
- else if (parse->groupingSets)
- {
- /* Empty grouping sets ... one result row for each one */
- dNumGroups = list_length(parse->groupingSets);
- }
- else if (parse->hasAggs || root->hasHavingQual)
- {
- /* Plain aggregation, one result row */
- dNumGroups = 1;
- }
- else
- {
- /* Not grouping */
- dNumGroups = 1;
- }
-
- return dNumGroups;
-}
-
-/*
- * estimate_hashagg_tablesize
- * estimate the number of bytes that a hash aggregate hashtable will
- * require based on the agg_costs, path width and dNumGroups.
- *
- * XXX this may be over-estimating the size now that hashagg knows to omit
- * unneeded columns from the hashtable. Also for mixed-mode grouping sets,
- * grouping columns not in the hashed set are counted here even though hashagg
- * won't store them. Is this a problem?
- */
-static Size
-estimate_hashagg_tablesize(Path *path, const AggClauseCosts *agg_costs,
- double dNumGroups)
-{
- Size hashentrysize;
-
- /* Estimate per-hash-entry space at tuple width... */
- hashentrysize = MAXALIGN(path->pathtarget->width) +
- MAXALIGN(SizeofMinimalTupleHeader);
-
- /* plus space for pass-by-ref transition values... */
- hashentrysize += agg_costs->transitionSpace;
- /* plus the per-hash-entry overhead */
- hashentrysize += hash_agg_entry_size(agg_costs->numAggs);
-
- /*
- * Note that this disregards the effect of fill-factor and growth policy
- * of the hash-table. That's probably ok, given default the default
- * fill-factor is relatively high. It'd be hard to meaningfully factor in
- * "double-in-size" growth policies here.
- */
- return hashentrysize * dNumGroups;
-}
-
-/*
- * create_grouping_paths
- *
- * Build a new upperrel containing Paths for grouping and/or aggregation.
- * Along the way, we also build an upperrel for Paths which are partially
- * grouped and/or aggregated. A partially grouped and/or aggregated path
- * needs a FinalizeAggregate node to complete the aggregation. Currently,
- * the only partially grouped paths we build are also partial paths; that
- * is, they need a Gather and then a FinalizeAggregate.
- *
- * input_rel: contains the source-data Paths
- * target: the pathtarget for the result Paths to compute
- * agg_costs: cost info about all aggregates in query (in AGGSPLIT_SIMPLE mode)
- * gd: grouping sets data including list of grouping sets and their clauses
- *
- * Note: all Paths in input_rel are expected to return the target computed
- * by make_group_input_target.
- */
-static RelOptInfo *
-create_grouping_paths(PlannerInfo *root,
- RelOptInfo *input_rel,
- PathTarget *target,
- bool target_parallel_safe,
- const AggClauseCosts *agg_costs,
- grouping_sets_data *gd)
-{
- Query *parse = root->parse;
- RelOptInfo *grouped_rel;
- RelOptInfo *partially_grouped_rel;
-
- /*
- * Create grouping relation to hold fully aggregated grouping and/or
- * aggregation paths.
- */
- grouped_rel = make_grouping_rel(root, input_rel, target,
- target_parallel_safe, parse->havingQual);
-
- /*
- * Create either paths for a degenerate grouping or paths for ordinary
- * grouping, as appropriate.
- */
- if (is_degenerate_grouping(root))
- create_degenerate_grouping_paths(root, input_rel, grouped_rel);
- else
- {
- int flags = 0;
- GroupPathExtraData extra;
-
- /*
- * Determine whether it's possible to perform sort-based
- * implementations of grouping. (Note that if groupClause is empty,
- * grouping_is_sortable() is trivially true, and all the
- * pathkeys_contained_in() tests will succeed too, so that we'll
- * consider every surviving input path.)
- *
- * If we have grouping sets, we might be able to sort some but not all
- * of them; in this case, we need can_sort to be true as long as we
- * must consider any sorted-input plan.
- */
- if ((gd && gd->rollups != NIL)
- || grouping_is_sortable(parse->groupClause))
- flags |= GROUPING_CAN_USE_SORT;
-
- /*
- * Determine whether we should consider hash-based implementations of
- * grouping.
- *
- * Hashed aggregation only applies if we're grouping. If we have
- * grouping sets, some groups might be hashable but others not; in
- * this case we set can_hash true as long as there is nothing globally
- * preventing us from hashing (and we should therefore consider plans
- * with hashes).
- *
- * Executor doesn't support hashed aggregation with DISTINCT or ORDER
- * BY aggregates. (Doing so would imply storing *all* the input
- * values in the hash table, and/or running many sorts in parallel,
- * either of which seems like a certain loser.) We similarly don't
- * support ordered-set aggregates in hashed aggregation, but that case
- * is also included in the numOrderedAggs count.
- *
- * Note: grouping_is_hashable() is much more expensive to check than
- * the other gating conditions, so we want to do it last.
- */
- if ((parse->groupClause != NIL &&
- agg_costs->numOrderedAggs == 0 &&
- (gd ? gd->any_hashable : grouping_is_hashable(parse->groupClause))))
- flags |= GROUPING_CAN_USE_HASH;
-
- /*
- * Determine whether partial aggregation is possible.
- */
- if (can_partial_agg(root, agg_costs))
- flags |= GROUPING_CAN_PARTIAL_AGG;
-
- extra.flags = flags;
- extra.target_parallel_safe = target_parallel_safe;
- extra.havingQual = parse->havingQual;
- extra.targetList = parse->targetList;
- extra.partial_costs_set = false;
-
- /*
- * Determine whether partitionwise aggregation is in theory possible.
- * It can be disabled by the user, and for now, we don't try to
- * support grouping sets. create_ordinary_grouping_paths() will check
- * additional conditions, such as whether input_rel is partitioned.
- */
- if (enable_partitionwise_aggregate && !parse->groupingSets)
- extra.patype = PARTITIONWISE_AGGREGATE_FULL;
- else
- extra.patype = PARTITIONWISE_AGGREGATE_NONE;
-
- create_ordinary_grouping_paths(root, input_rel, grouped_rel,
- agg_costs, gd, &extra,
- &partially_grouped_rel);
- }
-
- set_cheapest(grouped_rel);
- return grouped_rel;
-}
-
-/*
- * make_grouping_rel
- *
- * Create a new grouping rel and set basic properties.
- *
- * input_rel represents the underlying scan/join relation.
- * target is the output expected from the grouping relation.
- */
-static RelOptInfo *
-make_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
- PathTarget *target, bool target_parallel_safe,
- Node *havingQual)
-{
- RelOptInfo *grouped_rel;
-
- if (IS_OTHER_REL(input_rel))
- {
- grouped_rel = fetch_upper_rel(root, UPPERREL_GROUP_AGG,
- input_rel->relids);
- grouped_rel->reloptkind = RELOPT_OTHER_UPPER_REL;
- }
- else
- {
- /*
- * By tradition, the relids set for the main grouping relation is
- * NULL. (This could be changed, but might require adjustments
- * elsewhere.)
- */
- grouped_rel = fetch_upper_rel(root, UPPERREL_GROUP_AGG, NULL);
- }
-
- /* Set target. */
- grouped_rel->reltarget = target;
-
- /*
- * If the input relation is not parallel-safe, then the grouped relation
- * can't be parallel-safe, either. Otherwise, it's parallel-safe if the
- * target list and HAVING quals are parallel-safe.
- */
- if (input_rel->consider_parallel && target_parallel_safe &&
- is_parallel_safe(root, (Node *) havingQual))
- grouped_rel->consider_parallel = true;
-
- /*
- * If the input rel belongs to a single FDW, so does the grouped rel.
- */
- grouped_rel->serverid = input_rel->serverid;
- grouped_rel->userid = input_rel->userid;
- grouped_rel->useridiscurrent = input_rel->useridiscurrent;
- grouped_rel->fdwroutine = input_rel->fdwroutine;
-
- return grouped_rel;
-}
-
-/*
- * is_degenerate_grouping
- *
- * A degenerate grouping is one in which the query has a HAVING qual and/or
- * grouping sets, but no aggregates and no GROUP BY (which implies that the
- * grouping sets are all empty).
- */
-static bool
-is_degenerate_grouping(PlannerInfo *root)
-{
- Query *parse = root->parse;
-
- return (root->hasHavingQual || parse->groupingSets) &&
- !parse->hasAggs && parse->groupClause == NIL;
-}
-
-/*
- * create_degenerate_grouping_paths
- *
- * When the grouping is degenerate (see is_degenerate_grouping), we are
- * supposed to emit either zero or one row for each grouping set depending on
- * whether HAVING succeeds. Furthermore, there cannot be any variables in
- * either HAVING or the targetlist, so we actually do not need the FROM table
- * at all! We can just throw away the plan-so-far and generate a Result node.
- * This is a sufficiently unusual corner case that it's not worth contorting
- * the structure of this module to avoid having to generate the earlier paths
- * in the first place.
- */
-static void
-create_degenerate_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
- RelOptInfo *grouped_rel)
-{
- Query *parse = root->parse;
- int nrows;
- Path *path;
-
- nrows = list_length(parse->groupingSets);
- if (nrows > 1)
- {
- /*
- * Doesn't seem worthwhile writing code to cons up a generate_series
- * or a values scan to emit multiple rows. Instead just make N clones
- * and append them. (With a volatile HAVING clause, this means you
- * might get between 0 and N output rows. Offhand I think that's
- * desired.)
- */
- List *paths = NIL;
-
- while (--nrows >= 0)
- {
- path = (Path *)
- create_result_path(root, grouped_rel,
- grouped_rel->reltarget,
- (List *) parse->havingQual);
- paths = lappend(paths, path);
- }
- path = (Path *)
- create_append_path(root,
- grouped_rel,
- paths,
- NIL,
- NULL,
- 0,
- false,
- NIL,
- -1);
- }
- else
- {
- /* No grouping sets, or just one, so one output row */
- path = (Path *)
- create_result_path(root, grouped_rel,
- grouped_rel->reltarget,
- (List *) parse->havingQual);
- }
-
- add_path(grouped_rel, path);
-}
-
-/*
- * create_ordinary_grouping_paths
- *
- * Create grouping paths for the ordinary (that is, non-degenerate) case.
- *
- * We need to consider sorted and hashed aggregation in the same function,
- * because otherwise (1) it would be harder to throw an appropriate error
- * message if neither way works, and (2) we should not allow hashtable size
- * considerations to dissuade us from using hashing if sorting is not possible.
- *
- * *partially_grouped_rel_p will be set to the partially grouped rel which this
- * function creates, or to NULL if it doesn't create one.
- */
-static void
-create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
- RelOptInfo *grouped_rel,
- const AggClauseCosts *agg_costs,
- grouping_sets_data *gd,
- GroupPathExtraData *extra,
- RelOptInfo **partially_grouped_rel_p)
-{
- Path *cheapest_path = input_rel->cheapest_total_path;
- RelOptInfo *partially_grouped_rel = NULL;
- double dNumGroups;
- PartitionwiseAggregateType patype = PARTITIONWISE_AGGREGATE_NONE;
-
- /*
- * If this is the topmost grouping relation or if the parent relation is
- * doing some form of partitionwise aggregation, then we may be able to do
- * it at this level also. However, if the input relation is not
- * partitioned, partitionwise aggregate is impossible, and if it is dummy,
- * partitionwise aggregate is pointless.
- */
- if (extra->patype != PARTITIONWISE_AGGREGATE_NONE &&
- input_rel->part_scheme && input_rel->part_rels &&
- !IS_DUMMY_REL(input_rel))
- {
- /*
- * If this is the topmost relation or if the parent relation is doing
- * full partitionwise aggregation, then we can do full partitionwise
- * aggregation provided that the GROUP BY clause contains all of the
- * partitioning columns at this level. Otherwise, we can do at most
- * partial partitionwise aggregation. But if partial aggregation is
- * not supported in general then we can't use it for partitionwise
- * aggregation either.
- */
- if (extra->patype == PARTITIONWISE_AGGREGATE_FULL &&
- group_by_has_partkey(input_rel, extra->targetList,
- root->parse->groupClause))
- patype = PARTITIONWISE_AGGREGATE_FULL;
- else if ((extra->flags & GROUPING_CAN_PARTIAL_AGG) != 0)
- patype = PARTITIONWISE_AGGREGATE_PARTIAL;
- else
- patype = PARTITIONWISE_AGGREGATE_NONE;
- }
-
- /*
- * Before generating paths for grouped_rel, we first generate any possible
- * partially grouped paths; that way, later code can easily consider both
- * parallel and non-parallel approaches to grouping.
- */
- if ((extra->flags & GROUPING_CAN_PARTIAL_AGG) != 0)
- {
- bool force_rel_creation;
-
- /*
- * If we're doing partitionwise aggregation at this level, force
- * creation of a partially_grouped_rel so we can add partitionwise
- * paths to it.
- */
- force_rel_creation = (patype == PARTITIONWISE_AGGREGATE_PARTIAL);
-
- partially_grouped_rel =
- create_partial_grouping_paths(root,
- grouped_rel,
- input_rel,
- gd,
- extra,
- force_rel_creation);
- }
-
- /* Set out parameter. */
- *partially_grouped_rel_p = partially_grouped_rel;
-
- /* Apply partitionwise aggregation technique, if possible. */
- if (patype != PARTITIONWISE_AGGREGATE_NONE)
- create_partitionwise_grouping_paths(root, input_rel, grouped_rel,
- partially_grouped_rel, agg_costs,
- gd, patype, extra);
-
- /* If we are doing partial aggregation only, return. */
- if (extra->patype == PARTITIONWISE_AGGREGATE_PARTIAL)
- {
- Assert(partially_grouped_rel);
-
- if (partially_grouped_rel->pathlist)
- set_cheapest(partially_grouped_rel);
-
- return;
- }
-
- /* Gather any partially grouped partial paths. */
- if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
- {
- gather_grouping_paths(root, partially_grouped_rel);
- set_cheapest(partially_grouped_rel);
- }
-
- /*
- * Estimate number of groups.
- */
- dNumGroups = get_number_of_groups(root,
- cheapest_path->rows,
- gd,
- extra->targetList);
-
- /* Build final grouping paths */
- add_paths_to_grouping_rel(root, input_rel, grouped_rel,
- partially_grouped_rel, agg_costs, gd,
- dNumGroups, extra);
-
- /* Give a helpful error if we failed to find any implementation */
- if (grouped_rel->pathlist == NIL)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("could not implement GROUP BY"),
- errdetail("Some of the datatypes only support hashing, while others only support sorting.")));
-
- /*
- * If there is an FDW that's responsible for all baserels of the query,
- * let it consider adding ForeignPaths.
- */
- if (grouped_rel->fdwroutine &&
- grouped_rel->fdwroutine->GetForeignUpperPaths)
- grouped_rel->fdwroutine->GetForeignUpperPaths(root, UPPERREL_GROUP_AGG,
- input_rel, grouped_rel,
- extra);
-
- /* Let extensions possibly add some more paths */
- if (create_upper_paths_hook)
- (*create_upper_paths_hook) (root, UPPERREL_GROUP_AGG,
- input_rel, grouped_rel,
- extra);
-}
-
-/*
- * For a given input path, consider the possible ways of doing grouping sets on
- * it, by combinations of hashing and sorting. This can be called multiple
- * times, so it's important that it not scribble on input. No result is
- * returned, but any generated paths are added to grouped_rel.
- */
-static void
-consider_groupingsets_paths(PlannerInfo *root,
- RelOptInfo *grouped_rel,
- Path *path,
- bool is_sorted,
- bool can_hash,
- grouping_sets_data *gd,
- const AggClauseCosts *agg_costs,
- double dNumGroups)
-{
- Query *parse = root->parse;
-
- /*
- * If we're not being offered sorted input, then only consider plans that
- * can be done entirely by hashing.
- *
- * We can hash everything if it looks like it'll fit in work_mem. But if
- * the input is actually sorted despite not being advertised as such, we
- * prefer to make use of that in order to use less memory.
- *
- * If none of the grouping sets are sortable, then ignore the work_mem
- * limit and generate a path anyway, since otherwise we'll just fail.
- */
- if (!is_sorted)
- {
- List *new_rollups = NIL;
- RollupData *unhashed_rollup = NULL;
- List *sets_data;
- List *empty_sets_data = NIL;
- List *empty_sets = NIL;
- ListCell *lc;
- ListCell *l_start = list_head(gd->rollups);
- AggStrategy strat = AGG_HASHED;
- Size hashsize;
- double exclude_groups = 0.0;
-
- Assert(can_hash);
-
- /*
- * If the input is coincidentally sorted usefully (which can happen
- * even if is_sorted is false, since that only means that our caller
- * has set up the sorting for us), then save some hashtable space by
- * making use of that. But we need to watch out for degenerate cases:
- *
- * 1) If there are any empty grouping sets, then group_pathkeys might
- * be NIL if all non-empty grouping sets are unsortable. In this case,
- * there will be a rollup containing only empty groups, and the
- * pathkeys_contained_in test is vacuously true; this is ok.
- *
- * XXX: the above relies on the fact that group_pathkeys is generated
- * from the first rollup. If we add the ability to consider multiple
- * sort orders for grouping input, this assumption might fail.
- *
- * 2) If there are no empty sets and only unsortable sets, then the
- * rollups list will be empty (and thus l_start == NULL), and
- * group_pathkeys will be NIL; we must ensure that the vacuously-true
- * pathkeys_contain_in test doesn't cause us to crash.
- */
- if (l_start != NULL &&
- pathkeys_contained_in(root->group_pathkeys, path->pathkeys))
- {
- unhashed_rollup = lfirst_node(RollupData, l_start);
- exclude_groups = unhashed_rollup->numGroups;
- l_start = lnext(l_start);
- }
-
- hashsize = estimate_hashagg_tablesize(path,
- agg_costs,
- dNumGroups - exclude_groups);
-
- /*
- * gd->rollups is empty if we have only unsortable columns to work
- * with. Override work_mem in that case; otherwise, we'll rely on the
- * sorted-input case to generate usable mixed paths.
- */
- if (hashsize > work_mem * 1024L && gd->rollups)
- return; /* nope, won't fit */
-
- /*
- * We need to burst the existing rollups list into individual grouping
- * sets and recompute a groupClause for each set.
- */
- sets_data = list_copy(gd->unsortable_sets);
-
- for_each_cell(lc, l_start)
- {
- RollupData *rollup = lfirst_node(RollupData, lc);
-
- /*
- * If we find an unhashable rollup that's not been skipped by the
- * "actually sorted" check above, we can't cope; we'd need sorted
- * input (with a different sort order) but we can't get that here.
- * So bail out; we'll get a valid path from the is_sorted case
- * instead.
- *
- * The mere presence of empty grouping sets doesn't make a rollup
- * unhashable (see preprocess_grouping_sets), we handle those
- * specially below.
- */
- if (!rollup->hashable)
- return;
- else
- sets_data = list_concat(sets_data, list_copy(rollup->gsets_data));
- }
- foreach(lc, sets_data)
- {
- GroupingSetData *gs = lfirst_node(GroupingSetData, lc);
- List *gset = gs->set;
- RollupData *rollup;
-
- if (gset == NIL)
- {
- /* Empty grouping sets can't be hashed. */
- empty_sets_data = lappend(empty_sets_data, gs);
- empty_sets = lappend(empty_sets, NIL);
- }
- else
- {
- rollup = makeNode(RollupData);
-
- rollup->groupClause = preprocess_groupclause(root, gset);
- rollup->gsets_data = list_make1(gs);
- rollup->gsets = remap_to_groupclause_idx(rollup->groupClause,
- rollup->gsets_data,
- gd->tleref_to_colnum_map);
- rollup->numGroups = gs->numGroups;
- rollup->hashable = true;
- rollup->is_hashed = true;
- new_rollups = lappend(new_rollups, rollup);
- }
- }
-
- /*
- * If we didn't find anything nonempty to hash, then bail. We'll
- * generate a path from the is_sorted case.
- */
- if (new_rollups == NIL)
- return;
-
- /*
- * If there were empty grouping sets they should have been in the
- * first rollup.
- */
- Assert(!unhashed_rollup || !empty_sets);
-
- if (unhashed_rollup)
- {
- new_rollups = lappend(new_rollups, unhashed_rollup);
- strat = AGG_MIXED;
- }
- else if (empty_sets)
- {
- RollupData *rollup = makeNode(RollupData);
-
- rollup->groupClause = NIL;
- rollup->gsets_data = empty_sets_data;
- rollup->gsets = empty_sets;
- rollup->numGroups = list_length(empty_sets);
- rollup->hashable = false;
- rollup->is_hashed = false;
- new_rollups = lappend(new_rollups, rollup);
- strat = AGG_MIXED;
- }
-
- add_path(grouped_rel, (Path *)
- create_groupingsets_path(root,
- grouped_rel,
- path,
- (List *) parse->havingQual,
- strat,
- new_rollups,
- agg_costs,
- dNumGroups));
- return;
- }
-
- /*
- * If we have sorted input but nothing we can do with it, bail.
- */
- if (list_length(gd->rollups) == 0)
- return;
-
- /*
- * Given sorted input, we try and make two paths: one sorted and one mixed
- * sort/hash. (We need to try both because hashagg might be disabled, or
- * some columns might not be sortable.)
- *
- * can_hash is passed in as false if some obstacle elsewhere (such as
- * ordered aggs) means that we shouldn't consider hashing at all.
- */
- if (can_hash && gd->any_hashable)
- {
- List *rollups = NIL;
- List *hash_sets = list_copy(gd->unsortable_sets);
- double availspace = (work_mem * 1024.0);
- ListCell *lc;
-
- /*
- * Account first for space needed for groups we can't sort at all.
- */
- availspace -= (double) estimate_hashagg_tablesize(path,
- agg_costs,
- gd->dNumHashGroups);
-
- if (availspace > 0 && list_length(gd->rollups) > 1)
- {
- double scale;
- int num_rollups = list_length(gd->rollups);
- int k_capacity;
- int *k_weights = palloc(num_rollups * sizeof(int));
- Bitmapset *hash_items = NULL;
- int i;
-
- /*
- * We treat this as a knapsack problem: the knapsack capacity
- * represents work_mem, the item weights are the estimated memory
- * usage of the hashtables needed to implement a single rollup,
- * and we really ought to use the cost saving as the item value;
- * however, currently the costs assigned to sort nodes don't
- * reflect the comparison costs well, and so we treat all items as
- * of equal value (each rollup we hash instead saves us one sort).
- *
- * To use the discrete knapsack, we need to scale the values to a
- * reasonably small bounded range. We choose to allow a 5% error
- * margin; we have no more than 4096 rollups in the worst possible
- * case, which with a 5% error margin will require a bit over 42MB
- * of workspace. (Anyone wanting to plan queries that complex had
- * better have the memory for it. In more reasonable cases, with
- * no more than a couple of dozen rollups, the memory usage will
- * be negligible.)
- *
- * k_capacity is naturally bounded, but we clamp the values for
- * scale and weight (below) to avoid overflows or underflows (or
- * uselessly trying to use a scale factor less than 1 byte).
- */
- scale = Max(availspace / (20.0 * num_rollups), 1.0);
- k_capacity = (int) floor(availspace / scale);
-
- /*
- * We leave the first rollup out of consideration since it's the
- * one that matches the input sort order. We assign indexes "i"
- * to only those entries considered for hashing; the second loop,
- * below, must use the same condition.
- */
- i = 0;
- for_each_cell(lc, lnext(list_head(gd->rollups)))
- {
- RollupData *rollup = lfirst_node(RollupData, lc);
-
- if (rollup->hashable)
- {
- double sz = estimate_hashagg_tablesize(path,
- agg_costs,
- rollup->numGroups);
-
- /*
- * If sz is enormous, but work_mem (and hence scale) is
- * small, avoid integer overflow here.
- */
- k_weights[i] = (int) Min(floor(sz / scale),
- k_capacity + 1.0);
- ++i;
- }
- }
-
- /*
- * Apply knapsack algorithm; compute the set of items which
- * maximizes the value stored (in this case the number of sorts
- * saved) while keeping the total size (approximately) within
- * capacity.
- */
- if (i > 0)
- hash_items = DiscreteKnapsack(k_capacity, i, k_weights, NULL);
-
- if (!bms_is_empty(hash_items))
- {
- rollups = list_make1(linitial(gd->rollups));
-
- i = 0;
- for_each_cell(lc, lnext(list_head(gd->rollups)))
- {
- RollupData *rollup = lfirst_node(RollupData, lc);
-
- if (rollup->hashable)
- {
- if (bms_is_member(i, hash_items))
- hash_sets = list_concat(hash_sets,
- list_copy(rollup->gsets_data));
- else
- rollups = lappend(rollups, rollup);
- ++i;
- }
- else
- rollups = lappend(rollups, rollup);
- }
- }
- }
-
- if (!rollups && hash_sets)
- rollups = list_copy(gd->rollups);
-
- foreach(lc, hash_sets)
- {
- GroupingSetData *gs = lfirst_node(GroupingSetData, lc);
- RollupData *rollup = makeNode(RollupData);
-
- Assert(gs->set != NIL);
-
- rollup->groupClause = preprocess_groupclause(root, gs->set);
- rollup->gsets_data = list_make1(gs);
- rollup->gsets = remap_to_groupclause_idx(rollup->groupClause,
- rollup->gsets_data,
- gd->tleref_to_colnum_map);
- rollup->numGroups = gs->numGroups;
- rollup->hashable = true;
- rollup->is_hashed = true;
- rollups = lcons(rollup, rollups);
- }
-
- if (rollups)
- {
- add_path(grouped_rel, (Path *)
- create_groupingsets_path(root,
- grouped_rel,
- path,
- (List *) parse->havingQual,
- AGG_MIXED,
- rollups,
- agg_costs,
- dNumGroups));
- }
- }
-
- /*
- * Now try the simple sorted case.
- */
- if (!gd->unsortable_sets)
- add_path(grouped_rel, (Path *)
- create_groupingsets_path(root,
- grouped_rel,
- path,
- (List *) parse->havingQual,
- AGG_SORTED,
- gd->rollups,
- agg_costs,
- dNumGroups));
-}
-
-/*
- * create_window_paths
- *
- * Build a new upperrel containing Paths for window-function evaluation.
- *
- * input_rel: contains the source-data Paths
- * input_target: result of make_window_input_target
- * output_target: what the topmost WindowAggPath should return
- * tlist: query's target list (needed to look up pathkeys)
- * wflists: result of find_window_functions
- * activeWindows: result of select_active_windows
- *
- * Note: all Paths in input_rel are expected to return input_target.
- */
-static RelOptInfo *
-create_window_paths(PlannerInfo *root,
- RelOptInfo *input_rel,
- PathTarget *input_target,
- PathTarget *output_target,
- bool output_target_parallel_safe,
- List *tlist,
- WindowFuncLists *wflists,
- List *activeWindows)
-{
- RelOptInfo *window_rel;
- ListCell *lc;
-
- /* For now, do all work in the (WINDOW, NULL) upperrel */
- window_rel = fetch_upper_rel(root, UPPERREL_WINDOW, NULL);
-
- /*
- * If the input relation is not parallel-safe, then the window relation
- * can't be parallel-safe, either. Otherwise, we need to examine the
- * target list and active windows for non-parallel-safe constructs.
- */
- if (input_rel->consider_parallel && output_target_parallel_safe &&
- is_parallel_safe(root, (Node *) activeWindows))
- window_rel->consider_parallel = true;
-
- /*
- * If the input rel belongs to a single FDW, so does the window rel.
- */
- window_rel->serverid = input_rel->serverid;
- window_rel->userid = input_rel->userid;
- window_rel->useridiscurrent = input_rel->useridiscurrent;
- window_rel->fdwroutine = input_rel->fdwroutine;
-
- /*
- * Consider computing window functions starting from the existing
- * cheapest-total path (which will likely require a sort) as well as any
- * existing paths that satisfy root->window_pathkeys (which won't).
- */
- foreach(lc, input_rel->pathlist)
- {
- Path *path = (Path *) lfirst(lc);
-
- if (path == input_rel->cheapest_total_path ||
- pathkeys_contained_in(root->window_pathkeys, path->pathkeys))
- create_one_window_path(root,
- window_rel,
- path,
- input_target,
- output_target,
- tlist,
- wflists,
- activeWindows);
- }
-
- /*
- * If there is an FDW that's responsible for all baserels of the query,
- * let it consider adding ForeignPaths.
- */
- if (window_rel->fdwroutine &&
- window_rel->fdwroutine->GetForeignUpperPaths)
- window_rel->fdwroutine->GetForeignUpperPaths(root, UPPERREL_WINDOW,
- input_rel, window_rel,
- NULL);
-
- /* Let extensions possibly add some more paths */
- if (create_upper_paths_hook)
- (*create_upper_paths_hook) (root, UPPERREL_WINDOW,
- input_rel, window_rel, NULL);
-
- /* Now choose the best path(s) */
- set_cheapest(window_rel);
-
- return window_rel;
-}
-
-/*
- * Stack window-function implementation steps atop the given Path, and
- * add the result to window_rel.
- *
- * window_rel: upperrel to contain result
- * path: input Path to use (must return input_target)
- * input_target: result of make_window_input_target
- * output_target: what the topmost WindowAggPath should return
- * tlist: query's target list (needed to look up pathkeys)
- * wflists: result of find_window_functions
- * activeWindows: result of select_active_windows
- */
-static void
-create_one_window_path(PlannerInfo *root,
- RelOptInfo *window_rel,
- Path *path,
- PathTarget *input_target,
- PathTarget *output_target,
- List *tlist,
- WindowFuncLists *wflists,
- List *activeWindows)
-{
- PathTarget *window_target;
- ListCell *l;
-
- /*
- * Since each window clause could require a different sort order, we stack
- * up a WindowAgg node for each clause, with sort steps between them as
- * needed. (We assume that select_active_windows chose a good order for
- * executing the clauses in.)
- *
- * input_target should contain all Vars and Aggs needed for the result.
- * (In some cases we wouldn't need to propagate all of these all the way
- * to the top, since they might only be needed as inputs to WindowFuncs.
- * It's probably not worth trying to optimize that though.) It must also
- * contain all window partitioning and sorting expressions, to ensure
- * they're computed only once at the bottom of the stack (that's critical
- * for volatile functions). As we climb up the stack, we'll add outputs
- * for the WindowFuncs computed at each level.
- */
- window_target = input_target;
-
- foreach(l, activeWindows)
- {
- WindowClause *wc = lfirst_node(WindowClause, l);
- List *window_pathkeys;
-
- window_pathkeys = make_pathkeys_for_window(root,
- wc,
- tlist);
-
- /* Sort if necessary */
- if (!pathkeys_contained_in(window_pathkeys, path->pathkeys))
- {
- path = (Path *) create_sort_path(root, window_rel,
- path,
- window_pathkeys,
- -1.0);
- }
-
- if (lnext(l))
- {
- /*
- * Add the current WindowFuncs to the output target for this
- * intermediate WindowAggPath. We must copy window_target to
- * avoid changing the previous path's target.
- *
- * Note: a WindowFunc adds nothing to the target's eval costs; but
- * we do need to account for the increase in tlist width.
- */
- ListCell *lc2;
-
- window_target = copy_pathtarget(window_target);
- foreach(lc2, wflists->windowFuncs[wc->winref])
- {
- WindowFunc *wfunc = lfirst_node(WindowFunc, lc2);
-
- add_column_to_pathtarget(window_target, (Expr *) wfunc, 0);
- window_target->width += get_typavgwidth(wfunc->wintype, -1);
- }
- }
- else
- {
- /* Install the goal target in the topmost WindowAgg */
- window_target = output_target;
- }
-
- path = (Path *)
- create_windowagg_path(root, window_rel, path, window_target,
- wflists->windowFuncs[wc->winref],
- wc,
- window_pathkeys);
- }
-
- add_path(window_rel, path);
-}
-
-/*
- * create_distinct_paths
- *
- * Build a new upperrel containing Paths for SELECT DISTINCT evaluation.
- *
- * input_rel: contains the source-data Paths
- *
- * Note: input paths should already compute the desired pathtarget, since
- * Sort/Unique won't project anything.
- */
-static RelOptInfo *
-create_distinct_paths(PlannerInfo *root,
- RelOptInfo *input_rel)
-{
- Query *parse = root->parse;
- Path *cheapest_input_path = input_rel->cheapest_total_path;
- RelOptInfo *distinct_rel;
- double numDistinctRows;
- bool allow_hash;
- Path *path;
- ListCell *lc;
-
- /* For now, do all work in the (DISTINCT, NULL) upperrel */
- distinct_rel = fetch_upper_rel(root, UPPERREL_DISTINCT, NULL);
-
- /*
- * We don't compute anything at this level, so distinct_rel will be
- * parallel-safe if the input rel is parallel-safe. In particular, if
- * there is a DISTINCT ON (...) clause, any path for the input_rel will
- * output those expressions, and will not be parallel-safe unless those
- * expressions are parallel-safe.
- */
- distinct_rel->consider_parallel = input_rel->consider_parallel;
-
- /*
- * If the input rel belongs to a single FDW, so does the distinct_rel.
- */
- distinct_rel->serverid = input_rel->serverid;
- distinct_rel->userid = input_rel->userid;
- distinct_rel->useridiscurrent = input_rel->useridiscurrent;
- distinct_rel->fdwroutine = input_rel->fdwroutine;
-
- /* Estimate number of distinct rows there will be */
- if (parse->groupClause || parse->groupingSets || parse->hasAggs ||
- root->hasHavingQual)
- {
- /*
- * If there was grouping or aggregation, use the number of input rows
- * as the estimated number of DISTINCT rows (ie, assume the input is
- * already mostly unique).
- */
- numDistinctRows = cheapest_input_path->rows;
- }
- else
- {
- /*
- * Otherwise, the UNIQUE filter has effects comparable to GROUP BY.
- */
- List *distinctExprs;
-
- distinctExprs = get_sortgrouplist_exprs(parse->distinctClause,
- parse->targetList);
- numDistinctRows = estimate_num_groups(root, distinctExprs,
- cheapest_input_path->rows,
- NULL);
- }
-
- /*
- * Consider sort-based implementations of DISTINCT, if possible.
- */
- if (grouping_is_sortable(parse->distinctClause))
- {
- /*
- * First, if we have any adequately-presorted paths, just stick a
- * Unique node on those. Then consider doing an explicit sort of the
- * cheapest input path and Unique'ing that.
- *
- * When we have DISTINCT ON, we must sort by the more rigorous of
- * DISTINCT and ORDER BY, else it won't have the desired behavior.
- * Also, if we do have to do an explicit sort, we might as well use
- * the more rigorous ordering to avoid a second sort later. (Note
- * that the parser will have ensured that one clause is a prefix of
- * the other.)
- */
- List *needed_pathkeys;
-
- if (parse->hasDistinctOn &&
- list_length(root->distinct_pathkeys) <
- list_length(root->sort_pathkeys))
- needed_pathkeys = root->sort_pathkeys;
- else
- needed_pathkeys = root->distinct_pathkeys;
-
- foreach(lc, input_rel->pathlist)
- {
- Path *path = (Path *) lfirst(lc);
-
- if (pathkeys_contained_in(needed_pathkeys, path->pathkeys))
- {
- add_path(distinct_rel, (Path *)
- create_upper_unique_path(root, distinct_rel,
- path,
- list_length(root->distinct_pathkeys),
- numDistinctRows));
- }
- }
-
- /* For explicit-sort case, always use the more rigorous clause */
- if (list_length(root->distinct_pathkeys) <
- list_length(root->sort_pathkeys))
- {
- needed_pathkeys = root->sort_pathkeys;
- /* Assert checks that parser didn't mess up... */
- Assert(pathkeys_contained_in(root->distinct_pathkeys,
- needed_pathkeys));
- }
- else
- needed_pathkeys = root->distinct_pathkeys;
-
- path = cheapest_input_path;
- if (!pathkeys_contained_in(needed_pathkeys, path->pathkeys))
- path = (Path *) create_sort_path(root, distinct_rel,
- path,
- needed_pathkeys,
- -1.0);
-
- add_path(distinct_rel, (Path *)
- create_upper_unique_path(root, distinct_rel,
- path,
- list_length(root->distinct_pathkeys),
- numDistinctRows));
- }
-
- /*
- * Consider hash-based implementations of DISTINCT, if possible.
- *
- * If we were not able to make any other types of path, we *must* hash or
- * die trying. If we do have other choices, there are several things that
- * should prevent selection of hashing: if the query uses DISTINCT ON
- * (because it won't really have the expected behavior if we hash), or if
- * enable_hashagg is off, or if it looks like the hashtable will exceed
- * work_mem.
- *
- * Note: grouping_is_hashable() is much more expensive to check than the
- * other gating conditions, so we want to do it last.
- */
- if (distinct_rel->pathlist == NIL)
- allow_hash = true; /* we have no alternatives */
- else if (parse->hasDistinctOn || !enable_hashagg)
- allow_hash = false; /* policy-based decision not to hash */
- else
- {
- Size hashentrysize;
-
- /* Estimate per-hash-entry space at tuple width... */
- hashentrysize = MAXALIGN(cheapest_input_path->pathtarget->width) +
- MAXALIGN(SizeofMinimalTupleHeader);
- /* plus the per-hash-entry overhead */
- hashentrysize += hash_agg_entry_size(0);
-
- /* Allow hashing only if hashtable is predicted to fit in work_mem */
- allow_hash = (hashentrysize * numDistinctRows <= work_mem * 1024L);
- }
-
- if (allow_hash && grouping_is_hashable(parse->distinctClause))
- {
- /* Generate hashed aggregate path --- no sort needed */
- add_path(distinct_rel, (Path *)
- create_agg_path(root,
- distinct_rel,
- cheapest_input_path,
- cheapest_input_path->pathtarget,
- AGG_HASHED,
- AGGSPLIT_SIMPLE,
- parse->distinctClause,
- NIL,
- NULL,
- numDistinctRows));
- }
-
- /* Give a helpful error if we failed to find any implementation */
- if (distinct_rel->pathlist == NIL)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("could not implement DISTINCT"),
- errdetail("Some of the datatypes only support hashing, while others only support sorting.")));
-
- /*
- * If there is an FDW that's responsible for all baserels of the query,
- * let it consider adding ForeignPaths.
+ * If we have a sortable GROUP BY clause, then we want a result sorted
+ * properly for grouping. Otherwise, if we have window functions to
+ * evaluate, we try to sort for the first window. Otherwise, if there's a
+ * sortable DISTINCT clause that's more rigorous than the ORDER BY clause,
+ * we try to produce output that's sufficiently well sorted for the
+ * DISTINCT. Otherwise, if there is an ORDER BY clause, we want to sort
+ * by the ORDER BY clause.
+ *
+ * Note: if we have both ORDER BY and GROUP BY, and ORDER BY is a superset
+ * of GROUP BY, it would be tempting to request sort by ORDER BY --- but
+ * that might just leave us failing to exploit an available sort order at
+ * all. Needs more thought. The choice for DISTINCT versus ORDER BY is
+ * much easier, since we know that the parser ensured that one is a
+ * superset of the other.
*/
- if (distinct_rel->fdwroutine &&
- distinct_rel->fdwroutine->GetForeignUpperPaths)
- distinct_rel->fdwroutine->GetForeignUpperPaths(root, UPPERREL_DISTINCT,
- input_rel, distinct_rel,
- NULL);
-
- /* Let extensions possibly add some more paths */
- if (create_upper_paths_hook)
- (*create_upper_paths_hook) (root, UPPERREL_DISTINCT,
- input_rel, distinct_rel, NULL);
-
- /* Now choose the best path(s) */
- set_cheapest(distinct_rel);
-
- return distinct_rel;
+ if (root->group_pathkeys)
+ root->query_pathkeys = root->group_pathkeys;
+ else if (root->window_pathkeys)
+ root->query_pathkeys = root->window_pathkeys;
+ else if (list_length(root->distinct_pathkeys) >
+ list_length(root->sort_pathkeys))
+ root->query_pathkeys = root->distinct_pathkeys;
+ else if (root->sort_pathkeys)
+ root->query_pathkeys = root->sort_pathkeys;
+ else
+ root->query_pathkeys = NIL;
}
/*
- * create_ordered_paths
- *
- * Build a new upperrel containing Paths for ORDER BY evaluation.
+ * create_window_paths
*
- * All paths in the result must satisfy the ORDER BY ordering.
- * The only new path we need consider is an explicit sort on the
- * cheapest-total existing path.
+ * Build a new upperrel containing Paths for window-function evaluation.
*
* input_rel: contains the source-data Paths
- * target: the output tlist the result Paths must emit
- * limit_tuples: estimated bound on the number of output tuples,
- * or -1 if no LIMIT or couldn't estimate
+ * input_target: result of make_window_input_target
+ * output_target: what the topmost WindowAggPath should return
+ * tlist: query's target list (needed to look up pathkeys)
+ * wflists: result of find_window_functions
+ * activeWindows: result of select_active_windows
+ *
+ * Note: all Paths in input_rel are expected to return input_target.
*/
static RelOptInfo *
-create_ordered_paths(PlannerInfo *root,
- RelOptInfo *input_rel,
- PathTarget *target,
- bool target_parallel_safe,
- double limit_tuples)
+create_window_paths(PlannerInfo *root,
+ RelOptInfo *input_rel,
+ PathTarget *input_target,
+ PathTarget *output_target,
+ bool output_target_parallel_safe,
+ List *tlist,
+ WindowFuncLists *wflists,
+ List *activeWindows)
{
- Path *cheapest_input_path = input_rel->cheapest_total_path;
- RelOptInfo *ordered_rel;
+ RelOptInfo *window_rel;
ListCell *lc;
- /* For now, do all work in the (ORDERED, NULL) upperrel */
- ordered_rel = fetch_upper_rel(root, UPPERREL_ORDERED, NULL);
+ /* For now, do all work in the (WINDOW, NULL) upperrel */
+ window_rel = fetch_upper_rel(root, UPPERREL_WINDOW, NULL);
/*
- * If the input relation is not parallel-safe, then the ordered relation
- * can't be parallel-safe, either. Otherwise, it's parallel-safe if the
- * target list is parallel-safe.
+ * If the input relation is not parallel-safe, then the window relation
+ * can't be parallel-safe, either. Otherwise, we need to examine the
+ * target list and active windows for non-parallel-safe constructs.
*/
- if (input_rel->consider_parallel && target_parallel_safe)
- ordered_rel->consider_parallel = true;
+ if (input_rel->consider_parallel && output_target_parallel_safe &&
+ is_parallel_safe(root, (Node *) activeWindows))
+ window_rel->consider_parallel = true;
/*
- * If the input rel belongs to a single FDW, so does the ordered_rel.
+ * If the input rel belongs to a single FDW, so does the window rel.
*/
- ordered_rel->serverid = input_rel->serverid;
- ordered_rel->userid = input_rel->userid;
- ordered_rel->useridiscurrent = input_rel->useridiscurrent;
- ordered_rel->fdwroutine = input_rel->fdwroutine;
-
- foreach(lc, input_rel->pathlist)
- {
- Path *path = (Path *) lfirst(lc);
- bool is_sorted;
-
- is_sorted = pathkeys_contained_in(root->sort_pathkeys,
- path->pathkeys);
- if (path == cheapest_input_path || is_sorted)
- {
- if (!is_sorted)
- {
- /* An explicit sort here can take advantage of LIMIT */
- path = (Path *) create_sort_path(root,
- ordered_rel,
- path,
- root->sort_pathkeys,
- limit_tuples);
- }
-
- /* Add projection step if needed */
- if (path->pathtarget != target)
- path = apply_projection_to_path(root, ordered_rel,
- path, target);
-
- add_path(ordered_rel, path);
- }
- }
+ window_rel->serverid = input_rel->serverid;
+ window_rel->userid = input_rel->userid;
+ window_rel->useridiscurrent = input_rel->useridiscurrent;
+ window_rel->fdwroutine = input_rel->fdwroutine;
/*
- * generate_gather_paths() will have already generated a simple Gather
- * path for the best parallel path, if any, and the loop above will have
- * considered sorting it. Similarly, generate_gather_paths() will also
- * have generated order-preserving Gather Merge plans which can be used
- * without sorting if they happen to match the sort_pathkeys, and the loop
- * above will have handled those as well. However, there's one more
- * possibility: it may make sense to sort the cheapest partial path
- * according to the required output order and then use Gather Merge.
+ * Consider computing window functions starting from the existing
+ * cheapest-total path (which will likely require a sort) as well as any
+ * existing paths that satisfy root->window_pathkeys (which won't).
*/
- if (ordered_rel->consider_parallel && root->sort_pathkeys != NIL &&
- input_rel->partial_pathlist != NIL)
+ foreach(lc, input_rel->pathlist)
{
- Path *cheapest_partial_path;
-
- cheapest_partial_path = linitial(input_rel->partial_pathlist);
-
- /*
- * If cheapest partial path doesn't need a sort, this is redundant
- * with what's already been tried.
- */
- if (!pathkeys_contained_in(root->sort_pathkeys,
- cheapest_partial_path->pathkeys))
- {
- Path *path;
- double total_groups;
-
- path = (Path *) create_sort_path(root,
- ordered_rel,
- cheapest_partial_path,
- root->sort_pathkeys,
- limit_tuples);
-
- total_groups = cheapest_partial_path->rows *
- cheapest_partial_path->parallel_workers;
- path = (Path *)
- create_gather_merge_path(root, ordered_rel,
- path,
- path->pathtarget,
- root->sort_pathkeys, NULL,
- &total_groups);
-
- /* Add projection step if needed */
- if (path->pathtarget != target)
- path = apply_projection_to_path(root, ordered_rel,
- path, target);
+ Path *path = (Path *) lfirst(lc);
- add_path(ordered_rel, path);
- }
+ if (path == input_rel->cheapest_total_path ||
+ pathkeys_contained_in(root->window_pathkeys, path->pathkeys))
+ create_one_window_path(root,
+ window_rel,
+ path,
+ input_target,
+ output_target,
+ tlist,
+ wflists,
+ activeWindows);
}
/*
* If there is an FDW that's responsible for all baserels of the query,
* let it consider adding ForeignPaths.
*/
- if (ordered_rel->fdwroutine &&
- ordered_rel->fdwroutine->GetForeignUpperPaths)
- ordered_rel->fdwroutine->GetForeignUpperPaths(root, UPPERREL_ORDERED,
- input_rel, ordered_rel,
- NULL);
+ if (window_rel->fdwroutine &&
+ window_rel->fdwroutine->GetForeignUpperPaths)
+ window_rel->fdwroutine->GetForeignUpperPaths(root, UPPERREL_WINDOW,
+ input_rel, window_rel,
+ NULL);
/* Let extensions possibly add some more paths */
if (create_upper_paths_hook)
- (*create_upper_paths_hook) (root, UPPERREL_ORDERED,
- input_rel, ordered_rel, NULL);
-
- /*
- * No need to bother with set_cheapest here; grouping_planner does not
- * need us to do it.
- */
- Assert(ordered_rel->pathlist != NIL);
-
- return ordered_rel;
-}
-
-
-/*
- * make_group_input_target
- * Generate appropriate PathTarget for initial input to grouping nodes.
- *
- * If there is grouping or aggregation, the scan/join subplan cannot emit
- * the query's final targetlist; for example, it certainly can't emit any
- * aggregate function calls. This routine generates the correct target
- * for the scan/join subplan.
- *
- * The query target list passed from the parser already contains entries
- * for all ORDER BY and GROUP BY expressions, but it will not have entries
- * for variables used only in HAVING clauses; so we need to add those
- * variables to the subplan target list. Also, we flatten all expressions
- * except GROUP BY items into their component variables; other expressions
- * will be computed by the upper plan nodes rather than by the subplan.
- * For example, given a query like
- * SELECT a+b,SUM(c+d) FROM table GROUP BY a+b;
- * we want to pass this targetlist to the subplan:
- * a+b,c,d
- * where the a+b target will be used by the Sort/Group steps, and the
- * other targets will be used for computing the final results.
- *
- * 'final_target' is the query's final target list (in PathTarget form)
+ (*create_upper_paths_hook) (root, UPPERREL_WINDOW,
+ input_rel, window_rel, NULL);
+
+ /* Now choose the best path(s) */
+ set_cheapest(window_rel);
+
+ return window_rel;
+}
+
+/*
+ * Stack window-function implementation steps atop the given Path, and
+ * add the result to window_rel.
*
- * The result is the PathTarget to be computed by the Paths returned from
- * query_planner().
+ * window_rel: upperrel to contain result
+ * path: input Path to use (must return input_target)
+ * input_target: result of make_window_input_target
+ * output_target: what the topmost WindowAggPath should return
+ * tlist: query's target list (needed to look up pathkeys)
+ * wflists: result of find_window_functions
+ * activeWindows: result of select_active_windows
*/
-static PathTarget *
-make_group_input_target(PlannerInfo *root, PathTarget *final_target)
+static void
+create_one_window_path(PlannerInfo *root,
+ RelOptInfo *window_rel,
+ Path *path,
+ PathTarget *input_target,
+ PathTarget *output_target,
+ List *tlist,
+ WindowFuncLists *wflists,
+ List *activeWindows)
{
- Query *parse = root->parse;
- PathTarget *input_target;
- List *non_group_cols;
- List *non_group_vars;
- int i;
- ListCell *lc;
+ PathTarget *window_target;
+ ListCell *l;
/*
- * We must build a target containing all grouping columns, plus any other
- * Vars mentioned in the query's targetlist and HAVING qual.
+ * Since each window clause could require a different sort order, we stack
+ * up a WindowAgg node for each clause, with sort steps between them as
+ * needed. (We assume that select_active_windows chose a good order for
+ * executing the clauses in.)
+ *
+ * input_target should contain all Vars and Aggs needed for the result.
+ * (In some cases we wouldn't need to propagate all of these all the way
+ * to the top, since they might only be needed as inputs to WindowFuncs.
+ * It's probably not worth trying to optimize that though.) It must also
+ * contain all window partitioning and sorting expressions, to ensure
+ * they're computed only once at the bottom of the stack (that's critical
+ * for volatile functions). As we climb up the stack, we'll add outputs
+ * for the WindowFuncs computed at each level.
*/
- input_target = create_empty_pathtarget();
- non_group_cols = NIL;
+ window_target = input_target;
- i = 0;
- foreach(lc, final_target->exprs)
+ foreach(l, activeWindows)
{
- Expr *expr = (Expr *) lfirst(lc);
- Index sgref = get_pathtarget_sortgroupref(final_target, i);
+ WindowClause *wc = lfirst_node(WindowClause, l);
+ List *window_pathkeys;
- if (sgref && parse->groupClause &&
- get_sortgroupref_clause_noerr(sgref, parse->groupClause) != NULL)
+ window_pathkeys = make_pathkeys_for_window(root,
+ wc,
+ tlist);
+
+ /* Sort if necessary */
+ if (!pathkeys_contained_in(window_pathkeys, path->pathkeys))
+ {
+ path = (Path *) create_sort_path(root, window_rel,
+ path,
+ window_pathkeys,
+ -1.0);
+ }
+
+ if (lnext(l))
{
/*
- * It's a grouping column, so add it to the input target as-is.
+ * Add the current WindowFuncs to the output target for this
+ * intermediate WindowAggPath. We must copy window_target to
+ * avoid changing the previous path's target.
+ *
+ * Note: a WindowFunc adds nothing to the target's eval costs; but
+ * we do need to account for the increase in tlist width.
*/
- add_column_to_pathtarget(input_target, expr, sgref);
+ ListCell *lc2;
+
+ window_target = copy_pathtarget(window_target);
+ foreach(lc2, wflists->windowFuncs[wc->winref])
+ {
+ WindowFunc *wfunc = lfirst_node(WindowFunc, lc2);
+
+ add_column_to_pathtarget(window_target, (Expr *) wfunc, 0);
+ window_target->width += get_typavgwidth(wfunc->wintype, -1);
+ }
}
else
{
- /*
- * Non-grouping column, so just remember the expression for later
- * call to pull_var_clause.
- */
- non_group_cols = lappend(non_group_cols, expr);
+ /* Install the goal target in the topmost WindowAgg */
+ window_target = output_target;
}
- i++;
+ path = (Path *)
+ create_windowagg_path(root, window_rel, path, window_target,
+ wflists->windowFuncs[wc->winref],
+ wc,
+ window_pathkeys);
}
- /*
- * If there's a HAVING clause, we'll need the Vars it uses, too.
- */
- if (parse->havingQual)
- non_group_cols = lappend(non_group_cols, parse->havingQual);
-
- /*
- * Pull out all the Vars mentioned in non-group cols (plus HAVING), and
- * add them to the input target if not already present. (A Var used
- * directly as a GROUP BY item will be present already.) Note this
- * includes Vars used in resjunk items, so we are covering the needs of
- * ORDER BY and window specifications. Vars used within Aggrefs and
- * WindowFuncs will be pulled out here, too.
- */
- non_group_vars = pull_var_clause((Node *) non_group_cols,
- PVC_RECURSE_AGGREGATES |
- PVC_RECURSE_WINDOWFUNCS |
- PVC_INCLUDE_PLACEHOLDERS);
- add_new_columns_to_pathtarget(input_target, non_group_vars);
-
- /* clean up cruft */
- list_free(non_group_vars);
- list_free(non_group_cols);
-
- /* XXX this causes some redundant cost calculation ... */
- return set_pathtarget_cost_width(root, input_target);
+ add_path(window_rel, path);
}
/*
- * make_partial_grouping_target
- * Generate appropriate PathTarget for output of partial aggregate
- * (or partial grouping, if there are no aggregates) nodes.
+ * create_distinct_paths
*
- * A partial aggregation node needs to emit all the same aggregates that
- * a regular aggregation node would, plus any aggregates used in HAVING;
- * except that the Aggref nodes should be marked as partial aggregates.
+ * Build a new upperrel containing Paths for SELECT DISTINCT evaluation.
*
- * In addition, we'd better emit any Vars and PlaceholderVars that are
- * used outside of Aggrefs in the aggregation tlist and HAVING. (Presumably,
- * these would be Vars that are grouped by or used in grouping expressions.)
+ * input_rel: contains the source-data Paths
*
- * grouping_target is the tlist to be emitted by the topmost aggregation step.
- * havingQual represents the HAVING clause.
+ * Note: input paths should already compute the desired pathtarget, since
+ * Sort/Unique won't project anything.
*/
-static PathTarget *
-make_partial_grouping_target(PlannerInfo *root,
- PathTarget *grouping_target,
- Node *havingQual)
+static RelOptInfo *
+create_distinct_paths(PlannerInfo *root,
+ RelOptInfo *input_rel)
{
Query *parse = root->parse;
- PathTarget *partial_target;
- List *non_group_cols;
- List *non_group_exprs;
- int i;
+ Path *cheapest_input_path = input_rel->cheapest_total_path;
+ RelOptInfo *distinct_rel;
+ double numDistinctRows;
+ bool allow_hash;
+ Path *path;
ListCell *lc;
- partial_target = create_empty_pathtarget();
- non_group_cols = NIL;
-
- i = 0;
- foreach(lc, grouping_target->exprs)
- {
- Expr *expr = (Expr *) lfirst(lc);
- Index sgref = get_pathtarget_sortgroupref(grouping_target, i);
-
- if (sgref && parse->groupClause &&
- get_sortgroupref_clause_noerr(sgref, parse->groupClause) != NULL)
- {
- /*
- * It's a grouping column, so add it to the partial_target as-is.
- * (This allows the upper agg step to repeat the grouping calcs.)
- */
- add_column_to_pathtarget(partial_target, expr, sgref);
- }
- else
- {
- /*
- * Non-grouping column, so just remember the expression for later
- * call to pull_var_clause.
- */
- non_group_cols = lappend(non_group_cols, expr);
- }
-
- i++;
- }
+ /* For now, do all work in the (DISTINCT, NULL) upperrel */
+ distinct_rel = fetch_upper_rel(root, UPPERREL_DISTINCT, NULL);
/*
- * If there's a HAVING clause, we'll need the Vars/Aggrefs it uses, too.
+ * We don't compute anything at this level, so distinct_rel will be
+ * parallel-safe if the input rel is parallel-safe. In particular, if
+ * there is a DISTINCT ON (...) clause, any path for the input_rel will
+ * output those expressions, and will not be parallel-safe unless those
+ * expressions are parallel-safe.
*/
- if (havingQual)
- non_group_cols = lappend(non_group_cols, havingQual);
+ distinct_rel->consider_parallel = input_rel->consider_parallel;
/*
- * Pull out all the Vars, PlaceHolderVars, and Aggrefs mentioned in
- * non-group cols (plus HAVING), and add them to the partial_target if not
- * already present. (An expression used directly as a GROUP BY item will
- * be present already.) Note this includes Vars used in resjunk items, so
- * we are covering the needs of ORDER BY and window specifications.
+ * If the input rel belongs to a single FDW, so does the distinct_rel.
*/
- non_group_exprs = pull_var_clause((Node *) non_group_cols,
- PVC_INCLUDE_AGGREGATES |
- PVC_RECURSE_WINDOWFUNCS |
- PVC_INCLUDE_PLACEHOLDERS);
+ distinct_rel->serverid = input_rel->serverid;
+ distinct_rel->userid = input_rel->userid;
+ distinct_rel->useridiscurrent = input_rel->useridiscurrent;
+ distinct_rel->fdwroutine = input_rel->fdwroutine;
+
+ /* Estimate number of distinct rows there will be */
+ if (parse->groupClause || parse->groupingSets || parse->hasAggs ||
+ root->hasHavingQual)
+ {
+ /*
+ * If there was grouping or aggregation, use the number of input rows
+ * as the estimated number of DISTINCT rows (ie, assume the input is
+ * already mostly unique).
+ */
+ numDistinctRows = cheapest_input_path->rows;
+ }
+ else
+ {
+ /*
+ * Otherwise, the UNIQUE filter has effects comparable to GROUP BY.
+ */
+ List *distinctExprs;
- add_new_columns_to_pathtarget(partial_target, non_group_exprs);
+ distinctExprs = get_sortgrouplist_exprs(parse->distinctClause,
+ parse->targetList);
+ numDistinctRows = estimate_num_groups(root, distinctExprs,
+ cheapest_input_path->rows,
+ NULL);
+ }
/*
- * Adjust Aggrefs to put them in partial mode. At this point all Aggrefs
- * are at the top level of the target list, so we can just scan the list
- * rather than recursing through the expression trees.
+ * Consider sort-based implementations of DISTINCT, if possible.
*/
- foreach(lc, partial_target->exprs)
+ if (grouping_is_sortable(parse->distinctClause))
{
- Aggref *aggref = (Aggref *) lfirst(lc);
-
- if (IsA(aggref, Aggref))
- {
- Aggref *newaggref;
-
- /*
- * We shouldn't need to copy the substructure of the Aggref node,
- * but flat-copy the node itself to avoid damaging other trees.
- */
- newaggref = makeNode(Aggref);
- memcpy(newaggref, aggref, sizeof(Aggref));
+ /*
+ * First, if we have any adequately-presorted paths, just stick a
+ * Unique node on those. Then consider doing an explicit sort of the
+ * cheapest input path and Unique'ing that.
+ *
+ * When we have DISTINCT ON, we must sort by the more rigorous of
+ * DISTINCT and ORDER BY, else it won't have the desired behavior.
+ * Also, if we do have to do an explicit sort, we might as well use
+ * the more rigorous ordering to avoid a second sort later. (Note
+ * that the parser will have ensured that one clause is a prefix of
+ * the other.)
+ */
+ List *needed_pathkeys;
- /* For now, assume serialization is required */
- mark_partial_aggref(newaggref, AGGSPLIT_INITIAL_SERIAL);
+ if (parse->hasDistinctOn &&
+ list_length(root->distinct_pathkeys) <
+ list_length(root->sort_pathkeys))
+ needed_pathkeys = root->sort_pathkeys;
+ else
+ needed_pathkeys = root->distinct_pathkeys;
- lfirst(lc) = newaggref;
- }
- }
+ foreach(lc, input_rel->pathlist)
+ {
+ Path *path = (Path *) lfirst(lc);
- /* clean up cruft */
- list_free(non_group_exprs);
- list_free(non_group_cols);
+ if (pathkeys_contained_in(needed_pathkeys, path->pathkeys))
+ {
+ add_path(distinct_rel, (Path *)
+ create_upper_unique_path(root, distinct_rel,
+ path,
+ list_length(root->distinct_pathkeys),
+ numDistinctRows));
+ }
+ }
- /* XXX this causes some redundant cost calculation ... */
- return set_pathtarget_cost_width(root, partial_target);
-}
+ /* For explicit-sort case, always use the more rigorous clause */
+ if (list_length(root->distinct_pathkeys) <
+ list_length(root->sort_pathkeys))
+ {
+ needed_pathkeys = root->sort_pathkeys;
+ /* Assert checks that parser didn't mess up... */
+ Assert(pathkeys_contained_in(root->distinct_pathkeys,
+ needed_pathkeys));
+ }
+ else
+ needed_pathkeys = root->distinct_pathkeys;
-/*
- * mark_partial_aggref
- * Adjust an Aggref to make it represent a partial-aggregation step.
- *
- * The Aggref node is modified in-place; caller must do any copying required.
- */
-void
-mark_partial_aggref(Aggref *agg, AggSplit aggsplit)
-{
- /* aggtranstype should be computed by this point */
- Assert(OidIsValid(agg->aggtranstype));
- /* ... but aggsplit should still be as the parser left it */
- Assert(agg->aggsplit == AGGSPLIT_SIMPLE);
+ path = cheapest_input_path;
+ if (!pathkeys_contained_in(needed_pathkeys, path->pathkeys))
+ path = (Path *) create_sort_path(root, distinct_rel,
+ path,
+ needed_pathkeys,
+ -1.0);
- /* Mark the Aggref with the intended partial-aggregation mode */
- agg->aggsplit = aggsplit;
+ add_path(distinct_rel, (Path *)
+ create_upper_unique_path(root, distinct_rel,
+ path,
+ list_length(root->distinct_pathkeys),
+ numDistinctRows));
+ }
/*
- * Adjust result type if needed. Normally, a partial aggregate returns
- * the aggregate's transition type; but if that's INTERNAL and we're
- * serializing, it returns BYTEA instead.
+ * Consider hash-based implementations of DISTINCT, if possible.
+ *
+ * If we were not able to make any other types of path, we *must* hash or
+ * die trying. If we do have other choices, there are several things that
+ * should prevent selection of hashing: if the query uses DISTINCT ON
+ * (because it won't really have the expected behavior if we hash), or if
+ * enable_hashagg is off, or if it looks like the hashtable will exceed
+ * work_mem.
+ *
+ * Note: grouping_is_hashable() is much more expensive to check than the
+ * other gating conditions, so we want to do it last.
*/
- if (DO_AGGSPLIT_SKIPFINAL(aggsplit))
- {
- if (agg->aggtranstype == INTERNALOID && DO_AGGSPLIT_SERIALIZE(aggsplit))
- agg->aggtype = BYTEAOID;
- else
- agg->aggtype = agg->aggtranstype;
- }
-}
-
-/*
- * postprocess_setop_tlist
- * Fix up targetlist returned by plan_set_operations().
- *
- * We need to transpose sort key info from the orig_tlist into new_tlist.
- * NOTE: this would not be good enough if we supported resjunk sort keys
- * for results of set operations --- then, we'd need to project a whole
- * new tlist to evaluate the resjunk columns. For now, just ereport if we
- * find any resjunk columns in orig_tlist.
- */
-static List *
-postprocess_setop_tlist(List *new_tlist, List *orig_tlist)
-{
- ListCell *l;
- ListCell *orig_tlist_item = list_head(orig_tlist);
-
- foreach(l, new_tlist)
+ if (distinct_rel->pathlist == NIL)
+ allow_hash = true; /* we have no alternatives */
+ else if (parse->hasDistinctOn || !enable_hashagg)
+ allow_hash = false; /* policy-based decision not to hash */
+ else
{
- TargetEntry *new_tle = lfirst_node(TargetEntry, l);
- TargetEntry *orig_tle;
+ Size hashentrysize;
- /* ignore resjunk columns in setop result */
- if (new_tle->resjunk)
- continue;
+ /* Estimate per-hash-entry space at tuple width... */
+ hashentrysize = MAXALIGN(cheapest_input_path->pathtarget->width) +
+ MAXALIGN(SizeofMinimalTupleHeader);
+ /* plus the per-hash-entry overhead */
+ hashentrysize += hash_agg_entry_size(0);
- Assert(orig_tlist_item != NULL);
- orig_tle = lfirst_node(TargetEntry, orig_tlist_item);
- orig_tlist_item = lnext(orig_tlist_item);
- if (orig_tle->resjunk) /* should not happen */
- elog(ERROR, "resjunk output columns are not implemented");
- Assert(new_tle->resno == orig_tle->resno);
- new_tle->ressortgroupref = orig_tle->ressortgroupref;
+ /* Allow hashing only if hashtable is predicted to fit in work_mem */
+ allow_hash = (hashentrysize * numDistinctRows <= work_mem * 1024L);
}
- if (orig_tlist_item != NULL)
- elog(ERROR, "resjunk output columns are not implemented");
- return new_tlist;
-}
-
-/*
- * select_active_windows
- * Create a list of the "active" window clauses (ie, those referenced
- * by non-deleted WindowFuncs) in the order they are to be executed.
- */
-static List *
-select_active_windows(PlannerInfo *root, WindowFuncLists *wflists)
-{
- List *result;
- List *actives;
- ListCell *lc;
- /* First, make a list of the active windows */
- actives = NIL;
- foreach(lc, root->parse->windowClause)
+ if (allow_hash && grouping_is_hashable(parse->distinctClause))
{
- WindowClause *wc = lfirst_node(WindowClause, lc);
-
- /* It's only active if wflists shows some related WindowFuncs */
- Assert(wc->winref <= wflists->maxWinRef);
- if (wflists->windowFuncs[wc->winref] != NIL)
- actives = lappend(actives, wc);
+ /* Generate hashed aggregate path --- no sort needed */
+ add_path(distinct_rel, (Path *)
+ create_agg_path(root,
+ distinct_rel,
+ cheapest_input_path,
+ cheapest_input_path->pathtarget,
+ AGG_HASHED,
+ AGGSPLIT_SIMPLE,
+ parse->distinctClause,
+ NIL,
+ NULL,
+ numDistinctRows));
}
+ /* Give a helpful error if we failed to find any implementation */
+ if (distinct_rel->pathlist == NIL)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("could not implement DISTINCT"),
+ errdetail("Some of the datatypes only support hashing, while others only support sorting.")));
+
/*
- * Now, ensure that windows with identical partitioning/ordering clauses
- * are adjacent in the list. This is required by the SQL standard, which
- * says that only one sort is to be used for such windows, even if they
- * are otherwise distinct (eg, different names or framing clauses).
- *
- * There is room to be much smarter here, for example detecting whether
- * one window's sort keys are a prefix of another's (so that sorting for
- * the latter would do for the former), or putting windows first that
- * match a sort order available for the underlying query. For the moment
- * we are content with meeting the spec.
+ * If there is an FDW that's responsible for all baserels of the query,
+ * let it consider adding ForeignPaths.
*/
- result = NIL;
- while (actives != NIL)
- {
- WindowClause *wc = linitial_node(WindowClause, actives);
- ListCell *prev;
- ListCell *next;
-
- /* Move wc from actives to result */
- actives = list_delete_first(actives);
- result = lappend(result, wc);
+ if (distinct_rel->fdwroutine &&
+ distinct_rel->fdwroutine->GetForeignUpperPaths)
+ distinct_rel->fdwroutine->GetForeignUpperPaths(root, UPPERREL_DISTINCT,
+ input_rel, distinct_rel,
+ NULL);
- /* Now move any matching windows from actives to result */
- prev = NULL;
- for (lc = list_head(actives); lc; lc = next)
- {
- WindowClause *wc2 = lfirst_node(WindowClause, lc);
+ /* Let extensions possibly add some more paths */
+ if (create_upper_paths_hook)
+ (*create_upper_paths_hook) (root, UPPERREL_DISTINCT,
+ input_rel, distinct_rel, NULL);
- next = lnext(lc);
- /* framing options are NOT to be compared here! */
- if (equal(wc->partitionClause, wc2->partitionClause) &&
- equal(wc->orderClause, wc2->orderClause))
- {
- actives = list_delete_cell(actives, lc, prev);
- result = lappend(result, wc2);
- }
- else
- prev = lc;
- }
- }
+ /* Now choose the best path(s) */
+ set_cheapest(distinct_rel);
- return result;
+ return distinct_rel;
}
/*
- * make_window_input_target
- * Generate appropriate PathTarget for initial input to WindowAgg nodes.
- *
- * When the query has window functions, this function computes the desired
- * target to be computed by the node just below the first WindowAgg.
- * This tlist must contain all values needed to evaluate the window functions,
- * compute the final target list, and perform any required final sort step.
- * If multiple WindowAggs are needed, each intermediate one adds its window
- * function results onto this base tlist; only the topmost WindowAgg computes
- * the actual desired target list.
- *
- * This function is much like make_group_input_target, though not quite enough
- * like it to share code. As in that function, we flatten most expressions
- * into their component variables. But we do not want to flatten window
- * PARTITION BY/ORDER BY clauses, since that might result in multiple
- * evaluations of them, which would be bad (possibly even resulting in
- * inconsistent answers, if they contain volatile functions).
- * Also, we must not flatten GROUP BY clauses that were left unflattened by
- * make_group_input_target, because we may no longer have access to the
- * individual Vars in them.
+ * create_ordered_paths
*
- * Another key difference from make_group_input_target is that we don't
- * flatten Aggref expressions, since those are to be computed below the
- * window functions and just referenced like Vars above that.
+ * Build a new upperrel containing Paths for ORDER BY evaluation.
*
- * 'final_target' is the query's final target list (in PathTarget form)
- * 'activeWindows' is the list of active windows previously identified by
- * select_active_windows.
+ * All paths in the result must satisfy the ORDER BY ordering.
+ * The only new path we need consider is an explicit sort on the
+ * cheapest-total existing path.
*
- * The result is the PathTarget to be computed by the plan node immediately
- * below the first WindowAgg node.
+ * input_rel: contains the source-data Paths
+ * target: the output tlist the result Paths must emit
+ * limit_tuples: estimated bound on the number of output tuples,
+ * or -1 if no LIMIT or couldn't estimate
*/
-static PathTarget *
-make_window_input_target(PlannerInfo *root,
- PathTarget *final_target,
- List *activeWindows)
+static RelOptInfo *
+create_ordered_paths(PlannerInfo *root,
+ RelOptInfo *input_rel,
+ PathTarget *target,
+ bool target_parallel_safe,
+ double limit_tuples)
{
- Query *parse = root->parse;
- PathTarget *input_target;
- Bitmapset *sgrefs;
- List *flattenable_cols;
- List *flattenable_vars;
- int i;
+ Path *cheapest_input_path = input_rel->cheapest_total_path;
+ RelOptInfo *ordered_rel;
ListCell *lc;
- Assert(parse->hasWindowFuncs);
+ /* For now, do all work in the (ORDERED, NULL) upperrel */
+ ordered_rel = fetch_upper_rel(root, UPPERREL_ORDERED, NULL);
/*
- * Collect the sortgroupref numbers of window PARTITION/ORDER BY clauses
- * into a bitmapset for convenient reference below.
+ * If the input relation is not parallel-safe, then the ordered relation
+ * can't be parallel-safe, either. Otherwise, it's parallel-safe if the
+ * target list is parallel-safe.
*/
- sgrefs = NULL;
- foreach(lc, activeWindows)
+ if (input_rel->consider_parallel && target_parallel_safe)
+ ordered_rel->consider_parallel = true;
+
+ /*
+ * If the input rel belongs to a single FDW, so does the ordered_rel.
+ */
+ ordered_rel->serverid = input_rel->serverid;
+ ordered_rel->userid = input_rel->userid;
+ ordered_rel->useridiscurrent = input_rel->useridiscurrent;
+ ordered_rel->fdwroutine = input_rel->fdwroutine;
+
+ foreach(lc, input_rel->pathlist)
{
- WindowClause *wc = lfirst_node(WindowClause, lc);
- ListCell *lc2;
+ Path *path = (Path *) lfirst(lc);
+ bool is_sorted;
- foreach(lc2, wc->partitionClause)
+ is_sorted = pathkeys_contained_in(root->sort_pathkeys,
+ path->pathkeys);
+ if (path == cheapest_input_path || is_sorted)
{
- SortGroupClause *sortcl = lfirst_node(SortGroupClause, lc2);
+ if (!is_sorted)
+ {
+ /* An explicit sort here can take advantage of LIMIT */
+ path = (Path *) create_sort_path(root,
+ ordered_rel,
+ path,
+ root->sort_pathkeys,
+ limit_tuples);
+ }
- sgrefs = bms_add_member(sgrefs, sortcl->tleSortGroupRef);
- }
- foreach(lc2, wc->orderClause)
- {
- SortGroupClause *sortcl = lfirst_node(SortGroupClause, lc2);
+ /* Add projection step if needed */
+ if (path->pathtarget != target)
+ path = apply_projection_to_path(root, ordered_rel,
+ path, target);
- sgrefs = bms_add_member(sgrefs, sortcl->tleSortGroupRef);
+ add_path(ordered_rel, path);
}
}
- /* Add in sortgroupref numbers of GROUP BY clauses, too */
- foreach(lc, parse->groupClause)
- {
- SortGroupClause *grpcl = lfirst_node(SortGroupClause, lc);
-
- sgrefs = bms_add_member(sgrefs, grpcl->tleSortGroupRef);
- }
-
/*
- * Construct a target containing all the non-flattenable targetlist items,
- * and save aside the others for a moment.
+ * generate_gather_paths() will have already generated a simple Gather
+ * path for the best parallel path, if any, and the loop above will have
+ * considered sorting it. Similarly, generate_gather_paths() will also
+ * have generated order-preserving Gather Merge plans which can be used
+ * without sorting if they happen to match the sort_pathkeys, and the loop
+ * above will have handled those as well. However, there's one more
+ * possibility: it may make sense to sort the cheapest partial path
+ * according to the required output order and then use Gather Merge.
*/
- input_target = create_empty_pathtarget();
- flattenable_cols = NIL;
-
- i = 0;
- foreach(lc, final_target->exprs)
+ if (ordered_rel->consider_parallel && root->sort_pathkeys != NIL &&
+ input_rel->partial_pathlist != NIL)
{
- Expr *expr = (Expr *) lfirst(lc);
- Index sgref = get_pathtarget_sortgroupref(final_target, i);
+ Path *cheapest_partial_path;
+
+ cheapest_partial_path = linitial(input_rel->partial_pathlist);
/*
- * Don't want to deconstruct window clauses or GROUP BY items. (Note
- * that such items can't contain window functions, so it's okay to
- * compute them below the WindowAgg nodes.)
+ * If cheapest partial path doesn't need a sort, this is redundant
+ * with what's already been tried.
*/
- if (sgref != 0 && bms_is_member(sgref, sgrefs))
- {
- /*
- * Don't want to deconstruct this value, so add it to the input
- * target as-is.
- */
- add_column_to_pathtarget(input_target, expr, sgref);
- }
- else
+ if (!pathkeys_contained_in(root->sort_pathkeys,
+ cheapest_partial_path->pathkeys))
{
- /*
- * Column is to be flattened, so just remember the expression for
- * later call to pull_var_clause.
- */
- flattenable_cols = lappend(flattenable_cols, expr);
- }
+ Path *path;
+ double total_groups;
- i++;
+ path = (Path *) create_sort_path(root,
+ ordered_rel,
+ cheapest_partial_path,
+ root->sort_pathkeys,
+ limit_tuples);
+
+ total_groups = cheapest_partial_path->rows *
+ cheapest_partial_path->parallel_workers;
+ path = (Path *)
+ create_gather_merge_path(root, ordered_rel,
+ path,
+ path->pathtarget,
+ root->sort_pathkeys, NULL,
+ &total_groups);
+
+ /* Add projection step if needed */
+ if (path->pathtarget != target)
+ path = apply_projection_to_path(root, ordered_rel,
+ path, target);
+
+ add_path(ordered_rel, path);
+ }
}
/*
- * Pull out all the Vars and Aggrefs mentioned in flattenable columns, and
- * add them to the input target if not already present. (Some might be
- * there already because they're used directly as window/group clauses.)
- *
- * Note: it's essential to use PVC_INCLUDE_AGGREGATES here, so that any
- * Aggrefs are placed in the Agg node's tlist and not left to be computed
- * at higher levels. On the other hand, we should recurse into
- * WindowFuncs to make sure their input expressions are available.
+ * If there is an FDW that's responsible for all baserels of the query,
+ * let it consider adding ForeignPaths.
*/
- flattenable_vars = pull_var_clause((Node *) flattenable_cols,
- PVC_INCLUDE_AGGREGATES |
- PVC_RECURSE_WINDOWFUNCS |
- PVC_INCLUDE_PLACEHOLDERS);
- add_new_columns_to_pathtarget(input_target, flattenable_vars);
-
- /* clean up cruft */
- list_free(flattenable_vars);
- list_free(flattenable_cols);
-
- /* XXX this causes some redundant cost calculation ... */
- return set_pathtarget_cost_width(root, input_target);
-}
+ if (ordered_rel->fdwroutine &&
+ ordered_rel->fdwroutine->GetForeignUpperPaths)
+ ordered_rel->fdwroutine->GetForeignUpperPaths(root, UPPERREL_ORDERED,
+ input_rel, ordered_rel,
+ NULL);
-/*
- * make_pathkeys_for_window
- * Create a pathkeys list describing the required input ordering
- * for the given WindowClause.
- *
- * The required ordering is first the PARTITION keys, then the ORDER keys.
- * In the future we might try to implement windowing using hashing, in which
- * case the ordering could be relaxed, but for now we always sort.
- *
- * Caution: if you change this, see createplan.c's get_column_info_for_window!
- */
-static List *
-make_pathkeys_for_window(PlannerInfo *root, WindowClause *wc,
- List *tlist)
-{
- List *window_pathkeys;
- List *window_sortclauses;
+ /* Let extensions possibly add some more paths */
+ if (create_upper_paths_hook)
+ (*create_upper_paths_hook) (root, UPPERREL_ORDERED,
+ input_rel, ordered_rel, NULL);
- /* Throw error if can't sort */
- if (!grouping_is_sortable(wc->partitionClause))
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("could not implement window PARTITION BY"),
- errdetail("Window partitioning columns must be of sortable datatypes.")));
- if (!grouping_is_sortable(wc->orderClause))
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("could not implement window ORDER BY"),
- errdetail("Window ordering columns must be of sortable datatypes.")));
+ /*
+ * No need to bother with set_cheapest here; grouping_planner does not
+ * need us to do it.
+ */
+ Assert(ordered_rel->pathlist != NIL);
- /* Okay, make the combined pathkeys */
- window_sortclauses = list_concat(list_copy(wc->partitionClause),
- list_copy(wc->orderClause));
- window_pathkeys = make_pathkeys_for_sortclauses(root,
- window_sortclauses,
- tlist);
- list_free(window_sortclauses);
- return window_pathkeys;
+ return ordered_rel;
}
+
/*
- * make_sort_input_target
- * Generate appropriate PathTarget for initial input to Sort step.
- *
- * If the query has ORDER BY, this function chooses the target to be computed
- * by the node just below the Sort (and DISTINCT, if any, since Unique can't
- * project) steps. This might or might not be identical to the query's final
- * output target.
- *
- * The main argument for keeping the sort-input tlist the same as the final
- * is that we avoid a separate projection node (which will be needed if
- * they're different, because Sort can't project). However, there are also
- * advantages to postponing tlist evaluation till after the Sort: it ensures
- * a consistent order of evaluation for any volatile functions in the tlist,
- * and if there's also a LIMIT, we can stop the query without ever computing
- * tlist functions for later rows, which is beneficial for both volatile and
- * expensive functions.
- *
- * Our current policy is to postpone volatile expressions till after the sort
- * unconditionally (assuming that that's possible, ie they are in plain tlist
- * columns and not ORDER BY/GROUP BY/DISTINCT columns). We also prefer to
- * postpone set-returning expressions, because running them beforehand would
- * bloat the sort dataset, and because it might cause unexpected output order
- * if the sort isn't stable. However there's a constraint on that: all SRFs
- * in the tlist should be evaluated at the same plan step, so that they can
- * run in sync in nodeProjectSet. So if any SRFs are in sort columns, we
- * mustn't postpone any SRFs. (Note that in principle that policy should
- * probably get applied to the group/window input targetlists too, but we
- * have not done that historically.) Lastly, expensive expressions are
- * postponed if there is a LIMIT, or if root->tuple_fraction shows that
- * partial evaluation of the query is possible (if neither is true, we expect
- * to have to evaluate the expressions for every row anyway), or if there are
- * any volatile or set-returning expressions (since once we've put in a
- * projection at all, it won't cost any more to postpone more stuff).
- *
- * Another issue that could potentially be considered here is that
- * evaluating tlist expressions could result in data that's either wider
- * or narrower than the input Vars, thus changing the volume of data that
- * has to go through the Sort. However, we usually have only a very bad
- * idea of the output width of any expression more complex than a Var,
- * so for now it seems too risky to try to optimize on that basis.
+ * make_group_input_target
+ * Generate appropriate PathTarget for initial input to grouping nodes.
*
- * Note that if we do produce a modified sort-input target, and then the
- * query ends up not using an explicit Sort, no particular harm is done:
- * we'll initially use the modified target for the preceding path nodes,
- * but then change them to the final target with apply_projection_to_path.
- * Moreover, in such a case the guarantees about evaluation order of
- * volatile functions still hold, since the rows are sorted already.
+ * If there is grouping or aggregation, the scan/join subplan cannot emit
+ * the query's final targetlist; for example, it certainly can't emit any
+ * aggregate function calls. This routine generates the correct target
+ * for the scan/join subplan.
*
- * This function has some things in common with make_group_input_target and
- * make_window_input_target, though the detailed rules for what to do are
- * different. We never flatten/postpone any grouping or ordering columns;
- * those are needed before the sort. If we do flatten a particular
- * expression, we leave Aggref and WindowFunc nodes alone, since those were
- * computed earlier.
+ * The query target list passed from the parser already contains entries
+ * for all ORDER BY and GROUP BY expressions, but it will not have entries
+ * for variables used only in HAVING clauses; so we need to add those
+ * variables to the subplan target list. Also, we flatten all expressions
+ * except GROUP BY items into their component variables; other expressions
+ * will be computed by the upper plan nodes rather than by the subplan.
+ * For example, given a query like
+ * SELECT a+b,SUM(c+d) FROM table GROUP BY a+b;
+ * we want to pass this targetlist to the subplan:
+ * a+b,c,d
+ * where the a+b target will be used by the Sort/Group steps, and the
+ * other targets will be used for computing the final results.
*
* 'final_target' is the query's final target list (in PathTarget form)
- * 'have_postponed_srfs' is an output argument, see below
- *
- * The result is the PathTarget to be computed by the plan node immediately
- * below the Sort step (and the Distinct step, if any). This will be
- * exactly final_target if we decide a projection step wouldn't be helpful.
*
- * In addition, *have_postponed_srfs is set to true if we choose to postpone
- * any set-returning functions to after the Sort.
+ * The result is the PathTarget to be computed by the Paths returned from
+ * query_planner().
*/
static PathTarget *
-make_sort_input_target(PlannerInfo *root,
- PathTarget *final_target,
- bool *have_postponed_srfs)
+make_group_input_target(PlannerInfo *root, PathTarget *final_target)
{
Query *parse = root->parse;
PathTarget *input_target;
- int ncols;
- bool *col_is_srf;
- bool *postpone_col;
- bool have_srf;
- bool have_volatile;
- bool have_expensive;
- bool have_srf_sortcols;
- bool postpone_srfs;
- List *postponable_cols;
- List *postponable_vars;
+ List *non_group_cols;
+ List *non_group_vars;
int i;
ListCell *lc;
- /* Shouldn't get here unless query has ORDER BY */
- Assert(parse->sortClause);
-
- *have_postponed_srfs = false; /* default result */
-
- /* Inspect tlist and collect per-column information */
- ncols = list_length(final_target->exprs);
- col_is_srf = (bool *) palloc0(ncols * sizeof(bool));
- postpone_col = (bool *) palloc0(ncols * sizeof(bool));
- have_srf = have_volatile = have_expensive = have_srf_sortcols = false;
+ /*
+ * We must build a target containing all grouping columns, plus any other
+ * Vars mentioned in the query's targetlist and HAVING qual.
+ */
+ input_target = create_empty_pathtarget();
+ non_group_cols = NIL;
i = 0;
foreach(lc, final_target->exprs)
{
Expr *expr = (Expr *) lfirst(lc);
+ Index sgref = get_pathtarget_sortgroupref(final_target, i);
- /*
- * If the column has a sortgroupref, assume it has to be evaluated
- * before sorting. Generally such columns would be ORDER BY, GROUP
- * BY, etc targets. One exception is columns that were removed from
- * GROUP BY by remove_useless_groupby_columns() ... but those would
- * only be Vars anyway. There don't seem to be any cases where it
- * would be worth the trouble to double-check.
- */
- if (get_pathtarget_sortgroupref(final_target, i) == 0)
+ if (sgref && parse->groupClause &&
+ get_sortgroupref_clause_noerr(sgref, parse->groupClause) != NULL)
{
/*
- * Check for SRF or volatile functions. Check the SRF case first
- * because we must know whether we have any postponed SRFs.
+ * It's a grouping column, so add it to the input target as-is.
*/
- if (parse->hasTargetSRFs &&
- expression_returns_set((Node *) expr))
- {
- /* We'll decide below whether these are postponable */
- col_is_srf[i] = true;
- have_srf = true;
- }
- else if (contain_volatile_functions((Node *) expr))
- {
- /* Unconditionally postpone */
- postpone_col[i] = true;
- have_volatile = true;
- }
- else
- {
- /*
- * Else check the cost. XXX it's annoying to have to do this
- * when set_pathtarget_cost_width() just did it. Refactor to
- * allow sharing the work?
- */
- QualCost cost;
-
- cost_qual_eval_node(&cost, (Node *) expr, root);
-
- /*
- * We arbitrarily define "expensive" as "more than 10X
- * cpu_operator_cost". Note this will take in any PL function
- * with default cost.
- */
- if (cost.per_tuple > 10 * cpu_operator_cost)
- {
- postpone_col[i] = true;
- have_expensive = true;
- }
- }
+ add_column_to_pathtarget(input_target, expr, sgref);
}
else
{
- /* For sortgroupref cols, just check if any contain SRFs */
- if (!have_srf_sortcols &&
- parse->hasTargetSRFs &&
- expression_returns_set((Node *) expr))
- have_srf_sortcols = true;
+ /*
+ * Non-grouping column, so just remember the expression for later
+ * call to pull_var_clause.
+ */
+ non_group_cols = lappend(non_group_cols, expr);
}
i++;
}
/*
- * We can postpone SRFs if we have some but none are in sortgroupref cols.
- */
- postpone_srfs = (have_srf && !have_srf_sortcols);
-
- /*
- * If we don't need a post-sort projection, just return final_target.
+ * If there's a HAVING clause, we'll need the Vars it uses, too.
*/
- if (!(postpone_srfs || have_volatile ||
- (have_expensive &&
- (parse->limitCount || root->tuple_fraction > 0))))
- return final_target;
+ if (parse->havingQual)
+ non_group_cols = lappend(non_group_cols, parse->havingQual);
/*
- * Report whether the post-sort projection will contain set-returning
- * functions. This is important because it affects whether the Sort can
- * rely on the query's LIMIT (if any) to bound the number of rows it needs
- * to return.
+ * Pull out all the Vars mentioned in non-group cols (plus HAVING), and
+ * add them to the input target if not already present. (A Var used
+ * directly as a GROUP BY item will be present already.) Note this
+ * includes Vars used in resjunk items, so we are covering the needs of
+ * ORDER BY and window specifications. Vars used within Aggrefs and
+ * WindowFuncs will be pulled out here, too.
*/
- *have_postponed_srfs = postpone_srfs;
+ non_group_vars = pull_var_clause((Node *) non_group_cols,
+ PVC_RECURSE_AGGREGATES |
+ PVC_RECURSE_WINDOWFUNCS |
+ PVC_INCLUDE_PLACEHOLDERS);
+ add_new_columns_to_pathtarget(input_target, non_group_vars);
- /*
- * Construct the sort-input target, taking all non-postponable columns and
- * then adding Vars, PlaceHolderVars, Aggrefs, and WindowFuncs found in
- * the postponable ones.
- */
- input_target = create_empty_pathtarget();
- postponable_cols = NIL;
+ /* clean up cruft */
+ list_free(non_group_vars);
+ list_free(non_group_cols);
- i = 0;
- foreach(lc, final_target->exprs)
- {
- Expr *expr = (Expr *) lfirst(lc);
+ /* XXX this causes some redundant cost calculation ... */
+ return set_pathtarget_cost_width(root, input_target);
+}
- if (postpone_col[i] || (postpone_srfs && col_is_srf[i]))
- postponable_cols = lappend(postponable_cols, expr);
- else
- add_column_to_pathtarget(input_target, expr,
- get_pathtarget_sortgroupref(final_target, i));
+/*
+ * mark_partial_aggref
+ * Adjust an Aggref to make it represent a partial-aggregation step.
+ *
+ * The Aggref node is modified in-place; caller must do any copying required.
+ */
+void
+mark_partial_aggref(Aggref *agg, AggSplit aggsplit)
+{
+ /* aggtranstype should be computed by this point */
+ Assert(OidIsValid(agg->aggtranstype));
+ /* ... but aggsplit should still be as the parser left it */
+ Assert(agg->aggsplit == AGGSPLIT_SIMPLE);
- i++;
- }
+ /* Mark the Aggref with the intended partial-aggregation mode */
+ agg->aggsplit = aggsplit;
/*
- * Pull out all the Vars, Aggrefs, and WindowFuncs mentioned in
- * postponable columns, and add them to the sort-input target if not
- * already present. (Some might be there already.) We mustn't
- * deconstruct Aggrefs or WindowFuncs here, since the projection node
- * would be unable to recompute them.
+ * Adjust result type if needed. Normally, a partial aggregate returns
+ * the aggregate's transition type; but if that's INTERNAL and we're
+ * serializing, it returns BYTEA instead.
*/
- postponable_vars = pull_var_clause((Node *) postponable_cols,
- PVC_INCLUDE_AGGREGATES |
- PVC_INCLUDE_WINDOWFUNCS |
- PVC_INCLUDE_PLACEHOLDERS);
- add_new_columns_to_pathtarget(input_target, postponable_vars);
-
- /* clean up cruft */
- list_free(postponable_vars);
- list_free(postponable_cols);
-
- /* XXX this represents even more redundant cost calculation ... */
- return set_pathtarget_cost_width(root, input_target);
+ if (DO_AGGSPLIT_SKIPFINAL(aggsplit))
+ {
+ if (agg->aggtranstype == INTERNALOID && DO_AGGSPLIT_SERIALIZE(aggsplit))
+ agg->aggtype = BYTEAOID;
+ else
+ agg->aggtype = agg->aggtranstype;
+ }
}
/*
- * get_cheapest_fractional_path
- * Find the cheapest path for retrieving a specified fraction of all
- * the tuples expected to be returned by the given relation.
- *
- * We interpret tuple_fraction the same way as grouping_planner.
+ * postprocess_setop_tlist
+ * Fix up targetlist returned by plan_set_operations().
*
- * We assume set_cheapest() has been run on the given rel.
+ * We need to transpose sort key info from the orig_tlist into new_tlist.
+ * NOTE: this would not be good enough if we supported resjunk sort keys
+ * for results of set operations --- then, we'd need to project a whole
+ * new tlist to evaluate the resjunk columns. For now, just ereport if we
+ * find any resjunk columns in orig_tlist.
+ */
+static List *
+postprocess_setop_tlist(List *new_tlist, List *orig_tlist)
+{
+ ListCell *l;
+ ListCell *orig_tlist_item = list_head(orig_tlist);
+
+ foreach(l, new_tlist)
+ {
+ TargetEntry *new_tle = lfirst_node(TargetEntry, l);
+ TargetEntry *orig_tle;
+
+ /* ignore resjunk columns in setop result */
+ if (new_tle->resjunk)
+ continue;
+
+ Assert(orig_tlist_item != NULL);
+ orig_tle = lfirst_node(TargetEntry, orig_tlist_item);
+ orig_tlist_item = lnext(orig_tlist_item);
+ if (orig_tle->resjunk) /* should not happen */
+ elog(ERROR, "resjunk output columns are not implemented");
+ Assert(new_tle->resno == orig_tle->resno);
+ new_tle->ressortgroupref = orig_tle->ressortgroupref;
+ }
+ if (orig_tlist_item != NULL)
+ elog(ERROR, "resjunk output columns are not implemented");
+ return new_tlist;
+}
+
+/*
+ * select_active_windows
+ * Create a list of the "active" window clauses (ie, those referenced
+ * by non-deleted WindowFuncs) in the order they are to be executed.
*/
-Path *
-get_cheapest_fractional_path(RelOptInfo *rel, double tuple_fraction)
+static List *
+select_active_windows(PlannerInfo *root, WindowFuncLists *wflists)
{
- Path *best_path = rel->cheapest_total_path;
- ListCell *l;
+ List *result;
+ List *actives;
+ ListCell *lc;
- /* If all tuples will be retrieved, just return the cheapest-total path */
- if (tuple_fraction <= 0.0)
- return best_path;
+ /* First, make a list of the active windows */
+ actives = NIL;
+ foreach(lc, root->parse->windowClause)
+ {
+ WindowClause *wc = lfirst_node(WindowClause, lc);
- /* Convert absolute # of tuples to a fraction; no need to clamp to 0..1 */
- if (tuple_fraction >= 1.0 && best_path->rows > 0)
- tuple_fraction /= best_path->rows;
+ /* It's only active if wflists shows some related WindowFuncs */
+ Assert(wc->winref <= wflists->maxWinRef);
+ if (wflists->windowFuncs[wc->winref] != NIL)
+ actives = lappend(actives, wc);
+ }
- foreach(l, rel->pathlist)
+ /*
+ * Now, ensure that windows with identical partitioning/ordering clauses
+ * are adjacent in the list. This is required by the SQL standard, which
+ * says that only one sort is to be used for such windows, even if they
+ * are otherwise distinct (eg, different names or framing clauses).
+ *
+ * There is room to be much smarter here, for example detecting whether
+ * one window's sort keys are a prefix of another's (so that sorting for
+ * the latter would do for the former), or putting windows first that
+ * match a sort order available for the underlying query. For the moment
+ * we are content with meeting the spec.
+ */
+ result = NIL;
+ while (actives != NIL)
{
- Path *path = (Path *) lfirst(l);
+ WindowClause *wc = linitial_node(WindowClause, actives);
+ ListCell *prev;
+ ListCell *next;
- if (path == rel->cheapest_total_path ||
- compare_fractional_path_costs(best_path, path, tuple_fraction) <= 0)
- continue;
+ /* Move wc from actives to result */
+ actives = list_delete_first(actives);
+ result = lappend(result, wc);
- best_path = path;
+ /* Now move any matching windows from actives to result */
+ prev = NULL;
+ for (lc = list_head(actives); lc; lc = next)
+ {
+ WindowClause *wc2 = lfirst_node(WindowClause, lc);
+
+ next = lnext(lc);
+ /* framing options are NOT to be compared here! */
+ if (equal(wc->partitionClause, wc2->partitionClause) &&
+ equal(wc->orderClause, wc2->orderClause))
+ {
+ actives = list_delete_cell(actives, lc, prev);
+ result = lappend(result, wc2);
+ }
+ else
+ prev = lc;
+ }
}
- return best_path;
+ return result;
}
/*
- * adjust_paths_for_srfs
- * Fix up the Paths of the given upperrel to handle tSRFs properly.
+ * make_window_input_target
+ * Generate appropriate PathTarget for initial input to WindowAgg nodes.
*
- * The executor can only handle set-returning functions that appear at the
- * top level of the targetlist of a ProjectSet plan node. If we have any SRFs
- * that are not at top level, we need to split up the evaluation into multiple
- * plan levels in which each level satisfies this constraint. This function
- * modifies each Path of an upperrel that (might) compute any SRFs in its
- * output tlist to insert appropriate projection steps.
+ * When the query has window functions, this function computes the desired
+ * target to be computed by the node just below the first WindowAgg.
+ * This tlist must contain all values needed to evaluate the window functions,
+ * compute the final target list, and perform any required final sort step.
+ * If multiple WindowAggs are needed, each intermediate one adds its window
+ * function results onto this base tlist; only the topmost WindowAgg computes
+ * the actual desired target list.
*
- * The given targets and targets_contain_srfs lists are from
- * split_pathtarget_at_srfs(). We assume the existing Paths emit the first
- * target in targets.
+ * This function is much like make_group_input_target, though not quite enough
+ * like it to share code. As in that function, we flatten most expressions
+ * into their component variables. But we do not want to flatten window
+ * PARTITION BY/ORDER BY clauses, since that might result in multiple
+ * evaluations of them, which would be bad (possibly even resulting in
+ * inconsistent answers, if they contain volatile functions).
+ * Also, we must not flatten GROUP BY clauses that were left unflattened by
+ * make_group_input_target, because we may no longer have access to the
+ * individual Vars in them.
+ *
+ * Another key difference from make_group_input_target is that we don't
+ * flatten Aggref expressions, since those are to be computed below the
+ * window functions and just referenced like Vars above that.
+ *
+ * 'final_target' is the query's final target list (in PathTarget form)
+ * 'activeWindows' is the list of active windows previously identified by
+ * select_active_windows.
+ *
+ * The result is the PathTarget to be computed by the plan node immediately
+ * below the first WindowAgg node.
*/
-static void
-adjust_paths_for_srfs(PlannerInfo *root, RelOptInfo *rel,
- List *targets, List *targets_contain_srfs)
+static PathTarget *
+make_window_input_target(PlannerInfo *root,
+ PathTarget *final_target,
+ List *activeWindows)
{
+ Query *parse = root->parse;
+ PathTarget *input_target;
+ Bitmapset *sgrefs;
+ List *flattenable_cols;
+ List *flattenable_vars;
+ int i;
ListCell *lc;
- Assert(list_length(targets) == list_length(targets_contain_srfs));
- Assert(!linitial_int(targets_contain_srfs));
-
- /* If no SRFs appear at this plan level, nothing to do */
- if (list_length(targets) == 1)
- return;
+ Assert(parse->hasWindowFuncs);
/*
- * Stack SRF-evaluation nodes atop each path for the rel.
- *
- * In principle we should re-run set_cheapest() here to identify the
- * cheapest path, but it seems unlikely that adding the same tlist eval
- * costs to all the paths would change that, so we don't bother. Instead,
- * just assume that the cheapest-startup and cheapest-total paths remain
- * so. (There should be no parameterized paths anymore, so we needn't
- * worry about updating cheapest_parameterized_paths.)
+ * Collect the sortgroupref numbers of window PARTITION/ORDER BY clauses
+ * into a bitmapset for convenient reference below.
*/
- foreach(lc, rel->pathlist)
+ sgrefs = NULL;
+ foreach(lc, activeWindows)
{
- Path *subpath = (Path *) lfirst(lc);
- Path *newpath = subpath;
- ListCell *lc1,
- *lc2;
+ WindowClause *wc = lfirst_node(WindowClause, lc);
+ ListCell *lc2;
- Assert(subpath->param_info == NULL);
- forboth(lc1, targets, lc2, targets_contain_srfs)
+ foreach(lc2, wc->partitionClause)
{
- PathTarget *thistarget = lfirst_node(PathTarget, lc1);
- bool contains_srfs = (bool) lfirst_int(lc2);
+ SortGroupClause *sortcl = lfirst_node(SortGroupClause, lc2);
- /* If this level doesn't contain SRFs, do regular projection */
- if (contains_srfs)
- newpath = (Path *) create_set_projection_path(root,
- rel,
- newpath,
- thistarget);
- else
- newpath = (Path *) apply_projection_to_path(root,
- rel,
- newpath,
- thistarget);
+ sgrefs = bms_add_member(sgrefs, sortcl->tleSortGroupRef);
+ }
+ foreach(lc2, wc->orderClause)
+ {
+ SortGroupClause *sortcl = lfirst_node(SortGroupClause, lc2);
+
+ sgrefs = bms_add_member(sgrefs, sortcl->tleSortGroupRef);
}
- lfirst(lc) = newpath;
- if (subpath == rel->cheapest_startup_path)
- rel->cheapest_startup_path = newpath;
- if (subpath == rel->cheapest_total_path)
- rel->cheapest_total_path = newpath;
}
- /* Likewise for partial paths, if any */
- foreach(lc, rel->partial_pathlist)
+ /* Add in sortgroupref numbers of GROUP BY clauses, too */
+ foreach(lc, parse->groupClause)
{
- Path *subpath = (Path *) lfirst(lc);
- Path *newpath = subpath;
- ListCell *lc1,
- *lc2;
+ SortGroupClause *grpcl = lfirst_node(SortGroupClause, lc);
- Assert(subpath->param_info == NULL);
- forboth(lc1, targets, lc2, targets_contain_srfs)
- {
- PathTarget *thistarget = lfirst_node(PathTarget, lc1);
- bool contains_srfs = (bool) lfirst_int(lc2);
+ sgrefs = bms_add_member(sgrefs, grpcl->tleSortGroupRef);
+ }
- /* If this level doesn't contain SRFs, do regular projection */
- if (contains_srfs)
- newpath = (Path *) create_set_projection_path(root,
- rel,
- newpath,
- thistarget);
- else
- {
- /* avoid apply_projection_to_path, in case of multiple refs */
- newpath = (Path *) create_projection_path(root,
- rel,
- newpath,
- thistarget);
- }
+ /*
+ * Construct a target containing all the non-flattenable targetlist items,
+ * and save aside the others for a moment.
+ */
+ input_target = create_empty_pathtarget();
+ flattenable_cols = NIL;
+
+ i = 0;
+ foreach(lc, final_target->exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+ Index sgref = get_pathtarget_sortgroupref(final_target, i);
+
+ /*
+ * Don't want to deconstruct window clauses or GROUP BY items. (Note
+ * that such items can't contain window functions, so it's okay to
+ * compute them below the WindowAgg nodes.)
+ */
+ if (sgref != 0 && bms_is_member(sgref, sgrefs))
+ {
+ /*
+ * Don't want to deconstruct this value, so add it to the input
+ * target as-is.
+ */
+ add_column_to_pathtarget(input_target, expr, sgref);
}
- lfirst(lc) = newpath;
+ else
+ {
+ /*
+ * Column is to be flattened, so just remember the expression for
+ * later call to pull_var_clause.
+ */
+ flattenable_cols = lappend(flattenable_cols, expr);
+ }
+
+ i++;
}
+
+ /*
+ * Pull out all the Vars and Aggrefs mentioned in flattenable columns, and
+ * add them to the input target if not already present. (Some might be
+ * there already because they're used directly as window/group clauses.)
+ *
+ * Note: it's essential to use PVC_INCLUDE_AGGREGATES here, so that any
+ * Aggrefs are placed in the Agg node's tlist and not left to be computed
+ * at higher levels. On the other hand, we should recurse into
+ * WindowFuncs to make sure their input expressions are available.
+ */
+ flattenable_vars = pull_var_clause((Node *) flattenable_cols,
+ PVC_INCLUDE_AGGREGATES |
+ PVC_RECURSE_WINDOWFUNCS |
+ PVC_INCLUDE_PLACEHOLDERS);
+ add_new_columns_to_pathtarget(input_target, flattenable_vars);
+
+ /* clean up cruft */
+ list_free(flattenable_vars);
+ list_free(flattenable_cols);
+
+ /* XXX this causes some redundant cost calculation ... */
+ return set_pathtarget_cost_width(root, input_target);
}
/*
- * expression_planner
- * Perform planner's transformations on a standalone expression.
- *
- * Various utility commands need to evaluate expressions that are not part
- * of a plannable query. They can do so using the executor's regular
- * expression-execution machinery, but first the expression has to be fed
- * through here to transform it from parser output to something executable.
+ * make_pathkeys_for_window
+ * Create a pathkeys list describing the required input ordering
+ * for the given WindowClause.
*
- * Currently, we disallow sublinks in standalone expressions, so there's no
- * real "planning" involved here. (That might not always be true though.)
- * What we must do is run eval_const_expressions to ensure that any function
- * calls are converted to positional notation and function default arguments
- * get inserted. The fact that constant subexpressions get simplified is a
- * side-effect that is useful when the expression will get evaluated more than
- * once. Also, we must fix operator function IDs.
+ * The required ordering is first the PARTITION keys, then the ORDER keys.
+ * In the future we might try to implement windowing using hashing, in which
+ * case the ordering could be relaxed, but for now we always sort.
*
- * Note: this must not make any damaging changes to the passed-in expression
- * tree. (It would actually be okay to apply fix_opfuncids to it, but since
- * we first do an expression_tree_mutator-based walk, what is returned will
- * be a new node tree.)
+ * Caution: if you change this, see createplan.c's get_column_info_for_window!
*/
-Expr *
-expression_planner(Expr *expr)
+static List *
+make_pathkeys_for_window(PlannerInfo *root, WindowClause *wc,
+ List *tlist)
{
- Node *result;
-
- /*
- * Convert named-argument function calls, insert default arguments and
- * simplify constant subexprs
- */
- result = eval_const_expressions(NULL, (Node *) expr);
+ List *window_pathkeys;
+ List *window_sortclauses;
- /* Fill in opfuncid values if missing */
- fix_opfuncids(result);
+ /* Throw error if can't sort */
+ if (!grouping_is_sortable(wc->partitionClause))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("could not implement window PARTITION BY"),
+ errdetail("Window partitioning columns must be of sortable datatypes.")));
+ if (!grouping_is_sortable(wc->orderClause))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("could not implement window ORDER BY"),
+ errdetail("Window ordering columns must be of sortable datatypes.")));
- return (Expr *) result;
+ /* Okay, make the combined pathkeys */
+ window_sortclauses = list_concat(list_copy(wc->partitionClause),
+ list_copy(wc->orderClause));
+ window_pathkeys = make_pathkeys_for_sortclauses(root,
+ window_sortclauses,
+ tlist);
+ list_free(window_sortclauses);
+ return window_pathkeys;
}
-
/*
- * plan_cluster_use_sort
- * Use the planner to decide how CLUSTER should implement sorting
+ * make_sort_input_target
+ * Generate appropriate PathTarget for initial input to Sort step.
*
- * tableOid is the OID of a table to be clustered on its index indexOid
- * (which is already known to be a btree index). Decide whether it's
- * cheaper to do an indexscan or a seqscan-plus-sort to execute the CLUSTER.
- * Return true to use sorting, false to use an indexscan.
+ * If the query has ORDER BY, this function chooses the target to be computed
+ * by the node just below the Sort (and DISTINCT, if any, since Unique can't
+ * project) steps. This might or might not be identical to the query's final
+ * output target.
*
- * Note: caller had better already hold some type of lock on the table.
+ * The main argument for keeping the sort-input tlist the same as the final
+ * is that we avoid a separate projection node (which will be needed if
+ * they're different, because Sort can't project). However, there are also
+ * advantages to postponing tlist evaluation till after the Sort: it ensures
+ * a consistent order of evaluation for any volatile functions in the tlist,
+ * and if there's also a LIMIT, we can stop the query without ever computing
+ * tlist functions for later rows, which is beneficial for both volatile and
+ * expensive functions.
+ *
+ * Our current policy is to postpone volatile expressions till after the sort
+ * unconditionally (assuming that that's possible, ie they are in plain tlist
+ * columns and not ORDER BY/GROUP BY/DISTINCT columns). We also prefer to
+ * postpone set-returning expressions, because running them beforehand would
+ * bloat the sort dataset, and because it might cause unexpected output order
+ * if the sort isn't stable. However there's a constraint on that: all SRFs
+ * in the tlist should be evaluated at the same plan step, so that they can
+ * run in sync in nodeProjectSet. So if any SRFs are in sort columns, we
+ * mustn't postpone any SRFs. (Note that in principle that policy should
+ * probably get applied to the group/window input targetlists too, but we
+ * have not done that historically.) Lastly, expensive expressions are
+ * postponed if there is a LIMIT, or if root->tuple_fraction shows that
+ * partial evaluation of the query is possible (if neither is true, we expect
+ * to have to evaluate the expressions for every row anyway), or if there are
+ * any volatile or set-returning expressions (since once we've put in a
+ * projection at all, it won't cost any more to postpone more stuff).
+ *
+ * Another issue that could potentially be considered here is that
+ * evaluating tlist expressions could result in data that's either wider
+ * or narrower than the input Vars, thus changing the volume of data that
+ * has to go through the Sort. However, we usually have only a very bad
+ * idea of the output width of any expression more complex than a Var,
+ * so for now it seems too risky to try to optimize on that basis.
+ *
+ * Note that if we do produce a modified sort-input target, and then the
+ * query ends up not using an explicit Sort, no particular harm is done:
+ * we'll initially use the modified target for the preceding path nodes,
+ * but then change them to the final target with apply_projection_to_path.
+ * Moreover, in such a case the guarantees about evaluation order of
+ * volatile functions still hold, since the rows are sorted already.
+ *
+ * This function has some things in common with make_group_input_target and
+ * make_window_input_target, though the detailed rules for what to do are
+ * different. We never flatten/postpone any grouping or ordering columns;
+ * those are needed before the sort. If we do flatten a particular
+ * expression, we leave Aggref and WindowFunc nodes alone, since those were
+ * computed earlier.
+ *
+ * 'final_target' is the query's final target list (in PathTarget form)
+ * 'have_postponed_srfs' is an output argument, see below
+ *
+ * The result is the PathTarget to be computed by the plan node immediately
+ * below the Sort step (and the Distinct step, if any). This will be
+ * exactly final_target if we decide a projection step wouldn't be helpful.
+ *
+ * In addition, *have_postponed_srfs is set to true if we choose to postpone
+ * any set-returning functions to after the Sort.
*/
-bool
-plan_cluster_use_sort(Oid tableOid, Oid indexOid)
+static PathTarget *
+make_sort_input_target(PlannerInfo *root,
+ PathTarget *final_target,
+ bool *have_postponed_srfs)
{
- PlannerInfo *root;
- Query *query;
- PlannerGlobal *glob;
- RangeTblEntry *rte;
- RelOptInfo *rel;
- IndexOptInfo *indexInfo;
- QualCost indexExprCost;
- Cost comparisonCost;
- Path *seqScanPath;
- Path seqScanAndSortPath;
- IndexPath *indexScanPath;
+ Query *parse = root->parse;
+ PathTarget *input_target;
+ int ncols;
+ bool *col_is_srf;
+ bool *postpone_col;
+ bool have_srf;
+ bool have_volatile;
+ bool have_expensive;
+ bool have_srf_sortcols;
+ bool postpone_srfs;
+ List *postponable_cols;
+ List *postponable_vars;
+ int i;
ListCell *lc;
- /* We can short-circuit the cost comparison if indexscans are disabled */
- if (!enable_indexscan)
- return true; /* use sort */
+ /* Shouldn't get here unless query has ORDER BY */
+ Assert(parse->sortClause);
- /* Set up mostly-dummy planner state */
- query = makeNode(Query);
- query->commandType = CMD_SELECT;
+ *have_postponed_srfs = false; /* default result */
- glob = makeNode(PlannerGlobal);
+ /* Inspect tlist and collect per-column information */
+ ncols = list_length(final_target->exprs);
+ col_is_srf = (bool *) palloc0(ncols * sizeof(bool));
+ postpone_col = (bool *) palloc0(ncols * sizeof(bool));
+ have_srf = have_volatile = have_expensive = have_srf_sortcols = false;
- root = makeNode(PlannerInfo);
- root->parse = query;
- root->glob = glob;
- root->query_level = 1;
- root->planner_cxt = CurrentMemoryContext;
- root->wt_param_id = -1;
+ i = 0;
+ foreach(lc, final_target->exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
- /* Build a minimal RTE for the rel */
- rte = makeNode(RangeTblEntry);
- rte->rtekind = RTE_RELATION;
- rte->relid = tableOid;
- rte->relkind = RELKIND_RELATION; /* Don't be too picky. */
- rte->lateral = false;
- rte->inh = false;
- rte->inFromCl = true;
- query->rtable = list_make1(rte);
+ /*
+ * If the column has a sortgroupref, assume it has to be evaluated
+ * before sorting. Generally such columns would be ORDER BY, GROUP
+ * BY, etc targets. One exception is columns that were removed from
+ * GROUP BY by remove_useless_groupby_columns() ... but those would
+ * only be Vars anyway. There don't seem to be any cases where it
+ * would be worth the trouble to double-check.
+ */
+ if (get_pathtarget_sortgroupref(final_target, i) == 0)
+ {
+ /*
+ * Check for SRF or volatile functions. Check the SRF case first
+ * because we must know whether we have any postponed SRFs.
+ */
+ if (parse->hasTargetSRFs &&
+ expression_returns_set((Node *) expr))
+ {
+ /* We'll decide below whether these are postponable */
+ col_is_srf[i] = true;
+ have_srf = true;
+ }
+ else if (contain_volatile_functions((Node *) expr))
+ {
+ /* Unconditionally postpone */
+ postpone_col[i] = true;
+ have_volatile = true;
+ }
+ else
+ {
+ /*
+ * Else check the cost. XXX it's annoying to have to do this
+ * when set_pathtarget_cost_width() just did it. Refactor to
+ * allow sharing the work?
+ */
+ QualCost cost;
- /* Set up RTE/RelOptInfo arrays */
- setup_simple_rel_arrays(root);
+ cost_qual_eval_node(&cost, (Node *) expr, root);
- /* Build RelOptInfo */
- rel = build_simple_rel(root, 1, NULL);
+ /*
+ * We arbitrarily define "expensive" as "more than 10X
+ * cpu_operator_cost". Note this will take in any PL function
+ * with default cost.
+ */
+ if (cost.per_tuple > 10 * cpu_operator_cost)
+ {
+ postpone_col[i] = true;
+ have_expensive = true;
+ }
+ }
+ }
+ else
+ {
+ /* For sortgroupref cols, just check if any contain SRFs */
+ if (!have_srf_sortcols &&
+ parse->hasTargetSRFs &&
+ expression_returns_set((Node *) expr))
+ have_srf_sortcols = true;
+ }
- /* Locate IndexOptInfo for the target index */
- indexInfo = NULL;
- foreach(lc, rel->indexlist)
- {
- indexInfo = lfirst_node(IndexOptInfo, lc);
- if (indexInfo->indexoid == indexOid)
- break;
+ i++;
}
/*
- * It's possible that get_relation_info did not generate an IndexOptInfo
- * for the desired index; this could happen if it's not yet reached its
- * indcheckxmin usability horizon, or if it's a system index and we're
- * ignoring system indexes. In such cases we should tell CLUSTER to not
- * trust the index contents but use seqscan-and-sort.
+ * We can postpone SRFs if we have some but none are in sortgroupref cols.
*/
- if (lc == NULL) /* not in the list? */
- return true; /* use sort */
+ postpone_srfs = (have_srf && !have_srf_sortcols);
/*
- * Rather than doing all the pushups that would be needed to use
- * set_baserel_size_estimates, just do a quick hack for rows and width.
+ * If we don't need a post-sort projection, just return final_target.
*/
- rel->rows = rel->tuples;
- rel->reltarget->width = get_relation_data_width(tableOid, NULL);
+ if (!(postpone_srfs || have_volatile ||
+ (have_expensive &&
+ (parse->limitCount || root->tuple_fraction > 0))))
+ return final_target;
- root->total_table_pages = rel->pages;
+ /*
+ * Report whether the post-sort projection will contain set-returning
+ * functions. This is important because it affects whether the Sort can
+ * rely on the query's LIMIT (if any) to bound the number of rows it needs
+ * to return.
+ */
+ *have_postponed_srfs = postpone_srfs;
/*
- * Determine eval cost of the index expressions, if any. We need to
- * charge twice that amount for each tuple comparison that happens during
- * the sort, since tuplesort.c will have to re-evaluate the index
- * expressions each time. (XXX that's pretty inefficient...)
+ * Construct the sort-input target, taking all non-postponable columns and
+ * then adding Vars, PlaceHolderVars, Aggrefs, and WindowFuncs found in
+ * the postponable ones.
*/
- cost_qual_eval(&indexExprCost, indexInfo->indexprs, root);
- comparisonCost = 2.0 * (indexExprCost.startup + indexExprCost.per_tuple);
+ input_target = create_empty_pathtarget();
+ postponable_cols = NIL;
- /* Estimate the cost of seq scan + sort */
- seqScanPath = create_seqscan_path(root, rel, NULL, 0);
- cost_sort(&seqScanAndSortPath, root, NIL,
- seqScanPath->total_cost, rel->tuples, rel->reltarget->width,
- comparisonCost, maintenance_work_mem, -1.0);
+ i = 0;
+ foreach(lc, final_target->exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
- /* Estimate the cost of index scan */
- indexScanPath = create_index_path(root, indexInfo,
- NIL, NIL, NIL, NIL, NIL,
- ForwardScanDirection, false,
- NULL, 1.0, false);
+ if (postpone_col[i] || (postpone_srfs && col_is_srf[i]))
+ postponable_cols = lappend(postponable_cols, expr);
+ else
+ add_column_to_pathtarget(input_target, expr,
+ get_pathtarget_sortgroupref(final_target, i));
- return (seqScanAndSortPath.total_cost < indexScanPath->path.total_cost);
+ i++;
+ }
+
+ /*
+ * Pull out all the Vars, Aggrefs, and WindowFuncs mentioned in
+ * postponable columns, and add them to the sort-input target if not
+ * already present. (Some might be there already.) We mustn't
+ * deconstruct Aggrefs or WindowFuncs here, since the projection node
+ * would be unable to recompute them.
+ */
+ postponable_vars = pull_var_clause((Node *) postponable_cols,
+ PVC_INCLUDE_AGGREGATES |
+ PVC_INCLUDE_WINDOWFUNCS |
+ PVC_INCLUDE_PLACEHOLDERS);
+ add_new_columns_to_pathtarget(input_target, postponable_vars);
+
+ /* clean up cruft */
+ list_free(postponable_vars);
+ list_free(postponable_cols);
+
+ /* XXX this represents even more redundant cost calculation ... */
+ return set_pathtarget_cost_width(root, input_target);
}
/*
- * plan_create_index_workers
- * Use the planner to decide how many parallel worker processes
- * CREATE INDEX should request for use
- *
- * tableOid is the table on which the index is to be built. indexOid is the
- * OID of an index to be created or reindexed (which must be a btree index).
+ * get_cheapest_fractional_path
+ * Find the cheapest path for retrieving a specified fraction of all
+ * the tuples expected to be returned by the given relation.
*
- * Return value is the number of parallel worker processes to request. It
- * may be unsafe to proceed if this is 0. Note that this does not include the
- * leader participating as a worker (value is always a number of parallel
- * worker processes).
+ * We interpret tuple_fraction the same way as grouping_planner.
*
- * Note: caller had better already hold some type of lock on the table and
- * index.
+ * We assume set_cheapest() has been run on the given rel.
*/
-int
-plan_create_index_workers(Oid tableOid, Oid indexOid)
+Path *
+get_cheapest_fractional_path(RelOptInfo *rel, double tuple_fraction)
{
- PlannerInfo *root;
- Query *query;
- PlannerGlobal *glob;
- RangeTblEntry *rte;
- Relation heap;
- Relation index;
- RelOptInfo *rel;
- int parallel_workers;
- BlockNumber heap_blocks;
- double reltuples;
- double allvisfrac;
-
- /* Return immediately when parallelism disabled */
- if (dynamic_shared_memory_type == DSM_IMPL_NONE ||
- max_parallel_maintenance_workers == 0)
- return 0;
-
- /* Set up largely-dummy planner state */
- query = makeNode(Query);
- query->commandType = CMD_SELECT;
-
- glob = makeNode(PlannerGlobal);
-
- root = makeNode(PlannerInfo);
- root->parse = query;
- root->glob = glob;
- root->query_level = 1;
- root->planner_cxt = CurrentMemoryContext;
- root->wt_param_id = -1;
-
- /*
- * Build a minimal RTE.
- *
- * Set the target's table to be an inheritance parent. This is a kludge
- * that prevents problems within get_relation_info(), which does not
- * expect that any IndexOptInfo is currently undergoing REINDEX.
- */
- rte = makeNode(RangeTblEntry);
- rte->rtekind = RTE_RELATION;
- rte->relid = tableOid;
- rte->relkind = RELKIND_RELATION; /* Don't be too picky. */
- rte->lateral = false;
- rte->inh = true;
- rte->inFromCl = true;
- query->rtable = list_make1(rte);
-
- /* Set up RTE/RelOptInfo arrays */
- setup_simple_rel_arrays(root);
-
- /* Build RelOptInfo */
- rel = build_simple_rel(root, 1, NULL);
+ Path *best_path = rel->cheapest_total_path;
+ ListCell *l;
- heap = heap_open(tableOid, NoLock);
- index = index_open(indexOid, NoLock);
+ /* If all tuples will be retrieved, just return the cheapest-total path */
+ if (tuple_fraction <= 0.0)
+ return best_path;
- /*
- * Determine if it's safe to proceed.
- *
- * Currently, parallel workers can't access the leader's temporary tables.
- * Furthermore, any index predicate or index expressions must be parallel
- * safe.
- */
- if (heap->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
- !is_parallel_safe(root, (Node *) RelationGetIndexExpressions(index)) ||
- !is_parallel_safe(root, (Node *) RelationGetIndexPredicate(index)))
- {
- parallel_workers = 0;
- goto done;
- }
+ /* Convert absolute # of tuples to a fraction; no need to clamp to 0..1 */
+ if (tuple_fraction >= 1.0 && best_path->rows > 0)
+ tuple_fraction /= best_path->rows;
- /*
- * If parallel_workers storage parameter is set for the table, accept that
- * as the number of parallel worker processes to launch (though still cap
- * at max_parallel_maintenance_workers). Note that we deliberately do not
- * consider any other factor when parallel_workers is set. (e.g., memory
- * use by workers.)
- */
- if (rel->rel_parallel_workers != -1)
+ foreach(l, rel->pathlist)
{
- parallel_workers = Min(rel->rel_parallel_workers,
- max_parallel_maintenance_workers);
- goto done;
- }
-
- /*
- * Estimate heap relation size ourselves, since rel->pages cannot be
- * trusted (heap RTE was marked as inheritance parent)
- */
- estimate_rel_size(heap, NULL, &heap_blocks, &reltuples, &allvisfrac);
-
- /*
- * Determine number of workers to scan the heap relation using generic
- * model
- */
- parallel_workers = compute_parallel_worker(rel, heap_blocks, -1,
- max_parallel_maintenance_workers);
+ Path *path = (Path *) lfirst(l);
- /*
- * Cap workers based on available maintenance_work_mem as needed.
- *
- * Note that each tuplesort participant receives an even share of the
- * total maintenance_work_mem budget. Aim to leave participants
- * (including the leader as a participant) with no less than 32MB of
- * memory. This leaves cases where maintenance_work_mem is set to 64MB
- * immediately past the threshold of being capable of launching a single
- * parallel worker to sort.
- */
- while (parallel_workers > 0 &&
- maintenance_work_mem / (parallel_workers + 1) < 32768L)
- parallel_workers--;
+ if (path == rel->cheapest_total_path ||
+ compare_fractional_path_costs(best_path, path, tuple_fraction) <= 0)
+ continue;
-done:
- index_close(index, NoLock);
- heap_close(heap, NoLock);
+ best_path = path;
+ }
- return parallel_workers;
+ return best_path;
}
/*
- * add_paths_to_grouping_rel
+ * adjust_paths_for_srfs
+ * Fix up the Paths of the given upperrel to handle tSRFs properly.
+ *
+ * The executor can only handle set-returning functions that appear at the
+ * top level of the targetlist of a ProjectSet plan node. If we have any SRFs
+ * that are not at top level, we need to split up the evaluation into multiple
+ * plan levels in which each level satisfies this constraint. This function
+ * modifies each Path of an upperrel that (might) compute any SRFs in its
+ * output tlist to insert appropriate projection steps.
*
- * Add non-partial paths to grouping relation.
+ * The given targets and targets_contain_srfs lists are from
+ * split_pathtarget_at_srfs(). We assume the existing Paths emit the first
+ * target in targets.
*/
static void
-add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
- RelOptInfo *grouped_rel,
- RelOptInfo *partially_grouped_rel,
- const AggClauseCosts *agg_costs,
- grouping_sets_data *gd, double dNumGroups,
- GroupPathExtraData *extra)
+adjust_paths_for_srfs(PlannerInfo *root, RelOptInfo *rel,
+ List *targets, List *targets_contain_srfs)
{
- Query *parse = root->parse;
- Path *cheapest_path = input_rel->cheapest_total_path;
ListCell *lc;
- bool can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
- bool can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
- List *havingQual = (List *) extra->havingQual;
- AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
- if (can_sort)
- {
- /*
- * Use any available suitably-sorted path as input, and also consider
- * sorting the cheapest-total path.
- */
- foreach(lc, input_rel->pathlist)
- {
- Path *path = (Path *) lfirst(lc);
- bool is_sorted;
+ Assert(list_length(targets) == list_length(targets_contain_srfs));
+ Assert(!linitial_int(targets_contain_srfs));
- is_sorted = pathkeys_contained_in(root->group_pathkeys,
- path->pathkeys);
- if (path == cheapest_path || is_sorted)
- {
- /* Sort the cheapest-total path if it isn't already sorted */
- if (!is_sorted)
- path = (Path *) create_sort_path(root,
- grouped_rel,
- path,
- root->group_pathkeys,
- -1.0);
-
- /* Now decide what to stick atop it */
- if (parse->groupingSets)
- {
- consider_groupingsets_paths(root, grouped_rel,
- path, true, can_hash,
- gd, agg_costs, dNumGroups);
- }
- else if (parse->hasAggs)
- {
- /*
- * We have aggregation, possibly with plain GROUP BY. Make
- * an AggPath.
- */
- add_path(grouped_rel, (Path *)
- create_agg_path(root,
- grouped_rel,
- path,
- grouped_rel->reltarget,
- parse->groupClause ? AGG_SORTED : AGG_PLAIN,
- AGGSPLIT_SIMPLE,
- parse->groupClause,
- havingQual,
- agg_costs,
- dNumGroups));
- }
- else if (parse->groupClause)
- {
- /*
- * We have GROUP BY without aggregation or grouping sets.
- * Make a GroupPath.
- */
- add_path(grouped_rel, (Path *)
- create_group_path(root,
- grouped_rel,
- path,
- parse->groupClause,
- havingQual,
- dNumGroups));
- }
- else
- {
- /* Other cases should have been handled above */
- Assert(false);
- }
- }
- }
+ /* If no SRFs appear at this plan level, nothing to do */
+ if (list_length(targets) == 1)
+ return;
- /*
- * Instead of operating directly on the input relation, we can
- * consider finalizing a partially aggregated path.
- */
- if (partially_grouped_rel != NULL)
- {
- foreach(lc, partially_grouped_rel->pathlist)
- {
- Path *path = (Path *) lfirst(lc);
+ /*
+ * Stack SRF-evaluation nodes atop each path for the rel.
+ *
+ * In principle we should re-run set_cheapest() here to identify the
+ * cheapest path, but it seems unlikely that adding the same tlist eval
+ * costs to all the paths would change that, so we don't bother. Instead,
+ * just assume that the cheapest-startup and cheapest-total paths remain
+ * so. (There should be no parameterized paths anymore, so we needn't
+ * worry about updating cheapest_parameterized_paths.)
+ */
+ foreach(lc, rel->pathlist)
+ {
+ Path *subpath = (Path *) lfirst(lc);
+ Path *newpath = subpath;
+ ListCell *lc1,
+ *lc2;
- /*
- * Insert a Sort node, if required. But there's no point in
- * sorting anything but the cheapest path.
- */
- if (!pathkeys_contained_in(root->group_pathkeys, path->pathkeys))
- {
- if (path != partially_grouped_rel->cheapest_total_path)
- continue;
- path = (Path *) create_sort_path(root,
- grouped_rel,
- path,
- root->group_pathkeys,
- -1.0);
- }
+ Assert(subpath->param_info == NULL);
+ forboth(lc1, targets, lc2, targets_contain_srfs)
+ {
+ PathTarget *thistarget = lfirst_node(PathTarget, lc1);
+ bool contains_srfs = (bool) lfirst_int(lc2);
- if (parse->hasAggs)
- add_path(grouped_rel, (Path *)
- create_agg_path(root,
- grouped_rel,
- path,
- grouped_rel->reltarget,
- parse->groupClause ? AGG_SORTED : AGG_PLAIN,
- AGGSPLIT_FINAL_DESERIAL,
- parse->groupClause,
- havingQual,
- agg_final_costs,
- dNumGroups));
- else
- add_path(grouped_rel, (Path *)
- create_group_path(root,
- grouped_rel,
- path,
- parse->groupClause,
- havingQual,
- dNumGroups));
- }
+ /* If this level doesn't contain SRFs, do regular projection */
+ if (contains_srfs)
+ newpath = (Path *) create_set_projection_path(root,
+ rel,
+ newpath,
+ thistarget);
+ else
+ newpath = (Path *) apply_projection_to_path(root,
+ rel,
+ newpath,
+ thistarget);
}
+ lfirst(lc) = newpath;
+ if (subpath == rel->cheapest_startup_path)
+ rel->cheapest_startup_path = newpath;
+ if (subpath == rel->cheapest_total_path)
+ rel->cheapest_total_path = newpath;
}
- if (can_hash)
+ /* Likewise for partial paths, if any */
+ foreach(lc, rel->partial_pathlist)
{
- Size hashaggtablesize;
+ Path *subpath = (Path *) lfirst(lc);
+ Path *newpath = subpath;
+ ListCell *lc1,
+ *lc2;
- if (parse->groupingSets)
- {
- /*
- * Try for a hash-only groupingsets path over unsorted input.
- */
- consider_groupingsets_paths(root, grouped_rel,
- cheapest_path, false, true,
- gd, agg_costs, dNumGroups);
- }
- else
+ Assert(subpath->param_info == NULL);
+ forboth(lc1, targets, lc2, targets_contain_srfs)
{
- hashaggtablesize = estimate_hashagg_tablesize(cheapest_path,
- agg_costs,
- dNumGroups);
+ PathTarget *thistarget = lfirst_node(PathTarget, lc1);
+ bool contains_srfs = (bool) lfirst_int(lc2);
- /*
- * Provided that the estimated size of the hashtable does not
- * exceed work_mem, we'll generate a HashAgg Path, although if we
- * were unable to sort above, then we'd better generate a Path, so
- * that we at least have one.
- */
- if (hashaggtablesize < work_mem * 1024L ||
- grouped_rel->pathlist == NIL)
+ /* If this level doesn't contain SRFs, do regular projection */
+ if (contains_srfs)
+ newpath = (Path *) create_set_projection_path(root,
+ rel,
+ newpath,
+ thistarget);
+ else
{
- /*
- * We just need an Agg over the cheapest-total input path,
- * since input order won't matter.
- */
- add_path(grouped_rel, (Path *)
- create_agg_path(root, grouped_rel,
- cheapest_path,
- grouped_rel->reltarget,
- AGG_HASHED,
- AGGSPLIT_SIMPLE,
- parse->groupClause,
- havingQual,
- agg_costs,
- dNumGroups));
+ /* avoid apply_projection_to_path, in case of multiple refs */
+ newpath = (Path *) create_projection_path(root,
+ rel,
+ newpath,
+ thistarget);
}
}
-
- /*
- * Generate a Finalize HashAgg Path atop of the cheapest partially
- * grouped path, assuming there is one. Once again, we'll only do this
- * if it looks as though the hash table won't exceed work_mem.
- */
- if (partially_grouped_rel && partially_grouped_rel->pathlist)
- {
- Path *path = partially_grouped_rel->cheapest_total_path;
-
- hashaggtablesize = estimate_hashagg_tablesize(path,
- agg_final_costs,
- dNumGroups);
-
- if (hashaggtablesize < work_mem * 1024L)
- add_path(grouped_rel, (Path *)
- create_agg_path(root,
- grouped_rel,
- path,
- grouped_rel->reltarget,
- AGG_HASHED,
- AGGSPLIT_FINAL_DESERIAL,
- parse->groupClause,
- havingQual,
- agg_final_costs,
- dNumGroups));
- }
+ lfirst(lc) = newpath;
}
-
- /*
- * When partitionwise aggregate is used, we might have fully aggregated
- * paths in the partial pathlist, because add_paths_to_append_rel() will
- * consider a path for grouped_rel consisting of a Parallel Append of
- * non-partial paths from each child.
- */
- if (grouped_rel->partial_pathlist != NIL)
- gather_grouping_paths(root, grouped_rel);
}
/*
- * create_partial_grouping_paths
+ * expression_planner
+ * Perform planner's transformations on a standalone expression.
*
- * Create a new upper relation representing the result of partial aggregation
- * and populate it with appropriate paths. Note that we don't finalize the
- * lists of paths here, so the caller can add additional partial or non-partial
- * paths and must afterward call gather_grouping_paths and set_cheapest on
- * the returned upper relation.
+ * Various utility commands need to evaluate expressions that are not part
+ * of a plannable query. They can do so using the executor's regular
+ * expression-execution machinery, but first the expression has to be fed
+ * through here to transform it from parser output to something executable.
*
- * All paths for this new upper relation -- both partial and non-partial --
- * have been partially aggregated but require a subsequent FinalizeAggregate
- * step.
+ * Currently, we disallow sublinks in standalone expressions, so there's no
+ * real "planning" involved here. (That might not always be true though.)
+ * What we must do is run eval_const_expressions to ensure that any function
+ * calls are converted to positional notation and function default arguments
+ * get inserted. The fact that constant subexpressions get simplified is a
+ * side-effect that is useful when the expression will get evaluated more than
+ * once. Also, we must fix operator function IDs.
*
- * NB: This function is allowed to return NULL if it determines that there is
- * no real need to create a new RelOptInfo.
+ * Note: this must not make any damaging changes to the passed-in expression
+ * tree. (It would actually be okay to apply fix_opfuncids to it, but since
+ * we first do an expression_tree_mutator-based walk, what is returned will
+ * be a new node tree.)
*/
-static RelOptInfo *
-create_partial_grouping_paths(PlannerInfo *root,
- RelOptInfo *grouped_rel,
- RelOptInfo *input_rel,
- grouping_sets_data *gd,
- GroupPathExtraData *extra,
- bool force_rel_creation)
+Expr *
+expression_planner(Expr *expr)
{
- Query *parse = root->parse;
- RelOptInfo *partially_grouped_rel;
- AggClauseCosts *agg_partial_costs = &extra->agg_partial_costs;
- AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
- Path *cheapest_partial_path = NULL;
- Path *cheapest_total_path = NULL;
- double dNumPartialGroups = 0;
- double dNumPartialPartialGroups = 0;
- ListCell *lc;
- bool can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
- bool can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
-
- /*
- * Consider whether we should generate partially aggregated non-partial
- * paths. We can only do this if we have a non-partial path, and only if
- * the parent of the input rel is performing partial partitionwise
- * aggregation. (Note that extra->patype is the type of partitionwise
- * aggregation being used at the parent level, not this level.)
- */
- if (input_rel->pathlist != NIL &&
- extra->patype == PARTITIONWISE_AGGREGATE_PARTIAL)
- cheapest_total_path = input_rel->cheapest_total_path;
-
- /*
- * If parallelism is possible for grouped_rel, then we should consider
- * generating partially-grouped partial paths. However, if the input rel
- * has no partial paths, then we can't.
- */
- if (grouped_rel->consider_parallel && input_rel->partial_pathlist != NIL)
- cheapest_partial_path = linitial(input_rel->partial_pathlist);
-
- /*
- * If we can't partially aggregate partial paths, and we can't partially
- * aggregate non-partial paths, then don't bother creating the new
- * RelOptInfo at all, unless the caller specified force_rel_creation.
- */
- if (cheapest_total_path == NULL &&
- cheapest_partial_path == NULL &&
- !force_rel_creation)
- return NULL;
-
- /*
- * Build a new upper relation to represent the result of partially
- * aggregating the rows from the input relation.
- */
- partially_grouped_rel = fetch_upper_rel(root,
- UPPERREL_PARTIAL_GROUP_AGG,
- grouped_rel->relids);
- partially_grouped_rel->consider_parallel =
- grouped_rel->consider_parallel;
- partially_grouped_rel->reloptkind = grouped_rel->reloptkind;
- partially_grouped_rel->serverid = grouped_rel->serverid;
- partially_grouped_rel->userid = grouped_rel->userid;
- partially_grouped_rel->useridiscurrent = grouped_rel->useridiscurrent;
- partially_grouped_rel->fdwroutine = grouped_rel->fdwroutine;
+ Node *result;
/*
- * Build target list for partial aggregate paths. These paths cannot just
- * emit the same tlist as regular aggregate paths, because (1) we must
- * include Vars and Aggrefs needed in HAVING, which might not appear in
- * the result tlist, and (2) the Aggrefs must be set in partial mode.
+ * Convert named-argument function calls, insert default arguments and
+ * simplify constant subexprs
*/
- partially_grouped_rel->reltarget =
- make_partial_grouping_target(root, grouped_rel->reltarget,
- extra->havingQual);
+ result = eval_const_expressions(NULL, (Node *) expr);
- if (!extra->partial_costs_set)
- {
- /*
- * Collect statistics about aggregates for estimating costs of
- * performing aggregation in parallel.
- */
- MemSet(agg_partial_costs, 0, sizeof(AggClauseCosts));
- MemSet(agg_final_costs, 0, sizeof(AggClauseCosts));
- if (parse->hasAggs)
- {
- List *partial_target_exprs;
-
- /* partial phase */
- partial_target_exprs = partially_grouped_rel->reltarget->exprs;
- get_agg_clause_costs(root, (Node *) partial_target_exprs,
- AGGSPLIT_INITIAL_SERIAL,
- agg_partial_costs);
-
- /* final phase */
- get_agg_clause_costs(root, (Node *) grouped_rel->reltarget->exprs,
- AGGSPLIT_FINAL_DESERIAL,
- agg_final_costs);
- get_agg_clause_costs(root, extra->havingQual,
- AGGSPLIT_FINAL_DESERIAL,
- agg_final_costs);
- }
+ /* Fill in opfuncid values if missing */
+ fix_opfuncids(result);
- extra->partial_costs_set = true;
- }
+ return (Expr *) result;
+}
- /* Estimate number of partial groups. */
- if (cheapest_total_path != NULL)
- dNumPartialGroups =
- get_number_of_groups(root,
- cheapest_total_path->rows,
- gd,
- extra->targetList);
- if (cheapest_partial_path != NULL)
- dNumPartialPartialGroups =
- get_number_of_groups(root,
- cheapest_partial_path->rows,
- gd,
- extra->targetList);
-
- if (can_sort && cheapest_total_path != NULL)
- {
- /* This should have been checked previously */
- Assert(parse->hasAggs || parse->groupClause);
- /*
- * Use any available suitably-sorted path as input, and also consider
- * sorting the cheapest partial path.
- */
- foreach(lc, input_rel->pathlist)
- {
- Path *path = (Path *) lfirst(lc);
- bool is_sorted;
+/*
+ * plan_cluster_use_sort
+ * Use the planner to decide how CLUSTER should implement sorting
+ *
+ * tableOid is the OID of a table to be clustered on its index indexOid
+ * (which is already known to be a btree index). Decide whether it's
+ * cheaper to do an indexscan or a seqscan-plus-sort to execute the CLUSTER.
+ * Return true to use sorting, false to use an indexscan.
+ *
+ * Note: caller had better already hold some type of lock on the table.
+ */
+bool
+plan_cluster_use_sort(Oid tableOid, Oid indexOid)
+{
+ PlannerInfo *root;
+ Query *query;
+ PlannerGlobal *glob;
+ RangeTblEntry *rte;
+ RelOptInfo *rel;
+ IndexOptInfo *indexInfo;
+ QualCost indexExprCost;
+ Cost comparisonCost;
+ Path *seqScanPath;
+ Path seqScanAndSortPath;
+ IndexPath *indexScanPath;
+ ListCell *lc;
- is_sorted = pathkeys_contained_in(root->group_pathkeys,
- path->pathkeys);
- if (path == cheapest_total_path || is_sorted)
- {
- /* Sort the cheapest partial path, if it isn't already */
- if (!is_sorted)
- path = (Path *) create_sort_path(root,
- partially_grouped_rel,
- path,
- root->group_pathkeys,
- -1.0);
-
- if (parse->hasAggs)
- add_path(partially_grouped_rel, (Path *)
- create_agg_path(root,
- partially_grouped_rel,
- path,
- partially_grouped_rel->reltarget,
- parse->groupClause ? AGG_SORTED : AGG_PLAIN,
- AGGSPLIT_INITIAL_SERIAL,
- parse->groupClause,
- NIL,
- agg_partial_costs,
- dNumPartialGroups));
- else
- add_path(partially_grouped_rel, (Path *)
- create_group_path(root,
- partially_grouped_rel,
- path,
- parse->groupClause,
- NIL,
- dNumPartialGroups));
- }
- }
- }
+ /* We can short-circuit the cost comparison if indexscans are disabled */
+ if (!enable_indexscan)
+ return true; /* use sort */
- if (can_sort && cheapest_partial_path != NULL)
- {
- /* Similar to above logic, but for partial paths. */
- foreach(lc, input_rel->partial_pathlist)
- {
- Path *path = (Path *) lfirst(lc);
- bool is_sorted;
+ /* Set up mostly-dummy planner state */
+ query = makeNode(Query);
+ query->commandType = CMD_SELECT;
- is_sorted = pathkeys_contained_in(root->group_pathkeys,
- path->pathkeys);
- if (path == cheapest_partial_path || is_sorted)
- {
- /* Sort the cheapest partial path, if it isn't already */
- if (!is_sorted)
- path = (Path *) create_sort_path(root,
- partially_grouped_rel,
- path,
- root->group_pathkeys,
- -1.0);
-
- if (parse->hasAggs)
- add_partial_path(partially_grouped_rel, (Path *)
- create_agg_path(root,
- partially_grouped_rel,
- path,
- partially_grouped_rel->reltarget,
- parse->groupClause ? AGG_SORTED : AGG_PLAIN,
- AGGSPLIT_INITIAL_SERIAL,
- parse->groupClause,
- NIL,
- agg_partial_costs,
- dNumPartialPartialGroups));
- else
- add_partial_path(partially_grouped_rel, (Path *)
- create_group_path(root,
- partially_grouped_rel,
- path,
- parse->groupClause,
- NIL,
- dNumPartialPartialGroups));
- }
- }
- }
+ glob = makeNode(PlannerGlobal);
- if (can_hash && cheapest_total_path != NULL)
- {
- Size hashaggtablesize;
+ root = makeNode(PlannerInfo);
+ root->parse = query;
+ root->glob = glob;
+ root->query_level = 1;
+ root->planner_cxt = CurrentMemoryContext;
+ root->wt_param_id = -1;
- /* Checked above */
- Assert(parse->hasAggs || parse->groupClause);
+ /* Build a minimal RTE for the rel */
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = tableOid;
+ rte->relkind = RELKIND_RELATION; /* Don't be too picky. */
+ rte->lateral = false;
+ rte->inh = false;
+ rte->inFromCl = true;
+ query->rtable = list_make1(rte);
- hashaggtablesize =
- estimate_hashagg_tablesize(cheapest_total_path,
- agg_partial_costs,
- dNumPartialGroups);
+ /* Set up RTE/RelOptInfo arrays */
+ setup_simple_rel_arrays(root);
- /*
- * Tentatively produce a partial HashAgg Path, depending on if it
- * looks as if the hash table will fit in work_mem.
- */
- if (hashaggtablesize < work_mem * 1024L &&
- cheapest_total_path != NULL)
- {
- add_path(partially_grouped_rel, (Path *)
- create_agg_path(root,
- partially_grouped_rel,
- cheapest_total_path,
- partially_grouped_rel->reltarget,
- AGG_HASHED,
- AGGSPLIT_INITIAL_SERIAL,
- parse->groupClause,
- NIL,
- agg_partial_costs,
- dNumPartialGroups));
- }
- }
+ /* Build RelOptInfo */
+ rel = build_simple_rel(root, 1, NULL);
- if (can_hash && cheapest_partial_path != NULL)
+ /* Locate IndexOptInfo for the target index */
+ indexInfo = NULL;
+ foreach(lc, rel->indexlist)
{
- Size hashaggtablesize;
+ indexInfo = lfirst_node(IndexOptInfo, lc);
+ if (indexInfo->indexoid == indexOid)
+ break;
+ }
+
+ /*
+ * It's possible that get_relation_info did not generate an IndexOptInfo
+ * for the desired index; this could happen if it's not yet reached its
+ * indcheckxmin usability horizon, or if it's a system index and we're
+ * ignoring system indexes. In such cases we should tell CLUSTER to not
+ * trust the index contents but use seqscan-and-sort.
+ */
+ if (lc == NULL) /* not in the list? */
+ return true; /* use sort */
- hashaggtablesize =
- estimate_hashagg_tablesize(cheapest_partial_path,
- agg_partial_costs,
- dNumPartialPartialGroups);
+ /*
+ * Rather than doing all the pushups that would be needed to use
+ * set_baserel_size_estimates, just do a quick hack for rows and width.
+ */
+ rel->rows = rel->tuples;
+ rel->reltarget->width = get_relation_data_width(tableOid, NULL);
- /* Do the same for partial paths. */
- if (hashaggtablesize < work_mem * 1024L &&
- cheapest_partial_path != NULL)
- {
- add_partial_path(partially_grouped_rel, (Path *)
- create_agg_path(root,
- partially_grouped_rel,
- cheapest_partial_path,
- partially_grouped_rel->reltarget,
- AGG_HASHED,
- AGGSPLIT_INITIAL_SERIAL,
- parse->groupClause,
- NIL,
- agg_partial_costs,
- dNumPartialPartialGroups));
- }
- }
+ root->total_table_pages = rel->pages;
/*
- * If there is an FDW that's responsible for all baserels of the query,
- * let it consider adding partially grouped ForeignPaths.
+ * Determine eval cost of the index expressions, if any. We need to
+ * charge twice that amount for each tuple comparison that happens during
+ * the sort, since tuplesort.c will have to re-evaluate the index
+ * expressions each time. (XXX that's pretty inefficient...)
*/
- if (partially_grouped_rel->fdwroutine &&
- partially_grouped_rel->fdwroutine->GetForeignUpperPaths)
- {
- FdwRoutine *fdwroutine = partially_grouped_rel->fdwroutine;
+ cost_qual_eval(&indexExprCost, indexInfo->indexprs, root);
+ comparisonCost = 2.0 * (indexExprCost.startup + indexExprCost.per_tuple);
- fdwroutine->GetForeignUpperPaths(root,
- UPPERREL_PARTIAL_GROUP_AGG,
- input_rel, partially_grouped_rel,
- extra);
- }
+ /* Estimate the cost of seq scan + sort */
+ seqScanPath = create_seqscan_path(root, rel, NULL, 0);
+ cost_sort(&seqScanAndSortPath, root, NIL,
+ seqScanPath->total_cost, rel->tuples, rel->reltarget->width,
+ comparisonCost, maintenance_work_mem, -1.0);
+
+ /* Estimate the cost of index scan */
+ indexScanPath = create_index_path(root, indexInfo,
+ NIL, NIL, NIL, NIL, NIL,
+ ForwardScanDirection, false,
+ NULL, 1.0, false);
- return partially_grouped_rel;
+ return (seqScanAndSortPath.total_cost < indexScanPath->path.total_cost);
}
/*
- * Generate Gather and Gather Merge paths for a grouping relation or partial
- * grouping relation.
+ * plan_create_index_workers
+ * Use the planner to decide how many parallel worker processes
+ * CREATE INDEX should request for use
+ *
+ * tableOid is the table on which the index is to be built. indexOid is the
+ * OID of an index to be created or reindexed (which must be a btree index).
*
- * generate_gather_paths does most of the work, but we also consider a special
- * case: we could try sorting the data by the group_pathkeys and then applying
- * Gather Merge.
+ * Return value is the number of parallel worker processes to request. It
+ * may be unsafe to proceed if this is 0. Note that this does not include the
+ * leader participating as a worker (value is always a number of parallel
+ * worker processes).
*
- * NB: This function shouldn't be used for anything other than a grouped or
- * partially grouped relation not only because of the fact that it explicitly
- * references group_pathkeys but we pass "true" as the third argument to
- * generate_gather_paths().
+ * Note: caller had better already hold some type of lock on the table and
+ * index.
*/
-static void
-gather_grouping_paths(PlannerInfo *root, RelOptInfo *rel)
+int
+plan_create_index_workers(Oid tableOid, Oid indexOid)
{
- Path *cheapest_partial_path;
+ PlannerInfo *root;
+ Query *query;
+ PlannerGlobal *glob;
+ RangeTblEntry *rte;
+ Relation heap;
+ Relation index;
+ RelOptInfo *rel;
+ int parallel_workers;
+ BlockNumber heap_blocks;
+ double reltuples;
+ double allvisfrac;
- /* Try Gather for unordered paths and Gather Merge for ordered ones. */
- generate_gather_paths(root, rel, true);
+ /* Return immediately when parallelism disabled */
+ if (dynamic_shared_memory_type == DSM_IMPL_NONE ||
+ max_parallel_maintenance_workers == 0)
+ return 0;
- /* Try cheapest partial path + explicit Sort + Gather Merge. */
- cheapest_partial_path = linitial(rel->partial_pathlist);
- if (!pathkeys_contained_in(root->group_pathkeys,
- cheapest_partial_path->pathkeys))
- {
- Path *path;
- double total_groups;
-
- total_groups =
- cheapest_partial_path->rows * cheapest_partial_path->parallel_workers;
- path = (Path *) create_sort_path(root, rel, cheapest_partial_path,
- root->group_pathkeys,
- -1.0);
- path = (Path *)
- create_gather_merge_path(root,
- rel,
- path,
- rel->reltarget,
- root->group_pathkeys,
- NULL,
- &total_groups);
+ /* Set up largely-dummy planner state */
+ query = makeNode(Query);
+ query->commandType = CMD_SELECT;
- add_path(rel, path);
- }
-}
+ glob = makeNode(PlannerGlobal);
-/*
- * can_partial_agg
- *
- * Determines whether or not partial grouping and/or aggregation is possible.
- * Returns true when possible, false otherwise.
- */
-static bool
-can_partial_agg(PlannerInfo *root, const AggClauseCosts *agg_costs)
-{
- Query *parse = root->parse;
+ root = makeNode(PlannerInfo);
+ root->parse = query;
+ root->glob = glob;
+ root->query_level = 1;
+ root->planner_cxt = CurrentMemoryContext;
+ root->wt_param_id = -1;
- if (!parse->hasAggs && parse->groupClause == NIL)
- {
- /*
- * We don't know how to do parallel aggregation unless we have either
- * some aggregates or a grouping clause.
- */
- return false;
- }
- else if (parse->groupingSets)
+ /*
+ * Build a minimal RTE.
+ *
+ * Set the target's table to be an inheritance parent. This is a kludge
+ * that prevents problems within get_relation_info(), which does not
+ * expect that any IndexOptInfo is currently undergoing REINDEX.
+ */
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = tableOid;
+ rte->relkind = RELKIND_RELATION; /* Don't be too picky. */
+ rte->lateral = false;
+ rte->inh = true;
+ rte->inFromCl = true;
+ query->rtable = list_make1(rte);
+
+ /* Set up RTE/RelOptInfo arrays */
+ setup_simple_rel_arrays(root);
+
+ /* Build RelOptInfo */
+ rel = build_simple_rel(root, 1, NULL);
+
+ heap = heap_open(tableOid, NoLock);
+ index = index_open(indexOid, NoLock);
+
+ /*
+ * Determine if it's safe to proceed.
+ *
+ * Currently, parallel workers can't access the leader's temporary tables.
+ * Furthermore, any index predicate or index expressions must be parallel
+ * safe.
+ */
+ if (heap->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
+ !is_parallel_safe(root, (Node *) RelationGetIndexExpressions(index)) ||
+ !is_parallel_safe(root, (Node *) RelationGetIndexPredicate(index)))
{
- /* We don't know how to do grouping sets in parallel. */
- return false;
+ parallel_workers = 0;
+ goto done;
}
- else if (agg_costs->hasNonPartial || agg_costs->hasNonSerial)
+
+ /*
+ * If parallel_workers storage parameter is set for the table, accept that
+ * as the number of parallel worker processes to launch (though still cap
+ * at max_parallel_maintenance_workers). Note that we deliberately do not
+ * consider any other factor when parallel_workers is set. (e.g., memory
+ * use by workers.)
+ */
+ if (rel->rel_parallel_workers != -1)
{
- /* Insufficient support for partial mode. */
- return false;
+ parallel_workers = Min(rel->rel_parallel_workers,
+ max_parallel_maintenance_workers);
+ goto done;
}
- /* Everything looks good. */
- return true;
+ /*
+ * Estimate heap relation size ourselves, since rel->pages cannot be
+ * trusted (heap RTE was marked as inheritance parent)
+ */
+ estimate_rel_size(heap, NULL, &heap_blocks, &reltuples, &allvisfrac);
+
+ /*
+ * Determine number of workers to scan the heap relation using generic
+ * model
+ */
+ parallel_workers = compute_parallel_worker(rel, heap_blocks, -1,
+ max_parallel_maintenance_workers);
+
+ /*
+ * Cap workers based on available maintenance_work_mem as needed.
+ *
+ * Note that each tuplesort participant receives an even share of the
+ * total maintenance_work_mem budget. Aim to leave participants
+ * (including the leader as a participant) with no less than 32MB of
+ * memory. This leaves cases where maintenance_work_mem is set to 64MB
+ * immediately past the threshold of being capable of launching a single
+ * parallel worker to sort.
+ */
+ while (parallel_workers > 0 &&
+ maintenance_work_mem / (parallel_workers + 1) < 32768L)
+ parallel_workers--;
+
+done:
+ index_close(index, NoLock);
+ heap_close(heap, NoLock);
+
+ return parallel_workers;
}
/*
@@ -6980,208 +5235,3 @@ apply_scanjoin_target_to_paths(PlannerInfo *root,
*/
set_cheapest(rel);
}
-
-/*
- * create_partitionwise_grouping_paths
- *
- * If the partition keys of input relation are part of the GROUP BY clause, all
- * the rows belonging to a given group come from a single partition. This
- * allows aggregation/grouping over a partitioned relation to be broken down
- * into aggregation/grouping on each partition. This should be no worse, and
- * often better, than the normal approach.
- *
- * However, if the GROUP BY clause does not contain all the partition keys,
- * rows from a given group may be spread across multiple partitions. In that
- * case, we perform partial aggregation for each group, append the results,
- * and then finalize aggregation. This is less certain to win than the
- * previous case. It may win if the PartialAggregate stage greatly reduces
- * the number of groups, because fewer rows will pass through the Append node.
- * It may lose if we have lots of small groups.
- */
-static void
-create_partitionwise_grouping_paths(PlannerInfo *root,
- RelOptInfo *input_rel,
- RelOptInfo *grouped_rel,
- RelOptInfo *partially_grouped_rel,
- const AggClauseCosts *agg_costs,
- grouping_sets_data *gd,
- PartitionwiseAggregateType patype,
- GroupPathExtraData *extra)
-{
- int nparts = input_rel->nparts;
- int cnt_parts;
- List *grouped_live_children = NIL;
- List *partially_grouped_live_children = NIL;
- PathTarget *target = grouped_rel->reltarget;
-
- Assert(patype != PARTITIONWISE_AGGREGATE_NONE);
- Assert(patype != PARTITIONWISE_AGGREGATE_PARTIAL ||
- partially_grouped_rel != NULL);
-
- /* Add paths for partitionwise aggregation/grouping. */
- for (cnt_parts = 0; cnt_parts < nparts; cnt_parts++)
- {
- RelOptInfo *child_input_rel = input_rel->part_rels[cnt_parts];
- PathTarget *child_target = copy_pathtarget(target);
- AppendRelInfo **appinfos;
- int nappinfos;
- GroupPathExtraData child_extra;
- RelOptInfo *child_grouped_rel;
- RelOptInfo *child_partially_grouped_rel;
-
- /* Input child rel must have a path */
- Assert(child_input_rel->pathlist != NIL);
-
- /*
- * Copy the given "extra" structure as is and then override the
- * members specific to this child.
- */
- memcpy(&child_extra, extra, sizeof(child_extra));
-
- appinfos = find_appinfos_by_relids(root, child_input_rel->relids,
- &nappinfos);
-
- child_target->exprs = (List *)
- adjust_appendrel_attrs(root,
- (Node *) target->exprs,
- nappinfos, appinfos);
-
- /* Translate havingQual and targetList. */
- child_extra.havingQual = (Node *)
- adjust_appendrel_attrs(root,
- extra->havingQual,
- nappinfos, appinfos);
- child_extra.targetList = (List *)
- adjust_appendrel_attrs(root,
- (Node *) extra->targetList,
- nappinfos, appinfos);
-
- /*
- * extra->patype was the value computed for our parent rel; patype is
- * the value for this relation. For the child, our value is its
- * parent rel's value.
- */
- child_extra.patype = patype;
-
- /*
- * Create grouping relation to hold fully aggregated grouping and/or
- * aggregation paths for the child.
- */
- child_grouped_rel = make_grouping_rel(root, child_input_rel,
- child_target,
- extra->target_parallel_safe,
- child_extra.havingQual);
-
- /* Ignore empty children. They contribute nothing. */
- if (IS_DUMMY_REL(child_input_rel))
- {
- mark_dummy_rel(child_grouped_rel);
-
- continue;
- }
-
- /* Create grouping paths for this child relation. */
- create_ordinary_grouping_paths(root, child_input_rel,
- child_grouped_rel,
- agg_costs, gd, &child_extra,
- &child_partially_grouped_rel);
-
- if (child_partially_grouped_rel)
- {
- partially_grouped_live_children =
- lappend(partially_grouped_live_children,
- child_partially_grouped_rel);
- }
-
- if (patype == PARTITIONWISE_AGGREGATE_FULL)
- {
- set_cheapest(child_grouped_rel);
- grouped_live_children = lappend(grouped_live_children,
- child_grouped_rel);
- }
-
- pfree(appinfos);
- }
-
- /*
- * All children can't be dummy at this point. If they are, then the parent
- * too marked as dummy.
- */
- Assert(grouped_live_children != NIL ||
- partially_grouped_live_children != NIL);
-
- /*
- * Try to create append paths for partially grouped children. For full
- * partitionwise aggregation, we might have paths in the partial_pathlist
- * if parallel aggregation is possible. For partial partitionwise
- * aggregation, we may have paths in both pathlist and partial_pathlist.
- */
- if (partially_grouped_rel)
- {
- add_paths_to_append_rel(root, partially_grouped_rel,
- partially_grouped_live_children);
-
- /*
- * We need call set_cheapest, since the finalization step will use the
- * cheapest path from the rel.
- */
- if (partially_grouped_rel->pathlist)
- set_cheapest(partially_grouped_rel);
- }
-
- /* If possible, create append paths for fully grouped children. */
- if (patype == PARTITIONWISE_AGGREGATE_FULL)
- add_paths_to_append_rel(root, grouped_rel, grouped_live_children);
-}
-
-/*
- * group_by_has_partkey
- *
- * Returns true, if all the partition keys of the given relation are part of
- * the GROUP BY clauses, false otherwise.
- */
-static bool
-group_by_has_partkey(RelOptInfo *input_rel,
- List *targetList,
- List *groupClause)
-{
- List *groupexprs = get_sortgrouplist_exprs(groupClause, targetList);
- int cnt = 0;
- int partnatts;
-
- /* Input relation should be partitioned. */
- Assert(input_rel->part_scheme);
-
- /* Rule out early, if there are no partition keys present. */
- if (!input_rel->partexprs)
- return false;
-
- partnatts = input_rel->part_scheme->partnatts;
-
- for (cnt = 0; cnt < partnatts; cnt++)
- {
- List *partexprs = input_rel->partexprs[cnt];
- ListCell *lc;
- bool found = false;
-
- foreach(lc, partexprs)
- {
- Expr *partexpr = lfirst(lc);
-
- if (list_member(groupexprs, partexpr))
- {
- found = true;
- break;
- }
- }
-
- /*
- * If none of the partition key expressions match with any of the
- * GROUP BY expression, return false.
- */
- if (!found)
- return false;
- }
-
- return true;
-}
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 6bf9e84c4a..ad77416050 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -2424,4 +2424,23 @@ typedef struct JoinCostWorkspace
double inner_rows_total;
} JoinCostWorkspace;
+
+
+/*
+ * Data specific to grouping sets
+ */
+
+typedef struct
+{
+ List *rollups;
+ List *hash_sets_idx;
+ double dNumHashGroups;
+ bool any_hashable;
+ Bitmapset *unsortable_refs;
+ Bitmapset *unhashable_refs;
+ List *unsortable_sets;
+ int *tleref_to_colnum_map;
+} grouping_sets_data;
+
+
#endif /* RELATION_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index cafde307ad..89835ad259 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -120,6 +120,19 @@ extern bool have_partkey_equi_join(RelOptInfo *joinrel,
JoinType jointype, List *restrictlist);
/*
+ * aggpath.c
+ * routines to create grouping paths
+ */
+extern RelOptInfo *create_grouping_paths(PlannerInfo *root,
+ RelOptInfo *input_rel,
+ PathTarget *target,
+ bool target_parallel_safe,
+ const AggClauseCosts *agg_costs,
+ grouping_sets_data *gd);
+extern List *remap_to_groupclause_idx(List *groupClause, List *gsets,
+ int *tleref_to_colnum_map);
+
+/*
* equivclass.c
* routines for managing EquivalenceClasses
*/
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 4e61dff241..dd6e912373 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -42,6 +42,7 @@ extern RelOptInfo *query_planner(PlannerInfo *root);
*/
extern void preprocess_minmax_aggregates(PlannerInfo *root, List *tlist);
+
/*
* prototypes for plan/createplan.c
*/
diff --git a/src/include/optimizer/planner.h b/src/include/optimizer/planner.h
index c090396e13..497a8c0581 100644
--- a/src/include/optimizer/planner.h
+++ b/src/include/optimizer/planner.h
@@ -55,6 +55,8 @@ extern Path *get_cheapest_fractional_path(RelOptInfo *rel,
extern Expr *expression_planner(Expr *expr);
extern Expr *preprocess_phv_expression(PlannerInfo *root, Expr *expr);
+extern List *preprocess_groupclause(PlannerInfo *root, List *force);
+extern grouping_sets_data *preprocess_grouping_sets(PlannerInfo *root);
extern bool plan_cluster_use_sort(Oid tableOid, Oid indexOid);
extern int plan_create_index_workers(Oid tableOid, Oid indexOid);
--
2.11.0
0003-Move-find_em_expr_for_rel-into-the-backend.patchtext/x-patch; name=0003-Move-find_em_expr_for_rel-into-the-backend.patchDownload
From 231479b128c08b845fbc78e682e6c54e9e7e2141 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Tue, 19 Jun 2018 13:34:31 +0300
Subject: [PATCH 3/4] Move find_em_expr_for_rel() into the backend.
---
contrib/postgres_fdw/deparse.c | 1 +
contrib/postgres_fdw/postgres_fdw.c | 28 ----------------------------
contrib/postgres_fdw/postgres_fdw.h | 1 -
src/backend/optimizer/path/equivclass.c | 29 +++++++++++++++++++++++++++++
src/include/optimizer/paths.h | 1 +
5 files changed, 31 insertions(+), 29 deletions(-)
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index d272719ff4..6b3178350f 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -49,6 +49,7 @@
#include "nodes/nodeFuncs.h"
#include "nodes/plannodes.h"
#include "optimizer/clauses.h"
+#include "optimizer/paths.h"
#include "optimizer/prep.h"
#include "optimizer/tlist.h"
#include "optimizer/var.h"
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 78b0f43ca8..7906ade920 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -5767,31 +5767,3 @@ conversion_error_callback(void *arg)
errcontext("column \"%s\" of foreign table \"%s\"", attname, relname);
}
}
-
-/*
- * Find an equivalence class member expression, all of whose Vars, come from
- * the indicated relation.
- */
-Expr *
-find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
-{
- ListCell *lc_em;
-
- foreach(lc_em, ec->ec_members)
- {
- EquivalenceMember *em = lfirst(lc_em);
-
- if (bms_is_subset(em->em_relids, rel->relids))
- {
- /*
- * If there is more than one equivalence member whose Vars are
- * taken entirely from this relation, we'll be content to choose
- * any one of those.
- */
- return em->em_expr;
- }
- }
-
- /* We didn't find any suitable equivalence class expression */
- return NULL;
-}
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index a5d4011e8d..16965fc820 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -172,7 +172,6 @@ extern void deparseAnalyzeSizeSql(StringInfo buf, Relation rel);
extern void deparseAnalyzeSql(StringInfo buf, Relation rel,
List **retrieved_attrs);
extern void deparseStringLiteral(StringInfo buf, const char *val);
-extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
extern List *build_tlist_to_deparse(RelOptInfo *foreignrel);
extern void deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root,
RelOptInfo *foreignrel, List *tlist,
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index b22b36ec0e..b8a0a2e29b 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -2511,3 +2511,32 @@ is_redundant_derived_clause(RestrictInfo *rinfo, List *clauselist)
return false;
}
+
+/*
+ * find_em_expr_for_rel
+ * Find an equivalence class member expression, all of whose Vars, come
+ * from the indicated relation.
+ */
+Expr *
+find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+{
+ ListCell *lc_em;
+
+ foreach(lc_em, ec->ec_members)
+ {
+ EquivalenceMember *em = lfirst(lc_em);
+
+ if (bms_is_subset(em->em_relids, rel->relids))
+ {
+ /*
+ * If there is more than one equivalence member whose Vars are
+ * taken entirely from this relation, we'll be content to choose
+ * any one of those.
+ */
+ return em->em_expr;
+ }
+ }
+
+ /* We didn't find any suitable equivalence class expression */
+ return NULL;
+}
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 89835ad259..39e3e8f85c 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -188,6 +188,7 @@ extern bool eclass_useful_for_merging(PlannerInfo *root,
EquivalenceClass *eclass,
RelOptInfo *rel);
extern bool is_redundant_derived_clause(RestrictInfo *rinfo, List *clauselist);
+extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
/*
* pathkeys.c
--
2.11.0
0004-Plan-aggregation-in-query_planner-to-allow-aggregati.patchtext/x-patch; name=0004-Plan-aggregation-in-query_planner-to-allow-aggregati.patchDownload
From 457becbf1ff49670b4608841654b84c5c75a252e Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Tue, 12 Jun 2018 18:15:22 +0300
Subject: [PATCH 4/4] Plan aggregation in query_planner, to allow aggregating
below joins.
Consider performing Grouping/Aggregation below Join nodes, if it's legal
to do so. To do this, move the responsibility of planning grouping from
the "upper stages", in grouping_planner, into scan/join planning, in
query_planner().
In query_planner(), after building the RelOptInfo for a scan or join rel,
also build a grouped RelOptInfo to shadow each RelOptInfo (if aggregation
can be done at that rel). The grouped RelOptInfo is stored in a new
'grouped_rel' field in the parent RelOptInfo.
The grouped rel holds Paths where the grouping/aggregation is already
performed at that node. For a base rel, it represents performing the
aggregation on top of the scan, i.e. the Paths contain Agg(Scan). For a
grouped join rel, the paths look like an Agg(Join(A, B)), or Join(Agg(A), B).
This is still a prototype, with a bunch of stuff broken or in need of
cleanup:
- Clarify what all this should mean to the FDW API. I did some hacking
around in postgres_fdw, to make it work partially, but still lots of
regression failures there. Need to figure out the details of how
an FDW should plan grouping nodes in the new world.
- Still one regression failure, in 'partition_aggregate' test.
- grouping_planner() is a bit of a misnomer now, as grouping is now planned
as part of query_planner().
---
contrib/postgres_fdw/deparse.c | 22 +-
contrib/postgres_fdw/postgres_fdw.c | 8 +-
src/backend/optimizer/README | 63 ++-
src/backend/optimizer/geqo/geqo_eval.c | 3 +
src/backend/optimizer/path/aggpath.c | 574 +++++++++++++++++++++++----
src/backend/optimizer/path/allpaths.c | 165 ++++++++
src/backend/optimizer/path/joinrels.c | 130 ++++++
src/backend/optimizer/path/pathkeys.c | 2 +-
src/backend/optimizer/plan/initsplan.c | 139 ++++++-
src/backend/optimizer/plan/planmain.c | 40 +-
src/backend/optimizer/plan/planner.c | 272 ++++---------
src/backend/optimizer/util/pathnode.c | 12 +-
src/backend/optimizer/util/relnode.c | 46 ++-
src/include/nodes/relation.h | 36 +-
src/include/optimizer/pathnode.h | 1 +
src/include/optimizer/paths.h | 24 +-
src/include/optimizer/planmain.h | 1 +
src/include/optimizer/planner.h | 8 +
src/test/regress/expected/aggregates.out | 21 +-
src/test/regress/expected/partition_join.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/sql/aggregate_pushdown.sql | 70 ++++
22 files changed, 1311 insertions(+), 332 deletions(-)
create mode 100644 src/test/regress/sql/aggregate_pushdown.sql
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 6b3178350f..f7aa8b9f0e 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -682,7 +682,7 @@ foreign_expr_walker(Node *node,
ListCell *lc;
/* Not safe to pushdown when not in grouping context */
- if (!IS_UPPER_REL(glob_cxt->foreignrel))
+ if (!IS_GROUPED_REL(glob_cxt->foreignrel))
return false;
/* Only non-split aggregates are pushable. */
@@ -882,7 +882,7 @@ build_tlist_to_deparse(RelOptInfo *foreignrel)
* For an upper relation, we have already built the target list while
* checking shippability, so just return that.
*/
- if (IS_UPPER_REL(foreignrel))
+ if (IS_GROUPED_REL(foreignrel))
return fpinfo->grouped_tlist;
/*
@@ -959,7 +959,7 @@ deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *rel,
* conditions of the underlying scan relation; otherwise, we can use the
* supplied list of remote conditions directly.
*/
- if (IS_UPPER_REL(rel))
+ if (IS_UPPER_REL(rel) || IS_GROUPED_REL(rel))
{
PgFdwRelationInfo *ofpinfo;
@@ -972,7 +972,7 @@ deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *rel,
/* Construct FROM and WHERE clauses */
deparseFromExpr(quals, &context);
- if (IS_UPPER_REL(rel))
+ if (IS_GROUPED_REL(rel))
{
/* Append GROUP BY clause */
appendGroupByClause(tlist, &context);
@@ -1029,7 +1029,7 @@ deparseSelectSql(List *tlist, bool is_subquery, List **retrieved_attrs,
*/
deparseSubqueryTargetList(context);
}
- else if (IS_JOIN_REL(foreignrel) || IS_UPPER_REL(foreignrel))
+ else if (IS_JOIN_REL(foreignrel) || IS_UPPER_REL(foreignrel) || IS_GROUPED_REL(foreignrel))
{
/*
* For a join or upper relation the input tlist gives the list of
@@ -1444,6 +1444,18 @@ deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel,
bool outerrel_is_target = false;
bool innerrel_is_target = false;
+ /*
+ * In a grouped join rel, fpinfo->outerrel points to the non-grouped
+ * parent rel, and fpinfo->innerrel is NULL. Dig the inner and outer
+ * side of the parent.
+ */
+ if (IS_GROUPED_REL(foreignrel))
+ {
+ fpinfo = (PgFdwRelationInfo *) fpinfo->outerrel->fdw_private;
+ outerrel = fpinfo->outerrel;
+ innerrel = fpinfo->innerrel;
+ }
+
if (ignore_rel > 0 && bms_is_member(ignore_rel, foreignrel->relids))
{
/*
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 7906ade920..acdbeb28df 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -1242,6 +1242,8 @@ postgresGetForeignPlan(PlannerInfo *root,
* Right now, we only consider grouping and aggregation beyond
* joins. Queries involving aggregates or grouping do not require
* EPQ mechanism, hence should not have an outer plan here.
+ *
+ * FIXME: is this comment still valid?
*/
Assert(!IS_UPPER_REL(foreignrel));
@@ -2672,7 +2674,7 @@ estimate_path_cost_size(PlannerInfo *root,
&remote_param_join_conds, &local_param_join_conds);
/* Build the list of columns to be fetched from the foreign server. */
- if (IS_JOIN_REL(foreignrel) || IS_UPPER_REL(foreignrel))
+ if (IS_JOIN_REL(foreignrel) || IS_UPPER_REL(foreignrel) || IS_GROUPED_REL(foreignrel))
fdw_scan_tlist = build_tlist_to_deparse(foreignrel);
else
fdw_scan_tlist = NIL;
@@ -2753,7 +2755,7 @@ estimate_path_cost_size(PlannerInfo *root,
startup_cost = fpinfo->rel_startup_cost;
run_cost = fpinfo->rel_total_cost - fpinfo->rel_startup_cost;
}
- else if (IS_JOIN_REL(foreignrel))
+ else if (IS_JOIN_REL(foreignrel) && !IS_GROUPED_REL(foreignrel))
{
PgFdwRelationInfo *fpinfo_i;
PgFdwRelationInfo *fpinfo_o;
@@ -2819,7 +2821,7 @@ estimate_path_cost_size(PlannerInfo *root,
run_cost += nrows * remote_conds_cost.per_tuple;
run_cost += fpinfo->local_conds_cost.per_tuple * retrieved_rows;
}
- else if (IS_UPPER_REL(foreignrel))
+ else if (IS_GROUPED_REL(foreignrel))
{
PgFdwRelationInfo *ofpinfo;
PathTarget *ptarget = foreignrel->reltarget;
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index 15af9ceff5..c0a4a64e37 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -313,8 +313,7 @@ set up for recursive handling of subqueries
convert Vars of outer query levels into Params
--grouping_planner()
preprocess target list for non-SELECT queries
- handle UNION/INTERSECT/EXCEPT, GROUP BY, HAVING, aggregates,
- ORDER BY, DISTINCT, LIMIT
+ handle UNION/INTERSECT/EXCEPT, ORDER BY, DISTINCT, LIMIT
--query_planner()
make list of base relations used in query
split up the qual into restrictions (a=1) and joins (b=c)
@@ -334,9 +333,10 @@ set up for recursive handling of subqueries
Back at standard_join_search(), generate gather paths if needed for
each newly constructed joinrel, then apply set_cheapest() to extract
the cheapest path for it.
+ If the query has a GROUP BY or aggregates, generate grouping paths for
+ each new joinrel, where grouping is possible.
Loop back if this wasn't the top join level.
Back at grouping_planner:
- do grouping (GROUP BY) and aggregation
do window functions
make unique (DISTINCT)
do sorting (ORDER BY)
@@ -981,16 +981,57 @@ security-barrier views be flattened into the parent query, allowing more
flexibility of planning while still preserving required ordering of qual
evaluation. But that will come later.
+Grouping
+--------
+
+If the query involves GROUP BY or aggregates, grouping is considered together with
+scans and joins. The straightforward way is to perform the aggregation after
+all the joins, but sometimes it can be done earlier. For example:
+
+SELECT t1.a, count(*) FROM t1, t2 WHERE t1.a = t2.x
+GROUP BY t1.a
+
+The straightforward plan might look like this:
+
+ Aggregate
+ -> Join
+ -> Scan on t1
+ -> Scan on t2
+
+But under some circumstances, the Aggregate can be performed before the join, like
+this:
+
+ Join
+ -> Aggregate
+ -> Scan on t1
+ -> Scan on t2
+
+This transformation is only valid under some conditions. All the GROUP BY
+expressions must be computable at the lower plan node, the join clauses must
+refer to the GROUP BY columns, and the Join above the Aggregate mustn't
+introduce any "new" rows, that would need to be passed to the aggregate functions.
+
+In the dynamic programming algorithm, when we build the join relations, we also
+consider performing Aggregation at each join relation. However, it is not always
+a win to perform the Aggregation at the lowest possible join. Performing a join
+first might eliminate a lot of rows, avoiding the cost of doing the aggregation
+for the eliminated rows. Therefore, we build both aggregated and non-aggregated
+Paths for each join relation, and choose the cheapest aggregated Path of the final
+join relation.
+
+This approach is based on the paper "Including Group-By in Query Optimization"
+by Surajit Chaudhuri and Kyuseok Shim":
+ https://pdfs.semanticscholar.org/3079/5447cec18753254edbbd7839f0afa58b2a39.pdf
-Post scan/join planning
------------------------
+Post scan/join/group planning
+-----------------------------
-So far we have discussed only scan/join planning, that is, implementation
-of the FROM and WHERE clauses of a SQL query. But the planner must also
-determine how to deal with GROUP BY, aggregation, and other higher-level
-features of queries; and in many cases there are multiple ways to do these
+So far we have discussed only scan/join/group planning, that is, implementation
+of the FROM and WHERE clauses of a SQL query, as well as GROUP BY and aggregation.
+But the planner must also determine how to deal with higher-level features of
+queries, like window functions and ORDER BY; and in many cases there are multiple ways to do these
steps and thus opportunities for optimization choices. These steps, like
-scan/join planning, are handled by constructing Paths representing the
+scan/join/group planning, are handled by constructing Paths representing the
different ways to do a step, then choosing the cheapest Path.
Since all Paths require a RelOptInfo as "parent", we create RelOptInfos
@@ -1000,7 +1041,7 @@ considered useful for each step. Currently, we may create these types of
additional RelOptInfos during upper-level planning:
UPPERREL_SETOP result of UNION/INTERSECT/EXCEPT, if any
-UPPERREL_PARTIAL_GROUP_AGG result of partial grouping/aggregation, if any
+UPPERREL_PARTIAL_GROUP_AGG result of partial grouping/aggregation, if any XXX: do we still create these?
UPPERREL_GROUP_AGG result of grouping/aggregation, if any
UPPERREL_WINDOW result of window functions, if any
UPPERREL_DISTINCT result of "SELECT DISTINCT", if any
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index 3ef7d7d8aa..7724726027 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -268,6 +268,9 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
/* Create paths for partitionwise joins. */
generate_partitionwise_join_paths(root, joinrel);
+ /* Create paths for groupings. */
+ generate_grouped_join_paths(root, joinrel);
+
/*
* Except for the topmost scan/join rel, consider gathering
* partial paths. We'll do the same for the topmost scan/join
diff --git a/src/backend/optimizer/path/aggpath.c b/src/backend/optimizer/path/aggpath.c
index 618171b148..edb6f24e05 100644
--- a/src/backend/optimizer/path/aggpath.c
+++ b/src/backend/optimizer/path/aggpath.c
@@ -45,9 +45,6 @@
#include "utils/selfuncs.h"
#include "utils/syscache.h"
-static RelOptInfo *make_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
- PathTarget *target, bool target_parallel_safe,
- Node *havingQual);
static bool is_degenerate_grouping(PlannerInfo *root);
static void create_degenerate_grouping_paths(PlannerInfo *root,
RelOptInfo *input_rel,
@@ -264,26 +261,19 @@ estimate_hashagg_tablesize(Path *path, const AggClauseCosts *agg_costs,
* Note: all Paths in input_rel are expected to return the target computed
* by make_group_input_target.
*/
-RelOptInfo *
+void
create_grouping_paths(PlannerInfo *root,
RelOptInfo *input_rel,
+ RelOptInfo *grouped_rel,
PathTarget *target,
bool target_parallel_safe,
const AggClauseCosts *agg_costs,
grouping_sets_data *gd)
{
Query *parse = root->parse;
- RelOptInfo *grouped_rel;
RelOptInfo *partially_grouped_rel;
/*
- * Create grouping relation to hold fully aggregated grouping and/or
- * aggregation paths.
- */
- grouped_rel = make_grouping_rel(root, input_rel, target,
- target_parallel_safe, parse->havingQual);
-
- /*
* Create either paths for a degenerate grouping or paths for ordinary
* grouping, as appropriate.
*/
@@ -363,61 +353,6 @@ create_grouping_paths(PlannerInfo *root,
}
set_cheapest(grouped_rel);
- return grouped_rel;
-}
-
-/*
- * make_grouping_rel
- *
- * Create a new grouping rel and set basic properties.
- *
- * input_rel represents the underlying scan/join relation.
- * target is the output expected from the grouping relation.
- */
-static RelOptInfo *
-make_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
- PathTarget *target, bool target_parallel_safe,
- Node *havingQual)
-{
- RelOptInfo *grouped_rel;
-
- if (IS_OTHER_REL(input_rel))
- {
- grouped_rel = fetch_upper_rel(root, UPPERREL_GROUP_AGG,
- input_rel->relids);
- grouped_rel->reloptkind = RELOPT_OTHER_UPPER_REL;
- }
- else
- {
- /*
- * By tradition, the relids set for the main grouping relation is
- * NULL. (This could be changed, but might require adjustments
- * elsewhere.)
- */
- grouped_rel = fetch_upper_rel(root, UPPERREL_GROUP_AGG, NULL);
- }
-
- /* Set target. */
- grouped_rel->reltarget = target;
-
- /*
- * If the input relation is not parallel-safe, then the grouped relation
- * can't be parallel-safe, either. Otherwise, it's parallel-safe if the
- * target list and HAVING quals are parallel-safe.
- */
- if (input_rel->consider_parallel && target_parallel_safe &&
- is_parallel_safe(root, (Node *) havingQual))
- grouped_rel->consider_parallel = true;
-
- /*
- * If the input rel belongs to a single FDW, so does the grouped rel.
- */
- grouped_rel->serverid = input_rel->serverid;
- grouped_rel->userid = input_rel->userid;
- grouped_rel->useridiscurrent = input_rel->useridiscurrent;
- grouped_rel->fdwroutine = input_rel->fdwroutine;
-
- return grouped_rel;
}
/*
@@ -1040,6 +975,8 @@ create_partial_grouping_paths(PlannerInfo *root,
bool force_rel_creation)
{
Query *parse = root->parse;
+ PathTarget *partially_grouped_target;
+ bool partially_grouped_target_parallel_safe;
RelOptInfo *partially_grouped_rel;
AggClauseCosts *agg_partial_costs = &extra->agg_partial_costs;
AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
@@ -1081,12 +1018,25 @@ create_partial_grouping_paths(PlannerInfo *root,
return NULL;
/*
- * Build a new upper relation to represent the result of partially
+ * Build target list for partial aggregate paths. These paths cannot just
+ * emit the same tlist as regular aggregate paths, because (1) we must
+ * include Vars and Aggrefs needed in HAVING, which might not appear in
+ * the result tlist, and (2) the Aggrefs must be set in partial mode.
+ */
+ partially_grouped_target =
+ make_partial_grouping_target(root, grouped_rel->reltarget,
+ extra->havingQual);
+ partially_grouped_target_parallel_safe =
+ is_parallel_safe(root, (Node *) partially_grouped_target->exprs);
+
+ /*
+ * Build a new grouped relation to represent the result of partially
* aggregating the rows from the input relation.
*/
- partially_grouped_rel = fetch_upper_rel(root,
- UPPERREL_PARTIAL_GROUP_AGG,
- grouped_rel->relids);
+ partially_grouped_rel = make_grouping_rel(root, input_rel,
+ partially_grouped_target,
+ partially_grouped_target_parallel_safe,
+ root->parse->havingQual);
partially_grouped_rel->consider_parallel =
grouped_rel->consider_parallel;
partially_grouped_rel->reloptkind = grouped_rel->reloptkind;
@@ -1095,16 +1045,6 @@ create_partial_grouping_paths(PlannerInfo *root,
partially_grouped_rel->useridiscurrent = grouped_rel->useridiscurrent;
partially_grouped_rel->fdwroutine = grouped_rel->fdwroutine;
- /*
- * Build target list for partial aggregate paths. These paths cannot just
- * emit the same tlist as regular aggregate paths, because (1) we must
- * include Vars and Aggrefs needed in HAVING, which might not appear in
- * the result tlist, and (2) the Aggrefs must be set in partial mode.
- */
- partially_grouped_rel->reltarget =
- make_partial_grouping_target(root, grouped_rel->reltarget,
- extra->havingQual);
-
if (!extra->partial_costs_set)
{
/*
@@ -1454,10 +1394,12 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
* Create grouping relation to hold fully aggregated grouping and/or
* aggregation paths for the child.
*/
+ Assert(!child_input_rel->grouped_rel);
child_grouped_rel = make_grouping_rel(root, child_input_rel,
child_target,
extra->target_parallel_safe,
child_extra.havingQual);
+ child_input_rel->grouped_rel = child_grouped_rel;
/* Ignore empty children. They contribute nothing. */
if (IS_DUMMY_REL(child_input_rel))
@@ -1965,3 +1907,473 @@ remap_to_groupclause_idx(List *groupClause,
return result;
}
+
+/*
+ * make_group_input_target
+ * Generate appropriate PathTarget for initial input to grouping nodes.
+ *
+ * If there is grouping or aggregation, the scan/join subplan cannot emit
+ * the query's final targetlist; for example, it certainly can't emit any
+ * aggregate function calls. This routine generates the correct target
+ * for the scan/join subplan.
+ *
+ * The input must contain:
+ * - GROUP BY expressions.
+ * - inputs to Aggregates
+ * - other Vars needed to in the final target list. (ie. columns that
+ * are functionally dependent on the GROUP BY expressions, and
+ * therefore didn't need to be listed in GROUP BY.)
+ *
+ * Compared to the scan/join input's target list, this must
+ * - Add GROUP BY expressions
+ * - Remove aggregated columns, i.e columns that are not directly listed
+ * in GROUP BY.
+ *
+ * The query target list passed from the parser already contains entries
+ * for all ORDER BY and GROUP BY expressions, but it will not have entries
+ * for variables used only in HAVING clauses; so we need to add those
+ * variables to the subplan target list. Also, we flatten all expressions
+ * except GROUP BY items into their component variables; other expressions
+ * will be computed by the upper plan nodes rather than by the subplan.
+ * For example, given a query like
+ * SELECT a+b,SUM(c+d) FROM table GROUP BY a+b;
+ * we want to pass this targetlist to the subplan:
+ * a+b,c,d
+ * where the a+b target will be used by the Sort/Group steps, and the
+ * other targets will be used for computing the final results.
+ *
+ * 'root->final_target' is the query's final target list (in PathTarget form)
+ */
+PathTarget *
+make_group_input_target(PlannerInfo *root, RelOptInfo *rel)
+{
+ Query *parse = root->parse;
+ PathTarget *final_target = root->final_target;
+ PathTarget *input_target;
+ List *non_group_vars;
+ ListCell *lc;
+ ListCell *lec;
+ ListCell *lsortref;
+
+ input_target = create_empty_pathtarget();
+
+ /*
+ * Begin by adding all grouping columns.
+ *
+ * If the grouping is pushed down below a join, we might not have the
+ * original expression listed in the GROUP BY available, but we can use
+ * another expression that's known to be equal, because of a qual.
+ * Hence search the equivalence classes, rather than parse->groupClause
+ * directly.
+ */
+ forboth(lec, root->group_ecs, lsortref, root->group_sortrefs)
+ {
+ EquivalenceClass *eclass = lfirst_node(EquivalenceClass, lec);
+ int sgref = lfirst_int(lsortref);
+ Expr *expr;
+
+ if (eclass)
+ {
+ expr = find_em_expr_for_rel(eclass, rel);
+ if (!expr)
+ elog(ERROR, "could not find equivalence class member for given relations");
+ }
+ else
+ {
+ expr = get_sortgroupref_tle(sgref, root->processed_tlist)->expr;
+ }
+
+ add_column_to_pathtarget(input_target, expr, sgref);
+ }
+
+ /*
+ * Pull out all the Vars mentioned in non-group cols, and
+ * add them to the input target if not already present. (A Var used
+ * directly as a GROUP BY item will be present already.) Note this
+ * includes Vars used in resjunk items, so we are covering the needs of
+ * ORDER BY and window specifications. Vars used within Aggrefs and
+ * WindowFuncs will be pulled out here, too.
+ */
+ non_group_vars = pull_var_clause((Node *) final_target->exprs,
+ PVC_RECURSE_AGGREGATES |
+ PVC_RECURSE_WINDOWFUNCS |
+ PVC_INCLUDE_PLACEHOLDERS);
+
+ /* Only add columns that we have available here. */
+ foreach (lc, non_group_vars)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+
+ if (bms_is_subset(pull_varnos((Node *) expr), rel->relids))
+ add_new_column_to_pathtarget(input_target, expr);
+ }
+ list_free(non_group_vars);
+
+ /*
+ * If there's a HAVING clause, we'll need the Vars it uses, too.
+ *
+ * All the vars referenced in HAVING should be available at this
+ * scan/join rel.
+ */
+ if (parse->havingQual)
+ {
+ non_group_vars = pull_var_clause(parse->havingQual,
+ PVC_RECURSE_AGGREGATES |
+ PVC_RECURSE_WINDOWFUNCS |
+ PVC_INCLUDE_PLACEHOLDERS);
+ add_new_columns_to_pathtarget(input_target, non_group_vars);
+ list_free(non_group_vars);
+ }
+
+ /* XXX this causes some redundant cost calculation ... */
+ return set_pathtarget_cost_width(root, input_target);
+}
+
+/*
+ * make_group_input_target
+ * Generate appropriate PathTarget for the output of an Agg/Group node
+ *
+ * FIXME: this function is quite a mess, needs a rewrite. I think it would
+ * make sense to merge this with make_group_input_target(), so that both
+ * target lists were created together. There is some duplicate work between
+ * them. Or can we move some of this processing to be done earlier, in
+ * process_targetlist()?
+ */
+PathTarget *
+make_grouping_target(PlannerInfo *root, RelOptInfo *rel,
+ PathTarget *input_target, PathTarget *final_target)
+{
+ /*
+ * - grouping columns
+ * - aggregates
+ * - columns needed for joins above this node
+ */
+ Query *parse = root->parse;
+ PathTarget *grouping_target;
+ List *non_group_cols;
+ List *non_group_vars;
+ int i;
+ ListCell *lc;
+ Relids relids;
+ Bitmapset *group_col_sortrefs = NULL;
+
+ /*
+ * We must build a target containing all grouping columns, plus any other
+ * Vars mentioned in the query's targetlist. We can ignore HAVING here,
+ * it's evaluated at the Agg itself, and doesn't need to be propagated
+ * above it.
+ */
+ grouping_target = create_empty_pathtarget();
+ non_group_cols = NIL;
+
+ /*
+ * First, add all GROUP BY columns.
+ */
+ /*
+ * 1. Take the input target list. It should include all grouping cols. Remove everything that's
+ * not a grouping col.
+ * 2. Add Aggrefs.
+ */
+ i = 0;
+ foreach(lc, final_target->exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+ Index sgref = get_pathtarget_sortgroupref(final_target, i);
+
+ if (bms_is_subset(pull_varnos((Node *) expr), rel->relids))
+ {
+ if (sgref && parse->groupClause &&
+ get_sortgroupref_clause_noerr(sgref, parse->groupClause) != NULL)
+ {
+ /*
+ * It's a grouping column, so add it to the input target as-is.
+ */
+ group_col_sortrefs = bms_add_member(group_col_sortrefs, sgref);
+ add_column_to_pathtarget(grouping_target, expr, sgref);
+ }
+ else
+ {
+ /*
+ * Non-grouping column, so just remember the expression for later
+ * call to pull_var_clause.
+ */
+ non_group_cols = lappend(non_group_cols, expr);
+ }
+ }
+
+ i++;
+ }
+
+ /* attrs_needed refers to parent relids and not those of a child. */
+ if (rel->top_parent_relids)
+ relids = rel->top_parent_relids;
+ else
+ relids = rel->relids;
+
+ relids = bms_add_member(bms_copy(relids), NEEDED_IN_GROUPING);
+
+ i = 0;
+ foreach(lc, input_target->exprs)
+ {
+ Var *var = (Var *) lfirst(lc);
+ RelOptInfo *baserel;
+ int ndx;
+ Index sgref = get_pathtarget_sortgroupref(input_target, i);
+
+ /* this is similar to build_joinrel_tlist. */
+
+ if (sgref && bms_is_member(sgref, group_col_sortrefs))
+ {
+ i++;
+ continue;
+ }
+
+ /*
+ * Ignore PlaceHolderVars in the input tlists; we'll make our own
+ * decisions about whether to copy them.
+ */
+ if (IsA(var, PlaceHolderVar))
+ {
+ i++;
+ continue;
+ }
+
+ /*
+ * Otherwise, anything in a baserel or joinrel targetlist ought to be
+ * a Var. Children of a partitioned table may have ConvertRowtypeExpr
+ * translating whole-row Var of a child to that of the parent.
+ * Children of an inherited table or subquery child rels can not
+ * directly participate in a join, so other kinds of nodes here.
+ */
+ if (IsA(var, Var))
+ {
+ baserel = find_base_rel(root, var->varno);
+ ndx = var->varattno - baserel->min_attr;
+ }
+ else if (IsA(var, ConvertRowtypeExpr))
+ {
+ ConvertRowtypeExpr *child_expr = (ConvertRowtypeExpr *) var;
+ Var *childvar = (Var *) child_expr->arg;
+
+ /*
+ * Child's whole-row references are converted to look like those
+ * of parent using ConvertRowtypeExpr. There can be as many
+ * ConvertRowtypeExpr decorations as the depth of partition tree.
+ * The argument to the deepest ConvertRowtypeExpr is expected to
+ * be a whole-row reference of the child.
+ */
+ while (IsA(childvar, ConvertRowtypeExpr))
+ {
+ child_expr = (ConvertRowtypeExpr *) childvar;
+ childvar = (Var *) child_expr->arg;
+ }
+ Assert(IsA(childvar, Var) &&childvar->varattno == 0);
+
+ baserel = find_base_rel(root, childvar->varno);
+ ndx = 0 - baserel->min_attr;
+ }
+ else
+ {
+ /*
+ * If this rel is above grouping, then we can have Aggrefs
+ * and grouping column expressions in the target list. Carry
+ * them up to the join rel. They will surely be needed at
+ * the top of the join tree. (Unless they're only used in
+ * HAVING?)
+ */
+#if 0
+ elog(ERROR, "unexpected node type in rel targetlist: %d",
+ (int) nodeTag(var));
+#endif
+ baserel = NULL;
+ }
+
+ /* Is the target expression still needed above this joinrel? */
+ if (baserel == NULL || bms_nonempty_difference(baserel->attr_needed[ndx], relids))
+ {
+ /* Yup, add it to the output */
+ add_column_to_pathtarget(grouping_target, (Expr *) var, sgref);
+ }
+ i++;
+ }
+
+ /*
+ * Pull out all the Vars mentioned in non-group cols, and
+ * add them to the input target if not already present. (A Var used
+ * directly as a GROUP BY item will be present already.) Note this
+ * includes Vars used in resjunk items, so we are covering the needs of
+ * ORDER BY and window specifications. Vars used within
+ * WindowFuncs will be pulled out here, too. Aggrefs will be included
+ * as is.
+ */
+ non_group_vars = pull_var_clause((Node *) non_group_cols,
+ PVC_INCLUDE_AGGREGATES |
+ PVC_RECURSE_WINDOWFUNCS |
+ PVC_INCLUDE_PLACEHOLDERS);
+ foreach (lc, non_group_vars)
+ {
+ Node *n = lfirst(lc);
+
+ if (IsA(n, Aggref))
+ add_new_column_to_pathtarget(grouping_target, (Expr *) n);
+ }
+
+ add_new_columns_to_pathtarget(grouping_target, non_group_vars);
+
+ /* XXX: Add anything needed to evaluate Aggs here, i.e. agg arguments */
+
+ /* clean up cruft */
+ list_free(non_group_vars);
+ list_free(non_group_cols);
+
+ /* XXX this causes some redundant cost calculation ... */
+ return set_pathtarget_cost_width(root, grouping_target);
+}
+
+/*
+ * Is the query's GROUP BY computable at the given relation?
+ *
+ * From "Including Group-By in Query Optimization" paper:
+ *
+ * Definition 3.1: A node n of a given left-deep tree has the invariant
+ * grouping property if the following conditions are true:
+ *
+ * 1. Every aggregating column of the query is a candidate aggregating
+ * column of n.
+ *
+ * 2. Every join column of n is also a grouping column of the query.
+ *
+ * 3. For every join-node that is an ancestor of n, the join is an
+ * equijoin predicate on a foreign key column of n.
+ */
+bool
+is_grouping_computable_at_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+ ListCell *lc;
+ ListCell *lec;
+ Relids other_relids;
+ int x;
+
+ /*
+ * If this is the final, top, join node, then surely the grouping
+ * can be done here.
+ */
+ if (bms_is_subset(root->all_baserels, rel->relids))
+ return true;
+
+ /*
+ * Currently, give up on SRFs in target list. It gets too complicated to
+ * evaluate them in the middle of the join tree. (Note that we check for
+ * this after checking if this is the final rel, so we still produce
+ * grouping plans with SRFs, at the top).
+ *
+ * FIXME: I'm not sure there's any fundamental reason this wouldn't work.
+ */
+ if (root->parse->hasTargetSRFs)
+ return false;
+
+ /*
+ * 1. Every aggregating column of the query is a candidate aggregating
+ * column of n.
+ *
+ * What this means is that we must be able to compute the aggregates
+ * at this relation. For example, "AVG(tbl.col)" can only be computed
+ * if 'tbl' is part of this join relation.
+ */
+ if (!bms_is_subset(root->agg_relids, rel->relids))
+ return false;
+
+ /*
+ * We must also be able to compute each grouping column here.
+ */
+ foreach (lec, root->group_ecs)
+ {
+ EquivalenceClass *ec = lfirst_node(EquivalenceClass, lec);
+
+ if (!find_em_expr_for_rel(ec, rel))
+ return false;
+ }
+
+ /*
+ * non-equijoins can only be evaluated correctly before grouping.
+ *
+ * XXX: A parameterized path, for use in the inner side of a nested
+ * loop join, where all the vars are available as Params, would be
+ * acceptable, though.
+ */
+ foreach (lc, rel->joininfo)
+ {
+ RestrictInfo *rinfo = lfirst_node(RestrictInfo, lc);
+
+ if (!bms_is_subset(rinfo->required_relids, rel->relids))
+ return false;
+ }
+
+ x = -1;
+ other_relids = bms_difference(root->all_baserels, rel->relids);
+ while ((x = bms_next_member(other_relids, x)) >= 0)
+ {
+ List *joinquals;
+ Relids joinrelids;
+ Relids outer_relids;
+ RelOptInfo *other_rel;
+
+ other_rel = find_base_rel(root, x);
+
+ outer_relids = bms_make_singleton(x);
+ joinrelids = bms_add_members(bms_make_singleton(x), rel->relids);
+
+ joinquals = generate_join_implied_equalities(root,
+ joinrelids,
+ outer_relids,
+ other_rel);
+
+ /*
+ * Check condition 2: the join column must be in GROUP BY.
+ */
+ foreach(lc, joinquals)
+ {
+ RestrictInfo *joinqual = lfirst_node(RestrictInfo, lc);
+
+ if (!joinqual->can_join)
+ {
+ /* Not a joinable binary opclause */
+ return false;
+ }
+
+ foreach (lec, root->group_ecs)
+ {
+ EquivalenceClass *ec = lfirst_node(EquivalenceClass, lec);
+
+ /* XXX: are left_ec/right_ec guaranteed to be valid here? */
+ if (ec == joinqual->left_ec ||
+ ec == joinqual->right_ec)
+ {
+ break;
+ }
+ }
+ if (lec == NULL)
+ {
+ /* This join qual was not in GROUP BY */
+ return false;
+ }
+ }
+
+ /*
+ * Check condition 3: performing the remaining joins on top of this
+ * mustn't "add" any more rows.
+ */
+ if (!innerrel_is_unique(root,
+ joinrelids, /* joinrelids */
+ rel->relids, /* outerrelids */
+ other_rel, /* innerrel */
+ JOIN_INNER, /* FIXME */
+ joinquals,
+ false)) /* force_cache */
+ {
+ return false;
+ }
+ }
+
+ return true;
+}
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 3ada379f8b..74765d0d53 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -36,6 +36,7 @@
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/plancat.h"
+#include "optimizer/planmain.h"
#include "optimizer/planner.h"
#include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
@@ -72,6 +73,7 @@ join_search_hook_type join_search_hook = NULL;
static void set_base_rel_consider_startup(PlannerInfo *root);
static void set_base_rel_sizes(PlannerInfo *root);
static void set_base_rel_pathlists(PlannerInfo *root);
+static void set_base_rel_groupings(PlannerInfo *root);
static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
Index rti, RangeTblEntry *rte);
static void set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
@@ -180,6 +182,11 @@ make_one_rel(PlannerInfo *root, List *joinlist)
set_base_rel_pathlists(root);
/*
+ * Generate paths that implement grouping directly on top of base rels.
+ */
+ set_base_rel_groupings(root);
+
+ /*
* Generate access paths for the entire join tree.
*/
rel = make_rel_from_joinlist(root, joinlist);
@@ -312,6 +319,33 @@ set_base_rel_pathlists(PlannerInfo *root)
}
/*
+ * set_base_rel_groupings
+ * Generate paths to perform grouping for each base relation.
+ */
+static void
+set_base_rel_groupings(PlannerInfo *root)
+{
+ Index rti;
+
+ for (rti = 1; rti < root->simple_rel_array_size; rti++)
+ {
+ RelOptInfo *rel = root->simple_rel_array[rti];
+
+ /* there may be empty slots corresponding to non-baserel RTEs */
+ if (rel == NULL)
+ continue;
+
+ Assert(rel->relid == rti); /* sanity check on array */
+
+ /* ignore RTEs that are "other rels" */
+ if (rel->reloptkind != RELOPT_BASEREL)
+ continue;
+
+ set_rel_grouping(root, rel);
+ }
+}
+
+/*
* set_rel_size
* Set size estimates for a base relation
*/
@@ -2679,6 +2713,106 @@ make_rel_from_joinlist(PlannerInfo *root, List *joinlist)
}
/*
+ * set_rel_grouping
+ * Generate grouping paths for a relation
+ */
+void
+set_rel_grouping(PlannerInfo *root, RelOptInfo *rel)
+{
+ PathTarget *final_target;
+ PathTarget *scanjoin_target;
+ List *scanjoin_targets;
+ List *scanjoin_targets_contain_srfs;
+ bool scanjoin_target_parallel_safe;
+ bool scanjoin_target_same_exprs;
+ PathTarget *grouping_target;
+ bool grouping_target_parallel_safe;
+
+ /* If there's no GROUP BY or aggregates, nothing to do. */
+ if (!root->have_grouping)
+ return;
+
+ /*
+ * For testing: always do grouping at the lowest available level.
+ * So if we already computed grouped paths, where the grouping was
+ * done at a lower level, use those paths.
+ */
+#if 0
+ if (rel->grouped_rel)
+ return;
+#endif
+
+ /*
+ * Can we apply the GROUPing on this rel?
+ */
+ if (!is_grouping_computable_at_rel(root, rel))
+ return;
+
+ /*
+ * Construct a target list for the scan/join below the GROUPing,
+ * as well as for the grouping rel itself.
+ */
+ final_target = create_pathtarget(root, root->processed_tlist);
+ scanjoin_target = make_group_input_target(root, rel);
+ scanjoin_target_parallel_safe =
+ is_parallel_safe(root, (Node *) scanjoin_target->exprs);
+
+ grouping_target = make_grouping_target(root, rel, scanjoin_target, final_target);
+ grouping_target_parallel_safe =
+ is_parallel_safe(root, (Node *) grouping_target->exprs);
+
+ if (root->parse->hasTargetSRFs)
+ {
+ /* scanjoin_target will not have any SRFs precomputed for it */
+ split_pathtarget_at_srfs(root, scanjoin_target, NULL,
+ &scanjoin_targets,
+ &scanjoin_targets_contain_srfs);
+ scanjoin_target = linitial_node(PathTarget, scanjoin_targets);
+ Assert(!linitial_int(scanjoin_targets_contain_srfs));
+ }
+ else
+ {
+ scanjoin_targets = list_make1(scanjoin_target);
+ scanjoin_targets_contain_srfs = NIL;
+ }
+
+ /* XXX: this changes the target list of the non-grouped paths. Is that bad? */
+ scanjoin_target_same_exprs = list_length(scanjoin_targets) == 1
+ && equal(scanjoin_target->exprs, rel->reltarget->exprs);
+ apply_scanjoin_target_to_paths(root,
+ rel,
+ scanjoin_targets,
+ scanjoin_targets_contain_srfs,
+ scanjoin_target_parallel_safe,
+ scanjoin_target_same_exprs);
+
+ /*
+ * Create grouping relation to hold fully aggregated grouping and/or
+ * aggregation paths.
+ */
+ if (!rel->grouped_rel)
+ {
+ rel->grouped_rel = make_grouping_rel(root, rel, grouping_target,
+ grouping_target_parallel_safe,
+ root->parse->havingQual);
+ }
+
+ /* We can apply grouping here. Construct Paths on */
+ create_grouping_paths(root,
+ rel,
+ rel->grouped_rel,
+ grouping_target,
+ grouping_target_parallel_safe,
+ root->agg_costs,
+ root->gd);
+
+#if 0
+ elog(NOTICE, "input: %s", nodeToString(scanjoin_target));
+ elog(NOTICE, "group: %s", nodeToString(grouping_target));
+#endif
+}
+
+/*
* standard_join_search
* Find possible joinpaths for a query by successively finding ways
* to join component relations into join relations.
@@ -2761,6 +2895,9 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
/* Create paths for partitionwise joins. */
generate_partitionwise_join_paths(root, rel);
+ /* Create paths for groupings. */
+ generate_grouped_join_paths(root, rel);
+
/*
* Except for the topmost scan/join rel, consider gathering
* partial paths. We'll do the same for the topmost scan/join rel
@@ -3585,6 +3722,27 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
}
+/*
+ * generate_grouped_join_paths
+ * Create paths representing GROUP BY / aggregation over join rel.
+ */
+void
+generate_grouped_join_paths(PlannerInfo *root, RelOptInfo *rel)
+{
+ if (!root->have_grouping)
+ return;
+
+ /* Handle only join relations here. */
+ if (!IS_JOIN_REL(rel))
+ return;
+
+ /* Guard against stack overflow due to overly deep partition hierarchy. */
+ check_stack_depth();
+
+ set_rel_grouping(root, rel);
+}
+
+
/*****************************************************************************
* DEBUG SUPPORT
*****************************************************************************/
@@ -3880,6 +4038,13 @@ debug_print_rel(PlannerInfo *root, RelOptInfo *rel)
print_path(root, rel->cheapest_total_path, 1);
}
printf("\n");
+
+ if (rel->grouped_rel)
+ {
+ printf("GROUPED ");
+ debug_print_rel(root, rel->grouped_rel);
+ }
+
fflush(stdout);
}
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 7008e1318e..ae87ba487f 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -15,11 +15,13 @@
#include "postgres.h"
#include "miscadmin.h"
+#include "foreign/fdwapi.h"
#include "optimizer/clauses.h"
#include "optimizer/joininfo.h"
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/prep.h"
+#include "optimizer/tlist.h"
#include "partitioning/partbounds.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -46,6 +48,8 @@ static void try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1,
List *parent_restrictlist);
static int match_expr_to_partition_keys(Expr *expr, RelOptInfo *rel,
bool strict_op);
+static void make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2, RelOptInfo *joinrel,
+ SpecialJoinInfo *sjinfo, List *restrictlist);
/*
@@ -742,6 +746,17 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
restrictlist);
+ /*
+ * If we have grouped paths for either side of the join, create a
+ * grouped join relation. (Paths where the grouping is done at this
+ * join relation, is considered later, in generate_grouped_join_paths, after
+ * building partition-wise join paths.)
+ */
+ if (rel1->grouped_rel || rel2->grouped_rel)
+ {
+ make_grouped_join_rel(root, rel1, rel2, joinrel, sjinfo, restrictlist);
+ }
+
bms_free(joinrelids);
return joinrel;
@@ -908,6 +923,121 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
try_partitionwise_join(root, rel1, rel2, joinrel, sjinfo, restrictlist);
}
+/*
+ * make_grouping_rel
+ *
+ * Create a new grouping rel and set basic properties.
+ *
+ * input_rel represents the underlying scan/join relation.
+ * target is the output expected from the grouping relation.
+ */
+RelOptInfo *
+make_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
+ PathTarget *target, bool target_parallel_safe,
+ Node *havingQual)
+{
+ RelOptInfo *grouped_rel;
+
+ grouped_rel = build_group_rel(root, input_rel);
+
+ /* Set target. */
+ grouped_rel->reltarget = target;
+
+ /*
+ * If the input relation is not parallel-safe, then the grouped relation
+ * can't be parallel-safe, either. Otherwise, it's parallel-safe if the
+ * target list and HAVING quals are parallel-safe.
+ */
+ if (input_rel->consider_parallel && target_parallel_safe &&
+ is_parallel_safe(root, (Node *) havingQual))
+ grouped_rel->consider_parallel = true;
+
+ /*
+ * If the input rel belongs to a single FDW, so does the grouped rel.
+ */
+ grouped_rel->serverid = input_rel->serverid;
+ grouped_rel->userid = input_rel->userid;
+ grouped_rel->useridiscurrent = input_rel->useridiscurrent;
+ grouped_rel->fdwroutine = input_rel->fdwroutine;
+
+ /* XXX: what other fields might be needed? */
+ grouped_rel->relid = input_rel->relid;
+
+ return grouped_rel;
+}
+
+/*
+ * make_grouped_join_rel
+ *
+ * Create grouped paths for a join rel, where one side of the join already
+ * has grouped paths.
+ */
+static void
+make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2, RelOptInfo *joinrel,
+ SpecialJoinInfo *sjinfo, List *restrictlist)
+{
+ RelOptInfo *grouped_rel;
+
+ /* At least one side of the join should have grouped paths, or we have nothing to do. */
+ Assert(rel1->grouped_rel || rel2->grouped_rel);
+
+ /*
+ * Also build join rels for grouped children.
+ *
+ * The paths for grouping at this join rel are generated later, see
+ * generate_grouped_join_paths.
+ */
+ Assert(!joinrel->grouped_rel);
+ grouped_rel = build_group_rel(root, joinrel);
+ joinrel->grouped_rel = grouped_rel;
+
+ /*
+ * XXX: How much of the fields in RelOptInfo should we copy forom the parent joinrel?
+ * Perhaps we should call build_join_rel() here instead? Or in build_group_rel().
+ */
+
+ /*
+ * FIXME: the row estimate here surely isn't right, as the aggregation below this join
+ * should have eliminated some rows. In fact, it'd be pretty important to have a more
+ * sensible estimate here, because otherwise it's unlikely that we choose this plan,
+ * because we don't "see" the benefit of doing the aggregation at the lower level. The
+ * main benefit of doing that is precisely that it reduces the number of rows that
+ * need to be joined.
+ */
+ grouped_rel->rows = joinrel->rows;
+
+ /*
+ * It's possible for *both* sides of a join to have grouped paths.
+ */
+ if (rel1->grouped_rel)
+ {
+ add_new_columns_to_pathtarget(grouped_rel->reltarget,
+ rel1->grouped_rel->reltarget->exprs);
+ add_new_columns_to_pathtarget(grouped_rel->reltarget,
+ rel2->reltarget->exprs);
+ populate_joinrel_with_paths(root,
+ rel1->grouped_rel,
+ rel2,
+ grouped_rel,
+ sjinfo,
+ restrictlist);
+ }
+ if (rel2->grouped_rel)
+ {
+ add_new_columns_to_pathtarget(grouped_rel->reltarget,
+ rel1->reltarget->exprs);
+ add_new_columns_to_pathtarget(grouped_rel->reltarget,
+ rel2->grouped_rel->reltarget->exprs);
+ populate_joinrel_with_paths(root,
+ rel1,
+ rel2->grouped_rel,
+ grouped_rel,
+ sjinfo,
+ restrictlist);
+ }
+}
+
+
/*
* have_join_order_restriction
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index ec66cb9c3c..7e919ac63f 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -225,7 +225,7 @@ make_pathkey_from_sortinfo(PlannerInfo *root,
* This should eventually go away, but we need to restructure SortGroupClause
* first.
*/
-static PathKey *
+PathKey *
make_pathkey_from_sortop(PlannerInfo *root,
Expr *expr,
Relids nullable_relids,
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 01335db511..c9a7bef603 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -27,6 +27,7 @@
#include "optimizer/planner.h"
#include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
+#include "optimizer/tlist.h"
#include "optimizer/var.h"
#include "parser/analyze.h"
#include "rewrite/rewriteManip.h"
@@ -144,20 +145,57 @@ add_base_rels_to_query(PlannerInfo *root, Node *jtnode)
* Add targetlist entries for each var needed in the query's final tlist
* (and HAVING clause, if any) to the appropriate base relations.
*
- * We mark such vars as needed by "relation 0" to ensure that they will
- * propagate up through all join plan steps.
+ * Vars that are needed in the final target list are marked as needed by
+ * "relation 0", to ensure that they will propagate up through all join and
+ * grouping plan steps. Vars that are needed in aggregate functions are
+' marked as needed by "NEEDED_IN_GROUPING" magic relation.
*/
void
build_base_rel_tlists(PlannerInfo *root, List *final_tlist)
{
- List *tlist_vars = pull_var_clause((Node *) final_tlist,
- PVC_RECURSE_AGGREGATES |
- PVC_RECURSE_WINDOWFUNCS |
- PVC_INCLUDE_PLACEHOLDERS);
+ List *tlist_vars;
+ ListCell *lc;
+ ListCell *lc2;
- if (tlist_vars != NIL)
+ foreach (lc, final_tlist)
{
- add_vars_to_targetlist(root, tlist_vars, bms_make_singleton(0), true);
+ TargetEntry *tle = lfirst_node(TargetEntry, lc);
+
+ if (tle->resjunk)
+ {
+ tlist_vars = pull_var_clause((Node *) tle->expr,
+ PVC_RECURSE_AGGREGATES |
+ PVC_RECURSE_WINDOWFUNCS |
+ PVC_INCLUDE_PLACEHOLDERS);
+ add_vars_to_targetlist(root, tlist_vars,
+ bms_make_singleton(NEEDED_IN_GROUPING), true);
+ }
+ else
+ {
+ tlist_vars = pull_var_clause((Node *) tle->expr,
+ PVC_INCLUDE_AGGREGATES |
+ PVC_RECURSE_WINDOWFUNCS |
+ PVC_INCLUDE_PLACEHOLDERS);
+ foreach(lc2, tlist_vars)
+ {
+ Expr *expr = (Expr *) lfirst(lc2);
+
+ if (IsA(expr, Aggref) || IsA(expr, GroupingFunc))
+ {
+ List *aggref_vars;
+
+ aggref_vars = pull_var_clause((Node *) expr,
+ PVC_RECURSE_AGGREGATES |
+ PVC_RECURSE_WINDOWFUNCS |
+ PVC_INCLUDE_PLACEHOLDERS);
+ add_vars_to_targetlist(root, aggref_vars,
+ bms_make_singleton(NEEDED_IN_GROUPING), true);
+ }
+ else
+ add_vars_to_targetlist(root, list_make1(expr), bms_make_singleton(0), true);
+
+ }
+ }
list_free(tlist_vars);
}
@@ -174,7 +212,7 @@ build_base_rel_tlists(PlannerInfo *root, List *final_tlist)
if (having_vars != NIL)
{
add_vars_to_targetlist(root, having_vars,
- bms_make_singleton(0), true);
+ bms_make_singleton(NEEDED_IN_GROUPING), true);
list_free(having_vars);
}
}
@@ -187,6 +225,9 @@ build_base_rel_tlists(PlannerInfo *root, List *final_tlist)
* as being needed for the indicated join (or for final output if
* where_needed includes "relation 0").
*
+ * where_needed can also include NEEDED_IN_GROUPING, to mean that it's
+ * needed to compute aggregates, but possibly not in the final output.
+ *
* The list may also contain PlaceHolderVars. These don't necessarily
* have a single owning relation; we keep their attr_needed info in
* root->placeholder_list instead. If create_new_ph is true, it's OK
@@ -241,6 +282,86 @@ add_vars_to_targetlist(PlannerInfo *root, List *vars,
}
}
+/*
+ * process_targetlist
+ * Extract information needed in query_planner() from target list.
+ *
+ * This needs to run after building equivalence classes.
+ */
+void
+process_targetlist(PlannerInfo *root, List *tlist)
+{
+ Query *parse = root->parse;
+ PathTarget *final_target;
+ ListCell *lc;
+ Bitmapset *agg_relids;
+ List *group_ecs = NIL;
+ List *group_sortrefs = NIL;
+ List *tlist_varnos;
+
+ /*
+ * Make a copy of the final target list, in PathTarget format, for the
+ * convenience of other routines in query_planner().
+ */
+ final_target = create_pathtarget(root, tlist);
+ set_pathtarget_cost_width(root, final_target);
+ root->final_target = final_target;
+
+ /*
+ * Build equivalence classes to represent each GROUP BY column. In most
+ * cases, we already created these when we built PathKeys for them, but
+ * it's not guaranteed. For example, if there are any non-sortable GROUP
+ * BY expressions.
+ */
+ foreach(lc, parse->groupClause)
+ {
+ SortGroupClause *sortcl = (SortGroupClause *) lfirst(lc);
+ Expr *sortkey;
+ PathKey *pathkey;
+ EquivalenceClass *eclass;
+
+ sortkey = (Expr *) get_sortgroupclause_expr(sortcl, tlist);
+
+ if (OidIsValid(sortcl->sortop))
+ {
+ pathkey = make_pathkey_from_sortop(root,
+ sortkey,
+ root->nullable_baserels,
+ sortcl->sortop,
+ sortcl->nulls_first,
+ sortcl->tleSortGroupRef,
+ true);
+ eclass = pathkey->pk_eclass;
+ }
+ else
+ {
+ /* XXX: Can there be an equivalence class for a non-sortable expression? */
+ eclass = NULL;
+ }
+
+ group_ecs = lappend(group_ecs, eclass);
+ group_sortrefs = lappend_int(group_sortrefs, sortcl->tleSortGroupRef);
+ }
+ root->group_ecs = group_ecs;
+ root->group_sortrefs = group_sortrefs;
+
+ /*
+ * Compute the minimum set of relations needed to compute aggregates.
+ */
+ agg_relids = NULL;
+ tlist_varnos = pull_var_clause((Node *) tlist,
+ PVC_INCLUDE_AGGREGATES |
+ PVC_RECURSE_WINDOWFUNCS |
+ PVC_RECURSE_PLACEHOLDERS);
+ foreach(lc, tlist_varnos)
+ {
+ Node *e = (Node *) lfirst(lc);
+
+ if (IsA(e, Aggref))
+ agg_relids = bms_join(agg_relids, pull_varnos(e));
+ }
+ root->agg_relids = agg_relids;
+}
/*****************************************************************************
*
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index dc5cc110a9..09defb0363 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -53,6 +53,12 @@ process_jointree(PlannerInfo *root, List *tlist)
if (parse->jointree->fromlist == NIL)
{
/*
+ * Construct a PathTarget to represent the final target list, and extract
+ * information about aggregates.
+ */
+ process_targetlist(root, tlist);
+
+ /*
* Initialize canon_pathkeys, in case it's something like
* "SELECT 2+2 ORDER BY 1".
*/
@@ -189,6 +195,12 @@ process_jointree(PlannerInfo *root, List *tlist)
extract_restriction_or_clauses(root);
/*
+ * Construct a PathTarget to represent the final target list, and extract
+ * information about aggregates.
+ */
+ process_targetlist(root, tlist);
+
+ /*
* We should now have size estimates for every actual table involved in
* the query, and we also know which if any have been deleted from the
* query by join removal; so we can compute total_table_pages.
@@ -220,10 +232,10 @@ process_jointree(PlannerInfo *root, List *tlist)
/*
* query_planner
* Generate paths (that is, simplified plans) for a basic query,
- * which may involve joins but not any fancier features.
+ * which may involve joins and grouping, but not any fancier features.
*
- * Since query_planner does not handle the toplevel processing (grouping,
- * sorting, etc) it cannot select the best path by itself. Instead, it
+ * Since query_planner does not handle the toplevel processing (sorting,
+ * etc) it cannot select the best path by itself. Instead, it
* returns the RelOptInfo for the top level of joining, and the caller
* (grouping_planner) can choose among the surviving paths for the rel.
*
@@ -267,6 +279,16 @@ query_planner(PlannerInfo *root)
/* Select cheapest path (pretty easy in this case...) */
set_cheapest(final_rel);
+ /*
+ * If there is grouping/aggregation (i.e. something like "SELECT COUNT(*);"),
+ * add an Agg node on top of the Result.
+ */
+ if (root->have_grouping)
+ {
+ set_rel_grouping(root, final_rel);
+ final_rel = final_rel->grouped_rel;
+ }
+
return final_rel;
}
@@ -280,5 +302,17 @@ query_planner(PlannerInfo *root)
final_rel->cheapest_total_path->param_info != NULL)
elog(ERROR, "failed to construct the join relation");
+ /*
+ * If there is grouping and/or aggregation involved, return the
+ * grouped plan.
+ */
+ if (root->have_grouping)
+ {
+ if (!final_rel->grouped_rel)
+ elog(ERROR, "grouping query, but no grouping node was created as part of scan/join planning");
+
+ final_rel = final_rel->grouped_rel;
+ }
+
return final_rel;
}
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 0d31eded37..68ee0c09a2 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -129,8 +129,6 @@ static RelOptInfo *create_ordered_paths(PlannerInfo *root,
PathTarget *target,
bool target_parallel_safe,
double limit_tuples);
-static PathTarget *make_group_input_target(PlannerInfo *root,
- PathTarget *final_target);
static List *postprocess_setop_tlist(List *new_tlist, List *orig_tlist);
static List *select_active_windows(PlannerInfo *root, WindowFuncLists *wflists);
static PathTarget *make_window_input_target(PlannerInfo *root,
@@ -143,12 +141,6 @@ static PathTarget *make_sort_input_target(PlannerInfo *root,
bool *have_postponed_srfs);
static void adjust_paths_for_srfs(PlannerInfo *root, RelOptInfo *rel,
List *targets, List *targets_contain_srfs);
-static void apply_scanjoin_target_to_paths(PlannerInfo *root,
- RelOptInfo *rel,
- List *scanjoin_targets,
- List *scanjoin_targets_contain_srfs,
- bool scanjoin_target_parallel_safe,
- bool tlist_same_exprs);
static List *extract_rollup_sets(List *groupingSets);
static List *reorder_grouping_sets(List *groupingSets, List *sortclause);
@@ -1687,15 +1679,11 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
List *sort_input_targets;
List *sort_input_targets_contain_srfs;
bool sort_input_target_parallel_safe;
- PathTarget *grouping_target;
- List *grouping_targets;
- List *grouping_targets_contain_srfs;
- bool grouping_target_parallel_safe;
- PathTarget *scanjoin_target;
- List *scanjoin_targets;
- List *scanjoin_targets_contain_srfs;
- bool scanjoin_target_parallel_safe;
- bool scanjoin_target_same_exprs;
+ PathTarget *scanjoingrouping_target;
+ List *scanjoingrouping_targets;
+ List *scanjoingrouping_targets_contain_srfs;
+ bool scanjoingrouping_target_parallel_safe;
+ bool scanjoingrouping_target_same_exprs;
bool have_grouping;
AggClauseCosts agg_costs;
WindowFuncLists *wflists = NULL;
@@ -1716,6 +1704,7 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
if (parse->groupClause)
parse->groupClause = preprocess_groupclause(root, NIL);
}
+ root->gd = gset_data;
/* Preprocess targetlist */
tlist = preprocess_targetlist(root);
@@ -1750,6 +1739,7 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
get_agg_clause_costs(root, parse->havingQual, AGGSPLIT_SIMPLE,
&agg_costs);
}
+ root->agg_costs = &agg_costs;
/*
* Locate any window functions in the tlist. (We don't need to look
@@ -1808,6 +1798,10 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
? (gset_data->rollups ? linitial_node(RollupData, gset_data->rollups)->groupClause : NIL)
: parse->groupClause);
+ have_grouping = (parse->groupClause || parse->groupingSets ||
+ parse->hasAggs || root->hasHavingQual);
+ root->have_grouping = have_grouping;
+
/*
* Generate the best unsorted and presorted paths for the scan/join
* portion of this Query, ie the processing represented by the
@@ -1852,35 +1846,16 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
*/
if (activeWindows)
{
- grouping_target = make_window_input_target(root,
- final_target,
- activeWindows);
- grouping_target_parallel_safe =
- is_parallel_safe(root, (Node *) grouping_target->exprs);
+ scanjoingrouping_target = make_window_input_target(root,
+ final_target,
+ activeWindows);
+ scanjoingrouping_target_parallel_safe =
+ is_parallel_safe(root, (Node *) scanjoingrouping_target->exprs);
}
else
{
- grouping_target = sort_input_target;
- grouping_target_parallel_safe = sort_input_target_parallel_safe;
- }
-
- /*
- * If we have grouping or aggregation to do, the topmost scan/join
- * plan node must emit what the grouping step wants; otherwise, it
- * should emit grouping_target.
- */
- have_grouping = (parse->groupClause || parse->groupingSets ||
- parse->hasAggs || root->hasHavingQual);
- if (have_grouping)
- {
- scanjoin_target = make_group_input_target(root, final_target);
- scanjoin_target_parallel_safe =
- is_parallel_safe(root, (Node *) grouping_target->exprs);
- }
- else
- {
- scanjoin_target = grouping_target;
- scanjoin_target_parallel_safe = grouping_target_parallel_safe;
+ scanjoingrouping_target = sort_input_target;
+ scanjoingrouping_target_parallel_safe = sort_input_target_parallel_safe;
}
/*
@@ -1898,41 +1873,74 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
final_target = linitial_node(PathTarget, final_targets);
Assert(!linitial_int(final_targets_contain_srfs));
/* likewise for sort_input_target vs. grouping_target */
- split_pathtarget_at_srfs(root, sort_input_target, grouping_target,
+ split_pathtarget_at_srfs(root, sort_input_target, scanjoingrouping_target,
&sort_input_targets,
&sort_input_targets_contain_srfs);
sort_input_target = linitial_node(PathTarget, sort_input_targets);
Assert(!linitial_int(sort_input_targets_contain_srfs));
/* likewise for grouping_target vs. scanjoin_target */
- split_pathtarget_at_srfs(root, grouping_target, scanjoin_target,
- &grouping_targets,
- &grouping_targets_contain_srfs);
- grouping_target = linitial_node(PathTarget, grouping_targets);
- Assert(!linitial_int(grouping_targets_contain_srfs));
- /* scanjoin_target will not have any SRFs precomputed for it */
- split_pathtarget_at_srfs(root, scanjoin_target, NULL,
- &scanjoin_targets,
- &scanjoin_targets_contain_srfs);
- scanjoin_target = linitial_node(PathTarget, scanjoin_targets);
- Assert(!linitial_int(scanjoin_targets_contain_srfs));
+ split_pathtarget_at_srfs(root, scanjoingrouping_target, current_rel->reltarget,
+ &scanjoingrouping_targets,
+ &scanjoingrouping_targets_contain_srfs);
+ scanjoingrouping_target = linitial_node(PathTarget, scanjoingrouping_targets);
+ Assert(!linitial_int(scanjoingrouping_targets_contain_srfs));
}
else
{
/* initialize lists; for most of these, dummy values are OK */
final_targets = final_targets_contain_srfs = NIL;
sort_input_targets = sort_input_targets_contain_srfs = NIL;
- grouping_targets = grouping_targets_contain_srfs = NIL;
- scanjoin_targets = list_make1(scanjoin_target);
- scanjoin_targets_contain_srfs = NIL;
+ scanjoingrouping_targets = list_make1(scanjoingrouping_target);
+ scanjoingrouping_targets_contain_srfs = NIL;
+ }
+
+ /*
+ * We might have MIN/MAX paths stashed in UPPERREL_GROUP_AGG.
+ * Merge them with the paths in current rel.
+ *
+ * XXX: what we really should do is to use the proper upper-rel
+ * in the first place. Instead of always building a new grouped-rel
+ * in query_planner(), it should call fetch_upper_rel with the right
+ * relids. But when I tried doing that, GEQO became unhappy: it
+ * allocates join rels in a temp memory context, and the caching
+ * in fetch_upper_rel() didn't work with that.
+ */
+ if (have_grouping)
+ {
+ RelOptInfo *grouped_rel;
+
+ /* Copy the paths to 'grouped_rel' */
+ grouped_rel = fetch_upper_rel(root, UPPERREL_GROUP_AGG, NULL);
+ grouped_rel->is_grouped_rel = true;
+
+ /*
+ * Update FDW information in the new upper rel.
+ */
+ grouped_rel->serverid = current_rel->serverid;
+ grouped_rel->userid = current_rel->userid;
+ grouped_rel->useridiscurrent = current_rel->useridiscurrent;
+ grouped_rel->fdwroutine = current_rel->fdwroutine;
+ grouped_rel->fdw_private = current_rel->fdw_private;
+
+ foreach(lc, current_rel->pathlist)
+ {
+ Path *path = (Path *) lfirst(lc);
+
+ path->parent = grouped_rel;
+ add_path(grouped_rel, path);
+ }
+
+ set_cheapest(grouped_rel);
+ current_rel = grouped_rel;
}
- /* Apply scan/join target. */
- scanjoin_target_same_exprs = list_length(scanjoin_targets) == 1
- && equal(scanjoin_target->exprs, current_rel->reltarget->exprs);
- apply_scanjoin_target_to_paths(root, current_rel, scanjoin_targets,
- scanjoin_targets_contain_srfs,
- scanjoin_target_parallel_safe,
- scanjoin_target_same_exprs);
+ /* Apply scan/join/grouping target. */
+ scanjoingrouping_target_same_exprs = list_length(scanjoingrouping_targets) == 1
+ && equal(scanjoingrouping_target->exprs, current_rel->reltarget->exprs);
+ apply_scanjoin_target_to_paths(root, current_rel, scanjoingrouping_targets,
+ scanjoingrouping_targets_contain_srfs,
+ scanjoingrouping_target_parallel_safe,
+ scanjoingrouping_target_same_exprs);
/*
* Save the various upper-rel PathTargets we just computed into
@@ -1943,27 +1951,7 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
*/
root->upper_targets[UPPERREL_FINAL] = final_target;
root->upper_targets[UPPERREL_WINDOW] = sort_input_target;
- root->upper_targets[UPPERREL_GROUP_AGG] = grouping_target;
-
- /*
- * If we have grouping and/or aggregation, consider ways to implement
- * that. We build a new upperrel representing the output of this
- * phase.
- */
- if (have_grouping)
- {
- current_rel = create_grouping_paths(root,
- current_rel,
- grouping_target,
- grouping_target_parallel_safe,
- &agg_costs,
- gset_data);
- /* Fix things up if grouping_target contains SRFs */
- if (parse->hasTargetSRFs)
- adjust_paths_for_srfs(root, current_rel,
- grouping_targets,
- grouping_targets_contain_srfs);
- }
+ root->upper_targets[UPPERREL_GROUP_AGG] = scanjoingrouping_target;
/*
* If we have window functions, consider ways to implement those. We
@@ -1973,7 +1961,7 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
{
current_rel = create_window_paths(root,
current_rel,
- grouping_target,
+ scanjoingrouping_target,
sort_input_target,
sort_input_target_parallel_safe,
tlist,
@@ -3941,105 +3929,6 @@ create_ordered_paths(PlannerInfo *root,
return ordered_rel;
}
-
-/*
- * make_group_input_target
- * Generate appropriate PathTarget for initial input to grouping nodes.
- *
- * If there is grouping or aggregation, the scan/join subplan cannot emit
- * the query's final targetlist; for example, it certainly can't emit any
- * aggregate function calls. This routine generates the correct target
- * for the scan/join subplan.
- *
- * The query target list passed from the parser already contains entries
- * for all ORDER BY and GROUP BY expressions, but it will not have entries
- * for variables used only in HAVING clauses; so we need to add those
- * variables to the subplan target list. Also, we flatten all expressions
- * except GROUP BY items into their component variables; other expressions
- * will be computed by the upper plan nodes rather than by the subplan.
- * For example, given a query like
- * SELECT a+b,SUM(c+d) FROM table GROUP BY a+b;
- * we want to pass this targetlist to the subplan:
- * a+b,c,d
- * where the a+b target will be used by the Sort/Group steps, and the
- * other targets will be used for computing the final results.
- *
- * 'final_target' is the query's final target list (in PathTarget form)
- *
- * The result is the PathTarget to be computed by the Paths returned from
- * query_planner().
- */
-static PathTarget *
-make_group_input_target(PlannerInfo *root, PathTarget *final_target)
-{
- Query *parse = root->parse;
- PathTarget *input_target;
- List *non_group_cols;
- List *non_group_vars;
- int i;
- ListCell *lc;
-
- /*
- * We must build a target containing all grouping columns, plus any other
- * Vars mentioned in the query's targetlist and HAVING qual.
- */
- input_target = create_empty_pathtarget();
- non_group_cols = NIL;
-
- i = 0;
- foreach(lc, final_target->exprs)
- {
- Expr *expr = (Expr *) lfirst(lc);
- Index sgref = get_pathtarget_sortgroupref(final_target, i);
-
- if (sgref && parse->groupClause &&
- get_sortgroupref_clause_noerr(sgref, parse->groupClause) != NULL)
- {
- /*
- * It's a grouping column, so add it to the input target as-is.
- */
- add_column_to_pathtarget(input_target, expr, sgref);
- }
- else
- {
- /*
- * Non-grouping column, so just remember the expression for later
- * call to pull_var_clause.
- */
- non_group_cols = lappend(non_group_cols, expr);
- }
-
- i++;
- }
-
- /*
- * If there's a HAVING clause, we'll need the Vars it uses, too.
- */
- if (parse->havingQual)
- non_group_cols = lappend(non_group_cols, parse->havingQual);
-
- /*
- * Pull out all the Vars mentioned in non-group cols (plus HAVING), and
- * add them to the input target if not already present. (A Var used
- * directly as a GROUP BY item will be present already.) Note this
- * includes Vars used in resjunk items, so we are covering the needs of
- * ORDER BY and window specifications. Vars used within Aggrefs and
- * WindowFuncs will be pulled out here, too.
- */
- non_group_vars = pull_var_clause((Node *) non_group_cols,
- PVC_RECURSE_AGGREGATES |
- PVC_RECURSE_WINDOWFUNCS |
- PVC_INCLUDE_PLACEHOLDERS);
- add_new_columns_to_pathtarget(input_target, non_group_vars);
-
- /* clean up cruft */
- list_free(non_group_vars);
- list_free(non_group_cols);
-
- /* XXX this causes some redundant cost calculation ... */
- return set_pathtarget_cost_width(root, input_target);
-}
-
/*
* mark_partial_aggref
* Adjust an Aggref to make it represent a partial-aggregation step.
@@ -5020,7 +4909,7 @@ done:
/*
* apply_scanjoin_target_to_paths
*
- * Adjust the final scan/join relation, and recursively all of its children,
+ * Adjust the final scan/join relation, and recursively all of its children (if partitioned),
* to generate the final scan/join target. It would be more correct to model
* this as a separate planning step with a new RelOptInfo at the toplevel and
* for each child relation, but doing it this way is noticeably cheaper.
@@ -5031,7 +4920,7 @@ done:
* appropriate sortgroupref information. By avoiding the creation of
* projection paths we save effort both immediately and at plan creation time.
*/
-static void
+void
apply_scanjoin_target_to_paths(PlannerInfo *root,
RelOptInfo *rel,
List *scanjoin_targets,
@@ -5123,9 +5012,10 @@ apply_scanjoin_target_to_paths(PlannerInfo *root,
Path *subpath = (Path *) lfirst(lc);
Path *newpath;
- Assert(subpath->param_info == NULL);
+ //Assert(subpath->param_info == NULL);
- if (tlist_same_exprs)
+ if (tlist_same_exprs &&
+ equal(scanjoin_target->exprs, subpath->pathtarget->exprs))
subpath->pathtarget->sortgrouprefs =
scanjoin_target->sortgrouprefs;
else
@@ -5143,7 +5033,7 @@ apply_scanjoin_target_to_paths(PlannerInfo *root,
Path *newpath;
/* Shouldn't have any parameterized paths anymore */
- Assert(subpath->param_info == NULL);
+ //Assert(subpath->param_info == NULL);
if (tlist_same_exprs)
subpath->pathtarget->sortgrouprefs =
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index e190ad49d1..0a3c5bcfad 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -2392,8 +2392,7 @@ create_projection_path(PlannerInfo *root,
pathnode->path.pathtype = T_Result;
pathnode->path.parent = rel;
pathnode->path.pathtarget = target;
- /* For now, assume we are above any joins, so no parameterization */
- pathnode->path.param_info = NULL;
+ pathnode->path.param_info = subpath->param_info;
pathnode->path.parallel_aware = false;
pathnode->path.parallel_safe = rel->consider_parallel &&
subpath->parallel_safe &&
@@ -2640,8 +2639,7 @@ create_sort_path(PlannerInfo *root,
pathnode->path.parent = rel;
/* Sort doesn't project, so use source path's pathtarget */
pathnode->path.pathtarget = subpath->pathtarget;
- /* For now, assume we are above any joins, so no parameterization */
- pathnode->path.param_info = NULL;
+ pathnode->path.param_info = subpath->param_info;
pathnode->path.parallel_aware = false;
pathnode->path.parallel_safe = rel->consider_parallel &&
subpath->parallel_safe;
@@ -2685,8 +2683,7 @@ create_group_path(PlannerInfo *root,
pathnode->path.pathtype = T_Group;
pathnode->path.parent = rel;
pathnode->path.pathtarget = target;
- /* For now, assume we are above any joins, so no parameterization */
- pathnode->path.param_info = NULL;
+ pathnode->path.param_info = subpath->param_info;
pathnode->path.parallel_aware = false;
pathnode->path.parallel_safe = rel->consider_parallel &&
subpath->parallel_safe;
@@ -2799,8 +2796,7 @@ create_agg_path(PlannerInfo *root,
pathnode->path.pathtype = T_Agg;
pathnode->path.parent = rel;
pathnode->path.pathtarget = target;
- /* For now, assume we are above any joins, so no parameterization */
- pathnode->path.param_info = NULL;
+ pathnode->path.param_info = subpath->param_info;
pathnode->path.parallel_aware = false;
pathnode->path.parallel_safe = rel->consider_parallel &&
subpath->parallel_safe;
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 82b78420e7..6716a10ee4 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -915,12 +915,23 @@ build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
ndx = 0 - baserel->min_attr;
}
else
+ {
+ /*
+ * If this rel is above grouping, then we can have Aggrefs
+ * and grouping column expressions in the target list. Carry
+ * them up to the join rel. They will surely be needed at
+ * the top of the join tree. (Unless they're only used in
+ * HAVING?)
+ */
+#if 0
elog(ERROR, "unexpected node type in rel targetlist: %d",
(int) nodeTag(var));
-
+#endif
+ baserel = NULL;
+ }
/* Is the target expression still needed above this joinrel? */
- if (bms_nonempty_difference(baserel->attr_needed[ndx], relids))
+ if (baserel == NULL || bms_nonempty_difference(baserel->attr_needed[ndx], relids))
{
/* Yup, add it to the output */
joinrel->reltarget->exprs = lappend(joinrel->reltarget->exprs, var);
@@ -930,7 +941,9 @@ build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
* if it's a ConvertRowtypeExpr, it will be computed only for the
* base relation, costing nothing for a join.
*/
- joinrel->reltarget->width += baserel->attr_widths[ndx];
+ /* XXX: where to get estimate here if !baserel? */
+ if (baserel)
+ joinrel->reltarget->width += baserel->attr_widths[ndx];
}
}
}
@@ -1760,3 +1773,30 @@ build_joinrel_partition_info(RelOptInfo *joinrel, RelOptInfo *outer_rel,
joinrel->nullable_partexprs[cnt] = nullable_partexpr;
}
}
+
+RelOptInfo *
+build_group_rel(PlannerInfo *root, RelOptInfo *parent)
+{
+ RelOptInfo *grouped_rel;
+
+ grouped_rel = makeNode(RelOptInfo);
+ grouped_rel->reloptkind = parent->reloptkind;
+ grouped_rel->is_grouped_rel = true;
+ grouped_rel->relids = bms_copy(parent->relids);
+ grouped_rel->relids = bms_add_member(grouped_rel->relids, NEEDED_IN_GROUPING);
+
+ /* cheap startup cost is interesting iff not all tuples to be retrieved */
+ grouped_rel->consider_startup = (root->tuple_fraction > 0);
+ grouped_rel->consider_param_startup = false;
+ grouped_rel->consider_parallel = false; /* might get changed later */
+ grouped_rel->reltarget = create_empty_pathtarget();
+ grouped_rel->pathlist = NIL;
+ grouped_rel->cheapest_startup_path = NULL;
+ grouped_rel->cheapest_total_path = NULL;
+ grouped_rel->cheapest_unique_path = NULL;
+ grouped_rel->cheapest_parameterized_paths = NIL;
+
+ build_joinrel_tlist(root, grouped_rel, parent);
+
+ return grouped_rel;
+}
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index ad77416050..149d198554 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -236,6 +236,9 @@ typedef struct PlannerInfo
List *join_rel_list; /* list of join-relation RelOptInfos */
struct HTAB *join_rel_hash; /* optional hashtable for join relations */
+ struct PathTarget *final_target;
+ Relids agg_relids;
+
/*
* When doing a dynamic-programming-style join search, join_rel_level[k]
* is a list of all join-relation RelOptInfos of level k, and
@@ -281,6 +284,10 @@ typedef struct PlannerInfo
List *query_pathkeys; /* desired pathkeys for query_planner() */
List *group_pathkeys; /* groupClause pathkeys, if any */
+
+ List *group_ecs; /* groupClause equivalence classes, one for each groupClause item */
+ List *group_sortrefs; /* sortrefs, for each entry in group_ecs */
+
List *window_pathkeys; /* pathkeys of bottom window, if any */
List *distinct_pathkeys; /* distinctClause pathkeys, if any */
List *sort_pathkeys; /* sortClause pathkeys, if any */
@@ -290,6 +297,11 @@ typedef struct PlannerInfo
List *initial_rels; /* RelOptInfos we are now trying to join */
+ AggClauseCosts *agg_costs;
+ struct grouping_sets_data *gd;
+
+ bool have_grouping;
+
/* Use fetch_upper_rel() to get any particular upper rel */
List *upper_rels[UPPERREL_FINAL + 1]; /* upper-rel RelOptInfos */
@@ -609,11 +621,30 @@ typedef enum RelOptKind
(rel)->reloptkind == RELOPT_OTHER_JOINREL || \
(rel)->reloptkind == RELOPT_OTHER_UPPER_REL)
+/* Is the given relation a grouped relation? */
+#define IS_GROUPED_REL(rel) \
+ ((rel)->is_grouped_rel)
+
+/*
+ * NEEDED_IN_GROUPING is a magic range table index used in 'attr_needed',
+ * to indicate that an attribute is needed to compute an Aggregate, ie.
+ * is used as an argument to an aggregate function. It's similar to the
+ * magic '0' value used to indicate that an attribute is needed by the
+ * final target list.
+ *
+ * XXX: obviously 1000 won't work when you have 1000 relations in a query.
+ * I think we'll need to offset the bitmaps by some constant, to make room
+ * for this magic value, e.g. make it 1. (too bad that Bitmapset doesn't
+ * support negative values)
+ */
+#define NEEDED_IN_GROUPING 1000
+
typedef struct RelOptInfo
{
NodeTag type;
RelOptKind reloptkind;
+ bool is_grouped_rel;
/* all relations included in this RelOptInfo */
Relids relids; /* set of base relids (rangetable indexes) */
@@ -638,6 +669,9 @@ typedef struct RelOptInfo
struct Path *cheapest_unique_path;
List *cheapest_parameterized_paths;
+ /* version of this relation that includes the effects of GROUP BY and aggregates */
+ struct RelOptInfo *grouped_rel;
+
/* parameterization information needed for both base rels and join rels */
/* (see also lateral_vars and lateral_referencers) */
Relids direct_lateral_relids; /* rels directly laterally referenced */
@@ -2430,7 +2464,7 @@ typedef struct JoinCostWorkspace
* Data specific to grouping sets
*/
-typedef struct
+typedef struct grouping_sets_data
{
List *rollups;
List *hash_sets_idx;
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index e99ae36bef..d294450caa 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -299,5 +299,6 @@ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
RelOptInfo *outer_rel, RelOptInfo *inner_rel,
RelOptInfo *parent_joinrel, List *restrictlist,
SpecialJoinInfo *sjinfo, JoinType jointype);
+extern RelOptInfo *build_group_rel(PlannerInfo *root, RelOptInfo *parent);
#endif /* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 39e3e8f85c..6c274ac309 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -61,11 +61,15 @@ extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
Path *bitmapqual);
extern void generate_partitionwise_join_paths(PlannerInfo *root,
RelOptInfo *rel);
+extern void generate_grouped_join_paths(PlannerInfo *root,
+ RelOptInfo *rel);
#ifdef OPTIMIZER_DEBUG
extern void debug_print_rel(PlannerInfo *root, RelOptInfo *rel);
#endif
+extern void set_rel_grouping(PlannerInfo *root, RelOptInfo *rel);
+
/*
* indxpath.c
* routines to generate index paths
@@ -110,6 +114,9 @@ extern void add_paths_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
extern void join_search_one_level(PlannerInfo *root, int level);
extern RelOptInfo *make_join_rel(PlannerInfo *root,
RelOptInfo *rel1, RelOptInfo *rel2);
+extern RelOptInfo *make_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
+ PathTarget *target, bool target_parallel_safe,
+ Node *havingQual);
extern bool have_join_order_restriction(PlannerInfo *root,
RelOptInfo *rel1, RelOptInfo *rel2);
extern bool have_dangerous_phv(PlannerInfo *root,
@@ -123,12 +130,17 @@ extern bool have_partkey_equi_join(RelOptInfo *joinrel,
* aggpath.c
* routines to create grouping paths
*/
-extern RelOptInfo *create_grouping_paths(PlannerInfo *root,
+extern void create_grouping_paths(PlannerInfo *root,
RelOptInfo *input_rel,
+ RelOptInfo *grouped_rel,
PathTarget *target,
bool target_parallel_safe,
const AggClauseCosts *agg_costs,
- grouping_sets_data *gd);
+ struct grouping_sets_data *gd);
+extern PathTarget *make_group_input_target(PlannerInfo *root, RelOptInfo *rel);
+extern PathTarget *make_grouping_target(PlannerInfo *root, RelOptInfo *rel, PathTarget *input_target, PathTarget *final_target);
+extern bool is_grouping_computable_at_rel(PlannerInfo *root, RelOptInfo *rel);
+
extern List *remap_to_groupclause_idx(List *groupClause, List *gsets,
int *tleref_to_colnum_map);
@@ -228,6 +240,14 @@ extern List *build_join_pathkeys(PlannerInfo *root,
extern List *make_pathkeys_for_sortclauses(PlannerInfo *root,
List *sortclauses,
List *tlist);
+extern PathKey *make_pathkey_from_sortop(PlannerInfo *root,
+ Expr *expr,
+ Relids nullable_relids,
+ Oid ordering_op,
+ bool nulls_first,
+ Index sortref,
+ bool create_it);
+
extern void initialize_mergeclause_eclasses(PlannerInfo *root,
RestrictInfo *restrictinfo);
extern void update_mergeclause_eclasses(PlannerInfo *root,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index dd6e912373..19b6b7ae9f 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -76,6 +76,7 @@ extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
Relids where_needed, bool create_new_ph);
extern void find_lateral_references(PlannerInfo *root);
extern void create_lateral_join_info(PlannerInfo *root);
+extern void process_targetlist(PlannerInfo *root, List *tlist);
extern List *deconstruct_jointree(PlannerInfo *root);
extern void distribute_restrictinfo_to_rels(PlannerInfo *root,
RestrictInfo *restrictinfo);
diff --git a/src/include/optimizer/planner.h b/src/include/optimizer/planner.h
index 497a8c0581..fb214c371f 100644
--- a/src/include/optimizer/planner.h
+++ b/src/include/optimizer/planner.h
@@ -61,4 +61,12 @@ extern grouping_sets_data *preprocess_grouping_sets(PlannerInfo *root);
extern bool plan_cluster_use_sort(Oid tableOid, Oid indexOid);
extern int plan_create_index_workers(Oid tableOid, Oid indexOid);
+extern void
+apply_scanjoin_target_to_paths(PlannerInfo *root,
+ RelOptInfo *rel,
+ List *scanjoin_targets,
+ List *scanjoin_targets_contain_srfs,
+ bool scanjoin_target_parallel_safe,
+ bool tlist_same_exprs);
+
#endif /* PLANNER_H */
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index f85e913850..70a2c5e271 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -982,16 +982,15 @@ explain (costs off) select a,c from t1 group by a,c,d;
explain (costs off) select *
from t1 inner join t2 on t1.a = t2.x and t1.b = t2.y
group by t1.a,t1.b,t1.c,t1.d,t2.x,t2.y,t2.z;
- QUERY PLAN
-------------------------------------------------------
- HashAggregate
- Group Key: t1.a, t1.b, t2.x, t2.y
- -> Hash Join
- Hash Cond: ((t2.x = t1.a) AND (t2.y = t1.b))
- -> Seq Scan on t2
- -> Hash
- -> Seq Scan on t1
-(7 rows)
+ QUERY PLAN
+-------------------------------------------------
+ Nested Loop
+ -> HashAggregate
+ Group Key: t1.a, t1.b, t1.a, t1.b
+ -> Seq Scan on t1
+ -> Index Scan using t2_pkey on t2
+ Index Cond: ((x = t1.a) AND (y = t1.b))
+(6 rows)
-- Test case where t1 can be optimized but not t2
explain (costs off) select t1.*,t2.x,t2.z
@@ -1000,7 +999,7 @@ group by t1.a,t1.b,t1.c,t1.d,t2.x,t2.z;
QUERY PLAN
------------------------------------------------------
HashAggregate
- Group Key: t1.a, t1.b, t2.x, t2.z
+ Group Key: t1.a, t1.b, t1.a, t2.z
-> Hash Join
Hash Cond: ((t2.x = t1.a) AND (t2.y = t1.b))
-> Seq Scan on t2
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index b983f9c506..69199d8ac2 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -1140,7 +1140,7 @@ SELECT avg(t1.a), avg(t2.b), avg(t3.a + t3.b), t1.c, t2.c, t3.c FROM plt1 t1, pl
QUERY PLAN
--------------------------------------------------------------------------------
GroupAggregate
- Group Key: t1.c, t2.c, t3.c
+ Group Key: t1.c, t1.c, t3.c
-> Sort
Sort Key: t1.c, t3.c
-> Append
@@ -1284,7 +1284,7 @@ SELECT avg(t1.a), avg(t2.b), avg(t3.a + t3.b), t1.c, t2.c, t3.c FROM pht1 t1, ph
QUERY PLAN
--------------------------------------------------------------------------------
GroupAggregate
- Group Key: t1.c, t2.c, t3.c
+ Group Key: t1.c, t1.c, t3.c
-> Sort
Sort Key: t1.c, t3.c
-> Append
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 16f979c8d9..f6ed965ebe 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -116,7 +116,7 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare without_oid c
# ----------
# Another group of parallel tests
# ----------
-test: identity partition_join partition_prune reloptions hash_part indexing partition_aggregate
+test: identity partition_join partition_prune reloptions hash_part indexing partition_aggregate aggregate_pushdown
# event triggers cannot run concurrently with any test that runs DDL
test: event_trigger
diff --git a/src/test/regress/sql/aggregate_pushdown.sql b/src/test/regress/sql/aggregate_pushdown.sql
new file mode 100644
index 0000000000..afb385f844
--- /dev/null
+++ b/src/test/regress/sql/aggregate_pushdown.sql
@@ -0,0 +1,70 @@
+-- Test cases for "pushing down" Agg/Group node below joins .
+
+create temp table a (id int4 primary key);
+create temp table b (id int4);
+create index on b (id);
+
+insert into a values (1);
+insert into b select g/10 from generate_series(1, 1000) g;
+analyze a,b;
+
+-- Here, the "normal" plan with Aggregate on top wins.
+explain (costs off)
+select b.id from a, b where a.id = b.id group by b.id;
+select b.id from a, b where a.id = b.id group by b.id;
+
+-- With different data, pushing down the Aggregate below the join looks more
+-- attractive
+truncate a;
+insert into a select g from generate_series(1, 10) g;
+analyze a;
+
+explain (costs off)
+select b.id from a, b where a.id = b.id group by b.id;
+select b.id from a, b where a.id = b.id group by b.id;
+
+-- TODO: Test a nested loop join, with a GROUP BY parameterized path
+
+
+
+-- This Grouping can be applied on top of either t1 or t2.
+
+create temp table t1 (a int, b int, primary key (a, b));
+create temp table t2 (x int, y int, primary key (x, y));
+
+explain (costs off)
+select t1.a, count(*) from t1, t2 WHERE t1.a = t2.x AND t1.b = t2.y GROUP BY t1.a, t1.b;
+
+-- This is the same, but because of avg(t1.a), the aggregation cannot be done on 't2'
+explain (costs off)
+select t1.a, avg(t1.a) from t1, t2 WHERE t1.a = t2.x AND t1.b = t2.y GROUP BY t1.a, t1.b;
+
+-- This is the same, but because of avg(t2.x), the aggregation cannot be done on 't1'
+explain (costs off)
+select t1.a, avg(t2.x) from t1, t2 WHERE t1.a = t2.x AND t1.b = t2.y GROUP BY t1.a, t1.b;
+
+-- With both avg(t1.a) and avg(t2.x), aggregation needs to be done after the join.
+explain (costs off)
+select t1.a, avg(t1.a), avg(t2.x) from t1, t2 WHERE t1.a = t2.x AND t1.b = t2.y GROUP BY t1.a, t1.b;
+
+
+drop table a;
+drop table b;
+
+create temp table a (id int4 primary key);
+create temp table b (id int4, v numeric);
+create index on b (id);
+
+insert into a select g from generate_series(1, 10) g;
+insert into b select g / 10, g::numeric / 10.0 from generate_series(1, 1000) g;
+analyze a,b;
+
+-- This grouping can be performed on top of 'b'
+explain (costs off)
+select b.id, avg(b.v) from a, b where a.id = b.id group by b.id;
+select b.id, avg(b.v) from a, b where a.id = b.id group by b.id;
+
+-- But this can not (Except in a nested loop join, with b.id being used as a Param)
+explain (costs off)
+select b.id, avg(b.v) from a, b where a.id = b.id and a.id / 10 + b.v < 1.5 group by b.id;
+select b.id, avg(b.v) from a, b where a.id = b.id and a.id / 10 + b.v < 1.5 group by b.id;
--
2.11.0
On 06/20/2018 10:12 PM, Heikki Linnakangas wrote:
Currently, the planner always first decides the scan/join order, and
adds Group/Agg nodes on top of the joins. Sometimes it would be legal,
and beneficial, to perform the aggregation below a join. I've been
hacking on a patch to allow that.
There was a patch [1] from Antonin Houska aiming to achieve something
similar. IIRC it aimed to push the aggregate down in more cases,
leveraging the partial aggregation stuff. I suppose your patch only aims
to do the pushdown when the two-phase aggregation is not needed?
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 06/20/2018 10:12 PM, Heikki Linnakangas wrote:
Currently, the planner always first decides the scan/join order, and
adds Group/Agg nodes on top of the joins. Sometimes it would be legal,
and beneficial, to perform the aggregation below a join. I've been
hacking on a patch to allow that.
There was a patch [1]/messages/by-id/9666.1491295317@localhost from Antonin Houska aiming to achieve something
similar. IIRC it aimed to push the aggregate down in more cases,
leveraging the partial aggregation stuff. I suppose your patch only aims
to do the pushdown when the two-phase aggregation is not needed?
[1]: /messages/by-id/9666.1491295317@localhost
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On 06/20/2018 10:12 PM, Heikki Linnakangas wrote:
Currently, the planner always first decides the scan/join order, and
adds Group/Agg nodes on top of the joins. Sometimes it would be legal,
and beneficial, to perform the aggregation below a join. I've been
hacking on a patch to allow that.There was a patch [1] from Antonin Houska aiming to achieve something
similar. IIRC it aimed to push the aggregate down in more cases,
leveraging the partial aggregation stuff.
Yes, I interrupted the work when it become clear that it doesn't find its way
into v11. I'm about to get back to it next week, and at least rebase it so it
can be applied to the current master branch.
--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com
On 21/06/18 09:11, Antonin Houska wrote:
Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On 06/20/2018 10:12 PM, Heikki Linnakangas wrote:
Currently, the planner always first decides the scan/join order, and
adds Group/Agg nodes on top of the joins. Sometimes it would be legal,
and beneficial, to perform the aggregation below a join. I've been
hacking on a patch to allow that.There was a patch [1] from Antonin Houska aiming to achieve something
similar. IIRC it aimed to push the aggregate down in more cases,
leveraging the partial aggregation stuff.Yes, I interrupted the work when it become clear that it doesn't find its way
into v11. I'm about to get back to it next week, and at least rebase it so it
can be applied to the current master branch.
Ah, cool! I missed that thread earlier. Yes, seems like we've been
hacking on the same feature. Let's compare!
I've been using this paper as a guide:
"Including Group-By in Query Optimization", by Surajit Chaudhuri and
Kyuseok Shim:
https://pdfs.semanticscholar.org/3079/5447cec18753254edbbd7839f0afa58b2a39.pdf
Using the terms from that paper, my patch does only "Invariant Grouping
transormation", while yours can do "Simple Coalescing Grouping", which
is more general. In layman terms, my patch can push the Aggregate below
a join, while your patch can also split an Aggregate so that you do a
partial aggregate below the join, and a final stage above it. My
thinking was to start with the simpler Invariant Grouping transformation
first, and do the more advanced splitting into partial aggregates later,
as a separate patch.
Doing partial aggregation actually made your patch simpler in some ways,
though. I had to make some changes to grouping_planner(), because in my
patch, it is no longer responsible for aggregation. In your patch, it's
still responsible for doing the final aggregation.
There's some difference in the way we're dealing with the grouped
RelOptInfos. You're storing them completely separately, in PlannerInfo's
new 'simple_grouped_rel_array' array and 'join_grouped_rel_list/hash'.
I'm attaching each grouped RelOptInfo to its parent.
I got away with much less code churn in allpaths.c, indxpath.c,
joinpath.c. You're adding new 'do_aggregate' flags to many functions.
I'm not sure if you needed that because you do partial aggregation and I
don't, but it would be nice to avoid it.
You're introducing a new GroupedVar expression to represent Aggrefs
between the partial and final aggregates, while I'm just using Aggref
directly, above the aggregate node. I'm not thrilled about introducing
an new Var-like expression. We already have PlaceHolderVars, and I'm
always confused on how those work. But maybe that's necessary to supprot
partial aggregation?
In the other thread, Robert Haas wrote:
Concretely, in your test query "SELECT p.i, avg(c1.v) FROM
agg_pushdown_parent AS p JOIN agg_pushdown_child1 AS c1 ON c1.parent =
p.i GROUP BY p.i" you assume that it's OK to do a Partial
HashAggregate over c1.parent rather than p.i. This will be false if,
say, c1.parent is of type citext and p.i is of type text; this will
get grouped together that shouldn't. It will also be false if the
grouping expression is something like GROUP BY length(p.i::text),
because one value could be '0'::numeric and the other '0.00'::numeric.
I can't think of a reason why it would be false if the grouping
expressions are both simple Vars of the same underlying data type, but
I'm a little nervous that I might be wrong even about that case.
Maybe you've handled all of this somehow, but it's not obvious to me
that it has been considered.
Ah, I made the same mistake. I did consider the "GROUP BY
length(o.i::text)", but not the cross-datatype case. I think we should
punt on that for now, and only do the substitution for simple Vars of
the same datatype. That seems safe to me.
Overall, your patch is in a more polished state than my prototype. For
easier review, though, I think we should try to get something smaller
committed first, and build on that. Let's punt on the Var substitution.
And I'd suggest adopting my approach of attaching the grouped
RelOptInfos to the parent RelOptInfo, that seems simpler. And if we punt
on the partial aggregation, and only allow pushing down the whole
Aggregate, how much smaller would your patch get?
(I'll be on vacation for the next two weeks, but I'll be following this
thread. After that, I plan to focus on this feature, as time from
reviewing patches in the commitfest permits.)
- Heikki
Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Ah, cool! I missed that thread earlier. Yes, seems like we've been hacking on
the same feature. Let's compare!I've been using this paper as a guide:
"Including Group-By in Query Optimization", by Surajit Chaudhuri and Kyuseok
Shim:
https://pdfs.semanticscholar.org/3079/5447cec18753254edbbd7839f0afa58b2a39.pdf
Using the terms from that paper, my patch does only "Invariant Grouping
transormation", while yours can do "Simple Coalescing Grouping", which is more
general. In layman terms, my patch can push the Aggregate below a join, while
your patch can also split an Aggregate so that you do a partial aggregate
below the join, and a final stage above it.
Thanks for the link. I've just checked the two approaches briefly so far.
My thinking was to start with the simpler Invariant Grouping transformation
first, and do the more advanced splitting into partial aggregates later, as
a separate patch.
I think for this you need to make sure that no join duplicates already
processed groups. I tried to implement PoC of "unique keys" in v5 of my patch
[1]: /messages/by-id/18007.1513957437@localhost
aggregation can be replaced by calling pg_aggregate(aggfinalfn) function if
the final join generates only unique values of the GROUP BY expression.
Eventually I considered this an additional optimization of my approach and
postponed work on this to later, but I think something like this would be
necessary for your approach. As soon as you find out that a grouped relation
is joined to another relation in a way that duplicates the grouping
expression, you cannot proceed in creating grouped path.
There's some difference in the way we're dealing with the grouped
RelOptInfos. You're storing them completely separately, in PlannerInfo's new
'simple_grouped_rel_array' array and 'join_grouped_rel_list/hash'. I'm
attaching each grouped RelOptInfo to its parent.
In the first version of my patch I added several fields to RelOptInfo
(reltarget for the grouped paths, separate pathlist, etc.) but some people
didn't like it. In the later versions I introduced a separate RelOptInfo for
grouped relations, but stored it in a separate list. Your approach might make
the patch a bit less invasive, i.e. we don't have to add those RelOptInfo
lists / arrays to standard_join_search() and subroutines.
I got away with much less code churn in allpaths.c, indxpath.c,
joinpath.c. You're adding new 'do_aggregate' flags to many functions. I'm not
sure if you needed that because you do partial aggregation and I don't, but it
would be nice to avoid it.
IIRC, the do_aggregate argument determines how the grouped join should be
created. If it's false, the planner joins a grouped relation (if it exists) to
non-grouped one. If it's true, it joins two non-grouped relations and applies
(partial) aggregation to the result.
You're introducing a new GroupedVar expression to represent Aggrefs between
the partial and final aggregates, while I'm just using Aggref directly, above
the aggregate node. I'm not thrilled about introducing an new Var-like
expression. We already have PlaceHolderVars, and I'm always confused on how
those work. But maybe that's necessary to supprot partial aggregation?
The similarity of GroupedVar and PlaceHolderVar is that they are evaluated at
some place in the join tree and the result is only passed to the joins above
and eventually to the query target, w/o being evaluated again. In contrast,
generic expressions are evaluated in the query target (only the input Vars get
propagated from lower nodes), but that's not what we want for 2-stage
aggregation.
In my patch GroupedVar represents either the result of partial aggregation
(the value of the transient state) or a grouping expression which is more
complex than a plain column reference (Var expression).
In the other thread, Robert Haas wrote:
Concretely, in your test query "SELECT p.i, avg(c1.v) FROM
agg_pushdown_parent AS p JOIN agg_pushdown_child1 AS c1 ON c1.parent =
p.i GROUP BY p.i" you assume that it's OK to do a Partial
HashAggregate over c1.parent rather than p.i. This will be false if,
say, c1.parent is of type citext and p.i is of type text; this will
get grouped together that shouldn't. It will also be false if the
grouping expression is something like GROUP BY length(p.i::text),
because one value could be '0'::numeric and the other '0.00'::numeric.
I can't think of a reason why it would be false if the grouping
expressions are both simple Vars of the same underlying data type, but
I'm a little nervous that I might be wrong even about that case.
Maybe you've handled all of this somehow, but it's not obvious to me
that it has been considered.Ah, I made the same mistake. I did consider the "GROUP BY length(o.i::text)",
but not the cross-datatype case. I think we should punt on that for now, and
only do the substitution for simple Vars of the same datatype. That seems safe
to me.
Yes, I reached the same conclusion. I'll add this restriction to the next
version of the patch.
Overall, your patch is in a more polished state than my prototype.
Probably I spent much more time on it.
For easier review, though, I think we should try to get something smaller
committed first, and build on that. Let's punt on the Var substitution.
As mentioned above, I think we can only live without the Var substitution (in
other words without 2-stage aggregation) if we can check the uniqueness of
grouping keys of any path. So the question is how much effort this check
requires.
And I'd suggest adopting my approach of attaching the grouped RelOptInfos to
the parent RelOptInfo, that seems simpler.
o.k., I'll try this in the next version.
And if we punt on the partial aggregation, and only allow pushing down the
whole Aggregate, how much smaller would your patch get?
I can't tell now, need to spend some time looking at the code.
(I'll be on vacation for the next two weeks, but I'll be following this
thread. After that, I plan to focus on this feature, as time from reviewing
patches in the commitfest permits.)
Likewise, I'll be off from July 5th to 22nd. I'll post what I have before I
leave and will see what you could make out of it :-)
It's cool that you intend to work on this feature too!
[1]: /messages/by-id/18007.1513957437@localhost
--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com
On 20.06.18 22:12, Heikki Linnakangas wrote:
Currently, the planner always first decides the scan/join order, and
adds Group/Agg nodes on top of the joins. Sometimes it would be legal,
and beneficial, to perform the aggregation below a join. I've been
hacking on a patch to allow that.
Because this patch moves a lot of code around, there are nontrivial
conflicts now. I was able to apply it on top of
fb6accd27b99f5f91a7e9e5bd32b98a53fc6d6b8 based on the date.
With that, I'm getting test failures in partition_aggregate, like this
Sort
Sort Key: t2.y, (sum(t1.y)), (count(*))
- -> Append
- -> HashAggregate
- Group Key: t2.y
...
+ -> Result
+ -> Append
+ -> HashAggregate
+ Group Key: t1.x
...
And there is apparently no expected/aggregate_pushdown.out file in the
patch set.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
This is what I managed to hack so far. Now the patch can also handle the
AGGSPLIT_SIMPLE aggregates.
Antonin Houska <ah@cybertec.at> wrote:
Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Ah, cool! I missed that thread earlier. Yes, seems like we've been hacking on
the same feature. Let's compare!I've been using this paper as a guide:
"Including Group-By in Query Optimization", by Surajit Chaudhuri and Kyuseok
Shim:
https://pdfs.semanticscholar.org/3079/5447cec18753254edbbd7839f0afa58b2a39.pdfUsing the terms from that paper, my patch does only "Invariant Grouping
transormation", while yours can do "Simple Coalescing Grouping", which is more
general. In layman terms, my patch can push the Aggregate below a join, while
your patch can also split an Aggregate so that you do a partial aggregate
below the join, and a final stage above it.Thanks for the link. I've just checked the two approaches briefly so far.
My thinking was to start with the simpler Invariant Grouping transformation
first, and do the more advanced splitting into partial aggregates later, as
a separate patch.I think for this you need to make sure that no join duplicates already
processed groups. I tried to implement PoC of "unique keys" in v5 of my patch
[1], see 09_avoid_agg_finalization.diff. The point is that the final
aggregation can be replaced by calling pg_aggregate(aggfinalfn) function if
the final join generates only unique values of the GROUP BY expression.Eventually I considered this an additional optimization of my approach and
postponed work on this to later, but I think something like this would be
necessary for your approach. As soon as you find out that a grouped relation
is joined to another relation in a way that duplicates the grouping
expression, you cannot proceed in creating grouped path.
The current patch version does not check the uniqueness of grouping keys, so
it actually doe not produce the plans you wanted to see. For expeimental
purposes, you can comment out the (agg_kind == REL_AGG_KIND_SIMPLE) branch
near the bottom of make_join_rel_common_grouped() and see that the
agg_pushdown regression test will generate plans with AGGSPLIT_SIMPLE pushed
down. For example
Finalize HashAggregate
Group Key: p.i
-> Hash Join
Hash Cond: (p.i = c1.parent)
-> Seq Scan on agg_pushdown_parent p
-> Hash
-> Partial HashAggregate
Group Key: c1.parent
-> Seq Scan on agg_pushdown_child1 c1
will become
Hash Join
Hash Cond: (p.i = c1.parent)
-> Seq Scan on agg_pushdown_parent p
-> Hash
-> HashAggregate
Group Key: c1.parent
-> Seq Scan on agg_pushdown_child1 c1
The plan will look correct but it can generate duplicate grouping keys.
There's some difference in the way we're dealing with the grouped
RelOptInfos. You're storing them completely separately, in PlannerInfo's new
'simple_grouped_rel_array' array and 'join_grouped_rel_list/hash'. I'm
attaching each grouped RelOptInfo to its parent.In the first version of my patch I added several fields to RelOptInfo
(reltarget for the grouped paths, separate pathlist, etc.) but some people
didn't like it. In the later versions I introduced a separate RelOptInfo for
grouped relations, but stored it in a separate list. Your approach might make
the patch a bit less invasive, i.e. we don't have to add those RelOptInfo
lists / arrays to standard_join_search() and subroutines.
Done. I think this concept might eventually lead to simplification of
create_grouping_paths(), especially the in the special cases related to
partitioning. I just think so because it's more generic, but haven't tried
yet.
You're introducing a new GroupedVar expression to represent Aggrefs between
the partial and final aggregates, while I'm just using Aggref directly, above
the aggregate node. I'm not thrilled about introducing an new Var-like
expression. We already have PlaceHolderVars, and I'm always confused on how
those work. But maybe that's necessary to supprot partial aggregation?The similarity of GroupedVar and PlaceHolderVar is that they are evaluated at
some place in the join tree and the result is only passed to the joins above
and eventually to the query target, w/o being evaluated again. In contrast,
generic expressions are evaluated in the query target (only the input Vars get
propagated from lower nodes), but that's not what we want for 2-stage
aggregation.
In my patch GroupedVar represents either the result of partial aggregation
(the value of the transient state) or a grouping expression which is more
complex than a plain column reference (Var expression).
I'm still not convinced that GroupedVar should be removed. First, RelOptInfo
can currently have either Var or PlaceHolderVar in its reltarget, so I prefer
to add no more than one kind of expression (GroupedVar can represent either
Aggref or a generic (non-Var) grouping expression). Second, GroupedVar
indicates during cost estimation that the value has been evaluated at lower
node of the join tree and thus the higher nodes should not account for the
evaluation again, see set_pathtarget_cost_width().
In the other thread, Robert Haas wrote:
Concretely, in your test query "SELECT p.i, avg(c1.v) FROM
agg_pushdown_parent AS p JOIN agg_pushdown_child1 AS c1 ON c1.parent =
p.i GROUP BY p.i" you assume that it's OK to do a Partial
HashAggregate over c1.parent rather than p.i. This will be false if,
say, c1.parent is of type citext and p.i is of type text; this will
get grouped together that shouldn't. It will also be false if the
grouping expression is something like GROUP BY length(p.i::text),
because one value could be '0'::numeric and the other '0.00'::numeric.
I can't think of a reason why it would be false if the grouping
expressions are both simple Vars of the same underlying data type, but
I'm a little nervous that I might be wrong even about that case.
Maybe you've handled all of this somehow, but it's not obvious to me
that it has been considered.Ah, I made the same mistake. I did consider the "GROUP BY length(o.i::text)",
but not the cross-datatype case. I think we should punt on that for now, and
only do the substitution for simple Vars of the same datatype. That seems safe
to me.Yes, I reached the same conclusion. I'll add this restriction to the next
version of the patch.
Done.
Rleated problem is "binary equality" of grouping keys. We need to avoid
aggregation push-down if it can discard information that JOIN/ON or WHERE
clause needs. The patch does not solve the problem yet. This is where the
discussion ended up: [1]/messages/by-id/11966.1530875502@localhost
Overall, your patch is in a more polished state than my prototype.
Probably I spent much more time on it.
For easier review, though, I think we should try to get something smaller
committed first, and build on that. Let's punt on the Var substitution.
And if we punt on the partial aggregation, and only allow pushing down the
whole Aggregate, how much smaller would your patch get?I can't tell now, need to spend some time looking at the code.
I didn't have enough time to separate "your functionality" and can do it when
I'm back from vacation. If you're courious and want to try yourself, this is
what you need to do to eliminate "my functionality":
* Remove the REL_AGG_KIND_PARTIAL constant (or the whole RelAggKind
enumeration)
* Remove the needs_final_agg field of the RelOptGrouped structure (or the
whole structure and make RelOptInfo grouped point directly to the RelOptInfo
that no_final_agg currently points to).
* Remove code paths "if (GroupedVar.agg_partial != NULL) ..."
* Revert my changes of create_ordinary_grouping_paths()
* Revert the related part of my changes of set_upper_references() (see
comments)
And then try to adjust the code so it can compile.
Of course the patch needs quite some polishing, but first we need to
achieve consensus on the concepts.
(I'll be on vacation for the next two weeks, but I'll be following this
thread. After that, I plan to focus on this feature, as time from reviewing
patches in the commitfest permits.)Likewise, I'll be off from July 5th to 22nd. I'll post what I have before I
leave and will see what you could make out of it :-)It's cool that you intend to work on this feature too!
A few more notes:
* I didn't have time to check if all the regression tests succeed. Besides the
tests that the patch adds I only ran the partition_join test with
enable_agg_pushdown enabled, to see that the patch does push aggregation down
to partitions.
* Older version of my patch contains the postgres_fdw part. I can adjust it
when I'm back.
[1]: /messages/by-id/11966.1530875502@localhost
--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com
Attachments:
agg_pushdown_v7.patchtext/x-diffDownload
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index e284fd71d7..fa71cfb010 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -798,6 +798,45 @@ ExecInitExprRec(Expr *node, ExprState *state,
break;
}
+ case T_GroupedVar:
+
+ /*
+ * If GroupedVar appears in targetlist of Agg node, it can
+ * represent either Aggref or grouping expression.
+ *
+ * TODO Consider doing this expansion earlier, e.g. in setrefs.c.
+ */
+ if (state->parent && (IsA(state->parent, AggState)))
+ {
+ GroupedVar *gvar = (GroupedVar *) node;
+
+ if (IsA(gvar->gvexpr, Aggref))
+ {
+ if (gvar->agg_partial)
+ ExecInitExprRec((Expr *) gvar->agg_partial, state,
+ resv, resnull);
+ else
+ ExecInitExprRec((Expr *) gvar->gvexpr, state,
+ resv, resnull);
+ }
+ else
+ ExecInitExprRec((Expr *) gvar->gvexpr, state,
+ resv, resnull);
+ break;
+ }
+ else
+ {
+ /*
+ * set_plan_refs should have replaced GroupedVar in the
+ * targetlist with an ordinary Var.
+ *
+ * XXX Should we error out here? There's at least one legal
+ * case here which we'd have to check: a Result plan with no
+ * outer plan which represents an empty Append plan.
+ */
+ break;
+ }
+
case T_GroupingFunc:
{
GroupingFunc *grp_node = (GroupingFunc *) node;
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 1c12075b01..c03e7592b3 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -1421,6 +1421,7 @@ _copyAggref(const Aggref *from)
COPY_SCALAR_FIELD(aggcollid);
COPY_SCALAR_FIELD(inputcollid);
COPY_SCALAR_FIELD(aggtranstype);
+ COPY_SCALAR_FIELD(aggcombinefn);
COPY_NODE_FIELD(aggargtypes);
COPY_NODE_FIELD(aggdirectargs);
COPY_NODE_FIELD(args);
@@ -2273,6 +2274,23 @@ _copyPlaceHolderVar(const PlaceHolderVar *from)
}
/*
+ * _copyGroupedVar
+ */
+static GroupedVar *
+_copyGroupedVar(const GroupedVar *from)
+{
+ GroupedVar *newnode = makeNode(GroupedVar);
+
+ COPY_NODE_FIELD(gvexpr);
+ COPY_NODE_FIELD(agg_partial);
+ COPY_SCALAR_FIELD(sortgroupref);
+ COPY_SCALAR_FIELD(gvid);
+ COPY_SCALAR_FIELD(width);
+
+ return newnode;
+}
+
+/*
* _copySpecialJoinInfo
*/
static SpecialJoinInfo *
@@ -2331,6 +2349,21 @@ _copyPlaceHolderInfo(const PlaceHolderInfo *from)
return newnode;
}
+static GroupedVarInfo *
+_copyGroupedVarInfo(const GroupedVarInfo *from)
+{
+ GroupedVarInfo *newnode = makeNode(GroupedVarInfo);
+
+ COPY_SCALAR_FIELD(gvid);
+ COPY_NODE_FIELD(gvexpr);
+ COPY_NODE_FIELD(agg_partial);
+ COPY_SCALAR_FIELD(sortgroupref);
+ COPY_SCALAR_FIELD(gv_eval_at);
+ COPY_SCALAR_FIELD(derived);
+
+ return newnode;
+}
+
/* ****************************************************************
* parsenodes.h copy functions
* ****************************************************************
@@ -5086,6 +5119,9 @@ copyObjectImpl(const void *from)
case T_PlaceHolderVar:
retval = _copyPlaceHolderVar(from);
break;
+ case T_GroupedVar:
+ retval = _copyGroupedVar(from);
+ break;
case T_SpecialJoinInfo:
retval = _copySpecialJoinInfo(from);
break;
@@ -5095,6 +5131,9 @@ copyObjectImpl(const void *from)
case T_PlaceHolderInfo:
retval = _copyPlaceHolderInfo(from);
break;
+ case T_GroupedVarInfo:
+ retval = _copyGroupedVarInfo(from);
+ break;
/*
* VALUE NODES
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 6a971d0141..8cd4051e74 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -873,6 +873,14 @@ _equalPlaceHolderVar(const PlaceHolderVar *a, const PlaceHolderVar *b)
}
static bool
+_equalGroupedVar(const GroupedVar *a, const GroupedVar *b)
+{
+ COMPARE_SCALAR_FIELD(gvid);
+
+ return true;
+}
+
+static bool
_equalSpecialJoinInfo(const SpecialJoinInfo *a, const SpecialJoinInfo *b)
{
COMPARE_BITMAPSET_FIELD(min_lefthand);
@@ -3173,6 +3181,9 @@ equal(const void *a, const void *b)
case T_PlaceHolderVar:
retval = _equalPlaceHolderVar(a, b);
break;
+ case T_GroupedVar:
+ retval = _equalGroupedVar(a, b);
+ break;
case T_SpecialJoinInfo:
retval = _equalSpecialJoinInfo(a, b);
break;
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index a10014f755..8ea1f212a8 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -259,6 +259,17 @@ exprType(const Node *expr)
case T_PlaceHolderVar:
type = exprType((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
+ case T_GroupedVar:
+ if (IsA(((const GroupedVar *) expr)->gvexpr, Aggref))
+ {
+ if (((const GroupedVar *) expr)->agg_partial)
+ type = exprType((Node *) ((const GroupedVar *) expr)->agg_partial);
+ else
+ type = exprType((Node *) ((const GroupedVar *) expr)->gvexpr);
+ }
+ else
+ type = exprType((Node *) ((const GroupedVar *) expr)->gvexpr);
+ break;
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(expr));
type = InvalidOid; /* keep compiler quiet */
@@ -492,6 +503,16 @@ exprTypmod(const Node *expr)
return ((const SetToDefault *) expr)->typeMod;
case T_PlaceHolderVar:
return exprTypmod((Node *) ((const PlaceHolderVar *) expr)->phexpr);
+ case T_GroupedVar:
+ if (IsA(((const GroupedVar *) expr)->gvexpr, Aggref))
+ {
+ if (((const GroupedVar *) expr)->agg_partial)
+ return exprTypmod((Node *) ((const GroupedVar *) expr)->agg_partial);
+ else
+ return exprTypmod((Node *) ((const GroupedVar *) expr)->gvexpr);
+ }
+ else
+ return exprTypmod((Node *) ((const GroupedVar *) expr)->gvexpr);
default:
break;
}
@@ -903,6 +924,12 @@ exprCollation(const Node *expr)
case T_PlaceHolderVar:
coll = exprCollation((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
+ case T_GroupedVar:
+ if (IsA(((const GroupedVar *) expr)->gvexpr, Aggref))
+ coll = exprCollation((Node *) ((const GroupedVar *) expr)->agg_partial);
+ else
+ coll = exprCollation((Node *) ((const GroupedVar *) expr)->gvexpr);
+ break;
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(expr));
coll = InvalidOid; /* keep compiler quiet */
@@ -2187,6 +2214,8 @@ expression_tree_walker(Node *node,
break;
case T_PlaceHolderVar:
return walker(((PlaceHolderVar *) node)->phexpr, context);
+ case T_GroupedVar:
+ return walker(((GroupedVar *) node)->gvexpr, context);
case T_InferenceElem:
return walker(((InferenceElem *) node)->expr, context);
case T_AppendRelInfo:
@@ -2993,6 +3022,16 @@ expression_tree_mutator(Node *node,
return (Node *) newnode;
}
break;
+ case T_GroupedVar:
+ {
+ GroupedVar *gv = (GroupedVar *) node;
+ GroupedVar *newnode;
+
+ FLATCOPY(newnode, gv, GroupedVar);
+ MUTATE(newnode->gvexpr, gv->gvexpr, Expr *);
+ MUTATE(newnode->agg_partial, gv->agg_partial, Aggref *);
+ return (Node *) newnode;
+ }
case T_InferenceElem:
{
InferenceElem *inferenceelemdexpr = (InferenceElem *) node;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 979d523e00..7a8e52614d 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1196,6 +1196,7 @@ _outAggref(StringInfo str, const Aggref *node)
WRITE_OID_FIELD(aggcollid);
WRITE_OID_FIELD(inputcollid);
WRITE_OID_FIELD(aggtranstype);
+ WRITE_OID_FIELD(aggcombinefn);
WRITE_NODE_FIELD(aggargtypes);
WRITE_NODE_FIELD(aggdirectargs);
WRITE_NODE_FIELD(args);
@@ -2287,6 +2288,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
WRITE_NODE_FIELD(append_rel_list);
WRITE_NODE_FIELD(rowMarks);
WRITE_NODE_FIELD(placeholder_list);
+ WRITE_NODE_FIELD(grouped_var_list);
WRITE_NODE_FIELD(fkey_list);
WRITE_NODE_FIELD(query_pathkeys);
WRITE_NODE_FIELD(group_pathkeys);
@@ -2294,6 +2296,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
WRITE_NODE_FIELD(distinct_pathkeys);
WRITE_NODE_FIELD(sort_pathkeys);
WRITE_NODE_FIELD(processed_tlist);
+ WRITE_INT_FIELD(max_sortgroupref);
WRITE_NODE_FIELD(minmax_aggs);
WRITE_FLOAT_FIELD(total_table_pages, "%.0f");
WRITE_FLOAT_FIELD(tuple_fraction, "%.4f");
@@ -2334,6 +2337,7 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node)
WRITE_NODE_FIELD(cheapest_parameterized_paths);
WRITE_BITMAPSET_FIELD(direct_lateral_relids);
WRITE_BITMAPSET_FIELD(lateral_relids);
+ WRITE_NODE_FIELD(agg_info);
WRITE_UINT_FIELD(relid);
WRITE_OID_FIELD(reltablespace);
WRITE_ENUM_FIELD(rtekind, RTEKind);
@@ -2511,6 +2515,20 @@ _outParamPathInfo(StringInfo str, const ParamPathInfo *node)
}
static void
+_outRelAggInfo(StringInfo str, const RelAggInfo *node)
+{
+ WRITE_NODE_TYPE("RELAGGINFO");
+
+ WRITE_NODE_FIELD(target_simple);
+ WRITE_NODE_FIELD(target_partial);
+ WRITE_NODE_FIELD(input);
+ WRITE_NODE_FIELD(group_clauses);
+ WRITE_NODE_FIELD(group_exprs);
+ WRITE_NODE_FIELD(agg_exprs_simple);
+ WRITE_NODE_FIELD(agg_exprs_partial);
+}
+
+static void
_outRestrictInfo(StringInfo str, const RestrictInfo *node)
{
WRITE_NODE_TYPE("RESTRICTINFO");
@@ -2554,6 +2572,18 @@ _outPlaceHolderVar(StringInfo str, const PlaceHolderVar *node)
}
static void
+_outGroupedVar(StringInfo str, const GroupedVar *node)
+{
+ WRITE_NODE_TYPE("GROUPEDVAR");
+
+ WRITE_NODE_FIELD(gvexpr);
+ WRITE_NODE_FIELD(agg_partial);
+ WRITE_UINT_FIELD(sortgroupref);
+ WRITE_UINT_FIELD(gvid);
+ WRITE_INT_FIELD(width);
+}
+
+static void
_outSpecialJoinInfo(StringInfo str, const SpecialJoinInfo *node)
{
WRITE_NODE_TYPE("SPECIALJOININFO");
@@ -2598,6 +2628,19 @@ _outPlaceHolderInfo(StringInfo str, const PlaceHolderInfo *node)
}
static void
+_outGroupedVarInfo(StringInfo str, const GroupedVarInfo *node)
+{
+ WRITE_NODE_TYPE("GROUPEDVARINFO");
+
+ WRITE_UINT_FIELD(gvid);
+ WRITE_NODE_FIELD(gvexpr);
+ WRITE_NODE_FIELD(agg_partial);
+ WRITE_UINT_FIELD(sortgroupref);
+ WRITE_BITMAPSET_FIELD(gv_eval_at);
+ WRITE_BOOL_FIELD(derived);
+}
+
+static void
_outMinMaxAggInfo(StringInfo str, const MinMaxAggInfo *node)
{
WRITE_NODE_TYPE("MINMAXAGGINFO");
@@ -4121,12 +4164,18 @@ outNode(StringInfo str, const void *obj)
case T_ParamPathInfo:
_outParamPathInfo(str, obj);
break;
+ case T_RelAggInfo:
+ _outRelAggInfo(str, obj);
+ break;
case T_RestrictInfo:
_outRestrictInfo(str, obj);
break;
case T_PlaceHolderVar:
_outPlaceHolderVar(str, obj);
break;
+ case T_GroupedVar:
+ _outGroupedVar(str, obj);
+ break;
case T_SpecialJoinInfo:
_outSpecialJoinInfo(str, obj);
break;
@@ -4136,6 +4185,9 @@ outNode(StringInfo str, const void *obj)
case T_PlaceHolderInfo:
_outPlaceHolderInfo(str, obj);
break;
+ case T_GroupedVarInfo:
+ _outGroupedVarInfo(str, obj);
+ break;
case T_MinMaxAggInfo:
_outMinMaxAggInfo(str, obj);
break;
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 42aff7f57a..9a54f768c3 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -534,6 +534,23 @@ _readVar(void)
}
/*
+ * _readGroupedVar
+ */
+static GroupedVar *
+_readGroupedVar(void)
+{
+ READ_LOCALS(GroupedVar);
+
+ READ_NODE_FIELD(gvexpr);
+ READ_NODE_FIELD(agg_partial);
+ READ_UINT_FIELD(sortgroupref);
+ READ_UINT_FIELD(gvid);
+ READ_INT_FIELD(width);
+
+ READ_DONE();
+}
+
+/*
* _readConst
*/
static Const *
@@ -589,6 +606,7 @@ _readAggref(void)
READ_OID_FIELD(aggcollid);
READ_OID_FIELD(inputcollid);
READ_OID_FIELD(aggtranstype);
+ READ_OID_FIELD(aggcombinefn);
READ_NODE_FIELD(aggargtypes);
READ_NODE_FIELD(aggdirectargs);
READ_NODE_FIELD(args);
@@ -2535,6 +2553,8 @@ parseNodeString(void)
return_value = _readTableFunc();
else if (MATCH("VAR", 3))
return_value = _readVar();
+ else if (MATCH("GROUPEDVAR", 10))
+ return_value = _readGroupedVar();
else if (MATCH("CONST", 5))
return_value = _readConst();
else if (MATCH("PARAM", 5))
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index 3ef7d7d8aa..30b5b6ad63 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -266,7 +266,8 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
if (joinrel)
{
/* Create paths for partitionwise joins. */
- generate_partitionwise_join_paths(root, joinrel);
+ generate_partitionwise_join_paths(root, joinrel,
+ REL_AGG_KIND_NONE);
/*
* Except for the topmost scan/join rel, consider gathering
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 3ada379f8b..70a97689b5 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -58,6 +58,7 @@ typedef struct pushdown_safety_info
/* These parameters are set by GUC */
bool enable_geqo = false; /* just in case GUC doesn't set it */
+bool enable_agg_pushdown;
int geqo_threshold;
int min_parallel_table_scan_size;
int min_parallel_index_scan_size;
@@ -73,16 +74,18 @@ static void set_base_rel_consider_startup(PlannerInfo *root);
static void set_base_rel_sizes(PlannerInfo *root);
static void set_base_rel_pathlists(PlannerInfo *root);
static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
- Index rti, RangeTblEntry *rte);
+ Index rti, RangeTblEntry *rte, RelAggKind agg_kind);
static void set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
- Index rti, RangeTblEntry *rte);
+ Index rti, RangeTblEntry *rte,
+ RelAggKind agg_kind);
static void set_plain_rel_size(PlannerInfo *root, RelOptInfo *rel,
RangeTblEntry *rte);
-static void create_plain_partial_paths(PlannerInfo *root, RelOptInfo *rel);
+static void create_plain_partial_paths(PlannerInfo *root, RelOptInfo *rel,
+ RelAggKind agg_kind);
static void set_rel_consider_parallel(PlannerInfo *root, RelOptInfo *rel,
RangeTblEntry *rte);
static void set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
- RangeTblEntry *rte);
+ RangeTblEntry *rte, RelAggKind agg_kind);
static void set_tablesample_rel_size(PlannerInfo *root, RelOptInfo *rel,
RangeTblEntry *rte);
static void set_tablesample_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
@@ -92,16 +95,19 @@ static void set_foreign_size(PlannerInfo *root, RelOptInfo *rel,
static void set_foreign_pathlist(PlannerInfo *root, RelOptInfo *rel,
RangeTblEntry *rte);
static void set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
- Index rti, RangeTblEntry *rte);
+ Index rti, RangeTblEntry *rte,
+ RelAggKind agg_kind);
static void set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
- Index rti, RangeTblEntry *rte);
+ Index rti, RangeTblEntry *rte,
+ RelAggKind agg_kind);
static void generate_mergeappend_paths(PlannerInfo *root, RelOptInfo *rel,
List *live_childrels,
List *all_child_pathkeys,
List *partitioned_rels);
static Path *get_cheapest_parameterized_child_path(PlannerInfo *root,
RelOptInfo *rel,
- Relids required_outer);
+ Relids required_outer,
+ RelAggKind agg_kind);
static void accumulate_append_subpath(Path *path,
List **subpaths, List **special_subpaths);
static void set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
@@ -118,7 +124,8 @@ static void set_namedtuplestore_pathlist(PlannerInfo *root, RelOptInfo *rel,
RangeTblEntry *rte);
static void set_worktable_pathlist(PlannerInfo *root, RelOptInfo *rel,
RangeTblEntry *rte);
-static RelOptInfo *make_rel_from_joinlist(PlannerInfo *root, List *joinlist);
+static RelOptInfo *make_rel_from_joinlist(PlannerInfo *root,
+ List *joinlist);
static bool subquery_is_pushdown_safe(Query *subquery, Query *topquery,
pushdown_safety_info *safetyInfo);
static bool recurse_pushdown_safe(Node *setOp, Query *topquery,
@@ -140,7 +147,8 @@ static void remove_unused_subquery_outputs(Query *subquery, RelOptInfo *rel);
/*
* make_one_rel
* Finds all possible access paths for executing a query, returning a
- * single rel that represents the join of all base rels in the query.
+ * single rel that represents the join of all base rels in the query. If
+ * possible, also return a join that contains partial aggregate(s).
*/
RelOptInfo *
make_one_rel(PlannerInfo *root, List *joinlist)
@@ -169,12 +177,16 @@ make_one_rel(PlannerInfo *root, List *joinlist)
root->all_baserels = bms_add_member(root->all_baserels, brel->relid);
}
- /* Mark base rels as to whether we care about fast-start plans */
+ /*
+ * Mark base rels as to whether we care about fast-start plans. XXX We
+ * deliberately do not mark grouped rels --- see the comment on
+ * consider_startup in build_simple_rel().
+ */
set_base_rel_consider_startup(root);
/*
- * Compute size estimates and consider_parallel flags for each base rel,
- * then generate access paths.
+ * Compute size estimates and consider_parallel flags for each plain and
+ * each grouped base rel, then generate access paths.
*/
set_base_rel_sizes(root);
set_base_rel_pathlists(root);
@@ -231,6 +243,21 @@ set_base_rel_consider_startup(PlannerInfo *root)
RelOptInfo *rel = find_base_rel(root, varno);
rel->consider_param_startup = true;
+
+ if (rel->grouped)
+ {
+ /*
+ * As for grouped relations, paths differ substantially by the
+ * AggStrategy. Paths that use AGG_HASHED should not be
+ * parameterized (because creation of hashtable would have to
+ * be repeated for different parameters) but paths using
+ * AGG_SORTED can be. The latter seems to justify considering
+ * the startup cost for grouped relation in general.
+ */
+ rel->grouped->needs_final_agg->consider_param_startup = true;
+ if (rel->grouped->no_final_agg)
+ rel->grouped->no_final_agg->consider_param_startup = true;
+ }
}
}
}
@@ -253,6 +280,7 @@ set_base_rel_sizes(PlannerInfo *root)
for (rti = 1; rti < root->simple_rel_array_size; rti++)
{
RelOptInfo *rel = root->simple_rel_array[rti];
+ RelOptGrouped *rels_grouped;
RangeTblEntry *rte;
/* there may be empty slots corresponding to non-baserel RTEs */
@@ -266,6 +294,7 @@ set_base_rel_sizes(PlannerInfo *root)
continue;
rte = root->simple_rte_array[rti];
+ rels_grouped = rel->grouped;
/*
* If parallelism is allowable for this query in general, see whether
@@ -276,9 +305,31 @@ set_base_rel_sizes(PlannerInfo *root)
* goes ahead and makes paths immediately.
*/
if (root->glob->parallelModeOK)
+ {
set_rel_consider_parallel(root, rel, rte);
- set_rel_size(root, rel, rti, rte);
+ /*
+ * The grouped rel should not need this field (the owning plain
+ * relation controls whether the aggregation takes place in a
+ * parallel worker) but let's set it for consistency.
+ *
+ * TODO Either do the same for no_final_agg or remove this setting
+ * altogether.
+ */
+ if (rels_grouped)
+ rels_grouped->needs_final_agg->consider_parallel =
+ rel->consider_parallel;
+ }
+
+ set_rel_size(root, rel, rti, rte, REL_AGG_KIND_NONE);
+ if (rels_grouped)
+ {
+ set_rel_size(root, rels_grouped->needs_final_agg, rti, rte,
+ REL_AGG_KIND_PARTIAL);
+ if (rels_grouped->no_final_agg)
+ set_rel_size(root, rels_grouped->no_final_agg, rti, rte,
+ REL_AGG_KIND_SIMPLE);
+ }
}
}
@@ -297,7 +348,9 @@ set_base_rel_pathlists(PlannerInfo *root)
{
RelOptInfo *rel = root->simple_rel_array[rti];
- /* there may be empty slots corresponding to non-baserel RTEs */
+ /*
+ * there may be empty slots corresponding to non-baserel RTEs
+ */
if (rel == NULL)
continue;
@@ -307,7 +360,31 @@ set_base_rel_pathlists(PlannerInfo *root)
if (rel->reloptkind != RELOPT_BASEREL)
continue;
- set_rel_pathlist(root, rel, rti, root->simple_rte_array[rti]);
+ set_rel_pathlist(root, rel, rti, root->simple_rte_array[rti],
+ REL_AGG_KIND_NONE);
+
+ /*
+ * Create grouped paths for grouped relation if it exists.
+ */
+ if (rel->grouped)
+ {
+ Assert(rel->grouped->needs_final_agg->agg_info != NULL);
+ Assert(rel->grouped->needs_final_agg->grouped == NULL);
+
+ set_rel_pathlist(root, rel, rti,
+ root->simple_rte_array[rti],
+ REL_AGG_KIND_PARTIAL);
+
+ if (rel->grouped->no_final_agg)
+ {
+ Assert(rel->grouped->no_final_agg->agg_info != NULL);
+ Assert(rel->grouped->no_final_agg->grouped == NULL);
+
+ set_rel_pathlist(root, rel, rti,
+ root->simple_rte_array[rti],
+ REL_AGG_KIND_SIMPLE);
+ }
+ }
}
}
@@ -317,8 +394,16 @@ set_base_rel_pathlists(PlannerInfo *root)
*/
static void
set_rel_size(PlannerInfo *root, RelOptInfo *rel,
- Index rti, RangeTblEntry *rte)
+ Index rti, RangeTblEntry *rte, RelAggKind agg_kind)
{
+ bool grouped = rel->agg_info != NULL;
+
+ /*
+ * build_simple_rel() should not have created rels that do not match this
+ * condition.
+ */
+ Assert(!grouped || rte->rtekind == RTE_RELATION);
+
if (rel->reloptkind == RELOPT_BASEREL &&
relation_excluded_by_constraints(root, rel, rte))
{
@@ -338,7 +423,7 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
else if (rte->inh)
{
/* It's an "append relation", process accordingly */
- set_append_rel_size(root, rel, rti, rte);
+ set_append_rel_size(root, rel, rti, rte, agg_kind);
}
else
{
@@ -348,6 +433,8 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
if (rte->relkind == RELKIND_FOREIGN_TABLE)
{
/* Foreign table */
+ /* Not supported yet, see build_simple_rel(). */
+ Assert(!grouped);
set_foreign_size(root, rel, rte);
}
else if (rte->relkind == RELKIND_PARTITIONED_TABLE)
@@ -361,6 +448,8 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
else if (rte->tablesample != NULL)
{
/* Sampled relation */
+ /* Not supported yet, see build_simple_rel(). */
+ Assert(!grouped);
set_tablesample_rel_size(root, rel, rte);
}
else
@@ -420,8 +509,16 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
*/
static void
set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
- Index rti, RangeTblEntry *rte)
+ Index rti, RangeTblEntry *rte, RelAggKind agg_kind)
{
+ bool grouped = rel->agg_info != NULL;
+
+ /*
+ * add_grouped_base_rels_to_query() should not have created rels that do
+ * not match this condition.
+ */
+ Assert(!grouped || rte->rtekind == RTE_RELATION);
+
if (IS_DUMMY_REL(rel))
{
/* We already proved the relation empty, so nothing more to do */
@@ -429,7 +526,7 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
else if (rte->inh)
{
/* It's an "append relation", process accordingly */
- set_append_rel_pathlist(root, rel, rti, rte);
+ set_append_rel_pathlist(root, rel, rti, rte, agg_kind);
}
else
{
@@ -439,17 +536,21 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
if (rte->relkind == RELKIND_FOREIGN_TABLE)
{
/* Foreign table */
+ /* Not supported yet, see build_simple_rel(). */
+ Assert(!grouped);
set_foreign_pathlist(root, rel, rte);
}
else if (rte->tablesample != NULL)
{
/* Sampled relation */
+ /* Not supported yet, see build_simple_rel(). */
+ Assert(!grouped);
set_tablesample_rel_pathlist(root, rel, rte);
}
else
{
/* Plain relation */
- set_plain_rel_pathlist(root, rel, rte);
+ set_plain_rel_pathlist(root, rel, rte, agg_kind);
}
break;
case RTE_SUBQUERY:
@@ -479,6 +580,11 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
}
}
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ rel = rel->grouped->needs_final_agg;
+ else if (agg_kind == REL_AGG_KIND_SIMPLE)
+ rel = rel->grouped->no_final_agg;
+
/*
* If this is a baserel, we should normally consider gathering any partial
* paths we may have created for it.
@@ -692,9 +798,17 @@ set_rel_consider_parallel(PlannerInfo *root, RelOptInfo *rel,
* Build access paths for a plain relation (no subquery, no inheritance)
*/
static void
-set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
+set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte,
+ RelAggKind agg_kind)
{
Relids required_outer;
+ Path *seq_path;
+ RelOptInfo *rel_plain = rel;
+
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ rel = rel->grouped->needs_final_agg;
+ else if (agg_kind == REL_AGG_KIND_SIMPLE)
+ rel = rel->grouped->no_final_agg;
/*
* We don't support pushing join clauses into the quals of a seqscan, but
@@ -703,18 +817,37 @@ set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
*/
required_outer = rel->lateral_relids;
- /* Consider sequential scan */
- add_path(rel, create_seqscan_path(root, rel, required_outer, 0));
+ /* Consider sequential scan, both plain and grouped. */
+ seq_path = create_seqscan_path(root, rel, required_outer, 0);
- /* If appropriate, consider parallel sequential scan */
+ /*
+ * It's probably not good idea to repeat hashed aggregation with different
+ * parameters, so check if there are no parameters.
+ */
+ if (agg_kind == REL_AGG_KIND_NONE)
+ add_path(rel, seq_path);
+ else if (required_outer == NULL)
+ {
+ /*
+ * Only AGG_HASHED is suitable here as it does not expect the input
+ * set to be sorted.
+ */
+ create_grouped_path(root, rel, seq_path, false, false, AGG_HASHED,
+ agg_kind);
+ }
+
+ /* If appropriate, consider parallel sequential scan (plain or grouped) */
if (rel->consider_parallel && required_outer == NULL)
- create_plain_partial_paths(root, rel);
+ create_plain_partial_paths(root, rel_plain, agg_kind);
- /* Consider index scans */
- create_index_paths(root, rel);
+ /*
+ * Consider index scans.
+ */
+ create_index_paths(root, rel, agg_kind);
/* Consider TID scans */
- create_tidscan_paths(root, rel);
+ /* TODO Regression test for these paths. */
+ create_tidscan_paths(root, rel, agg_kind);
}
/*
@@ -722,19 +855,143 @@ set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
* Build partial access paths for parallel scan of a plain relation
*/
static void
-create_plain_partial_paths(PlannerInfo *root, RelOptInfo *rel)
+create_plain_partial_paths(PlannerInfo *root, RelOptInfo *rel,
+ RelAggKind agg_kind)
{
int parallel_workers;
+ Path *path;
- parallel_workers = compute_parallel_worker(rel, rel->pages, -1,
- max_parallel_workers_per_gather);
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ rel = rel->grouped->needs_final_agg;
+ else if (agg_kind == REL_AGG_KIND_SIMPLE)
+ rel = rel->grouped->no_final_agg;
+
+ /*
+ * See the no_final_agg field of RelOptGrouped for explanation.
+ */
+ if (agg_kind == REL_AGG_KIND_SIMPLE)
+ return;
+
+ parallel_workers = compute_parallel_worker(rel, rel->pages, -1, max_parallel_workers_per_gather);
/* If any limit was set to zero, the user doesn't want a parallel scan. */
if (parallel_workers <= 0)
return;
/* Add an unordered partial path based on a parallel sequential scan. */
- add_partial_path(rel, create_seqscan_path(root, rel, NULL, parallel_workers));
+ path = create_seqscan_path(root, rel, NULL, parallel_workers);
+
+ if (agg_kind == REL_AGG_KIND_NONE)
+ add_partial_path(rel, path);
+ else
+ {
+ /*
+ * Do partial aggregation at base relation level if the relation is
+ * eligible for it. Only AGG_HASHED is suitable here as it does not
+ * expect the input set to be sorted.
+ */
+ create_grouped_path(root, rel, path, false, true, AGG_HASHED,
+ agg_kind);
+ }
+}
+
+/*
+ * Apply aggregation to a subpath and add the AggPath to the pathlist.
+ *
+ * "precheck" tells whether the aggregation path should first be checked using
+ * add_path_precheck() / add_partial_path_precheck().
+ *
+ * If "partial" is true, the aggregation path is considered partial in terms
+ * of parallel execution.
+ *
+ * "agg_kind" tells whether the aggregation should be partial (in terms of
+ * 2-stage aggregation) or simple (i.e. 1-stage aggregation).
+ *
+ * Caution: Since only grouped relation makes sense as an input for this
+ * function, "rel" is the grouped relation even though "agg_kind" is passed
+ * too. This is different from other functions that receive "agg_kind" and use
+ * it to fetch the grouped relation themselves.
+ *
+ * The return value tells whether the path was added to the pathlist.
+ */
+bool
+create_grouped_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
+ bool precheck, bool partial, AggStrategy aggstrategy,
+ RelAggKind agg_kind)
+{
+ Path *agg_path;
+ RelAggInfo *agg_info = rel->agg_info;
+
+ Assert(agg_kind != REL_AGG_KIND_NONE);
+ Assert(agg_info != NULL);
+
+ /*
+ * REL_AGG_KIND_SIMPLE causes finalization of aggregates. We can only
+ * support parallel paths if each worker produced a distinct set of
+ * grouping keys, but such a special case is not known. So this list
+ * should be empty.
+ */
+ if (agg_kind == REL_AGG_KIND_SIMPLE && partial)
+ return false;
+
+ /*
+ * If the AggPath should be partial, the subpath must be too, and
+ * therefore the subpath is essentially parallel_safe.
+ */
+ Assert(subpath->parallel_safe || !partial);
+
+ /*
+ * Repeated creation of hash table does not sound like a good idea. Caller
+ * should avoid asking us to do so.
+ */
+ Assert(subpath->param_info == NULL || aggstrategy != AGG_HASHED);
+
+ /*
+ * Note that "partial" in the following function names refers to 2-stage
+ * aggregation, not to parallel processing.
+ */
+ if (aggstrategy == AGG_HASHED)
+ agg_path = (Path *) create_agg_hashed_path(root, subpath,
+ subpath->rows,
+ agg_kind);
+ else if (aggstrategy == AGG_SORTED)
+ agg_path = (Path *) create_agg_sorted_path(root, subpath,
+ true,
+ subpath->rows,
+ agg_kind);
+ else
+ elog(ERROR, "unexpected strategy %d", aggstrategy);
+
+ /* Add the grouped path to the list of grouped base paths. */
+ if (agg_path != NULL)
+ {
+ if (precheck)
+ {
+ List *pathkeys;
+
+ /* AGG_HASH is not supposed to generate sorted output. */
+ pathkeys = aggstrategy == AGG_SORTED ? subpath->pathkeys : NIL;
+
+ if (!partial &&
+ !add_path_precheck(rel, agg_path->startup_cost,
+ agg_path->total_cost, pathkeys, NULL))
+ return false;
+
+ if (partial &&
+ !add_partial_path_precheck(rel, agg_path->total_cost,
+ pathkeys))
+ return false;
+ }
+
+ if (!partial)
+ add_path(rel, (Path *) agg_path);
+ else
+ add_partial_path(rel, (Path *) agg_path);
+
+ return true;
+ }
+
+ return false;
}
/*
@@ -866,7 +1123,7 @@ set_foreign_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
*/
static void
set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
- Index rti, RangeTblEntry *rte)
+ Index rti, RangeTblEntry *rte, RelAggKind agg_kind)
{
int parentRTindex = rti;
bool has_live_children;
@@ -877,6 +1134,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
ListCell *l;
Relids live_children = NULL;
bool did_pruning = false;
+ bool grouped = rel->agg_info != NULL;
/* Guard against stack overflow due to overly deep inheritance tree. */
check_stack_depth();
@@ -1016,10 +1274,46 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
* PlaceHolderVars.) XXX we do not bother to update the cost or width
* fields of childrel->reltarget; not clear if that would be useful.
*/
- childrel->reltarget->exprs = (List *)
- adjust_appendrel_attrs(root,
- (Node *) rel->reltarget->exprs,
- 1, &appinfo);
+ if (grouped)
+ {
+ RelOptInfo *childrel_grouped;
+
+ Assert(childrel->grouped != NULL);
+
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ childrel_grouped = childrel->grouped->needs_final_agg;
+ else if (agg_kind == REL_AGG_KIND_SIMPLE)
+ childrel_grouped = childrel->grouped->no_final_agg;
+ else
+ Assert(false);
+
+ /*
+ * Special attention is needed in the grouped case.
+ *
+ * copy_simple_rel() didn't create empty target because it's
+ * better to start with copying one from the parent rel.
+ */
+ Assert(childrel_grouped->reltarget == NULL &&
+ childrel_grouped->agg_info == NULL);
+ Assert(rel->reltarget != NULL && rel->agg_info != NULL);
+
+ /*
+ * Translate the targets and grouping expressions so they match
+ * this child.
+ */
+ childrel_grouped->agg_info = translate_rel_agg_info(root, rel->agg_info,
+ &appinfo, 1);
+
+ /*
+ * The relation paths will generate input for partial aggregation.
+ */
+ childrel_grouped->reltarget = childrel_grouped->agg_info->input;
+ }
+ else
+ childrel->reltarget->exprs = (List *)
+ adjust_appendrel_attrs(root,
+ (Node *) rel->reltarget->exprs,
+ 1, &appinfo);
/*
* We have to make child entries in the EquivalenceClass data
@@ -1181,19 +1475,42 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
1, &appinfo);
/*
+ * We have to make child entries in the EquivalenceClass data
+ * structures as well. This is needed either if the parent
+ * participates in some eclass joins (because we will want to consider
+ * inner-indexscan joins on the individual children) or if the parent
+ * has useful pathkeys (because we should try to build MergeAppend
+ * paths that produce those sort orderings).
+ */
+ if (rel->has_eclass_joins || has_useful_pathkeys(root, rel))
+ add_child_rel_equivalences(root, appinfo, rel, childrel);
+ childrel->has_eclass_joins = rel->has_eclass_joins;
+
+ /*
+ * Note: we could compute appropriate attr_needed data for the child's
+ * variables, by transforming the parent's attr_needed through the
+ * translated_vars mapping. However, currently there's no need
+ * because attr_needed is only examined for base relations not
+ * otherrels. So we just leave the child's attr_needed empty.
+ */
+
+ /*
* If parallelism is allowable for this query in general, see whether
* it's allowable for this childrel in particular. But if we've
* already decided the appendrel is not parallel-safe as a whole,
* there's no point in considering parallelism for this child. For
* consistency, do this before calling set_rel_size() for the child.
+ *
+ * The aggregated relations do not use the consider_parallel flag.
*/
- if (root->glob->parallelModeOK && rel->consider_parallel)
+ if (root->glob->parallelModeOK && rel->consider_parallel &&
+ agg_kind == REL_AGG_KIND_NONE)
set_rel_consider_parallel(root, childrel, childRTE);
/*
* Compute the child's size.
*/
- set_rel_size(root, childrel, childRTindex, childRTE);
+ set_rel_size(root, childrel, childRTindex, childRTE, agg_kind);
/*
* It is possible that constraint exclusion detected a contradiction
@@ -1299,13 +1616,20 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
*/
static void
set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
- Index rti, RangeTblEntry *rte)
+ Index rti, RangeTblEntry *rte, RelAggKind agg_kind)
{
int parentRTindex = rti;
List *live_childrels = NIL;
ListCell *l;
/*
+ * TODO Only allow per-child AGGSPLIT_SIMPLE if the partitioning allows
+ * it, i.e. each partition generates distinct set of grouping keys.
+ */
+ if (agg_kind == REL_AGG_KIND_SIMPLE)
+ return;
+
+ /*
* Generate access paths for each member relation, and remember the
* non-dummy children.
*/
@@ -1323,7 +1647,7 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
/* Re-locate the child RTE and RelOptInfo */
childRTindex = appinfo->child_relid;
childRTE = root->simple_rte_array[childRTindex];
- childrel = root->simple_rel_array[childRTindex];
+ childrel = find_base_rel(root, childRTindex);
/*
* If set_append_rel_size() decided the parent appendrel was
@@ -1337,7 +1661,7 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
/*
* Compute the child's access paths.
*/
- set_rel_pathlist(root, childrel, childRTindex, childRTE);
+ set_rel_pathlist(root, childrel, childRTindex, childRTE, agg_kind);
/*
* If child is dummy, ignore it.
@@ -1353,12 +1677,16 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
/*
* Child is live, so add it to the live_childrels list for use below.
+ *
+ * If we added the paths to the grouped child rel, add that grouped
+ * rel to the list instead.
*/
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ childrel = childrel->grouped->needs_final_agg;
live_childrels = lappend(live_childrels, childrel);
}
- /* Add paths to the append relation. */
- add_paths_to_append_rel(root, rel, live_childrels);
+ add_paths_to_append_rel(root, rel, live_childrels, agg_kind);
}
@@ -1375,7 +1703,7 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
*/
void
add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
- List *live_childrels)
+ List *live_childrels, RelAggKind agg_kind)
{
List *subpaths = NIL;
bool subpaths_valid = true;
@@ -1390,6 +1718,21 @@ add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
List *partitioned_rels = NIL;
bool build_partitioned_rels = false;
double partial_rows = -1;
+ RelOptInfo *rel_target;
+
+ /*
+ * TODO Only allow per-child AGGSPLIT_SIMPLE if the partitioning allows
+ * it, i.e. each partition generates distinct set of grouping keys.
+ */
+ Assert(agg_kind != REL_AGG_KIND_SIMPLE);
+
+ /*
+ * Determine on which rel add_path() should be called.
+ */
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ rel_target = rel->grouped->needs_final_agg;
+ else
+ rel_target = rel;
/* If appropriate, consider parallel append */
pa_subpaths_valid = enable_parallel_append && rel->consider_parallel;
@@ -1609,9 +1952,10 @@ add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
* if we have zero or one live subpath due to constraint exclusion.)
*/
if (subpaths_valid)
- add_path(rel, (Path *) create_append_path(root, rel, subpaths, NIL,
- NULL, 0, false,
- partitioned_rels, -1));
+ add_path(rel_target, (Path *) create_append_path(root, rel, subpaths, NIL,
+ NULL, 0, false,
+ partitioned_rels, -1,
+ agg_kind));
/*
* Consider an append of unordered, unparameterized partial paths. Make
@@ -1654,7 +1998,7 @@ add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
appendpath = create_append_path(root, rel, NIL, partial_subpaths,
NULL, parallel_workers,
enable_parallel_append,
- partitioned_rels, -1);
+ partitioned_rels, -1, agg_kind);
/*
* Make sure any subsequent partial paths use the same row count
@@ -1703,7 +2047,8 @@ add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
appendpath = create_append_path(root, rel, pa_nonpartial_subpaths,
pa_partial_subpaths,
NULL, parallel_workers, true,
- partitioned_rels, partial_rows);
+ partitioned_rels, partial_rows,
+ agg_kind);
add_partial_path(rel, (Path *) appendpath);
}
@@ -1742,6 +2087,11 @@ add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
RelOptInfo *childrel = (RelOptInfo *) lfirst(lcr);
Path *subpath;
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ childrel = childrel->grouped->needs_final_agg;
+ else if (agg_kind == REL_AGG_KIND_SIMPLE)
+ childrel = childrel->grouped->no_final_agg;
+
if (childrel->pathlist == NIL)
{
/* failed to make a suitable path for this child */
@@ -1751,7 +2101,8 @@ add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
subpath = get_cheapest_parameterized_child_path(root,
childrel,
- required_outer);
+ required_outer,
+ agg_kind);
if (subpath == NULL)
{
/* failed to make a suitable path for this child */
@@ -1762,10 +2113,10 @@ add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
}
if (subpaths_valid)
- add_path(rel, (Path *)
+ add_path(rel_target, (Path *)
create_append_path(root, rel, subpaths, NIL,
required_outer, 0, false,
- partitioned_rels, -1));
+ partitioned_rels, -1, agg_kind));
}
}
@@ -1799,6 +2150,7 @@ generate_mergeappend_paths(PlannerInfo *root, RelOptInfo *rel,
List *partitioned_rels)
{
ListCell *lcp;
+ PathTarget *target = NULL;
foreach(lcp, all_child_pathkeys)
{
@@ -1807,23 +2159,25 @@ generate_mergeappend_paths(PlannerInfo *root, RelOptInfo *rel,
List *total_subpaths = NIL;
bool startup_neq_total = false;
ListCell *lcr;
+ Path *path;
/* Select the child paths for this ordering... */
foreach(lcr, live_childrels)
{
RelOptInfo *childrel = (RelOptInfo *) lfirst(lcr);
+ List *pathlist = childrel->pathlist;
Path *cheapest_startup,
*cheapest_total;
/* Locate the right paths, if they are available. */
cheapest_startup =
- get_cheapest_path_for_pathkeys(childrel->pathlist,
+ get_cheapest_path_for_pathkeys(pathlist,
pathkeys,
NULL,
STARTUP_COST,
false);
cheapest_total =
- get_cheapest_path_for_pathkeys(childrel->pathlist,
+ get_cheapest_path_for_pathkeys(pathlist,
pathkeys,
NULL,
TOTAL_COST,
@@ -1856,19 +2210,28 @@ generate_mergeappend_paths(PlannerInfo *root, RelOptInfo *rel,
}
/* ... and build the MergeAppend paths */
- add_path(rel, (Path *) create_merge_append_path(root,
- rel,
- startup_subpaths,
- pathkeys,
- NULL,
- partitioned_rels));
+ path = (Path *) create_merge_append_path(root,
+ rel,
+ target,
+ startup_subpaths,
+ pathkeys,
+ NULL,
+ partitioned_rels);
+
+ add_path(rel, path);
+
if (startup_neq_total)
- add_path(rel, (Path *) create_merge_append_path(root,
- rel,
- total_subpaths,
- pathkeys,
- NULL,
- partitioned_rels));
+ {
+ path = (Path *) create_merge_append_path(root,
+ rel,
+ target,
+ total_subpaths,
+ pathkeys,
+ NULL,
+ partitioned_rels);
+ add_path(rel, path);
+ }
+
}
}
@@ -1881,7 +2244,8 @@ generate_mergeappend_paths(PlannerInfo *root, RelOptInfo *rel,
*/
static Path *
get_cheapest_parameterized_child_path(PlannerInfo *root, RelOptInfo *rel,
- Relids required_outer)
+ Relids required_outer,
+ RelAggKind agg_kind)
{
Path *cheapest;
ListCell *lc;
@@ -1928,7 +2292,8 @@ get_cheapest_parameterized_child_path(PlannerInfo *root, RelOptInfo *rel,
/* Reparameterize if needed, then recheck cost */
if (!bms_equal(PATH_REQ_OUTER(path), required_outer))
{
- path = reparameterize_path(root, path, required_outer, 1.0);
+ path = reparameterize_path(root, path, required_outer, 1.0,
+ agg_kind);
if (path == NULL)
continue; /* failed to reparameterize this one */
Assert(bms_equal(PATH_REQ_OUTER(path), required_outer));
@@ -2030,7 +2395,8 @@ set_dummy_rel_pathlist(RelOptInfo *rel)
rel->partial_pathlist = NIL;
add_path(rel, (Path *) create_append_path(NULL, rel, NIL, NIL, NULL,
- 0, false, NIL, -1));
+ 0, false, NIL, -1,
+ REL_AGG_KIND_NONE));
/*
* We set the cheapest path immediately, to ensure that IS_DUMMY_REL()
@@ -2670,11 +3036,22 @@ make_rel_from_joinlist(PlannerInfo *root, List *joinlist)
root->initial_rels = initial_rels;
if (join_search_hook)
- return (*join_search_hook) (root, levels_needed, initial_rels);
+ return (*join_search_hook) (root, levels_needed,
+ initial_rels);
else if (enable_geqo && levels_needed >= geqo_threshold)
+ {
+ /*
+ * TODO Teach GEQO about grouped relations. Don't forget that
+ * pathlist can be NIL before set_cheapest() gets called.
+ *
+ * This processing makes no difference betweend plain and grouped
+ * rels, so process them in the same loop.
+ */
return geqo(root, levels_needed, initial_rels);
+ }
else
- return standard_join_search(root, levels_needed, initial_rels);
+ return standard_join_search(root, levels_needed,
+ initial_rels);
}
}
@@ -2759,7 +3136,15 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
rel = (RelOptInfo *) lfirst(lc);
/* Create paths for partitionwise joins. */
- generate_partitionwise_join_paths(root, rel);
+ generate_partitionwise_join_paths(root, rel, REL_AGG_KIND_NONE);
+ if (rel->grouped)
+ {
+ generate_partitionwise_join_paths(root, rel,
+ REL_AGG_KIND_PARTIAL);
+
+ generate_partitionwise_join_paths(root, rel,
+ REL_AGG_KIND_SIMPLE);
+ }
/*
* Except for the topmost scan/join rel, consider gathering
@@ -2771,6 +3156,12 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
/* Find and save the cheapest paths for this rel */
set_cheapest(rel);
+ if (rel->grouped)
+ {
+ set_cheapest(rel->grouped->needs_final_agg);
+ if (rel->grouped->no_final_agg)
+ set_cheapest(rel->grouped->no_final_agg);
+ }
#ifdef OPTIMIZER_DEBUG
debug_print_rel(root, rel);
@@ -3409,6 +3800,7 @@ create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
{
int parallel_workers;
double pages_fetched;
+ Path *bmhpath;
/* Compute heap pages for bitmap heap scan */
pages_fetched = compute_bitmap_pages(root, rel, bitmapqual, 1.0,
@@ -3420,8 +3812,21 @@ create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
if (parallel_workers <= 0)
return;
- add_partial_path(rel, (Path *) create_bitmap_heap_path(root, rel,
- bitmapqual, rel->lateral_relids, 1.0, parallel_workers));
+ bmhpath = (Path *) create_bitmap_heap_path(root, rel, bitmapqual,
+ rel->lateral_relids, 1.0,
+ parallel_workers);
+
+ if (rel->agg_info == NULL)
+ add_partial_path(rel, bmhpath);
+ else
+ {
+ /*
+ * Only AGG_HASHED is suitable here as it does not expect the input
+ * set to be sorted.
+ */
+ create_grouped_path(root, rel, (Path *) bmhpath, false, true,
+ AGG_HASHED, REL_AGG_KIND_PARTIAL);
+ }
}
/*
@@ -3528,13 +3933,21 @@ compute_parallel_worker(RelOptInfo *rel, double heap_pages, double index_pages,
* generated here has a reference.
*/
void
-generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
+generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel,
+ RelAggKind agg_kind)
{
List *live_children = NIL;
int cnt_parts;
int num_parts;
RelOptInfo **part_rels;
+ /*
+ * TODO Only allow per-child AGGSPLIT_SIMPLE if the partitioning allows
+ * it, i.e. each partition generates distinct set of grouping keys.
+ */
+ if (agg_kind == REL_AGG_KIND_SIMPLE)
+ return;
+
/* Handle only join relations here. */
if (!IS_JOIN_REL(rel))
return;
@@ -3557,12 +3970,17 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
Assert(child_rel != NULL);
/* Add partitionwise join paths for partitioned child-joins. */
- generate_partitionwise_join_paths(root, child_rel);
+ generate_partitionwise_join_paths(root, child_rel, agg_kind);
/* Dummy children will not be scanned, so ignore those. */
if (IS_DUMMY_REL(child_rel))
continue;
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ child_rel = child_rel->grouped->needs_final_agg;
+ else if (agg_kind == REL_AGG_KIND_SIMPLE)
+ child_rel = child_rel->grouped->no_final_agg;
+
set_cheapest(child_rel);
#ifdef OPTIMIZER_DEBUG
@@ -3575,12 +3993,17 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
/* If all child-joins are dummy, parent join is also dummy. */
if (!live_children)
{
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ rel = rel->grouped->needs_final_agg;
+ else if (agg_kind == REL_AGG_KIND_SIMPLE)
+ rel = rel->grouped->no_final_agg;
+
mark_dummy_rel(rel);
return;
}
/* Build additional paths for this rel from child-join paths. */
- add_paths_to_append_rel(root, rel, live_children);
+ add_paths_to_append_rel(root, rel, live_children, agg_kind);
list_free(live_children);
}
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index a2a7e0c520..f87a2d52ed 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -91,6 +91,7 @@
#include "optimizer/plancat.h"
#include "optimizer/planmain.h"
#include "optimizer/restrictinfo.h"
+#include "optimizer/var.h"
#include "parser/parsetree.h"
#include "utils/lsyscache.h"
#include "utils/selfuncs.h"
@@ -1068,6 +1069,17 @@ cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
*cost = path->total_cost;
*selec = ((BitmapOrPath *) path)->bitmapselectivity;
}
+ else if (IsA(path, AggPath))
+ {
+ /*
+ * If partial aggregation was already applied, use only the input
+ * path.
+ *
+ * TODO Take the aggregation into account, both cost and its effect on
+ * selectivity (i.e. how it reduces the number of rows).
+ */
+ cost_bitmap_tree_node(((AggPath *) path)->subpath, cost, selec);
+ }
else
{
elog(ERROR, "unrecognized node type: %d", nodeTag(path));
@@ -2290,6 +2302,41 @@ cost_group(Path *path, PlannerInfo *root,
path->total_cost = total_cost;
}
+static void
+estimate_join_rows(PlannerInfo *root, Path *path, RelAggInfo *agg_info)
+{
+ bool grouped = agg_info != NULL;
+
+ if (path->param_info)
+ {
+ double nrows;
+
+ path->rows = path->param_info->ppi_rows;
+ if (grouped)
+ {
+ nrows = estimate_num_groups(root, agg_info->group_exprs,
+ path->rows, NULL);
+ path->rows = clamp_row_est(nrows);
+ }
+ }
+ else
+ {
+ if (!grouped)
+ path->rows = path->parent->rows;
+ else
+ {
+ /*
+ * XXX agg_info->rows is an estimate of the output rows if we join
+ * the non-grouped rels and aggregate the output. However the
+ * figure can be different if an already grouped rel is joined to
+ * non-grouped one. Is this worth adding a new field to the
+ * agg_info?
+ */
+ path->rows = agg_info->rows;
+ }
+ }
+}
+
/*
* initial_cost_nestloop
* Preliminary estimate of the cost of a nestloop join path.
@@ -2411,10 +2458,7 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
inner_path_rows = 1;
/* Mark the path with the correct row estimate */
- if (path->path.param_info)
- path->path.rows = path->path.param_info->ppi_rows;
- else
- path->path.rows = path->path.parent->rows;
+ estimate_join_rows(root, (Path *) path, path->path.parent->agg_info);
/* For partial paths, scale row estimate. */
if (path->path.parallel_workers > 0)
@@ -2857,10 +2901,8 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
inner_path_rows = 1;
/* Mark the path with the correct row estimate */
- if (path->jpath.path.param_info)
- path->jpath.path.rows = path->jpath.path.param_info->ppi_rows;
- else
- path->jpath.path.rows = path->jpath.path.parent->rows;
+ estimate_join_rows(root, (Path *) path,
+ path->jpath.path.parent->agg_info);
/* For partial paths, scale row estimate. */
if (path->jpath.path.parallel_workers > 0)
@@ -3282,10 +3324,8 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
ListCell *hcl;
/* Mark the path with the correct row estimate */
- if (path->jpath.path.param_info)
- path->jpath.path.rows = path->jpath.path.param_info->ppi_rows;
- else
- path->jpath.path.rows = path->jpath.path.parent->rows;
+ estimate_join_rows(root, (Path *) path,
+ path->jpath.path.parent->agg_info);
/* For partial paths, scale row estimate. */
if (path->jpath.path.parallel_workers > 0)
@@ -3808,8 +3848,9 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
* estimated execution cost given by pg_proc.procost (remember to multiply
* this by cpu_operator_cost).
*
- * Vars and Consts are charged zero, and so are boolean operators (AND,
- * OR, NOT). Simplistic, but a lot better than no model at all.
+ * Vars, GroupedVars and Consts are charged zero, and so are boolean
+ * operators (AND, OR, NOT). Simplistic, but a lot better than no model at
+ * all.
*
* Should we try to account for the possibility of short-circuit
* evaluation of AND/OR? Probably *not*, because that would make the
@@ -4290,11 +4331,13 @@ approx_tuple_count(PlannerInfo *root, JoinPath *path, List *quals)
* restriction clauses).
* width: the estimated average output tuple width in bytes.
* baserestrictcost: estimated cost of evaluating baserestrictinfo clauses.
+ * grouped: will partial aggregation be applied to each path?
*/
void
set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)
{
double nrows;
+ bool grouped = rel->agg_info != NULL;
/* Should only be applied to base relations */
Assert(rel->relid > 0);
@@ -4305,12 +4348,31 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)
0,
JOIN_INNER,
NULL);
-
rel->rows = clamp_row_est(nrows);
+ /*
+ * Grouping essentially changes the number of rows.
+ */
+ if (grouped)
+ {
+ nrows = estimate_num_groups(root,
+ rel->agg_info->group_exprs, nrows,
+ NULL);
+ rel->agg_info->rows = clamp_row_est(nrows);
+ }
+
cost_qual_eval(&rel->baserestrictcost, rel->baserestrictinfo, root);
- set_rel_width(root, rel);
+ /*
+ * The grouped target should have the cost and width set immediately on
+ * creation, see create_rel_agg_info().
+ */
+ if (!grouped)
+ set_rel_width(root, rel);
+#ifdef USE_ASSERT_CHECKING
+ else
+ Assert(rel->reltarget->width > 0);
+#endif
}
/*
@@ -4378,12 +4440,23 @@ set_joinrel_size_estimates(PlannerInfo *root, RelOptInfo *rel,
SpecialJoinInfo *sjinfo,
List *restrictlist)
{
+ double outer_rows,
+ inner_rows;
+
+ /*
+ * Take grouping of the input rels into account.
+ */
+ outer_rows = outer_rel->agg_info ? outer_rel->agg_info->rows :
+ outer_rel->rows;
+ inner_rows = inner_rel->agg_info ? inner_rel->agg_info->rows :
+ inner_rel->rows;
+
rel->rows = calc_joinrel_size_estimate(root,
rel,
outer_rel,
inner_rel,
- outer_rel->rows,
- inner_rel->rows,
+ outer_rows,
+ inner_rows,
sjinfo,
restrictlist);
}
@@ -5260,11 +5333,11 @@ set_pathtarget_cost_width(PlannerInfo *root, PathTarget *target)
foreach(lc, target->exprs)
{
Node *node = (Node *) lfirst(lc);
+ int32 item_width;
if (IsA(node, Var))
{
Var *var = (Var *) node;
- int32 item_width;
/* We should not see any upper-level Vars here */
Assert(var->varlevelsup == 0);
@@ -5295,6 +5368,33 @@ set_pathtarget_cost_width(PlannerInfo *root, PathTarget *target)
Assert(item_width > 0);
tuple_width += item_width;
}
+ else if (IsA(node, GroupedVar))
+ {
+ GroupedVar *gvar = (GroupedVar *) node;
+ Node *expr;
+
+ /*
+ * Only AggPath can evaluate GroupedVar if it's an aggregate, or
+ * the AggPath's input path if it's a generic grouping expression.
+ * In the other cases the GroupedVar we see here only bubbled up
+ * from a lower AggPath, so it does not add any cost to the path
+ * that owns this target.
+ *
+ * XXX Is the value worth caching in GroupedVar?
+ */
+ if (gvar->agg_partial != NULL)
+ {
+ Assert(IsA(gvar->gvexpr, Aggref));
+
+ expr = (Node *) gvar->agg_partial;
+ }
+ else
+ expr = (Node *) gvar->gvexpr;
+
+ item_width = get_typavgwidth(exprType(expr), exprTypmod(expr));
+ Assert(item_width > 0);
+ tuple_width += item_width;
+ }
else
{
/*
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index b22b36ec0e..921e6f405b 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -65,6 +65,19 @@ static bool reconsider_outer_join_clause(PlannerInfo *root,
static bool reconsider_full_join_clause(PlannerInfo *root,
RestrictInfo *rinfo);
+typedef struct translate_expr_context
+{
+ Var **keys; /* Dictionary keys. */
+ Var **values; /* Dictionary values */
+ int nitems; /* Number of dictionary items. */
+ Relids *gv_eval_at_p; /* See GroupedVarInfo. */
+ Index relid; /* Translate into this relid. */
+} translate_expr_context;
+
+static Node *translate_expression_to_rels_mutator(Node *node,
+ translate_expr_context *context);
+static int var_dictionary_comparator(const void *a, const void *b);
+
/*
* process_equivalence
@@ -2511,3 +2524,329 @@ is_redundant_derived_clause(RestrictInfo *rinfo, List *clauselist)
return false;
}
+
+/*
+ * translate_expression_to_rels
+ * If the appropriate equivalence classes exist, replace vars in
+ * gvi->gvexpr with vars whose varno is equal to relid.
+ */
+GroupedVarInfo *
+translate_expression_to_rels(PlannerInfo *root, GroupedVarInfo *gvi,
+ Index relid)
+{
+ List *vars;
+ ListCell *l1;
+ int i,
+ j;
+ int nkeys,
+ nkeys_resolved;
+ Var **keys,
+ **values,
+ **keys_tmp;
+ Var *key,
+ *key_prev;
+ translate_expr_context context;
+ GroupedVarInfo *result;
+
+ /* Can't do anything w/o equivalence classes. */
+ if (root->eq_classes == NIL)
+ return NULL;
+
+ /*
+ * Before actually trying to modify the expression tree, find out if all
+ * vars can be translated.
+ */
+ vars = pull_var_clause((Node *) gvi->gvexpr, PVC_RECURSE_AGGREGATES);
+
+ /* No vars to translate? */
+ if (vars == NIL)
+ return NULL;
+
+ /*
+ * Search for individual replacement vars as well as the actual expression
+ * translation will be more efficient if we use a dictionary with the keys
+ * (i.e. the "source vars") unique and sorted.
+ */
+ nkeys = list_length(vars);
+ keys = (Var **) palloc(nkeys * sizeof(Var *));
+ i = 0;
+ foreach(l1, vars)
+ {
+ key = lfirst_node(Var, l1);
+ keys[i++] = key;
+ }
+
+ /*
+ * Sort the keys by varno. varattno decides where varnos are equal.
+ */
+ if (nkeys > 1)
+ pg_qsort(keys, nkeys, sizeof(Var *), var_dictionary_comparator);
+
+ /*
+ * Pick unique values and get rid of the vars that need no translation.
+ */
+ keys_tmp = (Var **) palloc(nkeys * sizeof(Var *));
+ key_prev = NULL;
+ j = 0;
+ for (i = 0; i < nkeys; i++)
+ {
+ key = keys[i];
+
+ if ((key_prev == NULL || (key->varno != key_prev->varno &&
+ key->varattno != key_prev->varattno)) &&
+ key->varno != relid)
+ keys_tmp[j++] = key;
+
+ key_prev = key;
+ }
+ pfree(keys);
+ keys = keys_tmp;
+ nkeys = j;
+
+ /*
+ * Is there actually nothing to be translated?
+ */
+ if (nkeys == 0)
+ {
+ pfree(keys);
+ return NULL;
+ }
+
+ nkeys_resolved = 0;
+
+ /*
+ * Find the replacement vars.
+ */
+ values = (Var **) palloc0(nkeys * sizeof(Var *));
+ foreach(l1, root->eq_classes)
+ {
+ EquivalenceClass *ec = lfirst_node(EquivalenceClass, l1);
+ Relids ec_var_relids;
+ Var **ec_vars;
+ int ec_nvars;
+ ListCell *l2;
+
+ /* TODO Re-check if any other EC kind should be ignored. */
+ if (ec->ec_has_volatile || ec->ec_below_outer_join || ec->ec_broken)
+ continue;
+
+ /* Single-element EC can hardly help in translations. */
+ if (list_length(ec->ec_members) == 1)
+ continue;
+
+ /*
+ * Collect all vars of this EC and their varnos.
+ *
+ * ec->ec_relids does not help because we're only interested in a
+ * subset of EC members.
+ */
+ ec_vars = (Var **) palloc(list_length(ec->ec_members) * sizeof(Var *));
+ ec_nvars = 0;
+ ec_var_relids = NULL;
+ foreach(l2, ec->ec_members)
+ {
+ EquivalenceMember *em = lfirst_node(EquivalenceMember, l2);
+ Var *ec_var;
+
+ if (!IsA(em->em_expr, Var))
+ continue;
+
+ ec_var = castNode(Var, em->em_expr);
+ ec_vars[ec_nvars++] = ec_var;
+ ec_var_relids = bms_add_member(ec_var_relids, ec_var->varno);
+ }
+
+ /*
+ * At least two vars are needed so that the EC is usable for
+ * translation.
+ */
+ if (ec_nvars <= 1)
+ {
+ pfree(ec_vars);
+ bms_free(ec_var_relids);
+ continue;
+ }
+
+ /*
+ * Now check where this EC can help.
+ */
+ for (i = 0; i < nkeys; i++)
+ {
+ Relids ec_rest;
+ bool relid_ok,
+ key_found;
+ Var *key = keys[i];
+ Var *value = values[i];
+
+ /* Skip this item if it's already resolved. */
+ if (value != NULL)
+ continue;
+
+ /*
+ * Can't translate if the EC does not mention key->varno.
+ */
+ if (!bms_is_member(key->varno, ec_var_relids))
+ continue;
+
+ /*
+ * Besides key, at least one EC member must belong to the relation
+ * we're translating our expression to.
+ */
+ ec_rest = bms_copy(ec_var_relids);
+ ec_rest = bms_del_member(ec_rest, key->varno);
+ relid_ok = bms_is_member(relid, ec_rest);
+ bms_free(ec_rest);
+ if (!relid_ok)
+ continue;
+
+ /*
+ * The preliminary checks passed, so try to find the exact vars.
+ */
+ key_found = false;
+ for (j = 0; j < ec_nvars; j++)
+ {
+ Var *ec_var = ec_vars[j];
+
+ if (!key_found && key->varno == ec_var->varno &&
+ key->varattno == ec_var->varattno)
+ key_found = true;
+
+ /*
+ *
+ * Is this Var useful for our dictionary?
+ *
+ * XXX Shouldn't ec_var be copied?
+ */
+ if (value == NULL && ec_var->varno == relid)
+ value = ec_var;
+
+ if (key_found && value != NULL)
+ break;
+ }
+
+ /*
+ * The replacement Var must have the same data type, otherwise the
+ * values are not guaranteed to be grouped in the same way as
+ * values of the original Var.
+ */
+ if (key_found && value != NULL &&
+ key->vartype == value->vartype)
+ {
+ values[i] = value;
+ nkeys_resolved++;
+
+ if (nkeys_resolved == nkeys)
+ break;
+ }
+ }
+
+ pfree(ec_vars);
+ bms_free(ec_var_relids);
+
+ /* Don't need to check the remaining ECs? */
+ if (nkeys_resolved == nkeys)
+ break;
+ }
+
+ /* Couldn't compose usable dictionary? */
+ if (nkeys_resolved < nkeys)
+ {
+ pfree(keys);
+ pfree(values);
+ return NULL;
+ }
+
+ result = makeNode(GroupedVarInfo);
+ memcpy(result, gvi, sizeof(GroupedVarInfo));
+
+ /*
+ * translate_expression_to_rels_mutator updates gv_eval_at.
+ */
+ result->gv_eval_at = bms_copy(result->gv_eval_at);
+
+ /* The dictionary is ready, so perform the translation. */
+ context.keys = keys;
+ context.values = values;
+ context.nitems = nkeys;
+ context.gv_eval_at_p = &result->gv_eval_at;
+ context.relid = relid;
+ result->gvexpr = (Expr *)
+ translate_expression_to_rels_mutator((Node *) gvi->gvexpr, &context);
+ result->derived = true;
+
+ pfree(keys);
+ pfree(values);
+ return result;
+}
+
+static Node *
+translate_expression_to_rels_mutator(Node *node,
+ translate_expr_context *context)
+{
+ if (node == NULL)
+ return NULL;
+
+ if (IsA(node, Var))
+ {
+ Var *var = castNode(Var, node);
+ Var **key_p;
+ Var *value;
+ int index;
+
+ /*
+ * Simply return the existing variable if already belongs to the
+ * relation we're adjusting the expression to.
+ */
+ if (var->varno == context->relid)
+ return (Node *) var;
+
+ key_p = bsearch(&var, context->keys, context->nitems, sizeof(Var *),
+ var_dictionary_comparator);
+
+ /* We shouldn't have omitted any var from the dictionary. */
+ Assert(key_p != NULL);
+
+ index = key_p - context->keys;
+ Assert(index >= 0 && index < context->nitems);
+ value = context->values[index];
+
+ /* All values should be present in the dictionary. */
+ Assert(value != NULL);
+
+ /* Update gv_eval_at accordingly. */
+ bms_del_member(*context->gv_eval_at_p, var->varno);
+ *context->gv_eval_at_p = bms_add_member(*context->gv_eval_at_p,
+ value->varno);
+
+ return (Node *) value;
+ }
+
+ return expression_tree_mutator(node, translate_expression_to_rels_mutator,
+ (void *) context);
+}
+
+static int
+var_dictionary_comparator(const void *a, const void *b)
+{
+ Var **var1_p,
+ **var2_p;
+ Var *var1,
+ *var2;
+
+ var1_p = (Var **) a;
+ var1 = castNode(Var, *var1_p);
+ var2_p = (Var **) b;
+ var2 = castNode(Var, *var2_p);
+
+ if (var1->varno < var2->varno)
+ return -1;
+ else if (var1->varno > var2->varno)
+ return 1;
+
+ if (var1->varattno < var2->varattno)
+ return -1;
+ else if (var1->varattno > var2->varattno)
+ return 1;
+
+ return 0;
+}
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index f295558f76..43e638d53a 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -32,6 +32,7 @@
#include "optimizer/predtest.h"
#include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
+#include "optimizer/tlist.h"
#include "optimizer/var.h"
#include "utils/builtins.h"
#include "utils/bytea.h"
@@ -76,13 +77,13 @@ typedef struct
int indexcol; /* index column we want to match to */
} ec_member_matches_arg;
-
static void consider_index_join_clauses(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index,
IndexClauseSet *rclauseset,
IndexClauseSet *jclauseset,
IndexClauseSet *eclauseset,
- List **bitindexpaths);
+ List **bitindexpaths,
+ RelAggKind agg_kind);
static void consider_index_join_outer_rels(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index,
IndexClauseSet *rclauseset,
@@ -91,7 +92,8 @@ static void consider_index_join_outer_rels(PlannerInfo *root, RelOptInfo *rel,
List **bitindexpaths,
List *indexjoinclauses,
int considered_clauses,
- List **considered_relids);
+ List **considered_relids,
+ RelAggKind agg_kind);
static void get_join_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index,
IndexClauseSet *rclauseset,
@@ -99,23 +101,28 @@ static void get_join_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexClauseSet *eclauseset,
List **bitindexpaths,
Relids relids,
- List **considered_relids);
+ List **considered_relids,
+ RelAggKind agg_kind);
static bool eclass_already_used(EquivalenceClass *parent_ec, Relids oldrelids,
List *indexjoinclauses);
static bool bms_equal_any(Relids relids, List *relids_list);
static void get_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
- List **bitindexpaths);
+ List **bitindexpaths,
+ RelAggKind agg_kind);
static List *build_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
bool useful_predicate,
ScanTypeControl scantype,
bool *skip_nonnative_saop,
- bool *skip_lower_saop);
+ bool *skip_lower_saop,
+ RelAggKind agg_kind);
static List *build_paths_for_OR(PlannerInfo *root, RelOptInfo *rel,
- List *clauses, List *other_clauses);
+ List *clauses, List *other_clauses,
+ RelAggKind agg_kind);
static List *generate_bitmap_or_paths(PlannerInfo *root, RelOptInfo *rel,
- List *clauses, List *other_clauses);
+ List *clauses, List *other_clauses,
+ RelAggKind agg_kind);
static Path *choose_bitmap_and(PlannerInfo *root, RelOptInfo *rel,
List *paths);
static int path_usage_comparator(const void *a, const void *b);
@@ -225,9 +232,11 @@ static Const *string_to_const(const char *str, Oid datatype);
* index quals ... but for now, it doesn't seem worth troubling over.
* In particular, comments below about "unparameterized" paths should be read
* as meaning "unparameterized so far as the indexquals are concerned".
+ *
+ * If agg_info is passed, grouped paths are generated too.
*/
void
-create_index_paths(PlannerInfo *root, RelOptInfo *rel)
+create_index_paths(PlannerInfo *root, RelOptInfo *rel, RelAggKind agg_kind)
{
List *indexpaths;
List *bitindexpaths;
@@ -272,8 +281,8 @@ create_index_paths(PlannerInfo *root, RelOptInfo *rel)
* non-parameterized paths. Plain paths go directly to add_path(),
* bitmap paths are added to bitindexpaths to be handled below.
*/
- get_index_paths(root, rel, index, &rclauseset,
- &bitindexpaths);
+ get_index_paths(root, rel, index, &rclauseset, &bitindexpaths,
+ agg_kind);
/*
* Identify the join clauses that can match the index. For the moment
@@ -302,15 +311,25 @@ create_index_paths(PlannerInfo *root, RelOptInfo *rel)
&rclauseset,
&jclauseset,
&eclauseset,
- &bitjoinpaths);
+ &bitjoinpaths,
+ agg_kind);
}
+
+ /*
+ * Bitmap paths are currently not aggregated: AggPath does not accept the
+ * TID bitmap as input, and even if it did, it'd seem weird to aggregate
+ * the individual paths and then AND them together.
+ */
+ if (rel->agg_info != NULL)
+ return;
+
/*
* Generate BitmapOrPaths for any suitable OR-clauses present in the
* restriction list. Add these to bitindexpaths.
*/
- indexpaths = generate_bitmap_or_paths(root, rel,
- rel->baserestrictinfo, NIL);
+ indexpaths = generate_bitmap_or_paths(root, rel, rel->baserestrictinfo,
+ NIL, agg_kind);
bitindexpaths = list_concat(bitindexpaths, indexpaths);
/*
@@ -318,7 +337,8 @@ create_index_paths(PlannerInfo *root, RelOptInfo *rel)
* the joinclause list. Add these to bitjoinpaths.
*/
indexpaths = generate_bitmap_or_paths(root, rel,
- joinorclauses, rel->baserestrictinfo);
+ joinorclauses, rel->baserestrictinfo,
+ agg_kind);
bitjoinpaths = list_concat(bitjoinpaths, indexpaths);
/*
@@ -439,7 +459,8 @@ consider_index_join_clauses(PlannerInfo *root, RelOptInfo *rel,
IndexClauseSet *rclauseset,
IndexClauseSet *jclauseset,
IndexClauseSet *eclauseset,
- List **bitindexpaths)
+ List **bitindexpaths,
+ RelAggKind agg_kind)
{
int considered_clauses = 0;
List *considered_relids = NIL;
@@ -475,7 +496,8 @@ consider_index_join_clauses(PlannerInfo *root, RelOptInfo *rel,
bitindexpaths,
jclauseset->indexclauses[indexcol],
considered_clauses,
- &considered_relids);
+ &considered_relids,
+ agg_kind);
/* Consider each applicable eclass join clause */
considered_clauses += list_length(eclauseset->indexclauses[indexcol]);
consider_index_join_outer_rels(root, rel, index,
@@ -483,7 +505,8 @@ consider_index_join_clauses(PlannerInfo *root, RelOptInfo *rel,
bitindexpaths,
eclauseset->indexclauses[indexcol],
considered_clauses,
- &considered_relids);
+ &considered_relids,
+ agg_kind);
}
}
@@ -508,7 +531,8 @@ consider_index_join_outer_rels(PlannerInfo *root, RelOptInfo *rel,
List **bitindexpaths,
List *indexjoinclauses,
int considered_clauses,
- List **considered_relids)
+ List **considered_relids,
+ RelAggKind agg_kind)
{
ListCell *lc;
@@ -575,7 +599,8 @@ consider_index_join_outer_rels(PlannerInfo *root, RelOptInfo *rel,
rclauseset, jclauseset, eclauseset,
bitindexpaths,
bms_union(clause_relids, oldrelids),
- considered_relids);
+ considered_relids,
+ agg_kind);
}
/* Also try this set of relids by itself */
@@ -583,7 +608,8 @@ consider_index_join_outer_rels(PlannerInfo *root, RelOptInfo *rel,
rclauseset, jclauseset, eclauseset,
bitindexpaths,
clause_relids,
- considered_relids);
+ considered_relids,
+ agg_kind);
}
}
@@ -608,7 +634,8 @@ get_join_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexClauseSet *eclauseset,
List **bitindexpaths,
Relids relids,
- List **considered_relids)
+ List **considered_relids,
+ RelAggKind agg_kind)
{
IndexClauseSet clauseset;
int indexcol;
@@ -665,7 +692,8 @@ get_join_index_paths(PlannerInfo *root, RelOptInfo *rel,
Assert(clauseset.nonempty);
/* Build index path(s) using the collected set of clauses */
- get_index_paths(root, rel, index, &clauseset, bitindexpaths);
+ get_index_paths(root, rel, index, &clauseset, bitindexpaths,
+ agg_kind);
/*
* Remember we considered paths for this set of relids. We use lcons not
@@ -715,7 +743,6 @@ bms_equal_any(Relids relids, List *relids_list)
return false;
}
-
/*
* get_index_paths
* Given an index and a set of index clauses for it, construct IndexPaths.
@@ -734,7 +761,7 @@ bms_equal_any(Relids relids, List *relids_list)
static void
get_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
- List **bitindexpaths)
+ List **bitindexpaths, RelAggKind agg_kind)
{
List *indexpaths;
bool skip_nonnative_saop = false;
@@ -746,18 +773,26 @@ get_index_paths(PlannerInfo *root, RelOptInfo *rel,
* clauses only if the index AM supports them natively, and skip any such
* clauses for index columns after the first (so that we produce ordered
* paths if possible).
+ *
+ * These paths are good candidates for AGG_SORTED, so pass the output
+ * lists for this strategy. AGG_HASHED should be applied to paths with no
+ * pathkeys.
*/
indexpaths = build_index_paths(root, rel,
index, clauses,
index->predOK,
ST_ANYSCAN,
&skip_nonnative_saop,
- &skip_lower_saop);
+ &skip_lower_saop,
+ agg_kind);
/*
* If we skipped any lower-order ScalarArrayOpExprs on an index with an AM
* that supports them, then try again including those clauses. This will
* produce paths with more selectivity but no ordering.
+ *
+ * As for the grouping paths, only AGG_HASHED is considered due to the
+ * missing ordering.
*/
if (skip_lower_saop)
{
@@ -767,7 +802,8 @@ get_index_paths(PlannerInfo *root, RelOptInfo *rel,
index->predOK,
ST_ANYSCAN,
&skip_nonnative_saop,
- NULL));
+ NULL,
+ agg_kind));
}
/*
@@ -799,6 +835,9 @@ get_index_paths(PlannerInfo *root, RelOptInfo *rel,
* If there were ScalarArrayOpExpr clauses that the index can't handle
* natively, generate bitmap scan paths relying on executor-managed
* ScalarArrayOpExpr.
+ *
+ * As for grouping, only AGG_HASHED is possible here. Again, because
+ * there's no ordering.
*/
if (skip_nonnative_saop)
{
@@ -807,7 +846,8 @@ get_index_paths(PlannerInfo *root, RelOptInfo *rel,
false,
ST_BITMAPSCAN,
NULL,
- NULL);
+ NULL,
+ agg_kind);
*bitindexpaths = list_concat(*bitindexpaths, indexpaths);
}
}
@@ -845,13 +885,18 @@ get_index_paths(PlannerInfo *root, RelOptInfo *rel,
* NULL, we do not ignore non-first ScalarArrayOpExpr clauses, but they will
* result in considering the scan's output to be unordered.
*
+ * If 'agg_info' is passed, 'agg_sorted' and / or 'agg_hashed' must be passed
+ * too. In that case AGG_SORTED and / or AGG_HASHED aggregation is applied to
+ * the index path (as long as the index path is appropriate) and the resulting
+ * grouped path is stored here.
+ *
* 'rel' is the index's heap relation
* 'index' is the index for which we want to generate paths
* 'clauses' is the collection of indexable clauses (RestrictInfo nodes)
* 'useful_predicate' indicates whether the index has a useful predicate
* 'scantype' indicates whether we need plain or bitmap scan support
* 'skip_nonnative_saop' indicates whether to accept SAOP if index AM doesn't
- * 'skip_lower_saop' indicates whether to accept non-first-column SAOP
+ * 'skip_lower_saop' indicates whether to accept non-first-column SAOP.
*/
static List *
build_index_paths(PlannerInfo *root, RelOptInfo *rel,
@@ -859,7 +904,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
bool useful_predicate,
ScanTypeControl scantype,
bool *skip_nonnative_saop,
- bool *skip_lower_saop)
+ bool *skip_lower_saop,
+ RelAggKind agg_kind)
{
List *result = NIL;
IndexPath *ipath;
@@ -876,6 +922,12 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
bool index_is_ordered;
bool index_only_scan;
int indexcol;
+ bool grouped;
+ bool can_agg_sorted,
+ can_agg_hashed;
+ AggPath *agg_path;
+
+ grouped = rel->agg_info != NULL;
/*
* Check that index supports the desired scan type(s)
@@ -1029,7 +1081,12 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
* in the current clauses, OR the index ordering is potentially useful for
* later merging or final output ordering, OR the index has a useful
* predicate, OR an index-only scan is possible.
+ *
+ * This is where grouped path start to be considered.
*/
+ can_agg_sorted = true;
+ can_agg_hashed = true;
+
if (index_clauses != NIL || useful_pathkeys != NIL || useful_predicate ||
index_only_scan)
{
@@ -1046,7 +1103,65 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
outer_relids,
loop_count,
false);
- result = lappend(result, ipath);
+
+ if (!grouped)
+ result = lappend(result, ipath);
+ else
+ {
+ /*
+ * Try to create the grouped paths if caller is interested in
+ * them.
+ */
+ if (useful_pathkeys != NIL)
+ {
+ agg_path = create_agg_sorted_path(root,
+ (Path *) ipath,
+ true,
+ ipath->path.rows,
+ agg_kind);
+
+ if (agg_path != NULL)
+ result = lappend(result, agg_path);
+ else
+ {
+ /*
+ * If ipath could not be used as a source for AGG_SORTED
+ * partial aggregation, it probably does not have the
+ * appropriate pathkeys. Avoid trying to apply AGG_SORTED
+ * to the next index paths because those will have the
+ * same pathkeys.
+ */
+ can_agg_sorted = false;
+ }
+ }
+ else
+ can_agg_sorted = false;
+
+ /*
+ * Hashed aggregation should not be parameterized: the cost of
+ * repeated creation of the hashtable (for different parameter
+ * values) is probably not worth.
+ */
+ if (outer_relids != NULL)
+ {
+ agg_path = create_agg_hashed_path(root,
+ (Path *) ipath,
+ ipath->path.rows,
+ agg_kind);
+
+ if (agg_path != NULL)
+ result = lappend(result, agg_path);
+ else
+ {
+ /*
+ * If ipath could not be used as a source for AGG_HASHED,
+ * we should not expect any other path of the same index
+ * to succeed. Avoid wasting the effort next time.
+ */
+ can_agg_hashed = false;
+ }
+ }
+ }
/*
* If appropriate, consider parallel index scan. We don't allow
@@ -1075,7 +1190,48 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
* parallel workers, just free it.
*/
if (ipath->path.parallel_workers > 0)
- add_partial_path(rel, (Path *) ipath);
+ {
+ if (!grouped)
+ add_partial_path(rel, (Path *) ipath);
+ else
+ {
+ if (useful_pathkeys != NIL && can_agg_sorted)
+ {
+ /*
+ * No need to check the pathkeys again.
+ */
+ agg_path = create_agg_sorted_path(root,
+ (Path *) ipath,
+ false,
+ ipath->path.rows,
+ agg_kind);
+
+ /*
+ * If create_agg_sorted_path succeeded once, it should
+ * always do.
+ */
+ Assert(agg_path != NULL);
+
+ add_partial_path(rel, (Path *) agg_path);
+ }
+
+ if (can_agg_hashed && outer_relids == NULL)
+ {
+ agg_path = create_agg_hashed_path(root,
+ (Path *) ipath,
+ ipath->path.rows,
+ agg_kind);
+
+ /*
+ * If create_agg_hashed_path succeeded once, it should
+ * always do.
+ */
+ Assert(agg_path != NULL);
+
+ add_partial_path(rel, (Path *) agg_path);
+ }
+ }
+ }
else
pfree(ipath);
}
@@ -1103,7 +1259,33 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
outer_relids,
loop_count,
false);
- result = lappend(result, ipath);
+
+ if (!grouped)
+ result = lappend(result, ipath);
+ else
+ {
+ /*
+ * As the input set ordering does not matter to AGG_HASHED,
+ * only AGG_SORTED makes sense here. (The AGG_HASHED path we'd
+ * create here should already exist.)
+ *
+ * The existing value of can_agg_sorted is not up-to-date for
+ * the new pathkeys.
+ */
+ can_agg_sorted = true;
+
+ /* pathkeys are new, so check them. */
+ agg_path = create_agg_sorted_path(root,
+ (Path *) ipath,
+ true,
+ ipath->path.rows,
+ agg_kind);
+
+ if (agg_path != NULL)
+ result = lappend(result, agg_path);
+ else
+ can_agg_sorted = false;
+ }
/* If appropriate, consider parallel index scan */
if (index->amcanparallel &&
@@ -1127,7 +1309,27 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
* using parallel workers, just free it.
*/
if (ipath->path.parallel_workers > 0)
- add_partial_path(rel, (Path *) ipath);
+ {
+ if (!grouped)
+ add_partial_path(rel, (Path *) ipath);
+ else
+ {
+ if (can_agg_sorted)
+ {
+ /*
+ * The non-partial path above should have been
+ * created, so no need to check pathkeys.
+ */
+ agg_path = create_agg_sorted_path(root,
+ (Path *) ipath,
+ false,
+ ipath->path.rows,
+ agg_kind);
+ Assert(agg_path != NULL);
+ add_partial_path(rel, (Path *) agg_path);
+ }
+ }
+ }
else
pfree(ipath);
}
@@ -1162,10 +1364,12 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
* 'rel' is the relation for which we want to generate index paths
* 'clauses' is the current list of clauses (RestrictInfo nodes)
* 'other_clauses' is the list of additional upper-level clauses
+ * 'agg_info' indicates that grouped paths should be added to 'agg_hashed'.
*/
static List *
build_paths_for_OR(PlannerInfo *root, RelOptInfo *rel,
- List *clauses, List *other_clauses)
+ List *clauses, List *other_clauses,
+ RelAggKind agg_kind)
{
List *result = NIL;
List *all_clauses = NIL; /* not computed till needed */
@@ -1235,14 +1439,16 @@ build_paths_for_OR(PlannerInfo *root, RelOptInfo *rel,
match_clauses_to_index(index, other_clauses, &clauseset);
/*
- * Construct paths if possible.
+ * Construct paths if possible. Forbid partial aggregation even if the
+ * relation is grouped --- it'll be applied to the bitmap heap path.
*/
indexpaths = build_index_paths(root, rel,
index, &clauseset,
useful_predicate,
ST_BITMAPSCAN,
NULL,
- NULL);
+ NULL,
+ agg_kind);
result = list_concat(result, indexpaths);
}
@@ -1261,7 +1467,8 @@ build_paths_for_OR(PlannerInfo *root, RelOptInfo *rel,
*/
static List *
generate_bitmap_or_paths(PlannerInfo *root, RelOptInfo *rel,
- List *clauses, List *other_clauses)
+ List *clauses, List *other_clauses,
+ RelAggKind agg_kind)
{
List *result = NIL;
List *all_clauses;
@@ -1301,13 +1508,15 @@ generate_bitmap_or_paths(PlannerInfo *root, RelOptInfo *rel,
indlist = build_paths_for_OR(root, rel,
andargs,
- all_clauses);
+ all_clauses,
+ agg_kind);
/* Recurse in case there are sub-ORs */
indlist = list_concat(indlist,
generate_bitmap_or_paths(root, rel,
andargs,
- all_clauses));
+ all_clauses,
+ agg_kind));
}
else
{
@@ -1319,7 +1528,8 @@ generate_bitmap_or_paths(PlannerInfo *root, RelOptInfo *rel,
indlist = build_paths_for_OR(root, rel,
orargs,
- all_clauses);
+ all_clauses,
+ agg_kind);
}
/*
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 642f951093..d5880e42a0 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -48,29 +48,38 @@ static void try_partial_mergejoin_path(PlannerInfo *root,
List *outersortkeys,
List *innersortkeys,
JoinType jointype,
- JoinPathExtraData *extra);
+ JoinPathExtraData *extra,
+ RelAggKind agg_kind,
+ bool do_aggregate);
static void sort_inner_and_outer(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outerrel, RelOptInfo *innerrel,
- JoinType jointype, JoinPathExtraData *extra);
+ JoinType jointype, JoinPathExtraData *extra,
+ RelAggKind agg_kind, bool do_aggregate);
static void match_unsorted_outer(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outerrel, RelOptInfo *innerrel,
- JoinType jointype, JoinPathExtraData *extra);
+ JoinType jointype, JoinPathExtraData *extra,
+ RelAggKind agg_kind, bool do_aggregate);
static void consider_parallel_nestloop(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
- JoinPathExtraData *extra);
+ JoinPathExtraData *extra,
+ RelAggKind agg_kind,
+ bool do_aggregate);
static void consider_parallel_mergejoin(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
JoinPathExtraData *extra,
- Path *inner_cheapest_total);
+ Path *inner_cheapest_total,
+ RelAggKind agg_kind,
+ bool do_aggregate);
static void hash_inner_and_outer(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outerrel, RelOptInfo *innerrel,
- JoinType jointype, JoinPathExtraData *extra);
+ JoinType jointype, JoinPathExtraData *extra,
+ RelAggKind agg_kind, bool do_aggregate);
static List *select_mergejoin_clauses(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
@@ -87,7 +96,9 @@ static void generate_mergejoin_paths(PlannerInfo *root,
bool useallclauses,
Path *inner_cheapest_total,
List *merge_pathkeys,
- bool is_partial);
+ bool is_partial,
+ RelAggKind agg_kind,
+ bool do_aggregate);
/*
@@ -112,6 +123,9 @@ static void generate_mergejoin_paths(PlannerInfo *root,
* however. Path cost estimation code may need to recognize that it's
* dealing with such a case --- the combination of nominal jointype INNER
* with sjinfo->jointype == JOIN_SEMI indicates that.
+ *
+ * agg_info is passed iff partial aggregation should be applied to the join
+ * path.
*/
void
add_paths_to_joinrel(PlannerInfo *root,
@@ -120,7 +134,9 @@ add_paths_to_joinrel(PlannerInfo *root,
RelOptInfo *innerrel,
JoinType jointype,
SpecialJoinInfo *sjinfo,
- List *restrictlist)
+ List *restrictlist,
+ RelAggKind agg_kind,
+ bool do_aggregate)
{
JoinPathExtraData extra;
bool mergejoin_allowed = true;
@@ -267,7 +283,7 @@ add_paths_to_joinrel(PlannerInfo *root,
*/
if (mergejoin_allowed)
sort_inner_and_outer(root, joinrel, outerrel, innerrel,
- jointype, &extra);
+ jointype, &extra, agg_kind, do_aggregate);
/*
* 2. Consider paths where the outer relation need not be explicitly
@@ -278,7 +294,7 @@ add_paths_to_joinrel(PlannerInfo *root,
*/
if (mergejoin_allowed)
match_unsorted_outer(root, joinrel, outerrel, innerrel,
- jointype, &extra);
+ jointype, &extra, agg_kind, do_aggregate);
#ifdef NOT_USED
@@ -305,7 +321,7 @@ add_paths_to_joinrel(PlannerInfo *root,
*/
if (enable_hashjoin || jointype == JOIN_FULL)
hash_inner_and_outer(root, joinrel, outerrel, innerrel,
- jointype, &extra);
+ jointype, &extra, agg_kind, do_aggregate);
/*
* 5. If inner and outer relations are foreign tables (or joins) belonging
@@ -366,7 +382,9 @@ try_nestloop_path(PlannerInfo *root,
Path *inner_path,
List *pathkeys,
JoinType jointype,
- JoinPathExtraData *extra)
+ JoinPathExtraData *extra,
+ RelAggKind agg_kind,
+ bool do_aggregate)
{
Relids required_outer;
JoinCostWorkspace workspace;
@@ -376,6 +394,12 @@ try_nestloop_path(PlannerInfo *root,
Relids outerrelids;
Relids inner_paramrels = PATH_REQ_OUTER(inner_path);
Relids outer_paramrels = PATH_REQ_OUTER(outer_path);
+ bool success = false;
+
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ joinrel = joinrel->grouped->needs_final_agg;
+ else if (agg_kind == REL_AGG_KIND_SIMPLE)
+ joinrel = joinrel->grouped->no_final_agg;
/*
* Paths are parameterized by top-level parents, so run parameterization
@@ -422,10 +446,31 @@ try_nestloop_path(PlannerInfo *root,
initial_cost_nestloop(root, &workspace, jointype,
outer_path, inner_path, extra);
- if (add_path_precheck(joinrel,
- workspace.startup_cost, workspace.total_cost,
- pathkeys, required_outer))
+ /*
+ * If the join output should be (partially) aggregated, the precheck
+ * includes the aggregation and is postponed to create_grouped_path().
+ */
+ if ((!do_aggregate &&
+ add_path_precheck(joinrel,
+ workspace.startup_cost, workspace.total_cost,
+ pathkeys, required_outer)) ||
+ do_aggregate)
{
+ Path *path;
+ PathTarget *target;
+
+ /*
+ * If the join output is subject to partial aggregation, the path must
+ * have the appropriate target.
+ */
+ if (!do_aggregate)
+ target = joinrel->reltarget;
+ else
+ {
+ Assert(joinrel->agg_info != NULL);
+ target = joinrel->agg_info->input;
+ }
+
/*
* If the inner path is parameterized, it is parameterized by the
* topmost parent of the outer rel, not the outer rel itself. Fix
@@ -447,21 +492,58 @@ try_nestloop_path(PlannerInfo *root,
}
}
- add_path(joinrel, (Path *)
- create_nestloop_path(root,
- joinrel,
- jointype,
- &workspace,
- extra,
- outer_path,
- inner_path,
- extra->restrictlist,
- pathkeys,
- required_outer));
+ path = (Path *) create_nestloop_path(root,
+ joinrel,
+ target,
+ jointype,
+ &workspace,
+ extra,
+ outer_path,
+ inner_path,
+ extra->restrictlist,
+ pathkeys,
+ required_outer);
+ if (!do_aggregate)
+ {
+ add_path(joinrel, path);
+ success = true;
+ }
+ else
+ {
+ /*
+ * Try both AGG_HASHED and AGG_SORTED partial aggregation.
+ *
+ * AGG_HASHED should not be parameterized because we don't want to
+ * create the hashtable again for each set of parameters.
+ */
+ if (required_outer == NULL)
+ success = create_grouped_path(root,
+ joinrel,
+ path,
+ true,
+ false,
+ AGG_HASHED,
+ agg_kind);
+
+ /*
+ * Don't try AGG_SORTED if create_grouped_path() would reject it
+ * anyway.
+ */
+ if (pathkeys != NIL)
+ success = success ||
+ create_grouped_path(root,
+ joinrel,
+ path,
+ true,
+ false,
+ AGG_SORTED,
+ agg_kind);
+ }
}
- else
+
+ if (!success)
{
- /* Waste no memory when we reject a path here */
+ /* Waste no memory when we reject path(s) here */
bms_free(required_outer);
}
}
@@ -478,9 +560,28 @@ try_partial_nestloop_path(PlannerInfo *root,
Path *inner_path,
List *pathkeys,
JoinType jointype,
- JoinPathExtraData *extra)
+ JoinPathExtraData *extra,
+ RelAggKind agg_kind,
+ bool do_aggregate)
{
JoinCostWorkspace workspace;
+ Path *path;
+ PathTarget *target;
+
+ if (agg_kind == REL_AGG_KIND_SIMPLE)
+ {
+ /*
+ * See create_grouped_path() for explanation why parallel grouping
+ * paths are not useful w/o final aggregation.
+ */
+ return;
+ }
+
+ /*
+ * Fetch the relation to which we'll add the paths.
+ */
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ joinrel = joinrel->grouped->needs_final_agg;
/*
* If the inner path is parameterized, the parameterization must be fully
@@ -515,7 +616,13 @@ try_partial_nestloop_path(PlannerInfo *root,
*/
initial_cost_nestloop(root, &workspace, jointype,
outer_path, inner_path, extra);
- if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
+
+ /*
+ * If the join output should be (partially) aggregated, the precheck
+ * includes the aggregation and is postponed to create_grouped_path().
+ */
+ if (!do_aggregate &&
+ !add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
return;
/*
@@ -534,18 +641,56 @@ try_partial_nestloop_path(PlannerInfo *root,
return;
}
+ /*
+ * If the join output is subject to partial aggregation, the path must
+ * have the appropriate target.
+ */
+ if (!do_aggregate)
+ target = joinrel->reltarget;
+ else
+ {
+ Assert(joinrel->agg_info != NULL);
+ target = joinrel->agg_info->input;
+ }
+
/* Might be good enough to be worth trying, so let's try it. */
- add_partial_path(joinrel, (Path *)
- create_nestloop_path(root,
- joinrel,
- jointype,
- &workspace,
- extra,
- outer_path,
- inner_path,
- extra->restrictlist,
- pathkeys,
- NULL));
+ path = (Path *) create_nestloop_path(root,
+ joinrel,
+ target,
+ jointype,
+ &workspace,
+ extra,
+ outer_path,
+ inner_path,
+ extra->restrictlist,
+ pathkeys,
+ NULL);
+
+ if (!do_aggregate)
+ add_partial_path(joinrel, path);
+ else
+ {
+ create_grouped_path(root,
+ joinrel,
+ path,
+ true,
+ true,
+ AGG_HASHED,
+ agg_kind);
+
+ /*
+ * Don't try AGG_SORTED if create_grouped_path() would reject it
+ * anyway.
+ */
+ if (pathkeys != NIL)
+ create_grouped_path(root,
+ joinrel,
+ path,
+ true,
+ true,
+ AGG_SORTED,
+ agg_kind);
+ }
}
/*
@@ -564,15 +709,24 @@ try_mergejoin_path(PlannerInfo *root,
List *innersortkeys,
JoinType jointype,
JoinPathExtraData *extra,
- bool is_partial)
+ bool is_partial,
+ RelAggKind agg_kind,
+ bool do_aggregate)
{
Relids required_outer;
JoinCostWorkspace workspace;
+ bool success = false;
+ RelOptInfo *joinrel_plain = joinrel;
+
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ joinrel = joinrel->grouped->needs_final_agg;
+ else if (agg_kind == REL_AGG_KIND_SIMPLE)
+ joinrel = joinrel->grouped->no_final_agg;
if (is_partial)
{
try_partial_mergejoin_path(root,
- joinrel,
+ joinrel_plain,
outer_path,
inner_path,
pathkeys,
@@ -580,7 +734,9 @@ try_mergejoin_path(PlannerInfo *root,
outersortkeys,
innersortkeys,
jointype,
- extra);
+ extra,
+ agg_kind,
+ do_aggregate);
return;
}
@@ -617,26 +773,70 @@ try_mergejoin_path(PlannerInfo *root,
outersortkeys, innersortkeys,
extra);
- if (add_path_precheck(joinrel,
- workspace.startup_cost, workspace.total_cost,
- pathkeys, required_outer))
+ if ((!do_aggregate &&
+ add_path_precheck(joinrel,
+ workspace.startup_cost, workspace.total_cost,
+ pathkeys, required_outer)) ||
+ do_aggregate)
{
- add_path(joinrel, (Path *)
- create_mergejoin_path(root,
- joinrel,
- jointype,
- &workspace,
- extra,
- outer_path,
- inner_path,
- extra->restrictlist,
- pathkeys,
- required_outer,
- mergeclauses,
- outersortkeys,
- innersortkeys));
+ Path *path;
+ PathTarget *target;
+
+ /*
+ * If the join output is subject to partial aggregation, the path must
+ * have the appropriate target.
+ */
+ if (!do_aggregate)
+ target = joinrel->reltarget;
+ else
+ {
+ Assert(joinrel->agg_info != NULL);
+ target = joinrel->agg_info->input;
+ }
+
+ path = (Path *) create_mergejoin_path(root,
+ joinrel,
+ target,
+ jointype,
+ &workspace,
+ extra,
+ outer_path,
+ inner_path,
+ extra->restrictlist,
+ pathkeys,
+ required_outer,
+ mergeclauses,
+ outersortkeys,
+ innersortkeys);
+ if (!do_aggregate)
+ {
+ add_path(joinrel, path);
+ success = true;
+ }
+ else
+ {
+ if (required_outer == NULL)
+ success = create_grouped_path(root,
+ joinrel,
+ path,
+ true,
+ false,
+ AGG_HASHED,
+ agg_kind);
+
+ if (pathkeys != NIL)
+ success = success ||
+ create_grouped_path(root,
+ joinrel,
+ path,
+ true,
+ false,
+ AGG_SORTED,
+ agg_kind);
+ }
}
- else
+
+ if (!success)
{
/* Waste no memory when we reject a path here */
bms_free(required_outer);
@@ -658,9 +858,28 @@ try_partial_mergejoin_path(PlannerInfo *root,
List *outersortkeys,
List *innersortkeys,
JoinType jointype,
- JoinPathExtraData *extra)
+ JoinPathExtraData *extra,
+ RelAggKind agg_kind,
+ bool do_aggregate)
{
JoinCostWorkspace workspace;
+ Path *path;
+ PathTarget *target;
+
+ if (agg_kind == REL_AGG_KIND_SIMPLE)
+ {
+ /*
+ * See create_grouped_path() for explanation why parallel grouping
+ * paths are not useful w/o final aggregation.
+ */
+ return;
+ }
+
+ /*
+ * Fetch the relation to which we'll add the paths.
+ */
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ joinrel = joinrel->grouped->needs_final_agg;
/*
* See comments in try_partial_hashjoin_path().
@@ -693,24 +912,59 @@ try_partial_mergejoin_path(PlannerInfo *root,
outersortkeys, innersortkeys,
extra);
- if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
+ if (!do_aggregate &&
+ !add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
return;
+ /*
+ * If the join output is subject to partial aggregation, the path must
+ * have the appropriate target.
+ */
+ if (!do_aggregate)
+ target = joinrel->reltarget;
+ else
+ {
+ Assert(joinrel->agg_info != NULL);
+ target = joinrel->agg_info->input;
+ }
+
/* Might be good enough to be worth trying, so let's try it. */
- add_partial_path(joinrel, (Path *)
- create_mergejoin_path(root,
- joinrel,
- jointype,
- &workspace,
- extra,
- outer_path,
- inner_path,
- extra->restrictlist,
- pathkeys,
- NULL,
- mergeclauses,
- outersortkeys,
- innersortkeys));
+ path = (Path *) create_mergejoin_path(root,
+ joinrel,
+ target,
+ jointype,
+ &workspace,
+ extra,
+ outer_path,
+ inner_path,
+ extra->restrictlist,
+ pathkeys,
+ NULL,
+ mergeclauses,
+ outersortkeys,
+ innersortkeys);
+
+ if (!do_aggregate)
+ add_partial_path(joinrel, path);
+ else
+ {
+ create_grouped_path(root,
+ joinrel,
+ path,
+ true,
+ true,
+ AGG_HASHED,
+ agg_kind);
+
+ if (pathkeys != NIL)
+ create_grouped_path(root,
+ joinrel,
+ path,
+ true,
+ true,
+ AGG_SORTED,
+ agg_kind);
+ }
}
/*
@@ -725,10 +979,19 @@ try_hashjoin_path(PlannerInfo *root,
Path *inner_path,
List *hashclauses,
JoinType jointype,
- JoinPathExtraData *extra)
+ JoinPathExtraData *extra,
+ RelAggKind agg_kind,
+ bool do_aggregate)
{
Relids required_outer;
JoinCostWorkspace workspace;
+ Path *path = NULL;
+ bool success = false;
+
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ joinrel = joinrel->grouped->needs_final_agg;
+ else if (agg_kind == REL_AGG_KIND_SIMPLE)
+ joinrel = joinrel->grouped->no_final_agg;
/*
* Check to see if proposed path is still parameterized, and reject if the
@@ -745,30 +1008,81 @@ try_hashjoin_path(PlannerInfo *root,
}
/*
+ * Parameterized execution of grouped path would mean repeated hashing of
+ * the output of the hashjoin output, so forget about AGG_HASHED if there
+ * are any parameters. And AGG_SORTED makes no sense because the hash join
+ * output is not sorted.
+ */
+ if (required_outer && joinrel->agg_info)
+ return;
+
+ /*
* See comments in try_nestloop_path(). Also note that hashjoin paths
* never have any output pathkeys, per comments in create_hashjoin_path.
*/
initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
outer_path, inner_path, extra, false);
- if (add_path_precheck(joinrel,
- workspace.startup_cost, workspace.total_cost,
- NIL, required_outer))
+ /*
+ * If the join output should be (partially) aggregated, the precheck
+ * includes the aggregation and is postponed to create_grouped_path().
+ */
+ if ((!do_aggregate &&
+ add_path_precheck(joinrel,
+ workspace.startup_cost, workspace.total_cost,
+ NIL, required_outer)) ||
+ do_aggregate)
{
- add_path(joinrel, (Path *)
- create_hashjoin_path(root,
- joinrel,
- jointype,
- &workspace,
- extra,
- outer_path,
- inner_path,
- false, /* parallel_hash */
- extra->restrictlist,
- required_outer,
- hashclauses));
+ PathTarget *target;
+
+ /*
+ * If the join output is subject to partial aggregation, the path must
+ * have the appropriate target.
+ */
+ if (!do_aggregate)
+ target = joinrel->reltarget;
+ else
+ {
+ Assert(joinrel->agg_info != NULL);
+ target = joinrel->agg_info->input;
+ }
+
+ path = (Path *) create_hashjoin_path(root,
+ joinrel,
+ target,
+ jointype,
+ &workspace,
+ extra,
+ outer_path,
+ inner_path,
+ false, /* parallel_hash */
+ extra->restrictlist,
+ required_outer,
+ hashclauses);
+
+ if (!do_aggregate)
+ {
+ add_path(joinrel, path);
+ success = true;
+ }
+ else
+ {
+
+ /*
+ * As the hashjoin path is not sorted, only try AGG_HASHED.
+ */
+ if (create_grouped_path(root,
+ joinrel,
+ path,
+ true,
+ false,
+ AGG_HASHED,
+ agg_kind))
+ success = true;
+ }
}
- else
+
+ if (!success)
{
/* Waste no memory when we reject a path here */
bms_free(required_outer);
@@ -792,9 +1106,28 @@ try_partial_hashjoin_path(PlannerInfo *root,
List *hashclauses,
JoinType jointype,
JoinPathExtraData *extra,
- bool parallel_hash)
+ bool parallel_hash,
+ RelAggKind agg_kind,
+ bool do_aggregate)
{
JoinCostWorkspace workspace;
+ Path *path;
+ PathTarget *target;
+
+ if (agg_kind == REL_AGG_KIND_SIMPLE)
+ {
+ /*
+ * See create_grouped_path() for explanation why parallel grouping
+ * paths are not useful w/o final aggregation.
+ */
+ return;
+ }
+
+ /*
+ * Fetch the relation to which we'll add the paths.
+ */
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ joinrel = joinrel->grouped->needs_final_agg;
/*
* If the inner path is parameterized, the parameterization must be fully
@@ -816,23 +1149,55 @@ try_partial_hashjoin_path(PlannerInfo *root,
* cost. Bail out right away if it looks terrible.
*/
initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
- outer_path, inner_path, extra, parallel_hash);
- if (!add_partial_path_precheck(joinrel, workspace.total_cost, NIL))
+ outer_path, inner_path, extra, true);
+
+ /*
+ * If the join output should be (partially) aggregated, the precheck
+ * includes the aggregation and is postponed to create_grouped_path().
+ */
+ if (!do_aggregate &&
+ !add_partial_path_precheck(joinrel, workspace.total_cost, NIL))
return;
- /* Might be good enough to be worth trying, so let's try it. */
- add_partial_path(joinrel, (Path *)
- create_hashjoin_path(root,
- joinrel,
- jointype,
- &workspace,
- extra,
- outer_path,
- inner_path,
- parallel_hash,
- extra->restrictlist,
- NULL,
- hashclauses));
+ /*
+ * If the join output is subject to partial aggregation, the path must
+ * have the appropriate target.
+ */
+ if (!do_aggregate)
+ target = joinrel->reltarget;
+ else
+ {
+ Assert(joinrel->agg_info != NULL);
+ target = joinrel->agg_info->input;
+ }
+
+ path = (Path *) create_hashjoin_path(root,
+ joinrel,
+ target,
+ jointype,
+ &workspace,
+ extra,
+ outer_path,
+ inner_path,
+ parallel_hash,
+ extra->restrictlist,
+ NULL,
+ hashclauses);
+ if (!do_aggregate)
+ add_partial_path(joinrel, path);
+ else
+ {
+ /*
+ * Only AGG_HASHED is useful, see comments in try_hashjoin_path().
+ */
+ create_grouped_path(root,
+ joinrel,
+ path,
+ true,
+ true,
+ AGG_HASHED,
+ agg_kind);
+ }
}
/*
@@ -876,6 +1241,7 @@ clause_sides_match_join(RestrictInfo *rinfo, RelOptInfo *outerrel,
* 'innerrel' is the inner join relation
* 'jointype' is the type of join to do
* 'extra' contains additional input values
+ * 'agg_info' tells if/how to apply partial aggregation to the output.
*/
static void
sort_inner_and_outer(PlannerInfo *root,
@@ -883,7 +1249,9 @@ sort_inner_and_outer(PlannerInfo *root,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
- JoinPathExtraData *extra)
+ JoinPathExtraData *extra,
+ RelAggKind agg_kind,
+ bool do_aggregate)
{
JoinType save_jointype = jointype;
Path *outer_path;
@@ -1045,7 +1413,9 @@ sort_inner_and_outer(PlannerInfo *root,
innerkeys,
jointype,
extra,
- false);
+ false,
+ agg_kind,
+ do_aggregate);
/*
* If we have partial outer and parallel safe inner path then try
@@ -1061,7 +1431,9 @@ sort_inner_and_outer(PlannerInfo *root,
outerkeys,
innerkeys,
jointype,
- extra);
+ extra,
+ agg_kind,
+ do_aggregate);
}
}
@@ -1089,7 +1461,9 @@ generate_mergejoin_paths(PlannerInfo *root,
bool useallclauses,
Path *inner_cheapest_total,
List *merge_pathkeys,
- bool is_partial)
+ bool is_partial,
+ RelAggKind agg_kind,
+ bool do_aggregate)
{
List *mergeclauses;
List *innersortkeys;
@@ -1150,7 +1524,9 @@ generate_mergejoin_paths(PlannerInfo *root,
innersortkeys,
jointype,
extra,
- is_partial);
+ is_partial,
+ agg_kind,
+ do_aggregate);
/* Can't do anything else if inner path needs to be unique'd */
if (save_jointype == JOIN_UNIQUE_INNER)
@@ -1247,7 +1623,9 @@ generate_mergejoin_paths(PlannerInfo *root,
NIL,
jointype,
extra,
- is_partial);
+ is_partial,
+ agg_kind,
+ do_aggregate);
cheapest_total_inner = innerpath;
}
/* Same on the basis of cheapest startup cost ... */
@@ -1291,7 +1669,9 @@ generate_mergejoin_paths(PlannerInfo *root,
NIL,
jointype,
extra,
- is_partial);
+ is_partial,
+ agg_kind,
+ do_aggregate);
}
cheapest_startup_inner = innerpath;
}
@@ -1333,7 +1713,9 @@ match_unsorted_outer(PlannerInfo *root,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
- JoinPathExtraData *extra)
+ JoinPathExtraData *extra,
+ RelAggKind agg_kind,
+ bool do_aggregate)
{
JoinType save_jointype = jointype;
bool nestjoinOK;
@@ -1456,7 +1838,9 @@ match_unsorted_outer(PlannerInfo *root,
inner_cheapest_total,
merge_pathkeys,
jointype,
- extra);
+ extra,
+ agg_kind,
+ do_aggregate);
}
else if (nestjoinOK)
{
@@ -1478,7 +1862,9 @@ match_unsorted_outer(PlannerInfo *root,
innerpath,
merge_pathkeys,
jointype,
- extra);
+ extra,
+ agg_kind,
+ do_aggregate);
}
/* Also consider materialized form of the cheapest inner path */
@@ -1489,7 +1875,9 @@ match_unsorted_outer(PlannerInfo *root,
matpath,
merge_pathkeys,
jointype,
- extra);
+ extra,
+ agg_kind,
+ do_aggregate);
}
/* Can't do anything else if outer path needs to be unique'd */
@@ -1504,7 +1892,7 @@ match_unsorted_outer(PlannerInfo *root,
generate_mergejoin_paths(root, joinrel, innerrel, outerpath,
save_jointype, extra, useallclauses,
inner_cheapest_total, merge_pathkeys,
- false);
+ false, agg_kind, do_aggregate);
}
/*
@@ -1525,7 +1913,8 @@ match_unsorted_outer(PlannerInfo *root,
{
if (nestjoinOK)
consider_parallel_nestloop(root, joinrel, outerrel, innerrel,
- save_jointype, extra);
+ save_jointype, extra, agg_kind,
+ do_aggregate);
/*
* If inner_cheapest_total is NULL or non parallel-safe then find the
@@ -1545,7 +1934,9 @@ match_unsorted_outer(PlannerInfo *root,
if (inner_cheapest_total)
consider_parallel_mergejoin(root, joinrel, outerrel, innerrel,
save_jointype, extra,
- inner_cheapest_total);
+ inner_cheapest_total,
+ agg_kind,
+ do_aggregate);
}
}
@@ -1568,7 +1959,9 @@ consider_parallel_mergejoin(PlannerInfo *root,
RelOptInfo *innerrel,
JoinType jointype,
JoinPathExtraData *extra,
- Path *inner_cheapest_total)
+ Path *inner_cheapest_total,
+ RelAggKind agg_kind,
+ bool do_aggregate)
{
ListCell *lc1;
@@ -1586,7 +1979,8 @@ consider_parallel_mergejoin(PlannerInfo *root,
generate_mergejoin_paths(root, joinrel, innerrel, outerpath, jointype,
extra, false, inner_cheapest_total,
- merge_pathkeys, true);
+ merge_pathkeys, true, agg_kind,
+ do_aggregate);
}
}
@@ -1607,7 +2001,9 @@ consider_parallel_nestloop(PlannerInfo *root,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
- JoinPathExtraData *extra)
+ JoinPathExtraData *extra,
+ RelAggKind agg_kind,
+ bool do_aggregate)
{
JoinType save_jointype = jointype;
ListCell *lc1;
@@ -1657,7 +2053,8 @@ consider_parallel_nestloop(PlannerInfo *root,
}
try_partial_nestloop_path(root, joinrel, outerpath, innerpath,
- pathkeys, jointype, extra);
+ pathkeys, jointype, extra, agg_kind,
+ do_aggregate);
}
}
}
@@ -1672,6 +2069,7 @@ consider_parallel_nestloop(PlannerInfo *root,
* 'innerrel' is the inner join relation
* 'jointype' is the type of join to do
* 'extra' contains additional input values
+ * 'agg_info' tells if/how to apply partial aggregation to the output.
*/
static void
hash_inner_and_outer(PlannerInfo *root,
@@ -1679,7 +2077,9 @@ hash_inner_and_outer(PlannerInfo *root,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
- JoinPathExtraData *extra)
+ JoinPathExtraData *extra,
+ RelAggKind agg_kind,
+ bool do_aggregate)
{
JoinType save_jointype = jointype;
bool isouterjoin = IS_OUTER_JOIN(jointype);
@@ -1754,7 +2154,9 @@ hash_inner_and_outer(PlannerInfo *root,
cheapest_total_inner,
hashclauses,
jointype,
- extra);
+ extra,
+ agg_kind,
+ do_aggregate);
/* no possibility of cheap startup here */
}
else if (jointype == JOIN_UNIQUE_INNER)
@@ -1770,7 +2172,9 @@ hash_inner_and_outer(PlannerInfo *root,
cheapest_total_inner,
hashclauses,
jointype,
- extra);
+ extra,
+ agg_kind,
+ do_aggregate);
if (cheapest_startup_outer != NULL &&
cheapest_startup_outer != cheapest_total_outer)
try_hashjoin_path(root,
@@ -1779,7 +2183,9 @@ hash_inner_and_outer(PlannerInfo *root,
cheapest_total_inner,
hashclauses,
jointype,
- extra);
+ extra,
+ agg_kind,
+ do_aggregate);
}
else
{
@@ -1800,7 +2206,9 @@ hash_inner_and_outer(PlannerInfo *root,
cheapest_total_inner,
hashclauses,
jointype,
- extra);
+ extra,
+ agg_kind,
+ do_aggregate);
foreach(lc1, outerrel->cheapest_parameterized_paths)
{
@@ -1834,7 +2242,9 @@ hash_inner_and_outer(PlannerInfo *root,
innerpath,
hashclauses,
jointype,
- extra);
+ extra,
+ agg_kind,
+ do_aggregate);
}
}
}
@@ -1877,7 +2287,9 @@ hash_inner_and_outer(PlannerInfo *root,
cheapest_partial_outer,
cheapest_partial_inner,
hashclauses, jointype, extra,
- true /* parallel_hash */ );
+ true /* parallel_hash */ ,
+ agg_kind,
+ do_aggregate);
}
/*
@@ -1898,7 +2310,9 @@ hash_inner_and_outer(PlannerInfo *root,
cheapest_partial_outer,
cheapest_safe_inner,
hashclauses, jointype, extra,
- false /* parallel_hash */ );
+ false /* parallel_hash */ ,
+ agg_kind,
+ do_aggregate);
}
}
}
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 7008e1318e..78b1950a84 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -16,13 +16,16 @@
#include "miscadmin.h"
#include "optimizer/clauses.h"
+#include "optimizer/cost.h"
#include "optimizer/joininfo.h"
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/prep.h"
+#include "optimizer/tlist.h"
#include "partitioning/partbounds.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/selfuncs.h"
static void make_rels_by_clause_joins(PlannerInfo *root,
@@ -31,23 +34,36 @@ static void make_rels_by_clause_joins(PlannerInfo *root,
static void make_rels_by_clauseless_joins(PlannerInfo *root,
RelOptInfo *old_rel,
ListCell *other_rels);
+static void set_grouped_joinrel_target(PlannerInfo *root, RelOptInfo *joinrel,
+ RelOptInfo *rel1, RelOptInfo *rel2,
+ SpecialJoinInfo *sjinfo, List *restrictlist,
+ RelAggInfo *agg_info, RelAggKind agg_kind);
static bool has_join_restriction(PlannerInfo *root, RelOptInfo *rel);
static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
static bool is_dummy_rel(RelOptInfo *rel);
static bool restriction_is_constant_false(List *restrictlist,
RelOptInfo *joinrel,
bool only_pushed_down);
+static RelOptInfo *make_join_rel_common(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
+ RelAggInfo *agg_info, RelAggKind agg_kind,
+ bool do_aggregate);
+static void make_join_rel_common_grouped(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
+ RelAggInfo *agg_info, RelAggKind agg_kind,
+ bool do_aggregate);
static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
RelOptInfo *rel2, RelOptInfo *joinrel,
- SpecialJoinInfo *sjinfo, List *restrictlist);
-static void try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1,
- RelOptInfo *rel2, RelOptInfo *joinrel,
- SpecialJoinInfo *parent_sjinfo,
- List *parent_restrictlist);
+ SpecialJoinInfo *sjinfo, List *restrictlist,
+ RelAggKind agg_kind,
+ bool do_aggregate);
+static void try_partition_wise_join(PlannerInfo *root, RelOptInfo *rel1,
+ RelOptInfo *rel2, RelOptInfo *joinrel,
+ SpecialJoinInfo *parent_sjinfo,
+ List *parent_restrictlist,
+ RelAggKind agg_kind,
+ bool do_aggregate);
static int match_expr_to_partition_keys(Expr *expr, RelOptInfo *rel,
bool strict_op);
-
/*
* join_search_one_level
* Consider ways to produce join relations containing exactly 'level'
@@ -322,6 +338,63 @@ make_rels_by_clauseless_joins(PlannerInfo *root,
}
}
+/*
+ * Set joinrel's reltarget according to agg_info and estimate the number of
+ * rows.
+ */
+static void
+set_grouped_joinrel_target(PlannerInfo *root, RelOptInfo *joinrel,
+ RelOptInfo *rel1, RelOptInfo *rel2,
+ SpecialJoinInfo *sjinfo, List *restrictlist,
+ RelAggInfo *agg_info, RelAggKind agg_kind)
+{
+ PathTarget *target = NULL;
+
+ Assert(agg_info != NULL);
+
+ /*
+ * build_join_rel() / build_child_join_rel() does not create the target
+ * for grouped relation.
+ */
+ Assert(joinrel->reltarget == NULL);
+ Assert(joinrel->agg_info == NULL);
+
+ if (agg_kind == REL_AGG_KIND_SIMPLE)
+ target = agg_info->target_simple;
+ else if (agg_kind == REL_AGG_KIND_PARTIAL)
+ target = agg_info->target_partial;
+ else
+ Assert(false);
+
+ /*
+ * The output will actually be grouped, i.e. partially aggregated. No
+ * additional processing needed.
+ */
+ joinrel->reltarget = copy_pathtarget(target);
+
+ /*
+ * The rest of agg_info will be needed at aggregation time.
+ */
+ joinrel->agg_info = agg_info;
+
+ /*
+ * Now that we have the target, compute the estimates.
+ */
+ set_joinrel_size_estimates(root, joinrel, rel1, rel2, sjinfo,
+ restrictlist);
+
+ /*
+ * Grouping essentially changes the number of rows.
+ *
+ * XXX We do not distinguish whether two plain rels are joined and the
+ * result is partially aggregated, or the partial aggregation has been
+ * already applied to one of the input rels. Is this worth extra effort,
+ * e.g. maintaining a separate RelOptInfo for each case (one difficulty
+ * that would introduce is construction of AppendPath)?
+ */
+ joinrel->rows = estimate_num_groups(root, joinrel->agg_info->group_exprs,
+ joinrel->rows, NULL);
+}
/*
* join_is_legal
@@ -651,32 +724,46 @@ join_is_legal(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
return true;
}
-
/*
- * make_join_rel
+ * make_join_rel_common
* Find or create a join RelOptInfo that represents the join of
* the two given rels, and add to it path information for paths
* created with the two rels as outer and inner rel.
* (The join rel may already contain paths generated from other
* pairs of rels that add up to the same set of base rels.)
*
- * NB: will return NULL if attempted join is not valid. This can happen
- * when working with outer joins, or with IN or EXISTS clauses that have been
- * turned into joins.
+ * 'agg_info' contains the reltarget of grouped relation and everything we
+ * need to aggregate the join result. If NULL, then the join relation
+ * should not be grouped.
+ *
+ * 'do_aggregate' tells that two non-grouped rels should be grouped and
+ * partial aggregation should be applied to all their paths.
+ *
+ * NB: will return NULL if attempted join is not valid. This can happen when
+ * working with outer joins, or with IN or EXISTS clauses that have been
+ * turned into joins. NULL is also returned if caller is interested in a
+ * grouped relation but there's no useful grouped input relation.
*/
-RelOptInfo *
-make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
+static RelOptInfo *
+make_join_rel_common(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
+ RelAggInfo *agg_info, RelAggKind agg_kind,
+ bool do_aggregate)
{
Relids joinrelids;
SpecialJoinInfo *sjinfo;
bool reversed;
SpecialJoinInfo sjinfo_data;
- RelOptInfo *joinrel;
+ RelOptInfo *joinrel,
+ *joinrel_plain;
List *restrictlist;
+ bool grouped = agg_info != NULL;
/* We should never try to join two overlapping sets of rels. */
Assert(!bms_overlap(rel1->relids, rel2->relids));
+ /* do_aggregate implies the output to be grouped. */
+ Assert(agg_kind == REL_AGG_KIND_NONE || grouped);
+
/* Construct Relids set that identifies the joinrel. */
joinrelids = bms_union(rel1->relids, rel2->relids);
@@ -725,8 +812,68 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
* Find or build the join RelOptInfo, and compute the restrictlist that
* goes with this particular joining.
*/
- joinrel = build_join_rel(root, joinrelids, rel1, rel2, sjinfo,
- &restrictlist);
+ joinrel = joinrel_plain = build_join_rel(root, joinrelids, rel1, rel2, sjinfo,
+ &restrictlist, false);
+
+ if (grouped)
+ {
+ /*
+ * Make sure there's a grouped join relation.
+ */
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ {
+ if (joinrel->grouped->needs_final_agg == NULL)
+ joinrel->grouped->needs_final_agg = build_join_rel(root,
+ joinrelids,
+ rel1,
+ rel2,
+ sjinfo,
+ &restrictlist,
+ true);
+
+ /*
+ * The grouped join is what we need to return.
+ */
+ joinrel = joinrel->grouped->needs_final_agg;
+ }
+ else if (agg_kind == REL_AGG_KIND_SIMPLE)
+ {
+ if (joinrel->grouped->no_final_agg == NULL)
+ joinrel->grouped->no_final_agg = build_join_rel(root,
+ joinrelids,
+ rel1,
+ rel2,
+ sjinfo,
+ &restrictlist,
+ true);
+
+ /*
+ * The grouped join is what we need to return.
+ */
+ joinrel = joinrel->grouped->no_final_agg;
+ }
+ else
+ Assert(false);
+
+ /*
+ * Make sure the grouped joinrel has reltarget initialized. Caller
+ * should supply the target for group relation, so build_join_rel()
+ * should have omitted its creation.
+ *
+ * The target can already be there if we already applied another
+ * strategy to create grouped join.
+ */
+ if (joinrel->reltarget == NULL)
+ {
+ set_grouped_joinrel_target(root, joinrel, rel1, rel2, sjinfo,
+ restrictlist, agg_info, agg_kind);
+
+ if (rel1->consider_parallel && rel2->consider_parallel &&
+ is_parallel_safe(root, (Node *) restrictlist) &&
+ is_parallel_safe(root, (Node *) joinrel->reltarget->exprs))
+ joinrel->consider_parallel = true;
+ }
+ }
/*
* If we've already proven this join is empty, we needn't consider any
@@ -738,15 +885,222 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
return joinrel;
}
- /* Add paths to the join relation. */
- populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
- restrictlist);
+ /*
+ * Add paths to the join relation.
+ *
+ * Pass joinrel_plain and agg_kind instead of joinrel, since the function
+ * needs agg_kind anyway.
+ */
+ populate_joinrel_with_paths(root, rel1, rel2, joinrel_plain, sjinfo,
+ restrictlist, agg_kind, do_aggregate);
bms_free(joinrelids);
return joinrel;
}
+static void
+make_join_rel_common_grouped(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
+ RelAggInfo *agg_info, RelAggKind agg_kind,
+ bool do_aggregate)
+{
+ RelOptInfo *rel1_grouped = NULL;
+ RelOptInfo *rel2_grouped = NULL;
+ bool rel1_grouped_useful = false;
+ bool rel2_grouped_useful = false;
+
+ /*
+ * Retrieve the grouped relations.
+ *
+ * Dummy rel indicates join relation able to generate grouped paths as
+ * such (i.e. it has valid agg_info), but for which the path actually
+ * could not be created (e.g. only AGG_HASHED strategy was possible but
+ * work_mem was not sufficient for hash table).
+ */
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ {
+ if (rel1->grouped && rel1->grouped->needs_final_agg)
+ rel1_grouped = rel1->grouped->needs_final_agg;
+
+ if (rel2->grouped && rel2->grouped->needs_final_agg)
+ rel2_grouped = rel2->grouped->needs_final_agg;
+ }
+ else if (agg_kind == REL_AGG_KIND_SIMPLE)
+ {
+ if (rel1->grouped && rel1->grouped->no_final_agg)
+ rel1_grouped = rel1->grouped->no_final_agg;
+
+ if (rel2->grouped && rel2->grouped->no_final_agg)
+ rel2_grouped = rel2->grouped->no_final_agg;
+ }
+ else
+ Assert(false);
+
+ rel1_grouped_useful = rel1_grouped != NULL && !IS_DUMMY_REL(rel1_grouped);
+ rel2_grouped_useful = rel2_grouped != NULL && !IS_DUMMY_REL(rel2_grouped);
+
+ /*
+ * Nothing else to do?
+ */
+ if (!rel1_grouped_useful && !rel2_grouped_useful)
+ return;
+
+ /*
+ * At maximum one input rel can be grouped (here we don't care if any rel
+ * is eventually dummy, the existence of grouped rel indicates that
+ * aggregates can be pushed down to it). If both were grouped, then
+ * grouping of one side would change the occurrence of the other side's
+ * aggregate transient states on the input of the final aggregation. This
+ * can be handled by adjusting the transient states, but it's not worth
+ * the effort because it's hard to find a use case for this kind of join.
+ *
+ * XXX If the join of two grouped rels is implemented someday, note that
+ * both rels can have aggregates, so it'd be hard to join grouped rel to
+ * non-grouped here: 1) such a "mixed join" would require a special
+ * target, 2) both AGGSPLIT_FINAL_DESERIAL and AGGSPLIT_SIMPLE aggregates
+ * could appear in the target of the final aggregation node, originating
+ * from the grouped and the non-grouped input rel respectively.
+ */
+ if (rel1_grouped && rel2_grouped)
+ return;
+
+ if (agg_kind == REL_AGG_KIND_SIMPLE)
+ {
+ /*
+ * TODO return only if the join can duplicate values of grouping key
+ * generated by the grouped relation.
+ */
+ return;
+ }
+
+ if (rel1_grouped_useful)
+ make_join_rel_common(root, rel1_grouped, rel2, agg_info, agg_kind,
+ do_aggregate);
+ else if (rel2_grouped_useful)
+ make_join_rel_common(root, rel1, rel2_grouped, agg_info, agg_kind,
+ do_aggregate);
+}
+
+/*
+ * Front-end to make_join_rel_common(). Generates plain (non-grouped) join and
+ * then uses all the possible strategies to generate the grouped one.
+ */
+RelOptInfo *
+make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
+{
+ Relids joinrelids;
+ RelAggInfo *agg_info;
+ RelOptInfo *joinrel;
+ double nrows_plain;
+ RelOptInfo *result;
+
+ /* 1) form the plain join. */
+ result = make_join_rel_common(root, rel1, rel2, NULL, REL_AGG_KIND_NONE,
+ false);
+
+ if (result == NULL)
+ return result;
+
+ nrows_plain = result->rows;
+
+ /*
+ * We're done if there are no grouping expressions nor aggregates.
+ */
+ if (root->grouped_var_list == NIL)
+ return result;
+
+ /*
+ * If the same joinrel was already formed, just with the base rels divided
+ * between rel1 and rel2 in a different way, we might already have the
+ * matching agg_info.
+ */
+ joinrelids = bms_union(rel1->relids, rel2->relids);
+ joinrel = find_join_rel(root, joinrelids);
+
+ /*
+ * At the moment we know that non-grouped join exists, so it should have
+ * been fetched.
+ */
+ Assert(joinrel != NULL);
+
+ if (joinrel->grouped != NULL)
+ {
+ /*
+ * RelOptGrouped should always have valid needs_final_agg.
+ *
+ * XXX Should RelOptGrouped also have the agg_info pointer, to make
+ * access to it more straightforward?
+ */
+ Assert(joinrel->grouped->needs_final_agg != NULL);
+ Assert(joinrel->grouped->needs_final_agg->agg_info != NULL);
+
+ agg_info = joinrel->grouped->needs_final_agg->agg_info;
+ }
+ else
+ {
+ double nrows;
+
+ /*
+ * agg_info must be created from scratch.
+ */
+ agg_info = create_rel_agg_info(root, result);
+
+ /*
+ * Grouping essentially changes the number of rows.
+ */
+ if (agg_info != NULL)
+ {
+ nrows = estimate_num_groups(root,
+ agg_info->group_exprs,
+ nrows_plain,
+ NULL);
+ agg_info->rows = clamp_row_est(nrows);
+ }
+ }
+
+ /*
+ * Cannot we build grouped join?
+ */
+ if (agg_info == NULL)
+ return result;
+
+ /*
+ * 2) join two plain rels and aggregate the join paths.
+ */
+ result->grouped = (RelOptGrouped *) palloc0(sizeof(RelOptGrouped));
+ result->grouped->needs_final_agg = make_join_rel_common(root, rel1, rel2,
+ agg_info,
+ REL_AGG_KIND_PARTIAL,
+ true);
+
+ /*
+ * If the non-grouped join relation could be built, its aggregated form
+ * should exist too.
+ */
+ Assert(result->grouped->needs_final_agg != NULL);
+
+ /*
+ * Similarly for no_final_agg.
+ */
+ result->grouped->no_final_agg = make_join_rel_common(root, rel1, rel2,
+ agg_info,
+ REL_AGG_KIND_SIMPLE,
+ true);
+ Assert(result->grouped->no_final_agg != NULL);
+
+
+ /*
+ * 3) combine plain and grouped relations in order to create both
+ * needs_final_agg and no_final_agg join relations.
+ */
+ make_join_rel_common_grouped(root, rel1, rel2, agg_info,
+ REL_AGG_KIND_PARTIAL, false);
+ make_join_rel_common_grouped(root, rel1, rel2, agg_info,
+ REL_AGG_KIND_SIMPLE, false);
+
+ return result;
+}
+
/*
* populate_joinrel_with_paths
* Add paths to the given joinrel for given pair of joining relations. The
@@ -757,8 +1111,26 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
static void
populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
RelOptInfo *rel2, RelOptInfo *joinrel,
- SpecialJoinInfo *sjinfo, List *restrictlist)
+ SpecialJoinInfo *sjinfo, List *restrictlist,
+ RelAggKind agg_kind, bool do_aggregate)
{
+ RelOptInfo *joinrel_plain;
+
+ /*
+ * joinrel_plain and agg_kind is passed to add_paths_to_joinrel() since it
+ * needs agg_kind anyway.
+ *
+ * TODO As for the other uses, find out where joinrel can be used safely
+ * instead of joinrel_plain, i.e. check that even grouped joinrel has all
+ * the information needed.
+ */
+ joinrel_plain = joinrel;
+
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ joinrel = joinrel->grouped->needs_final_agg;
+ else if (agg_kind == REL_AGG_KIND_SIMPLE)
+ joinrel = joinrel->grouped->no_final_agg;
+
/*
* Consider paths using each rel as both outer and inner. Depending on
* the join type, a provably empty outer or inner rel might mean the join
@@ -781,17 +1153,17 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
{
case JOIN_INNER:
if (is_dummy_rel(rel1) || is_dummy_rel(rel2) ||
- restriction_is_constant_false(restrictlist, joinrel, false))
+ restriction_is_constant_false(restrictlist, joinrel_plain, false))
{
mark_dummy_rel(joinrel);
break;
}
- add_paths_to_joinrel(root, joinrel, rel1, rel2,
+ add_paths_to_joinrel(root, joinrel_plain, rel1, rel2,
JOIN_INNER, sjinfo,
- restrictlist);
- add_paths_to_joinrel(root, joinrel, rel2, rel1,
+ restrictlist, agg_kind, do_aggregate);
+ add_paths_to_joinrel(root, joinrel_plain, rel2, rel1,
JOIN_INNER, sjinfo,
- restrictlist);
+ restrictlist, agg_kind, do_aggregate);
break;
case JOIN_LEFT:
if (is_dummy_rel(rel1) ||
@@ -800,29 +1172,29 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
mark_dummy_rel(joinrel);
break;
}
- if (restriction_is_constant_false(restrictlist, joinrel, false) &&
+ if (restriction_is_constant_false(restrictlist, joinrel_plain, false) &&
bms_is_subset(rel2->relids, sjinfo->syn_righthand))
mark_dummy_rel(rel2);
- add_paths_to_joinrel(root, joinrel, rel1, rel2,
+ add_paths_to_joinrel(root, joinrel_plain, rel1, rel2,
JOIN_LEFT, sjinfo,
- restrictlist);
- add_paths_to_joinrel(root, joinrel, rel2, rel1,
+ restrictlist, agg_kind, do_aggregate);
+ add_paths_to_joinrel(root, joinrel_plain, rel2, rel1,
JOIN_RIGHT, sjinfo,
- restrictlist);
+ restrictlist, agg_kind, do_aggregate);
break;
case JOIN_FULL:
if ((is_dummy_rel(rel1) && is_dummy_rel(rel2)) ||
- restriction_is_constant_false(restrictlist, joinrel, true))
+ restriction_is_constant_false(restrictlist, joinrel_plain, true))
{
mark_dummy_rel(joinrel);
break;
}
- add_paths_to_joinrel(root, joinrel, rel1, rel2,
+ add_paths_to_joinrel(root, joinrel_plain, rel1, rel2,
JOIN_FULL, sjinfo,
- restrictlist);
- add_paths_to_joinrel(root, joinrel, rel2, rel1,
+ restrictlist, agg_kind, do_aggregate);
+ add_paths_to_joinrel(root, joinrel_plain, rel2, rel1,
JOIN_FULL, sjinfo,
- restrictlist);
+ restrictlist, agg_kind, do_aggregate);
/*
* If there are join quals that aren't mergeable or hashable, we
@@ -848,14 +1220,14 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
bms_is_subset(sjinfo->min_righthand, rel2->relids))
{
if (is_dummy_rel(rel1) || is_dummy_rel(rel2) ||
- restriction_is_constant_false(restrictlist, joinrel, false))
+ restriction_is_constant_false(restrictlist, joinrel_plain, false))
{
mark_dummy_rel(joinrel);
break;
}
- add_paths_to_joinrel(root, joinrel, rel1, rel2,
+ add_paths_to_joinrel(root, joinrel_plain, rel1, rel2,
JOIN_SEMI, sjinfo,
- restrictlist);
+ restrictlist, agg_kind, do_aggregate);
}
/*
@@ -871,32 +1243,32 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
sjinfo) != NULL)
{
if (is_dummy_rel(rel1) || is_dummy_rel(rel2) ||
- restriction_is_constant_false(restrictlist, joinrel, false))
+ restriction_is_constant_false(restrictlist, joinrel_plain, false))
{
mark_dummy_rel(joinrel);
break;
}
- add_paths_to_joinrel(root, joinrel, rel1, rel2,
+ add_paths_to_joinrel(root, joinrel_plain, rel1, rel2,
JOIN_UNIQUE_INNER, sjinfo,
- restrictlist);
- add_paths_to_joinrel(root, joinrel, rel2, rel1,
+ restrictlist, agg_kind, do_aggregate);
+ add_paths_to_joinrel(root, joinrel_plain, rel2, rel1,
JOIN_UNIQUE_OUTER, sjinfo,
- restrictlist);
+ restrictlist, agg_kind, do_aggregate);
}
break;
case JOIN_ANTI:
if (is_dummy_rel(rel1) ||
- restriction_is_constant_false(restrictlist, joinrel, true))
+ restriction_is_constant_false(restrictlist, joinrel_plain, true))
{
mark_dummy_rel(joinrel);
break;
}
- if (restriction_is_constant_false(restrictlist, joinrel, false) &&
+ if (restriction_is_constant_false(restrictlist, joinrel_plain, false) &&
bms_is_subset(rel2->relids, sjinfo->syn_righthand))
mark_dummy_rel(rel2);
- add_paths_to_joinrel(root, joinrel, rel1, rel2,
+ add_paths_to_joinrel(root, joinrel_plain, rel1, rel2,
JOIN_ANTI, sjinfo,
- restrictlist);
+ restrictlist, agg_kind, do_aggregate);
break;
default:
/* other values not expected here */
@@ -904,8 +1276,16 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
break;
}
- /* Apply partitionwise join technique, if possible. */
- try_partitionwise_join(root, rel1, rel2, joinrel, sjinfo, restrictlist);
+ /*
+ * TODO Only allow per-child AGGSPLIT_SIMPLE if the partitioning allows
+ * it, i.e. each partition generates distinct set of grouping keys.
+ */
+ if (agg_kind == REL_AGG_KIND_SIMPLE)
+ return;
+
+ /* Apply partition-wise join technique, if possible. */
+ try_partition_wise_join(root, rel1, rel2, joinrel_plain, sjinfo, restrictlist,
+ agg_kind, do_aggregate);
}
@@ -1232,7 +1612,8 @@ mark_dummy_rel(RelOptInfo *rel)
/* Set up the dummy path */
add_path(rel, (Path *) create_append_path(NULL, rel, NIL, NIL, NULL,
- 0, false, NIL, -1));
+ 0, false, NIL, -1,
+ REL_AGG_KIND_NONE));
/* Set or update cheapest_total_path and related fields */
set_cheapest(rel);
@@ -1308,16 +1689,16 @@ restriction_is_constant_false(List *restrictlist,
* obtained by translating the respective parent join structures.
*/
static void
-try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
- RelOptInfo *joinrel, SpecialJoinInfo *parent_sjinfo,
- List *parent_restrictlist)
+try_partition_wise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
+ RelOptInfo *joinrel, SpecialJoinInfo *parent_sjinfo,
+ List *parent_restrictlist, RelAggKind agg_kind,
+ bool do_aggregate)
{
int nparts;
int cnt_parts;
/* Guard against stack overflow due to overly deep partition hierarchy. */
check_stack_depth();
-
/* Nothing to do, if the join relation is not partitioned. */
if (!IS_PARTITIONED_REL(joinrel))
return;
@@ -1390,23 +1771,124 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
(List *) adjust_appendrel_attrs(root,
(Node *) parent_restrictlist,
nappinfos, appinfos);
- pfree(appinfos);
child_joinrel = joinrel->part_rels[cnt_parts];
if (!child_joinrel)
{
- child_joinrel = build_child_join_rel(root, child_rel1, child_rel2,
- joinrel, child_restrictlist,
- child_sjinfo,
- child_sjinfo->jointype);
- joinrel->part_rels[cnt_parts] = child_joinrel;
+ if (agg_kind == REL_AGG_KIND_NONE)
+ child_joinrel = build_child_join_rel(root, child_rel1, child_rel2,
+ joinrel,
+ child_restrictlist,
+ child_sjinfo,
+ child_sjinfo->jointype,
+ false);
+ else
+ {
+ /*
+ * The join should have been created when we were called with
+ * REL_AGG_KIND_NONE.
+ */
+ child_joinrel = find_join_rel(root, bms_union(child_rel1->relids,
+ child_rel2->relids));
+ Assert(child_joinrel);
+ }
}
+ if (agg_kind != REL_AGG_KIND_NONE)
+ {
+ RelOptInfo *joinrel_grouped,
+ *child_joinrel_grouped;
+ RelAggInfo *child_agg_info;
+
+ if (child_joinrel->grouped == NULL)
+ child_joinrel->grouped = (RelOptGrouped *) palloc0(sizeof(RelOptGrouped));
+
+ /*
+ * Make sure there's a grouped join relation.
+ */
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ {
+ joinrel_grouped = joinrel->grouped->needs_final_agg;
+
+ if (child_joinrel->grouped->needs_final_agg == NULL)
+ child_joinrel->grouped->needs_final_agg =
+ build_child_join_rel(root,
+ child_rel1,
+ child_rel2,
+ joinrel_grouped,
+ child_restrictlist,
+ child_sjinfo,
+ child_sjinfo->jointype,
+ true);
+
+ /*
+ * The grouped join is what we need till the end of the
+ * function.
+ */
+ child_joinrel_grouped = child_joinrel->grouped->needs_final_agg;
+ }
+ else if (agg_kind == REL_AGG_KIND_SIMPLE)
+ {
+ joinrel_grouped = joinrel->grouped->no_final_agg;
+
+ if (child_joinrel->grouped->no_final_agg == NULL)
+ child_joinrel->grouped->no_final_agg =
+ build_child_join_rel(root, child_rel1, child_rel2,
+ joinrel_grouped,
+ child_restrictlist,
+ child_sjinfo,
+ child_sjinfo->jointype,
+ true);
+
+ /*
+ * The grouped join is what we need till the end of the
+ * function.
+ */
+ child_joinrel_grouped = child_joinrel->grouped->no_final_agg;
+ }
+ else
+ Assert(false);
+
+ /*
+ * Make sure the child_joinrel has reltarget initialized.
+ *
+ * Although build_child_join_rel() creates reltarget for each
+ * child join from scratch as opposed to translating the parent
+ * reltarget (XXX set_append_rel_size() uses the translation ---
+ * is this inconsistency justified?), we just translate the parent
+ * reltarget here. Per-child call of create_rel_agg_info() would
+ * introduce too much duplicate work because it needs the *parent*
+ * target as a source and that one is identical for all the child
+ * joins
+ */
+ child_agg_info = translate_rel_agg_info(root,
+ joinrel_grouped->agg_info,
+ appinfos, nappinfos);
+
+ /*
+ * Make sure the child joinrel has reltarget initialized.
+ */
+ if (child_joinrel_grouped->reltarget == NULL)
+ {
+ set_grouped_joinrel_target(root, child_joinrel_grouped, rel1, rel2,
+ child_sjinfo, child_restrictlist,
+ child_agg_info, agg_kind);
+ }
+
+ joinrel_grouped->part_rels[cnt_parts] = child_joinrel_grouped;
+ }
+ else
+ joinrel->part_rels[cnt_parts] = child_joinrel;
+
+ pfree(appinfos);
+
Assert(bms_equal(child_joinrel->relids, child_joinrelids));
populate_joinrel_with_paths(root, child_rel1, child_rel2,
child_joinrel, child_sjinfo,
- child_restrictlist);
+ child_restrictlist,
+ agg_kind,
+ do_aggregate);
}
}
diff --git a/src/backend/optimizer/path/tidpath.c b/src/backend/optimizer/path/tidpath.c
index 3bb5b8def6..0a0d22d427 100644
--- a/src/backend/optimizer/path/tidpath.c
+++ b/src/backend/optimizer/path/tidpath.c
@@ -250,10 +250,11 @@ TidQualFromBaseRestrictinfo(RelOptInfo *rel)
* Candidate paths are added to the rel's pathlist (using add_path).
*/
void
-create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel)
+create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel, RelAggKind agg_kind)
{
Relids required_outer;
List *tidquals;
+ Path *tidpath;
/*
* We don't support pushing join clauses into the quals of a tidscan, but
@@ -263,8 +264,21 @@ create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel)
required_outer = rel->lateral_relids;
tidquals = TidQualFromBaseRestrictinfo(rel);
+ if (!tidquals)
+ return;
- if (tidquals)
- add_path(rel, (Path *) create_tidscan_path(root, rel, tidquals,
- required_outer));
+ tidpath = (Path *) create_tidscan_path(root, rel, tidquals,
+ required_outer);
+
+ if (agg_kind == REL_AGG_KIND_NONE)
+ add_path(rel, tidpath);
+ else if (required_outer == NULL)
+ {
+ /*
+ * Only AGG_HASHED is suitable here as it does not expect the input
+ * set to be sorted.
+ */
+ create_grouped_path(root, rel, tidpath, false, false, AGG_HASHED,
+ agg_kind);
+ }
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index cf82b7052d..26dec922d8 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -831,6 +831,12 @@ use_physical_tlist(PlannerInfo *root, Path *path, int flags)
return false;
/*
+ * Grouped relation's target list contains GroupedVars.
+ */
+ if (rel->agg_info != NULL)
+ return false;
+
+ /*
* If a bitmap scan's tlist is empty, keep it as-is. This may allow the
* executor to skip heap page fetches, and in any case, the benefit of
* using a physical tlist instead would be minimal.
@@ -1639,7 +1645,8 @@ create_projection_plan(PlannerInfo *root, ProjectionPath *best_path, int flags)
* therefore can't predict whether it will require an exact tlist. For
* both of these reasons, we have to recheck here.
*/
- if (use_physical_tlist(root, &best_path->path, flags))
+ if (!best_path->force_result &&
+ use_physical_tlist(root, &best_path->path, flags))
{
/*
* Our caller doesn't really care what tlist we return, so we don't
@@ -1652,7 +1659,8 @@ create_projection_plan(PlannerInfo *root, ProjectionPath *best_path, int flags)
apply_pathtarget_labeling_to_tlist(tlist,
best_path->path.pathtarget);
}
- else if (is_projection_capable_path(best_path->subpath))
+ else if (!best_path->force_result &&
+ is_projection_capable_path(best_path->subpath))
{
/*
* Our caller requires that we return the exact tlist, but no separate
@@ -5929,6 +5937,21 @@ find_ec_member_for_tle(EquivalenceClass *ec,
while (tlexpr && IsA(tlexpr, RelabelType))
tlexpr = ((RelabelType *) tlexpr)->arg;
+ /*
+ * GroupedVar can contain either non-Var grouping expression or aggregate.
+ * The grouping expression might be useful for sorting, however aggregates
+ * shouldn't currently appear among pathkeys.
+ */
+ if (IsA(tlexpr, GroupedVar))
+ {
+ GroupedVar *gvar = castNode(GroupedVar, tlexpr);
+
+ if (!IsA(gvar->gvexpr, Aggref))
+ tlexpr = gvar->gvexpr;
+ else
+ return NULL;
+ }
+
foreach(lc, ec->ec_members)
{
EquivalenceMember *em = (EquivalenceMember *) lfirst(lc);
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 01335db511..0740e3f18d 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -14,6 +14,7 @@
*/
#include "postgres.h"
+#include "access/sysattr.h"
#include "catalog/pg_type.h"
#include "catalog/pg_class.h"
#include "nodes/nodeFuncs.h"
@@ -27,6 +28,7 @@
#include "optimizer/planner.h"
#include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
+#include "optimizer/tlist.h"
#include "optimizer/var.h"
#include "parser/analyze.h"
#include "rewrite/rewriteManip.h"
@@ -46,6 +48,10 @@ typedef struct PostponedQual
} PostponedQual;
+static void create_aggregate_grouped_var_infos(PlannerInfo *root);
+static void create_grouping_expr_grouped_var_infos(PlannerInfo *root);
+static RelOptInfo *copy_simple_rel(PlannerInfo *root, RelOptInfo *rel,
+ RelAggKind agg_kind);
static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
Index rtindex);
static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
@@ -96,10 +102,9 @@ static void check_hashjoinable(RestrictInfo *restrictinfo);
* jtnode. Internally, the function recurses through the jointree.
*
* At the end of this process, there should be one baserel RelOptInfo for
- * every non-join RTE that is used in the query. Therefore, this routine
- * is the only place that should call build_simple_rel with reloptkind
- * RELOPT_BASEREL. (Note: build_simple_rel recurses internally to build
- * "other rel" RelOptInfos for the members of any appendrels we find here.)
+ * every non-grouped non-join RTE that is used in the query. (Note:
+ * build_simple_rel recurses internally to build "other rel" RelOptInfos for
+ * the members of any appendrels we find here.)
*/
void
add_base_rels_to_query(PlannerInfo *root, Node *jtnode)
@@ -241,6 +246,456 @@ add_vars_to_targetlist(PlannerInfo *root, List *vars,
}
}
+/*
+ * Add GroupedVarInfo to grouped_var_list for each aggregate as well as for
+ * each possible grouping expression and setup RelOptInfo for each base or
+ * 'other' relation that can product grouped paths.
+ *
+ * Note that targets of the 'other' relations are not set here ---
+ * set_append_rel_size() will create them by translating the targets of the
+ * base rel.
+ *
+ * root->group_pathkeys must be setup before this function is called.
+ */
+extern void
+add_grouped_base_rels_to_query(PlannerInfo *root)
+{
+ int i;
+ ListCell *lc;
+
+ /*
+ * Isn't user interested in the aggregate push-down feature?
+ */
+ if (!enable_agg_pushdown)
+ return;
+
+ /* No grouping in the query? */
+ if (!root->parse->groupClause)
+ return;
+
+ /*
+ * Grouping sets require multiple different groupings but the base
+ * relation can only generate one.
+ */
+ if (root->parse->groupingSets)
+ return;
+
+ /*
+ * SRF is not allowed in the aggregate argument and we don't even want it
+ * in the GROUP BY clause, so forbid it in general. It needs to be
+ * analyzed if evaluation of a GROUP BY clause containing SRF below the
+ * query targetlist would be correct. Currently it does not seem to be an
+ * important use case.
+ */
+ if (root->parse->hasTargetSRFs)
+ return;
+
+ /*
+ * TODO Consider if this is a real limitation.
+ */
+ if (root->parse->hasWindowFuncs)
+ return;
+
+ /* Create GroupedVarInfo per (distinct) aggregate. */
+ create_aggregate_grouped_var_infos(root);
+
+ /* Isn't there any aggregate to be pushed down? */
+ if (root->grouped_var_list == NIL)
+ return;
+
+ /* Create GroupedVarInfo per grouping expression. */
+ create_grouping_expr_grouped_var_infos(root);
+
+ /*
+ * Are all the aggregates AGGSPLIT_SIMPLE?
+ */
+ if (root->grouped_var_list == NIL)
+ return;
+
+ /*
+ * Now that we know that grouping can be pushed down, search for the
+ * maximum sortgroupref. The base relations may need it if extra grouping
+ * expressions get added to them.
+ */
+ Assert(root->max_sortgroupref == 0);
+ foreach(lc, root->processed_tlist)
+ {
+ TargetEntry *te = lfirst_node(TargetEntry, lc);
+
+ if (te->ressortgroupref > root->max_sortgroupref)
+ root->max_sortgroupref = te->ressortgroupref;
+ }
+
+ /* Process the individual base relations. */
+ for (i = 1; i < root->simple_rel_array_size; i++)
+ {
+ RelOptInfo *rel = root->simple_rel_array[i];
+ RangeTblEntry *rte;
+ RelAggInfo *agg_info;
+
+ /* NULL should mean a join relation. */
+ if (rel == NULL)
+ continue;
+
+ /*
+ * Not all RTE kinds are supported when grouping is considered.
+ *
+ * TODO Consider relaxing some of these restrictions.
+ */
+ rte = root->simple_rte_array[rel->relid];
+ if (rte->rtekind != RTE_RELATION ||
+ rte->relkind == RELKIND_FOREIGN_TABLE ||
+ rte->tablesample != NULL)
+ return;
+
+ /*
+ * Grouped "other member rels" should not be created until we know
+ * whether the parent can be grouped, i.e. until the parent has
+ * rel->agg_info initialized.
+ */
+ if (rel->reloptkind != RELOPT_BASEREL)
+ continue;
+
+ /*
+ * Retrieve the information we need for aggregation of the rel
+ * contents.
+ */
+ Assert(rel->agg_info == NULL);
+ agg_info = create_rel_agg_info(root, rel);
+ if (agg_info == NULL)
+ continue;
+
+ /*
+ * Create the grouped counterpart of "rel". This may includes the
+ * "other member rels" rejected above, if they're children of this
+ * rel. (The child rels will have their ->target and ->agg_info
+ * initialized later by set_append_rel_size()).
+ */
+ Assert(rel->agg_info == NULL);
+ Assert(rel->grouped == NULL);
+ rel->grouped = (RelOptGrouped *) palloc0(sizeof(RelOptGrouped));
+ rel->grouped->needs_final_agg = copy_simple_rel(root, rel,
+ REL_AGG_KIND_PARTIAL);
+ rel->grouped->no_final_agg = copy_simple_rel(root, rel,
+ REL_AGG_KIND_SIMPLE);
+
+ /*
+ * Assign it the aggregation-specific info.
+ *
+ * The aggregation paths will get their input target from agg_info, so
+ * store it too.
+ */
+ rel->grouped->needs_final_agg->reltarget = agg_info->target_partial;
+ rel->grouped->needs_final_agg->agg_info = agg_info;
+
+ rel->grouped->no_final_agg->reltarget = agg_info->target_simple;
+ rel->grouped->no_final_agg->agg_info = agg_info;
+ }
+}
+
+/*
+ * Create GroupedVarInfo for each distinct aggregate.
+ *
+ * If any aggregate is not suitable, set root->grouped_var_list to NIL and
+ * return.
+ */
+static void
+create_aggregate_grouped_var_infos(PlannerInfo *root)
+{
+ List *tlist_exprs;
+ ListCell *lc;
+
+ Assert(root->grouped_var_list == NIL);
+
+ tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+ PVC_INCLUDE_AGGREGATES);
+
+ /*
+ * Although GroupingFunc is related to root->parse->groupingSets, this
+ * field does not necessarily reflect its presence.
+ */
+ foreach(lc, tlist_exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+
+ if (IsA(expr, GroupingFunc))
+ return;
+ }
+
+ /*
+ * Aggregates within the HAVING clause need to be processed in the same
+ * way as those in the main targetlist.
+ */
+ if (root->parse->havingQual != NULL)
+ {
+ List *having_exprs;
+
+ having_exprs = pull_var_clause((Node *) root->parse->havingQual,
+ PVC_INCLUDE_AGGREGATES);
+ if (having_exprs != NIL)
+ tlist_exprs = list_concat(tlist_exprs, having_exprs);
+ }
+
+ if (tlist_exprs == NIL)
+ return;
+
+ /* tlist_exprs may also contain Vars, but we only need Aggrefs. */
+ foreach(lc, tlist_exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+ Aggref *aggref;
+ ListCell *lc2;
+ GroupedVarInfo *gvi;
+ bool exists;
+
+ if (IsA(expr, Var))
+ continue;
+
+ aggref = castNode(Aggref, expr);
+
+ /* TODO Think if (some of) these can be handled. */
+ if (aggref->aggvariadic ||
+ aggref->aggdirectargs || aggref->aggorder ||
+ aggref->aggdistinct || aggref->aggfilter)
+ {
+ /*
+ * Partial aggregation is not useful if at least one aggregate
+ * cannot be evaluated below the top-level join.
+ *
+ * XXX Is it worth freeing the GroupedVarInfos and their subtrees?
+ */
+ root->grouped_var_list = NIL;
+ break;
+ }
+
+ /*
+ * Aggregation push-down does not work w/o aggcombinefn. This field is
+ * not mandatory, so check if this particular aggregate can handle
+ * partial aggregation.
+ */
+ if (!OidIsValid(aggref->aggcombinefn))
+ {
+ root->grouped_var_list = NIL;
+ break;
+ }
+
+ /* Does GroupedVarInfo for this aggregate already exist? */
+ exists = false;
+ foreach(lc2, root->grouped_var_list)
+ {
+ gvi = lfirst_node(GroupedVarInfo, lc2);
+
+ if (equal(expr, gvi->gvexpr))
+ {
+ exists = true;
+ break;
+ }
+ }
+
+ /* Construct a new GroupedVarInfo if does not exist yet. */
+ if (!exists)
+ {
+ Relids relids;
+
+ gvi = makeNode(GroupedVarInfo);
+ gvi->gvid = list_length(root->grouped_var_list);
+ gvi->gvexpr = (Expr *) copyObject(aggref);
+ gvi->agg_partial = copyObject(aggref);
+ mark_partial_aggref(gvi->agg_partial, AGGSPLIT_INITIAL_SERIAL);
+
+ /* Find out where the aggregate should be evaluated. */
+ relids = pull_varnos((Node *) aggref);
+ if (!bms_is_empty(relids))
+ gvi->gv_eval_at = relids;
+ else
+ gvi->gv_eval_at = NULL;
+
+ root->grouped_var_list = lappend(root->grouped_var_list, gvi);
+ }
+ }
+
+ list_free(tlist_exprs);
+}
+
+/*
+ * Create GroupedVarInfo for each expression usable as grouping key.
+ *
+ * In addition to the expressions of the query targetlist, group_pathkeys is
+ * also considered the source of grouping expressions. That increases the
+ * chance to get the relation output grouped.
+ */
+static void
+create_grouping_expr_grouped_var_infos(PlannerInfo *root)
+{
+ ListCell *l1,
+ *l2;
+ List *exprs = NIL;
+ List *sortgrouprefs = NIL;
+
+ /*
+ * Make sure GroupedVarInfo exists for each expression usable as grouping
+ * key.
+ */
+ foreach(l1, root->parse->groupClause)
+ {
+ SortGroupClause *sgClause;
+ TargetEntry *te;
+ Index sortgroupref;
+
+ sgClause = lfirst_node(SortGroupClause, l1);
+ te = get_sortgroupclause_tle(sgClause, root->processed_tlist);
+ sortgroupref = te->ressortgroupref;
+
+ if (sortgroupref == 0)
+ continue;
+
+ /*
+ * Non-zero sortgroupref does not necessarily imply grouping
+ * expression: data can also be sorted by aggregate.
+ */
+ if (IsA(te->expr, Aggref))
+ continue;
+
+ exprs = lappend(exprs, te->expr);
+ sortgrouprefs = lappend_int(sortgrouprefs, sortgroupref);
+ }
+
+ /*
+ * Construct GroupedVarInfo for each expression.
+ */
+ forboth(l1, exprs, l2, sortgrouprefs)
+ {
+ Expr *expr = (Expr *) lfirst(l1);
+ int sortgroupref = lfirst_int(l2);
+ GroupedVarInfo *gvi = makeNode(GroupedVarInfo);
+
+ gvi->gvid = list_length(root->grouped_var_list);
+ gvi->gvexpr = (Expr *) copyObject(expr);
+ gvi->sortgroupref = sortgroupref;
+
+ /* Find out where the expression should be evaluated. */
+ gvi->gv_eval_at = pull_varnos((Node *) expr);
+
+ root->grouped_var_list = lappend(root->grouped_var_list, gvi);
+ }
+}
+
+/*
+ * Take a flat copy of already initialized RelOptInfo and process child rels
+ * recursively.
+ *
+ * Flat copy ensures that we do not miss any information that the non-grouped
+ * rel already contains. XXX Do we need to copy any Node field?
+ *
+ * Two calls are expected per relation: the first with agg_kind equal to
+ * REL_AGG_KIND_PARTIAL, the second with REL_AGG_KIND_SIMPLE.
+ *
+ * TODO The function only produces grouped rels, the name should reflect it
+ * (create_grouped_rel() ?).
+ */
+static RelOptInfo *
+copy_simple_rel(PlannerInfo *root, RelOptInfo *rel, RelAggKind agg_kind)
+{
+ Index relid = rel->relid;
+ RangeTblEntry *rte;
+ ListCell *l;
+ List *indexlist = NIL;
+ RelOptInfo *result;
+
+ result = makeNode(RelOptInfo);
+ memcpy(result, rel, sizeof(RelOptInfo));
+
+ /*
+ * The new relation is grouped itself.
+ */
+ result->grouped = NULL;
+
+ /*
+ * The target to generate aggregation input will be initialized later.
+ */
+ result->reltarget = NULL;
+
+ /*
+ * Make sure that index paths have access to the parent rel's agg_info,
+ * which is used to indicate that the rel should produce grouped paths.
+ */
+ foreach(l, result->indexlist)
+ {
+ IndexOptInfo *src,
+ *dst;
+
+ src = lfirst_node(IndexOptInfo, l);
+ dst = makeNode(IndexOptInfo);
+ memcpy(dst, src, sizeof(IndexOptInfo));
+
+ dst->rel = result;
+ indexlist = lappend(indexlist, dst);
+ }
+ result->indexlist = indexlist;
+
+ /*
+ * This is very similar to child rel processing in build_simple_rel().
+ */
+ rte = root->simple_rte_array[relid];
+ if (rte->inh)
+ {
+ int nparts = rel->nparts;
+ int cnt_parts = 0;
+
+ if (nparts > 0)
+ result->part_rels = (RelOptInfo **)
+ palloc(sizeof(RelOptInfo *) * nparts);
+
+ foreach(l, root->append_rel_list)
+ {
+ AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(l);
+ RelOptInfo *childrel;
+
+ /* append_rel_list contains all append rels; ignore others */
+ if (appinfo->parent_relid != relid)
+ continue;
+
+ /*
+ * The non-grouped child rel must already exist.
+ */
+ childrel = root->simple_rel_array[appinfo->child_relid];
+ Assert(childrel != NULL);
+
+ /*
+ * Create the copies.
+ */
+ Assert(childrel->agg_info == NULL);
+ if (agg_kind == REL_AGG_KIND_PARTIAL)
+ {
+ Assert(childrel->grouped == NULL);
+
+ childrel->grouped = (RelOptGrouped *) palloc0(sizeof(RelOptGrouped));
+ childrel->grouped->needs_final_agg = copy_simple_rel(root, childrel, agg_kind);
+ }
+ else if (agg_kind == REL_AGG_KIND_SIMPLE)
+ {
+ Assert(childrel->grouped != NULL);
+ Assert(childrel->grouped->no_final_agg == NULL);
+ childrel->grouped->no_final_agg = copy_simple_rel(root, childrel, agg_kind);
+ }
+ else
+ Assert(false);
+
+ /* Nothing more to do for an unpartitioned table. */
+ if (!rel->part_scheme)
+ continue;
+
+ Assert(cnt_parts < nparts);
+ result->part_rels[cnt_parts] = childrel;
+ cnt_parts++;
+ }
+
+ /* We should have seen all the child partitions. */
+ Assert(cnt_parts == nparts);
+ }
+
+ return result;
+}
/*****************************************************************************
*
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index b05adc70c4..0ca5d6ea0b 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -43,6 +43,8 @@
* (this is NOT necessarily root->parse->targetList!)
* qp_callback is a function to compute query_pathkeys once it's safe to do so
* qp_extra is optional extra data to pass to qp_callback
+ * *partially_grouped may receive relation that contains partial aggregate
+ * anywhere in the join tree.
*
* Note: the PlannerInfo node also includes a query_pathkeys field, which
* tells query_planner the sort order that is desired in the final output
@@ -66,6 +68,8 @@ query_planner(PlannerInfo *root, List *tlist,
*/
if (parse->jointree->fromlist == NIL)
{
+ RelOptInfo *final_rel;
+
/* We need a dummy joinrel to describe the empty set of baserels */
final_rel = build_empty_join_rel(root);
@@ -114,6 +118,7 @@ query_planner(PlannerInfo *root, List *tlist,
root->full_join_clauses = NIL;
root->join_info_list = NIL;
root->placeholder_list = NIL;
+ root->grouped_var_list = NIL;
root->fkey_list = NIL;
root->initial_rels = NIL;
@@ -232,6 +237,16 @@ query_planner(PlannerInfo *root, List *tlist,
extract_restriction_or_clauses(root);
/*
+ * If the query result can be grouped, check if any grouping can be
+ * performed below the top-level join. If so, setup root->grouped_var_list
+ * and create RelOptInfo for base relations capable to do the grouping.
+ *
+ * The base relations should be fully initialized now, so that we have
+ * enough info to decide whether grouping is possible.
+ */
+ add_grouped_base_rels_to_query(root);
+
+ /*
* We should now have size estimates for every actual table involved in
* the query, and we also know which if any have been deleted from the
* query by join removal; so we can compute total_table_pages.
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index fd45c9767d..da8ac3c2d1 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -133,9 +133,6 @@ static double get_number_of_groups(PlannerInfo *root,
double path_rows,
grouping_sets_data *gd,
List *target_list);
-static Size estimate_hashagg_tablesize(Path *path,
- const AggClauseCosts *agg_costs,
- double dNumGroups);
static RelOptInfo *create_grouping_paths(PlannerInfo *root,
RelOptInfo *input_rel,
PathTarget *target,
@@ -2044,6 +2041,7 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
grouping_target_parallel_safe,
&agg_costs,
gset_data);
+
/* Fix things up if grouping_target contains SRFs */
if (parse->hasTargetSRFs)
adjust_paths_for_srfs(root, current_rel,
@@ -3640,40 +3638,6 @@ get_number_of_groups(PlannerInfo *root,
}
/*
- * estimate_hashagg_tablesize
- * estimate the number of bytes that a hash aggregate hashtable will
- * require based on the agg_costs, path width and dNumGroups.
- *
- * XXX this may be over-estimating the size now that hashagg knows to omit
- * unneeded columns from the hashtable. Also for mixed-mode grouping sets,
- * grouping columns not in the hashed set are counted here even though hashagg
- * won't store them. Is this a problem?
- */
-static Size
-estimate_hashagg_tablesize(Path *path, const AggClauseCosts *agg_costs,
- double dNumGroups)
-{
- Size hashentrysize;
-
- /* Estimate per-hash-entry space at tuple width... */
- hashentrysize = MAXALIGN(path->pathtarget->width) +
- MAXALIGN(SizeofMinimalTupleHeader);
-
- /* plus space for pass-by-ref transition values... */
- hashentrysize += agg_costs->transitionSpace;
- /* plus the per-hash-entry overhead */
- hashentrysize += hash_agg_entry_size(agg_costs->numAggs);
-
- /*
- * Note that this disregards the effect of fill-factor and growth policy
- * of the hash-table. That's probably ok, given default the default
- * fill-factor is relatively high. It'd be hard to meaningfully factor in
- * "double-in-size" growth policies here.
- */
- return hashentrysize * dNumGroups;
-}
-
-/*
* create_grouping_paths
*
* Build a new upperrel containing Paths for grouping and/or aggregation.
@@ -3720,6 +3684,7 @@ create_grouping_paths(PlannerInfo *root,
{
int flags = 0;
GroupPathExtraData extra;
+ List *agg_pushdown_paths = NIL;
/*
* Determine whether it's possible to perform sort-based
@@ -3787,6 +3752,39 @@ create_grouping_paths(PlannerInfo *root,
create_ordinary_grouping_paths(root, input_rel, grouped_rel,
agg_costs, gd, &extra,
&partially_grouped_rel);
+
+ /*
+ * Process paths generated by aggregation push-down feature. These
+ * have been produced due to REL_AGG_KIND_SIMPLE.
+ */
+ if (input_rel->grouped && input_rel->grouped->no_final_agg)
+ {
+ RelOptInfo *agg_pushdown_rel;
+ ListCell *lc;
+
+ agg_pushdown_rel = input_rel->grouped->no_final_agg;
+ agg_pushdown_paths = agg_pushdown_rel->pathlist;
+
+ /*
+ * See create_grouped_path().
+ */
+ Assert(agg_pushdown_rel->partial_pathlist == NIL);
+
+ foreach(lc, agg_pushdown_paths)
+ {
+ Path *path = (Path *) lfirst(lc);
+
+ /*
+ * The REL_AGG_KIND_SIMPLE strategy currently turns append rel
+ * into a dummy rel, see comment in set_append_rel_pathlist().
+ * XXX Can we eliminate this situation earlier?
+ */
+ if (IS_DUMMY_PATH(path))
+ continue;
+
+ add_path(grouped_rel, path);
+ }
+ }
}
set_cheapest(grouped_rel);
@@ -3912,7 +3910,8 @@ create_degenerate_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
0,
false,
NIL,
- -1);
+ -1,
+ REL_AGG_KIND_NONE);
}
else
{
@@ -3951,6 +3950,8 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
RelOptInfo *partially_grouped_rel = NULL;
double dNumGroups;
PartitionwiseAggregateType patype = PARTITIONWISE_AGGREGATE_NONE;
+ RelOptInfo *grouped_input_rel = NULL;
+ bool agg_push_down_paths = false;
/*
* If this is the topmost grouping relation or if the parent relation is
@@ -3983,20 +3984,39 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
}
/*
+ * Process paths generated due to aggregation push-down feature. The
+ * REL_AGG_KIND_SIMPLE option is responsible for these.
+ */
+ if (input_rel->grouped)
+ grouped_input_rel = input_rel->grouped->needs_final_agg;
+
+ /*
* Before generating paths for grouped_rel, we first generate any possible
* partially grouped paths; that way, later code can easily consider both
* parallel and non-parallel approaches to grouping.
+ *
+ * Partially grouped paths may also result from aggregation push-down.
*/
+ if (grouped_input_rel != NULL)
+ {
+ Assert(enable_agg_pushdown);
+
+ if (grouped_input_rel->partial_pathlist != NIL ||
+ grouped_input_rel->pathlist != NIL)
+ agg_push_down_paths = true;
+ }
+
if ((extra->flags & GROUPING_CAN_PARTIAL_AGG) != 0)
{
bool force_rel_creation;
/*
- * If we're doing partitionwise aggregation at this level, force
- * creation of a partially_grouped_rel so we can add partitionwise
- * paths to it.
+ * If we're doing partitionwise aggregation at this level or if
+ * aggregation push-down took place, force creation of a
+ * partially_grouped_rel so we can add the related paths to it.
*/
- force_rel_creation = (patype == PARTITIONWISE_AGGREGATE_PARTIAL);
+ force_rel_creation = (patype == PARTITIONWISE_AGGREGATE_PARTIAL ||
+ agg_push_down_paths);
partially_grouped_rel =
create_partial_grouping_paths(root,
@@ -4005,6 +4025,44 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
gd,
extra,
force_rel_creation);
+
+ /*
+ * Process paths resulting from aggregate push-down if there are some.
+ *
+ * This works independent from the "partitionwise features".
+ */
+ if (agg_push_down_paths &&
+ extra->patype == PARTITIONWISE_AGGREGATE_NONE)
+ {
+ ListCell *lc;
+
+ /*
+ * Gather the partial paths resulting from aggregation push-down
+ * separate because they have different target: aggregates are
+ * represented there by GroupedVars. The targets of Gather /
+ * GatherMerge paths must take this into account.
+ */
+ if (grouped_input_rel->partial_pathlist != NIL)
+ gather_grouping_paths(root, grouped_input_rel);
+
+
+ /*
+ * If non-partial paths were generated above and / or the
+ * aggregate push-down resulted in non-partial paths, just add
+ * them all to partially_grouped_rel for common processing.
+ *
+ * The only difference is that the paths we add here have
+ * GroupedVars in their pathtarget, while ones to be added to
+ * pathlist of partially_grouped_rel above have Aggrefs. This
+ * difference will be handled later by set_upper_references().
+ */
+ foreach(lc, grouped_input_rel->pathlist)
+ {
+ Path *path = (Path *) lfirst(lc);
+
+ add_path(partially_grouped_rel, path);
+ }
+ }
}
/* Set out parameter. */
@@ -4029,10 +4087,14 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
/* Gather any partially grouped partial paths. */
if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
- {
gather_grouping_paths(root, partially_grouped_rel);
+
+ /*
+ * The non-partial paths can come either from the Gather above or from
+ * aggregate push-down.
+ */
+ if (partially_grouped_rel && partially_grouped_rel->pathlist)
set_cheapest(partially_grouped_rel);
- }
/*
* Estimate number of groups.
@@ -6839,7 +6901,7 @@ apply_scanjoin_target_to_paths(PlannerInfo *root,
*/
rel->pathlist = list_make1(create_append_path(root, rel, NIL, NIL,
NULL, 0, false, NIL,
- -1));
+ -1, REL_AGG_KIND_NONE));
rel->partial_pathlist = NIL;
set_cheapest(rel);
Assert(IS_DUMMY_REL(rel));
@@ -6963,7 +7025,8 @@ apply_scanjoin_target_to_paths(PlannerInfo *root,
/* Build new paths for this relation by appending child paths. */
if (live_children != NIL)
- add_paths_to_append_rel(root, rel, live_children);
+ add_paths_to_append_rel(root, rel, live_children,
+ REL_AGG_KIND_NONE);
}
/*
@@ -7122,8 +7185,15 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
{
Assert(partially_grouped_live_children != NIL);
+ /*
+ * This grouping is independent from the aggregate push-down feature,
+ * which is the reason we pass REL_AGG_KIND_NONE.
+ */
+ Assert(partially_grouped_rel->agg_info == NULL);
+
add_paths_to_append_rel(root, partially_grouped_rel,
- partially_grouped_live_children);
+ partially_grouped_live_children,
+ REL_AGG_KIND_NONE);
/*
* We need call set_cheapest, since the finalization step will use the
@@ -7138,7 +7208,12 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
{
Assert(grouped_live_children != NIL);
- add_paths_to_append_rel(root, grouped_rel, grouped_live_children);
+ /*
+ * This grouping is independent from the aggregate push-down feature,
+ * which is the reason we pass REL_AGG_KIND_NONE.
+ */
+ add_paths_to_append_rel(root, grouped_rel, grouped_live_children,
+ REL_AGG_KIND_NONE);
}
}
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 69dd327f0c..5f2105a682 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -40,6 +40,7 @@ typedef struct
List *tlist; /* underlying target list */
int num_vars; /* number of plain Var tlist entries */
bool has_ph_vars; /* are there PlaceHolderVar entries? */
+ bool has_grp_vars; /* are there GroupedVar entries? */
bool has_non_vars; /* are there other entries? */
bool has_conv_whole_rows; /* are there ConvertRowtypeExpr
* entries encapsulating a whole-row
@@ -1739,9 +1740,74 @@ set_upper_references(PlannerInfo *root, Plan *plan, int rtoffset)
indexed_tlist *subplan_itlist;
List *output_targetlist;
ListCell *l;
+ List *sub_tlist_save = NIL;
+
+ if (root->grouped_var_list != NIL)
+ {
+ if (IsA(plan, Agg))
+ {
+ Agg *agg = (Agg *) plan;
+
+ if (agg->aggsplit == AGGSPLIT_FINAL_DESERIAL)
+ {
+ /*
+ * convert_combining_aggrefs could have replaced some vars
+ * with Aggref expressions representing the partial
+ * aggregation. We need to restore the same Aggrefs in the
+ * subplan targetlist, but this would break the subplan if
+ * it's something else than the partial aggregation (i.e. the
+ * partial aggregation takes place lower in the plan tree). So
+ * we'll eventually need to restore the current
+ * subplan->targetlist.
+ */
+ if (!IsA(subplan, Agg))
+ sub_tlist_save = subplan->targetlist;
+#ifdef USE_ASSERT_CHECKING
+ else
+ Assert(((Agg *) subplan)->aggsplit == AGGSPLIT_INITIAL_SERIAL);
+#endif /* USE_ASSERT_CHECKING */
+
+ /*
+ * Restore the aggregate expressions that we might have
+ * removed when planning for aggregation at base relation
+ * level.
+ */
+ subplan->targetlist =
+ replace_grouped_vars_with_aggrefs(root, subplan->targetlist);
+ }
+ else if (agg->aggsplit == AGGSPLIT_SIMPLE)
+ {
+ /*
+ * Similarly, process paths generated due to
+ * REL_AGG_KIND_SIMPLE.
+ */
+ Assert(!IsA(subplan, Agg));
+
+ sub_tlist_save = subplan->targetlist;
+ subplan->targetlist =
+ replace_grouped_vars_with_aggrefs(root, subplan->targetlist);
+ }
+ }
+ else if (IsA(plan, Result))
+ {
+ /*
+ * Result can contain Aggrefs that we need to convert.
+ */
+ sub_tlist_save = subplan->targetlist;
+ subplan->targetlist =
+ replace_grouped_vars_with_aggrefs(root, subplan->targetlist);
+ }
+ }
subplan_itlist = build_tlist_index(subplan->targetlist);
+ /*
+ * The replacement of GroupVars by Aggrefs was only needed for the index
+ * build.
+ */
+ if (sub_tlist_save != NIL)
+ subplan->targetlist = sub_tlist_save;
+
output_targetlist = NIL;
foreach(l, plan->targetlist)
{
@@ -1996,6 +2062,7 @@ build_tlist_index(List *tlist)
itlist->tlist = tlist;
itlist->has_ph_vars = false;
+ itlist->has_grp_vars = false;
itlist->has_non_vars = false;
itlist->has_conv_whole_rows = false;
@@ -2016,6 +2083,8 @@ build_tlist_index(List *tlist)
}
else if (tle->expr && IsA(tle->expr, PlaceHolderVar))
itlist->has_ph_vars = true;
+ else if (tle->expr && IsA(tle->expr, GroupedVar))
+ itlist->has_grp_vars = true;
else if (is_converted_whole_row_reference((Node *) tle->expr))
itlist->has_conv_whole_rows = true;
else
@@ -2299,6 +2368,31 @@ fix_join_expr_mutator(Node *node, fix_join_expr_context *context)
/* No referent found for Var */
elog(ERROR, "variable not found in subplan target lists");
}
+ if (IsA(node, GroupedVar))
+ {
+ GroupedVar *gvar = (GroupedVar *) node;
+
+ /* See if the GroupedVar has bubbled up from a lower plan node */
+ if (context->outer_itlist && context->outer_itlist->has_grp_vars)
+ {
+ newvar = search_indexed_tlist_for_non_var((Expr *) gvar,
+ context->outer_itlist,
+ OUTER_VAR);
+ if (newvar)
+ return (Node *) newvar;
+ }
+ if (context->inner_itlist && context->inner_itlist->has_grp_vars)
+ {
+ newvar = search_indexed_tlist_for_non_var((Expr *) gvar,
+ context->inner_itlist,
+ INNER_VAR);
+ if (newvar)
+ return (Node *) newvar;
+ }
+
+ /* No referent found for GroupedVar */
+ elog(ERROR, "grouped variable not found in subplan target lists");
+ }
if (IsA(node, PlaceHolderVar))
{
PlaceHolderVar *phv = (PlaceHolderVar *) node;
@@ -2461,7 +2555,8 @@ fix_upper_expr_mutator(Node *node, fix_upper_expr_context *context)
/* If no match, just fall through to process it normally */
}
/* Try matching more complex expressions too, if tlist has any */
- if (context->subplan_itlist->has_non_vars ||
+ if (context->subplan_itlist->has_grp_vars ||
+ context->subplan_itlist->has_non_vars ||
(context->subplan_itlist->has_conv_whole_rows &&
is_converted_whole_row_reference(node)))
{
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index c3f46a26c3..daf3118810 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -911,6 +911,7 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte,
memset(subroot->upper_rels, 0, sizeof(subroot->upper_rels));
memset(subroot->upper_targets, 0, sizeof(subroot->upper_targets));
subroot->processed_tlist = NIL;
+ subroot->max_sortgroupref = 0;
subroot->grouping_map = NULL;
subroot->minmax_aggs = NIL;
subroot->qual_security_level = 0;
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 7d75e1eda9..1d4452b57f 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -656,7 +656,8 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
* Append the child results together.
*/
path = (Path *) create_append_path(root, result_rel, pathlist, NIL,
- NULL, 0, false, NIL, -1);
+ NULL, 0, false, NIL, -1,
+ REL_AGG_KIND_NONE);
/*
* For UNION ALL, we just need the Append path. For UNION, need to add
@@ -712,7 +713,7 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
ppath = (Path *)
create_append_path(root, result_rel, NIL, partial_pathlist,
NULL, parallel_workers, enable_parallel_append,
- NIL, -1);
+ NIL, -1, REL_AGG_KIND_NONE);
ppath = (Path *)
create_gather_path(root, result_rel, ppath,
result_rel->reltarget, NULL, NULL);
@@ -822,7 +823,8 @@ generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root,
* Append the child results together.
*/
path = (Path *) create_append_path(root, result_rel, pathlist, NIL,
- NULL, 0, false, NIL, -1);
+ NULL, 0, false, NIL, -1,
+ REL_AGG_KIND_NONE);
/* Identify the grouping semantics */
groupList = generate_setop_grouplist(op, tlist);
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index dbf9adcdac..1b813ccc0a 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -27,6 +27,7 @@
#include "optimizer/planmain.h"
#include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
+/* TODO Remove this if create_grouped_path ends up in another module. */
#include "optimizer/tlist.h"
#include "optimizer/var.h"
#include "parser/parsetree.h"
@@ -57,7 +58,6 @@ static List *reparameterize_pathlist_by_child(PlannerInfo *root,
List *pathlist,
RelOptInfo *child_rel);
-
/*****************************************************************************
* MISC. PATH UTILITIES
*****************************************************************************/
@@ -243,6 +243,7 @@ compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor)
void
set_cheapest(RelOptInfo *parent_rel)
{
+ bool grouped = parent_rel->agg_info != NULL;
Path *cheapest_startup_path;
Path *cheapest_total_path;
Path *best_param_path;
@@ -252,7 +253,22 @@ set_cheapest(RelOptInfo *parent_rel)
Assert(IsA(parent_rel, RelOptInfo));
if (parent_rel->pathlist == NIL)
- elog(ERROR, "could not devise a query plan for the given query");
+ {
+ if (!grouped)
+ elog(ERROR, "could not devise a query plan for the given query");
+ else
+ {
+ /*
+ * Creation of grouped paths is not guaranteed. Currently this
+ * happens if REL_AGG_KIND_SIMPLE is applied to append relation.
+ */
+ if (IS_SIMPLE_REL(parent_rel) || IS_JOIN_REL(parent_rel))
+ mark_dummy_rel(parent_rel);
+ else
+ Assert(false);
+ return;
+ }
+ }
cheapest_startup_path = cheapest_total_path = best_param_path = NULL;
parameterized_paths = NIL;
@@ -955,10 +971,15 @@ create_seqscan_path(PlannerInfo *root, RelOptInfo *rel,
Relids required_outer, int parallel_workers)
{
Path *pathnode = makeNode(Path);
+ bool grouped = rel->agg_info != NULL;
pathnode->pathtype = T_SeqScan;
pathnode->parent = rel;
- pathnode->pathtarget = rel->reltarget;
+ /* For grouped relation only generate the aggregation input. */
+ if (!grouped)
+ pathnode->pathtarget = rel->reltarget;
+ else
+ pathnode->pathtarget = rel->agg_info->input;
pathnode->param_info = get_baserel_parampathinfo(root, rel,
required_outer);
pathnode->parallel_aware = parallel_workers > 0 ? true : false;
@@ -1038,10 +1059,15 @@ create_index_path(PlannerInfo *root,
RelOptInfo *rel = index->rel;
List *indexquals,
*indexqualcols;
+ bool grouped = rel->agg_info != NULL;
pathnode->path.pathtype = indexonly ? T_IndexOnlyScan : T_IndexScan;
pathnode->path.parent = rel;
- pathnode->path.pathtarget = rel->reltarget;
+ /* For grouped relation only generate the aggregation input. */
+ if (!grouped)
+ pathnode->path.pathtarget = rel->reltarget;
+ else
+ pathnode->path.pathtarget = rel->agg_info->input;
pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
required_outer);
pathnode->path.parallel_aware = false;
@@ -1189,10 +1215,15 @@ create_tidscan_path(PlannerInfo *root, RelOptInfo *rel, List *tidquals,
Relids required_outer)
{
TidPath *pathnode = makeNode(TidPath);
+ bool grouped = rel->agg_info != NULL;
pathnode->path.pathtype = T_TidScan;
pathnode->path.parent = rel;
- pathnode->path.pathtarget = rel->reltarget;
+ /* For grouped relation only generate the aggregation input. */
+ if (!grouped)
+ pathnode->path.pathtarget = rel->reltarget;
+ else
+ pathnode->path.pathtarget = rel->agg_info->input;
pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
required_outer);
pathnode->path.parallel_aware = false;
@@ -1221,7 +1252,8 @@ create_append_path(PlannerInfo *root,
List *subpaths, List *partial_subpaths,
Relids required_outer,
int parallel_workers, bool parallel_aware,
- List *partitioned_rels, double rows)
+ List *partitioned_rels, double rows,
+ RelAggKind agg_kind)
{
AppendPath *pathnode = makeNode(AppendPath);
ListCell *l;
@@ -1229,8 +1261,24 @@ create_append_path(PlannerInfo *root,
Assert(!parallel_aware || parallel_workers > 0);
pathnode->path.pathtype = T_Append;
+
+ if (agg_kind == REL_AGG_KIND_NONE)
+ pathnode->path.pathtarget = rel->reltarget;
+ else
+ {
+ if (agg_kind == REL_AGG_KIND_SIMPLE)
+ {
+ rel = rel->grouped->no_final_agg;
+ pathnode->path.pathtarget = rel->agg_info->target_simple;
+ }
+ else if (agg_kind == REL_AGG_KIND_PARTIAL)
+ {
+ rel = rel->grouped->needs_final_agg;
+ pathnode->path.pathtarget = rel->agg_info->target_partial;
+ }
+ }
+
pathnode->path.parent = rel;
- pathnode->path.pathtarget = rel->reltarget;
/*
* When generating an Append path for a partitioned table, there may be
@@ -1341,11 +1389,13 @@ append_startup_cost_compare(const void *a, const void *b)
/*
* create_merge_append_path
* Creates a path corresponding to a MergeAppend plan, returning the
- * pathnode.
+ * pathnode. target can be supplied by caller. If NULL is passed, the field
+ * is set to rel->reltarget.
*/
MergeAppendPath *
create_merge_append_path(PlannerInfo *root,
RelOptInfo *rel,
+ PathTarget *target,
List *subpaths,
List *pathkeys,
Relids required_outer,
@@ -1358,7 +1408,7 @@ create_merge_append_path(PlannerInfo *root,
pathnode->path.pathtype = T_MergeAppend;
pathnode->path.parent = rel;
- pathnode->path.pathtarget = rel->reltarget;
+ pathnode->path.pathtarget = target ? target : rel->reltarget;
pathnode->path.param_info = get_appendrel_parampathinfo(rel,
required_outer);
pathnode->path.parallel_aware = false;
@@ -1528,7 +1578,9 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
MemoryContext oldcontext;
int numCols;
- /* Caller made a mistake if subpath isn't cheapest_total ... */
+ /*
+ * Caller made a mistake if subpath isn't cheapest_total.
+ */
Assert(subpath == rel->cheapest_total_path);
Assert(subpath->parent == rel);
/* ... or if SpecialJoinInfo is the wrong one */
@@ -2149,6 +2201,7 @@ calc_non_nestloop_required_outer(Path *outer_path, Path *inner_path)
* relations.
*
* 'joinrel' is the join relation.
+ * 'target' is the join path target
* 'jointype' is the type of join required
* 'workspace' is the result from initial_cost_nestloop
* 'extra' contains various information about the join
@@ -2163,6 +2216,7 @@ calc_non_nestloop_required_outer(Path *outer_path, Path *inner_path)
NestPath *
create_nestloop_path(PlannerInfo *root,
RelOptInfo *joinrel,
+ PathTarget *target,
JoinType jointype,
JoinCostWorkspace *workspace,
JoinPathExtraData *extra,
@@ -2203,7 +2257,7 @@ create_nestloop_path(PlannerInfo *root,
pathnode->path.pathtype = T_NestLoop;
pathnode->path.parent = joinrel;
- pathnode->path.pathtarget = joinrel->reltarget;
+ pathnode->path.pathtarget = target;
pathnode->path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
@@ -2235,6 +2289,7 @@ create_nestloop_path(PlannerInfo *root,
* two relations
*
* 'joinrel' is the join relation
+ * 'target' is the join path target
* 'jointype' is the type of join required
* 'workspace' is the result from initial_cost_mergejoin
* 'extra' contains various information about the join
@@ -2251,6 +2306,7 @@ create_nestloop_path(PlannerInfo *root,
MergePath *
create_mergejoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
+ PathTarget *target,
JoinType jointype,
JoinCostWorkspace *workspace,
JoinPathExtraData *extra,
@@ -2267,7 +2323,7 @@ create_mergejoin_path(PlannerInfo *root,
pathnode->jpath.path.pathtype = T_MergeJoin;
pathnode->jpath.path.parent = joinrel;
- pathnode->jpath.path.pathtarget = joinrel->reltarget;
+ pathnode->jpath.path.pathtarget = target;
pathnode->jpath.path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
@@ -2303,6 +2359,7 @@ create_mergejoin_path(PlannerInfo *root,
* Creates a pathnode corresponding to a hash join between two relations.
*
* 'joinrel' is the join relation
+ * 'target' is the join path target
* 'jointype' is the type of join required
* 'workspace' is the result from initial_cost_hashjoin
* 'extra' contains various information about the join
@@ -2317,6 +2374,7 @@ create_mergejoin_path(PlannerInfo *root,
HashPath *
create_hashjoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
+ PathTarget *target,
JoinType jointype,
JoinCostWorkspace *workspace,
JoinPathExtraData *extra,
@@ -2331,7 +2389,7 @@ create_hashjoin_path(PlannerInfo *root,
pathnode->jpath.path.pathtype = T_HashJoin;
pathnode->jpath.path.parent = joinrel;
- pathnode->jpath.path.pathtarget = joinrel->reltarget;
+ pathnode->jpath.path.pathtarget = target;
pathnode->jpath.path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
@@ -2413,8 +2471,8 @@ create_projection_path(PlannerInfo *root,
* Note: in the latter case, create_projection_plan has to recheck our
* conclusion; see comments therein.
*/
- if (is_projection_capable_path(subpath) ||
- equal(oldtarget->exprs, target->exprs))
+ if ((is_projection_capable_path(subpath) ||
+ equal(oldtarget->exprs, target->exprs)))
{
/* No separate Result node needed */
pathnode->dummypp = true;
@@ -2799,8 +2857,7 @@ create_agg_path(PlannerInfo *root,
pathnode->path.pathtype = T_Agg;
pathnode->path.parent = rel;
pathnode->path.pathtarget = target;
- /* For now, assume we are above any joins, so no parameterization */
- pathnode->path.param_info = NULL;
+ pathnode->path.param_info = subpath->param_info;
pathnode->path.parallel_aware = false;
pathnode->path.parallel_safe = rel->consider_parallel &&
subpath->parallel_safe;
@@ -2833,6 +2890,188 @@ create_agg_path(PlannerInfo *root,
}
/*
+ * Apply AGG_SORTED aggregation path to subpath if it's suitably sorted.
+ *
+ * check_pathkeys can be passed FALSE if the function was already called for
+ * given index --- since the target should not change, we can skip the check
+ * of sorting during subsequent calls.
+ *
+ * agg_info contains both aggregate and grouping expressions.
+ *
+ * NULL is returned if sorting of subpath output is not suitable.
+ */
+AggPath *
+create_agg_sorted_path(PlannerInfo *root, Path *subpath,
+ bool check_pathkeys, double input_rows,
+ RelAggKind agg_kind)
+{
+ RelOptInfo *rel;
+ Node *agg_exprs;
+ AggSplit aggsplit;
+ AggClauseCosts agg_costs;
+ PathTarget *target;
+ double dNumGroups;
+ AggPath *result = NULL;
+ RelAggInfo *agg_info;
+
+ rel = subpath->parent;
+ agg_info = rel->agg_info;
+ Assert(agg_info != NULL);
+
+ if (agg_kind == REL_AGG_KIND_SIMPLE)
+ {
+ aggsplit = AGGSPLIT_SIMPLE;
+ agg_exprs = (Node *) agg_info->agg_exprs_simple;
+ target = agg_info->target_simple;
+ }
+ else if (agg_kind == REL_AGG_KIND_PARTIAL)
+ {
+ aggsplit = AGGSPLIT_INITIAL_SERIAL;
+ agg_exprs = (Node *) agg_info->agg_exprs_partial;
+ target = agg_info->target_partial;
+ }
+ else
+ Assert(false);
+
+ if (subpath->pathkeys == NIL)
+ return NULL;
+
+ if (!grouping_is_sortable(root->parse->groupClause))
+ return NULL;
+
+ if (check_pathkeys)
+ {
+ ListCell *lc1;
+ List *key_subset = NIL;
+
+ /*
+ * Find all query pathkeys that our relation does affect.
+ */
+ foreach(lc1, root->group_pathkeys)
+ {
+ PathKey *gkey = castNode(PathKey, lfirst(lc1));
+ ListCell *lc2;
+
+ foreach(lc2, subpath->pathkeys)
+ {
+ PathKey *skey = castNode(PathKey, lfirst(lc2));
+
+ if (skey == gkey)
+ {
+ key_subset = lappend(key_subset, gkey);
+ break;
+ }
+ }
+ }
+
+ if (key_subset == NIL)
+ return NULL;
+
+ /* Check if AGG_SORTED is useful for the whole query. */
+ if (!pathkeys_contained_in(key_subset, subpath->pathkeys))
+ return NULL;
+ }
+
+ MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+ get_agg_clause_costs(root, (Node *) agg_exprs, aggsplit, &agg_costs);
+
+ Assert(agg_info->group_exprs != NIL);
+ dNumGroups = estimate_num_groups(root, agg_info->group_exprs,
+ input_rows, NULL);
+
+ Assert(agg_info->group_clauses != NIL);
+ result = create_agg_path(root, rel, subpath, target,
+ AGG_SORTED, aggsplit,
+ agg_info->group_clauses, NIL, &agg_costs,
+ dNumGroups);
+
+ return result;
+}
+
+/*
+ * Apply AGG_HASHED aggregation to subpath.
+ *
+ * Arguments have the same meaning as those of create_agg_sorted_path.
+ */
+AggPath *
+create_agg_hashed_path(PlannerInfo *root, Path *subpath,
+ double input_rows, RelAggKind agg_kind)
+{
+ RelOptInfo *rel;
+ bool can_hash;
+ Node *agg_exprs;
+ AggSplit aggsplit;
+ AggClauseCosts agg_costs;
+ PathTarget *target;
+ double dNumGroups;
+ Size hashaggtablesize;
+ Query *parse = root->parse;
+ AggPath *result = NULL;
+ RelAggInfo *agg_info;
+
+ rel = subpath->parent;
+ agg_info = rel->agg_info;
+ Assert(agg_info != NULL);
+
+ if (agg_kind == REL_AGG_KIND_SIMPLE)
+ {
+ aggsplit = AGGSPLIT_SIMPLE;
+ agg_exprs = (Node *) agg_info->agg_exprs_simple;
+ target = agg_info->target_simple;
+ }
+ else if (agg_kind == REL_AGG_KIND_PARTIAL)
+ {
+ aggsplit = AGGSPLIT_INITIAL_SERIAL;
+ agg_exprs = (Node *) agg_info->agg_exprs_partial;
+ target = agg_info->target_partial;
+ }
+ else
+ Assert(false);
+
+ MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+ get_agg_clause_costs(root, agg_exprs, aggsplit, &agg_costs);
+
+ can_hash = (parse->groupClause != NIL &&
+ parse->groupingSets == NIL &&
+ agg_costs.numOrderedAggs == 0 &&
+ grouping_is_hashable(parse->groupClause));
+
+ if (can_hash)
+ {
+ Assert(agg_info->group_exprs != NIL);
+ dNumGroups = estimate_num_groups(root, agg_info->group_exprs,
+ input_rows, NULL);
+
+ hashaggtablesize = estimate_hashagg_tablesize(subpath, &agg_costs,
+ dNumGroups);
+
+ if (hashaggtablesize < work_mem * 1024L)
+ {
+ /*
+ * Create the partial aggregation path.
+ */
+ Assert(agg_info->group_clauses != NIL);
+
+ result = create_agg_path(root, rel, subpath,
+ target,
+ AGG_HASHED,
+ aggsplit,
+ agg_info->group_clauses, NIL,
+ &agg_costs,
+ dNumGroups);
+
+ /*
+ * The agg path should require no fewer parameters than the plain
+ * one.
+ */
+ result->path.param_info = subpath->param_info;
+ }
+ }
+
+ return result;
+}
+
+/*
* create_groupingsets_path
* Creates a pathnode that represents performing GROUPING SETS aggregation
*
@@ -3512,7 +3751,7 @@ create_limit_path(PlannerInfo *root, RelOptInfo *rel,
Path *
reparameterize_path(PlannerInfo *root, Path *path,
Relids required_outer,
- double loop_count)
+ double loop_count, RelAggKind agg_kind)
{
RelOptInfo *rel = path->parent;
@@ -3580,7 +3819,8 @@ reparameterize_path(PlannerInfo *root, Path *path,
spath = reparameterize_path(root, spath,
required_outer,
- loop_count);
+ loop_count,
+ agg_kind);
if (spath == NULL)
return NULL;
/* We have to re-split the regular and partial paths */
@@ -3596,7 +3836,8 @@ reparameterize_path(PlannerInfo *root, Path *path,
apath->path.parallel_workers,
apath->path.parallel_aware,
apath->partitioned_rels,
- -1);
+ -1,
+ agg_kind);
}
default:
break;
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index c69740eda6..114a3445db 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -17,6 +17,7 @@
#include <limits.h>
#include "miscadmin.h"
+#include "catalog/pg_constraint.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/pathnode.h"
@@ -26,6 +27,8 @@
#include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
#include "optimizer/tlist.h"
+#include "optimizer/var.h"
+#include "parser/parse_oper.h"
#include "partitioning/partbounds.h"
#include "utils/hsearch.h"
@@ -57,6 +60,9 @@ static void add_join_rel(PlannerInfo *root, RelOptInfo *joinrel);
static void build_joinrel_partition_info(RelOptInfo *joinrel,
RelOptInfo *outer_rel, RelOptInfo *inner_rel,
List *restrictlist, JoinType jointype);
+static void init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+ PathTarget *target, PathTarget *agg_input,
+ List *gvis, List **group_exprs_extra_p);
/*
@@ -72,7 +78,10 @@ setup_simple_rel_arrays(PlannerInfo *root)
/* Arrays are accessed using RT indexes (1..N) */
root->simple_rel_array_size = list_length(root->parse->rtable) + 1;
- /* simple_rel_array is initialized to all NULLs */
+ /*
+ * simple_rel_array / simple_grouped_rel_array are both initialized to all
+ * NULLs
+ */
root->simple_rel_array = (RelOptInfo **)
palloc0(root->simple_rel_array_size * sizeof(RelOptInfo *));
@@ -148,7 +157,14 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
rel->reloptkind = parent ? RELOPT_OTHER_MEMBER_REL : RELOPT_BASEREL;
rel->relids = bms_make_singleton(relid);
rel->rows = 0;
- /* cheap startup cost is interesting iff not all tuples to be retrieved */
+
+ /*
+ * Cheap startup cost is interesting iff not all tuples to be retrieved.
+ * XXX As for grouped relation, the startup cost might be interesting for
+ * AGG_SORTED (if it can produce the ordering that matches
+ * root->query_pathkeys) but not in general (other kinds of aggregation
+ * need the whole relation). Yet it seems worth trying.
+ */
rel->consider_startup = (root->tuple_fraction > 0);
rel->consider_param_startup = false; /* might get changed later */
rel->consider_parallel = false; /* might get changed later */
@@ -162,6 +178,8 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
rel->cheapest_parameterized_paths = NIL;
rel->direct_lateral_relids = NULL;
rel->lateral_relids = NULL;
+ rel->agg_info = NULL;
+ rel->grouped = NULL;
rel->relid = relid;
rel->rtekind = rte->rtekind;
/* min_attr, max_attr, attr_needed, attr_widths are set below */
@@ -380,13 +398,23 @@ build_join_rel_hash(PlannerInfo *root)
RelOptInfo *
find_join_rel(PlannerInfo *root, Relids relids)
{
+ HTAB *join_rel_hash;
+ List *join_rel_list;
+
+ join_rel_hash = root->join_rel_hash;
+ join_rel_list = root->join_rel_list;
+
/*
* Switch to using hash lookup when list grows "too long". The threshold
* is arbitrary and is known only here.
*/
- if (!root->join_rel_hash && list_length(root->join_rel_list) > 32)
+ if (!join_rel_hash && list_length(join_rel_list) > 32)
+ {
build_join_rel_hash(root);
+ join_rel_hash = root->join_rel_hash;
+ }
+
/*
* Use either hashtable lookup or linear search, as appropriate.
*
@@ -395,12 +423,12 @@ find_join_rel(PlannerInfo *root, Relids relids)
* so would force relids out of a register and thus probably slow down the
* list-search case.
*/
- if (root->join_rel_hash)
+ if (join_rel_hash)
{
Relids hashkey = relids;
JoinHashEntry *hentry;
- hentry = (JoinHashEntry *) hash_search(root->join_rel_hash,
+ hentry = (JoinHashEntry *) hash_search(join_rel_hash,
&hashkey,
HASH_FIND,
NULL);
@@ -411,7 +439,7 @@ find_join_rel(PlannerInfo *root, Relids relids)
{
ListCell *l;
- foreach(l, root->join_rel_list)
+ foreach(l, join_rel_list)
{
RelOptInfo *rel = (RelOptInfo *) lfirst(l);
@@ -481,7 +509,9 @@ set_foreign_rel_properties(RelOptInfo *joinrel, RelOptInfo *outer_rel,
static void
add_join_rel(PlannerInfo *root, RelOptInfo *joinrel)
{
- /* GEQO requires us to append the new joinrel to the end of the list! */
+ /*
+ * GEQO requires us to append the new joinrel to the end of the list!
+ */
root->join_rel_list = lappend(root->join_rel_list, joinrel);
/* store it into the auxiliary hashtable if there is one. */
@@ -511,6 +541,9 @@ add_join_rel(PlannerInfo *root, RelOptInfo *joinrel)
* 'restrictlist_ptr': result variable. If not NULL, *restrictlist_ptr
* receives the list of RestrictInfo nodes that apply to this
* particular pair of joinable relations.
+ * 'grouped' forces creation of a "standalone" object, i.e. w/o search in the
+ * join list and without adding the result to the list. Caller is
+ * responsible for setup of reltarget in such a case.
*
* restrictlist_ptr makes the routine's API a little grotty, but it saves
* duplicated calculation of the restrictlist...
@@ -521,10 +554,12 @@ build_join_rel(PlannerInfo *root,
RelOptInfo *outer_rel,
RelOptInfo *inner_rel,
SpecialJoinInfo *sjinfo,
- List **restrictlist_ptr)
+ List **restrictlist_ptr,
+ bool grouped)
{
- RelOptInfo *joinrel;
+ RelOptInfo *joinrel = NULL;
List *restrictlist;
+ bool create_target = !grouped;
/* This function should be used only for join between parents. */
Assert(!IS_OTHER_REL(outer_rel) && !IS_OTHER_REL(inner_rel));
@@ -532,7 +567,8 @@ build_join_rel(PlannerInfo *root,
/*
* See if we already have a joinrel for this set of base rels.
*/
- joinrel = find_join_rel(root, joinrelids);
+ if (!grouped)
+ joinrel = find_join_rel(root, joinrelids);
if (joinrel)
{
@@ -555,11 +591,11 @@ build_join_rel(PlannerInfo *root,
joinrel->reloptkind = RELOPT_JOINREL;
joinrel->relids = bms_copy(joinrelids);
joinrel->rows = 0;
- /* cheap startup cost is interesting iff not all tuples to be retrieved */
+ /* See the comment in build_simple_rel(). */
joinrel->consider_startup = (root->tuple_fraction > 0);
joinrel->consider_param_startup = false;
joinrel->consider_parallel = false;
- joinrel->reltarget = create_empty_pathtarget();
+ joinrel->reltarget = NULL;
joinrel->pathlist = NIL;
joinrel->ppilist = NIL;
joinrel->partial_pathlist = NIL;
@@ -573,6 +609,8 @@ build_join_rel(PlannerInfo *root,
inner_rel->direct_lateral_relids);
joinrel->lateral_relids = min_join_parameterization(root, joinrel->relids,
outer_rel, inner_rel);
+ joinrel->agg_info = NULL;
+ joinrel->grouped = NULL;
joinrel->relid = 0; /* indicates not a baserel */
joinrel->rtekind = RTE_JOIN;
joinrel->min_attr = 0;
@@ -623,9 +661,13 @@ build_join_rel(PlannerInfo *root,
* and inner rels we first try to build it from. But the contents should
* be the same regardless.
*/
- build_joinrel_tlist(root, joinrel, outer_rel);
- build_joinrel_tlist(root, joinrel, inner_rel);
- add_placeholders_to_joinrel(root, joinrel, outer_rel, inner_rel);
+ if (create_target)
+ {
+ joinrel->reltarget = create_empty_pathtarget();
+ build_joinrel_tlist(root, joinrel, outer_rel);
+ build_joinrel_tlist(root, joinrel, inner_rel);
+ add_placeholders_to_joinrel(root, joinrel, outer_rel, inner_rel);
+ }
/*
* add_placeholders_to_joinrel also took care of adding the ph_lateral
@@ -662,31 +704,39 @@ build_join_rel(PlannerInfo *root,
/*
* Set estimates of the joinrel's size.
- */
- set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
- sjinfo, restrictlist);
-
- /*
- * Set the consider_parallel flag if this joinrel could potentially be
- * scanned within a parallel worker. If this flag is false for either
- * inner_rel or outer_rel, then it must be false for the joinrel also.
- * Even if both are true, there might be parallel-restricted expressions
- * in the targetlist or quals.
*
- * Note that if there are more than two rels in this relation, they could
- * be divided between inner_rel and outer_rel in any arbitrary way. We
- * assume this doesn't matter, because we should hit all the same baserels
- * and joinclauses while building up to this joinrel no matter which we
- * take; therefore, we should make the same decision here however we get
- * here.
+ * XXX The function claims to need reltarget but it does not seem to
+ * actually use it. Should we call it unconditionally so that callers of
+ * build_join_rel() do not have to care?
*/
- if (inner_rel->consider_parallel && outer_rel->consider_parallel &&
- is_parallel_safe(root, (Node *) restrictlist) &&
- is_parallel_safe(root, (Node *) joinrel->reltarget->exprs))
- joinrel->consider_parallel = true;
+ if (create_target)
+ {
+ set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
+ sjinfo, restrictlist);
+
+ /*
+ * Set the consider_parallel flag if this joinrel could potentially be
+ * scanned within a parallel worker. If this flag is false for either
+ * inner_rel or outer_rel, then it must be false for the joinrel also.
+ * Even if both are true, there might be parallel-restricted
+ * expressions in the targetlist or quals.
+ *
+ * Note that if there are more than two rels in this relation, they
+ * could be divided between inner_rel and outer_rel in any arbitrary
+ * way. We assume this doesn't matter, because we should hit all the
+ * same baserels and joinclauses while building up to this joinrel no
+ * matter which we take; therefore, we should make the same decision
+ * here however we get here.
+ */
+ if (inner_rel->consider_parallel && outer_rel->consider_parallel &&
+ is_parallel_safe(root, (Node *) restrictlist) &&
+ is_parallel_safe(root, (Node *) joinrel->reltarget->exprs))
+ joinrel->consider_parallel = true;
+ }
/* Add the joinrel to the PlannerInfo. */
- add_join_rel(root, joinrel);
+ if (!grouped)
+ add_join_rel(root, joinrel);
/*
* Also, if dynamic-programming join search is active, add the new joinrel
@@ -694,7 +744,7 @@ build_join_rel(PlannerInfo *root,
* of members should be for equality, but some of the level 1 rels might
* have been joinrels already, so we can only assert <=.
*/
- if (root->join_rel_level)
+ if (root->join_rel_level && !grouped)
{
Assert(root->join_cur_level > 0);
Assert(root->join_cur_level <= bms_num_members(joinrel->relids));
@@ -718,16 +768,19 @@ build_join_rel(PlannerInfo *root,
* 'restrictlist': list of RestrictInfo nodes that apply to this particular
* pair of joinable relations
* 'jointype' is the join type (inner, left, full, etc)
+ * 'grouped': does the join contain partial aggregate? (If it does, then
+ * caller is responsible for setup of reltarget.)
*/
RelOptInfo *
build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
RelOptInfo *inner_rel, RelOptInfo *parent_joinrel,
List *restrictlist, SpecialJoinInfo *sjinfo,
- JoinType jointype)
+ JoinType jointype, bool grouped)
{
RelOptInfo *joinrel = makeNode(RelOptInfo);
AppendRelInfo **appinfos;
int nappinfos;
+ bool create_target = !grouped;
/* Only joins between "other" relations land here. */
Assert(IS_OTHER_REL(outer_rel) && IS_OTHER_REL(inner_rel));
@@ -735,11 +788,11 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
joinrel->reloptkind = RELOPT_OTHER_JOINREL;
joinrel->relids = bms_union(outer_rel->relids, inner_rel->relids);
joinrel->rows = 0;
- /* cheap startup cost is interesting iff not all tuples to be retrieved */
+ /* See the comment in build_simple_rel(). */
joinrel->consider_startup = (root->tuple_fraction > 0);
joinrel->consider_param_startup = false;
joinrel->consider_parallel = false;
- joinrel->reltarget = create_empty_pathtarget();
+ joinrel->reltarget = NULL;
joinrel->pathlist = NIL;
joinrel->ppilist = NIL;
joinrel->partial_pathlist = NIL;
@@ -749,6 +802,8 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
joinrel->cheapest_parameterized_paths = NIL;
joinrel->direct_lateral_relids = NULL;
joinrel->lateral_relids = NULL;
+ joinrel->agg_info = NULL;
+ joinrel->grouped = NULL;
joinrel->relid = 0; /* indicates not a baserel */
joinrel->rtekind = RTE_JOIN;
joinrel->min_attr = 0;
@@ -789,11 +844,15 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
/* Compute information relevant to foreign relations. */
set_foreign_rel_properties(joinrel, outer_rel, inner_rel);
- /* Build targetlist */
- build_joinrel_tlist(root, joinrel, outer_rel);
- build_joinrel_tlist(root, joinrel, inner_rel);
- /* Add placeholder variables. */
- add_placeholders_to_child_joinrel(root, joinrel, parent_joinrel);
+ if (create_target)
+ {
+ /* Build targetlist */
+ joinrel->reltarget = create_empty_pathtarget();
+ build_joinrel_tlist(root, joinrel, outer_rel);
+ build_joinrel_tlist(root, joinrel, inner_rel);
+ /* Add placeholder variables. */
+ add_placeholders_to_child_joinrel(root, joinrel, parent_joinrel);
+ }
/* Construct joininfo list. */
appinfos = find_appinfos_by_relids(root, joinrel->relids, &nappinfos);
@@ -801,7 +860,6 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
(Node *) parent_joinrel->joininfo,
nappinfos,
appinfos);
- pfree(appinfos);
/*
* Lateral relids referred in child join will be same as that referred in
@@ -828,14 +886,22 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
/* Set estimates of the child-joinrel's size. */
- set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
- sjinfo, restrictlist);
+ /* XXX See the corresponding comment in build_join_rel(). */
+ if (create_target)
+ set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
+ sjinfo, restrictlist);
- /* We build the join only once. */
- Assert(!find_join_rel(root, joinrel->relids));
+ /*
+ * We build the join only once. (Grouped joins should not exist in the
+ * list.)
+ */
+ Assert(!find_join_rel(root, joinrel->relids) || grouped);
/* Add the relation to the PlannerInfo. */
- add_join_rel(root, joinrel);
+ if (!grouped)
+ add_join_rel(root, joinrel);
+
+ pfree(appinfos);
return joinrel;
}
@@ -1768,3 +1834,662 @@ build_joinrel_partition_info(RelOptInfo *joinrel, RelOptInfo *outer_rel,
joinrel->nullable_partexprs[cnt] = nullable_partexpr;
}
}
+
+/*
+ * Check if the relation can produce grouped paths and return the information
+ * it'll need for it. The passed relation is the non-grouped one which has the
+ * reltarget already constructed.
+ */
+RelAggInfo *
+create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel)
+{
+ List *gvis;
+ List *aggregates = NIL;
+ List *grp_exprs = NIL;
+ bool found_higher_agg;
+ ListCell *lc;
+ RelAggInfo *result;
+ PathTarget *target_partial,
+ *target_simple,
+ *agg_input;
+ List *exprs_tmp;
+ List *grp_exprs_extra = NIL;
+ int i;
+ List *sortgroupclauses = NIL;
+
+ /*
+ * The function shouldn't have been called if there's no opportunity for
+ * aggregation push-down.
+ */
+ Assert(root->grouped_var_list != NIL);
+
+ /*
+ * The source relation has nothing to do with grouping.
+ */
+ Assert(rel->agg_info == NULL);
+
+ /*
+ * The current implementation of aggregation push-down cannot handle
+ * PlaceHolderVar (PHV).
+ *
+ * If we knew that the PHV should be evaluated in this target (and of
+ * course, if its expression matched some grouping expression or Aggref
+ * argument), we'd just let init_grouping_targets create GroupedVar for
+ * the corresponding expression (phexpr). On the other hand, if we knew
+ * that the PHV is evaluated below the current rel, we'd ignore it because
+ * the referencing GroupedVar would take care of propagation of the value
+ * to upper joins. (PHV whose ph_eval_at is above the current rel make the
+ * aggregation push-down impossible in any case because the partial
+ * aggregation would receive wrong input if we ignored the ph_eval_at.)
+ *
+ * The problem is that the same PHV can be evaluated in the target of the
+ * current rel or in that of lower rel --- depending on the input paths.
+ * For example, consider rel->relids = {A, B, C} and if ph_eval_at = {B,
+ * C}. Path "A JOIN (B JOIN C)" implies that the PHV is evaluated by the
+ * "(B JOIN C)", while path "(A JOIN B) JOIN C" evaluates the PHV itself.
+ */
+ foreach(lc, rel->reltarget->exprs)
+ {
+ Expr *expr = lfirst(lc);
+
+ if (IsA(expr, PlaceHolderVar))
+ return NULL;
+ }
+
+ if (IS_SIMPLE_REL(rel))
+ {
+ RangeTblEntry *rte = root->simple_rte_array[rel->relid];;
+
+ /*
+ * rtekind != RTE_RELATION case is not supported yet.
+ */
+ if (rte->rtekind != RTE_RELATION)
+ return NULL;
+ }
+
+ /* Caller should only pass base relations or joins. */
+ Assert(rel->reloptkind == RELOPT_BASEREL ||
+ rel->reloptkind == RELOPT_JOINREL);
+
+ /*
+ * If any outer join can set the attribute value to NULL, the Agg plan
+ * would receive different input at the base rel level.
+ *
+ * XXX For RELOPT_JOINREL, do not return if all the joins that can set any
+ * entry of the grouped target (do we need to postpone this check until
+ * the grouped target is available, and init_grouping_targets take care?)
+ * of this rel to NULL are provably below rel. (It's ok if rel is one of
+ * these joins.)
+ */
+ if (bms_overlap(rel->relids, root->nullable_baserels))
+ return NULL;
+
+ /*
+ * Use equivalence classes to generate additional grouping expressions for
+ * the current rel. Without these we might not be able to apply
+ * aggregation to the relation result set.
+ *
+ * It's important that create_grouping_expr_grouped_var_infos has
+ * processed the explicit grouping columns by now. If the grouping clause
+ * contains multiple expressions belonging to the same EC, the original
+ * (i.e. not derived) one should be preferred when we build grouping
+ * target for a relation. Otherwise we have a problem when trying to match
+ * target entries to grouping clauses during plan creation, see
+ * get_grouping_expression().
+ */
+ gvis = list_copy(root->grouped_var_list);
+ foreach(lc, root->grouped_var_list)
+ {
+ GroupedVarInfo *gvi = lfirst_node(GroupedVarInfo, lc);
+ int relid = -1;
+
+ /* Only interested in grouping expressions. */
+ if (IsA(gvi->gvexpr, Aggref))
+ continue;
+
+ while ((relid = bms_next_member(rel->relids, relid)) >= 0)
+ {
+ GroupedVarInfo *gvi_trans;
+
+ gvi_trans = translate_expression_to_rels(root, gvi, relid);
+ if (gvi_trans != NULL)
+ gvis = lappend(gvis, gvi_trans);
+ }
+ }
+
+ /*
+ * Check if some aggregates or grouping expressions can be evaluated in
+ * this relation's target, and collect all vars referenced by these
+ * aggregates / grouping expressions;
+ */
+ found_higher_agg = false;
+ foreach(lc, gvis)
+ {
+ GroupedVarInfo *gvi = lfirst_node(GroupedVarInfo, lc);
+
+ /*
+ * The subset includes gv_eval_at uninitialized, which includes
+ * Aggref.aggstar.
+ */
+ if (bms_is_subset(gvi->gv_eval_at, rel->relids))
+ {
+ /*
+ * init_grouping_targets will handle plain Var grouping
+ * expressions because it needs to look them up in
+ * grouped_var_list anyway.
+ *
+ * XXX A plain Var could actually be handled w/o GroupedVar, but
+ * thus init_grouping_targets would have to spend extra effort
+ * looking for the EC-related vars, instead of relying on
+ * create_grouping_expr_grouped_var_infos. (Processing of
+ * particular expression would look different, so we could hardly
+ * reuse the same piece of code.)
+ */
+ if (IsA(gvi->gvexpr, Var))
+ continue;
+
+ /*
+ * The derived grouping expressions should not be referenced by
+ * the query targetlist, so do not add them if we're at the top of
+ * the join tree.
+ */
+ if (gvi->derived && bms_equal(rel->relids, root->all_baserels))
+ continue;
+
+ /*
+ * Accept the aggregate / grouping expression.
+ *
+ * (GroupedVarInfo is more convenient for the next processing than
+ * Aggref, see add_aggregates_to_grouped_target.)
+ */
+ if (IsA(gvi->gvexpr, Aggref))
+ aggregates = lappend(aggregates, gvi);
+ else
+ grp_exprs = lappend(grp_exprs, gvi);
+ }
+ else if (bms_overlap(gvi->gv_eval_at, rel->relids) &&
+ IsA(gvi->gvexpr, Aggref))
+ {
+ /*
+ * Remember that there is at least one aggregate expression that
+ * needs more than this rel.
+ */
+ found_higher_agg = true;
+ }
+ }
+
+ /*
+ * Grouping makes little sense w/o aggregate function and w/o grouping
+ * expressions.
+ */
+ if (aggregates == NIL)
+ {
+ list_free(gvis);
+ return NULL;
+ }
+
+ /*
+ * Give up if some other aggregate(s) need multiple relations including
+ * the current one. The problem is that grouping of the current relation
+ * could make some input variables unavailable for the "higher aggregate",
+ * and it'd also decrease the number of input rows the "higher aggregate"
+ * receives.
+ *
+ * In contrast, grp_exprs is only supposed to contain generic grouping
+ * expression, so it can be NIL so far. If all the grouping keys are just
+ * plain Vars, init_grouping_targets will take care of them.
+ */
+ if (found_higher_agg)
+ {
+ list_free(gvis);
+ return NULL;
+ }
+
+ /*
+ * Create target for grouped paths as well as one for the input paths of
+ * the aggregation paths.
+ */
+ target_partial = create_empty_pathtarget();
+ agg_input = create_empty_pathtarget();
+ init_grouping_targets(root, rel, target_partial, agg_input, gvis,
+ &grp_exprs_extra);
+ list_free(gvis);
+
+ /*
+ * Add (non-Var) grouping expressions (in the form of GroupedVar) to
+ * target_agg.
+ *
+ * Follow the convention that the grouping expressions should precede
+ * aggregates.
+ */
+ add_grouped_vars_to_target(root, target_partial, grp_exprs);
+
+ /*
+ * Aggregation push-down makes no sense w/o grouping expressions.
+ */
+ if (list_length(target_partial->exprs) == 0)
+ return NULL;
+
+ /*
+ * If the aggregation target should have extra grouping expressions, add
+ * them now. This step includes assignment of tleSortGroupRef's which we
+ * can generate now (the "ordinary" grouping expressions are present in
+ * the target by now).
+ */
+ if (list_length(grp_exprs_extra) > 0)
+ {
+ Index sortgroupref;
+
+ /*
+ * Always start at root->max_sortgroupref. The extra grouping
+ * expressions aren't used during the final aggregation, so the
+ * sortgroupref values don't need to be unique across the query. Thus
+ * we don't have to increase root->max_sortgroupref, which makes
+ * recognition of the extra grouping expressions pretty easy.
+ */
+ sortgroupref = root->max_sortgroupref;
+
+ /*
+ * Generate the SortGroupClause's and add the expressions to the
+ * target.
+ */
+ foreach(lc, grp_exprs_extra)
+ {
+ Var *var = lfirst_node(Var, lc);
+ SortGroupClause *cl = makeNode(SortGroupClause);
+ int i = 0;
+ ListCell *lc2;
+
+ /*
+ * TODO Verify that these fields are sufficient for this special
+ * SortGroupClause.
+ */
+ cl->tleSortGroupRef = ++sortgroupref;
+ get_sort_group_operators(var->vartype,
+ false, true, false,
+ NULL, &cl->eqop, NULL,
+ &cl->hashable);
+ sortgroupclauses = lappend(sortgroupclauses, cl);
+ add_column_to_pathtarget(target_partial, (Expr *) var,
+ cl->tleSortGroupRef);
+
+ /*
+ * The aggregation input target must emit this var too. It can
+ * already be there, so avoid adding it again.
+ */
+ foreach(lc2, agg_input->exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc2);
+
+ if (equal(expr, var))
+ {
+ /*
+ * The fact that the var is in agg_input does not imply
+ * that it has sortgroupref set. For example, the reason
+ * that it's there can be that a generic grouping
+ * expression references it, so grouping by the var alone
+ * hasn't been considered so far.
+ */
+ if (agg_input->sortgrouprefs == NULL)
+ {
+ agg_input->sortgrouprefs = (Index *)
+ palloc0(list_length(agg_input->exprs) *
+ sizeof(Index));
+ }
+ if (agg_input->sortgrouprefs[i] == 0)
+ agg_input->sortgrouprefs[i] = cl->tleSortGroupRef;
+
+ break;
+ }
+
+ i++;
+ }
+ if (lc2 != NULL)
+ continue;
+
+ /*
+ * Add the var if it's not in the target yet.
+ */
+ add_column_to_pathtarget(agg_input, (Expr *) var,
+ cl->tleSortGroupRef);
+ }
+ }
+
+ /*
+ * Add aggregates (in the form of GroupedVar) to the grouping target.
+ */
+ add_grouped_vars_to_target(root, target_partial, aggregates);
+
+ /*
+ * Make sure that the paths generating input data for partial aggregation
+ * include non-Var grouping expressions.
+ */
+ foreach(lc, grp_exprs)
+ {
+ GroupedVarInfo *gvi;
+
+ gvi = lfirst_node(GroupedVarInfo, lc);
+ add_column_to_pathtarget(agg_input, gvi->gvexpr, gvi->sortgroupref);
+ }
+
+ /*
+ * Since neither target nor agg_input is supposed to be identical to the
+ * source reltarget, compute the width and cost again.
+ */
+ set_pathtarget_cost_width(root, target_partial);
+ set_pathtarget_cost_width(root, agg_input);
+
+ /*
+ * Setup a target for 1-stage aggregation (REL_AGG_KIND_SIMPLE).
+ */
+ target_simple = copy_pathtarget(target_partial);
+ exprs_tmp = NIL;
+ foreach(lc, target_simple->exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+
+ /*
+ * The difference from target_partial is that the contained
+ * GroupedVars do not have agg_partial set.
+ */
+ if (IsA(expr, GroupedVar))
+ {
+ GroupedVar *gvar_new = makeNode(GroupedVar);
+
+ memcpy(gvar_new, expr, sizeof(GroupedVar));
+ gvar_new->agg_partial = NULL;
+ expr = (Expr *) gvar_new;
+ }
+ exprs_tmp = lappend(exprs_tmp, expr);
+ }
+ target_simple->exprs = exprs_tmp;
+ set_pathtarget_cost_width(root, target_simple);
+
+ result = makeNode(RelAggInfo);
+ result->target_partial = target_partial;
+ result->target_simple = target_simple;
+ result->input = agg_input;
+
+ /*
+ * Build a list of grouping expressions and a list of the corresponding
+ * SortGroupClauses.
+ */
+ i = 0;
+ foreach(lc, target_partial->exprs)
+ {
+ Index sortgroupref = 0;
+ SortGroupClause *cl;
+ Expr *texpr;
+
+ texpr = (Expr *) lfirst(lc);
+
+ if (IsA(texpr, GroupedVar) &&
+ IsA(((GroupedVar *) texpr)->gvexpr, Aggref))
+ {
+ /*
+ * texpr should represent the first aggregate in the targetlist.
+ */
+ break;
+ }
+
+ /*
+ * Find the clause by sortgroupref.
+ */
+ sortgroupref = target_partial->sortgrouprefs[i++];
+
+ /*
+ * Besides being an aggregate, the target expression should have no
+ * other reason then being a column of a relation functionally
+ * dependent on the GROUP BY clause. So it's not actually a grouping
+ * column.
+ */
+ if (sortgroupref == 0)
+ continue;
+
+ cl = get_sortgroupref_clause_noerr(sortgroupref,
+ root->parse->groupClause);
+
+ /*
+ * If query does not have this clause, it must be target-specific.
+ */
+ if (cl == NULL)
+ cl = get_sortgroupref_clause(sortgroupref, sortgroupclauses);
+
+ result->group_clauses = list_append_unique(result->group_clauses,
+ cl);
+
+ /*
+ * Add only unique clauses because of joins (both sides of a join can
+ * point at the same grouping clause). XXX Is it worth adding a bool
+ * argument indicating that we're dealing with join right now?
+ */
+ result->group_exprs = list_append_unique(result->group_exprs,
+ texpr);
+ }
+
+ /* Finally collect the aggregates. */
+ while (lc != NULL)
+ {
+ GroupedVar *gvar = castNode(GroupedVar, lfirst(lc));
+
+ Assert(IsA(gvar->gvexpr, Aggref));
+ result->agg_exprs_simple = lappend(result->agg_exprs_simple,
+ gvar->gvexpr);
+
+ Assert(gvar->agg_partial != NULL);
+ result->agg_exprs_partial = lappend(result->agg_exprs_partial,
+ gvar->agg_partial);
+ lc = lnext(lc);
+ }
+
+ return result;
+}
+
+/*
+ * Initialize target for grouped paths (target) as well as a target for paths
+ * that generate input for partial aggregation (agg_input).
+ *
+ * gvis a list of GroupedVarInfo's possibly useful for rel.
+ *
+ * The *group_exprs_extra_p list may receive additional grouping expressions
+ * that the query does not have. These can make the aggregation of base
+ * relation / join less efficient, but can allow for join of the grouped
+ * relation that wouldn't be possible otherwise.
+ */
+static void
+init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+ PathTarget *target, PathTarget *agg_input,
+ List *gvis, List **group_exprs_extra_p)
+{
+ ListCell *lc;
+ List *vars_unresolved = NIL;
+
+ foreach(lc, rel->reltarget->exprs)
+ {
+ Var *tvar;
+ GroupedVar *gvar;
+
+ /*
+ * Given that PlaceHolderVar currently prevents us from doing
+ * aggregation push-down, the source target cannot contain anything
+ * more complex than a Var. (As for generic grouping expressions,
+ * add_grouped_vars_to_target will retrieve them from the query
+ * targetlist and add them to "target" outside this function.)
+ */
+ tvar = lfirst_node(Var, lc);
+
+ gvar = get_grouping_expression(gvis, (Expr *) tvar);
+ if (gvar != NULL)
+ {
+ /*
+ * It's o.k. to use the target expression for grouping.
+ *
+ * The actual Var is added to the target. If we used the
+ * containing GroupedVar, references from various clauses (e.g.
+ * join quals) wouldn't work.
+ */
+ add_column_to_pathtarget(target, gvar->gvexpr,
+ gvar->sortgroupref);
+
+ /*
+ * As for agg_input, add the original expression but set
+ * sortgroupref in addition.
+ */
+ add_column_to_pathtarget(agg_input, gvar->gvexpr,
+ gvar->sortgroupref);
+
+ /* Process the next expression. */
+ continue;
+ }
+
+ /*
+ * Further investigation involves dependency check, for which we need
+ * to have all the plain-var grouping expressions gathered. So far
+ * only store the var in a list.
+ */
+ vars_unresolved = lappend(vars_unresolved, tvar);
+ }
+
+ /*
+ * Check for other possible reasons for the var to be in the plain target.
+ */
+ foreach(lc, vars_unresolved)
+ {
+ Var *var;
+ RangeTblEntry *rte;
+ List *deps = NIL;
+ Relids relids_subtract;
+ int ndx;
+ RelOptInfo *baserel;
+
+ var = lfirst_node(Var, lc);
+ rte = root->simple_rte_array[var->varno];
+
+ /*
+ * Dependent var is almost the same as one that has sortgroupref.
+ */
+ if (check_functional_grouping(rte->relid, var->varno,
+ var->varlevelsup,
+ target->exprs, &deps))
+ {
+
+ Index sortgroupref = 0;
+
+ add_column_to_pathtarget(target, (Expr *) var, sortgroupref);
+
+ /*
+ * The var shouldn't be actually used as a grouping key (instead,
+ * the one this depends on will be), so sortgroupref should not be
+ * important. But once we have it ...
+ */
+ add_column_to_pathtarget(agg_input, (Expr *) var, sortgroupref);
+
+ /*
+ * The var may or may not be present in generic grouping
+ * expression(s) or aggregate arguments, but we already have it in
+ * the targets, so don't care.
+ */
+ continue;
+ }
+
+ /*
+ * Isn't the expression needed by joins above the current rel?
+ *
+ * The relids we're not interested in do include 0, which is the
+ * top-level targetlist. The only reason for relids to contain 0
+ * should be that arg_var is referenced either by aggregate or by
+ * grouping expression, but right now we're interested in the *other*
+ * reasons. (As soon as GroupedVars are installed, the top level
+ * aggregates / grouping expressions no longer need direct reference
+ * to arg_var anyway.)
+ */
+ relids_subtract = bms_copy(rel->relids);
+ bms_add_member(relids_subtract, 0);
+
+ baserel = find_base_rel(root, var->varno);
+ ndx = var->varattno - baserel->min_attr;
+ if (bms_nonempty_difference(baserel->attr_needed[ndx],
+ relids_subtract))
+ {
+ /*
+ * The variable is needed by upper join. This includes one that is
+ * referenced by a generic grouping expression but couldn't be
+ * recognized as grouping expression on its own at the top of the
+ * loop.
+ *
+ * The only way to bring this var to the aggregation output is to
+ * add it to the grouping expressions too.
+ *
+ * Since root->parse->groupClause is not supposed to contain this
+ * expression, we need to construct special SortGroupClause. Its
+ * tleSortGroupRef needs to be unique within "target", so postpone
+ * creation of the SortGroupRefs until we're done with the
+ * iteration of rel->reltarget->exprs.
+ */
+ *group_exprs_extra_p = lappend(*group_exprs_extra_p, var);
+ }
+ else
+ {
+ /*
+ * As long as the query is semantically correct, arriving here
+ * means that the var is referenced either by aggregate argument
+ * or by generic grouping expression. The per-relation aggregation
+ * target should not contain it, as it only provides input for the
+ * final aggregation.
+ */
+ }
+
+ /*
+ * The var is not suitable for grouping, but agg_input ought to stay
+ * complete.
+ */
+ add_column_to_pathtarget(agg_input, (Expr *) var, 0);
+ }
+}
+
+
+/*
+ * Translate RelAggInfo of parent relation so it matches given child relation.
+ */
+RelAggInfo *
+translate_rel_agg_info(PlannerInfo *root, RelAggInfo *parent,
+ AppendRelInfo **appinfos, int nappinfos)
+{
+ RelAggInfo *result;
+
+ result = makeNode(RelAggInfo);
+
+ result->target_simple = copy_pathtarget(parent->target_simple);
+ result->target_simple->exprs = (List *)
+ adjust_appendrel_attrs(root,
+ (Node *) result->target_simple->exprs,
+ nappinfos, appinfos);
+ result->target_partial = copy_pathtarget(parent->target_partial);
+ result->target_partial->exprs = (List *)
+ adjust_appendrel_attrs(root,
+ (Node *) result->target_partial->exprs,
+ nappinfos, appinfos);
+
+ result->input = copy_pathtarget(parent->input);
+ result->input->exprs = (List *)
+ adjust_appendrel_attrs(root,
+ (Node *) result->input->exprs,
+ nappinfos, appinfos);
+
+ result->group_clauses = parent->group_clauses;
+
+ result->group_exprs = (List *)
+ adjust_appendrel_attrs(root,
+ (Node *) parent->group_exprs,
+ nappinfos, appinfos);
+
+ result->agg_exprs_simple = (List *)
+ adjust_appendrel_attrs(root,
+ (Node *) parent->agg_exprs_simple,
+ nappinfos, appinfos);
+ result->agg_exprs_partial = (List *)
+ adjust_appendrel_attrs(root,
+ (Node *) parent->agg_exprs_partial,
+ nappinfos, appinfos);
+ return result;
+}
diff --git a/src/backend/optimizer/util/tlist.c b/src/backend/optimizer/util/tlist.c
index 5500f33e63..b09fddeb32 100644
--- a/src/backend/optimizer/util/tlist.c
+++ b/src/backend/optimizer/util/tlist.c
@@ -426,7 +426,6 @@ get_sortgrouplist_exprs(List *sgClauses, List *targetList)
return result;
}
-
/*****************************************************************************
* Functions to extract data from a list of SortGroupClauses
*
@@ -801,6 +800,133 @@ apply_pathtarget_labeling_to_tlist(List *tlist, PathTarget *target)
}
/*
+ * Replace each GroupedVar in the source targetlist with the original
+ * expression --- either Aggref or a non-Var grouping expression.
+ *
+ * Even if the query targetlist has the Aggref wrapped in a generic
+ * expression, any subplan should emit the corresponding GroupedVar
+ * alone. (Aggregate finalization is needed before the aggregate result can be
+ * used for any purposes and that happens at the top level of the query.)
+ * Therefore we do not have to recurse into the target expressions here.
+ */
+List *
+replace_grouped_vars_with_aggrefs(PlannerInfo *root, List *src)
+{
+ List *result = NIL;
+ ListCell *l;
+
+ foreach(l, src)
+ {
+ TargetEntry *te,
+ *te_new;
+ Expr *expr_new = NULL;
+
+ te = lfirst_node(TargetEntry, l);
+
+ if (IsA(te->expr, GroupedVar))
+ {
+ GroupedVar *gvar;
+
+ gvar = castNode(GroupedVar, te->expr);
+ if (IsA(gvar->gvexpr, Aggref))
+ {
+ if (gvar->agg_partial)
+ {
+ /*
+ * Partial aggregate should appear in the targetlist so
+ * that it looks as if convert_combining_aggrefs arranged
+ * it.
+ */
+ expr_new = (Expr *) gvar->agg_partial;
+ }
+ else
+ {
+ /*
+ * Restore the original aggregate. This is typical for the
+ * REL_AGG_KIND_SIMPLE kind of aggregate push-down.
+ */
+ Assert(IsA(gvar->gvexpr, Aggref));
+
+ expr_new = (Expr *) gvar->gvexpr;
+ }
+ }
+ else
+ expr_new = gvar->gvexpr;
+ }
+
+ if (expr_new != NULL)
+ {
+ te_new = flatCopyTargetEntry(te);
+ te_new->expr = (Expr *) expr_new;
+ }
+ else
+ te_new = te;
+ result = lappend(result, te_new);
+ }
+
+ return result;
+}
+
+/*
+ * For each aggregate add GroupedVar to the grouped target.
+ *
+ * Caller passes the aggregates in the form of GroupedVarInfos so that we
+ * don't have to look for gvid.
+ */
+void
+add_grouped_vars_to_target(PlannerInfo *root, PathTarget *target,
+ List *expressions)
+{
+ ListCell *lc;
+
+ /* Create the vars and add them to the target. */
+ foreach(lc, expressions)
+ {
+ GroupedVarInfo *gvi;
+ GroupedVar *gvar;
+
+ gvi = lfirst_node(GroupedVarInfo, lc);
+ gvar = makeNode(GroupedVar);
+ gvar->gvid = gvi->gvid;
+ gvar->gvexpr = gvi->gvexpr;
+ gvar->agg_partial = gvi->agg_partial;
+ add_column_to_pathtarget(target, (Expr *) gvar, gvi->sortgroupref);
+ }
+}
+
+/*
+ * Return GroupedVar containing the passed-in expression if one exists, or
+ * NULL if the expression cannot be used as grouping key.
+ */
+GroupedVar *
+get_grouping_expression(List *gvis, Expr *expr)
+{
+ ListCell *lc;
+
+ foreach(lc, gvis)
+ {
+ GroupedVarInfo *gvi = lfirst_node(GroupedVarInfo, lc);
+
+ if (IsA(gvi->gvexpr, Aggref))
+ continue;
+
+ if (equal(gvi->gvexpr, expr))
+ {
+ GroupedVar *result = makeNode(GroupedVar);
+
+ Assert(gvi->sortgroupref > 0);
+ result->gvexpr = gvi->gvexpr;
+ result->gvid = gvi->gvid;
+ result->sortgroupref = gvi->sortgroupref;
+ return result;
+ }
+ }
+
+ /* The expression cannot be used as grouping key. */
+ return NULL;
+}
+
+/*
* split_pathtarget_at_srfs
* Split given PathTarget into multiple levels to position SRFs safely
*
diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c
index b16b1e4656..459dc3087c 100644
--- a/src/backend/optimizer/util/var.c
+++ b/src/backend/optimizer/util/var.c
@@ -840,3 +840,25 @@ alias_relid_set(PlannerInfo *root, Relids relids)
}
return result;
}
+
+/*
+ * Return GroupedVarInfo for given GroupedVar.
+ *
+ * XXX Consider better location of this routine.
+ */
+GroupedVarInfo *
+find_grouped_var_info(PlannerInfo *root, GroupedVar *gvar)
+{
+ ListCell *l;
+
+ foreach(l, root->grouped_var_list)
+ {
+ GroupedVarInfo *gvi = lfirst_node(GroupedVarInfo, l);
+
+ if (gvi->gvid == gvar->gvid)
+ return gvi;
+ }
+
+ elog(ERROR, "GroupedVarInfo not found");
+ return NULL; /* keep compiler quiet */
+}
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index abe1dbc521..3671f8dda3 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -104,6 +104,7 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
Oid vatype;
FuncDetailCode fdresult;
char aggkind = 0;
+ Oid aggcombinefn = InvalidOid;
ParseCallbackState pcbstate;
/*
@@ -360,6 +361,7 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
elog(ERROR, "cache lookup failed for aggregate %u", funcid);
classForm = (Form_pg_aggregate) GETSTRUCT(tup);
aggkind = classForm->aggkind;
+ aggcombinefn = classForm->aggcombinefn;
catDirectArgs = classForm->aggnumdirectargs;
ReleaseSysCache(tup);
@@ -740,6 +742,7 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
aggref->aggstar = agg_star;
aggref->aggvariadic = func_variadic;
aggref->aggkind = aggkind;
+ aggref->aggcombinefn = aggcombinefn;
/* agglevelsup will be set by transformAggregateCall */
aggref->aggsplit = AGGSPLIT_SIMPLE; /* planner might change this */
aggref->location = location;
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 065238b0fe..c17ef5edba 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -7723,6 +7723,23 @@ get_rule_expr(Node *node, deparse_context *context,
get_agg_expr((Aggref *) node, context, (Aggref *) node);
break;
+ case T_GroupedVar:
+ {
+ GroupedVar *gvar = castNode(GroupedVar, node);
+ Expr *expr = gvar->gvexpr;
+
+ if (IsA(expr, Aggref))
+ get_agg_expr(gvar->agg_partial, context, (Aggref *) gvar->gvexpr);
+ else if (IsA(expr, Var))
+ (void) get_variable((Var *) expr, 0, false, context);
+ else
+ {
+ Assert(IsA(gvar->gvexpr, OpExpr));
+ get_oper_expr((OpExpr *) expr, context);
+ }
+ break;
+ }
+
case T_GroupingFunc:
{
GroupingFunc *gexpr = (GroupingFunc *) node;
@@ -9208,10 +9225,18 @@ get_agg_combine_expr(Node *node, deparse_context *context, void *private)
Aggref *aggref;
Aggref *original_aggref = private;
- if (!IsA(node, Aggref))
+ if (IsA(node, Aggref))
+ aggref = (Aggref *) node;
+ else if (IsA(node, GroupedVar))
+ {
+ GroupedVar *gvar = castNode(GroupedVar, node);
+
+ aggref = gvar->agg_partial;
+ original_aggref = castNode(Aggref, gvar->gvexpr);
+ }
+ else
elog(ERROR, "combining Aggref does not point to an Aggref");
- aggref = (Aggref *) node;
get_agg_expr(aggref, context, original_aggref);
}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 4b08cdb721..eb02d1801c 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -114,6 +114,7 @@
#include "catalog/pg_statistic_ext.h"
#include "catalog/pg_type.h"
#include "executor/executor.h"
+#include "executor/nodeAgg.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
@@ -3884,6 +3885,39 @@ estimate_hash_bucket_stats(PlannerInfo *root, Node *hashkey, double nbuckets,
ReleaseVariableStats(vardata);
}
+/*
+ * estimate_hashagg_tablesize
+ * estimate the number of bytes that a hash aggregate hashtable will
+ * require based on the agg_costs, path width and dNumGroups.
+ *
+ * XXX this may be over-estimating the size now that hashagg knows to omit
+ * unneeded columns from the hashtable. Also for mixed-mode grouping sets,
+ * grouping columns not in the hashed set are counted here even though hashagg
+ * won't store them. Is this a problem?
+ */
+Size
+estimate_hashagg_tablesize(Path *path, const AggClauseCosts *agg_costs,
+ double dNumGroups)
+{
+ Size hashentrysize;
+
+ /* Estimate per-hash-entry space at tuple width... */
+ hashentrysize = MAXALIGN(path->pathtarget->width) +
+ MAXALIGN(SizeofMinimalTupleHeader);
+
+ /* plus space for pass-by-ref transition values... */
+ hashentrysize += agg_costs->transitionSpace;
+ /* plus the per-hash-entry overhead */
+ hashentrysize += hash_agg_entry_size(agg_costs->numAggs);
+
+ /*
+ * Note that this disregards the effect of fill-factor and growth policy
+ * of the hash-table. That's probably ok, given default the default
+ * fill-factor is relatively high. It'd be hard to meaningfully factor in
+ * "double-in-size" growth policies here.
+ */
+ return hashentrysize * dNumGroups;
+}
/*-------------------------------------------------------------------------
*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index b05fb209bb..bc335be32d 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -944,6 +944,15 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},
{
+ {"enable_agg_pushdown", PGC_USERSET, QUERY_TUNING_METHOD,
+ gettext_noop("Enables aggregation push-down."),
+ NULL
+ },
+ &enable_agg_pushdown,
+ false,
+ NULL, NULL, NULL
+ },
+ {
{"enable_parallel_append", PGC_USERSET, QUERY_TUNING_METHOD,
gettext_noop("Enables the planner's use of parallel append plans."),
NULL
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 43f1552241..f76fa9d532 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -222,6 +222,7 @@ typedef enum NodeTag
T_IndexOptInfo,
T_ForeignKeyOptInfo,
T_ParamPathInfo,
+ T_RelAggInfo,
T_Path,
T_IndexPath,
T_BitmapHeapPath,
@@ -262,9 +263,11 @@ typedef enum NodeTag
T_PathTarget,
T_RestrictInfo,
T_PlaceHolderVar,
+ T_GroupedVar,
T_SpecialJoinInfo,
T_AppendRelInfo,
T_PlaceHolderInfo,
+ T_GroupedVarInfo,
T_MinMaxAggInfo,
T_PlannerParamItem,
T_RollupData,
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 1b4b0d75af..6af31f2722 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -296,6 +296,7 @@ typedef struct Aggref
Oid aggcollid; /* OID of collation of result */
Oid inputcollid; /* OID of collation that function should use */
Oid aggtranstype; /* type Oid of aggregate's transition value */
+ Oid aggcombinefn; /* combine function (see pg_aggregate.h) */
List *aggargtypes; /* type Oids of direct and aggregated args */
List *aggdirectargs; /* direct arguments, if an ordered-set agg */
List *args; /* aggregated arguments and sort expressions */
@@ -306,6 +307,7 @@ typedef struct Aggref
bool aggvariadic; /* true if variadic arguments have been
* combined into an array last argument */
char aggkind; /* aggregate kind (see pg_aggregate.h) */
+
Index agglevelsup; /* > 0 if agg belongs to outer query */
AggSplit aggsplit; /* expected agg-splitting mode of parent Agg */
int location; /* token location, or -1 if unknown */
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 7cae3fcfb5..d3a7a97672 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -193,7 +193,8 @@ typedef struct PlannerInfo
* unreferenced view RTE; or if the RelOptInfo hasn't been made yet.
*/
struct RelOptInfo **simple_rel_array; /* All 1-rel RelOptInfos */
- int simple_rel_array_size; /* allocated size of array */
+
+ int simple_rel_array_size; /* allocated size of the arrays above */
/*
* simple_rte_array is the same length as simple_rel_array and holds
@@ -247,6 +248,7 @@ typedef struct PlannerInfo
* join_rel_level is NULL if not in use.
*/
List **join_rel_level; /* lists of join-relation RelOptInfos */
+
int join_cur_level; /* index of list being extended */
List *init_plans; /* init SubPlans for query */
@@ -279,6 +281,8 @@ typedef struct PlannerInfo
List *placeholder_list; /* list of PlaceHolderInfos */
+ List *grouped_var_list; /* List of GroupedVarInfos. */
+
List *fkey_list; /* list of ForeignKeyOptInfos */
List *query_pathkeys; /* desired pathkeys for query_planner() */
@@ -305,6 +309,12 @@ typedef struct PlannerInfo
*/
List *processed_tlist;
+ /*
+ * The maximum ressortgroupref among target entries in processed_list.
+ * Useful when adding extra grouping expressions for partial aggregation.
+ */
+ int max_sortgroupref;
+
/* Fields filled during create_plan() for use in setrefs.c */
AttrNumber *grouping_map; /* for GroupingFunc fixup */
List *minmax_aggs; /* List of MinMaxAggInfos */
@@ -387,6 +397,36 @@ typedef struct PartitionSchemeData
typedef struct PartitionSchemeData *PartitionScheme;
+/*
+ * Grouped paths created at relation level are added to the relations stored
+ * in this structure.
+ */
+typedef struct RelOptGrouped
+{
+ /*
+ * Paths belonging to this relation need additional processing by
+ * create_grouping_paths() and subroutines.
+ *
+ * This field should always be set.
+ */
+ struct RelOptInfo *needs_final_agg;
+
+ /*
+ * Paths belonging to this relation do not need create_grouping_paths().
+ * They are ready for the next upper rel processing, e.g.
+ * create_ordered_paths().
+ *
+ * This relation should not contain any partial paths. XXX Consider if
+ * there are special cases where we can apply AGGSPLIT_SIMPLE aggregates
+ * to partitions and process the result using parallel Append w/o getting
+ * duplicate groups.
+ *
+ * RelOptGrouped may have this field NULL, e.g. for partitioned table
+ * (because partitions can generate duplicate values of the grouping key).
+ */
+ struct RelOptInfo *no_final_agg;
+} RelOptGrouped;
+
/*----------
* RelOptInfo
* Per-relation information for planning/optimization
@@ -467,6 +507,8 @@ typedef struct PartitionSchemeData *PartitionScheme;
* direct_lateral_relids - rels this rel has direct LATERAL references to
* lateral_relids - required outer rels for LATERAL, as a Relids set
* (includes both direct and indirect lateral references)
+ * gpi - GroupedPathInfo if the relation can produce grouped paths, NULL
+ * otherwise.
*
* If the relation is a base relation it will have these fields set:
*
@@ -646,6 +688,16 @@ typedef struct RelOptInfo
Relids direct_lateral_relids; /* rels directly laterally referenced */
Relids lateral_relids; /* minimum parameterization of rel */
+ /* Information needed to apply partial aggregation to this rel's paths. */
+ struct RelAggInfo *agg_info;
+
+ /*
+ * If the relation can produce grouped paths, store them here.
+ *
+ * If "grouped" is valid then "agg_info" must be NULL and vice versa.
+ */
+ struct RelOptGrouped *grouped;
+
/* information about a base rel (not set for join rels!) */
Index relid;
Oid reltablespace; /* containing tablespace */
@@ -1051,6 +1103,79 @@ typedef struct ParamPathInfo
/*
+ * What kind of aggregation should be applied to base relation or join?
+ */
+typedef enum
+{
+ REL_AGG_KIND_NONE, /* No aggregation. */
+ REL_AGG_KIND_SIMPLE, /* AGGSPLIT_SIMPLE */
+ REL_AGG_KIND_PARTIAL /* AGGSPLIT_INITIAL_SERIAL */
+} RelAggKind;
+
+/*
+ * RelAggInfo
+ *
+ * RelOptInfo needs information contained here if its paths should be
+ * aggregated.
+ *
+ * "target_simple" or "target_partial" will be used as pathtarget for
+ * REL_AGG_KIND_SIMPLE and REL_AGG_KIND_PARTIAL aggregation respectively, if
+ * "explicit aggregation" is applied to base relation or join. The same target
+ * will will also --- if the relation is a join --- be used to joinin grouped
+ * path to a non-grouped one.
+ *
+ * These targets contain plain-Var grouping expressions, generic grouping
+ * expressions wrapped in GroupedVar structure, or Aggrefs which are also
+ * wrapped in GroupedVar. Once GroupedVar is evaluated, its value is passed to
+ * the upper paths w/o being evaluated again. If final aggregation appears to
+ * be necessary above the final join, the contained Aggrefs are supposed to
+ * provide the final aggregation plan with input values, i.e. the aggregate
+ * transient state.
+ *
+ * Note: There's a convention that GroupedVars that contain Aggref expressions
+ * are supposed to follow the other expressions of the target. Iterations of
+ * ->exprs may rely on this arrangement.
+ *
+ * "input" contains Vars used either as grouping expressions or aggregate
+ * arguments, plus those used in grouping expressions which are not plain Vars
+ * themselves. Paths providing the aggregation plan with input data should use
+ * this target.
+ *
+ * "group_clauses" and "group_exprs" are lists of SortGroupClause and the
+ * corresponding grouping expressions respectively.
+ *
+ * "agg_exprs_simple" and "agg_exprs_partial" are lists of Aggref nodes for
+ * the "simple" and partial aggregation respectively, to be evaluated by the
+ * relation.
+ *
+ * "rows" is the estimated number of result tuples produced by grouped
+ * paths.
+ */
+typedef struct RelAggInfo
+{
+ NodeTag type;
+
+ PathTarget *target_simple; /* Target for REL_AGG_KIND_SIMPLE. */
+ PathTarget *target_partial; /* Target for REL_AGG_KIND_PARTIAL. */
+
+ PathTarget *input; /* pathtarget of paths that generate input for
+ * aggregation paths. */
+
+ List *group_clauses;
+ List *group_exprs;
+
+ /*
+ * TODO Consider removing these fields and creating the Aggref, partial or
+ * simple, when needed, but avoid creating it multiple times (e.g. once
+ * for hash grouping, other times for sorted grouping).
+ */
+ List *agg_exprs_simple; /* Expressions for REL_AGG_KIND_SIMPLE */
+ List *agg_exprs_partial; /* Expressions for REL_AGG_KIND_PARTIAL */
+
+ double rows;
+} RelAggInfo;
+
+/*
* Type "Path" is used as-is for sequential-scan paths, as well as some other
* simple plan types that we don't need any extra information in the path for.
* For other path types it is the first component of a larger struct.
@@ -1526,12 +1651,16 @@ typedef struct HashPath
* ProjectionPath node, which is marked dummy to indicate that we intend to
* assign the work to the input plan node. The estimated cost for the
* ProjectionPath node will account for whether a Result will be used or not.
+ *
+ * force_result field tells that the Result node must be used for some reason
+ * even though the subpath could normally handle the projection.
*/
typedef struct ProjectionPath
{
Path path;
Path *subpath; /* path representing input source */
bool dummypp; /* true if no separate Result is needed */
+ bool force_result; /* Is Result node required? */
} ProjectionPath;
/*
@@ -2012,6 +2141,44 @@ typedef struct PlaceHolderVar
Index phlevelsup; /* > 0 if PHV belongs to outer query */
} PlaceHolderVar;
+
+/*
+ * Similar to the concept of PlaceHolderVar, we treat aggregates and grouping
+ * columns as special variables if grouping is possible below the top-level
+ * join. Likewise, the variable is evaluated below the query targetlist (in
+ * particular, in the targetlist of AGGSPLIT_INITIAL_SERIAL aggregation node
+ * which has base relation or a join as the input) and bubbles up through the
+ * join tree until it reaches AGGSPLIT_FINAL_DESERIAL aggregation node.
+ *
+ * gvexpr is either Aggref or a generic (non-Var) grouping expression. (If a
+ * simple Var, we don't replace it with GroupedVar.)
+ *
+ * agg_partial also points to the corresponding field of GroupedVarInfo if
+ * gvexpr is Aggref.
+ */
+typedef struct GroupedVar
+{
+ Expr xpr;
+ Expr *gvexpr; /* the represented expression */
+
+ /*
+ * TODO
+ *
+ * Do we need to cache the partial aggregate? (The simple aggregate should
+ * be in gvexpr.) If not, make sure translation of the GroupedVar to child
+ * rels works.
+ *
+ */
+ Aggref *agg_partial; /* partial aggregate if gvexpr is an aggregate
+ * and if it's used in a target of partial
+ * aggregation. */
+
+ Index sortgroupref; /* SortGroupClause.tleSortGroupRef if gvexpr
+ * is grouping expression. */
+ Index gvid; /* GroupedVarInfo */
+ int32 width; /* Expression width. */
+} GroupedVar;
+
/*
* "Special join" info.
*
@@ -2208,6 +2375,26 @@ typedef struct PlaceHolderInfo
} PlaceHolderInfo;
/*
+ * Likewise, GroupedVarInfo exists for each distinct GroupedVar.
+ */
+typedef struct GroupedVarInfo
+{
+ NodeTag type;
+
+ Index gvid; /* GroupedVar.gvid */
+ Expr *gvexpr; /* the represented expression. */
+ Aggref *agg_partial; /* if gvexpr is aggregate, agg_partial is the
+ * corresponding partial aggregate */
+ Index sortgroupref; /* If gvexpr is a grouping expression, this is
+ * the tleSortGroupRef of the corresponding
+ * SortGroupClause. */
+ Relids gv_eval_at; /* lowest level we can evaluate the expression
+ * at or NULL if it can happen anywhere. */
+ bool derived; /* derived from another GroupedVarInfo using
+ * equeivalence classes? */
+} GroupedVarInfo;
+
+/*
* This struct describes one potentially index-optimizable MIN/MAX aggregate
* function. MinMaxAggPath contains a list of these, and if we accept that
* path, the list is stored into root->minmax_aggs for use during setrefs.c.
diff --git a/src/include/optimizer/clauses.h b/src/include/optimizer/clauses.h
index ed854fdd40..f9f3d14b0b 100644
--- a/src/include/optimizer/clauses.h
+++ b/src/include/optimizer/clauses.h
@@ -88,4 +88,6 @@ extern Query *inline_set_returning_function(PlannerInfo *root,
extern List *expand_function_arguments(List *args, Oid result_type,
HeapTuple func_tuple);
+extern GroupedVarInfo *translate_expression_to_rels(PlannerInfo *root,
+ GroupedVarInfo *gvi, Index relid);
#endif /* CLAUSES_H */
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 77ca7ff837..bb6ec0f4e1 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -72,6 +72,7 @@ extern PGDLLIMPORT bool enable_partitionwise_aggregate;
extern PGDLLIMPORT bool enable_parallel_append;
extern PGDLLIMPORT bool enable_parallel_hash;
extern PGDLLIMPORT bool enable_partition_pruning;
+extern PGDLLIMPORT bool enable_agg_pushdown;
extern PGDLLIMPORT int constraint_exclusion;
extern double clamp_row_est(double nrows);
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 4ba358e72d..b2f51fa119 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -68,9 +68,11 @@ extern AppendPath *create_append_path(PlannerInfo *root, RelOptInfo *rel,
List *subpaths, List *partial_subpaths,
Relids required_outer,
int parallel_workers, bool parallel_aware,
- List *partitioned_rels, double rows);
+ List *partitioned_rels, double rows,
+ RelAggKind agg_kind);
extern MergeAppendPath *create_merge_append_path(PlannerInfo *root,
RelOptInfo *rel,
+ PathTarget *target,
List *subpaths,
List *pathkeys,
Relids required_outer,
@@ -123,6 +125,7 @@ extern Relids calc_non_nestloop_required_outer(Path *outer_path, Path *inner_pat
extern NestPath *create_nestloop_path(PlannerInfo *root,
RelOptInfo *joinrel,
+ PathTarget *target,
JoinType jointype,
JoinCostWorkspace *workspace,
JoinPathExtraData *extra,
@@ -134,6 +137,7 @@ extern NestPath *create_nestloop_path(PlannerInfo *root,
extern MergePath *create_mergejoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
+ PathTarget *target,
JoinType jointype,
JoinCostWorkspace *workspace,
JoinPathExtraData *extra,
@@ -148,6 +152,7 @@ extern MergePath *create_mergejoin_path(PlannerInfo *root,
extern HashPath *create_hashjoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
+ PathTarget *target,
JoinType jointype,
JoinCostWorkspace *workspace,
JoinPathExtraData *extra,
@@ -196,6 +201,14 @@ extern AggPath *create_agg_path(PlannerInfo *root,
List *qual,
const AggClauseCosts *aggcosts,
double numGroups);
+extern AggPath *create_agg_sorted_path(PlannerInfo *root,
+ Path *subpath,
+ bool check_pathkeys,
+ double input_rows,
+ RelAggKind agg_kind);
+extern AggPath *create_agg_hashed_path(PlannerInfo *root,
+ Path *subpath,
+ double input_rows, RelAggKind agg_kind);
extern GroupingSetsPath *create_groupingsets_path(PlannerInfo *root,
RelOptInfo *rel,
Path *subpath,
@@ -253,7 +266,8 @@ extern LimitPath *create_limit_path(PlannerInfo *root, RelOptInfo *rel,
extern Path *reparameterize_path(PlannerInfo *root, Path *path,
Relids required_outer,
- double loop_count);
+ double loop_count,
+ RelAggKind agg_kind);
extern Path *reparameterize_path_by_child(PlannerInfo *root, Path *path,
RelOptInfo *child_rel);
@@ -271,7 +285,8 @@ extern RelOptInfo *build_join_rel(PlannerInfo *root,
RelOptInfo *outer_rel,
RelOptInfo *inner_rel,
SpecialJoinInfo *sjinfo,
- List **restrictlist_ptr);
+ List **restrictlist_ptr,
+ bool grouped);
extern Relids min_join_parameterization(PlannerInfo *root,
Relids joinrelids,
RelOptInfo *outer_rel,
@@ -297,6 +312,11 @@ extern ParamPathInfo *find_param_path_info(RelOptInfo *rel,
extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
RelOptInfo *outer_rel, RelOptInfo *inner_rel,
RelOptInfo *parent_joinrel, List *restrictlist,
- SpecialJoinInfo *sjinfo, JoinType jointype);
-
+ SpecialJoinInfo *sjinfo, JoinType jointype,
+ bool grouped);
+extern RelAggInfo *create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel);
+extern RelAggInfo *translate_rel_agg_info(PlannerInfo *root,
+ RelAggInfo *agg_info,
+ AppendRelInfo **appinfos,
+ int nappinfos);
#endif /* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index cafde307ad..760673d591 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -21,6 +21,7 @@
* allpaths.c
*/
extern PGDLLIMPORT bool enable_geqo;
+extern PGDLLIMPORT bool enable_agg_pushdown;
extern PGDLLIMPORT int geqo_threshold;
extern PGDLLIMPORT int min_parallel_table_scan_size;
extern PGDLLIMPORT int min_parallel_index_scan_size;
@@ -50,17 +51,23 @@ extern PGDLLIMPORT join_search_hook_type join_search_hook;
extern RelOptInfo *make_one_rel(PlannerInfo *root, List *joinlist);
extern void set_dummy_rel_pathlist(RelOptInfo *rel);
-extern RelOptInfo *standard_join_search(PlannerInfo *root, int levels_needed,
+extern RelOptInfo *standard_join_search(PlannerInfo *root,
+ int levels_needed,
List *initial_rels);
extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
bool override_rows);
+
+extern bool create_grouped_path(PlannerInfo *root, RelOptInfo *rel,
+ Path *subpath, bool precheck,
+ bool partial, AggStrategy aggstrategy, RelAggKind agg_kind);
extern int compute_parallel_worker(RelOptInfo *rel, double heap_pages,
double index_pages, int max_workers);
extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
Path *bitmapqual);
extern void generate_partitionwise_join_paths(PlannerInfo *root,
- RelOptInfo *rel);
+ RelOptInfo *rel,
+ RelAggKind agg_kind);
#ifdef OPTIMIZER_DEBUG
extern void debug_print_rel(PlannerInfo *root, RelOptInfo *rel);
@@ -70,7 +77,8 @@ extern void debug_print_rel(PlannerInfo *root, RelOptInfo *rel);
* indxpath.c
* routines to generate index paths
*/
-extern void create_index_paths(PlannerInfo *root, RelOptInfo *rel);
+extern void create_index_paths(PlannerInfo *root, RelOptInfo *rel,
+ RelAggKind agg_kind);
extern bool relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
List *restrictlist,
List *exprlist, List *oprlist);
@@ -92,7 +100,8 @@ extern Expr *adjust_rowcompare_for_index(RowCompareExpr *clause,
* tidpath.h
* routines to generate tid paths
*/
-extern void create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel);
+extern void create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel,
+ RelAggKind agg_kind);
/*
* joinpath.c
@@ -101,7 +110,8 @@ extern void create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel);
extern void add_paths_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outerrel, RelOptInfo *innerrel,
JoinType jointype, SpecialJoinInfo *sjinfo,
- List *restrictlist);
+ List *restrictlist,
+ RelAggKind agg_kind, bool do_aggregate);
/*
* joinrels.c
@@ -238,6 +248,7 @@ extern PathKey *make_canonical_pathkey(PlannerInfo *root,
EquivalenceClass *eclass, Oid opfamily,
int strategy, bool nulls_first);
extern void add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
- List *live_childrels);
+ List *live_childrels,
+ RelAggKind agg_kind);
#endif /* PATHS_H */
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index c8ab0280d2..ac76375b31 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -76,6 +76,8 @@ extern void add_base_rels_to_query(PlannerInfo *root, Node *jtnode);
extern void build_base_rel_tlists(PlannerInfo *root, List *final_tlist);
extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
Relids where_needed, bool create_new_ph);
+extern void add_grouped_base_rels_to_query(PlannerInfo *root);
+extern void add_grouped_vars_to_rels(PlannerInfo *root);
extern void find_lateral_references(PlannerInfo *root);
extern void create_lateral_join_info(PlannerInfo *root);
extern List *deconstruct_jointree(PlannerInfo *root);
diff --git a/src/include/optimizer/tlist.h b/src/include/optimizer/tlist.h
index 9fa52e1278..68c32e1caa 100644
--- a/src/include/optimizer/tlist.h
+++ b/src/include/optimizer/tlist.h
@@ -16,7 +16,6 @@
#include "nodes/relation.h"
-
extern TargetEntry *tlist_member(Expr *node, List *targetlist);
extern TargetEntry *tlist_member_ignore_relabel(Expr *node, List *targetlist);
@@ -41,7 +40,6 @@ extern Node *get_sortgroupclause_expr(SortGroupClause *sgClause,
List *targetList);
extern List *get_sortgrouplist_exprs(List *sgClauses,
List *targetList);
-
extern SortGroupClause *get_sortgroupref_clause(Index sortref,
List *clauses);
extern SortGroupClause *get_sortgroupref_clause_noerr(Index sortref,
@@ -65,6 +63,13 @@ extern void split_pathtarget_at_srfs(PlannerInfo *root,
PathTarget *target, PathTarget *input_target,
List **targets, List **targets_contain_srfs);
+/* TODO Find the best location (position and in some cases even file) for the
+ * following ones. */
+extern List *replace_grouped_vars_with_aggrefs(PlannerInfo *root, List *src);
+extern void add_grouped_vars_to_target(PlannerInfo *root, PathTarget *target,
+ List *expressions);
+extern GroupedVar *get_grouping_expression(List *gvis, Expr *expr);
+
/* Convenience macro to get a PathTarget with valid cost/width fields */
#define create_pathtarget(root, tlist) \
set_pathtarget_cost_width(root, make_pathtarget_from_tlist(tlist))
diff --git a/src/include/optimizer/var.h b/src/include/optimizer/var.h
index 43c53b5344..5a795c3231 100644
--- a/src/include/optimizer/var.h
+++ b/src/include/optimizer/var.h
@@ -36,5 +36,7 @@ extern bool contain_vars_of_level(Node *node, int levelsup);
extern int locate_var_of_level(Node *node, int levelsup);
extern List *pull_var_clause(Node *node, int flags);
extern Node *flatten_join_alias_vars(PlannerInfo *root, Node *node);
+extern GroupedVarInfo *find_grouped_var_info(PlannerInfo *root,
+ GroupedVar *gvar);
#endif /* VAR_H */
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
index 95e44280c4..3a14fc6036 100644
--- a/src/include/utils/selfuncs.h
+++ b/src/include/utils/selfuncs.h
@@ -213,6 +213,9 @@ extern void estimate_hash_bucket_stats(PlannerInfo *root,
Node *hashkey, double nbuckets,
Selectivity *mcv_freq,
Selectivity *bucketsize_frac);
+extern Size estimate_hashagg_tablesize(Path *path,
+ const AggClauseCosts *agg_costs,
+ double dNumGroups);
extern List *deconstruct_indexquals(IndexPath *path);
extern void genericcostestimate(PlannerInfo *root, IndexPath *path,
diff --git a/src/test/regress/expected/agg_pushdown.out b/src/test/regress/expected/agg_pushdown.out
new file mode 100644
index 0000000000..09b380d21f
--- /dev/null
+++ b/src/test/regress/expected/agg_pushdown.out
@@ -0,0 +1,316 @@
+BEGIN;
+CREATE TABLE agg_pushdown_parent (
+ i int primary key);
+CREATE TABLE agg_pushdown_child1 (
+ j int primary key,
+ parent int references agg_pushdown_parent,
+ v double precision);
+CREATE INDEX ON agg_pushdown_child1(parent);
+CREATE TABLE agg_pushdown_child2 (
+ k int primary key,
+ parent int references agg_pushdown_parent,
+ v double precision);
+INSERT INTO agg_pushdown_parent(i)
+SELECT n
+FROM generate_series(0, 7) AS s(n);
+INSERT INTO agg_pushdown_child1(j, parent, v)
+SELECT 64 * i + n, i, random()
+FROM generate_series(0, 63) AS s(n), agg_pushdown_parent;
+INSERT INTO agg_pushdown_child2(k, parent, v)
+SELECT 64 * i + n, i, random()
+FROM generate_series(0, 63) AS s(n), agg_pushdown_parent;
+ANALYZE;
+SET enable_agg_pushdown TO on;
+-- Perform scan of a table and partially aggregate the result.
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v) FROM agg_pushdown_parent AS p JOIN agg_pushdown_child1
+AS c1 ON c1.parent = p.i GROUP BY p.i;
+ QUERY PLAN
+------------------------------------------------------------
+ Finalize HashAggregate
+ Group Key: p.i
+ -> Hash Join
+ Hash Cond: (p.i = c1.parent)
+ -> Seq Scan on agg_pushdown_parent p
+ -> Hash
+ -> Partial HashAggregate
+ Group Key: c1.parent
+ -> Seq Scan on agg_pushdown_child1 c1
+(9 rows)
+
+-- Scan index on agg_pushdown_child1(parent) column and partially aggregate
+-- the result using AGG_SORTED strategy.
+SET enable_seqscan TO off;
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v) FROM agg_pushdown_parent AS p JOIN agg_pushdown_child1
+AS c1 ON c1.parent = p.i GROUP BY p.i;
+ QUERY PLAN
+---------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Group Key: p.i
+ -> Nested Loop
+ -> Partial GroupAggregate
+ Group Key: c1.parent
+ -> Index Scan using agg_pushdown_child1_parent_idx on agg_pushdown_child1 c1
+ -> Index Only Scan using agg_pushdown_parent_pkey on agg_pushdown_parent p
+ Index Cond: (i = c1.parent)
+(8 rows)
+
+SET enable_seqscan TO on;
+-- Perform nestloop join between agg_pushdown_child1 and agg_pushdown_child2
+-- and partially aggregate the result.
+SET enable_nestloop TO on;
+SET enable_hashjoin TO off;
+SET enable_mergejoin TO off;
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v + c2.v) FROM agg_pushdown_parent AS p JOIN
+agg_pushdown_child1 AS c1 ON c1.parent = p.i JOIN agg_pushdown_child2 AS c2 ON
+c2.parent = p.i WHERE c1.j = c2.k GROUP BY p.i;
+ QUERY PLAN
+---------------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Group Key: p.i
+ -> Sort
+ Sort Key: p.i
+ -> Nested Loop
+ -> Partial HashAggregate
+ Group Key: c1.parent
+ -> Nested Loop
+ -> Seq Scan on agg_pushdown_child1 c1
+ -> Index Scan using agg_pushdown_child2_pkey on agg_pushdown_child2 c2
+ Index Cond: (k = c1.j)
+ Filter: (c1.parent = parent)
+ -> Index Only Scan using agg_pushdown_parent_pkey on agg_pushdown_parent p
+ Index Cond: (i = c1.parent)
+(14 rows)
+
+-- The same for hash join.
+SET enable_nestloop TO off;
+SET enable_hashjoin TO on;
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v + c2.v) FROM agg_pushdown_parent AS p JOIN
+agg_pushdown_child1 AS c1 ON c1.parent = p.i JOIN agg_pushdown_child2 AS c2 ON
+c2.parent = p.i WHERE c1.j = c2.k GROUP BY p.i;
+ QUERY PLAN
+----------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Group Key: p.i
+ -> Sort
+ Sort Key: p.i
+ -> Hash Join
+ Hash Cond: (p.i = c1.parent)
+ -> Seq Scan on agg_pushdown_parent p
+ -> Hash
+ -> Partial HashAggregate
+ Group Key: c1.parent
+ -> Hash Join
+ Hash Cond: ((c1.parent = c2.parent) AND (c1.j = c2.k))
+ -> Seq Scan on agg_pushdown_child1 c1
+ -> Hash
+ -> Seq Scan on agg_pushdown_child2 c2
+(15 rows)
+
+-- The same for merge join.
+SET enable_hashjoin TO off;
+SET enable_mergejoin TO on;
+SET enable_seqscan TO off;
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v + c2.v) FROM agg_pushdown_parent AS p JOIN
+agg_pushdown_child1 AS c1 ON c1.parent = p.i JOIN agg_pushdown_child2 AS c2 ON
+c2.parent = p.i WHERE c1.j = c2.k GROUP BY p.i;
+ QUERY PLAN
+---------------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Group Key: p.i
+ -> Merge Join
+ Merge Cond: (c1.parent = p.i)
+ -> Sort
+ Sort Key: c1.parent
+ -> Partial HashAggregate
+ Group Key: c1.parent
+ -> Merge Join
+ Merge Cond: (c1.j = c2.k)
+ Join Filter: (c1.parent = c2.parent)
+ -> Index Scan using agg_pushdown_child1_pkey on agg_pushdown_child1 c1
+ -> Index Scan using agg_pushdown_child2_pkey on agg_pushdown_child2 c2
+ -> Index Only Scan using agg_pushdown_parent_pkey on agg_pushdown_parent p
+(14 rows)
+
+-- Generic grouping expression.
+EXPLAIN (COSTS off)
+SELECT p.i / 2, avg(c1.v + c2.v) FROM agg_pushdown_parent AS p JOIN
+agg_pushdown_child1 AS c1 ON c1.parent = p.i JOIN agg_pushdown_child2 AS c2 ON
+c2.parent = p.i WHERE c1.j = c2.k GROUP BY p.i / 2;
+ QUERY PLAN
+---------------------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Group Key: (((c1.parent / 2)))
+ -> Sort
+ Sort Key: (((c1.parent / 2)))
+ -> Merge Join
+ Merge Cond: (c1.parent = p.i)
+ -> Sort
+ Sort Key: c1.parent
+ -> Partial HashAggregate
+ Group Key: (c1.parent / 2), c1.parent, c2.parent
+ -> Merge Join
+ Merge Cond: (c1.j = c2.k)
+ Join Filter: (c1.parent = c2.parent)
+ -> Index Scan using agg_pushdown_child1_pkey on agg_pushdown_child1 c1
+ -> Index Scan using agg_pushdown_child2_pkey on agg_pushdown_child2 c2
+ -> Index Only Scan using agg_pushdown_parent_pkey on agg_pushdown_parent p
+(16 rows)
+
+-- The same tests for parallel plans.
+RESET ALL;
+SET parallel_setup_cost TO 0;
+SET parallel_tuple_cost TO 0;
+SET min_parallel_table_scan_size TO 0;
+SET min_parallel_index_scan_size TO 0;
+SET max_parallel_workers_per_gather TO 4;
+SET enable_agg_pushdown TO on;
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v) FROM agg_pushdown_parent AS p JOIN agg_pushdown_child1
+AS c1 ON c1.parent = p.i GROUP BY p.i;
+ QUERY PLAN
+---------------------------------------------------------------------
+ Finalize HashAggregate
+ Group Key: p.i
+ -> Gather
+ Workers Planned: 2
+ -> Parallel Hash Join
+ Hash Cond: (c1.parent = p.i)
+ -> Partial HashAggregate
+ Group Key: c1.parent
+ -> Parallel Seq Scan on agg_pushdown_child1 c1
+ -> Parallel Hash
+ -> Parallel Seq Scan on agg_pushdown_parent p
+(11 rows)
+
+SET enable_seqscan TO off;
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v) FROM agg_pushdown_parent AS p JOIN agg_pushdown_child1
+AS c1 ON c1.parent = p.i GROUP BY p.i;
+ QUERY PLAN
+------------------------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Group Key: p.i
+ -> Gather Merge
+ Workers Planned: 2
+ -> Nested Loop
+ -> Partial GroupAggregate
+ Group Key: c1.parent
+ -> Parallel Index Scan using agg_pushdown_child1_parent_idx on agg_pushdown_child1 c1
+ -> Index Only Scan using agg_pushdown_parent_pkey on agg_pushdown_parent p
+ Index Cond: (i = c1.parent)
+(10 rows)
+
+SET enable_seqscan TO on;
+SET enable_nestloop TO on;
+SET enable_hashjoin TO off;
+SET enable_mergejoin TO off;
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v + c2.v) FROM agg_pushdown_parent AS p JOIN
+agg_pushdown_child1 AS c1 ON c1.parent = p.i JOIN agg_pushdown_child2 AS c2 ON
+c2.parent = p.i WHERE c1.j = c2.k GROUP BY p.i;
+ QUERY PLAN
+---------------------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Group Key: p.i
+ -> Gather Merge
+ Workers Planned: 2
+ -> Sort
+ Sort Key: p.i
+ -> Nested Loop
+ -> Partial HashAggregate
+ Group Key: c1.parent
+ -> Nested Loop
+ -> Parallel Seq Scan on agg_pushdown_child1 c1
+ -> Index Scan using agg_pushdown_child2_pkey on agg_pushdown_child2 c2
+ Index Cond: (k = c1.j)
+ Filter: (c1.parent = parent)
+ -> Index Only Scan using agg_pushdown_parent_pkey on agg_pushdown_parent p
+ Index Cond: (i = c1.parent)
+(16 rows)
+
+SET enable_nestloop TO off;
+SET enable_hashjoin TO on;
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v + c2.v) FROM agg_pushdown_parent AS p JOIN
+agg_pushdown_child1 AS c1 ON c1.parent = p.i JOIN agg_pushdown_child2 AS c2 ON
+c2.parent = p.i WHERE c1.j = c2.k GROUP BY p.i;
+ QUERY PLAN
+----------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Group Key: p.i
+ -> Sort
+ Sort Key: p.i
+ -> Gather
+ Workers Planned: 1
+ -> Parallel Hash Join
+ Hash Cond: (p.i = c1.parent)
+ -> Parallel Seq Scan on agg_pushdown_parent p
+ -> Parallel Hash
+ -> Partial HashAggregate
+ Group Key: c1.parent
+ -> Parallel Hash Join
+ Hash Cond: ((c1.parent = c2.parent) AND (c1.j = c2.k))
+ -> Parallel Seq Scan on agg_pushdown_child1 c1
+ -> Parallel Hash
+ -> Parallel Seq Scan on agg_pushdown_child2 c2
+(17 rows)
+
+SET enable_hashjoin TO off;
+SET enable_mergejoin TO on;
+SET enable_seqscan TO off;
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v + c2.v) FROM agg_pushdown_parent AS p JOIN
+agg_pushdown_child1 AS c1 ON c1.parent = p.i JOIN agg_pushdown_child2 AS c2 ON
+c2.parent = p.i WHERE c1.j = c2.k GROUP BY p.i;
+ QUERY PLAN
+------------------------------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Group Key: p.i
+ -> Gather Merge
+ Workers Planned: 2
+ -> Merge Join
+ Merge Cond: (c1.parent = p.i)
+ -> Sort
+ Sort Key: c1.parent
+ -> Partial HashAggregate
+ Group Key: c1.parent
+ -> Merge Join
+ Merge Cond: (c1.j = c2.k)
+ Join Filter: (c1.parent = c2.parent)
+ -> Parallel Index Scan using agg_pushdown_child1_pkey on agg_pushdown_child1 c1
+ -> Index Scan using agg_pushdown_child2_pkey on agg_pushdown_child2 c2
+ -> Index Only Scan using agg_pushdown_parent_pkey on agg_pushdown_parent p
+(16 rows)
+
+EXPLAIN (COSTS off)
+SELECT p.i / 2, avg(c1.v + c2.v) FROM agg_pushdown_parent AS p JOIN
+agg_pushdown_child1 AS c1 ON c1.parent = p.i JOIN agg_pushdown_child2 AS c2 ON
+c2.parent = p.i WHERE c1.j = c2.k GROUP BY p.i / 2;
+ QUERY PLAN
+------------------------------------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Group Key: (((c1.parent / 2)))
+ -> Sort
+ Sort Key: (((c1.parent / 2)))
+ -> Gather
+ Workers Planned: 2
+ -> Merge Join
+ Merge Cond: (c1.parent = p.i)
+ -> Sort
+ Sort Key: c1.parent
+ -> Partial HashAggregate
+ Group Key: (c1.parent / 2), c1.parent, c2.parent
+ -> Merge Join
+ Merge Cond: (c1.j = c2.k)
+ Join Filter: (c1.parent = c2.parent)
+ -> Parallel Index Scan using agg_pushdown_child1_pkey on agg_pushdown_child1 c1
+ -> Index Scan using agg_pushdown_child2_pkey on agg_pushdown_child2 c2
+ -> Index Only Scan using agg_pushdown_parent_pkey on agg_pushdown_parent p
+(18 rows)
+
+ROLLBACK;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 16f979c8d9..6d406c65cc 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -98,6 +98,9 @@ test: rules psql_crosstab amutils
test: select_parallel
test: write_parallel
+# this one runs parallel workers too
+test: agg_pushdown
+
# no relation related tests can be put in this group
test: publication subscription
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 42632be675..f480c7aaa0 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -139,6 +139,7 @@ test: rules
test: psql_crosstab
test: select_parallel
test: write_parallel
+test: agg_pushdown
test: publication
test: subscription
test: amutils
diff --git a/src/test/regress/sql/agg_pushdown.sql b/src/test/regress/sql/agg_pushdown.sql
new file mode 100644
index 0000000000..05e2f5504f
--- /dev/null
+++ b/src/test/regress/sql/agg_pushdown.sql
@@ -0,0 +1,137 @@
+BEGIN;
+
+CREATE TABLE agg_pushdown_parent (
+ i int primary key);
+
+CREATE TABLE agg_pushdown_child1 (
+ j int primary key,
+ parent int references agg_pushdown_parent,
+ v double precision);
+
+CREATE INDEX ON agg_pushdown_child1(parent);
+
+CREATE TABLE agg_pushdown_child2 (
+ k int primary key,
+ parent int references agg_pushdown_parent,
+ v double precision);
+
+INSERT INTO agg_pushdown_parent(i)
+SELECT n
+FROM generate_series(0, 7) AS s(n);
+
+INSERT INTO agg_pushdown_child1(j, parent, v)
+SELECT 64 * i + n, i, random()
+FROM generate_series(0, 63) AS s(n), agg_pushdown_parent;
+
+INSERT INTO agg_pushdown_child2(k, parent, v)
+SELECT 64 * i + n, i, random()
+FROM generate_series(0, 63) AS s(n), agg_pushdown_parent;
+
+ANALYZE;
+
+SET enable_agg_pushdown TO on;
+
+-- Perform scan of a table and partially aggregate the result.
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v) FROM agg_pushdown_parent AS p JOIN agg_pushdown_child1
+AS c1 ON c1.parent = p.i GROUP BY p.i;
+
+-- Scan index on agg_pushdown_child1(parent) column and partially aggregate
+-- the result using AGG_SORTED strategy.
+SET enable_seqscan TO off;
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v) FROM agg_pushdown_parent AS p JOIN agg_pushdown_child1
+AS c1 ON c1.parent = p.i GROUP BY p.i;
+
+SET enable_seqscan TO on;
+
+-- Perform nestloop join between agg_pushdown_child1 and agg_pushdown_child2
+-- and partially aggregate the result.
+SET enable_nestloop TO on;
+SET enable_hashjoin TO off;
+SET enable_mergejoin TO off;
+
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v + c2.v) FROM agg_pushdown_parent AS p JOIN
+agg_pushdown_child1 AS c1 ON c1.parent = p.i JOIN agg_pushdown_child2 AS c2 ON
+c2.parent = p.i WHERE c1.j = c2.k GROUP BY p.i;
+
+-- The same for hash join.
+SET enable_nestloop TO off;
+SET enable_hashjoin TO on;
+
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v + c2.v) FROM agg_pushdown_parent AS p JOIN
+agg_pushdown_child1 AS c1 ON c1.parent = p.i JOIN agg_pushdown_child2 AS c2 ON
+c2.parent = p.i WHERE c1.j = c2.k GROUP BY p.i;
+
+-- The same for merge join.
+SET enable_hashjoin TO off;
+SET enable_mergejoin TO on;
+SET enable_seqscan TO off;
+
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v + c2.v) FROM agg_pushdown_parent AS p JOIN
+agg_pushdown_child1 AS c1 ON c1.parent = p.i JOIN agg_pushdown_child2 AS c2 ON
+c2.parent = p.i WHERE c1.j = c2.k GROUP BY p.i;
+
+-- Generic grouping expression.
+EXPLAIN (COSTS off)
+SELECT p.i / 2, avg(c1.v + c2.v) FROM agg_pushdown_parent AS p JOIN
+agg_pushdown_child1 AS c1 ON c1.parent = p.i JOIN agg_pushdown_child2 AS c2 ON
+c2.parent = p.i WHERE c1.j = c2.k GROUP BY p.i / 2;
+
+-- The same tests for parallel plans.
+RESET ALL;
+
+SET parallel_setup_cost TO 0;
+SET parallel_tuple_cost TO 0;
+SET min_parallel_table_scan_size TO 0;
+SET min_parallel_index_scan_size TO 0;
+SET max_parallel_workers_per_gather TO 4;
+
+SET enable_agg_pushdown TO on;
+
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v) FROM agg_pushdown_parent AS p JOIN agg_pushdown_child1
+AS c1 ON c1.parent = p.i GROUP BY p.i;
+
+SET enable_seqscan TO off;
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v) FROM agg_pushdown_parent AS p JOIN agg_pushdown_child1
+AS c1 ON c1.parent = p.i GROUP BY p.i;
+
+SET enable_seqscan TO on;
+
+SET enable_nestloop TO on;
+SET enable_hashjoin TO off;
+SET enable_mergejoin TO off;
+
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v + c2.v) FROM agg_pushdown_parent AS p JOIN
+agg_pushdown_child1 AS c1 ON c1.parent = p.i JOIN agg_pushdown_child2 AS c2 ON
+c2.parent = p.i WHERE c1.j = c2.k GROUP BY p.i;
+
+SET enable_nestloop TO off;
+SET enable_hashjoin TO on;
+
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v + c2.v) FROM agg_pushdown_parent AS p JOIN
+agg_pushdown_child1 AS c1 ON c1.parent = p.i JOIN agg_pushdown_child2 AS c2 ON
+c2.parent = p.i WHERE c1.j = c2.k GROUP BY p.i;
+
+SET enable_hashjoin TO off;
+SET enable_mergejoin TO on;
+SET enable_seqscan TO off;
+
+EXPLAIN (COSTS off)
+SELECT p.i, avg(c1.v + c2.v) FROM agg_pushdown_parent AS p JOIN
+agg_pushdown_child1 AS c1 ON c1.parent = p.i JOIN agg_pushdown_child2 AS c2 ON
+c2.parent = p.i WHERE c1.j = c2.k GROUP BY p.i;
+
+EXPLAIN (COSTS off)
+SELECT p.i / 2, avg(c1.v + c2.v) FROM agg_pushdown_parent AS p JOIN
+agg_pushdown_child1 AS c1 ON c1.parent = p.i JOIN agg_pushdown_child2 AS c2 ON
+c2.parent = p.i WHERE c1.j = c2.k GROUP BY p.i / 2;
+
+ROLLBACK;
Antonin Houska <ah@cybertec.at> wrote:
I didn't have enough time to separate "your functionality" and can do it when
I'm back from vacation.
So I've separated the code that does not use the 2-stage replication (and
therefore the feature is not involved in parallel queries).
Note on coding: so far most of the functions to which I added the "grouped"
argument can get the same information from rel->agg_info (it's set iff the
relation is grouped). So we can remove it for this "simple aggregation", but
the next (more generic) part of the patch will have to add some argument
anyway, to indicate whether AGGSPLIT_SIMPLE, AGGSPLIT_INITIAL_SERIAL (or no
aggregation) should be performed.
Besides splitting the code I worked on the determination whether join
duplicates grouping keys or not. Currently it still fails to combine
"uniquekeys" of joined relation in some (important) cases when the information
needed is actually available. I think ECs should be used here, like it is for
pathkeys.
So currently you should comment out this code
if (!match_uniquekeys_to_group_pathkeys(root, result, target))
*keys_ok = false;
in pathnode.c:make_uniquekeys_for_join() if you want the patch to at leat
produce interesting EXPLAIN output.
One example:
CREATE TABLE a(i int primary key);
CREATE TABLE b(j int primary key, k int);
SET enable_agg_pushdown TO true;
EXPLAIN
SELECT j, sum(k)
FROM a, b
WHERE i = j
GROUP BY j
QUERY PLAN
-----------------------------------------------------------------------
Hash Join (cost=94.75..162.43 rows=2260 width=12)
Hash Cond: (a.i = b.j)
-> Seq Scan on a (cost=0.00..35.50 rows=2550 width=4)
-> Hash (cost=66.50..66.50 rows=2260 width=12)
-> HashAggregate (cost=43.90..66.50 rows=2260 width=12)
Group Key: b.j
-> Seq Scan on b (cost=0.00..32.60 rows=2260 width=8)
However there are cases like this
EXPLAIN
SELECT i, sum(k)
FROM a, b
WHERE i = j
GROUP BY i
which currently does not work. The reason is that the column b.j which is not
in the GROUP BY clause needs to be in the grouped output of the "b" (grouped)
table output, otherwise the join condition cannot be evaluated.
While separating the code that only uses 1-stage aggregation I removed the
code that adds such extra grouping keys to per-relation AggPath because in
general this is only safe if the final aggregation is performed, and the final
aggregation uses no added columns. However I forgot that grouping keys can be
added in cases like shown above, i.e. the grouping expression b.j is derived
from GROUP BY using equivalence class.
I'll fix this (and various other problems) asap. I believe it's worth to at
least show the current code. I'm curious if it's something we can build on or
if another rework will be needed.
(I'll be off next week.)
--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com
Attachments:
agg_pushdown_simple.patchtext/x-diffDownload
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index e284fd71d7..c6ef340f03 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -798,6 +798,41 @@ ExecInitExprRec(Expr *node, ExprState *state,
break;
}
+ case T_GroupedVar:
+
+ /*
+ * If GroupedVar appears in targetlist of Agg node, it can
+ * represent either Aggref or grouping expression.
+ *
+ * TODO Consider doing this expansion earlier, e.g. in setrefs.c.
+ */
+ if (state->parent && (IsA(state->parent, AggState)))
+ {
+ GroupedVar *gvar = (GroupedVar *) node;
+
+ if (IsA(gvar->gvexpr, Aggref))
+ {
+ ExecInitExprRec((Expr *) gvar->gvexpr, state,
+ resv, resnull);
+ }
+ else
+ ExecInitExprRec((Expr *) gvar->gvexpr, state,
+ resv, resnull);
+ break;
+ }
+ else
+ {
+ /*
+ * set_plan_refs should have replaced GroupedVar in the
+ * targetlist with an ordinary Var.
+ *
+ * XXX Should we error out here? There's at least one legal
+ * case here which we'd have to check: a Result plan with no
+ * outer plan which represents an empty Append plan.
+ */
+ break;
+ }
+
case T_GroupingFunc:
{
GroupingFunc *grp_node = (GroupingFunc *) node;
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 7c8220cf65..520557d2ac 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -1433,6 +1433,7 @@ _copyAggref(const Aggref *from)
COPY_SCALAR_FIELD(aggcollid);
COPY_SCALAR_FIELD(inputcollid);
COPY_SCALAR_FIELD(aggtranstype);
+ COPY_SCALAR_FIELD(aggcombinefn);
COPY_NODE_FIELD(aggargtypes);
COPY_NODE_FIELD(aggdirectargs);
COPY_NODE_FIELD(args);
@@ -2285,6 +2286,22 @@ _copyPlaceHolderVar(const PlaceHolderVar *from)
}
/*
+ * _copyGroupedVar
+ */
+static GroupedVar *
+_copyGroupedVar(const GroupedVar *from)
+{
+ GroupedVar *newnode = makeNode(GroupedVar);
+
+ COPY_NODE_FIELD(gvexpr);
+ COPY_SCALAR_FIELD(sortgroupref);
+ COPY_SCALAR_FIELD(gvid);
+ COPY_SCALAR_FIELD(width);
+
+ return newnode;
+}
+
+/*
* _copySpecialJoinInfo
*/
static SpecialJoinInfo *
@@ -2343,6 +2360,20 @@ _copyPlaceHolderInfo(const PlaceHolderInfo *from)
return newnode;
}
+static GroupedVarInfo *
+_copyGroupedVarInfo(const GroupedVarInfo *from)
+{
+ GroupedVarInfo *newnode = makeNode(GroupedVarInfo);
+
+ COPY_SCALAR_FIELD(gvid);
+ COPY_NODE_FIELD(gvexpr);
+ COPY_SCALAR_FIELD(sortgroupref);
+ COPY_SCALAR_FIELD(gv_eval_at);
+ COPY_SCALAR_FIELD(derived);
+
+ return newnode;
+}
+
/* ****************************************************************
* parsenodes.h copy functions
* ****************************************************************
@@ -5101,6 +5132,9 @@ copyObjectImpl(const void *from)
case T_PlaceHolderVar:
retval = _copyPlaceHolderVar(from);
break;
+ case T_GroupedVar:
+ retval = _copyGroupedVar(from);
+ break;
case T_SpecialJoinInfo:
retval = _copySpecialJoinInfo(from);
break;
@@ -5110,6 +5144,9 @@ copyObjectImpl(const void *from)
case T_PlaceHolderInfo:
retval = _copyPlaceHolderInfo(from);
break;
+ case T_GroupedVarInfo:
+ retval = _copyGroupedVarInfo(from);
+ break;
/*
* VALUE NODES
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 378f2facb8..9d990ba828 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -873,6 +873,14 @@ _equalPlaceHolderVar(const PlaceHolderVar *a, const PlaceHolderVar *b)
}
static bool
+_equalGroupedVar(const GroupedVar *a, const GroupedVar *b)
+{
+ COMPARE_SCALAR_FIELD(gvid);
+
+ return true;
+}
+
+static bool
_equalSpecialJoinInfo(const SpecialJoinInfo *a, const SpecialJoinInfo *b)
{
COMPARE_BITMAPSET_FIELD(min_lefthand);
@@ -3173,6 +3181,9 @@ equal(const void *a, const void *b)
case T_PlaceHolderVar:
retval = _equalPlaceHolderVar(a, b);
break;
+ case T_GroupedVar:
+ retval = _equalGroupedVar(a, b);
+ break;
case T_SpecialJoinInfo:
retval = _equalSpecialJoinInfo(a, b);
break;
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index a10014f755..54d20f811f 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -259,6 +259,9 @@ exprType(const Node *expr)
case T_PlaceHolderVar:
type = exprType((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
+ case T_GroupedVar:
+ type = exprType((Node *) ((const GroupedVar *) expr)->gvexpr);
+ break;
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(expr));
type = InvalidOid; /* keep compiler quiet */
@@ -492,6 +495,8 @@ exprTypmod(const Node *expr)
return ((const SetToDefault *) expr)->typeMod;
case T_PlaceHolderVar:
return exprTypmod((Node *) ((const PlaceHolderVar *) expr)->phexpr);
+ case T_GroupedVar:
+ return exprTypmod((Node *) ((const GroupedVar *) expr)->gvexpr);
default:
break;
}
@@ -903,6 +908,9 @@ exprCollation(const Node *expr)
case T_PlaceHolderVar:
coll = exprCollation((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
+ case T_GroupedVar:
+ coll = exprCollation((Node *) ((const GroupedVar *) expr)->gvexpr);
+ break;
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(expr));
coll = InvalidOid; /* keep compiler quiet */
@@ -2187,6 +2195,8 @@ expression_tree_walker(Node *node,
break;
case T_PlaceHolderVar:
return walker(((PlaceHolderVar *) node)->phexpr, context);
+ case T_GroupedVar:
+ return walker(((GroupedVar *) node)->gvexpr, context);
case T_InferenceElem:
return walker(((InferenceElem *) node)->expr, context);
case T_AppendRelInfo:
@@ -2993,6 +3003,15 @@ expression_tree_mutator(Node *node,
return (Node *) newnode;
}
break;
+ case T_GroupedVar:
+ {
+ GroupedVar *gv = (GroupedVar *) node;
+ GroupedVar *newnode;
+
+ FLATCOPY(newnode, gv, GroupedVar);
+ MUTATE(newnode->gvexpr, gv->gvexpr, Expr *);
+ return (Node *) newnode;
+ }
case T_InferenceElem:
{
InferenceElem *inferenceelemdexpr = (InferenceElem *) node;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 6269f474d2..25ca3fb57b 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1207,6 +1207,7 @@ _outAggref(StringInfo str, const Aggref *node)
WRITE_OID_FIELD(aggcollid);
WRITE_OID_FIELD(inputcollid);
WRITE_OID_FIELD(aggtranstype);
+ WRITE_OID_FIELD(aggcombinefn);
WRITE_NODE_FIELD(aggargtypes);
WRITE_NODE_FIELD(aggdirectargs);
WRITE_NODE_FIELD(args);
@@ -2297,6 +2298,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
WRITE_NODE_FIELD(append_rel_list);
WRITE_NODE_FIELD(rowMarks);
WRITE_NODE_FIELD(placeholder_list);
+ WRITE_NODE_FIELD(grouped_var_list);
WRITE_NODE_FIELD(fkey_list);
WRITE_NODE_FIELD(query_pathkeys);
WRITE_NODE_FIELD(group_pathkeys);
@@ -2344,6 +2346,7 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node)
WRITE_NODE_FIELD(cheapest_parameterized_paths);
WRITE_BITMAPSET_FIELD(direct_lateral_relids);
WRITE_BITMAPSET_FIELD(lateral_relids);
+ WRITE_NODE_FIELD(agg_info);
WRITE_UINT_FIELD(relid);
WRITE_OID_FIELD(reltablespace);
WRITE_ENUM_FIELD(rtekind, RTEKind);
@@ -2521,6 +2524,18 @@ _outParamPathInfo(StringInfo str, const ParamPathInfo *node)
}
static void
+_outRelAggInfo(StringInfo str, const RelAggInfo *node)
+{
+ WRITE_NODE_TYPE("RELAGGINFO");
+
+ WRITE_NODE_FIELD(target);
+ WRITE_NODE_FIELD(input);
+ WRITE_NODE_FIELD(group_clauses);
+ WRITE_NODE_FIELD(group_exprs);
+ WRITE_NODE_FIELD(agg_exprs);
+}
+
+static void
_outRestrictInfo(StringInfo str, const RestrictInfo *node)
{
WRITE_NODE_TYPE("RESTRICTINFO");
@@ -2564,6 +2579,17 @@ _outPlaceHolderVar(StringInfo str, const PlaceHolderVar *node)
}
static void
+_outGroupedVar(StringInfo str, const GroupedVar *node)
+{
+ WRITE_NODE_TYPE("GROUPEDVAR");
+
+ WRITE_NODE_FIELD(gvexpr);
+ WRITE_UINT_FIELD(sortgroupref);
+ WRITE_UINT_FIELD(gvid);
+ WRITE_INT_FIELD(width);
+}
+
+static void
_outSpecialJoinInfo(StringInfo str, const SpecialJoinInfo *node)
{
WRITE_NODE_TYPE("SPECIALJOININFO");
@@ -2608,6 +2634,18 @@ _outPlaceHolderInfo(StringInfo str, const PlaceHolderInfo *node)
}
static void
+_outGroupedVarInfo(StringInfo str, const GroupedVarInfo *node)
+{
+ WRITE_NODE_TYPE("GROUPEDVARINFO");
+
+ WRITE_UINT_FIELD(gvid);
+ WRITE_NODE_FIELD(gvexpr);
+ WRITE_UINT_FIELD(sortgroupref);
+ WRITE_BITMAPSET_FIELD(gv_eval_at);
+ WRITE_BOOL_FIELD(derived);
+}
+
+static void
_outMinMaxAggInfo(StringInfo str, const MinMaxAggInfo *node)
{
WRITE_NODE_TYPE("MINMAXAGGINFO");
@@ -4134,12 +4172,18 @@ outNode(StringInfo str, const void *obj)
case T_ParamPathInfo:
_outParamPathInfo(str, obj);
break;
+ case T_RelAggInfo:
+ _outRelAggInfo(str, obj);
+ break;
case T_RestrictInfo:
_outRestrictInfo(str, obj);
break;
case T_PlaceHolderVar:
_outPlaceHolderVar(str, obj);
break;
+ case T_GroupedVar:
+ _outGroupedVar(str, obj);
+ break;
case T_SpecialJoinInfo:
_outSpecialJoinInfo(str, obj);
break;
@@ -4149,6 +4193,9 @@ outNode(StringInfo str, const void *obj)
case T_PlaceHolderInfo:
_outPlaceHolderInfo(str, obj);
break;
+ case T_GroupedVarInfo:
+ _outGroupedVarInfo(str, obj);
+ break;
case T_MinMaxAggInfo:
_outMinMaxAggInfo(str, obj);
break;
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 3254524223..dd9573a0ef 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -534,6 +534,22 @@ _readVar(void)
}
/*
+ * _readGroupedVar
+ */
+static GroupedVar *
+_readGroupedVar(void)
+{
+ READ_LOCALS(GroupedVar);
+
+ READ_NODE_FIELD(gvexpr);
+ READ_UINT_FIELD(sortgroupref);
+ READ_UINT_FIELD(gvid);
+ READ_INT_FIELD(width);
+
+ READ_DONE();
+}
+
+/*
* _readConst
*/
static Const *
@@ -589,6 +605,7 @@ _readAggref(void)
READ_OID_FIELD(aggcollid);
READ_OID_FIELD(inputcollid);
READ_OID_FIELD(aggtranstype);
+ READ_OID_FIELD(aggcombinefn);
READ_NODE_FIELD(aggargtypes);
READ_NODE_FIELD(aggdirectargs);
READ_NODE_FIELD(args);
@@ -2547,6 +2564,8 @@ parseNodeString(void)
return_value = _readTableFunc();
else if (MATCH("VAR", 3))
return_value = _readVar();
+ else if (MATCH("GROUPEDVAR", 10))
+ return_value = _readGroupedVar();
else if (MATCH("CONST", 5))
return_value = _readConst();
else if (MATCH("PARAM", 5))
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 0e80aeb65c..98485f5298 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -58,6 +58,7 @@ typedef struct pushdown_safety_info
/* These parameters are set by GUC */
bool enable_geqo = false; /* just in case GUC doesn't set it */
+bool enable_agg_pushdown;
int geqo_threshold;
int min_parallel_table_scan_size;
int min_parallel_index_scan_size;
@@ -73,16 +74,17 @@ static void set_base_rel_consider_startup(PlannerInfo *root);
static void set_base_rel_sizes(PlannerInfo *root);
static void set_base_rel_pathlists(PlannerInfo *root);
static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
- Index rti, RangeTblEntry *rte);
+ Index rti, RangeTblEntry *rte, bool grouped);
static void set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
- Index rti, RangeTblEntry *rte);
+ Index rti, RangeTblEntry *rte,
+ bool grouped);
static void set_plain_rel_size(PlannerInfo *root, RelOptInfo *rel,
- RangeTblEntry *rte);
+ RangeTblEntry *rte, bool grouped);
static void create_plain_partial_paths(PlannerInfo *root, RelOptInfo *rel);
static void set_rel_consider_parallel(PlannerInfo *root, RelOptInfo *rel,
RangeTblEntry *rte);
static void set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
- RangeTblEntry *rte);
+ RangeTblEntry *rte, bool grouped);
static void set_tablesample_rel_size(PlannerInfo *root, RelOptInfo *rel,
RangeTblEntry *rte);
static void set_tablesample_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
@@ -92,9 +94,11 @@ static void set_foreign_size(PlannerInfo *root, RelOptInfo *rel,
static void set_foreign_pathlist(PlannerInfo *root, RelOptInfo *rel,
RangeTblEntry *rte);
static void set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
- Index rti, RangeTblEntry *rte);
+ Index rti, RangeTblEntry *rte,
+ bool grouped);
static void set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
- Index rti, RangeTblEntry *rte);
+ Index rti, RangeTblEntry *rte,
+ bool grouped);
static void generate_mergeappend_paths(PlannerInfo *root, RelOptInfo *rel,
List *live_childrels,
List *all_child_pathkeys,
@@ -118,7 +122,8 @@ static void set_namedtuplestore_pathlist(PlannerInfo *root, RelOptInfo *rel,
RangeTblEntry *rte);
static void set_worktable_pathlist(PlannerInfo *root, RelOptInfo *rel,
RangeTblEntry *rte);
-static RelOptInfo *make_rel_from_joinlist(PlannerInfo *root, List *joinlist);
+static RelOptInfo *make_rel_from_joinlist(PlannerInfo *root,
+ List *joinlist);
static bool subquery_is_pushdown_safe(Query *subquery, Query *topquery,
pushdown_safety_info *safetyInfo);
static bool recurse_pushdown_safe(Node *setOp, Query *topquery,
@@ -140,7 +145,8 @@ static void remove_unused_subquery_outputs(Query *subquery, RelOptInfo *rel);
/*
* make_one_rel
* Finds all possible access paths for executing a query, returning a
- * single rel that represents the join of all base rels in the query.
+ * single rel that represents the join of all base rels in the query. If
+ * possible, also return a join that contains partial aggregate(s).
*/
RelOptInfo *
make_one_rel(PlannerInfo *root, List *joinlist)
@@ -169,12 +175,16 @@ make_one_rel(PlannerInfo *root, List *joinlist)
root->all_baserels = bms_add_member(root->all_baserels, brel->relid);
}
- /* Mark base rels as to whether we care about fast-start plans */
+ /*
+ * Mark base rels as to whether we care about fast-start plans. XXX We
+ * deliberately do not mark grouped rels --- see the comment on
+ * consider_startup in build_simple_rel().
+ */
set_base_rel_consider_startup(root);
/*
- * Compute size estimates and consider_parallel flags for each base rel,
- * then generate access paths.
+ * Compute size estimates and consider_parallel flags for each plain and
+ * each grouped base rel, then generate access paths.
*/
set_base_rel_sizes(root);
set_base_rel_pathlists(root);
@@ -231,6 +241,19 @@ set_base_rel_consider_startup(PlannerInfo *root)
RelOptInfo *rel = find_base_rel(root, varno);
rel->consider_param_startup = true;
+
+ if (rel->grouped)
+ {
+ /*
+ * As for grouped relations, paths differ substantially by the
+ * AggStrategy. Paths that use AGG_HASHED should not be
+ * parameterized (because creation of hashtable would have to
+ * be repeated for different parameters) but paths using
+ * AGG_SORTED can be. The latter seems to justify considering
+ * the startup cost for grouped relation in general.
+ */
+ rel->grouped->consider_param_startup = true;
+ }
}
}
}
@@ -278,7 +301,9 @@ set_base_rel_sizes(PlannerInfo *root)
if (root->glob->parallelModeOK)
set_rel_consider_parallel(root, rel, rte);
- set_rel_size(root, rel, rti, rte);
+ set_rel_size(root, rel, rti, rte, false);
+ if (rel->grouped)
+ set_rel_size(root, rel, rti, rte, true);
}
}
@@ -297,7 +322,9 @@ set_base_rel_pathlists(PlannerInfo *root)
{
RelOptInfo *rel = root->simple_rel_array[rti];
- /* there may be empty slots corresponding to non-baserel RTEs */
+ /*
+ * there may be empty slots corresponding to non-baserel RTEs
+ */
if (rel == NULL)
continue;
@@ -307,7 +334,20 @@ set_base_rel_pathlists(PlannerInfo *root)
if (rel->reloptkind != RELOPT_BASEREL)
continue;
- set_rel_pathlist(root, rel, rti, root->simple_rte_array[rti]);
+ set_rel_pathlist(root, rel, rti, root->simple_rte_array[rti], false);
+
+ /*
+ * Create grouped paths for grouped relation if it exists.
+ */
+ if (rel->grouped)
+ {
+ Assert(rel->grouped->agg_info != NULL);
+ Assert(rel->grouped->grouped == NULL);
+
+ set_rel_pathlist(root, rel, rti,
+ root->simple_rte_array[rti],
+ true);
+ }
}
}
@@ -317,8 +357,14 @@ set_base_rel_pathlists(PlannerInfo *root)
*/
static void
set_rel_size(PlannerInfo *root, RelOptInfo *rel,
- Index rti, RangeTblEntry *rte)
+ Index rti, RangeTblEntry *rte, bool grouped)
{
+ /*
+ * build_simple_rel() should not have created rels that do not match this
+ * condition.
+ */
+ Assert(!grouped || rte->rtekind == RTE_RELATION);
+
if (rel->reloptkind == RELOPT_BASEREL &&
relation_excluded_by_constraints(root, rel, rte))
{
@@ -338,7 +384,7 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
else if (rte->inh)
{
/* It's an "append relation", process accordingly */
- set_append_rel_size(root, rel, rti, rte);
+ set_append_rel_size(root, rel, rti, rte, grouped);
}
else
{
@@ -348,6 +394,8 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
if (rte->relkind == RELKIND_FOREIGN_TABLE)
{
/* Foreign table */
+ /* Not supported yet, see build_simple_rel(). */
+ Assert(!grouped);
set_foreign_size(root, rel, rte);
}
else if (rte->relkind == RELKIND_PARTITIONED_TABLE)
@@ -356,17 +404,22 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
* A partitioned table without any partitions is marked as
* a dummy rel.
*/
+ if (grouped)
+ rel = rel->grouped;
+
set_dummy_rel_pathlist(rel);
}
else if (rte->tablesample != NULL)
{
/* Sampled relation */
+ /* Not supported yet, see build_simple_rel(). */
+ Assert(!grouped);
set_tablesample_rel_size(root, rel, rte);
}
else
{
/* Plain relation */
- set_plain_rel_size(root, rel, rte);
+ set_plain_rel_size(root, rel, rte, grouped);
}
break;
case RTE_SUBQUERY:
@@ -420,8 +473,16 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
*/
static void
set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
- Index rti, RangeTblEntry *rte)
+ Index rti, RangeTblEntry *rte, bool grouped)
{
+ RelOptInfo *rel_plain = rel; /* non-grouped relation */
+
+ /*
+ * add_grouped_base_rels_to_query() should not have created rels that do
+ * not match this condition.
+ */
+ Assert(!grouped || rte->rtekind == RTE_RELATION);
+
if (IS_DUMMY_REL(rel))
{
/* We already proved the relation empty, so nothing more to do */
@@ -429,7 +490,7 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
else if (rte->inh)
{
/* It's an "append relation", process accordingly */
- set_append_rel_pathlist(root, rel, rti, rte);
+ set_append_rel_pathlist(root, rel, rti, rte, grouped);
}
else
{
@@ -439,17 +500,21 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
if (rte->relkind == RELKIND_FOREIGN_TABLE)
{
/* Foreign table */
+ /* Not supported yet, see build_simple_rel(). */
+ Assert(!grouped);
set_foreign_pathlist(root, rel, rte);
}
else if (rte->tablesample != NULL)
{
/* Sampled relation */
+ /* Not supported yet, see build_simple_rel(). */
+ Assert(!grouped);
set_tablesample_rel_pathlist(root, rel, rte);
}
else
{
/* Plain relation */
- set_plain_rel_pathlist(root, rel, rte);
+ set_plain_rel_pathlist(root, rel, rte, grouped);
}
break;
case RTE_SUBQUERY:
@@ -479,6 +544,9 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
}
}
+ if (grouped)
+ rel = rel->grouped;
+
/*
* If this is a baserel, we should normally consider gathering any partial
* paths we may have created for it.
@@ -491,9 +559,13 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
* Also, if this is the topmost scan/join rel (that is, the only baserel),
* we postpone this until the final scan/join targelist is available (see
* grouping_planner).
+ *
+ * Note on aggregation push-down: parallel paths are not supported until
+ * we implement the feature using 2-stage aggregation.
*/
if (rel->reloptkind == RELOPT_BASEREL &&
- bms_membership(root->all_baserels) != BMS_SINGLETON)
+ bms_membership(root->all_baserels) != BMS_SINGLETON &&
+ !grouped)
generate_gather_paths(root, rel, false);
/*
@@ -504,6 +576,22 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
if (set_rel_pathlist_hook)
(*set_rel_pathlist_hook) (root, rel, rti, rte);
+ /*
+ * Get rid of the grouped relations which have no paths (and to which
+ * generate_gather_paths() won't add any).
+ */
+ if (grouped && rel->pathlist == NIL)
+ {
+ /*
+ * This grouped rel should not contain any partial paths.
+ */
+ Assert(rel->partial_pathlist == NIL);
+
+ pfree(rel_plain->grouped);
+ rel_plain->grouped = NULL;
+ return;
+ }
+
/* Now find the cheapest of the paths for this rel */
set_cheapest(rel);
@@ -517,8 +605,12 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
* Set size estimates for a plain relation (no subquery, no inheritance)
*/
static void
-set_plain_rel_size(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
+set_plain_rel_size(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte,
+ bool grouped)
{
+ if (grouped)
+ rel = rel->grouped;
+
/*
* Test any partial indexes of rel for applicability. We must do this
* first since partial unique indexes can affect size estimates.
@@ -692,9 +784,15 @@ set_rel_consider_parallel(PlannerInfo *root, RelOptInfo *rel,
* Build access paths for a plain relation (no subquery, no inheritance)
*/
static void
-set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
+set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte,
+ bool grouped)
{
Relids required_outer;
+ Path *seq_path;
+ RelOptInfo *rel_plain = rel;
+
+ if (grouped)
+ rel = rel->grouped;
/*
* We don't support pushing join clauses into the quals of a seqscan, but
@@ -703,18 +801,43 @@ set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
*/
required_outer = rel->lateral_relids;
- /* Consider sequential scan */
- add_path(rel, create_seqscan_path(root, rel, required_outer, 0));
+ /* Consider sequential scan, both plain and grouped. */
+ seq_path = create_seqscan_path(root, rel, required_outer, 0);
+
+
+ /*
+ * It's probably not good idea to repeat hashed aggregation with different
+ * parameters, so check if there are no parameters.
+ */
+ if (!grouped)
+ {
+ /* Try to compute unique keys. */
+ make_uniquekeys(root, seq_path);
+
+ add_path(rel, seq_path);
+ }
+ else if (required_outer == NULL)
+ {
+ /*
+ * Only AGG_HASHED is suitable here as it does not expect the input
+ * set to be sorted.
+ */
+ create_grouped_path(root, rel, seq_path, false, false, AGG_HASHED);
+ }
- /* If appropriate, consider parallel sequential scan */
- if (rel->consider_parallel && required_outer == NULL)
- create_plain_partial_paths(root, rel);
+ /* If appropriate, consider parallel sequential scan (plain or grouped) */
+ if (rel->consider_parallel && required_outer == NULL && !grouped)
+ create_plain_partial_paths(root, rel_plain);
- /* Consider index scans */
- create_index_paths(root, rel);
+ /*
+ * Consider index scans, possibly including the grouped and grouped
+ * partial paths.
+ */
+ create_index_paths(root, rel, grouped);
/* Consider TID scans */
- create_tidscan_paths(root, rel);
+ /* TODO Regression test for these paths. */
+ create_tidscan_paths(root, rel, grouped);
}
/*
@@ -726,8 +849,7 @@ create_plain_partial_paths(PlannerInfo *root, RelOptInfo *rel)
{
int parallel_workers;
- parallel_workers = compute_parallel_worker(rel, rel->pages, -1,
- max_parallel_workers_per_gather);
+ parallel_workers = compute_parallel_worker(rel, rel->pages, -1, max_parallel_workers_per_gather);
/* If any limit was set to zero, the user doesn't want a parallel scan. */
if (parallel_workers <= 0)
@@ -738,6 +860,100 @@ create_plain_partial_paths(PlannerInfo *root, RelOptInfo *rel)
}
/*
+ * Apply aggregation to a subpath and add the AggPath to the pathlist.
+ *
+ * "precheck" tells whether the aggregation path should first be checked using
+ * add_path_precheck() / add_partial_path_precheck().
+ *
+ * If "parallel" is true, the aggregation path is considered partial in terms
+ * of parallel execution.
+ *
+ * Caution: Since only grouped relation makes sense as an input for this
+ * function, "rel" is the grouped relation even though "agg_kind" is passed
+ * too. This is different from other functions that receive "agg_kind" and use
+ * it to fetch the grouped relation themselves.
+ *
+ * The return value tells whether the path was added to the pathlist.
+ *
+ * TODO Pass the plain rel and use agg_kind to retrieve the grouped one.
+ */
+bool
+create_grouped_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
+ bool precheck, bool parallel, AggStrategy aggstrategy)
+{
+ Path *agg_path;
+ RelAggInfo *agg_info = rel->agg_info;
+
+ Assert(agg_info != NULL);
+
+ /*
+ * We can only support parallel paths if each worker produced a distinct
+ * set of grouping keys, but such a special case is not known. So this
+ * list should be empty.
+ */
+ if (parallel)
+ return false;
+
+ /*
+ * If the AggPath should be partial, the subpath must be too, and
+ * therefore the subpath is essentially parallel_safe.
+ */
+ Assert(subpath->parallel_safe || !parallel);
+
+ /*
+ * Repeated creation of hash table does not sound like a good idea. Caller
+ * should avoid asking us to do so.
+ */
+ Assert(subpath->param_info == NULL || aggstrategy != AGG_HASHED);
+
+ if (aggstrategy == AGG_HASHED)
+ agg_path = (Path *) create_agg_hashed_path(root, subpath,
+ subpath->rows);
+ else if (aggstrategy == AGG_SORTED)
+ agg_path = (Path *) create_agg_sorted_path(root, subpath,
+ true,
+ subpath->rows);
+ else
+ elog(ERROR, "unexpected strategy %d", aggstrategy);
+
+ /* Add the grouped path to the list of grouped base paths. */
+ if (agg_path != NULL)
+ {
+ if (precheck)
+ {
+ List *pathkeys;
+
+ /* AGG_HASH is not supposed to generate sorted output. */
+ pathkeys = aggstrategy == AGG_SORTED ? subpath->pathkeys : NIL;
+
+ if (!parallel &&
+ !add_path_precheck(rel, agg_path->startup_cost,
+ agg_path->total_cost, pathkeys, NULL))
+ return false;
+
+ if (parallel &&
+ !add_partial_path_precheck(rel, agg_path->total_cost,
+ pathkeys))
+ return false;
+ }
+
+ if (!parallel)
+ {
+ /* Try to compute unique keys. */
+ make_uniquekeys(root, (Path *) agg_path);
+
+ add_path(rel, (Path *) agg_path);
+ }
+ else
+ add_partial_path(rel, (Path *) agg_path);
+
+ return true;
+ }
+
+ return false;
+}
+
+/*
* set_tablesample_rel_size
* Set size estimates for a sampled relation
*/
@@ -866,7 +1082,7 @@ set_foreign_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
*/
static void
set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
- Index rti, RangeTblEntry *rte)
+ Index rti, RangeTblEntry *rte, bool grouped)
{
int parentRTindex = rti;
bool has_live_children;
@@ -1016,10 +1232,50 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
* PlaceHolderVars.) XXX we do not bother to update the cost or width
* fields of childrel->reltarget; not clear if that would be useful.
*/
- childrel->reltarget->exprs = (List *)
- adjust_appendrel_attrs(root,
- (Node *) rel->reltarget->exprs,
- 1, &appinfo);
+ if (grouped)
+ {
+ RelOptInfo *rel_grouped,
+ *childrel_grouped;
+
+ Assert(childrel->grouped != NULL);
+
+ childrel_grouped = childrel->grouped;
+ rel_grouped = rel->grouped;
+
+ /*
+ * Special attention is needed in the grouped case.
+ *
+ * copy_simple_rel() didn't create empty target because it's
+ * better to start with copying one from the parent rel.
+ */
+ Assert(childrel_grouped->reltarget == NULL &&
+ childrel_grouped->agg_info == NULL);
+
+ /*
+ * The parent rel should already have the info that we're setting
+ * up now for the child.
+ */
+ Assert(rel_grouped->reltarget != NULL &&
+ rel_grouped->agg_info != NULL);
+
+ /*
+ * Translate the targets and grouping expressions so they match
+ * this child.
+ */
+ childrel_grouped->agg_info = translate_rel_agg_info(root,
+ rel_grouped->agg_info,
+ &appinfo, 1);
+
+ /*
+ * The relation paths will generate input for partial aggregation.
+ */
+ childrel_grouped->reltarget = childrel_grouped->agg_info->input;
+ }
+ else
+ childrel->reltarget->exprs = (List *)
+ adjust_appendrel_attrs(root,
+ (Node *) rel->reltarget->exprs,
+ 1, &appinfo);
/*
* We have to make child entries in the EquivalenceClass data
@@ -1181,19 +1437,42 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
1, &appinfo);
/*
+ * We have to make child entries in the EquivalenceClass data
+ * structures as well. This is needed either if the parent
+ * participates in some eclass joins (because we will want to consider
+ * inner-indexscan joins on the individual children) or if the parent
+ * has useful pathkeys (because we should try to build MergeAppend
+ * paths that produce those sort orderings).
+ */
+ if (rel->has_eclass_joins || has_useful_pathkeys(root, rel))
+ add_child_rel_equivalences(root, appinfo, rel, childrel);
+ childrel->has_eclass_joins = rel->has_eclass_joins;
+
+ /*
+ * Note: we could compute appropriate attr_needed data for the child's
+ * variables, by transforming the parent's attr_needed through the
+ * translated_vars mapping. However, currently there's no need
+ * because attr_needed is only examined for base relations not
+ * otherrels. So we just leave the child's attr_needed empty.
+ */
+
+ /*
* If parallelism is allowable for this query in general, see whether
* it's allowable for this childrel in particular. But if we've
* already decided the appendrel is not parallel-safe as a whole,
* there's no point in considering parallelism for this child. For
* consistency, do this before calling set_rel_size() for the child.
+ *
+ * The aggregated relations do not use the consider_parallel flag.
*/
- if (root->glob->parallelModeOK && rel->consider_parallel)
+ if (root->glob->parallelModeOK && rel->consider_parallel &&
+ !grouped)
set_rel_consider_parallel(root, childrel, childRTE);
/*
* Compute the child's size.
*/
- set_rel_size(root, childrel, childRTindex, childRTE);
+ set_rel_size(root, childrel, childRTindex, childRTE, grouped);
/*
* It is possible that constraint exclusion detected a contradiction
@@ -1299,13 +1578,20 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
*/
static void
set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
- Index rti, RangeTblEntry *rte)
+ Index rti, RangeTblEntry *rte, bool grouped)
{
int parentRTindex = rti;
List *live_childrels = NIL;
ListCell *l;
/*
+ * TODO Only allow per-child AGGSPLIT_SIMPLE if the partitioning allows
+ * it, i.e. each partition generates distinct set of grouping keys.
+ */
+ if (grouped)
+ return;
+
+ /*
* Generate access paths for each member relation, and remember the
* non-dummy children.
*/
@@ -1323,7 +1609,7 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
/* Re-locate the child RTE and RelOptInfo */
childRTindex = appinfo->child_relid;
childRTE = root->simple_rte_array[childRTindex];
- childrel = root->simple_rel_array[childRTindex];
+ childrel = find_base_rel(root, childRTindex);
/*
* If set_append_rel_size() decided the parent appendrel was
@@ -1337,7 +1623,7 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
/*
* Compute the child's access paths.
*/
- set_rel_pathlist(root, childrel, childRTindex, childRTE);
+ set_rel_pathlist(root, childrel, childRTindex, childRTE, grouped);
/*
* If child is dummy, ignore it.
@@ -1351,13 +1637,9 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
list_concat(rel->partitioned_child_rels,
list_copy(childrel->partitioned_child_rels));
- /*
- * Child is live, so add it to the live_childrels list for use below.
- */
live_childrels = lappend(live_childrels, childrel);
}
- /* Add paths to the append relation. */
add_paths_to_append_rel(root, rel, live_childrels);
}
@@ -1794,6 +2076,7 @@ generate_mergeappend_paths(PlannerInfo *root, RelOptInfo *rel,
List *partitioned_rels)
{
ListCell *lcp;
+ PathTarget *target = NULL;
foreach(lcp, all_child_pathkeys)
{
@@ -1802,23 +2085,25 @@ generate_mergeappend_paths(PlannerInfo *root, RelOptInfo *rel,
List *total_subpaths = NIL;
bool startup_neq_total = false;
ListCell *lcr;
+ Path *path;
/* Select the child paths for this ordering... */
foreach(lcr, live_childrels)
{
RelOptInfo *childrel = (RelOptInfo *) lfirst(lcr);
+ List *pathlist = childrel->pathlist;
Path *cheapest_startup,
*cheapest_total;
/* Locate the right paths, if they are available. */
cheapest_startup =
- get_cheapest_path_for_pathkeys(childrel->pathlist,
+ get_cheapest_path_for_pathkeys(pathlist,
pathkeys,
NULL,
STARTUP_COST,
false);
cheapest_total =
- get_cheapest_path_for_pathkeys(childrel->pathlist,
+ get_cheapest_path_for_pathkeys(pathlist,
pathkeys,
NULL,
TOTAL_COST,
@@ -1851,19 +2136,28 @@ generate_mergeappend_paths(PlannerInfo *root, RelOptInfo *rel,
}
/* ... and build the MergeAppend paths */
- add_path(rel, (Path *) create_merge_append_path(root,
- rel,
- startup_subpaths,
- pathkeys,
- NULL,
- partitioned_rels));
+ path = (Path *) create_merge_append_path(root,
+ rel,
+ target,
+ startup_subpaths,
+ pathkeys,
+ NULL,
+ partitioned_rels);
+
+ add_path(rel, path);
+
if (startup_neq_total)
- add_path(rel, (Path *) create_merge_append_path(root,
- rel,
- total_subpaths,
- pathkeys,
- NULL,
- partitioned_rels));
+ {
+ path = (Path *) create_merge_append_path(root,
+ rel,
+ target,
+ total_subpaths,
+ pathkeys,
+ NULL,
+ partitioned_rels);
+ add_path(rel, path);
+ }
+
}
}
@@ -2665,11 +2959,22 @@ make_rel_from_joinlist(PlannerInfo *root, List *joinlist)
root->initial_rels = initial_rels;
if (join_search_hook)
- return (*join_search_hook) (root, levels_needed, initial_rels);
+ return (*join_search_hook) (root, levels_needed,
+ initial_rels);
else if (enable_geqo && levels_needed >= geqo_threshold)
+ {
+ /*
+ * TODO Teach GEQO about grouped relations. Don't forget that
+ * pathlist can be NIL before set_cheapest() gets called.
+ *
+ * This processing makes no difference betweend plain and grouped
+ * rels, so process them in the same loop.
+ */
return geqo(root, levels_needed, initial_rels);
+ }
else
- return standard_join_search(root, levels_needed, initial_rels);
+ return standard_join_search(root, levels_needed,
+ initial_rels);
}
}
@@ -2767,6 +3072,23 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
/* Find and save the cheapest paths for this rel */
set_cheapest(rel);
+ if (rel->grouped)
+ {
+ RelOptInfo *rel_grouped;
+
+ rel_grouped = rel->grouped;
+
+ Assert(rel_grouped->partial_pathlist == NIL);
+
+ if (rel_grouped->pathlist != NIL)
+ set_cheapest(rel_grouped);
+ else
+ {
+ pfree(rel_grouped);
+ rel->grouped = NULL;
+ }
+ }
+
#ifdef OPTIMIZER_DEBUG
debug_print_rel(root, rel);
#endif
@@ -3404,6 +3726,7 @@ create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
{
int parallel_workers;
double pages_fetched;
+ Path *bmhpath;
/* Compute heap pages for bitmap heap scan */
pages_fetched = compute_bitmap_pages(root, rel, bitmapqual, 1.0,
@@ -3415,8 +3738,21 @@ create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
if (parallel_workers <= 0)
return;
- add_partial_path(rel, (Path *) create_bitmap_heap_path(root, rel,
- bitmapqual, rel->lateral_relids, 1.0, parallel_workers));
+ bmhpath = (Path *) create_bitmap_heap_path(root, rel, bitmapqual,
+ rel->lateral_relids, 1.0,
+ parallel_workers);
+
+ if (rel->agg_info == NULL)
+ add_partial_path(rel, bmhpath);
+ else
+ {
+ /*
+ * Only AGG_HASHED is suitable here as it does not expect the input
+ * set to be sorted.
+ */
+ create_grouped_path(root, rel, (Path *) bmhpath, false, true,
+ AGG_HASHED);
+ }
}
/*
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 7bf67a0529..b904ba3f85 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -88,6 +88,7 @@
#include "optimizer/plancat.h"
#include "optimizer/planmain.h"
#include "optimizer/restrictinfo.h"
+#include "optimizer/var.h"
#include "parser/parsetree.h"
#include "utils/lsyscache.h"
#include "utils/selfuncs.h"
@@ -1065,6 +1066,17 @@ cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
*cost = path->total_cost;
*selec = ((BitmapOrPath *) path)->bitmapselectivity;
}
+ else if (IsA(path, AggPath))
+ {
+ /*
+ * If partial aggregation was already applied, use only the input
+ * path.
+ *
+ * TODO Take the aggregation into account, both cost and its effect on
+ * selectivity (i.e. how it reduces the number of rows).
+ */
+ cost_bitmap_tree_node(((AggPath *) path)->subpath, cost, selec);
+ }
else
{
elog(ERROR, "unrecognized node type: %d", nodeTag(path));
@@ -2287,6 +2299,41 @@ cost_group(Path *path, PlannerInfo *root,
path->total_cost = total_cost;
}
+static void
+estimate_join_rows(PlannerInfo *root, Path *path, RelAggInfo *agg_info)
+{
+ bool grouped = agg_info != NULL;
+
+ if (path->param_info)
+ {
+ double nrows;
+
+ path->rows = path->param_info->ppi_rows;
+ if (grouped)
+ {
+ nrows = estimate_num_groups(root, agg_info->group_exprs,
+ path->rows, NULL);
+ path->rows = clamp_row_est(nrows);
+ }
+ }
+ else
+ {
+ if (!grouped)
+ path->rows = path->parent->rows;
+ else
+ {
+ /*
+ * XXX agg_info->rows is an estimate of the output rows if we join
+ * the non-grouped rels and aggregate the output. However the
+ * figure can be different if an already grouped rel is joined to
+ * non-grouped one. Is this worth adding a new field to the
+ * agg_info?
+ */
+ path->rows = agg_info->rows;
+ }
+ }
+}
+
/*
* initial_cost_nestloop
* Preliminary estimate of the cost of a nestloop join path.
@@ -2408,10 +2455,7 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
inner_path_rows = 1;
/* Mark the path with the correct row estimate */
- if (path->path.param_info)
- path->path.rows = path->path.param_info->ppi_rows;
- else
- path->path.rows = path->path.parent->rows;
+ estimate_join_rows(root, (Path *) path, path->path.parent->agg_info);
/* For partial paths, scale row estimate. */
if (path->path.parallel_workers > 0)
@@ -2854,10 +2898,8 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
inner_path_rows = 1;
/* Mark the path with the correct row estimate */
- if (path->jpath.path.param_info)
- path->jpath.path.rows = path->jpath.path.param_info->ppi_rows;
- else
- path->jpath.path.rows = path->jpath.path.parent->rows;
+ estimate_join_rows(root, (Path *) path,
+ path->jpath.path.parent->agg_info);
/* For partial paths, scale row estimate. */
if (path->jpath.path.parallel_workers > 0)
@@ -3279,10 +3321,8 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
ListCell *hcl;
/* Mark the path with the correct row estimate */
- if (path->jpath.path.param_info)
- path->jpath.path.rows = path->jpath.path.param_info->ppi_rows;
- else
- path->jpath.path.rows = path->jpath.path.parent->rows;
+ estimate_join_rows(root, (Path *) path,
+ path->jpath.path.parent->agg_info);
/* For partial paths, scale row estimate. */
if (path->jpath.path.parallel_workers > 0)
@@ -3805,8 +3845,9 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
* estimated execution cost given by pg_proc.procost (remember to multiply
* this by cpu_operator_cost).
*
- * Vars and Consts are charged zero, and so are boolean operators (AND,
- * OR, NOT). Simplistic, but a lot better than no model at all.
+ * Vars, GroupedVars and Consts are charged zero, and so are boolean
+ * operators (AND, OR, NOT). Simplistic, but a lot better than no model at
+ * all.
*
* Should we try to account for the possibility of short-circuit
* evaluation of AND/OR? Probably *not*, because that would make the
@@ -4287,11 +4328,13 @@ approx_tuple_count(PlannerInfo *root, JoinPath *path, List *quals)
* restriction clauses).
* width: the estimated average output tuple width in bytes.
* baserestrictcost: estimated cost of evaluating baserestrictinfo clauses.
+ * grouped: will partial aggregation be applied to each path?
*/
void
set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)
{
double nrows;
+ bool grouped = rel->agg_info != NULL;
/* Should only be applied to base relations */
Assert(rel->relid > 0);
@@ -4302,12 +4345,31 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)
0,
JOIN_INNER,
NULL);
-
rel->rows = clamp_row_est(nrows);
+ /*
+ * Grouping essentially changes the number of rows.
+ */
+ if (grouped)
+ {
+ nrows = estimate_num_groups(root,
+ rel->agg_info->group_exprs, nrows,
+ NULL);
+ rel->agg_info->rows = clamp_row_est(nrows);
+ }
+
cost_qual_eval(&rel->baserestrictcost, rel->baserestrictinfo, root);
- set_rel_width(root, rel);
+ /*
+ * The grouped target should have the cost and width set immediately on
+ * creation, see create_rel_agg_info().
+ */
+ if (!grouped)
+ set_rel_width(root, rel);
+#ifdef USE_ASSERT_CHECKING
+ else
+ Assert(rel->reltarget->width > 0);
+#endif
}
/*
@@ -4375,12 +4437,23 @@ set_joinrel_size_estimates(PlannerInfo *root, RelOptInfo *rel,
SpecialJoinInfo *sjinfo,
List *restrictlist)
{
+ double outer_rows,
+ inner_rows;
+
+ /*
+ * Take grouping of the input rels into account.
+ */
+ outer_rows = outer_rel->agg_info ? outer_rel->agg_info->rows :
+ outer_rel->rows;
+ inner_rows = inner_rel->agg_info ? inner_rel->agg_info->rows :
+ inner_rel->rows;
+
rel->rows = calc_joinrel_size_estimate(root,
rel,
outer_rel,
inner_rel,
- outer_rel->rows,
- inner_rel->rows,
+ outer_rows,
+ inner_rows,
sjinfo,
restrictlist);
}
@@ -5257,11 +5330,11 @@ set_pathtarget_cost_width(PlannerInfo *root, PathTarget *target)
foreach(lc, target->exprs)
{
Node *node = (Node *) lfirst(lc);
+ int32 item_width;
if (IsA(node, Var))
{
Var *var = (Var *) node;
- int32 item_width;
/* We should not see any upper-level Vars here */
Assert(var->varlevelsup == 0);
@@ -5292,6 +5365,25 @@ set_pathtarget_cost_width(PlannerInfo *root, PathTarget *target)
Assert(item_width > 0);
tuple_width += item_width;
}
+ else if (IsA(node, GroupedVar))
+ {
+ GroupedVar *gvar = (GroupedVar *) node;
+ Node *expr;
+
+ /*
+ * Only AggPath can evaluate GroupedVar if it's an aggregate, or
+ * the AggPath's input path if it's a generic grouping expression.
+ * In the other cases the GroupedVar we see here only bubbled up
+ * from a lower AggPath, so it does not add any cost to the path
+ * that owns this target.
+ *
+ * XXX Is the value worth caching in GroupedVar?
+ */
+ expr = (Node *) gvar->gvexpr;
+ item_width = get_typavgwidth(exprType(expr), exprTypmod(expr));
+ Assert(item_width > 0);
+ tuple_width += item_width;
+ }
else
{
/*
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index b22b36ec0e..921e6f405b 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -65,6 +65,19 @@ static bool reconsider_outer_join_clause(PlannerInfo *root,
static bool reconsider_full_join_clause(PlannerInfo *root,
RestrictInfo *rinfo);
+typedef struct translate_expr_context
+{
+ Var **keys; /* Dictionary keys. */
+ Var **values; /* Dictionary values */
+ int nitems; /* Number of dictionary items. */
+ Relids *gv_eval_at_p; /* See GroupedVarInfo. */
+ Index relid; /* Translate into this relid. */
+} translate_expr_context;
+
+static Node *translate_expression_to_rels_mutator(Node *node,
+ translate_expr_context *context);
+static int var_dictionary_comparator(const void *a, const void *b);
+
/*
* process_equivalence
@@ -2511,3 +2524,329 @@ is_redundant_derived_clause(RestrictInfo *rinfo, List *clauselist)
return false;
}
+
+/*
+ * translate_expression_to_rels
+ * If the appropriate equivalence classes exist, replace vars in
+ * gvi->gvexpr with vars whose varno is equal to relid.
+ */
+GroupedVarInfo *
+translate_expression_to_rels(PlannerInfo *root, GroupedVarInfo *gvi,
+ Index relid)
+{
+ List *vars;
+ ListCell *l1;
+ int i,
+ j;
+ int nkeys,
+ nkeys_resolved;
+ Var **keys,
+ **values,
+ **keys_tmp;
+ Var *key,
+ *key_prev;
+ translate_expr_context context;
+ GroupedVarInfo *result;
+
+ /* Can't do anything w/o equivalence classes. */
+ if (root->eq_classes == NIL)
+ return NULL;
+
+ /*
+ * Before actually trying to modify the expression tree, find out if all
+ * vars can be translated.
+ */
+ vars = pull_var_clause((Node *) gvi->gvexpr, PVC_RECURSE_AGGREGATES);
+
+ /* No vars to translate? */
+ if (vars == NIL)
+ return NULL;
+
+ /*
+ * Search for individual replacement vars as well as the actual expression
+ * translation will be more efficient if we use a dictionary with the keys
+ * (i.e. the "source vars") unique and sorted.
+ */
+ nkeys = list_length(vars);
+ keys = (Var **) palloc(nkeys * sizeof(Var *));
+ i = 0;
+ foreach(l1, vars)
+ {
+ key = lfirst_node(Var, l1);
+ keys[i++] = key;
+ }
+
+ /*
+ * Sort the keys by varno. varattno decides where varnos are equal.
+ */
+ if (nkeys > 1)
+ pg_qsort(keys, nkeys, sizeof(Var *), var_dictionary_comparator);
+
+ /*
+ * Pick unique values and get rid of the vars that need no translation.
+ */
+ keys_tmp = (Var **) palloc(nkeys * sizeof(Var *));
+ key_prev = NULL;
+ j = 0;
+ for (i = 0; i < nkeys; i++)
+ {
+ key = keys[i];
+
+ if ((key_prev == NULL || (key->varno != key_prev->varno &&
+ key->varattno != key_prev->varattno)) &&
+ key->varno != relid)
+ keys_tmp[j++] = key;
+
+ key_prev = key;
+ }
+ pfree(keys);
+ keys = keys_tmp;
+ nkeys = j;
+
+ /*
+ * Is there actually nothing to be translated?
+ */
+ if (nkeys == 0)
+ {
+ pfree(keys);
+ return NULL;
+ }
+
+ nkeys_resolved = 0;
+
+ /*
+ * Find the replacement vars.
+ */
+ values = (Var **) palloc0(nkeys * sizeof(Var *));
+ foreach(l1, root->eq_classes)
+ {
+ EquivalenceClass *ec = lfirst_node(EquivalenceClass, l1);
+ Relids ec_var_relids;
+ Var **ec_vars;
+ int ec_nvars;
+ ListCell *l2;
+
+ /* TODO Re-check if any other EC kind should be ignored. */
+ if (ec->ec_has_volatile || ec->ec_below_outer_join || ec->ec_broken)
+ continue;
+
+ /* Single-element EC can hardly help in translations. */
+ if (list_length(ec->ec_members) == 1)
+ continue;
+
+ /*
+ * Collect all vars of this EC and their varnos.
+ *
+ * ec->ec_relids does not help because we're only interested in a
+ * subset of EC members.
+ */
+ ec_vars = (Var **) palloc(list_length(ec->ec_members) * sizeof(Var *));
+ ec_nvars = 0;
+ ec_var_relids = NULL;
+ foreach(l2, ec->ec_members)
+ {
+ EquivalenceMember *em = lfirst_node(EquivalenceMember, l2);
+ Var *ec_var;
+
+ if (!IsA(em->em_expr, Var))
+ continue;
+
+ ec_var = castNode(Var, em->em_expr);
+ ec_vars[ec_nvars++] = ec_var;
+ ec_var_relids = bms_add_member(ec_var_relids, ec_var->varno);
+ }
+
+ /*
+ * At least two vars are needed so that the EC is usable for
+ * translation.
+ */
+ if (ec_nvars <= 1)
+ {
+ pfree(ec_vars);
+ bms_free(ec_var_relids);
+ continue;
+ }
+
+ /*
+ * Now check where this EC can help.
+ */
+ for (i = 0; i < nkeys; i++)
+ {
+ Relids ec_rest;
+ bool relid_ok,
+ key_found;
+ Var *key = keys[i];
+ Var *value = values[i];
+
+ /* Skip this item if it's already resolved. */
+ if (value != NULL)
+ continue;
+
+ /*
+ * Can't translate if the EC does not mention key->varno.
+ */
+ if (!bms_is_member(key->varno, ec_var_relids))
+ continue;
+
+ /*
+ * Besides key, at least one EC member must belong to the relation
+ * we're translating our expression to.
+ */
+ ec_rest = bms_copy(ec_var_relids);
+ ec_rest = bms_del_member(ec_rest, key->varno);
+ relid_ok = bms_is_member(relid, ec_rest);
+ bms_free(ec_rest);
+ if (!relid_ok)
+ continue;
+
+ /*
+ * The preliminary checks passed, so try to find the exact vars.
+ */
+ key_found = false;
+ for (j = 0; j < ec_nvars; j++)
+ {
+ Var *ec_var = ec_vars[j];
+
+ if (!key_found && key->varno == ec_var->varno &&
+ key->varattno == ec_var->varattno)
+ key_found = true;
+
+ /*
+ *
+ * Is this Var useful for our dictionary?
+ *
+ * XXX Shouldn't ec_var be copied?
+ */
+ if (value == NULL && ec_var->varno == relid)
+ value = ec_var;
+
+ if (key_found && value != NULL)
+ break;
+ }
+
+ /*
+ * The replacement Var must have the same data type, otherwise the
+ * values are not guaranteed to be grouped in the same way as
+ * values of the original Var.
+ */
+ if (key_found && value != NULL &&
+ key->vartype == value->vartype)
+ {
+ values[i] = value;
+ nkeys_resolved++;
+
+ if (nkeys_resolved == nkeys)
+ break;
+ }
+ }
+
+ pfree(ec_vars);
+ bms_free(ec_var_relids);
+
+ /* Don't need to check the remaining ECs? */
+ if (nkeys_resolved == nkeys)
+ break;
+ }
+
+ /* Couldn't compose usable dictionary? */
+ if (nkeys_resolved < nkeys)
+ {
+ pfree(keys);
+ pfree(values);
+ return NULL;
+ }
+
+ result = makeNode(GroupedVarInfo);
+ memcpy(result, gvi, sizeof(GroupedVarInfo));
+
+ /*
+ * translate_expression_to_rels_mutator updates gv_eval_at.
+ */
+ result->gv_eval_at = bms_copy(result->gv_eval_at);
+
+ /* The dictionary is ready, so perform the translation. */
+ context.keys = keys;
+ context.values = values;
+ context.nitems = nkeys;
+ context.gv_eval_at_p = &result->gv_eval_at;
+ context.relid = relid;
+ result->gvexpr = (Expr *)
+ translate_expression_to_rels_mutator((Node *) gvi->gvexpr, &context);
+ result->derived = true;
+
+ pfree(keys);
+ pfree(values);
+ return result;
+}
+
+static Node *
+translate_expression_to_rels_mutator(Node *node,
+ translate_expr_context *context)
+{
+ if (node == NULL)
+ return NULL;
+
+ if (IsA(node, Var))
+ {
+ Var *var = castNode(Var, node);
+ Var **key_p;
+ Var *value;
+ int index;
+
+ /*
+ * Simply return the existing variable if already belongs to the
+ * relation we're adjusting the expression to.
+ */
+ if (var->varno == context->relid)
+ return (Node *) var;
+
+ key_p = bsearch(&var, context->keys, context->nitems, sizeof(Var *),
+ var_dictionary_comparator);
+
+ /* We shouldn't have omitted any var from the dictionary. */
+ Assert(key_p != NULL);
+
+ index = key_p - context->keys;
+ Assert(index >= 0 && index < context->nitems);
+ value = context->values[index];
+
+ /* All values should be present in the dictionary. */
+ Assert(value != NULL);
+
+ /* Update gv_eval_at accordingly. */
+ bms_del_member(*context->gv_eval_at_p, var->varno);
+ *context->gv_eval_at_p = bms_add_member(*context->gv_eval_at_p,
+ value->varno);
+
+ return (Node *) value;
+ }
+
+ return expression_tree_mutator(node, translate_expression_to_rels_mutator,
+ (void *) context);
+}
+
+static int
+var_dictionary_comparator(const void *a, const void *b)
+{
+ Var **var1_p,
+ **var2_p;
+ Var *var1,
+ *var2;
+
+ var1_p = (Var **) a;
+ var1 = castNode(Var, *var1_p);
+ var2_p = (Var **) b;
+ var2 = castNode(Var, *var2_p);
+
+ if (var1->varno < var2->varno)
+ return -1;
+ else if (var1->varno > var2->varno)
+ return 1;
+
+ if (var1->varattno < var2->varattno)
+ return -1;
+ else if (var1->varattno > var2->varattno)
+ return 1;
+
+ return 0;
+}
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index f295558f76..bdbeee1a6a 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -32,6 +32,7 @@
#include "optimizer/predtest.h"
#include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
+#include "optimizer/tlist.h"
#include "optimizer/var.h"
#include "utils/builtins.h"
#include "utils/bytea.h"
@@ -76,13 +77,13 @@ typedef struct
int indexcol; /* index column we want to match to */
} ec_member_matches_arg;
-
static void consider_index_join_clauses(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index,
IndexClauseSet *rclauseset,
IndexClauseSet *jclauseset,
IndexClauseSet *eclauseset,
- List **bitindexpaths);
+ List **bitindexpaths,
+ bool grouped);
static void consider_index_join_outer_rels(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index,
IndexClauseSet *rclauseset,
@@ -91,7 +92,8 @@ static void consider_index_join_outer_rels(PlannerInfo *root, RelOptInfo *rel,
List **bitindexpaths,
List *indexjoinclauses,
int considered_clauses,
- List **considered_relids);
+ List **considered_relids,
+ bool grouped);
static void get_join_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index,
IndexClauseSet *rclauseset,
@@ -99,23 +101,28 @@ static void get_join_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexClauseSet *eclauseset,
List **bitindexpaths,
Relids relids,
- List **considered_relids);
+ List **considered_relids,
+ bool grouped);
static bool eclass_already_used(EquivalenceClass *parent_ec, Relids oldrelids,
List *indexjoinclauses);
static bool bms_equal_any(Relids relids, List *relids_list);
static void get_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
- List **bitindexpaths);
+ List **bitindexpaths,
+ bool grouped);
static List *build_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
bool useful_predicate,
ScanTypeControl scantype,
bool *skip_nonnative_saop,
- bool *skip_lower_saop);
+ bool *skip_lower_saop,
+ bool grouped);
static List *build_paths_for_OR(PlannerInfo *root, RelOptInfo *rel,
- List *clauses, List *other_clauses);
+ List *clauses, List *other_clauses,
+ bool grouped);
static List *generate_bitmap_or_paths(PlannerInfo *root, RelOptInfo *rel,
- List *clauses, List *other_clauses);
+ List *clauses, List *other_clauses,
+ bool grouped);
static Path *choose_bitmap_and(PlannerInfo *root, RelOptInfo *rel,
List *paths);
static int path_usage_comparator(const void *a, const void *b);
@@ -227,7 +234,7 @@ static Const *string_to_const(const char *str, Oid datatype);
* as meaning "unparameterized so far as the indexquals are concerned".
*/
void
-create_index_paths(PlannerInfo *root, RelOptInfo *rel)
+create_index_paths(PlannerInfo *root, RelOptInfo *rel, bool grouped)
{
List *indexpaths;
List *bitindexpaths;
@@ -272,8 +279,8 @@ create_index_paths(PlannerInfo *root, RelOptInfo *rel)
* non-parameterized paths. Plain paths go directly to add_path(),
* bitmap paths are added to bitindexpaths to be handled below.
*/
- get_index_paths(root, rel, index, &rclauseset,
- &bitindexpaths);
+ get_index_paths(root, rel, index, &rclauseset, &bitindexpaths,
+ grouped);
/*
* Identify the join clauses that can match the index. For the moment
@@ -302,15 +309,25 @@ create_index_paths(PlannerInfo *root, RelOptInfo *rel)
&rclauseset,
&jclauseset,
&eclauseset,
- &bitjoinpaths);
+ &bitjoinpaths,
+ grouped);
}
+
+ /*
+ * Bitmap paths are currently not aggregated: AggPath does not accept the
+ * TID bitmap as input, and even if it did, it'd seem weird to aggregate
+ * the individual paths and then AND them together.
+ */
+ if (rel->agg_info != NULL)
+ return;
+
/*
* Generate BitmapOrPaths for any suitable OR-clauses present in the
* restriction list. Add these to bitindexpaths.
*/
- indexpaths = generate_bitmap_or_paths(root, rel,
- rel->baserestrictinfo, NIL);
+ indexpaths = generate_bitmap_or_paths(root, rel, rel->baserestrictinfo,
+ NIL, grouped);
bitindexpaths = list_concat(bitindexpaths, indexpaths);
/*
@@ -318,7 +335,8 @@ create_index_paths(PlannerInfo *root, RelOptInfo *rel)
* the joinclause list. Add these to bitjoinpaths.
*/
indexpaths = generate_bitmap_or_paths(root, rel,
- joinorclauses, rel->baserestrictinfo);
+ joinorclauses, rel->baserestrictinfo,
+ grouped);
bitjoinpaths = list_concat(bitjoinpaths, indexpaths);
/*
@@ -439,7 +457,8 @@ consider_index_join_clauses(PlannerInfo *root, RelOptInfo *rel,
IndexClauseSet *rclauseset,
IndexClauseSet *jclauseset,
IndexClauseSet *eclauseset,
- List **bitindexpaths)
+ List **bitindexpaths,
+ bool grouped)
{
int considered_clauses = 0;
List *considered_relids = NIL;
@@ -475,7 +494,8 @@ consider_index_join_clauses(PlannerInfo *root, RelOptInfo *rel,
bitindexpaths,
jclauseset->indexclauses[indexcol],
considered_clauses,
- &considered_relids);
+ &considered_relids,
+ grouped);
/* Consider each applicable eclass join clause */
considered_clauses += list_length(eclauseset->indexclauses[indexcol]);
consider_index_join_outer_rels(root, rel, index,
@@ -483,7 +503,8 @@ consider_index_join_clauses(PlannerInfo *root, RelOptInfo *rel,
bitindexpaths,
eclauseset->indexclauses[indexcol],
considered_clauses,
- &considered_relids);
+ &considered_relids,
+ grouped);
}
}
@@ -508,7 +529,8 @@ consider_index_join_outer_rels(PlannerInfo *root, RelOptInfo *rel,
List **bitindexpaths,
List *indexjoinclauses,
int considered_clauses,
- List **considered_relids)
+ List **considered_relids,
+ bool grouped)
{
ListCell *lc;
@@ -575,7 +597,8 @@ consider_index_join_outer_rels(PlannerInfo *root, RelOptInfo *rel,
rclauseset, jclauseset, eclauseset,
bitindexpaths,
bms_union(clause_relids, oldrelids),
- considered_relids);
+ considered_relids,
+ grouped);
}
/* Also try this set of relids by itself */
@@ -583,7 +606,8 @@ consider_index_join_outer_rels(PlannerInfo *root, RelOptInfo *rel,
rclauseset, jclauseset, eclauseset,
bitindexpaths,
clause_relids,
- considered_relids);
+ considered_relids,
+ grouped);
}
}
@@ -608,7 +632,8 @@ get_join_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexClauseSet *eclauseset,
List **bitindexpaths,
Relids relids,
- List **considered_relids)
+ List **considered_relids,
+ bool grouped)
{
IndexClauseSet clauseset;
int indexcol;
@@ -665,7 +690,8 @@ get_join_index_paths(PlannerInfo *root, RelOptInfo *rel,
Assert(clauseset.nonempty);
/* Build index path(s) using the collected set of clauses */
- get_index_paths(root, rel, index, &clauseset, bitindexpaths);
+ get_index_paths(root, rel, index, &clauseset, bitindexpaths,
+ grouped);
/*
* Remember we considered paths for this set of relids. We use lcons not
@@ -715,7 +741,6 @@ bms_equal_any(Relids relids, List *relids_list)
return false;
}
-
/*
* get_index_paths
* Given an index and a set of index clauses for it, construct IndexPaths.
@@ -734,7 +759,7 @@ bms_equal_any(Relids relids, List *relids_list)
static void
get_index_paths(PlannerInfo *root, RelOptInfo *rel,
IndexOptInfo *index, IndexClauseSet *clauses,
- List **bitindexpaths)
+ List **bitindexpaths, bool grouped)
{
List *indexpaths;
bool skip_nonnative_saop = false;
@@ -746,18 +771,26 @@ get_index_paths(PlannerInfo *root, RelOptInfo *rel,
* clauses only if the index AM supports them natively, and skip any such
* clauses for index columns after the first (so that we produce ordered
* paths if possible).
+ *
+ * These paths are good candidates for AGG_SORTED, so pass the output
+ * lists for this strategy. AGG_HASHED should be applied to paths with no
+ * pathkeys.
*/
indexpaths = build_index_paths(root, rel,
index, clauses,
index->predOK,
ST_ANYSCAN,
&skip_nonnative_saop,
- &skip_lower_saop);
+ &skip_lower_saop,
+ grouped);
/*
* If we skipped any lower-order ScalarArrayOpExprs on an index with an AM
* that supports them, then try again including those clauses. This will
* produce paths with more selectivity but no ordering.
+ *
+ * As for the grouping paths, only AGG_HASHED is considered due to the
+ * missing ordering.
*/
if (skip_lower_saop)
{
@@ -767,7 +800,8 @@ get_index_paths(PlannerInfo *root, RelOptInfo *rel,
index->predOK,
ST_ANYSCAN,
&skip_nonnative_saop,
- NULL));
+ NULL,
+ grouped));
}
/*
@@ -799,6 +833,9 @@ get_index_paths(PlannerInfo *root, RelOptInfo *rel,
* If there were ScalarArrayOpExpr clauses that the index can't handle
* natively, generate bitmap scan paths relying on executor-managed
* ScalarArrayOpExpr.
+ *
+ * As for grouping, only AGG_HASHED is possible here. Again, because
+ * there's no ordering.
*/
if (skip_nonnative_saop)
{
@@ -807,7 +844,8 @@ get_index_paths(PlannerInfo *root, RelOptInfo *rel,
false,
ST_BITMAPSCAN,
NULL,
- NULL);
+ NULL,
+ grouped);
*bitindexpaths = list_concat(*bitindexpaths, indexpaths);
}
}
@@ -845,13 +883,18 @@ get_index_paths(PlannerInfo *root, RelOptInfo *rel,
* NULL, we do not ignore non-first ScalarArrayOpExpr clauses, but they will
* result in considering the scan's output to be unordered.
*
+ * If 'agg_info' is passed, 'agg_sorted' and / or 'agg_hashed' must be passed
+ * too. In that case AGG_SORTED and / or AGG_HASHED aggregation is applied to
+ * the index path (as long as the index path is appropriate) and the resulting
+ * grouped path is stored here.
+ *
* 'rel' is the index's heap relation
* 'index' is the index for which we want to generate paths
* 'clauses' is the collection of indexable clauses (RestrictInfo nodes)
* 'useful_predicate' indicates whether the index has a useful predicate
* 'scantype' indicates whether we need plain or bitmap scan support
* 'skip_nonnative_saop' indicates whether to accept SAOP if index AM doesn't
- * 'skip_lower_saop' indicates whether to accept non-first-column SAOP
+ * 'skip_lower_saop' indicates whether to accept non-first-column SAOP.
*/
static List *
build_index_paths(PlannerInfo *root, RelOptInfo *rel,
@@ -859,7 +902,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
bool useful_predicate,
ScanTypeControl scantype,
bool *skip_nonnative_saop,
- bool *skip_lower_saop)
+ bool *skip_lower_saop,
+ bool grouped)
{
List *result = NIL;
IndexPath *ipath;
@@ -876,6 +920,9 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
bool index_is_ordered;
bool index_only_scan;
int indexcol;
+ bool can_agg_sorted,
+ can_agg_hashed;
+ AggPath *agg_path;
/*
* Check that index supports the desired scan type(s)
@@ -1029,7 +1076,12 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
* in the current clauses, OR the index ordering is potentially useful for
* later merging or final output ordering, OR the index has a useful
* predicate, OR an index-only scan is possible.
+ *
+ * This is where grouped path start to be considered.
*/
+ can_agg_sorted = true;
+ can_agg_hashed = true;
+
if (index_clauses != NIL || useful_pathkeys != NIL || useful_predicate ||
index_only_scan)
{
@@ -1046,7 +1098,72 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
outer_relids,
loop_count,
false);
- result = lappend(result, ipath);
+
+ if (!grouped)
+ {
+ make_uniquekeys(root, (Path *) ipath);
+ result = lappend(result, ipath);
+ }
+ else
+ {
+ /*
+ * Try to create the grouped paths if caller is interested in
+ * them.
+ */
+ if (useful_pathkeys != NIL)
+ {
+ agg_path = create_agg_sorted_path(root,
+ (Path *) ipath,
+ true,
+ ipath->path.rows);
+
+ if (agg_path != NULL)
+ {
+ make_uniquekeys(root, (Path *) agg_path);
+ result = lappend(result, agg_path);
+ }
+ else
+ {
+ /*
+ * If ipath could not be used as a source for AGG_SORTED
+ * partial aggregation, it probably does not have the
+ * appropriate pathkeys. Avoid trying to apply AGG_SORTED
+ * to the next index paths because those will have the
+ * same pathkeys.
+ */
+ can_agg_sorted = false;
+ }
+ }
+ else
+ can_agg_sorted = false;
+
+ /*
+ * Hashed aggregation should not be parameterized: the cost of
+ * repeated creation of the hashtable (for different parameter
+ * values) is probably not worth.
+ */
+ if (outer_relids != NULL)
+ {
+ agg_path = create_agg_hashed_path(root,
+ (Path *) ipath,
+ ipath->path.rows);
+
+ if (agg_path != NULL)
+ {
+ make_uniquekeys(root, (Path *) agg_path);
+ result = lappend(result, agg_path);
+ }
+ else
+ {
+ /*
+ * If ipath could not be used as a source for AGG_HASHED,
+ * we should not expect any other path of the same index
+ * to succeed. Avoid wasting the effort next time.
+ */
+ can_agg_hashed = false;
+ }
+ }
+ }
/*
* If appropriate, consider parallel index scan. We don't allow
@@ -1075,7 +1192,46 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
* parallel workers, just free it.
*/
if (ipath->path.parallel_workers > 0)
- add_partial_path(rel, (Path *) ipath);
+ {
+ if (!grouped)
+ add_partial_path(rel, (Path *) ipath);
+ else
+ {
+ if (useful_pathkeys != NIL && can_agg_sorted)
+ {
+ /*
+ * No need to check the pathkeys again.
+ */
+ agg_path = create_agg_sorted_path(root,
+ (Path *) ipath,
+ false,
+ ipath->path.rows);
+
+ /*
+ * If create_agg_sorted_path succeeded once, it should
+ * always do.
+ */
+ Assert(agg_path != NULL);
+
+ add_partial_path(rel, (Path *) agg_path);
+ }
+
+ if (can_agg_hashed && outer_relids == NULL)
+ {
+ agg_path = create_agg_hashed_path(root,
+ (Path *) ipath,
+ ipath->path.rows);
+
+ /*
+ * If create_agg_hashed_path succeeded once, it should
+ * always do.
+ */
+ Assert(agg_path != NULL);
+
+ add_partial_path(rel, (Path *) agg_path);
+ }
+ }
+ }
else
pfree(ipath);
}
@@ -1103,7 +1259,38 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
outer_relids,
loop_count,
false);
- result = lappend(result, ipath);
+
+ if (!grouped)
+ {
+ make_uniquekeys(root, (Path *) ipath);
+ result = lappend(result, ipath);
+ }
+ else
+ {
+ /*
+ * As the input set ordering does not matter to AGG_HASHED,
+ * only AGG_SORTED makes sense here. (The AGG_HASHED path we'd
+ * create here should already exist.)
+ *
+ * The existing value of can_agg_sorted is not up-to-date for
+ * the new pathkeys.
+ */
+ can_agg_sorted = true;
+
+ /* pathkeys are new, so check them. */
+ agg_path = create_agg_sorted_path(root,
+ (Path *) ipath,
+ true,
+ ipath->path.rows);
+
+ if (agg_path != NULL)
+ {
+ make_uniquekeys(root, (Path *) agg_path);
+ result = lappend(result, agg_path);
+ }
+ else
+ can_agg_sorted = false;
+ }
/* If appropriate, consider parallel index scan */
if (index->amcanparallel &&
@@ -1127,7 +1314,26 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
* using parallel workers, just free it.
*/
if (ipath->path.parallel_workers > 0)
- add_partial_path(rel, (Path *) ipath);
+ {
+ if (!grouped)
+ add_partial_path(rel, (Path *) ipath);
+ else
+ {
+ if (can_agg_sorted)
+ {
+ /*
+ * The non-partial path above should have been
+ * created, so no need to check pathkeys.
+ */
+ agg_path = create_agg_sorted_path(root,
+ (Path *) ipath,
+ false,
+ ipath->path.rows);
+ Assert(agg_path != NULL);
+ add_partial_path(rel, (Path *) agg_path);
+ }
+ }
+ }
else
pfree(ipath);
}
@@ -1162,10 +1368,12 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
* 'rel' is the relation for which we want to generate index paths
* 'clauses' is the current list of clauses (RestrictInfo nodes)
* 'other_clauses' is the list of additional upper-level clauses
+ * 'agg_info' indicates that grouped paths should be added to 'agg_hashed'.
*/
static List *
build_paths_for_OR(PlannerInfo *root, RelOptInfo *rel,
- List *clauses, List *other_clauses)
+ List *clauses, List *other_clauses,
+ bool grouped)
{
List *result = NIL;
List *all_clauses = NIL; /* not computed till needed */
@@ -1235,14 +1443,16 @@ build_paths_for_OR(PlannerInfo *root, RelOptInfo *rel,
match_clauses_to_index(index, other_clauses, &clauseset);
/*
- * Construct paths if possible.
+ * Construct paths if possible. Forbid partial aggregation even if the
+ * relation is grouped --- it'll be applied to the bitmap heap path.
*/
indexpaths = build_index_paths(root, rel,
index, &clauseset,
useful_predicate,
ST_BITMAPSCAN,
NULL,
- NULL);
+ NULL,
+ grouped);
result = list_concat(result, indexpaths);
}
@@ -1261,7 +1471,8 @@ build_paths_for_OR(PlannerInfo *root, RelOptInfo *rel,
*/
static List *
generate_bitmap_or_paths(PlannerInfo *root, RelOptInfo *rel,
- List *clauses, List *other_clauses)
+ List *clauses, List *other_clauses,
+ bool grouped)
{
List *result = NIL;
List *all_clauses;
@@ -1301,13 +1512,15 @@ generate_bitmap_or_paths(PlannerInfo *root, RelOptInfo *rel,
indlist = build_paths_for_OR(root, rel,
andargs,
- all_clauses);
+ all_clauses,
+ grouped);
/* Recurse in case there are sub-ORs */
indlist = list_concat(indlist,
generate_bitmap_or_paths(root, rel,
andargs,
- all_clauses));
+ all_clauses,
+ grouped));
}
else
{
@@ -1319,7 +1532,8 @@ generate_bitmap_or_paths(PlannerInfo *root, RelOptInfo *rel,
indlist = build_paths_for_OR(root, rel,
orargs,
- all_clauses);
+ all_clauses,
+ grouped);
}
/*
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 642f951093..cb45f03fe2 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -51,10 +51,12 @@ static void try_partial_mergejoin_path(PlannerInfo *root,
JoinPathExtraData *extra);
static void sort_inner_and_outer(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outerrel, RelOptInfo *innerrel,
- JoinType jointype, JoinPathExtraData *extra);
+ JoinType jointype, JoinPathExtraData *extra,
+ bool grouped, bool do_aggregate);
static void match_unsorted_outer(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outerrel, RelOptInfo *innerrel,
- JoinType jointype, JoinPathExtraData *extra);
+ JoinType jointype, JoinPathExtraData *extra,
+ bool grouped, bool do_aggregate);
static void consider_parallel_nestloop(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
@@ -67,10 +69,13 @@ static void consider_parallel_mergejoin(PlannerInfo *root,
RelOptInfo *innerrel,
JoinType jointype,
JoinPathExtraData *extra,
- Path *inner_cheapest_total);
+ Path *inner_cheapest_total,
+ bool grouped,
+ bool do_aggregate);
static void hash_inner_and_outer(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outerrel, RelOptInfo *innerrel,
- JoinType jointype, JoinPathExtraData *extra);
+ JoinType jointype, JoinPathExtraData *extra,
+ bool grouped, bool do_aggregate);
static List *select_mergejoin_clauses(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
@@ -87,7 +92,9 @@ static void generate_mergejoin_paths(PlannerInfo *root,
bool useallclauses,
Path *inner_cheapest_total,
List *merge_pathkeys,
- bool is_partial);
+ bool is_partial,
+ bool grouped,
+ bool do_aggregate);
/*
@@ -120,7 +127,9 @@ add_paths_to_joinrel(PlannerInfo *root,
RelOptInfo *innerrel,
JoinType jointype,
SpecialJoinInfo *sjinfo,
- List *restrictlist)
+ List *restrictlist,
+ bool grouped,
+ bool do_aggregate)
{
JoinPathExtraData extra;
bool mergejoin_allowed = true;
@@ -267,7 +276,7 @@ add_paths_to_joinrel(PlannerInfo *root,
*/
if (mergejoin_allowed)
sort_inner_and_outer(root, joinrel, outerrel, innerrel,
- jointype, &extra);
+ jointype, &extra, grouped, do_aggregate);
/*
* 2. Consider paths where the outer relation need not be explicitly
@@ -278,7 +287,7 @@ add_paths_to_joinrel(PlannerInfo *root,
*/
if (mergejoin_allowed)
match_unsorted_outer(root, joinrel, outerrel, innerrel,
- jointype, &extra);
+ jointype, &extra, grouped, do_aggregate);
#ifdef NOT_USED
@@ -305,7 +314,7 @@ add_paths_to_joinrel(PlannerInfo *root,
*/
if (enable_hashjoin || jointype == JOIN_FULL)
hash_inner_and_outer(root, joinrel, outerrel, innerrel,
- jointype, &extra);
+ jointype, &extra, grouped, do_aggregate);
/*
* 5. If inner and outer relations are foreign tables (or joins) belonging
@@ -366,7 +375,9 @@ try_nestloop_path(PlannerInfo *root,
Path *inner_path,
List *pathkeys,
JoinType jointype,
- JoinPathExtraData *extra)
+ JoinPathExtraData *extra,
+ bool grouped,
+ bool do_aggregate)
{
Relids required_outer;
JoinCostWorkspace workspace;
@@ -376,6 +387,11 @@ try_nestloop_path(PlannerInfo *root,
Relids outerrelids;
Relids inner_paramrels = PATH_REQ_OUTER(inner_path);
Relids outer_paramrels = PATH_REQ_OUTER(outer_path);
+ bool success = false;
+ RelOptInfo *joinrel_plain = joinrel; /* Non-grouped joinrel. */
+
+ if (grouped)
+ joinrel = joinrel->grouped;
/*
* Paths are parameterized by top-level parents, so run parameterization
@@ -422,10 +438,61 @@ try_nestloop_path(PlannerInfo *root,
initial_cost_nestloop(root, &workspace, jointype,
outer_path, inner_path, extra);
- if (add_path_precheck(joinrel,
- workspace.startup_cost, workspace.total_cost,
- pathkeys, required_outer))
+ /*
+ * If the join output should be (partially) aggregated, the precheck
+ * includes the aggregation and is postponed to create_grouped_path().
+ */
+ if ((!do_aggregate &&
+ add_path_precheck(joinrel,
+ workspace.startup_cost, workspace.total_cost,
+ pathkeys, required_outer)) ||
+ do_aggregate)
{
+ PathTarget *target;
+ List *uniquekeys = NIL;
+ Path *path;
+
+ /*
+ * If the join output is subject to partial aggregation, the path must
+ * generate aggregation input.
+ */
+ if (!do_aggregate)
+ {
+ target = joinrel->reltarget;
+
+ /*
+ * 1-stage aggregation can only be used if the join produces
+ * unique grouping keys, so we have to check that.
+ */
+ if (grouped)
+ {
+ bool keys_ok;
+
+ /*
+ * We're not going to produce AggPath, so the grouping keys
+ * are not guaranteed to be unique across the output set. This
+ * function returns NULL if no appropriate uniquekeys could be
+ * generated.
+ */
+ uniquekeys = make_uniquekeys_for_join(root, outer_path,
+ inner_path,
+ target, &keys_ok);
+
+ /*
+ * Do not create the join path if it would duplicate the
+ * grouping keys and if the upper paths do not expect those
+ * duplicities.
+ */
+ if (!keys_ok)
+ return;
+ }
+ }
+ else
+ {
+ Assert(joinrel->agg_info != NULL);
+ target = joinrel->agg_info->input;
+ }
+
/*
* If the inner path is parameterized, it is parameterized by the
* topmost parent of the outer rel, not the outer rel itself. Fix
@@ -447,21 +514,72 @@ try_nestloop_path(PlannerInfo *root,
}
}
- add_path(joinrel, (Path *)
- create_nestloop_path(root,
- joinrel,
- jointype,
- &workspace,
- extra,
- outer_path,
- inner_path,
- extra->restrictlist,
- pathkeys,
- required_outer));
+ path = (Path *) create_nestloop_path(root,
+ joinrel_plain,
+ target,
+ jointype,
+ &workspace,
+ extra,
+ outer_path,
+ inner_path,
+ extra->restrictlist,
+ pathkeys,
+ required_outer);
+
+ /*
+ * TODO joinrel_plain had to be passed above because of row estimates.
+ * Pass "grouped" in addition so that we do not need the following
+ * hack?
+ */
+ path->parent = joinrel;
+
+ /*
+ * If uniquekeys is NIL, we should not need it. Either because
+ * grouped==false (obviously no aggregation push-down), or the input
+ * path(s) do not have unique keys or we're going to apply AggPath to
+ * the join.
+ */
+ path->uniquekeys = uniquekeys;
+
+ if (!do_aggregate)
+ {
+ add_path(joinrel, path);
+ success = true;
+ }
+ else
+ {
+ /*
+ * Try both AGG_HASHED and AGG_SORTED partial aggregation.
+ *
+ * AGG_HASHED should not be parameterized because we don't want to
+ * create the hashtable again for each set of parameters.
+ */
+ if (required_outer == NULL)
+ success = create_grouped_path(root,
+ joinrel,
+ path,
+ true,
+ false,
+ AGG_HASHED);
+
+ /*
+ * Don't try AGG_SORTED if create_grouped_path() would reject it
+ * anyway.
+ */
+ if (pathkeys != NIL)
+ success = success ||
+ create_grouped_path(root,
+ joinrel,
+ path,
+ true,
+ false,
+ AGG_SORTED);
+ }
}
- else
+
+ if (!success)
{
- /* Waste no memory when we reject a path here */
+ /* Waste no memory when we reject path(s) here */
bms_free(required_outer);
}
}
@@ -538,6 +656,7 @@ try_partial_nestloop_path(PlannerInfo *root,
add_partial_path(joinrel, (Path *)
create_nestloop_path(root,
joinrel,
+ joinrel->reltarget,
jointype,
&workspace,
extra,
@@ -564,15 +683,22 @@ try_mergejoin_path(PlannerInfo *root,
List *innersortkeys,
JoinType jointype,
JoinPathExtraData *extra,
- bool is_partial)
+ bool is_partial,
+ bool grouped,
+ bool do_aggregate)
{
Relids required_outer;
JoinCostWorkspace workspace;
+ bool success = false;
+ RelOptInfo *joinrel_plain = joinrel;
+
+ if (grouped)
+ joinrel = joinrel->grouped;
- if (is_partial)
+ if (!grouped && is_partial)
{
try_partial_mergejoin_path(root,
- joinrel,
+ joinrel_plain,
outer_path,
inner_path,
pathkeys,
@@ -617,26 +743,90 @@ try_mergejoin_path(PlannerInfo *root,
outersortkeys, innersortkeys,
extra);
- if (add_path_precheck(joinrel,
- workspace.startup_cost, workspace.total_cost,
- pathkeys, required_outer))
+ /*
+ * See comments in try_nestloop_path().
+ */
+ if ((!do_aggregate &&
+ add_path_precheck(joinrel,
+ workspace.startup_cost, workspace.total_cost,
+ pathkeys, required_outer)) ||
+ do_aggregate)
{
- add_path(joinrel, (Path *)
- create_mergejoin_path(root,
- joinrel,
- jointype,
- &workspace,
- extra,
- outer_path,
- inner_path,
- extra->restrictlist,
- pathkeys,
- required_outer,
- mergeclauses,
- outersortkeys,
- innersortkeys));
+ PathTarget *target;
+ List *uniquekeys = NIL;
+ Path *path;
+
+ if (!do_aggregate)
+ {
+ target = joinrel->reltarget;
+
+ if (grouped)
+ {
+ bool keys_ok;
+
+ uniquekeys = make_uniquekeys_for_join(root, outer_path,
+ inner_path,
+ target,
+ &keys_ok);
+ if (!keys_ok)
+ return;
+ }
+ }
+ else
+ {
+ Assert(joinrel->agg_info != NULL);
+ target = joinrel->agg_info->input;
+ }
+
+ path = (Path *) create_mergejoin_path(root,
+ joinrel_plain,
+ target,
+ jointype,
+ &workspace,
+ extra,
+ outer_path,
+ inner_path,
+ extra->restrictlist,
+ pathkeys,
+ required_outer,
+ mergeclauses,
+ outersortkeys,
+ innersortkeys);
+ /* See try_nestloop_path() */
+ path->parent = joinrel;
+
+ /*
+ * See comments in try_nestloop_path().
+ */
+ path->uniquekeys = uniquekeys;
+
+ if (!do_aggregate)
+ {
+ add_path(joinrel, path);
+ success = true;
+ }
+ else
+ {
+ if (required_outer == NULL)
+ success = create_grouped_path(root,
+ joinrel,
+ path,
+ true,
+ false,
+ AGG_HASHED);
+
+ if (pathkeys != NIL)
+ success = success ||
+ create_grouped_path(root,
+ joinrel,
+ path,
+ true,
+ false,
+ AGG_SORTED);
+ }
}
- else
+
+ if (!success)
{
/* Waste no memory when we reject a path here */
bms_free(required_outer);
@@ -700,6 +890,7 @@ try_partial_mergejoin_path(PlannerInfo *root,
add_partial_path(joinrel, (Path *)
create_mergejoin_path(root,
joinrel,
+ joinrel->reltarget,
jointype,
&workspace,
extra,
@@ -725,10 +916,17 @@ try_hashjoin_path(PlannerInfo *root,
Path *inner_path,
List *hashclauses,
JoinType jointype,
- JoinPathExtraData *extra)
+ JoinPathExtraData *extra,
+ bool grouped,
+ bool do_aggregate)
{
Relids required_outer;
JoinCostWorkspace workspace;
+ bool success = false;
+ RelOptInfo *joinrel_plain = joinrel; /* Non-grouped joinrel. */
+
+ if (grouped)
+ joinrel = joinrel->grouped;
/*
* Check to see if proposed path is still parameterized, and reject if the
@@ -745,30 +943,98 @@ try_hashjoin_path(PlannerInfo *root,
}
/*
+ * Parameterized execution of grouped path would mean repeated hashing of
+ * the output of the hashjoin output, so forget about AGG_HASHED if there
+ * are any parameters. And AGG_SORTED makes no sense because the hash join
+ * output is not sorted.
+ */
+ if (required_outer && joinrel->agg_info)
+ return;
+
+ /*
* See comments in try_nestloop_path(). Also note that hashjoin paths
* never have any output pathkeys, per comments in create_hashjoin_path.
*/
initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
outer_path, inner_path, extra, false);
- if (add_path_precheck(joinrel,
- workspace.startup_cost, workspace.total_cost,
- NIL, required_outer))
+
+ /*
+ * See comments try_nestloop_path().
+ */
+ if ((!do_aggregate &&
+ add_path_precheck(joinrel,
+ workspace.startup_cost, workspace.total_cost,
+ NIL, required_outer)) ||
+ do_aggregate)
{
- add_path(joinrel, (Path *)
- create_hashjoin_path(root,
- joinrel,
- jointype,
- &workspace,
- extra,
- outer_path,
- inner_path,
- false, /* parallel_hash */
- extra->restrictlist,
- required_outer,
- hashclauses));
+ PathTarget *target;
+ List *uniquekeys = NIL;
+ Path *path = NULL;
+
+ if (!do_aggregate)
+ {
+ target = joinrel->reltarget;
+
+ if (grouped)
+ {
+ bool keys_ok;
+
+ uniquekeys = make_uniquekeys_for_join(root, outer_path,
+ inner_path,
+ target,
+ &keys_ok);
+ if (!keys_ok)
+ return;
+ }
+ }
+ else
+ {
+ Assert(joinrel->agg_info != NULL);
+ target = joinrel->agg_info->input;
+ }
+
+ path = (Path *) create_hashjoin_path(root,
+ joinrel_plain,
+ target,
+ jointype,
+ &workspace,
+ extra,
+ outer_path,
+ inner_path,
+ false, /* parallel_hash */
+ extra->restrictlist,
+ required_outer,
+ hashclauses);
+ /* See try_nestloop_path() */
+ path->parent = joinrel;
+
+ /*
+ * See comments in try_nestloop_path().
+ */
+ path->uniquekeys = uniquekeys;
+
+ if (!do_aggregate)
+ {
+ add_path(joinrel, path);
+ success = true;
+ }
+ else
+ {
+ /*
+ * As the hashjoin path is not sorted, only try AGG_HASHED.
+ */
+ if (create_grouped_path(root,
+ joinrel,
+ path,
+ true,
+ false,
+ AGG_HASHED))
+ success = true;
+ }
}
- else
+
+ if (!success)
{
/* Waste no memory when we reject a path here */
bms_free(required_outer);
@@ -824,6 +1090,7 @@ try_partial_hashjoin_path(PlannerInfo *root,
add_partial_path(joinrel, (Path *)
create_hashjoin_path(root,
joinrel,
+ joinrel->reltarget,
jointype,
&workspace,
extra,
@@ -876,6 +1143,7 @@ clause_sides_match_join(RestrictInfo *rinfo, RelOptInfo *outerrel,
* 'innerrel' is the inner join relation
* 'jointype' is the type of join to do
* 'extra' contains additional input values
+ * 'agg_info' tells if/how to apply partial aggregation to the output.
*/
static void
sort_inner_and_outer(PlannerInfo *root,
@@ -883,7 +1151,9 @@ sort_inner_and_outer(PlannerInfo *root,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
- JoinPathExtraData *extra)
+ JoinPathExtraData *extra,
+ bool grouped,
+ bool do_aggregate)
{
JoinType save_jointype = jointype;
Path *outer_path;
@@ -1045,13 +1315,15 @@ sort_inner_and_outer(PlannerInfo *root,
innerkeys,
jointype,
extra,
- false);
+ false,
+ grouped,
+ do_aggregate);
/*
* If we have partial outer and parallel safe inner path then try
* partial mergejoin path.
*/
- if (cheapest_partial_outer && cheapest_safe_inner)
+ if (!grouped && cheapest_partial_outer && cheapest_safe_inner)
try_partial_mergejoin_path(root,
joinrel,
cheapest_partial_outer,
@@ -1089,7 +1361,9 @@ generate_mergejoin_paths(PlannerInfo *root,
bool useallclauses,
Path *inner_cheapest_total,
List *merge_pathkeys,
- bool is_partial)
+ bool is_partial,
+ bool grouped,
+ bool do_aggregate)
{
List *mergeclauses;
List *innersortkeys;
@@ -1150,7 +1424,9 @@ generate_mergejoin_paths(PlannerInfo *root,
innersortkeys,
jointype,
extra,
- is_partial);
+ is_partial,
+ grouped,
+ do_aggregate);
/* Can't do anything else if inner path needs to be unique'd */
if (save_jointype == JOIN_UNIQUE_INNER)
@@ -1247,7 +1523,9 @@ generate_mergejoin_paths(PlannerInfo *root,
NIL,
jointype,
extra,
- is_partial);
+ is_partial,
+ grouped,
+ do_aggregate);
cheapest_total_inner = innerpath;
}
/* Same on the basis of cheapest startup cost ... */
@@ -1291,7 +1569,9 @@ generate_mergejoin_paths(PlannerInfo *root,
NIL,
jointype,
extra,
- is_partial);
+ is_partial,
+ grouped,
+ do_aggregate);
}
cheapest_startup_inner = innerpath;
}
@@ -1333,7 +1613,9 @@ match_unsorted_outer(PlannerInfo *root,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
- JoinPathExtraData *extra)
+ JoinPathExtraData *extra,
+ bool grouped,
+ bool do_aggregate)
{
JoinType save_jointype = jointype;
bool nestjoinOK;
@@ -1456,7 +1738,9 @@ match_unsorted_outer(PlannerInfo *root,
inner_cheapest_total,
merge_pathkeys,
jointype,
- extra);
+ extra,
+ grouped,
+ do_aggregate);
}
else if (nestjoinOK)
{
@@ -1478,7 +1762,9 @@ match_unsorted_outer(PlannerInfo *root,
innerpath,
merge_pathkeys,
jointype,
- extra);
+ extra,
+ grouped,
+ do_aggregate);
}
/* Also consider materialized form of the cheapest inner path */
@@ -1489,7 +1775,9 @@ match_unsorted_outer(PlannerInfo *root,
matpath,
merge_pathkeys,
jointype,
- extra);
+ extra,
+ grouped,
+ do_aggregate);
}
/* Can't do anything else if outer path needs to be unique'd */
@@ -1504,7 +1792,7 @@ match_unsorted_outer(PlannerInfo *root,
generate_mergejoin_paths(root, joinrel, innerrel, outerpath,
save_jointype, extra, useallclauses,
inner_cheapest_total, merge_pathkeys,
- false);
+ false, grouped, do_aggregate);
}
/*
@@ -1516,7 +1804,8 @@ match_unsorted_outer(PlannerInfo *root,
* parameterized. Similarly, we can't handle JOIN_FULL and JOIN_RIGHT,
* because they can produce false null extended rows.
*/
- if (joinrel->consider_parallel &&
+ if (!grouped &&
+ joinrel->consider_parallel &&
save_jointype != JOIN_UNIQUE_OUTER &&
save_jointype != JOIN_FULL &&
save_jointype != JOIN_RIGHT &&
@@ -1545,7 +1834,9 @@ match_unsorted_outer(PlannerInfo *root,
if (inner_cheapest_total)
consider_parallel_mergejoin(root, joinrel, outerrel, innerrel,
save_jointype, extra,
- inner_cheapest_total);
+ inner_cheapest_total,
+ grouped,
+ do_aggregate);
}
}
@@ -1568,7 +1859,9 @@ consider_parallel_mergejoin(PlannerInfo *root,
RelOptInfo *innerrel,
JoinType jointype,
JoinPathExtraData *extra,
- Path *inner_cheapest_total)
+ Path *inner_cheapest_total,
+ bool grouped,
+ bool do_aggregate)
{
ListCell *lc1;
@@ -1586,7 +1879,8 @@ consider_parallel_mergejoin(PlannerInfo *root,
generate_mergejoin_paths(root, joinrel, innerrel, outerpath, jointype,
extra, false, inner_cheapest_total,
- merge_pathkeys, true);
+ merge_pathkeys, true, grouped,
+ do_aggregate);
}
}
@@ -1679,7 +1973,9 @@ hash_inner_and_outer(PlannerInfo *root,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
- JoinPathExtraData *extra)
+ JoinPathExtraData *extra,
+ bool grouped,
+ bool do_aggregate)
{
JoinType save_jointype = jointype;
bool isouterjoin = IS_OUTER_JOIN(jointype);
@@ -1754,7 +2050,9 @@ hash_inner_and_outer(PlannerInfo *root,
cheapest_total_inner,
hashclauses,
jointype,
- extra);
+ extra,
+ grouped,
+ do_aggregate);
/* no possibility of cheap startup here */
}
else if (jointype == JOIN_UNIQUE_INNER)
@@ -1770,7 +2068,9 @@ hash_inner_and_outer(PlannerInfo *root,
cheapest_total_inner,
hashclauses,
jointype,
- extra);
+ extra,
+ grouped,
+ do_aggregate);
if (cheapest_startup_outer != NULL &&
cheapest_startup_outer != cheapest_total_outer)
try_hashjoin_path(root,
@@ -1779,7 +2079,9 @@ hash_inner_and_outer(PlannerInfo *root,
cheapest_total_inner,
hashclauses,
jointype,
- extra);
+ extra,
+ grouped,
+ do_aggregate);
}
else
{
@@ -1800,7 +2102,9 @@ hash_inner_and_outer(PlannerInfo *root,
cheapest_total_inner,
hashclauses,
jointype,
- extra);
+ extra,
+ grouped,
+ do_aggregate);
foreach(lc1, outerrel->cheapest_parameterized_paths)
{
@@ -1834,7 +2138,9 @@ hash_inner_and_outer(PlannerInfo *root,
innerpath,
hashclauses,
jointype,
- extra);
+ extra,
+ grouped,
+ do_aggregate);
}
}
}
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 7008e1318e..3370a217f3 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -16,13 +16,16 @@
#include "miscadmin.h"
#include "optimizer/clauses.h"
+#include "optimizer/cost.h"
#include "optimizer/joininfo.h"
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/prep.h"
+#include "optimizer/tlist.h"
#include "partitioning/partbounds.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/selfuncs.h"
static void make_rels_by_clause_joins(PlannerInfo *root,
@@ -31,23 +34,35 @@ static void make_rels_by_clause_joins(PlannerInfo *root,
static void make_rels_by_clauseless_joins(PlannerInfo *root,
RelOptInfo *old_rel,
ListCell *other_rels);
+static void set_grouped_joinrel_target(PlannerInfo *root, RelOptInfo *joinrel,
+ RelOptInfo *rel1, RelOptInfo *rel2,
+ SpecialJoinInfo *sjinfo, List *restrictlist,
+ RelAggInfo *agg_info);
static bool has_join_restriction(PlannerInfo *root, RelOptInfo *rel);
static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
static bool is_dummy_rel(RelOptInfo *rel);
static bool restriction_is_constant_false(List *restrictlist,
RelOptInfo *joinrel,
bool only_pushed_down);
+static RelOptInfo *make_join_rel_common(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
+ RelAggInfo *agg_info, bool grouped,
+ bool do_aggregate);
+static void make_join_rel_common_grouped(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
+ RelAggInfo *agg_info, bool do_aggregate);
static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
RelOptInfo *rel2, RelOptInfo *joinrel,
- SpecialJoinInfo *sjinfo, List *restrictlist);
-static void try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1,
- RelOptInfo *rel2, RelOptInfo *joinrel,
- SpecialJoinInfo *parent_sjinfo,
- List *parent_restrictlist);
+ SpecialJoinInfo *sjinfo, List *restrictlist,
+ bool grouped,
+ bool do_aggregate);
+static void try_partition_wise_join(PlannerInfo *root, RelOptInfo *rel1,
+ RelOptInfo *rel2, RelOptInfo *joinrel,
+ SpecialJoinInfo *parent_sjinfo,
+ List *parent_restrictlist,
+ bool grouped,
+ bool do_aggregate);
static int match_expr_to_partition_keys(Expr *expr, RelOptInfo *rel,
bool strict_op);
-
/*
* join_search_one_level
* Consider ways to produce join relations containing exactly 'level'
@@ -322,6 +337,58 @@ make_rels_by_clauseless_joins(PlannerInfo *root,
}
}
+/*
+ * Set joinrel's reltarget according to agg_info and estimate the number of
+ * rows.
+ */
+static void
+set_grouped_joinrel_target(PlannerInfo *root, RelOptInfo *joinrel,
+ RelOptInfo *rel1, RelOptInfo *rel2,
+ SpecialJoinInfo *sjinfo, List *restrictlist,
+ RelAggInfo *agg_info)
+{
+ PathTarget *target = NULL;
+
+ Assert(agg_info != NULL);
+
+ /*
+ * build_join_rel() / build_child_join_rel() does not create the target
+ * for grouped relation.
+ */
+ Assert(joinrel->reltarget == NULL);
+ Assert(joinrel->agg_info == NULL);
+
+ target = agg_info->target;
+
+ /*
+ * The output will actually be grouped, i.e. partially aggregated. No
+ * additional processing needed.
+ */
+ joinrel->reltarget = copy_pathtarget(target);
+
+ /*
+ * The rest of agg_info will be needed at aggregation time.
+ */
+ joinrel->agg_info = agg_info;
+
+ /*
+ * Now that we have the target, compute the estimates.
+ */
+ set_joinrel_size_estimates(root, joinrel, rel1, rel2, sjinfo,
+ restrictlist);
+
+ /*
+ * Grouping essentially changes the number of rows.
+ *
+ * XXX We do not distinguish whether two plain rels are joined and the
+ * result is partially aggregated, or the partial aggregation has been
+ * already applied to one of the input rels. Is this worth extra effort,
+ * e.g. maintaining a separate RelOptInfo for each case (one difficulty
+ * that would introduce is construction of AppendPath)?
+ */
+ joinrel->rows = estimate_num_groups(root, joinrel->agg_info->group_exprs,
+ joinrel->rows, NULL);
+}
/*
* join_is_legal
@@ -651,32 +718,45 @@ join_is_legal(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
return true;
}
-
/*
- * make_join_rel
+ * make_join_rel_common
* Find or create a join RelOptInfo that represents the join of
* the two given rels, and add to it path information for paths
* created with the two rels as outer and inner rel.
* (The join rel may already contain paths generated from other
* pairs of rels that add up to the same set of base rels.)
*
- * NB: will return NULL if attempted join is not valid. This can happen
- * when working with outer joins, or with IN or EXISTS clauses that have been
- * turned into joins.
+ * 'agg_info' contains the reltarget of grouped relation and everything we
+ * need to aggregate the join result. If NULL, then the join relation
+ * should not be grouped.
+ *
+ * 'do_aggregate' tells that two non-grouped rels should be grouped and
+ * partial aggregation should be applied to all their paths.
+ *
+ * NB: will return NULL if attempted join is not valid. This can happen when
+ * working with outer joins, or with IN or EXISTS clauses that have been
+ * turned into joins. NULL is also returned if caller is interested in a
+ * grouped relation but there's no useful grouped input relation.
*/
-RelOptInfo *
-make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
+static RelOptInfo *
+make_join_rel_common(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
+ RelAggInfo *agg_info, bool grouped,
+ bool do_aggregate)
{
Relids joinrelids;
SpecialJoinInfo *sjinfo;
bool reversed;
SpecialJoinInfo sjinfo_data;
- RelOptInfo *joinrel;
+ RelOptInfo *joinrel,
+ *joinrel_plain;
List *restrictlist;
/* We should never try to join two overlapping sets of rels. */
Assert(!bms_overlap(rel1->relids, rel2->relids));
+ /* do_aggregate implies the output to be grouped. */
+ Assert(!do_aggregate || grouped);
+
/* Construct Relids set that identifies the joinrel. */
joinrelids = bms_union(rel1->relids, rel2->relids);
@@ -725,8 +805,38 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
* Find or build the join RelOptInfo, and compute the restrictlist that
* goes with this particular joining.
*/
- joinrel = build_join_rel(root, joinrelids, rel1, rel2, sjinfo,
- &restrictlist);
+ joinrel = joinrel_plain = build_join_rel(root, joinrelids, rel1, rel2, sjinfo,
+ &restrictlist, false);
+
+ if (grouped)
+ {
+ /*
+ * Make sure there's a grouped join relation.
+ */
+ if (joinrel->grouped == NULL)
+ joinrel->grouped = build_join_rel(root,
+ joinrelids,
+ rel1,
+ rel2,
+ sjinfo,
+ &restrictlist,
+ true);
+
+ /*
+ * The grouped join is what we need to return.
+ */
+ joinrel = joinrel->grouped;
+
+
+ /*
+ * Make sure the grouped joinrel has reltarget initialized. Caller
+ * should supply the target for group relation, so build_join_rel()
+ * should have omitted its creation.
+ */
+ if (joinrel->reltarget == NULL)
+ set_grouped_joinrel_target(root, joinrel, rel1, rel2, sjinfo,
+ restrictlist, agg_info);
+ }
/*
* If we've already proven this join is empty, we needn't consider any
@@ -738,15 +848,182 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
return joinrel;
}
- /* Add paths to the join relation. */
- populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
- restrictlist);
+ /*
+ * Add paths to the join relation.
+ *
+ * Pass joinrel_plain and agg_kind instead of joinrel, since the function
+ * needs agg_kind anyway.
+ */
+ populate_joinrel_with_paths(root, rel1, rel2, joinrel_plain, sjinfo,
+ restrictlist, grouped, do_aggregate);
bms_free(joinrelids);
return joinrel;
}
+static void
+make_join_rel_common_grouped(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
+ RelAggInfo *agg_info, bool do_aggregate)
+{
+ RelOptInfo *rel1_grouped = NULL;
+ RelOptInfo *rel2_grouped = NULL;
+ bool rel1_grouped_useful = false;
+ bool rel2_grouped_useful = false;
+
+ /*
+ * Retrieve the grouped relations.
+ *
+ * Dummy rel indicates join relation able to generate grouped paths as
+ * such (i.e. it has valid agg_info), but for which the path actually
+ * could not be created (e.g. only AGG_HASHED strategy was possible but
+ * work_mem was not sufficient for hash table).
+ */
+ if (rel1->grouped)
+ rel1_grouped = rel1->grouped;
+ if (rel2->grouped)
+ rel2_grouped = rel2->grouped;
+
+ rel1_grouped_useful = rel1_grouped != NULL && !IS_DUMMY_REL(rel1_grouped);
+ rel2_grouped_useful = rel2_grouped != NULL && !IS_DUMMY_REL(rel2_grouped);
+
+ /*
+ * Nothing else to do?
+ */
+ if (!rel1_grouped_useful && !rel2_grouped_useful)
+ return;
+
+ /*
+ * At maximum one input rel can be grouped (here we don't care if any rel
+ * is eventually dummy, the existence of grouped rel indicates that
+ * aggregates can be pushed down to it). If both were grouped, then
+ * grouping of one side would change the occurrence of the other side's
+ * aggregate transient states on the input of the final aggregation. This
+ * can be handled by adjusting the transient states, but it's not worth
+ * the effort because it's hard to find a use case for this kind of join.
+ *
+ * XXX If the join of two grouped rels is implemented someday, note that
+ * both rels can have aggregates, so it'd be hard to join grouped rel to
+ * non-grouped here: 1) such a "mixed join" would require a special
+ * target, 2) both AGGSPLIT_FINAL_DESERIAL and AGGSPLIT_SIMPLE aggregates
+ * could appear in the target of the final aggregation node, originating
+ * from the grouped and the non-grouped input rel respectively.
+ */
+ if (rel1_grouped && rel2_grouped)
+ return;
+
+ if (rel1_grouped_useful)
+ {
+ if (rel1_grouped->agg_info->target)
+ make_join_rel_common(root, rel1_grouped, rel2, agg_info, true,
+ do_aggregate);
+ }
+ else if (rel2_grouped_useful)
+ {
+ if (rel2_grouped->agg_info->target)
+ make_join_rel_common(root, rel1, rel2_grouped, agg_info, true,
+ do_aggregate);
+ }
+}
+
+/*
+ * Front-end to make_join_rel_common(). Generates plain (non-grouped) join and
+ * then uses all the possible strategies to generate the grouped one.
+ */
+RelOptInfo *
+make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
+{
+ Relids joinrelids;
+ RelAggInfo *agg_info;
+ RelOptInfo *joinrel;
+ double nrows_plain;
+ RelOptInfo *result;
+
+ /* 1) form the plain join. */
+ result = make_join_rel_common(root, rel1, rel2, NULL, false,
+ false);
+
+ if (result == NULL)
+ return result;
+
+ nrows_plain = result->rows;
+
+ /*
+ * We're done if there are no grouping expressions nor aggregates.
+ */
+ if (root->grouped_var_list == NIL)
+ return result;
+
+ /*
+ * If the same joinrel was already formed, just with the base rels divided
+ * between rel1 and rel2 in a different way, we might already have the
+ * matching agg_info.
+ */
+ joinrelids = bms_union(rel1->relids, rel2->relids);
+ joinrel = find_join_rel(root, joinrelids);
+
+ /*
+ * At the moment we know that non-grouped join exists, so it should have
+ * been fetched.
+ */
+ Assert(joinrel != NULL);
+
+ if (joinrel->grouped != NULL)
+ {
+ Assert(joinrel->grouped->agg_info != NULL);
+
+ agg_info = joinrel->grouped->agg_info;
+ }
+ else
+ {
+ double nrows;
+
+ /*
+ * agg_info must be created from scratch.
+ */
+ agg_info = create_rel_agg_info(root, result);
+
+ /*
+ * Grouping essentially changes the number of rows.
+ */
+ if (agg_info != NULL)
+ {
+ nrows = estimate_num_groups(root,
+ agg_info->group_exprs,
+ nrows_plain,
+ NULL);
+ agg_info->rows = clamp_row_est(nrows);
+ }
+ }
+
+ /*
+ * Cannot we build grouped join?
+ */
+ if (agg_info == NULL)
+ return result;
+
+ /*
+ * 2) join two plain rels and aggregate the join paths.
+ */
+ result->grouped = make_join_rel_common(root, rel1, rel2,
+ agg_info,
+ true,
+ true);
+
+ /*
+ * If the non-grouped join relation could be built, its aggregated form
+ * should exist too.
+ */
+ Assert(result->grouped != NULL);
+
+ /*
+ * 3) combine plain and grouped relations.
+ */
+ make_join_rel_common_grouped(root, rel1, rel2, agg_info, false);
+
+ return result;
+}
+
/*
* populate_joinrel_with_paths
* Add paths to the given joinrel for given pair of joining relations. The
@@ -757,8 +1034,24 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
static void
populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
RelOptInfo *rel2, RelOptInfo *joinrel,
- SpecialJoinInfo *sjinfo, List *restrictlist)
+ SpecialJoinInfo *sjinfo, List *restrictlist,
+ bool grouped, bool do_aggregate)
{
+ RelOptInfo *joinrel_plain;
+
+ /*
+ * joinrel_plain and agg_kind is passed to add_paths_to_joinrel() since it
+ * needs agg_kind anyway.
+ *
+ * TODO As for the other uses, find out where joinrel can be used safely
+ * instead of joinrel_plain, i.e. check that even grouped joinrel has all
+ * the information needed.
+ */
+ joinrel_plain = joinrel;
+
+ if (grouped)
+ joinrel = joinrel->grouped;
+
/*
* Consider paths using each rel as both outer and inner. Depending on
* the join type, a provably empty outer or inner rel might mean the join
@@ -781,17 +1074,17 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
{
case JOIN_INNER:
if (is_dummy_rel(rel1) || is_dummy_rel(rel2) ||
- restriction_is_constant_false(restrictlist, joinrel, false))
+ restriction_is_constant_false(restrictlist, joinrel_plain, false))
{
mark_dummy_rel(joinrel);
break;
}
- add_paths_to_joinrel(root, joinrel, rel1, rel2,
+ add_paths_to_joinrel(root, joinrel_plain, rel1, rel2,
JOIN_INNER, sjinfo,
- restrictlist);
- add_paths_to_joinrel(root, joinrel, rel2, rel1,
+ restrictlist, grouped, do_aggregate);
+ add_paths_to_joinrel(root, joinrel_plain, rel2, rel1,
JOIN_INNER, sjinfo,
- restrictlist);
+ restrictlist, grouped, do_aggregate);
break;
case JOIN_LEFT:
if (is_dummy_rel(rel1) ||
@@ -800,29 +1093,29 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
mark_dummy_rel(joinrel);
break;
}
- if (restriction_is_constant_false(restrictlist, joinrel, false) &&
+ if (restriction_is_constant_false(restrictlist, joinrel_plain, false) &&
bms_is_subset(rel2->relids, sjinfo->syn_righthand))
mark_dummy_rel(rel2);
- add_paths_to_joinrel(root, joinrel, rel1, rel2,
+ add_paths_to_joinrel(root, joinrel_plain, rel1, rel2,
JOIN_LEFT, sjinfo,
- restrictlist);
- add_paths_to_joinrel(root, joinrel, rel2, rel1,
+ restrictlist, grouped, do_aggregate);
+ add_paths_to_joinrel(root, joinrel_plain, rel2, rel1,
JOIN_RIGHT, sjinfo,
- restrictlist);
+ restrictlist, grouped, do_aggregate);
break;
case JOIN_FULL:
if ((is_dummy_rel(rel1) && is_dummy_rel(rel2)) ||
- restriction_is_constant_false(restrictlist, joinrel, true))
+ restriction_is_constant_false(restrictlist, joinrel_plain, true))
{
mark_dummy_rel(joinrel);
break;
}
- add_paths_to_joinrel(root, joinrel, rel1, rel2,
+ add_paths_to_joinrel(root, joinrel_plain, rel1, rel2,
JOIN_FULL, sjinfo,
- restrictlist);
- add_paths_to_joinrel(root, joinrel, rel2, rel1,
+ restrictlist, grouped, do_aggregate);
+ add_paths_to_joinrel(root, joinrel_plain, rel2, rel1,
JOIN_FULL, sjinfo,
- restrictlist);
+ restrictlist, grouped, do_aggregate);
/*
* If there are join quals that aren't mergeable or hashable, we
@@ -848,14 +1141,14 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
bms_is_subset(sjinfo->min_righthand, rel2->relids))
{
if (is_dummy_rel(rel1) || is_dummy_rel(rel2) ||
- restriction_is_constant_false(restrictlist, joinrel, false))
+ restriction_is_constant_false(restrictlist, joinrel_plain, false))
{
mark_dummy_rel(joinrel);
break;
}
- add_paths_to_joinrel(root, joinrel, rel1, rel2,
+ add_paths_to_joinrel(root, joinrel_plain, rel1, rel2,
JOIN_SEMI, sjinfo,
- restrictlist);
+ restrictlist, grouped, do_aggregate);
}
/*
@@ -871,32 +1164,32 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
sjinfo) != NULL)
{
if (is_dummy_rel(rel1) || is_dummy_rel(rel2) ||
- restriction_is_constant_false(restrictlist, joinrel, false))
+ restriction_is_constant_false(restrictlist, joinrel_plain, false))
{
mark_dummy_rel(joinrel);
break;
}
- add_paths_to_joinrel(root, joinrel, rel1, rel2,
+ add_paths_to_joinrel(root, joinrel_plain, rel1, rel2,
JOIN_UNIQUE_INNER, sjinfo,
- restrictlist);
- add_paths_to_joinrel(root, joinrel, rel2, rel1,
+ restrictlist, grouped, do_aggregate);
+ add_paths_to_joinrel(root, joinrel_plain, rel2, rel1,
JOIN_UNIQUE_OUTER, sjinfo,
- restrictlist);
+ restrictlist, grouped, do_aggregate);
}
break;
case JOIN_ANTI:
if (is_dummy_rel(rel1) ||
- restriction_is_constant_false(restrictlist, joinrel, true))
+ restriction_is_constant_false(restrictlist, joinrel_plain, true))
{
mark_dummy_rel(joinrel);
break;
}
- if (restriction_is_constant_false(restrictlist, joinrel, false) &&
+ if (restriction_is_constant_false(restrictlist, joinrel_plain, false) &&
bms_is_subset(rel2->relids, sjinfo->syn_righthand))
mark_dummy_rel(rel2);
- add_paths_to_joinrel(root, joinrel, rel1, rel2,
+ add_paths_to_joinrel(root, joinrel_plain, rel1, rel2,
JOIN_ANTI, sjinfo,
- restrictlist);
+ restrictlist, grouped, do_aggregate);
break;
default:
/* other values not expected here */
@@ -904,8 +1197,16 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
break;
}
- /* Apply partitionwise join technique, if possible. */
- try_partitionwise_join(root, rel1, rel2, joinrel, sjinfo, restrictlist);
+ /*
+ * TODO Only allow per-child AGGSPLIT_SIMPLE if the partitioning allows
+ * it, i.e. each partition generates distinct set of grouping keys.
+ */
+ if (grouped)
+ return;
+
+ /* Apply partition-wise join technique, if possible. */
+ try_partition_wise_join(root, rel1, rel2, joinrel_plain, sjinfo, restrictlist,
+ grouped, do_aggregate);
}
@@ -1308,16 +1609,16 @@ restriction_is_constant_false(List *restrictlist,
* obtained by translating the respective parent join structures.
*/
static void
-try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
- RelOptInfo *joinrel, SpecialJoinInfo *parent_sjinfo,
- List *parent_restrictlist)
+try_partition_wise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
+ RelOptInfo *joinrel, SpecialJoinInfo *parent_sjinfo,
+ List *parent_restrictlist, bool grouped,
+ bool do_aggregate)
{
int nparts;
int cnt_parts;
/* Guard against stack overflow due to overly deep partition hierarchy. */
check_stack_depth();
-
/* Nothing to do, if the join relation is not partitioned. */
if (!IS_PARTITIONED_REL(joinrel))
return;
@@ -1390,23 +1691,91 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
(List *) adjust_appendrel_attrs(root,
(Node *) parent_restrictlist,
nappinfos, appinfos);
- pfree(appinfos);
child_joinrel = joinrel->part_rels[cnt_parts];
if (!child_joinrel)
{
- child_joinrel = build_child_join_rel(root, child_rel1, child_rel2,
- joinrel, child_restrictlist,
- child_sjinfo,
- child_sjinfo->jointype);
- joinrel->part_rels[cnt_parts] = child_joinrel;
+ if (!grouped)
+ child_joinrel = build_child_join_rel(root, child_rel1, child_rel2,
+ joinrel,
+ child_restrictlist,
+ child_sjinfo,
+ child_sjinfo->jointype,
+ false);
+ else
+ {
+ /*
+ * The join should have been created when we were called with
+ * !grouped.
+ */
+ child_joinrel = find_join_rel(root, bms_union(child_rel1->relids,
+ child_rel2->relids));
+ Assert(child_joinrel);
+ }
}
+ if (grouped)
+ {
+ RelOptInfo *joinrel_grouped,
+ *child_joinrel_grouped;
+ RelAggInfo *child_agg_info;
+
+ joinrel_grouped = joinrel->grouped;
+
+ if (child_joinrel->grouped == NULL)
+ child_joinrel->grouped =
+ build_child_join_rel(root, child_rel1, child_rel2,
+ joinrel_grouped,
+ child_restrictlist,
+ child_sjinfo,
+ child_sjinfo->jointype,
+ true);
+
+ /*
+ * The grouped join is what we need till the end of the function.
+ */
+ child_joinrel_grouped = child_joinrel->grouped;
+
+ /*
+ * Make sure the child_joinrel has reltarget initialized.
+ *
+ * Although build_child_join_rel() creates reltarget for each
+ * child join from scratch as opposed to translating the parent
+ * reltarget (XXX set_append_rel_size() uses the translation ---
+ * is this inconsistency justified?), we just translate the parent
+ * reltarget here. Per-child call of create_rel_agg_info() would
+ * introduce too much duplicate work because it needs the *parent*
+ * target as a source and that one is identical for all the child
+ * joins
+ */
+ child_agg_info = translate_rel_agg_info(root,
+ joinrel_grouped->agg_info,
+ appinfos, nappinfos);
+
+ /*
+ * Make sure the child joinrel has reltarget initialized.
+ */
+ if (child_joinrel_grouped->reltarget == NULL)
+ {
+ set_grouped_joinrel_target(root, child_joinrel_grouped, rel1, rel2,
+ child_sjinfo, child_restrictlist,
+ child_agg_info);
+ }
+
+ joinrel_grouped->part_rels[cnt_parts] = child_joinrel_grouped;
+ }
+ else
+ joinrel->part_rels[cnt_parts] = child_joinrel;
+
+ pfree(appinfos);
+
Assert(bms_equal(child_joinrel->relids, child_joinrelids));
populate_joinrel_with_paths(root, child_rel1, child_rel2,
child_joinrel, child_sjinfo,
- child_restrictlist);
+ child_restrictlist,
+ grouped,
+ do_aggregate);
}
}
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index ec66cb9c3c..52e76530f3 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -1664,3 +1664,165 @@ has_useful_pathkeys(PlannerInfo *root, RelOptInfo *rel)
return true; /* might be able to use them for ordering */
return false; /* definitely useless */
}
+
+/*
+ * Add a new set of unique keys to a list of unique key sets, to which keys_p
+ * points. If an identical set is already there, free new_set instead of
+ * adding it.
+ */
+void
+add_uniquekeys(List **keys_p, Bitmapset *new_set)
+{
+ ListCell *lc;
+
+ foreach(lc, *keys_p)
+ {
+ Bitmapset *set = (Bitmapset *) lfirst(lc);
+
+ if (bms_equal(new_set, set))
+ break;
+ }
+ if (lc == NULL)
+ *keys_p = lappend(*keys_p, new_set);
+ else
+ bms_free(new_set);
+}
+
+/*
+ * Return true the output of a path having given uniquekeys and target
+ * contains only distinct values of root->group_pathkeys.
+ */
+bool
+match_uniquekeys_to_group_pathkeys(PlannerInfo *root,
+ List *uniquekeys,
+ PathTarget *target)
+{
+ Bitmapset *uniquekeys_all = NULL;
+ ListCell *l1;
+ int i;
+ bool *is_group_expr;
+
+ /*
+ * group_pathkeys are essential for this function.
+ */
+ if (root->group_pathkeys == NIL)
+ return false;
+
+ /*
+ * The path is not aware of being unique.
+ */
+ if (uniquekeys == NIL)
+ return false;
+
+ /*
+ * There can be multiple known unique key sets. Gather pathkeys of all the
+ * unique expressions the sets may reference.
+ */
+ foreach(l1, uniquekeys)
+ {
+ Bitmapset *set = (Bitmapset *) lfirst(l1);
+
+ uniquekeys_all = bms_union(uniquekeys_all, set);
+ }
+
+ /*
+ * Find pathkeys for the expressions.
+ */
+ is_group_expr = (bool *)
+ palloc0(list_length(target->exprs) * sizeof(bool));
+
+ i = 0;
+ foreach(l1, target->exprs)
+ {
+ Expr *expr = (Expr *) lfirst(l1);
+
+ if (bms_is_member(i, uniquekeys_all))
+ {
+ ListCell *l2;
+ bool found = false;
+
+ /*
+ * This is an unique expression, so find its pathkey.
+ */
+ foreach(l2, root->group_pathkeys)
+ {
+ PathKey *pk = lfirst_node(PathKey, l2);
+ EquivalenceClass *ec = pk->pk_eclass;
+ ListCell *l3;
+ EquivalenceMember *em = NULL;
+
+ if (ec->ec_below_outer_join)
+ continue;
+ if (ec->ec_has_volatile)
+ continue;
+
+ foreach(l3, ec->ec_members)
+ {
+ em = lfirst_node(EquivalenceMember, l3);
+
+ if (em->em_nullable_relids)
+ continue;
+
+ if (equal(em->em_expr, expr))
+ {
+ found = true;
+ break;
+ }
+ }
+ if (found)
+ break;
+
+ }
+ is_group_expr[i] = found;
+ }
+
+ i++;
+ }
+
+ /*
+ * Now check the unique key sets and see if any one matches all items of
+ * group_pathkeys.
+ */
+ foreach(l1, uniquekeys)
+ {
+ Bitmapset *set = (Bitmapset *) lfirst(l1);
+ bool found = false;
+
+ /*
+ * Check unique keys associated with this set.
+ */
+ for (i = 0; i < list_length(target->exprs); i++)
+ {
+ /*
+ * Is this expression an unique key?
+ */
+ if (bms_is_member(i, set))
+ {
+ /*
+ * If the set misses a single grouping pathkey, at least one
+ * expression of the unique key is outside the grouping
+ * expressions, and thus the path can generate multiple rows
+ * with the same grouping expressions.
+ */
+ if (!is_group_expr[i])
+ {
+ found = true;
+ break;
+ }
+ }
+ }
+
+ /*
+ * No problem with this set. No need to check the other ones.
+ */
+ if (!found)
+ {
+ pfree(is_group_expr);
+ return true;
+ }
+ }
+
+ /* No match found. */
+ pfree(is_group_expr);
+ return false;
+}
diff --git a/src/backend/optimizer/path/tidpath.c b/src/backend/optimizer/path/tidpath.c
index 3bb5b8def6..bb0f8142a6 100644
--- a/src/backend/optimizer/path/tidpath.c
+++ b/src/backend/optimizer/path/tidpath.c
@@ -250,10 +250,11 @@ TidQualFromBaseRestrictinfo(RelOptInfo *rel)
* Candidate paths are added to the rel's pathlist (using add_path).
*/
void
-create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel)
+create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel, bool grouped)
{
Relids required_outer;
List *tidquals;
+ Path *tidpath;
/*
* We don't support pushing join clauses into the quals of a tidscan, but
@@ -263,8 +264,20 @@ create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel)
required_outer = rel->lateral_relids;
tidquals = TidQualFromBaseRestrictinfo(rel);
+ if (!tidquals)
+ return;
- if (tidquals)
- add_path(rel, (Path *) create_tidscan_path(root, rel, tidquals,
- required_outer));
+ tidpath = (Path *) create_tidscan_path(root, rel, tidquals,
+ required_outer);
+
+ if (!grouped)
+ add_path(rel, tidpath);
+ else if (required_outer == NULL)
+ {
+ /*
+ * Only AGG_HASHED is suitable here as it does not expect the input
+ * set to be sorted.
+ */
+ create_grouped_path(root, rel, tidpath, false, false, AGG_HASHED);
+ }
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ae41c9efa0..f9dde17ce8 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -824,6 +824,12 @@ use_physical_tlist(PlannerInfo *root, Path *path, int flags)
return false;
/*
+ * Grouped relation's target list contains GroupedVars.
+ */
+ if (rel->agg_info != NULL)
+ return false;
+
+ /*
* If a bitmap scan's tlist is empty, keep it as-is. This may allow the
* executor to skip heap page fetches, and in any case, the benefit of
* using a physical tlist instead would be minimal.
@@ -1667,7 +1673,8 @@ create_projection_plan(PlannerInfo *root, ProjectionPath *best_path, int flags)
* therefore can't predict whether it will require an exact tlist. For
* both of these reasons, we have to recheck here.
*/
- if (use_physical_tlist(root, &best_path->path, flags))
+ if (!best_path->force_result &&
+ use_physical_tlist(root, &best_path->path, flags))
{
/*
* Our caller doesn't really care what tlist we return, so we don't
@@ -1680,7 +1687,8 @@ create_projection_plan(PlannerInfo *root, ProjectionPath *best_path, int flags)
apply_pathtarget_labeling_to_tlist(tlist,
best_path->path.pathtarget);
}
- else if (is_projection_capable_path(best_path->subpath))
+ else if (!best_path->force_result &&
+ is_projection_capable_path(best_path->subpath))
{
/*
* Our caller requires that we return the exact tlist, but no separate
@@ -5881,6 +5889,21 @@ find_ec_member_for_tle(EquivalenceClass *ec,
while (tlexpr && IsA(tlexpr, RelabelType))
tlexpr = ((RelabelType *) tlexpr)->arg;
+ /*
+ * GroupedVar can contain either non-Var grouping expression or aggregate.
+ * The grouping expression might be useful for sorting, however aggregates
+ * shouldn't currently appear among pathkeys.
+ */
+ if (IsA(tlexpr, GroupedVar))
+ {
+ GroupedVar *gvar = castNode(GroupedVar, tlexpr);
+
+ if (!IsA(gvar->gvexpr, Aggref))
+ tlexpr = gvar->gvexpr;
+ else
+ return NULL;
+ }
+
foreach(lc, ec->ec_members)
{
EquivalenceMember *em = (EquivalenceMember *) lfirst(lc);
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 01335db511..0dca87a589 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -14,6 +14,7 @@
*/
#include "postgres.h"
+#include "access/sysattr.h"
#include "catalog/pg_type.h"
#include "catalog/pg_class.h"
#include "nodes/nodeFuncs.h"
@@ -27,6 +28,7 @@
#include "optimizer/planner.h"
#include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
+#include "optimizer/tlist.h"
#include "optimizer/var.h"
#include "parser/analyze.h"
#include "rewrite/rewriteManip.h"
@@ -46,6 +48,9 @@ typedef struct PostponedQual
} PostponedQual;
+static void create_aggregate_grouped_var_infos(PlannerInfo *root);
+static void create_grouping_expr_grouped_var_infos(PlannerInfo *root);
+static RelOptInfo *copy_simple_rel(PlannerInfo *root, RelOptInfo *rel);
static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
Index rtindex);
static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
@@ -96,10 +101,9 @@ static void check_hashjoinable(RestrictInfo *restrictinfo);
* jtnode. Internally, the function recurses through the jointree.
*
* At the end of this process, there should be one baserel RelOptInfo for
- * every non-join RTE that is used in the query. Therefore, this routine
- * is the only place that should call build_simple_rel with reloptkind
- * RELOPT_BASEREL. (Note: build_simple_rel recurses internally to build
- * "other rel" RelOptInfos for the members of any appendrels we find here.)
+ * every non-grouped non-join RTE that is used in the query. (Note:
+ * build_simple_rel recurses internally to build "other rel" RelOptInfos for
+ * the members of any appendrels we find here.)
*/
void
add_base_rels_to_query(PlannerInfo *root, Node *jtnode)
@@ -241,6 +245,415 @@ add_vars_to_targetlist(PlannerInfo *root, List *vars,
}
}
+/*
+ * Add GroupedVarInfo to grouped_var_list for each aggregate as well as for
+ * each possible grouping expression and setup RelOptInfo for each base or
+ * 'other' relation that can product grouped paths.
+ *
+ * Note that targets of the 'other' relations are not set here ---
+ * set_append_rel_size() will create them by translating the targets of the
+ * base rel.
+ *
+ * root->group_pathkeys must be setup before this function is called.
+ */
+extern void
+add_grouped_base_rels_to_query(PlannerInfo *root)
+{
+ int i;
+
+ /*
+ * Isn't user interested in the aggregate push-down feature?
+ */
+ if (!enable_agg_pushdown)
+ return;
+
+ /* No grouping in the query? */
+ if (!root->parse->groupClause)
+ return;
+
+ /*
+ * Grouping sets require multiple different groupings but the base
+ * relation can only generate one.
+ */
+ if (root->parse->groupingSets)
+ return;
+
+ /*
+ * SRF is not allowed in the aggregate argument and we don't even want it
+ * in the GROUP BY clause, so forbid it in general. It needs to be
+ * analyzed if evaluation of a GROUP BY clause containing SRF below the
+ * query targetlist would be correct. Currently it does not seem to be an
+ * important use case.
+ */
+ if (root->parse->hasTargetSRFs)
+ return;
+
+ /*
+ * TODO Consider if this is a real limitation.
+ */
+ if (root->parse->hasWindowFuncs)
+ return;
+
+ /* Create GroupedVarInfo per (distinct) aggregate. */
+ create_aggregate_grouped_var_infos(root);
+
+ /* Isn't there any aggregate to be pushed down? */
+ if (root->grouped_var_list == NIL)
+ return;
+
+ /* Create GroupedVarInfo per grouping expression. */
+ create_grouping_expr_grouped_var_infos(root);
+
+ /*
+ * Are all the aggregates AGGSPLIT_SIMPLE?
+ */
+ if (root->grouped_var_list == NIL)
+ return;
+
+ /* Process the individual base relations. */
+ for (i = 1; i < root->simple_rel_array_size; i++)
+ {
+ RelOptInfo *rel = root->simple_rel_array[i];
+ RangeTblEntry *rte;
+ RelAggInfo *agg_info;
+
+ /* NULL should mean a join relation. */
+ if (rel == NULL)
+ continue;
+
+ /*
+ * Not all RTE kinds are supported when grouping is considered.
+ *
+ * TODO Consider relaxing some of these restrictions.
+ */
+ rte = root->simple_rte_array[rel->relid];
+ if (rte->rtekind != RTE_RELATION ||
+ rte->relkind == RELKIND_FOREIGN_TABLE ||
+ rte->tablesample != NULL)
+ return;
+
+ /*
+ * Grouped "other member rels" should not be created until we know
+ * whether the parent can be grouped, i.e. until the parent has
+ * rel->agg_info initialized.
+ */
+ if (rel->reloptkind != RELOPT_BASEREL)
+ continue;
+
+ /*
+ * Retrieve the information we need for aggregation of the rel
+ * contents.
+ */
+ Assert(rel->agg_info == NULL);
+ agg_info = create_rel_agg_info(root, rel);
+ if (agg_info == NULL)
+ continue;
+
+ /*
+ * Create the grouped counterpart of "rel". This may includes the
+ * "other member rels" rejected above, if they're children of this
+ * rel. (The child rels will have their ->target and ->agg_info
+ * initialized later by set_append_rel_size()).
+ */
+ Assert(rel->agg_info == NULL);
+ Assert(rel->grouped == NULL);
+ rel->grouped = copy_simple_rel(root, rel);
+
+ /*
+ * Assign it the aggregation-specific info.
+ *
+ * The aggregation paths will get their input target from agg_info, so
+ * store it too.
+ */
+ rel->grouped->reltarget = agg_info->target;
+ rel->grouped->agg_info = agg_info;
+ }
+}
+
+/*
+ * Create GroupedVarInfo for each distinct aggregate.
+ *
+ * If any aggregate is not suitable, set root->grouped_var_list to NIL and
+ * return.
+ */
+static void
+create_aggregate_grouped_var_infos(PlannerInfo *root)
+{
+ List *tlist_exprs;
+ ListCell *lc;
+
+ Assert(root->grouped_var_list == NIL);
+
+ tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+ PVC_INCLUDE_AGGREGATES);
+
+ /*
+ * Although GroupingFunc is related to root->parse->groupingSets, this
+ * field does not necessarily reflect its presence.
+ */
+ foreach(lc, tlist_exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+
+ if (IsA(expr, GroupingFunc))
+ return;
+ }
+
+ /*
+ * Aggregates within the HAVING clause need to be processed in the same
+ * way as those in the main targetlist.
+ */
+ if (root->parse->havingQual != NULL)
+ {
+ List *having_exprs;
+
+ having_exprs = pull_var_clause((Node *) root->parse->havingQual,
+ PVC_INCLUDE_AGGREGATES);
+ if (having_exprs != NIL)
+ tlist_exprs = list_concat(tlist_exprs, having_exprs);
+ }
+
+ if (tlist_exprs == NIL)
+ return;
+
+ /* tlist_exprs may also contain Vars, but we only need Aggrefs. */
+ foreach(lc, tlist_exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+ Aggref *aggref;
+ ListCell *lc2;
+ GroupedVarInfo *gvi;
+ bool exists;
+
+ if (IsA(expr, Var))
+ continue;
+
+ aggref = castNode(Aggref, expr);
+
+ /* TODO Think if (some of) these can be handled. */
+ if (aggref->aggvariadic ||
+ aggref->aggdirectargs || aggref->aggorder ||
+ aggref->aggdistinct || aggref->aggfilter)
+ {
+ /*
+ * Partial aggregation is not useful if at least one aggregate
+ * cannot be evaluated below the top-level join.
+ *
+ * XXX Is it worth freeing the GroupedVarInfos and their subtrees?
+ */
+ root->grouped_var_list = NIL;
+ break;
+ }
+
+ /*
+ * Aggregation push-down does not work w/o aggcombinefn. This field is
+ * not mandatory, so check if this particular aggregate can handle
+ * partial aggregation.
+ */
+ if (!OidIsValid(aggref->aggcombinefn))
+ {
+ root->grouped_var_list = NIL;
+ break;
+ }
+
+ /* Does GroupedVarInfo for this aggregate already exist? */
+ exists = false;
+ foreach(lc2, root->grouped_var_list)
+ {
+ gvi = lfirst_node(GroupedVarInfo, lc2);
+
+ if (equal(expr, gvi->gvexpr))
+ {
+ exists = true;
+ break;
+ }
+ }
+
+ /* Construct a new GroupedVarInfo if does not exist yet. */
+ if (!exists)
+ {
+ Relids relids;
+
+ gvi = makeNode(GroupedVarInfo);
+ gvi->gvid = list_length(root->grouped_var_list);
+ gvi->gvexpr = (Expr *) copyObject(aggref);
+
+ /* Find out where the aggregate should be evaluated. */
+ relids = pull_varnos((Node *) aggref);
+ if (!bms_is_empty(relids))
+ gvi->gv_eval_at = relids;
+ else
+ gvi->gv_eval_at = NULL;
+
+ root->grouped_var_list = lappend(root->grouped_var_list, gvi);
+ }
+ }
+
+ list_free(tlist_exprs);
+}
+
+/*
+ * Create GroupedVarInfo for each expression usable as grouping key.
+ *
+ * In addition to the expressions of the query targetlist, group_pathkeys is
+ * also considered the source of grouping expressions. That increases the
+ * chance to get the relation output grouped.
+ */
+static void
+create_grouping_expr_grouped_var_infos(PlannerInfo *root)
+{
+ ListCell *l1,
+ *l2;
+ List *exprs = NIL;
+ List *sortgrouprefs = NIL;
+
+ /*
+ * Make sure GroupedVarInfo exists for each expression usable as grouping
+ * key.
+ */
+ foreach(l1, root->parse->groupClause)
+ {
+ SortGroupClause *sgClause;
+ TargetEntry *te;
+ Index sortgroupref;
+
+ sgClause = lfirst_node(SortGroupClause, l1);
+ te = get_sortgroupclause_tle(sgClause, root->processed_tlist);
+ sortgroupref = te->ressortgroupref;
+
+ if (sortgroupref == 0)
+ continue;
+
+ /*
+ * Non-zero sortgroupref does not necessarily imply grouping
+ * expression: data can also be sorted by aggregate.
+ */
+ if (IsA(te->expr, Aggref))
+ continue;
+
+ exprs = lappend(exprs, te->expr);
+ sortgrouprefs = lappend_int(sortgrouprefs, sortgroupref);
+ }
+
+ /*
+ * Construct GroupedVarInfo for each expression.
+ */
+ forboth(l1, exprs, l2, sortgrouprefs)
+ {
+ Expr *expr = (Expr *) lfirst(l1);
+ int sortgroupref = lfirst_int(l2);
+ GroupedVarInfo *gvi = makeNode(GroupedVarInfo);
+
+ gvi->gvid = list_length(root->grouped_var_list);
+ gvi->gvexpr = (Expr *) copyObject(expr);
+ gvi->sortgroupref = sortgroupref;
+
+ /* Find out where the expression should be evaluated. */
+ gvi->gv_eval_at = pull_varnos((Node *) expr);
+
+ root->grouped_var_list = lappend(root->grouped_var_list, gvi);
+ }
+}
+
+/*
+ * Take a flat copy of already initialized RelOptInfo and process child rels
+ * recursively.
+ *
+ * Flat copy ensures that we do not miss any information that the non-grouped
+ * rel already contains. XXX Do we need to copy any Node field?
+ *
+ * TODO The function only produces grouped rels, the name should reflect it
+ * (create_grouped_rel() ?).
+ */
+static RelOptInfo *
+copy_simple_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+ Index relid = rel->relid;
+ RangeTblEntry *rte;
+ ListCell *l;
+ List *indexlist = NIL;
+ RelOptInfo *result;
+
+ result = makeNode(RelOptInfo);
+ memcpy(result, rel, sizeof(RelOptInfo));
+
+ /*
+ * The new relation is grouped itself.
+ */
+ result->grouped = NULL;
+
+ /*
+ * The target to generate aggregation input will be initialized later.
+ */
+ result->reltarget = NULL;
+
+ /*
+ * Make sure that index paths have access to the parent rel's agg_info,
+ * which is used to indicate that the rel should produce grouped paths.
+ */
+ foreach(l, result->indexlist)
+ {
+ IndexOptInfo *src,
+ *dst;
+
+ src = lfirst_node(IndexOptInfo, l);
+ dst = makeNode(IndexOptInfo);
+ memcpy(dst, src, sizeof(IndexOptInfo));
+
+ dst->rel = result;
+ indexlist = lappend(indexlist, dst);
+ }
+ result->indexlist = indexlist;
+
+ /*
+ * This is very similar to child rel processing in build_simple_rel().
+ */
+ rte = root->simple_rte_array[relid];
+ if (rte->inh)
+ {
+ int nparts = rel->nparts;
+ int cnt_parts = 0;
+
+ if (nparts > 0)
+ result->part_rels = (RelOptInfo **)
+ palloc(sizeof(RelOptInfo *) * nparts);
+
+ foreach(l, root->append_rel_list)
+ {
+ AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(l);
+ RelOptInfo *childrel;
+
+ /* append_rel_list contains all append rels; ignore others */
+ if (appinfo->parent_relid != relid)
+ continue;
+
+ /*
+ * The non-grouped child rel must already exist.
+ */
+ childrel = root->simple_rel_array[appinfo->child_relid];
+ Assert(childrel != NULL);
+
+ /*
+ * Create the copy.
+ */
+ Assert(childrel->agg_info == NULL);
+ childrel->grouped = copy_simple_rel(root, childrel);
+
+ /* Nothing more to do for an unpartitioned table. */
+ if (!rel->part_scheme)
+ continue;
+
+ Assert(cnt_parts < nparts);
+ result->part_rels[cnt_parts] = childrel;
+ cnt_parts++;
+ }
+
+ /* We should have seen all the child partitions. */
+ Assert(cnt_parts == nparts);
+ }
+
+ return result;
+}
/*****************************************************************************
*
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index b05adc70c4..0ca5d6ea0b 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -43,6 +43,8 @@
* (this is NOT necessarily root->parse->targetList!)
* qp_callback is a function to compute query_pathkeys once it's safe to do so
* qp_extra is optional extra data to pass to qp_callback
+ * *partially_grouped may receive relation that contains partial aggregate
+ * anywhere in the join tree.
*
* Note: the PlannerInfo node also includes a query_pathkeys field, which
* tells query_planner the sort order that is desired in the final output
@@ -66,6 +68,8 @@ query_planner(PlannerInfo *root, List *tlist,
*/
if (parse->jointree->fromlist == NIL)
{
+ RelOptInfo *final_rel;
+
/* We need a dummy joinrel to describe the empty set of baserels */
final_rel = build_empty_join_rel(root);
@@ -114,6 +118,7 @@ query_planner(PlannerInfo *root, List *tlist,
root->full_join_clauses = NIL;
root->join_info_list = NIL;
root->placeholder_list = NIL;
+ root->grouped_var_list = NIL;
root->fkey_list = NIL;
root->initial_rels = NIL;
@@ -232,6 +237,16 @@ query_planner(PlannerInfo *root, List *tlist,
extract_restriction_or_clauses(root);
/*
+ * If the query result can be grouped, check if any grouping can be
+ * performed below the top-level join. If so, setup root->grouped_var_list
+ * and create RelOptInfo for base relations capable to do the grouping.
+ *
+ * The base relations should be fully initialized now, so that we have
+ * enough info to decide whether grouping is possible.
+ */
+ add_grouped_base_rels_to_query(root);
+
+ /*
* We should now have size estimates for every actual table involved in
* the query, and we also know which if any have been deleted from the
* query by join removal; so we can compute total_table_pages.
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index fd06da98b9..ffef925f5f 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -133,9 +133,6 @@ static double get_number_of_groups(PlannerInfo *root,
double path_rows,
grouping_sets_data *gd,
List *target_list);
-static Size estimate_hashagg_tablesize(Path *path,
- const AggClauseCosts *agg_costs,
- double dNumGroups);
static RelOptInfo *create_grouping_paths(PlannerInfo *root,
RelOptInfo *input_rel,
PathTarget *target,
@@ -2044,6 +2041,7 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
grouping_target_parallel_safe,
&agg_costs,
gset_data);
+
/* Fix things up if grouping_target contains SRFs */
if (parse->hasTargetSRFs)
adjust_paths_for_srfs(root, current_rel,
@@ -3640,40 +3638,6 @@ get_number_of_groups(PlannerInfo *root,
}
/*
- * estimate_hashagg_tablesize
- * estimate the number of bytes that a hash aggregate hashtable will
- * require based on the agg_costs, path width and dNumGroups.
- *
- * XXX this may be over-estimating the size now that hashagg knows to omit
- * unneeded columns from the hashtable. Also for mixed-mode grouping sets,
- * grouping columns not in the hashed set are counted here even though hashagg
- * won't store them. Is this a problem?
- */
-static Size
-estimate_hashagg_tablesize(Path *path, const AggClauseCosts *agg_costs,
- double dNumGroups)
-{
- Size hashentrysize;
-
- /* Estimate per-hash-entry space at tuple width... */
- hashentrysize = MAXALIGN(path->pathtarget->width) +
- MAXALIGN(SizeofMinimalTupleHeader);
-
- /* plus space for pass-by-ref transition values... */
- hashentrysize += agg_costs->transitionSpace;
- /* plus the per-hash-entry overhead */
- hashentrysize += hash_agg_entry_size(agg_costs->numAggs);
-
- /*
- * Note that this disregards the effect of fill-factor and growth policy
- * of the hash-table. That's probably ok, given default the default
- * fill-factor is relatively high. It'd be hard to meaningfully factor in
- * "double-in-size" growth policies here.
- */
- return hashentrysize * dNumGroups;
-}
-
-/*
* create_grouping_paths
*
* Build a new upperrel containing Paths for grouping and/or aggregation.
@@ -3720,6 +3684,7 @@ create_grouping_paths(PlannerInfo *root,
{
int flags = 0;
GroupPathExtraData extra;
+ List *agg_pushdown_paths = NIL;
/*
* Determine whether it's possible to perform sort-based
@@ -3787,6 +3752,38 @@ create_grouping_paths(PlannerInfo *root,
create_ordinary_grouping_paths(root, input_rel, grouped_rel,
agg_costs, gd, &extra,
&partially_grouped_rel);
+
+ /*
+ * Process paths generated by aggregation push-down feature.
+ */
+ if (input_rel->grouped && input_rel->grouped)
+ {
+ RelOptInfo *agg_pushdown_rel;
+ ListCell *lc;
+
+ agg_pushdown_rel = input_rel->grouped;
+ agg_pushdown_paths = agg_pushdown_rel->pathlist;
+
+ /*
+ * See create_grouped_path().
+ */
+ Assert(agg_pushdown_rel->partial_pathlist == NIL);
+
+ foreach(lc, agg_pushdown_paths)
+ {
+ Path *path = (Path *) lfirst(lc);
+
+ /*
+ * The aggregate push-down feature currently turns append rel
+ * into a dummy rel, see comment in set_append_rel_pathlist().
+ * XXX Can we eliminate this situation earlier?
+ */
+ if (IS_DUMMY_PATH(path))
+ continue;
+
+ add_path(grouped_rel, path);
+ }
+ }
}
set_cheapest(grouped_rel);
@@ -3992,11 +3989,11 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
bool force_rel_creation;
/*
- * If we're doing partitionwise aggregation at this level, force
- * creation of a partially_grouped_rel so we can add partitionwise
- * paths to it.
+ * If we're doing partitionwise aggregation at this level or if
+ * aggregation push-down took place, force creation of a
+ * partially_grouped_rel so we can add the related paths to it.
*/
- force_rel_creation = (patype == PARTITIONWISE_AGGREGATE_PARTIAL);
+ force_rel_creation = patype == PARTITIONWISE_AGGREGATE_PARTIAL;
partially_grouped_rel =
create_partial_grouping_paths(root,
@@ -4029,10 +4026,14 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
/* Gather any partially grouped partial paths. */
if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
- {
gather_grouping_paths(root, partially_grouped_rel);
+
+ /*
+ * The non-partial paths can come either from the Gather above or from
+ * aggregate push-down.
+ */
+ if (partially_grouped_rel && partially_grouped_rel->pathlist)
set_cheapest(partially_grouped_rel);
- }
/*
* Estimate number of groups.
@@ -7117,6 +7118,7 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
if (partially_grouped_rel && partial_grouping_valid)
{
Assert(partially_grouped_live_children != NIL);
+ Assert(partially_grouped_rel->agg_info == NULL);
add_paths_to_append_rel(root, partially_grouped_rel,
partially_grouped_live_children);
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 69dd327f0c..f03d979f15 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -40,6 +40,7 @@ typedef struct
List *tlist; /* underlying target list */
int num_vars; /* number of plain Var tlist entries */
bool has_ph_vars; /* are there PlaceHolderVar entries? */
+ bool has_grp_vars; /* are there GroupedVar entries? */
bool has_non_vars; /* are there other entries? */
bool has_conv_whole_rows; /* are there ConvertRowtypeExpr
* entries encapsulating a whole-row
@@ -1988,6 +1989,7 @@ build_tlist_index(List *tlist)
indexed_tlist *itlist;
tlist_vinfo *vinfo;
ListCell *l;
+ List *tlist_gvars = NIL;
/* Create data structure with enough slots for all tlist entries */
itlist = (indexed_tlist *)
@@ -1996,6 +1998,7 @@ build_tlist_index(List *tlist)
itlist->tlist = tlist;
itlist->has_ph_vars = false;
+ itlist->has_grp_vars = false;
itlist->has_non_vars = false;
itlist->has_conv_whole_rows = false;
@@ -2016,6 +2019,8 @@ build_tlist_index(List *tlist)
}
else if (tle->expr && IsA(tle->expr, PlaceHolderVar))
itlist->has_ph_vars = true;
+ else if (tle->expr && IsA(tle->expr, GroupedVar))
+ tlist_gvars = lappend(tlist_gvars, tle);
else if (is_converted_whole_row_reference((Node *) tle->expr))
itlist->has_conv_whole_rows = true;
else
@@ -2024,6 +2029,42 @@ build_tlist_index(List *tlist)
itlist->num_vars = (vinfo - itlist->vars);
+ /*
+ * If the targetlist contains GroupedVars, we may need to match them to
+ * Aggrefs in the upper plan. Thus the upper planner can always put
+ * Aggrefs into the targetlists, regardless the subplan(s) contain the
+ * original Aggrefs or GroupedVar substitutions.
+ */
+ if (list_length(tlist_gvars) > 0)
+ {
+ List *tlist_new;
+
+ /*
+ * Copy the source list because caller does not expect to see the
+ * items we're going to add.
+ */
+ tlist_new = list_copy(itlist->tlist);
+
+ foreach(l, tlist_gvars)
+ {
+ TargetEntry *tle = lfirst_node(TargetEntry, l);
+ TargetEntry *tle_new;
+ GroupedVar *gvar = castNode(GroupedVar, tle->expr);
+
+ /*
+ * Add the entry to match the Aggref.
+ */
+ tle_new = flatCopyTargetEntry(tle);
+ tle_new->expr = gvar->gvexpr;
+ tlist_new = lappend(tlist_new, tle_new);
+ }
+
+ itlist->tlist = tlist_new;
+ itlist->has_grp_vars = true;
+
+ list_free(tlist_gvars);
+ }
+
return itlist;
}
@@ -2299,6 +2340,48 @@ fix_join_expr_mutator(Node *node, fix_join_expr_context *context)
/* No referent found for Var */
elog(ERROR, "variable not found in subplan target lists");
}
+ if (IsA(node, GroupedVar) ||IsA(node, Aggref))
+ {
+ bool try = true;
+
+ /*
+ * The upper plan targetlist can contain Aggref whose value has
+ * already been evaluated by the subplan and is being delivered via
+ * GroupedVar. However this is true only for specific kinds of Aggref.
+ */
+ if (IsA(node, Aggref))
+ {
+ Aggref *aggref = castNode(Aggref, node);
+
+ if (aggref->aggsplit != AGGSPLIT_SIMPLE &&
+ aggref->aggsplit != AGGSPLIT_INITIAL_SERIAL)
+ try = false;
+ }
+
+ if (try)
+ {
+ /* See if the GroupedVar has bubbled up from a lower plan node */
+ if (context->outer_itlist && context->outer_itlist->has_grp_vars)
+ {
+ newvar = search_indexed_tlist_for_non_var((Expr *) node,
+ context->outer_itlist,
+ OUTER_VAR);
+ if (newvar)
+ return (Node *) newvar;
+ }
+ if (context->inner_itlist && context->inner_itlist->has_grp_vars)
+ {
+ newvar = search_indexed_tlist_for_non_var((Expr *) node,
+ context->inner_itlist,
+ INNER_VAR);
+ if (newvar)
+ return (Node *) newvar;
+ }
+ }
+
+ /* No referent found for GroupedVar */
+ elog(ERROR, "grouped variable not found in subplan target lists");
+ }
if (IsA(node, PlaceHolderVar))
{
PlaceHolderVar *phv = (PlaceHolderVar *) node;
@@ -2461,7 +2544,8 @@ fix_upper_expr_mutator(Node *node, fix_upper_expr_context *context)
/* If no match, just fall through to process it normally */
}
/* Try matching more complex expressions too, if tlist has any */
- if (context->subplan_itlist->has_non_vars ||
+ if (context->subplan_itlist->has_grp_vars ||
+ context->subplan_itlist->has_non_vars ||
(context->subplan_itlist->has_conv_whole_rows &&
is_converted_whole_row_reference(node)))
{
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index c5aaaf5c22..793d44e6d3 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -27,6 +27,7 @@
#include "optimizer/planmain.h"
#include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
+/* TODO Remove this if create_grouped_path ends up in another module. */
#include "optimizer/tlist.h"
#include "optimizer/var.h"
#include "parser/parsetree.h"
@@ -56,7 +57,12 @@ static int append_startup_cost_compare(const void *a, const void *b);
static List *reparameterize_pathlist_by_child(PlannerInfo *root,
List *pathlist,
RelOptInfo *child_rel);
-
+static Bitmapset *combine_uniquekeys(Path *outerpath, Bitmapset *outerset,
+ Path *innerpath,
+ Bitmapset *innerset,
+ PathTarget *target);
+static void make_uniquekeys_for_unique_index(PathTarget *reltarget,
+ IndexOptInfo *index, Path *path);
/*****************************************************************************
* MISC. PATH UTILITIES
@@ -955,10 +961,15 @@ create_seqscan_path(PlannerInfo *root, RelOptInfo *rel,
Relids required_outer, int parallel_workers)
{
Path *pathnode = makeNode(Path);
+ bool grouped = rel->agg_info != NULL;
pathnode->pathtype = T_SeqScan;
pathnode->parent = rel;
- pathnode->pathtarget = rel->reltarget;
+ /* For grouped relation only generate the aggregation input. */
+ if (!grouped)
+ pathnode->pathtarget = rel->reltarget;
+ else
+ pathnode->pathtarget = rel->agg_info->input;
pathnode->param_info = get_baserel_parampathinfo(root, rel,
required_outer);
pathnode->parallel_aware = parallel_workers > 0 ? true : false;
@@ -1038,10 +1049,15 @@ create_index_path(PlannerInfo *root,
RelOptInfo *rel = index->rel;
List *indexquals,
*indexqualcols;
+ bool grouped = rel->agg_info != NULL;
pathnode->path.pathtype = indexonly ? T_IndexOnlyScan : T_IndexScan;
pathnode->path.parent = rel;
- pathnode->path.pathtarget = rel->reltarget;
+ /* For grouped relation only generate the aggregation input. */
+ if (!grouped)
+ pathnode->path.pathtarget = rel->reltarget;
+ else
+ pathnode->path.pathtarget = rel->agg_info->input;
pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
required_outer);
pathnode->path.parallel_aware = false;
@@ -1189,10 +1205,15 @@ create_tidscan_path(PlannerInfo *root, RelOptInfo *rel, List *tidquals,
Relids required_outer)
{
TidPath *pathnode = makeNode(TidPath);
+ bool grouped = rel->agg_info != NULL;
pathnode->path.pathtype = T_TidScan;
pathnode->path.parent = rel;
- pathnode->path.pathtarget = rel->reltarget;
+ /* For grouped relation only generate the aggregation input. */
+ if (!grouped)
+ pathnode->path.pathtarget = rel->reltarget;
+ else
+ pathnode->path.pathtarget = rel->agg_info->input;
pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
required_outer);
pathnode->path.parallel_aware = false;
@@ -1229,9 +1250,11 @@ create_append_path(PlannerInfo *root,
Assert(!parallel_aware || parallel_workers > 0);
pathnode->path.pathtype = T_Append;
- pathnode->path.parent = rel;
+
pathnode->path.pathtarget = rel->reltarget;
+ pathnode->path.parent = rel;
+
/*
* When generating an Append path for a partitioned table, there may be
* parameters that are useful so we can eliminate certain partitions
@@ -1341,11 +1364,13 @@ append_startup_cost_compare(const void *a, const void *b)
/*
* create_merge_append_path
* Creates a path corresponding to a MergeAppend plan, returning the
- * pathnode.
+ * pathnode. target can be supplied by caller. If NULL is passed, the field
+ * is set to rel->reltarget.
*/
MergeAppendPath *
create_merge_append_path(PlannerInfo *root,
RelOptInfo *rel,
+ PathTarget *target,
List *subpaths,
List *pathkeys,
Relids required_outer,
@@ -1358,7 +1383,7 @@ create_merge_append_path(PlannerInfo *root,
pathnode->path.pathtype = T_MergeAppend;
pathnode->path.parent = rel;
- pathnode->path.pathtarget = rel->reltarget;
+ pathnode->path.pathtarget = target ? target : rel->reltarget;
pathnode->path.param_info = get_appendrel_parampathinfo(rel,
required_outer);
pathnode->path.parallel_aware = false;
@@ -1495,6 +1520,7 @@ create_material_path(RelOptInfo *rel, Path *subpath)
subpath->parallel_safe;
pathnode->path.parallel_workers = subpath->parallel_workers;
pathnode->path.pathkeys = subpath->pathkeys;
+ pathnode->path.uniquekeys = subpath->uniquekeys;
pathnode->subpath = subpath;
@@ -1528,7 +1554,9 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
MemoryContext oldcontext;
int numCols;
- /* Caller made a mistake if subpath isn't cheapest_total ... */
+ /*
+ * Caller made a mistake if subpath isn't cheapest_total.
+ */
Assert(subpath == rel->cheapest_total_path);
Assert(subpath->parent == rel);
/* ... or if SpecialJoinInfo is the wrong one */
@@ -2149,6 +2177,7 @@ calc_non_nestloop_required_outer(Path *outer_path, Path *inner_path)
* relations.
*
* 'joinrel' is the join relation.
+ * 'target' is the join path target
* 'jointype' is the type of join required
* 'workspace' is the result from initial_cost_nestloop
* 'extra' contains various information about the join
@@ -2163,6 +2192,7 @@ calc_non_nestloop_required_outer(Path *outer_path, Path *inner_path)
NestPath *
create_nestloop_path(PlannerInfo *root,
RelOptInfo *joinrel,
+ PathTarget *target,
JoinType jointype,
JoinCostWorkspace *workspace,
JoinPathExtraData *extra,
@@ -2203,7 +2233,7 @@ create_nestloop_path(PlannerInfo *root,
pathnode->path.pathtype = T_NestLoop;
pathnode->path.parent = joinrel;
- pathnode->path.pathtarget = joinrel->reltarget;
+ pathnode->path.pathtarget = target;
pathnode->path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
@@ -2235,6 +2265,7 @@ create_nestloop_path(PlannerInfo *root,
* two relations
*
* 'joinrel' is the join relation
+ * 'target' is the join path target
* 'jointype' is the type of join required
* 'workspace' is the result from initial_cost_mergejoin
* 'extra' contains various information about the join
@@ -2251,6 +2282,7 @@ create_nestloop_path(PlannerInfo *root,
MergePath *
create_mergejoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
+ PathTarget *target,
JoinType jointype,
JoinCostWorkspace *workspace,
JoinPathExtraData *extra,
@@ -2267,7 +2299,7 @@ create_mergejoin_path(PlannerInfo *root,
pathnode->jpath.path.pathtype = T_MergeJoin;
pathnode->jpath.path.parent = joinrel;
- pathnode->jpath.path.pathtarget = joinrel->reltarget;
+ pathnode->jpath.path.pathtarget = target;
pathnode->jpath.path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
@@ -2303,6 +2335,7 @@ create_mergejoin_path(PlannerInfo *root,
* Creates a pathnode corresponding to a hash join between two relations.
*
* 'joinrel' is the join relation
+ * 'target' is the join path target
* 'jointype' is the type of join required
* 'workspace' is the result from initial_cost_hashjoin
* 'extra' contains various information about the join
@@ -2317,6 +2350,7 @@ create_mergejoin_path(PlannerInfo *root,
HashPath *
create_hashjoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
+ PathTarget *target,
JoinType jointype,
JoinCostWorkspace *workspace,
JoinPathExtraData *extra,
@@ -2331,7 +2365,7 @@ create_hashjoin_path(PlannerInfo *root,
pathnode->jpath.path.pathtype = T_HashJoin;
pathnode->jpath.path.parent = joinrel;
- pathnode->jpath.path.pathtarget = joinrel->reltarget;
+ pathnode->jpath.path.pathtarget = target;
pathnode->jpath.path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
@@ -2413,8 +2447,8 @@ create_projection_path(PlannerInfo *root,
* Note: in the latter case, create_projection_plan has to recheck our
* conclusion; see comments therein.
*/
- if (is_projection_capable_path(subpath) ||
- equal(oldtarget->exprs, target->exprs))
+ if ((is_projection_capable_path(subpath) ||
+ equal(oldtarget->exprs, target->exprs)))
{
/* No separate Result node needed */
pathnode->dummypp = true;
@@ -2647,6 +2681,7 @@ create_sort_path(PlannerInfo *root,
subpath->parallel_safe;
pathnode->path.parallel_workers = subpath->parallel_workers;
pathnode->path.pathkeys = pathkeys;
+ pathnode->path.uniquekeys = subpath->uniquekeys;
pathnode->subpath = subpath;
@@ -2799,8 +2834,7 @@ create_agg_path(PlannerInfo *root,
pathnode->path.pathtype = T_Agg;
pathnode->path.parent = rel;
pathnode->path.pathtarget = target;
- /* For now, assume we are above any joins, so no parameterization */
- pathnode->path.param_info = NULL;
+ pathnode->path.param_info = subpath->param_info;
pathnode->path.parallel_aware = false;
pathnode->path.parallel_safe = rel->consider_parallel &&
subpath->parallel_safe;
@@ -2833,6 +2867,179 @@ create_agg_path(PlannerInfo *root,
}
/*
+ * Apply AGG_SORTED aggregation path to subpath if it's suitably sorted.
+ *
+ * check_pathkeys can be passed FALSE if the function was already called for
+ * given index --- since the target should not change, we can skip the check
+ * of sorting during subsequent calls.
+ *
+ * NULL is returned if sorting of subpath output is not suitable.
+ */
+AggPath *
+create_agg_sorted_path(PlannerInfo *root, Path *subpath,
+ bool check_pathkeys, double input_rows)
+{
+ RelOptInfo *rel;
+ Node *agg_exprs;
+ AggSplit aggsplit;
+ AggClauseCosts agg_costs;
+ PathTarget *target;
+ double dNumGroups;
+ Node *qual = NULL;
+ AggPath *result = NULL;
+ RelAggInfo *agg_info;
+
+ rel = subpath->parent;
+ agg_info = rel->agg_info;
+ Assert(agg_info != NULL);
+
+ aggsplit = AGGSPLIT_SIMPLE;
+ agg_exprs = (Node *) agg_info->agg_exprs;
+ target = agg_info->target;
+
+ if (subpath->pathkeys == NIL)
+ return NULL;
+
+ if (!grouping_is_sortable(root->parse->groupClause))
+ return NULL;
+
+ if (check_pathkeys)
+ {
+ ListCell *lc1;
+ List *key_subset = NIL;
+
+ /*
+ * Find all query pathkeys that our relation does affect.
+ */
+ foreach(lc1, root->group_pathkeys)
+ {
+ PathKey *gkey = castNode(PathKey, lfirst(lc1));
+ ListCell *lc2;
+
+ foreach(lc2, subpath->pathkeys)
+ {
+ PathKey *skey = castNode(PathKey, lfirst(lc2));
+
+ if (skey == gkey)
+ {
+ key_subset = lappend(key_subset, gkey);
+ break;
+ }
+ }
+ }
+
+ if (key_subset == NIL)
+ return NULL;
+
+ /* Check if AGG_SORTED is useful for the whole query. */
+ if (!pathkeys_contained_in(key_subset, subpath->pathkeys))
+ return NULL;
+ }
+
+ MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+ get_agg_clause_costs(root, (Node *) agg_exprs, aggsplit, &agg_costs);
+
+ if (root->parse->havingQual)
+ {
+ qual = root->parse->havingQual;
+ get_agg_clause_costs(root, agg_exprs, aggsplit, &agg_costs);
+ }
+
+ Assert(agg_info->group_exprs != NIL);
+ dNumGroups = estimate_num_groups(root, agg_info->group_exprs,
+ input_rows, NULL);
+
+ Assert(agg_info->group_clauses != NIL);
+ result = create_agg_path(root, rel, subpath, target,
+ AGG_SORTED, aggsplit,
+ agg_info->group_clauses,
+ (List *) qual, &agg_costs,
+ dNumGroups);
+
+ return result;
+}
+
+/*
+ * Apply AGG_HASHED aggregation to subpath.
+ *
+ * Arguments have the same meaning as those of create_agg_sorted_path.
+ */
+AggPath *
+create_agg_hashed_path(PlannerInfo *root, Path *subpath,
+ double input_rows)
+{
+ RelOptInfo *rel;
+ bool can_hash;
+ Node *agg_exprs;
+ AggSplit aggsplit;
+ AggClauseCosts agg_costs;
+ PathTarget *target;
+ double dNumGroups;
+ Size hashaggtablesize;
+ Query *parse = root->parse;
+ Node *qual = NULL;
+ AggPath *result = NULL;
+ RelAggInfo *agg_info;
+
+ rel = subpath->parent;
+ agg_info = rel->agg_info;
+ Assert(agg_info != NULL);
+
+ aggsplit = AGGSPLIT_SIMPLE;
+ agg_exprs = (Node *) agg_info->agg_exprs;
+ target = agg_info->target;
+
+ MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+ get_agg_clause_costs(root, agg_exprs, aggsplit, &agg_costs);
+
+ if (parse->havingQual)
+ {
+ qual = parse->havingQual;
+ get_agg_clause_costs(root, agg_exprs, aggsplit, &agg_costs);
+ }
+
+ can_hash = (parse->groupClause != NIL &&
+ parse->groupingSets == NIL &&
+ agg_costs.numOrderedAggs == 0 &&
+ grouping_is_hashable(parse->groupClause));
+
+ if (can_hash)
+ {
+ Assert(agg_info->group_exprs != NIL);
+ dNumGroups = estimate_num_groups(root, agg_info->group_exprs,
+ input_rows, NULL);
+
+ hashaggtablesize = estimate_hashagg_tablesize(subpath, &agg_costs,
+ dNumGroups);
+
+ if (hashaggtablesize < work_mem * 1024L)
+ {
+ /*
+ * Create the partial aggregation path.
+ */
+ Assert(agg_info->group_clauses != NIL);
+
+ result = create_agg_path(root, rel, subpath,
+ target,
+ AGG_HASHED,
+ aggsplit,
+ agg_info->group_clauses,
+ (List *) qual,
+ &agg_costs,
+ dNumGroups);
+
+ /*
+ * The agg path should require no fewer parameters than the plain
+ * one.
+ */
+ result->path.param_info = subpath->param_info;
+ }
+ }
+
+ return result;
+}
+
+/*
* create_groupingsets_path
* Creates a pathnode that represents performing GROUPING SETS aggregation
*
@@ -3956,3 +4163,402 @@ reparameterize_pathlist_by_child(PlannerInfo *root,
return result;
}
+
+/*
+ * Find out if path produces an unique set of expressions and set uniquekeys
+ * accordingly.
+ *
+ * TODO Check if any expression of any unique key isn't nullable, whether in
+ * the table / index or by an outer join.
+ */
+void
+make_uniquekeys(PlannerInfo *root, Path *path)
+{
+ RelOptInfo *rel;
+
+ /*
+ * The unique keys are not interesting if there's no chance to push
+ * aggregation down to base relations / joins.
+ */
+ if (root->grouped_var_list == NIL)
+ return;
+
+ /*
+ * Do not accept repeated calls of the function on the same path.
+ */
+ if (path->uniquekeys != NIL)
+ return;
+
+ rel = path->parent;
+
+ /*
+ * Base relations.
+ */
+ if (IsA(path, IndexPath) ||
+ (IsA(path, Path) &&path->pathtype == T_SeqScan))
+ {
+ ListCell *lc;
+
+ /*
+ * Derive grouping keys from unique indexes.
+ */
+ if (IsA(path, IndexPath) ||IsA(path, Path))
+ {
+ foreach(lc, rel->indexlist)
+ {
+ IndexOptInfo *index = lfirst_node(IndexOptInfo, lc);
+
+ make_uniquekeys_for_unique_index(rel->reltarget, index, path);
+ }
+ }
+#ifdef USE_ASSERT_CHECKING
+ else
+ Assert(false);
+#endif
+ return;
+ }
+
+ if (IsA(path, AggPath))
+ {
+ /*
+ * The immediate output of aggregation essentially produces an unique
+ * set of grouping keys.
+ */
+ make_uniquekeys_for_agg_path(path);
+ return;
+ }
+#ifdef USE_ASSERT_CHECKING
+
+ /*
+ * TODO Consider other ones, e.g. UniquePath.
+ */
+ Assert(false);
+#endif
+}
+
+/*
+ * Create uniquekeys for a path that has Aggrefs in its target.
+ * set.
+ *
+ * Besides AggPath, ForeignPath is a known use case for this function.
+ */
+void
+make_uniquekeys_for_agg_path(Path *path)
+{
+ PathTarget *target;
+ ListCell *lc;
+ Bitmapset *keyset = NULL;
+ int i = 0;
+
+ target = path->pathtarget;
+ Assert(target->sortgrouprefs != NULL);
+
+ foreach(lc, target->exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+
+ if (IsA(expr, GroupedVar))
+ {
+ GroupedVar *gvar = castNode(GroupedVar, expr);
+
+ if (!IsA(gvar->gvexpr, Aggref))
+ {
+ /*
+ * Generic grouping expression.
+ */
+ keyset = bms_add_member(keyset, i);
+ }
+ }
+ else
+ {
+ Assert(IsA(expr, Var));
+
+ if (target->sortgrouprefs[i] > 0)
+ {
+ /*
+ * Plain Var grouping expression.
+ */
+ keyset = bms_add_member(keyset, i);
+ }
+ else
+ {
+ /*
+ * A column functionally dependent on the GROUP BY clause?
+ */
+ }
+ }
+
+ i++;
+ }
+
+ add_uniquekeys(&path->uniquekeys, keyset);
+}
+
+/*
+ * Unlike other kinds of path, creation of join path might be rejected due to
+ * inappropriate uniquekeys. Therefore this function only derives uniquekeys
+ * for a join and checks if the join would produce unique grouping
+ * keys. Caller is responsible for adding them to the path if it's eventually
+ * created.
+ */
+List *
+make_uniquekeys_for_join(PlannerInfo *root, Path *outerpath, Path *innerpath,
+ PathTarget *target, bool *keys_ok)
+{
+ ListCell *l1;
+ List *result = NIL;
+
+ *keys_ok = true;
+
+ /*
+ * Find out if the join produces unique keys for various combinations of
+ * input sets of unique keys.
+ *
+ * TODO Implement heuristic that picks a few most useful sets on each
+ * side, to avoid exponential growth of the uniquekeys list as we proceed
+ * from lower to higher joins. Maybe also discard the resulting sets
+ * containing unique expressions which are not grouping expressions (and
+ * of course which are not aggregates) of this join's target.
+ */
+ foreach(l1, outerpath->uniquekeys)
+ {
+ ListCell *l2;
+ Bitmapset *outerset = (Bitmapset *) lfirst(l1);
+
+ foreach(l2, innerpath->uniquekeys)
+ {
+ Bitmapset *innerset = (Bitmapset *) lfirst(l2);
+ Bitmapset *joinset;
+
+ joinset = combine_uniquekeys(outerpath, outerset, innerpath,
+ innerset, target);
+ if (joinset != NULL)
+ {
+ /* Add the set to the path. */
+ add_uniquekeys(&result, joinset);
+ }
+ }
+ }
+
+ if (!match_uniquekeys_to_group_pathkeys(root, result, target))
+ *keys_ok = false;
+
+ return result;
+}
+
+/*
+ * Create join uniquekeys out of the uniquekeys of input paths.
+ */
+static Bitmapset *
+combine_uniquekeys(Path *outerpath, Bitmapset *outerset, Path *innerpath,
+ Bitmapset *innerset, PathTarget *target)
+{
+ ListCell *l1;
+ PathTarget *unique_exprs;
+ Expr *expr;
+ Index sortgroupref;
+ int i;
+ Bitmapset *result = NULL;
+
+ /*
+ * TODO sortgroupref is used to improve matching of the input and output
+ * path. Better solution might be to store the uniquekeys as a list of EC
+ * pointers, which is how PathKey is implemented.
+ */
+
+ /*
+ * Use PathTarget so that we can store both expression and its
+ * sortgroupref.
+ */
+ unique_exprs = create_empty_pathtarget();
+
+ /*
+ * First, collect the expressions corresponding to the uniquekeys of each
+ * input target.
+ */
+ i = 0;
+ foreach(l1, outerpath->pathtarget->exprs)
+ {
+ expr = (Expr *) lfirst(l1);
+
+ sortgroupref = 0;
+ if (outerpath->pathtarget->sortgrouprefs)
+ sortgroupref = outerpath->pathtarget->sortgrouprefs[i];
+
+ if (bms_is_member(i, outerset))
+ add_column_to_pathtarget(unique_exprs, expr, sortgroupref);
+
+ i++;
+ }
+ i = 0;
+ foreach(l1, innerpath->pathtarget->exprs)
+ {
+ expr = (Expr *) lfirst(l1);
+
+ sortgroupref = 0;
+ if (innerpath->pathtarget->sortgrouprefs)
+ sortgroupref = innerpath->pathtarget->sortgrouprefs[i];
+
+ if (bms_is_member(i, innerset))
+ add_column_to_pathtarget(unique_exprs, expr, sortgroupref);
+
+ i++;
+ }
+ if (unique_exprs->exprs == NIL)
+ return NULL;
+
+ /*
+ * Now find each expression in the join target and add the position to the
+ * output uniquekeys.
+ */
+ i = 0;
+ foreach(l1, unique_exprs->exprs)
+ {
+ Expr *unique_expr = (Expr *) lfirst(l1);
+ Index sortgroupref = 0;
+ ListCell *l2;
+ int j;
+ bool match = false;
+
+ if (unique_exprs->sortgrouprefs)
+ sortgroupref = unique_exprs->sortgrouprefs[i];
+
+ /*
+ * As some expressions of the input uniquekeys do not necessarily
+ * appear in the output target (they could have been there just
+ * because of the join clause of the current join), we first try to
+ * find match using sortgroupref. If one expression is gone but other
+ * one with the same sortgroupref still exists (i.e. one was derived
+ * from the other), it should always have the same value.
+ */
+ j = 0;
+ if (sortgroupref > 0)
+ {
+ foreach(l2, target->exprs)
+ {
+ Index sortgroupref_target = 0;
+
+ if (target->sortgrouprefs)
+ sortgroupref_target = target->sortgrouprefs[j];
+
+ if (sortgroupref_target == sortgroupref)
+ {
+ result = bms_add_member(result, i);
+ match = true;
+ break;
+ }
+ }
+ }
+
+ if (match)
+ {
+ i++;
+ continue;
+ }
+
+ /*
+ * If sortgroupref didn't help, we need to find the exact expression.
+ */
+ j = 0;
+ foreach(l2, target->exprs)
+ {
+ Expr *expr = (Expr *) lfirst(l2);
+
+ if (equal(expr, unique_expr))
+ {
+ result = bms_add_member(result, i);
+ match = true;
+ break;
+ }
+
+ j++;
+ }
+
+ /*
+ * We can't construct uniquekeys for the join.
+ */
+ if (!match)
+ return NULL;
+
+ i++;
+ }
+
+ return result;
+}
+
+void
+free_uniquekeys(List *uniquekeys)
+{
+ ListCell *lc;
+
+ foreach(lc, uniquekeys)
+ bms_free((Bitmapset *) lc);
+ list_free(uniquekeys);
+}
+
+/*
+ * Create a set of positions of expressions in reltarget if the index is
+ * unique and if reltarget contains all the index columns. Add the set to
+ * uniquekeys if identical one is not already there.
+ */
+static void
+make_uniquekeys_for_unique_index(PathTarget *reltarget, IndexOptInfo *index,
+ Path *path)
+{
+ int i;
+ Bitmapset *new_set = NULL;
+
+ /*
+ * Give up if the index does not guarantee uniqueness.
+ */
+ if (!index->unique || !index->immediate ||
+ (index->indpred != NIL && !index->predOK))
+ return;
+
+ /*
+ * For the index path to be acceptable, reltarget must contain all the
+ * index columns.
+ *
+ * reltarget is not supposed to contain non-var expressions, so the index
+ * should neither.
+ */
+ if (index->indexprs != NULL)
+ return;
+
+ for (i = 0; i < index->ncolumns; i++)
+ {
+ int indkey = index->indexkeys[i];
+ ListCell *lc;
+ bool found = false;
+ int j = 0;
+
+ foreach(lc, reltarget->exprs)
+ {
+ Var *var = lfirst_node(Var, lc);
+
+ if (var->varno == index->rel->relid && var->varattno == indkey)
+ {
+ new_set = bms_add_member(new_set, j);
+ found = true;
+ break;
+ }
+
+ j++;
+ }
+
+ /*
+ * If rel needs less than the whole index key then the values of the
+ * columns matched so far can be duplicate.
+ */
+ if (!found)
+ {
+ bms_free(new_set);
+ return;
+ }
+ }
+
+ /*
+ * Add the set to the path, unless it's already there.
+ */
+ add_uniquekeys(&path->uniquekeys, new_set);
+}
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index c69740eda6..120dd4ab49 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -17,6 +17,7 @@
#include <limits.h>
#include "miscadmin.h"
+#include "catalog/pg_constraint.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/pathnode.h"
@@ -26,6 +27,8 @@
#include "optimizer/prep.h"
#include "optimizer/restrictinfo.h"
#include "optimizer/tlist.h"
+#include "optimizer/var.h"
+#include "parser/parse_oper.h"
#include "partitioning/partbounds.h"
#include "utils/hsearch.h"
@@ -57,6 +60,9 @@ static void add_join_rel(PlannerInfo *root, RelOptInfo *joinrel);
static void build_joinrel_partition_info(RelOptInfo *joinrel,
RelOptInfo *outer_rel, RelOptInfo *inner_rel,
List *restrictlist, JoinType jointype);
+static bool init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+ PathTarget *target, PathTarget *agg_input,
+ List *gvis);
/*
@@ -72,7 +78,10 @@ setup_simple_rel_arrays(PlannerInfo *root)
/* Arrays are accessed using RT indexes (1..N) */
root->simple_rel_array_size = list_length(root->parse->rtable) + 1;
- /* simple_rel_array is initialized to all NULLs */
+ /*
+ * simple_rel_array / simple_grouped_rel_array are both initialized to all
+ * NULLs
+ */
root->simple_rel_array = (RelOptInfo **)
palloc0(root->simple_rel_array_size * sizeof(RelOptInfo *));
@@ -148,7 +157,14 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
rel->reloptkind = parent ? RELOPT_OTHER_MEMBER_REL : RELOPT_BASEREL;
rel->relids = bms_make_singleton(relid);
rel->rows = 0;
- /* cheap startup cost is interesting iff not all tuples to be retrieved */
+
+ /*
+ * Cheap startup cost is interesting iff not all tuples to be retrieved.
+ * XXX As for grouped relation, the startup cost might be interesting for
+ * AGG_SORTED (if it can produce the ordering that matches
+ * root->query_pathkeys) but not in general (other kinds of aggregation
+ * need the whole relation). Yet it seems worth trying.
+ */
rel->consider_startup = (root->tuple_fraction > 0);
rel->consider_param_startup = false; /* might get changed later */
rel->consider_parallel = false; /* might get changed later */
@@ -162,6 +178,8 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
rel->cheapest_parameterized_paths = NIL;
rel->direct_lateral_relids = NULL;
rel->lateral_relids = NULL;
+ rel->agg_info = NULL;
+ rel->grouped = NULL;
rel->relid = relid;
rel->rtekind = rte->rtekind;
/* min_attr, max_attr, attr_needed, attr_widths are set below */
@@ -380,13 +398,23 @@ build_join_rel_hash(PlannerInfo *root)
RelOptInfo *
find_join_rel(PlannerInfo *root, Relids relids)
{
+ HTAB *join_rel_hash;
+ List *join_rel_list;
+
+ join_rel_hash = root->join_rel_hash;
+ join_rel_list = root->join_rel_list;
+
/*
* Switch to using hash lookup when list grows "too long". The threshold
* is arbitrary and is known only here.
*/
- if (!root->join_rel_hash && list_length(root->join_rel_list) > 32)
+ if (!join_rel_hash && list_length(join_rel_list) > 32)
+ {
build_join_rel_hash(root);
+ join_rel_hash = root->join_rel_hash;
+ }
+
/*
* Use either hashtable lookup or linear search, as appropriate.
*
@@ -395,12 +423,12 @@ find_join_rel(PlannerInfo *root, Relids relids)
* so would force relids out of a register and thus probably slow down the
* list-search case.
*/
- if (root->join_rel_hash)
+ if (join_rel_hash)
{
Relids hashkey = relids;
JoinHashEntry *hentry;
- hentry = (JoinHashEntry *) hash_search(root->join_rel_hash,
+ hentry = (JoinHashEntry *) hash_search(join_rel_hash,
&hashkey,
HASH_FIND,
NULL);
@@ -411,7 +439,7 @@ find_join_rel(PlannerInfo *root, Relids relids)
{
ListCell *l;
- foreach(l, root->join_rel_list)
+ foreach(l, join_rel_list)
{
RelOptInfo *rel = (RelOptInfo *) lfirst(l);
@@ -481,7 +509,9 @@ set_foreign_rel_properties(RelOptInfo *joinrel, RelOptInfo *outer_rel,
static void
add_join_rel(PlannerInfo *root, RelOptInfo *joinrel)
{
- /* GEQO requires us to append the new joinrel to the end of the list! */
+ /*
+ * GEQO requires us to append the new joinrel to the end of the list!
+ */
root->join_rel_list = lappend(root->join_rel_list, joinrel);
/* store it into the auxiliary hashtable if there is one. */
@@ -511,6 +541,9 @@ add_join_rel(PlannerInfo *root, RelOptInfo *joinrel)
* 'restrictlist_ptr': result variable. If not NULL, *restrictlist_ptr
* receives the list of RestrictInfo nodes that apply to this
* particular pair of joinable relations.
+ * 'grouped' forces creation of a "standalone" object, i.e. w/o search in the
+ * join list and without adding the result to the list. Caller is
+ * responsible for setup of reltarget in such a case.
*
* restrictlist_ptr makes the routine's API a little grotty, but it saves
* duplicated calculation of the restrictlist...
@@ -521,10 +554,12 @@ build_join_rel(PlannerInfo *root,
RelOptInfo *outer_rel,
RelOptInfo *inner_rel,
SpecialJoinInfo *sjinfo,
- List **restrictlist_ptr)
+ List **restrictlist_ptr,
+ bool grouped)
{
- RelOptInfo *joinrel;
+ RelOptInfo *joinrel = NULL;
List *restrictlist;
+ bool create_target = !grouped;
/* This function should be used only for join between parents. */
Assert(!IS_OTHER_REL(outer_rel) && !IS_OTHER_REL(inner_rel));
@@ -532,7 +567,8 @@ build_join_rel(PlannerInfo *root,
/*
* See if we already have a joinrel for this set of base rels.
*/
- joinrel = find_join_rel(root, joinrelids);
+ if (!grouped)
+ joinrel = find_join_rel(root, joinrelids);
if (joinrel)
{
@@ -555,11 +591,11 @@ build_join_rel(PlannerInfo *root,
joinrel->reloptkind = RELOPT_JOINREL;
joinrel->relids = bms_copy(joinrelids);
joinrel->rows = 0;
- /* cheap startup cost is interesting iff not all tuples to be retrieved */
+ /* See the comment in build_simple_rel(). */
joinrel->consider_startup = (root->tuple_fraction > 0);
joinrel->consider_param_startup = false;
joinrel->consider_parallel = false;
- joinrel->reltarget = create_empty_pathtarget();
+ joinrel->reltarget = NULL;
joinrel->pathlist = NIL;
joinrel->ppilist = NIL;
joinrel->partial_pathlist = NIL;
@@ -573,6 +609,8 @@ build_join_rel(PlannerInfo *root,
inner_rel->direct_lateral_relids);
joinrel->lateral_relids = min_join_parameterization(root, joinrel->relids,
outer_rel, inner_rel);
+ joinrel->agg_info = NULL;
+ joinrel->grouped = NULL;
joinrel->relid = 0; /* indicates not a baserel */
joinrel->rtekind = RTE_JOIN;
joinrel->min_attr = 0;
@@ -623,9 +661,13 @@ build_join_rel(PlannerInfo *root,
* and inner rels we first try to build it from. But the contents should
* be the same regardless.
*/
- build_joinrel_tlist(root, joinrel, outer_rel);
- build_joinrel_tlist(root, joinrel, inner_rel);
- add_placeholders_to_joinrel(root, joinrel, outer_rel, inner_rel);
+ if (create_target)
+ {
+ joinrel->reltarget = create_empty_pathtarget();
+ build_joinrel_tlist(root, joinrel, outer_rel);
+ build_joinrel_tlist(root, joinrel, inner_rel);
+ add_placeholders_to_joinrel(root, joinrel, outer_rel, inner_rel);
+ }
/*
* add_placeholders_to_joinrel also took care of adding the ph_lateral
@@ -662,31 +704,39 @@ build_join_rel(PlannerInfo *root,
/*
* Set estimates of the joinrel's size.
- */
- set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
- sjinfo, restrictlist);
-
- /*
- * Set the consider_parallel flag if this joinrel could potentially be
- * scanned within a parallel worker. If this flag is false for either
- * inner_rel or outer_rel, then it must be false for the joinrel also.
- * Even if both are true, there might be parallel-restricted expressions
- * in the targetlist or quals.
*
- * Note that if there are more than two rels in this relation, they could
- * be divided between inner_rel and outer_rel in any arbitrary way. We
- * assume this doesn't matter, because we should hit all the same baserels
- * and joinclauses while building up to this joinrel no matter which we
- * take; therefore, we should make the same decision here however we get
- * here.
+ * XXX The function claims to need reltarget but it does not seem to
+ * actually use it. Should we call it unconditionally so that callers of
+ * build_join_rel() do not have to care?
*/
- if (inner_rel->consider_parallel && outer_rel->consider_parallel &&
- is_parallel_safe(root, (Node *) restrictlist) &&
- is_parallel_safe(root, (Node *) joinrel->reltarget->exprs))
- joinrel->consider_parallel = true;
+ if (create_target)
+ {
+ set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
+ sjinfo, restrictlist);
+
+ /*
+ * Set the consider_parallel flag if this joinrel could potentially be
+ * scanned within a parallel worker. If this flag is false for either
+ * inner_rel or outer_rel, then it must be false for the joinrel also.
+ * Even if both are true, there might be parallel-restricted
+ * expressions in the targetlist or quals.
+ *
+ * Note that if there are more than two rels in this relation, they
+ * could be divided between inner_rel and outer_rel in any arbitrary
+ * way. We assume this doesn't matter, because we should hit all the
+ * same baserels and joinclauses while building up to this joinrel no
+ * matter which we take; therefore, we should make the same decision
+ * here however we get here.
+ */
+ if (inner_rel->consider_parallel && outer_rel->consider_parallel &&
+ is_parallel_safe(root, (Node *) restrictlist) &&
+ is_parallel_safe(root, (Node *) joinrel->reltarget->exprs))
+ joinrel->consider_parallel = true;
+ }
/* Add the joinrel to the PlannerInfo. */
- add_join_rel(root, joinrel);
+ if (!grouped)
+ add_join_rel(root, joinrel);
/*
* Also, if dynamic-programming join search is active, add the new joinrel
@@ -694,7 +744,7 @@ build_join_rel(PlannerInfo *root,
* of members should be for equality, but some of the level 1 rels might
* have been joinrels already, so we can only assert <=.
*/
- if (root->join_rel_level)
+ if (root->join_rel_level && !grouped)
{
Assert(root->join_cur_level > 0);
Assert(root->join_cur_level <= bms_num_members(joinrel->relids));
@@ -718,16 +768,19 @@ build_join_rel(PlannerInfo *root,
* 'restrictlist': list of RestrictInfo nodes that apply to this particular
* pair of joinable relations
* 'jointype' is the join type (inner, left, full, etc)
+ * 'grouped': does the join contain partial aggregate? (If it does, then
+ * caller is responsible for setup of reltarget.)
*/
RelOptInfo *
build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
RelOptInfo *inner_rel, RelOptInfo *parent_joinrel,
List *restrictlist, SpecialJoinInfo *sjinfo,
- JoinType jointype)
+ JoinType jointype, bool grouped)
{
RelOptInfo *joinrel = makeNode(RelOptInfo);
AppendRelInfo **appinfos;
int nappinfos;
+ bool create_target = !grouped;
/* Only joins between "other" relations land here. */
Assert(IS_OTHER_REL(outer_rel) && IS_OTHER_REL(inner_rel));
@@ -735,11 +788,11 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
joinrel->reloptkind = RELOPT_OTHER_JOINREL;
joinrel->relids = bms_union(outer_rel->relids, inner_rel->relids);
joinrel->rows = 0;
- /* cheap startup cost is interesting iff not all tuples to be retrieved */
+ /* See the comment in build_simple_rel(). */
joinrel->consider_startup = (root->tuple_fraction > 0);
joinrel->consider_param_startup = false;
joinrel->consider_parallel = false;
- joinrel->reltarget = create_empty_pathtarget();
+ joinrel->reltarget = NULL;
joinrel->pathlist = NIL;
joinrel->ppilist = NIL;
joinrel->partial_pathlist = NIL;
@@ -749,6 +802,8 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
joinrel->cheapest_parameterized_paths = NIL;
joinrel->direct_lateral_relids = NULL;
joinrel->lateral_relids = NULL;
+ joinrel->agg_info = NULL;
+ joinrel->grouped = NULL;
joinrel->relid = 0; /* indicates not a baserel */
joinrel->rtekind = RTE_JOIN;
joinrel->min_attr = 0;
@@ -789,11 +844,15 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
/* Compute information relevant to foreign relations. */
set_foreign_rel_properties(joinrel, outer_rel, inner_rel);
- /* Build targetlist */
- build_joinrel_tlist(root, joinrel, outer_rel);
- build_joinrel_tlist(root, joinrel, inner_rel);
- /* Add placeholder variables. */
- add_placeholders_to_child_joinrel(root, joinrel, parent_joinrel);
+ if (create_target)
+ {
+ /* Build targetlist */
+ joinrel->reltarget = create_empty_pathtarget();
+ build_joinrel_tlist(root, joinrel, outer_rel);
+ build_joinrel_tlist(root, joinrel, inner_rel);
+ /* Add placeholder variables. */
+ add_placeholders_to_child_joinrel(root, joinrel, parent_joinrel);
+ }
/* Construct joininfo list. */
appinfos = find_appinfos_by_relids(root, joinrel->relids, &nappinfos);
@@ -801,7 +860,6 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
(Node *) parent_joinrel->joininfo,
nappinfos,
appinfos);
- pfree(appinfos);
/*
* Lateral relids referred in child join will be same as that referred in
@@ -828,14 +886,22 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
/* Set estimates of the child-joinrel's size. */
- set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
- sjinfo, restrictlist);
+ /* XXX See the corresponding comment in build_join_rel(). */
+ if (create_target)
+ set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
+ sjinfo, restrictlist);
- /* We build the join only once. */
- Assert(!find_join_rel(root, joinrel->relids));
+ /*
+ * We build the join only once. (Grouped joins should not exist in the
+ * list.)
+ */
+ Assert(!find_join_rel(root, joinrel->relids) || grouped);
/* Add the relation to the PlannerInfo. */
- add_join_rel(root, joinrel);
+ if (!grouped)
+ add_join_rel(root, joinrel);
+
+ pfree(appinfos);
return joinrel;
}
@@ -1768,3 +1834,728 @@ build_joinrel_partition_info(RelOptInfo *joinrel, RelOptInfo *outer_rel,
joinrel->nullable_partexprs[cnt] = nullable_partexpr;
}
}
+
+/*
+ * Check if the relation can produce grouped paths and return the information
+ * it'll need for it. The passed relation is the non-grouped one which has the
+ * reltarget already constructed.
+ */
+RelAggInfo *
+create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel)
+{
+ List *gvis;
+ List *aggregates = NIL;
+ List *grp_exprs = NIL;
+ bool found_other_rel_agg;
+ ListCell *lc;
+ RelAggInfo *result;
+ PathTarget *agg_input;
+ PathTarget *target = NULL;
+ int i;
+ Bitmapset *sgr_set = NULL;
+ Bitmapset *sgr_query_set = NULL;
+
+ /*
+ * The function shouldn't have been called if there's no opportunity for
+ * aggregation push-down.
+ */
+ Assert(root->grouped_var_list != NIL);
+
+ /*
+ * The source relation has nothing to do with grouping.
+ */
+ Assert(rel->agg_info == NULL);
+
+ /*
+ * The current implementation of aggregation push-down cannot handle
+ * PlaceHolderVar (PHV).
+ *
+ * If we knew that the PHV should be evaluated in this target (and of
+ * course, if its expression matched some grouping expression or Aggref
+ * argument), we'd just let init_grouping_targets create GroupedVar for
+ * the corresponding expression (phexpr). On the other hand, if we knew
+ * that the PHV is evaluated below the current rel, we'd ignore it because
+ * the referencing GroupedVar would take care of propagation of the value
+ * to upper joins. (PHV whose ph_eval_at is above the current rel make the
+ * aggregation push-down impossible in any case because the partial
+ * aggregation would receive wrong input if we ignored the ph_eval_at.)
+ *
+ * The problem is that the same PHV can be evaluated in the target of the
+ * current rel or in that of lower rel --- depending on the input paths.
+ * For example, consider rel->relids = {A, B, C} and if ph_eval_at = {B,
+ * C}. Path "A JOIN (B JOIN C)" implies that the PHV is evaluated by the
+ * "(B JOIN C)", while path "(A JOIN B) JOIN C" evaluates the PHV itself.
+ */
+ foreach(lc, rel->reltarget->exprs)
+ {
+ Expr *expr = lfirst(lc);
+
+ if (IsA(expr, PlaceHolderVar))
+ return NULL;
+ }
+
+ if (IS_SIMPLE_REL(rel))
+ {
+ RangeTblEntry *rte = root->simple_rte_array[rel->relid];;
+
+ /*
+ * rtekind != RTE_RELATION case is not supported yet.
+ */
+ if (rte->rtekind != RTE_RELATION)
+ return NULL;
+ }
+
+ /* Caller should only pass base relations or joins. */
+ Assert(rel->reloptkind == RELOPT_BASEREL ||
+ rel->reloptkind == RELOPT_JOINREL);
+
+ /*
+ * If any outer join can set the attribute value to NULL, the Agg plan
+ * would receive different input at the base rel level.
+ *
+ * XXX For RELOPT_JOINREL, do not return if all the joins that can set any
+ * entry of the grouped target (do we need to postpone this check until
+ * the grouped target is available, and init_grouping_targets take care?)
+ * of this rel to NULL are provably below rel. (It's ok if rel is one of
+ * these joins.)
+ */
+ if (bms_overlap(rel->relids, root->nullable_baserels))
+ return NULL;
+
+ /*
+ * Use equivalence classes to generate additional grouping expressions for
+ * the current rel. Without these we might not be able to apply
+ * aggregation to the relation result set.
+ *
+ * It's important that create_grouping_expr_grouped_var_infos has
+ * processed the explicit grouping columns by now. If the grouping clause
+ * contains multiple expressions belonging to the same EC, the original
+ * (i.e. not derived) one should be preferred when we build grouping
+ * target for a relation. Otherwise we have a problem when trying to match
+ * target entries to grouping clauses during plan creation, see
+ * get_grouping_expression().
+ */
+ gvis = list_copy(root->grouped_var_list);
+ foreach(lc, root->grouped_var_list)
+ {
+ GroupedVarInfo *gvi = lfirst_node(GroupedVarInfo, lc);
+ int relid = -1;
+
+ /* Only interested in grouping expressions. */
+ if (IsA(gvi->gvexpr, Aggref))
+ continue;
+
+ while ((relid = bms_next_member(rel->relids, relid)) >= 0)
+ {
+ GroupedVarInfo *gvi_trans;
+
+ gvi_trans = translate_expression_to_rels(root, gvi, relid);
+ if (gvi_trans != NULL)
+ gvis = lappend(gvis, gvi_trans);
+ }
+ }
+
+ /*
+ * Check if some aggregates or grouping expressions can be evaluated in
+ * this relation's target, and collect all vars referenced by these
+ * aggregates / grouping expressions;
+ */
+ found_other_rel_agg = false;
+ foreach(lc, gvis)
+ {
+ GroupedVarInfo *gvi = lfirst_node(GroupedVarInfo, lc);
+
+ /*
+ * The subset includes gv_eval_at uninitialized, which includes
+ * Aggref.aggstar.
+ */
+ if (bms_is_subset(gvi->gv_eval_at, rel->relids))
+ {
+ /*
+ * init_grouping_targets will handle plain Var grouping
+ * expressions because it needs to look them up in
+ * grouped_var_list anyway.
+ *
+ * XXX A plain Var could actually be handled w/o GroupedVar, but
+ * thus init_grouping_targets would have to spend extra effort
+ * looking for the EC-related vars, instead of relying on
+ * create_grouping_expr_grouped_var_infos. (Processing of
+ * particular expression would look different, so we could hardly
+ * reuse the same piece of code.)
+ */
+ if (IsA(gvi->gvexpr, Var))
+ continue;
+
+ /*
+ * The derived grouping expressions should not be referenced by
+ * the query targetlist, so do not add them if we're at the top of
+ * the join tree.
+ */
+ if (gvi->derived && bms_equal(rel->relids, root->all_baserels))
+ continue;
+
+ /*
+ * Accept the aggregate / grouping expression.
+ *
+ * (GroupedVarInfo is more convenient for the next processing than
+ * Aggref, see add_aggregates_to_grouped_target.)
+ */
+ if (IsA(gvi->gvexpr, Aggref))
+ aggregates = lappend(aggregates, gvi);
+ else
+ grp_exprs = lappend(grp_exprs, gvi);
+ }
+ else if (IsA(gvi->gvexpr, Aggref))
+ {
+ /*
+ * Remember that there is at least one aggregate expression that
+ * needs something else than this rel.
+ */
+ found_other_rel_agg = true;
+
+ /*
+ * This condition effectively terminates creation of the
+ * RelAggInfo, so there's no reason to check the next
+ * GroupedVarInfo.
+ */
+ break;
+ }
+ }
+
+ /*
+ * Grouping makes little sense w/o aggregate function and w/o grouping
+ * expressions.
+ *
+ * In contrast, grp_exprs is only supposed to contain generic grouping
+ * expression, so it can be NIL so far. If all the grouping keys are just
+ * plain Vars, init_grouping_targets will take care of them.
+ */
+ if (aggregates == NIL)
+ {
+ list_free(gvis);
+ return NULL;
+ }
+
+ /*
+ * Give up if some other aggregate(s) need relations other than the
+ * current one.
+ *
+ * If the aggregate needs the current rel plus anything else, then the
+ * problem is that grouping of the current relation could make some input
+ * variables unavailable for the "higher aggregate", and it'd also
+ * decrease the number of input rows the "higher aggregate" receives.
+ *
+ * If the aggregate does not even need the current rel, then neither the
+ * current rel nor anything else should be grouped because we do not
+ * support join of two grouped relations.
+ */
+ if (found_other_rel_agg)
+ {
+ list_free(gvis);
+ return NULL;
+ }
+
+ /*
+ * Create target for grouped paths as well as one for the input paths of
+ * the aggregation paths.
+ */
+ target = create_empty_pathtarget();
+ agg_input = create_empty_pathtarget();
+
+ /*
+ * Cannot suitable targets for the aggregation push-down be derived?
+ */
+ if (!init_grouping_targets(root, rel, target, agg_input, gvis))
+ {
+ list_free(gvis);
+ return NULL;
+ }
+
+ list_free(gvis);
+
+ /*
+ * Add (non-Var) grouping expressions (in the form of GroupedVar) to
+ * target_agg.
+ *
+ * Follow the convention that the grouping expressions should precede
+ * aggregates.
+ */
+ add_grouped_vars_to_target(root, target, grp_exprs);
+
+ /*
+ * Aggregation push-down makes no sense w/o grouping expressions.
+ */
+ if (list_length(target->exprs) == 0)
+ return NULL;
+
+ /*
+ * Add aggregates (in the form of GroupedVar) to the grouping target.
+ */
+ add_grouped_vars_to_target(root, target, aggregates);
+
+ /*
+ * Make sure that the paths generating input data for partial aggregation
+ * include non-Var grouping expressions.
+ *
+ * TODO Shouldn't GroupedVar be added instead?
+ */
+ foreach(lc, grp_exprs)
+ {
+ GroupedVarInfo *gvi;
+
+ gvi = lfirst_node(GroupedVarInfo, lc);
+ add_column_to_pathtarget(agg_input, gvi->gvexpr, gvi->sortgroupref);
+ }
+
+ /*
+ * Since neither target nor agg_input is supposed to be identical to the
+ * source reltarget, compute the width and cost again.
+ */
+ set_pathtarget_cost_width(root, target);
+ set_pathtarget_cost_width(root, agg_input);
+
+ /*
+ * Check if target for 1-stage aggregation can be setup.
+ *
+ * For this to work, the relation grouping target must contain all the
+ * grouping expressions of the query. (Again, there's no final aggregation
+ * that ensures utilization of all the grouping expressions.)
+ *
+ * TODO Take into account the fact that grouping expressions can be
+ * derived using ECs, so a single expression of the target can correspond
+ * to multiple expressions of the query target. (There's no reason to put
+ * GROUP BY usually multiple EC members in the GROUP BY clause, but if
+ * user does, it should not disable aggregation push-down.)
+ *
+ * First, collect sortgrouprefs of the relation target.
+ */
+ i = 0;
+ foreach(lc, target->exprs)
+ {
+ Index sortgroupref = 0;
+
+ Assert(target->sortgrouprefs != NULL);
+ sortgroupref = target->sortgrouprefs[i++];
+ if (sortgroupref > 0)
+ sgr_set = bms_add_member(sgr_set, sortgroupref);
+ }
+
+ /*
+ * Collect those of the query.
+ */
+ foreach(lc, root->processed_tlist)
+ {
+ TargetEntry *te = lfirst_node(TargetEntry, lc);
+ ListCell *lc2;
+
+ if (te->ressortgroupref > 0)
+ {
+ bool accept = false;
+
+ /*
+ * Ignore target entries that contain aggregates.
+ */
+ if (IsA(te->expr, Var))
+ accept = true;
+ else
+ {
+ List *l = pull_var_clause((Node *) te->expr,
+ PVC_INCLUDE_AGGREGATES);
+
+ foreach(lc2, l)
+ {
+ Expr *expr = lfirst(lc2);
+
+ if (IsA(expr, Aggref))
+ break;
+ }
+
+ /*
+ * Accept the target entry if no Aggref was found.
+ */
+ if (lc2 == NULL)
+ accept = true;
+ list_free(l);
+ }
+ if (accept)
+ sgr_query_set = bms_add_member(sgr_query_set,
+ te->ressortgroupref);
+ }
+ }
+
+ /*
+ * No grouping at this relation if the the check failed.
+ */
+ if (!bms_equal(sgr_set, sgr_query_set))
+ return NULL;
+
+ result = makeNode(RelAggInfo);
+ result->target = target;
+ result->input = agg_input;
+
+ /*
+ * Build a list of grouping expressions and a list of the corresponding
+ * SortGroupClauses.
+ */
+ i = 0;
+ foreach(lc, target->exprs)
+ {
+ Index sortgroupref = 0;
+ SortGroupClause *cl;
+ Expr *texpr;
+
+ texpr = (Expr *) lfirst(lc);
+
+ if (IsA(texpr, GroupedVar) &&
+ IsA(((GroupedVar *) texpr)->gvexpr, Aggref))
+ {
+ /*
+ * texpr should represent the first aggregate in the targetlist.
+ */
+ break;
+ }
+
+ /*
+ * Find the clause by sortgroupref.
+ */
+ sortgroupref = target->sortgrouprefs[i++];
+
+ /*
+ * Besides being an aggregate, the target expression should have no
+ * other reason then being a column of a relation functionally
+ * dependent on the GROUP BY clause. So it's not actually a grouping
+ * column.
+ */
+ if (sortgroupref == 0)
+ continue;
+
+ cl = get_sortgroupref_clause(sortgroupref,
+ root->parse->groupClause);
+
+
+ result->group_clauses = list_append_unique(result->group_clauses,
+ cl);
+
+ /*
+ * Add only unique clauses because of joins (both sides of a join can
+ * point at the same grouping clause). XXX Is it worth adding a bool
+ * argument indicating that we're dealing with join right now?
+ */
+ result->group_exprs = list_append_unique(result->group_exprs,
+ texpr);
+ }
+
+ /* Finally collect the aggregates. */
+ while (lc != NULL)
+ {
+ GroupedVar *gvar = castNode(GroupedVar, lfirst(lc));
+
+ Assert(IsA(gvar->gvexpr, Aggref));
+ result->agg_exprs = lappend(result->agg_exprs,
+ gvar->gvexpr);
+
+ lc = lnext(lc);
+ }
+
+ return result;
+}
+
+/*
+ * Initialize target for grouped paths (target) as well as a target for paths
+ * that generate input for partial aggregation (agg_input).
+ *
+ * gvis a list of GroupedVarInfo's possibly useful for rel.
+ *
+ * Return true iff the targets could be initialized.
+ */
+static bool
+init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+ PathTarget *target, PathTarget *agg_input,
+ List *gvis)
+{
+ ListCell *lc;
+ List *unresolved = NIL;
+
+ foreach(lc, rel->reltarget->exprs)
+ {
+ Var *tvar;
+ GroupedVar *gvar;
+ Expr *expr_unresolved;
+ bool derived = false;
+ ListCell *lc2;
+ bool needed_by_aggregate;
+
+ /*
+ * Given that PlaceHolderVar currently prevents us from doing
+ * aggregation push-down, the source target cannot contain anything
+ * more complex than a Var. (As for generic grouping expressions,
+ * add_grouped_vars_to_target will retrieve them from the query
+ * targetlist and add them to "target" outside this function.)
+ */
+ tvar = lfirst_node(Var, lc);
+
+ gvar = get_grouping_expression(gvis, (Expr *) tvar, &derived);
+
+ /*
+ * Derived grouping expressions should not be referenced by the query
+ * targetlist, so let them fall into vars_unresolved. It'll be checked
+ * later if the current targetlist needs them.
+ */
+ if (gvar != NULL && !derived)
+ {
+ /*
+ * It's o.k. to use the target expression for grouping.
+ *
+ * The actual Var is added to the target. If we used the
+ * containing GroupedVar, references from various clauses (e.g.
+ * join quals) wouldn't work.
+ */
+ add_column_to_pathtarget(target, gvar->gvexpr,
+ gvar->sortgroupref);
+
+ /*
+ * As for agg_input, add the original expression but set
+ * sortgroupref in addition.
+ */
+ add_column_to_pathtarget(agg_input, gvar->gvexpr,
+ gvar->sortgroupref);
+
+ /* Process the next expression. */
+ continue;
+ }
+
+ /*
+ * Is this Var needed in the query targetlist for anything else than
+ * aggregate input?
+ */
+ needed_by_aggregate = false;
+ foreach(lc2, root->grouped_var_list)
+ {
+ GroupedVarInfo *gvi = lfirst_node(GroupedVarInfo, lc2);
+ ListCell *lc3;
+ List *vars;
+
+ if (!IsA(gvi->gvexpr, Aggref))
+ continue;
+
+ if (!bms_is_member(tvar->varno, gvi->gv_eval_at))
+ continue;
+
+ /*
+ * XXX Consider some sort of caching.
+ */
+ vars = pull_var_clause((Node *) gvi->gvexpr, PVC_RECURSE_AGGREGATES);
+ foreach(lc3, vars)
+ {
+ Var *var = lfirst_node(Var, lc3);
+
+ if (equal(var, tvar))
+ {
+ needed_by_aggregate = true;
+ break;
+ }
+ }
+ list_free(vars);
+ if (needed_by_aggregate)
+ break;
+ }
+
+ if (needed_by_aggregate)
+ {
+ bool found = false;
+
+ foreach(lc2, root->processed_tlist)
+ {
+ TargetEntry *te = lfirst_node(TargetEntry, lc2);
+
+ if (IsA(te->expr, Aggref))
+ continue;
+
+ /*
+ * Match tvar only to plain Vars in the targetlist.
+ *
+ * In contrast, occurrence of tvar in a generic (grouping)
+ * expressions is not a reason to add tvar to vars_unresolved
+ * and eventually to the grouping target because the generic
+ * grouping expressions should already have their GroupedVars
+ * created and those should be added to the grouping target
+ * separate.
+ */
+ if (equal(te->expr, tvar))
+ {
+ found = true;
+ break;
+ }
+ }
+
+ /*
+ * If it's only Aggref input, add it to the aggregation input
+ * target and that's it.
+ */
+ if (!found)
+ {
+ add_new_column_to_pathtarget(agg_input, (Expr *) tvar);
+ continue;
+ }
+ }
+
+ if (gvar != NULL)
+ {
+ Assert(derived);
+
+ /*
+ * Use the whole GroupedVar as it contains sortgroupref.
+ */
+ expr_unresolved = (Expr *) gvar;
+ }
+ else
+ expr_unresolved = (Expr *) tvar;
+
+ /*
+ * Further investigation involves dependency check, for which we need
+ * to have all the plain-var grouping expressions gathered.
+ */
+ unresolved = lappend(unresolved, expr_unresolved);
+ }
+
+ /*
+ * Check for other possible reasons for the var to be in the plain target.
+ */
+ foreach(lc, unresolved)
+ {
+ Expr *unresolved_expr = (Expr *) lfirst(lc);
+ Var *var;
+ GroupedVar *gvar = NULL;
+ Index sortgroupref;
+ RangeTblEntry *rte;
+ List *deps = NIL;
+ Relids relids_subtract;
+ int ndx;
+ RelOptInfo *baserel;
+
+ if (IsA(unresolved_expr, Var))
+ {
+ var = castNode(Var, unresolved_expr);
+ sortgroupref = 0;
+ }
+ else
+ {
+ gvar = castNode(GroupedVar, unresolved_expr);
+ var = castNode(Var, gvar->gvexpr);
+ sortgroupref = gvar->sortgroupref;
+ Assert(sortgroupref > 0);
+ }
+
+ rte = root->simple_rte_array[var->varno];
+
+ /*
+ * Check if the Var can be in the grouping key even though it's not
+ * mentioned by the GROUP BY clause (and could not be derived using
+ * ECs).
+ */
+ if (sortgroupref == 0 &&
+ check_functional_grouping(rte->relid, var->varno,
+ var->varlevelsup,
+ target->exprs, &deps))
+ {
+ /*
+ * The var shouldn't be actually used as a grouping key (instead,
+ * the one this depends on will be), so sortgroupref should not be
+ * important.
+ */
+ add_new_column_to_pathtarget(target, (Expr *) var);
+ add_new_column_to_pathtarget(agg_input, (Expr *) var);
+
+ /*
+ * The var may or may not be present in generic grouping
+ * expression(s) in addition, but this is handled elsewhere.
+ */
+ continue;
+ }
+
+ /*
+ * Isn't the expression needed by joins above the current rel?
+ *
+ * The relids we're not interested in do include 0, which is the
+ * top-level targetlist. The only reason for relids to contain 0
+ * should be that arg_var is referenced either by aggregate or by
+ * grouping expression, but right now we're interested in the *other*
+ * reasons. (As soon as GroupedVars are installed, the top level
+ * aggregates / grouping expressions no longer need direct reference
+ * to arg_var anyway.)
+ */
+ relids_subtract = bms_copy(rel->relids);
+ bms_add_member(relids_subtract, 0);
+
+ baserel = find_base_rel(root, var->varno);
+ ndx = var->varattno - baserel->min_attr;
+ if (bms_nonempty_difference(baserel->attr_needed[ndx],
+ relids_subtract))
+ {
+ /*
+ * The variable is needed by upper join. This includes one that is
+ * referenced by a generic grouping expression but couldn't be
+ * recognized as grouping expression on its own at the top of the
+ * loop.
+ *
+ * The only way to bring this var to the aggregation output is to
+ * add it to the grouping expressions too, but we can't do this
+ * unless aggregation push-down involves 2-stage aggregation. So
+ * give up.
+ */
+ return false;
+ }
+ else
+ {
+ /*
+ * As long as the query is semantically correct, arriving here
+ * means that the var is referenced by generic grouping
+ * expression. "target" should not contain it, as it only provides
+ * input for the final aggregation (it contans GroupedVar for the
+ * whole grouping expression).
+ */
+ }
+
+ add_column_to_pathtarget(agg_input, (Expr *) var, 0);
+ }
+
+ return true;
+}
+
+
+/*
+ * Translate RelAggInfo of parent relation so it matches given child relation.
+ */
+RelAggInfo *
+translate_rel_agg_info(PlannerInfo *root, RelAggInfo *parent,
+ AppendRelInfo **appinfos, int nappinfos)
+{
+ RelAggInfo *result;
+
+ result = makeNode(RelAggInfo);
+
+ result->target = copy_pathtarget(parent->target);
+ result->target->exprs = (List *)
+ adjust_appendrel_attrs(root,
+ (Node *) result->target->exprs,
+ nappinfos, appinfos);
+
+ result->input = copy_pathtarget(parent->input);
+ result->input->exprs = (List *)
+ adjust_appendrel_attrs(root,
+ (Node *) result->input->exprs,
+ nappinfos, appinfos);
+
+ result->group_clauses = parent->group_clauses;
+
+ result->group_exprs = (List *)
+ adjust_appendrel_attrs(root,
+ (Node *) parent->group_exprs,
+ nappinfos, appinfos);
+
+ result->agg_exprs = (List *)
+ adjust_appendrel_attrs(root,
+ (Node *) parent->agg_exprs,
+ nappinfos, appinfos);
+ return result;
+}
diff --git a/src/backend/optimizer/util/tlist.c b/src/backend/optimizer/util/tlist.c
index 5500f33e63..fbc2851b39 100644
--- a/src/backend/optimizer/util/tlist.c
+++ b/src/backend/optimizer/util/tlist.c
@@ -105,6 +105,17 @@ tlist_member_ignore_relabel(Expr *node, List *targetlist)
while (tlexpr && IsA(tlexpr, RelabelType))
tlexpr = ((RelabelType *) tlexpr)->arg;
+ /*
+ * The targetlist may contain GroupedVar where caller expects the
+ * actual expression.
+ *
+ * XXX prepare_sort_from_pathkeys() needs this special case. Should
+ * the same assignment be also added to the other tlist_member_...()
+ * functions?
+ */
+ if (IsA(tlexpr, GroupedVar))
+ tlexpr = ((GroupedVar *) tlexpr)->gvexpr;
+
if (equal(node, tlexpr))
return tlentry;
}
@@ -426,7 +437,6 @@ get_sortgrouplist_exprs(List *sgClauses, List *targetList)
return result;
}
-
/*****************************************************************************
* Functions to extract data from a list of SortGroupClauses
*
@@ -801,6 +811,68 @@ apply_pathtarget_labeling_to_tlist(List *tlist, PathTarget *target)
}
/*
+ * For each aggregate add GroupedVar to the grouped target.
+ *
+ * Caller passes the aggregates in the form of GroupedVarInfos so that we
+ * don't have to look for gvid.
+ */
+void
+add_grouped_vars_to_target(PlannerInfo *root, PathTarget *target,
+ List *expressions)
+{
+ ListCell *lc;
+
+ /* Create the vars and add them to the target. */
+ foreach(lc, expressions)
+ {
+ GroupedVarInfo *gvi;
+ GroupedVar *gvar;
+
+ gvi = lfirst_node(GroupedVarInfo, lc);
+ gvar = makeNode(GroupedVar);
+ gvar->gvid = gvi->gvid;
+ gvar->gvexpr = gvi->gvexpr;
+ add_column_to_pathtarget(target, (Expr *) gvar, gvi->sortgroupref);
+ }
+}
+
+/*
+ * Return GroupedVar containing the passed-in expression if one exists, or
+ * NULL if the expression cannot be used as grouping key.
+ *
+ * is_derived reflects the ->derived field of the corresponding
+ * GroupedVarInfo.
+ */
+GroupedVar *
+get_grouping_expression(List *gvis, Expr *expr, bool *is_derived)
+{
+ ListCell *lc;
+
+ foreach(lc, gvis)
+ {
+ GroupedVarInfo *gvi = lfirst_node(GroupedVarInfo, lc);
+
+ if (IsA(gvi->gvexpr, Aggref))
+ continue;
+
+ if (equal(gvi->gvexpr, expr))
+ {
+ GroupedVar *result = makeNode(GroupedVar);
+
+ Assert(gvi->sortgroupref > 0);
+ result->gvexpr = gvi->gvexpr;
+ result->gvid = gvi->gvid;
+ result->sortgroupref = gvi->sortgroupref;
+ *is_derived = gvi->derived;
+ return result;
+ }
+ }
+
+ /* The expression cannot be used as grouping key. */
+ return NULL;
+}
+
+/*
* split_pathtarget_at_srfs
* Split given PathTarget into multiple levels to position SRFs safely
*
diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c
index b16b1e4656..459dc3087c 100644
--- a/src/backend/optimizer/util/var.c
+++ b/src/backend/optimizer/util/var.c
@@ -840,3 +840,25 @@ alias_relid_set(PlannerInfo *root, Relids relids)
}
return result;
}
+
+/*
+ * Return GroupedVarInfo for given GroupedVar.
+ *
+ * XXX Consider better location of this routine.
+ */
+GroupedVarInfo *
+find_grouped_var_info(PlannerInfo *root, GroupedVar *gvar)
+{
+ ListCell *l;
+
+ foreach(l, root->grouped_var_list)
+ {
+ GroupedVarInfo *gvi = lfirst_node(GroupedVarInfo, l);
+
+ if (gvi->gvid == gvar->gvid)
+ return gvi;
+ }
+
+ elog(ERROR, "GroupedVarInfo not found");
+ return NULL; /* keep compiler quiet */
+}
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 44257154b8..a47aa0e92c 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -104,6 +104,7 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
Oid vatype;
FuncDetailCode fdresult;
char aggkind = 0;
+ Oid aggcombinefn = InvalidOid;
ParseCallbackState pcbstate;
/*
@@ -360,6 +361,7 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
elog(ERROR, "cache lookup failed for aggregate %u", funcid);
classForm = (Form_pg_aggregate) GETSTRUCT(tup);
aggkind = classForm->aggkind;
+ aggcombinefn = classForm->aggcombinefn;
catDirectArgs = classForm->aggnumdirectargs;
ReleaseSysCache(tup);
@@ -759,6 +761,7 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
aggref->aggstar = agg_star;
aggref->aggvariadic = func_variadic;
aggref->aggkind = aggkind;
+ aggref->aggcombinefn = aggcombinefn;
/* agglevelsup will be set by transformAggregateCall */
aggref->aggsplit = AGGSPLIT_SIMPLE; /* planner might change this */
aggref->location = location;
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 03e9a28a63..5b1c86a4d0 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -7757,6 +7757,35 @@ get_rule_expr(Node *node, deparse_context *context,
get_agg_expr((Aggref *) node, context, (Aggref *) node);
break;
+ case T_GroupedVar:
+ {
+ GroupedVar *gvar = castNode(GroupedVar, node);
+ Expr *expr = gvar->gvexpr;
+
+ /*
+ * GroupedVar that setrefs.c leaves in the tree should only
+ * exist in the Agg plan targetlist (while the GroupedVars in
+ * upper plans should have been replaced with Vars). If
+ * agg_partial is not initialized, the AGGSPLIT_SIMPLE
+ * aggregate has been pushed down.
+ */
+ if (IsA(expr, Aggref))
+ {
+ Aggref *aggref;
+
+ aggref = (Aggref *) gvar->gvexpr;
+ get_agg_expr(aggref, context, (Aggref *) gvar->gvexpr);
+ }
+ else if (IsA(expr, Var))
+ (void) get_variable((Var *) expr, 0, false, context);
+ else
+ {
+ Assert(IsA(gvar->gvexpr, OpExpr));
+ get_oper_expr((OpExpr *) expr, context);
+ }
+ break;
+ }
+
case T_GroupingFunc:
{
GroupingFunc *gexpr = (GroupingFunc *) node;
@@ -9242,10 +9271,18 @@ get_agg_combine_expr(Node *node, deparse_context *context, void *private)
Aggref *aggref;
Aggref *original_aggref = private;
- if (!IsA(node, Aggref))
+ if (IsA(node, Aggref))
+ aggref = (Aggref *) node;
+ else if (IsA(node, GroupedVar))
+ {
+ GroupedVar *gvar = castNode(GroupedVar, node);
+
+ aggref = (Aggref *) gvar->gvexpr;
+ original_aggref = castNode(Aggref, gvar->gvexpr);
+ }
+ else
elog(ERROR, "combining Aggref does not point to an Aggref");
- aggref = (Aggref *) node;
get_agg_expr(aggref, context, original_aggref);
}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index f1c78ffb65..bd4868cb56 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -113,6 +113,7 @@
#include "catalog/pg_statistic_ext.h"
#include "catalog/pg_type.h"
#include "executor/executor.h"
+#include "executor/nodeAgg.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
@@ -3883,6 +3884,39 @@ estimate_hash_bucket_stats(PlannerInfo *root, Node *hashkey, double nbuckets,
ReleaseVariableStats(vardata);
}
+/*
+ * estimate_hashagg_tablesize
+ * estimate the number of bytes that a hash aggregate hashtable will
+ * require based on the agg_costs, path width and dNumGroups.
+ *
+ * XXX this may be over-estimating the size now that hashagg knows to omit
+ * unneeded columns from the hashtable. Also for mixed-mode grouping sets,
+ * grouping columns not in the hashed set are counted here even though hashagg
+ * won't store them. Is this a problem?
+ */
+Size
+estimate_hashagg_tablesize(Path *path, const AggClauseCosts *agg_costs,
+ double dNumGroups)
+{
+ Size hashentrysize;
+
+ /* Estimate per-hash-entry space at tuple width... */
+ hashentrysize = MAXALIGN(path->pathtarget->width) +
+ MAXALIGN(SizeofMinimalTupleHeader);
+
+ /* plus space for pass-by-ref transition values... */
+ hashentrysize += agg_costs->transitionSpace;
+ /* plus the per-hash-entry overhead */
+ hashentrysize += hash_agg_entry_size(agg_costs->numAggs);
+
+ /*
+ * Note that this disregards the effect of fill-factor and growth policy
+ * of the hash-table. That's probably ok, given default the default
+ * fill-factor is relatively high. It'd be hard to meaningfully factor in
+ * "double-in-size" growth policies here.
+ */
+ return hashentrysize * dNumGroups;
+}
/*-------------------------------------------------------------------------
*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index c5ba149996..61388d1254 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -952,6 +952,15 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},
{
+ {"enable_agg_pushdown", PGC_USERSET, QUERY_TUNING_METHOD,
+ gettext_noop("Enables aggregation push-down."),
+ NULL
+ },
+ &enable_agg_pushdown,
+ false,
+ NULL, NULL, NULL
+ },
+ {
{"enable_parallel_append", PGC_USERSET, QUERY_TUNING_METHOD,
gettext_noop("Enables the planner's use of parallel append plans."),
NULL
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 697d3d7a5f..1e422fc140 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -223,6 +223,7 @@ typedef enum NodeTag
T_IndexOptInfo,
T_ForeignKeyOptInfo,
T_ParamPathInfo,
+ T_RelAggInfo,
T_Path,
T_IndexPath,
T_BitmapHeapPath,
@@ -263,9 +264,11 @@ typedef enum NodeTag
T_PathTarget,
T_RestrictInfo,
T_PlaceHolderVar,
+ T_GroupedVar,
T_SpecialJoinInfo,
T_AppendRelInfo,
T_PlaceHolderInfo,
+ T_GroupedVarInfo,
T_MinMaxAggInfo,
T_PlannerParamItem,
T_RollupData,
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 1b4b0d75af..6af31f2722 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -296,6 +296,7 @@ typedef struct Aggref
Oid aggcollid; /* OID of collation of result */
Oid inputcollid; /* OID of collation that function should use */
Oid aggtranstype; /* type Oid of aggregate's transition value */
+ Oid aggcombinefn; /* combine function (see pg_aggregate.h) */
List *aggargtypes; /* type Oids of direct and aggregated args */
List *aggdirectargs; /* direct arguments, if an ordered-set agg */
List *args; /* aggregated arguments and sort expressions */
@@ -306,6 +307,7 @@ typedef struct Aggref
bool aggvariadic; /* true if variadic arguments have been
* combined into an array last argument */
char aggkind; /* aggregate kind (see pg_aggregate.h) */
+
Index agglevelsup; /* > 0 if agg belongs to outer query */
AggSplit aggsplit; /* expected agg-splitting mode of parent Agg */
int location; /* token location, or -1 if unknown */
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 41caf873fb..3e0c4cb060 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -193,7 +193,8 @@ typedef struct PlannerInfo
* unreferenced view RTE; or if the RelOptInfo hasn't been made yet.
*/
struct RelOptInfo **simple_rel_array; /* All 1-rel RelOptInfos */
- int simple_rel_array_size; /* allocated size of array */
+
+ int simple_rel_array_size; /* allocated size of the arrays above */
/*
* simple_rte_array is the same length as simple_rel_array and holds
@@ -247,6 +248,7 @@ typedef struct PlannerInfo
* join_rel_level is NULL if not in use.
*/
List **join_rel_level; /* lists of join-relation RelOptInfos */
+
int join_cur_level; /* index of list being extended */
List *init_plans; /* init SubPlans for query */
@@ -279,6 +281,8 @@ typedef struct PlannerInfo
List *placeholder_list; /* list of PlaceHolderInfos */
+ List *grouped_var_list; /* List of GroupedVarInfos. */
+
List *fkey_list; /* list of ForeignKeyOptInfos */
List *query_pathkeys; /* desired pathkeys for query_planner() */
@@ -467,6 +471,8 @@ typedef struct PartitionSchemeData *PartitionScheme;
* direct_lateral_relids - rels this rel has direct LATERAL references to
* lateral_relids - required outer rels for LATERAL, as a Relids set
* (includes both direct and indirect lateral references)
+ * gpi - GroupedPathInfo if the relation can produce grouped paths, NULL
+ * otherwise.
*
* If the relation is a base relation it will have these fields set:
*
@@ -646,6 +652,16 @@ typedef struct RelOptInfo
Relids direct_lateral_relids; /* rels directly laterally referenced */
Relids lateral_relids; /* minimum parameterization of rel */
+ /* Information needed to apply partial aggregation to this rel's paths. */
+ struct RelAggInfo *agg_info;
+
+ /*
+ * If the relation can produce grouped paths, store them here.
+ *
+ * If "grouped" is valid then "agg_info" must be NULL and vice versa.
+ */
+ struct RelOptInfo *grouped;
+
/* information about a base rel (not set for join rels!) */
Index relid;
Oid reltablespace; /* containing tablespace */
@@ -1049,6 +1065,64 @@ typedef struct ParamPathInfo
List *ppi_clauses; /* join clauses available from outer rels */
} ParamPathInfo;
+/*
+ * RelAggInfo
+ *
+ * RelOptInfo needs information contained here if its paths should be
+ * aggregated.
+ *
+ * "target" will be used as pathtarget for aggregation if "explicit
+ * aggregation" is applied to base relation or join. The same target will will
+ * also --- if the relation is a join --- be used to joinin grouped path to a
+ * non-grouped one.
+ *
+ * These targets contain plain-Var grouping expressions, generic grouping
+ * expressions wrapped in GroupedVar structure, or Aggrefs which are also
+ * wrapped in GroupedVar. Once GroupedVar is evaluated, its value is passed to
+ * the upper paths w/o being evaluated again. If final aggregation appears to
+ * be necessary above the final join, the contained Aggrefs are supposed to
+ * provide the final aggregation plan with input values, i.e. the aggregate
+ * transient state.
+ *
+ * Note: There's a convention that GroupedVars that contain Aggref expressions
+ * are supposed to follow the other expressions of the target. Iterations of
+ * ->exprs may rely on this arrangement.
+ *
+ * "input" contains Vars used either as grouping expressions or aggregate
+ * arguments, plus those used in grouping expressions which are not plain Vars
+ * themselves. Paths providing the aggregation plan with input data should use
+ * this target.
+ *
+ * "group_clauses" and "group_exprs" are lists of SortGroupClause and the
+ * corresponding grouping expressions respectively.
+ *
+ * "agg_exprs" is a list of Aggref nodes for the aggregation of the relation's
+ * paths.
+ *
+ * "rows" is the estimated number of result tuples produced by grouped
+ * paths.
+ */
+typedef struct RelAggInfo
+{
+ NodeTag type;
+
+ PathTarget *target; /* Target for grouped paths.. */
+
+ PathTarget *input; /* pathtarget of paths that generate input for
+ * aggregation paths. */
+
+ List *group_clauses;
+ List *group_exprs;
+
+ /*
+ * TODO Consider removing this field and creating the Aggref, partial or
+ * simple, when needed, but avoid creating it multiple times (e.g. once
+ * for hash grouping, other times for sorted grouping).
+ */
+ List *agg_exprs; /* Aggref expressions. */
+
+ double rows;
+} RelAggInfo;
/*
* Type "Path" is used as-is for sequential-scan paths, as well as some other
@@ -1078,6 +1152,10 @@ typedef struct ParamPathInfo
*
* "pathkeys" is a List of PathKey nodes (see above), describing the sort
* ordering of the path's output rows.
+ *
+ * "uniquekeys" is a List of Bitmapset objects, each pointing at a set of
+ * expressions of "pathtarget" whose values within the path output are
+ * distinct.
*/
typedef struct Path
{
@@ -1101,6 +1179,10 @@ typedef struct Path
List *pathkeys; /* sort ordering of path's output */
/* pathkeys is a List of PathKey nodes; see above */
+
+ List *uniquekeys; /* list of bitmapsets where each set contains
+ * positions of unique expressions within
+ * pathtarget. */
} Path;
/* Macro for extracting a path's parameterization relids; beware double eval */
@@ -1526,12 +1608,16 @@ typedef struct HashPath
* ProjectionPath node, which is marked dummy to indicate that we intend to
* assign the work to the input plan node. The estimated cost for the
* ProjectionPath node will account for whether a Result will be used or not.
+ *
+ * force_result field tells that the Result node must be used for some reason
+ * even though the subpath could normally handle the projection.
*/
typedef struct ProjectionPath
{
Path path;
Path *subpath; /* path representing input source */
bool dummypp; /* true if no separate Result is needed */
+ bool force_result; /* Is Result node required? */
} ProjectionPath;
/*
@@ -2007,6 +2093,29 @@ typedef struct PlaceHolderVar
Index phlevelsup; /* > 0 if PHV belongs to outer query */
} PlaceHolderVar;
+
+/*
+ * Similar to the concept of PlaceHolderVar, we treat aggregates and grouping
+ * columns as special variables if grouping is possible below the top-level
+ * join. Likewise, the variable is evaluated below the query targetlist (in
+ * particular, in the targetlist of AGGSPLIT_INITIAL_SERIAL aggregation node
+ * which has base relation or a join as the input) and bubbles up through the
+ * join tree until it reaches AGGSPLIT_FINAL_DESERIAL aggregation node.
+ *
+ * gvexpr is either Aggref or a generic (non-Var) grouping expression. (If a
+ * simple Var, we don't replace it with GroupedVar.)
+ */
+typedef struct GroupedVar
+{
+ Expr xpr;
+ Expr *gvexpr; /* the represented expression */
+
+ Index sortgroupref; /* SortGroupClause.tleSortGroupRef if gvexpr
+ * is grouping expression. */
+ Index gvid; /* GroupedVarInfo */
+ int32 width; /* Expression width. */
+} GroupedVar;
+
/*
* "Special join" info.
*
@@ -2203,6 +2312,24 @@ typedef struct PlaceHolderInfo
} PlaceHolderInfo;
/*
+ * Likewise, GroupedVarInfo exists for each distinct GroupedVar.
+ */
+typedef struct GroupedVarInfo
+{
+ NodeTag type;
+
+ Index gvid; /* GroupedVar.gvid */
+ Expr *gvexpr; /* the represented expression. */
+ Index sortgroupref; /* If gvexpr is a grouping expression, this is
+ * the tleSortGroupRef of the corresponding
+ * SortGroupClause. */
+ Relids gv_eval_at; /* lowest level we can evaluate the expression
+ * at or NULL if it can happen anywhere. */
+ bool derived; /* derived from another GroupedVarInfo using
+ * equeivalence classes? */
+} GroupedVarInfo;
+
+/*
* This struct describes one potentially index-optimizable MIN/MAX aggregate
* function. MinMaxAggPath contains a list of these, and if we accept that
* path, the list is stored into root->minmax_aggs for use during setrefs.c.
diff --git a/src/include/optimizer/clauses.h b/src/include/optimizer/clauses.h
index ed854fdd40..f9f3d14b0b 100644
--- a/src/include/optimizer/clauses.h
+++ b/src/include/optimizer/clauses.h
@@ -88,4 +88,6 @@ extern Query *inline_set_returning_function(PlannerInfo *root,
extern List *expand_function_arguments(List *args, Oid result_type,
HeapTuple func_tuple);
+extern GroupedVarInfo *translate_expression_to_rels(PlannerInfo *root,
+ GroupedVarInfo *gvi, Index relid);
#endif /* CLAUSES_H */
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 77ca7ff837..bb6ec0f4e1 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -72,6 +72,7 @@ extern PGDLLIMPORT bool enable_partitionwise_aggregate;
extern PGDLLIMPORT bool enable_parallel_append;
extern PGDLLIMPORT bool enable_parallel_hash;
extern PGDLLIMPORT bool enable_partition_pruning;
+extern PGDLLIMPORT bool enable_agg_pushdown;
extern PGDLLIMPORT int constraint_exclusion;
extern double clamp_row_est(double nrows);
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 7c5ff22650..e7edd2f34e 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -71,6 +71,7 @@ extern AppendPath *create_append_path(PlannerInfo *root, RelOptInfo *rel,
List *partitioned_rels, double rows);
extern MergeAppendPath *create_merge_append_path(PlannerInfo *root,
RelOptInfo *rel,
+ PathTarget *target,
List *subpaths,
List *pathkeys,
Relids required_outer,
@@ -123,6 +124,7 @@ extern Relids calc_non_nestloop_required_outer(Path *outer_path, Path *inner_pat
extern NestPath *create_nestloop_path(PlannerInfo *root,
RelOptInfo *joinrel,
+ PathTarget *target,
JoinType jointype,
JoinCostWorkspace *workspace,
JoinPathExtraData *extra,
@@ -134,6 +136,7 @@ extern NestPath *create_nestloop_path(PlannerInfo *root,
extern MergePath *create_mergejoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
+ PathTarget *target,
JoinType jointype,
JoinCostWorkspace *workspace,
JoinPathExtraData *extra,
@@ -148,6 +151,7 @@ extern MergePath *create_mergejoin_path(PlannerInfo *root,
extern HashPath *create_hashjoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
+ PathTarget *target,
JoinType jointype,
JoinCostWorkspace *workspace,
JoinPathExtraData *extra,
@@ -196,6 +200,13 @@ extern AggPath *create_agg_path(PlannerInfo *root,
List *qual,
const AggClauseCosts *aggcosts,
double numGroups);
+extern AggPath *create_agg_sorted_path(PlannerInfo *root,
+ Path *subpath,
+ bool check_pathkeys,
+ double input_rows);
+extern AggPath *create_agg_hashed_path(PlannerInfo *root,
+ Path *subpath,
+ double input_rows);
extern GroupingSetsPath *create_groupingsets_path(PlannerInfo *root,
RelOptInfo *rel,
Path *subpath,
@@ -255,6 +266,14 @@ extern Path *reparameterize_path(PlannerInfo *root, Path *path,
double loop_count);
extern Path *reparameterize_path_by_child(PlannerInfo *root, Path *path,
RelOptInfo *child_rel);
+extern void make_uniquekeys(PlannerInfo *root, Path *path);
+extern void make_uniquekeys_for_agg_path(Path *path);
+extern List *make_uniquekeys_for_join(PlannerInfo *root,
+ Path *outerpath,
+ Path *innerpath,
+ PathTarget *target,
+ bool *keys_ok);
+extern void free_uniquekeys(List *uniquekeys);
/*
* prototypes for relnode.c
@@ -270,7 +289,8 @@ extern RelOptInfo *build_join_rel(PlannerInfo *root,
RelOptInfo *outer_rel,
RelOptInfo *inner_rel,
SpecialJoinInfo *sjinfo,
- List **restrictlist_ptr);
+ List **restrictlist_ptr,
+ bool grouped);
extern Relids min_join_parameterization(PlannerInfo *root,
Relids joinrelids,
RelOptInfo *outer_rel,
@@ -296,6 +316,11 @@ extern ParamPathInfo *find_param_path_info(RelOptInfo *rel,
extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
RelOptInfo *outer_rel, RelOptInfo *inner_rel,
RelOptInfo *parent_joinrel, List *restrictlist,
- SpecialJoinInfo *sjinfo, JoinType jointype);
-
+ SpecialJoinInfo *sjinfo, JoinType jointype,
+ bool grouped);
+extern RelAggInfo *create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel);
+extern RelAggInfo *translate_rel_agg_info(PlannerInfo *root,
+ RelAggInfo *agg_info,
+ AppendRelInfo **appinfos,
+ int nappinfos);
#endif /* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index cafde307ad..ede0cf242d 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -21,6 +21,7 @@
* allpaths.c
*/
extern PGDLLIMPORT bool enable_geqo;
+extern PGDLLIMPORT bool enable_agg_pushdown;
extern PGDLLIMPORT int geqo_threshold;
extern PGDLLIMPORT int min_parallel_table_scan_size;
extern PGDLLIMPORT int min_parallel_index_scan_size;
@@ -50,11 +51,16 @@ extern PGDLLIMPORT join_search_hook_type join_search_hook;
extern RelOptInfo *make_one_rel(PlannerInfo *root, List *joinlist);
extern void set_dummy_rel_pathlist(RelOptInfo *rel);
-extern RelOptInfo *standard_join_search(PlannerInfo *root, int levels_needed,
+extern RelOptInfo *standard_join_search(PlannerInfo *root,
+ int levels_needed,
List *initial_rels);
extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
bool override_rows);
+
+extern bool create_grouped_path(PlannerInfo *root, RelOptInfo *rel,
+ Path *subpath, bool precheck,
+ bool parallel, AggStrategy aggstrategy);
extern int compute_parallel_worker(RelOptInfo *rel, double heap_pages,
double index_pages, int max_workers);
extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
@@ -70,7 +76,8 @@ extern void debug_print_rel(PlannerInfo *root, RelOptInfo *rel);
* indxpath.c
* routines to generate index paths
*/
-extern void create_index_paths(PlannerInfo *root, RelOptInfo *rel);
+extern void create_index_paths(PlannerInfo *root, RelOptInfo *rel,
+ bool grouped);
extern bool relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
List *restrictlist,
List *exprlist, List *oprlist);
@@ -92,7 +99,8 @@ extern Expr *adjust_rowcompare_for_index(RowCompareExpr *clause,
* tidpath.h
* routines to generate tid paths
*/
-extern void create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel);
+extern void create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel,
+ bool grouped);
/*
* joinpath.c
@@ -101,7 +109,8 @@ extern void create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel);
extern void add_paths_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *outerrel, RelOptInfo *innerrel,
JoinType jointype, SpecialJoinInfo *sjinfo,
- List *restrictlist);
+ List *restrictlist,
+ bool grouped, bool do_aggregate);
/*
* joinrels.c
@@ -237,6 +246,10 @@ extern bool has_useful_pathkeys(PlannerInfo *root, RelOptInfo *rel);
extern PathKey *make_canonical_pathkey(PlannerInfo *root,
EquivalenceClass *eclass, Oid opfamily,
int strategy, bool nulls_first);
+extern void add_uniquekeys(List **keys_p, Bitmapset *new_set);
+extern bool match_uniquekeys_to_group_pathkeys(PlannerInfo *root,
+ List *uniquekeys,
+ PathTarget *target);
extern void add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
List *live_childrels);
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index c8ab0280d2..ac76375b31 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -76,6 +76,8 @@ extern void add_base_rels_to_query(PlannerInfo *root, Node *jtnode);
extern void build_base_rel_tlists(PlannerInfo *root, List *final_tlist);
extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
Relids where_needed, bool create_new_ph);
+extern void add_grouped_base_rels_to_query(PlannerInfo *root);
+extern void add_grouped_vars_to_rels(PlannerInfo *root);
extern void find_lateral_references(PlannerInfo *root);
extern void create_lateral_join_info(PlannerInfo *root);
extern List *deconstruct_jointree(PlannerInfo *root);
diff --git a/src/include/optimizer/tlist.h b/src/include/optimizer/tlist.h
index 9fa52e1278..6c7619ad31 100644
--- a/src/include/optimizer/tlist.h
+++ b/src/include/optimizer/tlist.h
@@ -16,7 +16,6 @@
#include "nodes/relation.h"
-
extern TargetEntry *tlist_member(Expr *node, List *targetlist);
extern TargetEntry *tlist_member_ignore_relabel(Expr *node, List *targetlist);
@@ -41,7 +40,6 @@ extern Node *get_sortgroupclause_expr(SortGroupClause *sgClause,
List *targetList);
extern List *get_sortgrouplist_exprs(List *sgClauses,
List *targetList);
-
extern SortGroupClause *get_sortgroupref_clause(Index sortref,
List *clauses);
extern SortGroupClause *get_sortgroupref_clause_noerr(Index sortref,
@@ -65,6 +63,13 @@ extern void split_pathtarget_at_srfs(PlannerInfo *root,
PathTarget *target, PathTarget *input_target,
List **targets, List **targets_contain_srfs);
+/* TODO Find the best location (position and in some cases even file) for the
+ * following ones. */
+extern void add_grouped_vars_to_target(PlannerInfo *root, PathTarget *target,
+ List *expressions);
+extern GroupedVar *get_grouping_expression(List *gvis, Expr *expr,
+ bool *is_derived);
+
/* Convenience macro to get a PathTarget with valid cost/width fields */
#define create_pathtarget(root, tlist) \
set_pathtarget_cost_width(root, make_pathtarget_from_tlist(tlist))
diff --git a/src/include/optimizer/var.h b/src/include/optimizer/var.h
index 43c53b5344..5a795c3231 100644
--- a/src/include/optimizer/var.h
+++ b/src/include/optimizer/var.h
@@ -36,5 +36,7 @@ extern bool contain_vars_of_level(Node *node, int levelsup);
extern int locate_var_of_level(Node *node, int levelsup);
extern List *pull_var_clause(Node *node, int flags);
extern Node *flatten_join_alias_vars(PlannerInfo *root, Node *node);
+extern GroupedVarInfo *find_grouped_var_info(PlannerInfo *root,
+ GroupedVar *gvar);
#endif /* VAR_H */
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
index 95e44280c4..3a14fc6036 100644
--- a/src/include/utils/selfuncs.h
+++ b/src/include/utils/selfuncs.h
@@ -213,6 +213,9 @@ extern void estimate_hash_bucket_stats(PlannerInfo *root,
Node *hashkey, double nbuckets,
Selectivity *mcv_freq,
Selectivity *bucketsize_frac);
+extern Size estimate_hashagg_tablesize(Path *path,
+ const AggClauseCosts *agg_costs,
+ double dNumGroups);
extern List *deconstruct_indexquals(IndexPath *path);
extern void genericcostestimate(PlannerInfo *root, IndexPath *path,
On Fri, Aug 03, 2018 at 04:50:11PM +0200, Antonin Houska wrote:
Antonin Houska <ah@cybertec.at> wrote:
I didn't have enough time to separate "your functionality" and can do it when
I'm back from vacation.So I've separated the code that does not use the 2-stage replication (and
therefore the feature is not involved in parallel queries).
This patch has been waiting for input from its author for a couple of
months now, so I am switching it to "Returned with Feedback". If a new
version can be provided, please feel free to send a new one.
--
Michael