PoC: Partial sort
Hackers!
Currently when we need to get ordered result from table we have to choose
one of two approaches: get results from index in exact order we need or do
sort of tuples. However, it could be useful to mix both methods: get
results from index in order which partially meets our requirements and do
rest of work from heap.
Two attached patches are proof of concept for this approach.
*partial-sort-1.patch*
This patch allows to use index for order-by if order-by clause and index
has non-empty common prefix. So, index gives right ordering for first n
order-by columns. In order to provide right order for rest m columns, sort
node is inserted. This sort node sorts groups of tuples where values of
first n order-by columns are equal.
See an example.
create table test as (select id, (random()*10000)::int as v1, random() as
v2 from generate_series(1,1000000) id);
create index test_v1_idx on test (v1);
We've index by v1 column, but we can get results ordered by v1, v2.
postgres=# select * from test order by v1, v2 limit 10;
id | v1 | v2
--------+----+--------------------
390371 | 0 | 0.0284479795955122
674617 | 0 | 0.0322008323855698
881905 | 0 | 0.042586590629071
972877 | 0 | 0.0531588457524776
364903 | 0 | 0.0594307743012905
82333 | 0 | 0.0666455538012087
266488 | 0 | 0.072808934841305
892215 | 0 | 0.0744258034974337
13805 | 0 | 0.0794667331501842
338435 | 0 | 0.171817752998322
(10 rows)
And it's fast using following plan.
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=69214.06..69214.08 rows=10 width=16) (actual
time=0.097..0.099 rows=10 loops=1)
-> Sort (cost=69214.06..71714.06 rows=1000000 width=16) (actual
time=0.096..0.097 rows=10 loops=1)
Sort Key: v1, v2
Sort Method: top-N heapsort Memory: 25kB
-> Index Scan using test_v1_idx on test (cost=0.42..47604.42
rows=1000000 width=16) (actual time=0.017..0.066 rows=56 loops=1)
Total runtime: 0.125 ms
(6 rows)
For sure, this approach is effective only when first n order-by columns we
selected provides enough count of unique values (so, sorted groups are
small). Patch is only PoC because it doesn't contains any try to estimate
right cost of using partial sort.
*partial-knn-1.patch*
KNN-GiST provides ability to get ordered results from index, but this order
is based only on index information. For instance, GiST index contains
bounding rectangles for polygons, and we can't get exact distance to
polygon from index (similar situation is in PostGIS). In attached patch,
GiST distance method can set recheck flag (similar to consistent method).
This flag means that distance method returned lower bound of distance and
we should recheck it from heap.
See an example.
create table test as (select id, polygon(3+(random()*10)::int,
circle(point(random(), random()), 0.0003 + random()*0.001)) as p from
generate_series(1,1000000) id);
create index test_idx on test using gist (p);
We can get results ordered by distance from polygon to point.
postgres=# select id, p <-> point(0.5,0.5) from test order by p <->
point(0.5,0.5) limit 10;
id | ?column?
--------+----------------------
755611 | 0.000405855808916853
807562 | 0.000464123777564343
437778 | 0.000738524708741959
947860 | 0.00076250998760724
389843 | 0.000886362723569568
17586 | 0.000981960100555216
411329 | 0.00145338112316853
894191 | 0.00149399559703506
391907 | 0.0016647896049741
235381 | 0.00167554614889509
(10 rows)
It's fast using just index scan.
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.29..1.86 rows=10 width=36) (actual time=0.180..0.230
rows=10 loops=1)
-> Index Scan using test_idx on test (cost=0.29..157672.29
rows=1000000 width=36) (actual time=0.179..0.228 rows=10 loops=1)
Order By: (p <-> '(0.5,0.5)'::point)
Total runtime: 0.305 ms
(4 rows)
This patch is also only PoC because of following:
1) It's probably wrong at all to get heap tuple from index scan node. This
work should be done from another node.
2) Assumption that order-by operator returns float8 comparable with GiST
distance method result in general case is wrong.
------
With best regards,
Alexander Korotkov.
Attachments:
partial-sort-1.patchapplication/octet-stream; name=partial-sort-1.patchDownload
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
new file mode 100644
index 09b2eb0..65bf9fd
*** a/src/backend/executor/nodeSort.c
--- b/src/backend/executor/nodeSort.c
***************
*** 15,25 ****
--- 15,52 ----
#include "postgres.h"
+ #include "access/htup_details.h"
#include "executor/execdebug.h"
#include "executor/nodeSort.h"
#include "miscadmin.h"
#include "utils/tuplesort.h"
+ /*
+ * Check if first "skipCols" sort values are equal.
+ */
+ static bool
+ cmpTuples(SortState *node, TupleDesc tupDesc, HeapTuple a, TupleTableSlot *b)
+ {
+ int n = ((Sort *)node->ss.ps.plan)->skipCols, i;
+ SortSupport sortKeys = tuplesort_get_sortkeys(node->tuplesortstate);
+
+ for (i = 0; i < n; i++)
+ {
+ Datum datumA, datumB;
+ bool isnullA, isnullB;
+ AttrNumber attno = sortKeys[i].ssup_attno;
+
+ datumA = heap_getattr(a, attno, tupDesc, &isnullA);
+ datumB = slot_getattr(b, attno, &isnullB);
+
+ if (ApplySortComparator(datumA, isnullA,
+ datumB, isnullB,
+ &sortKeys[i]))
+ return false;
+ }
+ return true;
+ }
+
/* ----------------------------------------------------------------
* ExecSort
*************** ExecSort(SortState *node)
*** 54,131 ****
tuplesortstate = (Tuplesortstate *) node->tuplesortstate;
/*
* If first time through, read all tuples from outer plan and pass them to
* tuplesort.c. Subsequent calls just fetch tuples from tuplesort.
*/
! if (!node->sort_Done)
! {
! Sort *plannode = (Sort *) node->ss.ps.plan;
! PlanState *outerNode;
! TupleDesc tupDesc;
! SO1_printf("ExecSort: %s\n",
! "sorting subplan");
! /*
! * Want to scan subplan in the forward direction while creating the
! * sorted data.
! */
! estate->es_direction = ForwardScanDirection;
! /*
! * Initialize tuplesort module.
! */
! SO1_printf("ExecSort: %s\n",
! "calling tuplesort_begin");
! outerNode = outerPlanState(node);
! tupDesc = ExecGetResultType(outerNode);
! tuplesortstate = tuplesort_begin_heap(tupDesc,
! plannode->numCols,
! plannode->sortColIdx,
! plannode->sortOperators,
! plannode->collations,
! plannode->nullsFirst,
! work_mem,
! node->randomAccess);
! if (node->bounded)
! tuplesort_set_bound(tuplesortstate, node->bound);
! node->tuplesortstate = (void *) tuplesortstate;
! /*
! * Scan the subplan and feed all the tuples to tuplesort.
! */
! for (;;)
{
! slot = ExecProcNode(outerNode);
if (TupIsNull(slot))
break;
!
! tuplesort_puttupleslot(tuplesortstate, slot);
}
! /*
! * Complete the sort.
! */
! tuplesort_performsort(tuplesortstate);
! /*
! * restore to user specified direction
! */
! estate->es_direction = dir;
! /*
! * finally set the sorted flag to true
! */
! node->sort_Done = true;
! node->bounded_Done = node->bounded;
! node->bound_Done = node->bound;
! SO1_printf("ExecSort: %s\n", "sorting done");
! }
SO1_printf("ExecSort: %s\n",
"retrieving tuple from tuplesort");
--- 81,194 ----
tuplesortstate = (Tuplesortstate *) node->tuplesortstate;
/*
+ * Return next tuple from sorted set if any.
+ */
+ if (node->sort_Done)
+ {
+ slot = node->ss.ps.ps_ResultTupleSlot;
+ if (tuplesort_gettupleslot(tuplesortstate,
+ ScanDirectionIsForward(dir),
+ slot) || node->finished)
+ return slot;
+ }
+
+ /*
* If first time through, read all tuples from outer plan and pass them to
* tuplesort.c. Subsequent calls just fetch tuples from tuplesort.
*/
! Sort *plannode = (Sort *) node->ss.ps.plan;
! PlanState *outerNode;
! TupleDesc tupDesc;
! SO1_printf("ExecSort: %s\n",
! "sorting subplan");
! /*
! * Want to scan subplan in the forward direction while creating the
! * sorted data.
! */
! estate->es_direction = ForwardScanDirection;
! /*
! * Initialize tuplesort module.
! */
! SO1_printf("ExecSort: %s\n",
! "calling tuplesort_begin");
! outerNode = outerPlanState(node);
! tupDesc = ExecGetResultType(outerNode);
! tuplesortstate = tuplesort_begin_heap(tupDesc,
! plannode->numCols,
! plannode->sortColIdx,
! plannode->sortOperators,
! plannode->collations,
! plannode->nullsFirst,
! work_mem,
! node->randomAccess);
! if (node->bounded)
! tuplesort_set_bound(tuplesortstate, node->bound);
! node->tuplesortstate = (void *) tuplesortstate;
! /*
! * Put next group of tuples where skipCols" sort values are equal to
! * tuplesort.
! */
! for (;;)
! {
! slot = ExecProcNode(outerNode);
! if (node->prev)
{
! ExecStoreTuple(node->prev, node->ss.ps.ps_ResultTupleSlot, InvalidBuffer, false);
! tuplesort_puttupleslot(tuplesortstate, node->ss.ps.ps_ResultTupleSlot);
if (TupIsNull(slot))
+ {
+ node->finished = true;
break;
! }
! else
! {
! bool cmp;
! cmp = cmpTuples(node, tupDesc, node->prev, slot);
! node->prev = ExecCopySlotTuple(slot);
! if (!cmp)
! break;
! }
}
+ else
+ {
+ if (TupIsNull(slot))
+ {
+ node->finished = true;
+ break;
+ }
+ else
+ {
+ node->prev = ExecCopySlotTuple(slot);
+ }
+ }
+ }
! /*
! * Complete the sort.
! */
! tuplesort_performsort(tuplesortstate);
! /*
! * restore to user specified direction
! */
! estate->es_direction = dir;
! /*
! * finally set the sorted flag to true
! */
! node->sort_Done = true;
! node->bounded_Done = node->bounded;
! node->bound_Done = node->bound;
! SO1_printf("ExecSort: %s\n", "sorting done");
SO1_printf("ExecSort: %s\n",
"retrieving tuple from tuplesort");
*************** ExecInitSort(Sort *node, EState *estate,
*** 174,180 ****
--- 237,245 ----
sortstate->bounded = false;
sortstate->sort_Done = false;
+ sortstate->finished = false;
sortstate->tuplesortstate = NULL;
+ sortstate->prev = NULL;
/*
* Miscellaneous initialization
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
new file mode 100644
index e3edcf6..d698559
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
*************** _copySort(const Sort *from)
*** 735,740 ****
--- 735,741 ----
CopyPlanFields((const Plan *) from, (Plan *) newnode);
COPY_SCALAR_FIELD(numCols);
+ COPY_SCALAR_FIELD(skipCols);
COPY_POINTER_FIELD(sortColIdx, from->numCols * sizeof(AttrNumber));
COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
new file mode 100644
index 9c8ede6..067730f
*** a/src/backend/optimizer/path/pathkeys.c
--- b/src/backend/optimizer/path/pathkeys.c
*************** compare_pathkeys(List *keys1, List *keys
*** 312,317 ****
--- 312,343 ----
}
/*
+ * pathkeys_common
+ * Returns length of longest common prefix of keys1 and keys2.
+ */
+ int
+ pathkeys_common(List *keys1, List *keys2)
+ {
+ int n;
+ ListCell *key1,
+ *key2;
+ n = 0;
+
+ forboth(key1, keys1, key2, keys2)
+ {
+ PathKey *pathkey1 = (PathKey *) lfirst(key1);
+ PathKey *pathkey2 = (PathKey *) lfirst(key2);
+
+ if (pathkey1 != pathkey2)
+ return n;
+ n++;
+ }
+
+ return n;
+ }
+
+
+ /*
* pathkeys_contained_in
* Common special case of compare_pathkeys: we just want to know
* if keys2 are at least as well sorted as keys1.
*************** get_cheapest_fractional_path_for_pathkey
*** 403,409 ****
compare_fractional_path_costs(matched_path, path, fraction) <= 0)
continue;
! if (pathkeys_contained_in(pathkeys, path->pathkeys) &&
bms_is_subset(PATH_REQ_OUTER(path), required_outer))
matched_path = path;
}
--- 429,435 ----
compare_fractional_path_costs(matched_path, path, fraction) <= 0)
continue;
! if (pathkeys_common(pathkeys, path->pathkeys) != 0 &&
bms_is_subset(PATH_REQ_OUTER(path), required_outer))
matched_path = path;
}
*************** right_merge_direction(PlannerInfo *root,
*** 1457,1469 ****
static int
pathkeys_useful_for_ordering(PlannerInfo *root, List *pathkeys)
{
if (root->query_pathkeys == NIL)
return 0; /* no special ordering requested */
if (pathkeys == NIL)
return 0; /* unordered path */
! if (pathkeys_contained_in(root->query_pathkeys, pathkeys))
{
/* It's useful ... or at least the first N keys are */
return list_length(root->query_pathkeys);
--- 1483,1499 ----
static int
pathkeys_useful_for_ordering(PlannerInfo *root, List *pathkeys)
{
+ int n;
+
if (root->query_pathkeys == NIL)
return 0; /* no special ordering requested */
if (pathkeys == NIL)
return 0; /* unordered path */
! n = pathkeys_common(root->query_pathkeys, pathkeys);
!
! if (n != 0)
{
/* It's useful ... or at least the first N keys are */
return list_length(root->query_pathkeys);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
new file mode 100644
index f2c122d..87dd985
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
*************** static MergeJoin *make_mergejoin(List *t
*** 148,154 ****
bool *mergenullsfirst,
Plan *lefttree, Plan *righttree,
JoinType jointype);
! static Sort *make_sort(PlannerInfo *root, Plan *lefttree, int numCols,
AttrNumber *sortColIdx, Oid *sortOperators,
Oid *collations, bool *nullsFirst,
double limit_tuples);
--- 148,154 ----
bool *mergenullsfirst,
Plan *lefttree, Plan *righttree,
JoinType jointype);
! static Sort *make_sort(PlannerInfo *root, Plan *lefttree, int numCols, int skipCols,
AttrNumber *sortColIdx, Oid *sortOperators,
Oid *collations, bool *nullsFirst,
double limit_tuples);
*************** create_merge_append_plan(PlannerInfo *ro
*** 808,814 ****
/* Now, insert a Sort node if subplan isn't sufficiently ordered */
if (!pathkeys_contained_in(pathkeys, subpath->pathkeys))
! subplan = (Plan *) make_sort(root, subplan, numsortkeys,
sortColIdx, sortOperators,
collations, nullsFirst,
best_path->limit_tuples);
--- 808,814 ----
/* Now, insert a Sort node if subplan isn't sufficiently ordered */
if (!pathkeys_contained_in(pathkeys, subpath->pathkeys))
! subplan = (Plan *) make_sort(root, subplan, numsortkeys, 0,
sortColIdx, sortOperators,
collations, nullsFirst,
best_path->limit_tuples);
*************** create_mergejoin_plan(PlannerInfo *root,
*** 2186,2192 ****
make_sort_from_pathkeys(root,
outer_plan,
best_path->outersortkeys,
! -1.0);
outerpathkeys = best_path->outersortkeys;
}
else
--- 2186,2192 ----
make_sort_from_pathkeys(root,
outer_plan,
best_path->outersortkeys,
! -1.0, 0);
outerpathkeys = best_path->outersortkeys;
}
else
*************** create_mergejoin_plan(PlannerInfo *root,
*** 2199,2205 ****
make_sort_from_pathkeys(root,
inner_plan,
best_path->innersortkeys,
! -1.0);
innerpathkeys = best_path->innersortkeys;
}
else
--- 2199,2205 ----
make_sort_from_pathkeys(root,
inner_plan,
best_path->innersortkeys,
! -1.0, 0);
innerpathkeys = best_path->innersortkeys;
}
else
*************** make_mergejoin(List *tlist,
*** 3738,3744 ****
* limit_tuples is as for cost_sort (in particular, pass -1 if no limit)
*/
static Sort *
! make_sort(PlannerInfo *root, Plan *lefttree, int numCols,
AttrNumber *sortColIdx, Oid *sortOperators,
Oid *collations, bool *nullsFirst,
double limit_tuples)
--- 3738,3744 ----
* limit_tuples is as for cost_sort (in particular, pass -1 if no limit)
*/
static Sort *
! make_sort(PlannerInfo *root, Plan *lefttree, int numCols, int skipCols,
AttrNumber *sortColIdx, Oid *sortOperators,
Oid *collations, bool *nullsFirst,
double limit_tuples)
*************** make_sort(PlannerInfo *root, Plan *leftt
*** 3762,3767 ****
--- 3762,3768 ----
plan->lefttree = lefttree;
plan->righttree = NULL;
node->numCols = numCols;
+ node->skipCols = skipCols;
node->sortColIdx = sortColIdx;
node->sortOperators = sortOperators;
node->collations = collations;
*************** find_ec_member_for_tle(EquivalenceClass
*** 4090,4096 ****
*/
Sort *
make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
! double limit_tuples)
{
int numsortkeys;
AttrNumber *sortColIdx;
--- 4091,4097 ----
*/
Sort *
make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
! double limit_tuples, int skipCols)
{
int numsortkeys;
AttrNumber *sortColIdx;
*************** make_sort_from_pathkeys(PlannerInfo *roo
*** 4110,4116 ****
&nullsFirst);
/* Now build the Sort node */
! return make_sort(root, lefttree, numsortkeys,
sortColIdx, sortOperators, collations,
nullsFirst, limit_tuples);
}
--- 4111,4117 ----
&nullsFirst);
/* Now build the Sort node */
! return make_sort(root, lefttree, numsortkeys, skipCols,
sortColIdx, sortOperators, collations,
nullsFirst, limit_tuples);
}
*************** make_sort_from_sortclauses(PlannerInfo *
*** 4153,4159 ****
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
--- 4154,4160 ----
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys, 0,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
*************** make_sort_from_groupcols(PlannerInfo *ro
*** 4208,4214 ****
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
--- 4209,4215 ----
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys, 0,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
new file mode 100644
index 6670794..94cb114
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*************** grouping_planner(PlannerInfo *root, doub
*** 1360,1367 ****
Path sort_path; /* dummy for result of cost_sort */
if (root->query_pathkeys == NIL ||
! pathkeys_contained_in(root->query_pathkeys,
! cheapest_path->pathkeys))
{
/* No sort needed for cheapest path */
sort_path.startup_cost = cheapest_path->startup_cost;
--- 1360,1367 ----
Path sort_path; /* dummy for result of cost_sort */
if (root->query_pathkeys == NIL ||
! pathkeys_common(root->query_pathkeys,
! cheapest_path->pathkeys) != 0)
{
/* No sort needed for cheapest path */
sort_path.startup_cost = cheapest_path->startup_cost;
*************** grouping_planner(PlannerInfo *root, doub
*** 1721,1727 ****
sort_plan = make_sort_from_pathkeys(root,
result_plan,
window_pathkeys,
! -1.0);
if (!pathkeys_contained_in(window_pathkeys,
current_pathkeys))
{
--- 1721,1727 ----
sort_plan = make_sort_from_pathkeys(root,
result_plan,
window_pathkeys,
! -1.0, 0);
if (!pathkeys_contained_in(window_pathkeys,
current_pathkeys))
{
*************** grouping_planner(PlannerInfo *root, doub
*** 1881,1887 ****
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
current_pathkeys,
! -1.0);
}
result_plan = (Plan *) make_unique(result_plan,
--- 1881,1887 ----
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
current_pathkeys,
! -1.0, 0);
}
result_plan = (Plan *) make_unique(result_plan,
*************** grouping_planner(PlannerInfo *root, doub
*** 1897,1908 ****
*/
if (parse->sortClause)
{
! if (!pathkeys_contained_in(root->sort_pathkeys, current_pathkeys))
{
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
root->sort_pathkeys,
! limit_tuples);
current_pathkeys = root->sort_pathkeys;
}
}
--- 1897,1912 ----
*/
if (parse->sortClause)
{
! int common = pathkeys_common(root->sort_pathkeys, current_pathkeys);
! int sortLength = list_length(root->sort_pathkeys);
!
! if (common <= sortLength)
{
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
root->sort_pathkeys,
! limit_tuples,
! common);
current_pathkeys = root->sort_pathkeys;
}
}
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
new file mode 100644
index ea8af9f..29b90f2
*** a/src/backend/utils/sort/tuplesort.c
--- b/src/backend/utils/sort/tuplesort.c
*************** free_sort_tuple(Tuplesortstate *state, S
*** 3455,3457 ****
--- 3455,3464 ----
FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
pfree(stup->tuple);
}
+
+ SortSupport
+ tuplesort_get_sortkeys(Tuplesortstate *state)
+ {
+ return state->sortKeys;
+ }
+
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
new file mode 100644
index 5a40347..3723a18
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
*************** typedef struct SortState
*** 1663,1670 ****
--- 1663,1672 ----
int64 bound; /* if bounded, how many tuples are needed */
bool sort_Done; /* sort completed yet? */
bool bounded_Done; /* value of bounded we did the sort with */
+ bool finished;
int64 bound_Done; /* value of bound we did the sort with */
void *tuplesortstate; /* private state of tuplesort.c */
+ HeapTuple prev;
} SortState;
/* ---------------------
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
new file mode 100644
index 101e22c..28b871e
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
*************** typedef struct Sort
*** 582,587 ****
--- 582,588 ----
{
Plan plan;
int numCols; /* number of sort-key columns */
+ int skipCols;
AttrNumber *sortColIdx; /* their indexes in the target list */
Oid *sortOperators; /* OIDs of operators to sort them by */
Oid *collations; /* OIDs of collations */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
new file mode 100644
index 999adaa..7c09301
*** a/src/include/optimizer/paths.h
--- b/src/include/optimizer/paths.h
*************** typedef enum
*** 157,162 ****
--- 157,163 ----
extern PathKeysComparison compare_pathkeys(List *keys1, List *keys2);
extern bool pathkeys_contained_in(List *keys1, List *keys2);
+ extern int pathkeys_common(List *keys1, List *keys2);
extern Path *get_cheapest_path_for_pathkeys(List *paths, List *pathkeys,
Relids required_outer,
CostSelector cost_criterion);
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
new file mode 100644
index ba7ae7c..b46d71c
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
*************** extern RecursiveUnion *make_recursive_un
*** 50,56 ****
Plan *lefttree, Plan *righttree, int wtParam,
List *distinctList, long numGroups);
extern Sort *make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree,
! List *pathkeys, double limit_tuples);
extern Sort *make_sort_from_sortclauses(PlannerInfo *root, List *sortcls,
Plan *lefttree);
extern Sort *make_sort_from_groupcols(PlannerInfo *root, List *groupcls,
--- 50,56 ----
Plan *lefttree, Plan *righttree, int wtParam,
List *distinctList, long numGroups);
extern Sort *make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree,
! List *pathkeys, double limit_tuples, int skipCols);
extern Sort *make_sort_from_sortclauses(PlannerInfo *root, List *sortcls,
Plan *lefttree);
extern Sort *make_sort_from_groupcols(PlannerInfo *root, List *groupcls,
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
new file mode 100644
index 25fa6de..267a988
*** a/src/include/utils/tuplesort.h
--- b/src/include/utils/tuplesort.h
***************
*** 24,29 ****
--- 24,30 ----
#include "executor/tuptable.h"
#include "fmgr.h"
#include "utils/relcache.h"
+ #include "utils/sortsupport.h"
/* Tuplesortstate is an opaque type whose details are not known outside
*************** extern void tuplesort_get_stats(Tuplesor
*** 108,113 ****
--- 109,116 ----
extern int tuplesort_merge_order(int64 allowedMem);
+ extern SortSupport tuplesort_get_sortkeys(Tuplesortstate *state);
+
/*
* These routines may only be called if randomAccess was specified 'true'.
* Likewise, backwards scan in gettuple/getdatum is only allowed if
partial-knn-1.patchapplication/octet-stream; name=partial-knn-1.patchDownload
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
new file mode 100644
index e97ab8f..6ad5677
*** a/src/backend/access/gist/gistget.c
--- b/src/backend/access/gist/gistget.c
***************
*** 16,21 ****
--- 16,22 ----
#include "access/gist_private.h"
#include "access/relscan.h"
+ #include "catalog/index.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "utils/builtins.h"
*************** gistindex_keytest(IndexScanDesc scan,
*** 55,61 ****
GISTSTATE *giststate = so->giststate;
ScanKey key = scan->keyData;
int keySize = scan->numberOfKeys;
! double *distance_p;
Relation r = scan->indexRelation;
*recheck_p = false;
--- 56,62 ----
GISTSTATE *giststate = so->giststate;
ScanKey key = scan->keyData;
int keySize = scan->numberOfKeys;
! GISTSearchTreeItemDistance *distance_p;
Relation r = scan->indexRelation;
*recheck_p = false;
*************** gistindex_keytest(IndexScanDesc scan,
*** 72,78 ****
if (GistPageIsLeaf(page)) /* shouldn't happen */
elog(ERROR, "invalid GiST tuple found on leaf page");
for (i = 0; i < scan->numberOfOrderBys; i++)
! so->distances[i] = -get_float8_infinity();
return true;
}
--- 73,82 ----
if (GistPageIsLeaf(page)) /* shouldn't happen */
elog(ERROR, "invalid GiST tuple found on leaf page");
for (i = 0; i < scan->numberOfOrderBys; i++)
! {
! so->distances[i].value = -get_float8_infinity();
! so->distances[i].recheck = false;
! }
return true;
}
*************** gistindex_keytest(IndexScanDesc scan,
*** 170,176 ****
if ((key->sk_flags & SK_ISNULL) || isNull)
{
/* Assume distance computes as null and sorts to the end */
! *distance_p = get_float8_infinity();
}
else
{
--- 174,181 ----
if ((key->sk_flags & SK_ISNULL) || isNull)
{
/* Assume distance computes as null and sorts to the end */
! distance_p->value = get_float8_infinity();
! distance_p->recheck = false;
}
else
{
*************** gistindex_keytest(IndexScanDesc scan,
*** 195,208 ****
* can't tolerate lossy distance calculations on leaf tuples;
* there is no opportunity to re-sort the tuples afterwards.
*/
! dist = FunctionCall4Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype));
! *distance_p = DatumGetFloat8(dist);
}
key++;
--- 200,215 ----
* can't tolerate lossy distance calculations on leaf tuples;
* there is no opportunity to re-sort the tuples afterwards.
*/
! distance_p->recheck = false;
! dist = FunctionCall5Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype),
! PointerGetDatum(&distance_p->recheck));
! distance_p->value = DatumGetFloat8(dist);
}
key++;
*************** gistindex_keytest(IndexScanDesc scan,
*** 234,240 ****
* sibling will be processed next.
*/
static void
! gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
TIDBitmap *tbm, int64 *ntids)
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
--- 241,247 ----
* sibling will be processed next.
*/
static void
! gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, GISTSearchTreeItemDistance *myDistances,
TIDBitmap *tbm, int64 *ntids)
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 284,290 ****
tmpItem->head = item;
tmpItem->lastHeap = NULL;
memcpy(tmpItem->distances, myDistances,
! sizeof(double) * scan->numberOfOrderBys);
(void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
--- 291,297 ----
tmpItem->head = item;
tmpItem->lastHeap = NULL;
memcpy(tmpItem->distances, myDistances,
! sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
(void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 375,381 ****
tmpItem->head = item;
tmpItem->lastHeap = GISTSearchItemIsHeap(*item) ? item : NULL;
memcpy(tmpItem->distances, so->distances,
! sizeof(double) * scan->numberOfOrderBys);
(void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
--- 382,388 ----
tmpItem->head = item;
tmpItem->lastHeap = GISTSearchItemIsHeap(*item) ? item : NULL;
memcpy(tmpItem->distances, so->distances,
! sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
(void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 387,392 ****
--- 394,485 ----
}
/*
+ * Do this tree item distance values needs recheck?
+ */
+ static bool
+ searchTreeItemNeedDistanceRecheck(IndexScanDesc scan, GISTSearchTreeItem *item)
+ {
+ int i;
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ if (item->distances[i].recheck)
+ return true;
+ }
+ return false;
+ }
+
+ /*
+ * Recheck distance values of item from heap and reinsert it into RB-tree.
+ */
+ static void
+ searchTreeItemDistanceRecheck(IndexScanDesc scan, GISTSearchTreeItem *treeItem,
+ GISTSearchItem *item)
+ {
+ GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
+ GISTSearchTreeItem *tmpItem = so->tmpTreeItem;
+ Buffer buffer;
+ bool got_heap_tuple, all_dead;
+ HeapTupleData tup;
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ bool isNew;
+ int i;
+
+ buffer = ReadBuffer(scan->heapRelation,
+ ItemPointerGetBlockNumber(&item->data.heap.heapPtr));
+ LockBuffer(buffer, BUFFER_LOCK_SHARE);
+ got_heap_tuple = heap_hot_search_buffer(&item->data.heap.heapPtr,
+ scan->heapRelation,
+ buffer,
+ scan->xs_snapshot,
+ &tup,
+ &all_dead,
+ true);
+ if (!got_heap_tuple)
+ {
+ UnlockReleaseBuffer(buffer);
+ pfree(item);
+ return;
+ }
+
+ memcpy(tmpItem, treeItem, GSTIHDRSZ +
+ sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
+ tmpItem->head = item;
+ tmpItem->lastHeap = item;
+ item->next = NULL;
+
+ ExecStoreTuple(&tup, so->slot, InvalidBuffer, false);
+ FormIndexDatum(so->indexInfo, so->slot, so->estate, values, isnull);
+
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ if (tmpItem->distances[i].recheck)
+ {
+ ScanKey key = scan->orderByData + i;
+ float8 newDistance;
+
+ tmpItem->distances[i].recheck = false;
+ if (isnull[key->sk_attno - 1])
+ {
+ tmpItem->distances[i].value = -get_float8_infinity();
+ continue;
+ }
+
+ newDistance = DatumGetFloat8(
+ FunctionCall2Coll(&so->orderByRechecks[i],
+ key->sk_collation,
+ values[key->sk_attno - 1],
+ key->sk_argument));
+
+ tmpItem->distances[i].value = newDistance;
+
+ }
+ }
+ (void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
+ UnlockReleaseBuffer(buffer);
+ }
+
+ /*
* Extract next item (in order) from search queue
*
* Returns a GISTSearchItem or NULL. Caller must pfree item when done with it.
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 396,403 ****
* the distances value for the item.
*/
static GISTSearchItem *
! getNextGISTSearchItem(GISTScanOpaque so)
{
for (;;)
{
GISTSearchItem *item;
--- 489,498 ----
* the distances value for the item.
*/
static GISTSearchItem *
! getNextGISTSearchItem(IndexScanDesc scan)
{
+ GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
+
for (;;)
{
GISTSearchItem *item;
*************** getNextGISTSearchItem(GISTScanOpaque so)
*** 418,423 ****
--- 513,526 ----
so->curTreeItem->head = item->next;
if (item == so->curTreeItem->lastHeap)
so->curTreeItem->lastHeap = NULL;
+
+ /* Recheck distance from heap tuple if needed */
+ if (GISTSearchItemIsHeap(*item) &&
+ searchTreeItemNeedDistanceRecheck(scan, so->curTreeItem))
+ {
+ searchTreeItemDistanceRecheck(scan, so->curTreeItem, item);
+ continue;
+ }
/* Return item; caller is responsible to pfree it */
return item;
}
*************** getNextNearest(IndexScanDesc scan)
*** 441,447 ****
do
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
break;
--- 544,550 ----
do
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
break;
*************** gistgettuple(PG_FUNCTION_ARGS)
*** 521,527 ****
/* find and process the next index page */
do
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
PG_RETURN_BOOL(false);
--- 624,630 ----
/* find and process the next index page */
do
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
PG_RETURN_BOOL(false);
*************** gistgetbitmap(PG_FUNCTION_ARGS)
*** 573,579 ****
*/
for (;;)
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
break;
--- 676,682 ----
*/
for (;;)
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
break;
diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
new file mode 100644
index 3a45781..afe447f
*** a/src/backend/access/gist/gistproc.c
--- b/src/backend/access/gist/gistproc.c
*************** gist_poly_consistent(PG_FUNCTION_ARGS)
*** 1094,1099 ****
--- 1094,1100 ----
PG_RETURN_BOOL(result);
}
+
/**************************************************
* Circle ops
**************************************************/
*************** computeDistance(bool isLeaf, BOX *box, P
*** 1270,1275 ****
--- 1271,1337 ----
return result;
}
+ static double
+ computeDistanceMBR(BOX *box, Point *point)
+ {
+ double result = 0.0;
+
+ if (point->x <= box->high.x && point->x >= box->low.x &&
+ point->y <= box->high.y && point->y >= box->low.y)
+ {
+ /* point inside the box */
+ result = 0.0;
+ }
+ else if (point->x <= box->high.x && point->x >= box->low.x)
+ {
+ /* point is over or below box */
+ Assert(box->low.y <= box->high.y);
+ if (point->y > box->high.y)
+ result = point->y - box->high.y;
+ else if (point->y < box->low.y)
+ result = box->low.y - point->y;
+ else
+ elog(ERROR, "inconsistent point values");
+ }
+ else if (point->y <= box->high.y && point->y >= box->low.y)
+ {
+ /* point is to left or right of box */
+ Assert(box->low.x <= box->high.x);
+ if (point->x > box->high.x)
+ result = point->x - box->high.x;
+ else if (point->x < box->low.x)
+ result = box->low.x - point->x;
+ else
+ elog(ERROR, "inconsistent point values");
+ }
+ else
+ {
+ /* closest point will be a vertex */
+ Point p;
+ double subresult;
+
+ result = point_point_distance(point, &box->low);
+
+ subresult = point_point_distance(point, &box->high);
+ if (result > subresult)
+ result = subresult;
+
+ p.x = box->low.x;
+ p.y = box->high.y;
+ subresult = point_point_distance(point, &p);
+ if (result > subresult)
+ result = subresult;
+
+ p.x = box->high.x;
+ p.y = box->low.y;
+ subresult = point_point_distance(point, &p);
+ if (result > subresult)
+ result = subresult;
+ }
+
+ return result;
+ }
+
static bool
gist_point_consistent_internal(StrategyNumber strategy,
bool isLeaf, BOX *key, Point *query)
*************** gist_point_distance(PG_FUNCTION_ARGS)
*** 1451,1453 ****
--- 1513,1540 ----
PG_RETURN_FLOAT8(distance);
}
+
+ Datum
+ gist_poly_distance(PG_FUNCTION_ARGS)
+ {
+ GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+ StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
+ double distance;
+ StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
+
+ *recheck = true;
+
+ switch (strategyGroup)
+ {
+ case PointStrategyNumberGroup:
+ distance = computeDistanceMBR(DatumGetBoxP(entry->key),
+ PG_GETARG_POINT_P(1));
+ break;
+ default:
+ elog(ERROR, "unknown strategy number: %d", strategy);
+ distance = 0.0; /* keep compiler quiet */
+ }
+
+ PG_RETURN_FLOAT8(distance);
+ }
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
new file mode 100644
index b5553ff..61e5597
*** a/src/backend/access/gist/gistscan.c
--- b/src/backend/access/gist/gistscan.c
***************
*** 17,22 ****
--- 17,25 ----
#include "access/gist_private.h"
#include "access/gistscan.h"
#include "access/relscan.h"
+ #include "catalog/index.h"
+ #include "executor/executor.h"
+ #include "executor/tuptable.h"
#include "utils/memutils.h"
#include "utils/rel.h"
*************** GISTSearchTreeItemComparator(const RBNod
*** 36,43 ****
/* Order according to distance comparison */
for (i = 0; i < scan->numberOfOrderBys; i++)
{
! if (sa->distances[i] != sb->distances[i])
! return (sa->distances[i] > sb->distances[i]) ? 1 : -1;
}
return 0;
--- 39,53 ----
/* Order according to distance comparison */
for (i = 0; i < scan->numberOfOrderBys; i++)
{
! if (sa->distances[i].value != sb->distances[i].value)
! return (sa->distances[i].value > sb->distances[i].value) ? 1 : -1;
!
! /*
! * Items without recheck can be immediately returned. So they are
! * placed first.
! */
! if (sa->distances[i].recheck != sb->distances[i].recheck)
! return sa->distances[i].recheck ? 1 : -1;
}
return 0;
*************** GISTSearchTreeItemAllocator(void *arg)
*** 83,89 ****
{
IndexScanDesc scan = (IndexScanDesc) arg;
! return palloc(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys);
}
static void
--- 93,99 ----
{
IndexScanDesc scan = (IndexScanDesc) arg;
! return palloc(GSTIHDRSZ + sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
}
static void
*************** gistbeginscan(PG_FUNCTION_ARGS)
*** 127,136 ****
so->queueCxt = giststate->scanCxt; /* see gistrescan */
/* workspaces with size dependent on numberOfOrderBys: */
! so->tmpTreeItem = palloc(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys);
! so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
scan->opaque = so;
MemoryContextSwitchTo(oldCxt);
--- 137,153 ----
so->queueCxt = giststate->scanCxt; /* see gistrescan */
/* workspaces with size dependent on numberOfOrderBys: */
! so->tmpTreeItem = palloc(GSTIHDRSZ + sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
! so->distances = palloc(sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
+ if (scan->numberOfOrderBys > 0)
+ {
+ so->orderByRechecks = (FmgrInfo *)palloc(sizeof(FmgrInfo) * scan->numberOfOrderBys);
+ so->indexInfo = BuildIndexInfo(scan->indexRelation);
+ so->estate = CreateExecutorState();
+ }
+
scan->opaque = so;
MemoryContextSwitchTo(oldCxt);
*************** gistrescan(PG_FUNCTION_ARGS)
*** 186,194 ****
first_time = false;
}
/* create new, empty RBTree for search queue */
oldCxt = MemoryContextSwitchTo(so->queueCxt);
! so->queue = rb_create(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys,
GISTSearchTreeItemComparator,
GISTSearchTreeItemCombiner,
GISTSearchTreeItemAllocator,
--- 203,216 ----
first_time = false;
}
+ if (scan->numberOfOrderBys > 0 && !so->slot)
+ {
+ so->slot = MakeSingleTupleTableSlot(RelationGetDescr(scan->heapRelation));
+ }
+
/* create new, empty RBTree for search queue */
oldCxt = MemoryContextSwitchTo(so->queueCxt);
! so->queue = rb_create(GSTIHDRSZ + sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys,
GISTSearchTreeItemComparator,
GISTSearchTreeItemCombiner,
GISTSearchTreeItemAllocator,
*************** gistrescan(PG_FUNCTION_ARGS)
*** 289,294 ****
--- 311,319 ----
GIST_DISTANCE_PROC, skey->sk_attno,
RelationGetRelationName(scan->indexRelation));
+ fmgr_info_copy(&so->orderByRechecks[i], &(skey->sk_func),
+ so->giststate->scanCxt);
+
fmgr_info_copy(&(skey->sk_func), finfo, so->giststate->scanCxt);
/* Restore prior fn_extra pointers, if not first time */
*************** gistendscan(PG_FUNCTION_ARGS)
*** 323,328 ****
--- 348,356 ----
IndexScanDesc scan = (IndexScanDesc) PG_GETARG_POINTER(0);
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
+ if (so->slot)
+ ExecDropSingleTupleTableSlot(so->slot);
+
/*
* freeGISTstate is enough to clean up everything made by gistbeginscan,
* as well as the queueCxt if there is a separate context for it.
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
new file mode 100644
index 41178a6..16b60fe
*** a/src/backend/utils/adt/geo_ops.c
--- b/src/backend/utils/adt/geo_ops.c
*************** dist_cpoly(PG_FUNCTION_ARGS)
*** 2664,2669 ****
--- 2664,2715 ----
PG_RETURN_FLOAT8(result);
}
+ Datum
+ dist_polyp(PG_FUNCTION_ARGS)
+ {
+ POLYGON *poly = PG_GETARG_POLYGON_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+ float8 d;
+ int i;
+ LSEG seg;
+
+ if (point_inside(point, poly->npts, poly->p) != 0)
+ {
+ #ifdef GEODEBUG
+ printf("dist_polyp- point inside of polygon\n");
+ #endif
+ PG_RETURN_FLOAT8(0.0);
+ }
+
+ /* initialize distance with segment between first and last points */
+ seg.p[0].x = poly->p[0].x;
+ seg.p[0].y = poly->p[0].y;
+ seg.p[1].x = poly->p[poly->npts - 1].x;
+ seg.p[1].y = poly->p[poly->npts - 1].y;
+ result = dist_ps_internal(point, &seg);
+ #ifdef GEODEBUG
+ printf("dist_polyp- segment 0/n distance is %f\n", result);
+ #endif
+
+ /* check distances for other segments */
+ for (i = 0; (i < poly->npts - 1); i++)
+ {
+ seg.p[0].x = poly->p[i].x;
+ seg.p[0].y = poly->p[i].y;
+ seg.p[1].x = poly->p[i + 1].x;
+ seg.p[1].y = poly->p[i + 1].y;
+ d = dist_ps_internal(point, &seg);
+ #ifdef GEODEBUG
+ printf("dist_polyp- segment %d distance is %f\n", (i + 1), d);
+ #endif
+ if (d < result)
+ result = d;
+ }
+
+ PG_RETURN_FLOAT8(result);
+ }
+
/*---------------------------------------------------------------------
* interpt_
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
new file mode 100644
index cae6dbc..b2572df
*** a/src/include/access/gist_private.h
--- b/src/include/access/gist_private.h
***************
*** 16,22 ****
--- 16,24 ----
#include "access/gist.h"
#include "access/itup.h"
+ #include "executor/tuptable.h"
#include "fmgr.h"
+ #include "nodes/execnodes.h"
#include "storage/bufmgr.h"
#include "storage/buffile.h"
#include "utils/rbtree.h"
*************** typedef struct GISTSearchItem
*** 119,124 ****
--- 121,132 ----
#define GISTSearchItemIsHeap(item) ((item).blkno == InvalidBlockNumber)
+ typedef struct GISTSearchTreeItemDistance
+ {
+ double value;
+ bool recheck;
+ } GISTSearchTreeItemDistance;
+
/*
* Within a GISTSearchTreeItem's chain, heap items always appear before
* index-page items, since we want to visit heap items first. lastHeap points
*************** typedef struct GISTSearchTreeItem
*** 129,135 ****
RBNode rbnode; /* this is an RBTree item */
GISTSearchItem *head; /* first chain member */
GISTSearchItem *lastHeap; /* last heap-tuple member, if any */
! double distances[1]; /* array with numberOfOrderBys entries */
} GISTSearchTreeItem;
#define GSTIHDRSZ offsetof(GISTSearchTreeItem, distances)
--- 137,143 ----
RBNode rbnode; /* this is an RBTree item */
GISTSearchItem *head; /* first chain member */
GISTSearchItem *lastHeap; /* last heap-tuple member, if any */
! GISTSearchTreeItemDistance distances[1]; /* array with numberOfOrderBys entries */
} GISTSearchTreeItem;
#define GSTIHDRSZ offsetof(GISTSearchTreeItem, distances)
*************** typedef struct GISTScanOpaqueData
*** 149,160 ****
/* pre-allocated workspace arrays */
GISTSearchTreeItem *tmpTreeItem; /* workspace to pass to rb_insert */
! double *distances; /* output area for gistindex_keytest */
/* In a non-ordered search, returnable heap items are stored here: */
GISTSearchHeapItem pageData[BLCKSZ / sizeof(IndexTupleData)];
OffsetNumber nPageData; /* number of valid items in array */
OffsetNumber curPageData; /* next item to return */
} GISTScanOpaqueData;
typedef GISTScanOpaqueData *GISTScanOpaque;
--- 157,172 ----
/* pre-allocated workspace arrays */
GISTSearchTreeItem *tmpTreeItem; /* workspace to pass to rb_insert */
! GISTSearchTreeItemDistance *distances; /* output area for gistindex_keytest */
/* In a non-ordered search, returnable heap items are stored here: */
GISTSearchHeapItem pageData[BLCKSZ / sizeof(IndexTupleData)];
OffsetNumber nPageData; /* number of valid items in array */
OffsetNumber curPageData; /* next item to return */
+ FmgrInfo *orderByRechecks;
+ IndexInfo *indexInfo;
+ TupleTableSlot *slot;
+ EState *estate;
} GISTScanOpaqueData;
typedef GISTScanOpaqueData *GISTScanOpaque;
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
new file mode 100644
index c8a548c..e7a79c6
*** a/src/include/catalog/pg_amop.h
--- b/src/include/catalog/pg_amop.h
*************** DATA(insert ( 2594 604 604 11 s 2577 7
*** 638,643 ****
--- 638,644 ----
DATA(insert ( 2594 604 604 12 s 2576 783 0 ));
DATA(insert ( 2594 604 604 13 s 2861 783 0 ));
DATA(insert ( 2594 604 604 14 s 2860 783 0 ));
+ DATA(insert ( 2594 604 600 15 o 3569 783 1970 ));
/*
* gist circle_ops
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
new file mode 100644
index 53a3a7a..29c7c09
*** a/src/include/catalog/pg_amproc.h
--- b/src/include/catalog/pg_amproc.h
*************** DATA(insert ( 2594 604 604 4 2580 ));
*** 188,193 ****
--- 188,194 ----
DATA(insert ( 2594 604 604 5 2581 ));
DATA(insert ( 2594 604 604 6 2582 ));
DATA(insert ( 2594 604 604 7 2584 ));
+ DATA(insert ( 2594 604 604 8 3567 ));
DATA(insert ( 2595 718 718 1 2591 ));
DATA(insert ( 2595 718 718 2 2583 ));
DATA(insert ( 2595 718 718 3 2592 ));
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
new file mode 100644
index 78efaa5..32ac483
*** a/src/include/catalog/pg_operator.h
--- b/src/include/catalog/pg_operator.h
*************** DATA(insert OID = 709 ( "<->" PGNSP
*** 591,596 ****
--- 591,598 ----
DESCR("distance between");
DATA(insert OID = 712 ( "<->" PGNSP PGUID b f f 604 604 701 712 0 poly_distance - - ));
DESCR("distance between");
+ DATA(insert OID = 3569 ( "<->" PGNSP PGUID b f f 604 600 701 0 0 dist_polyp - - ));
+ DESCR("distance between");
DATA(insert OID = 713 ( "<>" PGNSP PGUID b f f 600 600 16 713 510 point_ne neqsel neqjoinsel ));
DESCR("not equal");
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index 0117500..85d077b
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DATA(insert OID = 726 ( dist_lb PGN
*** 809,814 ****
--- 809,815 ----
DATA(insert OID = 727 ( dist_sl PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "601 628" _null_ _null_ _null_ _null_ dist_sl _null_ _null_ _null_ ));
DATA(insert OID = 728 ( dist_cpoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 604" _null_ _null_ _null_ _null_ dist_cpoly _null_ _null_ _null_ ));
DATA(insert OID = 729 ( poly_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 604" _null_ _null_ _null_ _null_ poly_distance _null_ _null_ _null_ ));
+ DATA(insert OID = 3568 ( dist_polyp PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 600" _null_ _null_ _null_ _null_ dist_polyp _null_ _null_ _null_ ));
DATA(insert OID = 740 ( text_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_lt _null_ _null_ _null_ ));
DATA(insert OID = 741 ( text_le PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_le _null_ _null_ _null_ ));
*************** DATA(insert OID = 2585 ( gist_poly_cons
*** 3937,3942 ****
--- 3938,3945 ----
DESCR("GiST support");
DATA(insert OID = 2586 ( gist_poly_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_ _null_ _null_ gist_poly_compress _null_ _null_ _null_ ));
DESCR("GiST support");
+ DATA(insert OID = 3567 ( gist_poly_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_poly_distance _null_ _null_ _null_ ));
+ DESCR("GiST support");
DATA(insert OID = 2591 ( gist_circle_consistent PGNSP PGUID 12 1 0 0 0 f f f f t f i 5 0 16 "2281 718 23 26 2281" _null_ _null_ _null_ _null_ gist_circle_consistent _null_ _null_ _null_ ));
DESCR("GiST support");
DATA(insert OID = 2592 ( gist_circle_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_ _null_ _null_ gist_circle_compress _null_ _null_ _null_ ));
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
new file mode 100644
index 1e648c0..b8a04cb
*** a/src/include/utils/geo_decls.h
--- b/src/include/utils/geo_decls.h
*************** extern Datum circle_radius(PG_FUNCTION_A
*** 395,400 ****
--- 395,401 ----
extern Datum circle_distance(PG_FUNCTION_ARGS);
extern Datum dist_pc(PG_FUNCTION_ARGS);
extern Datum dist_cpoly(PG_FUNCTION_ARGS);
+ extern Datum dist_polyp(PG_FUNCTION_ARGS);
extern Datum circle_center(PG_FUNCTION_ARGS);
extern Datum cr_circle(PG_FUNCTION_ARGS);
extern Datum box_circle(PG_FUNCTION_ARGS);
*************** extern Datum gist_circle_consistent(PG_F
*** 418,423 ****
--- 419,425 ----
extern Datum gist_point_compress(PG_FUNCTION_ARGS);
extern Datum gist_point_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_distance(PG_FUNCTION_ARGS);
+ extern Datum gist_poly_distance(PG_FUNCTION_ARGS);
/* geo_selfuncs.c */
extern Datum areasel(PG_FUNCTION_ARGS);
Hi,
Cool stuff.
On 2013-12-14 13:59:02 +0400, Alexander Korotkov wrote:
Currently when we need to get ordered result from table we have to choose
one of two approaches: get results from index in exact order we need or do
sort of tuples. However, it could be useful to mix both methods: get
results from index in order which partially meets our requirements and do
rest of work from heap.
------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=69214.06..69214.08 rows=10 width=16) (actual
time=0.097..0.099 rows=10 loops=1)
-> Sort (cost=69214.06..71714.06 rows=1000000 width=16) (actual
time=0.096..0.097 rows=10 loops=1)
Sort Key: v1, v2
Sort Method: top-N heapsort Memory: 25kB
-> Index Scan using test_v1_idx on test (cost=0.42..47604.42
rows=1000000 width=16) (actual time=0.017..0.066 rows=56 loops=1)
Total runtime: 0.125 ms
(6 rows)
Is that actually all that beneficial when sorting with a bog standard
qsort() since that doesn't generally benefit from data being pre-sorted?
I think we might need to switch to a different algorithm to really
benefit from mostly pre-sorted input.
*partial-knn-1.patch*
KNN-GiST provides ability to get ordered results from index, but this order
is based only on index information. For instance, GiST index contains
bounding rectangles for polygons, and we can't get exact distance to
polygon from index (similar situation is in PostGIS). In attached patch,
GiST distance method can set recheck flag (similar to consistent method).
This flag means that distance method returned lower bound of distance and
we should recheck it from heap.
See an example.
create table test as (select id, polygon(3+(random()*10)::int,
circle(point(random(), random()), 0.0003 + random()*0.001)) as p from
generate_series(1,1000000) id);
create index test_idx on test using gist (p);We can get results ordered by distance from polygon to point.
postgres=# select id, p <-> point(0.5,0.5) from test order by p <->
point(0.5,0.5) limit 10;
----------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.29..1.86 rows=10 width=36) (actual time=0.180..0.230
rows=10 loops=1)
-> Index Scan using test_idx on test (cost=0.29..157672.29
rows=1000000 width=36) (actual time=0.179..0.228 rows=10 loops=1)
Order By: (p <-> '(0.5,0.5)'::point)
Total runtime: 0.305 ms
(4 rows)
Rechecking from the heap means adding a sort node though, which I don't
see here? Or am I misunderstanding something?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi!
Thanks for feedback!
On Sat, Dec 14, 2013 at 4:54 PM, Andres Freund <andres@2ndquadrant.com>wrote:
Hi,
Cool stuff.
On 2013-12-14 13:59:02 +0400, Alexander Korotkov wrote:
Currently when we need to get ordered result from table we have to choose
one of two approaches: get results from index in exact order we need ordo
sort of tuples. However, it could be useful to mix both methods: get
results from index in order which partially meets our requirements and do
rest of work from heap.------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=69214.06..69214.08 rows=10 width=16) (actual
time=0.097..0.099 rows=10 loops=1)
-> Sort (cost=69214.06..71714.06 rows=1000000 width=16) (actual
time=0.096..0.097 rows=10 loops=1)
Sort Key: v1, v2
Sort Method: top-N heapsort Memory: 25kB
-> Index Scan using test_v1_idx on test (cost=0.42..47604.42
rows=1000000 width=16) (actual time=0.017..0.066 rows=56 loops=1)
Total runtime: 0.125 ms
(6 rows)Is that actually all that beneficial when sorting with a bog standard
qsort() since that doesn't generally benefit from data being pre-sorted?
I think we might need to switch to a different algorithm to really
benefit from mostly pre-sorted input.
In this patch I don't do full sort of dataset. For instance, index returns
data ordered by first column and we need to order them also by second
column. Then this node sorts groups (assumed to be small) where values of
the first column are same by value of second column. And with limit clause
only required number of such groups will be processed. But, I don't think
we should expect pre-sorted values of second column inside a group.
*partial-knn-1.patch*
KNN-GiST provides ability to get ordered results from index, but this
order
is based only on index information. For instance, GiST index contains
bounding rectangles for polygons, and we can't get exact distance to
polygon from index (similar situation is in PostGIS). In attached patch,
GiST distance method can set recheck flag (similar to consistent method).
This flag means that distance method returned lower bound of distance and
we should recheck it from heap.See an example.
create table test as (select id, polygon(3+(random()*10)::int,
circle(point(random(), random()), 0.0003 + random()*0.001)) as p from
generate_series(1,1000000) id);
create index test_idx on test using gist (p);We can get results ordered by distance from polygon to point.
postgres=# select id, p <-> point(0.5,0.5) from test order by p <->
point(0.5,0.5) limit 10;----------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.29..1.86 rows=10 width=36) (actual time=0.180..0.230
rows=10 loops=1)
-> Index Scan using test_idx on test (cost=0.29..157672.29
rows=1000000 width=36) (actual time=0.179..0.228 rows=10 loops=1)
Order By: (p <-> '(0.5,0.5)'::point)
Total runtime: 0.305 ms
(4 rows)Rechecking from the heap means adding a sort node though, which I don't
see here? Or am I misunderstanding something?
KNN-GiST contain RB-tree of scanned items. In this patch item is rechecked
inside GiST and reinserted into same RB-tree. It appears to be much easier
implementation for PoC and also looks very efficient. I'm not sure what is
actually right design for it. This is what I like to discuss.
------
With best regards,
Alexander Korotkov.
On 14/12/13 12:54, Andres Freund wrote:
On 2013-12-14 13:59:02 +0400, Alexander Korotkov wrote:
Currently when we need to get ordered result from table we have to choose
one of two approaches: get results from index in exact order we need or do
sort of tuples. However, it could be useful to mix both methods: get
results from index in order which partially meets our requirements and do
rest of work from heap.------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=69214.06..69214.08 rows=10 width=16) (actual
time=0.097..0.099 rows=10 loops=1)
-> Sort (cost=69214.06..71714.06 rows=1000000 width=16) (actual
time=0.096..0.097 rows=10 loops=1)
Sort Key: v1, v2
Sort Method: top-N heapsort Memory: 25kB
-> Index Scan using test_v1_idx on test (cost=0.42..47604.42
rows=1000000 width=16) (actual time=0.017..0.066 rows=56 loops=1)
Total runtime: 0.125 ms
(6 rows)Is that actually all that beneficial when sorting with a bog standard
qsort() since that doesn't generally benefit from data being pre-sorted?
I think we might need to switch to a different algorithm to really
benefit from mostly pre-sorted input.
Eg: /messages/by-id/5291467E.6070807@wizmail.org
Maybe Alexander and I should bash our heads together.
--
Cheers,
Jeremy
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Dec 14, 2013 at 06:21:18PM +0400, Alexander Korotkov wrote:
Is that actually all that beneficial when sorting with a bog standard
qsort() since that doesn't generally benefit from data being pre-sorted?
I think we might need to switch to a different algorithm to really
benefit from mostly pre-sorted input.In this patch I don't do full sort of dataset. For instance, index returns
data ordered by first column and we need to order them also by second
column. Then this node sorts groups (assumed to be small) where values of
the first column are same by value of second column. And with limit clause
only required number of such groups will be processed. But, I don't think
we should expect pre-sorted values of second column inside a group.
Nice. I imagine this would be mostly beneficial for fast-start plans,
since you no longer need to sort the whole table prior to returning the
first tuple.
Reduced memory usage might be a factor, especially for large sorts
where you otherwise might need to spool to disk.
You can now use an index on (a) to improve sorting for (a,b).
Cost of sorting n groups of size l goes from O(nl log nl) to just O(nl
log l), useful for large n.
Minor comments:
I find cmpTuple a bad name. That's what it's doing but perhaps
cmpSkipColumns would be clearer.
I think it's worthwhile adding a seperate path for the skipCols = 0
case, to avoid extra copies.
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
He who writes carelessly confesses thereby at the very outset that he does
not attach much importance to his own thoughts.
-- Arthur Schopenhauer
Hi,
Limit (cost=69214.06..69214.08 rows=10 width=16) (actual
time=0.097..0.099 rows=10 loops=1)
-> Sort (cost=69214.06..71714.06 rows=1000000 width=16) (actual
time=0.096..0.097 rows=10 loops=1)
Sort Key: v1, v2
Sort Method: top-N heapsort Memory: 25kB
-> Index Scan using test_v1_idx on test (cost=0.42..47604.42
rows=1000000 width=16) (actual time=0.017..0.066 rows=56 loops=1)
Total runtime: 0.125 ms
(6 rows)Is that actually all that beneficial when sorting with a bog standard
qsort() since that doesn't generally benefit from data being pre-sorted?
I think we might need to switch to a different algorithm to really
benefit from mostly pre-sorted input.In this patch I don't do full sort of dataset. For instance, index returns
data ordered by first column and we need to order them also by second
column.
Ah, that makes sense.
But, I don't think we should expect pre-sorted values of second column
inside a group.
Yes, if you do it that way, there doesn't seem to any need to assume
that any more than we usually do.
I think you should make the explain output reflect the fact that we're
assuming v1 is presorted and just sorting v2. I'd be happy enough with:
Sort Key: v1, v2
Partial Sort: v2
or even just
"Partial Sort Key: [v1,] v2"
but I am sure others disagree.
*partial-knn-1.patch*
Rechecking from the heap means adding a sort node though, which I don't
see here? Or am I misunderstanding something?
KNN-GiST contain RB-tree of scanned items. In this patch item is rechecked
inside GiST and reinserted into same RB-tree. It appears to be much easier
implementation for PoC and also looks very efficient. I'm not sure what is
actually right design for it. This is what I like to discuss.
I don't have enough clue about gist to say wether it's the right design,
but it doesn't look wrong to my eyes. It'd probably be useful to export
the knowledge that we are rechecking and how often that happens to the
outside.
While I didn't really look into the patch, I noticed in passing that you
pass a all_dead variable to heap_hot_search_buffer without using the
result - just pass NULL instead, that performs a bit less work.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 12/14/2013 10:59 AM, Alexander Korotkov wrote:
This patch allows to use index for order-by if order-by clause and index
has non-empty common prefix. So, index gives right ordering for first n
order-by columns. In order to provide right order for rest m columns,
sort node is inserted. This sort node sorts groups of tuples where
values of first n order-by columns are equal.
I recently looked at the same problem. I see that you solved the
rescanning problem by simply forcing the sort to be redone on
ExecReScanSort if you have done a partial sort.
My idea for a solution was to modify tuplesort to allow storing the
already sorted keys in either memtuples or the sort result file, but
setting a field so it does not sort thee already sorted tuples again.
This would allow the rescan to work as it used to, but I am unsure how
clean or ugly this code would be. Was this something you considered?
--
Andreas Karlsson
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Dec 14, 2013 at 6:39 PM, Martijn van Oosterhout
<kleptog@svana.org>wrote:
On Sat, Dec 14, 2013 at 06:21:18PM +0400, Alexander Korotkov wrote:
Is that actually all that beneficial when sorting with a bog standard
qsort() since that doesn't generally benefit from data beingpre-sorted?
I think we might need to switch to a different algorithm to really
benefit from mostly pre-sorted input.In this patch I don't do full sort of dataset. For instance, index
returns
data ordered by first column and we need to order them also by second
column. Then this node sorts groups (assumed to be small) where values of
the first column are same by value of second column. And with limitclause
only required number of such groups will be processed. But, I don't think
we should expect pre-sorted values of second column inside a group.Nice. I imagine this would be mostly beneficial for fast-start plans,
since you no longer need to sort the whole table prior to returning the
first tuple.Reduced memory usage might be a factor, especially for large sorts
where you otherwise might need to spool to disk.You can now use an index on (a) to improve sorting for (a,b).
Cost of sorting n groups of size l goes from O(nl log nl) to just O(nl
log l), useful for large n.
Agree. Your reasoning looks correct.
Minor comments:
I find cmpTuple a bad name. That's what it's doing but perhaps
cmpSkipColumns would be clearer.I think it's worthwhile adding a seperate path for the skipCols = 0
case, to avoid extra copies.
Thanks. I'll take care about.
------
With best regards,
Alexander Korotkov.
On Sat, Dec 14, 2013 at 7:04 PM, Andres Freund <andres@2ndquadrant.com>wrote:
Hi,
Limit (cost=69214.06..69214.08 rows=10 width=16) (actual
time=0.097..0.099 rows=10 loops=1)
-> Sort (cost=69214.06..71714.06 rows=1000000 width=16) (actual
time=0.096..0.097 rows=10 loops=1)
Sort Key: v1, v2
Sort Method: top-N heapsort Memory: 25kB
-> Index Scan using test_v1_idx on test(cost=0.42..47604.42
rows=1000000 width=16) (actual time=0.017..0.066 rows=56 loops=1)
Total runtime: 0.125 ms
(6 rows)Is that actually all that beneficial when sorting with a bog standard
qsort() since that doesn't generally benefit from data beingpre-sorted?
I think we might need to switch to a different algorithm to really
benefit from mostly pre-sorted input.In this patch I don't do full sort of dataset. For instance, index
returns
data ordered by first column and we need to order them also by second
column.Ah, that makes sense.
But, I don't think we should expect pre-sorted values of second column
inside a group.Yes, if you do it that way, there doesn't seem to any need to assume
that any more than we usually do.I think you should make the explain output reflect the fact that we're
assuming v1 is presorted and just sorting v2. I'd be happy enough with:
Sort Key: v1, v2
Partial Sort: v2
or even just
"Partial Sort Key: [v1,] v2"
but I am sure others disagree.
Sure, I just didn't change explain output yet. It should look like what you
propose.
*partial-knn-1.patch*
Rechecking from the heap means adding a sort node though, which I don't
see here? Or am I misunderstanding something?KNN-GiST contain RB-tree of scanned items. In this patch item is
rechecked
inside GiST and reinserted into same RB-tree. It appears to be much
easier
implementation for PoC and also looks very efficient. I'm not sure what
is
actually right design for it. This is what I like to discuss.
I don't have enough clue about gist to say wether it's the right design,
but it doesn't look wrong to my eyes. It'd probably be useful to export
the knowledge that we are rechecking and how often that happens to the
outside.
While I didn't really look into the patch, I noticed in passing that you
pass a all_dead variable to heap_hot_search_buffer without using the
result - just pass NULL instead, that performs a bit less work.
Useful notice, thanks.
------
With best regards,
Alexander Korotkov.
On Sat, Dec 14, 2013 at 11:47 PM, Andreas Karlsson <andreas@proxel.se>wrote:
On 12/14/2013 10:59 AM, Alexander Korotkov wrote:
This patch allows to use index for order-by if order-by clause and index
has non-empty common prefix. So, index gives right ordering for first n
order-by columns. In order to provide right order for rest m columns,
sort node is inserted. This sort node sorts groups of tuples where
values of first n order-by columns are equal.I recently looked at the same problem. I see that you solved the
rescanning problem by simply forcing the sort to be redone on
ExecReScanSort if you have done a partial sort.
Naturally, I'm sure I solved it at all :) I just get version of patch
working for very limited use-cases.
My idea for a solution was to modify tuplesort to allow storing the
already sorted keys in either memtuples or the sort result file, but
setting a field so it does not sort thee already sorted tuples again. This
would allow the rescan to work as it used to, but I am unsure how clean or
ugly this code would be. Was this something you considered?
I'm not sure. I believe that best answer depends on particular parameter:
how much memory we've for sort, how expensive is underlying node and how it
performs rescan, how big are groups in partial sort.
------
With best regards,
Alexander Korotkov.
On 12/18/2013 01:02 PM, Alexander Korotkov wrote:
My idea for a solution was to modify tuplesort to allow storing the
already sorted keys in either memtuples or the sort result file, but
setting a field so it does not sort thee already sorted tuples
again. This would allow the rescan to work as it used to, but I am
unsure how clean or ugly this code would be. Was this something you
considered?I'm not sure. I believe that best answer depends on particular
parameter: how much memory we've for sort, how expensive is underlying
node and how it performs rescan, how big are groups in partial sort.
Yes, if one does not need a rescan your solution will use less memory
and about the same amount of CPU (if the tuplesort does not spill to
disk). While if we keep all the already sorted tuples in the tuplesort
rescans will be cheap but more memory will be used with an increased
chance of spilling to disk.
--
Andreas Karlsson
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi!
Next revision. It expected to do better work with optimizer. It introduces
presorted_keys argument of cost_sort function which represent number of
keys already sorted in Path. Then this function uses estimate_num_groups to
estimate number of groups with different values of presorted keys and
assumes that dataset is uniformly divided by
groups. get_cheapest_fractional_path_for_pathkeys tries to select the path
matching most part of path keys.
You can see it's working pretty good on single table queries.
create table test as (select id, (random()*5)::int as v1,
(random()*1000)::int as v2 from generate_series(1,1000000) id);
create index test_v1_idx on test (v1);
create index test_v1_v2_idx on test (v1, v2);
create index test_v2_idx on test (v2);
vacuum analyze;
postgres=# explain analyze select * from test order by v1, id;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Sort (cost=149244.84..151744.84 rows=1000000 width=12) (actual
time=2111.476..2586.493 rows=1000000 loops=1)
Sort Key: v1, id
Sort Method: external merge Disk: 21512kB
-> Seq Scan on test (cost=0.00..15406.00 rows=1000000 width=12)
(actual time=0.012..113.815 rows=1000000 loops=1)
Total runtime: 2683.011 ms
(5 rows)
postgres=# explain analyze select * from test order by v1, id limit 10;
QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=11441.77..11442.18 rows=10 width=12) (actual
time=79.980..79.982 rows=10 loops=1)
-> Partial sort (cost=11441.77..53140.44 rows=1000000 width=12)
(actual time=79.978..79.978 rows=10 loops=1)
Sort Key: v1, id
Presorted Key: v1
Sort Method: top-N heapsort Memory: 25kB
-> Index Scan using test_v1_idx on test (cost=0.42..47038.83
rows=1000000 width=12) (actual time=0.031..38.275 rows=100213 loops=1)
Total runtime: 81.786 ms
(7 rows)
postgres=# explain analyze select * from test order by v1, v2 limit 10;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..0.90 rows=10 width=12) (actual time=0.031..0.047
rows=10 loops=1)
-> Index Scan using test_v1_v2_idx on test (cost=0.42..47286.28
rows=1000000 width=12) (actual time=0.029..0.043 rows=10 loops=1)
Total runtime: 0.083 ms
(3 rows)
postgres=# explain analyze select * from test order by v2, id;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Partial sort (cost=97.75..99925.50 rows=1000000 width=12) (actual
time=1.069..1299.481 rows=1000000 loops=1)
Sort Key: v2, id
Presorted Key: v2
Sort Method: quicksort Memory: 52kB
-> Index Scan using test_v2_idx on test (cost=0.42..47603.79
rows=1000000 width=12) (actual time=0.030..812.083 rows=1000000 loops=1)
Total runtime: 1393.850 ms
(6 rows)
However, work with joins needs more improvements.
------
With best regards,
Alexander Korotkov.
Attachments:
partial-sort-2.patchapplication/octet-stream; name=partial-sort-2.patchDownload
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
new file mode 100644
index bd5428d..9edcc44
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
*************** static void show_sort_keys(SortState *so
*** 77,83 ****
static void show_merge_append_keys(MergeAppendState *mstate, List *ancestors,
ExplainState *es);
static void show_sort_keys_common(PlanState *planstate,
! int nkeys, AttrNumber *keycols,
List *ancestors, ExplainState *es);
static void show_sort_info(SortState *sortstate, ExplainState *es);
static void show_hash_info(HashState *hashstate, ExplainState *es);
--- 77,83 ----
static void show_merge_append_keys(MergeAppendState *mstate, List *ancestors,
ExplainState *es);
static void show_sort_keys_common(PlanState *planstate,
! int nkeys, int nPresortedKeys, AttrNumber *keycols,
List *ancestors, ExplainState *es);
static void show_sort_info(SortState *sortstate, ExplainState *es);
static void show_hash_info(HashState *hashstate, ExplainState *es);
*************** ExplainNode(PlanState *planstate, List *
*** 901,907 ****
pname = sname = "Materialize";
break;
case T_Sort:
! pname = sname = "Sort";
break;
case T_Group:
pname = sname = "Group";
--- 901,910 ----
pname = sname = "Materialize";
break;
case T_Sort:
! if (((Sort *) plan)->skipCols > 0)
! pname = sname = "Partial sort";
! else
! pname = sname = "Sort";
break;
case T_Group:
pname = sname = "Group";
*************** show_sort_keys(SortState *sortstate, Lis
*** 1694,1700 ****
Sort *plan = (Sort *) sortstate->ss.ps.plan;
show_sort_keys_common((PlanState *) sortstate,
! plan->numCols, plan->sortColIdx,
ancestors, es);
}
--- 1697,1703 ----
Sort *plan = (Sort *) sortstate->ss.ps.plan;
show_sort_keys_common((PlanState *) sortstate,
! plan->numCols, plan->skipCols, plan->sortColIdx,
ancestors, es);
}
*************** show_merge_append_keys(MergeAppendState
*** 1708,1724 ****
MergeAppend *plan = (MergeAppend *) mstate->ps.plan;
show_sort_keys_common((PlanState *) mstate,
! plan->numCols, plan->sortColIdx,
ancestors, es);
}
static void
! show_sort_keys_common(PlanState *planstate, int nkeys, AttrNumber *keycols,
! List *ancestors, ExplainState *es)
{
Plan *plan = planstate->plan;
List *context;
! List *result = NIL;
bool useprefix;
int keyno;
char *exprstr;
--- 1711,1728 ----
MergeAppend *plan = (MergeAppend *) mstate->ps.plan;
show_sort_keys_common((PlanState *) mstate,
! plan->numCols, 0, plan->sortColIdx,
ancestors, es);
}
static void
! show_sort_keys_common(PlanState *planstate, int nkeys, int nPresortedKeys,
! AttrNumber *keycols, List *ancestors, ExplainState *es)
{
Plan *plan = planstate->plan;
List *context;
! List *resultSort = NIL;
! List *resultPresorted = NIL;
bool useprefix;
int keyno;
char *exprstr;
*************** show_sort_keys_common(PlanState *plansta
*** 1745,1754 ****
/* Deparse the expression, showing any top-level cast */
exprstr = deparse_expression((Node *) target->expr, context,
useprefix, true);
! result = lappend(result, exprstr);
}
! ExplainPropertyList("Sort Key", result, es);
}
/*
--- 1749,1763 ----
/* Deparse the expression, showing any top-level cast */
exprstr = deparse_expression((Node *) target->expr, context,
useprefix, true);
!
! if (keyno < nPresortedKeys)
! resultPresorted = lappend(resultPresorted, exprstr);
! resultSort = lappend(resultSort, exprstr);
}
! ExplainPropertyList("Sort Key", resultSort, es);
! if (nPresortedKeys > 0)
! ExplainPropertyList("Presorted Key", resultPresorted, es);
}
/*
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
new file mode 100644
index 09b2eb0..e6a9a0c
*** a/src/backend/executor/nodeSort.c
--- b/src/backend/executor/nodeSort.c
***************
*** 15,25 ****
--- 15,52 ----
#include "postgres.h"
+ #include "access/htup_details.h"
#include "executor/execdebug.h"
#include "executor/nodeSort.h"
#include "miscadmin.h"
#include "utils/tuplesort.h"
+ /*
+ * Check if first "skipCols" sort values are equal.
+ */
+ static bool
+ cmpSortSkipCols(SortState *node, TupleDesc tupDesc, HeapTuple a, TupleTableSlot *b)
+ {
+ int n = ((Sort *)node->ss.ps.plan)->skipCols, i;
+ SortSupport sortKeys = tuplesort_get_sortkeys(node->tuplesortstate);
+
+ for (i = 0; i < n; i++)
+ {
+ Datum datumA, datumB;
+ bool isnullA, isnullB;
+ AttrNumber attno = sortKeys[i].ssup_attno;
+
+ datumA = heap_getattr(a, attno, tupDesc, &isnullA);
+ datumB = slot_getattr(b, attno, &isnullB);
+
+ if (ApplySortComparator(datumA, isnullA,
+ datumB, isnullB,
+ &sortKeys[i]))
+ return false;
+ }
+ return true;
+ }
+
/* ----------------------------------------------------------------
* ExecSort
*************** ExecSort(SortState *node)
*** 42,47 ****
--- 69,75 ----
ScanDirection dir;
Tuplesortstate *tuplesortstate;
TupleTableSlot *slot;
+ int skipCols = ((Sort *)node->ss.ps.plan)->skipCols;
/*
* get state info from node
*************** ExecSort(SortState *node)
*** 54,131 ****
tuplesortstate = (Tuplesortstate *) node->tuplesortstate;
/*
* If first time through, read all tuples from outer plan and pass them to
* tuplesort.c. Subsequent calls just fetch tuples from tuplesort.
*/
! if (!node->sort_Done)
! {
! Sort *plannode = (Sort *) node->ss.ps.plan;
! PlanState *outerNode;
! TupleDesc tupDesc;
! SO1_printf("ExecSort: %s\n",
! "sorting subplan");
! /*
! * Want to scan subplan in the forward direction while creating the
! * sorted data.
! */
! estate->es_direction = ForwardScanDirection;
! /*
! * Initialize tuplesort module.
! */
! SO1_printf("ExecSort: %s\n",
! "calling tuplesort_begin");
! outerNode = outerPlanState(node);
! tupDesc = ExecGetResultType(outerNode);
! tuplesortstate = tuplesort_begin_heap(tupDesc,
! plannode->numCols,
! plannode->sortColIdx,
! plannode->sortOperators,
! plannode->collations,
! plannode->nullsFirst,
! work_mem,
! node->randomAccess);
! if (node->bounded)
! tuplesort_set_bound(tuplesortstate, node->bound);
! node->tuplesortstate = (void *) tuplesortstate;
! /*
! * Scan the subplan and feed all the tuples to tuplesort.
! */
! for (;;)
{
- slot = ExecProcNode(outerNode);
-
if (TupIsNull(slot))
break;
!
tuplesort_puttupleslot(tuplesortstate, slot);
}
! /*
! * Complete the sort.
! */
! tuplesort_performsort(tuplesortstate);
! /*
! * restore to user specified direction
! */
! estate->es_direction = dir;
! /*
! * finally set the sorted flag to true
! */
! node->sort_Done = true;
! node->bounded_Done = node->bounded;
! node->bound_Done = node->bound;
! SO1_printf("ExecSort: %s\n", "sorting done");
! }
SO1_printf("ExecSort: %s\n",
"retrieving tuple from tuplesort");
--- 82,204 ----
tuplesortstate = (Tuplesortstate *) node->tuplesortstate;
/*
+ * Return next tuple from sorted set if any.
+ */
+ if (node->sort_Done)
+ {
+ slot = node->ss.ps.ps_ResultTupleSlot;
+ if (tuplesort_gettupleslot(tuplesortstate,
+ ScanDirectionIsForward(dir),
+ slot) || node->finished)
+ return slot;
+ }
+
+ /*
* If first time through, read all tuples from outer plan and pass them to
* tuplesort.c. Subsequent calls just fetch tuples from tuplesort.
*/
! Sort *plannode = (Sort *) node->ss.ps.plan;
! PlanState *outerNode;
! TupleDesc tupDesc;
! SO1_printf("ExecSort: %s\n",
! "sorting subplan");
! /*
! * Want to scan subplan in the forward direction while creating the
! * sorted data.
! */
! estate->es_direction = ForwardScanDirection;
! /*
! * Initialize tuplesort module.
! */
! SO1_printf("ExecSort: %s\n",
! "calling tuplesort_begin");
! outerNode = outerPlanState(node);
! tupDesc = ExecGetResultType(outerNode);
! tuplesortstate = tuplesort_begin_heap(tupDesc,
! plannode->numCols,
! plannode->sortColIdx,
! plannode->sortOperators,
! plannode->collations,
! plannode->nullsFirst,
! work_mem,
! node->randomAccess);
! if (node->bounded)
! tuplesort_set_bound(tuplesortstate, node->bound);
! node->tuplesortstate = (void *) tuplesortstate;
! /*
! * Put next group of tuples where skipCols" sort values are equal to
! * tuplesort.
! */
! for (;;)
! {
! slot = ExecProcNode(outerNode);
! if (skipCols == 0)
{
if (TupIsNull(slot))
+ {
+ node->finished = true;
break;
! }
tuplesort_puttupleslot(tuplesortstate, slot);
}
+ else if (node->prev)
+ {
+ ExecStoreTuple(node->prev, node->ss.ps.ps_ResultTupleSlot, InvalidBuffer, false);
+ tuplesort_puttupleslot(tuplesortstate, node->ss.ps.ps_ResultTupleSlot);
! if (TupIsNull(slot))
! {
! node->finished = true;
! break;
! }
! else
! {
! bool cmp;
! cmp = cmpSortSkipCols(node, tupDesc, node->prev, slot);
! node->prev = ExecCopySlotTuple(slot);
! if (!cmp)
! break;
! }
! }
! else
! {
! if (TupIsNull(slot))
! {
! node->finished = true;
! break;
! }
! else
! {
! node->prev = ExecCopySlotTuple(slot);
! }
! }
! }
! /*
! * Complete the sort.
! */
! tuplesort_performsort(tuplesortstate);
! /*
! * restore to user specified direction
! */
! estate->es_direction = dir;
!
! /*
! * finally set the sorted flag to true
! */
! node->sort_Done = true;
! node->bounded_Done = node->bounded;
! node->bound_Done = node->bound;
! SO1_printf("ExecSort: %s\n", "sorting done");
SO1_printf("ExecSort: %s\n",
"retrieving tuple from tuplesort");
*************** ExecInitSort(Sort *node, EState *estate,
*** 174,180 ****
--- 247,255 ----
sortstate->bounded = false;
sortstate->sort_Done = false;
+ sortstate->finished = false;
sortstate->tuplesortstate = NULL;
+ sortstate->prev = NULL;
/*
* Miscellaneous initialization
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
new file mode 100644
index e3edcf6..d698559
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
*************** _copySort(const Sort *from)
*** 735,740 ****
--- 735,741 ----
CopyPlanFields((const Plan *) from, (Plan *) newnode);
COPY_SCALAR_FIELD(numCols);
+ COPY_SCALAR_FIELD(skipCols);
COPY_POINTER_FIELD(sortColIdx, from->numCols * sizeof(AttrNumber));
COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
new file mode 100644
index 50f0852..1a38407
*** a/src/backend/optimizer/path/costsize.c
--- b/src/backend/optimizer/path/costsize.c
*************** cost_recursive_union(Plan *runion, Plan
*** 1281,1295 ****
*/
void
cost_sort(Path *path, PlannerInfo *root,
! List *pathkeys, Cost input_cost, double tuples, int width,
! Cost comparison_cost, int sort_mem,
double limit_tuples)
{
! Cost startup_cost = input_cost;
! Cost run_cost = 0;
double input_bytes = relation_byte_size(tuples, width);
double output_bytes;
double output_tuples;
long sort_mem_bytes = sort_mem * 1024L;
if (!enable_sort)
--- 1281,1302 ----
*/
void
cost_sort(Path *path, PlannerInfo *root,
! List *pathkeys, int presorted_keys,
! Cost input_startup_cost, Cost input_total_cost,
! double tuples, int width, Cost comparison_cost, int sort_mem,
double limit_tuples)
{
! Cost startup_cost = input_startup_cost;
! Cost run_cost = 0,
! rest_cost,
! group_cost,
! input_run_cost = input_total_cost - input_startup_cost;
double input_bytes = relation_byte_size(tuples, width);
double output_bytes;
double output_tuples;
+ double num_groups,
+ group_input_bytes,
+ group_tuples;
long sort_mem_bytes = sort_mem * 1024L;
if (!enable_sort)
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1319,1331 ****
output_bytes = input_bytes;
}
! if (output_bytes > sort_mem_bytes)
{
/*
* We'll have to use a disk-based sort of all the tuples
*/
! double npages = ceil(input_bytes / BLCKSZ);
! double nruns = (input_bytes / sort_mem_bytes) * 0.5;
double mergeorder = tuplesort_merge_order(sort_mem_bytes);
double log_runs;
double npageaccesses;
--- 1326,1367 ----
output_bytes = input_bytes;
}
! if (presorted_keys > 0)
! {
! List *groupExprs = NIL;
! ListCell *l;
! int i = 0;
!
! foreach(l, pathkeys)
! {
! PathKey *key = (PathKey *)lfirst(l);
! EquivalenceMember *member = (EquivalenceMember *)
! lfirst(list_head(key->pk_eclass->ec_members));
!
! groupExprs = lappend(groupExprs, member->em_expr);
!
! i++;
! if (i >= presorted_keys)
! break;
! }
!
! num_groups = estimate_num_groups(root, groupExprs, tuples);
! }
! else
! {
! num_groups = 1.0;
! }
!
! group_input_bytes = input_bytes / num_groups;
! group_tuples = tuples / num_groups;
!
! if (output_bytes > sort_mem_bytes && group_input_bytes > sort_mem_bytes)
{
/*
* We'll have to use a disk-based sort of all the tuples
*/
! double npages = ceil(group_input_bytes / BLCKSZ);
! double nruns = (group_input_bytes / sort_mem_bytes) * 0.5;
double mergeorder = tuplesort_merge_order(sort_mem_bytes);
double log_runs;
double npageaccesses;
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1335,1341 ****
*
* Assume about N log2 N comparisons
*/
! startup_cost += comparison_cost * tuples * LOG2(tuples);
/* Disk costs */
--- 1371,1377 ----
*
* Assume about N log2 N comparisons
*/
! group_cost = comparison_cost * group_tuples * LOG2(group_tuples);
/* Disk costs */
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1346,1355 ****
log_runs = 1.0;
npageaccesses = 2.0 * npages * log_runs;
/* Assume 3/4ths of accesses are sequential, 1/4th are not */
! startup_cost += npageaccesses *
(seq_page_cost * 0.75 + random_page_cost * 0.25);
}
! else if (tuples > 2 * output_tuples || input_bytes > sort_mem_bytes)
{
/*
* We'll use a bounded heap-sort keeping just K tuples in memory, for
--- 1382,1391 ----
log_runs = 1.0;
npageaccesses = 2.0 * npages * log_runs;
/* Assume 3/4ths of accesses are sequential, 1/4th are not */
! group_cost += npageaccesses *
(seq_page_cost * 0.75 + random_page_cost * 0.25);
}
! else if (group_tuples > 2 * output_tuples || group_input_bytes > sort_mem_bytes)
{
/*
* We'll use a bounded heap-sort keeping just K tuples in memory, for
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1357,1368 ****
* factor is a bit higher than for quicksort. Tweak it so that the
* cost curve is continuous at the crossover point.
*/
! startup_cost += comparison_cost * tuples * LOG2(2.0 * output_tuples);
}
else
{
/* We'll use plain quicksort on all the input tuples */
! startup_cost += comparison_cost * tuples * LOG2(tuples);
}
/*
--- 1393,1404 ----
* factor is a bit higher than for quicksort. Tweak it so that the
* cost curve is continuous at the crossover point.
*/
! group_cost = comparison_cost * group_tuples * LOG2(2.0 * output_tuples);
}
else
{
/* We'll use plain quicksort on all the input tuples */
! group_cost = comparison_cost * group_tuples * LOG2(group_tuples);
}
/*
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1373,1380 ****
--- 1409,1423 ----
* here --- the upper LIMIT will pro-rate the run cost so we'd be double
* counting the LIMIT otherwise.
*/
+ startup_cost += group_cost;
+ rest_cost = (num_groups * (output_tuples / tuples) - 1.0) * group_cost;
+ if (rest_cost > 0.0)
+ run_cost += rest_cost;
run_cost += cpu_operator_cost * tuples;
+ startup_cost += input_run_cost / num_groups;
+ run_cost += input_run_cost * ((num_groups - 1.0) / num_groups);
+
path->startup_cost = startup_cost;
path->total_cost = startup_cost + run_cost;
}
*************** initial_cost_mergejoin(PlannerInfo *root
*** 2075,2080 ****
--- 2118,2125 ----
cost_sort(&sort_path,
root,
outersortkeys,
+ pathkeys_common(outer_path->pathkeys, outersortkeys),
+ outer_path->startup_cost,
outer_path->total_cost,
outer_path_rows,
outer_path->parent->width,
*************** initial_cost_mergejoin(PlannerInfo *root
*** 2101,2106 ****
--- 2146,2153 ----
cost_sort(&sort_path,
root,
innersortkeys,
+ pathkeys_common(inner_path->pathkeys, innersortkeys),
+ inner_path->startup_cost,
inner_path->total_cost,
inner_path_rows,
inner_path->parent->width,
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
new file mode 100644
index 9c8ede6..cdb9ae7
*** a/src/backend/optimizer/path/pathkeys.c
--- b/src/backend/optimizer/path/pathkeys.c
*************** compare_pathkeys(List *keys1, List *keys
*** 312,317 ****
--- 312,343 ----
}
/*
+ * pathkeys_common
+ * Returns length of longest common prefix of keys1 and keys2.
+ */
+ int
+ pathkeys_common(List *keys1, List *keys2)
+ {
+ int n;
+ ListCell *key1,
+ *key2;
+ n = 0;
+
+ forboth(key1, keys1, key2, keys2)
+ {
+ PathKey *pathkey1 = (PathKey *) lfirst(key1);
+ PathKey *pathkey2 = (PathKey *) lfirst(key2);
+
+ if (pathkey1 != pathkey2)
+ return n;
+ n++;
+ }
+
+ return n;
+ }
+
+
+ /*
* pathkeys_contained_in
* Common special case of compare_pathkeys: we just want to know
* if keys2 are at least as well sorted as keys1.
*************** get_cheapest_fractional_path_for_pathkey
*** 389,394 ****
--- 415,423 ----
double fraction)
{
Path *matched_path = NULL;
+ int matched_n_common_pathkeys = 0,
+ costs_cmp, n_common_pathkeys,
+ n_pathkeys = list_length(pathkeys);
ListCell *l;
foreach(l, paths)
*************** get_cheapest_fractional_path_for_pathkey
*** 399,411 ****
* Since cost comparison is a lot cheaper than pathkey comparison, do
* that first. (XXX is that still true?)
*/
! if (matched_path != NULL &&
! compare_fractional_path_costs(matched_path, path, fraction) <= 0)
continue;
! if (pathkeys_contained_in(pathkeys, path->pathkeys) &&
bms_is_subset(PATH_REQ_OUTER(path), required_outer))
matched_path = path;
}
return matched_path;
}
--- 428,457 ----
* Since cost comparison is a lot cheaper than pathkey comparison, do
* that first. (XXX is that still true?)
*/
! if (matched_path != NULL)
! {
! costs_cmp = compare_fractional_path_costs(matched_path, path, fraction);
! if (matched_n_common_pathkeys == n_pathkeys && costs_cmp < 0)
! continue;
! }
! else
! {
! costs_cmp = 1;
! }
!
! n_common_pathkeys = pathkeys_common(pathkeys, path->pathkeys);
! if (n_common_pathkeys == 0)
continue;
! if ((
! n_common_pathkeys > matched_n_common_pathkeys
! || (n_common_pathkeys == matched_n_common_pathkeys
! && costs_cmp > 0)) &&
bms_is_subset(PATH_REQ_OUTER(path), required_outer))
+ {
matched_path = path;
+ matched_n_common_pathkeys = n_common_pathkeys;
+ }
}
return matched_path;
}
*************** right_merge_direction(PlannerInfo *root,
*** 1457,1472 ****
static int
pathkeys_useful_for_ordering(PlannerInfo *root, List *pathkeys)
{
if (root->query_pathkeys == NIL)
return 0; /* no special ordering requested */
if (pathkeys == NIL)
return 0; /* unordered path */
! if (pathkeys_contained_in(root->query_pathkeys, pathkeys))
{
/* It's useful ... or at least the first N keys are */
! return list_length(root->query_pathkeys);
}
return 0; /* path ordering not useful */
--- 1503,1522 ----
static int
pathkeys_useful_for_ordering(PlannerInfo *root, List *pathkeys)
{
+ int n;
+
if (root->query_pathkeys == NIL)
return 0; /* no special ordering requested */
if (pathkeys == NIL)
return 0; /* unordered path */
! n = pathkeys_common(root->query_pathkeys, pathkeys);
!
! if (n != 0)
{
/* It's useful ... or at least the first N keys are */
! return n;
}
return 0; /* path ordering not useful */
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
new file mode 100644
index f2c122d..a300342
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
*************** static MergeJoin *make_mergejoin(List *t
*** 149,154 ****
--- 149,155 ----
Plan *lefttree, Plan *righttree,
JoinType jointype);
static Sort *make_sort(PlannerInfo *root, Plan *lefttree, int numCols,
+ List *pathkeys, int skipCols,
AttrNumber *sortColIdx, Oid *sortOperators,
Oid *collations, bool *nullsFirst,
double limit_tuples);
*************** create_merge_append_plan(PlannerInfo *ro
*** 774,779 ****
--- 775,781 ----
Oid *sortOperators;
Oid *collations;
bool *nullsFirst;
+ int n_common_pathkeys;
/* Build the child plan */
subplan = create_plan_recurse(root, subpath);
*************** create_merge_append_plan(PlannerInfo *ro
*** 807,814 ****
numsortkeys * sizeof(bool)) == 0);
/* Now, insert a Sort node if subplan isn't sufficiently ordered */
! if (!pathkeys_contained_in(pathkeys, subpath->pathkeys))
subplan = (Plan *) make_sort(root, subplan, numsortkeys,
sortColIdx, sortOperators,
collations, nullsFirst,
best_path->limit_tuples);
--- 809,818 ----
numsortkeys * sizeof(bool)) == 0);
/* Now, insert a Sort node if subplan isn't sufficiently ordered */
! n_common_pathkeys = pathkeys_common(pathkeys, subpath->pathkeys);
! if (n_common_pathkeys < list_length(pathkeys))
subplan = (Plan *) make_sort(root, subplan, numsortkeys,
+ pathkeys, n_common_pathkeys,
sortColIdx, sortOperators,
collations, nullsFirst,
best_path->limit_tuples);
*************** create_mergejoin_plan(PlannerInfo *root,
*** 2184,2192 ****
disuse_physical_tlist(root, outer_plan, best_path->jpath.outerjoinpath);
outer_plan = (Plan *)
make_sort_from_pathkeys(root,
! outer_plan,
! best_path->outersortkeys,
! -1.0);
outerpathkeys = best_path->outersortkeys;
}
else
--- 2188,2198 ----
disuse_physical_tlist(root, outer_plan, best_path->jpath.outerjoinpath);
outer_plan = (Plan *)
make_sort_from_pathkeys(root,
! outer_plan,
! best_path->outersortkeys,
! -1.0,
! pathkeys_common(best_path->outersortkeys,
! best_path->jpath.outerjoinpath->pathkeys));
outerpathkeys = best_path->outersortkeys;
}
else
*************** create_mergejoin_plan(PlannerInfo *root,
*** 2197,2205 ****
disuse_physical_tlist(root, inner_plan, best_path->jpath.innerjoinpath);
inner_plan = (Plan *)
make_sort_from_pathkeys(root,
! inner_plan,
! best_path->innersortkeys,
! -1.0);
innerpathkeys = best_path->innersortkeys;
}
else
--- 2203,2213 ----
disuse_physical_tlist(root, inner_plan, best_path->jpath.innerjoinpath);
inner_plan = (Plan *)
make_sort_from_pathkeys(root,
! inner_plan,
! best_path->innersortkeys,
! -1.0,
! pathkeys_common(best_path->innersortkeys,
! best_path->jpath.innerjoinpath->pathkeys));
innerpathkeys = best_path->innersortkeys;
}
else
*************** make_mergejoin(List *tlist,
*** 3739,3744 ****
--- 3747,3753 ----
*/
static Sort *
make_sort(PlannerInfo *root, Plan *lefttree, int numCols,
+ List *pathkeys, int skipCols,
AttrNumber *sortColIdx, Oid *sortOperators,
Oid *collations, bool *nullsFirst,
double limit_tuples)
*************** make_sort(PlannerInfo *root, Plan *leftt
*** 3748,3754 ****
Path sort_path; /* dummy for result of cost_sort */
copy_plan_costsize(plan, lefttree); /* only care about copying size */
! cost_sort(&sort_path, root, NIL,
lefttree->total_cost,
lefttree->plan_rows,
lefttree->plan_width,
--- 3757,3764 ----
Path sort_path; /* dummy for result of cost_sort */
copy_plan_costsize(plan, lefttree); /* only care about copying size */
! cost_sort(&sort_path, root, pathkeys, skipCols,
! lefttree->startup_cost,
lefttree->total_cost,
lefttree->plan_rows,
lefttree->plan_width,
*************** make_sort(PlannerInfo *root, Plan *leftt
*** 3762,3767 ****
--- 3772,3778 ----
plan->lefttree = lefttree;
plan->righttree = NULL;
node->numCols = numCols;
+ node->skipCols = skipCols;
node->sortColIdx = sortColIdx;
node->sortOperators = sortOperators;
node->collations = collations;
*************** find_ec_member_for_tle(EquivalenceClass
*** 4090,4096 ****
*/
Sort *
make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
! double limit_tuples)
{
int numsortkeys;
AttrNumber *sortColIdx;
--- 4101,4107 ----
*/
Sort *
make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
! double limit_tuples, int skipCols)
{
int numsortkeys;
AttrNumber *sortColIdx;
*************** make_sort_from_pathkeys(PlannerInfo *roo
*** 4110,4116 ****
&nullsFirst);
/* Now build the Sort node */
! return make_sort(root, lefttree, numsortkeys,
sortColIdx, sortOperators, collations,
nullsFirst, limit_tuples);
}
--- 4121,4127 ----
&nullsFirst);
/* Now build the Sort node */
! return make_sort(root, lefttree, numsortkeys, pathkeys, skipCols,
sortColIdx, sortOperators, collations,
nullsFirst, limit_tuples);
}
*************** make_sort_from_sortclauses(PlannerInfo *
*** 4153,4159 ****
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
--- 4164,4170 ----
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys, NIL, 0,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
*************** Sort *
*** 4175,4181 ****
make_sort_from_groupcols(PlannerInfo *root,
List *groupcls,
AttrNumber *grpColIdx,
! Plan *lefttree)
{
List *sub_tlist = lefttree->targetlist;
ListCell *l;
--- 4186,4193 ----
make_sort_from_groupcols(PlannerInfo *root,
List *groupcls,
AttrNumber *grpColIdx,
! Plan *lefttree,
! List *pathkeys, int skipCols)
{
List *sub_tlist = lefttree->targetlist;
ListCell *l;
*************** make_sort_from_groupcols(PlannerInfo *ro
*** 4208,4214 ****
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
--- 4220,4226 ----
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys, pathkeys, skipCols,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
new file mode 100644
index 6670794..56ffb75
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*************** grouping_planner(PlannerInfo *root, doub
*** 1358,1367 ****
if (sorted_path)
{
Path sort_path; /* dummy for result of cost_sort */
if (root->query_pathkeys == NIL ||
! pathkeys_contained_in(root->query_pathkeys,
! cheapest_path->pathkeys))
{
/* No sort needed for cheapest path */
sort_path.startup_cost = cheapest_path->startup_cost;
--- 1358,1371 ----
if (sorted_path)
{
Path sort_path; /* dummy for result of cost_sort */
+ Path partial_sort_path; /* dummy for result of cost_sort */
+ int n_common_pathkeys;
+
+ n_common_pathkeys = pathkeys_common(root->query_pathkeys,
+ cheapest_path->pathkeys);
if (root->query_pathkeys == NIL ||
! n_common_pathkeys == list_length(root->query_pathkeys))
{
/* No sort needed for cheapest path */
sort_path.startup_cost = cheapest_path->startup_cost;
*************** grouping_planner(PlannerInfo *root, doub
*** 1371,1382 ****
{
/* Figure cost for sorting */
cost_sort(&sort_path, root, root->query_pathkeys,
cheapest_path->total_cost,
path_rows, path_width,
0.0, work_mem, root->limit_tuples);
}
! if (compare_fractional_path_costs(sorted_path, &sort_path,
tuple_fraction) > 0)
{
/* Presorted path is a loser */
--- 1375,1409 ----
{
/* Figure cost for sorting */
cost_sort(&sort_path, root, root->query_pathkeys,
+ n_common_pathkeys,
+ cheapest_path->startup_cost,
cheapest_path->total_cost,
path_rows, path_width,
0.0, work_mem, root->limit_tuples);
}
! n_common_pathkeys = pathkeys_common(root->query_pathkeys,
! sorted_path->pathkeys);
!
! if (root->query_pathkeys == NIL ||
! n_common_pathkeys == list_length(root->query_pathkeys))
! {
! /* No sort needed for cheapest path */
! partial_sort_path.startup_cost = sorted_path->startup_cost;
! partial_sort_path.total_cost = sorted_path->total_cost;
! }
! else
! {
! /* Figure cost for sorting */
! cost_sort(&partial_sort_path, root, root->query_pathkeys,
! n_common_pathkeys,
! sorted_path->startup_cost,
! sorted_path->total_cost,
! path_rows, path_width,
! 0.0, work_mem, root->limit_tuples);
! }
!
! if (compare_fractional_path_costs(&partial_sort_path, &sort_path,
tuple_fraction) > 0)
{
/* Presorted path is a loser */
*************** grouping_planner(PlannerInfo *root, doub
*** 1457,1469 ****
* results.
*/
bool need_sort_for_grouping = false;
result_plan = create_plan(root, best_path);
current_pathkeys = best_path->pathkeys;
/* Detect if we'll need an explicit sort for grouping */
if (parse->groupClause && !use_hashed_grouping &&
! !pathkeys_contained_in(root->group_pathkeys, current_pathkeys))
{
need_sort_for_grouping = true;
--- 1484,1499 ----
* results.
*/
bool need_sort_for_grouping = false;
+ int n_common_pathkeys_grouping;
result_plan = create_plan(root, best_path);
current_pathkeys = best_path->pathkeys;
/* Detect if we'll need an explicit sort for grouping */
+ n_common_pathkeys_grouping = pathkeys_common(root->group_pathkeys,
+ current_pathkeys);
if (parse->groupClause && !use_hashed_grouping &&
! n_common_pathkeys_grouping < list_length(root->group_pathkeys))
{
need_sort_for_grouping = true;
*************** grouping_planner(PlannerInfo *root, doub
*** 1557,1563 ****
make_sort_from_groupcols(root,
parse->groupClause,
groupColIdx,
! result_plan);
current_pathkeys = root->group_pathkeys;
}
aggstrategy = AGG_SORTED;
--- 1587,1595 ----
make_sort_from_groupcols(root,
parse->groupClause,
groupColIdx,
! result_plan,
! root->group_pathkeys,
! n_common_pathkeys_grouping);
current_pathkeys = root->group_pathkeys;
}
aggstrategy = AGG_SORTED;
*************** grouping_planner(PlannerInfo *root, doub
*** 1600,1606 ****
make_sort_from_groupcols(root,
parse->groupClause,
groupColIdx,
! result_plan);
current_pathkeys = root->group_pathkeys;
}
--- 1632,1640 ----
make_sort_from_groupcols(root,
parse->groupClause,
groupColIdx,
! result_plan,
! root->group_pathkeys,
! n_common_pathkeys_grouping);
current_pathkeys = root->group_pathkeys;
}
*************** grouping_planner(PlannerInfo *root, doub
*** 1717,1729 ****
if (window_pathkeys)
{
Sort *sort_plan;
sort_plan = make_sort_from_pathkeys(root,
result_plan,
window_pathkeys,
! -1.0);
! if (!pathkeys_contained_in(window_pathkeys,
! current_pathkeys))
{
/* we do indeed need to sort */
result_plan = (Plan *) sort_plan;
--- 1751,1767 ----
if (window_pathkeys)
{
Sort *sort_plan;
+ int n_common_pathkeys;
+
+ n_common_pathkeys = pathkeys_common(window_pathkeys,
+ current_pathkeys);
sort_plan = make_sort_from_pathkeys(root,
result_plan,
window_pathkeys,
! -1.0,
! n_common_pathkeys);
! if (n_common_pathkeys < list_length(window_pathkeys))
{
/* we do indeed need to sort */
result_plan = (Plan *) sort_plan;
*************** grouping_planner(PlannerInfo *root, doub
*** 1869,1887 ****
{
if (list_length(root->distinct_pathkeys) >=
list_length(root->sort_pathkeys))
! current_pathkeys = root->distinct_pathkeys;
else
{
! current_pathkeys = root->sort_pathkeys;
/* Assert checks that parser didn't mess up... */
Assert(pathkeys_contained_in(root->distinct_pathkeys,
! current_pathkeys));
}
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
! current_pathkeys,
! -1.0);
}
result_plan = (Plan *) make_unique(result_plan,
--- 1907,1927 ----
{
if (list_length(root->distinct_pathkeys) >=
list_length(root->sort_pathkeys))
! needed_pathkeys = root->distinct_pathkeys;
else
{
! needed_pathkeys = root->sort_pathkeys;
/* Assert checks that parser didn't mess up... */
Assert(pathkeys_contained_in(root->distinct_pathkeys,
! needed_pathkeys));
}
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
! needed_pathkeys,
! -1.0,
! pathkeys_common(needed_pathkeys, current_pathkeys));
! current_pathkeys = needed_pathkeys;
}
result_plan = (Plan *) make_unique(result_plan,
*************** grouping_planner(PlannerInfo *root, doub
*** 1897,1908 ****
*/
if (parse->sortClause)
{
! if (!pathkeys_contained_in(root->sort_pathkeys, current_pathkeys))
{
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
root->sort_pathkeys,
! limit_tuples);
current_pathkeys = root->sort_pathkeys;
}
}
--- 1937,1951 ----
*/
if (parse->sortClause)
{
! int common = pathkeys_common(root->sort_pathkeys, current_pathkeys);
!
! if (common < list_length(root->sort_pathkeys))
{
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
root->sort_pathkeys,
! limit_tuples,
! common);
current_pathkeys = root->sort_pathkeys;
}
}
*************** choose_hashed_grouping(PlannerInfo *root
*** 2647,2652 ****
--- 2690,2696 ----
List *current_pathkeys;
Path hashed_p;
Path sorted_p;
+ int n_common_pathkeys;
/*
* Executor doesn't support hashed aggregation with DISTINCT or ORDER BY
*************** choose_hashed_grouping(PlannerInfo *root
*** 2726,2732 ****
path_rows);
/* Result of hashed agg is always unsorted */
if (target_pathkeys)
! cost_sort(&hashed_p, root, target_pathkeys, hashed_p.total_cost,
dNumGroups, path_width,
0.0, work_mem, limit_tuples);
--- 2770,2777 ----
path_rows);
/* Result of hashed agg is always unsorted */
if (target_pathkeys)
! cost_sort(&hashed_p, root, target_pathkeys, 0,
! hashed_p.startup_cost, hashed_p.total_cost,
dNumGroups, path_width,
0.0, work_mem, limit_tuples);
*************** choose_hashed_grouping(PlannerInfo *root
*** 2742,2750 ****
sorted_p.total_cost = cheapest_path->total_cost;
current_pathkeys = cheapest_path->pathkeys;
}
! if (!pathkeys_contained_in(root->group_pathkeys, current_pathkeys))
{
! cost_sort(&sorted_p, root, root->group_pathkeys, sorted_p.total_cost,
path_rows, path_width,
0.0, work_mem, -1.0);
current_pathkeys = root->group_pathkeys;
--- 2787,2798 ----
sorted_p.total_cost = cheapest_path->total_cost;
current_pathkeys = cheapest_path->pathkeys;
}
!
! n_common_pathkeys = pathkeys_common(root->group_pathkeys, current_pathkeys);
! if (n_common_pathkeys < list_length(root->group_pathkeys))
{
! cost_sort(&sorted_p, root, root->group_pathkeys,
! n_common_pathkeys, sorted_p.startup_cost, sorted_p.total_cost,
path_rows, path_width,
0.0, work_mem, -1.0);
current_pathkeys = root->group_pathkeys;
*************** choose_hashed_grouping(PlannerInfo *root
*** 2759,2768 ****
cost_group(&sorted_p, root, numGroupCols, dNumGroups,
sorted_p.startup_cost, sorted_p.total_cost,
path_rows);
/* The Agg or Group node will preserve ordering */
! if (target_pathkeys &&
! !pathkeys_contained_in(target_pathkeys, current_pathkeys))
! cost_sort(&sorted_p, root, target_pathkeys, sorted_p.total_cost,
dNumGroups, path_width,
0.0, work_mem, limit_tuples);
--- 2807,2818 ----
cost_group(&sorted_p, root, numGroupCols, dNumGroups,
sorted_p.startup_cost, sorted_p.total_cost,
path_rows);
+
/* The Agg or Group node will preserve ordering */
! n_common_pathkeys = pathkeys_common(target_pathkeys, current_pathkeys);
! if (target_pathkeys && n_common_pathkeys < list_length(target_pathkeys))
! cost_sort(&sorted_p, root, target_pathkeys, n_common_pathkeys,
! sorted_p.startup_cost, sorted_p.total_cost,
dNumGroups, path_width,
0.0, work_mem, limit_tuples);
*************** choose_hashed_distinct(PlannerInfo *root
*** 2815,2820 ****
--- 2865,2871 ----
List *needed_pathkeys;
Path hashed_p;
Path sorted_p;
+ int n_common_pathkeys;
/*
* If we have a sortable DISTINCT ON clause, we always use sorting. This
*************** choose_hashed_distinct(PlannerInfo *root
*** 2880,2886 ****
* need to charge for the final sort.
*/
if (parse->sortClause)
! cost_sort(&hashed_p, root, root->sort_pathkeys, hashed_p.total_cost,
dNumDistinctRows, path_width,
0.0, work_mem, limit_tuples);
--- 2931,2938 ----
* need to charge for the final sort.
*/
if (parse->sortClause)
! cost_sort(&hashed_p, root, root->sort_pathkeys, 0,
! hashed_p.startup_cost, hashed_p.total_cost,
dNumDistinctRows, path_width,
0.0, work_mem, limit_tuples);
*************** choose_hashed_distinct(PlannerInfo *root
*** 2897,2919 ****
needed_pathkeys = root->sort_pathkeys;
else
needed_pathkeys = root->distinct_pathkeys;
! if (!pathkeys_contained_in(needed_pathkeys, current_pathkeys))
{
if (list_length(root->distinct_pathkeys) >=
list_length(root->sort_pathkeys))
current_pathkeys = root->distinct_pathkeys;
else
current_pathkeys = root->sort_pathkeys;
! cost_sort(&sorted_p, root, current_pathkeys, sorted_p.total_cost,
path_rows, path_width,
0.0, work_mem, -1.0);
}
cost_group(&sorted_p, root, numDistinctCols, dNumDistinctRows,
sorted_p.startup_cost, sorted_p.total_cost,
path_rows);
if (parse->sortClause &&
! !pathkeys_contained_in(root->sort_pathkeys, current_pathkeys))
! cost_sort(&sorted_p, root, root->sort_pathkeys, sorted_p.total_cost,
dNumDistinctRows, path_width,
0.0, work_mem, limit_tuples);
--- 2949,2978 ----
needed_pathkeys = root->sort_pathkeys;
else
needed_pathkeys = root->distinct_pathkeys;
!
! n_common_pathkeys = pathkeys_common(needed_pathkeys, current_pathkeys);
! if (n_common_pathkeys < list_length(needed_pathkeys))
{
if (list_length(root->distinct_pathkeys) >=
list_length(root->sort_pathkeys))
current_pathkeys = root->distinct_pathkeys;
else
current_pathkeys = root->sort_pathkeys;
! cost_sort(&sorted_p, root, current_pathkeys,
! n_common_pathkeys, sorted_p.startup_cost, sorted_p.total_cost,
path_rows, path_width,
0.0, work_mem, -1.0);
}
cost_group(&sorted_p, root, numDistinctCols, dNumDistinctRows,
sorted_p.startup_cost, sorted_p.total_cost,
path_rows);
+
+
+ n_common_pathkeys = pathkeys_common(root->sort_pathkeys, current_pathkeys);
if (parse->sortClause &&
! n_common_pathkeys < list_length(root->sort_pathkeys))
! cost_sort(&sorted_p, root, root->sort_pathkeys, n_common_pathkeys,
! sorted_p.startup_cost, sorted_p.total_cost,
dNumDistinctRows, path_width,
0.0, work_mem, limit_tuples);
*************** plan_cluster_use_sort(Oid tableOid, Oid
*** 3703,3710 ****
/* Estimate the cost of seq scan + sort */
seqScanPath = create_seqscan_path(root, rel, NULL);
! cost_sort(&seqScanAndSortPath, root, NIL,
! seqScanPath->total_cost, rel->tuples, rel->width,
comparisonCost, maintenance_work_mem, -1.0);
/* Estimate the cost of index scan */
--- 3762,3770 ----
/* Estimate the cost of seq scan + sort */
seqScanPath = create_seqscan_path(root, rel, NULL);
! cost_sort(&seqScanAndSortPath, root, NIL, 0,
! seqScanPath->startup_cost, seqScanPath->total_cost,
! rel->tuples, rel->width,
comparisonCost, maintenance_work_mem, -1.0);
/* Estimate the cost of index scan */
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
new file mode 100644
index e249628..b0b5471
*** a/src/backend/optimizer/prep/prepunion.c
--- b/src/backend/optimizer/prep/prepunion.c
*************** choose_hashed_setop(PlannerInfo *root, L
*** 859,865 ****
sorted_p.startup_cost = input_plan->startup_cost;
sorted_p.total_cost = input_plan->total_cost;
/* XXX cost_sort doesn't actually look at pathkeys, so just pass NIL */
! cost_sort(&sorted_p, root, NIL, sorted_p.total_cost,
input_plan->plan_rows, input_plan->plan_width,
0.0, work_mem, -1.0);
cost_group(&sorted_p, root, numGroupCols, dNumGroups,
--- 859,866 ----
sorted_p.startup_cost = input_plan->startup_cost;
sorted_p.total_cost = input_plan->total_cost;
/* XXX cost_sort doesn't actually look at pathkeys, so just pass NIL */
! cost_sort(&sorted_p, root, NIL, 0,
! sorted_p.startup_cost, sorted_p.total_cost,
input_plan->plan_rows, input_plan->plan_width,
0.0, work_mem, -1.0);
cost_group(&sorted_p, root, numGroupCols, dNumGroups,
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
new file mode 100644
index a7169ef..3d0a842
*** a/src/backend/optimizer/util/pathnode.c
--- b/src/backend/optimizer/util/pathnode.c
*************** create_merge_append_path(PlannerInfo *ro
*** 971,980 ****
foreach(l, subpaths)
{
Path *subpath = (Path *) lfirst(l);
pathnode->path.rows += subpath->rows;
! if (pathkeys_contained_in(pathkeys, subpath->pathkeys))
{
/* Subpath is adequately ordered, we won't need to sort it */
input_startup_cost += subpath->startup_cost;
--- 971,981 ----
foreach(l, subpaths)
{
Path *subpath = (Path *) lfirst(l);
+ int n_common_pathkeys = pathkeys_common(pathkeys, subpath->pathkeys);
pathnode->path.rows += subpath->rows;
! if (n_common_pathkeys == list_length(pathkeys))
{
/* Subpath is adequately ordered, we won't need to sort it */
input_startup_cost += subpath->startup_cost;
*************** create_merge_append_path(PlannerInfo *ro
*** 988,993 ****
--- 989,996 ----
cost_sort(&sort_path,
root,
pathkeys,
+ n_common_pathkeys,
+ subpath->startup_cost,
subpath->total_cost,
subpath->parent->tuples,
subpath->parent->width,
*************** create_unique_path(PlannerInfo *root, Re
*** 1343,1349 ****
/*
* Estimate cost for sort+unique implementation
*/
! cost_sort(&sort_path, root, NIL,
subpath->total_cost,
rel->rows,
rel->width,
--- 1346,1353 ----
/*
* Estimate cost for sort+unique implementation
*/
! cost_sort(&sort_path, root, NIL, 0,
! subpath->startup_cost,
subpath->total_cost,
rel->rows,
rel->width,
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
new file mode 100644
index ea8af9f..29b90f2
*** a/src/backend/utils/sort/tuplesort.c
--- b/src/backend/utils/sort/tuplesort.c
*************** free_sort_tuple(Tuplesortstate *state, S
*** 3455,3457 ****
--- 3455,3464 ----
FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
pfree(stup->tuple);
}
+
+ SortSupport
+ tuplesort_get_sortkeys(Tuplesortstate *state)
+ {
+ return state->sortKeys;
+ }
+
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
new file mode 100644
index 5a40347..3723a18
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
*************** typedef struct SortState
*** 1663,1670 ****
--- 1663,1672 ----
int64 bound; /* if bounded, how many tuples are needed */
bool sort_Done; /* sort completed yet? */
bool bounded_Done; /* value of bounded we did the sort with */
+ bool finished;
int64 bound_Done; /* value of bound we did the sort with */
void *tuplesortstate; /* private state of tuplesort.c */
+ HeapTuple prev;
} SortState;
/* ---------------------
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
new file mode 100644
index 101e22c..28b871e
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
*************** typedef struct Sort
*** 582,587 ****
--- 582,588 ----
{
Plan plan;
int numCols; /* number of sort-key columns */
+ int skipCols;
AttrNumber *sortColIdx; /* their indexes in the target list */
Oid *sortOperators; /* OIDs of operators to sort them by */
Oid *collations; /* OIDs of collations */
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
new file mode 100644
index 444ab74..e98fb0c
*** a/src/include/optimizer/cost.h
--- b/src/include/optimizer/cost.h
*************** extern void cost_ctescan(Path *path, Pla
*** 88,95 ****
RelOptInfo *baserel, ParamPathInfo *param_info);
extern void cost_recursive_union(Plan *runion, Plan *nrterm, Plan *rterm);
extern void cost_sort(Path *path, PlannerInfo *root,
! List *pathkeys, Cost input_cost, double tuples, int width,
! Cost comparison_cost, int sort_mem,
double limit_tuples);
extern void cost_merge_append(Path *path, PlannerInfo *root,
List *pathkeys, int n_streams,
--- 88,96 ----
RelOptInfo *baserel, ParamPathInfo *param_info);
extern void cost_recursive_union(Plan *runion, Plan *nrterm, Plan *rterm);
extern void cost_sort(Path *path, PlannerInfo *root,
! List *pathkeys, int presorted_keys,
! Cost input_startup_cost, Cost input_total_cost,
! double tuples, int width, Cost comparison_cost, int sort_mem,
double limit_tuples);
extern void cost_merge_append(Path *path, PlannerInfo *root,
List *pathkeys, int n_streams,
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
new file mode 100644
index 999adaa..7c09301
*** a/src/include/optimizer/paths.h
--- b/src/include/optimizer/paths.h
*************** typedef enum
*** 157,162 ****
--- 157,163 ----
extern PathKeysComparison compare_pathkeys(List *keys1, List *keys2);
extern bool pathkeys_contained_in(List *keys1, List *keys2);
+ extern int pathkeys_common(List *keys1, List *keys2);
extern Path *get_cheapest_path_for_pathkeys(List *paths, List *pathkeys,
Relids required_outer,
CostSelector cost_criterion);
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
new file mode 100644
index ba7ae7c..d33c615
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
*************** extern RecursiveUnion *make_recursive_un
*** 50,60 ****
Plan *lefttree, Plan *righttree, int wtParam,
List *distinctList, long numGroups);
extern Sort *make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree,
! List *pathkeys, double limit_tuples);
extern Sort *make_sort_from_sortclauses(PlannerInfo *root, List *sortcls,
Plan *lefttree);
extern Sort *make_sort_from_groupcols(PlannerInfo *root, List *groupcls,
! AttrNumber *grpColIdx, Plan *lefttree);
extern Agg *make_agg(PlannerInfo *root, List *tlist, List *qual,
AggStrategy aggstrategy, const AggClauseCosts *aggcosts,
int numGroupCols, AttrNumber *grpColIdx, Oid *grpOperators,
--- 50,61 ----
Plan *lefttree, Plan *righttree, int wtParam,
List *distinctList, long numGroups);
extern Sort *make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree,
! List *pathkeys, double limit_tuples, int skipCols);
extern Sort *make_sort_from_sortclauses(PlannerInfo *root, List *sortcls,
Plan *lefttree);
extern Sort *make_sort_from_groupcols(PlannerInfo *root, List *groupcls,
! AttrNumber *grpColIdx, Plan *lefttree, List *pathkeys,
! int skipCols);
extern Agg *make_agg(PlannerInfo *root, List *tlist, List *qual,
AggStrategy aggstrategy, const AggClauseCosts *aggcosts,
int numGroupCols, AttrNumber *grpColIdx, Oid *grpOperators,
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
new file mode 100644
index 25fa6de..267a988
*** a/src/include/utils/tuplesort.h
--- b/src/include/utils/tuplesort.h
***************
*** 24,29 ****
--- 24,30 ----
#include "executor/tuptable.h"
#include "fmgr.h"
#include "utils/relcache.h"
+ #include "utils/sortsupport.h"
/* Tuplesortstate is an opaque type whose details are not known outside
*************** extern void tuplesort_get_stats(Tuplesor
*** 108,113 ****
--- 109,116 ----
extern int tuplesort_merge_order(int64 allowedMem);
+ extern SortSupport tuplesort_get_sortkeys(Tuplesortstate *state);
+
/*
* These routines may only be called if randomAccess was specified 'true'.
* Likewise, backwards scan in gettuple/getdatum is only allowed if
On Sun, Dec 22, 2013 at 07:38:05PM +0400, Alexander Korotkov wrote:
Hi!
Next revision. It expected to do better work with optimizer. It introduces
presorted_keys argument of cost_sort function which represent number of
keys already sorted in Path. Then this function uses estimate_num_groups to
estimate number of groups with different values of presorted keys and
assumes that dataset is uniformly divided by
groups. get_cheapest_fractional_path_for_pathkeys tries to select the path
matching most part of path keys.
You can see it's working pretty good on single table queries.
Nice work! The plans look good and the calculated costs seem sane also.
I suppose the problem with the joins is generating the pathkeys?
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
He who writes carelessly confesses thereby at the very outset that he does
not attach much importance to his own thoughts.
-- Arthur Schopenhauer
On Sun, Dec 22, 2013 at 8:12 PM, Martijn van Oosterhout
<kleptog@svana.org>wrote:
On Sun, Dec 22, 2013 at 07:38:05PM +0400, Alexander Korotkov wrote:
Hi!
Next revision. It expected to do better work with optimizer. It
introduces
presorted_keys argument of cost_sort function which represent number of
keys already sorted in Path. Then this function uses estimate_num_groupsto
estimate number of groups with different values of presorted keys and
assumes that dataset is uniformly divided by
groups. get_cheapest_fractional_path_for_pathkeys tries to select thepath
matching most part of path keys.
You can see it's working pretty good on single table queries.Nice work! The plans look good and the calculated costs seem sane also.
I suppose the problem with the joins is generating the pathkeys?
In general, problem is that partial sort is alternative to do less
restrictive merge join and filter it's results. As far as I can see, taking
care about it require some rework of merge optimization. For now, I didn't
get what it's going to look like. I'll try to dig more into details.
------
With best regards,
Alexander Korotkov.
On Sat, Dec 14, 2013 at 6:30 PM, Jeremy Harris <jgh@wizmail.org> wrote:
On 14/12/13 12:54, Andres Freund wrote:
On 2013-12-14 13:59:02 +0400, Alexander Korotkov wrote:
Currently when we need to get ordered result from table we have to choose
one of two approaches: get results from index in exact order we need or
do
sort of tuples. However, it could be useful to mix both methods: get
results from index in order which partially meets our requirements and do
rest of work from heap.------------------------------------------------------------
------------------------------------------------------------
------------------
Limit (cost=69214.06..69214.08 rows=10 width=16) (actual
time=0.097..0.099 rows=10 loops=1)
-> Sort (cost=69214.06..71714.06 rows=1000000 width=16) (actual
time=0.096..0.097 rows=10 loops=1)
Sort Key: v1, v2
Sort Method: top-N heapsort Memory: 25kB
-> Index Scan using test_v1_idx on test (cost=0.42..47604.42
rows=1000000 width=16) (actual time=0.017..0.066 rows=56 loops=1)
Total runtime: 0.125 ms
(6 rows)Is that actually all that beneficial when sorting with a bog standard
qsort() since that doesn't generally benefit from data being pre-sorted?
I think we might need to switch to a different algorithm to really
benefit from mostly pre-sorted input.Eg: /messages/by-id/5291467E.6070807@wizmail.org
Maybe Alexander and I should bash our heads together.
Partial sort patch is mostly optimizer/executor improvement rather than
improvement of sort algorithm itself. But I would appreciate using
enchantments of sorting algorithms in my work.
------
With best regards,
Alexander Korotkov.
On 12/22/2013 04:38 PM, Alexander Korotkov wrote:
postgres=# explain analyze select * from test order by v1, id limit 10;
QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=11441.77..11442.18 rows=10 width=12) (actual
time=79.980..79.982 rows=10 loops=1)
-> Partial sort (cost=11441.77..53140.44 rows=1000000 width=12)
(actual time=79.978..79.978 rows=10 loops=1)
Sort Key: v1, id
Presorted Key: v1
Sort Method: top-N heapsort Memory: 25kB
-> Index Scan using test_v1_idx on test (cost=0.42..47038.83
rows=1000000 width=12) (actual time=0.031..38.275 rows=100213 loops=1)
Total runtime: 81.786 ms
(7 rows)
Have you thought about how do you plan to print which sort method and
how much memory was used? Several different sort methods may have been
use in the query. Should the largest amount of memory/disk be printed?
However, work with joins needs more improvements.
That would be really nice to have, but the patch seems useful even
without the improvements to joins.
--
Andreas Karlsson
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Dec 24, 2013 at 6:02 AM, Andreas Karlsson <andreas@proxel.se> wrote:
On 12/22/2013 04:38 PM, Alexander Korotkov wrote:
postgres=# explain analyze select * from test order by v1, id limit 10;
QUERY
PLAN
------------------------------------------------------------
------------------------------------------------------------
-----------------------
Limit (cost=11441.77..11442.18 rows=10 width=12) (actual
time=79.980..79.982 rows=10 loops=1)
-> Partial sort (cost=11441.77..53140.44 rows=1000000 width=12)
(actual time=79.978..79.978 rows=10 loops=1)
Sort Key: v1, id
Presorted Key: v1
Sort Method: top-N heapsort Memory: 25kB
-> Index Scan using test_v1_idx on test (cost=0.42..47038.83
rows=1000000 width=12) (actual time=0.031..38.275 rows=100213 loops=1)
Total runtime: 81.786 ms
(7 rows)Have you thought about how do you plan to print which sort method and how
much memory was used? Several different sort methods may have been use in
the query. Should the largest amount of memory/disk be printed?
Apparently, now amount of memory for sorted last group is printed. Your
proposal makes sense: largest amount of memory/disk should be printed.
However, work with joins needs more improvements.
That would be really nice to have, but the patch seems useful even without
the improvements to joins.
Attached revision of patch implements partial sort usage in merge joins.
create table test1 as (
select id,
(random()*100)::int as v1,
(random()*10000)::int as v2
from generate_series(1,1000000) id);
create table test2 as (
select id,
(random()*100)::int as v1,
(random()*10000)::int as v2
from generate_series(1,1000000) id);
create index test1_v1_idx on test1 (v1);
create index test2_v1_idx on test2 (v1);
create index test1_v1_idx on test1 (v1);
create index test2_v1_idx on test2 (v1);
# explain select * from test1 t1 join test2 t2 on t1.v1 = t2.v1 and t1.v2 =
t2.v2;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Merge Join (cost=2257.67..255273.39 rows=983360 width=24)
Merge Cond: ((t1.v1 = t2.v1) AND (t1.v2 = t2.v2))
-> Partial sort (cost=1128.84..116470.79 rows=1000000 width=12)
Sort Key: t1.v1, t1.v2
Presorted Key: t1.v1
-> Index Scan using test1_v1_idx on test1 t1
(cost=0.42..47604.01 rows=1000000 width=12)
-> Materialize (cost=1128.83..118969.00 rows=1000000 width=12)
-> Partial sort (cost=1128.83..116469.00 rows=1000000 width=12)
Sort Key: t2.v1, t2.v2
Presorted Key: t2.v1
-> Index Scan using test2_v1_idx on test2 t2
(cost=0.42..47602.22 rows=1000000 width=12)
I believe now patch covers desired functionality. I'm going to focus on
nailing down details, refactoring and documenting.
------
With best regards,
Alexander Korotkov.
Attachments:
partial-sort-3.patchapplication/octet-stream; name=partial-sort-3.patchDownload
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
new file mode 100644
index 9969a25..07cb66d
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
*************** static void show_agg_keys(AggState *asta
*** 81,87 ****
static void show_group_keys(GroupState *gstate, List *ancestors,
ExplainState *es);
static void show_sort_group_keys(PlanState *planstate, const char *qlabel,
! int nkeys, AttrNumber *keycols,
List *ancestors, ExplainState *es);
static void show_sort_info(SortState *sortstate, ExplainState *es);
static void show_hash_info(HashState *hashstate, ExplainState *es);
--- 81,87 ----
static void show_group_keys(GroupState *gstate, List *ancestors,
ExplainState *es);
static void show_sort_group_keys(PlanState *planstate, const char *qlabel,
! int nkeys, int nPresortedKeys, AttrNumber *keycols,
List *ancestors, ExplainState *es);
static void show_sort_info(SortState *sortstate, ExplainState *es);
static void show_hash_info(HashState *hashstate, ExplainState *es);
*************** ExplainNode(PlanState *planstate, List *
*** 905,911 ****
pname = sname = "Materialize";
break;
case T_Sort:
! pname = sname = "Sort";
break;
case T_Group:
pname = sname = "Group";
--- 905,914 ----
pname = sname = "Materialize";
break;
case T_Sort:
! if (((Sort *) plan)->skipCols > 0)
! pname = sname = "Partial sort";
! else
! pname = sname = "Sort";
break;
case T_Group:
pname = sname = "Group";
*************** show_sort_keys(SortState *sortstate, Lis
*** 1705,1711 ****
Sort *plan = (Sort *) sortstate->ss.ps.plan;
show_sort_group_keys((PlanState *) sortstate, "Sort Key",
! plan->numCols, plan->sortColIdx,
ancestors, es);
}
--- 1708,1714 ----
Sort *plan = (Sort *) sortstate->ss.ps.plan;
show_sort_group_keys((PlanState *) sortstate, "Sort Key",
! plan->numCols, plan->skipCols, plan->sortColIdx,
ancestors, es);
}
*************** show_merge_append_keys(MergeAppendState
*** 1719,1725 ****
MergeAppend *plan = (MergeAppend *) mstate->ps.plan;
show_sort_group_keys((PlanState *) mstate, "Sort Key",
! plan->numCols, plan->sortColIdx,
ancestors, es);
}
--- 1722,1728 ----
MergeAppend *plan = (MergeAppend *) mstate->ps.plan;
show_sort_group_keys((PlanState *) mstate, "Sort Key",
! plan->numCols, 0, plan->sortColIdx,
ancestors, es);
}
*************** show_agg_keys(AggState *astate, List *an
*** 1737,1743 ****
/* The key columns refer to the tlist of the child plan */
ancestors = lcons(astate, ancestors);
show_sort_group_keys(outerPlanState(astate), "Group Key",
! plan->numCols, plan->grpColIdx,
ancestors, es);
ancestors = list_delete_first(ancestors);
}
--- 1740,1746 ----
/* The key columns refer to the tlist of the child plan */
ancestors = lcons(astate, ancestors);
show_sort_group_keys(outerPlanState(astate), "Group Key",
! plan->numCols, 0, plan->grpColIdx,
ancestors, es);
ancestors = list_delete_first(ancestors);
}
*************** show_group_keys(GroupState *gstate, List
*** 1755,1761 ****
/* The key columns refer to the tlist of the child plan */
ancestors = lcons(gstate, ancestors);
show_sort_group_keys(outerPlanState(gstate), "Group Key",
! plan->numCols, plan->grpColIdx,
ancestors, es);
ancestors = list_delete_first(ancestors);
}
--- 1758,1764 ----
/* The key columns refer to the tlist of the child plan */
ancestors = lcons(gstate, ancestors);
show_sort_group_keys(outerPlanState(gstate), "Group Key",
! plan->numCols, 0, plan->grpColIdx,
ancestors, es);
ancestors = list_delete_first(ancestors);
}
*************** show_group_keys(GroupState *gstate, List
*** 1765,1777 ****
* as arrays of targetlist indexes
*/
static void
! show_sort_group_keys(PlanState *planstate, const char *qlabel,
! int nkeys, AttrNumber *keycols,
List *ancestors, ExplainState *es)
{
Plan *plan = planstate->plan;
List *context;
! List *result = NIL;
bool useprefix;
int keyno;
char *exprstr;
--- 1768,1781 ----
* as arrays of targetlist indexes
*/
static void
! show_sort_group_keys(PlanState *planstate, const char *qlabel,
! int nkeys, int nPresortedKeys, AttrNumber *keycols,
List *ancestors, ExplainState *es)
{
Plan *plan = planstate->plan;
List *context;
! List *resultSort = NIL;
! List *resultPresorted = NIL;
bool useprefix;
int keyno;
char *exprstr;
*************** show_sort_group_keys(PlanState *planstat
*** 1798,1807 ****
/* Deparse the expression, showing any top-level cast */
exprstr = deparse_expression((Node *) target->expr, context,
useprefix, true);
! result = lappend(result, exprstr);
}
! ExplainPropertyList(qlabel, result, es);
}
/*
--- 1802,1816 ----
/* Deparse the expression, showing any top-level cast */
exprstr = deparse_expression((Node *) target->expr, context,
useprefix, true);
!
! if (keyno < nPresortedKeys)
! resultPresorted = lappend(resultPresorted, exprstr);
! resultSort = lappend(resultSort, exprstr);
}
! ExplainPropertyList(qlabel, resultSort, es);
! if (nPresortedKeys > 0)
! ExplainPropertyList("Presorted Key", resultPresorted, es);
}
/*
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
new file mode 100644
index 09b2eb0..1693d46
*** a/src/backend/executor/nodeSort.c
--- b/src/backend/executor/nodeSort.c
***************
*** 15,25 ****
--- 15,52 ----
#include "postgres.h"
+ #include "access/htup_details.h"
#include "executor/execdebug.h"
#include "executor/nodeSort.h"
#include "miscadmin.h"
#include "utils/tuplesort.h"
+ /*
+ * Check if first "skipCols" sort values are equal.
+ */
+ static bool
+ cmpSortSkipCols(SortState *node, TupleDesc tupDesc, HeapTuple a, TupleTableSlot *b)
+ {
+ int n = ((Sort *)node->ss.ps.plan)->skipCols, i;
+ SortSupport sortKeys = tuplesort_get_sortkeys(node->tuplesortstate);
+
+ for (i = 0; i < n; i++)
+ {
+ Datum datumA, datumB;
+ bool isnullA, isnullB;
+ AttrNumber attno = sortKeys[i].ssup_attno;
+
+ datumA = heap_getattr(a, attno, tupDesc, &isnullA);
+ datumB = slot_getattr(b, attno, &isnullB);
+
+ if (ApplySortComparator(datumA, isnullA,
+ datumB, isnullB,
+ &sortKeys[i]))
+ return false;
+ }
+ return true;
+ }
+
/* ----------------------------------------------------------------
* ExecSort
*************** ExecSort(SortState *node)
*** 42,47 ****
--- 69,75 ----
ScanDirection dir;
Tuplesortstate *tuplesortstate;
TupleTableSlot *slot;
+ int skipCols = ((Sort *)node->ss.ps.plan)->skipCols;
/*
* get state info from node
*************** ExecSort(SortState *node)
*** 54,131 ****
tuplesortstate = (Tuplesortstate *) node->tuplesortstate;
/*
* If first time through, read all tuples from outer plan and pass them to
* tuplesort.c. Subsequent calls just fetch tuples from tuplesort.
*/
! if (!node->sort_Done)
! {
! Sort *plannode = (Sort *) node->ss.ps.plan;
! PlanState *outerNode;
! TupleDesc tupDesc;
! SO1_printf("ExecSort: %s\n",
! "sorting subplan");
! /*
! * Want to scan subplan in the forward direction while creating the
! * sorted data.
! */
! estate->es_direction = ForwardScanDirection;
! /*
! * Initialize tuplesort module.
! */
! SO1_printf("ExecSort: %s\n",
! "calling tuplesort_begin");
! outerNode = outerPlanState(node);
! tupDesc = ExecGetResultType(outerNode);
! tuplesortstate = tuplesort_begin_heap(tupDesc,
! plannode->numCols,
! plannode->sortColIdx,
! plannode->sortOperators,
! plannode->collations,
! plannode->nullsFirst,
! work_mem,
! node->randomAccess);
! if (node->bounded)
! tuplesort_set_bound(tuplesortstate, node->bound);
! node->tuplesortstate = (void *) tuplesortstate;
! /*
! * Scan the subplan and feed all the tuples to tuplesort.
! */
! for (;;)
{
- slot = ExecProcNode(outerNode);
-
if (TupIsNull(slot))
break;
!
tuplesort_puttupleslot(tuplesortstate, slot);
}
! /*
! * Complete the sort.
! */
! tuplesort_performsort(tuplesortstate);
! /*
! * restore to user specified direction
! */
! estate->es_direction = dir;
! /*
! * finally set the sorted flag to true
! */
! node->sort_Done = true;
! node->bounded_Done = node->bounded;
! node->bound_Done = node->bound;
! SO1_printf("ExecSort: %s\n", "sorting done");
! }
SO1_printf("ExecSort: %s\n",
"retrieving tuple from tuplesort");
--- 82,206 ----
tuplesortstate = (Tuplesortstate *) node->tuplesortstate;
/*
+ * Return next tuple from sorted set if any.
+ */
+ if (node->sort_Done)
+ {
+ slot = node->ss.ps.ps_ResultTupleSlot;
+ if (tuplesort_gettupleslot(tuplesortstate,
+ ScanDirectionIsForward(dir),
+ slot) || node->finished)
+ return slot;
+ }
+
+ /*
* If first time through, read all tuples from outer plan and pass them to
* tuplesort.c. Subsequent calls just fetch tuples from tuplesort.
*/
! Sort *plannode = (Sort *) node->ss.ps.plan;
! PlanState *outerNode;
! TupleDesc tupDesc;
! SO1_printf("ExecSort: %s\n",
! "sorting subplan");
! /*
! * Want to scan subplan in the forward direction while creating the
! * sorted data.
! */
! estate->es_direction = ForwardScanDirection;
! /*
! * Initialize tuplesort module.
! */
! SO1_printf("ExecSort: %s\n",
! "calling tuplesort_begin");
! outerNode = outerPlanState(node);
! tupDesc = ExecGetResultType(outerNode);
! if (node->tuplesortstate != NULL)
! tuplesort_end((Tuplesortstate *) node->tuplesortstate);
! tuplesortstate = tuplesort_begin_heap(tupDesc,
! plannode->numCols,
! plannode->sortColIdx,
! plannode->sortOperators,
! plannode->collations,
! plannode->nullsFirst,
! work_mem,
! node->randomAccess);
! if (node->bounded)
! tuplesort_set_bound(tuplesortstate, node->bound);
! node->tuplesortstate = (void *) tuplesortstate;
! /*
! * Put next group of tuples where skipCols" sort values are equal to
! * tuplesort.
! */
! for (;;)
! {
! slot = ExecProcNode(outerNode);
! if (skipCols == 0)
{
if (TupIsNull(slot))
+ {
+ node->finished = true;
break;
! }
tuplesort_puttupleslot(tuplesortstate, slot);
}
+ else if (node->prev)
+ {
+ ExecStoreTuple(node->prev, node->ss.ps.ps_ResultTupleSlot, InvalidBuffer, false);
+ tuplesort_puttupleslot(tuplesortstate, node->ss.ps.ps_ResultTupleSlot);
! if (TupIsNull(slot))
! {
! node->finished = true;
! break;
! }
! else
! {
! bool cmp;
! cmp = cmpSortSkipCols(node, tupDesc, node->prev, slot);
! node->prev = ExecCopySlotTuple(slot);
! if (!cmp)
! break;
! }
! }
! else
! {
! if (TupIsNull(slot))
! {
! node->finished = true;
! break;
! }
! else
! {
! node->prev = ExecCopySlotTuple(slot);
! }
! }
! }
! /*
! * Complete the sort.
! */
! tuplesort_performsort(tuplesortstate);
! /*
! * restore to user specified direction
! */
! estate->es_direction = dir;
!
! /*
! * finally set the sorted flag to true
! */
! node->sort_Done = true;
! node->bounded_Done = node->bounded;
! node->bound_Done = node->bound;
! SO1_printf("ExecSort: %s\n", "sorting done");
SO1_printf("ExecSort: %s\n",
"retrieving tuple from tuplesort");
*************** ExecInitSort(Sort *node, EState *estate,
*** 174,180 ****
--- 249,257 ----
sortstate->bounded = false;
sortstate->sort_Done = false;
+ sortstate->finished = false;
sortstate->tuplesortstate = NULL;
+ sortstate->prev = NULL;
/*
* Miscellaneous initialization
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
new file mode 100644
index e4184c5..b41213a
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
*************** _copySort(const Sort *from)
*** 735,740 ****
--- 735,741 ----
CopyPlanFields((const Plan *) from, (Plan *) newnode);
COPY_SCALAR_FIELD(numCols);
+ COPY_SCALAR_FIELD(skipCols);
COPY_POINTER_FIELD(sortColIdx, from->numCols * sizeof(AttrNumber));
COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
new file mode 100644
index 50f0852..1a38407
*** a/src/backend/optimizer/path/costsize.c
--- b/src/backend/optimizer/path/costsize.c
*************** cost_recursive_union(Plan *runion, Plan
*** 1281,1295 ****
*/
void
cost_sort(Path *path, PlannerInfo *root,
! List *pathkeys, Cost input_cost, double tuples, int width,
! Cost comparison_cost, int sort_mem,
double limit_tuples)
{
! Cost startup_cost = input_cost;
! Cost run_cost = 0;
double input_bytes = relation_byte_size(tuples, width);
double output_bytes;
double output_tuples;
long sort_mem_bytes = sort_mem * 1024L;
if (!enable_sort)
--- 1281,1302 ----
*/
void
cost_sort(Path *path, PlannerInfo *root,
! List *pathkeys, int presorted_keys,
! Cost input_startup_cost, Cost input_total_cost,
! double tuples, int width, Cost comparison_cost, int sort_mem,
double limit_tuples)
{
! Cost startup_cost = input_startup_cost;
! Cost run_cost = 0,
! rest_cost,
! group_cost,
! input_run_cost = input_total_cost - input_startup_cost;
double input_bytes = relation_byte_size(tuples, width);
double output_bytes;
double output_tuples;
+ double num_groups,
+ group_input_bytes,
+ group_tuples;
long sort_mem_bytes = sort_mem * 1024L;
if (!enable_sort)
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1319,1331 ****
output_bytes = input_bytes;
}
! if (output_bytes > sort_mem_bytes)
{
/*
* We'll have to use a disk-based sort of all the tuples
*/
! double npages = ceil(input_bytes / BLCKSZ);
! double nruns = (input_bytes / sort_mem_bytes) * 0.5;
double mergeorder = tuplesort_merge_order(sort_mem_bytes);
double log_runs;
double npageaccesses;
--- 1326,1367 ----
output_bytes = input_bytes;
}
! if (presorted_keys > 0)
! {
! List *groupExprs = NIL;
! ListCell *l;
! int i = 0;
!
! foreach(l, pathkeys)
! {
! PathKey *key = (PathKey *)lfirst(l);
! EquivalenceMember *member = (EquivalenceMember *)
! lfirst(list_head(key->pk_eclass->ec_members));
!
! groupExprs = lappend(groupExprs, member->em_expr);
!
! i++;
! if (i >= presorted_keys)
! break;
! }
!
! num_groups = estimate_num_groups(root, groupExprs, tuples);
! }
! else
! {
! num_groups = 1.0;
! }
!
! group_input_bytes = input_bytes / num_groups;
! group_tuples = tuples / num_groups;
!
! if (output_bytes > sort_mem_bytes && group_input_bytes > sort_mem_bytes)
{
/*
* We'll have to use a disk-based sort of all the tuples
*/
! double npages = ceil(group_input_bytes / BLCKSZ);
! double nruns = (group_input_bytes / sort_mem_bytes) * 0.5;
double mergeorder = tuplesort_merge_order(sort_mem_bytes);
double log_runs;
double npageaccesses;
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1335,1341 ****
*
* Assume about N log2 N comparisons
*/
! startup_cost += comparison_cost * tuples * LOG2(tuples);
/* Disk costs */
--- 1371,1377 ----
*
* Assume about N log2 N comparisons
*/
! group_cost = comparison_cost * group_tuples * LOG2(group_tuples);
/* Disk costs */
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1346,1355 ****
log_runs = 1.0;
npageaccesses = 2.0 * npages * log_runs;
/* Assume 3/4ths of accesses are sequential, 1/4th are not */
! startup_cost += npageaccesses *
(seq_page_cost * 0.75 + random_page_cost * 0.25);
}
! else if (tuples > 2 * output_tuples || input_bytes > sort_mem_bytes)
{
/*
* We'll use a bounded heap-sort keeping just K tuples in memory, for
--- 1382,1391 ----
log_runs = 1.0;
npageaccesses = 2.0 * npages * log_runs;
/* Assume 3/4ths of accesses are sequential, 1/4th are not */
! group_cost += npageaccesses *
(seq_page_cost * 0.75 + random_page_cost * 0.25);
}
! else if (group_tuples > 2 * output_tuples || group_input_bytes > sort_mem_bytes)
{
/*
* We'll use a bounded heap-sort keeping just K tuples in memory, for
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1357,1368 ****
* factor is a bit higher than for quicksort. Tweak it so that the
* cost curve is continuous at the crossover point.
*/
! startup_cost += comparison_cost * tuples * LOG2(2.0 * output_tuples);
}
else
{
/* We'll use plain quicksort on all the input tuples */
! startup_cost += comparison_cost * tuples * LOG2(tuples);
}
/*
--- 1393,1404 ----
* factor is a bit higher than for quicksort. Tweak it so that the
* cost curve is continuous at the crossover point.
*/
! group_cost = comparison_cost * group_tuples * LOG2(2.0 * output_tuples);
}
else
{
/* We'll use plain quicksort on all the input tuples */
! group_cost = comparison_cost * group_tuples * LOG2(group_tuples);
}
/*
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1373,1380 ****
--- 1409,1423 ----
* here --- the upper LIMIT will pro-rate the run cost so we'd be double
* counting the LIMIT otherwise.
*/
+ startup_cost += group_cost;
+ rest_cost = (num_groups * (output_tuples / tuples) - 1.0) * group_cost;
+ if (rest_cost > 0.0)
+ run_cost += rest_cost;
run_cost += cpu_operator_cost * tuples;
+ startup_cost += input_run_cost / num_groups;
+ run_cost += input_run_cost * ((num_groups - 1.0) / num_groups);
+
path->startup_cost = startup_cost;
path->total_cost = startup_cost + run_cost;
}
*************** initial_cost_mergejoin(PlannerInfo *root
*** 2075,2080 ****
--- 2118,2125 ----
cost_sort(&sort_path,
root,
outersortkeys,
+ pathkeys_common(outer_path->pathkeys, outersortkeys),
+ outer_path->startup_cost,
outer_path->total_cost,
outer_path_rows,
outer_path->parent->width,
*************** initial_cost_mergejoin(PlannerInfo *root
*** 2101,2106 ****
--- 2146,2153 ----
cost_sort(&sort_path,
root,
innersortkeys,
+ pathkeys_common(inner_path->pathkeys, innersortkeys),
+ inner_path->startup_cost,
inner_path->total_cost,
inner_path_rows,
inner_path->parent->width,
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
new file mode 100644
index 5b477e5..5909dfe
*** a/src/backend/optimizer/path/joinpath.c
--- b/src/backend/optimizer/path/joinpath.c
*************** sort_inner_and_outer(PlannerInfo *root,
*** 662,668 ****
cur_mergeclauses = find_mergeclauses_for_pathkeys(root,
outerkeys,
true,
! mergeclause_list);
/* Should have used them all... */
Assert(list_length(cur_mergeclauses) == list_length(mergeclause_list));
--- 662,670 ----
cur_mergeclauses = find_mergeclauses_for_pathkeys(root,
outerkeys,
true,
! mergeclause_list,
! NULL,
! NULL);
/* Should have used them all... */
Assert(list_length(cur_mergeclauses) == list_length(mergeclause_list));
*************** match_unsorted_outer(PlannerInfo *root,
*** 832,837 ****
--- 834,840 ----
List *mergeclauses;
List *innersortkeys;
List *trialsortkeys;
+ List *outersortkeys;
Path *cheapest_startup_inner;
Path *cheapest_total_inner;
int num_sortkeys;
*************** match_unsorted_outer(PlannerInfo *root,
*** 937,943 ****
mergeclauses = find_mergeclauses_for_pathkeys(root,
outerpath->pathkeys,
true,
! mergeclause_list);
/*
* Done with this outer path if no chance for a mergejoin.
--- 940,948 ----
mergeclauses = find_mergeclauses_for_pathkeys(root,
outerpath->pathkeys,
true,
! mergeclause_list,
! joinrel,
! &outersortkeys);
/*
* Done with this outer path if no chance for a mergejoin.
*************** match_unsorted_outer(PlannerInfo *root,
*** 961,967 ****
/* Compute the required ordering of the inner path */
innersortkeys = make_inner_pathkeys_for_merge(root,
mergeclauses,
! outerpath->pathkeys);
/*
* Generate a mergejoin on the basis of sorting the cheapest inner.
--- 966,972 ----
/* Compute the required ordering of the inner path */
innersortkeys = make_inner_pathkeys_for_merge(root,
mergeclauses,
! outersortkeys);
/*
* Generate a mergejoin on the basis of sorting the cheapest inner.
*************** match_unsorted_outer(PlannerInfo *root,
*** 980,986 ****
restrictlist,
merge_pathkeys,
mergeclauses,
! NIL,
innersortkeys);
/* Can't do anything else if inner path needs to be unique'd */
--- 985,991 ----
restrictlist,
merge_pathkeys,
mergeclauses,
! outersortkeys,
innersortkeys);
/* Can't do anything else if inner path needs to be unique'd */
*************** match_unsorted_outer(PlannerInfo *root,
*** 1038,1044 ****
for (sortkeycnt = num_sortkeys; sortkeycnt > 0; sortkeycnt--)
{
Path *innerpath;
- List *newclauses = NIL;
/*
* Look for an inner path ordered well enough for the first
--- 1043,1048 ----
*************** match_unsorted_outer(PlannerInfo *root,
*** 1055,1073 ****
compare_path_costs(innerpath, cheapest_total_inner,
TOTAL_COST) < 0))
{
- /* Found a cheap (or even-cheaper) sorted path */
- /* Select the right mergeclauses, if we didn't already */
- if (sortkeycnt < num_sortkeys)
- {
- newclauses =
- find_mergeclauses_for_pathkeys(root,
- trialsortkeys,
- false,
- mergeclauses);
- Assert(newclauses != NIL);
- }
- else
- newclauses = mergeclauses;
try_mergejoin_path(root,
joinrel,
jointype,
--- 1059,1064 ----
*************** match_unsorted_outer(PlannerInfo *root,
*** 1078,1086 ****
innerpath,
restrictlist,
merge_pathkeys,
! newclauses,
! NIL,
! NIL);
cheapest_total_inner = innerpath;
}
/* Same on the basis of cheapest startup cost ... */
--- 1069,1077 ----
innerpath,
restrictlist,
merge_pathkeys,
! mergeclauses,
! outersortkeys,
! innersortkeys);
cheapest_total_inner = innerpath;
}
/* Same on the basis of cheapest startup cost ... */
*************** match_unsorted_outer(PlannerInfo *root,
*** 1096,1119 ****
/* Found a cheap (or even-cheaper) sorted path */
if (innerpath != cheapest_total_inner)
{
- /*
- * Avoid rebuilding clause list if we already made one;
- * saves memory in big join trees...
- */
- if (newclauses == NIL)
- {
- if (sortkeycnt < num_sortkeys)
- {
- newclauses =
- find_mergeclauses_for_pathkeys(root,
- trialsortkeys,
- false,
- mergeclauses);
- Assert(newclauses != NIL);
- }
- else
- newclauses = mergeclauses;
- }
try_mergejoin_path(root,
joinrel,
jointype,
--- 1087,1092 ----
*************** match_unsorted_outer(PlannerInfo *root,
*** 1124,1132 ****
innerpath,
restrictlist,
merge_pathkeys,
! newclauses,
! NIL,
! NIL);
}
cheapest_startup_inner = innerpath;
}
--- 1097,1105 ----
innerpath,
restrictlist,
merge_pathkeys,
! mergeclauses,
! outersortkeys,
! innersortkeys);
}
cheapest_startup_inner = innerpath;
}
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
new file mode 100644
index 9c8ede6..63c0b03
*** a/src/backend/optimizer/path/pathkeys.c
--- b/src/backend/optimizer/path/pathkeys.c
***************
*** 26,31 ****
--- 26,32 ----
#include "optimizer/paths.h"
#include "optimizer/tlist.h"
#include "utils/lsyscache.h"
+ #include "utils/selfuncs.h"
static PathKey *make_canonical_pathkey(PlannerInfo *root,
*************** compare_pathkeys(List *keys1, List *keys
*** 312,317 ****
--- 313,344 ----
}
/*
+ * pathkeys_common
+ * Returns length of longest common prefix of keys1 and keys2.
+ */
+ int
+ pathkeys_common(List *keys1, List *keys2)
+ {
+ int n;
+ ListCell *key1,
+ *key2;
+ n = 0;
+
+ forboth(key1, keys1, key2, keys2)
+ {
+ PathKey *pathkey1 = (PathKey *) lfirst(key1);
+ PathKey *pathkey2 = (PathKey *) lfirst(key2);
+
+ if (pathkey1 != pathkey2)
+ return n;
+ n++;
+ }
+
+ return n;
+ }
+
+
+ /*
* pathkeys_contained_in
* Common special case of compare_pathkeys: we just want to know
* if keys2 are at least as well sorted as keys1.
*************** get_cheapest_path_for_pathkeys(List *pat
*** 368,373 ****
--- 395,421 ----
return matched_path;
}
+ static int
+ compare_bifractional_path_costs(Path *path1, Path *path2,
+ double fraction1, double fraction2)
+ {
+ Cost cost1,
+ cost2;
+
+ if (fraction1 <= 0.0 || fraction1 >= 1.0 ||
+ fraction2 <= 0.0 || fraction2 >= 1.0)
+ return compare_path_costs(path1, path2, TOTAL_COST);
+ cost1 = path1->startup_cost +
+ fraction1 * (path1->total_cost - path1->startup_cost);
+ cost2 = path2->startup_cost +
+ fraction2 * (path2->total_cost - path2->startup_cost);
+ if (cost1 < cost2)
+ return -1;
+ if (cost1 > cost2)
+ return +1;
+ return 0;
+ }
+
/*
* get_cheapest_fractional_path_for_pathkeys
* Find the cheapest path (for retrieving a specified fraction of all
*************** Path *
*** 386,411 ****
get_cheapest_fractional_path_for_pathkeys(List *paths,
List *pathkeys,
Relids required_outer,
! double fraction)
{
Path *matched_path = NULL;
ListCell *l;
foreach(l, paths)
{
Path *path = (Path *) lfirst(l);
/*
* Since cost comparison is a lot cheaper than pathkey comparison, do
* that first. (XXX is that still true?)
*/
! if (matched_path != NULL &&
! compare_fractional_path_costs(matched_path, path, fraction) <= 0)
! continue;
! if (pathkeys_contained_in(pathkeys, path->pathkeys) &&
bms_is_subset(PATH_REQ_OUTER(path), required_outer))
matched_path = path;
}
return matched_path;
}
--- 434,508 ----
get_cheapest_fractional_path_for_pathkeys(List *paths,
List *pathkeys,
Relids required_outer,
! double fraction,
! PlannerInfo *root,
! double tuples)
{
Path *matched_path = NULL;
+ int matched_n_common_pathkeys = 0,
+ costs_cmp, n_common_pathkeys,
+ n_pathkeys = list_length(pathkeys);
ListCell *l;
+ List *groupExprs = NIL;
+ double *num_groups, matched_fraction;
+ int i;
+
+ i = 0;
+ num_groups = (double *)palloc(sizeof(double) * list_length(pathkeys));
+ foreach(l, pathkeys)
+ {
+ PathKey *key = (PathKey *)lfirst(l);
+ EquivalenceMember *member = (EquivalenceMember *)
+ lfirst(list_head(key->pk_eclass->ec_members));
+
+ groupExprs = lappend(groupExprs, member->em_expr);
+
+ num_groups[i] = estimate_num_groups(root, groupExprs, tuples);
+ i++;
+ }
+
foreach(l, paths)
{
Path *path = (Path *) lfirst(l);
+ double current_fraction;
+
+ n_common_pathkeys = pathkeys_common(pathkeys, path->pathkeys);
+ if (n_common_pathkeys < matched_n_common_pathkeys ||
+ n_common_pathkeys == 0)
+ continue;
+
+ current_fraction = fraction;
+ if (n_common_pathkeys < n_pathkeys)
+ {
+ current_fraction += 1.0 / num_groups[n_common_pathkeys - 1];
+ current_fraction = Max(current_fraction, 1.0);
+ }
/*
* Since cost comparison is a lot cheaper than pathkey comparison, do
* that first. (XXX is that still true?)
*/
! if (matched_path != NULL)
! {
! costs_cmp = compare_bifractional_path_costs(matched_path, path,
! matched_fraction, current_fraction);
! }
! else
! {
! costs_cmp = 1;
! }
! if ((
! n_common_pathkeys > matched_n_common_pathkeys
! || (n_common_pathkeys == matched_n_common_pathkeys
! && costs_cmp > 0)) &&
bms_is_subset(PATH_REQ_OUTER(path), required_outer))
+ {
matched_path = path;
+ matched_n_common_pathkeys = n_common_pathkeys;
+ matched_fraction = current_fraction;
+ }
}
return matched_path;
}
*************** List *
*** 965,974 ****
find_mergeclauses_for_pathkeys(PlannerInfo *root,
List *pathkeys,
bool outer_keys,
! List *restrictinfos)
{
List *mergeclauses = NIL;
ListCell *i;
/* make sure we have eclasses cached in the clauses */
foreach(i, restrictinfos)
--- 1062,1077 ----
find_mergeclauses_for_pathkeys(PlannerInfo *root,
List *pathkeys,
bool outer_keys,
! List *restrictinfos,
! RelOptInfo *joinrel,
! List **outersortkeys)
{
List *mergeclauses = NIL;
ListCell *i;
+ bool *used = (bool *)palloc0(sizeof(bool) * list_length(restrictinfos));
+ int k;
+ List *unusedRestrictinfos = NIL;
+ List *usedPathkeys = NIL;
/* make sure we have eclasses cached in the clauses */
foreach(i, restrictinfos)
*************** find_mergeclauses_for_pathkeys(PlannerIn
*** 1021,1026 ****
--- 1124,1130 ----
* deal with the case in create_mergejoin_plan().
*----------
*/
+ k = 0;
foreach(j, restrictinfos)
{
RestrictInfo *rinfo = (RestrictInfo *) lfirst(j);
*************** find_mergeclauses_for_pathkeys(PlannerIn
*** 1033,1039 ****
--- 1137,1147 ----
clause_ec = rinfo->outer_is_left ?
rinfo->right_ec : rinfo->left_ec;
if (clause_ec == pathkey_ec)
+ {
matched_restrictinfos = lappend(matched_restrictinfos, rinfo);
+ used[k] = true;
+ }
+ k++;
}
/*
*************** find_mergeclauses_for_pathkeys(PlannerIn
*** 1044,1049 ****
--- 1152,1159 ----
if (matched_restrictinfos == NIL)
break;
+ usedPathkeys = lappend(usedPathkeys, pathkey);
+
/*
* If we did find usable mergeclause(s) for this sort-key position,
* add them to result list.
*************** find_mergeclauses_for_pathkeys(PlannerIn
*** 1051,1056 ****
--- 1161,1201 ----
mergeclauses = list_concat(mergeclauses, matched_restrictinfos);
}
+ if (outersortkeys)
+ {
+ List *addPathkeys, *addMergeclauses;
+
+ *outersortkeys = pathkeys;
+
+ if (!mergeclauses)
+ return mergeclauses;
+
+ k = 0;
+ foreach(i, restrictinfos)
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) lfirst(i);
+ if (!used[k])
+ unusedRestrictinfos = lappend(unusedRestrictinfos, rinfo);
+ k++;
+ }
+
+ if (!unusedRestrictinfos)
+ return mergeclauses;
+
+ addPathkeys = select_outer_pathkeys_for_merge(root,
+ unusedRestrictinfos, joinrel);
+
+ if (!addPathkeys)
+ return mergeclauses;
+
+ addMergeclauses = find_mergeclauses_for_pathkeys(root,
+ addPathkeys, true, unusedRestrictinfos, NULL, NULL);
+
+ *outersortkeys = list_concat(usedPathkeys, addPathkeys);
+ mergeclauses = list_concat(mergeclauses, addMergeclauses);
+
+ }
+
return mergeclauses;
}
*************** right_merge_direction(PlannerInfo *root,
*** 1457,1472 ****
static int
pathkeys_useful_for_ordering(PlannerInfo *root, List *pathkeys)
{
if (root->query_pathkeys == NIL)
return 0; /* no special ordering requested */
if (pathkeys == NIL)
return 0; /* unordered path */
! if (pathkeys_contained_in(root->query_pathkeys, pathkeys))
{
/* It's useful ... or at least the first N keys are */
! return list_length(root->query_pathkeys);
}
return 0; /* path ordering not useful */
--- 1602,1621 ----
static int
pathkeys_useful_for_ordering(PlannerInfo *root, List *pathkeys)
{
+ int n;
+
if (root->query_pathkeys == NIL)
return 0; /* no special ordering requested */
if (pathkeys == NIL)
return 0; /* unordered path */
! n = pathkeys_common(root->query_pathkeys, pathkeys);
!
! if (n != 0)
{
/* It's useful ... or at least the first N keys are */
! return n;
}
return 0; /* path ordering not useful */
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
new file mode 100644
index f2c122d..a300342
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
*************** static MergeJoin *make_mergejoin(List *t
*** 149,154 ****
--- 149,155 ----
Plan *lefttree, Plan *righttree,
JoinType jointype);
static Sort *make_sort(PlannerInfo *root, Plan *lefttree, int numCols,
+ List *pathkeys, int skipCols,
AttrNumber *sortColIdx, Oid *sortOperators,
Oid *collations, bool *nullsFirst,
double limit_tuples);
*************** create_merge_append_plan(PlannerInfo *ro
*** 774,779 ****
--- 775,781 ----
Oid *sortOperators;
Oid *collations;
bool *nullsFirst;
+ int n_common_pathkeys;
/* Build the child plan */
subplan = create_plan_recurse(root, subpath);
*************** create_merge_append_plan(PlannerInfo *ro
*** 807,814 ****
numsortkeys * sizeof(bool)) == 0);
/* Now, insert a Sort node if subplan isn't sufficiently ordered */
! if (!pathkeys_contained_in(pathkeys, subpath->pathkeys))
subplan = (Plan *) make_sort(root, subplan, numsortkeys,
sortColIdx, sortOperators,
collations, nullsFirst,
best_path->limit_tuples);
--- 809,818 ----
numsortkeys * sizeof(bool)) == 0);
/* Now, insert a Sort node if subplan isn't sufficiently ordered */
! n_common_pathkeys = pathkeys_common(pathkeys, subpath->pathkeys);
! if (n_common_pathkeys < list_length(pathkeys))
subplan = (Plan *) make_sort(root, subplan, numsortkeys,
+ pathkeys, n_common_pathkeys,
sortColIdx, sortOperators,
collations, nullsFirst,
best_path->limit_tuples);
*************** create_mergejoin_plan(PlannerInfo *root,
*** 2184,2192 ****
disuse_physical_tlist(root, outer_plan, best_path->jpath.outerjoinpath);
outer_plan = (Plan *)
make_sort_from_pathkeys(root,
! outer_plan,
! best_path->outersortkeys,
! -1.0);
outerpathkeys = best_path->outersortkeys;
}
else
--- 2188,2198 ----
disuse_physical_tlist(root, outer_plan, best_path->jpath.outerjoinpath);
outer_plan = (Plan *)
make_sort_from_pathkeys(root,
! outer_plan,
! best_path->outersortkeys,
! -1.0,
! pathkeys_common(best_path->outersortkeys,
! best_path->jpath.outerjoinpath->pathkeys));
outerpathkeys = best_path->outersortkeys;
}
else
*************** create_mergejoin_plan(PlannerInfo *root,
*** 2197,2205 ****
disuse_physical_tlist(root, inner_plan, best_path->jpath.innerjoinpath);
inner_plan = (Plan *)
make_sort_from_pathkeys(root,
! inner_plan,
! best_path->innersortkeys,
! -1.0);
innerpathkeys = best_path->innersortkeys;
}
else
--- 2203,2213 ----
disuse_physical_tlist(root, inner_plan, best_path->jpath.innerjoinpath);
inner_plan = (Plan *)
make_sort_from_pathkeys(root,
! inner_plan,
! best_path->innersortkeys,
! -1.0,
! pathkeys_common(best_path->innersortkeys,
! best_path->jpath.innerjoinpath->pathkeys));
innerpathkeys = best_path->innersortkeys;
}
else
*************** make_mergejoin(List *tlist,
*** 3739,3744 ****
--- 3747,3753 ----
*/
static Sort *
make_sort(PlannerInfo *root, Plan *lefttree, int numCols,
+ List *pathkeys, int skipCols,
AttrNumber *sortColIdx, Oid *sortOperators,
Oid *collations, bool *nullsFirst,
double limit_tuples)
*************** make_sort(PlannerInfo *root, Plan *leftt
*** 3748,3754 ****
Path sort_path; /* dummy for result of cost_sort */
copy_plan_costsize(plan, lefttree); /* only care about copying size */
! cost_sort(&sort_path, root, NIL,
lefttree->total_cost,
lefttree->plan_rows,
lefttree->plan_width,
--- 3757,3764 ----
Path sort_path; /* dummy for result of cost_sort */
copy_plan_costsize(plan, lefttree); /* only care about copying size */
! cost_sort(&sort_path, root, pathkeys, skipCols,
! lefttree->startup_cost,
lefttree->total_cost,
lefttree->plan_rows,
lefttree->plan_width,
*************** make_sort(PlannerInfo *root, Plan *leftt
*** 3762,3767 ****
--- 3772,3778 ----
plan->lefttree = lefttree;
plan->righttree = NULL;
node->numCols = numCols;
+ node->skipCols = skipCols;
node->sortColIdx = sortColIdx;
node->sortOperators = sortOperators;
node->collations = collations;
*************** find_ec_member_for_tle(EquivalenceClass
*** 4090,4096 ****
*/
Sort *
make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
! double limit_tuples)
{
int numsortkeys;
AttrNumber *sortColIdx;
--- 4101,4107 ----
*/
Sort *
make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
! double limit_tuples, int skipCols)
{
int numsortkeys;
AttrNumber *sortColIdx;
*************** make_sort_from_pathkeys(PlannerInfo *roo
*** 4110,4116 ****
&nullsFirst);
/* Now build the Sort node */
! return make_sort(root, lefttree, numsortkeys,
sortColIdx, sortOperators, collations,
nullsFirst, limit_tuples);
}
--- 4121,4127 ----
&nullsFirst);
/* Now build the Sort node */
! return make_sort(root, lefttree, numsortkeys, pathkeys, skipCols,
sortColIdx, sortOperators, collations,
nullsFirst, limit_tuples);
}
*************** make_sort_from_sortclauses(PlannerInfo *
*** 4153,4159 ****
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
--- 4164,4170 ----
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys, NIL, 0,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
*************** Sort *
*** 4175,4181 ****
make_sort_from_groupcols(PlannerInfo *root,
List *groupcls,
AttrNumber *grpColIdx,
! Plan *lefttree)
{
List *sub_tlist = lefttree->targetlist;
ListCell *l;
--- 4186,4193 ----
make_sort_from_groupcols(PlannerInfo *root,
List *groupcls,
AttrNumber *grpColIdx,
! Plan *lefttree,
! List *pathkeys, int skipCols)
{
List *sub_tlist = lefttree->targetlist;
ListCell *l;
*************** make_sort_from_groupcols(PlannerInfo *ro
*** 4208,4214 ****
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
--- 4220,4226 ----
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys, pathkeys, skipCols,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
diff --git a/src/backend/optimizer/plan/planagg.c b/src/backend/optimizer/plan/planagg.c
new file mode 100644
index 53fc238..4675402
*** a/src/backend/optimizer/plan/planagg.c
--- b/src/backend/optimizer/plan/planagg.c
*************** build_minmax_path(PlannerInfo *root, Min
*** 494,500 ****
get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
subroot->query_pathkeys,
NULL,
! path_fraction);
if (!sorted_path)
return false;
--- 494,502 ----
get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
subroot->query_pathkeys,
NULL,
! path_fraction,
! subroot,
! final_rel->rows);
if (!sorted_path)
return false;
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
new file mode 100644
index 1da4b2f..df5563a
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*************** grouping_planner(PlannerInfo *root, doub
*** 1349,1355 ****
get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
root->query_pathkeys,
NULL,
! tuple_fraction);
/* Don't consider same path in both guises; just wastes effort */
if (sorted_path == cheapest_path)
--- 1349,1357 ----
get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
root->query_pathkeys,
NULL,
! tuple_fraction,
! root,
! path_rows);
/* Don't consider same path in both guises; just wastes effort */
if (sorted_path == cheapest_path)
*************** grouping_planner(PlannerInfo *root, doub
*** 1365,1374 ****
if (sorted_path)
{
Path sort_path; /* dummy for result of cost_sort */
if (root->query_pathkeys == NIL ||
! pathkeys_contained_in(root->query_pathkeys,
! cheapest_path->pathkeys))
{
/* No sort needed for cheapest path */
sort_path.startup_cost = cheapest_path->startup_cost;
--- 1367,1380 ----
if (sorted_path)
{
Path sort_path; /* dummy for result of cost_sort */
+ Path partial_sort_path; /* dummy for result of cost_sort */
+ int n_common_pathkeys;
+
+ n_common_pathkeys = pathkeys_common(root->query_pathkeys,
+ cheapest_path->pathkeys);
if (root->query_pathkeys == NIL ||
! n_common_pathkeys == list_length(root->query_pathkeys))
{
/* No sort needed for cheapest path */
sort_path.startup_cost = cheapest_path->startup_cost;
*************** grouping_planner(PlannerInfo *root, doub
*** 1378,1389 ****
{
/* Figure cost for sorting */
cost_sort(&sort_path, root, root->query_pathkeys,
cheapest_path->total_cost,
path_rows, path_width,
0.0, work_mem, root->limit_tuples);
}
! if (compare_fractional_path_costs(sorted_path, &sort_path,
tuple_fraction) > 0)
{
/* Presorted path is a loser */
--- 1384,1418 ----
{
/* Figure cost for sorting */
cost_sort(&sort_path, root, root->query_pathkeys,
+ n_common_pathkeys,
+ cheapest_path->startup_cost,
cheapest_path->total_cost,
path_rows, path_width,
0.0, work_mem, root->limit_tuples);
}
! n_common_pathkeys = pathkeys_common(root->query_pathkeys,
! sorted_path->pathkeys);
!
! if (root->query_pathkeys == NIL ||
! n_common_pathkeys == list_length(root->query_pathkeys))
! {
! /* No sort needed for cheapest path */
! partial_sort_path.startup_cost = sorted_path->startup_cost;
! partial_sort_path.total_cost = sorted_path->total_cost;
! }
! else
! {
! /* Figure cost for sorting */
! cost_sort(&partial_sort_path, root, root->query_pathkeys,
! n_common_pathkeys,
! sorted_path->startup_cost,
! sorted_path->total_cost,
! path_rows, path_width,
! 0.0, work_mem, root->limit_tuples);
! }
!
! if (compare_fractional_path_costs(&partial_sort_path, &sort_path,
tuple_fraction) > 0)
{
/* Presorted path is a loser */
*************** grouping_planner(PlannerInfo *root, doub
*** 1464,1476 ****
* results.
*/
bool need_sort_for_grouping = false;
result_plan = create_plan(root, best_path);
current_pathkeys = best_path->pathkeys;
/* Detect if we'll need an explicit sort for grouping */
if (parse->groupClause && !use_hashed_grouping &&
! !pathkeys_contained_in(root->group_pathkeys, current_pathkeys))
{
need_sort_for_grouping = true;
--- 1493,1508 ----
* results.
*/
bool need_sort_for_grouping = false;
+ int n_common_pathkeys_grouping;
result_plan = create_plan(root, best_path);
current_pathkeys = best_path->pathkeys;
/* Detect if we'll need an explicit sort for grouping */
+ n_common_pathkeys_grouping = pathkeys_common(root->group_pathkeys,
+ current_pathkeys);
if (parse->groupClause && !use_hashed_grouping &&
! n_common_pathkeys_grouping < list_length(root->group_pathkeys))
{
need_sort_for_grouping = true;
*************** grouping_planner(PlannerInfo *root, doub
*** 1564,1570 ****
make_sort_from_groupcols(root,
parse->groupClause,
groupColIdx,
! result_plan);
current_pathkeys = root->group_pathkeys;
}
aggstrategy = AGG_SORTED;
--- 1596,1604 ----
make_sort_from_groupcols(root,
parse->groupClause,
groupColIdx,
! result_plan,
! root->group_pathkeys,
! n_common_pathkeys_grouping);
current_pathkeys = root->group_pathkeys;
}
aggstrategy = AGG_SORTED;
*************** grouping_planner(PlannerInfo *root, doub
*** 1607,1613 ****
make_sort_from_groupcols(root,
parse->groupClause,
groupColIdx,
! result_plan);
current_pathkeys = root->group_pathkeys;
}
--- 1641,1649 ----
make_sort_from_groupcols(root,
parse->groupClause,
groupColIdx,
! result_plan,
! root->group_pathkeys,
! n_common_pathkeys_grouping);
current_pathkeys = root->group_pathkeys;
}
*************** grouping_planner(PlannerInfo *root, doub
*** 1724,1736 ****
if (window_pathkeys)
{
Sort *sort_plan;
sort_plan = make_sort_from_pathkeys(root,
result_plan,
window_pathkeys,
! -1.0);
! if (!pathkeys_contained_in(window_pathkeys,
! current_pathkeys))
{
/* we do indeed need to sort */
result_plan = (Plan *) sort_plan;
--- 1760,1776 ----
if (window_pathkeys)
{
Sort *sort_plan;
+ int n_common_pathkeys;
+
+ n_common_pathkeys = pathkeys_common(window_pathkeys,
+ current_pathkeys);
sort_plan = make_sort_from_pathkeys(root,
result_plan,
window_pathkeys,
! -1.0,
! n_common_pathkeys);
! if (n_common_pathkeys < list_length(window_pathkeys))
{
/* we do indeed need to sort */
result_plan = (Plan *) sort_plan;
*************** grouping_planner(PlannerInfo *root, doub
*** 1876,1894 ****
{
if (list_length(root->distinct_pathkeys) >=
list_length(root->sort_pathkeys))
! current_pathkeys = root->distinct_pathkeys;
else
{
! current_pathkeys = root->sort_pathkeys;
/* Assert checks that parser didn't mess up... */
Assert(pathkeys_contained_in(root->distinct_pathkeys,
! current_pathkeys));
}
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
! current_pathkeys,
! -1.0);
}
result_plan = (Plan *) make_unique(result_plan,
--- 1916,1936 ----
{
if (list_length(root->distinct_pathkeys) >=
list_length(root->sort_pathkeys))
! needed_pathkeys = root->distinct_pathkeys;
else
{
! needed_pathkeys = root->sort_pathkeys;
/* Assert checks that parser didn't mess up... */
Assert(pathkeys_contained_in(root->distinct_pathkeys,
! needed_pathkeys));
}
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
! needed_pathkeys,
! -1.0,
! pathkeys_common(needed_pathkeys, current_pathkeys));
! current_pathkeys = needed_pathkeys;
}
result_plan = (Plan *) make_unique(result_plan,
*************** grouping_planner(PlannerInfo *root, doub
*** 1904,1915 ****
*/
if (parse->sortClause)
{
! if (!pathkeys_contained_in(root->sort_pathkeys, current_pathkeys))
{
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
root->sort_pathkeys,
! limit_tuples);
current_pathkeys = root->sort_pathkeys;
}
}
--- 1946,1960 ----
*/
if (parse->sortClause)
{
! int common = pathkeys_common(root->sort_pathkeys, current_pathkeys);
!
! if (common < list_length(root->sort_pathkeys))
{
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
root->sort_pathkeys,
! limit_tuples,
! common);
current_pathkeys = root->sort_pathkeys;
}
}
*************** choose_hashed_grouping(PlannerInfo *root
*** 2654,2659 ****
--- 2699,2705 ----
List *current_pathkeys;
Path hashed_p;
Path sorted_p;
+ int n_common_pathkeys;
/*
* Executor doesn't support hashed aggregation with DISTINCT or ORDER BY
*************** choose_hashed_grouping(PlannerInfo *root
*** 2735,2741 ****
path_rows);
/* Result of hashed agg is always unsorted */
if (target_pathkeys)
! cost_sort(&hashed_p, root, target_pathkeys, hashed_p.total_cost,
dNumGroups, path_width,
0.0, work_mem, limit_tuples);
--- 2781,2788 ----
path_rows);
/* Result of hashed agg is always unsorted */
if (target_pathkeys)
! cost_sort(&hashed_p, root, target_pathkeys, 0,
! hashed_p.startup_cost, hashed_p.total_cost,
dNumGroups, path_width,
0.0, work_mem, limit_tuples);
*************** choose_hashed_grouping(PlannerInfo *root
*** 2751,2759 ****
sorted_p.total_cost = cheapest_path->total_cost;
current_pathkeys = cheapest_path->pathkeys;
}
! if (!pathkeys_contained_in(root->group_pathkeys, current_pathkeys))
{
! cost_sort(&sorted_p, root, root->group_pathkeys, sorted_p.total_cost,
path_rows, path_width,
0.0, work_mem, -1.0);
current_pathkeys = root->group_pathkeys;
--- 2798,2809 ----
sorted_p.total_cost = cheapest_path->total_cost;
current_pathkeys = cheapest_path->pathkeys;
}
!
! n_common_pathkeys = pathkeys_common(root->group_pathkeys, current_pathkeys);
! if (n_common_pathkeys < list_length(root->group_pathkeys))
{
! cost_sort(&sorted_p, root, root->group_pathkeys,
! n_common_pathkeys, sorted_p.startup_cost, sorted_p.total_cost,
path_rows, path_width,
0.0, work_mem, -1.0);
current_pathkeys = root->group_pathkeys;
*************** choose_hashed_grouping(PlannerInfo *root
*** 2768,2777 ****
cost_group(&sorted_p, root, numGroupCols, dNumGroups,
sorted_p.startup_cost, sorted_p.total_cost,
path_rows);
/* The Agg or Group node will preserve ordering */
! if (target_pathkeys &&
! !pathkeys_contained_in(target_pathkeys, current_pathkeys))
! cost_sort(&sorted_p, root, target_pathkeys, sorted_p.total_cost,
dNumGroups, path_width,
0.0, work_mem, limit_tuples);
--- 2818,2829 ----
cost_group(&sorted_p, root, numGroupCols, dNumGroups,
sorted_p.startup_cost, sorted_p.total_cost,
path_rows);
+
/* The Agg or Group node will preserve ordering */
! n_common_pathkeys = pathkeys_common(target_pathkeys, current_pathkeys);
! if (target_pathkeys && n_common_pathkeys < list_length(target_pathkeys))
! cost_sort(&sorted_p, root, target_pathkeys, n_common_pathkeys,
! sorted_p.startup_cost, sorted_p.total_cost,
dNumGroups, path_width,
0.0, work_mem, limit_tuples);
*************** choose_hashed_distinct(PlannerInfo *root
*** 2824,2829 ****
--- 2876,2882 ----
List *needed_pathkeys;
Path hashed_p;
Path sorted_p;
+ int n_common_pathkeys;
/*
* If we have a sortable DISTINCT ON clause, we always use sorting. This
*************** choose_hashed_distinct(PlannerInfo *root
*** 2889,2895 ****
* need to charge for the final sort.
*/
if (parse->sortClause)
! cost_sort(&hashed_p, root, root->sort_pathkeys, hashed_p.total_cost,
dNumDistinctRows, path_width,
0.0, work_mem, limit_tuples);
--- 2942,2949 ----
* need to charge for the final sort.
*/
if (parse->sortClause)
! cost_sort(&hashed_p, root, root->sort_pathkeys, 0,
! hashed_p.startup_cost, hashed_p.total_cost,
dNumDistinctRows, path_width,
0.0, work_mem, limit_tuples);
*************** choose_hashed_distinct(PlannerInfo *root
*** 2906,2928 ****
needed_pathkeys = root->sort_pathkeys;
else
needed_pathkeys = root->distinct_pathkeys;
! if (!pathkeys_contained_in(needed_pathkeys, current_pathkeys))
{
if (list_length(root->distinct_pathkeys) >=
list_length(root->sort_pathkeys))
current_pathkeys = root->distinct_pathkeys;
else
current_pathkeys = root->sort_pathkeys;
! cost_sort(&sorted_p, root, current_pathkeys, sorted_p.total_cost,
path_rows, path_width,
0.0, work_mem, -1.0);
}
cost_group(&sorted_p, root, numDistinctCols, dNumDistinctRows,
sorted_p.startup_cost, sorted_p.total_cost,
path_rows);
if (parse->sortClause &&
! !pathkeys_contained_in(root->sort_pathkeys, current_pathkeys))
! cost_sort(&sorted_p, root, root->sort_pathkeys, sorted_p.total_cost,
dNumDistinctRows, path_width,
0.0, work_mem, limit_tuples);
--- 2960,2989 ----
needed_pathkeys = root->sort_pathkeys;
else
needed_pathkeys = root->distinct_pathkeys;
!
! n_common_pathkeys = pathkeys_common(needed_pathkeys, current_pathkeys);
! if (n_common_pathkeys < list_length(needed_pathkeys))
{
if (list_length(root->distinct_pathkeys) >=
list_length(root->sort_pathkeys))
current_pathkeys = root->distinct_pathkeys;
else
current_pathkeys = root->sort_pathkeys;
! cost_sort(&sorted_p, root, current_pathkeys,
! n_common_pathkeys, sorted_p.startup_cost, sorted_p.total_cost,
path_rows, path_width,
0.0, work_mem, -1.0);
}
cost_group(&sorted_p, root, numDistinctCols, dNumDistinctRows,
sorted_p.startup_cost, sorted_p.total_cost,
path_rows);
+
+
+ n_common_pathkeys = pathkeys_common(root->sort_pathkeys, current_pathkeys);
if (parse->sortClause &&
! n_common_pathkeys < list_length(root->sort_pathkeys))
! cost_sort(&sorted_p, root, root->sort_pathkeys, n_common_pathkeys,
! sorted_p.startup_cost, sorted_p.total_cost,
dNumDistinctRows, path_width,
0.0, work_mem, limit_tuples);
*************** plan_cluster_use_sort(Oid tableOid, Oid
*** 3712,3719 ****
/* Estimate the cost of seq scan + sort */
seqScanPath = create_seqscan_path(root, rel, NULL);
! cost_sort(&seqScanAndSortPath, root, NIL,
! seqScanPath->total_cost, rel->tuples, rel->width,
comparisonCost, maintenance_work_mem, -1.0);
/* Estimate the cost of index scan */
--- 3773,3781 ----
/* Estimate the cost of seq scan + sort */
seqScanPath = create_seqscan_path(root, rel, NULL);
! cost_sort(&seqScanAndSortPath, root, NIL, 0,
! seqScanPath->startup_cost, seqScanPath->total_cost,
! rel->tuples, rel->width,
comparisonCost, maintenance_work_mem, -1.0);
/* Estimate the cost of index scan */
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
new file mode 100644
index e249628..b0b5471
*** a/src/backend/optimizer/prep/prepunion.c
--- b/src/backend/optimizer/prep/prepunion.c
*************** choose_hashed_setop(PlannerInfo *root, L
*** 859,865 ****
sorted_p.startup_cost = input_plan->startup_cost;
sorted_p.total_cost = input_plan->total_cost;
/* XXX cost_sort doesn't actually look at pathkeys, so just pass NIL */
! cost_sort(&sorted_p, root, NIL, sorted_p.total_cost,
input_plan->plan_rows, input_plan->plan_width,
0.0, work_mem, -1.0);
cost_group(&sorted_p, root, numGroupCols, dNumGroups,
--- 859,866 ----
sorted_p.startup_cost = input_plan->startup_cost;
sorted_p.total_cost = input_plan->total_cost;
/* XXX cost_sort doesn't actually look at pathkeys, so just pass NIL */
! cost_sort(&sorted_p, root, NIL, 0,
! sorted_p.startup_cost, sorted_p.total_cost,
input_plan->plan_rows, input_plan->plan_width,
0.0, work_mem, -1.0);
cost_group(&sorted_p, root, numGroupCols, dNumGroups,
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
new file mode 100644
index a7169ef..3d0a842
*** a/src/backend/optimizer/util/pathnode.c
--- b/src/backend/optimizer/util/pathnode.c
*************** create_merge_append_path(PlannerInfo *ro
*** 971,980 ****
foreach(l, subpaths)
{
Path *subpath = (Path *) lfirst(l);
pathnode->path.rows += subpath->rows;
! if (pathkeys_contained_in(pathkeys, subpath->pathkeys))
{
/* Subpath is adequately ordered, we won't need to sort it */
input_startup_cost += subpath->startup_cost;
--- 971,981 ----
foreach(l, subpaths)
{
Path *subpath = (Path *) lfirst(l);
+ int n_common_pathkeys = pathkeys_common(pathkeys, subpath->pathkeys);
pathnode->path.rows += subpath->rows;
! if (n_common_pathkeys == list_length(pathkeys))
{
/* Subpath is adequately ordered, we won't need to sort it */
input_startup_cost += subpath->startup_cost;
*************** create_merge_append_path(PlannerInfo *ro
*** 988,993 ****
--- 989,996 ----
cost_sort(&sort_path,
root,
pathkeys,
+ n_common_pathkeys,
+ subpath->startup_cost,
subpath->total_cost,
subpath->parent->tuples,
subpath->parent->width,
*************** create_unique_path(PlannerInfo *root, Re
*** 1343,1349 ****
/*
* Estimate cost for sort+unique implementation
*/
! cost_sort(&sort_path, root, NIL,
subpath->total_cost,
rel->rows,
rel->width,
--- 1346,1353 ----
/*
* Estimate cost for sort+unique implementation
*/
! cost_sort(&sort_path, root, NIL, 0,
! subpath->startup_cost,
subpath->total_cost,
rel->rows,
rel->width,
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
new file mode 100644
index 52f05e6..6a09138
*** a/src/backend/utils/sort/tuplesort.c
--- b/src/backend/utils/sort/tuplesort.c
*************** free_sort_tuple(Tuplesortstate *state, S
*** 3525,3527 ****
--- 3525,3534 ----
FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
pfree(stup->tuple);
}
+
+ SortSupport
+ tuplesort_get_sortkeys(Tuplesortstate *state)
+ {
+ return state->sortKeys;
+ }
+
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
new file mode 100644
index 2a7b36e..76aab79
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
*************** typedef struct SortState
*** 1664,1671 ****
--- 1664,1673 ----
int64 bound; /* if bounded, how many tuples are needed */
bool sort_Done; /* sort completed yet? */
bool bounded_Done; /* value of bounded we did the sort with */
+ bool finished;
int64 bound_Done; /* value of bound we did the sort with */
void *tuplesortstate; /* private state of tuplesort.c */
+ HeapTuple prev;
} SortState;
/* ---------------------
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
new file mode 100644
index 101e22c..28b871e
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
*************** typedef struct Sort
*** 582,587 ****
--- 582,588 ----
{
Plan plan;
int numCols; /* number of sort-key columns */
+ int skipCols;
AttrNumber *sortColIdx; /* their indexes in the target list */
Oid *sortOperators; /* OIDs of operators to sort them by */
Oid *collations; /* OIDs of collations */
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
new file mode 100644
index 444ab74..e98fb0c
*** a/src/include/optimizer/cost.h
--- b/src/include/optimizer/cost.h
*************** extern void cost_ctescan(Path *path, Pla
*** 88,95 ****
RelOptInfo *baserel, ParamPathInfo *param_info);
extern void cost_recursive_union(Plan *runion, Plan *nrterm, Plan *rterm);
extern void cost_sort(Path *path, PlannerInfo *root,
! List *pathkeys, Cost input_cost, double tuples, int width,
! Cost comparison_cost, int sort_mem,
double limit_tuples);
extern void cost_merge_append(Path *path, PlannerInfo *root,
List *pathkeys, int n_streams,
--- 88,96 ----
RelOptInfo *baserel, ParamPathInfo *param_info);
extern void cost_recursive_union(Plan *runion, Plan *nrterm, Plan *rterm);
extern void cost_sort(Path *path, PlannerInfo *root,
! List *pathkeys, int presorted_keys,
! Cost input_startup_cost, Cost input_total_cost,
! double tuples, int width, Cost comparison_cost, int sort_mem,
double limit_tuples);
extern void cost_merge_append(Path *path, PlannerInfo *root,
List *pathkeys, int n_streams,
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
new file mode 100644
index 999adaa..043641d
*** a/src/include/optimizer/paths.h
--- b/src/include/optimizer/paths.h
*************** typedef enum
*** 157,169 ****
extern PathKeysComparison compare_pathkeys(List *keys1, List *keys2);
extern bool pathkeys_contained_in(List *keys1, List *keys2);
extern Path *get_cheapest_path_for_pathkeys(List *paths, List *pathkeys,
Relids required_outer,
CostSelector cost_criterion);
extern Path *get_cheapest_fractional_path_for_pathkeys(List *paths,
List *pathkeys,
Relids required_outer,
! double fraction);
extern List *build_index_pathkeys(PlannerInfo *root, IndexOptInfo *index,
ScanDirection scandir);
extern List *build_expression_pathkey(PlannerInfo *root, Expr *expr,
--- 157,172 ----
extern PathKeysComparison compare_pathkeys(List *keys1, List *keys2);
extern bool pathkeys_contained_in(List *keys1, List *keys2);
+ extern int pathkeys_common(List *keys1, List *keys2);
extern Path *get_cheapest_path_for_pathkeys(List *paths, List *pathkeys,
Relids required_outer,
CostSelector cost_criterion);
extern Path *get_cheapest_fractional_path_for_pathkeys(List *paths,
List *pathkeys,
Relids required_outer,
! double fraction,
! PlannerInfo *root,
! double tuples);
extern List *build_index_pathkeys(PlannerInfo *root, IndexOptInfo *index,
ScanDirection scandir);
extern List *build_expression_pathkey(PlannerInfo *root, Expr *expr,
*************** extern void update_mergeclause_eclasses(
*** 185,191 ****
extern List *find_mergeclauses_for_pathkeys(PlannerInfo *root,
List *pathkeys,
bool outer_keys,
! List *restrictinfos);
extern List *select_outer_pathkeys_for_merge(PlannerInfo *root,
List *mergeclauses,
RelOptInfo *joinrel);
--- 188,196 ----
extern List *find_mergeclauses_for_pathkeys(PlannerInfo *root,
List *pathkeys,
bool outer_keys,
! List *restrictinfos,
! RelOptInfo *joinrel,
! List **outerpathkeys);
extern List *select_outer_pathkeys_for_merge(PlannerInfo *root,
List *mergeclauses,
RelOptInfo *joinrel);
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
new file mode 100644
index ba7ae7c..d33c615
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
*************** extern RecursiveUnion *make_recursive_un
*** 50,60 ****
Plan *lefttree, Plan *righttree, int wtParam,
List *distinctList, long numGroups);
extern Sort *make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree,
! List *pathkeys, double limit_tuples);
extern Sort *make_sort_from_sortclauses(PlannerInfo *root, List *sortcls,
Plan *lefttree);
extern Sort *make_sort_from_groupcols(PlannerInfo *root, List *groupcls,
! AttrNumber *grpColIdx, Plan *lefttree);
extern Agg *make_agg(PlannerInfo *root, List *tlist, List *qual,
AggStrategy aggstrategy, const AggClauseCosts *aggcosts,
int numGroupCols, AttrNumber *grpColIdx, Oid *grpOperators,
--- 50,61 ----
Plan *lefttree, Plan *righttree, int wtParam,
List *distinctList, long numGroups);
extern Sort *make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree,
! List *pathkeys, double limit_tuples, int skipCols);
extern Sort *make_sort_from_sortclauses(PlannerInfo *root, List *sortcls,
Plan *lefttree);
extern Sort *make_sort_from_groupcols(PlannerInfo *root, List *groupcls,
! AttrNumber *grpColIdx, Plan *lefttree, List *pathkeys,
! int skipCols);
extern Agg *make_agg(PlannerInfo *root, List *tlist, List *qual,
AggStrategy aggstrategy, const AggClauseCosts *aggcosts,
int numGroupCols, AttrNumber *grpColIdx, Oid *grpOperators,
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
new file mode 100644
index 5f87254..5a65cd2
*** a/src/include/utils/tuplesort.h
--- b/src/include/utils/tuplesort.h
***************
*** 24,29 ****
--- 24,30 ----
#include "executor/tuptable.h"
#include "fmgr.h"
#include "utils/relcache.h"
+ #include "utils/sortsupport.h"
/* Tuplesortstate is an opaque type whose details are not known outside
*************** extern void tuplesort_get_stats(Tuplesor
*** 111,116 ****
--- 112,119 ----
extern int tuplesort_merge_order(int64 allowedMem);
+ extern SortSupport tuplesort_get_sortkeys(Tuplesortstate *state);
+
/*
* These routines may only be called if randomAccess was specified 'true'.
* Likewise, backwards scan in gettuple/getdatum is only allowed if
On Sat, Dec 28, 2013 at 9:28 PM, Alexander Korotkov <aekorotkov@gmail.com>wrote:
On Tue, Dec 24, 2013 at 6:02 AM, Andreas Karlsson <andreas@proxel.se>wrote:
Attached revision of patch implements partial sort usage in merge joins.
I'm looking forward to doing a bit of testing on this patch. I think it is
a really useful feature to get a bit more out of existing indexes.
I was about to test it tonight, but I'm having trouble getting the patch to
compile... I'm really wondering which compiler you are using as it seems
you're declaring your variables in some strange places.. See nodeSort.c
line 101. These variables are declared after there has been an if statement
in the same scope. That's not valid in C. (The patch did however apply
without any complaints).
Here's a list of the errors I get when compiling with visual studios on
windows.
"D:\Postgres\c\pgsql.sln" (default target) (1) ->
"D:\Postgres\c\postgres.vcxproj" (default target) (2) ->
(ClCompile target) ->
src\backend\executor\nodeSort.c(101): error C2275: 'Sort' : illegal use
of this type as an expression [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(101): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(102): error C2275: 'PlanState' : illegal
use of this type as an expression [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(102): error C2065: 'outerNode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(103): error C2275: 'TupleDesc' : illegal
use of this type as an expression [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(103): error C2146: syntax error : missing
';' before identifier 'tupDesc' [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(103): error C2065: 'tupDesc' : undeclared
identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(120): error C2065: 'outerNode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(121): error C2065: 'tupDesc' : undeclared
identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(121): error C2065: 'outerNode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(125): error C2065: 'tupDesc' : undeclared
identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(126): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(126): error C2223: left of '->numCols'
must point to struct/union [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(127): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(127): error C2223: left of '->sortColIdx'
must point to struct/union [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(128): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(128): error C2223: left of
'->sortOperators' must point to struct/union
[D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(129): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(129): error C2223: left of '->collations'
must point to struct/union [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(130): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(130): error C2223: left of '->nullsFirst'
must point to struct/union [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(132): error C2198: 'tuplesort_begin_heap'
: too few arguments for call [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(143): error C2065: 'outerNode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(167): error C2065: 'tupDesc' : undeclared
identifier [D:\Postgres\c\postgres.vcxproj]
13 Warning(s)
24 Error(s)
Regards
David Rowley
On Sat, Dec 28, 2013 at 1:04 PM, David Rowley <dgrowleyml@gmail.com> wrote:
On Sat, Dec 28, 2013 at 9:28 PM, Alexander Korotkov <aekorotkov@gmail.com>wrote:
On Tue, Dec 24, 2013 at 6:02 AM, Andreas Karlsson <andreas@proxel.se>wrote:
Attached revision of patch implements partial sort usage in merge joins.I'm looking forward to doing a bit of testing on this patch. I think it is
a really useful feature to get a bit more out of existing indexes.I was about to test it tonight, but I'm having trouble getting the patch
to compile... I'm really wondering which compiler you are using as it seems
you're declaring your variables in some strange places.. See nodeSort.c
line 101. These variables are declared after there has been an if statement
in the same scope. That's not valid in C. (The patch did however apply
without any complaints).Here's a list of the errors I get when compiling with visual studios on
windows."D:\Postgres\c\pgsql.sln" (default target) (1) ->
"D:\Postgres\c\postgres.vcxproj" (default target) (2) ->
(ClCompile target) ->
src\backend\executor\nodeSort.c(101): error C2275: 'Sort' : illegal use
of this type as an expression [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(101): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(102): error C2275: 'PlanState' : illegal
use of this type as an expression [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(102): error C2065: 'outerNode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(103): error C2275: 'TupleDesc' : illegal
use of this type as an expression [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(103): error C2146: syntax error :
missing ';' before identifier 'tupDesc' [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(103): error C2065: 'tupDesc' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(120): error C2065: 'outerNode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(121): error C2065: 'tupDesc' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(121): error C2065: 'outerNode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(125): error C2065: 'tupDesc' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(126): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(126): error C2223: left of '->numCols'
must point to struct/union [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(127): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(127): error C2223: left of
'->sortColIdx' must point to struct/union [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(128): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(128): error C2223: left of
'->sortOperators' must point to struct/union
[D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(129): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(129): error C2223: left of
'->collations' must point to struct/union [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(130): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(130): error C2223: left of
'->nullsFirst' must point to struct/union [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(132): error C2198:
'tuplesort_begin_heap' : too few arguments for call
[D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(143): error C2065: 'outerNode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
src\backend\executor\nodeSort.c(167): error C2065: 'tupDesc' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]13 Warning(s)
24 Error(s)
I've compiled it with clang. Yeah, there was mixed declarations. I've
rechecked it with gcc, now it gives no warnings. I didn't try it with
visual studio, but I hope it will be OK.
------
With best regards,
Alexander Korotkov.
Attachments:
partial-sort-4.patchapplication/octet-stream; name=partial-sort-4.patchDownload
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
new file mode 100644
index 9969a25..07cb66d
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
*************** static void show_agg_keys(AggState *asta
*** 81,87 ****
static void show_group_keys(GroupState *gstate, List *ancestors,
ExplainState *es);
static void show_sort_group_keys(PlanState *planstate, const char *qlabel,
! int nkeys, AttrNumber *keycols,
List *ancestors, ExplainState *es);
static void show_sort_info(SortState *sortstate, ExplainState *es);
static void show_hash_info(HashState *hashstate, ExplainState *es);
--- 81,87 ----
static void show_group_keys(GroupState *gstate, List *ancestors,
ExplainState *es);
static void show_sort_group_keys(PlanState *planstate, const char *qlabel,
! int nkeys, int nPresortedKeys, AttrNumber *keycols,
List *ancestors, ExplainState *es);
static void show_sort_info(SortState *sortstate, ExplainState *es);
static void show_hash_info(HashState *hashstate, ExplainState *es);
*************** ExplainNode(PlanState *planstate, List *
*** 905,911 ****
pname = sname = "Materialize";
break;
case T_Sort:
! pname = sname = "Sort";
break;
case T_Group:
pname = sname = "Group";
--- 905,914 ----
pname = sname = "Materialize";
break;
case T_Sort:
! if (((Sort *) plan)->skipCols > 0)
! pname = sname = "Partial sort";
! else
! pname = sname = "Sort";
break;
case T_Group:
pname = sname = "Group";
*************** show_sort_keys(SortState *sortstate, Lis
*** 1705,1711 ****
Sort *plan = (Sort *) sortstate->ss.ps.plan;
show_sort_group_keys((PlanState *) sortstate, "Sort Key",
! plan->numCols, plan->sortColIdx,
ancestors, es);
}
--- 1708,1714 ----
Sort *plan = (Sort *) sortstate->ss.ps.plan;
show_sort_group_keys((PlanState *) sortstate, "Sort Key",
! plan->numCols, plan->skipCols, plan->sortColIdx,
ancestors, es);
}
*************** show_merge_append_keys(MergeAppendState
*** 1719,1725 ****
MergeAppend *plan = (MergeAppend *) mstate->ps.plan;
show_sort_group_keys((PlanState *) mstate, "Sort Key",
! plan->numCols, plan->sortColIdx,
ancestors, es);
}
--- 1722,1728 ----
MergeAppend *plan = (MergeAppend *) mstate->ps.plan;
show_sort_group_keys((PlanState *) mstate, "Sort Key",
! plan->numCols, 0, plan->sortColIdx,
ancestors, es);
}
*************** show_agg_keys(AggState *astate, List *an
*** 1737,1743 ****
/* The key columns refer to the tlist of the child plan */
ancestors = lcons(astate, ancestors);
show_sort_group_keys(outerPlanState(astate), "Group Key",
! plan->numCols, plan->grpColIdx,
ancestors, es);
ancestors = list_delete_first(ancestors);
}
--- 1740,1746 ----
/* The key columns refer to the tlist of the child plan */
ancestors = lcons(astate, ancestors);
show_sort_group_keys(outerPlanState(astate), "Group Key",
! plan->numCols, 0, plan->grpColIdx,
ancestors, es);
ancestors = list_delete_first(ancestors);
}
*************** show_group_keys(GroupState *gstate, List
*** 1755,1761 ****
/* The key columns refer to the tlist of the child plan */
ancestors = lcons(gstate, ancestors);
show_sort_group_keys(outerPlanState(gstate), "Group Key",
! plan->numCols, plan->grpColIdx,
ancestors, es);
ancestors = list_delete_first(ancestors);
}
--- 1758,1764 ----
/* The key columns refer to the tlist of the child plan */
ancestors = lcons(gstate, ancestors);
show_sort_group_keys(outerPlanState(gstate), "Group Key",
! plan->numCols, 0, plan->grpColIdx,
ancestors, es);
ancestors = list_delete_first(ancestors);
}
*************** show_group_keys(GroupState *gstate, List
*** 1765,1777 ****
* as arrays of targetlist indexes
*/
static void
! show_sort_group_keys(PlanState *planstate, const char *qlabel,
! int nkeys, AttrNumber *keycols,
List *ancestors, ExplainState *es)
{
Plan *plan = planstate->plan;
List *context;
! List *result = NIL;
bool useprefix;
int keyno;
char *exprstr;
--- 1768,1781 ----
* as arrays of targetlist indexes
*/
static void
! show_sort_group_keys(PlanState *planstate, const char *qlabel,
! int nkeys, int nPresortedKeys, AttrNumber *keycols,
List *ancestors, ExplainState *es)
{
Plan *plan = planstate->plan;
List *context;
! List *resultSort = NIL;
! List *resultPresorted = NIL;
bool useprefix;
int keyno;
char *exprstr;
*************** show_sort_group_keys(PlanState *planstat
*** 1798,1807 ****
/* Deparse the expression, showing any top-level cast */
exprstr = deparse_expression((Node *) target->expr, context,
useprefix, true);
! result = lappend(result, exprstr);
}
! ExplainPropertyList(qlabel, result, es);
}
/*
--- 1802,1816 ----
/* Deparse the expression, showing any top-level cast */
exprstr = deparse_expression((Node *) target->expr, context,
useprefix, true);
!
! if (keyno < nPresortedKeys)
! resultPresorted = lappend(resultPresorted, exprstr);
! resultSort = lappend(resultSort, exprstr);
}
! ExplainPropertyList(qlabel, resultSort, es);
! if (nPresortedKeys > 0)
! ExplainPropertyList("Presorted Key", resultPresorted, es);
}
/*
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
new file mode 100644
index 09b2eb0..02dcd7a
*** a/src/backend/executor/nodeSort.c
--- b/src/backend/executor/nodeSort.c
***************
*** 15,25 ****
--- 15,52 ----
#include "postgres.h"
+ #include "access/htup_details.h"
#include "executor/execdebug.h"
#include "executor/nodeSort.h"
#include "miscadmin.h"
#include "utils/tuplesort.h"
+ /*
+ * Check if first "skipCols" sort values are equal.
+ */
+ static bool
+ cmpSortSkipCols(SortState *node, TupleDesc tupDesc, HeapTuple a, TupleTableSlot *b)
+ {
+ int n = ((Sort *)node->ss.ps.plan)->skipCols, i;
+ SortSupport sortKeys = tuplesort_get_sortkeys(node->tuplesortstate);
+
+ for (i = 0; i < n; i++)
+ {
+ Datum datumA, datumB;
+ bool isnullA, isnullB;
+ AttrNumber attno = sortKeys[i].ssup_attno;
+
+ datumA = heap_getattr(a, attno, tupDesc, &isnullA);
+ datumB = slot_getattr(b, attno, &isnullB);
+
+ if (ApplySortComparator(datumA, isnullA,
+ datumB, isnullB,
+ &sortKeys[i]))
+ return false;
+ }
+ return true;
+ }
+
/* ----------------------------------------------------------------
* ExecSort
*************** ExecSort(SortState *node)
*** 42,47 ****
--- 69,78 ----
ScanDirection dir;
Tuplesortstate *tuplesortstate;
TupleTableSlot *slot;
+ Sort *plannode = (Sort *) node->ss.ps.plan;
+ PlanState *outerNode;
+ TupleDesc tupDesc;
+ int skipCols = plannode->skipCols;
/*
* get state info from node
*************** ExecSort(SortState *node)
*** 54,131 ****
tuplesortstate = (Tuplesortstate *) node->tuplesortstate;
/*
* If first time through, read all tuples from outer plan and pass them to
* tuplesort.c. Subsequent calls just fetch tuples from tuplesort.
*/
! if (!node->sort_Done)
! {
! Sort *plannode = (Sort *) node->ss.ps.plan;
! PlanState *outerNode;
! TupleDesc tupDesc;
!
! SO1_printf("ExecSort: %s\n",
! "sorting subplan");
! /*
! * Want to scan subplan in the forward direction while creating the
! * sorted data.
! */
! estate->es_direction = ForwardScanDirection;
! /*
! * Initialize tuplesort module.
! */
! SO1_printf("ExecSort: %s\n",
! "calling tuplesort_begin");
! outerNode = outerPlanState(node);
! tupDesc = ExecGetResultType(outerNode);
! tuplesortstate = tuplesort_begin_heap(tupDesc,
! plannode->numCols,
! plannode->sortColIdx,
! plannode->sortOperators,
! plannode->collations,
! plannode->nullsFirst,
! work_mem,
! node->randomAccess);
! if (node->bounded)
! tuplesort_set_bound(tuplesortstate, node->bound);
! node->tuplesortstate = (void *) tuplesortstate;
! /*
! * Scan the subplan and feed all the tuples to tuplesort.
! */
! for (;;)
{
- slot = ExecProcNode(outerNode);
-
if (TupIsNull(slot))
break;
!
tuplesort_puttupleslot(tuplesortstate, slot);
}
! /*
! * Complete the sort.
! */
! tuplesort_performsort(tuplesortstate);
! /*
! * restore to user specified direction
! */
! estate->es_direction = dir;
! /*
! * finally set the sorted flag to true
! */
! node->sort_Done = true;
! node->bounded_Done = node->bounded;
! node->bound_Done = node->bound;
! SO1_printf("ExecSort: %s\n", "sorting done");
! }
SO1_printf("ExecSort: %s\n",
"retrieving tuple from tuplesort");
--- 85,205 ----
tuplesortstate = (Tuplesortstate *) node->tuplesortstate;
/*
+ * Return next tuple from sorted set if any.
+ */
+ if (node->sort_Done)
+ {
+ slot = node->ss.ps.ps_ResultTupleSlot;
+ if (tuplesort_gettupleslot(tuplesortstate,
+ ScanDirectionIsForward(dir),
+ slot) || node->finished)
+ return slot;
+ }
+
+ /*
* If first time through, read all tuples from outer plan and pass them to
* tuplesort.c. Subsequent calls just fetch tuples from tuplesort.
*/
! SO1_printf("ExecSort: %s\n",
! "sorting subplan");
! /*
! * Want to scan subplan in the forward direction while creating the
! * sorted data.
! */
! estate->es_direction = ForwardScanDirection;
! /*
! * Initialize tuplesort module.
! */
! SO1_printf("ExecSort: %s\n",
! "calling tuplesort_begin");
! outerNode = outerPlanState(node);
! tupDesc = ExecGetResultType(outerNode);
! if (node->tuplesortstate != NULL)
! tuplesort_end((Tuplesortstate *) node->tuplesortstate);
! tuplesortstate = tuplesort_begin_heap(tupDesc,
! plannode->numCols,
! plannode->sortColIdx,
! plannode->sortOperators,
! plannode->collations,
! plannode->nullsFirst,
! work_mem,
! node->randomAccess);
! if (node->bounded)
! tuplesort_set_bound(tuplesortstate, node->bound);
! node->tuplesortstate = (void *) tuplesortstate;
! /*
! * Put next group of tuples where skipCols" sort values are equal to
! * tuplesort.
! */
! for (;;)
! {
! slot = ExecProcNode(outerNode);
! if (skipCols == 0)
{
if (TupIsNull(slot))
+ {
+ node->finished = true;
break;
! }
tuplesort_puttupleslot(tuplesortstate, slot);
}
+ else if (node->prev)
+ {
+ ExecStoreTuple(node->prev, node->ss.ps.ps_ResultTupleSlot, InvalidBuffer, false);
+ tuplesort_puttupleslot(tuplesortstate, node->ss.ps.ps_ResultTupleSlot);
! if (TupIsNull(slot))
! {
! node->finished = true;
! break;
! }
! else
! {
! bool cmp;
! cmp = cmpSortSkipCols(node, tupDesc, node->prev, slot);
! node->prev = ExecCopySlotTuple(slot);
! if (!cmp)
! break;
! }
! }
! else
! {
! if (TupIsNull(slot))
! {
! node->finished = true;
! break;
! }
! else
! {
! node->prev = ExecCopySlotTuple(slot);
! }
! }
! }
! /*
! * Complete the sort.
! */
! tuplesort_performsort(tuplesortstate);
! /*
! * restore to user specified direction
! */
! estate->es_direction = dir;
!
! /*
! * finally set the sorted flag to true
! */
! node->sort_Done = true;
! node->bounded_Done = node->bounded;
! node->bound_Done = node->bound;
! SO1_printf("ExecSort: %s\n", "sorting done");
SO1_printf("ExecSort: %s\n",
"retrieving tuple from tuplesort");
*************** ExecInitSort(Sort *node, EState *estate,
*** 174,180 ****
--- 248,256 ----
sortstate->bounded = false;
sortstate->sort_Done = false;
+ sortstate->finished = false;
sortstate->tuplesortstate = NULL;
+ sortstate->prev = NULL;
/*
* Miscellaneous initialization
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
new file mode 100644
index e4184c5..b41213a
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
*************** _copySort(const Sort *from)
*** 735,740 ****
--- 735,741 ----
CopyPlanFields((const Plan *) from, (Plan *) newnode);
COPY_SCALAR_FIELD(numCols);
+ COPY_SCALAR_FIELD(skipCols);
COPY_POINTER_FIELD(sortColIdx, from->numCols * sizeof(AttrNumber));
COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
new file mode 100644
index 50f0852..1a38407
*** a/src/backend/optimizer/path/costsize.c
--- b/src/backend/optimizer/path/costsize.c
*************** cost_recursive_union(Plan *runion, Plan
*** 1281,1295 ****
*/
void
cost_sort(Path *path, PlannerInfo *root,
! List *pathkeys, Cost input_cost, double tuples, int width,
! Cost comparison_cost, int sort_mem,
double limit_tuples)
{
! Cost startup_cost = input_cost;
! Cost run_cost = 0;
double input_bytes = relation_byte_size(tuples, width);
double output_bytes;
double output_tuples;
long sort_mem_bytes = sort_mem * 1024L;
if (!enable_sort)
--- 1281,1302 ----
*/
void
cost_sort(Path *path, PlannerInfo *root,
! List *pathkeys, int presorted_keys,
! Cost input_startup_cost, Cost input_total_cost,
! double tuples, int width, Cost comparison_cost, int sort_mem,
double limit_tuples)
{
! Cost startup_cost = input_startup_cost;
! Cost run_cost = 0,
! rest_cost,
! group_cost,
! input_run_cost = input_total_cost - input_startup_cost;
double input_bytes = relation_byte_size(tuples, width);
double output_bytes;
double output_tuples;
+ double num_groups,
+ group_input_bytes,
+ group_tuples;
long sort_mem_bytes = sort_mem * 1024L;
if (!enable_sort)
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1319,1331 ****
output_bytes = input_bytes;
}
! if (output_bytes > sort_mem_bytes)
{
/*
* We'll have to use a disk-based sort of all the tuples
*/
! double npages = ceil(input_bytes / BLCKSZ);
! double nruns = (input_bytes / sort_mem_bytes) * 0.5;
double mergeorder = tuplesort_merge_order(sort_mem_bytes);
double log_runs;
double npageaccesses;
--- 1326,1367 ----
output_bytes = input_bytes;
}
! if (presorted_keys > 0)
! {
! List *groupExprs = NIL;
! ListCell *l;
! int i = 0;
!
! foreach(l, pathkeys)
! {
! PathKey *key = (PathKey *)lfirst(l);
! EquivalenceMember *member = (EquivalenceMember *)
! lfirst(list_head(key->pk_eclass->ec_members));
!
! groupExprs = lappend(groupExprs, member->em_expr);
!
! i++;
! if (i >= presorted_keys)
! break;
! }
!
! num_groups = estimate_num_groups(root, groupExprs, tuples);
! }
! else
! {
! num_groups = 1.0;
! }
!
! group_input_bytes = input_bytes / num_groups;
! group_tuples = tuples / num_groups;
!
! if (output_bytes > sort_mem_bytes && group_input_bytes > sort_mem_bytes)
{
/*
* We'll have to use a disk-based sort of all the tuples
*/
! double npages = ceil(group_input_bytes / BLCKSZ);
! double nruns = (group_input_bytes / sort_mem_bytes) * 0.5;
double mergeorder = tuplesort_merge_order(sort_mem_bytes);
double log_runs;
double npageaccesses;
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1335,1341 ****
*
* Assume about N log2 N comparisons
*/
! startup_cost += comparison_cost * tuples * LOG2(tuples);
/* Disk costs */
--- 1371,1377 ----
*
* Assume about N log2 N comparisons
*/
! group_cost = comparison_cost * group_tuples * LOG2(group_tuples);
/* Disk costs */
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1346,1355 ****
log_runs = 1.0;
npageaccesses = 2.0 * npages * log_runs;
/* Assume 3/4ths of accesses are sequential, 1/4th are not */
! startup_cost += npageaccesses *
(seq_page_cost * 0.75 + random_page_cost * 0.25);
}
! else if (tuples > 2 * output_tuples || input_bytes > sort_mem_bytes)
{
/*
* We'll use a bounded heap-sort keeping just K tuples in memory, for
--- 1382,1391 ----
log_runs = 1.0;
npageaccesses = 2.0 * npages * log_runs;
/* Assume 3/4ths of accesses are sequential, 1/4th are not */
! group_cost += npageaccesses *
(seq_page_cost * 0.75 + random_page_cost * 0.25);
}
! else if (group_tuples > 2 * output_tuples || group_input_bytes > sort_mem_bytes)
{
/*
* We'll use a bounded heap-sort keeping just K tuples in memory, for
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1357,1368 ****
* factor is a bit higher than for quicksort. Tweak it so that the
* cost curve is continuous at the crossover point.
*/
! startup_cost += comparison_cost * tuples * LOG2(2.0 * output_tuples);
}
else
{
/* We'll use plain quicksort on all the input tuples */
! startup_cost += comparison_cost * tuples * LOG2(tuples);
}
/*
--- 1393,1404 ----
* factor is a bit higher than for quicksort. Tweak it so that the
* cost curve is continuous at the crossover point.
*/
! group_cost = comparison_cost * group_tuples * LOG2(2.0 * output_tuples);
}
else
{
/* We'll use plain quicksort on all the input tuples */
! group_cost = comparison_cost * group_tuples * LOG2(group_tuples);
}
/*
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1373,1380 ****
--- 1409,1423 ----
* here --- the upper LIMIT will pro-rate the run cost so we'd be double
* counting the LIMIT otherwise.
*/
+ startup_cost += group_cost;
+ rest_cost = (num_groups * (output_tuples / tuples) - 1.0) * group_cost;
+ if (rest_cost > 0.0)
+ run_cost += rest_cost;
run_cost += cpu_operator_cost * tuples;
+ startup_cost += input_run_cost / num_groups;
+ run_cost += input_run_cost * ((num_groups - 1.0) / num_groups);
+
path->startup_cost = startup_cost;
path->total_cost = startup_cost + run_cost;
}
*************** initial_cost_mergejoin(PlannerInfo *root
*** 2075,2080 ****
--- 2118,2125 ----
cost_sort(&sort_path,
root,
outersortkeys,
+ pathkeys_common(outer_path->pathkeys, outersortkeys),
+ outer_path->startup_cost,
outer_path->total_cost,
outer_path_rows,
outer_path->parent->width,
*************** initial_cost_mergejoin(PlannerInfo *root
*** 2101,2106 ****
--- 2146,2153 ----
cost_sort(&sort_path,
root,
innersortkeys,
+ pathkeys_common(inner_path->pathkeys, innersortkeys),
+ inner_path->startup_cost,
inner_path->total_cost,
inner_path_rows,
inner_path->parent->width,
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
new file mode 100644
index 5b477e5..5909dfe
*** a/src/backend/optimizer/path/joinpath.c
--- b/src/backend/optimizer/path/joinpath.c
*************** sort_inner_and_outer(PlannerInfo *root,
*** 662,668 ****
cur_mergeclauses = find_mergeclauses_for_pathkeys(root,
outerkeys,
true,
! mergeclause_list);
/* Should have used them all... */
Assert(list_length(cur_mergeclauses) == list_length(mergeclause_list));
--- 662,670 ----
cur_mergeclauses = find_mergeclauses_for_pathkeys(root,
outerkeys,
true,
! mergeclause_list,
! NULL,
! NULL);
/* Should have used them all... */
Assert(list_length(cur_mergeclauses) == list_length(mergeclause_list));
*************** match_unsorted_outer(PlannerInfo *root,
*** 832,837 ****
--- 834,840 ----
List *mergeclauses;
List *innersortkeys;
List *trialsortkeys;
+ List *outersortkeys;
Path *cheapest_startup_inner;
Path *cheapest_total_inner;
int num_sortkeys;
*************** match_unsorted_outer(PlannerInfo *root,
*** 937,943 ****
mergeclauses = find_mergeclauses_for_pathkeys(root,
outerpath->pathkeys,
true,
! mergeclause_list);
/*
* Done with this outer path if no chance for a mergejoin.
--- 940,948 ----
mergeclauses = find_mergeclauses_for_pathkeys(root,
outerpath->pathkeys,
true,
! mergeclause_list,
! joinrel,
! &outersortkeys);
/*
* Done with this outer path if no chance for a mergejoin.
*************** match_unsorted_outer(PlannerInfo *root,
*** 961,967 ****
/* Compute the required ordering of the inner path */
innersortkeys = make_inner_pathkeys_for_merge(root,
mergeclauses,
! outerpath->pathkeys);
/*
* Generate a mergejoin on the basis of sorting the cheapest inner.
--- 966,972 ----
/* Compute the required ordering of the inner path */
innersortkeys = make_inner_pathkeys_for_merge(root,
mergeclauses,
! outersortkeys);
/*
* Generate a mergejoin on the basis of sorting the cheapest inner.
*************** match_unsorted_outer(PlannerInfo *root,
*** 980,986 ****
restrictlist,
merge_pathkeys,
mergeclauses,
! NIL,
innersortkeys);
/* Can't do anything else if inner path needs to be unique'd */
--- 985,991 ----
restrictlist,
merge_pathkeys,
mergeclauses,
! outersortkeys,
innersortkeys);
/* Can't do anything else if inner path needs to be unique'd */
*************** match_unsorted_outer(PlannerInfo *root,
*** 1038,1044 ****
for (sortkeycnt = num_sortkeys; sortkeycnt > 0; sortkeycnt--)
{
Path *innerpath;
- List *newclauses = NIL;
/*
* Look for an inner path ordered well enough for the first
--- 1043,1048 ----
*************** match_unsorted_outer(PlannerInfo *root,
*** 1055,1073 ****
compare_path_costs(innerpath, cheapest_total_inner,
TOTAL_COST) < 0))
{
- /* Found a cheap (or even-cheaper) sorted path */
- /* Select the right mergeclauses, if we didn't already */
- if (sortkeycnt < num_sortkeys)
- {
- newclauses =
- find_mergeclauses_for_pathkeys(root,
- trialsortkeys,
- false,
- mergeclauses);
- Assert(newclauses != NIL);
- }
- else
- newclauses = mergeclauses;
try_mergejoin_path(root,
joinrel,
jointype,
--- 1059,1064 ----
*************** match_unsorted_outer(PlannerInfo *root,
*** 1078,1086 ****
innerpath,
restrictlist,
merge_pathkeys,
! newclauses,
! NIL,
! NIL);
cheapest_total_inner = innerpath;
}
/* Same on the basis of cheapest startup cost ... */
--- 1069,1077 ----
innerpath,
restrictlist,
merge_pathkeys,
! mergeclauses,
! outersortkeys,
! innersortkeys);
cheapest_total_inner = innerpath;
}
/* Same on the basis of cheapest startup cost ... */
*************** match_unsorted_outer(PlannerInfo *root,
*** 1096,1119 ****
/* Found a cheap (or even-cheaper) sorted path */
if (innerpath != cheapest_total_inner)
{
- /*
- * Avoid rebuilding clause list if we already made one;
- * saves memory in big join trees...
- */
- if (newclauses == NIL)
- {
- if (sortkeycnt < num_sortkeys)
- {
- newclauses =
- find_mergeclauses_for_pathkeys(root,
- trialsortkeys,
- false,
- mergeclauses);
- Assert(newclauses != NIL);
- }
- else
- newclauses = mergeclauses;
- }
try_mergejoin_path(root,
joinrel,
jointype,
--- 1087,1092 ----
*************** match_unsorted_outer(PlannerInfo *root,
*** 1124,1132 ****
innerpath,
restrictlist,
merge_pathkeys,
! newclauses,
! NIL,
! NIL);
}
cheapest_startup_inner = innerpath;
}
--- 1097,1105 ----
innerpath,
restrictlist,
merge_pathkeys,
! mergeclauses,
! outersortkeys,
! innersortkeys);
}
cheapest_startup_inner = innerpath;
}
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
new file mode 100644
index 9c8ede6..63c0b03
*** a/src/backend/optimizer/path/pathkeys.c
--- b/src/backend/optimizer/path/pathkeys.c
***************
*** 26,31 ****
--- 26,32 ----
#include "optimizer/paths.h"
#include "optimizer/tlist.h"
#include "utils/lsyscache.h"
+ #include "utils/selfuncs.h"
static PathKey *make_canonical_pathkey(PlannerInfo *root,
*************** compare_pathkeys(List *keys1, List *keys
*** 312,317 ****
--- 313,344 ----
}
/*
+ * pathkeys_common
+ * Returns length of longest common prefix of keys1 and keys2.
+ */
+ int
+ pathkeys_common(List *keys1, List *keys2)
+ {
+ int n;
+ ListCell *key1,
+ *key2;
+ n = 0;
+
+ forboth(key1, keys1, key2, keys2)
+ {
+ PathKey *pathkey1 = (PathKey *) lfirst(key1);
+ PathKey *pathkey2 = (PathKey *) lfirst(key2);
+
+ if (pathkey1 != pathkey2)
+ return n;
+ n++;
+ }
+
+ return n;
+ }
+
+
+ /*
* pathkeys_contained_in
* Common special case of compare_pathkeys: we just want to know
* if keys2 are at least as well sorted as keys1.
*************** get_cheapest_path_for_pathkeys(List *pat
*** 368,373 ****
--- 395,421 ----
return matched_path;
}
+ static int
+ compare_bifractional_path_costs(Path *path1, Path *path2,
+ double fraction1, double fraction2)
+ {
+ Cost cost1,
+ cost2;
+
+ if (fraction1 <= 0.0 || fraction1 >= 1.0 ||
+ fraction2 <= 0.0 || fraction2 >= 1.0)
+ return compare_path_costs(path1, path2, TOTAL_COST);
+ cost1 = path1->startup_cost +
+ fraction1 * (path1->total_cost - path1->startup_cost);
+ cost2 = path2->startup_cost +
+ fraction2 * (path2->total_cost - path2->startup_cost);
+ if (cost1 < cost2)
+ return -1;
+ if (cost1 > cost2)
+ return +1;
+ return 0;
+ }
+
/*
* get_cheapest_fractional_path_for_pathkeys
* Find the cheapest path (for retrieving a specified fraction of all
*************** Path *
*** 386,411 ****
get_cheapest_fractional_path_for_pathkeys(List *paths,
List *pathkeys,
Relids required_outer,
! double fraction)
{
Path *matched_path = NULL;
ListCell *l;
foreach(l, paths)
{
Path *path = (Path *) lfirst(l);
/*
* Since cost comparison is a lot cheaper than pathkey comparison, do
* that first. (XXX is that still true?)
*/
! if (matched_path != NULL &&
! compare_fractional_path_costs(matched_path, path, fraction) <= 0)
! continue;
! if (pathkeys_contained_in(pathkeys, path->pathkeys) &&
bms_is_subset(PATH_REQ_OUTER(path), required_outer))
matched_path = path;
}
return matched_path;
}
--- 434,508 ----
get_cheapest_fractional_path_for_pathkeys(List *paths,
List *pathkeys,
Relids required_outer,
! double fraction,
! PlannerInfo *root,
! double tuples)
{
Path *matched_path = NULL;
+ int matched_n_common_pathkeys = 0,
+ costs_cmp, n_common_pathkeys,
+ n_pathkeys = list_length(pathkeys);
ListCell *l;
+ List *groupExprs = NIL;
+ double *num_groups, matched_fraction;
+ int i;
+
+ i = 0;
+ num_groups = (double *)palloc(sizeof(double) * list_length(pathkeys));
+ foreach(l, pathkeys)
+ {
+ PathKey *key = (PathKey *)lfirst(l);
+ EquivalenceMember *member = (EquivalenceMember *)
+ lfirst(list_head(key->pk_eclass->ec_members));
+
+ groupExprs = lappend(groupExprs, member->em_expr);
+
+ num_groups[i] = estimate_num_groups(root, groupExprs, tuples);
+ i++;
+ }
+
foreach(l, paths)
{
Path *path = (Path *) lfirst(l);
+ double current_fraction;
+
+ n_common_pathkeys = pathkeys_common(pathkeys, path->pathkeys);
+ if (n_common_pathkeys < matched_n_common_pathkeys ||
+ n_common_pathkeys == 0)
+ continue;
+
+ current_fraction = fraction;
+ if (n_common_pathkeys < n_pathkeys)
+ {
+ current_fraction += 1.0 / num_groups[n_common_pathkeys - 1];
+ current_fraction = Max(current_fraction, 1.0);
+ }
/*
* Since cost comparison is a lot cheaper than pathkey comparison, do
* that first. (XXX is that still true?)
*/
! if (matched_path != NULL)
! {
! costs_cmp = compare_bifractional_path_costs(matched_path, path,
! matched_fraction, current_fraction);
! }
! else
! {
! costs_cmp = 1;
! }
! if ((
! n_common_pathkeys > matched_n_common_pathkeys
! || (n_common_pathkeys == matched_n_common_pathkeys
! && costs_cmp > 0)) &&
bms_is_subset(PATH_REQ_OUTER(path), required_outer))
+ {
matched_path = path;
+ matched_n_common_pathkeys = n_common_pathkeys;
+ matched_fraction = current_fraction;
+ }
}
return matched_path;
}
*************** List *
*** 965,974 ****
find_mergeclauses_for_pathkeys(PlannerInfo *root,
List *pathkeys,
bool outer_keys,
! List *restrictinfos)
{
List *mergeclauses = NIL;
ListCell *i;
/* make sure we have eclasses cached in the clauses */
foreach(i, restrictinfos)
--- 1062,1077 ----
find_mergeclauses_for_pathkeys(PlannerInfo *root,
List *pathkeys,
bool outer_keys,
! List *restrictinfos,
! RelOptInfo *joinrel,
! List **outersortkeys)
{
List *mergeclauses = NIL;
ListCell *i;
+ bool *used = (bool *)palloc0(sizeof(bool) * list_length(restrictinfos));
+ int k;
+ List *unusedRestrictinfos = NIL;
+ List *usedPathkeys = NIL;
/* make sure we have eclasses cached in the clauses */
foreach(i, restrictinfos)
*************** find_mergeclauses_for_pathkeys(PlannerIn
*** 1021,1026 ****
--- 1124,1130 ----
* deal with the case in create_mergejoin_plan().
*----------
*/
+ k = 0;
foreach(j, restrictinfos)
{
RestrictInfo *rinfo = (RestrictInfo *) lfirst(j);
*************** find_mergeclauses_for_pathkeys(PlannerIn
*** 1033,1039 ****
--- 1137,1147 ----
clause_ec = rinfo->outer_is_left ?
rinfo->right_ec : rinfo->left_ec;
if (clause_ec == pathkey_ec)
+ {
matched_restrictinfos = lappend(matched_restrictinfos, rinfo);
+ used[k] = true;
+ }
+ k++;
}
/*
*************** find_mergeclauses_for_pathkeys(PlannerIn
*** 1044,1049 ****
--- 1152,1159 ----
if (matched_restrictinfos == NIL)
break;
+ usedPathkeys = lappend(usedPathkeys, pathkey);
+
/*
* If we did find usable mergeclause(s) for this sort-key position,
* add them to result list.
*************** find_mergeclauses_for_pathkeys(PlannerIn
*** 1051,1056 ****
--- 1161,1201 ----
mergeclauses = list_concat(mergeclauses, matched_restrictinfos);
}
+ if (outersortkeys)
+ {
+ List *addPathkeys, *addMergeclauses;
+
+ *outersortkeys = pathkeys;
+
+ if (!mergeclauses)
+ return mergeclauses;
+
+ k = 0;
+ foreach(i, restrictinfos)
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) lfirst(i);
+ if (!used[k])
+ unusedRestrictinfos = lappend(unusedRestrictinfos, rinfo);
+ k++;
+ }
+
+ if (!unusedRestrictinfos)
+ return mergeclauses;
+
+ addPathkeys = select_outer_pathkeys_for_merge(root,
+ unusedRestrictinfos, joinrel);
+
+ if (!addPathkeys)
+ return mergeclauses;
+
+ addMergeclauses = find_mergeclauses_for_pathkeys(root,
+ addPathkeys, true, unusedRestrictinfos, NULL, NULL);
+
+ *outersortkeys = list_concat(usedPathkeys, addPathkeys);
+ mergeclauses = list_concat(mergeclauses, addMergeclauses);
+
+ }
+
return mergeclauses;
}
*************** right_merge_direction(PlannerInfo *root,
*** 1457,1472 ****
static int
pathkeys_useful_for_ordering(PlannerInfo *root, List *pathkeys)
{
if (root->query_pathkeys == NIL)
return 0; /* no special ordering requested */
if (pathkeys == NIL)
return 0; /* unordered path */
! if (pathkeys_contained_in(root->query_pathkeys, pathkeys))
{
/* It's useful ... or at least the first N keys are */
! return list_length(root->query_pathkeys);
}
return 0; /* path ordering not useful */
--- 1602,1621 ----
static int
pathkeys_useful_for_ordering(PlannerInfo *root, List *pathkeys)
{
+ int n;
+
if (root->query_pathkeys == NIL)
return 0; /* no special ordering requested */
if (pathkeys == NIL)
return 0; /* unordered path */
! n = pathkeys_common(root->query_pathkeys, pathkeys);
!
! if (n != 0)
{
/* It's useful ... or at least the first N keys are */
! return n;
}
return 0; /* path ordering not useful */
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
new file mode 100644
index f2c122d..a300342
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
*************** static MergeJoin *make_mergejoin(List *t
*** 149,154 ****
--- 149,155 ----
Plan *lefttree, Plan *righttree,
JoinType jointype);
static Sort *make_sort(PlannerInfo *root, Plan *lefttree, int numCols,
+ List *pathkeys, int skipCols,
AttrNumber *sortColIdx, Oid *sortOperators,
Oid *collations, bool *nullsFirst,
double limit_tuples);
*************** create_merge_append_plan(PlannerInfo *ro
*** 774,779 ****
--- 775,781 ----
Oid *sortOperators;
Oid *collations;
bool *nullsFirst;
+ int n_common_pathkeys;
/* Build the child plan */
subplan = create_plan_recurse(root, subpath);
*************** create_merge_append_plan(PlannerInfo *ro
*** 807,814 ****
numsortkeys * sizeof(bool)) == 0);
/* Now, insert a Sort node if subplan isn't sufficiently ordered */
! if (!pathkeys_contained_in(pathkeys, subpath->pathkeys))
subplan = (Plan *) make_sort(root, subplan, numsortkeys,
sortColIdx, sortOperators,
collations, nullsFirst,
best_path->limit_tuples);
--- 809,818 ----
numsortkeys * sizeof(bool)) == 0);
/* Now, insert a Sort node if subplan isn't sufficiently ordered */
! n_common_pathkeys = pathkeys_common(pathkeys, subpath->pathkeys);
! if (n_common_pathkeys < list_length(pathkeys))
subplan = (Plan *) make_sort(root, subplan, numsortkeys,
+ pathkeys, n_common_pathkeys,
sortColIdx, sortOperators,
collations, nullsFirst,
best_path->limit_tuples);
*************** create_mergejoin_plan(PlannerInfo *root,
*** 2184,2192 ****
disuse_physical_tlist(root, outer_plan, best_path->jpath.outerjoinpath);
outer_plan = (Plan *)
make_sort_from_pathkeys(root,
! outer_plan,
! best_path->outersortkeys,
! -1.0);
outerpathkeys = best_path->outersortkeys;
}
else
--- 2188,2198 ----
disuse_physical_tlist(root, outer_plan, best_path->jpath.outerjoinpath);
outer_plan = (Plan *)
make_sort_from_pathkeys(root,
! outer_plan,
! best_path->outersortkeys,
! -1.0,
! pathkeys_common(best_path->outersortkeys,
! best_path->jpath.outerjoinpath->pathkeys));
outerpathkeys = best_path->outersortkeys;
}
else
*************** create_mergejoin_plan(PlannerInfo *root,
*** 2197,2205 ****
disuse_physical_tlist(root, inner_plan, best_path->jpath.innerjoinpath);
inner_plan = (Plan *)
make_sort_from_pathkeys(root,
! inner_plan,
! best_path->innersortkeys,
! -1.0);
innerpathkeys = best_path->innersortkeys;
}
else
--- 2203,2213 ----
disuse_physical_tlist(root, inner_plan, best_path->jpath.innerjoinpath);
inner_plan = (Plan *)
make_sort_from_pathkeys(root,
! inner_plan,
! best_path->innersortkeys,
! -1.0,
! pathkeys_common(best_path->innersortkeys,
! best_path->jpath.innerjoinpath->pathkeys));
innerpathkeys = best_path->innersortkeys;
}
else
*************** make_mergejoin(List *tlist,
*** 3739,3744 ****
--- 3747,3753 ----
*/
static Sort *
make_sort(PlannerInfo *root, Plan *lefttree, int numCols,
+ List *pathkeys, int skipCols,
AttrNumber *sortColIdx, Oid *sortOperators,
Oid *collations, bool *nullsFirst,
double limit_tuples)
*************** make_sort(PlannerInfo *root, Plan *leftt
*** 3748,3754 ****
Path sort_path; /* dummy for result of cost_sort */
copy_plan_costsize(plan, lefttree); /* only care about copying size */
! cost_sort(&sort_path, root, NIL,
lefttree->total_cost,
lefttree->plan_rows,
lefttree->plan_width,
--- 3757,3764 ----
Path sort_path; /* dummy for result of cost_sort */
copy_plan_costsize(plan, lefttree); /* only care about copying size */
! cost_sort(&sort_path, root, pathkeys, skipCols,
! lefttree->startup_cost,
lefttree->total_cost,
lefttree->plan_rows,
lefttree->plan_width,
*************** make_sort(PlannerInfo *root, Plan *leftt
*** 3762,3767 ****
--- 3772,3778 ----
plan->lefttree = lefttree;
plan->righttree = NULL;
node->numCols = numCols;
+ node->skipCols = skipCols;
node->sortColIdx = sortColIdx;
node->sortOperators = sortOperators;
node->collations = collations;
*************** find_ec_member_for_tle(EquivalenceClass
*** 4090,4096 ****
*/
Sort *
make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
! double limit_tuples)
{
int numsortkeys;
AttrNumber *sortColIdx;
--- 4101,4107 ----
*/
Sort *
make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
! double limit_tuples, int skipCols)
{
int numsortkeys;
AttrNumber *sortColIdx;
*************** make_sort_from_pathkeys(PlannerInfo *roo
*** 4110,4116 ****
&nullsFirst);
/* Now build the Sort node */
! return make_sort(root, lefttree, numsortkeys,
sortColIdx, sortOperators, collations,
nullsFirst, limit_tuples);
}
--- 4121,4127 ----
&nullsFirst);
/* Now build the Sort node */
! return make_sort(root, lefttree, numsortkeys, pathkeys, skipCols,
sortColIdx, sortOperators, collations,
nullsFirst, limit_tuples);
}
*************** make_sort_from_sortclauses(PlannerInfo *
*** 4153,4159 ****
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
--- 4164,4170 ----
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys, NIL, 0,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
*************** Sort *
*** 4175,4181 ****
make_sort_from_groupcols(PlannerInfo *root,
List *groupcls,
AttrNumber *grpColIdx,
! Plan *lefttree)
{
List *sub_tlist = lefttree->targetlist;
ListCell *l;
--- 4186,4193 ----
make_sort_from_groupcols(PlannerInfo *root,
List *groupcls,
AttrNumber *grpColIdx,
! Plan *lefttree,
! List *pathkeys, int skipCols)
{
List *sub_tlist = lefttree->targetlist;
ListCell *l;
*************** make_sort_from_groupcols(PlannerInfo *ro
*** 4208,4214 ****
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
--- 4220,4226 ----
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys, pathkeys, skipCols,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
diff --git a/src/backend/optimizer/plan/planagg.c b/src/backend/optimizer/plan/planagg.c
new file mode 100644
index 53fc238..4675402
*** a/src/backend/optimizer/plan/planagg.c
--- b/src/backend/optimizer/plan/planagg.c
*************** build_minmax_path(PlannerInfo *root, Min
*** 494,500 ****
get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
subroot->query_pathkeys,
NULL,
! path_fraction);
if (!sorted_path)
return false;
--- 494,502 ----
get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
subroot->query_pathkeys,
NULL,
! path_fraction,
! subroot,
! final_rel->rows);
if (!sorted_path)
return false;
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
new file mode 100644
index 1da4b2f..df5563a
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*************** grouping_planner(PlannerInfo *root, doub
*** 1349,1355 ****
get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
root->query_pathkeys,
NULL,
! tuple_fraction);
/* Don't consider same path in both guises; just wastes effort */
if (sorted_path == cheapest_path)
--- 1349,1357 ----
get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
root->query_pathkeys,
NULL,
! tuple_fraction,
! root,
! path_rows);
/* Don't consider same path in both guises; just wastes effort */
if (sorted_path == cheapest_path)
*************** grouping_planner(PlannerInfo *root, doub
*** 1365,1374 ****
if (sorted_path)
{
Path sort_path; /* dummy for result of cost_sort */
if (root->query_pathkeys == NIL ||
! pathkeys_contained_in(root->query_pathkeys,
! cheapest_path->pathkeys))
{
/* No sort needed for cheapest path */
sort_path.startup_cost = cheapest_path->startup_cost;
--- 1367,1380 ----
if (sorted_path)
{
Path sort_path; /* dummy for result of cost_sort */
+ Path partial_sort_path; /* dummy for result of cost_sort */
+ int n_common_pathkeys;
+
+ n_common_pathkeys = pathkeys_common(root->query_pathkeys,
+ cheapest_path->pathkeys);
if (root->query_pathkeys == NIL ||
! n_common_pathkeys == list_length(root->query_pathkeys))
{
/* No sort needed for cheapest path */
sort_path.startup_cost = cheapest_path->startup_cost;
*************** grouping_planner(PlannerInfo *root, doub
*** 1378,1389 ****
{
/* Figure cost for sorting */
cost_sort(&sort_path, root, root->query_pathkeys,
cheapest_path->total_cost,
path_rows, path_width,
0.0, work_mem, root->limit_tuples);
}
! if (compare_fractional_path_costs(sorted_path, &sort_path,
tuple_fraction) > 0)
{
/* Presorted path is a loser */
--- 1384,1418 ----
{
/* Figure cost for sorting */
cost_sort(&sort_path, root, root->query_pathkeys,
+ n_common_pathkeys,
+ cheapest_path->startup_cost,
cheapest_path->total_cost,
path_rows, path_width,
0.0, work_mem, root->limit_tuples);
}
! n_common_pathkeys = pathkeys_common(root->query_pathkeys,
! sorted_path->pathkeys);
!
! if (root->query_pathkeys == NIL ||
! n_common_pathkeys == list_length(root->query_pathkeys))
! {
! /* No sort needed for cheapest path */
! partial_sort_path.startup_cost = sorted_path->startup_cost;
! partial_sort_path.total_cost = sorted_path->total_cost;
! }
! else
! {
! /* Figure cost for sorting */
! cost_sort(&partial_sort_path, root, root->query_pathkeys,
! n_common_pathkeys,
! sorted_path->startup_cost,
! sorted_path->total_cost,
! path_rows, path_width,
! 0.0, work_mem, root->limit_tuples);
! }
!
! if (compare_fractional_path_costs(&partial_sort_path, &sort_path,
tuple_fraction) > 0)
{
/* Presorted path is a loser */
*************** grouping_planner(PlannerInfo *root, doub
*** 1464,1476 ****
* results.
*/
bool need_sort_for_grouping = false;
result_plan = create_plan(root, best_path);
current_pathkeys = best_path->pathkeys;
/* Detect if we'll need an explicit sort for grouping */
if (parse->groupClause && !use_hashed_grouping &&
! !pathkeys_contained_in(root->group_pathkeys, current_pathkeys))
{
need_sort_for_grouping = true;
--- 1493,1508 ----
* results.
*/
bool need_sort_for_grouping = false;
+ int n_common_pathkeys_grouping;
result_plan = create_plan(root, best_path);
current_pathkeys = best_path->pathkeys;
/* Detect if we'll need an explicit sort for grouping */
+ n_common_pathkeys_grouping = pathkeys_common(root->group_pathkeys,
+ current_pathkeys);
if (parse->groupClause && !use_hashed_grouping &&
! n_common_pathkeys_grouping < list_length(root->group_pathkeys))
{
need_sort_for_grouping = true;
*************** grouping_planner(PlannerInfo *root, doub
*** 1564,1570 ****
make_sort_from_groupcols(root,
parse->groupClause,
groupColIdx,
! result_plan);
current_pathkeys = root->group_pathkeys;
}
aggstrategy = AGG_SORTED;
--- 1596,1604 ----
make_sort_from_groupcols(root,
parse->groupClause,
groupColIdx,
! result_plan,
! root->group_pathkeys,
! n_common_pathkeys_grouping);
current_pathkeys = root->group_pathkeys;
}
aggstrategy = AGG_SORTED;
*************** grouping_planner(PlannerInfo *root, doub
*** 1607,1613 ****
make_sort_from_groupcols(root,
parse->groupClause,
groupColIdx,
! result_plan);
current_pathkeys = root->group_pathkeys;
}
--- 1641,1649 ----
make_sort_from_groupcols(root,
parse->groupClause,
groupColIdx,
! result_plan,
! root->group_pathkeys,
! n_common_pathkeys_grouping);
current_pathkeys = root->group_pathkeys;
}
*************** grouping_planner(PlannerInfo *root, doub
*** 1724,1736 ****
if (window_pathkeys)
{
Sort *sort_plan;
sort_plan = make_sort_from_pathkeys(root,
result_plan,
window_pathkeys,
! -1.0);
! if (!pathkeys_contained_in(window_pathkeys,
! current_pathkeys))
{
/* we do indeed need to sort */
result_plan = (Plan *) sort_plan;
--- 1760,1776 ----
if (window_pathkeys)
{
Sort *sort_plan;
+ int n_common_pathkeys;
+
+ n_common_pathkeys = pathkeys_common(window_pathkeys,
+ current_pathkeys);
sort_plan = make_sort_from_pathkeys(root,
result_plan,
window_pathkeys,
! -1.0,
! n_common_pathkeys);
! if (n_common_pathkeys < list_length(window_pathkeys))
{
/* we do indeed need to sort */
result_plan = (Plan *) sort_plan;
*************** grouping_planner(PlannerInfo *root, doub
*** 1876,1894 ****
{
if (list_length(root->distinct_pathkeys) >=
list_length(root->sort_pathkeys))
! current_pathkeys = root->distinct_pathkeys;
else
{
! current_pathkeys = root->sort_pathkeys;
/* Assert checks that parser didn't mess up... */
Assert(pathkeys_contained_in(root->distinct_pathkeys,
! current_pathkeys));
}
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
! current_pathkeys,
! -1.0);
}
result_plan = (Plan *) make_unique(result_plan,
--- 1916,1936 ----
{
if (list_length(root->distinct_pathkeys) >=
list_length(root->sort_pathkeys))
! needed_pathkeys = root->distinct_pathkeys;
else
{
! needed_pathkeys = root->sort_pathkeys;
/* Assert checks that parser didn't mess up... */
Assert(pathkeys_contained_in(root->distinct_pathkeys,
! needed_pathkeys));
}
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
! needed_pathkeys,
! -1.0,
! pathkeys_common(needed_pathkeys, current_pathkeys));
! current_pathkeys = needed_pathkeys;
}
result_plan = (Plan *) make_unique(result_plan,
*************** grouping_planner(PlannerInfo *root, doub
*** 1904,1915 ****
*/
if (parse->sortClause)
{
! if (!pathkeys_contained_in(root->sort_pathkeys, current_pathkeys))
{
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
root->sort_pathkeys,
! limit_tuples);
current_pathkeys = root->sort_pathkeys;
}
}
--- 1946,1960 ----
*/
if (parse->sortClause)
{
! int common = pathkeys_common(root->sort_pathkeys, current_pathkeys);
!
! if (common < list_length(root->sort_pathkeys))
{
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
root->sort_pathkeys,
! limit_tuples,
! common);
current_pathkeys = root->sort_pathkeys;
}
}
*************** choose_hashed_grouping(PlannerInfo *root
*** 2654,2659 ****
--- 2699,2705 ----
List *current_pathkeys;
Path hashed_p;
Path sorted_p;
+ int n_common_pathkeys;
/*
* Executor doesn't support hashed aggregation with DISTINCT or ORDER BY
*************** choose_hashed_grouping(PlannerInfo *root
*** 2735,2741 ****
path_rows);
/* Result of hashed agg is always unsorted */
if (target_pathkeys)
! cost_sort(&hashed_p, root, target_pathkeys, hashed_p.total_cost,
dNumGroups, path_width,
0.0, work_mem, limit_tuples);
--- 2781,2788 ----
path_rows);
/* Result of hashed agg is always unsorted */
if (target_pathkeys)
! cost_sort(&hashed_p, root, target_pathkeys, 0,
! hashed_p.startup_cost, hashed_p.total_cost,
dNumGroups, path_width,
0.0, work_mem, limit_tuples);
*************** choose_hashed_grouping(PlannerInfo *root
*** 2751,2759 ****
sorted_p.total_cost = cheapest_path->total_cost;
current_pathkeys = cheapest_path->pathkeys;
}
! if (!pathkeys_contained_in(root->group_pathkeys, current_pathkeys))
{
! cost_sort(&sorted_p, root, root->group_pathkeys, sorted_p.total_cost,
path_rows, path_width,
0.0, work_mem, -1.0);
current_pathkeys = root->group_pathkeys;
--- 2798,2809 ----
sorted_p.total_cost = cheapest_path->total_cost;
current_pathkeys = cheapest_path->pathkeys;
}
!
! n_common_pathkeys = pathkeys_common(root->group_pathkeys, current_pathkeys);
! if (n_common_pathkeys < list_length(root->group_pathkeys))
{
! cost_sort(&sorted_p, root, root->group_pathkeys,
! n_common_pathkeys, sorted_p.startup_cost, sorted_p.total_cost,
path_rows, path_width,
0.0, work_mem, -1.0);
current_pathkeys = root->group_pathkeys;
*************** choose_hashed_grouping(PlannerInfo *root
*** 2768,2777 ****
cost_group(&sorted_p, root, numGroupCols, dNumGroups,
sorted_p.startup_cost, sorted_p.total_cost,
path_rows);
/* The Agg or Group node will preserve ordering */
! if (target_pathkeys &&
! !pathkeys_contained_in(target_pathkeys, current_pathkeys))
! cost_sort(&sorted_p, root, target_pathkeys, sorted_p.total_cost,
dNumGroups, path_width,
0.0, work_mem, limit_tuples);
--- 2818,2829 ----
cost_group(&sorted_p, root, numGroupCols, dNumGroups,
sorted_p.startup_cost, sorted_p.total_cost,
path_rows);
+
/* The Agg or Group node will preserve ordering */
! n_common_pathkeys = pathkeys_common(target_pathkeys, current_pathkeys);
! if (target_pathkeys && n_common_pathkeys < list_length(target_pathkeys))
! cost_sort(&sorted_p, root, target_pathkeys, n_common_pathkeys,
! sorted_p.startup_cost, sorted_p.total_cost,
dNumGroups, path_width,
0.0, work_mem, limit_tuples);
*************** choose_hashed_distinct(PlannerInfo *root
*** 2824,2829 ****
--- 2876,2882 ----
List *needed_pathkeys;
Path hashed_p;
Path sorted_p;
+ int n_common_pathkeys;
/*
* If we have a sortable DISTINCT ON clause, we always use sorting. This
*************** choose_hashed_distinct(PlannerInfo *root
*** 2889,2895 ****
* need to charge for the final sort.
*/
if (parse->sortClause)
! cost_sort(&hashed_p, root, root->sort_pathkeys, hashed_p.total_cost,
dNumDistinctRows, path_width,
0.0, work_mem, limit_tuples);
--- 2942,2949 ----
* need to charge for the final sort.
*/
if (parse->sortClause)
! cost_sort(&hashed_p, root, root->sort_pathkeys, 0,
! hashed_p.startup_cost, hashed_p.total_cost,
dNumDistinctRows, path_width,
0.0, work_mem, limit_tuples);
*************** choose_hashed_distinct(PlannerInfo *root
*** 2906,2928 ****
needed_pathkeys = root->sort_pathkeys;
else
needed_pathkeys = root->distinct_pathkeys;
! if (!pathkeys_contained_in(needed_pathkeys, current_pathkeys))
{
if (list_length(root->distinct_pathkeys) >=
list_length(root->sort_pathkeys))
current_pathkeys = root->distinct_pathkeys;
else
current_pathkeys = root->sort_pathkeys;
! cost_sort(&sorted_p, root, current_pathkeys, sorted_p.total_cost,
path_rows, path_width,
0.0, work_mem, -1.0);
}
cost_group(&sorted_p, root, numDistinctCols, dNumDistinctRows,
sorted_p.startup_cost, sorted_p.total_cost,
path_rows);
if (parse->sortClause &&
! !pathkeys_contained_in(root->sort_pathkeys, current_pathkeys))
! cost_sort(&sorted_p, root, root->sort_pathkeys, sorted_p.total_cost,
dNumDistinctRows, path_width,
0.0, work_mem, limit_tuples);
--- 2960,2989 ----
needed_pathkeys = root->sort_pathkeys;
else
needed_pathkeys = root->distinct_pathkeys;
!
! n_common_pathkeys = pathkeys_common(needed_pathkeys, current_pathkeys);
! if (n_common_pathkeys < list_length(needed_pathkeys))
{
if (list_length(root->distinct_pathkeys) >=
list_length(root->sort_pathkeys))
current_pathkeys = root->distinct_pathkeys;
else
current_pathkeys = root->sort_pathkeys;
! cost_sort(&sorted_p, root, current_pathkeys,
! n_common_pathkeys, sorted_p.startup_cost, sorted_p.total_cost,
path_rows, path_width,
0.0, work_mem, -1.0);
}
cost_group(&sorted_p, root, numDistinctCols, dNumDistinctRows,
sorted_p.startup_cost, sorted_p.total_cost,
path_rows);
+
+
+ n_common_pathkeys = pathkeys_common(root->sort_pathkeys, current_pathkeys);
if (parse->sortClause &&
! n_common_pathkeys < list_length(root->sort_pathkeys))
! cost_sort(&sorted_p, root, root->sort_pathkeys, n_common_pathkeys,
! sorted_p.startup_cost, sorted_p.total_cost,
dNumDistinctRows, path_width,
0.0, work_mem, limit_tuples);
*************** plan_cluster_use_sort(Oid tableOid, Oid
*** 3712,3719 ****
/* Estimate the cost of seq scan + sort */
seqScanPath = create_seqscan_path(root, rel, NULL);
! cost_sort(&seqScanAndSortPath, root, NIL,
! seqScanPath->total_cost, rel->tuples, rel->width,
comparisonCost, maintenance_work_mem, -1.0);
/* Estimate the cost of index scan */
--- 3773,3781 ----
/* Estimate the cost of seq scan + sort */
seqScanPath = create_seqscan_path(root, rel, NULL);
! cost_sort(&seqScanAndSortPath, root, NIL, 0,
! seqScanPath->startup_cost, seqScanPath->total_cost,
! rel->tuples, rel->width,
comparisonCost, maintenance_work_mem, -1.0);
/* Estimate the cost of index scan */
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
new file mode 100644
index e249628..b0b5471
*** a/src/backend/optimizer/prep/prepunion.c
--- b/src/backend/optimizer/prep/prepunion.c
*************** choose_hashed_setop(PlannerInfo *root, L
*** 859,865 ****
sorted_p.startup_cost = input_plan->startup_cost;
sorted_p.total_cost = input_plan->total_cost;
/* XXX cost_sort doesn't actually look at pathkeys, so just pass NIL */
! cost_sort(&sorted_p, root, NIL, sorted_p.total_cost,
input_plan->plan_rows, input_plan->plan_width,
0.0, work_mem, -1.0);
cost_group(&sorted_p, root, numGroupCols, dNumGroups,
--- 859,866 ----
sorted_p.startup_cost = input_plan->startup_cost;
sorted_p.total_cost = input_plan->total_cost;
/* XXX cost_sort doesn't actually look at pathkeys, so just pass NIL */
! cost_sort(&sorted_p, root, NIL, 0,
! sorted_p.startup_cost, sorted_p.total_cost,
input_plan->plan_rows, input_plan->plan_width,
0.0, work_mem, -1.0);
cost_group(&sorted_p, root, numGroupCols, dNumGroups,
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
new file mode 100644
index a7169ef..3d0a842
*** a/src/backend/optimizer/util/pathnode.c
--- b/src/backend/optimizer/util/pathnode.c
*************** create_merge_append_path(PlannerInfo *ro
*** 971,980 ****
foreach(l, subpaths)
{
Path *subpath = (Path *) lfirst(l);
pathnode->path.rows += subpath->rows;
! if (pathkeys_contained_in(pathkeys, subpath->pathkeys))
{
/* Subpath is adequately ordered, we won't need to sort it */
input_startup_cost += subpath->startup_cost;
--- 971,981 ----
foreach(l, subpaths)
{
Path *subpath = (Path *) lfirst(l);
+ int n_common_pathkeys = pathkeys_common(pathkeys, subpath->pathkeys);
pathnode->path.rows += subpath->rows;
! if (n_common_pathkeys == list_length(pathkeys))
{
/* Subpath is adequately ordered, we won't need to sort it */
input_startup_cost += subpath->startup_cost;
*************** create_merge_append_path(PlannerInfo *ro
*** 988,993 ****
--- 989,996 ----
cost_sort(&sort_path,
root,
pathkeys,
+ n_common_pathkeys,
+ subpath->startup_cost,
subpath->total_cost,
subpath->parent->tuples,
subpath->parent->width,
*************** create_unique_path(PlannerInfo *root, Re
*** 1343,1349 ****
/*
* Estimate cost for sort+unique implementation
*/
! cost_sort(&sort_path, root, NIL,
subpath->total_cost,
rel->rows,
rel->width,
--- 1346,1353 ----
/*
* Estimate cost for sort+unique implementation
*/
! cost_sort(&sort_path, root, NIL, 0,
! subpath->startup_cost,
subpath->total_cost,
rel->rows,
rel->width,
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
new file mode 100644
index 52f05e6..6a09138
*** a/src/backend/utils/sort/tuplesort.c
--- b/src/backend/utils/sort/tuplesort.c
*************** free_sort_tuple(Tuplesortstate *state, S
*** 3525,3527 ****
--- 3525,3534 ----
FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
pfree(stup->tuple);
}
+
+ SortSupport
+ tuplesort_get_sortkeys(Tuplesortstate *state)
+ {
+ return state->sortKeys;
+ }
+
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
new file mode 100644
index 2a7b36e..76aab79
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
*************** typedef struct SortState
*** 1664,1671 ****
--- 1664,1673 ----
int64 bound; /* if bounded, how many tuples are needed */
bool sort_Done; /* sort completed yet? */
bool bounded_Done; /* value of bounded we did the sort with */
+ bool finished;
int64 bound_Done; /* value of bound we did the sort with */
void *tuplesortstate; /* private state of tuplesort.c */
+ HeapTuple prev;
} SortState;
/* ---------------------
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
new file mode 100644
index 101e22c..28b871e
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
*************** typedef struct Sort
*** 582,587 ****
--- 582,588 ----
{
Plan plan;
int numCols; /* number of sort-key columns */
+ int skipCols;
AttrNumber *sortColIdx; /* their indexes in the target list */
Oid *sortOperators; /* OIDs of operators to sort them by */
Oid *collations; /* OIDs of collations */
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
new file mode 100644
index 444ab74..e98fb0c
*** a/src/include/optimizer/cost.h
--- b/src/include/optimizer/cost.h
*************** extern void cost_ctescan(Path *path, Pla
*** 88,95 ****
RelOptInfo *baserel, ParamPathInfo *param_info);
extern void cost_recursive_union(Plan *runion, Plan *nrterm, Plan *rterm);
extern void cost_sort(Path *path, PlannerInfo *root,
! List *pathkeys, Cost input_cost, double tuples, int width,
! Cost comparison_cost, int sort_mem,
double limit_tuples);
extern void cost_merge_append(Path *path, PlannerInfo *root,
List *pathkeys, int n_streams,
--- 88,96 ----
RelOptInfo *baserel, ParamPathInfo *param_info);
extern void cost_recursive_union(Plan *runion, Plan *nrterm, Plan *rterm);
extern void cost_sort(Path *path, PlannerInfo *root,
! List *pathkeys, int presorted_keys,
! Cost input_startup_cost, Cost input_total_cost,
! double tuples, int width, Cost comparison_cost, int sort_mem,
double limit_tuples);
extern void cost_merge_append(Path *path, PlannerInfo *root,
List *pathkeys, int n_streams,
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
new file mode 100644
index 999adaa..043641d
*** a/src/include/optimizer/paths.h
--- b/src/include/optimizer/paths.h
*************** typedef enum
*** 157,169 ****
extern PathKeysComparison compare_pathkeys(List *keys1, List *keys2);
extern bool pathkeys_contained_in(List *keys1, List *keys2);
extern Path *get_cheapest_path_for_pathkeys(List *paths, List *pathkeys,
Relids required_outer,
CostSelector cost_criterion);
extern Path *get_cheapest_fractional_path_for_pathkeys(List *paths,
List *pathkeys,
Relids required_outer,
! double fraction);
extern List *build_index_pathkeys(PlannerInfo *root, IndexOptInfo *index,
ScanDirection scandir);
extern List *build_expression_pathkey(PlannerInfo *root, Expr *expr,
--- 157,172 ----
extern PathKeysComparison compare_pathkeys(List *keys1, List *keys2);
extern bool pathkeys_contained_in(List *keys1, List *keys2);
+ extern int pathkeys_common(List *keys1, List *keys2);
extern Path *get_cheapest_path_for_pathkeys(List *paths, List *pathkeys,
Relids required_outer,
CostSelector cost_criterion);
extern Path *get_cheapest_fractional_path_for_pathkeys(List *paths,
List *pathkeys,
Relids required_outer,
! double fraction,
! PlannerInfo *root,
! double tuples);
extern List *build_index_pathkeys(PlannerInfo *root, IndexOptInfo *index,
ScanDirection scandir);
extern List *build_expression_pathkey(PlannerInfo *root, Expr *expr,
*************** extern void update_mergeclause_eclasses(
*** 185,191 ****
extern List *find_mergeclauses_for_pathkeys(PlannerInfo *root,
List *pathkeys,
bool outer_keys,
! List *restrictinfos);
extern List *select_outer_pathkeys_for_merge(PlannerInfo *root,
List *mergeclauses,
RelOptInfo *joinrel);
--- 188,196 ----
extern List *find_mergeclauses_for_pathkeys(PlannerInfo *root,
List *pathkeys,
bool outer_keys,
! List *restrictinfos,
! RelOptInfo *joinrel,
! List **outerpathkeys);
extern List *select_outer_pathkeys_for_merge(PlannerInfo *root,
List *mergeclauses,
RelOptInfo *joinrel);
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
new file mode 100644
index ba7ae7c..d33c615
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
*************** extern RecursiveUnion *make_recursive_un
*** 50,60 ****
Plan *lefttree, Plan *righttree, int wtParam,
List *distinctList, long numGroups);
extern Sort *make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree,
! List *pathkeys, double limit_tuples);
extern Sort *make_sort_from_sortclauses(PlannerInfo *root, List *sortcls,
Plan *lefttree);
extern Sort *make_sort_from_groupcols(PlannerInfo *root, List *groupcls,
! AttrNumber *grpColIdx, Plan *lefttree);
extern Agg *make_agg(PlannerInfo *root, List *tlist, List *qual,
AggStrategy aggstrategy, const AggClauseCosts *aggcosts,
int numGroupCols, AttrNumber *grpColIdx, Oid *grpOperators,
--- 50,61 ----
Plan *lefttree, Plan *righttree, int wtParam,
List *distinctList, long numGroups);
extern Sort *make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree,
! List *pathkeys, double limit_tuples, int skipCols);
extern Sort *make_sort_from_sortclauses(PlannerInfo *root, List *sortcls,
Plan *lefttree);
extern Sort *make_sort_from_groupcols(PlannerInfo *root, List *groupcls,
! AttrNumber *grpColIdx, Plan *lefttree, List *pathkeys,
! int skipCols);
extern Agg *make_agg(PlannerInfo *root, List *tlist, List *qual,
AggStrategy aggstrategy, const AggClauseCosts *aggcosts,
int numGroupCols, AttrNumber *grpColIdx, Oid *grpOperators,
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
new file mode 100644
index 5f87254..5a65cd2
*** a/src/include/utils/tuplesort.h
--- b/src/include/utils/tuplesort.h
***************
*** 24,29 ****
--- 24,30 ----
#include "executor/tuptable.h"
#include "fmgr.h"
#include "utils/relcache.h"
+ #include "utils/sortsupport.h"
/* Tuplesortstate is an opaque type whose details are not known outside
*************** extern void tuplesort_get_stats(Tuplesor
*** 111,116 ****
--- 112,119 ----
extern int tuplesort_merge_order(int64 allowedMem);
+ extern SortSupport tuplesort_get_sortkeys(Tuplesortstate *state);
+
/*
* These routines may only be called if randomAccess was specified 'true'.
* Likewise, backwards scan in gettuple/getdatum is only allowed if
On Sun, Dec 29, 2013 at 4:51 AM, Alexander Korotkov <aekorotkov@gmail.com>wrote:
I've compiled it with clang. Yeah, there was mixed declarations. I've
rechecked it with gcc, now it gives no warnings. I didn't try it with
visual studio, but I hope it will be OK.
Thanks for the patch. It now compiles without any problems.
I've been doing a bit of testing with the patch testing a few different
workloads. One thing that I've found is that in my test case when the table
only contains 1 tuple for any given presort columns that the query is
actually slower than when there are say 100 tuples to sort for any given
presort group.
Here is my test case:
DROP TABLE IF EXISTS temperature_readings;
CREATE TABLE temperature_readings (
readingid SERIAL NOT NULL,
timestamp TIMESTAMP NOT NULL,
locationid INT NOT NULL,
temperature INT NOT NULL,
PRIMARY KEY (readingid)
);
INSERT INTO temperature_readings (timestamp,locationid,temperature)
SELECT ts.timestamp, loc.locationid, -10 + random() * 40
FROM generate_series('1900-04-01','2000-04-01','1 day'::interval)
ts(timestamp)
CROSS JOIN generate_series(1,1) loc(locationid);
VACUUM ANALYZE temperature_readings;
-- Warm buffers
SELECT AVG(temperature) FROM temperature_readings;
explain (buffers, analyze) select * from temperature_readings order by
timestamp,locationid; -- (seqscan -> sort) 70.805ms
-- create an index to allow presorting on timestamp.
CREATE INDEX temperature_readings_timestamp_idx ON temperature_readings
(timestamp);
-- warm index buffers
SELECT COUNT(*) FROM (SELECT timestamp FROM temperature_readings ORDER BY
timestamp) c;
explain (buffers, analyze) select * from temperature_readings order by
timestamp,locationid; -- index scan -> partial sort 253.032ms
The first query without the index to presort on runs in 70.805 ms, the 2nd
query uses the index to presort and runs in 253.032 ms.
I ran the code through a performance profiler and found that about 80% of
the time is spent in tuplesort_end and tuplesort_begin_heap.
If it was possible to devise some way to reuse any previous tuplesortstate
perhaps just inventing a reset method which clears out tuples, then we
could see performance exceed the standard seqscan -> sort. The code the way
it is seems to lookup the sort functions from the syscache for each group
then allocate some sort space, so quite a bit of time is also spent in
palloc0() and pfree()
If it was not possible to do this then maybe adding a cost to the number of
sort groups would be better so that the optimization is skipped if there
are too many sort groups.
Regards
David Rowley
Show quoted text
------
With best regards,
Alexander Korotkov.
On 12/29/2013 08:24 AM, David Rowley wrote:
If it was possible to devise some way to reuse any
previous tuplesortstate perhaps just inventing a reset method which
clears out tuples, then we could see performance exceed the standard
seqscan -> sort. The code the way it is seems to lookup the sort
functions from the syscache for each group then allocate some sort
space, so quite a bit of time is also spent in palloc0() and pfree()If it was not possible to do this then maybe adding a cost to the number
of sort groups would be better so that the optimization is skipped if
there are too many sort groups.
It should be possible. I have hacked a quick proof of concept for
reusing the tuplesort state. Can you try it and see if the performance
regression is fixed by this?
One thing which have to be fixed with my patch is that we probably want
to close the tuplesort once we have returned the last tuple from ExecSort().
I have attached my patch and the incremental patch on Alexander's patch.
--
Andreas Karlsson
Attachments:
partial-sort-4-reset.patchtext/x-patch; name=partial-sort-4-reset.patchDownload
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
new file mode 100644
index 9969a25..07cb66d
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
*************** static void show_agg_keys(AggState *asta
*** 81,87 ****
static void show_group_keys(GroupState *gstate, List *ancestors,
ExplainState *es);
static void show_sort_group_keys(PlanState *planstate, const char *qlabel,
! int nkeys, AttrNumber *keycols,
List *ancestors, ExplainState *es);
static void show_sort_info(SortState *sortstate, ExplainState *es);
static void show_hash_info(HashState *hashstate, ExplainState *es);
--- 81,87 ----
static void show_group_keys(GroupState *gstate, List *ancestors,
ExplainState *es);
static void show_sort_group_keys(PlanState *planstate, const char *qlabel,
! int nkeys, int nPresortedKeys, AttrNumber *keycols,
List *ancestors, ExplainState *es);
static void show_sort_info(SortState *sortstate, ExplainState *es);
static void show_hash_info(HashState *hashstate, ExplainState *es);
*************** ExplainNode(PlanState *planstate, List *
*** 905,911 ****
pname = sname = "Materialize";
break;
case T_Sort:
! pname = sname = "Sort";
break;
case T_Group:
pname = sname = "Group";
--- 905,914 ----
pname = sname = "Materialize";
break;
case T_Sort:
! if (((Sort *) plan)->skipCols > 0)
! pname = sname = "Partial sort";
! else
! pname = sname = "Sort";
break;
case T_Group:
pname = sname = "Group";
*************** show_sort_keys(SortState *sortstate, Lis
*** 1705,1711 ****
Sort *plan = (Sort *) sortstate->ss.ps.plan;
show_sort_group_keys((PlanState *) sortstate, "Sort Key",
! plan->numCols, plan->sortColIdx,
ancestors, es);
}
--- 1708,1714 ----
Sort *plan = (Sort *) sortstate->ss.ps.plan;
show_sort_group_keys((PlanState *) sortstate, "Sort Key",
! plan->numCols, plan->skipCols, plan->sortColIdx,
ancestors, es);
}
*************** show_merge_append_keys(MergeAppendState
*** 1719,1725 ****
MergeAppend *plan = (MergeAppend *) mstate->ps.plan;
show_sort_group_keys((PlanState *) mstate, "Sort Key",
! plan->numCols, plan->sortColIdx,
ancestors, es);
}
--- 1722,1728 ----
MergeAppend *plan = (MergeAppend *) mstate->ps.plan;
show_sort_group_keys((PlanState *) mstate, "Sort Key",
! plan->numCols, 0, plan->sortColIdx,
ancestors, es);
}
*************** show_agg_keys(AggState *astate, List *an
*** 1737,1743 ****
/* The key columns refer to the tlist of the child plan */
ancestors = lcons(astate, ancestors);
show_sort_group_keys(outerPlanState(astate), "Group Key",
! plan->numCols, plan->grpColIdx,
ancestors, es);
ancestors = list_delete_first(ancestors);
}
--- 1740,1746 ----
/* The key columns refer to the tlist of the child plan */
ancestors = lcons(astate, ancestors);
show_sort_group_keys(outerPlanState(astate), "Group Key",
! plan->numCols, 0, plan->grpColIdx,
ancestors, es);
ancestors = list_delete_first(ancestors);
}
*************** show_group_keys(GroupState *gstate, List
*** 1755,1761 ****
/* The key columns refer to the tlist of the child plan */
ancestors = lcons(gstate, ancestors);
show_sort_group_keys(outerPlanState(gstate), "Group Key",
! plan->numCols, plan->grpColIdx,
ancestors, es);
ancestors = list_delete_first(ancestors);
}
--- 1758,1764 ----
/* The key columns refer to the tlist of the child plan */
ancestors = lcons(gstate, ancestors);
show_sort_group_keys(outerPlanState(gstate), "Group Key",
! plan->numCols, 0, plan->grpColIdx,
ancestors, es);
ancestors = list_delete_first(ancestors);
}
*************** show_group_keys(GroupState *gstate, List
*** 1765,1777 ****
* as arrays of targetlist indexes
*/
static void
! show_sort_group_keys(PlanState *planstate, const char *qlabel,
! int nkeys, AttrNumber *keycols,
List *ancestors, ExplainState *es)
{
Plan *plan = planstate->plan;
List *context;
! List *result = NIL;
bool useprefix;
int keyno;
char *exprstr;
--- 1768,1781 ----
* as arrays of targetlist indexes
*/
static void
! show_sort_group_keys(PlanState *planstate, const char *qlabel,
! int nkeys, int nPresortedKeys, AttrNumber *keycols,
List *ancestors, ExplainState *es)
{
Plan *plan = planstate->plan;
List *context;
! List *resultSort = NIL;
! List *resultPresorted = NIL;
bool useprefix;
int keyno;
char *exprstr;
*************** show_sort_group_keys(PlanState *planstat
*** 1798,1807 ****
/* Deparse the expression, showing any top-level cast */
exprstr = deparse_expression((Node *) target->expr, context,
useprefix, true);
! result = lappend(result, exprstr);
}
! ExplainPropertyList(qlabel, result, es);
}
/*
--- 1802,1816 ----
/* Deparse the expression, showing any top-level cast */
exprstr = deparse_expression((Node *) target->expr, context,
useprefix, true);
!
! if (keyno < nPresortedKeys)
! resultPresorted = lappend(resultPresorted, exprstr);
! resultSort = lappend(resultSort, exprstr);
}
! ExplainPropertyList(qlabel, resultSort, es);
! if (nPresortedKeys > 0)
! ExplainPropertyList("Presorted Key", resultPresorted, es);
}
/*
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
new file mode 100644
index 09b2eb0..c25ed7d
*** a/src/backend/executor/nodeSort.c
--- b/src/backend/executor/nodeSort.c
***************
*** 15,25 ****
--- 15,52 ----
#include "postgres.h"
+ #include "access/htup_details.h"
#include "executor/execdebug.h"
#include "executor/nodeSort.h"
#include "miscadmin.h"
#include "utils/tuplesort.h"
+ /*
+ * Check if first "skipCols" sort values are equal.
+ */
+ static bool
+ cmpSortSkipCols(SortState *node, TupleDesc tupDesc, HeapTuple a, TupleTableSlot *b)
+ {
+ int n = ((Sort *)node->ss.ps.plan)->skipCols, i;
+ SortSupport sortKeys = tuplesort_get_sortkeys(node->tuplesortstate);
+
+ for (i = 0; i < n; i++)
+ {
+ Datum datumA, datumB;
+ bool isnullA, isnullB;
+ AttrNumber attno = sortKeys[i].ssup_attno;
+
+ datumA = heap_getattr(a, attno, tupDesc, &isnullA);
+ datumB = slot_getattr(b, attno, &isnullB);
+
+ if (ApplySortComparator(datumA, isnullA,
+ datumB, isnullB,
+ &sortKeys[i]))
+ return false;
+ }
+ return true;
+ }
+
/* ----------------------------------------------------------------
* ExecSort
*************** ExecSort(SortState *node)
*** 42,47 ****
--- 69,78 ----
ScanDirection dir;
Tuplesortstate *tuplesortstate;
TupleTableSlot *slot;
+ Sort *plannode = (Sort *) node->ss.ps.plan;
+ PlanState *outerNode;
+ TupleDesc tupDesc;
+ int skipCols = plannode->skipCols;
/*
* get state info from node
*************** ExecSort(SortState *node)
*** 54,87 ****
tuplesortstate = (Tuplesortstate *) node->tuplesortstate;
/*
* If first time through, read all tuples from outer plan and pass them to
* tuplesort.c. Subsequent calls just fetch tuples from tuplesort.
*/
! if (!node->sort_Done)
! {
! Sort *plannode = (Sort *) node->ss.ps.plan;
! PlanState *outerNode;
! TupleDesc tupDesc;
!
! SO1_printf("ExecSort: %s\n",
! "sorting subplan");
! /*
! * Want to scan subplan in the forward direction while creating the
! * sorted data.
! */
! estate->es_direction = ForwardScanDirection;
! /*
! * Initialize tuplesort module.
! */
! SO1_printf("ExecSort: %s\n",
! "calling tuplesort_begin");
! outerNode = outerPlanState(node);
! tupDesc = ExecGetResultType(outerNode);
tuplesortstate = tuplesort_begin_heap(tupDesc,
plannode->numCols,
plannode->sortColIdx,
--- 85,128 ----
tuplesortstate = (Tuplesortstate *) node->tuplesortstate;
/*
+ * Return next tuple from sorted set if any.
+ */
+ if (node->sort_Done)
+ {
+ slot = node->ss.ps.ps_ResultTupleSlot;
+ if (tuplesort_gettupleslot(tuplesortstate,
+ ScanDirectionIsForward(dir),
+ slot) || node->finished)
+ return slot;
+ }
+
+ /*
* If first time through, read all tuples from outer plan and pass them to
* tuplesort.c. Subsequent calls just fetch tuples from tuplesort.
*/
! SO1_printf("ExecSort: %s\n",
! "sorting subplan");
! /*
! * Want to scan subplan in the forward direction while creating the
! * sorted data.
! */
! estate->es_direction = ForwardScanDirection;
! /*
! * Initialize tuplesort module.
! */
! SO1_printf("ExecSort: %s\n",
! "calling tuplesort_begin");
! outerNode = outerPlanState(node);
! tupDesc = ExecGetResultType(outerNode);
+ if (node->tuplesortstate != NULL)
+ tuplesort_reset((Tuplesortstate *) node->tuplesortstate);
+ else
+ {
tuplesortstate = tuplesort_begin_heap(tupDesc,
plannode->numCols,
plannode->sortColIdx,
*************** ExecSort(SortState *node)
*** 93,131 ****
if (node->bounded)
tuplesort_set_bound(tuplesortstate, node->bound);
node->tuplesortstate = (void *) tuplesortstate;
! /*
! * Scan the subplan and feed all the tuples to tuplesort.
! */
! for (;;)
{
- slot = ExecProcNode(outerNode);
-
if (TupIsNull(slot))
break;
!
tuplesort_puttupleslot(tuplesortstate, slot);
}
! /*
! * Complete the sort.
! */
! tuplesort_performsort(tuplesortstate);
! /*
! * restore to user specified direction
! */
! estate->es_direction = dir;
! /*
! * finally set the sorted flag to true
! */
! node->sort_Done = true;
! node->bounded_Done = node->bounded;
! node->bound_Done = node->bound;
! SO1_printf("ExecSort: %s\n", "sorting done");
! }
SO1_printf("ExecSort: %s\n",
"retrieving tuple from tuplesort");
--- 134,208 ----
if (node->bounded)
tuplesort_set_bound(tuplesortstate, node->bound);
node->tuplesortstate = (void *) tuplesortstate;
+ }
! /*
! * Put next group of tuples where skipCols" sort values are equal to
! * tuplesort.
! */
! for (;;)
! {
! slot = ExecProcNode(outerNode);
! if (skipCols == 0)
{
if (TupIsNull(slot))
+ {
+ node->finished = true;
break;
! }
tuplesort_puttupleslot(tuplesortstate, slot);
}
+ else if (node->prev)
+ {
+ ExecStoreTuple(node->prev, node->ss.ps.ps_ResultTupleSlot, InvalidBuffer, false);
+ tuplesort_puttupleslot(tuplesortstate, node->ss.ps.ps_ResultTupleSlot);
! if (TupIsNull(slot))
! {
! node->finished = true;
! break;
! }
! else
! {
! bool cmp;
! cmp = cmpSortSkipCols(node, tupDesc, node->prev, slot);
! node->prev = ExecCopySlotTuple(slot);
! if (!cmp)
! break;
! }
! }
! else
! {
! if (TupIsNull(slot))
! {
! node->finished = true;
! break;
! }
! else
! {
! node->prev = ExecCopySlotTuple(slot);
! }
! }
! }
! /*
! * Complete the sort.
! */
! tuplesort_performsort(tuplesortstate);
! /*
! * restore to user specified direction
! */
! estate->es_direction = dir;
!
! /*
! * finally set the sorted flag to true
! */
! node->sort_Done = true;
! node->bounded_Done = node->bounded;
! node->bound_Done = node->bound;
! SO1_printf("ExecSort: %s\n", "sorting done");
SO1_printf("ExecSort: %s\n",
"retrieving tuple from tuplesort");
*************** ExecInitSort(Sort *node, EState *estate,
*** 174,180 ****
--- 251,259 ----
sortstate->bounded = false;
sortstate->sort_Done = false;
+ sortstate->finished = false;
sortstate->tuplesortstate = NULL;
+ sortstate->prev = NULL;
/*
* Miscellaneous initialization
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
new file mode 100644
index e4184c5..b41213a
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
*************** _copySort(const Sort *from)
*** 735,740 ****
--- 735,741 ----
CopyPlanFields((const Plan *) from, (Plan *) newnode);
COPY_SCALAR_FIELD(numCols);
+ COPY_SCALAR_FIELD(skipCols);
COPY_POINTER_FIELD(sortColIdx, from->numCols * sizeof(AttrNumber));
COPY_POINTER_FIELD(sortOperators, from->numCols * sizeof(Oid));
COPY_POINTER_FIELD(collations, from->numCols * sizeof(Oid));
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
new file mode 100644
index 50f0852..1a38407
*** a/src/backend/optimizer/path/costsize.c
--- b/src/backend/optimizer/path/costsize.c
*************** cost_recursive_union(Plan *runion, Plan
*** 1281,1295 ****
*/
void
cost_sort(Path *path, PlannerInfo *root,
! List *pathkeys, Cost input_cost, double tuples, int width,
! Cost comparison_cost, int sort_mem,
double limit_tuples)
{
! Cost startup_cost = input_cost;
! Cost run_cost = 0;
double input_bytes = relation_byte_size(tuples, width);
double output_bytes;
double output_tuples;
long sort_mem_bytes = sort_mem * 1024L;
if (!enable_sort)
--- 1281,1302 ----
*/
void
cost_sort(Path *path, PlannerInfo *root,
! List *pathkeys, int presorted_keys,
! Cost input_startup_cost, Cost input_total_cost,
! double tuples, int width, Cost comparison_cost, int sort_mem,
double limit_tuples)
{
! Cost startup_cost = input_startup_cost;
! Cost run_cost = 0,
! rest_cost,
! group_cost,
! input_run_cost = input_total_cost - input_startup_cost;
double input_bytes = relation_byte_size(tuples, width);
double output_bytes;
double output_tuples;
+ double num_groups,
+ group_input_bytes,
+ group_tuples;
long sort_mem_bytes = sort_mem * 1024L;
if (!enable_sort)
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1319,1331 ****
output_bytes = input_bytes;
}
! if (output_bytes > sort_mem_bytes)
{
/*
* We'll have to use a disk-based sort of all the tuples
*/
! double npages = ceil(input_bytes / BLCKSZ);
! double nruns = (input_bytes / sort_mem_bytes) * 0.5;
double mergeorder = tuplesort_merge_order(sort_mem_bytes);
double log_runs;
double npageaccesses;
--- 1326,1367 ----
output_bytes = input_bytes;
}
! if (presorted_keys > 0)
! {
! List *groupExprs = NIL;
! ListCell *l;
! int i = 0;
!
! foreach(l, pathkeys)
! {
! PathKey *key = (PathKey *)lfirst(l);
! EquivalenceMember *member = (EquivalenceMember *)
! lfirst(list_head(key->pk_eclass->ec_members));
!
! groupExprs = lappend(groupExprs, member->em_expr);
!
! i++;
! if (i >= presorted_keys)
! break;
! }
!
! num_groups = estimate_num_groups(root, groupExprs, tuples);
! }
! else
! {
! num_groups = 1.0;
! }
!
! group_input_bytes = input_bytes / num_groups;
! group_tuples = tuples / num_groups;
!
! if (output_bytes > sort_mem_bytes && group_input_bytes > sort_mem_bytes)
{
/*
* We'll have to use a disk-based sort of all the tuples
*/
! double npages = ceil(group_input_bytes / BLCKSZ);
! double nruns = (group_input_bytes / sort_mem_bytes) * 0.5;
double mergeorder = tuplesort_merge_order(sort_mem_bytes);
double log_runs;
double npageaccesses;
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1335,1341 ****
*
* Assume about N log2 N comparisons
*/
! startup_cost += comparison_cost * tuples * LOG2(tuples);
/* Disk costs */
--- 1371,1377 ----
*
* Assume about N log2 N comparisons
*/
! group_cost = comparison_cost * group_tuples * LOG2(group_tuples);
/* Disk costs */
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1346,1355 ****
log_runs = 1.0;
npageaccesses = 2.0 * npages * log_runs;
/* Assume 3/4ths of accesses are sequential, 1/4th are not */
! startup_cost += npageaccesses *
(seq_page_cost * 0.75 + random_page_cost * 0.25);
}
! else if (tuples > 2 * output_tuples || input_bytes > sort_mem_bytes)
{
/*
* We'll use a bounded heap-sort keeping just K tuples in memory, for
--- 1382,1391 ----
log_runs = 1.0;
npageaccesses = 2.0 * npages * log_runs;
/* Assume 3/4ths of accesses are sequential, 1/4th are not */
! group_cost += npageaccesses *
(seq_page_cost * 0.75 + random_page_cost * 0.25);
}
! else if (group_tuples > 2 * output_tuples || group_input_bytes > sort_mem_bytes)
{
/*
* We'll use a bounded heap-sort keeping just K tuples in memory, for
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1357,1368 ****
* factor is a bit higher than for quicksort. Tweak it so that the
* cost curve is continuous at the crossover point.
*/
! startup_cost += comparison_cost * tuples * LOG2(2.0 * output_tuples);
}
else
{
/* We'll use plain quicksort on all the input tuples */
! startup_cost += comparison_cost * tuples * LOG2(tuples);
}
/*
--- 1393,1404 ----
* factor is a bit higher than for quicksort. Tweak it so that the
* cost curve is continuous at the crossover point.
*/
! group_cost = comparison_cost * group_tuples * LOG2(2.0 * output_tuples);
}
else
{
/* We'll use plain quicksort on all the input tuples */
! group_cost = comparison_cost * group_tuples * LOG2(group_tuples);
}
/*
*************** cost_sort(Path *path, PlannerInfo *root,
*** 1373,1380 ****
--- 1409,1423 ----
* here --- the upper LIMIT will pro-rate the run cost so we'd be double
* counting the LIMIT otherwise.
*/
+ startup_cost += group_cost;
+ rest_cost = (num_groups * (output_tuples / tuples) - 1.0) * group_cost;
+ if (rest_cost > 0.0)
+ run_cost += rest_cost;
run_cost += cpu_operator_cost * tuples;
+ startup_cost += input_run_cost / num_groups;
+ run_cost += input_run_cost * ((num_groups - 1.0) / num_groups);
+
path->startup_cost = startup_cost;
path->total_cost = startup_cost + run_cost;
}
*************** initial_cost_mergejoin(PlannerInfo *root
*** 2075,2080 ****
--- 2118,2125 ----
cost_sort(&sort_path,
root,
outersortkeys,
+ pathkeys_common(outer_path->pathkeys, outersortkeys),
+ outer_path->startup_cost,
outer_path->total_cost,
outer_path_rows,
outer_path->parent->width,
*************** initial_cost_mergejoin(PlannerInfo *root
*** 2101,2106 ****
--- 2146,2153 ----
cost_sort(&sort_path,
root,
innersortkeys,
+ pathkeys_common(inner_path->pathkeys, innersortkeys),
+ inner_path->startup_cost,
inner_path->total_cost,
inner_path_rows,
inner_path->parent->width,
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
new file mode 100644
index 5b477e5..5909dfe
*** a/src/backend/optimizer/path/joinpath.c
--- b/src/backend/optimizer/path/joinpath.c
*************** sort_inner_and_outer(PlannerInfo *root,
*** 662,668 ****
cur_mergeclauses = find_mergeclauses_for_pathkeys(root,
outerkeys,
true,
! mergeclause_list);
/* Should have used them all... */
Assert(list_length(cur_mergeclauses) == list_length(mergeclause_list));
--- 662,670 ----
cur_mergeclauses = find_mergeclauses_for_pathkeys(root,
outerkeys,
true,
! mergeclause_list,
! NULL,
! NULL);
/* Should have used them all... */
Assert(list_length(cur_mergeclauses) == list_length(mergeclause_list));
*************** match_unsorted_outer(PlannerInfo *root,
*** 832,837 ****
--- 834,840 ----
List *mergeclauses;
List *innersortkeys;
List *trialsortkeys;
+ List *outersortkeys;
Path *cheapest_startup_inner;
Path *cheapest_total_inner;
int num_sortkeys;
*************** match_unsorted_outer(PlannerInfo *root,
*** 937,943 ****
mergeclauses = find_mergeclauses_for_pathkeys(root,
outerpath->pathkeys,
true,
! mergeclause_list);
/*
* Done with this outer path if no chance for a mergejoin.
--- 940,948 ----
mergeclauses = find_mergeclauses_for_pathkeys(root,
outerpath->pathkeys,
true,
! mergeclause_list,
! joinrel,
! &outersortkeys);
/*
* Done with this outer path if no chance for a mergejoin.
*************** match_unsorted_outer(PlannerInfo *root,
*** 961,967 ****
/* Compute the required ordering of the inner path */
innersortkeys = make_inner_pathkeys_for_merge(root,
mergeclauses,
! outerpath->pathkeys);
/*
* Generate a mergejoin on the basis of sorting the cheapest inner.
--- 966,972 ----
/* Compute the required ordering of the inner path */
innersortkeys = make_inner_pathkeys_for_merge(root,
mergeclauses,
! outersortkeys);
/*
* Generate a mergejoin on the basis of sorting the cheapest inner.
*************** match_unsorted_outer(PlannerInfo *root,
*** 980,986 ****
restrictlist,
merge_pathkeys,
mergeclauses,
! NIL,
innersortkeys);
/* Can't do anything else if inner path needs to be unique'd */
--- 985,991 ----
restrictlist,
merge_pathkeys,
mergeclauses,
! outersortkeys,
innersortkeys);
/* Can't do anything else if inner path needs to be unique'd */
*************** match_unsorted_outer(PlannerInfo *root,
*** 1038,1044 ****
for (sortkeycnt = num_sortkeys; sortkeycnt > 0; sortkeycnt--)
{
Path *innerpath;
- List *newclauses = NIL;
/*
* Look for an inner path ordered well enough for the first
--- 1043,1048 ----
*************** match_unsorted_outer(PlannerInfo *root,
*** 1055,1073 ****
compare_path_costs(innerpath, cheapest_total_inner,
TOTAL_COST) < 0))
{
- /* Found a cheap (or even-cheaper) sorted path */
- /* Select the right mergeclauses, if we didn't already */
- if (sortkeycnt < num_sortkeys)
- {
- newclauses =
- find_mergeclauses_for_pathkeys(root,
- trialsortkeys,
- false,
- mergeclauses);
- Assert(newclauses != NIL);
- }
- else
- newclauses = mergeclauses;
try_mergejoin_path(root,
joinrel,
jointype,
--- 1059,1064 ----
*************** match_unsorted_outer(PlannerInfo *root,
*** 1078,1086 ****
innerpath,
restrictlist,
merge_pathkeys,
! newclauses,
! NIL,
! NIL);
cheapest_total_inner = innerpath;
}
/* Same on the basis of cheapest startup cost ... */
--- 1069,1077 ----
innerpath,
restrictlist,
merge_pathkeys,
! mergeclauses,
! outersortkeys,
! innersortkeys);
cheapest_total_inner = innerpath;
}
/* Same on the basis of cheapest startup cost ... */
*************** match_unsorted_outer(PlannerInfo *root,
*** 1096,1119 ****
/* Found a cheap (or even-cheaper) sorted path */
if (innerpath != cheapest_total_inner)
{
- /*
- * Avoid rebuilding clause list if we already made one;
- * saves memory in big join trees...
- */
- if (newclauses == NIL)
- {
- if (sortkeycnt < num_sortkeys)
- {
- newclauses =
- find_mergeclauses_for_pathkeys(root,
- trialsortkeys,
- false,
- mergeclauses);
- Assert(newclauses != NIL);
- }
- else
- newclauses = mergeclauses;
- }
try_mergejoin_path(root,
joinrel,
jointype,
--- 1087,1092 ----
*************** match_unsorted_outer(PlannerInfo *root,
*** 1124,1132 ****
innerpath,
restrictlist,
merge_pathkeys,
! newclauses,
! NIL,
! NIL);
}
cheapest_startup_inner = innerpath;
}
--- 1097,1105 ----
innerpath,
restrictlist,
merge_pathkeys,
! mergeclauses,
! outersortkeys,
! innersortkeys);
}
cheapest_startup_inner = innerpath;
}
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
new file mode 100644
index 9c8ede6..63c0b03
*** a/src/backend/optimizer/path/pathkeys.c
--- b/src/backend/optimizer/path/pathkeys.c
***************
*** 26,31 ****
--- 26,32 ----
#include "optimizer/paths.h"
#include "optimizer/tlist.h"
#include "utils/lsyscache.h"
+ #include "utils/selfuncs.h"
static PathKey *make_canonical_pathkey(PlannerInfo *root,
*************** compare_pathkeys(List *keys1, List *keys
*** 312,317 ****
--- 313,344 ----
}
/*
+ * pathkeys_common
+ * Returns length of longest common prefix of keys1 and keys2.
+ */
+ int
+ pathkeys_common(List *keys1, List *keys2)
+ {
+ int n;
+ ListCell *key1,
+ *key2;
+ n = 0;
+
+ forboth(key1, keys1, key2, keys2)
+ {
+ PathKey *pathkey1 = (PathKey *) lfirst(key1);
+ PathKey *pathkey2 = (PathKey *) lfirst(key2);
+
+ if (pathkey1 != pathkey2)
+ return n;
+ n++;
+ }
+
+ return n;
+ }
+
+
+ /*
* pathkeys_contained_in
* Common special case of compare_pathkeys: we just want to know
* if keys2 are at least as well sorted as keys1.
*************** get_cheapest_path_for_pathkeys(List *pat
*** 368,373 ****
--- 395,421 ----
return matched_path;
}
+ static int
+ compare_bifractional_path_costs(Path *path1, Path *path2,
+ double fraction1, double fraction2)
+ {
+ Cost cost1,
+ cost2;
+
+ if (fraction1 <= 0.0 || fraction1 >= 1.0 ||
+ fraction2 <= 0.0 || fraction2 >= 1.0)
+ return compare_path_costs(path1, path2, TOTAL_COST);
+ cost1 = path1->startup_cost +
+ fraction1 * (path1->total_cost - path1->startup_cost);
+ cost2 = path2->startup_cost +
+ fraction2 * (path2->total_cost - path2->startup_cost);
+ if (cost1 < cost2)
+ return -1;
+ if (cost1 > cost2)
+ return +1;
+ return 0;
+ }
+
/*
* get_cheapest_fractional_path_for_pathkeys
* Find the cheapest path (for retrieving a specified fraction of all
*************** Path *
*** 386,411 ****
get_cheapest_fractional_path_for_pathkeys(List *paths,
List *pathkeys,
Relids required_outer,
! double fraction)
{
Path *matched_path = NULL;
ListCell *l;
foreach(l, paths)
{
Path *path = (Path *) lfirst(l);
/*
* Since cost comparison is a lot cheaper than pathkey comparison, do
* that first. (XXX is that still true?)
*/
! if (matched_path != NULL &&
! compare_fractional_path_costs(matched_path, path, fraction) <= 0)
! continue;
! if (pathkeys_contained_in(pathkeys, path->pathkeys) &&
bms_is_subset(PATH_REQ_OUTER(path), required_outer))
matched_path = path;
}
return matched_path;
}
--- 434,508 ----
get_cheapest_fractional_path_for_pathkeys(List *paths,
List *pathkeys,
Relids required_outer,
! double fraction,
! PlannerInfo *root,
! double tuples)
{
Path *matched_path = NULL;
+ int matched_n_common_pathkeys = 0,
+ costs_cmp, n_common_pathkeys,
+ n_pathkeys = list_length(pathkeys);
ListCell *l;
+ List *groupExprs = NIL;
+ double *num_groups, matched_fraction;
+ int i;
+
+ i = 0;
+ num_groups = (double *)palloc(sizeof(double) * list_length(pathkeys));
+ foreach(l, pathkeys)
+ {
+ PathKey *key = (PathKey *)lfirst(l);
+ EquivalenceMember *member = (EquivalenceMember *)
+ lfirst(list_head(key->pk_eclass->ec_members));
+
+ groupExprs = lappend(groupExprs, member->em_expr);
+
+ num_groups[i] = estimate_num_groups(root, groupExprs, tuples);
+ i++;
+ }
+
foreach(l, paths)
{
Path *path = (Path *) lfirst(l);
+ double current_fraction;
+
+ n_common_pathkeys = pathkeys_common(pathkeys, path->pathkeys);
+ if (n_common_pathkeys < matched_n_common_pathkeys ||
+ n_common_pathkeys == 0)
+ continue;
+
+ current_fraction = fraction;
+ if (n_common_pathkeys < n_pathkeys)
+ {
+ current_fraction += 1.0 / num_groups[n_common_pathkeys - 1];
+ current_fraction = Max(current_fraction, 1.0);
+ }
/*
* Since cost comparison is a lot cheaper than pathkey comparison, do
* that first. (XXX is that still true?)
*/
! if (matched_path != NULL)
! {
! costs_cmp = compare_bifractional_path_costs(matched_path, path,
! matched_fraction, current_fraction);
! }
! else
! {
! costs_cmp = 1;
! }
! if ((
! n_common_pathkeys > matched_n_common_pathkeys
! || (n_common_pathkeys == matched_n_common_pathkeys
! && costs_cmp > 0)) &&
bms_is_subset(PATH_REQ_OUTER(path), required_outer))
+ {
matched_path = path;
+ matched_n_common_pathkeys = n_common_pathkeys;
+ matched_fraction = current_fraction;
+ }
}
return matched_path;
}
*************** List *
*** 965,974 ****
find_mergeclauses_for_pathkeys(PlannerInfo *root,
List *pathkeys,
bool outer_keys,
! List *restrictinfos)
{
List *mergeclauses = NIL;
ListCell *i;
/* make sure we have eclasses cached in the clauses */
foreach(i, restrictinfos)
--- 1062,1077 ----
find_mergeclauses_for_pathkeys(PlannerInfo *root,
List *pathkeys,
bool outer_keys,
! List *restrictinfos,
! RelOptInfo *joinrel,
! List **outersortkeys)
{
List *mergeclauses = NIL;
ListCell *i;
+ bool *used = (bool *)palloc0(sizeof(bool) * list_length(restrictinfos));
+ int k;
+ List *unusedRestrictinfos = NIL;
+ List *usedPathkeys = NIL;
/* make sure we have eclasses cached in the clauses */
foreach(i, restrictinfos)
*************** find_mergeclauses_for_pathkeys(PlannerIn
*** 1021,1026 ****
--- 1124,1130 ----
* deal with the case in create_mergejoin_plan().
*----------
*/
+ k = 0;
foreach(j, restrictinfos)
{
RestrictInfo *rinfo = (RestrictInfo *) lfirst(j);
*************** find_mergeclauses_for_pathkeys(PlannerIn
*** 1033,1039 ****
--- 1137,1147 ----
clause_ec = rinfo->outer_is_left ?
rinfo->right_ec : rinfo->left_ec;
if (clause_ec == pathkey_ec)
+ {
matched_restrictinfos = lappend(matched_restrictinfos, rinfo);
+ used[k] = true;
+ }
+ k++;
}
/*
*************** find_mergeclauses_for_pathkeys(PlannerIn
*** 1044,1049 ****
--- 1152,1159 ----
if (matched_restrictinfos == NIL)
break;
+ usedPathkeys = lappend(usedPathkeys, pathkey);
+
/*
* If we did find usable mergeclause(s) for this sort-key position,
* add them to result list.
*************** find_mergeclauses_for_pathkeys(PlannerIn
*** 1051,1056 ****
--- 1161,1201 ----
mergeclauses = list_concat(mergeclauses, matched_restrictinfos);
}
+ if (outersortkeys)
+ {
+ List *addPathkeys, *addMergeclauses;
+
+ *outersortkeys = pathkeys;
+
+ if (!mergeclauses)
+ return mergeclauses;
+
+ k = 0;
+ foreach(i, restrictinfos)
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) lfirst(i);
+ if (!used[k])
+ unusedRestrictinfos = lappend(unusedRestrictinfos, rinfo);
+ k++;
+ }
+
+ if (!unusedRestrictinfos)
+ return mergeclauses;
+
+ addPathkeys = select_outer_pathkeys_for_merge(root,
+ unusedRestrictinfos, joinrel);
+
+ if (!addPathkeys)
+ return mergeclauses;
+
+ addMergeclauses = find_mergeclauses_for_pathkeys(root,
+ addPathkeys, true, unusedRestrictinfos, NULL, NULL);
+
+ *outersortkeys = list_concat(usedPathkeys, addPathkeys);
+ mergeclauses = list_concat(mergeclauses, addMergeclauses);
+
+ }
+
return mergeclauses;
}
*************** right_merge_direction(PlannerInfo *root,
*** 1457,1472 ****
static int
pathkeys_useful_for_ordering(PlannerInfo *root, List *pathkeys)
{
if (root->query_pathkeys == NIL)
return 0; /* no special ordering requested */
if (pathkeys == NIL)
return 0; /* unordered path */
! if (pathkeys_contained_in(root->query_pathkeys, pathkeys))
{
/* It's useful ... or at least the first N keys are */
! return list_length(root->query_pathkeys);
}
return 0; /* path ordering not useful */
--- 1602,1621 ----
static int
pathkeys_useful_for_ordering(PlannerInfo *root, List *pathkeys)
{
+ int n;
+
if (root->query_pathkeys == NIL)
return 0; /* no special ordering requested */
if (pathkeys == NIL)
return 0; /* unordered path */
! n = pathkeys_common(root->query_pathkeys, pathkeys);
!
! if (n != 0)
{
/* It's useful ... or at least the first N keys are */
! return n;
}
return 0; /* path ordering not useful */
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
new file mode 100644
index 701fe78..8467e0d
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
*************** static MergeJoin *make_mergejoin(List *t
*** 149,154 ****
--- 149,155 ----
Plan *lefttree, Plan *righttree,
JoinType jointype);
static Sort *make_sort(PlannerInfo *root, Plan *lefttree, int numCols,
+ List *pathkeys, int skipCols,
AttrNumber *sortColIdx, Oid *sortOperators,
Oid *collations, bool *nullsFirst,
double limit_tuples);
*************** create_merge_append_plan(PlannerInfo *ro
*** 774,779 ****
--- 775,781 ----
Oid *sortOperators;
Oid *collations;
bool *nullsFirst;
+ int n_common_pathkeys;
/* Build the child plan */
subplan = create_plan_recurse(root, subpath);
*************** create_merge_append_plan(PlannerInfo *ro
*** 807,814 ****
numsortkeys * sizeof(bool)) == 0);
/* Now, insert a Sort node if subplan isn't sufficiently ordered */
! if (!pathkeys_contained_in(pathkeys, subpath->pathkeys))
subplan = (Plan *) make_sort(root, subplan, numsortkeys,
sortColIdx, sortOperators,
collations, nullsFirst,
best_path->limit_tuples);
--- 809,818 ----
numsortkeys * sizeof(bool)) == 0);
/* Now, insert a Sort node if subplan isn't sufficiently ordered */
! n_common_pathkeys = pathkeys_common(pathkeys, subpath->pathkeys);
! if (n_common_pathkeys < list_length(pathkeys))
subplan = (Plan *) make_sort(root, subplan, numsortkeys,
+ pathkeys, n_common_pathkeys,
sortColIdx, sortOperators,
collations, nullsFirst,
best_path->limit_tuples);
*************** create_mergejoin_plan(PlannerInfo *root,
*** 2181,2189 ****
disuse_physical_tlist(root, outer_plan, best_path->jpath.outerjoinpath);
outer_plan = (Plan *)
make_sort_from_pathkeys(root,
! outer_plan,
! best_path->outersortkeys,
! -1.0);
outerpathkeys = best_path->outersortkeys;
}
else
--- 2185,2195 ----
disuse_physical_tlist(root, outer_plan, best_path->jpath.outerjoinpath);
outer_plan = (Plan *)
make_sort_from_pathkeys(root,
! outer_plan,
! best_path->outersortkeys,
! -1.0,
! pathkeys_common(best_path->outersortkeys,
! best_path->jpath.outerjoinpath->pathkeys));
outerpathkeys = best_path->outersortkeys;
}
else
*************** create_mergejoin_plan(PlannerInfo *root,
*** 2194,2202 ****
disuse_physical_tlist(root, inner_plan, best_path->jpath.innerjoinpath);
inner_plan = (Plan *)
make_sort_from_pathkeys(root,
! inner_plan,
! best_path->innersortkeys,
! -1.0);
innerpathkeys = best_path->innersortkeys;
}
else
--- 2200,2210 ----
disuse_physical_tlist(root, inner_plan, best_path->jpath.innerjoinpath);
inner_plan = (Plan *)
make_sort_from_pathkeys(root,
! inner_plan,
! best_path->innersortkeys,
! -1.0,
! pathkeys_common(best_path->innersortkeys,
! best_path->jpath.innerjoinpath->pathkeys));
innerpathkeys = best_path->innersortkeys;
}
else
*************** make_mergejoin(List *tlist,
*** 3736,3741 ****
--- 3744,3750 ----
*/
static Sort *
make_sort(PlannerInfo *root, Plan *lefttree, int numCols,
+ List *pathkeys, int skipCols,
AttrNumber *sortColIdx, Oid *sortOperators,
Oid *collations, bool *nullsFirst,
double limit_tuples)
*************** make_sort(PlannerInfo *root, Plan *leftt
*** 3745,3751 ****
Path sort_path; /* dummy for result of cost_sort */
copy_plan_costsize(plan, lefttree); /* only care about copying size */
! cost_sort(&sort_path, root, NIL,
lefttree->total_cost,
lefttree->plan_rows,
lefttree->plan_width,
--- 3754,3761 ----
Path sort_path; /* dummy for result of cost_sort */
copy_plan_costsize(plan, lefttree); /* only care about copying size */
! cost_sort(&sort_path, root, pathkeys, skipCols,
! lefttree->startup_cost,
lefttree->total_cost,
lefttree->plan_rows,
lefttree->plan_width,
*************** make_sort(PlannerInfo *root, Plan *leftt
*** 3759,3764 ****
--- 3769,3775 ----
plan->lefttree = lefttree;
plan->righttree = NULL;
node->numCols = numCols;
+ node->skipCols = skipCols;
node->sortColIdx = sortColIdx;
node->sortOperators = sortOperators;
node->collations = collations;
*************** find_ec_member_for_tle(EquivalenceClass
*** 4087,4093 ****
*/
Sort *
make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
! double limit_tuples)
{
int numsortkeys;
AttrNumber *sortColIdx;
--- 4098,4104 ----
*/
Sort *
make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
! double limit_tuples, int skipCols)
{
int numsortkeys;
AttrNumber *sortColIdx;
*************** make_sort_from_pathkeys(PlannerInfo *roo
*** 4107,4113 ****
&nullsFirst);
/* Now build the Sort node */
! return make_sort(root, lefttree, numsortkeys,
sortColIdx, sortOperators, collations,
nullsFirst, limit_tuples);
}
--- 4118,4124 ----
&nullsFirst);
/* Now build the Sort node */
! return make_sort(root, lefttree, numsortkeys, pathkeys, skipCols,
sortColIdx, sortOperators, collations,
nullsFirst, limit_tuples);
}
*************** make_sort_from_sortclauses(PlannerInfo *
*** 4150,4156 ****
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
--- 4161,4167 ----
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys, NIL, 0,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
*************** Sort *
*** 4172,4178 ****
make_sort_from_groupcols(PlannerInfo *root,
List *groupcls,
AttrNumber *grpColIdx,
! Plan *lefttree)
{
List *sub_tlist = lefttree->targetlist;
ListCell *l;
--- 4183,4190 ----
make_sort_from_groupcols(PlannerInfo *root,
List *groupcls,
AttrNumber *grpColIdx,
! Plan *lefttree,
! List *pathkeys, int skipCols)
{
List *sub_tlist = lefttree->targetlist;
ListCell *l;
*************** make_sort_from_groupcols(PlannerInfo *ro
*** 4205,4211 ****
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
--- 4217,4223 ----
numsortkeys++;
}
! return make_sort(root, lefttree, numsortkeys, pathkeys, skipCols,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
}
diff --git a/src/backend/optimizer/plan/planagg.c b/src/backend/optimizer/plan/planagg.c
new file mode 100644
index 53fc238..4675402
*** a/src/backend/optimizer/plan/planagg.c
--- b/src/backend/optimizer/plan/planagg.c
*************** build_minmax_path(PlannerInfo *root, Min
*** 494,500 ****
get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
subroot->query_pathkeys,
NULL,
! path_fraction);
if (!sorted_path)
return false;
--- 494,502 ----
get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
subroot->query_pathkeys,
NULL,
! path_fraction,
! subroot,
! final_rel->rows);
if (!sorted_path)
return false;
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
new file mode 100644
index 1da4b2f..df5563a
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*************** grouping_planner(PlannerInfo *root, doub
*** 1349,1355 ****
get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
root->query_pathkeys,
NULL,
! tuple_fraction);
/* Don't consider same path in both guises; just wastes effort */
if (sorted_path == cheapest_path)
--- 1349,1357 ----
get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
root->query_pathkeys,
NULL,
! tuple_fraction,
! root,
! path_rows);
/* Don't consider same path in both guises; just wastes effort */
if (sorted_path == cheapest_path)
*************** grouping_planner(PlannerInfo *root, doub
*** 1365,1374 ****
if (sorted_path)
{
Path sort_path; /* dummy for result of cost_sort */
if (root->query_pathkeys == NIL ||
! pathkeys_contained_in(root->query_pathkeys,
! cheapest_path->pathkeys))
{
/* No sort needed for cheapest path */
sort_path.startup_cost = cheapest_path->startup_cost;
--- 1367,1380 ----
if (sorted_path)
{
Path sort_path; /* dummy for result of cost_sort */
+ Path partial_sort_path; /* dummy for result of cost_sort */
+ int n_common_pathkeys;
+
+ n_common_pathkeys = pathkeys_common(root->query_pathkeys,
+ cheapest_path->pathkeys);
if (root->query_pathkeys == NIL ||
! n_common_pathkeys == list_length(root->query_pathkeys))
{
/* No sort needed for cheapest path */
sort_path.startup_cost = cheapest_path->startup_cost;
*************** grouping_planner(PlannerInfo *root, doub
*** 1378,1389 ****
{
/* Figure cost for sorting */
cost_sort(&sort_path, root, root->query_pathkeys,
cheapest_path->total_cost,
path_rows, path_width,
0.0, work_mem, root->limit_tuples);
}
! if (compare_fractional_path_costs(sorted_path, &sort_path,
tuple_fraction) > 0)
{
/* Presorted path is a loser */
--- 1384,1418 ----
{
/* Figure cost for sorting */
cost_sort(&sort_path, root, root->query_pathkeys,
+ n_common_pathkeys,
+ cheapest_path->startup_cost,
cheapest_path->total_cost,
path_rows, path_width,
0.0, work_mem, root->limit_tuples);
}
! n_common_pathkeys = pathkeys_common(root->query_pathkeys,
! sorted_path->pathkeys);
!
! if (root->query_pathkeys == NIL ||
! n_common_pathkeys == list_length(root->query_pathkeys))
! {
! /* No sort needed for cheapest path */
! partial_sort_path.startup_cost = sorted_path->startup_cost;
! partial_sort_path.total_cost = sorted_path->total_cost;
! }
! else
! {
! /* Figure cost for sorting */
! cost_sort(&partial_sort_path, root, root->query_pathkeys,
! n_common_pathkeys,
! sorted_path->startup_cost,
! sorted_path->total_cost,
! path_rows, path_width,
! 0.0, work_mem, root->limit_tuples);
! }
!
! if (compare_fractional_path_costs(&partial_sort_path, &sort_path,
tuple_fraction) > 0)
{
/* Presorted path is a loser */
*************** grouping_planner(PlannerInfo *root, doub
*** 1464,1476 ****
* results.
*/
bool need_sort_for_grouping = false;
result_plan = create_plan(root, best_path);
current_pathkeys = best_path->pathkeys;
/* Detect if we'll need an explicit sort for grouping */
if (parse->groupClause && !use_hashed_grouping &&
! !pathkeys_contained_in(root->group_pathkeys, current_pathkeys))
{
need_sort_for_grouping = true;
--- 1493,1508 ----
* results.
*/
bool need_sort_for_grouping = false;
+ int n_common_pathkeys_grouping;
result_plan = create_plan(root, best_path);
current_pathkeys = best_path->pathkeys;
/* Detect if we'll need an explicit sort for grouping */
+ n_common_pathkeys_grouping = pathkeys_common(root->group_pathkeys,
+ current_pathkeys);
if (parse->groupClause && !use_hashed_grouping &&
! n_common_pathkeys_grouping < list_length(root->group_pathkeys))
{
need_sort_for_grouping = true;
*************** grouping_planner(PlannerInfo *root, doub
*** 1564,1570 ****
make_sort_from_groupcols(root,
parse->groupClause,
groupColIdx,
! result_plan);
current_pathkeys = root->group_pathkeys;
}
aggstrategy = AGG_SORTED;
--- 1596,1604 ----
make_sort_from_groupcols(root,
parse->groupClause,
groupColIdx,
! result_plan,
! root->group_pathkeys,
! n_common_pathkeys_grouping);
current_pathkeys = root->group_pathkeys;
}
aggstrategy = AGG_SORTED;
*************** grouping_planner(PlannerInfo *root, doub
*** 1607,1613 ****
make_sort_from_groupcols(root,
parse->groupClause,
groupColIdx,
! result_plan);
current_pathkeys = root->group_pathkeys;
}
--- 1641,1649 ----
make_sort_from_groupcols(root,
parse->groupClause,
groupColIdx,
! result_plan,
! root->group_pathkeys,
! n_common_pathkeys_grouping);
current_pathkeys = root->group_pathkeys;
}
*************** grouping_planner(PlannerInfo *root, doub
*** 1724,1736 ****
if (window_pathkeys)
{
Sort *sort_plan;
sort_plan = make_sort_from_pathkeys(root,
result_plan,
window_pathkeys,
! -1.0);
! if (!pathkeys_contained_in(window_pathkeys,
! current_pathkeys))
{
/* we do indeed need to sort */
result_plan = (Plan *) sort_plan;
--- 1760,1776 ----
if (window_pathkeys)
{
Sort *sort_plan;
+ int n_common_pathkeys;
+
+ n_common_pathkeys = pathkeys_common(window_pathkeys,
+ current_pathkeys);
sort_plan = make_sort_from_pathkeys(root,
result_plan,
window_pathkeys,
! -1.0,
! n_common_pathkeys);
! if (n_common_pathkeys < list_length(window_pathkeys))
{
/* we do indeed need to sort */
result_plan = (Plan *) sort_plan;
*************** grouping_planner(PlannerInfo *root, doub
*** 1876,1894 ****
{
if (list_length(root->distinct_pathkeys) >=
list_length(root->sort_pathkeys))
! current_pathkeys = root->distinct_pathkeys;
else
{
! current_pathkeys = root->sort_pathkeys;
/* Assert checks that parser didn't mess up... */
Assert(pathkeys_contained_in(root->distinct_pathkeys,
! current_pathkeys));
}
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
! current_pathkeys,
! -1.0);
}
result_plan = (Plan *) make_unique(result_plan,
--- 1916,1936 ----
{
if (list_length(root->distinct_pathkeys) >=
list_length(root->sort_pathkeys))
! needed_pathkeys = root->distinct_pathkeys;
else
{
! needed_pathkeys = root->sort_pathkeys;
/* Assert checks that parser didn't mess up... */
Assert(pathkeys_contained_in(root->distinct_pathkeys,
! needed_pathkeys));
}
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
! needed_pathkeys,
! -1.0,
! pathkeys_common(needed_pathkeys, current_pathkeys));
! current_pathkeys = needed_pathkeys;
}
result_plan = (Plan *) make_unique(result_plan,
*************** grouping_planner(PlannerInfo *root, doub
*** 1904,1915 ****
*/
if (parse->sortClause)
{
! if (!pathkeys_contained_in(root->sort_pathkeys, current_pathkeys))
{
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
root->sort_pathkeys,
! limit_tuples);
current_pathkeys = root->sort_pathkeys;
}
}
--- 1946,1960 ----
*/
if (parse->sortClause)
{
! int common = pathkeys_common(root->sort_pathkeys, current_pathkeys);
!
! if (common < list_length(root->sort_pathkeys))
{
result_plan = (Plan *) make_sort_from_pathkeys(root,
result_plan,
root->sort_pathkeys,
! limit_tuples,
! common);
current_pathkeys = root->sort_pathkeys;
}
}
*************** choose_hashed_grouping(PlannerInfo *root
*** 2654,2659 ****
--- 2699,2705 ----
List *current_pathkeys;
Path hashed_p;
Path sorted_p;
+ int n_common_pathkeys;
/*
* Executor doesn't support hashed aggregation with DISTINCT or ORDER BY
*************** choose_hashed_grouping(PlannerInfo *root
*** 2735,2741 ****
path_rows);
/* Result of hashed agg is always unsorted */
if (target_pathkeys)
! cost_sort(&hashed_p, root, target_pathkeys, hashed_p.total_cost,
dNumGroups, path_width,
0.0, work_mem, limit_tuples);
--- 2781,2788 ----
path_rows);
/* Result of hashed agg is always unsorted */
if (target_pathkeys)
! cost_sort(&hashed_p, root, target_pathkeys, 0,
! hashed_p.startup_cost, hashed_p.total_cost,
dNumGroups, path_width,
0.0, work_mem, limit_tuples);
*************** choose_hashed_grouping(PlannerInfo *root
*** 2751,2759 ****
sorted_p.total_cost = cheapest_path->total_cost;
current_pathkeys = cheapest_path->pathkeys;
}
! if (!pathkeys_contained_in(root->group_pathkeys, current_pathkeys))
{
! cost_sort(&sorted_p, root, root->group_pathkeys, sorted_p.total_cost,
path_rows, path_width,
0.0, work_mem, -1.0);
current_pathkeys = root->group_pathkeys;
--- 2798,2809 ----
sorted_p.total_cost = cheapest_path->total_cost;
current_pathkeys = cheapest_path->pathkeys;
}
!
! n_common_pathkeys = pathkeys_common(root->group_pathkeys, current_pathkeys);
! if (n_common_pathkeys < list_length(root->group_pathkeys))
{
! cost_sort(&sorted_p, root, root->group_pathkeys,
! n_common_pathkeys, sorted_p.startup_cost, sorted_p.total_cost,
path_rows, path_width,
0.0, work_mem, -1.0);
current_pathkeys = root->group_pathkeys;
*************** choose_hashed_grouping(PlannerInfo *root
*** 2768,2777 ****
cost_group(&sorted_p, root, numGroupCols, dNumGroups,
sorted_p.startup_cost, sorted_p.total_cost,
path_rows);
/* The Agg or Group node will preserve ordering */
! if (target_pathkeys &&
! !pathkeys_contained_in(target_pathkeys, current_pathkeys))
! cost_sort(&sorted_p, root, target_pathkeys, sorted_p.total_cost,
dNumGroups, path_width,
0.0, work_mem, limit_tuples);
--- 2818,2829 ----
cost_group(&sorted_p, root, numGroupCols, dNumGroups,
sorted_p.startup_cost, sorted_p.total_cost,
path_rows);
+
/* The Agg or Group node will preserve ordering */
! n_common_pathkeys = pathkeys_common(target_pathkeys, current_pathkeys);
! if (target_pathkeys && n_common_pathkeys < list_length(target_pathkeys))
! cost_sort(&sorted_p, root, target_pathkeys, n_common_pathkeys,
! sorted_p.startup_cost, sorted_p.total_cost,
dNumGroups, path_width,
0.0, work_mem, limit_tuples);
*************** choose_hashed_distinct(PlannerInfo *root
*** 2824,2829 ****
--- 2876,2882 ----
List *needed_pathkeys;
Path hashed_p;
Path sorted_p;
+ int n_common_pathkeys;
/*
* If we have a sortable DISTINCT ON clause, we always use sorting. This
*************** choose_hashed_distinct(PlannerInfo *root
*** 2889,2895 ****
* need to charge for the final sort.
*/
if (parse->sortClause)
! cost_sort(&hashed_p, root, root->sort_pathkeys, hashed_p.total_cost,
dNumDistinctRows, path_width,
0.0, work_mem, limit_tuples);
--- 2942,2949 ----
* need to charge for the final sort.
*/
if (parse->sortClause)
! cost_sort(&hashed_p, root, root->sort_pathkeys, 0,
! hashed_p.startup_cost, hashed_p.total_cost,
dNumDistinctRows, path_width,
0.0, work_mem, limit_tuples);
*************** choose_hashed_distinct(PlannerInfo *root
*** 2906,2928 ****
needed_pathkeys = root->sort_pathkeys;
else
needed_pathkeys = root->distinct_pathkeys;
! if (!pathkeys_contained_in(needed_pathkeys, current_pathkeys))
{
if (list_length(root->distinct_pathkeys) >=
list_length(root->sort_pathkeys))
current_pathkeys = root->distinct_pathkeys;
else
current_pathkeys = root->sort_pathkeys;
! cost_sort(&sorted_p, root, current_pathkeys, sorted_p.total_cost,
path_rows, path_width,
0.0, work_mem, -1.0);
}
cost_group(&sorted_p, root, numDistinctCols, dNumDistinctRows,
sorted_p.startup_cost, sorted_p.total_cost,
path_rows);
if (parse->sortClause &&
! !pathkeys_contained_in(root->sort_pathkeys, current_pathkeys))
! cost_sort(&sorted_p, root, root->sort_pathkeys, sorted_p.total_cost,
dNumDistinctRows, path_width,
0.0, work_mem, limit_tuples);
--- 2960,2989 ----
needed_pathkeys = root->sort_pathkeys;
else
needed_pathkeys = root->distinct_pathkeys;
!
! n_common_pathkeys = pathkeys_common(needed_pathkeys, current_pathkeys);
! if (n_common_pathkeys < list_length(needed_pathkeys))
{
if (list_length(root->distinct_pathkeys) >=
list_length(root->sort_pathkeys))
current_pathkeys = root->distinct_pathkeys;
else
current_pathkeys = root->sort_pathkeys;
! cost_sort(&sorted_p, root, current_pathkeys,
! n_common_pathkeys, sorted_p.startup_cost, sorted_p.total_cost,
path_rows, path_width,
0.0, work_mem, -1.0);
}
cost_group(&sorted_p, root, numDistinctCols, dNumDistinctRows,
sorted_p.startup_cost, sorted_p.total_cost,
path_rows);
+
+
+ n_common_pathkeys = pathkeys_common(root->sort_pathkeys, current_pathkeys);
if (parse->sortClause &&
! n_common_pathkeys < list_length(root->sort_pathkeys))
! cost_sort(&sorted_p, root, root->sort_pathkeys, n_common_pathkeys,
! sorted_p.startup_cost, sorted_p.total_cost,
dNumDistinctRows, path_width,
0.0, work_mem, limit_tuples);
*************** plan_cluster_use_sort(Oid tableOid, Oid
*** 3712,3719 ****
/* Estimate the cost of seq scan + sort */
seqScanPath = create_seqscan_path(root, rel, NULL);
! cost_sort(&seqScanAndSortPath, root, NIL,
! seqScanPath->total_cost, rel->tuples, rel->width,
comparisonCost, maintenance_work_mem, -1.0);
/* Estimate the cost of index scan */
--- 3773,3781 ----
/* Estimate the cost of seq scan + sort */
seqScanPath = create_seqscan_path(root, rel, NULL);
! cost_sort(&seqScanAndSortPath, root, NIL, 0,
! seqScanPath->startup_cost, seqScanPath->total_cost,
! rel->tuples, rel->width,
comparisonCost, maintenance_work_mem, -1.0);
/* Estimate the cost of index scan */
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
new file mode 100644
index e249628..b0b5471
*** a/src/backend/optimizer/prep/prepunion.c
--- b/src/backend/optimizer/prep/prepunion.c
*************** choose_hashed_setop(PlannerInfo *root, L
*** 859,865 ****
sorted_p.startup_cost = input_plan->startup_cost;
sorted_p.total_cost = input_plan->total_cost;
/* XXX cost_sort doesn't actually look at pathkeys, so just pass NIL */
! cost_sort(&sorted_p, root, NIL, sorted_p.total_cost,
input_plan->plan_rows, input_plan->plan_width,
0.0, work_mem, -1.0);
cost_group(&sorted_p, root, numGroupCols, dNumGroups,
--- 859,866 ----
sorted_p.startup_cost = input_plan->startup_cost;
sorted_p.total_cost = input_plan->total_cost;
/* XXX cost_sort doesn't actually look at pathkeys, so just pass NIL */
! cost_sort(&sorted_p, root, NIL, 0,
! sorted_p.startup_cost, sorted_p.total_cost,
input_plan->plan_rows, input_plan->plan_width,
0.0, work_mem, -1.0);
cost_group(&sorted_p, root, numGroupCols, dNumGroups,
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
new file mode 100644
index a7169ef..3d0a842
*** a/src/backend/optimizer/util/pathnode.c
--- b/src/backend/optimizer/util/pathnode.c
*************** create_merge_append_path(PlannerInfo *ro
*** 971,980 ****
foreach(l, subpaths)
{
Path *subpath = (Path *) lfirst(l);
pathnode->path.rows += subpath->rows;
! if (pathkeys_contained_in(pathkeys, subpath->pathkeys))
{
/* Subpath is adequately ordered, we won't need to sort it */
input_startup_cost += subpath->startup_cost;
--- 971,981 ----
foreach(l, subpaths)
{
Path *subpath = (Path *) lfirst(l);
+ int n_common_pathkeys = pathkeys_common(pathkeys, subpath->pathkeys);
pathnode->path.rows += subpath->rows;
! if (n_common_pathkeys == list_length(pathkeys))
{
/* Subpath is adequately ordered, we won't need to sort it */
input_startup_cost += subpath->startup_cost;
*************** create_merge_append_path(PlannerInfo *ro
*** 988,993 ****
--- 989,996 ----
cost_sort(&sort_path,
root,
pathkeys,
+ n_common_pathkeys,
+ subpath->startup_cost,
subpath->total_cost,
subpath->parent->tuples,
subpath->parent->width,
*************** create_unique_path(PlannerInfo *root, Re
*** 1343,1349 ****
/*
* Estimate cost for sort+unique implementation
*/
! cost_sort(&sort_path, root, NIL,
subpath->total_cost,
rel->rows,
rel->width,
--- 1346,1353 ----
/*
* Estimate cost for sort+unique implementation
*/
! cost_sort(&sort_path, root, NIL, 0,
! subpath->startup_cost,
subpath->total_cost,
rel->rows,
rel->width,
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
new file mode 100644
index 52f05e6..8983251
*** a/src/backend/utils/sort/tuplesort.c
--- b/src/backend/utils/sort/tuplesort.c
*************** tuplesort_end(Tuplesortstate *state)
*** 960,965 ****
--- 960,984 ----
MemoryContextDelete(state->sortcontext);
}
+ void
+ tuplesort_reset(Tuplesortstate *state)
+ {
+ int i;
+
+ if (state->tapeset)
+ LogicalTapeSetClose(state->tapeset);
+
+ for (i = 0; i < state->memtupcount; i++)
+ free_sort_tuple(state, state->memtuples + i);
+
+ state->status = TSS_INITIAL;
+ state->memtupcount = 0;
+ state->boundUsed = false;
+ state->tapeset = NULL;
+ state->currentRun = 0;
+ state->result_tape = -1;
+ }
+
/*
* Grow the memtuples[] array, if possible within our memory constraint. We
* must not exceed INT_MAX tuples in memory or the caller-provided memory
*************** free_sort_tuple(Tuplesortstate *state, S
*** 3525,3527 ****
--- 3544,3553 ----
FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
pfree(stup->tuple);
}
+
+ SortSupport
+ tuplesort_get_sortkeys(Tuplesortstate *state)
+ {
+ return state->sortKeys;
+ }
+
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
new file mode 100644
index 2a7b36e..76aab79
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
*************** typedef struct SortState
*** 1664,1671 ****
--- 1664,1673 ----
int64 bound; /* if bounded, how many tuples are needed */
bool sort_Done; /* sort completed yet? */
bool bounded_Done; /* value of bounded we did the sort with */
+ bool finished;
int64 bound_Done; /* value of bound we did the sort with */
void *tuplesortstate; /* private state of tuplesort.c */
+ HeapTuple prev;
} SortState;
/* ---------------------
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
new file mode 100644
index 101e22c..28b871e
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
*************** typedef struct Sort
*** 582,587 ****
--- 582,588 ----
{
Plan plan;
int numCols; /* number of sort-key columns */
+ int skipCols;
AttrNumber *sortColIdx; /* their indexes in the target list */
Oid *sortOperators; /* OIDs of operators to sort them by */
Oid *collations; /* OIDs of collations */
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
new file mode 100644
index 444ab74..e98fb0c
*** a/src/include/optimizer/cost.h
--- b/src/include/optimizer/cost.h
*************** extern void cost_ctescan(Path *path, Pla
*** 88,95 ****
RelOptInfo *baserel, ParamPathInfo *param_info);
extern void cost_recursive_union(Plan *runion, Plan *nrterm, Plan *rterm);
extern void cost_sort(Path *path, PlannerInfo *root,
! List *pathkeys, Cost input_cost, double tuples, int width,
! Cost comparison_cost, int sort_mem,
double limit_tuples);
extern void cost_merge_append(Path *path, PlannerInfo *root,
List *pathkeys, int n_streams,
--- 88,96 ----
RelOptInfo *baserel, ParamPathInfo *param_info);
extern void cost_recursive_union(Plan *runion, Plan *nrterm, Plan *rterm);
extern void cost_sort(Path *path, PlannerInfo *root,
! List *pathkeys, int presorted_keys,
! Cost input_startup_cost, Cost input_total_cost,
! double tuples, int width, Cost comparison_cost, int sort_mem,
double limit_tuples);
extern void cost_merge_append(Path *path, PlannerInfo *root,
List *pathkeys, int n_streams,
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
new file mode 100644
index dfe3a22..2b3313b
*** a/src/include/optimizer/paths.h
--- b/src/include/optimizer/paths.h
*************** typedef enum
*** 148,160 ****
extern PathKeysComparison compare_pathkeys(List *keys1, List *keys2);
extern bool pathkeys_contained_in(List *keys1, List *keys2);
extern Path *get_cheapest_path_for_pathkeys(List *paths, List *pathkeys,
Relids required_outer,
CostSelector cost_criterion);
extern Path *get_cheapest_fractional_path_for_pathkeys(List *paths,
List *pathkeys,
Relids required_outer,
! double fraction);
extern List *build_index_pathkeys(PlannerInfo *root, IndexOptInfo *index,
ScanDirection scandir);
extern List *build_expression_pathkey(PlannerInfo *root, Expr *expr,
--- 148,163 ----
extern PathKeysComparison compare_pathkeys(List *keys1, List *keys2);
extern bool pathkeys_contained_in(List *keys1, List *keys2);
+ extern int pathkeys_common(List *keys1, List *keys2);
extern Path *get_cheapest_path_for_pathkeys(List *paths, List *pathkeys,
Relids required_outer,
CostSelector cost_criterion);
extern Path *get_cheapest_fractional_path_for_pathkeys(List *paths,
List *pathkeys,
Relids required_outer,
! double fraction,
! PlannerInfo *root,
! double tuples);
extern List *build_index_pathkeys(PlannerInfo *root, IndexOptInfo *index,
ScanDirection scandir);
extern List *build_expression_pathkey(PlannerInfo *root, Expr *expr,
*************** extern void update_mergeclause_eclasses(
*** 176,182 ****
extern List *find_mergeclauses_for_pathkeys(PlannerInfo *root,
List *pathkeys,
bool outer_keys,
! List *restrictinfos);
extern List *select_outer_pathkeys_for_merge(PlannerInfo *root,
List *mergeclauses,
RelOptInfo *joinrel);
--- 179,187 ----
extern List *find_mergeclauses_for_pathkeys(PlannerInfo *root,
List *pathkeys,
bool outer_keys,
! List *restrictinfos,
! RelOptInfo *joinrel,
! List **outerpathkeys);
extern List *select_outer_pathkeys_for_merge(PlannerInfo *root,
List *mergeclauses,
RelOptInfo *joinrel);
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
new file mode 100644
index ba7ae7c..d33c615
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
*************** extern RecursiveUnion *make_recursive_un
*** 50,60 ****
Plan *lefttree, Plan *righttree, int wtParam,
List *distinctList, long numGroups);
extern Sort *make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree,
! List *pathkeys, double limit_tuples);
extern Sort *make_sort_from_sortclauses(PlannerInfo *root, List *sortcls,
Plan *lefttree);
extern Sort *make_sort_from_groupcols(PlannerInfo *root, List *groupcls,
! AttrNumber *grpColIdx, Plan *lefttree);
extern Agg *make_agg(PlannerInfo *root, List *tlist, List *qual,
AggStrategy aggstrategy, const AggClauseCosts *aggcosts,
int numGroupCols, AttrNumber *grpColIdx, Oid *grpOperators,
--- 50,61 ----
Plan *lefttree, Plan *righttree, int wtParam,
List *distinctList, long numGroups);
extern Sort *make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree,
! List *pathkeys, double limit_tuples, int skipCols);
extern Sort *make_sort_from_sortclauses(PlannerInfo *root, List *sortcls,
Plan *lefttree);
extern Sort *make_sort_from_groupcols(PlannerInfo *root, List *groupcls,
! AttrNumber *grpColIdx, Plan *lefttree, List *pathkeys,
! int skipCols);
extern Agg *make_agg(PlannerInfo *root, List *tlist, List *qual,
AggStrategy aggstrategy, const AggClauseCosts *aggcosts,
int numGroupCols, AttrNumber *grpColIdx, Oid *grpOperators,
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
new file mode 100644
index 5f87254..d5bc45e
*** a/src/include/utils/tuplesort.h
--- b/src/include/utils/tuplesort.h
***************
*** 24,29 ****
--- 24,30 ----
#include "executor/tuptable.h"
#include "fmgr.h"
#include "utils/relcache.h"
+ #include "utils/sortsupport.h"
/* Tuplesortstate is an opaque type whose details are not known outside
*************** extern bool tuplesort_skiptuples(Tupleso
*** 104,109 ****
--- 105,112 ----
extern void tuplesort_end(Tuplesortstate *state);
+ extern void tuplesort_reset(Tuplesortstate *state);
+
extern void tuplesort_get_stats(Tuplesortstate *state,
const char **sortMethod,
const char **spaceType,
*************** extern void tuplesort_get_stats(Tuplesor
*** 111,116 ****
--- 114,121 ----
extern int tuplesort_merge_order(int64 allowedMem);
+ extern SortSupport tuplesort_get_sortkeys(Tuplesortstate *state);
+
/*
* These routines may only be called if randomAccess was specified 'true'.
* Likewise, backwards scan in gettuple/getdatum is only allowed if
partial-sort-4-resetdiff.patchtext/x-patch; name=partial-sort-4-resetdiff.patchDownload
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
new file mode 100644
index 02dcd7a..c25ed7d
*** a/src/backend/executor/nodeSort.c
--- b/src/backend/executor/nodeSort.c
*************** ExecSort(SortState *node)
*** 120,137 ****
tupDesc = ExecGetResultType(outerNode);
if (node->tuplesortstate != NULL)
! tuplesort_end((Tuplesortstate *) node->tuplesortstate);
! tuplesortstate = tuplesort_begin_heap(tupDesc,
! plannode->numCols,
! plannode->sortColIdx,
! plannode->sortOperators,
! plannode->collations,
! plannode->nullsFirst,
! work_mem,
! node->randomAccess);
! if (node->bounded)
! tuplesort_set_bound(tuplesortstate, node->bound);
! node->tuplesortstate = (void *) tuplesortstate;
/*
* Put next group of tuples where skipCols" sort values are equal to
--- 120,140 ----
tupDesc = ExecGetResultType(outerNode);
if (node->tuplesortstate != NULL)
! tuplesort_reset((Tuplesortstate *) node->tuplesortstate);
! else
! {
! tuplesortstate = tuplesort_begin_heap(tupDesc,
! plannode->numCols,
! plannode->sortColIdx,
! plannode->sortOperators,
! plannode->collations,
! plannode->nullsFirst,
! work_mem,
! node->randomAccess);
! if (node->bounded)
! tuplesort_set_bound(tuplesortstate, node->bound);
! node->tuplesortstate = (void *) tuplesortstate;
! }
/*
* Put next group of tuples where skipCols" sort values are equal to
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
new file mode 100644
index 6a09138..8983251
*** a/src/backend/utils/sort/tuplesort.c
--- b/src/backend/utils/sort/tuplesort.c
*************** tuplesort_end(Tuplesortstate *state)
*** 960,965 ****
--- 960,984 ----
MemoryContextDelete(state->sortcontext);
}
+ void
+ tuplesort_reset(Tuplesortstate *state)
+ {
+ int i;
+
+ if (state->tapeset)
+ LogicalTapeSetClose(state->tapeset);
+
+ for (i = 0; i < state->memtupcount; i++)
+ free_sort_tuple(state, state->memtuples + i);
+
+ state->status = TSS_INITIAL;
+ state->memtupcount = 0;
+ state->boundUsed = false;
+ state->tapeset = NULL;
+ state->currentRun = 0;
+ state->result_tape = -1;
+ }
+
/*
* Grow the memtuples[] array, if possible within our memory constraint. We
* must not exceed INT_MAX tuples in memory or the caller-provided memory
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
new file mode 100644
index 5a65cd2..d5bc45e
*** a/src/include/utils/tuplesort.h
--- b/src/include/utils/tuplesort.h
*************** extern bool tuplesort_skiptuples(Tupleso
*** 105,110 ****
--- 105,112 ----
extern void tuplesort_end(Tuplesortstate *state);
+ extern void tuplesort_reset(Tuplesortstate *state);
+
extern void tuplesort_get_stats(Tuplesortstate *state,
const char **sortMethod,
const char **spaceType,
On Tue, Dec 31, 2013 at 2:41 PM, Andreas Karlsson <andreas@proxel.se> wrote:
On 12/29/2013 08:24 AM, David Rowley wrote:
If it was possible to devise some way to reuse any
previous tuplesortstate perhaps just inventing a reset method which
clears out tuples, then we could see performance exceed the standard
seqscan -> sort. The code the way it is seems to lookup the sort
functions from the syscache for each group then allocate some sort
space, so quite a bit of time is also spent in palloc0() and pfree()If it was not possible to do this then maybe adding a cost to the number
of sort groups would be better so that the optimization is skipped if
there are too many sort groups.It should be possible. I have hacked a quick proof of concept for reusing
the tuplesort state. Can you try it and see if the performance regression
is fixed by this?One thing which have to be fixed with my patch is that we probably want to
close the tuplesort once we have returned the last tuple from ExecSort().I have attached my patch and the incremental patch on Alexander's patch.
Thanks, the attached is about 5 times faster than it was previously with my
test case upthread.
The times now look like:
No pre-sortable index:
Total runtime: 86.278 ms
With pre-sortable index with partial sorting
Total runtime: 47.500 ms
With the query where there is no index the sort remained in memory.
I spent some time trying to find a case where the partial sort is slower
than the seqscan -> sort. The only places partial sort seems slower are
when the number of estimated sort groups are around the crossover point
where the planner would be starting to think about performing a seqscan ->
sort instead. I'm thinking right now that it's not worth raising the costs
around this as the partial sort is less likely to become a disk sort than
the full sort is.
I'll keep going with trying to break it.
Regards
David Rowley
Show quoted text
--
Andreas Karlsson
David Rowley escribi�:
I was about to test it tonight, but I'm having trouble getting the patch to
compile... I'm really wondering which compiler you are using as it seems
you're declaring your variables in some strange places.. See nodeSort.c
line 101. These variables are declared after there has been an if statement
in the same scope. That's not valid in C. (The patch did however apply
without any complaints).
AFAIR C99 allows mixed declarations and code. Visual Studio only
implements C89 though, which is why it fails to compile there.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 12/28/2013 04:51 PM, Alexander Korotkov wrote:
I've compiled it with clang. Yeah, there was mixed declarations. I've
rechecked it with gcc, now it gives no warnings. I didn't try it with
visual studio, but I hope it will be OK.
I looked at this version of the patch and noticed a possibility for
improvement. You could decrement the bound for the tuplesort after every
completed sort. Otherwise the optimizations for small limits wont apply
to partial sort.
--
Andreas Karlsson
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Dec 31, 2013 at 5:41 AM, Andreas Karlsson <andreas@proxel.se> wrote:
On 12/29/2013 08:24 AM, David Rowley wrote:
If it was possible to devise some way to reuse any
previous tuplesortstate perhaps just inventing a reset method which
clears out tuples, then we could see performance exceed the standard
seqscan -> sort. The code the way it is seems to lookup the sort
functions from the syscache for each group then allocate some sort
space, so quite a bit of time is also spent in palloc0() and pfree()If it was not possible to do this then maybe adding a cost to the number
of sort groups would be better so that the optimization is skipped if
there are too many sort groups.It should be possible. I have hacked a quick proof of concept for reusing
the tuplesort state. Can you try it and see if the performance regression
is fixed by this?One thing which have to be fixed with my patch is that we probably want to
close the tuplesort once we have returned the last tuple from ExecSort().I have attached my patch and the incremental patch on Alexander's patch.
Thanks. It's included into attached version of patch. As wall as estimation
improvements, more comments and regression tests fix.
------
With best regards,
Alexander Korotkov.
Attachments:
Hi Alexander,
First, thanks a lot for working on this feature. This PostgreSQL
shortcoming crops up in all the time in web applications that implement
paging by multiple sorted columns.
I've been trying it out in a few situations. I implemented a new
enable_partialsort GUC to make it easier to turn on/off, this way it's a
lot easier to test. The attached patch applies on top of
partial-sort-5.patch
I will spend more time reviewing the patch, but some of this planner code
is over my head. If there's any way I can help to make sure this lands in
the next version, let me know.
----
The patch performs just as well as I would expect it to:
marti=# select ac.name, r.name from artist_credit ac join release r on (
ac.id=r.artist_credit) order by ac.name, r.name limit 1000;
Time: 9.830 ms
marti=# set enable_partialsort = off;
marti=# select ac.name, r.name from artist_credit ac join release r on (
ac.id=r.artist_credit) order by ac.name, r.name limit 1000;
Time: 1442.815 ms
A difference of almost 150x!
There's a missed opportunity in that the code doesn't consider pushing new
Sort steps into subplans. For example, if there's no index on
language(name) then this query cannot take advantage partial sorts:
marti=# explain select l.name, r.name from language l join release r on (
l.id=r.language) order by l.name, r.name limit 1000;
Limit (cost=123203.20..123205.70 rows=1000 width=32)
-> Sort (cost=123203.20..126154.27 rows=1180430 width=32)
Sort Key: l.name, r.name
-> Hash Join (cost=229.47..58481.49 rows=1180430 width=32)
Hash Cond: (r.language = l.id)
-> Seq Scan on release r (cost=0.00..31040.10 rows=1232610
width=26)
-> Hash (cost=131.43..131.43 rows=7843 width=14)
-> Seq Scan on language l (cost=0.00..131.43
rows=7843 width=14)
But because there are only so few languages, it would be a lot faster to
sort languages in advance and then do partial sort:
Limit (rows=1000 width=31)
-> Partial sort (rows=1180881 width=31)
Sort Key: l.name, r.name
Presorted Key: l.name
-> Nested Loop (rows=1180881 width=31)
-> Sort (rows=7843 width=10)
Sort Key: name
-> Seq Scan on language (rows=7843 width=14)
-> Index Scan using release_language_idx on release r
(rows=11246 width=25)
Index Cond: (language = l.id)
Even an explicit sorted CTE cannot take advantage of partial sorts:
marti=# explain with sorted_lang as (select id, name from language order by
name)
marti-# select l.name, r.name from sorted_lang l join release r on
(l.id=r.language)
order by l.name, r.name limit 1000;
Limit (cost=3324368.83..3324371.33 rows=1000 width=240)
CTE sorted_lang
-> Sort (cost=638.76..658.37 rows=7843 width=14)
Sort Key: language.name
-> Seq Scan on language (cost=0.00..131.43 rows=7843 width=14)
-> Sort (cost=3323710.46..3439436.82 rows=46290543 width=240)
Sort Key: l.name, r.name
-> Merge Join (cost=664.62..785649.92 rows=46290543 width=240)
Merge Cond: (r.language = l.id)
-> Index Scan using release_language_idx on release r
(cost=0.43..87546.06 rows=1232610 width=26)
-> Sort (cost=664.19..683.80 rows=7843 width=222)
Sort Key: l.id
-> CTE Scan on sorted_lang l (cost=0.00..156.86
rows=7843 width=222)
But even with these limitations, this will easily be the killer feature of
the next release, for me at least.
Regards,
Marti
On Mon, Jan 13, 2014 at 8:01 PM, Alexander Korotkov <aekorotkov@gmail.com>wrote:
Show quoted text
On Tue, Dec 31, 2013 at 5:41 AM, Andreas Karlsson <andreas@proxel.se>wrote:
On 12/29/2013 08:24 AM, David Rowley wrote:
If it was possible to devise some way to reuse any
previous tuplesortstate perhaps just inventing a reset method which
clears out tuples, then we could see performance exceed the standard
seqscan -> sort. The code the way it is seems to lookup the sort
functions from the syscache for each group then allocate some sort
space, so quite a bit of time is also spent in palloc0() and pfree()If it was not possible to do this then maybe adding a cost to the number
of sort groups would be better so that the optimization is skipped if
there are too many sort groups.It should be possible. I have hacked a quick proof of concept for reusing
the tuplesort state. Can you try it and see if the performance regression
is fixed by this?One thing which have to be fixed with my patch is that we probably want
to close the tuplesort once we have returned the last tuple from ExecSort().I have attached my patch and the incremental patch on Alexander's patch.
Thanks. It's included into attached version of patch. As wall as
estimation improvements, more comments and regression tests fix.------
With best regards,
Alexander Korotkov.--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Attachments:
0001-Add-enable_partialsort-GUC-for-disabling-partial-sor.patchtext/x-patch; charset=US-ASCII; name=0001-Add-enable_partialsort-GUC-for-disabling-partial-sor.patchDownload
From 3f05447e7feb99583336b381df60ff013a144bab Mon Sep 17 00:00:00 2001
From: Marti Raudsepp <marti@juffo.org>
Date: Mon, 13 Jan 2014 22:24:20 +0200
Subject: [PATCH] Add enable_partialsort GUC for disabling partial sorts
---
doc/src/sgml/config.sgml | 13 +++++++++++++
src/backend/optimizer/path/costsize.c | 3 ++-
src/backend/optimizer/path/pathkeys.c | 1 +
src/backend/utils/misc/guc.c | 10 ++++++++++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/optimizer/cost.h | 1 +
6 files changed, 28 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0f2f2bf..1995625 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2808,6 +2808,19 @@ include 'filename'
</listitem>
</varlistentry>
+ <varlistentry id="guc-enable-partialsort" xreflabel="enable_partialsort">
+ <term><varname>enable_partialsort</varname> (<type>boolean</type>)</term>
+ <indexterm>
+ <primary><varname>enable_partialsort</> configuration parameter</primary>
+ </indexterm>
+ <listitem>
+ <para>
+ Enables or disables the query planner's use of partial sort steps.
+ The default is <literal>on</>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-enable-tidscan" xreflabel="enable_tidscan">
<term><varname>enable_tidscan</varname> (<type>boolean</type>)</term>
<indexterm>
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index da64825..cefd480 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -112,6 +112,7 @@ bool enable_indexonlyscan = true;
bool enable_bitmapscan = true;
bool enable_tidscan = true;
bool enable_sort = true;
+bool enable_partialsort = true;
bool enable_hashagg = true;
bool enable_nestloop = true;
bool enable_material = true;
@@ -1329,7 +1330,7 @@ cost_sort(Path *path, PlannerInfo *root,
/*
* Estimate number of groups which dataset is divided by presorted keys.
*/
- if (presorted_keys > 0)
+ if (presorted_keys > 0 && enable_partialsort)
{
List *groupExprs = NIL;
ListCell *l;
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index 55d8ef4..d5a1357 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -22,6 +22,7 @@
#include "nodes/nodeFuncs.h"
#include "nodes/plannodes.h"
#include "optimizer/clauses.h"
+#include "optimizer/cost.h"
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/tlist.h"
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 1217098..c3f2f29 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1,3 +1,4 @@
+
/*--------------------------------------------------------------------
* guc.c
*
@@ -724,6 +725,15 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},
{
+ {"enable_partialsort", PGC_USERSET, QUERY_TUNING_METHOD,
+ gettext_noop("Enables the planner's use of partial sort steps."),
+ NULL
+ },
+ &enable_partialsort,
+ true,
+ NULL, NULL, NULL
+ },
+ {
{"enable_hashagg", PGC_USERSET, QUERY_TUNING_METHOD,
gettext_noop("Enables the planner's use of hashed aggregation plans."),
NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 27791cc..20072fb 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -270,6 +270,7 @@
#enable_nestloop = on
#enable_seqscan = on
#enable_sort = on
+#enable_partialsort = on
#enable_tidscan = on
# - Planner Cost Constants -
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 47aef12..30203c7 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -56,6 +56,7 @@ extern bool enable_indexonlyscan;
extern bool enable_bitmapscan;
extern bool enable_tidscan;
extern bool enable_sort;
+extern bool enable_partialsort;
extern bool enable_hashagg;
extern bool enable_nestloop;
extern bool enable_material;
--
1.8.5.2
Hi!
On Tue, Jan 14, 2014 at 12:54 AM, Marti Raudsepp <marti@juffo.org> wrote:
First, thanks a lot for working on this feature. This PostgreSQL
shortcoming crops up in all the time in web applications that implement
paging by multiple sorted columns.
Thanks!
I've been trying it out in a few situations. I implemented a new
enable_partialsort GUC to make it easier to turn on/off, this way it's a
lot easier to test. The attached patch applies on top of
partial-sort-5.patch
I though about such option. Generally not because of testing convenience,
but because of overhead of planning. This way you implement it is quite
naive :) For instance, merge join rely on partial sort which will be
replaced with simple sort.
I will spend more time reviewing the patch, but some of this planner code
is over my head. If there's any way I can help to make sure this lands in
the next version, let me know.----
The patch performs just as well as I would expect it to:
marti=# select ac.name, r.name from artist_credit ac join release r on (
ac.id=r.artist_credit) order by ac.name, r.name limit 1000;
Time: 9.830 ms
marti=# set enable_partialsort = off;
marti=# select ac.name, r.name from artist_credit ac join release r on (
ac.id=r.artist_credit) order by ac.name, r.name limit 1000;
Time: 1442.815 msA difference of almost 150x!
There's a missed opportunity in that the code doesn't consider pushing new
Sort steps into subplans. For example, if there's no index on
language(name) then this query cannot take advantage partial sorts:marti=# explain select l.name, r.name from language l join release r on (
l.id=r.language) order by l.name, r.name limit 1000;
Limit (cost=123203.20..123205.70 rows=1000 width=32)
-> Sort (cost=123203.20..126154.27 rows=1180430 width=32)
Sort Key: l.name, r.name
-> Hash Join (cost=229.47..58481.49 rows=1180430 width=32)
Hash Cond: (r.language = l.id)
-> Seq Scan on release r (cost=0.00..31040.10
rows=1232610 width=26)
-> Hash (cost=131.43..131.43 rows=7843 width=14)
-> Seq Scan on language l (cost=0.00..131.43
rows=7843 width=14)But because there are only so few languages, it would be a lot faster to
sort languages in advance and then do partial sort:
Limit (rows=1000 width=31)
-> Partial sort (rows=1180881 width=31)
Sort Key: l.name, r.name
Presorted Key: l.name
-> Nested Loop (rows=1180881 width=31)
-> Sort (rows=7843 width=10)
Sort Key: name
-> Seq Scan on language (rows=7843 width=14)
-> Index Scan using release_language_idx on release r
(rows=11246 width=25)
Index Cond: (language = l.id)Even an explicit sorted CTE cannot take advantage of partial sorts:
marti=# explain with sorted_lang as (select id, name from language order
by name)
marti-# select l.name, r.name from sorted_lang l join release r on (l.id=r.language)
order by l.name, r.name limit 1000;
Limit (cost=3324368.83..3324371.33 rows=1000 width=240)
CTE sorted_lang
-> Sort (cost=638.76..658.37 rows=7843 width=14)
Sort Key: language.name
-> Seq Scan on language (cost=0.00..131.43 rows=7843 width=14)
-> Sort (cost=3323710.46..3439436.82 rows=46290543 width=240)
Sort Key: l.name, r.name
-> Merge Join (cost=664.62..785649.92 rows=46290543 width=240)
Merge Cond: (r.language = l.id)
-> Index Scan using release_language_idx on release r
(cost=0.43..87546.06 rows=1232610 width=26)
-> Sort (cost=664.19..683.80 rows=7843 width=222)
Sort Key: l.id
-> CTE Scan on sorted_lang l (cost=0.00..156.86
rows=7843 width=222)But even with these limitations, this will easily be the killer feature of
the next release, for me at least.
I see. But I don't think it can be achieved by small changes in planner.
Moreover, I didn't check but I think if you remove ordering by r.name you
will still not get sorting languages in the inner node. So, this problem is
not directly related to partial sort.
------
With best regards,
Alexander Korotkov.
On Tue, Jan 14, 2014 at 5:49 PM, Alexander Korotkov <aekorotkov@gmail.com>
wrote:
I implemented a new
enable_partialsort GUC to make it easier to turn on/off
I though about such option. Generally not because of testing convenience,
but because of overhead of planning. This way you implement it is quite
naive :) For instance, merge join rely on partial sort which will be
replaced with simple sort.
Oh, this actually highlights a performance regression with the partial sort
patch. I assumed the planner will discard the full sort because of higher
costs, but it looks like the new code always assumes that a Partial sort
will be cheaper than a Join Filter without considering costs. When doing a
join USING (unique_indexed_value, something), the new plan is significantly
worse.
Unpatched:
marti=# explain analyze select * from release a join release b using (id,
name);
Merge Join (cost=0.85..179810.75 rows=12 width=158) (actual
time=0.011..1279.596 rows=1232610 loops=1)
Merge Cond: (a.id = b.id)
Join Filter: ((a.name)::text = (b.name)::text)
-> Index Scan using release_id_idx on release a (cost=0.43..79120.04
rows=1232610 width=92) (actual time=0.005..211.928 rows=1232610 loops=1)
-> Index Scan using release_id_idx on release b (cost=0.43..79120.04
rows=1232610 width=92) (actual time=0.004..371.592 rows=1232610 loops=1)
Total runtime: 1309.049 ms
Patched:
Merge Join (cost=0.98..179810.87 rows=12 width=158) (actual
time=0.037..5034.158 rows=1232610 loops=1)
Merge Cond: ((a.id = b.id) AND ((a.name)::text = (b.name)::text))
-> Partial sort (cost=0.49..82201.56 rows=1232610 width=92) (actual
time=0.013..955.938 rows=1232610 loops=1)
Sort Key: a.id, a.name
Presorted Key: a.id
Sort Method: quicksort Memory: 25kB
-> Index Scan using release_id_idx on release a
(cost=0.43..79120.04 rows=1232610 width=92) (actual time=0.007..449.332
rows=1232610 loops=1)
-> Materialize (cost=0.49..85283.09 rows=1232610 width=92) (actual
time=0.019..1352.377 rows=1232610 loops=1)
-> Partial sort (cost=0.49..82201.56 rows=1232610 width=92)
(actual time=0.018..1223.251 rows=1232610 loops=1)
Sort Key: b.id, b.name
Presorted Key: b.id
Sort Method: quicksort Memory: 25kB
-> Index Scan using release_id_idx on release b
(cost=0.43..79120.04 rows=1232610 width=92) (actual time=0.004..597.258
rows=1232610 loops=1)
Total runtime: 5166.906 ms
----
There's another "wishlist" kind of thing with top-N heapsort bounds; if I
do a query with LIMIT 1000 then every sort batch has Tuplesortstate.bound
set to 1000, but it could be reduced after each batch. If the first batch
is 900 rows then the 2nd batch only needs the top 100 rows at most.
Also, I find the name "partial sort" a bit confusing; this feature is not
actually sorting *partially*, it's finishing the sort of partially-sorted
data. Perhaps "batched sort" would explain the feature better? Because it
does the sort in multiple batches instead of all at once. But maybe that's
just me.
Regards,
Marti
On Tue, Jan 14, 2014 at 11:16 PM, Marti Raudsepp <marti@juffo.org> wrote:
On Tue, Jan 14, 2014 at 5:49 PM, Alexander Korotkov <aekorotkov@gmail.com>
wrote:I implemented a new
enable_partialsort GUC to make it easier to turn on/offI though about such option. Generally not because of testing convenience,
but because of overhead of planning. This way you implement it is quite
naive :) For instance, merge join rely on partial sort which will be
replaced with simple sort.Oh, this actually highlights a performance regression with the partial
sort patch. I assumed the planner will discard the full sort because of
higher costs, but it looks like the new code always assumes that a Partial
sort will be cheaper than a Join Filter without considering costs. When
doing a join USING (unique_indexed_value, something), the new plan is
significantly worse.Unpatched:
marti=# explain analyze select * from release a join release b using (id,
name);
Merge Join (cost=0.85..179810.75 rows=12 width=158) (actual
time=0.011..1279.596 rows=1232610 loops=1)
Merge Cond: (a.id = b.id)
Join Filter: ((a.name)::text = (b.name)::text)
-> Index Scan using release_id_idx on release a (cost=0.43..79120.04
rows=1232610 width=92) (actual time=0.005..211.928 rows=1232610 loops=1)
-> Index Scan using release_id_idx on release b (cost=0.43..79120.04
rows=1232610 width=92) (actual time=0.004..371.592 rows=1232610 loops=1)
Total runtime: 1309.049 msPatched:
Merge Join (cost=0.98..179810.87 rows=12 width=158) (actual
time=0.037..5034.158 rows=1232610 loops=1)
Merge Cond: ((a.id = b.id) AND ((a.name)::text = (b.name)::text))
-> Partial sort (cost=0.49..82201.56 rows=1232610 width=92) (actual
time=0.013..955.938 rows=1232610 loops=1)
Sort Key: a.id, a.name
Presorted Key: a.id
Sort Method: quicksort Memory: 25kB
-> Index Scan using release_id_idx on release a
(cost=0.43..79120.04 rows=1232610 width=92) (actual time=0.007..449.332
rows=1232610 loops=1)
-> Materialize (cost=0.49..85283.09 rows=1232610 width=92) (actual
time=0.019..1352.377 rows=1232610 loops=1)
-> Partial sort (cost=0.49..82201.56 rows=1232610 width=92)
(actual time=0.018..1223.251 rows=1232610 loops=1)
Sort Key: b.id, b.name
Presorted Key: b.id
Sort Method: quicksort Memory: 25kB
-> Index Scan using release_id_idx on release b
(cost=0.43..79120.04 rows=1232610 width=92) (actual time=0.004..597.258
rows=1232610 loops=1)
Total runtime: 5166.906 ms
----
Interesting. Could you share the dataset?
There's another "wishlist" kind of thing with top-N heapsort bounds; if I
do a query with LIMIT 1000 then every sort batch has Tuplesortstate.bound
set to 1000, but it could be reduced after each batch. If the first batch
is 900 rows then the 2nd batch only needs the top 100 rows at most.
Right. Just didn't implement it yet.
Also, I find the name "partial sort" a bit confusing; this feature is not
actually sorting *partially*, it's finishing the sort of partially-sorted
data. Perhaps "batched sort" would explain the feature better? Because it
does the sort in multiple batches instead of all at once. But maybe that's
just me.
I'm not sure. For me "batched sort" sounds like we're going to sort in
batch something that we sorted separately before. Probably I'm wrong
because I'm far from native english :)
------
With best regards,
Alexander Korotkov.
On Tue, Jan 14, 2014 at 9:28 PM, Alexander Korotkov <aekorotkov@gmail.com>wrote:
On Tue, Jan 14, 2014 at 11:16 PM, Marti Raudsepp <marti@juffo.org> wrote:
Oh, this actually highlights a performance regression with the partial
sort patch.Interesting. Could you share the dataset?
It occurs with many datasets if work_mem is sufficiently low (10MB in my
case). Here's a quicker way to reproduce a similar issue:
create table foo as select i, i as j from generate_series(1,10000000) i;
create index on foo(i);
explain analyze select * from foo a join foo b using (i, j);
The real data is from the "release" table from MusicBrainz database dump:
https://musicbrainz.org/doc/MusicBrainz_Database/Download . It's nontrivial
to set up though, so if you still need the real data, I can upload a pgdump
for you.
Regards,
Marti
On 22/12/13 20:26, Alexander Korotkov wrote:
On Sat, Dec 14, 2013 at 6:30 PM, Jeremy Harris <jgh@wizmail.org> wrote:
On 14/12/13 12:54, Andres Freund wrote:
Is that actually all that beneficial when sorting with a bog standard
qsort() since that doesn't generally benefit from data being pre-sorted?
I think we might need to switch to a different algorithm to really
benefit from mostly pre-sorted input.Eg: /messages/by-id/5291467E.6070807@wizmail.org
Maybe Alexander and I should bash our heads together.
Partial sort patch is mostly optimizer/executor improvement rather than
improvement of sort algorithm itself.
I finally got as far as understanding Alexander's cleverness, and it
does make the performance advantage (on partially-sorted input) of the
merge-sort irrelevant.
There's a slight tradeoff possible between the code complexity of
the chunking code front-ending the sorter and just using the
enhanced sorter. The chunking does reduce the peak memory usage
quite nicely too.
The implementation of the chunker does O(n) compares using the
keys of the feed-stream index, to identify the chunk boundaries.
Would it be possible to get this information from the Index Scan?
--
Cheers,
Jeremy
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 13/01/14 18:01, Alexander Korotkov wrote:
Thanks. It's included into attached version of patch. As wall as estimation
improvements, more comments and regression tests fix.
Would it be possible to totally separate the two sets of sort-keys,
only giving the non-index set to the tuplesort? At present tuplesort
will, when it has a group to sort, make wasted compares on the
indexed set of keys before starting on the non-indexed set.
--
Cheers,
Jeremy
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
There's another small regression with this patch when used with expensive
comparison functions, such as long text fields.
If we go through all this trouble in cmpSortSkipCols to prove that the
first N sortkeys are equal, it would be nice if Tuplesort could skip their
comparisons entirely; that's another nice optimization this patch can
provide.
I've implemented that in the attached patch, which applies on top of your
partial-sort-5.patch
Should the "Sort Key" field in EXPLAIN output be changed as well? I'd say
no, I think that makes the partial sort steps harder to read.
Generate test data:
create table longtext as select (select repeat('a', 1000*100)) a,
generate_series(1,1000) i;
create index on longtext(a);
Unpatched (using your original partial-sort-5.patch):
=# explain analyze select * from longtext order by a, i limit 10;
Limit (cost=2.34..19.26 rows=10 width=1160) (actual time=13477.739..13477.756
rows=10 loops=1)
-> Partial sort (cost=2.34..1694.15 rows=1000 width=1160) (actual time=
13477.737..13477.742 rows=10 loops=1)
Sort Key: a, i
Presorted Key: a
Sort Method: top-N heapsort Memory: 45kB
-> Index Scan using longtext_a_idx on longtext (cost=0.65..1691.65
rows=1000 width=1160) (actual time=0.015..2.364 rows=1000 loops=1)
Total runtime: 13478.158 ms
(7 rows)
=# set enable_indexscan=off;
=# explain analyze select * from longtext order by a, i limit 10;
Limit (cost=198.61..198.63 rows=10 width=1160) (actual
time=6970.439..6970.458 rows=10 loops=1)
-> Sort (cost=198.61..201.11 rows=1000 width=1160) (actual
time=6970.438..6970.444 rows=10 loops=1)
Sort Key: a, i
Sort Method: top-N heapsort Memory: 45kB
-> Seq Scan on longtext (cost=0.00..177.00 rows=1000 width=1160)
(actual time=0.007..1.763 rows=1000 loops=1)
Total runtime: 6970.491 ms
Patched:
=# explain analyze select * from longtext order by a, i ;
Partial sort (cost=2.34..1694.15 rows=1000 width=1160) (actual
time=0.024..4.603 rows=1000 loops=1)
Sort Key: a, i
Presorted Key: a
Sort Method: quicksort Memory: 27kB
-> Index Scan using longtext_a_idx on longtext (cost=0.65..1691.65
rows=1000 width=1160) (actual time=0.013..2.094 rows=1000 loops=1)
Total runtime: 5.418 ms
Regards,
Marti
Attachments:
0001-Batched-sort-skip-comparisons-for-known-equal-column.patchtext/x-patch; charset=US-ASCII; name=0001-Batched-sort-skip-comparisons-for-known-equal-column.patchDownload
From fbc6c31528018bca64dc54f65e1cd292f8de482a Mon Sep 17 00:00:00 2001
From: Marti Raudsepp <marti@juffo.org>
Date: Sat, 18 Jan 2014 19:16:15 +0200
Subject: [PATCH] Batched sort: skip comparisons for known-equal columns
---
src/backend/executor/nodeSort.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index cf1f79e..5abda1d 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -125,10 +125,10 @@ ExecSort(SortState *node)
{
tuplesortstate = tuplesort_begin_heap(tupDesc,
plannode->numCols - skipCols,
- &(plannode->sortColIdx)[skipCols],
- plannode->sortOperators,
- plannode->collations,
- plannode->nullsFirst,
+ &(plannode->sortColIdx[skipCols]),
+ &(plannode->sortOperators[skipCols]),
+ &(plannode->collations[skipCols]),
+ &(plannode->nullsFirst[skipCols]),
work_mem,
node->randomAccess);
if (node->bounded)
--
1.8.5.3
Funny, I just wrote a patch to do that some minutes ago (didn't see your
email yet).
/messages/by-id/CABRT9RCK=wmFUYZdqU_+MOFW5PDevLxJmZ5B=eTJJNUBvyARxw@mail.gmail.com
Regards,
Marti
On Sat, Jan 18, 2014 at 7:10 PM, Jeremy Harris <jgh@wizmail.org> wrote:
Show quoted text
On 13/01/14 18:01, Alexander Korotkov wrote:
Thanks. It's included into attached version of patch. As wall as
estimation
improvements, more comments and regression tests fix.Would it be possible to totally separate the two sets of sort-keys,
only giving the non-index set to the tuplesort? At present tuplesort
will, when it has a group to sort, make wasted compares on the
indexed set of keys before starting on the non-indexed set.
--
Cheers,
Jeremy--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 31/12/13 01:41, Andreas Karlsson wrote:
On 12/29/2013 08:24 AM, David Rowley wrote:
If it was possible to devise some way to reuse any
previous tuplesortstate perhaps just inventing a reset method which
clears out tuples, then we could see performance exceed the standard
seqscan -> sort. The code the way it is seems to lookup the sort
functions from the syscache for each group then allocate some sort
space, so quite a bit of time is also spent in palloc0() and pfree()If it was not possible to do this then maybe adding a cost to the number
of sort groups would be better so that the optimization is skipped if
there are too many sort groups.It should be possible. I have hacked a quick proof of concept for
reusing the tuplesort state. Can you try it and see if the performance
regression is fixed by this?One thing which have to be fixed with my patch is that we probably want
to close the tuplesort once we have returned the last tuple from
ExecSort().I have attached my patch and the incremental patch on Alexander's patch.
How does this work in combination with randomAccess ?
--
Thanks,
Jeremy
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Jan 18, 2014 at 7:22 PM, Marti Raudsepp <marti@juffo.org> wrote:
Total runtime: 5.418 ms
Oops, shouldn't have rushed this. Clearly the timings should have
tipped me off that it's broken. I didn't notice that cmpSortSkipCols
was re-using tuplesort's sortkeys.
Here's a patch that actually works; I added a new skipKeys attribute
to SortState. I had to extract the SortSupport-creation code from
tuplesort_begin_heap to a new function; but that's fine, because it
was already duplicated in ExecInitMergeAppend too.
I reverted the addition of tuplesort_get_sortkeys, which is not needed now.
Now the timings are:
Unpatched partial sort: 13478.158 ms
Full sort: 6802.063 ms
Patched partial sort: 6618.962 ms
Regards,
Marti
Attachments:
0001-Partial-sort-skip-comparisons-for-known-equal-column.patchtext/x-patch; charset=US-ASCII; name=0001-Partial-sort-skip-comparisons-for-known-equal-column.patchDownload
From 7d9f34c09e7836504725ff11be7e63a2fc438ae9 Mon Sep 17 00:00:00 2001
From: Marti Raudsepp <marti@juffo.org>
Date: Mon, 13 Jan 2014 20:38:45 +0200
Subject: [PATCH] Partial sort: skip comparisons for known-equal columns
---
src/backend/executor/nodeMergeAppend.c | 18 +++++-------------
src/backend/executor/nodeSort.c | 26 +++++++++++++++++---------
src/backend/utils/sort/sortsupport.c | 29 +++++++++++++++++++++++++++++
src/backend/utils/sort/tuplesort.c | 31 +++++--------------------------
src/include/nodes/execnodes.h | 1 +
src/include/utils/sortsupport.h | 3 +++
src/include/utils/tuplesort.h | 2 --
7 files changed, 60 insertions(+), 50 deletions(-)
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 74fa40d..db6ec23 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -126,19 +126,11 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* initialize sort-key information
*/
mergestate->ms_nkeys = node->numCols;
- mergestate->ms_sortkeys = palloc0(sizeof(SortSupportData) * node->numCols);
-
- for (i = 0; i < node->numCols; i++)
- {
- SortSupport sortKey = mergestate->ms_sortkeys + i;
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = node->collations[i];
- sortKey->ssup_nulls_first = node->nullsFirst[i];
- sortKey->ssup_attno = node->sortColIdx[i];
-
- PrepareSortSupportFromOrderingOp(node->sortOperators[i], sortKey);
- }
+ mergestate->ms_sortkeys = MakeSortSupportKeys(mergestate->ms_nkeys,
+ node->sortColIdx,
+ node->sortOperators,
+ node->collations,
+ node->nullsFirst);
/*
* initialize to show we have not run the subplans yet
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index 55cdb05..7645645 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -28,20 +28,19 @@ static bool
cmpSortSkipCols(SortState *node, TupleDesc tupDesc, HeapTuple a, TupleTableSlot *b)
{
int n = ((Sort *)node->ss.ps.plan)->skipCols, i;
- SortSupport sortKeys = tuplesort_get_sortkeys(node->tuplesortstate);
for (i = 0; i < n; i++)
{
Datum datumA, datumB;
bool isnullA, isnullB;
- AttrNumber attno = sortKeys[i].ssup_attno;
+ AttrNumber attno = node->skipKeys[i].ssup_attno;
datumA = heap_getattr(a, attno, tupDesc, &isnullA);
datumB = slot_getattr(b, attno, &isnullB);
if (ApplySortComparator(datumA, isnullA,
- datumB, isnullB,
- &sortKeys[i]))
+ datumB, isnullB,
+ &node->skipKeys[i]))
return false;
}
return true;
@@ -123,12 +122,21 @@ ExecSort(SortState *node)
tuplesort_reset((Tuplesortstate *) node->tuplesortstate);
else
{
+ /* Support structures for cmpSortSkipCols - already sorted columns */
+ if (skipCols)
+ node->skipKeys = MakeSortSupportKeys(skipCols,
+ plannode->sortColIdx,
+ plannode->sortOperators,
+ plannode->collations,
+ plannode->nullsFirst);
+
+ /* Only pass on remaining columns that are unsorted */
tuplesortstate = tuplesort_begin_heap(tupDesc,
- plannode->numCols,
- plannode->sortColIdx,
- plannode->sortOperators,
- plannode->collations,
- plannode->nullsFirst,
+ plannode->numCols - skipCols,
+ &(plannode->sortColIdx[skipCols]),
+ &(plannode->sortOperators[skipCols]),
+ &(plannode->collations[skipCols]),
+ &(plannode->nullsFirst[skipCols]),
work_mem,
node->randomAccess);
if (node->bounded)
diff --git a/src/backend/utils/sort/sortsupport.c b/src/backend/utils/sort/sortsupport.c
index 347f448..df82f5f 100644
--- a/src/backend/utils/sort/sortsupport.c
+++ b/src/backend/utils/sort/sortsupport.c
@@ -85,6 +85,35 @@ PrepareSortSupportComparisonShim(Oid cmpFunc, SortSupport ssup)
}
/*
+ * Build an array of SortSupportData structures from separated arrays.
+ */
+SortSupport
+MakeSortSupportKeys(int nkeys, AttrNumber *attNums,
+ Oid *sortOperators, Oid *sortCollations,
+ bool *nullsFirstFlags)
+{
+ SortSupport sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
+ int i;
+
+ for (i = 0; i < nkeys; i++)
+ {
+ SortSupport sortKey = sortKeys + i;
+
+ AssertArg(attNums[i] != 0);
+ AssertArg(sortOperators[i] != 0);
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = sortCollations[i];
+ sortKey->ssup_nulls_first = nullsFirstFlags[i];
+ sortKey->ssup_attno = attNums[i];
+
+ PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
+ }
+
+ return sortKeys;
+}
+
+/*
* Fill in SortSupport given an ordering operator (btree "<" or ">" operator).
*
* Caller must previously have zeroed the SortSupportData structure and then
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 9fb5a9f..738f7a1 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -604,7 +604,6 @@ tuplesort_begin_heap(TupleDesc tupDesc,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, randomAccess);
MemoryContext oldcontext;
- int i;
oldcontext = MemoryContextSwitchTo(state->sortcontext);
@@ -632,24 +631,11 @@ tuplesort_begin_heap(TupleDesc tupDesc,
state->reversedirection = reversedirection_heap;
state->tupDesc = tupDesc; /* assume we need not copy tupDesc */
-
- /* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
-
- for (i = 0; i < nkeys; i++)
- {
- SortSupport sortKey = state->sortKeys + i;
-
- AssertArg(attNums[i] != 0);
- AssertArg(sortOperators[i] != 0);
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = sortCollations[i];
- sortKey->ssup_nulls_first = nullsFirstFlags[i];
- sortKey->ssup_attno = attNums[i];
-
- PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
- }
+ state->sortKeys = MakeSortSupportKeys(nkeys,
+ attNums,
+ sortOperators,
+ sortCollations,
+ nullsFirstFlags);
if (nkeys == 1)
state->onlyKey = state->sortKeys;
@@ -3544,10 +3530,3 @@ free_sort_tuple(Tuplesortstate *state, SortTuple *stup)
FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
pfree(stup->tuple);
}
-
-SortSupport
-tuplesort_get_sortkeys(Tuplesortstate *state)
-{
- return state->sortKeys;
-}
-
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 9fa1823..13a4f0f 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1671,6 +1671,7 @@ typedef struct SortState
bool finished;
int64 bound_Done; /* value of bound we did the sort with */
void *tuplesortstate; /* private state of tuplesort.c */
+ SortSupport skipKeys; /* columns already sorted in input */
HeapTuple prev;
} SortState;
diff --git a/src/include/utils/sortsupport.h b/src/include/utils/sortsupport.h
index 13d3fbe..cd48a45 100644
--- a/src/include/utils/sortsupport.h
+++ b/src/include/utils/sortsupport.h
@@ -150,6 +150,9 @@ ApplySortComparator(Datum datum1, bool isNull1,
#endif /*-- PG_USE_INLINE || SORTSUPPORT_INCLUDE_DEFINITIONS */
/* Other functions in utils/sort/sortsupport.c */
+extern SortSupport MakeSortSupportKeys(int nkeys, AttrNumber *attNums,
+ Oid *sortOperators, Oid *sortCollations,
+ bool *nullsFirstFlags);
extern void PrepareSortSupportComparisonShim(Oid cmpFunc, SortSupport ssup);
extern void PrepareSortSupportFromOrderingOp(Oid orderingOp, SortSupport ssup);
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index 106c3fd..eb882d3 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -114,8 +114,6 @@ extern void tuplesort_get_stats(Tuplesortstate *state,
extern int tuplesort_merge_order(int64 allowedMem);
-extern SortSupport tuplesort_get_sortkeys(Tuplesortstate *state);
-
/*
* These routines may only be called if randomAccess was specified 'true'.
* Likewise, backwards scan in gettuple/getdatum is only allowed if
--
1.8.5.3
On 01/18/2014 08:13 PM, Jeremy Harris wrote:
On 31/12/13 01:41, Andreas Karlsson wrote:
On 12/29/2013 08:24 AM, David Rowley wrote:
If it was possible to devise some way to reuse any
previous tuplesortstate perhaps just inventing a reset method which
clears out tuples, then we could see performance exceed the standard
seqscan -> sort. The code the way it is seems to lookup the sort
functions from the syscache for each group then allocate some sort
space, so quite a bit of time is also spent in palloc0() and pfree()If it was not possible to do this then maybe adding a cost to the number
of sort groups would be better so that the optimization is skipped if
there are too many sort groups.It should be possible. I have hacked a quick proof of concept for
reusing the tuplesort state. Can you try it and see if the performance
regression is fixed by this?One thing which have to be fixed with my patch is that we probably want
to close the tuplesort once we have returned the last tuple from
ExecSort().I have attached my patch and the incremental patch on Alexander's patch.
How does this work in combination with randomAccess ?
As far as I can tell randomAccess was broken by the partial sort patch
even before my change since it would not iterate over multiple
tuplesorts anyway.
Alexander: Is this true or am I missing something?
--
Andreas Karlsson
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Jan 19, 2014 at 5:57 AM, Andreas Karlsson <andreas@proxel.se> wrote:
On 01/18/2014 08:13 PM, Jeremy Harris wrote:
On 31/12/13 01:41, Andreas Karlsson wrote:
On 12/29/2013 08:24 AM, David Rowley wrote:
If it was possible to devise some way to reuse any
previous tuplesortstate perhaps just inventing a reset method which
clears out tuples, then we could see performance exceed the standard
seqscan -> sort. The code the way it is seems to lookup the sort
functions from the syscache for each group then allocate some sort
space, so quite a bit of time is also spent in palloc0() and pfree()If it was not possible to do this then maybe adding a cost to the number
of sort groups would be better so that the optimization is skipped if
there are too many sort groups.It should be possible. I have hacked a quick proof of concept for
reusing the tuplesort state. Can you try it and see if the performance
regression is fixed by this?One thing which have to be fixed with my patch is that we probably want
to close the tuplesort once we have returned the last tuple from
ExecSort().I have attached my patch and the incremental patch on Alexander's patch.
How does this work in combination with randomAccess ?
As far as I can tell randomAccess was broken by the partial sort patch
even before my change since it would not iterate over multiple tuplesorts
anyway.Alexander: Is this true or am I missing something?
Yes, I decided that Sort node shouldn't provide randomAccess in the case of
skipCols !=0. See assert in the beginning of ExecInitSort. I decided that
it would be better to add explicit materialize node rather than store extra
tuples in tuplesortstate each time.
I also adjusted ExecSupportsMarkRestore, ExecMaterializesOutput and
ExecMaterializesOutput to make planner believe so. I found path->pathtype
to be absolutely never T_Sort. Correct me if I'm wrong.
Another changes in this version of patch:
1) Applied patch to don't compare skipCols in tuplesort by Marti Raudsepp
2) Adjusting sort bound after processing buckets.
------
With best regards,
Alexander Korotkov.
Attachments:
partial-sort-6.patch.gzapplication/x-gzip; name=partial-sort-6.patch.gzDownload
� ��R �}isG��g�W�bmPlB���Z�G;�V����;�h
�� �n������G�}$H���� ���������i8�����0��$�<���`9}2�9M��Ws?\�'b���7��J��y �4�F���|.��g�o��lV�O����������G����h������$��p">E�T$���??]�I����
���J4�A����F�[���z�m�?e�s�x�M����r$i't��{�S��������$�S{�s_5��K9�$�����������8�{���({"\�b�m=q�����b@5�2��z6{�i�L0\���)�%+�G9��z�����o�����{�]�}������B~��4����o���OVy�r�B:|�F�6�����WK��"��>zm�����'xt�u�q�_��? ����=�?����$���dl*|���S�(�����3Q��~��!tp�,�W�#�3�8�����0�HxmT+�'AY�;�A��j��CR�u�����>@���v�� ���h0�� �����C�.�*<�6�Y�=^��3�H�=����z�@��Ol�_O?���'�3(���8
DX� �c����G"�m�W��� >F�jW0��-~9�<W���! ���m�3�q>�������[�j��z���w��%�}���m���!�O���@@k����2q0��-�D
�;�f�cr��4i�� 5��(��R���u�S�B�Ek������'�V~%(q���+�^f����4�i0��q��2��S�O�� �������}��;���$vv����kf7X�A}^�m }^ ��1��7��B�.^`=s��e!U���kG|�!>��|)@�M���uB����`���$T��������U��8�����O^|{��������V�d��<���^����(�����`~�o0q��s�o\�>�`��c��=��ASa����W��z����/�s�y��e�jh�C5�M4��"X�1H3H�p� I�h�Q����5P���<��>��O��`QS�cd��j(�"[��s����!���KC�i�X.��=g���<!�UpfP]n��8Zqz�;P�8%tC����cF�A��A��3(T0(���2�B��y�sA�KlX�Kdv���[�^6Nx��;o#������.4�_Evc�e�Y�\�5��d�t��
U������{�z}��6����M#�f*��C��#��C��%L���m��^{���L��>��5`��[�^�poC���t�����es(����@T
�i�@�w�T�78�)w������D\��3��Hgz�
4mM��tr!jv����x5S���
3�������.��m����r������pz
�����? ��JC��7GY���#���%
%2�Z��|���I����4����2��9fgz���_��Bhub�{ .����%l�)�x
������Q�o�Z�G�g)��>ERB�� jU�S8���n Y����� n5��b���:�����;#�����l{��p\��� b����4Z~�� �
���=U�2o�����*qv�M��1��oyc���������������#�����*�����Y�p��mw�n�i]+�� ] �����
��I���"Bl�����t�Nouct;}��m[7�Ob��`�u8;mXI3w�Z/')��9*�<
r�~���3<�%�� �`�K��X4X����V���;a����\
����l�L,=w9���X�������^��{����b>�W0���E�z�V-W��]���rg��5;��pxCh������gh=�>��"�M�)�/��@8��B��H��T�ne�����QK`�h�7�|/��?��8�iA�#j!t�8!�Z��"<<$��w\�^iI �@��I�P�s�6Z�@��|F���:���},�����w����s����������r=�'�����^���F~��������N-@y7� �*���)�����U�4~4�G����X8�2�\9TI#�����S �IE�^��������A�r=b]�H�_����6,P��uk�X������dY%�e5��kvk'�oG4�t�kT��`��7����=h�MP�����Z�����)-���Zc
���p9��a��V����|Z�x���)�'�b�\ �2���e-S��L�������p��"9z����u
>!~)��`�����V=�&��S������/c���s�ry�Z�-,���u��bE�/[�^|5��^�D�\�_O�-�W�Y�����9`��1����M�`�����#p`��C�������H�����`�S���������I� (�?d�����H��y�puCB�d�x�}����q
�M�<�o�������fP��������� Y#�����DP@��`��B��$����A�J��os�;�,U�iJ���Y27EJ::8���8}T�$k+�� �|R.��o�ro������/�8 �WL��X�3uj�,u~�*��647bU-W�,S�����������'��X����
5�9�^����������}�+�k��Q���9�����-������g� ��3���:��U�q�>�@��?�� n��&���u�+?I��Z�}&{3�pR��q�
x/1����5�3P��5���t���z���A�K�[U�XT�<������
��tV{���{�������`��#�"Z3����@w �� <���J�?�e�"���@SQo�D����'��1TR�H
>��.�=�C$#��S���vNb��^[��d���fK�����Ug�qp���� ���� �yo��@�� e����*���~+w62������[����u��q��F)_��T�Z��J�KwN��t\X�*�/G�`�+��1�hqBz7#�=Gk���� p� 3,��@���N���<����f�<!r���f�����@����bh�BL��&��lcJ��7���q4!���!� !x��0�������TR���D�u�?�vJ��������@�^���}���(Y���!��L���:����I8��lC4�M��|.�����
=���9�0'N������S&qO;s�
jo$I�8O��#����I-��FR�W�45��\�����k��V��\��t�~d�o|N�)H��v ����:�'e�9����ty*��2�����b��#������������u"����A�&���r:�far��f�5�����
l�����}��wG�C� �-8=�M|�b;@ld��n�2���$s\���w?�ys���h-Ok[�d��z����B+8�N ����������Zm�������D�����%�3-�g5Y[p7�J���-���B��@�6 T��TA�p2>����OI��^JIk�3��%P��Z�~U��y�����n���No��RM�-v�#o����=j[q��3Y!O??�S�=���1�z_]�<�sx9�g.���\'�-x������j�W0�t#�.�O����U�i�c�s���+�!�������H����U|R3�FD��
��V-o���u��?�g� �X�������(\�7�� �g��W|� ��q�R��W��O�&��e���]� ��c�$�=�V��J�S�4*0�z�W0_�����_,7X�5�?���X����F"��rK����"qj��xKaJv\!Jm'I9c�L�����+�X�����w��9�S�aZL���I���x�j���HA������] �P��m�{��\h^P<�s����U����]�Y�b�Y�k�ZF����~����hS#I�t�xt�*����w�"��1!�� &������WoN~}|���w/<���'���������������=�q�C�d���S�:��E^9��K��=�� {���K�,k_�(2�<1��V����b&@'?��*�������"'�V���k.�2|���Xpb�r����MP���*�^k��l�f�l�T�U�-���1;Y,�^�A��C�)��$�EE���[Q���$ ~wD���������e����n��v�i��i
�N�zZ�'�f��E��:�K����J��� !����\��c��:Y��
&�O��>^*K�t\���w�a ;9�.�/���'��g��$�y���|f��$hM������5������:��a!�9?��3�d�7���z��uf��!��r�:�U��IMv���D�=QS?a�K[������>?ys�q����7/j����X�������������N�QH7{`��Ae7FF/��}8-���7�/E�h���?Y���t0��9�*�W�x�3l����A8������G���Y���py�8 ?���&1���C:�2�n
�^�5����MA��wE�T<����2�_���q����u�2+����=(�h�=y����s����W���P� y1�I����6���9zI��?����#�� ��~�����i�: UN�l��.�j��S2��c}!�kj 4��mD�������R�<Z��3�M��>