JIT compiling expressions/deform + inlining prototype v2.0
Hi,
I previously had an early prototype of JITing [1]http://archives.postgresql.org/message-id/20161206034955.bh33paeralxbtluv%40alap3.anarazel.de expression evaluation
and tuple deforming. I've since then worked a lot on this.
Here's an initial, not really pretty but functional, submission. This
supports all types of expressions, and tuples, and allows, albeit with
some drawbacks, inlining of builtin functions. Between the version at
[1]: http://archives.postgresql.org/message-id/20161206034955.bh33paeralxbtluv%40alap3.anarazel.de
experiment more with llvm, but I've now translated everything back.
Some features I'd to re-implement due to limitations of C API.
As a teaser:
tpch_5[9586][1]http://archives.postgresql.org/message-id/20161206034955.bh33paeralxbtluv%40alap3.anarazel.de=# set jit_expressions=0;set jit_tuple_deforming=0;
tpch_5[9586][1]http://archives.postgresql.org/message-id/20161206034955.bh33paeralxbtluv%40alap3.anarazel.de=# \i ~/tmp/tpch/pg-tpch/queries/q01.sql
┌──────────────┬──────────────┬───────────┬──────────────────┬──────────────────┬──────────────────┬──────────────────┬──────────────────┬────────────────────┬─────────────┐
│ l_returnflag │ l_linestatus │ sum_qty │ sum_base_price │ sum_disc_price │ sum_charge │ avg_qty │ avg_price │ avg_disc │ count_order │
├──────────────┼──────────────┼───────────┼──────────────────┼──────────────────┼──────────────────┼──────────────────┼──────────────────┼────────────────────┼─────────────┤
│ A │ F │ 188818373 │ 283107483036.109 │ 268952035589.054 │ 279714361804.23 │ 25.5025937044707 │ 38237.6725307617 │ 0.0499976863510723 │ 7403889 │
│ N │ F │ 4913382 │ 7364213967.94998 │ 6995782725.6633 │ 7275821143.98952 │ 25.5321530459003 │ 38267.7833908406 │ 0.0500308669240696 │ 192439 │
│ N │ O │ 375088356 │ 562442339707.852 │ 534321895537.884 │ 555701690243.972 │ 25.4978961033505 │ 38233.9150565265 │ 0.0499956453049625 │ 14710561 │
│ R │ F │ 188960009 │ 283310887148.206 │ 269147687267.211 │ 279912972474.866 │ 25.5132328961366 │ 38252.4148049933 │ 0.0499958481590264 │ 7406353 │
└──────────────┴──────────────┴───────────┴──────────────────┴──────────────────┴──────────────────┴──────────────────┴──────────────────┴────────────────────┴─────────────┘
(4 rows)
Time: 4367.486 ms (00:04.367)
tpch_5[9586][1]http://archives.postgresql.org/message-id/20161206034955.bh33paeralxbtluv%40alap3.anarazel.de=# set jit_expressions=1;set jit_tuple_deforming=1;
tpch_5[9586][1]http://archives.postgresql.org/message-id/20161206034955.bh33paeralxbtluv%40alap3.anarazel.de=# \i ~/tmp/tpch/pg-tpch/queries/q01.sql
<repeat>
(4 rows)
Time: 3158.575 ms (00:03.159)
tpch_5[9586][1]http://archives.postgresql.org/message-id/20161206034955.bh33paeralxbtluv%40alap3.anarazel.de=# set jit_expressions=0;set jit_tuple_deforming=0;
tpch_5[9586][1]http://archives.postgresql.org/message-id/20161206034955.bh33paeralxbtluv%40alap3.anarazel.de=# \i ~/tmp/tpch/pg-tpch/queries/q01.sql
<repeat>
(4 rows)
Time: 4383.562 ms (00:04.384)
The potential wins of the JITing itself are considerably larger than the
already significant gains demonstrated above - this version here doesn't
exactly generate the nicest native code around. After these patches the
bottlencks for TCP-H's Q01 are largely inside the float* functions and
the non-expressionified execGrouping.c code. The latter needs to be
expressified to gain benefits due to JIT - that shouldn't be very hard.
The code generation can be improved by moving more of the variable data
into llvm allocated stack data, that also has other benefits.
The patch series currently consists out of the following:
0001-Rely-on-executor-utils-to-build-targetlist-for-DML-R.patch
- boring prep work
0002-WIP-Allow-tupleslots-to-have-a-fixed-tupledesc-use-i.patch
- for JITed deforming we need to know whether a slot's tupledesc will
change
0003-WIP-Add-configure-infrastructure-to-enable-LLVM.patch
- boring
0004-WIP-Beginning-of-a-LLVM-JIT-infrastructure.patch
- infrastructure for llvm, including memory lifetime management, and
bulk emission of functions.
0005-Perform-slot-validity-checks-in-a-separate-pass-over.patch
- boring, prep work for expression jiting
0006-WIP-deduplicate-int-float-overflow-handling-code.patch
- boring
0007-Pass-through-PlanState-parent-to-expression-instanti.patch
- boring
0008-WIP-JIT-compile-expression.patch
- that's the biggest patch, actually adding JITing
- code needs to be better documented, tested, and deduplicated
0009-Simplify-aggregate-code-a-bit.patch
0010-More-efficient-AggState-pertrans-iteration.patch
0011-Avoid-dereferencing-tts_values-nulls-repeatedly.patch
0012-Centralize-slot-deforming-logic-a-bit.patch
- boring, mostly to make comparison between JITed and non-jitted a bit
fairer and to remove unnecessary other bottlenecks.
0013-WIP-Make-scan-desc-available-for-all-PlanStates.patch
- this isn't clean enough.
0014-WIP-JITed-tuple-deforming.patch
- do JITing of deforming, but only when called from within expression,
there we know which columns we want to be deformed etc.
- Not clear what'd be a good way to also JIT other deforming without
additional infrastructure - doing a separate function emission for
every slot_deform_tuple() is unattractive performancewise and
memory-lifetime wise, I did have that at first.
0015-WIP-Expression-based-agg-transition.patch
- allows to JIT aggregate transition invocation, but also speeds up
aggregates without JIT.
0016-Hacky-Preliminary-inlining-implementation.patch
- allows to inline functions, by using bitcode. That bitcode can be
loaded from a list of directories - as long as compatibly configured
the bitcode doesn't have to be generated by the same compiler as the
postgres binary. i.e. gcc postgres + clang bitcode works.
I've whacked this around quite heavily today, this likely has some new
bugs, sorry for that :(
I plan to spend some considerable time over the next weeks to clean this
up and address some of the areas where the performance isn't yet as good
as desirable.
Greetings,
Andres Freund
[1]: http://archives.postgresql.org/message-id/20161206034955.bh33paeralxbtluv%40alap3.anarazel.de
Attachments:
0001-Rely-on-executor-utils-to-build-targetlist-for-DML-R.patchtext/x-diff; charset=us-asciiDownload
From 9dd646b49ce2385fdf950f459879ded116ef4bb0 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 16 Aug 2017 01:03:51 -0700
Subject: [PATCH 01/16] Rely on executor utils to build targetlist for DML
RETURNING.
---
src/backend/executor/nodeModifyTable.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index e12721a9b6..57946e1591 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -1793,7 +1793,6 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
int nplans = list_length(node->plans);
ResultRelInfo *saved_resultRelInfo;
ResultRelInfo *resultRelInfo;
- TupleDesc tupDesc;
Plan *subplan;
ListCell *l;
int i;
@@ -2027,12 +2026,11 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* Initialize result tuple slot and assign its rowtype using the first
* RETURNING list. We assume the rest will look the same.
*/
- tupDesc = ExecTypeFromTL((List *) linitial(node->returningLists),
- false);
+ mtstate->ps.plan->targetlist = (List *) linitial(node->returningLists);
/* Set up a slot for the output of the RETURNING projection(s) */
ExecInitResultTupleSlot(estate, &mtstate->ps);
- ExecAssignResultType(&mtstate->ps, tupDesc);
+ ExecAssignResultTypeFromTL(&mtstate->ps);
slot = mtstate->ps.ps_ResultTupleSlot;
/* Need an econtext too */
@@ -2084,9 +2082,9 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* We still must construct a dummy result tuple type, because InitPlan
* expects one (maybe should change that?).
*/
- tupDesc = ExecTypeFromTL(NIL, false);
+ mtstate->ps.plan->targetlist = NIL;
ExecInitResultTupleSlot(estate, &mtstate->ps);
- ExecAssignResultType(&mtstate->ps, tupDesc);
+ ExecAssignResultTypeFromTL(&mtstate->ps);
mtstate->ps.ps_ExprContext = NULL;
}
--
2.14.1.2.g4274c698f4.dirty
0002-WIP-Allow-tupleslots-to-have-a-fixed-tupledesc-use-i.patchtext/x-diff; charset=us-asciiDownload
From 9000faa6ccb3f5bbbd8944590c82826fa13404e9 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Fri, 25 Aug 2017 14:19:23 -0700
Subject: [PATCH 02/16] WIP: Allow tupleslots to have a fixed tupledesc, use in
executor nodes.
The reason for doing so is that it will allow expression evaluation to
optimize based on the underlying tupledesc. In particular it allows
JITing tuple deforming together with the expression itself.
For that expression initialization needs to be moved after the
relevant slots are initialized - mostly unproblematic, except in the
case of nodeWorktablescan.c.
---
src/backend/executor/execMain.c | 2 +-
src/backend/executor/execTuples.c | 50 ++++++++++++-----
src/backend/executor/execUtils.c | 45 +--------------
src/backend/executor/nodeAgg.c | 42 +++++++-------
src/backend/executor/nodeAppend.c | 17 +++---
src/backend/executor/nodeBitmapAnd.c | 14 ++---
src/backend/executor/nodeBitmapHeapscan.c | 43 +++++++-------
src/backend/executor/nodeBitmapIndexscan.c | 14 ++---
src/backend/executor/nodeBitmapOr.c | 14 ++---
src/backend/executor/nodeCtescan.c | 20 +++----
src/backend/executor/nodeCustom.c | 16 +++---
src/backend/executor/nodeForeignscan.c | 24 ++++----
src/backend/executor/nodeFunctionscan.c | 26 ++++-----
src/backend/executor/nodeGather.c | 3 +-
src/backend/executor/nodeGatherMerge.c | 3 +-
src/backend/executor/nodeGroup.c | 18 +++---
src/backend/executor/nodeHash.c | 13 ++---
src/backend/executor/nodeHashjoin.c | 25 ++++-----
src/backend/executor/nodeIndexonlyscan.c | 28 +++++-----
src/backend/executor/nodeIndexscan.c | 49 ++++++++--------
src/backend/executor/nodeLimit.c | 3 +-
src/backend/executor/nodeLockRows.c | 15 +++--
src/backend/executor/nodeMaterial.c | 20 +++----
src/backend/executor/nodeMergeAppend.c | 3 +-
src/backend/executor/nodeMergejoin.c | 25 ++++-----
src/backend/executor/nodeModifyTable.c | 6 +-
src/backend/executor/nodeNamedtuplestorescan.c | 18 ++----
src/backend/executor/nodeNestloop.c | 21 ++++---
src/backend/executor/nodeProjectSet.c | 3 +-
src/backend/executor/nodeRecursiveunion.c | 40 ++++++-------
src/backend/executor/nodeResult.c | 27 +++++----
src/backend/executor/nodeSamplescan.c | 77 ++++++++++----------------
src/backend/executor/nodeSeqscan.c | 56 +++++++------------
src/backend/executor/nodeSetOp.c | 3 +-
src/backend/executor/nodeSort.c | 20 +++----
src/backend/executor/nodeSubqueryscan.c | 18 +++---
src/backend/executor/nodeTableFuncscan.c | 25 ++++-----
src/backend/executor/nodeTidscan.c | 49 ++++++++--------
src/backend/executor/nodeUnique.c | 3 +-
src/backend/executor/nodeValuesscan.c | 18 +++---
src/backend/executor/nodeWindowAgg.c | 6 +-
src/backend/executor/nodeWorktablescan.c | 13 ++---
src/include/executor/executor.h | 8 +--
src/include/executor/tuptable.h | 5 +-
44 files changed, 422 insertions(+), 526 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 2946a0edee..89ef0fc8ee 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -3290,7 +3290,7 @@ ExecSetupPartitionTupleRouting(Relation rel,
* (such as ModifyTableState) and released when the node finishes
* processing.
*/
- *partition_tuple_slot = MakeTupleTableSlot();
+ *partition_tuple_slot = MakeTupleTableSlot(NULL);
leaf_part_rri = *partitions;
i = 0;
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index 31f814c0f0..8280b89f7f 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -58,7 +58,7 @@
* At ExecutorStart()
* ----------------
* - ExecInitSeqScan() calls ExecInitScanTupleSlot() and
- * ExecInitResultTupleSlot() to construct TupleTableSlots
+ * ExecInitResultTupleSlotTL() to construct TupleTableSlots
* for the tuples returned by the access methods and the
* tuples resulting from performing target list projections.
*
@@ -108,7 +108,7 @@ static TupleDesc ExecTypeFromTLInternal(List *targetList,
* --------------------------------
*/
TupleTableSlot *
-MakeTupleTableSlot(void)
+MakeTupleTableSlot(TupleDesc tupleDesc)
{
TupleTableSlot *slot = makeNode(TupleTableSlot);
@@ -116,6 +116,7 @@ MakeTupleTableSlot(void)
slot->tts_shouldFree = false;
slot->tts_shouldFreeMin = false;
slot->tts_tuple = NULL;
+ slot->tts_fixedTupleDescriptor = false;
slot->tts_tupleDescriptor = NULL;
slot->tts_mcxt = CurrentMemoryContext;
slot->tts_buffer = InvalidBuffer;
@@ -124,6 +125,13 @@ MakeTupleTableSlot(void)
slot->tts_isnull = NULL;
slot->tts_mintuple = NULL;
+ /* FIXME: instead allocate everything in one go */
+ if (tupleDesc != NULL)
+ {
+ ExecSetSlotDescriptor(slot, tupleDesc);
+ slot->tts_fixedTupleDescriptor = true;
+ }
+
return slot;
}
@@ -134,9 +142,9 @@ MakeTupleTableSlot(void)
* --------------------------------
*/
TupleTableSlot *
-ExecAllocTableSlot(List **tupleTable)
+ExecAllocTableSlot(List **tupleTable, TupleDesc desc)
{
- TupleTableSlot *slot = MakeTupleTableSlot();
+ TupleTableSlot *slot = MakeTupleTableSlot(desc);
*tupleTable = lappend(*tupleTable, slot);
@@ -198,9 +206,7 @@ ExecResetTupleTable(List *tupleTable, /* tuple table */
TupleTableSlot *
MakeSingleTupleTableSlot(TupleDesc tupdesc)
{
- TupleTableSlot *slot = MakeTupleTableSlot();
-
- ExecSetSlotDescriptor(slot, tupdesc);
+ TupleTableSlot *slot = MakeTupleTableSlot(tupdesc);
return slot;
}
@@ -247,6 +253,8 @@ void
ExecSetSlotDescriptor(TupleTableSlot *slot, /* slot to change */
TupleDesc tupdesc) /* new tuple descriptor */
{
+ Assert(!slot->tts_fixedTupleDescriptor);
+
/* For safety, make sure slot is empty before changing it */
ExecClearTuple(slot);
@@ -825,13 +833,28 @@ ExecCopySlot(TupleTableSlot *dstslot, TupleTableSlot *srcslot)
*/
/* ----------------
- * ExecInitResultTupleSlot
+ * ExecInitResultTupleSlotTL
* ----------------
*/
void
-ExecInitResultTupleSlot(EState *estate, PlanState *planstate)
+ExecInitResultTupleSlotTL(EState *estate, PlanState *planstate)
{
- planstate->ps_ResultTupleSlot = ExecAllocTableSlot(&estate->es_tupleTable);
+ bool hasoid;
+ TupleDesc tupDesc;
+
+ if (ExecContextForcesOids(planstate, &hasoid))
+ {
+ /* context forces OID choice; hasoid is now set correctly */
+ }
+ else
+ {
+ /* given free choice, don't leave space for OIDs in result tuples */
+ hasoid = false;
+ }
+
+ tupDesc = ExecTypeFromTL(planstate->plan->targetlist, hasoid);
+
+ planstate->ps_ResultTupleSlot = ExecAllocTableSlot(&estate->es_tupleTable, tupDesc);
}
/* ----------------
@@ -839,9 +862,10 @@ ExecInitResultTupleSlot(EState *estate, PlanState *planstate)
* ----------------
*/
void
-ExecInitScanTupleSlot(EState *estate, ScanState *scanstate)
+ExecInitScanTupleSlot(EState *estate, ScanState *scanstate, TupleDesc tupledesc)
{
- scanstate->ss_ScanTupleSlot = ExecAllocTableSlot(&estate->es_tupleTable);
+ scanstate->ss_ScanTupleSlot = ExecAllocTableSlot(&estate->es_tupleTable,
+ tupledesc);
}
/* ----------------
@@ -851,7 +875,7 @@ ExecInitScanTupleSlot(EState *estate, ScanState *scanstate)
TupleTableSlot *
ExecInitExtraTupleSlot(EState *estate)
{
- return ExecAllocTableSlot(&estate->es_tupleTable);
+ return ExecAllocTableSlot(&estate->es_tupleTable, NULL);
}
/* ----------------
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 9528393976..5928c38f90 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -425,47 +425,6 @@ ExecAssignExprContext(EState *estate, PlanState *planstate)
planstate->ps_ExprContext = CreateExprContext(estate);
}
-/* ----------------
- * ExecAssignResultType
- * ----------------
- */
-void
-ExecAssignResultType(PlanState *planstate, TupleDesc tupDesc)
-{
- TupleTableSlot *slot = planstate->ps_ResultTupleSlot;
-
- ExecSetSlotDescriptor(slot, tupDesc);
-}
-
-/* ----------------
- * ExecAssignResultTypeFromTL
- * ----------------
- */
-void
-ExecAssignResultTypeFromTL(PlanState *planstate)
-{
- bool hasoid;
- TupleDesc tupDesc;
-
- if (ExecContextForcesOids(planstate, &hasoid))
- {
- /* context forces OID choice; hasoid is now set correctly */
- }
- else
- {
- /* given free choice, don't leave space for OIDs in result tuples */
- hasoid = false;
- }
-
- /*
- * ExecTypeFromTL needs the parse-time representation of the tlist, not a
- * list of ExprStates. This is good because some plan nodes don't bother
- * to set up planstate->targetlist ...
- */
- tupDesc = ExecTypeFromTL(planstate->plan->targetlist, hasoid);
- ExecAssignResultType(planstate, tupDesc);
-}
-
/* ----------------
* ExecGetResultType
* ----------------
@@ -554,7 +513,7 @@ ExecAssignScanType(ScanState *scanstate, TupleDesc tupDesc)
* ----------------
*/
void
-ExecAssignScanTypeFromOuterPlan(ScanState *scanstate)
+ExecCreateScanSlotForOuterPlan(EState *estate, ScanState *scanstate)
{
PlanState *outerPlan;
TupleDesc tupDesc;
@@ -562,7 +521,7 @@ ExecAssignScanTypeFromOuterPlan(ScanState *scanstate)
outerPlan = outerPlanState(scanstate);
tupDesc = ExecGetResultType(outerPlan);
- ExecAssignScanType(scanstate, tupDesc);
+ ExecInitScanTupleSlot(estate, scanstate, tupDesc);
}
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 0ae5873868..1783f38f14 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -2795,10 +2795,28 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
*
* For hashtables, we create some additional slots below.
*/
- ExecInitScanTupleSlot(estate, &aggstate->ss);
- ExecInitResultTupleSlot(estate, &aggstate->ss.ps);
+ ExecInitResultTupleSlotTL(estate, &aggstate->ss.ps);
aggstate->sort_slot = ExecInitExtraTupleSlot(estate);
+ /*
+ * Initialize child nodes.
+ *
+ * If we are doing a hashed aggregation then the child plan does not need
+ * to handle REWIND efficiently; see ExecReScanAgg.
+ */
+ if (node->aggstrategy == AGG_HASHED)
+ eflags &= ~EXEC_FLAG_REWIND;
+ outerPlan = outerPlan(node);
+ outerPlanState(aggstate) = ExecInitNode(outerPlan, estate, eflags);
+
+ /*
+ * initialize source tuple type.
+ */
+ ExecCreateScanSlotForOuterPlan(estate, &aggstate->ss);
+ if (node->chain)
+ ExecSetSlotDescriptor(aggstate->sort_slot,
+ aggstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor);
+
/*
* initialize child expressions
*
@@ -2814,29 +2832,9 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
aggstate->ss.ps.qual =
ExecInitQual(node->plan.qual, (PlanState *) aggstate);
- /*
- * Initialize child nodes.
- *
- * If we are doing a hashed aggregation then the child plan does not need
- * to handle REWIND efficiently; see ExecReScanAgg.
- */
- if (node->aggstrategy == AGG_HASHED)
- eflags &= ~EXEC_FLAG_REWIND;
- outerPlan = outerPlan(node);
- outerPlanState(aggstate) = ExecInitNode(outerPlan, estate, eflags);
-
- /*
- * initialize source tuple type.
- */
- ExecAssignScanTypeFromOuterPlan(&aggstate->ss);
- if (node->chain)
- ExecSetSlotDescriptor(aggstate->sort_slot,
- aggstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor);
-
/*
* Initialize result tuple type and projection info.
*/
- ExecAssignResultTypeFromTL(&aggstate->ss.ps);
ExecAssignProjectionInfo(&aggstate->ss.ps, NULL);
/*
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index bed9bb8713..e67d0c36d3 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -152,18 +152,11 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->appendplans = appendplanstates;
appendstate->as_nplans = nplans;
- /*
- * Miscellaneous initialization
- *
- * Append plans don't have expression contexts because they never call
- * ExecQual or ExecProject.
- */
-
/*
* append nodes still have Result slots, which hold pointers to tuples, so
* we have to initialize them.
*/
- ExecInitResultTupleSlot(estate, &appendstate->ps);
+ ExecInitResultTupleSlotTL(estate, &appendstate->ps);
/*
* call ExecInitNode on each of the plans to be executed and save the
@@ -181,9 +174,15 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
/*
* initialize output tuple type
*/
- ExecAssignResultTypeFromTL(&appendstate->ps);
appendstate->ps.ps_ProjInfo = NULL;
+ /*
+ * Miscellaneous initialization
+ *
+ * Append plans don't have expression contexts because they never call
+ * ExecQual or ExecProject.
+ */
+
/*
* initialize to scan first subplan
*/
diff --git a/src/backend/executor/nodeBitmapAnd.c b/src/backend/executor/nodeBitmapAnd.c
index 1c5c312c95..b2b30842c6 100644
--- a/src/backend/executor/nodeBitmapAnd.c
+++ b/src/backend/executor/nodeBitmapAnd.c
@@ -80,13 +80,6 @@ ExecInitBitmapAnd(BitmapAnd *node, EState *estate, int eflags)
bitmapandstate->bitmapplans = bitmapplanstates;
bitmapandstate->nplans = nplans;
- /*
- * Miscellaneous initialization
- *
- * BitmapAnd plans don't have expression contexts because they never call
- * ExecQual or ExecProject. They don't need any tuple slots either.
- */
-
/*
* call ExecInitNode on each of the plans to be executed and save the
* results into the array "bitmapplanstates".
@@ -99,6 +92,13 @@ ExecInitBitmapAnd(BitmapAnd *node, EState *estate, int eflags)
i++;
}
+ /*
+ * Miscellaneous initialization
+ *
+ * BitmapAnd plans don't have expression contexts because they never call
+ * ExecQual or ExecProject. They don't need any tuple slots either.
+ */
+
return bitmapandstate;
}
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index f7e55e0b45..021fe0e272 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -824,6 +824,27 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
+ /*
+ * open the base relation and acquire appropriate lock on it.
+ */
+ currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
+
+ /*
+ * tuple table initialization
+ */
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
+
+ /*
+ * get the scan type from the relation descriptor.
+ */
+ ExecInitScanTupleSlot(estate, &scanstate->ss,
+ RelationGetDescr(currentRelation));
+
+ /*
+ * Initialize result tuple type and projection info.
+ */
+ ExecAssignScanProjectionInfo(&scanstate->ss);
+
/*
* initialize child expressions
*/
@@ -832,17 +853,6 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
scanstate->bitmapqualorig =
ExecInitQual(node->bitmapqualorig, (PlanState *) scanstate);
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
-
- /*
- * open the base relation and acquire appropriate lock on it.
- */
- currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
-
/*
* Determine the maximum for prefetch_target. If the tablespace has a
* specific IO concurrency set, use that to compute the corresponding
@@ -870,17 +880,6 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
0,
NULL);
- /*
- * get the scan type from the relation descriptor.
- */
- ExecAssignScanType(&scanstate->ss, RelationGetDescr(currentRelation));
-
- /*
- * Initialize result tuple type and projection info.
- */
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
- ExecAssignScanProjectionInfo(&scanstate->ss);
-
/*
* initialize child nodes
*
diff --git a/src/backend/executor/nodeBitmapIndexscan.c b/src/backend/executor/nodeBitmapIndexscan.c
index 6feb70f4ae..1178dc82b2 100644
--- a/src/backend/executor/nodeBitmapIndexscan.c
+++ b/src/backend/executor/nodeBitmapIndexscan.c
@@ -226,13 +226,6 @@ ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags)
/* normally we don't make the result bitmap till runtime */
indexstate->biss_result = NULL;
- /*
- * Miscellaneous initialization
- *
- * We do not need a standard exprcontext for this node, though we may
- * decide below to create a runtime-key exprcontext
- */
-
/*
* initialize child expressions
*
@@ -248,6 +241,13 @@ ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags)
* the heap relation throughout the execution of the plan tree.
*/
+ /*
+ * Miscellaneous initialization
+ *
+ * We do not need a standard exprcontext for this node, though we may
+ * decide below to create a runtime-key exprcontext
+ */
+
indexstate->ss.ss_currentRelation = NULL;
indexstate->ss.ss_currentScanDesc = NULL;
diff --git a/src/backend/executor/nodeBitmapOr.c b/src/backend/executor/nodeBitmapOr.c
index 66a7a89a8b..73c08e4652 100644
--- a/src/backend/executor/nodeBitmapOr.c
+++ b/src/backend/executor/nodeBitmapOr.c
@@ -81,13 +81,6 @@ ExecInitBitmapOr(BitmapOr *node, EState *estate, int eflags)
bitmaporstate->bitmapplans = bitmapplanstates;
bitmaporstate->nplans = nplans;
- /*
- * Miscellaneous initialization
- *
- * BitmapOr plans don't have expression contexts because they never call
- * ExecQual or ExecProject. They don't need any tuple slots either.
- */
-
/*
* call ExecInitNode on each of the plans to be executed and save the
* results into the array "bitmapplanstates".
@@ -100,6 +93,13 @@ ExecInitBitmapOr(BitmapOr *node, EState *estate, int eflags)
i++;
}
+ /*
+ * Miscellaneous initialization
+ *
+ * BitmapOr plans don't have expression contexts because they never call
+ * ExecQual or ExecProject. They don't need any tuple slots either.
+ */
+
return bitmaporstate;
}
diff --git a/src/backend/executor/nodeCtescan.c b/src/backend/executor/nodeCtescan.c
index 79676ca978..934885dce5 100644
--- a/src/backend/executor/nodeCtescan.c
+++ b/src/backend/executor/nodeCtescan.c
@@ -242,31 +242,29 @@ ExecInitCteScan(CteScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
- /*
- * initialize child expressions
- */
- scanstate->ss.ps.qual =
- ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
-
/*
* tuple table initialization
*/
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
/*
* The scan tuple type (ie, the rowtype we expect to find in the work
* table) is the same as the result rowtype of the CTE query.
*/
- ExecAssignScanType(&scanstate->ss,
- ExecGetResultType(scanstate->cteplanstate));
+ ExecInitScanTupleSlot(estate, &scanstate->ss,
+ ExecGetResultType(scanstate->cteplanstate));
/*
* Initialize result tuple type and projection info.
*/
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
ExecAssignScanProjectionInfo(&scanstate->ss);
+ /*
+ * initialize child expressions
+ */
+ scanstate->ss.ps.qual =
+ ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
+
return scanstate;
}
diff --git a/src/backend/executor/nodeCustom.c b/src/backend/executor/nodeCustom.c
index 07dcabef55..3d9b69e550 100644
--- a/src/backend/executor/nodeCustom.c
+++ b/src/backend/executor/nodeCustom.c
@@ -54,13 +54,8 @@ ExecInitCustomScan(CustomScan *cscan, EState *estate, int eflags)
/* create expression context for node */
ExecAssignExprContext(estate, &css->ss.ps);
- /* initialize child expressions */
- css->ss.ps.qual =
- ExecInitQual(cscan->scan.plan.qual, (PlanState *) css);
-
/* tuple table initialization */
- ExecInitScanTupleSlot(estate, &css->ss);
- ExecInitResultTupleSlot(estate, &css->ss.ps);
+ ExecInitResultTupleSlotTL(estate, &css->ss.ps);
/*
* open the base relation, if any, and acquire an appropriate lock on it
@@ -81,13 +76,13 @@ ExecInitCustomScan(CustomScan *cscan, EState *estate, int eflags)
TupleDesc scan_tupdesc;
scan_tupdesc = ExecTypeFromTL(cscan->custom_scan_tlist, false);
- ExecAssignScanType(&css->ss, scan_tupdesc);
+ ExecInitScanTupleSlot(estate, &css->ss, scan_tupdesc);
/* Node's targetlist will contain Vars with varno = INDEX_VAR */
tlistvarno = INDEX_VAR;
}
else
{
- ExecAssignScanType(&css->ss, RelationGetDescr(scan_rel));
+ ExecInitScanTupleSlot(estate, &css->ss, RelationGetDescr(scan_rel));
/* Node's targetlist will contain Vars with varno = scanrelid */
tlistvarno = scanrelid;
}
@@ -95,9 +90,12 @@ ExecInitCustomScan(CustomScan *cscan, EState *estate, int eflags)
/*
* Initialize result tuple type and projection info.
*/
- ExecAssignResultTypeFromTL(&css->ss.ps);
ExecAssignScanProjectionInfoWithVarno(&css->ss, tlistvarno);
+ /* initialize child expressions */
+ css->ss.ps.qual =
+ ExecInitQual(cscan->scan.plan.qual, (PlanState *) css);
+
/*
* The callback of custom-scan provider applies the final initialization
* of the custom-scan-state node according to its logic.
diff --git a/src/backend/executor/nodeForeignscan.c b/src/backend/executor/nodeForeignscan.c
index 20892d6d5f..fb67a53d1e 100644
--- a/src/backend/executor/nodeForeignscan.c
+++ b/src/backend/executor/nodeForeignscan.c
@@ -155,19 +155,10 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
- /*
- * initialize child expressions
- */
- scanstate->ss.ps.qual =
- ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
- scanstate->fdw_recheck_quals =
- ExecInitQual(node->fdw_recheck_quals, (PlanState *) scanstate);
-
/*
* tuple table initialization
*/
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
/*
* open the base relation, if any, and acquire an appropriate lock on it;
@@ -194,13 +185,13 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
TupleDesc scan_tupdesc;
scan_tupdesc = ExecTypeFromTL(node->fdw_scan_tlist, false);
- ExecAssignScanType(&scanstate->ss, scan_tupdesc);
+ ExecInitScanTupleSlot(estate, &scanstate->ss, scan_tupdesc);
/* Node's targetlist will contain Vars with varno = INDEX_VAR */
tlistvarno = INDEX_VAR;
}
else
{
- ExecAssignScanType(&scanstate->ss, RelationGetDescr(currentRelation));
+ ExecInitScanTupleSlot(estate, &scanstate->ss, RelationGetDescr(currentRelation));
/* Node's targetlist will contain Vars with varno = scanrelid */
tlistvarno = scanrelid;
}
@@ -208,9 +199,16 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
/*
* Initialize result tuple type and projection info.
*/
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
ExecAssignScanProjectionInfoWithVarno(&scanstate->ss, tlistvarno);
+ /*
+ * initialize child expressions
+ */
+ scanstate->ss.ps.qual =
+ ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
+ scanstate->fdw_recheck_quals =
+ ExecInitQual(node->fdw_recheck_quals, (PlanState *) scanstate);
+
/*
* Initialize FDW-related state.
*/
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 9f87a7e5cd..8d834820c1 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -334,18 +334,6 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
-
- /*
- * initialize child expressions
- */
- scanstate->ss.ps.qual =
- ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
-
scanstate->funcstates = palloc(nfuncs * sizeof(FunctionScanPerFuncState));
natts = 0;
@@ -491,14 +479,24 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
Assert(attno == natts);
}
- ExecAssignScanType(&scanstate->ss, scan_tupdesc);
+ /*
+ * tuple table initialization
+ */
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
+ ExecInitScanTupleSlot(estate, &scanstate->ss, scan_tupdesc);
/*
* Initialize result tuple type and projection info.
*/
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
ExecAssignScanProjectionInfo(&scanstate->ss);
+ /*
+ * initialize child expressions
+ */
+ scanstate->ss.ps.qual =
+ ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
+
+
/*
* Create a memory context that ExecMakeTableFunctionResult can use to
* evaluate function arguments in. We can't use the per-tuple context for
diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c
index d93fbacdf9..7656aaaee2 100644
--- a/src/backend/executor/nodeGather.c
+++ b/src/backend/executor/nodeGather.c
@@ -93,7 +93,7 @@ ExecInitGather(Gather *node, EState *estate, int eflags)
* tuple table initialization
*/
gatherstate->funnel_slot = ExecInitExtraTupleSlot(estate);
- ExecInitResultTupleSlot(estate, &gatherstate->ps);
+ ExecInitResultTupleSlotTL(estate, &gatherstate->ps);
/*
* now initialize outer plan
@@ -104,7 +104,6 @@ ExecInitGather(Gather *node, EState *estate, int eflags)
/*
* Initialize result tuple type and projection info.
*/
- ExecAssignResultTypeFromTL(&gatherstate->ps);
ExecAssignProjectionInfo(&gatherstate->ps, NULL);
/*
diff --git a/src/backend/executor/nodeGatherMerge.c b/src/backend/executor/nodeGatherMerge.c
index b8bb4f8eb0..b940fb3cb5 100644
--- a/src/backend/executor/nodeGatherMerge.c
+++ b/src/backend/executor/nodeGatherMerge.c
@@ -106,7 +106,7 @@ ExecInitGatherMerge(GatherMerge *node, EState *estate, int eflags)
/*
* tuple table initialization
*/
- ExecInitResultTupleSlot(estate, &gm_state->ps);
+ ExecInitResultTupleSlotTL(estate, &gm_state->ps);
/*
* now initialize outer plan
@@ -117,7 +117,6 @@ ExecInitGatherMerge(GatherMerge *node, EState *estate, int eflags)
/*
* Initialize result tuple type and projection info.
*/
- ExecAssignResultTypeFromTL(&gm_state->ps);
ExecAssignProjectionInfo(&gm_state->ps, NULL);
/*
diff --git a/src/backend/executor/nodeGroup.c b/src/backend/executor/nodeGroup.c
index ab4ae24a6b..acd679fa14 100644
--- a/src/backend/executor/nodeGroup.c
+++ b/src/backend/executor/nodeGroup.c
@@ -187,14 +187,7 @@ ExecInitGroup(Group *node, EState *estate, int eflags)
/*
* tuple table initialization
*/
- ExecInitScanTupleSlot(estate, &grpstate->ss);
- ExecInitResultTupleSlot(estate, &grpstate->ss.ps);
-
- /*
- * initialize child expressions
- */
- grpstate->ss.ps.qual =
- ExecInitQual(node->plan.qual, (PlanState *) grpstate);
+ ExecInitResultTupleSlotTL(estate, &grpstate->ss.ps);
/*
* initialize child nodes
@@ -204,14 +197,19 @@ ExecInitGroup(Group *node, EState *estate, int eflags)
/*
* initialize tuple type.
*/
- ExecAssignScanTypeFromOuterPlan(&grpstate->ss);
+ ExecCreateScanSlotForOuterPlan(estate, &grpstate->ss);
/*
* Initialize result tuple type and projection info.
*/
- ExecAssignResultTypeFromTL(&grpstate->ss.ps);
ExecAssignProjectionInfo(&grpstate->ss.ps, NULL);
+ /*
+ * initialize child expressions
+ */
+ grpstate->ss.ps.qual =
+ ExecInitQual(node->plan.qual, (PlanState *) grpstate);
+
/*
* Precompute fmgr lookup data for inner loop
*/
diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index d10d94ccc2..c551f319ac 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -186,7 +186,12 @@ ExecInitHash(Hash *node, EState *estate, int eflags)
/*
* initialize our result slot
*/
- ExecInitResultTupleSlot(estate, &hashstate->ps);
+ ExecInitResultTupleSlotTL(estate, &hashstate->ps);
+
+ /*
+ * initialize child nodes
+ */
+ outerPlanState(hashstate) = ExecInitNode(outerPlan(node), estate, eflags);
/*
* initialize child expressions
@@ -194,16 +199,10 @@ ExecInitHash(Hash *node, EState *estate, int eflags)
hashstate->ps.qual =
ExecInitQual(node->plan.qual, (PlanState *) hashstate);
- /*
- * initialize child nodes
- */
- outerPlanState(hashstate) = ExecInitNode(outerPlan(node), estate, eflags);
-
/*
* initialize tuple type. no need to initialize projection info because
* this node doesn't do projections
*/
- ExecAssignResultTypeFromTL(&hashstate->ps);
hashstate->ps.ps_ProjInfo = NULL;
return hashstate;
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index ab1632cc13..f35538924a 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -401,6 +401,7 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
hjstate->js.ps.plan = (Plan *) node;
hjstate->js.ps.state = estate;
hjstate->js.ps.ExecProcNode = ExecHashJoin;
+ hjstate->js.jointype = node->join.jointype;
/*
* Miscellaneous initialization
@@ -409,17 +410,6 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &hjstate->js.ps);
- /*
- * initialize child expressions
- */
- hjstate->js.ps.qual =
- ExecInitQual(node->join.plan.qual, (PlanState *) hjstate);
- hjstate->js.jointype = node->join.jointype;
- hjstate->js.joinqual =
- ExecInitQual(node->join.joinqual, (PlanState *) hjstate);
- hjstate->hashclauses =
- ExecInitQual(node->hashclauses, (PlanState *) hjstate);
-
/*
* initialize child nodes
*
@@ -436,7 +426,7 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
/*
* tuple table initialization
*/
- ExecInitResultTupleSlot(estate, &hjstate->js.ps);
+ ExecInitResultTupleSlotTL(estate, &hjstate->js.ps);
hjstate->hj_OuterTupleSlot = ExecInitExtraTupleSlot(estate);
/*
@@ -492,12 +482,21 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
/*
* initialize tuple type and projection info
*/
- ExecAssignResultTypeFromTL(&hjstate->js.ps);
ExecAssignProjectionInfo(&hjstate->js.ps, NULL);
ExecSetSlotDescriptor(hjstate->hj_OuterTupleSlot,
ExecGetResultType(outerPlanState(hjstate)));
+ /*
+ * initialize child expressions
+ */
+ hjstate->js.ps.qual =
+ ExecInitQual(node->join.plan.qual, (PlanState *) hjstate);
+ hjstate->js.joinqual =
+ ExecInitQual(node->join.joinqual, (PlanState *) hjstate);
+ hjstate->hashclauses =
+ ExecInitQual(node->hashclauses, (PlanState *) hjstate);
+
/*
* initialize hash-specific info
*/
diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c
index 5351cb8981..479d6b3e0a 100644
--- a/src/backend/executor/nodeIndexonlyscan.c
+++ b/src/backend/executor/nodeIndexonlyscan.c
@@ -474,22 +474,10 @@ ExecInitIndexOnlyScan(IndexOnlyScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &indexstate->ss.ps);
- /*
- * initialize child expressions
- *
- * Note: we don't initialize all of the indexorderby expression, only the
- * sub-parts corresponding to runtime keys (see below).
- */
- indexstate->ss.ps.qual =
- ExecInitQual(node->scan.plan.qual, (PlanState *) indexstate);
- indexstate->indexqual =
- ExecInitQual(node->indexqual, (PlanState *) indexstate);
-
/*
* tuple table initialization
*/
- ExecInitResultTupleSlot(estate, &indexstate->ss.ps);
- ExecInitScanTupleSlot(estate, &indexstate->ss);
+ ExecInitResultTupleSlotTL(estate, &indexstate->ss.ps);
/*
* open the base relation and acquire appropriate lock on it.
@@ -507,16 +495,26 @@ ExecInitIndexOnlyScan(IndexOnlyScan *node, EState *estate, int eflags)
* suitable data anyway.)
*/
tupDesc = ExecTypeFromTL(node->indextlist, false);
- ExecAssignScanType(&indexstate->ss, tupDesc);
+ ExecInitScanTupleSlot(estate, &indexstate->ss, tupDesc);
/*
* Initialize result tuple type and projection info. The node's
* targetlist will contain Vars with varno = INDEX_VAR, referencing the
* scan tuple.
*/
- ExecAssignResultTypeFromTL(&indexstate->ss.ps);
ExecAssignScanProjectionInfoWithVarno(&indexstate->ss, INDEX_VAR);
+ /*
+ * initialize child expressions
+ *
+ * Note: we don't initialize all of the indexorderby expression, only the
+ * sub-parts corresponding to runtime keys (see below).
+ */
+ indexstate->ss.ps.qual =
+ ExecInitQual(node->scan.plan.qual, (PlanState *) indexstate);
+ indexstate->indexqual =
+ ExecInitQual(node->indexqual, (PlanState *) indexstate);
+
/*
* If we are just doing EXPLAIN (ie, aren't going to run the plan), stop
* here. This allows an index-advisor plugin to EXPLAIN a plan containing
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index 638b17b07c..998a39418c 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -900,6 +900,30 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &indexstate->ss.ps);
+ /*
+ * open the base relation and acquire appropriate lock on it.
+ */
+ currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
+
+ indexstate->ss.ss_currentRelation = currentRelation;
+ indexstate->ss.ss_currentScanDesc = NULL; /* no heap scan here */
+
+ /*
+ * tuple table initialization
+ */
+ ExecInitResultTupleSlotTL(estate, &indexstate->ss.ps);
+
+ /*
+ * get the scan type from the relation descriptor.
+ */
+ ExecInitScanTupleSlot(estate, &indexstate->ss,
+ RelationGetDescr(currentRelation));
+
+ /*
+ * Initialize result tuple type and projection info.
+ */
+ ExecAssignScanProjectionInfo(&indexstate->ss);
+
/*
* initialize child expressions
*
@@ -917,31 +941,6 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
indexstate->indexorderbyorig =
ExecInitExprList(node->indexorderbyorig, (PlanState *) indexstate);
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &indexstate->ss.ps);
- ExecInitScanTupleSlot(estate, &indexstate->ss);
-
- /*
- * open the base relation and acquire appropriate lock on it.
- */
- currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
-
- indexstate->ss.ss_currentRelation = currentRelation;
- indexstate->ss.ss_currentScanDesc = NULL; /* no heap scan here */
-
- /*
- * get the scan type from the relation descriptor.
- */
- ExecAssignScanType(&indexstate->ss, RelationGetDescr(currentRelation));
-
- /*
- * Initialize result tuple type and projection info.
- */
- ExecAssignResultTypeFromTL(&indexstate->ss.ps);
- ExecAssignScanProjectionInfo(&indexstate->ss);
-
/*
* If we are just doing EXPLAIN (ie, aren't going to run the plan), stop
* here. This allows an index-advisor plugin to EXPLAIN a plan containing
diff --git a/src/backend/executor/nodeLimit.c b/src/backend/executor/nodeLimit.c
index 883f46ce7c..0eed8f74b1 100644
--- a/src/backend/executor/nodeLimit.c
+++ b/src/backend/executor/nodeLimit.c
@@ -364,7 +364,7 @@ ExecInitLimit(Limit *node, EState *estate, int eflags)
/*
* Tuple table initialization (XXX not actually used...)
*/
- ExecInitResultTupleSlot(estate, &limitstate->ps);
+ ExecInitResultTupleSlotTL(estate, &limitstate->ps);
/*
* then initialize outer plan
@@ -376,7 +376,6 @@ ExecInitLimit(Limit *node, EState *estate, int eflags)
* limit nodes do no projections, so initialize projection info for this
* node appropriately
*/
- ExecAssignResultTypeFromTL(&limitstate->ps);
limitstate->ps.ps_ProjInfo = NULL;
return limitstate;
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 93895600a5..9dfb2f4524 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -367,16 +367,10 @@ ExecInitLockRows(LockRows *node, EState *estate, int eflags)
lrstate->ps.state = estate;
lrstate->ps.ExecProcNode = ExecLockRows;
- /*
- * Miscellaneous initialization
- *
- * LockRows nodes never call ExecQual or ExecProject.
- */
-
/*
* Tuple table initialization (XXX not actually used...)
*/
- ExecInitResultTupleSlot(estate, &lrstate->ps);
+ ExecInitResultTupleSlotTL(estate, &lrstate->ps);
/*
* then initialize outer plan
@@ -387,9 +381,14 @@ ExecInitLockRows(LockRows *node, EState *estate, int eflags)
* LockRows nodes do no projections, so initialize projection info for
* this node appropriately
*/
- ExecAssignResultTypeFromTL(&lrstate->ps);
lrstate->ps.ps_ProjInfo = NULL;
+ /*
+ * Miscellaneous initialization
+ *
+ * LockRows nodes never call ExecQual or ExecProject.
+ */
+
/*
* Create workspace in which we can remember per-RTE locked tuples
*/
diff --git a/src/backend/executor/nodeMaterial.c b/src/backend/executor/nodeMaterial.c
index 91178f1019..470d3ed717 100644
--- a/src/backend/executor/nodeMaterial.c
+++ b/src/backend/executor/nodeMaterial.c
@@ -199,20 +199,12 @@ ExecInitMaterial(Material *node, EState *estate, int eflags)
matstate->eof_underlying = false;
matstate->tuplestorestate = NULL;
- /*
- * Miscellaneous initialization
- *
- * Materialization nodes don't need ExprContexts because they never call
- * ExecQual or ExecProject.
- */
-
/*
* tuple table initialization
*
* material nodes only return tuples from their materialized relation.
*/
- ExecInitResultTupleSlot(estate, &matstate->ss.ps);
- ExecInitScanTupleSlot(estate, &matstate->ss);
+ ExecInitResultTupleSlotTL(estate, &matstate->ss.ps);
/*
* initialize child nodes
@@ -229,10 +221,16 @@ ExecInitMaterial(Material *node, EState *estate, int eflags)
* initialize tuple type. no need to initialize projection info because
* this node doesn't do projections.
*/
- ExecAssignResultTypeFromTL(&matstate->ss.ps);
- ExecAssignScanTypeFromOuterPlan(&matstate->ss);
+ ExecCreateScanSlotForOuterPlan(estate, &matstate->ss);
matstate->ss.ps.ps_ProjInfo = NULL;
+ /*
+ * Miscellaneous initialization
+ *
+ * Materialization nodes don't need ExprContexts because they never call
+ * ExecQual or ExecProject.
+ */
+
return matstate;
}
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 6bf490bd70..f50464fd7d 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -109,7 +109,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* MergeAppend nodes do have Result slots, which hold pointers to tuples,
* so we have to initialize them.
*/
- ExecInitResultTupleSlot(estate, &mergestate->ps);
+ ExecInitResultTupleSlotTL(estate, &mergestate->ps);
/*
* call ExecInitNode on each of the plans to be executed and save the
@@ -127,7 +127,6 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
/*
* initialize output tuple type
*/
- ExecAssignResultTypeFromTL(&mergestate->ps);
mergestate->ps.ps_ProjInfo = NULL;
/*
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index 925b4cf553..b11af73aaa 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -1450,6 +1450,8 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
mergestate->js.ps.plan = (Plan *) node;
mergestate->js.ps.state = estate;
mergestate->js.ps.ExecProcNode = ExecMergeJoin;
+ mergestate->js.jointype = node->join.jointype;
+ mergestate->mj_ConstFalseJoin = false;
/*
* Miscellaneous initialization
@@ -1466,17 +1468,6 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
mergestate->mj_OuterEContext = CreateExprContext(estate);
mergestate->mj_InnerEContext = CreateExprContext(estate);
- /*
- * initialize child expressions
- */
- mergestate->js.ps.qual =
- ExecInitQual(node->join.plan.qual, (PlanState *) mergestate);
- mergestate->js.jointype = node->join.jointype;
- mergestate->js.joinqual =
- ExecInitQual(node->join.joinqual, (PlanState *) mergestate);
- mergestate->mj_ConstFalseJoin = false;
- /* mergeclauses are handled below */
-
/*
* initialize child nodes
*
@@ -1513,12 +1504,21 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
/*
* tuple table initialization
*/
- ExecInitResultTupleSlot(estate, &mergestate->js.ps);
+ ExecInitResultTupleSlotTL(estate, &mergestate->js.ps);
mergestate->mj_MarkedTupleSlot = ExecInitExtraTupleSlot(estate);
ExecSetSlotDescriptor(mergestate->mj_MarkedTupleSlot,
ExecGetResultType(innerPlanState(mergestate)));
+ /*
+ * initialize child expressions
+ */
+ mergestate->js.ps.qual =
+ ExecInitQual(node->join.plan.qual, (PlanState *) mergestate);
+ mergestate->js.joinqual =
+ ExecInitQual(node->join.joinqual, (PlanState *) mergestate);
+ /* mergeclauses are handled below */
+
/*
* detect whether we need only consider the first matching inner tuple
*/
@@ -1586,7 +1586,6 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
/*
* initialize tuple type and projection info
*/
- ExecAssignResultTypeFromTL(&mergestate->js.ps);
ExecAssignProjectionInfo(&mergestate->js.ps, NULL);
/*
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 57946e1591..1e76ef7dd4 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2029,8 +2029,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
mtstate->ps.plan->targetlist = (List *) linitial(node->returningLists);
/* Set up a slot for the output of the RETURNING projection(s) */
- ExecInitResultTupleSlot(estate, &mtstate->ps);
- ExecAssignResultTypeFromTL(&mtstate->ps);
+ ExecInitResultTupleSlotTL(estate, &mtstate->ps);
slot = mtstate->ps.ps_ResultTupleSlot;
/* Need an econtext too */
@@ -2083,8 +2082,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* expects one (maybe should change that?).
*/
mtstate->ps.plan->targetlist = NIL;
- ExecInitResultTupleSlot(estate, &mtstate->ps);
- ExecAssignResultTypeFromTL(&mtstate->ps);
+ ExecInitResultTupleSlotTL(estate, &mtstate->ps);
mtstate->ps.ps_ExprContext = NULL;
}
diff --git a/src/backend/executor/nodeNamedtuplestorescan.c b/src/backend/executor/nodeNamedtuplestorescan.c
index 3a65b9f5dc..28036d5076 100644
--- a/src/backend/executor/nodeNamedtuplestorescan.c
+++ b/src/backend/executor/nodeNamedtuplestorescan.c
@@ -132,27 +132,21 @@ ExecInitNamedTuplestoreScan(NamedTuplestoreScan *node, EState *estate, int eflag
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
+ /*
+ * tuple table initialization
+ */
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
+ ExecInitScanTupleSlot(estate, &scanstate->ss, scanstate->tupdesc);
+
/*
* initialize child expressions
*/
scanstate->ss.ps.qual =
ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
-
- /*
- * The scan tuple type is specified for the tuplestore.
- */
- ExecAssignScanType(&scanstate->ss, scanstate->tupdesc);
-
/*
* Initialize result tuple type and projection info.
*/
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
ExecAssignScanProjectionInfo(&scanstate->ss);
return scanstate;
diff --git a/src/backend/executor/nodeNestloop.c b/src/backend/executor/nodeNestloop.c
index 4447b7c051..32c1544f4f 100644
--- a/src/backend/executor/nodeNestloop.c
+++ b/src/backend/executor/nodeNestloop.c
@@ -285,15 +285,6 @@ ExecInitNestLoop(NestLoop *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &nlstate->js.ps);
- /*
- * initialize child expressions
- */
- nlstate->js.ps.qual =
- ExecInitQual(node->join.plan.qual, (PlanState *) nlstate);
- nlstate->js.jointype = node->join.jointype;
- nlstate->js.joinqual =
- ExecInitQual(node->join.joinqual, (PlanState *) nlstate);
-
/*
* initialize child nodes
*
@@ -313,7 +304,16 @@ ExecInitNestLoop(NestLoop *node, EState *estate, int eflags)
/*
* tuple table initialization
*/
- ExecInitResultTupleSlot(estate, &nlstate->js.ps);
+ ExecInitResultTupleSlotTL(estate, &nlstate->js.ps);
+
+ /*
+ * initialize child expressions
+ */
+ nlstate->js.ps.qual =
+ ExecInitQual(node->join.plan.qual, (PlanState *) nlstate);
+ nlstate->js.jointype = node->join.jointype;
+ nlstate->js.joinqual =
+ ExecInitQual(node->join.joinqual, (PlanState *) nlstate);
/*
* detect whether we need only consider the first matching inner tuple
@@ -341,7 +341,6 @@ ExecInitNestLoop(NestLoop *node, EState *estate, int eflags)
/*
* initialize tuple type and projection info
*/
- ExecAssignResultTypeFromTL(&nlstate->js.ps);
ExecAssignProjectionInfo(&nlstate->js.ps, NULL);
/*
diff --git a/src/backend/executor/nodeProjectSet.c b/src/backend/executor/nodeProjectSet.c
index d93462c542..c55a5fded0 100644
--- a/src/backend/executor/nodeProjectSet.c
+++ b/src/backend/executor/nodeProjectSet.c
@@ -230,7 +230,7 @@ ExecInitProjectSet(ProjectSet *node, EState *estate, int eflags)
/*
* tuple table initialization
*/
- ExecInitResultTupleSlot(estate, &state->ps);
+ ExecInitResultTupleSlotTL(estate, &state->ps);
/* We don't support any qual on ProjectSet nodes */
Assert(node->plan.qual == NIL);
@@ -248,7 +248,6 @@ ExecInitProjectSet(ProjectSet *node, EState *estate, int eflags)
/*
* initialize tuple type and projection info
*/
- ExecAssignResultTypeFromTL(&state->ps);
/* Create workspace for per-tlist-entry expr state & SRF-is-done state */
state->nelems = list_length(node->plan.targetlist);
diff --git a/src/backend/executor/nodeRecursiveunion.c b/src/backend/executor/nodeRecursiveunion.c
index a64dd1397a..3c1f9ac7c6 100644
--- a/src/backend/executor/nodeRecursiveunion.c
+++ b/src/backend/executor/nodeRecursiveunion.c
@@ -214,6 +214,26 @@ ExecInitRecursiveUnion(RecursiveUnion *node, EState *estate, int eflags)
prmdata->value = PointerGetDatum(rustate);
prmdata->isnull = false;
+
+ /*
+ * initialize child nodes
+ */
+ outerPlanState(rustate) = ExecInitNode(outerPlan(node), estate, eflags);
+ innerPlanState(rustate) = ExecInitNode(innerPlan(node), estate, eflags);
+
+ /*
+ * RecursiveUnion nodes still have Result slots, which hold pointers to
+ * tuples, so we have to initialize them.
+ */
+ ExecInitResultTupleSlotTL(estate, &rustate->ps);
+
+ /*
+ * Initialize result tuple type and projection info. (Note: we have to
+ * set up the result type before initializing child nodes, because
+ * nodeWorktablescan.c expects it to be valid.)
+ */
+ rustate->ps.ps_ProjInfo = NULL;
+
/*
* Miscellaneous initialization
*
@@ -222,26 +242,6 @@ ExecInitRecursiveUnion(RecursiveUnion *node, EState *estate, int eflags)
*/
Assert(node->plan.qual == NIL);
- /*
- * RecursiveUnion nodes still have Result slots, which hold pointers to
- * tuples, so we have to initialize them.
- */
- ExecInitResultTupleSlot(estate, &rustate->ps);
-
- /*
- * Initialize result tuple type and projection info. (Note: we have to
- * set up the result type before initializing child nodes, because
- * nodeWorktablescan.c expects it to be valid.)
- */
- ExecAssignResultTypeFromTL(&rustate->ps);
- rustate->ps.ps_ProjInfo = NULL;
-
- /*
- * initialize child nodes
- */
- outerPlanState(rustate) = ExecInitNode(outerPlan(node), estate, eflags);
- innerPlanState(rustate) = ExecInitNode(innerPlan(node), estate, eflags);
-
/*
* If hashing, precompute fmgr lookup data for inner loop, and create the
* hash table.
diff --git a/src/backend/executor/nodeResult.c b/src/backend/executor/nodeResult.c
index 4c879d8765..7767c7426f 100644
--- a/src/backend/executor/nodeResult.c
+++ b/src/backend/executor/nodeResult.c
@@ -204,19 +204,6 @@ ExecInitResult(Result *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &resstate->ps);
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &resstate->ps);
-
- /*
- * initialize child expressions
- */
- resstate->ps.qual =
- ExecInitQual(node->plan.qual, (PlanState *) resstate);
- resstate->resconstantqual =
- ExecInitQual((List *) node->resconstantqual, (PlanState *) resstate);
-
/*
* initialize child nodes
*/
@@ -227,12 +214,24 @@ ExecInitResult(Result *node, EState *estate, int eflags)
*/
Assert(innerPlan(node) == NULL);
+ /*
+ * tuple table initialization
+ */
+ ExecInitResultTupleSlotTL(estate, &resstate->ps);
+
/*
* initialize tuple type and projection info
*/
- ExecAssignResultTypeFromTL(&resstate->ps);
ExecAssignProjectionInfo(&resstate->ps, NULL);
+ /*
+ * initialize child expressions
+ */
+ resstate->ps.qual =
+ ExecInitQual(node->plan.qual, (PlanState *) resstate);
+ resstate->resconstantqual =
+ ExecInitQual((List *) node->resconstantqual, (PlanState *) resstate);
+
return resstate;
}
diff --git a/src/backend/executor/nodeSamplescan.c b/src/backend/executor/nodeSamplescan.c
index 9c74a836e4..89f2a78d14 100644
--- a/src/backend/executor/nodeSamplescan.c
+++ b/src/backend/executor/nodeSamplescan.c
@@ -26,7 +26,6 @@
#include "utils/rel.h"
#include "utils/tqual.h"
-static void InitScanRelation(SampleScanState *node, EState *estate, int eflags);
static TupleTableSlot *SampleNext(SampleScanState *node);
static void tablesample_init(SampleScanState *scanstate);
static HeapTuple tablesample_getnext(SampleScanState *scanstate);
@@ -106,35 +105,6 @@ ExecSampleScan(PlanState *pstate)
(ExecScanRecheckMtd) SampleRecheck);
}
-/* ----------------------------------------------------------------
- * InitScanRelation
- *
- * Set up to access the scan relation.
- * ----------------------------------------------------------------
- */
-static void
-InitScanRelation(SampleScanState *node, EState *estate, int eflags)
-{
- Relation currentRelation;
-
- /*
- * get the relation object id from the relid'th entry in the range table,
- * open that relation and acquire appropriate lock on it.
- */
- currentRelation = ExecOpenScanRelation(estate,
- ((SampleScan *) node->ss.ps.plan)->scan.scanrelid,
- eflags);
-
- node->ss.ss_currentRelation = currentRelation;
-
- /* we won't set up the HeapScanDesc till later */
- node->ss.ss_currentScanDesc = NULL;
-
- /* and report the scan tuple slot's rowtype */
- ExecAssignScanType(&node->ss, RelationGetDescr(currentRelation));
-}
-
-
/* ----------------------------------------------------------------
* ExecInitSampleScan
* ----------------------------------------------------------------
@@ -164,6 +134,36 @@ ExecInitSampleScan(SampleScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
+ /*
+ * tuple table initialization
+ */
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
+
+ /*
+ * initialize scan relation
+ */
+
+ /*
+ * get the relation object id from the relid'th entry in the range table,
+ * open that relation and acquire appropriate lock on it.
+ */
+ scanstate->ss.ss_currentRelation =
+ ExecOpenScanRelation(estate,
+ node->scan.scanrelid,
+ eflags);
+
+ /* we won't set up the HeapScanDesc till later */
+ scanstate->ss.ss_currentScanDesc = NULL;
+
+ /* and create slot with appropriate rowtype */
+ ExecInitScanTupleSlot(estate, &scanstate->ss,
+ RelationGetDescr(scanstate->ss.ss_currentRelation));
+
+ /*
+ * Initialize result tuple type and projection info.
+ */
+ ExecAssignScanProjectionInfo(&scanstate->ss);
+
/*
* initialize child expressions
*/
@@ -174,23 +174,6 @@ ExecInitSampleScan(SampleScan *node, EState *estate, int eflags)
scanstate->repeatable =
ExecInitExpr(tsc->repeatable, (PlanState *) scanstate);
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
-
- /*
- * initialize scan relation
- */
- InitScanRelation(scanstate, estate, eflags);
-
- /*
- * Initialize result tuple type and projection info.
- */
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
- ExecAssignScanProjectionInfo(&scanstate->ss);
-
/*
* If we don't have a REPEATABLE clause, select a random seed. We want to
* do this just once, since the seed shouldn't change over rescans.
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index d4ac939c9b..da2bb69583 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -32,7 +32,6 @@
#include "executor/nodeSeqscan.h"
#include "utils/rel.h"
-static void InitScanRelation(SeqScanState *node, EState *estate, int eflags);
static TupleTableSlot *SeqNext(SeqScanState *node);
/* ----------------------------------------------------------------
@@ -132,31 +131,6 @@ ExecSeqScan(PlanState *pstate)
(ExecScanRecheckMtd) SeqRecheck);
}
-/* ----------------------------------------------------------------
- * InitScanRelation
- *
- * Set up to access the scan relation.
- * ----------------------------------------------------------------
- */
-static void
-InitScanRelation(SeqScanState *node, EState *estate, int eflags)
-{
- Relation currentRelation;
-
- /*
- * get the relation object id from the relid'th entry in the range table,
- * open that relation and acquire appropriate lock on it.
- */
- currentRelation = ExecOpenScanRelation(estate,
- ((SeqScan *) node->ss.ps.plan)->scanrelid,
- eflags);
-
- node->ss.ss_currentRelation = currentRelation;
-
- /* and report the scan tuple slot's rowtype */
- ExecAssignScanType(&node->ss, RelationGetDescr(currentRelation));
-}
-
/* ----------------------------------------------------------------
* ExecInitSeqScan
@@ -189,29 +163,39 @@ ExecInitSeqScan(SeqScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
- /*
- * initialize child expressions
- */
- scanstate->ss.ps.qual =
- ExecInitQual(node->plan.qual, (PlanState *) scanstate);
-
/*
* tuple table initialization
*/
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
/*
* initialize scan relation
*/
- InitScanRelation(scanstate, estate, eflags);
+
+ /*
+ * get the relation object id from the relid'th entry in the range table,
+ * open that relation and acquire appropriate lock on it.
+ */
+ scanstate->ss.ss_currentRelation =
+ ExecOpenScanRelation(estate,
+ node->scanrelid,
+ eflags);
+
+ /* and create slot with the appropriate rowtype */
+ ExecInitScanTupleSlot(estate, &scanstate->ss,
+ RelationGetDescr(scanstate->ss.ss_currentRelation));
/*
* Initialize result tuple type and projection info.
*/
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
ExecAssignScanProjectionInfo(&scanstate->ss);
+ /*
+ * initialize child expressions
+ */
+ scanstate->ss.ps.qual =
+ ExecInitQual(node->plan.qual, (PlanState *) scanstate);
+
return scanstate;
}
diff --git a/src/backend/executor/nodeSetOp.c b/src/backend/executor/nodeSetOp.c
index 571cbf86b1..126e65d4c7 100644
--- a/src/backend/executor/nodeSetOp.c
+++ b/src/backend/executor/nodeSetOp.c
@@ -523,7 +523,7 @@ ExecInitSetOp(SetOp *node, EState *estate, int eflags)
/*
* Tuple table initialization
*/
- ExecInitResultTupleSlot(estate, &setopstate->ps);
+ ExecInitResultTupleSlotTL(estate, &setopstate->ps);
/*
* initialize child nodes
@@ -539,7 +539,6 @@ ExecInitSetOp(SetOp *node, EState *estate, int eflags)
* setop nodes do no projections, so initialize projection info for this
* node appropriately
*/
- ExecAssignResultTypeFromTL(&setopstate->ps);
setopstate->ps.ps_ProjInfo = NULL;
/*
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index 98bcaeb66f..de4733bb19 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -191,20 +191,12 @@ ExecInitSort(Sort *node, EState *estate, int eflags)
sortstate->sort_Done = false;
sortstate->tuplesortstate = NULL;
- /*
- * Miscellaneous initialization
- *
- * Sort nodes don't initialize their ExprContexts because they never call
- * ExecQual or ExecProject.
- */
-
/*
* tuple table initialization
*
* sort nodes only return scan tuples from their sorted relation.
*/
- ExecInitResultTupleSlot(estate, &sortstate->ss.ps);
- ExecInitScanTupleSlot(estate, &sortstate->ss);
+ ExecInitResultTupleSlotTL(estate, &sortstate->ss.ps);
/*
* initialize child nodes
@@ -220,10 +212,16 @@ ExecInitSort(Sort *node, EState *estate, int eflags)
* initialize tuple type. no need to initialize projection info because
* this node doesn't do projections.
*/
- ExecAssignResultTypeFromTL(&sortstate->ss.ps);
- ExecAssignScanTypeFromOuterPlan(&sortstate->ss);
+ ExecCreateScanSlotForOuterPlan(estate, &sortstate->ss);
sortstate->ss.ps.ps_ProjInfo = NULL;
+ /*
+ * Miscellaneous initialization
+ *
+ * Sort nodes don't initialize their ExprContexts because they never call
+ * ExecQual or ExecProject.
+ */
+
SO1_printf("ExecInitSort: %s\n",
"sort node initialized");
diff --git a/src/backend/executor/nodeSubqueryscan.c b/src/backend/executor/nodeSubqueryscan.c
index 088c92992e..93f582eb02 100644
--- a/src/backend/executor/nodeSubqueryscan.c
+++ b/src/backend/executor/nodeSubqueryscan.c
@@ -120,17 +120,10 @@ ExecInitSubqueryScan(SubqueryScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &subquerystate->ss.ps);
- /*
- * initialize child expressions
- */
- subquerystate->ss.ps.qual =
- ExecInitQual(node->scan.plan.qual, (PlanState *) subquerystate);
-
/*
* tuple table initialization
*/
- ExecInitResultTupleSlot(estate, &subquerystate->ss.ps);
- ExecInitScanTupleSlot(estate, &subquerystate->ss);
+ ExecInitResultTupleSlotTL(estate, &subquerystate->ss.ps);
/*
* initialize subquery
@@ -140,15 +133,20 @@ ExecInitSubqueryScan(SubqueryScan *node, EState *estate, int eflags)
/*
* Initialize scan tuple type (needed by ExecAssignScanProjectionInfo)
*/
- ExecAssignScanType(&subquerystate->ss,
+ ExecInitScanTupleSlot(estate, &subquerystate->ss,
ExecGetResultType(subquerystate->subplan));
/*
* Initialize result tuple type and projection info.
*/
- ExecAssignResultTypeFromTL(&subquerystate->ss.ps);
ExecAssignScanProjectionInfo(&subquerystate->ss);
+ /*
+ * initialize child expressions
+ */
+ subquerystate->ss.ps.qual =
+ ExecInitQual(node->scan.plan.qual, (PlanState *) subquerystate);
+
return subquerystate;
}
diff --git a/src/backend/executor/nodeTableFuncscan.c b/src/backend/executor/nodeTableFuncscan.c
index 165fae8c83..d10d8b8097 100644
--- a/src/backend/executor/nodeTableFuncscan.c
+++ b/src/backend/executor/nodeTableFuncscan.c
@@ -139,18 +139,6 @@ ExecInitTableFuncScan(TableFuncScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
- /*
- * initialize child expressions
- */
- scanstate->ss.ps.qual =
- ExecInitQual(node->scan.plan.qual, &scanstate->ss.ps);
-
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
-
/*
* initialize source tuple type
*/
@@ -159,14 +147,23 @@ ExecInitTableFuncScan(TableFuncScan *node, EState *estate, int eflags)
tf->coltypmods,
tf->colcollations);
- ExecAssignScanType(&scanstate->ss, tupdesc);
+ /*
+ * tuple table initialization
+ */
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
+ ExecInitScanTupleSlot(estate, &scanstate->ss, tupdesc);
/*
* Initialize result tuple type and projection info.
*/
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
ExecAssignScanProjectionInfo(&scanstate->ss);
+ /*
+ * initialize child expressions
+ */
+ scanstate->ss.ps.qual =
+ ExecInitQual(node->scan.plan.qual, &scanstate->ss.ps);
+
/* Only XMLTABLE is supported currently */
scanstate->routine = &XmlTableRoutine;
diff --git a/src/backend/executor/nodeTidscan.c b/src/backend/executor/nodeTidscan.c
index 0ee76e7d25..8d7ea412c2 100644
--- a/src/backend/executor/nodeTidscan.c
+++ b/src/backend/executor/nodeTidscan.c
@@ -530,6 +530,30 @@ ExecInitTidScan(TidScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &tidstate->ss.ps);
+ /*
+ * open the base relation and acquire appropriate lock on it.
+ */
+ currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
+
+ tidstate->ss.ss_currentRelation = currentRelation;
+ tidstate->ss.ss_currentScanDesc = NULL; /* no heap scan here */
+
+ /*
+ * tuple table initialization
+ */
+ ExecInitResultTupleSlotTL(estate, &tidstate->ss.ps);
+
+ /*
+ * get the scan type from the relation descriptor.
+ */
+ ExecInitScanTupleSlot(estate, &tidstate->ss,
+ RelationGetDescr(currentRelation));
+
+ /*
+ * Initialize result tuple type and projection info.
+ */
+ ExecAssignScanProjectionInfo(&tidstate->ss);
+
/*
* initialize child expressions
*/
@@ -538,12 +562,6 @@ ExecInitTidScan(TidScan *node, EState *estate, int eflags)
TidExprListCreate(tidstate);
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &tidstate->ss.ps);
- ExecInitScanTupleSlot(estate, &tidstate->ss);
-
/*
* mark tid list as not computed yet
*/
@@ -551,25 +569,6 @@ ExecInitTidScan(TidScan *node, EState *estate, int eflags)
tidstate->tss_NumTids = 0;
tidstate->tss_TidPtr = -1;
- /*
- * open the base relation and acquire appropriate lock on it.
- */
- currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
-
- tidstate->ss.ss_currentRelation = currentRelation;
- tidstate->ss.ss_currentScanDesc = NULL; /* no heap scan here */
-
- /*
- * get the scan type from the relation descriptor.
- */
- ExecAssignScanType(&tidstate->ss, RelationGetDescr(currentRelation));
-
- /*
- * Initialize result tuple type and projection info.
- */
- ExecAssignResultTypeFromTL(&tidstate->ss.ps);
- ExecAssignScanProjectionInfo(&tidstate->ss);
-
/*
* all done.
*/
diff --git a/src/backend/executor/nodeUnique.c b/src/backend/executor/nodeUnique.c
index 621fdd9b9c..b86f1e549a 100644
--- a/src/backend/executor/nodeUnique.c
+++ b/src/backend/executor/nodeUnique.c
@@ -143,7 +143,7 @@ ExecInitUnique(Unique *node, EState *estate, int eflags)
/*
* Tuple table initialization
*/
- ExecInitResultTupleSlot(estate, &uniquestate->ps);
+ ExecInitResultTupleSlotTL(estate, &uniquestate->ps);
/*
* then initialize outer plan
@@ -154,7 +154,6 @@ ExecInitUnique(Unique *node, EState *estate, int eflags)
* unique nodes do no projections, so initialize projection info for this
* node appropriately
*/
- ExecAssignResultTypeFromTL(&uniquestate->ps);
uniquestate->ps.ps_ProjInfo = NULL;
/*
diff --git a/src/backend/executor/nodeValuesscan.c b/src/backend/executor/nodeValuesscan.c
index 1a72bfe160..232234cfd8 100644
--- a/src/backend/executor/nodeValuesscan.c
+++ b/src/backend/executor/nodeValuesscan.c
@@ -239,21 +239,20 @@ ExecInitValuesScan(ValuesScan *node, EState *estate, int eflags)
/*
* tuple table initialization
*/
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
-
- /*
- * initialize child expressions
- */
- scanstate->ss.ps.qual =
- ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
/*
* get info about values list
*/
tupdesc = ExecTypeFromExprList((List *) linitial(node->values_lists));
- ExecAssignScanType(&scanstate->ss, tupdesc);
+ ExecInitScanTupleSlot(estate, &scanstate->ss, tupdesc);
+
+ /*
+ * initialize child expressions
+ */
+ scanstate->ss.ps.qual =
+ ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
/*
* Other node-specific setup
@@ -273,7 +272,6 @@ ExecInitValuesScan(ValuesScan *node, EState *estate, int eflags)
/*
* Initialize result tuple type and projection info.
*/
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
ExecAssignScanProjectionInfo(&scanstate->ss);
return scanstate;
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 80be46029f..1b3d075b3a 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -1823,8 +1823,7 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
/*
* tuple table initialization
*/
- ExecInitScanTupleSlot(estate, &winstate->ss);
- ExecInitResultTupleSlot(estate, &winstate->ss.ps);
+ ExecInitResultTupleSlotTL(estate, &winstate->ss.ps);
winstate->first_part_slot = ExecInitExtraTupleSlot(estate);
winstate->agg_row_slot = ExecInitExtraTupleSlot(estate);
winstate->temp_slot_1 = ExecInitExtraTupleSlot(estate);
@@ -1847,7 +1846,7 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
* initialize source tuple type (which is also the tuple type that we'll
* store in the tuplestore and use in all our working slots).
*/
- ExecAssignScanTypeFromOuterPlan(&winstate->ss);
+ ExecCreateScanSlotForOuterPlan(estate, &winstate->ss);
ExecSetSlotDescriptor(winstate->first_part_slot,
winstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor);
@@ -1861,7 +1860,6 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
/*
* Initialize result tuple type and projection info.
*/
- ExecAssignResultTypeFromTL(&winstate->ss.ps);
ExecAssignProjectionInfo(&winstate->ss.ps, NULL);
/* Set up data for comparing tuples */
diff --git a/src/backend/executor/nodeWorktablescan.c b/src/backend/executor/nodeWorktablescan.c
index d5ffadda3e..f7ec95ba67 100644
--- a/src/backend/executor/nodeWorktablescan.c
+++ b/src/backend/executor/nodeWorktablescan.c
@@ -156,22 +156,21 @@ ExecInitWorkTableScan(WorkTableScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
+ /*
+ * tuple table initialization
+ */
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
+ ExecInitScanTupleSlot(estate, &scanstate->ss, NULL);
+
/*
* initialize child expressions
*/
scanstate->ss.ps.qual =
ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
-
/*
* Initialize result tuple type, but not yet projection info.
*/
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
return scanstate;
}
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index f48a603dae..f0601cb870 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -418,8 +418,8 @@ extern void ExecScanReScan(ScanState *node);
/*
* prototypes from functions in execTuples.c
*/
-extern void ExecInitResultTupleSlot(EState *estate, PlanState *planstate);
-extern void ExecInitScanTupleSlot(EState *estate, ScanState *scanstate);
+extern void ExecInitResultTupleSlotTL(EState *estate, PlanState *planstate);
+extern void ExecInitScanTupleSlot(EState *estate, ScanState *scanstate, TupleDesc tupleDesc);
extern TupleTableSlot *ExecInitExtraTupleSlot(EState *estate);
extern TupleTableSlot *ExecInitNullTupleSlot(EState *estate,
TupleDesc tupType);
@@ -489,14 +489,12 @@ extern ExprContext *MakePerTupleExprContext(EState *estate);
} while (0)
extern void ExecAssignExprContext(EState *estate, PlanState *planstate);
-extern void ExecAssignResultType(PlanState *planstate, TupleDesc tupDesc);
-extern void ExecAssignResultTypeFromTL(PlanState *planstate);
extern TupleDesc ExecGetResultType(PlanState *planstate);
extern void ExecAssignProjectionInfo(PlanState *planstate,
TupleDesc inputDesc);
extern void ExecFreeExprContext(PlanState *planstate);
extern void ExecAssignScanType(ScanState *scanstate, TupleDesc tupDesc);
-extern void ExecAssignScanTypeFromOuterPlan(ScanState *scanstate);
+extern void ExecCreateScanSlotForOuterPlan(EState *estate, ScanState *scanstate);
extern bool ExecRelationIsTargetRelation(EState *estate, Index scanrelid);
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index 55f4cce4ee..6c24fd334d 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -127,6 +127,7 @@ typedef struct TupleTableSlot
MinimalTuple tts_mintuple; /* minimal tuple, or NULL if none */
HeapTupleData tts_minhdr; /* workspace for minimal-tuple-only case */
long tts_off; /* saved state for slot_deform_tuple */
+ bool tts_fixedTupleDescriptor;
} TupleTableSlot;
#define TTS_HAS_PHYSICAL_TUPLE(slot) \
@@ -139,8 +140,8 @@ typedef struct TupleTableSlot
((slot) == NULL || (slot)->tts_isempty)
/* in executor/execTuples.c */
-extern TupleTableSlot *MakeTupleTableSlot(void);
-extern TupleTableSlot *ExecAllocTableSlot(List **tupleTable);
+extern TupleTableSlot *MakeTupleTableSlot(TupleDesc desc);
+extern TupleTableSlot *ExecAllocTableSlot(List **tupleTable, TupleDesc desc);
extern void ExecResetTupleTable(List *tupleTable, bool shouldFree);
extern TupleTableSlot *MakeSingleTupleTableSlot(TupleDesc tupdesc);
extern void ExecDropSingleTupleTableSlot(TupleTableSlot *slot);
--
2.14.1.2.g4274c698f4.dirty
0003-WIP-Add-configure-infrastructure-to-enable-LLVM.patchtext/x-diff; charset=us-asciiDownload
From 47dfd412a19a88aef125b4473337bb68c2dadb6a Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Mon, 13 Mar 2017 20:22:10 -0700
Subject: [PATCH 03/16] WIP: Add configure infrastructure to enable LLVM.
---
configure | 97 ++++++++++++++++++++++++++++++++++++++++++++++
configure.in | 27 +++++++++++++
src/Makefile.global.in | 2 +
src/backend/Makefile | 4 ++
src/include/pg_config.h.in | 3 ++
5 files changed, 133 insertions(+)
diff --git a/configure b/configure
index a2f9a256b4..fe905e294b 100755
--- a/configure
+++ b/configure
@@ -700,6 +700,9 @@ LDFLAGS_EX
ELF_SYS
EGREP
GREP
+with_llvm
+LLVM_LIBS
+LLVM_CONFIG
with_zlib
with_system_tzdata
with_libxslt
@@ -848,6 +851,7 @@ with_libxml
with_libxslt
with_system_tzdata
with_zlib
+with_llvm
with_gnu_ld
enable_largefile
enable_float4_byval
@@ -1546,6 +1550,7 @@ Optional Packages:
--with-system-tzdata=DIR
use system time zone data in DIR
--without-zlib do not use Zlib
+ --with-llvm build with llvm (JIT) support
--with-gnu-ld assume the C compiler uses GNU ld [default=no]
Some influential environment variables:
@@ -6460,6 +6465,98 @@ fi
+
+
+
+# Check whether --with-llvm was given.
+if test "${with_llvm+set}" = set; then :
+ withval=$with_llvm;
+ case $withval in
+ yes)
+
+$as_echo "#define USE_LLVM 1" >>confdefs.h
+
+ ;;
+ no)
+ :
+ ;;
+ *)
+ as_fn_error $? "no argument expected for --with-llvm option" "$LINENO" 5
+ ;;
+ esac
+
+else
+ with_llvm=no
+
+fi
+
+
+
+if test "$with_llvm" = yes ; then
+ for ac_prog in llvm-config
+do
+ # Extract the first word of "$ac_prog", so it can be a program name with args.
+set dummy $ac_prog; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_prog_LLVM_CONFIG+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ if test -n "$LLVM_CONFIG"; then
+ ac_cv_prog_LLVM_CONFIG="$LLVM_CONFIG" # Let the user override the test.
+else
+as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+ IFS=$as_save_IFS
+ test -z "$as_dir" && as_dir=.
+ for ac_exec_ext in '' $ac_executable_extensions; do
+ if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+ ac_cv_prog_LLVM_CONFIG="$ac_prog"
+ $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+ break 2
+ fi
+done
+ done
+IFS=$as_save_IFS
+
+fi
+fi
+LLVM_CONFIG=$ac_cv_prog_LLVM_CONFIG
+if test -n "$LLVM_CONFIG"; then
+ { $as_echo "$as_me:${as_lineno-$LINENO}: result: $LLVM_CONFIG" >&5
+$as_echo "$LLVM_CONFIG" >&6; }
+else
+ { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+
+ test -n "$LLVM_CONFIG" && break
+done
+
+ if test -n "$LLVM_CONFIG"; then
+ for pgac_option in `$LLVM_CONFIG --cflags`; do
+ case $pgac_option in
+ -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
+ esac
+ done
+ for pgac_option in `$LLVM_CONFIG --ldflags`; do
+ case $pgac_option in
+ -L*) LDFLAGS="$LDFLAGS $pgac_option";;
+ esac
+ done
+ for pgac_option in `$LLVM_CONFIG --libs --system-libs engine`; do
+ case $pgac_option in
+ -l*) LLVM_LIBS="$LLVM_LIBS $pgac_option";;
+ esac
+ done
+ fi
+fi
+
+
+
+
#
# Elf
#
diff --git a/configure.in b/configure.in
index e94fba5235..a99da9dff3 100644
--- a/configure.in
+++ b/configure.in
@@ -856,6 +856,33 @@ PGAC_ARG_BOOL(with, zlib, yes,
[do not use Zlib])
AC_SUBST(with_zlib)
+PGAC_ARG_BOOL(with, llvm, no, [build with llvm (JIT) support],
+ [AC_DEFINE([USE_LLVM], 1, [Define to 1 to build with llvm support. (--with-llvm)])])
+
+if test "$with_llvm" = yes ; then
+ AC_CHECK_PROGS(LLVM_CONFIG, llvm-config)
+ if test -n "$LLVM_CONFIG"; then
+ for pgac_option in `$LLVM_CONFIG --cflags`; do
+ case $pgac_option in
+ -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
+ esac
+ done
+ for pgac_option in `$LLVM_CONFIG --ldflags`; do
+ case $pgac_option in
+ -L*) LDFLAGS="$LDFLAGS $pgac_option";;
+ esac
+ done
+ for pgac_option in `$LLVM_CONFIG --libs --system-libs engine`; do
+ case $pgac_option in
+ -l*) LLVM_LIBS="$LLVM_LIBS $pgac_option";;
+ esac
+ done
+ fi
+fi
+AC_SUBST(LLVM_LIBS)
+AC_SUBST(with_llvm)
+
+
#
# Elf
#
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index e8b3a519cb..ab5862b472 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -186,6 +186,7 @@ with_tcl = @with_tcl@
with_openssl = @with_openssl@
with_selinux = @with_selinux@
with_systemd = @with_systemd@
+with_llvm = @with_llvm@
with_libxml = @with_libxml@
with_libxslt = @with_libxslt@
with_system_tzdata = @with_system_tzdata@
@@ -270,6 +271,7 @@ LDAP_LIBS_FE = @LDAP_LIBS_FE@
LDAP_LIBS_BE = @LDAP_LIBS_BE@
UUID_LIBS = @UUID_LIBS@
UUID_EXTRA_OBJS = @UUID_EXTRA_OBJS@
+LLVM_LIBS=@LLVM_LIBS@
LD = @LD@
with_gnu_ld = @with_gnu_ld@
diff --git a/src/backend/Makefile b/src/backend/Makefile
index aab676dbbd..c82ad75bda 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -45,6 +45,10 @@ LIBS := $(filter-out -lpgport -lpgcommon, $(LIBS)) $(LDAP_LIBS_BE) $(ICU_LIBS)
# The backend doesn't need everything that's in LIBS, however
LIBS := $(filter-out -lz -lreadline -ledit -ltermcap -lncurses -lcurses, $(LIBS))
+# Only the backend needs LLVM (if enabled) and it's a big library, so
+# only specify here
+LIBS += $(LLVM_LIBS)
+
ifeq ($(with_systemd),yes)
LIBS += -lsystemd
endif
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index dcb7a1a320..633c670de9 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -832,6 +832,9 @@
(--with-libxslt) */
#undef USE_LIBXSLT
+/* Define to 1 to build with llvm support. (--with-llvm) */
+#undef USE_LLVM
+
/* Define to select named POSIX semaphores. */
#undef USE_NAMED_POSIX_SEMAPHORES
--
2.14.1.2.g4274c698f4.dirty
0004-WIP-Beginning-of-a-LLVM-JIT-infrastructure.patchtext/x-diff; charset=us-asciiDownload
From 23e5dce848ed8dac7b590da5c77321344b30310d Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Mon, 13 Mar 2017 20:22:10 -0700
Subject: [PATCH 04/16] WIP: Beginning of a LLVM JIT infrastructure.
This needs to do a lot more, especially around error handling, and
memory management.
---
configure | 2 +-
configure.in | 2 +-
src/backend/executor/execUtils.c | 2 +
src/backend/lib/Makefile | 2 +-
src/backend/lib/llvmjit.c | 519 ++++++++++++++++++++++++++++++++++
src/backend/utils/misc/guc.c | 27 ++
src/backend/utils/resowner/resowner.c | 40 +++
src/include/lib/llvmjit.h | 83 ++++++
src/include/nodes/execnodes.h | 4 +-
src/include/utils/resowner_private.h | 7 +
10 files changed, 684 insertions(+), 4 deletions(-)
create mode 100644 src/backend/lib/llvmjit.c
create mode 100644 src/include/lib/llvmjit.h
diff --git a/configure b/configure
index fe905e294b..b6adceb990 100755
--- a/configure
+++ b/configure
@@ -6546,7 +6546,7 @@ done
-L*) LDFLAGS="$LDFLAGS $pgac_option";;
esac
done
- for pgac_option in `$LLVM_CONFIG --libs --system-libs engine`; do
+ for pgac_option in `$LLVM_CONFIG --libs --system-libs engine debuginfodwarf orcjit passes perfjitevents`; do
case $pgac_option in
-l*) LLVM_LIBS="$LLVM_LIBS $pgac_option";;
esac
diff --git a/configure.in b/configure.in
index a99da9dff3..7028a31137 100644
--- a/configure.in
+++ b/configure.in
@@ -872,7 +872,7 @@ if test "$with_llvm" = yes ; then
-L*) LDFLAGS="$LDFLAGS $pgac_option";;
esac
done
- for pgac_option in `$LLVM_CONFIG --libs --system-libs engine`; do
+ for pgac_option in `$LLVM_CONFIG --libs --system-libs engine debuginfodwarf orcjit passes perfjitevents`; do
case $pgac_option in
-l*) LLVM_LIBS="$LLVM_LIBS $pgac_option";;
esac
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 5928c38f90..aee6111c14 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -156,6 +156,8 @@ CreateExecutorState(void)
estate->es_epqScanDone = NULL;
estate->es_sourceText = NULL;
+ estate->es_jit = NULL;
+
/*
* Return the executor state structure
*/
diff --git a/src/backend/lib/Makefile b/src/backend/lib/Makefile
index d1fefe43f2..dd1390f9cf 100644
--- a/src/backend/lib/Makefile
+++ b/src/backend/lib/Makefile
@@ -13,6 +13,6 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
OBJS = binaryheap.o bipartite_match.o dshash.o hyperloglog.o ilist.o \
- knapsack.o pairingheap.o rbtree.o stringinfo.o
+ knapsack.o llvmjit.o pairingheap.o rbtree.o stringinfo.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/lib/llvmjit.c b/src/backend/lib/llvmjit.c
new file mode 100644
index 0000000000..460cb6b325
--- /dev/null
+++ b/src/backend/lib/llvmjit.c
@@ -0,0 +1,519 @@
+/*
+ * JIT infrastructure.
+ */
+
+#include "postgres.h"
+
+
+#include "lib/llvmjit.h"
+
+#include "utils/memutils.h"
+#include "utils/resowner_private.h"
+
+#ifdef USE_LLVM
+
+#include <fcntl.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#include <llvm-c/Core.h>
+#include <llvm-c/ExecutionEngine.h>
+#include <llvm-c/Target.h>
+#include <llvm-c/Analysis.h>
+#include <llvm-c/BitWriter.h>
+#include <llvm-c/OrcBindings.h>
+#include <llvm-c/Support.h>
+#include <llvm-c/Transforms/IPO.h>
+#include <llvm-c/Transforms/Scalar.h>
+
+
+/* GUCs */
+bool jit_log_ir = 0;
+bool jit_dump_bitcode = 0;
+
+static bool llvm_initialized = false;
+static LLVMPassManagerBuilderRef llvm_pmb;
+
+/* very common public things */
+const char *llvm_triple = NULL;
+
+LLVMTargetMachineRef llvm_targetmachine;
+
+LLVMTypeRef TypeSizeT;
+LLVMTypeRef TypeMemoryContext;
+LLVMTypeRef TypePGFunction;
+
+LLVMTypeRef StructHeapTupleFieldsField3;
+LLVMTypeRef StructHeapTupleFields;
+LLVMTypeRef StructHeapTupleHeaderData;
+LLVMTypeRef StructHeapTupleDataChoice;
+LLVMTypeRef StructHeapTupleData;
+LLVMTypeRef StructMinimalTupleData;
+LLVMTypeRef StructItemPointerData;
+LLVMTypeRef StructBlockId;
+LLVMTypeRef StructFormPgAttribute;
+LLVMTypeRef StructTupleConstr;
+LLVMTypeRef StructtupleDesc;
+LLVMTypeRef StructTupleTableSlot;
+LLVMTypeRef StructMemoryContextData;
+LLVMTypeRef StructPGFinfoRecord;
+LLVMTypeRef StructFmgrInfo;
+LLVMTypeRef StructFunctionCallInfoData;
+LLVMTypeRef StructExprState;
+LLVMTypeRef StructExprContext;
+
+
+static LLVMTargetRef llvm_targetref;
+static LLVMOrcJITStackRef llvm_orc;
+
+static void llvm_shutdown(void);
+static void llvm_create_types(void);
+
+
+static void
+llvm_shutdown(void)
+{
+ /* unregister profiling support, needs to be flushed to be useful */
+ if (llvm_orc)
+ {
+ LLVMOrcUnregisterPerf(llvm_orc);
+ llvm_orc = NULL;
+ }
+}
+
+void
+llvm_initialize(void)
+{
+ char *error = NULL;
+ MemoryContext oldcontext;
+
+ if (llvm_initialized)
+ return;
+
+ oldcontext = MemoryContextSwitchTo(TopMemoryContext);
+
+ LLVMInitializeNativeTarget();
+ LLVMInitializeNativeAsmPrinter();
+ LLVMInitializeNativeAsmParser();
+
+ /* force symbols in main binary to be loaded */
+ LLVMLoadLibraryPermanently("");
+
+ llvm_triple = LLVMGetDefaultTargetTriple();
+
+ if (LLVMGetTargetFromTriple(llvm_triple, &llvm_targetref, &error) != 0)
+ {
+ elog(FATAL, "failed to query triple %s\n", error);
+ }
+
+ llvm_targetmachine =
+ LLVMCreateTargetMachine(llvm_targetref, llvm_triple, NULL, NULL,
+ LLVMCodeGenLevelAggressive,
+ LLVMRelocDefault,
+ LLVMCodeModelJITDefault);
+
+ llvm_pmb = LLVMPassManagerBuilderCreate();
+ LLVMPassManagerBuilderSetOptLevel(llvm_pmb, 3);
+
+ llvm_orc = LLVMOrcCreateInstance(llvm_targetmachine);
+
+ LLVMOrcRegisterGDB(llvm_orc);
+ LLVMOrcRegisterPerf(llvm_orc);
+
+ atexit(llvm_shutdown);
+
+ llvm_create_types();
+
+ llvm_initialized = true;
+ MemoryContextSwitchTo(oldcontext);
+}
+
+static void
+llvm_create_types(void)
+{
+ /* so we don't constantly have to decide between 32/64 bit */
+#if SIZEOF_DATUM == 8
+ TypeSizeT = LLVMInt64Type();
+#else
+ TypeSizeT = LLVMInt32Type();
+#endif
+
+ /*
+ * XXX: should rather load these from disk using bitcode? It's ugly to
+ * duplicate the information, but in either case we're going to have to
+ * use member indexes for structs :(.
+ */
+ {
+ LLVMTypeRef members[2];
+ members[0] = LLVMInt16Type(); /* bi_hi */
+ members[1] = LLVMInt16Type(); /* bi_lo */
+ StructBlockId = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.BlockId");
+ LLVMStructSetBody(StructBlockId, members, 2, false);
+ }
+
+ {
+ LLVMTypeRef members[2];
+ members[0] = StructBlockId; /* ip_blkid */
+ members[1] = LLVMInt16Type(); /* ip_posid */
+
+ StructItemPointerData = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.ItemPointerData");
+ LLVMStructSetBody(StructItemPointerData, members, lengthof(members), false);
+ }
+
+
+ {
+ LLVMTypeRef members[1];
+
+ members[0] = LLVMInt32Type() ; /* cid | xvac */
+
+ StructHeapTupleFieldsField3 = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.StructHeapTupleFieldsField3");
+ LLVMStructSetBody(StructHeapTupleFieldsField3, members, lengthof(members), false);
+ }
+
+ {
+ LLVMTypeRef members[1];
+
+ members[0] = LLVMInt32Type() ; /* ? */
+
+ StructPGFinfoRecord = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.PGFinfoRecord");
+ LLVMStructSetBody(StructPGFinfoRecord, members, lengthof(members), false);
+ }
+
+ {
+ LLVMTypeRef members[3];
+ members[0] = LLVMInt32Type(); /* xmin */
+ members[1] = LLVMInt32Type(); /* xmax */
+ members[2] = StructHeapTupleFieldsField3; /* cid | xvac */
+
+ StructHeapTupleFields = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.HeapTupleFields");
+ LLVMStructSetBody(StructHeapTupleFields, members, lengthof(members), false);
+ }
+
+ {
+ LLVMTypeRef members[1];
+
+ members[0] = StructHeapTupleFields; /* t_heap | t_datum */
+
+ StructHeapTupleDataChoice = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.HeapTupleHeaderDataChoice");
+ LLVMStructSetBody(StructHeapTupleDataChoice, members, lengthof(members), false);
+
+ }
+
+ {
+ LLVMTypeRef members[6];
+
+ members[0] = StructHeapTupleDataChoice; /* t_heap | t_datum */
+ members[1] = StructItemPointerData; /* t_ctid */
+ members[2] = LLVMInt16Type(); /* t_infomask2 */
+ members[3] = LLVMInt16Type(); /* t_infomask1 */
+ members[4] = LLVMInt8Type(); /* t_hoff */
+ members[5] = LLVMArrayType(LLVMInt8Type(), 0); /* t_bits */
+ /* t_bits and other data follow */
+
+ StructHeapTupleHeaderData = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.HeapTupleHeaderData");
+ LLVMStructSetBody(StructHeapTupleHeaderData, members, lengthof(members), false);
+ }
+
+ {
+ LLVMTypeRef members[4];
+ members[0] = LLVMInt32Type(); /* t_len */
+ members[1] = StructItemPointerData; /* t_self */
+ members[2] = LLVMInt32Type(); /* t_tableOid */
+ members[3] = LLVMPointerType(StructHeapTupleHeaderData, 0); /* t_data */
+
+ StructHeapTupleData = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.HeapTupleData");
+ LLVMStructSetBody(StructHeapTupleData, members, lengthof(members), false);
+ }
+
+ {
+ StructMinimalTupleData = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.MinimalTupleData");
+ }
+
+
+ {
+ StructFormPgAttribute = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.Form_pg_attribute");
+ }
+
+ {
+ StructTupleConstr = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.TupleConstr");
+ }
+
+ {
+ LLVMTypeRef members[7];
+
+ members[0] = LLVMInt32Type(); /* natts */
+ members[1] = LLVMInt32Type(); /* tdtypeid */
+ members[2] = LLVMInt32Type(); /* tdtypemod */
+ members[3] = LLVMInt8Type(); /* tdhasoid */
+ members[4] = LLVMInt32Type(); /* tsrefcount */
+ members[5] = LLVMPointerType(StructTupleConstr, 0); /* constr */
+ members[6] = LLVMArrayType(LLVMPointerType(StructFormPgAttribute, 0), 0); /* attrs */
+
+ StructtupleDesc = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.tupleDesc");
+ LLVMStructSetBody(StructtupleDesc, members, lengthof(members), false);
+ }
+
+ {
+ StructMemoryContextData = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.MemoryContext");
+ }
+
+ {
+ TypeMemoryContext = LLVMPointerType(StructMemoryContextData, 0);
+ }
+
+ {
+ LLVMTypeRef members[15];
+
+ members[ 0] = LLVMInt32Type(); /* type */
+ members[ 1] = LLVMInt8Type(); /* isempty */
+ members[ 2] = LLVMInt8Type(); /* shouldFree */
+ members[ 3] = LLVMInt8Type(); /* shouldFreeMin */
+ members[ 4] = LLVMInt8Type(); /* slow */
+ members[ 5] = LLVMPointerType(StructHeapTupleData, 0); /* tuple */
+ members[ 6] = LLVMPointerType(StructtupleDesc, 0); /* tupleDescriptor */
+ members[ 7] = TypeMemoryContext; /* mcxt */
+ members[ 8] = LLVMInt32Type(); /* buffer */
+ members[ 9] = LLVMInt32Type(); /* nvalid */
+ members[10] = LLVMPointerType(TypeSizeT, 0); /* values */
+ members[11] = LLVMPointerType(LLVMInt8Type(), 0); /* nulls */
+ members[12] = LLVMPointerType(StructMinimalTupleData, 0); /* mintuple */
+ members[13] = StructHeapTupleData; /* minhdr */
+ members[14] = LLVMInt64Type(); /* off: FIXME, deterministic type, not long */
+
+ StructTupleTableSlot = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.TupleTableSlot");
+ LLVMStructSetBody(StructTupleTableSlot, members, lengthof(members), false);
+ }
+
+ {
+ StructFmgrInfo = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.FmgrInfo");
+ }
+
+ {
+ LLVMTypeRef members[8];
+
+ members[0] = LLVMPointerType(StructFmgrInfo, 0); /* flinfo */
+ members[1] = LLVMPointerType(StructPGFinfoRecord, 0); /* context */
+ members[2] = LLVMPointerType(StructPGFinfoRecord, 0); /* resultinfo */
+ members[3] = LLVMInt32Type(); /* fncollation */
+ members[4] = LLVMInt8Type(); /* isnull */
+ members[5] = LLVMInt16Type(); /* nargs */
+ members[6] = LLVMArrayType(TypeSizeT, FUNC_MAX_ARGS);
+ members[7] = LLVMArrayType(LLVMInt8Type(), FUNC_MAX_ARGS);
+
+ StructFunctionCallInfoData = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.FunctionCallInfoData");
+ LLVMStructSetBody(StructFunctionCallInfoData, members, lengthof(members), false);
+ }
+
+ {
+ LLVMTypeRef members[14];
+
+ members[ 0] = LLVMInt32Type(); /* tag */
+ members[ 1] = LLVMInt8Type(); /* flags */
+ members[ 2] = LLVMInt8Type(); /* resnull */
+ members[ 3] = TypeSizeT; /* resvalue */
+ members[ 4] = LLVMPointerType(StructTupleTableSlot, 0); /* resultslot */
+ members[ 5] = LLVMPointerType(TypeSizeT, 0); /* steps */
+ members[ 6] = LLVMPointerType(TypeSizeT, 0); /* evalfunc */
+ members[ 7] = LLVMPointerType(TypeSizeT, 0); /* expr */
+ members[ 8] = TypeSizeT; /* steps_len */
+ members[ 9] = TypeSizeT; /* steps_alloc */
+ members[10] = LLVMPointerType(TypeSizeT, 0); /* innermost caseval */
+ members[11] = LLVMPointerType(LLVMInt8Type(), 0); /* innermost casenull */
+ members[12] = LLVMPointerType(TypeSizeT, 0); /* innermost domainval */
+ members[13] = LLVMPointerType(LLVMInt8Type(), 0); /* innermost domainnull */
+
+ StructExprState = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.ExprState");
+ LLVMStructSetBody(StructExprState, members, lengthof(members), false);
+ }
+
+ {
+ LLVMTypeRef members[16];
+
+ members[ 0] = LLVMInt32Type(); /* tag */
+ members[ 1] = LLVMPointerType(StructTupleTableSlot, 0); /* scantuple */
+ members[ 2] = LLVMPointerType(StructTupleTableSlot, 0); /* innertuple */
+ members[ 3] = LLVMPointerType(StructTupleTableSlot, 0); /* outertuple */
+
+ members[ 4] = LLVMPointerType(TypeSizeT, 0); /* per_query_memory */
+ members[ 5] = LLVMPointerType(TypeSizeT, 0); /* per_tuple_memory */
+
+ members[ 6] = LLVMPointerType(TypeSizeT, 0); /* param_exec */
+ members[ 7] = LLVMPointerType(TypeSizeT, 0); /* param_list_info */
+
+ members[ 8] = LLVMPointerType(TypeSizeT, 0); /* aggvalues */
+ members[ 9] = LLVMPointerType(LLVMInt8Type(), 0); /* aggnulls */
+
+ members[10] = TypeSizeT; /* casvalue */
+ members[11] = LLVMInt8Type(); /* casenull */
+
+ members[12] = TypeSizeT; /* domainvalue */
+ members[13] = LLVMInt8Type(); /* domainnull */
+
+ members[14] = LLVMPointerType(TypeSizeT, 0); /* estate */
+ members[15] = LLVMPointerType(TypeSizeT, 0); /* callbacks */
+
+ StructExprContext = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.ExprContext");
+ LLVMStructSetBody(StructExprContext, members, lengthof(members), false);
+ }
+
+ {
+ LLVMTypeRef params[1];
+ params[0] = LLVMPointerType(StructFunctionCallInfoData, 0);
+ TypePGFunction = LLVMFunctionType(TypeSizeT, params, lengthof(params), 0);
+ }
+}
+
+static uint64_t
+llvm_resolve_symbol(const char *name, void *ctx)
+{
+ return (uint64_t) LLVMSearchForAddressOfSymbol(name);
+}
+
+void *
+llvm_get_function(LLVMJitContext *context, const char *funcname)
+{
+ /*
+ * If there is a pending, not emitted, module, compile and emit
+ * now. Otherwise we migh not find the [correct] function.
+ */
+ if (!context->compiled)
+ {
+ int handle;
+ LLVMSharedModuleRef smod = LLVMOrcMakeSharedModule(context->module);
+ MemoryContext oldcontext;
+
+ if (jit_log_ir)
+ {
+ LLVMDumpModule(context->module);
+ }
+
+ if (jit_dump_bitcode)
+ {
+ /* FIXME: invent module rather than function specific name */
+ char *filename = psprintf("%s.bc", funcname);
+ LLVMWriteBitcodeToFile(context->module, filename);
+ pfree(filename);
+ }
+
+
+ /* perform optimization */
+ {
+ LLVMValueRef func;
+ LLVMPassManagerRef llvm_fpm;
+ LLVMPassManagerRef llvm_mpm;
+
+ llvm_fpm = LLVMCreateFunctionPassManagerForModule(context->module);
+ llvm_mpm = LLVMCreatePassManager();
+
+ LLVMPassManagerBuilderPopulateFunctionPassManager(llvm_pmb, llvm_fpm);
+ LLVMPassManagerBuilderPopulateModulePassManager(llvm_pmb, llvm_mpm);
+ LLVMPassManagerBuilderPopulateLTOPassManager(llvm_pmb, llvm_mpm, true, true);
+
+ LLVMAddAnalysisPasses(llvm_targetmachine, llvm_mpm);
+ LLVMAddAnalysisPasses(llvm_targetmachine, llvm_fpm);
+
+ LLVMAddDeadStoreEliminationPass(llvm_fpm);
+
+ /* do function level optimization */
+ LLVMInitializeFunctionPassManager(llvm_fpm);
+ for (func = LLVMGetFirstFunction(context->module);
+ func != NULL;
+ func = LLVMGetNextFunction(func))
+ LLVMRunFunctionPassManager(llvm_fpm, func);
+ LLVMFinalizeFunctionPassManager(llvm_fpm);
+
+ /* do module level optimization */
+ LLVMRunPassManager(llvm_mpm, context->module);
+
+ LLVMDisposePassManager(llvm_fpm);
+ LLVMDisposePassManager(llvm_mpm);
+ }
+
+ /* and emit the code */
+ {
+ handle =
+ LLVMOrcAddEagerlyCompiledIR(llvm_orc, smod,
+ llvm_resolve_symbol, NULL);
+
+ oldcontext = MemoryContextSwitchTo(TopMemoryContext);
+ context->handles = lappend_int(context->handles, handle);
+ MemoryContextSwitchTo(oldcontext);
+
+ LLVMOrcDisposeSharedModuleRef(smod);
+
+ ResourceOwnerEnlargeJIT(CurrentResourceOwner);
+ ResourceOwnerRememberJIT(CurrentResourceOwner, PointerGetDatum(context));
+ }
+
+ context->module = NULL;
+ context->compiled = true;
+ }
+
+ /* search all emitted modules for function we're asked for */
+ {
+ void *addr;
+ char *mangled;
+ ListCell *lc;
+
+ LLVMOrcGetMangledSymbol(llvm_orc, &mangled, funcname);
+ foreach(lc, context->handles)
+ {
+ int handle = lfirst_int(lc);
+
+ addr = (void *) LLVMOrcGetSymbolAddressIn(llvm_orc, handle, mangled);
+ if (addr)
+ return addr;
+ }
+ }
+
+ elog(ERROR, "failed to JIT: %s", funcname);
+
+ return NULL;
+}
+
+void
+llvm_release_handle(ResourceOwner resowner, Datum handle)
+{
+ LLVMJitContext *context = (LLVMJitContext *) DatumGetPointer(handle);
+ ListCell *lc;
+
+ foreach(lc, context->handles)
+ {
+ int handle = lfirst_int(lc);
+
+ LLVMOrcRemoveModule(llvm_orc, handle);
+ }
+ list_free(context->handles);
+ context->handles = NIL;
+
+ ResourceOwnerForgetJIT(resowner, handle);
+}
+
+#else /* USE_LLVM */
+
+void
+llvm_release_handle(ResourceOwner resowner, Datum handle)
+{
+}
+
+#endif
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 246fea8693..2edc0b33c5 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -42,6 +42,7 @@
#include "commands/variable.h"
#include "commands/trigger.h"
#include "funcapi.h"
+#include "lib/llvmjit.h"
#include "libpq/auth.h"
#include "libpq/be-fsstubs.h"
#include "libpq/libpq.h"
@@ -995,6 +996,32 @@ static struct config_bool ConfigureNamesBool[] =
false,
NULL, NULL, NULL
},
+
+#ifdef USE_LLVM
+ {
+ {"jit_log_ir", PGC_USERSET, DEVELOPER_OPTIONS,
+ gettext_noop("just-in-time debugging: print IR to stdout"),
+ NULL,
+ GUC_NOT_IN_SAMPLE
+ },
+ &jit_log_ir,
+ false,
+ NULL, NULL, NULL
+ },
+
+ {
+ {"jit_dump_bitcode", PGC_USERSET, DEVELOPER_OPTIONS,
+ gettext_noop("just-in-time debuggin: write out bitcode"),
+ NULL,
+ GUC_NOT_IN_SAMPLE
+ },
+ &jit_dump_bitcode,
+ false,
+ NULL, NULL, NULL
+ },
+
+#endif
+
{
{"zero_damaged_pages", PGC_SUSET, DEVELOPER_OPTIONS,
gettext_noop("Continues processing past damaged page headers."),
diff --git a/src/backend/utils/resowner/resowner.c b/src/backend/utils/resowner/resowner.c
index 4a4a287148..3c89db5003 100644
--- a/src/backend/utils/resowner/resowner.c
+++ b/src/backend/utils/resowner/resowner.c
@@ -27,6 +27,7 @@
#include "utils/rel.h"
#include "utils/resowner_private.h"
#include "utils/snapmgr.h"
+#include "lib/llvmjit.h"
/*
@@ -124,6 +125,7 @@ typedef struct ResourceOwnerData
ResourceArray snapshotarr; /* snapshot references */
ResourceArray filearr; /* open temporary files */
ResourceArray dsmarr; /* dynamic shmem segments */
+ ResourceArray jitarr; /* JIT handles */
/* We can remember up to MAX_RESOWNER_LOCKS references to local locks. */
int nlocks; /* number of owned locks */
@@ -437,6 +439,7 @@ ResourceOwnerCreate(ResourceOwner parent, const char *name)
ResourceArrayInit(&(owner->snapshotarr), PointerGetDatum(NULL));
ResourceArrayInit(&(owner->filearr), FileGetDatum(-1));
ResourceArrayInit(&(owner->dsmarr), PointerGetDatum(NULL));
+ ResourceArrayInit(&(owner->jitarr), Int32GetDatum(-1));
return owner;
}
@@ -552,6 +555,21 @@ ResourceOwnerReleaseInternal(ResourceOwner owner,
PrintDSMLeakWarning(res);
dsm_detach(res);
}
+
+ /* Ditto for jited functions */
+ while (ResourceArrayGetAny(&(owner->jitarr), &foundres))
+ {
+ if (isTopLevel)
+ llvm_release_handle(owner, foundres);
+ else
+ {
+ ResourceOwnerForgetJIT(owner, foundres);
+ ResourceOwnerEnlargeJIT(owner->parent);
+ ResourceOwnerRememberJIT(owner->parent, foundres);
+
+ }
+ }
+
}
else if (phase == RESOURCE_RELEASE_LOCKS)
{
@@ -699,6 +717,7 @@ ResourceOwnerDelete(ResourceOwner owner)
Assert(owner->snapshotarr.nitems == 0);
Assert(owner->filearr.nitems == 0);
Assert(owner->dsmarr.nitems == 0);
+ Assert(owner->jitarr.nitems == 0);
Assert(owner->nlocks == 0 || owner->nlocks == MAX_RESOWNER_LOCKS + 1);
/*
@@ -725,6 +744,7 @@ ResourceOwnerDelete(ResourceOwner owner)
ResourceArrayFree(&(owner->snapshotarr));
ResourceArrayFree(&(owner->filearr));
ResourceArrayFree(&(owner->dsmarr));
+ ResourceArrayFree(&(owner->jitarr));
pfree(owner);
}
@@ -1267,3 +1287,23 @@ PrintDSMLeakWarning(dsm_segment *seg)
elog(WARNING, "dynamic shared memory leak: segment %u still referenced",
dsm_segment_handle(seg));
}
+
+void
+ResourceOwnerEnlargeJIT(ResourceOwner owner)
+{
+ ResourceArrayEnlarge(&(owner->jitarr));
+}
+
+void
+ResourceOwnerRememberJIT(ResourceOwner owner, Datum handle)
+{
+ ResourceArrayAdd(&(owner->jitarr), handle);
+}
+
+void
+ResourceOwnerForgetJIT(ResourceOwner owner, Datum handle)
+{
+ if (!ResourceArrayRemove(&(owner->jitarr), handle))
+ elog(ERROR, "jit %lu is not owned by resource owner %s",
+ handle, owner->name);
+}
diff --git a/src/include/lib/llvmjit.h b/src/include/lib/llvmjit.h
new file mode 100644
index 0000000000..82b0b91c93
--- /dev/null
+++ b/src/include/lib/llvmjit.h
@@ -0,0 +1,83 @@
+#ifndef LLVMJIT_H
+#define LLVMJIT_H
+
+#include "utils/resowner.h"
+
+#ifdef USE_LLVM
+
+/* symbol conflict :( */
+#undef PM
+
+#include "nodes/pg_list.h"
+
+#include <llvm-c/Core.h>
+#include <llvm-c/Core.h>
+#include <llvm-c/ExecutionEngine.h>
+#include <llvm-c/Target.h>
+#include <llvm-c/Analysis.h>
+#include <llvm-c/BitWriter.h>
+#include <llvm-c/IRReader.h>
+#include <llvm-c/BitReader.h>
+#include <llvm-c/Linker.h>
+#include <llvm-c/OrcBindings.h>
+#include <llvm-c/Transforms/PassManagerBuilder.h>
+
+typedef struct LLVMJitContext
+{
+ int counter;
+ LLVMModuleRef module;
+ bool compiled;
+ List *handles;
+} LLVMJitContext;
+
+extern bool jit_log_ir;
+extern bool jit_dump_bitcode;
+
+extern LLVMTargetMachineRef llvm_targetmachine;
+extern const char *llvm_triple;
+
+extern LLVMTypeRef TypeSizeT;
+extern LLVMTypeRef TypePGFunction;
+extern LLVMTypeRef TypeMemoryContext;
+
+extern LLVMTypeRef StructFormPgAttribute;
+extern LLVMTypeRef StructTupleConstr;
+extern LLVMTypeRef StructtupleDesc;
+extern LLVMTypeRef StructHeapTupleFields;
+extern LLVMTypeRef StructHeapTupleFieldsField3;
+extern LLVMTypeRef StructHeapTupleHeaderData;
+extern LLVMTypeRef StructHeapTupleDataChoice;
+extern LLVMTypeRef StructHeapTupleData;
+extern LLVMTypeRef StructMinimalTupleData;
+extern LLVMTypeRef StructItemPointerData;
+extern LLVMTypeRef StructBlockId;
+extern LLVMTypeRef StructTupleTableSlot;
+extern LLVMTypeRef StructMemoryContextData;
+extern LLVMTypeRef StructPGFinfoRecord;
+extern LLVMTypeRef StructFmgrInfo;
+extern LLVMTypeRef StructFunctionCallInfoData;
+extern LLVMTypeRef StructExprState;
+extern LLVMTypeRef StructExprContext;
+
+extern void llvm_initialize(void);
+extern void llvm_dispose_module(LLVMModuleRef mod, const char *funcname);
+
+extern void *llvm_get_function(LLVMJitContext *context, const char *funcname);
+
+extern void llvm_perf_support(LLVMExecutionEngineRef EE);
+extern void llvm_shutdown_perf_support(LLVMExecutionEngineRef EE);
+
+extern void llvm_perf_orc_support(LLVMOrcJITStackRef llvm_orc);
+extern void llvm_shutdown_orc_perf_support(LLVMOrcJITStackRef llvm_orc);
+
+#else
+
+typedef struct LLVMJitContext
+{
+} LLVMJitContext;
+
+#endif /* USE_LLVM */
+
+extern void llvm_release_handle(ResourceOwner resowner, Datum handle);
+
+#endif /* LLVMJIT_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 90a60abc4d..0dc9fa8d79 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -508,7 +508,9 @@ typedef struct EState
bool *es_epqScanDone; /* true if EPQ tuple has been fetched */
/* The per-query shared memory area to use for parallel execution. */
- struct dsa_area *es_query_dsa;
+ struct dsa_area *es_query_dsa;
+
+ struct LLVMJitContext *es_jit;
} EState;
diff --git a/src/include/utils/resowner_private.h b/src/include/utils/resowner_private.h
index 2420b651b3..1921e4e666 100644
--- a/src/include/utils/resowner_private.h
+++ b/src/include/utils/resowner_private.h
@@ -88,4 +88,11 @@ extern void ResourceOwnerRememberDSM(ResourceOwner owner,
extern void ResourceOwnerForgetDSM(ResourceOwner owner,
dsm_segment *);
+/* support for JITed functions */
+extern void ResourceOwnerEnlargeJIT(ResourceOwner owner);
+extern void ResourceOwnerRememberJIT(ResourceOwner owner,
+ Datum handle);
+extern void ResourceOwnerForgetJIT(ResourceOwner owner,
+ Datum handle);
+
#endif /* RESOWNER_PRIVATE_H */
--
2.14.1.2.g4274c698f4.dirty
0005-Perform-slot-validity-checks-in-a-separate-pass-over.patchtext/x-diff; charset=us-asciiDownload
From e4a5f0a418949b9d4700399ba0a85577cd85cb7f Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 31 Aug 2017 13:22:41 -0700
Subject: [PATCH 05/16] Perform slot validity checks in a separate pass over
expression.
This is better for JITing and allows to get rid of some code
duplication.
---
src/backend/executor/execExpr.c | 8 +-
src/backend/executor/execExprInterp.c | 192 ++++++++++++++--------------------
src/include/executor/execExpr.h | 9 +-
src/include/nodes/execnodes.h | 2 +
4 files changed, 86 insertions(+), 125 deletions(-)
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index be9d23bc32..81549ee915 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -637,20 +637,20 @@ ExecInitExprRec(Expr *node, PlanState *parent, ExprState *state,
/* regular user column */
scratch.d.var.attnum = variable->varattno - 1;
scratch.d.var.vartype = variable->vartype;
- /* select EEOP_*_FIRST opcode to force one-time checks */
+
switch (variable->varno)
{
case INNER_VAR:
- scratch.opcode = EEOP_INNER_VAR_FIRST;
+ scratch.opcode = EEOP_INNER_VAR;
break;
case OUTER_VAR:
- scratch.opcode = EEOP_OUTER_VAR_FIRST;
+ scratch.opcode = EEOP_OUTER_VAR;
break;
/* INDEX_VAR is handled by default case */
default:
- scratch.opcode = EEOP_SCAN_VAR_FIRST;
+ scratch.opcode = EEOP_SCAN_VAR;
break;
}
}
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 83e04471e4..50e3f8b176 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -131,7 +131,6 @@ static Datum ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnul
static void ExecInitInterpreter(void);
/* support functions */
-static void CheckVarSlotCompatibility(TupleTableSlot *slot, int attnum, Oid vartype);
static TupleDesc get_cached_rowtype(Oid type_id, int32 typmod,
TupleDesc *cache_field, ExprContext *econtext);
static void ShutdownTupleDescRef(Datum arg);
@@ -139,11 +138,8 @@ static void ExecEvalRowNullInt(ExprState *state, ExprEvalStep *op,
ExprContext *econtext, bool checkisnull);
/* fast-path evaluation functions */
-static Datum ExecJustInnerVarFirst(ExprState *state, ExprContext *econtext, bool *isnull);
static Datum ExecJustInnerVar(ExprState *state, ExprContext *econtext, bool *isnull);
-static Datum ExecJustOuterVarFirst(ExprState *state, ExprContext *econtext, bool *isnull);
static Datum ExecJustOuterVar(ExprState *state, ExprContext *econtext, bool *isnull);
-static Datum ExecJustScanVarFirst(ExprState *state, ExprContext *econtext, bool *isnull);
static Datum ExecJustScanVar(ExprState *state, ExprContext *econtext, bool *isnull);
static Datum ExecJustConst(ExprState *state, ExprContext *econtext, bool *isnull);
static Datum ExecJustAssignInnerVar(ExprState *state, ExprContext *econtext, bool *isnull);
@@ -172,6 +168,8 @@ ExecReadyInterpretedExpr(ExprState *state)
if (state->flags & EEO_FLAG_INTERPRETER_INITIALIZED)
return;
+ state->evalfunc = ExecInterpExprStillValid;
+
/* DIRECT_THREADED should not already be set */
Assert((state->flags & EEO_FLAG_DIRECT_THREADED) == 0);
@@ -195,46 +193,46 @@ ExecReadyInterpretedExpr(ExprState *state)
ExprEvalOp step1 = state->steps[1].opcode;
if (step0 == EEOP_INNER_FETCHSOME &&
- step1 == EEOP_INNER_VAR_FIRST)
+ step1 == EEOP_INNER_VAR)
{
- state->evalfunc = ExecJustInnerVarFirst;
+ state->evalfunc_private = ExecJustInnerVar;
return;
}
else if (step0 == EEOP_OUTER_FETCHSOME &&
- step1 == EEOP_OUTER_VAR_FIRST)
+ step1 == EEOP_OUTER_VAR)
{
- state->evalfunc = ExecJustOuterVarFirst;
+ state->evalfunc_private = ExecJustOuterVar;
return;
}
else if (step0 == EEOP_SCAN_FETCHSOME &&
- step1 == EEOP_SCAN_VAR_FIRST)
+ step1 == EEOP_SCAN_VAR)
{
- state->evalfunc = ExecJustScanVarFirst;
+ state->evalfunc_private = ExecJustScanVar;
return;
}
else if (step0 == EEOP_INNER_FETCHSOME &&
step1 == EEOP_ASSIGN_INNER_VAR)
{
- state->evalfunc = ExecJustAssignInnerVar;
+ state->evalfunc_private = ExecJustAssignInnerVar;
return;
}
else if (step0 == EEOP_OUTER_FETCHSOME &&
step1 == EEOP_ASSIGN_OUTER_VAR)
{
- state->evalfunc = ExecJustAssignOuterVar;
+ state->evalfunc_private = ExecJustAssignOuterVar;
return;
}
else if (step0 == EEOP_SCAN_FETCHSOME &&
step1 == EEOP_ASSIGN_SCAN_VAR)
{
- state->evalfunc = ExecJustAssignScanVar;
+ state->evalfunc_private = ExecJustAssignScanVar;
return;
}
}
else if (state->steps_len == 2 &&
state->steps[0].opcode == EEOP_CONST)
{
- state->evalfunc = ExecJustConst;
+ state->evalfunc_private = ExecJustConst;
return;
}
@@ -258,7 +256,7 @@ ExecReadyInterpretedExpr(ExprState *state)
}
#endif /* EEO_USE_COMPUTED_GOTO */
- state->evalfunc = ExecInterpExpr;
+ state->evalfunc_private = ExecInterpExpr;
}
@@ -289,11 +287,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
&&CASE_EEOP_INNER_FETCHSOME,
&&CASE_EEOP_OUTER_FETCHSOME,
&&CASE_EEOP_SCAN_FETCHSOME,
- &&CASE_EEOP_INNER_VAR_FIRST,
&&CASE_EEOP_INNER_VAR,
- &&CASE_EEOP_OUTER_VAR_FIRST,
&&CASE_EEOP_OUTER_VAR,
- &&CASE_EEOP_SCAN_VAR_FIRST,
&&CASE_EEOP_SCAN_VAR,
&&CASE_EEOP_INNER_SYSVAR,
&&CASE_EEOP_OUTER_SYSVAR,
@@ -415,22 +410,6 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_NEXT();
}
- EEO_CASE(EEOP_INNER_VAR_FIRST)
- {
- int attnum = op->d.var.attnum;
-
- /*
- * First time through, check whether attribute matches Var. Might
- * not be ok anymore, due to schema changes.
- */
- CheckVarSlotCompatibility(innerslot, attnum + 1, op->d.var.vartype);
-
- /* Skip that check on subsequent evaluations */
- op->opcode = EEO_OPCODE(EEOP_INNER_VAR);
-
- /* FALL THROUGH to EEOP_INNER_VAR */
- }
-
EEO_CASE(EEOP_INNER_VAR)
{
int attnum = op->d.var.attnum;
@@ -448,18 +427,6 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_NEXT();
}
- EEO_CASE(EEOP_OUTER_VAR_FIRST)
- {
- int attnum = op->d.var.attnum;
-
- /* See EEOP_INNER_VAR_FIRST comments */
-
- CheckVarSlotCompatibility(outerslot, attnum + 1, op->d.var.vartype);
- op->opcode = EEO_OPCODE(EEOP_OUTER_VAR);
-
- /* FALL THROUGH to EEOP_OUTER_VAR */
- }
-
EEO_CASE(EEOP_OUTER_VAR)
{
int attnum = op->d.var.attnum;
@@ -473,18 +440,6 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_NEXT();
}
- EEO_CASE(EEOP_SCAN_VAR_FIRST)
- {
- int attnum = op->d.var.attnum;
-
- /* See EEOP_INNER_VAR_FIRST comments */
-
- CheckVarSlotCompatibility(scanslot, attnum + 1, op->d.var.vartype);
- op->opcode = EEO_OPCODE(EEOP_SCAN_VAR);
-
- /* FALL THROUGH to EEOP_SCAN_VAR */
- }
-
EEO_CASE(EEOP_SCAN_VAR)
{
int attnum = op->d.var.attnum;
@@ -1524,7 +1479,7 @@ out:
* expression. This should succeed unless there have been schema changes
* since the expression tree has been created.
*/
-static void
+void
CheckVarSlotCompatibility(TupleTableSlot *slot, int attnum, Oid vartype)
{
/*
@@ -1572,6 +1527,61 @@ CheckVarSlotCompatibility(TupleTableSlot *slot, int attnum, Oid vartype)
}
}
+Datum
+ExecInterpExprStillValid(ExprState *state, ExprContext *econtext, bool *isNull)
+{
+ CheckExprStillValid(state, econtext, isNull);
+
+ state->evalfunc = state->evalfunc_private;
+
+ return state->evalfunc(state, econtext, isNull);
+}
+
+void
+CheckExprStillValid(ExprState *state, ExprContext *econtext, bool *isNull)
+{
+ int i = 0;
+ TupleTableSlot *innerslot;
+ TupleTableSlot *outerslot;
+ TupleTableSlot *scanslot;
+
+ innerslot = econtext->ecxt_innertuple;
+ outerslot = econtext->ecxt_outertuple;
+ scanslot = econtext->ecxt_scantuple;
+
+ for (i = 0; i < state->steps_len;i++)
+ {
+ ExprEvalStep *op = &state->steps[i];
+
+ switch (ExecEvalStepOp(state, op))
+ {
+ case EEOP_INNER_VAR:
+ {
+ int attnum = op->d.var.attnum;
+ CheckVarSlotCompatibility(innerslot, attnum + 1, op->d.var.vartype);
+ break;
+ }
+
+ case EEOP_OUTER_VAR:
+ {
+ int attnum = op->d.var.attnum;
+ CheckVarSlotCompatibility(outerslot, attnum + 1, op->d.var.vartype);
+ break;
+ }
+
+ case EEOP_SCAN_VAR:
+ {
+ int attnum = op->d.var.attnum;
+ CheckVarSlotCompatibility(scanslot, attnum + 1, op->d.var.vartype);
+ break;
+ }
+ default:
+ break;
+ }
+ }
+}
+
+
/*
* get_cached_rowtype: utility function to lookup a rowtype tupdesc
*
@@ -1631,28 +1641,6 @@ ShutdownTupleDescRef(Datum arg)
* Fast-path functions, for very simple expressions
*/
-/* Simple reference to inner Var, first time through */
-static Datum
-ExecJustInnerVarFirst(ExprState *state, ExprContext *econtext, bool *isnull)
-{
- ExprEvalStep *op = &state->steps[1];
- int attnum = op->d.var.attnum + 1;
- TupleTableSlot *slot = econtext->ecxt_innertuple;
-
- /* See ExecInterpExpr()'s comments for EEOP_INNER_VAR_FIRST */
-
- CheckVarSlotCompatibility(slot, attnum, op->d.var.vartype);
- op->opcode = EEOP_INNER_VAR; /* just for cleanliness */
- state->evalfunc = ExecJustInnerVar;
-
- /*
- * Since we use slot_getattr(), we don't need to implement the FETCHSOME
- * step explicitly, and we also needn't Assert that the attnum is in range
- * --- slot_getattr() will take care of any problems.
- */
- return slot_getattr(slot, attnum, isnull);
-}
-
/* Simple reference to inner Var */
static Datum
ExecJustInnerVar(ExprState *state, ExprContext *econtext, bool *isnull)
@@ -1661,23 +1649,11 @@ ExecJustInnerVar(ExprState *state, ExprContext *econtext, bool *isnull)
int attnum = op->d.var.attnum + 1;
TupleTableSlot *slot = econtext->ecxt_innertuple;
- /* See comments in ExecJustInnerVarFirst */
- return slot_getattr(slot, attnum, isnull);
-}
-
-/* Simple reference to outer Var, first time through */
-static Datum
-ExecJustOuterVarFirst(ExprState *state, ExprContext *econtext, bool *isnull)
-{
- ExprEvalStep *op = &state->steps[1];
- int attnum = op->d.var.attnum + 1;
- TupleTableSlot *slot = econtext->ecxt_outertuple;
-
- CheckVarSlotCompatibility(slot, attnum, op->d.var.vartype);
- op->opcode = EEOP_OUTER_VAR; /* just for cleanliness */
- state->evalfunc = ExecJustOuterVar;
-
- /* See comments in ExecJustInnerVarFirst */
+ /*
+ * Since we use slot_getattr(), we don't need to implement the FETCHSOME
+ * step explicitly, and we also needn't Assert that the attnum is in range
+ * --- slot_getattr() will take care of any problems.
+ */
return slot_getattr(slot, attnum, isnull);
}
@@ -1689,23 +1665,7 @@ ExecJustOuterVar(ExprState *state, ExprContext *econtext, bool *isnull)
int attnum = op->d.var.attnum + 1;
TupleTableSlot *slot = econtext->ecxt_outertuple;
- /* See comments in ExecJustInnerVarFirst */
- return slot_getattr(slot, attnum, isnull);
-}
-
-/* Simple reference to scan Var, first time through */
-static Datum
-ExecJustScanVarFirst(ExprState *state, ExprContext *econtext, bool *isnull)
-{
- ExprEvalStep *op = &state->steps[1];
- int attnum = op->d.var.attnum + 1;
- TupleTableSlot *slot = econtext->ecxt_scantuple;
-
- CheckVarSlotCompatibility(slot, attnum, op->d.var.vartype);
- op->opcode = EEOP_SCAN_VAR; /* just for cleanliness */
- state->evalfunc = ExecJustScanVar;
-
- /* See comments in ExecJustInnerVarFirst */
+ /* See comments in ExecJustInnerVar */
return slot_getattr(slot, attnum, isnull);
}
@@ -1717,7 +1677,7 @@ ExecJustScanVar(ExprState *state, ExprContext *econtext, bool *isnull)
int attnum = op->d.var.attnum + 1;
TupleTableSlot *slot = econtext->ecxt_scantuple;
- /* See comments in ExecJustInnerVarFirst */
+ /* See comments in ExecJustInnerVar */
return slot_getattr(slot, attnum, isnull);
}
diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h
index 8ee0496e01..0fbc112890 100644
--- a/src/include/executor/execExpr.h
+++ b/src/include/executor/execExpr.h
@@ -45,12 +45,8 @@ typedef enum ExprEvalOp
EEOP_SCAN_FETCHSOME,
/* compute non-system Var value */
- /* "FIRST" variants are used only the first time through */
- EEOP_INNER_VAR_FIRST,
EEOP_INNER_VAR,
- EEOP_OUTER_VAR_FIRST,
EEOP_OUTER_VAR,
- EEOP_SCAN_VAR_FIRST,
EEOP_SCAN_VAR,
/* compute system Var value */
@@ -62,7 +58,6 @@ typedef enum ExprEvalOp
EEOP_WHOLEROW,
/* compute non-system Var value, assign it into ExprState's resultslot */
- /* (these are not used if _FIRST checks would be needed) */
EEOP_ASSIGN_INNER_VAR,
EEOP_ASSIGN_OUTER_VAR,
EEOP_ASSIGN_SCAN_VAR,
@@ -604,6 +599,10 @@ extern void ExecReadyInterpretedExpr(ExprState *state);
extern ExprEvalOp ExecEvalStepOp(ExprState *state, ExprEvalStep *op);
+extern void CheckVarSlotCompatibility(TupleTableSlot *slot, int attnum, Oid vartype);
+extern Datum ExecInterpExprStillValid(ExprState *state, ExprContext *econtext, bool *isNull);
+extern void CheckExprStillValid(ExprState *state, ExprContext *econtext, bool *isNull);
+
/*
* Non fast-path execution functions. These are externs instead of statics in
* execExprInterp.c, because that allows them to be used by other methods of
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 0dc9fa8d79..8ae8179ee7 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -94,6 +94,8 @@ typedef struct ExprState
Datum *innermost_domainval;
bool *innermost_domainnull;
+
+ void *evalfunc_private;
} ExprState;
--
2.14.1.2.g4274c698f4.dirty
0006-WIP-deduplicate-int-float-overflow-handling-code.patchtext/x-diff; charset=us-asciiDownload
From f3d02385d7914e540c6eaf0ee506d0161d265380 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 31 Aug 2017 13:25:28 -0700
Subject: [PATCH 06/16] WIP: deduplicate int/float overflow handling code.
Author:
Reviewed-By:
Discussion: https://postgr.es/m/
Backpatch:
---
src/backend/utils/adt/float.c | 26 +++++++---
src/backend/utils/adt/int8.c | 113 +++++++++++++-----------------------------
2 files changed, 54 insertions(+), 85 deletions(-)
diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c
index 18b3b949ac..78c06b6c41 100644
--- a/src/backend/utils/adt/float.c
+++ b/src/backend/utils/adt/float.c
@@ -47,20 +47,31 @@ static const uint32 nan[2] = {0xffffffff, 0x7fffffff};
#define MAXFLOATWIDTH 64
#define MAXDOUBLEWIDTH 128
+static void
+floaterr(bool is_overflow)
+{
+ if (is_overflow)
+ ereport(ERROR, \
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), \
+ errmsg("value out of range: overflow"))); \
+ else
+ ereport(ERROR, \
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), \
+ errmsg("value out of range: underflow"))); \
+}
+
+#undef isinf
+#define isinf __builtin_isinf
+
/*
* check to see if a float4/8 val has underflowed or overflowed
*/
#define CHECKFLOATVAL(val, inf_is_valid, zero_is_valid) \
do { \
if (isinf(val) && !(inf_is_valid)) \
- ereport(ERROR, \
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), \
- errmsg("value out of range: overflow"))); \
- \
+ floaterr(true); \
if ((val) == 0.0 && !(zero_is_valid)) \
- ereport(ERROR, \
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), \
- errmsg("value out of range: underflow"))); \
+ floaterr(false); \
} while(0)
@@ -903,6 +914,7 @@ float8mul(PG_FUNCTION_ARGS)
CHECKFLOATVAL(result, isinf(arg1) || isinf(arg2),
arg1 == 0 || arg2 == 0);
+
PG_RETURN_FLOAT8(result);
}
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index e8354dee44..8b95a7c479 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -45,6 +45,14 @@ typedef struct
* Formatting and conversion routines.
*---------------------------------------------------------*/
+static void
+overflowerr(void)
+{
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+}
+
/*
* scanint8 --- try to parse a string into an int8.
*
@@ -495,9 +503,7 @@ int8um(PG_FUNCTION_ARGS)
result = -arg;
/* overflow check (needed for INT64_MIN) */
if (arg != 0 && SAMESIGN(result, arg))
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -524,9 +530,7 @@ int8pl(PG_FUNCTION_ARGS)
* better be that sign too.
*/
if (SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -545,9 +549,8 @@ int8mi(PG_FUNCTION_ARGS)
* result should be of the same sign as the first input.
*/
if (!SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
+
PG_RETURN_INT64(result);
}
@@ -576,9 +579,7 @@ int8mul(PG_FUNCTION_ARGS)
if (arg2 != 0 &&
((arg2 == -1 && arg1 < 0 && result < 0) ||
result / arg2 != arg1))
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
}
PG_RETURN_INT64(result);
}
@@ -610,9 +611,7 @@ int8div(PG_FUNCTION_ARGS)
result = -arg1;
/* overflow check (needed for INT64_MIN) */
if (arg1 != 0 && SAMESIGN(result, arg1))
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -635,9 +634,7 @@ int8abs(PG_FUNCTION_ARGS)
result = (arg1 < 0) ? -arg1 : arg1;
/* overflow check (needed for INT64_MIN) */
if (result < 0)
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -692,9 +689,7 @@ int8inc(PG_FUNCTION_ARGS)
result = *arg + 1;
/* Overflow check */
if (result < 0 && *arg > 0)
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
*arg = result;
PG_RETURN_POINTER(arg);
@@ -709,9 +704,7 @@ int8inc(PG_FUNCTION_ARGS)
result = arg + 1;
/* Overflow check */
if (result < 0 && arg > 0)
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -736,9 +729,7 @@ int8dec(PG_FUNCTION_ARGS)
result = *arg - 1;
/* Overflow check */
if (result > 0 && *arg < 0)
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
*arg = result;
PG_RETURN_POINTER(arg);
@@ -753,9 +744,7 @@ int8dec(PG_FUNCTION_ARGS)
result = arg - 1;
/* Overflow check */
if (result > 0 && arg < 0)
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -829,9 +818,7 @@ int84pl(PG_FUNCTION_ARGS)
* better be that sign too.
*/
if (SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -850,9 +837,7 @@ int84mi(PG_FUNCTION_ARGS)
* result should be of the same sign as the first input.
*/
if (!SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -877,9 +862,7 @@ int84mul(PG_FUNCTION_ARGS)
*/
if (arg1 != (int64) ((int32) arg1) &&
result / arg1 != arg2)
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -910,9 +893,7 @@ int84div(PG_FUNCTION_ARGS)
result = -arg1;
/* overflow check (needed for INT64_MIN) */
if (arg1 != 0 && SAMESIGN(result, arg1))
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -938,9 +919,7 @@ int48pl(PG_FUNCTION_ARGS)
* better be that sign too.
*/
if (SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -959,9 +938,7 @@ int48mi(PG_FUNCTION_ARGS)
* result should be of the same sign as the first input.
*/
if (!SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -986,9 +963,7 @@ int48mul(PG_FUNCTION_ARGS)
*/
if (arg2 != (int64) ((int32) arg2) &&
result / arg2 != arg1)
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -1026,9 +1001,7 @@ int82pl(PG_FUNCTION_ARGS)
* better be that sign too.
*/
if (SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -1047,9 +1020,7 @@ int82mi(PG_FUNCTION_ARGS)
* result should be of the same sign as the first input.
*/
if (!SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -1074,9 +1045,7 @@ int82mul(PG_FUNCTION_ARGS)
*/
if (arg1 != (int64) ((int32) arg1) &&
result / arg1 != arg2)
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -1107,9 +1076,7 @@ int82div(PG_FUNCTION_ARGS)
result = -arg1;
/* overflow check (needed for INT64_MIN) */
if (arg1 != 0 && SAMESIGN(result, arg1))
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -1135,9 +1102,7 @@ int28pl(PG_FUNCTION_ARGS)
* better be that sign too.
*/
if (SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -1156,9 +1121,7 @@ int28mi(PG_FUNCTION_ARGS)
* result should be of the same sign as the first input.
*/
if (!SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -1183,9 +1146,7 @@ int28mul(PG_FUNCTION_ARGS)
*/
if (arg2 != (int64) ((int32) arg2) &&
result / arg2 != arg1)
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -1356,9 +1317,7 @@ dtoi8(PG_FUNCTION_ARGS)
result = (int64) arg;
if ((float8) result != arg)
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
@@ -1395,9 +1354,7 @@ ftoi8(PG_FUNCTION_ARGS)
result = (int64) darg;
if ((float8) result != darg)
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("bigint out of range")));
+ overflowerr();
PG_RETURN_INT64(result);
}
--
2.14.1.2.g4274c698f4.dirty
0007-Pass-through-PlanState-parent-to-expression-instanti.patchtext/x-diff; charset=us-asciiDownload
From f0f4766679abcd55fbb117abfc97ccebf0522c80 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 31 Aug 2017 13:27:28 -0700
Subject: [PATCH 07/16] Pass through PlanState parent to expression
instantiation.
---
src/backend/executor/execExpr.c | 21 +++++++++++++++------
src/backend/executor/execExprInterp.c | 2 +-
2 files changed, 16 insertions(+), 7 deletions(-)
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 81549ee915..b5bde3fa80 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -54,7 +54,7 @@ typedef struct LastAttnumInfo
AttrNumber last_scan;
} LastAttnumInfo;
-static void ExecReadyExpr(ExprState *state);
+static void ExecReadyExpr(ExprState *state, PlanState *parent);
static void ExecInitExprRec(Expr *node, PlanState *parent, ExprState *state,
Datum *resv, bool *resnull);
static void ExprEvalPushStep(ExprState *es, const ExprEvalStep *s);
@@ -123,6 +123,9 @@ ExecInitExpr(Expr *node, PlanState *parent)
state = makeNode(ExprState);
state->expr = node;
+ scratch.resvalue = NULL;
+ scratch.resnull = NULL;
+
/* Insert EEOP_*_FETCHSOME steps as needed */
ExecInitExprSlots(state, (Node *) node);
@@ -133,7 +136,7 @@ ExecInitExpr(Expr *node, PlanState *parent)
scratch.opcode = EEOP_DONE;
ExprEvalPushStep(state, &scratch);
- ExecReadyExpr(state);
+ ExecReadyExpr(state, parent);
return state;
}
@@ -225,7 +228,7 @@ ExecInitQual(List *qual, PlanState *parent)
scratch.opcode = EEOP_DONE;
ExprEvalPushStep(state, &scratch);
- ExecReadyExpr(state);
+ ExecReadyExpr(state, parent);
return state;
}
@@ -316,6 +319,9 @@ ExecBuildProjectionInfo(List *targetList,
state->expr = (Expr *) targetList;
state->resultslot = slot;
+ scratch.resvalue = NULL;
+ scratch.resnull = NULL;
+
/* Insert EEOP_*_FETCHSOME steps as needed */
ExecInitExprSlots(state, (Node *) targetList);
@@ -417,7 +423,7 @@ ExecBuildProjectionInfo(List *targetList,
scratch.opcode = EEOP_DONE;
ExprEvalPushStep(state, &scratch);
- ExecReadyExpr(state);
+ ExecReadyExpr(state, parent);
return projInfo;
}
@@ -571,9 +577,9 @@ ExecCheck(ExprState *state, ExprContext *econtext)
* ExecReadyInterpretedExpr().
*/
static void
-ExecReadyExpr(ExprState *state)
+ExecReadyExpr(ExprState *state, PlanState *parent)
{
- ExecReadyInterpretedExpr(state);
+ ExecReadyInterpretedExpr(state, parent);
}
/*
@@ -2173,6 +2179,9 @@ ExecInitExprSlots(ExprState *state, Node *node)
LastAttnumInfo info = {0, 0, 0};
ExprEvalStep scratch;
+ scratch.resvalue = NULL;
+ scratch.resnull = NULL;
+
/*
* Figure out which attributes we're going to need.
*/
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 50e3f8b176..df453b2ab4 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -151,7 +151,7 @@ static Datum ExecJustAssignScanVar(ExprState *state, ExprContext *econtext, bool
* Prepare ExprState for interpreted execution.
*/
void
-ExecReadyInterpretedExpr(ExprState *state)
+ExecReadyInterpretedExpr(ExprState *state, PlanState *parent)
{
/* Ensure one-time interpreter setup has been done */
ExecInitInterpreter();
--
2.14.1.2.g4274c698f4.dirty
0008-WIP-JIT-compile-expression.patchtext/x-diff; charset=us-asciiDownload
From a75729ca7b46d7e2cdaa97bf4269d459e36fe655 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 31 Aug 2017 14:26:02 -0700
Subject: [PATCH 08/16] WIP: JIT compile expression.
---
src/backend/executor/Makefile | 2 +-
src/backend/executor/execExpr.c | 5 +
src/backend/executor/execExprCompile.c | 2403 ++++++++++++++++++++++++++++++++
src/backend/utils/fmgr/fmgr.c | 2 +-
src/backend/utils/misc/guc.c | 11 +
src/include/executor/execExpr.h | 3 +-
src/include/executor/executor.h | 4 +
src/include/lib/llvmjit.h | 5 +-
src/include/utils/fmgrtab.h | 2 +
9 files changed, 2431 insertions(+), 6 deletions(-)
create mode 100644 src/backend/executor/execExprCompile.c
diff --git a/src/backend/executor/Makefile b/src/backend/executor/Makefile
index 083b20f3fe..277c2e8bf0 100644
--- a/src/backend/executor/Makefile
+++ b/src/backend/executor/Makefile
@@ -12,7 +12,7 @@ subdir = src/backend/executor
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
-OBJS = execAmi.o execCurrent.o execExpr.o execExprInterp.o \
+OBJS = execAmi.o execCurrent.o execExpr.o execExprCompile.o execExprInterp.o \
execGrouping.o execIndexing.o execJunk.o \
execMain.o execParallel.o execProcnode.o \
execReplication.o execScan.o execSRF.o execTuples.o \
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index b5bde3fa80..e6ffe6e062 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -579,6 +579,11 @@ ExecCheck(ExprState *state, ExprContext *econtext)
static void
ExecReadyExpr(ExprState *state, PlanState *parent)
{
+#ifdef USE_LLVM
+ if (ExecReadyCompiledExpr(state, parent))
+ return;
+#endif
+
ExecReadyInterpretedExpr(state, parent);
}
diff --git a/src/backend/executor/execExprCompile.c b/src/backend/executor/execExprCompile.c
new file mode 100644
index 0000000000..d41405b648
--- /dev/null
+++ b/src/backend/executor/execExprCompile.c
@@ -0,0 +1,2403 @@
+/*-------------------------------------------------------------------------
+ *
+ * execCompileExpr.c
+ * LLVM compilation based expression evaluation.
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/executor/execCompileExpr.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#ifdef USE_LLVM
+
+#include "access/htup_details.h"
+#include "access/nbtree.h"
+#include "access/tupconvert.h"
+#include "catalog/objectaccess.h"
+#include "catalog/pg_type.h"
+#include "executor/execdebug.h"
+#include "executor/nodeSubplan.h"
+#include "executor/execExpr.h"
+#include "funcapi.h"
+#include "lib/llvmjit.h"
+#include "miscadmin.h"
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/planner.h"
+#include "parser/parse_coerce.h"
+#include "parser/parsetree.h"
+#include "pgstat.h"
+#include "utils/acl.h"
+#include "utils/builtins.h"
+#include "utils/date.h"
+#include "utils/fmgrtab.h"
+#include "utils/lsyscache.h"
+#include "utils/memutils.h"
+#include "utils/timestamp.h"
+#include "utils/typcache.h"
+#include "utils/xml.h"
+
+
+typedef struct CompiledExprState
+{
+ LLVMJitContext *context;
+ const char *funcname;
+} CompiledExprState;
+
+
+bool jit_expressions = false;
+
+
+static LLVMValueRef
+create_slot_getsomeattrs(LLVMModuleRef mod)
+{
+ LLVMTypeRef sig;
+ LLVMValueRef fn;
+ LLVMTypeRef param_types[2];
+ const char *nm = "slot_getsomeattrs";
+
+ fn = LLVMGetNamedFunction(mod, nm);
+ if (fn)
+ return fn;
+
+ param_types[0] = LLVMPointerType(StructTupleTableSlot, 0);
+ param_types[1] = LLVMInt32Type();
+
+ sig = LLVMFunctionType(LLVMInt64Type(), param_types, lengthof(param_types), 0);
+ fn = LLVMAddFunction(mod, nm, sig);
+
+ return fn;
+}
+
+
+static LLVMValueRef
+create_heap_getsysattr(LLVMModuleRef mod)
+{
+ LLVMTypeRef sig;
+ LLVMValueRef fn;
+ LLVMTypeRef param_types[4];
+ const char *nm = "heap_getsysattr";
+
+ fn = LLVMGetNamedFunction(mod, nm);
+ if (fn)
+ return fn;
+
+ /* heap_getsysattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull) */
+ param_types[0] = LLVMPointerType(StructHeapTupleData, 0);
+ param_types[1] = LLVMInt32Type();
+ param_types[2] = LLVMPointerType(StructtupleDesc, 0);
+ param_types[3] = LLVMPointerType(LLVMInt8Type(), 0);
+
+ sig = LLVMFunctionType(LLVMInt64Type(), param_types, lengthof(param_types), 0);
+ fn = LLVMAddFunction(mod, nm, sig);
+
+ return fn;
+}
+
+static LLVMValueRef
+create_EvalXFunc(LLVMModuleRef mod, const char *funcname)
+{
+ LLVMTypeRef sig;
+ LLVMValueRef fn;
+ LLVMTypeRef param_types[3];
+
+ fn = LLVMGetNamedFunction(mod, funcname);
+ if (fn)
+ return fn;
+
+ param_types[0] = LLVMPointerType(StructExprState, 0);
+ param_types[1] = LLVMPointerType(TypeSizeT, 0);
+ param_types[2] = LLVMPointerType(StructExprContext, 0);
+
+ sig = LLVMFunctionType(LLVMVoidType(), param_types, lengthof(param_types), 0);
+ fn = LLVMAddFunction(mod, funcname, sig);
+
+ return fn;
+}
+
+static LLVMValueRef
+create_MakeExpandedObjectReadOnly(LLVMModuleRef mod)
+{
+ LLVMTypeRef sig;
+ LLVMValueRef fn;
+ LLVMTypeRef param_types[1];
+ const char *nm = "MakeExpandedObjectReadOnlyInternal";
+
+ fn = LLVMGetNamedFunction(mod, nm);
+ if (fn)
+ return fn;
+
+ param_types[0] = TypeSizeT;
+
+ sig = LLVMFunctionType(TypeSizeT, param_types, lengthof(param_types), 0);
+ fn = LLVMAddFunction(mod, nm, sig);
+
+ return fn;
+}
+
+static LLVMValueRef
+create_EvalArrayRefSubscript(LLVMModuleRef mod)
+{
+ LLVMTypeRef sig;
+ LLVMValueRef fn;
+ LLVMTypeRef param_types[3];
+ const char *nm = "ExecEvalArrayRefSubscript";
+
+ fn = LLVMGetNamedFunction(mod, nm);
+ if (fn)
+ return fn;
+
+ param_types[0] = LLVMPointerType(StructExprState, 0);
+ param_types[1] = LLVMPointerType(TypeSizeT, 0);
+ param_types[2] = LLVMPointerType(StructExprContext, 0);
+
+ sig = LLVMFunctionType(LLVMInt8Type(), param_types, lengthof(param_types), 0);
+ fn = LLVMAddFunction(mod, nm, sig);
+
+ return fn;
+}
+
+static LLVMValueRef
+get_LifetimeEnd(LLVMModuleRef mod)
+{
+ LLVMTypeRef sig;
+ LLVMValueRef fn;
+ LLVMTypeRef param_types[2];
+ const char *nm = "llvm.lifetime.end.p0i8";
+
+ fn = LLVMGetNamedFunction(mod, nm);
+ if (fn)
+ return fn;
+
+ param_types[0] = LLVMInt64Type();
+ param_types[1] = LLVMPointerType(LLVMInt8Type(), 0);
+
+ sig = LLVMFunctionType(LLVMVoidType(), param_types, lengthof(param_types), 0);
+ fn = LLVMAddFunction(mod, nm, sig);
+
+ LLVMSetFunctionCallConv(fn, LLVMCCallConv);
+
+ Assert(LLVMGetIntrinsicID(fn));
+
+ return fn;
+}
+
+static LLVMValueRef
+BuildFunctionCall(LLVMJitContext *context, LLVMBuilderRef builder,
+ LLVMModuleRef mod, FunctionCallInfo fcinfo,
+ LLVMValueRef *v_fcinfo_isnull)
+{
+ bool forceinline = false;
+ LLVMValueRef v_fn_addr;
+ LLVMValueRef v_fcinfo_isnullp;
+ LLVMValueRef v_retval;
+ LLVMValueRef v_fcinfo;
+ const FmgrBuiltin *builtin;
+
+ builtin = fmgr_isbuiltin(fcinfo->flinfo->fn_oid);
+
+ if (builtin && LLVMGetNamedFunction(mod, builtin->funcName))
+ {
+ v_fn_addr = LLVMGetNamedFunction(mod, builtin->funcName);
+
+ forceinline = true;
+
+ }
+ else if (builtin)
+ {
+ LLVMAddFunction(mod, builtin->funcName, TypePGFunction);
+ v_fn_addr = LLVMGetNamedFunction(mod, builtin->funcName);
+ Assert(v_fn_addr);
+ }
+ else
+ {
+ v_fn_addr = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (intptr_t) fcinfo->flinfo->fn_addr, false),
+ LLVMPointerType(TypePGFunction, 0),
+ "v_fn_addr");
+ }
+
+ v_fcinfo = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (intptr_t) fcinfo, false),
+ LLVMPointerType(StructFunctionCallInfoData, 0),
+ "v_fcinfo");
+
+ v_fcinfo_isnullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (intptr_t) &fcinfo->isnull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "v_fcinfo_isnull");
+ LLVMBuildStore(builder, LLVMConstInt(LLVMInt8Type(), 0, false),
+ v_fcinfo_isnullp);
+
+ v_retval = LLVMBuildCall(builder, v_fn_addr, &v_fcinfo, 1, "funccall");
+
+ if (forceinline)
+ {
+ int id = LLVMGetEnumAttributeKindForName("alwaysinline", sizeof("alwaysinline") - 1);
+ LLVMAttributeRef attr;
+
+ attr = LLVMCreateEnumAttribute(LLVMGetGlobalContext(),
+ id, 0);
+ LLVMAddCallSiteAttribute(v_retval, LLVMAttributeFunctionIndex, attr);
+ }
+
+ if (v_fcinfo_isnull)
+ *v_fcinfo_isnull = LLVMBuildLoad(builder, v_fcinfo_isnullp, "");
+
+ /*
+ * Add lifetime-end annotation, signalling that writes to memory don't
+ * have to be retained (important for inlining potential).
+ */
+ {
+ LLVMValueRef v_lifetime = get_LifetimeEnd(mod);
+ LLVMValueRef params[2];
+
+ params[0] = LLVMConstInt(LLVMInt64Type(), sizeof(FunctionCallInfoData), false);
+ params[1] = LLVMBuildBitCast(
+ builder, v_fcinfo,
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "");
+ LLVMBuildCall(builder, v_lifetime, params, lengthof(params), "");
+ }
+
+ return v_retval;
+}
+
+static Datum
+ExecRunCompiledExpr(ExprState *state, ExprContext *econtext, bool *isNull)
+{
+ CompiledExprState *cstate = state->evalfunc_private;
+ ExprStateEvalFunc func;
+
+ CheckExprStillValid(state, econtext, isNull);
+
+ func = (ExprStateEvalFunc) llvm_get_function(cstate->context,
+ cstate->funcname);
+ if (!func)
+ elog(ERROR, "failed to jit");
+
+ state->evalfunc = func;
+
+ return func(state, econtext, isNull);
+}
+
+bool
+ExecReadyCompiledExpr(ExprState *state, PlanState *parent)
+{
+ ExprEvalStep *op;
+ int i = 0;
+ char *funcname;
+
+ LLVMJitContext *context = NULL;
+
+ LLVMBuilderRef builder;
+ LLVMModuleRef mod;
+ LLVMTypeRef eval_sig;
+ LLVMValueRef eval_fn;
+ LLVMBasicBlockRef entry;
+ LLVMBasicBlockRef *opblocks;
+
+ /* referenced functions */
+ LLVMValueRef l_heap_getsysattr = NULL;
+
+ /* state itself */
+ LLVMValueRef v_state;
+ LLVMValueRef v_econtext;
+
+ /* returnvalue */
+ LLVMValueRef v_isnullp;
+
+ /* tmp vars in state */
+ LLVMValueRef v_tmpvaluep;
+ LLVMValueRef v_tmpisnullp;
+
+ /* slots */
+ LLVMValueRef v_innerslot;
+ LLVMValueRef v_outerslot;
+ LLVMValueRef v_scanslot;
+ LLVMValueRef v_resultslot;
+
+ /* nulls/values of slots */
+ LLVMValueRef v_innervalues;
+ LLVMValueRef v_innernulls;
+ LLVMValueRef v_outervalues;
+ LLVMValueRef v_outernulls;
+ LLVMValueRef v_scanvalues;
+ LLVMValueRef v_scannulls;
+ LLVMValueRef v_resultvalues;
+ LLVMValueRef v_resultnulls;
+
+ /* stuff in econtext */
+ LLVMValueRef v_aggvalues;
+ LLVMValueRef v_aggnulls;
+
+ /* only do JITing if enabled */
+ if (!jit_expressions || !parent)
+ return false;
+
+ llvm_initialize();
+
+ if (parent && parent->state->es_jit)
+ {
+ context = parent->state->es_jit;
+ }
+ else
+ {
+ context = MemoryContextAllocZero(TopMemoryContext,
+ sizeof(LLVMJitContext));
+
+ if (parent)
+ {
+ parent->state->es_jit = context;
+ }
+
+ }
+
+ mod = context->module;
+ if (!mod)
+ {
+ context->compiled = false;
+ mod = context->module = LLVMModuleCreateWithName("evalexpr");
+ LLVMSetTarget(mod, llvm_triple);
+ }
+
+ op = state->steps;
+ funcname = psprintf("evalexpr%d", context->counter);
+ context->counter++;
+
+ builder = LLVMCreateBuilder();
+
+ /* Create the signature and function */
+ {
+ LLVMTypeRef param_types[] = {
+ LLVMPointerType(StructExprState, 0), /* state */
+ LLVMPointerType(StructExprContext, 0), /* econtext */
+ LLVMPointerType(LLVMInt8Type(), 0)}; /* isnull */
+ eval_sig = LLVMFunctionType(TypeSizeT, param_types, lengthof(param_types), 0);
+ }
+ eval_fn = LLVMAddFunction(mod, funcname, eval_sig);
+ LLVMSetLinkage(eval_fn, LLVMExternalLinkage);
+ LLVMSetVisibility(eval_fn, LLVMDefaultVisibility);
+
+ entry = LLVMAppendBasicBlock(eval_fn, "entry");
+
+ /* build state */
+ v_state = LLVMGetParam(eval_fn, 0);
+ v_econtext = LLVMGetParam(eval_fn, 1);
+ v_isnullp = LLVMGetParam(eval_fn, 2);
+
+ LLVMPositionBuilderAtEnd(builder, entry);
+
+ v_tmpvaluep = LLVMBuildStructGEP(builder, v_state, 3, "v.state.resvalue");
+ v_tmpisnullp = LLVMBuildStructGEP(builder, v_state, 2, "v.state.resnull");
+
+ /* build global slots */
+ v_scanslot = LLVMBuildLoad(builder, LLVMBuildStructGEP(builder, v_econtext, 1, ""), "v_scanslot");
+ v_innerslot = LLVMBuildLoad(builder, LLVMBuildStructGEP(builder, v_econtext, 2, ""), "v_innerslot");
+ v_outerslot = LLVMBuildLoad(builder, LLVMBuildStructGEP(builder, v_econtext, 3, ""), "v_outerslot");
+ v_resultslot = LLVMBuildLoad(builder, LLVMBuildStructGEP(builder, v_state, 4, ""), "v_resultslot");
+
+ /* build global values/isnull pointers */
+ v_scanvalues = LLVMBuildLoad(builder, LLVMBuildStructGEP(builder, v_scanslot, 10, ""), "v_scanvalues");
+ v_scannulls = LLVMBuildLoad(builder, LLVMBuildStructGEP(builder, v_scanslot, 11, ""), "v_scannulls");
+ v_innervalues = LLVMBuildLoad(builder, LLVMBuildStructGEP(builder, v_innerslot, 10, ""), "v_innervalues");
+ v_innernulls = LLVMBuildLoad(builder, LLVMBuildStructGEP(builder, v_innerslot, 11, ""), "v_innernulls");
+ v_outervalues = LLVMBuildLoad(builder, LLVMBuildStructGEP(builder, v_outerslot, 10, ""), "v_outervalues");
+ v_outernulls = LLVMBuildLoad(builder, LLVMBuildStructGEP(builder, v_outerslot, 11, ""), "v_outernulls");
+ v_resultvalues = LLVMBuildLoad(builder, LLVMBuildStructGEP(builder, v_resultslot, 10, ""), "v_resultvalues");
+ v_resultnulls = LLVMBuildLoad(builder, LLVMBuildStructGEP(builder, v_resultslot, 11, ""), "v_resultnulls");
+
+ /* aggvalues/aggnulls */
+ v_aggvalues = LLVMBuildLoad(builder, LLVMBuildStructGEP(builder, v_econtext, 8, ""), "v.econtext.aggvalues");
+ v_aggnulls = LLVMBuildLoad(builder, LLVMBuildStructGEP(builder, v_econtext, 9, ""), "v.econtext.aggnulls");
+
+ /* allocate blocks for each op upfront, so we can do jumps easily */
+ opblocks = palloc(sizeof(LLVMBasicBlockRef) * state->steps_len);
+ for (i = 0; i < state->steps_len; i++)
+ {
+ char *blockname = psprintf("block.op.%d.start", i);
+ opblocks[i] = LLVMAppendBasicBlock(eval_fn, blockname);
+ pfree(blockname);
+ }
+
+ /* jump from entry to first block */
+ LLVMBuildBr(builder, opblocks[0]);
+
+ for (i = 0; i < state->steps_len; i++)
+ {
+ LLVMValueRef v_resvaluep; /* FIXME */
+ LLVMValueRef v_resnullp;
+
+ LLVMPositionBuilderAtEnd(builder, opblocks[i]);
+
+ op = &state->steps[i];
+
+ v_resvaluep = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (intptr_t) op->resvalue, false),
+ LLVMPointerType(TypeSizeT, 0),
+ "v_resvaluep");
+ v_resnullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (intptr_t) op->resnull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "v_resnullp");
+
+ switch ((ExprEvalOp) op->opcode)
+ {
+ case EEOP_DONE:
+ {
+ LLVMValueRef v_tmpisnull, v_tmpvalue;
+
+ v_tmpvalue = LLVMBuildLoad(builder, v_tmpvaluep, "");
+ v_tmpisnull = LLVMBuildLoad(builder, v_tmpisnullp, "");
+
+ LLVMBuildStore(builder, v_tmpisnull, v_isnullp);
+
+
+ {
+ LLVMValueRef v_lifetime;
+ LLVMValueRef v_steps;
+ LLVMValueRef params[2];
+
+ v_lifetime = get_LifetimeEnd(mod);
+
+ v_steps = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) state->steps, false),
+ LLVMPointerType(TypeSizeT, 0), "");
+
+ params[0] = LLVMConstInt(LLVMInt64Type(),
+ sizeof(state->steps[0]) * state->steps_len,
+ false);
+ params[1] = LLVMBuildBitCast(
+ builder, v_steps,
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "");
+
+ LLVMBuildCall(builder, v_lifetime, params, lengthof(params), "");
+
+ params[0] = LLVMConstInt(LLVMInt64Type(), sizeof(*state), false);
+ params[1] = LLVMBuildBitCast(
+ builder, v_state,
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "");
+
+ LLVMBuildCall(builder, v_lifetime, params, lengthof(params), "");
+ }
+
+
+ LLVMBuildRet(builder, v_tmpvalue);
+ break;
+ }
+ case EEOP_INNER_FETCHSOME:
+ case EEOP_OUTER_FETCHSOME:
+ case EEOP_SCAN_FETCHSOME:
+ {
+ LLVMValueRef v_slot;
+ LLVMBasicBlockRef b_fetch = LLVMInsertBasicBlock(opblocks[i + 1], "");
+ LLVMValueRef v_nvalid;
+
+ if (op->opcode == EEOP_INNER_FETCHSOME)
+ {
+
+ v_slot = v_innerslot;
+
+ }
+ else if (op->opcode == EEOP_OUTER_FETCHSOME)
+ {
+ v_slot = v_outerslot;
+ }
+ else
+ {
+ v_slot = v_scanslot;
+ }
+
+ /*
+ * Check if all required attributes are available, or
+ * whether deforming is required.
+ */
+ v_nvalid = LLVMBuildLoad(
+ builder,
+ LLVMBuildStructGEP(builder, v_slot, 9, ""), "");
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntUGE,
+ v_nvalid,
+ LLVMConstInt(LLVMInt32Type(), op->d.fetch.last_var, false),
+ ""),
+ opblocks[i + 1], b_fetch);
+
+ LLVMPositionBuilderAtEnd(builder, b_fetch);
+ {
+ LLVMValueRef params[2];
+
+ params[0] = v_slot;
+ params[1] = LLVMConstInt(LLVMInt32Type(), op->d.fetch.last_var, false);
+
+ LLVMBuildCall(builder, create_slot_getsomeattrs(mod), params, lengthof(params), "");
+ }
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_INNER_VAR:
+ {
+ LLVMValueRef value, isnull;
+ LLVMValueRef v_attnum;
+
+ v_attnum = LLVMConstInt(LLVMInt32Type(), op->d.var.attnum, false);
+ value = LLVMBuildLoad(builder, LLVMBuildGEP(builder, v_innervalues, &v_attnum, 1, ""), "");
+ isnull = LLVMBuildLoad(builder, LLVMBuildGEP(builder, v_innernulls, &v_attnum, 1, ""), "");
+ LLVMBuildStore(builder, value, v_resvaluep);
+ LLVMBuildStore(builder, isnull, v_resnullp);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_OUTER_VAR:
+ {
+ LLVMValueRef value, isnull;
+ LLVMValueRef v_attnum;
+
+ v_attnum = LLVMConstInt(LLVMInt32Type(), op->d.var.attnum, false);
+ value = LLVMBuildLoad(builder, LLVMBuildGEP(builder, v_outervalues, &v_attnum, 1, ""), "");
+ isnull = LLVMBuildLoad(builder, LLVMBuildGEP(builder, v_outernulls, &v_attnum, 1, ""), "");
+ LLVMBuildStore(builder, value, v_resvaluep);
+ LLVMBuildStore(builder, isnull, v_resnullp);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_SCAN_VAR:
+ {
+ LLVMValueRef value, isnull;
+ LLVMValueRef v_attnum;
+
+ v_attnum = LLVMConstInt(LLVMInt32Type(), op->d.var.attnum, false);
+ value = LLVMBuildLoad(builder, LLVMBuildGEP(builder, v_scanvalues, &v_attnum, 1, ""), "");
+ isnull = LLVMBuildLoad(builder, LLVMBuildGEP(builder, v_scannulls, &v_attnum, 1, ""), "");
+ LLVMBuildStore(builder, value, v_resvaluep);
+ LLVMBuildStore(builder, isnull, v_resnullp);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_ASSIGN_INNER_VAR:
+ {
+ LLVMValueRef v_value, v_isnull;
+ LLVMValueRef v_rvaluep, v_risnullp;
+ LLVMValueRef v_attnum, v_resultnum;
+
+ v_attnum = LLVMConstInt(LLVMInt32Type(), op->d.assign_var.attnum, false);
+ v_resultnum = LLVMConstInt(LLVMInt32Type(), op->d.assign_var.resultnum, false);
+ v_rvaluep = LLVMBuildGEP(builder, v_resultvalues, &v_resultnum, 1, "");
+ v_risnullp = LLVMBuildGEP(builder, v_resultnulls, &v_resultnum, 1, "");
+
+ v_value = LLVMBuildLoad(builder, LLVMBuildGEP(builder, v_innervalues, &v_attnum, 1, ""), "");
+ v_isnull = LLVMBuildLoad(builder, LLVMBuildGEP(builder, v_innernulls, &v_attnum, 1, ""), "");
+
+ LLVMBuildStore(builder, v_value, v_rvaluep);
+ LLVMBuildStore(builder, v_isnull, v_risnullp);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+
+ }
+
+ case EEOP_ASSIGN_OUTER_VAR:
+ {
+ LLVMValueRef v_value, v_isnull;
+ LLVMValueRef v_rvaluep, v_risnullp;
+ LLVMValueRef v_attnum, v_resultnum;
+
+ v_attnum = LLVMConstInt(LLVMInt32Type(), op->d.assign_var.attnum, false);
+ v_resultnum = LLVMConstInt(LLVMInt32Type(), op->d.assign_var.resultnum, false);
+ v_rvaluep = LLVMBuildGEP(builder, v_resultvalues, &v_resultnum, 1, "");
+ v_risnullp = LLVMBuildGEP(builder, v_resultnulls, &v_resultnum, 1, "");
+
+ v_value = LLVMBuildLoad(builder, LLVMBuildGEP(builder, v_outervalues, &v_attnum, 1, ""), "");
+ v_isnull = LLVMBuildLoad(builder, LLVMBuildGEP(builder, v_outernulls, &v_attnum, 1, ""), "");
+
+ LLVMBuildStore(builder, v_value, v_rvaluep);
+ LLVMBuildStore(builder, v_isnull, v_risnullp);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_ASSIGN_SCAN_VAR:
+ {
+ LLVMValueRef v_value, v_isnull;
+ LLVMValueRef v_rvaluep, v_risnullp;
+ LLVMValueRef v_attnum, v_resultnum;
+
+ v_attnum = LLVMConstInt(LLVMInt32Type(), op->d.assign_var.attnum, false);
+ v_resultnum = LLVMConstInt(LLVMInt32Type(), op->d.assign_var.resultnum, false);
+ v_rvaluep = LLVMBuildGEP(builder, v_resultvalues, &v_resultnum, 1, "");
+ v_risnullp = LLVMBuildGEP(builder, v_resultnulls, &v_resultnum, 1, "");
+
+ v_value = LLVMBuildLoad(builder, LLVMBuildGEP(builder, v_scanvalues, &v_attnum, 1, ""), "");
+ v_isnull = LLVMBuildLoad(builder, LLVMBuildGEP(builder, v_scannulls, &v_attnum, 1, ""), "");
+
+ LLVMBuildStore(builder, v_value, v_rvaluep);
+ LLVMBuildStore(builder, v_isnull, v_risnullp);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_ASSIGN_TMP:
+ {
+ LLVMValueRef v_value, v_isnull;
+ LLVMValueRef v_rvaluep, v_risnullp;
+ LLVMValueRef v_resultnum;
+ size_t resultnum = op->d.assign_tmp.resultnum;
+
+ v_resultnum = LLVMConstInt(LLVMInt32Type(), resultnum, false);
+ v_value = LLVMBuildLoad(builder, v_tmpvaluep, "");
+ v_isnull = LLVMBuildLoad(builder, v_tmpisnullp, "");
+ v_rvaluep = LLVMBuildGEP(builder, v_resultvalues, &v_resultnum, 1, "");
+ v_risnullp = LLVMBuildGEP(builder, v_resultnulls, &v_resultnum, 1, "");
+
+ LLVMBuildStore(builder, v_value, v_rvaluep);
+ LLVMBuildStore(builder, v_isnull, v_risnullp);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_ASSIGN_TMP_MAKE_RO:
+ {
+ LLVMBasicBlockRef b_notnull;
+ LLVMValueRef v_params[1];
+ LLVMValueRef v_ret;
+ LLVMValueRef v_value, v_isnull;
+ LLVMValueRef v_rvaluep, v_risnullp;
+ LLVMValueRef v_resultnum;
+ size_t resultnum = op->d.assign_tmp.resultnum;
+
+ b_notnull = LLVMInsertBasicBlock(opblocks[i + 1], "assign_tmp.notnull");
+
+ v_resultnum = LLVMConstInt(LLVMInt32Type(), resultnum, false);
+ v_value = LLVMBuildLoad(builder, v_tmpvaluep, "");
+ v_isnull = LLVMBuildLoad(builder, v_tmpisnullp, "");
+ v_rvaluep = LLVMBuildGEP(builder, v_resultvalues, &v_resultnum, 1, "");
+ v_risnullp = LLVMBuildGEP(builder, v_resultnulls, &v_resultnum, 1, "");
+
+ LLVMBuildStore(builder, v_isnull, v_risnullp);
+
+ /* check if value is NULL */
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_isnull,
+ LLVMConstInt(LLVMInt8Type(), 0, false), ""),
+ b_notnull, opblocks[i + 1]);
+
+ /* if value is not null, convert to RO datum */
+ LLVMPositionBuilderAtEnd(builder, b_notnull);
+
+ v_params[0] = v_value;
+ v_ret = LLVMBuildCall(builder, create_MakeExpandedObjectReadOnly(mod),
+ v_params, lengthof(v_params), "");
+
+ LLVMBuildStore(builder, v_ret, v_rvaluep);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
+ break;
+ }
+
+ case EEOP_INNER_SYSVAR:
+ {
+ int attnum = op->d.var.attnum;
+ LLVMValueRef v_attnum;
+ LLVMValueRef v_tuple;
+ LLVMValueRef v_tupleDescriptor;
+ LLVMValueRef v_params[4];
+ LLVMValueRef v_syscol;
+
+ Assert(op->d.var.attnum < 0);
+
+ if (!l_heap_getsysattr)
+ l_heap_getsysattr = create_heap_getsysattr(mod);
+
+ v_tuple = LLVMBuildLoad(
+ builder,
+ LLVMBuildStructGEP(builder, v_innerslot, 5, "v.innertuple"),
+ "");
+ v_tupleDescriptor = LLVMBuildLoad(
+ builder,
+ LLVMBuildStructGEP(builder, v_innerslot, 6, "v.innertupledesc"),
+ "");
+
+ v_attnum = LLVMConstInt(LLVMInt32Type(), attnum, 0);
+
+ v_params[0] = v_tuple;
+ v_params[1] = v_attnum;
+ v_params[2] = v_tupleDescriptor;
+ v_params[3] = v_resnullp;
+ v_syscol = LLVMBuildCall(builder, l_heap_getsysattr, v_params, lengthof(v_params), "");
+ LLVMBuildStore(builder, v_syscol, v_resvaluep);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
+ break;
+ }
+
+ case EEOP_OUTER_SYSVAR:
+ {
+ int attnum = op->d.var.attnum;
+ LLVMValueRef v_attnum;
+ LLVMValueRef v_tuple;
+ LLVMValueRef v_tupleDescriptor;
+ LLVMValueRef v_params[4];
+ LLVMValueRef v_syscol;
+
+ Assert(op->d.var.attnum < 0);
+
+ if (!l_heap_getsysattr)
+ l_heap_getsysattr = create_heap_getsysattr(mod);
+
+ v_tuple = LLVMBuildLoad(
+ builder,
+ LLVMBuildStructGEP(builder, v_outerslot, 5, "v.outertuple"),
+ "");
+ v_tupleDescriptor = LLVMBuildLoad(
+ builder,
+ LLVMBuildStructGEP(builder, v_outerslot, 6, "v.outertupledesc"),
+ "");
+
+ v_attnum = LLVMConstInt(LLVMInt32Type(), attnum, 0);
+
+ v_params[0] = v_tuple;
+ v_params[1] = v_attnum;
+ v_params[2] = v_tupleDescriptor;
+ v_params[3] = v_resnullp;
+ v_syscol = LLVMBuildCall(builder, l_heap_getsysattr, v_params, lengthof(v_params), "");
+ LLVMBuildStore(builder, v_syscol, v_resvaluep);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ }
+
+ case EEOP_SCAN_SYSVAR:
+ {
+ int attnum = op->d.var.attnum;
+ LLVMValueRef v_attnum;
+ LLVMValueRef v_tuple;
+ LLVMValueRef v_tupleDescriptor;
+ LLVMValueRef v_params[4];
+ LLVMValueRef v_syscol;
+
+ Assert(op->d.var.attnum < 0);
+
+ if (!l_heap_getsysattr)
+ l_heap_getsysattr = create_heap_getsysattr(mod);
+
+ v_tuple = LLVMBuildLoad(
+ builder,
+ LLVMBuildStructGEP(builder, v_scanslot, 5, "v.scantuple"),
+ "");
+ v_tupleDescriptor = LLVMBuildLoad(
+ builder,
+ LLVMBuildStructGEP(builder, v_scanslot, 6, "v.scantupledesc"),
+ "");
+
+ v_attnum = LLVMConstInt(LLVMInt32Type(), attnum, 0);
+
+ v_params[0] = v_tuple;
+ v_params[1] = v_attnum;
+ v_params[2] = v_tupleDescriptor;
+ v_params[3] = v_resnullp;
+ v_syscol = LLVMBuildCall(builder, l_heap_getsysattr, v_params, lengthof(v_params), "");
+ LLVMBuildStore(builder, v_syscol, v_resvaluep);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
+ break;
+ }
+
+ case EEOP_CONST:
+ {
+ LLVMValueRef v_constvalue, v_constnull;
+
+ v_constvalue = LLVMConstInt(TypeSizeT, op->d.constval.value, false);
+ v_constnull = LLVMConstInt(LLVMInt8Type(), op->d.constval.isnull, false);
+
+ LLVMBuildStore(builder, v_constvalue, v_resvaluep);
+ LLVMBuildStore(builder, v_constnull, v_resnullp);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_FUNCEXPR_STRICT:
+ {
+ FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
+ LLVMBasicBlockRef b_nonull = LLVMInsertBasicBlock(opblocks[i + 1], "no-null-args");
+ int argno;
+ LLVMValueRef v_argnullp;
+ LLVMBasicBlockRef *b_checkargnulls;
+
+ /* should make sure they're optimized beforehand */
+ if (op->d.func.nargs == 0)
+ elog(ERROR, "argumentless strict functions are pointless");
+
+ v_argnullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) fcinfo->argnull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "v_argnullp");
+
+ /* set resnull to true, if the function is actually called, it'll be reset */
+ LLVMBuildStore(builder, LLVMConstInt(LLVMInt8Type(), 1, false), v_resnullp);
+
+ /* create blocks for checking args */
+ b_checkargnulls = palloc(sizeof(LLVMBasicBlockRef *) * op->d.func.nargs);
+ for (argno = 0; argno < op->d.func.nargs;argno++)
+ {
+ b_checkargnulls[argno] = LLVMInsertBasicBlock(b_nonull, "check-null-arg");
+ }
+
+ LLVMBuildBr(builder, b_checkargnulls[0]);
+
+ /* strict function, check for NULL args */
+ for (argno = 0; argno < op->d.func.nargs;argno++)
+ {
+ LLVMValueRef v_argno = LLVMConstInt(LLVMInt32Type(), argno, false);
+ LLVMValueRef v_argisnull;
+ LLVMBasicBlockRef b_argnotnull;
+
+ LLVMPositionBuilderAtEnd(builder, b_checkargnulls[argno]);
+
+ if (argno + 1 == op->d.func.nargs)
+ b_argnotnull = b_nonull;
+ else
+ b_argnotnull = b_checkargnulls[argno + 1];
+
+ v_argisnull = LLVMBuildLoad(builder, LLVMBuildGEP(builder, v_argnullp, &v_argno, 1, ""), "");
+
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_argisnull,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ opblocks[i + 1],
+ b_argnotnull);
+ }
+
+ LLVMPositionBuilderAtEnd(builder, b_nonull);
+ }
+ /* explicit fallthrough */
+ case EEOP_FUNCEXPR:
+ {
+ FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
+ LLVMValueRef v_fcinfo_isnull;
+ LLVMValueRef v_retval;
+
+ v_retval = BuildFunctionCall(context, builder, mod, fcinfo, &v_fcinfo_isnull);
+ LLVMBuildStore(builder, v_retval, v_resvaluep);
+ LLVMBuildStore(builder, v_fcinfo_isnull, v_resnullp);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_AGGREF:
+ {
+ AggrefExprState *aggref = op->d.aggref.astate;
+ LLVMValueRef v_aggnop;
+ LLVMValueRef v_aggno;
+ LLVMValueRef value, isnull;
+
+ /*
+ * At this point aggref->aggno has necessarily been set
+ * yet. So load it from memory each time round. Yes,
+ * that's really ugly. XXX
+ */
+ v_aggnop = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(LLVMInt64Type(), (intptr_t) &aggref->aggno, false),
+ LLVMPointerType(LLVMInt32Type(), 0),
+ "aggnop");
+ v_aggno = LLVMBuildLoad(builder, v_aggnop, "");
+ value = LLVMBuildLoad(builder, LLVMBuildGEP(builder, v_aggvalues, &v_aggno, 1, ""), "aggvalue");
+ isnull = LLVMBuildLoad(builder, LLVMBuildGEP(builder, v_aggnulls, &v_aggno, 1, ""), "aggnull");
+
+ LLVMBuildStore(builder, value, v_resvaluep);
+ LLVMBuildStore(builder, isnull, v_resnullp);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+ case EEOP_WINDOW_FUNC:
+ {
+ WindowFuncExprState *wfunc = op->d.window_func.wfstate;
+ LLVMValueRef v_aggnop;// = LLVMConstInt(LLVMInt32Type(), wfunc->wfuncno, false);
+ LLVMValueRef v_aggno;
+ LLVMValueRef value, isnull;
+
+ /*
+ * At this point wfuncref->wfuncno has necessarily been set
+ * yet. So load it from memory each time round. Yes,
+ * that's really ugly. XXX
+ */
+ v_aggnop = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(LLVMInt64Type(), (intptr_t) &wfunc->wfuncno, false),
+ LLVMPointerType(LLVMInt32Type(), 0),
+ "aggnop");
+
+ v_aggno = LLVMBuildLoad(builder, v_aggnop, "");
+ value = LLVMBuildLoad(builder, LLVMBuildGEP(builder, v_aggvalues, &v_aggno, 1, ""), "windowvalue");
+ isnull = LLVMBuildLoad(builder, LLVMBuildGEP(builder, v_aggnulls, &v_aggno, 1, ""), "windownull");
+
+ LLVMBuildStore(builder, value, v_resvaluep);
+ LLVMBuildStore(builder, isnull, v_resnullp);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_BOOL_AND_STEP_FIRST:
+ {
+ LLVMValueRef v_boolanynullp;
+
+ v_boolanynullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(LLVMInt64Type(), (intptr_t) op->d.boolexpr.anynull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "boolanynull");
+ LLVMBuildStore(builder, LLVMConstInt(LLVMInt8Type(), 0, false), v_boolanynullp);
+
+ /* intentionally fall through */
+ }
+
+ case EEOP_BOOL_AND_STEP_LAST: /* FIXME */
+ case EEOP_BOOL_AND_STEP:
+ {
+ LLVMValueRef v_boolvaluep, v_boolvalue;
+ LLVMValueRef v_boolnullp, v_boolnull;
+ LLVMValueRef v_boolanynullp, v_boolanynull;
+ LLVMBasicBlockRef boolisnullblock;
+ LLVMBasicBlockRef boolcheckfalseblock;
+ LLVMBasicBlockRef boolisfalseblock;
+ LLVMBasicBlockRef boolcontblock;
+ LLVMBasicBlockRef boolisanynullblock;
+ char *blockname;
+
+ blockname = psprintf("block.op.%d.boolisnull", i);
+ boolisnullblock = LLVMInsertBasicBlock(opblocks[i + 1], blockname);
+ pfree(blockname);
+
+ blockname = psprintf("block.op.%d.boolcheckfalse", i);
+ boolcheckfalseblock = LLVMInsertBasicBlock(opblocks[i + 1], blockname);
+ pfree(blockname);
+
+ blockname = psprintf("block.op.%d.boolisfalse", i);
+ boolisfalseblock = LLVMInsertBasicBlock(opblocks[i + 1], blockname);
+ pfree(blockname);
+
+ blockname = psprintf("block.op.%d.boolisanynullblock", i);
+ boolisanynullblock = LLVMInsertBasicBlock(opblocks[i + 1], blockname);
+ pfree(blockname);
+
+ blockname = psprintf("block.op.%d.boolcontblock", i);
+ boolcontblock = LLVMInsertBasicBlock(opblocks[i + 1], blockname);
+ pfree(blockname);
+
+ v_boolvaluep = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(LLVMInt64Type(), (intptr_t) op->resvalue, false),
+ LLVMPointerType(TypeSizeT, 0),
+ "boolvaluep");
+
+ v_boolnullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(LLVMInt64Type(), (intptr_t) op->resnull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "boolnullp");
+
+ v_boolanynullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(LLVMInt64Type(), (intptr_t) op->d.boolexpr.anynull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "anynull");
+
+ v_boolnull = LLVMBuildLoad(builder, v_boolnullp, "");
+ v_boolvalue = LLVMBuildLoad(builder, v_boolvaluep, "");
+
+ /* set resnull to boolnull */
+ LLVMBuildStore(builder, v_boolnull, v_resnullp);
+ /* set revalue to boolvalue */
+ LLVMBuildStore(builder, v_boolvalue, v_resvaluep);
+
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_boolnull,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ boolisnullblock,
+ boolcheckfalseblock);
+
+ /* build block that checks that sets anynull */
+ LLVMPositionBuilderAtEnd(builder, boolisnullblock);
+ /* set boolanynull to true */
+ LLVMBuildStore(builder, LLVMConstInt(LLVMInt8Type(), 1, false), v_boolanynullp);
+ /* set resnull to true */
+ LLVMBuildStore(builder, LLVMConstInt(LLVMInt8Type(), 1, false), v_resnullp);
+ /* reset resvalue (cleanliness) */
+ LLVMBuildStore(builder, LLVMConstInt(TypeSizeT, 0, false), v_resvaluep);
+ /* and jump to next block */
+ LLVMBuildBr(builder, boolcontblock);
+
+ /* build block checking for false, which can jumps out at false */
+ LLVMPositionBuilderAtEnd(builder, boolcheckfalseblock);
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_boolvalue,
+ LLVMConstInt(LLVMInt64Type(), 0, false), ""),
+ boolisfalseblock,
+ boolcontblock);
+
+ /* Build block handling FALSE. Value is false, so short circuit. */
+ LLVMPositionBuilderAtEnd(builder, boolisfalseblock);
+ /* set resnull to false */
+ LLVMBuildStore(builder, LLVMConstInt(LLVMInt8Type(), 0, false), v_resnullp);
+ /* reset resvalue to false */
+ LLVMBuildStore(builder, LLVMConstInt(TypeSizeT, 0, false), v_resvaluep);
+ /* and jump to the end of the AND expression */
+ LLVMBuildBr(builder, opblocks[op->d.boolexpr.jumpdone]);
+
+ /* build block that continues if bool is TRUE */
+ LLVMPositionBuilderAtEnd(builder, boolcontblock);
+
+ v_boolanynull = LLVMBuildLoad(builder, v_boolanynullp, "");
+
+ /* set value to NULL if any previous values were NULL */
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_boolanynull,
+ LLVMConstInt(LLVMInt8Type(), 0, false), ""),
+ opblocks[i + 1], boolisanynullblock);
+
+ LLVMPositionBuilderAtEnd(builder, boolisanynullblock);
+ /* set resnull to true */
+ LLVMBuildStore(builder, LLVMConstInt(LLVMInt8Type(), 1, true), v_resnullp);
+ /* reset resvalue */
+ LLVMBuildStore(builder, LLVMConstInt(TypeSizeT, 0, false), v_resvaluep);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+ case EEOP_BOOL_OR_STEP_FIRST:
+ {
+ LLVMValueRef v_boolanynullp;
+
+ v_boolanynullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(LLVMInt64Type(), (intptr_t) op->d.boolexpr.anynull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "boolanynull");
+ LLVMBuildStore(builder, LLVMConstInt(LLVMInt8Type(), 0, false), v_boolanynullp);
+
+ /* intentionally fall through */
+ }
+
+ case EEOP_BOOL_OR_STEP_LAST: /* FIXME */
+ case EEOP_BOOL_OR_STEP:
+ {
+ LLVMValueRef v_boolvaluep, v_boolvalue;
+ LLVMValueRef v_boolnullp, v_boolnull;
+ LLVMValueRef v_boolanynullp, v_boolanynull;
+ LLVMBasicBlockRef boolisnullblock;
+ LLVMBasicBlockRef boolchecktrueblock;
+ LLVMBasicBlockRef boolistrueblock;
+ LLVMBasicBlockRef boolcontblock;
+ LLVMBasicBlockRef boolisanynullblock;
+ char *blockname;
+
+ blockname = psprintf("block.op.%d.boolisnull", i);
+ boolisnullblock = LLVMInsertBasicBlock(opblocks[i + 1], blockname);
+ pfree(blockname);
+
+ blockname = psprintf("block.op.%d.boolchecktrue", i);
+ boolchecktrueblock = LLVMInsertBasicBlock(opblocks[i + 1], blockname);
+ pfree(blockname);
+
+ blockname = psprintf("block.op.%d.boolistrue", i);
+ boolistrueblock = LLVMInsertBasicBlock(opblocks[i + 1], blockname);
+ pfree(blockname);
+
+ blockname = psprintf("block.op.%d.boolisanynullblock", i);
+ boolisanynullblock = LLVMInsertBasicBlock(opblocks[i + 1], blockname);
+ pfree(blockname);
+
+ blockname = psprintf("block.op.%d.boolcontblock", i);
+ boolcontblock = LLVMInsertBasicBlock(opblocks[i + 1], blockname);
+ pfree(blockname);
+
+ v_boolvaluep = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(LLVMInt64Type(), (intptr_t) op->resvalue, false),
+ LLVMPointerType(TypeSizeT, 0),
+ "boolvaluep");
+
+ v_boolnullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(LLVMInt64Type(), (intptr_t) op->resnull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "boolnullp");
+
+ v_boolanynullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(LLVMInt64Type(), (intptr_t) op->d.boolexpr.anynull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "anynull");
+
+ v_boolnull = LLVMBuildLoad(builder, v_boolnullp, "");
+ v_boolvalue = LLVMBuildLoad(builder, v_boolvaluep, "");
+
+ /* set resnull to boolnull */
+ LLVMBuildStore(builder, v_boolnull, v_resnullp);
+ /* set revalue to boolvalue */
+ LLVMBuildStore(builder, v_boolvalue, v_resvaluep);
+
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_boolnull,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ boolisnullblock,
+ boolchecktrueblock);
+
+ /* build block that checks that sets anynull */
+ LLVMPositionBuilderAtEnd(builder, boolisnullblock);
+ /* set boolanynull to true */
+ LLVMBuildStore(builder, LLVMConstInt(LLVMInt8Type(), 1, false), v_boolanynullp);
+ /* set resnull to true */
+ LLVMBuildStore(builder, LLVMConstInt(LLVMInt8Type(), 1, false), v_resnullp);
+ /* reset resvalue (cleanliness) */
+ LLVMBuildStore(builder, LLVMConstInt(TypeSizeT, 0, false), v_resvaluep);
+ /* and jump to next block */
+ LLVMBuildBr(builder, boolcontblock);
+
+ /* build block checking for false, which can jumps out at false */
+ LLVMPositionBuilderAtEnd(builder, boolchecktrueblock);
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_boolvalue,
+ LLVMConstInt(LLVMInt64Type(), 1, false), ""),
+ boolistrueblock,
+ boolcontblock);
+
+ /* Build block handling TRUE. Value is true, so short circuit. */
+ LLVMPositionBuilderAtEnd(builder, boolistrueblock);
+ /* set resnull to false */
+ LLVMBuildStore(builder, LLVMConstInt(LLVMInt8Type(), 0, false), v_resnullp);
+ /* reset resvalue to true */
+ LLVMBuildStore(builder, LLVMConstInt(TypeSizeT, 1, false), v_resvaluep);
+ /* and jump to the end of the OR expression */
+ LLVMBuildBr(builder, opblocks[op->d.boolexpr.jumpdone]);
+
+ /* build block that continues if bool is FALSE */
+ LLVMPositionBuilderAtEnd(builder, boolcontblock);
+
+ v_boolanynull = LLVMBuildLoad(builder, v_boolanynullp, "");
+
+ /* set value to NULL if any previous values were NULL */
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_boolanynull,
+ LLVMConstInt(LLVMInt8Type(), 0, false), ""),
+ opblocks[i + 1], boolisanynullblock);
+
+ LLVMPositionBuilderAtEnd(builder, boolisanynullblock);
+ /* set resnull to true */
+ LLVMBuildStore(builder, LLVMConstInt(LLVMInt8Type(), 1, true), v_resnullp);
+ /* reset resvalue */
+ LLVMBuildStore(builder, LLVMConstInt(TypeSizeT, 0, false), v_resvaluep);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_BOOL_NOT_STEP:
+ {
+ LLVMValueRef v_boolvaluep, v_boolvalue;
+ LLVMValueRef v_boolnullp, v_boolnull;
+ LLVMValueRef v_negbool;
+
+ v_boolvaluep = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(LLVMInt64Type(), (intptr_t) op->resvalue, false),
+ LLVMPointerType(TypeSizeT, 0),
+ "boolvaluep");
+
+ v_boolnullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(LLVMInt64Type(), (intptr_t) op->resnull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "boolnullp");
+
+ v_boolnull = LLVMBuildLoad(builder, v_boolnullp, "");
+ v_boolvalue = LLVMBuildLoad(builder, v_boolvaluep, "");
+
+ v_negbool = LLVMBuildZExt(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_boolvalue,
+ LLVMConstInt(TypeSizeT, 0, false), ""),
+ TypeSizeT, "");
+ /* set resnull to boolnull */
+ LLVMBuildStore(builder, v_boolnull, v_resnullp);
+ /* set revalue to !boolvalue */
+ LLVMBuildStore(builder, v_negbool, v_resvaluep);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_QUAL:
+ {
+ LLVMValueRef v_resnull;
+ LLVMValueRef v_resvalue;
+ LLVMValueRef v_nullorfalse;
+ LLVMBasicBlockRef qualfailblock;
+ char *blockname;
+
+ blockname = psprintf("block.op.%d.qualfail", i);
+ qualfailblock = LLVMInsertBasicBlock(opblocks[i + 1], blockname);
+ pfree(blockname);
+
+ v_resvalue = LLVMBuildLoad(builder, v_resvaluep, "");
+ v_resnull = LLVMBuildLoad(builder, v_resnullp, "");
+
+ v_nullorfalse = LLVMBuildOr(
+ builder,
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_resnull,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_resvalue,
+ LLVMConstInt(TypeSizeT, 0, false), ""),
+ "");
+
+ LLVMBuildCondBr(
+ builder,
+ v_nullorfalse,
+ qualfailblock,
+ opblocks[i + 1]);
+
+ /* build block handling NULL or false */
+ LLVMPositionBuilderAtEnd(builder, qualfailblock);
+ /* set resnull to false */
+ LLVMBuildStore(builder, LLVMConstInt(LLVMInt8Type(), 0, false), v_resnullp);
+ /* set resvalue to false */
+ LLVMBuildStore(builder, LLVMConstInt(TypeSizeT, 0, false), v_resvaluep);
+ /* and jump out */
+ LLVMBuildBr(builder, opblocks[op->d.qualexpr.jumpdone]);
+
+ break;
+ }
+
+ case EEOP_JUMP:
+ {
+ LLVMBuildBr(builder, opblocks[op->d.jump.jumpdone]);
+
+ break;
+ }
+
+ case EEOP_JUMP_IF_NULL:
+ {
+ LLVMValueRef v_resnull;
+
+ /* Transfer control if current result is null */
+
+ v_resnull = LLVMBuildLoad(builder, v_resnullp, "");
+
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_resnull,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ opblocks[op->d.jump.jumpdone],
+ opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_JUMP_IF_NOT_NULL:
+ {
+ LLVMValueRef v_resnull;
+
+ /* Transfer control if current result is non-null */
+
+ v_resnull = LLVMBuildLoad(builder, v_resnullp, "");
+
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_resnull,
+ LLVMConstInt(LLVMInt8Type(), 0, false), ""),
+ opblocks[op->d.jump.jumpdone],
+ opblocks[i + 1]);
+ break;
+ }
+
+
+ case EEOP_JUMP_IF_NOT_TRUE:
+ {
+ LLVMValueRef v_resnull;
+ LLVMValueRef v_resvalue;
+ LLVMValueRef v_nullorfalse;
+
+ /* Transfer control if current result is null or false */
+
+ v_resvalue = LLVMBuildLoad(builder, v_resvaluep, "");
+ v_resnull = LLVMBuildLoad(builder, v_resnullp, "");
+
+ v_nullorfalse = LLVMBuildOr(
+ builder,
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_resnull,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_resvalue,
+ LLVMConstInt(TypeSizeT, 0, false), ""),
+ "");
+
+ LLVMBuildCondBr(
+ builder,
+ v_nullorfalse,
+ opblocks[op->d.jump.jumpdone],
+ opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_PARAM_EXEC:
+ case EEOP_PARAM_EXTERN:
+ case EEOP_SQLVALUEFUNCTION:
+ case EEOP_CURRENTOFEXPR:
+ case EEOP_NEXTVALUEEXPR:
+ case EEOP_ARRAYEXPR:
+ case EEOP_ARRAYCOERCE:
+ case EEOP_ROW:
+ case EEOP_MINMAX:
+ case EEOP_FIELDSELECT:
+ case EEOP_FIELDSTORE_DEFORM:
+ case EEOP_FIELDSTORE_FORM:
+ case EEOP_ARRAYREF_ASSIGN:
+ case EEOP_ARRAYREF_FETCH:
+ case EEOP_ARRAYREF_OLD:
+ case EEOP_CONVERT_ROWTYPE:
+ case EEOP_SCALARARRAYOP:
+ case EEOP_DOMAIN_NOTNULL:
+ case EEOP_DOMAIN_CHECK:
+ case EEOP_XMLEXPR:
+ case EEOP_GROUPING_FUNC:
+ case EEOP_SUBPLAN:
+ case EEOP_ALTERNATIVE_SUBPLAN:
+ case EEOP_NULLTEST_ROWISNULL:
+ case EEOP_NULLTEST_ROWISNOTNULL:
+ case EEOP_WHOLEROW:
+ {
+ LLVMValueRef v_params[3];
+ const char *funcname;
+
+ if (op->opcode == EEOP_PARAM_EXEC)
+ funcname = "ExecEvalParamExec";
+ else if (op->opcode == EEOP_PARAM_EXTERN)
+ funcname = "ExecEvalParamExtern";
+ else if (op->opcode == EEOP_SQLVALUEFUNCTION)
+ funcname = "ExecEvalSQLValueFunction";
+ else if (op->opcode == EEOP_CURRENTOFEXPR)
+ funcname = "ExecEvalCurrentOfExpr";
+ else if (op->opcode == EEOP_NEXTVALUEEXPR)
+ funcname = "ExecEvalNextValueExpr";
+ else if (op->opcode == EEOP_ARRAYEXPR)
+ funcname = "ExecEvalArrayExpr";
+ else if (op->opcode == EEOP_ARRAYCOERCE)
+ funcname = "ExecEvalArrayCoerce";
+ else if (op->opcode == EEOP_ROW)
+ funcname = "ExecEvalRow";
+ else if (op->opcode == EEOP_MINMAX)
+ funcname = "ExecEvalMinMax";
+ else if (op->opcode == EEOP_FIELDSELECT)
+ funcname = "ExecEvalFieldSelect";
+ else if (op->opcode == EEOP_FIELDSTORE_DEFORM)
+ funcname = "ExecEvalFieldStoreDeForm";
+ else if (op->opcode == EEOP_FIELDSTORE_FORM)
+ funcname = "ExecEvalFieldStoreForm";
+ else if (op->opcode == EEOP_ARRAYREF_FETCH)
+ funcname = "ExecEvalArrayRefFetch";
+ else if (op->opcode == EEOP_ARRAYREF_ASSIGN)
+ funcname = "ExecEvalArrayRefAssign";
+ else if (op->opcode == EEOP_ARRAYREF_OLD)
+ funcname = "ExecEvalArrayRefOld";
+ else if (op->opcode == EEOP_NULLTEST_ROWISNULL)
+ funcname = "ExecEvalRowNull";
+ else if (op->opcode == EEOP_NULLTEST_ROWISNOTNULL)
+ funcname = "ExecEvalRowNotNull";
+ else if (op->opcode == EEOP_CONVERT_ROWTYPE)
+ funcname = "ExecEvalConvertRowtype";
+ else if (op->opcode == EEOP_SCALARARRAYOP)
+ funcname = "ExecEvalScalarArrayOp";
+ else if (op->opcode == EEOP_DOMAIN_NOTNULL)
+ funcname = "ExecEvalConstraintNotNull";
+ else if (op->opcode == EEOP_DOMAIN_CHECK)
+ funcname = "ExecEvalConstraintCheck";
+ else if (op->opcode == EEOP_XMLEXPR)
+ funcname = "ExecEvalXmlExpr";
+ else if (op->opcode == EEOP_GROUPING_FUNC)
+ funcname = "ExecEvalGroupingFunc";
+ else if (op->opcode == EEOP_SUBPLAN)
+ funcname = "ExecEvalSubPlan";
+ else if (op->opcode == EEOP_ALTERNATIVE_SUBPLAN)
+ funcname = "ExecEvalAlternativeSubPlan";
+ else if (op->opcode == EEOP_WHOLEROW)
+ funcname = "ExecEvalWholeRowVar";
+ else
+ {
+ Assert(false);
+ funcname = NULL; /* prevent compiler warning */
+ }
+
+ v_params[0] = v_state;
+ v_params[1] = LLVMBuildIntToPtr(builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) op, false),
+ LLVMPointerType(TypeSizeT, 0),
+ "");
+ v_params[2] = v_econtext;
+ LLVMBuildCall(builder,
+ create_EvalXFunc(mod, funcname),
+ v_params, lengthof(v_params), "");
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_ARRAYREF_SUBSCRIPT:
+ {
+ LLVMValueRef v_params[3];
+ LLVMValueRef v_ret;
+
+ v_params[0] = v_state;
+ v_params[1] = LLVMBuildIntToPtr(builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) op, false),
+ LLVMPointerType(TypeSizeT, 0),
+ "");
+ v_params[2] = v_econtext;
+ v_ret = LLVMBuildCall(builder, create_EvalArrayRefSubscript(mod),
+ v_params, lengthof(v_params), "");
+
+ LLVMBuildCondBr(builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_ret,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ opblocks[i + 1],
+ opblocks[op->d.arrayref_subscript.jumpdone]);
+ break;
+ }
+
+ case EEOP_CASE_TESTVAL:
+ {
+ LLVMBasicBlockRef b_avail, b_notavail;
+ LLVMValueRef v_casevaluep, v_casevalue;
+ LLVMValueRef v_casenullp, v_casenull;
+
+ b_avail = LLVMInsertBasicBlock(opblocks[i + 1], "");
+ b_notavail = LLVMInsertBasicBlock(opblocks[i + 1], "");
+
+ v_casevaluep = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) op->d.casetest.value, false),
+ LLVMPointerType(TypeSizeT, 0),
+ "");
+ v_casenullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) op->d.casetest.isnull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "");
+
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ,
+ LLVMBuildPtrToInt(builder, v_casevaluep, TypeSizeT, ""),
+ LLVMConstInt(TypeSizeT, 0, false), ""),
+ b_notavail, b_avail);
+
+ /* if casetest != NULL */
+ LLVMPositionBuilderAtEnd(builder, b_avail);
+ v_casevalue = LLVMBuildLoad(builder, v_casevaluep, "");
+ v_casenull = LLVMBuildLoad(builder, v_casenullp, "");
+ LLVMBuildStore(builder, v_casevalue, v_resvaluep);
+ LLVMBuildStore(builder, v_casenull, v_resnullp);
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
+ /* if casetest == NULL */
+ LLVMPositionBuilderAtEnd(builder, b_notavail);
+ v_casevalue = LLVMBuildLoad(
+ builder,
+ LLVMBuildStructGEP(builder, v_econtext, 10, ""), "");
+ v_casenull = LLVMBuildLoad(
+ builder,
+ LLVMBuildStructGEP(builder, v_econtext, 11, ""), "");
+ LLVMBuildStore(builder, v_casevalue, v_resvaluep);
+ LLVMBuildStore(builder, v_casenull, v_resnullp);
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
+ break;
+ }
+
+ case EEOP_DOMAIN_TESTVAL:
+ {
+ LLVMBasicBlockRef b_avail, b_notavail;
+ LLVMValueRef v_casevaluep, v_casevalue;
+ LLVMValueRef v_casenullp, v_casenull;
+
+ b_avail = LLVMInsertBasicBlock(opblocks[i + 1], "");
+ b_notavail = LLVMInsertBasicBlock(opblocks[i + 1], "");
+
+ v_casevaluep = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) op->d.casetest.value, false),
+ LLVMPointerType(TypeSizeT, 0),
+ "");
+ v_casenullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) op->d.casetest.isnull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "");
+
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ,
+ LLVMBuildPtrToInt(builder, v_casevaluep, TypeSizeT, ""),
+ LLVMConstInt(TypeSizeT, 0, false), ""),
+ b_notavail, b_avail);
+
+ /* if casetest != NULL */
+ LLVMPositionBuilderAtEnd(builder, b_avail);
+ v_casevalue = LLVMBuildLoad(builder, v_casevaluep, "");
+ v_casenull = LLVMBuildLoad(builder, v_casenullp, "");
+ LLVMBuildStore(builder, v_casevalue, v_resvaluep);
+ LLVMBuildStore(builder, v_casenull, v_resnullp);
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
+ /* if casetest == NULL */
+ LLVMPositionBuilderAtEnd(builder, b_notavail);
+ v_casevalue = LLVMBuildLoad(
+ builder,
+ LLVMBuildStructGEP(builder, v_econtext, 12, ""), "");
+ v_casenull = LLVMBuildLoad(
+ builder,
+ LLVMBuildStructGEP(builder, v_econtext, 13, ""), "");
+ LLVMBuildStore(builder, v_casevalue, v_resvaluep);
+ LLVMBuildStore(builder, v_casenull, v_resnullp);
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
+ break;
+ }
+
+ case EEOP_NULLTEST_ISNULL:
+ {
+ LLVMValueRef v_resnull = LLVMBuildLoad(builder, v_resnullp, "");
+ LLVMValueRef v_resvalue;
+
+ v_resvalue = LLVMBuildSelect(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_resnull,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ LLVMConstInt(TypeSizeT, 1, false),
+ LLVMConstInt(TypeSizeT, 0, false),
+ "");
+ LLVMBuildStore(
+ builder,
+ v_resvalue,
+ v_resvaluep);
+ LLVMBuildStore(
+ builder,
+ LLVMConstInt(LLVMInt8Type(), 0, false),
+ v_resnullp);
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
+ break;
+ }
+
+ case EEOP_NULLTEST_ISNOTNULL:
+ {
+ LLVMValueRef v_resnull = LLVMBuildLoad(builder, v_resnullp, "");
+ LLVMValueRef v_resvalue;
+
+ v_resvalue = LLVMBuildSelect(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_resnull,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ LLVMConstInt(TypeSizeT, 0, false),
+ LLVMConstInt(TypeSizeT, 1, false),
+ "");
+ LLVMBuildStore(
+ builder,
+ v_resvalue,
+ v_resvaluep);
+ LLVMBuildStore(
+ builder,
+ LLVMConstInt(LLVMInt8Type(), 0, false),
+ v_resnullp);
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
+ break;
+ }
+
+ case EEOP_BOOLTEST_IS_TRUE:
+ case EEOP_BOOLTEST_IS_NOT_FALSE:
+ case EEOP_BOOLTEST_IS_FALSE:
+ case EEOP_BOOLTEST_IS_NOT_TRUE:
+ {
+ LLVMBasicBlockRef b_isnull, b_notnull;
+ LLVMValueRef v_resnull = LLVMBuildLoad(builder, v_resnullp, "");
+
+ b_isnull = LLVMInsertBasicBlock(opblocks[i + 1], "boolest.isnull");
+ b_notnull = LLVMInsertBasicBlock(opblocks[i + 1], "booltest.isnotnull");
+
+ /* check if value is NULL */
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_resnull,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ b_isnull, b_notnull);
+
+ /* if value is NULL, return false */
+ LLVMPositionBuilderAtEnd(builder, b_isnull);
+
+ /* result is not null */
+ LLVMBuildStore(
+ builder,
+ LLVMConstInt(LLVMInt8Type(), 0, false),
+ v_resnullp);
+
+ if (op->opcode == EEOP_BOOLTEST_IS_TRUE
+ || op->opcode == EEOP_BOOLTEST_IS_FALSE)
+ {
+ LLVMBuildStore(
+ builder,
+ LLVMConstInt(TypeSizeT, 0, false),
+ v_resvaluep);
+ }
+ else
+ {
+ LLVMBuildStore(
+ builder,
+ LLVMConstInt(TypeSizeT, 1, true),
+ v_resvaluep);
+ }
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
+ LLVMPositionBuilderAtEnd(builder, b_notnull);
+
+ /* FIXME: don't think this is correct */
+
+ if (op->opcode == EEOP_BOOLTEST_IS_TRUE ||
+ op->opcode == EEOP_BOOLTEST_IS_NOT_FALSE)
+ {
+ /* if value is not null NULL, return value (already set) */
+ }
+ else
+ {
+ LLVMValueRef v_value = LLVMBuildLoad(
+ builder, v_resvaluep, "");
+
+ v_value = LLVMBuildZExt(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_value,
+ LLVMConstInt(TypeSizeT, 0, false), ""),
+ TypeSizeT, "");
+ LLVMBuildStore(
+ builder,
+ v_value,
+ v_resvaluep);
+ }
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_NULLIF:
+ {
+ FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
+ LLVMValueRef v_fcinfo_isnull;
+
+ LLVMValueRef v_argnullp;
+ LLVMValueRef v_argnull0;
+ LLVMValueRef v_argnull1;
+ LLVMValueRef v_anyargisnull;
+ LLVMValueRef v_argp;
+ LLVMValueRef v_arg0;
+ LLVMValueRef v_argno;
+ LLVMBasicBlockRef b_hasnull =
+ LLVMInsertBasicBlock(opblocks[i + 1], "null-args");
+ LLVMBasicBlockRef b_nonull =
+ LLVMInsertBasicBlock(opblocks[i + 1], "no-null-args");
+ LLVMBasicBlockRef b_argsequal =
+ LLVMInsertBasicBlock(opblocks[i + 1], "argsequal");
+ LLVMValueRef v_retval;
+ LLVMValueRef v_argsequal;
+
+ v_argnullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) fcinfo->argnull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "v_argnullp");
+
+ v_argp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) fcinfo->arg, false),
+ LLVMPointerType(TypeSizeT, 0),
+ "v_arg");
+
+ /* if either argument is NULL they can't be equal */
+ v_argno = LLVMConstInt(LLVMInt32Type(), 0, false);
+ v_argnull0 = LLVMBuildLoad(
+ builder,
+ LLVMBuildGEP(builder, v_argnullp, &v_argno, 1, "")
+ , "");
+ v_argno = LLVMConstInt(LLVMInt32Type(), 1, false);
+ v_argnull1 = LLVMBuildLoad(
+ builder,
+ LLVMBuildGEP(builder, v_argnullp, &v_argno, 1, "")
+ , "");
+
+ v_anyargisnull = LLVMBuildOr(
+ builder,
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_argnull0,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_argnull1,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ "");
+
+ LLVMBuildCondBr(
+ builder,
+ v_anyargisnull,
+ b_hasnull,
+ b_nonull);
+
+ /* one (or both) of the arguments are null, return arg[0] */
+ LLVMPositionBuilderAtEnd(builder, b_hasnull);
+ v_argno = LLVMConstInt(LLVMInt32Type(), 0, false);
+ v_arg0 = LLVMBuildLoad(
+ builder,
+ LLVMBuildGEP(builder, v_argp, &v_argno, 1, "")
+ , "");
+ LLVMBuildStore(
+ builder,
+ v_argnull0,
+ v_resnullp);
+ LLVMBuildStore(
+ builder,
+ v_arg0,
+ v_resvaluep);
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
+ /* build block to invoke function and check result */
+ LLVMPositionBuilderAtEnd(builder, b_nonull);
+
+ v_retval = BuildFunctionCall(context, builder, mod, fcinfo, &v_fcinfo_isnull);
+
+ /* if result not null, and arguments are equal return null, */
+ v_argsequal = LLVMBuildAnd(
+ builder,
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_fcinfo_isnull,
+ LLVMConstInt(LLVMInt8Type(), 0, false), ""),
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_retval,
+ LLVMConstInt(TypeSizeT, 1, false), ""),
+ "");
+ LLVMBuildCondBr(
+ builder,
+ v_argsequal,
+ b_argsequal,
+ b_hasnull);
+
+ /* build block setting result to NULL, if args are equal */
+ LLVMPositionBuilderAtEnd(builder, b_argsequal);
+ LLVMBuildStore(
+ builder,
+ LLVMConstInt(LLVMInt8Type(), 1, false),
+ v_resnullp);
+ LLVMBuildStore(
+ builder,
+ LLVMConstInt(TypeSizeT, 0, false),
+ v_resvaluep);
+ LLVMBuildStore(builder, v_retval, v_resvaluep);
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
+ break;
+ }
+
+ case EEOP_IOCOERCE:
+ {
+ FunctionCallInfo fcinfo_out, fcinfo_in;
+ LLVMValueRef v_fcinfo_out, v_fcinfo_in;
+ LLVMValueRef v_fn_addr_out, v_fn_addr_in;
+ LLVMValueRef v_fcinfo_in_isnullp;
+ LLVMValueRef v_in_argp, v_out_argp;
+ LLVMValueRef v_in_argnullp, v_out_argnullp;
+ LLVMValueRef v_retval;
+ LLVMValueRef v_resvalue;
+ LLVMValueRef v_resnull;
+
+ LLVMValueRef v_output_skip;
+ LLVMValueRef v_output;
+
+ LLVMValueRef v_argno;
+
+ LLVMBasicBlockRef b_skipoutput =
+ LLVMInsertBasicBlock(opblocks[i + 1], "skipoutputnull");
+ LLVMBasicBlockRef b_calloutput =
+ LLVMInsertBasicBlock(opblocks[i + 1], "calloutput");
+ LLVMBasicBlockRef b_input =
+ LLVMInsertBasicBlock(opblocks[i + 1], "input");
+ LLVMBasicBlockRef b_inputcall =
+ LLVMInsertBasicBlock(opblocks[i + 1], "inputcall");
+
+ fcinfo_out = op->d.iocoerce.fcinfo_data_out;
+ fcinfo_in = op->d.iocoerce.fcinfo_data_in;
+
+ v_fcinfo_out = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (intptr_t) fcinfo_out, false),
+ LLVMPointerType(StructFunctionCallInfoData, 0),
+ "v_fcinfo");
+
+ v_fcinfo_in = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (intptr_t) fcinfo_in, false),
+ LLVMPointerType(StructFunctionCallInfoData, 0),
+ "v_fcinfo");
+
+ v_fn_addr_out = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (intptr_t) fcinfo_out->flinfo->fn_addr, false),
+ LLVMPointerType(TypePGFunction, 0),
+ "v_fn_addr");
+
+ v_fn_addr_in = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (intptr_t) fcinfo_in->flinfo->fn_addr, false),
+ LLVMPointerType(TypePGFunction, 0),
+ "v_fn_addr");
+
+ v_fcinfo_in_isnullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (intptr_t) &fcinfo_in->isnull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "v_fcinfo_isnull");
+
+ v_out_argnullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) fcinfo_out->argnull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "v_out_argnullp");
+
+ v_in_argnullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) fcinfo_in->argnull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "v_in_argnullp");
+
+ v_out_argp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) fcinfo_out->arg, false),
+ LLVMPointerType(TypeSizeT, 0),
+ "v_out_arg");
+
+ v_in_argp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) fcinfo_in->arg, false),
+ LLVMPointerType(TypeSizeT, 0),
+ "v_in_arg");
+
+ /*
+ * If input is NULL, don't call output functions, as
+ * they're not called on NULL.
+ */
+ v_resnull = LLVMBuildLoad(builder, v_resnullp, "");
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_resnull,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ b_skipoutput,
+ b_calloutput);
+
+ LLVMPositionBuilderAtEnd(builder, b_skipoutput);
+ v_output_skip = LLVMConstInt(TypeSizeT, 0, false);
+ LLVMBuildBr(builder, b_input);
+
+ LLVMPositionBuilderAtEnd(builder, b_calloutput);
+ v_resvalue = LLVMBuildLoad(builder, v_resvaluep, "");
+ /* set arg[0] */
+ v_argno = LLVMConstInt(LLVMInt32Type(), 0, false);
+ LLVMBuildStore(
+ builder,
+ v_resvalue,
+ LLVMBuildGEP(builder, v_out_argp, &v_argno, 1, ""));
+ LLVMBuildStore(
+ builder,
+ LLVMConstInt(LLVMInt8Type(), 0, false),
+ LLVMBuildGEP(builder, v_out_argnullp, &v_argno, 1, ""));
+ /* and call output function (can never return NULL) */
+ v_output = LLVMBuildCall(
+ builder, v_fn_addr_out, &v_fcinfo_out, 1, "funccall_coerce_out");
+ LLVMBuildBr(builder, b_input);
+
+ /* build block handling input function call */
+ LLVMPositionBuilderAtEnd(builder, b_input);
+ {
+ LLVMValueRef incoming_values[] =
+ {v_output_skip, v_output};
+ LLVMBasicBlockRef incoming_blocks[] =
+ {b_skipoutput, b_calloutput};
+ v_output = LLVMBuildPhi(builder, TypeSizeT, "output");
+ LLVMAddIncoming(v_output,
+ incoming_values, incoming_blocks,
+ lengthof(incoming_blocks));
+
+ }
+
+ /* if input function is strict, skip if input string is NULL */
+ if (op->d.iocoerce.finfo_in->fn_strict)
+ {
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_output,
+ LLVMConstInt(TypeSizeT, 0, false), ""),
+ opblocks[i + 1],
+ b_inputcall);
+ }
+ else
+ {
+ LLVMBuildBr(builder, b_inputcall);
+ }
+
+ LLVMPositionBuilderAtEnd(builder, b_inputcall);
+ /* set arguments */
+ /* arg0: output */
+ v_argno = LLVMConstInt(LLVMInt32Type(), 0, false);
+ LLVMBuildStore(
+ builder,
+ v_output,
+ LLVMBuildGEP(builder, v_in_argp, &v_argno, 1, ""));
+ LLVMBuildStore(
+ builder,
+ v_resnull,
+ LLVMBuildGEP(builder, v_in_argnullp, &v_argno, 1, ""));
+
+ /* arg1: ioparam: preset in execExpr.c */
+ /* arg2: typmod: preset in execExpr.c */
+
+ /* reset fcinfo_in->isnull */
+ LLVMBuildStore(
+ builder, LLVMConstInt(LLVMInt8Type(), 0, false),
+ v_fcinfo_in_isnullp);
+ /* and call function */
+ v_retval = LLVMBuildCall(
+ builder, v_fn_addr_in, &v_fcinfo_in, 1,
+ "funccall_iocoerce_in");
+
+ LLVMBuildStore(builder, v_retval, v_resvaluep);
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
+ break;
+ }
+
+ case EEOP_DISTINCT:
+ {
+ FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
+
+ LLVMValueRef v_fcinfo_isnull;
+
+ LLVMValueRef v_argnullp;
+ LLVMValueRef v_argnull0, v_argisnull0;
+ LLVMValueRef v_argnull1, v_argisnull1;
+
+ LLVMValueRef v_anyargisnull;
+ LLVMValueRef v_bothargisnull;
+
+ LLVMValueRef v_argno;
+ LLVMValueRef v_result;
+
+ LLVMBasicBlockRef b_noargnull =
+ LLVMInsertBasicBlock(opblocks[i + 1], "nonull");
+ LLVMBasicBlockRef b_checkbothargnull =
+ LLVMInsertBasicBlock(opblocks[i + 1], "checkbothargnull");
+ LLVMBasicBlockRef b_bothargnull =
+ LLVMInsertBasicBlock(opblocks[i + 1], "bothargnull");
+ LLVMBasicBlockRef b_anyargnull =
+ LLVMInsertBasicBlock(opblocks[i + 1], "anyargnull");
+
+ v_argnullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) fcinfo->argnull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "v_argnullp");
+
+ /* load argnull[0|1] for both arguments */
+ v_argno = LLVMConstInt(LLVMInt32Type(), 0, false);
+ v_argnull0 = LLVMBuildLoad(
+ builder,
+ LLVMBuildGEP(builder, v_argnullp, &v_argno, 1, "")
+ , "");
+ v_argisnull0 = LLVMBuildICmp(
+ builder, LLVMIntEQ, v_argnull0,
+ LLVMConstInt(LLVMInt8Type(), 1, false), "");
+
+ v_argno = LLVMConstInt(LLVMInt32Type(), 1, false);
+ v_argnull1 = LLVMBuildLoad(
+ builder,
+ LLVMBuildGEP(builder, v_argnullp, &v_argno, 1, "")
+ , "");
+ v_argisnull1 = LLVMBuildICmp(
+ builder, LLVMIntEQ, v_argnull1,
+ LLVMConstInt(LLVMInt8Type(), 1, false), "");
+
+ v_anyargisnull = LLVMBuildOr(
+ builder, v_argisnull0, v_argisnull1, "");
+ v_bothargisnull = LLVMBuildAnd(
+ builder, v_argisnull0, v_argisnull1, "");
+
+ /*
+ * Check function arguments for NULLness: If either is
+ * NULL, we check if both args are NULL. Otherwise call
+ * comparator.
+ */
+ LLVMBuildCondBr(
+ builder,
+ v_anyargisnull,
+ b_checkbothargnull,
+ b_noargnull);
+
+ /*
+ * build block checking if any arg is null
+ */
+ LLVMPositionBuilderAtEnd(builder, b_checkbothargnull);
+ LLVMBuildCondBr(
+ builder,
+ v_bothargisnull,
+ b_bothargnull,
+ b_anyargnull);
+
+
+ /* Both NULL? Then is not distinct... */
+ LLVMPositionBuilderAtEnd(builder, b_bothargnull);
+ LLVMBuildStore(
+ builder,
+ LLVMConstInt(LLVMInt8Type(), 0, false),
+ v_resnullp);
+ LLVMBuildStore(
+ builder,
+ LLVMConstInt(TypeSizeT, 0, false),
+ v_resvaluep);
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
+ /* Only one is NULL? Then is distinct... */
+ LLVMPositionBuilderAtEnd(builder, b_anyargnull);
+ LLVMBuildStore(
+ builder,
+ LLVMConstInt(LLVMInt8Type(), 0, false),
+ v_resnullp);
+ LLVMBuildStore(
+ builder,
+ LLVMConstInt(TypeSizeT, 1, false),
+ v_resvaluep);
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
+ /* neither argument is null: compare */
+ LLVMPositionBuilderAtEnd(builder, b_noargnull);
+
+ v_result = BuildFunctionCall(context, builder, mod, fcinfo, &v_fcinfo_isnull);
+
+ /* Must invert result of "=" */
+ v_result = LLVMBuildZExt(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_result,
+ LLVMConstInt(TypeSizeT, 0, false), ""),
+ TypeSizeT, "");
+
+ LLVMBuildStore(
+ builder,
+ v_fcinfo_isnull,
+ v_resnullp);
+ LLVMBuildStore(
+ builder,
+ v_result,
+ v_resvaluep);
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
+ break;
+ }
+
+ case EEOP_ROWCOMPARE_STEP:
+ {
+ FunctionCallInfo fcinfo = op->d.rowcompare_step.fcinfo_data;
+ LLVMValueRef v_fcinfo_isnull;
+
+ LLVMBasicBlockRef b_null =
+ LLVMInsertBasicBlock(opblocks[i + 1], "row-null");
+ LLVMBasicBlockRef b_compare =
+ LLVMInsertBasicBlock(opblocks[i + 1], "row-compare");
+ LLVMBasicBlockRef b_compare_result =
+ LLVMInsertBasicBlock(opblocks[i + 1], "row-compare-result");
+
+ LLVMValueRef v_retval;
+
+ /* if function is strict, and either arg is null, we're done */
+ if (op->d.rowcompare_step.finfo->fn_strict)
+ {
+ LLVMValueRef v_argnullp;
+ LLVMValueRef v_argnull0;
+ LLVMValueRef v_argnull1;
+ LLVMValueRef v_argno;
+ LLVMValueRef v_anyargisnull;
+
+ v_argnullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) fcinfo->argnull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "v_argnullp");
+
+ v_argno = LLVMConstInt(LLVMInt32Type(), 0, false);
+ v_argnull0 = LLVMBuildLoad(
+ builder,
+ LLVMBuildGEP(builder, v_argnullp, &v_argno, 1, "")
+ , "");
+ v_argno = LLVMConstInt(LLVMInt32Type(), 1, false);
+ v_argnull1 = LLVMBuildLoad(
+ builder,
+ LLVMBuildGEP(builder, v_argnullp, &v_argno, 1, "")
+ , "");
+
+ v_anyargisnull = LLVMBuildOr(
+ builder,
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_argnull0,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_argnull1,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ "");
+
+ LLVMBuildCondBr(
+ builder,
+ v_anyargisnull,
+ b_null,
+ b_compare);
+ }
+ else
+ {
+ LLVMBuildBr(builder, b_compare);
+ }
+
+ /* build block invoking comparison function */
+ LLVMPositionBuilderAtEnd(builder, b_compare);
+
+ /* call function */
+ v_retval = BuildFunctionCall(context, builder, mod, fcinfo, &v_fcinfo_isnull);
+ LLVMBuildStore(builder, v_retval, v_resvaluep);
+
+ /* if result of function is NULL, force NULL result */
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_fcinfo_isnull,
+ LLVMConstInt(LLVMInt8Type(), 0, false), ""),
+ b_compare_result,
+ b_null);
+
+ /* build block analying the !NULL comparator result */
+ LLVMPositionBuilderAtEnd(builder, b_compare_result);
+
+ /* if results equal, compare next, otherwise done */
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_retval,
+ LLVMConstInt(TypeSizeT, 0, false), ""),
+ opblocks[i + 1],
+ opblocks[op->d.rowcompare_step.jumpdone]);
+
+ /* build block handling NULL input or NULL comparator result */
+ LLVMPositionBuilderAtEnd(builder, b_null);
+ LLVMBuildStore(
+ builder,
+ LLVMConstInt(LLVMInt8Type(), 1, false),
+ v_resnullp);
+ LLVMBuildBr(
+ builder,
+ opblocks[op->d.rowcompare_step.jumpnull]);
+
+ break;
+ }
+
+ case EEOP_ROWCOMPARE_FINAL:
+ {
+ RowCompareType rctype = op->d.rowcompare_final.rctype;
+
+ LLVMValueRef v_cmpresult;
+ LLVMValueRef v_result;
+ LLVMIntPredicate predicate;
+
+ /*
+ * Btree comparators return 32 bit results, need to be
+ * careful about sign (used as a 64 bit value it's
+ * otherwise wrong).
+ */
+ v_cmpresult = LLVMBuildTrunc(
+ builder,
+ LLVMBuildLoad(builder, v_resvaluep, ""),
+ LLVMInt32Type(), "");
+
+ switch (rctype)
+ {
+ /* EQ and NE cases aren't allowed here */
+ case ROWCOMPARE_LT:
+ predicate = LLVMIntSLT;
+ break;
+ case ROWCOMPARE_LE:
+ predicate = LLVMIntSLE;
+ break;
+ case ROWCOMPARE_GT:
+ predicate = LLVMIntSGT;
+ break;
+ case ROWCOMPARE_GE:
+ predicate = LLVMIntSGE;
+ break;
+ default:
+ Assert(false);
+ predicate = 0; /* prevent compiler warning */
+ break;
+ }
+
+ v_result = LLVMBuildZExt(
+ builder,
+ LLVMBuildICmp(
+ builder,
+ predicate,
+ v_cmpresult,
+ LLVMConstInt(LLVMInt32Type(), 0, false), ""),
+ TypeSizeT, "");
+
+ LLVMBuildStore(
+ builder,
+ LLVMConstInt(LLVMInt8Type(), 0, false),
+ v_resnullp);
+ LLVMBuildStore(
+ builder,
+ v_result,
+ v_resvaluep);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
+ break;
+ }
+
+ case EEOP_MAKE_READONLY:
+ {
+ LLVMBasicBlockRef b_notnull;
+ LLVMValueRef v_params[1];
+ LLVMValueRef v_ret;
+ LLVMValueRef v_nullp;
+ LLVMValueRef v_valuep;
+ LLVMValueRef v_null;
+ LLVMValueRef v_value;
+
+ b_notnull = LLVMInsertBasicBlock(opblocks[i + 1], "readonly.notnull");
+
+ v_nullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) op->d.make_readonly.isnull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "");
+
+ v_null = LLVMBuildLoad(builder, v_nullp, "");
+
+ /* store null isnull value in result */
+ LLVMBuildStore(
+ builder,
+ v_null,
+ v_resnullp);
+
+ /* check if value is NULL */
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_null,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ opblocks[i + 1], b_notnull);
+
+ /* if value is not null, convert to RO datum */
+ LLVMPositionBuilderAtEnd(builder, b_notnull);
+
+ v_valuep = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) op->d.make_readonly.value, false),
+ LLVMPointerType(TypeSizeT, 0),
+ "");
+
+ v_value = LLVMBuildLoad(builder, v_valuep, "");
+
+ v_params[0] = v_value;
+ v_ret = LLVMBuildCall(builder, create_MakeExpandedObjectReadOnly(mod),
+ v_params, lengthof(v_params), "");
+ LLVMBuildStore(
+ builder,
+ v_ret,
+ v_resvaluep);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_FUNCEXPR_FUSAGE:
+ case EEOP_FUNCEXPR_STRICT_FUSAGE:
+
+ elog(ERROR, "unimplemented in jit: %zu", op->opcode);
+
+ case EEOP_LAST:
+ Assert(false);
+ break;
+ }
+ }
+
+ /*
+ * Don't immediately emit function, instead do so the first time the
+ * expression is actually evaluated. That allows to emit a lot of
+ * functions together, avoiding a lot of repeated llvm and memory
+ * remapping overhead.
+ */
+ state->evalfunc = ExecRunCompiledExpr;
+
+ {
+ CompiledExprState *cstate = palloc0(sizeof(CompiledExprState));
+ cstate->context = context;
+ cstate->funcname = funcname;
+ state->evalfunc_private = cstate;
+ }
+
+ LLVMDisposeBuilder(builder);
+
+ return true;
+}
+
+#endif
diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c
index a7b07827e0..fb5cfedb5d 100644
--- a/src/backend/utils/fmgr/fmgr.c
+++ b/src/backend/utils/fmgr/fmgr.c
@@ -67,7 +67,7 @@ static Datum fmgr_security_definer(PG_FUNCTION_ARGS);
* or name, but search by Oid is much faster.
*/
-static const FmgrBuiltin *
+const FmgrBuiltin *
fmgr_isbuiltin(Oid id)
{
int low = 0;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 2edc0b33c5..9a80ecedc2 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1020,6 +1020,17 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},
+ {
+ {"jit_expressions", PGC_USERSET, DEVELOPER_OPTIONS,
+ gettext_noop("just-in-time compile expression evaluation"),
+ NULL,
+ GUC_NOT_IN_SAMPLE
+ },
+ &jit_expressions,
+ false,
+ NULL, NULL, NULL
+ },
+
#endif
{
diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h
index 0fbc112890..3919ac5598 100644
--- a/src/include/executor/execExpr.h
+++ b/src/include/executor/execExpr.h
@@ -595,7 +595,8 @@ typedef struct ArrayRefState
} ArrayRefState;
-extern void ExecReadyInterpretedExpr(ExprState *state);
+extern void ExecReadyInterpretedExpr(ExprState *state, PlanState *parent);
+extern bool ExecReadyCompiledExpr(ExprState *state, PlanState *parent);
extern ExprEvalOp ExecEvalStepOp(ExprState *state, ExprEvalStep *op);
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index f0601cb870..4de4bf4035 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -88,6 +88,10 @@ extern PGDLLIMPORT ExecutorEnd_hook_type ExecutorEnd_hook;
typedef bool (*ExecutorCheckPerms_hook_type) (List *, bool);
extern PGDLLIMPORT ExecutorCheckPerms_hook_type ExecutorCheckPerms_hook;
+/* GUC variables for JITing */
+#ifdef USE_LLVM
+extern bool jit_expressions;
+#endif
/*
* prototypes from functions in execAmi.c
diff --git a/src/include/lib/llvmjit.h b/src/include/lib/llvmjit.h
index 82b0b91c93..9711d398ca 100644
--- a/src/include/lib/llvmjit.h
+++ b/src/include/lib/llvmjit.h
@@ -72,9 +72,8 @@ extern void llvm_shutdown_orc_perf_support(LLVMOrcJITStackRef llvm_orc);
#else
-typedef struct LLVMJitContext
-{
-} LLVMJitContext;
+struct LLVMJitContext;
+typedef struct LLVMJitContext LLVMJitContext;
#endif /* USE_LLVM */
diff --git a/src/include/utils/fmgrtab.h b/src/include/utils/fmgrtab.h
index 6130ef8f9c..0a29198db0 100644
--- a/src/include/utils/fmgrtab.h
+++ b/src/include/utils/fmgrtab.h
@@ -36,4 +36,6 @@ extern const FmgrBuiltin fmgr_builtins[];
extern const int fmgr_nbuiltins; /* number of entries in table */
+extern const FmgrBuiltin *fmgr_isbuiltin(Oid id);
+
#endif /* FMGRTAB_H */
--
2.14.1.2.g4274c698f4.dirty
0009-Simplify-aggregate-code-a-bit.patchtext/x-diff; charset=us-asciiDownload
From 31f06781bf6a53de34c20fbaaed0d81367e0e7f4 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 3 Aug 2017 15:23:40 -0700
Subject: [PATCH 09/16] Simplify aggregate code a bit.
---
src/backend/executor/nodeAgg.c | 94 ++++++++++++++++++++----------------------
src/include/nodes/execnodes.h | 6 ++-
2 files changed, 48 insertions(+), 52 deletions(-)
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 1783f38f14..7e521459d6 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -522,13 +522,13 @@ static void select_current_set(AggState *aggstate, int setno, bool is_hash);
static void initialize_phase(AggState *aggstate, int newphase);
static TupleTableSlot *fetch_input_tuple(AggState *aggstate);
static void initialize_aggregates(AggState *aggstate,
- AggStatePerGroup pergroup,
- int numReset);
+ AggStatePerGroup *pergroups,
+ bool isHash, int numReset);
static void advance_transition_function(AggState *aggstate,
AggStatePerTrans pertrans,
AggStatePerGroup pergroupstate);
-static void advance_aggregates(AggState *aggstate, AggStatePerGroup pergroup,
- AggStatePerGroup *pergroups);
+static void advance_aggregates(AggState *aggstate, AggStatePerGroup *sort_pergroups,
+ AggStatePerGroup *hash_pergroups);
static void advance_combine_function(AggState *aggstate,
AggStatePerTrans pertrans,
AggStatePerGroup pergroupstate);
@@ -782,15 +782,14 @@ initialize_aggregate(AggState *aggstate, AggStatePerTrans pertrans,
* If there are multiple grouping sets, we initialize only the first numReset
* of them (the grouping sets are ordered so that the most specific one, which
* is reset most often, is first). As a convenience, if numReset is 0, we
- * reinitialize all sets. numReset is -1 to initialize a hashtable entry, in
- * which case the caller must have used select_current_set appropriately.
+ * reinitialize all sets.
*
* When called, CurrentMemoryContext should be the per-query context.
*/
static void
initialize_aggregates(AggState *aggstate,
- AggStatePerGroup pergroup,
- int numReset)
+ AggStatePerGroup *pergroups,
+ bool isHash, int numReset)
{
int transno;
int numGroupingSets = Max(aggstate->phase->numsets, 1);
@@ -801,31 +800,19 @@ initialize_aggregates(AggState *aggstate,
if (numReset == 0)
numReset = numGroupingSets;
- for (transno = 0; transno < numTrans; transno++)
+ for (setno = 0; setno < numReset; setno++)
{
- AggStatePerTrans pertrans = &transstates[transno];
+ AggStatePerGroup pergroup = pergroups[setno];
- if (numReset < 0)
+ select_current_set(aggstate, setno, isHash);
+
+ for (transno = 0; transno < numTrans; transno++)
{
- AggStatePerGroup pergroupstate;
-
- pergroupstate = &pergroup[transno];
+ AggStatePerTrans pertrans = &transstates[transno];
+ AggStatePerGroup pergroupstate = &pergroup[transno];
initialize_aggregate(aggstate, pertrans, pergroupstate);
}
- else
- {
- for (setno = 0; setno < numReset; setno++)
- {
- AggStatePerGroup pergroupstate;
-
- pergroupstate = &pergroup[transno + (setno * numTrans)];
-
- select_current_set(aggstate, setno, false);
-
- initialize_aggregate(aggstate, pertrans, pergroupstate);
- }
- }
}
}
@@ -965,7 +952,7 @@ advance_transition_function(AggState *aggstate,
* When called, CurrentMemoryContext should be the per-query context.
*/
static void
-advance_aggregates(AggState *aggstate, AggStatePerGroup pergroup, AggStatePerGroup *pergroups)
+advance_aggregates(AggState *aggstate, AggStatePerGroup *sort_pergroups, AggStatePerGroup *hash_pergroups)
{
int transno;
int setno = 0;
@@ -1002,7 +989,7 @@ advance_aggregates(AggState *aggstate, AggStatePerGroup pergroup, AggStatePerGro
{
/* DISTINCT and/or ORDER BY case */
Assert(slot->tts_nvalid >= (pertrans->numInputs + inputoff));
- Assert(!pergroups);
+ Assert(!hash_pergroups);
/*
* If the transfn is strict, we want to check for nullity before
@@ -1063,9 +1050,9 @@ advance_aggregates(AggState *aggstate, AggStatePerGroup pergroup, AggStatePerGro
fcinfo->argnull[i + 1] = slot->tts_isnull[i + inputoff];
}
- if (pergroup)
+ if (sort_pergroups)
{
- /* advance transition states for ordered grouping */
+ /* advance transition states for ordered grouping */
for (setno = 0; setno < numGroupingSets; setno++)
{
@@ -1073,13 +1060,13 @@ advance_aggregates(AggState *aggstate, AggStatePerGroup pergroup, AggStatePerGro
select_current_set(aggstate, setno, false);
- pergroupstate = &pergroup[transno + (setno * numTrans)];
+ pergroupstate = &sort_pergroups[setno][transno];
advance_transition_function(aggstate, pertrans, pergroupstate);
}
}
- if (pergroups)
+ if (hash_pergroups)
{
/* advance transition states for hashed grouping */
@@ -1089,7 +1076,7 @@ advance_aggregates(AggState *aggstate, AggStatePerGroup pergroup, AggStatePerGro
select_current_set(aggstate, setno, true);
- pergroupstate = &pergroups[setno][transno];
+ pergroupstate = &hash_pergroups[setno][transno];
advance_transition_function(aggstate, pertrans, pergroupstate);
}
@@ -2061,8 +2048,8 @@ lookup_hash_entry(AggState *aggstate)
MemoryContextAlloc(perhash->hashtable->tablecxt,
sizeof(AggStatePerGroupData) * aggstate->numtrans);
/* initialize aggregates for new tuple group */
- initialize_aggregates(aggstate, (AggStatePerGroup) entry->additional,
- -1);
+ initialize_aggregates(aggstate, (AggStatePerGroup*) &entry->additional,
+ true, 1);
}
return entry;
@@ -2146,7 +2133,7 @@ agg_retrieve_direct(AggState *aggstate)
ExprContext *econtext;
ExprContext *tmpcontext;
AggStatePerAgg peragg;
- AggStatePerGroup pergroup;
+ AggStatePerGroup *pergroups;
AggStatePerGroup *hash_pergroups = NULL;
TupleTableSlot *outerslot;
TupleTableSlot *firstSlot;
@@ -2169,7 +2156,7 @@ agg_retrieve_direct(AggState *aggstate)
tmpcontext = aggstate->tmpcontext;
peragg = aggstate->peragg;
- pergroup = aggstate->pergroup;
+ pergroups = aggstate->pergroups;
firstSlot = aggstate->ss.ss_ScanTupleSlot;
/*
@@ -2371,7 +2358,7 @@ agg_retrieve_direct(AggState *aggstate)
/*
* Initialize working state for a new input tuple group.
*/
- initialize_aggregates(aggstate, pergroup, numReset);
+ initialize_aggregates(aggstate, pergroups, false, numReset);
if (aggstate->grp_firstTuple != NULL)
{
@@ -2408,9 +2395,9 @@ agg_retrieve_direct(AggState *aggstate)
hash_pergroups = NULL;
if (DO_AGGSPLIT_COMBINE(aggstate->aggsplit))
- combine_aggregates(aggstate, pergroup);
+ combine_aggregates(aggstate, pergroups[0]);
else
- advance_aggregates(aggstate, pergroup, hash_pergroups);
+ advance_aggregates(aggstate, pergroups, hash_pergroups);
/* Reset per-input-tuple context after each tuple */
ResetExprContext(tmpcontext);
@@ -2474,7 +2461,7 @@ agg_retrieve_direct(AggState *aggstate)
finalize_aggregates(aggstate,
peragg,
- pergroup + (currentSet * aggstate->numtrans));
+ pergroups[currentSet]);
/*
* If there's no row to project right now, we must continue rather
@@ -2715,7 +2702,7 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
aggstate->curpertrans = NULL;
aggstate->input_done = false;
aggstate->agg_done = false;
- aggstate->pergroup = NULL;
+ aggstate->pergroups = NULL;
aggstate->grp_firstTuple = NULL;
aggstate->sort_in = NULL;
aggstate->sort_out = NULL;
@@ -3019,13 +3006,17 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
if (node->aggstrategy != AGG_HASHED)
{
- AggStatePerGroup pergroup;
+ AggStatePerGroup *pergroups =
+ (AggStatePerGroup*) palloc0(sizeof(AggStatePerGroup)
+ * numGroupingSets);
- pergroup = (AggStatePerGroup) palloc0(sizeof(AggStatePerGroupData)
- * numaggs
- * numGroupingSets);
+ for (i = 0; i < numGroupingSets; i++)
+ {
+ pergroups[i] = (AggStatePerGroup) palloc0(sizeof(AggStatePerGroupData)
+ * numaggs);
+ }
- aggstate->pergroup = pergroup;
+ aggstate->pergroups = pergroups;
}
/*
@@ -3988,8 +3979,11 @@ ExecReScanAgg(AggState *node)
/*
* Reset the per-group state (in particular, mark transvalues null)
*/
- MemSet(node->pergroup, 0,
- sizeof(AggStatePerGroupData) * node->numaggs * numGroupingSets);
+ for (setno = 0; setno < numGroupingSets; setno++)
+ {
+ MemSet(node->pergroups[setno], 0,
+ sizeof(AggStatePerGroupData) * node->numaggs);
+ }
/* reset to phase 1 */
initialize_phase(node, 1);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 8ae8179ee7..bc5874f1ee 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1823,13 +1823,15 @@ typedef struct AggState
Tuplesortstate *sort_out; /* input is copied here for next phase */
TupleTableSlot *sort_slot; /* slot for sort results */
/* these fields are used in AGG_PLAIN and AGG_SORTED modes: */
- AggStatePerGroup pergroup; /* per-Aggref-per-group working state */
+ AggStatePerGroup *pergroups; /* grouping set indexed array of per-group
+ * pointers */
HeapTuple grp_firstTuple; /* copy of first tuple of current group */
/* these fields are used in AGG_HASHED and AGG_MIXED modes: */
bool table_filled; /* hash table filled yet? */
int num_hashes;
AggStatePerHash perhash;
- AggStatePerGroup *hash_pergroup; /* array of per-group pointers */
+ AggStatePerGroup *hash_pergroup; /* grouping set indexed array of
+ * per-group pointers */
/* support for evaluation of agg inputs */
TupleTableSlot *evalslot; /* slot for agg inputs */
ProjectionInfo *evalproj; /* projection machinery */
--
2.14.1.2.g4274c698f4.dirty
0010-More-efficient-AggState-pertrans-iteration.patchtext/x-diff; charset=us-asciiDownload
From 86ab48144bda740ba2b3781894abf9cfa939eb43 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Mon, 13 Mar 2017 20:22:10 -0700
Subject: [PATCH 10/16] More efficient AggState->pertrans iteration.
Turns out AggStatePerTrans is so large that multiplications are needed
to access elements of AggState->pertrans on x86.
Author: Andres Freund
---
src/backend/executor/nodeAgg.c | 27 +++++++++++++++------------
1 file changed, 15 insertions(+), 12 deletions(-)
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 7e521459d6..291f15fd94 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -960,14 +960,15 @@ advance_aggregates(AggState *aggstate, AggStatePerGroup *sort_pergroups, AggStat
int numHashes = aggstate->num_hashes;
int numTrans = aggstate->numtrans;
TupleTableSlot *slot = aggstate->evalslot;
+ AggStatePerTrans pertrans;
/* compute input for all aggregates */
if (aggstate->evalproj)
aggstate->evalslot = ExecProject(aggstate->evalproj);
- for (transno = 0; transno < numTrans; transno++)
+ for (transno = 0, pertrans = &aggstate->pertrans[0];
+ transno < numTrans; transno++, pertrans++)
{
- AggStatePerTrans pertrans = &aggstate->pertrans[transno];
ExprState *filter = pertrans->aggfilter;
int numTransInputs = pertrans->numTransInputs;
int i;
@@ -1098,6 +1099,7 @@ combine_aggregates(AggState *aggstate, AggStatePerGroup pergroup)
int transno;
int numTrans = aggstate->numtrans;
TupleTableSlot *slot;
+ AggStatePerTrans pertrans;
/* combine not supported with grouping sets */
Assert(aggstate->phase->numsets <= 1);
@@ -1105,9 +1107,9 @@ combine_aggregates(AggState *aggstate, AggStatePerGroup pergroup)
/* compute input for all aggregates */
slot = ExecProject(aggstate->evalproj);
- for (transno = 0; transno < numTrans; transno++)
+ for (transno = 0, pertrans = &aggstate->pertrans[0];
+ transno < numTrans; transno++, pertrans++)
{
- AggStatePerTrans pertrans = &aggstate->pertrans[transno];
AggStatePerGroup pergroupstate = &pergroup[transno];
FunctionCallInfo fcinfo = &pertrans->transfn_fcinfo;
int inputoff = pertrans->inputoff;
@@ -2659,6 +2661,7 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
AggState *aggstate;
AggStatePerAgg peraggs;
AggStatePerTrans pertransstates;
+ AggStatePerTrans pertrans;
Plan *outerPlan;
ExprContext *econtext;
int numaggs,
@@ -3349,9 +3352,9 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
*/
combined_inputeval = NIL;
column_offset = 0;
- for (transno = 0; transno < aggstate->numtrans; transno++)
+ for (transno = 0, pertrans = &pertransstates[0];
+ transno < aggstate->numtrans; transno++, pertrans++)
{
- AggStatePerTrans pertrans = &pertransstates[transno];
ListCell *arg;
pertrans->inputoff = column_offset;
@@ -3842,6 +3845,7 @@ ExecEndAgg(AggState *node)
int transno;
int numGroupingSets = Max(node->maxsets, 1);
int setno;
+ AggStatePerTrans pertrans;
/* Make sure we have closed any open tuplesorts */
@@ -3850,10 +3854,9 @@ ExecEndAgg(AggState *node)
if (node->sort_out)
tuplesort_end(node->sort_out);
- for (transno = 0; transno < node->numtrans; transno++)
+ for (transno = 0, pertrans = &node->pertrans[0];
+ transno < node->numtrans; transno++, pertrans++)
{
- AggStatePerTrans pertrans = &node->pertrans[transno];
-
for (setno = 0; setno < numGroupingSets; setno++)
{
if (pertrans->sortstates[setno])
@@ -3890,6 +3893,7 @@ ExecReScanAgg(AggState *node)
int transno;
int numGroupingSets = Max(node->maxsets, 1);
int setno;
+ AggStatePerTrans pertrans;
node->agg_done = false;
@@ -3921,12 +3925,11 @@ ExecReScanAgg(AggState *node)
}
/* Make sure we have closed any open tuplesorts */
- for (transno = 0; transno < node->numtrans; transno++)
+ for (transno = 0, pertrans = &node->pertrans[0];
+ transno < node->numtrans; transno++, pertrans++)
{
for (setno = 0; setno < numGroupingSets; setno++)
{
- AggStatePerTrans pertrans = &node->pertrans[transno];
-
if (pertrans->sortstates[setno])
{
tuplesort_end(pertrans->sortstates[setno]);
--
2.14.1.2.g4274c698f4.dirty
0011-Avoid-dereferencing-tts_values-nulls-repeatedly.patchtext/x-diff; charset=us-asciiDownload
From 87d95e24295df49fd9b64da275385bc1c775ae2d Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Mon, 10 Jul 2017 15:07:32 -0700
Subject: [PATCH 11/16] Avoid dereferencing tts_values/nulls repeatedly.
Author:
Reviewed-By:
Discussion: https://postgr.es/m/
Backpatch:
---
src/backend/executor/nodeAgg.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 291f15fd94..a63c05cb68 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -960,6 +960,8 @@ advance_aggregates(AggState *aggstate, AggStatePerGroup *sort_pergroups, AggStat
int numHashes = aggstate->num_hashes;
int numTrans = aggstate->numtrans;
TupleTableSlot *slot = aggstate->evalslot;
+ Datum *values = slot->tts_values;
+ bool *nulls = slot->tts_isnull;
AggStatePerTrans pertrans;
/* compute input for all aggregates */
@@ -1015,8 +1017,7 @@ advance_aggregates(AggState *aggstate, AggStatePerGroup *sort_pergroups, AggStat
/* OK, put the tuple into the tuplesort object */
if (pertrans->numInputs == 1)
tuplesort_putdatum(pertrans->sortstates[setno],
- slot->tts_values[inputoff],
- slot->tts_isnull[inputoff]);
+ values[inputoff], nulls[inputoff]);
else
{
/*
@@ -1025,10 +1026,10 @@ advance_aggregates(AggState *aggstate, AggStatePerGroup *sort_pergroups, AggStat
*/
ExecClearTuple(pertrans->sortslot);
memcpy(pertrans->sortslot->tts_values,
- &slot->tts_values[inputoff],
+ &values[inputoff],
pertrans->numInputs * sizeof(Datum));
memcpy(pertrans->sortslot->tts_isnull,
- &slot->tts_isnull[inputoff],
+ &nulls[inputoff],
pertrans->numInputs * sizeof(bool));
pertrans->sortslot->tts_nvalid = pertrans->numInputs;
ExecStoreVirtualTuple(pertrans->sortslot);
@@ -1047,8 +1048,8 @@ advance_aggregates(AggState *aggstate, AggStatePerGroup *sort_pergroups, AggStat
for (i = 0; i < numTransInputs; i++)
{
- fcinfo->arg[i + 1] = slot->tts_values[i + inputoff];
- fcinfo->argnull[i + 1] = slot->tts_isnull[i + inputoff];
+ fcinfo->arg[i + 1] = values[i + inputoff];
+ fcinfo->argnull[i + 1] = nulls[i + inputoff];
}
if (sort_pergroups)
--
2.14.1.2.g4274c698f4.dirty
0012-Centralize-slot-deforming-logic-a-bit.patchtext/x-diff; charset=us-asciiDownload
From 4db1d9b56b28ea81c542d2276c3b494a86eb75dc Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Fri, 4 Aug 2017 15:06:29 -0700
Subject: [PATCH 12/16] Centralize slot deforming logic a bit.
Author:
Reviewed-By:
Discussion: https://postgr.es/m/
Backpatch:
---
src/backend/access/common/heaptuple.c | 148 +++++++++++-----------------------
1 file changed, 47 insertions(+), 101 deletions(-)
diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 13ee528e26..f77ea477fb 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -1046,6 +1046,7 @@ slot_deform_tuple(TupleTableSlot *slot, int natts)
long off; /* offset in tuple data */
bits8 *bp = tup->t_bits; /* ptr to null bitmap in tuple */
bool slow; /* can we use/set attcacheoff? */
+ int valnatts = natts;
/*
* Check whether the first call for this tuple, and initialize or restore
@@ -1065,6 +1066,9 @@ slot_deform_tuple(TupleTableSlot *slot, int natts)
slow = slot->tts_slow;
}
+
+ natts = Min(natts, Min(HeapTupleHeaderGetNatts(tuple->t_data), slot->tts_tupleDescriptor->natts));
+
tp = (char *) tup + tup->t_hoff;
for (; attnum < natts; attnum++)
@@ -1118,10 +1122,16 @@ slot_deform_tuple(TupleTableSlot *slot, int natts)
slow = true; /* can't use attcacheoff anymore */
}
+ for (; attnum < valnatts; attnum++)
+ {
+ values[attnum] = 0;
+ isnull[attnum] = 1;
+ }
+
/*
* Save state for next execution
*/
- slot->tts_nvalid = attnum;
+ slot->tts_nvalid = valnatts;
slot->tts_off = off;
slot->tts_slow = slow;
}
@@ -1142,46 +1152,38 @@ Datum
slot_getattr(TupleTableSlot *slot, int attnum, bool *isnull)
{
HeapTuple tuple = slot->tts_tuple;
- TupleDesc tupleDesc = slot->tts_tupleDescriptor;
- HeapTupleHeader tup;
+ TupleDesc tupleDesc PG_USED_FOR_ASSERTS_ONLY = slot->tts_tupleDescriptor;
/*
* system attributes are handled by heap_getsysattr
*/
- if (attnum <= 0)
+ if (unlikely(attnum <= 0))
{
- if (tuple == NULL) /* internal error */
- elog(ERROR, "cannot extract system attribute from virtual tuple");
- if (tuple == &(slot->tts_minhdr)) /* internal error */
- elog(ERROR, "cannot extract system attribute from minimal tuple");
+
+ /* cannot extract system attribute from virtual tuple */
+ Assert(tuple);
+ /* "cannot extract system attribute from minimal tuple */
+ Assert(tuple != &(slot->tts_minhdr));
return heap_getsysattr(tuple, attnum, tupleDesc, isnull);
}
/*
* fast path if desired attribute already cached
*/
- if (attnum <= slot->tts_nvalid)
+ if (likely(attnum <= slot->tts_nvalid))
{
*isnull = slot->tts_isnull[attnum - 1];
return slot->tts_values[attnum - 1];
}
/*
- * return NULL if attnum is out of range according to the tupdesc
+ * While tuples might possibly be wider than the slot, they should never
+ * be accessed. We used to return NULL if so, but that a) isn't free b)
+ * seems more likely to hide bugs than anything.
*/
- if (attnum > tupleDesc->natts)
- {
- *isnull = true;
- return (Datum) 0;
- }
-
- /*
- * otherwise we had better have a physical tuple (tts_nvalid should equal
- * natts in all virtual-tuple cases)
- */
- if (tuple == NULL) /* internal error */
- elog(ERROR, "cannot extract attribute from empty tuple slot");
+ Assert(attnum <= tupleDesc->natts);
+#ifdef NOT_ANYMORE
/*
* return NULL if attnum is out of range according to the tuple
*
@@ -1195,26 +1197,13 @@ slot_getattr(TupleTableSlot *slot, int attnum, bool *isnull)
*isnull = true;
return (Datum) 0;
}
+#endif
/*
- * check if target attribute is null: no point in groveling through tuple
+ * otherwise we had better have a physical tuple (tts_nvalid should equal
+ * natts in all virtual-tuple cases)
*/
- if (HeapTupleHasNulls(tuple) && att_isnull(attnum - 1, tup->t_bits))
- {
- *isnull = true;
- return (Datum) 0;
- }
-
- /*
- * If the attribute's column has been dropped, we force a NULL result.
- * This case should not happen in normal use, but it could happen if we
- * are executing a plan cached before the column was dropped.
- */
- if (TupleDescAttr(tupleDesc, attnum - 1)->attisdropped)
- {
- *isnull = true;
- return (Datum) 0;
- }
+ Assert(tuple != NULL);
/*
* Extract the attribute, along with any preceding attributes.
@@ -1238,8 +1227,7 @@ void
slot_getallattrs(TupleTableSlot *slot)
{
int tdesc_natts = slot->tts_tupleDescriptor->natts;
- int attnum;
- HeapTuple tuple;
+ HeapTuple tuple PG_USED_FOR_ASSERTS_ONLY;
/* Quick out if we have 'em all already */
if (slot->tts_nvalid == tdesc_natts)
@@ -1250,27 +1238,10 @@ slot_getallattrs(TupleTableSlot *slot)
* natts in all virtual-tuple cases)
*/
tuple = slot->tts_tuple;
- if (tuple == NULL) /* internal error */
- elog(ERROR, "cannot extract attribute from empty tuple slot");
+ Assert(tuple != NULL);
- /*
- * load up any slots available from physical tuple
- */
- attnum = HeapTupleHeaderGetNatts(tuple->t_data);
- attnum = Min(attnum, tdesc_natts);
-
- slot_deform_tuple(slot, attnum);
-
- /*
- * If tuple doesn't have all the atts indicated by tupleDesc, read the
- * rest as null
- */
- for (; attnum < tdesc_natts; attnum++)
- {
- slot->tts_values[attnum] = (Datum) 0;
- slot->tts_isnull[attnum] = true;
- }
- slot->tts_nvalid = tdesc_natts;
+ slot_deform_tuple(slot, tdesc_natts);
+ Assert(tdesc_natts <= slot->tts_nvalid);
}
/*
@@ -1281,43 +1252,22 @@ slot_getallattrs(TupleTableSlot *slot)
void
slot_getsomeattrs(TupleTableSlot *slot, int attnum)
{
- HeapTuple tuple;
- int attno;
-
/* Quick out if we have 'em all already */
if (slot->tts_nvalid >= attnum)
return;
/* Check for caller error */
- if (attnum <= 0 || attnum > slot->tts_tupleDescriptor->natts)
- elog(ERROR, "invalid attribute number %d", attnum);
+ Assert(attnum > 0);
+ Assert(attnum <= slot->tts_tupleDescriptor->natts);
/*
* otherwise we had better have a physical tuple (tts_nvalid should equal
* natts in all virtual-tuple cases)
*/
- tuple = slot->tts_tuple;
- if (tuple == NULL) /* internal error */
- elog(ERROR, "cannot extract attribute from empty tuple slot");
+ Assert(slot->tts_tuple != NULL); /* internal error */
- /*
- * load up any slots available from physical tuple
- */
- attno = HeapTupleHeaderGetNatts(tuple->t_data);
- attno = Min(attno, attnum);
-
- slot_deform_tuple(slot, attno);
-
- /*
- * If tuple doesn't have all the atts indicated by tupleDesc, read the
- * rest as null
- */
- for (; attno < attnum; attno++)
- {
- slot->tts_values[attno] = (Datum) 0;
- slot->tts_isnull[attno] = true;
- }
- slot->tts_nvalid = attnum;
+ slot_deform_tuple(slot, attnum);
+ Assert(attnum <= slot->tts_nvalid);
}
/*
@@ -1329,38 +1279,34 @@ bool
slot_attisnull(TupleTableSlot *slot, int attnum)
{
HeapTuple tuple = slot->tts_tuple;
- TupleDesc tupleDesc = slot->tts_tupleDescriptor;
+ TupleDesc tupleDesc PG_USED_FOR_ASSERTS_ONLY = slot->tts_tupleDescriptor;
/*
* system attributes are handled by heap_attisnull
*/
- if (attnum <= 0)
+ if (unlikely(attnum <= 0))
{
- if (tuple == NULL) /* internal error */
- elog(ERROR, "cannot extract system attribute from virtual tuple");
- if (tuple == &(slot->tts_minhdr)) /* internal error */
- elog(ERROR, "cannot extract system attribute from minimal tuple");
+ /* cannot extract system attribute from virtual tuple */
+ Assert(tuple);
+ /* "cannot extract system attribute from minimal tuple */
+ Assert(tuple != &(slot->tts_minhdr));
return heap_attisnull(tuple, attnum);
}
/*
* fast path if desired attribute already cached
*/
- if (attnum <= slot->tts_nvalid)
+ if (likely(attnum <= slot->tts_nvalid))
return slot->tts_isnull[attnum - 1];
- /*
- * return NULL if attnum is out of range according to the tupdesc
- */
- if (attnum > tupleDesc->natts)
- return true;
+ /* Check for caller error */
+ Assert(attnum <= tupleDesc->natts);
/*
* otherwise we had better have a physical tuple (tts_nvalid should equal
* natts in all virtual-tuple cases)
*/
- if (tuple == NULL) /* internal error */
- elog(ERROR, "cannot extract attribute from empty tuple slot");
+ Assert(tuple != NULL);
/* and let the tuple tell it */
return heap_attisnull(tuple, attnum);
--
2.14.1.2.g4274c698f4.dirty
0013-WIP-Make-scan-desc-available-for-all-PlanStates.patchtext/x-diff; charset=us-asciiDownload
From 7fb3295812398f5a70f3b48f89d4b252b05755c4 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 31 Aug 2017 11:39:18 -0700
Subject: [PATCH 13/16] WIP: Make scan desc available for all PlanStates.
This is to allow JITing tuple deforming.
---
src/backend/executor/execTuples.c | 1 +
src/include/nodes/execnodes.h | 3 +++
2 files changed, 4 insertions(+)
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index 8280b89f7f..78ec871f50 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -866,6 +866,7 @@ ExecInitScanTupleSlot(EState *estate, ScanState *scanstate, TupleDesc tupledesc)
{
scanstate->ss_ScanTupleSlot = ExecAllocTableSlot(&estate->es_tupleTable,
tupledesc);
+ scanstate->ps.scandesc = tupledesc;
}
/* ----------------
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index bc5874f1ee..b0c4856392 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -869,6 +869,9 @@ typedef struct PlanState
ExprState *qual; /* boolean qual condition */
struct PlanState *lefttree; /* input plan tree(s) */
struct PlanState *righttree;
+
+ TupleDesc scandesc;
+
List *initPlan; /* Init SubPlanState nodes (un-correlated expr
* subselects) */
List *subPlan; /* SubPlanState nodes in my expressions */
--
2.14.1.2.g4274c698f4.dirty
0014-WIP-JITed-tuple-deforming.patchtext/x-diff; charset=us-asciiDownload
From af483065afd0c21a33321332abaab3823d9d4285 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 31 Aug 2017 11:40:26 -0700
Subject: [PATCH 14/16] WIP: JITed tuple deforming.
---
src/backend/access/common/heaptuple.c | 660 +++++++++++++++++++++++++++++++++
src/backend/executor/execExprCompile.c | 36 ++
src/backend/executor/execTuples.c | 5 +
src/backend/lib/llvmjit.c | 2 +-
src/backend/utils/misc/guc.c | 12 +
src/include/executor/executor.h | 1 +
src/include/executor/tuptable.h | 2 +-
src/include/lib/llvmjit.h | 6 +
8 files changed, 722 insertions(+), 2 deletions(-)
diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index f77ea477fb..0e552fb49a 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -60,7 +60,11 @@
#include "access/sysattr.h"
#include "access/tuptoaster.h"
#include "executor/tuptable.h"
+#include "nodes/execnodes.h"
#include "utils/expandeddatum.h"
+#include "utils/memutils.h"
+#include "utils/resowner.h"
+#include "lib/llvmjit.h"
/* Does att's datatype allow packing into the 1-byte-header varlena format? */
@@ -70,6 +74,11 @@
#define VARLENA_ATT_IS_PACKABLE(att) \
((att)->attstorage != 'p')
+#ifdef USE_LLVM
+bool jit_tuple_deforming = false;
+
+#endif /* USE_LLVM */
+
/* ----------------------------------------------------------------
* misc support routines
@@ -1058,6 +1067,7 @@ slot_deform_tuple(TupleTableSlot *slot, int natts)
/* Start from the first attribute */
off = 0;
slow = false;
+ Assert(slot->tts_off == 0);
}
else
{
@@ -1476,3 +1486,653 @@ minimal_tuple_from_heap_tuple(HeapTuple htup)
result->t_len = len;
return result;
}
+
+
+#ifdef USE_LLVM
+
+extern size_t varsize_any(void *p);
+
+size_t
+varsize_any(void *p)
+{
+ return VARSIZE_ANY(p);
+}
+
+/* build extern reference for varsize_any */
+static LLVMValueRef
+create_varsize_any(LLVMModuleRef mod)
+{
+ LLVMTypeRef *param_types = palloc(sizeof(LLVMTypeRef) * 1);
+ LLVMTypeRef sig;
+ LLVMValueRef fn;
+ const char *nm = "varsize_any";
+
+ fn = LLVMGetNamedFunction(mod, nm);
+ if (fn)
+ return fn;
+
+ param_types[0] = LLVMPointerType(LLVMInt8Type(), 0);
+ sig = LLVMFunctionType(LLVMInt64Type(), param_types, 1, 0);
+ fn = LLVMAddFunction(mod, nm, sig);
+
+ {
+ char argname[] = "readonly";
+ LLVMAttributeRef ref =
+ LLVMCreateStringAttribute(LLVMGetGlobalContext(), argname, strlen(argname), NULL, 0);
+ LLVMAddAttributeAtIndex(fn, LLVMAttributeFunctionIndex, ref);
+ }
+ {
+ char argname[] = "argmemonly";
+ LLVMAttributeRef ref =
+ LLVMCreateStringAttribute(LLVMGetGlobalContext(), argname, strlen(argname), NULL, 0);
+ LLVMAddAttributeAtIndex(fn, LLVMAttributeFunctionIndex, ref);
+ }
+
+ return fn;
+}
+
+/* build extern reference for strlen */
+static LLVMValueRef
+create_strlen(LLVMModuleRef mod)
+{
+ LLVMTypeRef *param_types = palloc(sizeof(LLVMTypeRef) * 1);
+ LLVMTypeRef sig;
+ LLVMValueRef fn;
+ const char *nm = "strlen";
+
+ fn = LLVMGetNamedFunction(mod, nm);
+ if (fn)
+ return fn;
+
+ param_types[0] = LLVMPointerType(LLVMInt8Type(), 0);
+ sig = LLVMFunctionType(TypeSizeT, param_types, 1, 0);
+ fn = LLVMAddFunction(mod, nm, sig);
+
+ return fn;
+}
+
+
+LLVMValueRef
+slot_compile_deform(LLVMJitContext *context, TupleDesc desc, int natts)
+{
+ static int deformcounter = 0;
+ char *funcname;
+
+ LLVMModuleRef mod;
+ LLVMBuilderRef builder;
+
+ LLVMTypeRef deform_sig;
+ LLVMValueRef deform_fn;
+
+ LLVMBasicBlockRef entry;
+ LLVMBasicBlockRef outblock;
+ LLVMBasicBlockRef deadblock;
+ LLVMBasicBlockRef *attcheckattnoblocks;
+ LLVMBasicBlockRef *attstartblocks;
+ LLVMBasicBlockRef *attisnullblocks;
+ LLVMBasicBlockRef *attcheckalignblocks;
+ LLVMBasicBlockRef *attalignblocks;
+ LLVMBasicBlockRef *attstoreblocks;
+ LLVMBasicBlockRef *attoutblocks;
+
+ LLVMValueRef l_varsize_any;
+ LLVMValueRef l_strlen;
+
+ LLVMValueRef v_tupdata_base;
+ LLVMValueRef v_off, v_off_inc, v_off_start;
+ LLVMValueRef v_tts_values;
+ LLVMValueRef v_tts_nulls;
+ LLVMValueRef v_slotoffp;
+ LLVMValueRef v_nvalidp, v_nvalid;
+ LLVMValueRef v_maxatt;
+
+ LLVMValueRef v_slot;
+
+ LLVMValueRef v_tupleheaderp;
+ LLVMValueRef v_tuplep;
+ LLVMValueRef v_infomask1;
+ //LLVMValueRef v_infomask2;
+ LLVMValueRef v_bits;
+
+ LLVMValueRef v_hoff;
+ //LLVMValueRef v_natts;
+
+ LLVMValueRef v_hasnulls;
+
+
+ int attnum;
+ int attcuralign = 0;
+ bool lastcouldbenull = false;
+
+ llvm_initialize();
+
+ mod = context->module;
+ if (!mod)
+ {
+ context->compiled = false;
+ mod = context->module = LLVMModuleCreateWithName("deform");
+ LLVMSetTarget(mod, llvm_triple);
+ }
+
+ funcname = psprintf("deform%d", context->counter++);
+ deformcounter++;
+
+ /* Create the signature and function */
+ {
+ LLVMTypeRef param_types[] = {
+ LLVMPointerType(StructTupleTableSlot, 0),
+ LLVMInt16Type()};
+ deform_sig = LLVMFunctionType(LLVMVoidType(), param_types,
+ lengthof(param_types), 0);
+ }
+ deform_fn = LLVMAddFunction(mod, funcname, deform_sig);
+ LLVMSetLinkage(deform_fn, LLVMInternalLinkage);
+ LLVMSetVisibility(deform_fn, LLVMDefaultVisibility);
+ LLVMSetParamAlignment(LLVMGetParam(deform_fn, 0), MAXIMUM_ALIGNOF);
+
+ entry = LLVMAppendBasicBlock(deform_fn, "entry");
+ outblock = LLVMAppendBasicBlock(deform_fn, "out");
+ deadblock = LLVMAppendBasicBlock(deform_fn, "deadblock");
+ builder = LLVMCreateBuilder();
+
+ attcheckattnoblocks = palloc(sizeof(LLVMBasicBlockRef) * natts);
+ attstartblocks = palloc(sizeof(LLVMBasicBlockRef) * natts);
+ attisnullblocks = palloc(sizeof(LLVMBasicBlockRef) * natts);
+ attcheckalignblocks = palloc(sizeof(LLVMBasicBlockRef) * natts);
+ attalignblocks = palloc(sizeof(LLVMBasicBlockRef) * natts);
+ attstoreblocks = palloc(sizeof(LLVMBasicBlockRef) * natts);
+ attoutblocks = palloc(sizeof(LLVMBasicBlockRef) * natts);
+
+ l_varsize_any = create_varsize_any(mod);
+ l_strlen = create_strlen(mod);
+
+ attcuralign = 0;
+ lastcouldbenull = false;
+
+
+ LLVMPositionBuilderAtEnd(builder, entry);
+
+ v_slot = LLVMGetParam(deform_fn, 0);
+
+ v_tts_values = LLVMBuildLoad(builder,
+ LLVMBuildStructGEP(builder, v_slot, 10, ""),
+ "tts_values");
+ v_tts_nulls = LLVMBuildLoad(builder,
+ LLVMBuildStructGEP(builder, v_slot, 11, ""),
+ "tts_isnull");
+ v_slotoffp = LLVMBuildStructGEP(builder, v_slot, 14, "");
+ v_nvalidp = LLVMBuildStructGEP(builder, v_slot, 9, "");
+
+ v_tupleheaderp = LLVMBuildLoad(
+ builder,
+ LLVMBuildStructGEP(builder, v_slot, 5, ""),
+ "tupleheader");
+ v_tuplep = LLVMBuildLoad(
+ builder,
+ LLVMBuildStructGEP(builder, v_tupleheaderp, 3, ""),
+ "tuple");
+ v_bits = LLVMBuildBitCast(
+ builder,
+ LLVMBuildStructGEP(builder, v_tuplep, 5, "t_bits"),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "");
+
+ v_infomask1 =
+ LLVMBuildLoad(builder,
+ LLVMBuildStructGEP(builder, v_tuplep, 3, ""),
+ "infomask");
+ //(tuple)->t_data->t_infomask & HEAP_HASNULL
+ v_hasnulls =
+ LLVMBuildICmp(builder, LLVMIntNE,
+ LLVMBuildAnd(builder,
+ LLVMConstInt(LLVMInt16Type(), HEAP_HASNULL, false),
+ v_infomask1, ""),
+ LLVMConstInt(LLVMInt16Type(), 0, false),
+ "hasnulls");
+
+ v_hoff = LLVMBuildLoad(
+ builder,
+ LLVMBuildStructGEP(
+ builder,
+ v_tuplep,
+ 4,
+ ""),
+ "t_hoff");
+
+ v_tupdata_base = LLVMBuildGEP(
+ builder,
+ LLVMBuildBitCast(
+ builder,
+ v_tuplep,
+ LLVMPointerType(LLVMInt8Type(), 0),
+ ""),
+ &v_hoff, 1,
+ "v_tupdata_base");
+
+ v_off_start = LLVMBuildLoad(
+ builder,
+ LLVMBuildStructGEP(
+ builder,
+ v_slot,
+ 14,
+ ""),
+ "v_slot_off");
+
+ v_off_inc = v_off = v_off_start;
+
+ v_maxatt = LLVMGetParam(deform_fn, 1);
+
+ /* build the basic block for each attribute, need them as jump target */
+ for (attnum = 0; attnum < natts; attnum++)
+ {
+ char *blockname;
+
+ blockname = psprintf("block.attr.%d.attcheckattno", attnum);
+ attcheckattnoblocks[attnum] = LLVMAppendBasicBlock(deform_fn, blockname);
+ pfree(blockname);
+ blockname = psprintf("block.attr.%d.start", attnum);
+ attstartblocks[attnum] = LLVMAppendBasicBlock(deform_fn, blockname);
+ pfree(blockname);
+ blockname = psprintf("block.attr.%d.attisnull", attnum);
+ attisnullblocks[attnum] = LLVMAppendBasicBlock(deform_fn, blockname);
+ pfree(blockname);
+ blockname = psprintf("block.attr.%d.attcheckalign", attnum);
+ attcheckalignblocks[attnum] = LLVMAppendBasicBlock(deform_fn, blockname);
+ pfree(blockname);
+ blockname = psprintf("block.attr.%d.align", attnum);
+ attalignblocks[attnum] = LLVMAppendBasicBlock(deform_fn, blockname);
+ pfree(blockname);
+ blockname = psprintf("block.attr.%d.store", attnum);
+ attstoreblocks[attnum] = LLVMAppendBasicBlock(deform_fn, blockname);
+ pfree(blockname);
+ blockname = psprintf("block.attr.%d.out", attnum);
+ attoutblocks[attnum] = LLVMAppendBasicBlock(deform_fn, blockname);
+ pfree(blockname);
+ }
+
+ v_nvalid = LLVMBuildLoad(builder, v_nvalidp, "");
+
+ /* build switch to go from nvalid to the right startblock */
+ if (true)
+ {
+ LLVMValueRef v_switch = LLVMBuildSwitch(builder, v_nvalid,
+ deadblock, natts);
+ for (attnum = 0; attnum < natts; attnum++)
+ {
+ LLVMValueRef v_attno = LLVMConstInt(LLVMInt32Type(), attnum, false);
+ LLVMAddCase(v_switch, v_attno, attstartblocks[attnum]);
+ }
+
+ }
+ else
+ {
+ /* jump from entry block to first block */
+ LLVMBuildBr(builder, attstartblocks[0]);
+ }
+
+ LLVMPositionBuilderAtEnd(builder, deadblock);
+ LLVMBuildUnreachable(builder);
+
+ for (attnum = 0; attnum < natts; attnum++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, attnum);
+ LLVMValueRef incby;
+ int alignto;
+ LLVMValueRef l_attno = LLVMConstInt(LLVMInt32Type(), attnum, false);
+ LLVMValueRef v_attdatap;
+ LLVMValueRef v_resultp;
+ LLVMValueRef v_islast;
+
+ /* build block checking whether we did all the necessary attributes */
+ LLVMPositionBuilderAtEnd(builder, attcheckattnoblocks[attnum]);
+
+ /*
+ * Build phi node, unless first block. This can be reached from:
+ * - store block of last attribute
+ * - start block of last attribute if null
+ */
+ if (lastcouldbenull)
+ {
+ LLVMValueRef incoming_values[] =
+ {v_off, v_off_inc};
+ LLVMBasicBlockRef incoming_blocks[] =
+ {attisnullblocks[attnum - 1], attstoreblocks[attnum - 1]};
+ v_off = LLVMBuildPhi(builder, LLVMInt32Type(), "off");
+ LLVMAddIncoming(v_off,
+ incoming_values, incoming_blocks,
+ lengthof(incoming_blocks));
+ }
+ else
+ {
+ v_off = v_off_inc;
+ }
+
+ /* check if done */
+ v_islast = LLVMBuildICmp(builder, LLVMIntEQ,
+ LLVMConstInt(LLVMInt16Type(), attnum, false),
+ v_maxatt, "");
+ LLVMBuildCondBr(
+ builder,
+ v_islast,
+ attoutblocks[attnum], attstartblocks[attnum]);
+
+ /* build block to jump out */
+ LLVMPositionBuilderAtEnd(builder, attoutblocks[attnum]);
+ LLVMBuildStore(builder, LLVMConstInt(LLVMInt32Type(), attnum, false), v_nvalidp);
+ LLVMBuildStore(builder, v_off, v_slotoffp);
+ LLVMBuildRetVoid(builder);
+
+ LLVMPositionBuilderAtEnd(builder, attstartblocks[attnum]);
+
+ /*
+ * This block can be reached because
+ * - we've been directly jumped through to continue deforming
+ * - this attribute's checkattno block
+ * Build the appropriate phi node.
+ */
+ {
+ LLVMValueRef incoming_values[] =
+ {v_off_start, v_off};
+ LLVMBasicBlockRef incoming_blocks[] =
+ {entry, attcheckattnoblocks[attnum]};
+
+ v_off = LLVMBuildPhi(builder, LLVMInt32Type(), "off");
+ LLVMAddIncoming(v_off,
+ incoming_values, incoming_blocks,
+ lengthof(incoming_blocks));
+ }
+
+ /* check for nulls if necessary */
+ if (!att->attnotnull)
+ {
+ LLVMBasicBlockRef blockifnotnull;
+ LLVMBasicBlockRef blockifnull;
+ LLVMBasicBlockRef blocknext;
+ LLVMValueRef attisnull;
+ LLVMValueRef v_nullbyteno;
+ LLVMValueRef v_nullbytemask;
+ LLVMValueRef v_nullbyte;
+ LLVMValueRef v_nullbit;
+
+ blockifnotnull = attcheckalignblocks[attnum];
+ blockifnull = attisnullblocks[attnum];
+
+ if (attnum + 1 == natts)
+ blocknext = outblock;
+ else
+ blocknext = attcheckattnoblocks[attnum + 1];
+
+ /* FIXME: replace with neg */
+ v_nullbyteno = LLVMConstInt(LLVMInt32Type(), attnum >> 3, false);
+ v_nullbytemask = LLVMConstInt(LLVMInt8Type(), 1 << ((attnum) & 0x07), false);
+ v_nullbyte = LLVMBuildLoad(
+ builder,
+ LLVMBuildGEP(builder, v_bits,
+ &v_nullbyteno, 1, ""),
+ "attnullbyte");
+
+ v_nullbit = LLVMBuildICmp(
+ builder,
+ LLVMIntEQ,
+ LLVMBuildAnd(builder, v_nullbyte, v_nullbytemask, ""),
+ LLVMConstInt(LLVMInt8Type(), 0, false),
+ "attisnull");
+
+ attisnull = LLVMBuildAnd(builder, v_hasnulls, v_nullbit, "");
+
+ LLVMBuildCondBr(builder, attisnull, blockifnull, blockifnotnull);
+
+ LLVMPositionBuilderAtEnd(builder, blockifnull);
+
+ /* store null-byte */
+ LLVMBuildStore(builder,
+ LLVMConstInt(LLVMInt8Type(), 1, false),
+ LLVMBuildGEP(builder, v_tts_nulls, &l_attno, 1, ""));
+ /* store zero datum */
+ LLVMBuildStore(builder,
+ LLVMConstInt(TypeSizeT, 0, false),
+ LLVMBuildGEP(builder, v_tts_values, &l_attno, 1, ""));
+
+ LLVMBuildBr(builder, blocknext);
+
+ lastcouldbenull = true;
+ }
+ else
+ {
+ LLVMBuildBr(builder, attcheckalignblocks[attnum]);
+ lastcouldbenull = false;
+
+ /* yuck, dirty hack */
+ LLVMPositionBuilderAtEnd(builder, attisnullblocks[attnum]);
+ LLVMBuildBr(builder, attcheckalignblocks[attnum]);
+ }
+ LLVMPositionBuilderAtEnd(builder, attcheckalignblocks[attnum]);
+
+ /* perform alignment */
+ if (att->attalign == 'i')
+ {
+ alignto = ALIGNOF_INT;
+ }
+ else if (att->attalign == 'c')
+ {
+ alignto = 1;
+ }
+ else if (att->attalign == 'd')
+ {
+ alignto = ALIGNOF_DOUBLE;
+ }
+ else if (att->attalign == 's')
+ {
+ alignto = ALIGNOF_SHORT;
+ }
+ else
+ {
+ elog(ERROR, "unknown alignment");
+ alignto = 0;
+ }
+
+ if ((alignto > 1 &&
+ (attcuralign < 0 || attcuralign != TYPEALIGN(alignto, attcuralign))))
+ {
+ LLVMValueRef v_off_aligned;
+ bool conditional_alignment;
+
+ /*
+ * If varlena, do only alignment if not short varlena. Check if
+ * the byte is padding for that.
+ */
+ if (att->attlen == -1)
+ {
+ LLVMValueRef possible_padbyte;
+ LLVMValueRef ispad;
+ possible_padbyte =
+ LLVMBuildLoad(builder,
+ LLVMBuildGEP(builder, v_tupdata_base, &v_off, 1, ""),
+ "padbyte");
+ ispad =
+ LLVMBuildICmp(builder, LLVMIntEQ, possible_padbyte,
+ LLVMConstInt(LLVMInt8Type(), 0, false),
+ "ispadbyte");
+ LLVMBuildCondBr(builder, ispad,
+ attalignblocks[attnum],
+ attstoreblocks[attnum]);
+ conditional_alignment = true;
+ }
+ else
+ {
+ LLVMBuildBr(builder, attalignblocks[attnum]);
+ conditional_alignment = false;
+ }
+
+ LLVMPositionBuilderAtEnd(builder, attalignblocks[attnum]);
+
+ {
+ /* translation of alignment code (cf TYPEALIGN()) */
+
+ /* ((ALIGNVAL) - 1) */
+ LLVMValueRef alignval = LLVMConstInt(LLVMInt32Type(), alignto - 1, false);
+ /* ((uintptr_t) (LEN) + ((ALIGNVAL) - 1)) */
+ LLVMValueRef lh = LLVMBuildAdd(builder, v_off, alignval, "");
+ /* ~((uintptr_t) ((ALIGNVAL) - 1))*/
+ LLVMValueRef rh = LLVMConstInt(LLVMInt32Type(), ~(alignto - 1), false);
+
+ v_off_aligned = LLVMBuildAnd(builder, lh, rh, "aligned_offset");
+ }
+
+ LLVMBuildBr(builder, attstoreblocks[attnum]);
+ LLVMPositionBuilderAtEnd(builder, attstoreblocks[attnum]);
+
+ if (conditional_alignment)
+ {
+ LLVMValueRef incoming_values[] =
+ {v_off, v_off_aligned};
+ LLVMBasicBlockRef incoming_blocks[] =
+ {attcheckalignblocks[attnum], attalignblocks[attnum]};
+ v_off_inc = LLVMBuildPhi(builder, LLVMInt32Type(), "");
+ LLVMAddIncoming(v_off_inc,
+ incoming_values, incoming_blocks,
+ lengthof(incoming_values));
+ }
+ else
+ {
+ v_off_inc = v_off_aligned;
+ }
+ }
+ else
+ {
+ LLVMPositionBuilderAtEnd(builder, attcheckalignblocks[attnum]);
+ LLVMBuildBr(builder, attalignblocks[attnum]);
+ LLVMPositionBuilderAtEnd(builder, attalignblocks[attnum]);
+ LLVMBuildBr(builder, attstoreblocks[attnum]);
+ v_off_inc = v_off;
+ }
+ LLVMPositionBuilderAtEnd(builder, attstoreblocks[attnum]);
+
+
+ /* compute what following columns are aligned to */
+ if (att->attlen < 0)
+ {
+ /* can't guarantee any alignment after varlen field */
+ attcuralign = -1;
+ }
+ else if (att->attnotnull && attcuralign >= 0)
+ {
+ Assert(att->attlen > 0);
+ attcuralign += att->attlen;
+ }
+ else if (att->attnotnull)
+ {
+ /*
+ * After a NOT NULL fixed-width column, alignment is
+ * guaranteed to be the minimum of the forced alignment and
+ * length. XXX
+ */
+ attcuralign = alignto + att->attlen;
+ Assert(attcuralign > 0);
+ }
+ else
+ {
+ //elog(LOG, "attnotnullreset: %d", attnum);
+ attcuralign = -1;
+ }
+
+ /* compute address to load data from */
+ v_attdatap =
+ LLVMBuildGEP(builder, v_tupdata_base, &v_off_inc, 1, "");
+
+ /* compute address to store value at */
+ v_resultp = LLVMBuildGEP(builder, v_tts_values, &l_attno, 1, "");
+
+ /* store null-byte (false) */
+ LLVMBuildStore(builder,
+ LLVMConstInt(LLVMInt8Type(), 0, false),
+ LLVMBuildGEP(builder, v_tts_nulls, &l_attno, 1, ""));
+
+ if (att->attbyval)
+ {
+ LLVMValueRef tmp_loaddata;
+ LLVMTypeRef vartypep =
+ LLVMPointerType(LLVMIntType(att->attlen*8), 0);
+ tmp_loaddata =
+ LLVMBuildPointerCast(builder, v_attdatap, vartypep, "");
+ tmp_loaddata = LLVMBuildLoad(builder, tmp_loaddata, "attr_byval");
+ tmp_loaddata = LLVMBuildZExt(builder, tmp_loaddata, TypeSizeT, "");
+
+ LLVMBuildStore(builder, tmp_loaddata, v_resultp);
+ }
+ else
+ {
+ LLVMValueRef tmp_loaddata;
+
+ /* store pointer */
+ tmp_loaddata =
+ LLVMBuildPtrToInt(builder,
+ v_attdatap,
+ TypeSizeT,
+ "attr_ptr");
+ LLVMBuildStore(builder, tmp_loaddata, v_resultp);
+ }
+
+ /* increment data pointer */
+ if (att->attlen > 0)
+ {
+ incby = LLVMConstInt(LLVMInt32Type(), att->attlen, false);
+ }
+ else if (att->attlen == -1)
+ {
+ incby =
+ LLVMBuildCall(builder, l_varsize_any,
+ &v_attdatap, 1,
+ "varsize_any");
+ {
+ char argname[] = "readonly";
+ LLVMAttributeRef ref =
+ LLVMCreateStringAttribute(LLVMGetGlobalContext(), argname, strlen(argname), NULL, 0);
+ LLVMAddCallSiteAttribute(incby, LLVMAttributeFunctionIndex, ref);
+ }
+ incby = LLVMBuildTrunc(builder, incby,
+ LLVMInt32Type(), "");
+ }
+ else if (att->attlen == -2)
+ {
+ incby = LLVMBuildCall(builder, l_strlen, &v_attdatap, 1, "strlen");
+ incby = LLVMBuildTrunc(builder, incby,
+ LLVMInt32Type(), "");
+ /* add 1 for NULL byte */
+ incby =
+ LLVMBuildAdd(builder, incby,
+ LLVMConstInt(LLVMInt32Type(), 1, false), "");
+ }
+ else
+ {
+ Assert(false);
+ incby = NULL; /* silence compiler */
+ }
+
+ v_off_inc = LLVMBuildAdd(builder, v_off_inc, incby, "increment_offset");
+
+ /*
+ * jump to next block, unless last possible column, or all desired
+ * (available) attributes have been fetched.
+ */
+ if (attnum + 1 == natts)
+ {
+ LLVMBuildBr(builder, outblock);
+ }
+ else
+ {
+ LLVMBuildBr(builder, attcheckattnoblocks[attnum + 1]);
+ }
+ }
+
+ /* jump out */
+ LLVMPositionBuilderAtEnd(builder, outblock);
+ LLVMBuildStore(builder, LLVMBuildZExt(builder, v_maxatt, LLVMInt32Type(), ""), v_nvalidp);
+ LLVMBuildStore(builder, v_off, v_slotoffp);
+ LLVMBuildRetVoid(builder);
+
+ LLVMDisposeBuilder(builder);
+
+ return deform_fn;
+}
+#endif
diff --git a/src/backend/executor/execExprCompile.c b/src/backend/executor/execExprCompile.c
index d41405b648..79b3ebd6c4 100644
--- a/src/backend/executor/execExprCompile.c
+++ b/src/backend/executor/execExprCompile.c
@@ -504,23 +504,37 @@ ExecReadyCompiledExpr(ExprState *state, PlanState *parent)
case EEOP_OUTER_FETCHSOME:
case EEOP_SCAN_FETCHSOME:
{
+ TupleDesc desc = NULL;
LLVMValueRef v_slot;
LLVMBasicBlockRef b_fetch = LLVMInsertBasicBlock(opblocks[i + 1], "");
LLVMValueRef v_nvalid;
if (op->opcode == EEOP_INNER_FETCHSOME)
{
+ PlanState *is = innerPlanState(parent);
v_slot = v_innerslot;
+ if (is &&
+ is->ps_ResultTupleSlot &&
+ is->ps_ResultTupleSlot->tts_fixedTupleDescriptor)
+ desc = is->ps_ResultTupleSlot->tts_tupleDescriptor;
}
else if (op->opcode == EEOP_OUTER_FETCHSOME)
{
+ PlanState *os = outerPlanState(parent);
+
v_slot = v_outerslot;
+
+ if (os &&
+ os->ps_ResultTupleSlot &&
+ os->ps_ResultTupleSlot->tts_fixedTupleDescriptor)
+ desc = os->ps_ResultTupleSlot->tts_tupleDescriptor;
}
else
{
v_slot = v_scanslot;
+ desc = parent ? parent->scandesc : NULL;
}
/*
@@ -539,6 +553,28 @@ ExecReadyCompiledExpr(ExprState *state, PlanState *parent)
opblocks[i + 1], b_fetch);
LLVMPositionBuilderAtEnd(builder, b_fetch);
+
+ /*
+ * If the tupledesc of the to-be-deformed tuple is known,
+ * and JITing of deforming is enabled, build deform
+ * function specific to tupledesc and the exact number of
+ * to-be-extracted attributes.
+ */
+ if (desc && jit_tuple_deforming)
+ {
+ LLVMValueRef params[2];
+ LLVMValueRef l_jit_deform;
+
+ l_jit_deform = slot_compile_deform(context,
+ desc,
+ op->d.fetch.last_var);
+ params[0] = v_slot;
+ params[1] = LLVMConstInt(LLVMInt16Type(), op->d.fetch.last_var, false);
+
+ LLVMBuildCall(builder, l_jit_deform, params, lengthof(params), "");
+
+ }
+ else
{
LLVMValueRef params[2];
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index 78ec871f50..e5568b922e 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -121,6 +121,7 @@ MakeTupleTableSlot(TupleDesc tupleDesc)
slot->tts_mcxt = CurrentMemoryContext;
slot->tts_buffer = InvalidBuffer;
slot->tts_nvalid = 0;
+ slot->tts_off = 0;
slot->tts_values = NULL;
slot->tts_isnull = NULL;
slot->tts_mintuple = NULL;
@@ -358,6 +359,7 @@ ExecStoreTuple(HeapTuple tuple,
/* Mark extracted state invalid */
slot->tts_nvalid = 0;
+ slot->tts_off = 0;
/*
* If tuple is on a disk page, keep the page pinned as long as we hold a
@@ -431,6 +433,7 @@ ExecStoreMinimalTuple(MinimalTuple mtup,
/* Mark extracted state invalid */
slot->tts_nvalid = 0;
+ slot->tts_off = 0;
return slot;
}
@@ -477,6 +480,7 @@ ExecClearTuple(TupleTableSlot *slot) /* slot in which to store tuple */
*/
slot->tts_isempty = true;
slot->tts_nvalid = 0;
+ slot->tts_off = 0;
return slot;
}
@@ -776,6 +780,7 @@ ExecMaterializeSlot(TupleTableSlot *slot)
* that we have not pfree'd tts_mintuple, if there is one.)
*/
slot->tts_nvalid = 0;
+ slot->tts_off = 0;
/*
* On the same principle of not depending on previous remote storage,
diff --git a/src/backend/lib/llvmjit.c b/src/backend/lib/llvmjit.c
index 460cb6b325..e05fe2dd72 100644
--- a/src/backend/lib/llvmjit.c
+++ b/src/backend/lib/llvmjit.c
@@ -293,7 +293,7 @@ llvm_create_types(void)
members[11] = LLVMPointerType(LLVMInt8Type(), 0); /* nulls */
members[12] = LLVMPointerType(StructMinimalTupleData, 0); /* mintuple */
members[13] = StructHeapTupleData; /* minhdr */
- members[14] = LLVMInt64Type(); /* off: FIXME, deterministic type, not long */
+ members[14] = LLVMInt32Type(); /* off */
StructTupleTableSlot = LLVMStructCreateNamed(LLVMGetGlobalContext(),
"struct.TupleTableSlot");
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 9a80ecedc2..4cc9f305a2 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -41,6 +41,7 @@
#include "commands/vacuum.h"
#include "commands/variable.h"
#include "commands/trigger.h"
+#include "executor/executor.h"
#include "funcapi.h"
#include "lib/llvmjit.h"
#include "libpq/auth.h"
@@ -1031,6 +1032,17 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},
+ {
+ {"jit_tuple_deforming", PGC_USERSET, DEVELOPER_OPTIONS,
+ gettext_noop("just-in-time compile tuple deforming"),
+ NULL,
+ GUC_NOT_IN_SAMPLE
+ },
+ &jit_tuple_deforming,
+ false,
+ NULL, NULL, NULL
+ },
+
#endif
{
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 4de4bf4035..ab2df96ca0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -91,6 +91,7 @@ extern PGDLLIMPORT ExecutorCheckPerms_hook_type ExecutorCheckPerms_hook;
/* GUC variables for JITing */
#ifdef USE_LLVM
extern bool jit_expressions;
+extern bool jit_tuple_deforming;
#endif
/*
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index 6c24fd334d..475b2bdcef 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -126,7 +126,7 @@ typedef struct TupleTableSlot
bool *tts_isnull; /* current per-attribute isnull flags */
MinimalTuple tts_mintuple; /* minimal tuple, or NULL if none */
HeapTupleData tts_minhdr; /* workspace for minimal-tuple-only case */
- long tts_off; /* saved state for slot_deform_tuple */
+ int32 tts_off; /* saved state for slot_deform_tuple */
bool tts_fixedTupleDescriptor;
} TupleTableSlot;
diff --git a/src/include/lib/llvmjit.h b/src/include/lib/llvmjit.h
index 9711d398ca..61d7c67d6f 100644
--- a/src/include/lib/llvmjit.h
+++ b/src/include/lib/llvmjit.h
@@ -9,6 +9,7 @@
#undef PM
#include "nodes/pg_list.h"
+#include "access/tupdesc.h"
#include <llvm-c/Core.h>
#include <llvm-c/Core.h>
@@ -70,6 +71,8 @@ extern void llvm_shutdown_perf_support(LLVMExecutionEngineRef EE);
extern void llvm_perf_orc_support(LLVMOrcJITStackRef llvm_orc);
extern void llvm_shutdown_orc_perf_support(LLVMOrcJITStackRef llvm_orc);
+extern LLVMValueRef slot_compile_deform(struct LLVMJitContext *context, TupleDesc desc, int natts);
+
#else
struct LLVMJitContext;
@@ -79,4 +82,7 @@ typedef struct LLVMJitContext LLVMJitContext;
extern void llvm_release_handle(ResourceOwner resowner, Datum handle);
+
+struct LLVMJitContext;
+
#endif /* LLVMJIT_H */
--
2.14.1.2.g4274c698f4.dirty
0015-WIP-Expression-based-agg-transition.patchtext/x-diff; charset=us-asciiDownload
From 343c747ac083bab9a3582b043af1c8615d2eeee7 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 30 Aug 2017 21:16:56 -0700
Subject: [PATCH 15/16] WIP: Expression based agg transition.
Todo:
- Split EEOP_AGG_PLAIN_TRANS into lifetime caring/not variant
- Fix memory lifetime for JITed
---
src/backend/executor/execExpr.c | 289 ++++++++++++++++++
src/backend/executor/execExprCompile.c | 357 ++++++++++++++++++++++
src/backend/executor/execExprInterp.c | 223 ++++++++++++++
src/backend/executor/nodeAgg.c | 534 ++++++---------------------------
src/backend/lib/llvmjit.c | 13 +
src/include/executor/execExpr.h | 69 +++++
src/include/executor/executor.h | 2 +
src/include/executor/nodeAgg.h | 308 +++++++++++++++++++
src/include/lib/llvmjit.h | 1 +
src/include/nodes/execnodes.h | 5 +
10 files changed, 1351 insertions(+), 450 deletions(-)
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index e6ffe6e062..29aa8718fc 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -43,6 +43,7 @@
#include "optimizer/planner.h"
#include "pgstat.h"
#include "utils/builtins.h"
+#include "utils/datum.h"
#include "utils/lsyscache.h"
#include "utils/typcache.h"
@@ -2532,6 +2533,294 @@ ExecInitArrayRef(ExprEvalStep *scratch, ArrayRef *aref, PlanState *parent,
}
}
+static void
+ExecInitAggTransTrans(ExprState *state, AggState *aggstate, ExprEvalStep *scratch, FunctionCallInfo fcinfo,
+ AggStatePerTrans pertrans, int transno, int setno, int setoff, bool ishash)
+{
+ int adjust_init_jumpnull = -1;
+ int adjust_strict_jumpnull = -1;
+ ExprContext *aggcontext;
+
+ if (ishash)
+ aggcontext = aggstate->hashcontext;
+ else
+ aggcontext = aggstate->aggcontexts[setno];
+
+ /*
+ * If the initial value for the transition state doesn't exist in the
+ * pg_aggregate table then we will let the first non-NULL value
+ * returned from the outer procNode become the initial value. (This is
+ * useful for aggregates like max() and min().) The noTransValue flag
+ * signals that we still need to do this.
+ */
+ if (pertrans->numSortCols == 0 &&
+ fcinfo->flinfo->fn_strict &&
+ pertrans->initValueIsNull)
+ {
+ scratch->opcode = EEOP_AGG_INIT_TRANS;
+ scratch->d.agg_init_trans.aggstate = aggstate;
+ scratch->d.agg_init_trans.pertrans = pertrans;
+ scratch->d.agg_init_trans.setno = setno;
+ scratch->d.agg_init_trans.setoff = setoff;
+ scratch->d.agg_init_trans.transno = transno;
+ scratch->d.agg_init_trans.aggcontext = aggcontext;
+ scratch->d.agg_init_trans.jumpnull = -1; /* adjust later */
+ ExprEvalPushStep(state, scratch);
+
+ adjust_init_jumpnull = state->steps_len - 1;
+ }
+
+ if (pertrans->numSortCols == 0 &&
+ fcinfo->flinfo->fn_strict)
+ {
+ scratch->opcode = EEOP_AGG_STRICT_TRANS_CHECK;
+ scratch->d.agg_strict_trans_check.aggstate = aggstate;
+ scratch->d.agg_strict_trans_check.setno = setno;
+ scratch->d.agg_strict_trans_check.setoff = setoff;
+ scratch->d.agg_strict_trans_check.transno = transno;
+ scratch->d.agg_strict_trans_check.jumpnull = -1; /* adjust later */
+ ExprEvalPushStep(state, scratch);
+
+ /*
+ * Note, we don't push into adjust_bailout here - those jump
+ * to the end of all transition value computations.
+ */
+ adjust_strict_jumpnull = state->steps_len - 1;
+ }
+
+ if (pertrans->numSortCols == 0)
+ {
+ scratch->opcode = EEOP_AGG_PLAIN_TRANS;
+ scratch->d.agg_plain_trans.aggstate = aggstate;
+ scratch->d.agg_plain_trans.pertrans = pertrans;
+ scratch->d.agg_plain_trans.setno = setno;
+ scratch->d.agg_plain_trans.setoff = setoff;
+ scratch->d.agg_plain_trans.transno = transno;
+ scratch->d.agg_plain_trans.aggcontext = aggcontext;
+ ExprEvalPushStep(state, scratch);
+ }
+ else if (pertrans->numInputs == 1)
+ {
+ scratch->opcode = EEOP_AGG_ORDERED_TRANS_DATUM;
+ scratch->d.agg_ordered_trans.aggstate = aggstate;
+ scratch->d.agg_ordered_trans.pertrans = pertrans;
+ scratch->d.agg_ordered_trans.setno = setno;
+ scratch->d.agg_ordered_trans.setoff = setoff;
+ scratch->d.agg_ordered_trans.aggcontext = aggcontext;
+ ExprEvalPushStep(state, scratch);
+ }
+ else
+ {
+ scratch->opcode = EEOP_AGG_ORDERED_TRANS_TUPLE;
+ scratch->d.agg_ordered_trans.aggstate = aggstate;
+ scratch->d.agg_ordered_trans.pertrans = pertrans;
+ scratch->d.agg_ordered_trans.setno = setno;
+ scratch->d.agg_ordered_trans.setoff = setoff;
+ scratch->d.agg_ordered_trans.aggcontext = aggcontext;
+ ExprEvalPushStep(state, scratch);
+ }
+
+ if (adjust_init_jumpnull != -1 )
+ {
+ ExprEvalStep *as = &state->steps[adjust_init_jumpnull];
+ Assert(as->d.agg_init_trans.jumpnull == -1);
+ as->d.agg_init_trans.jumpnull = state->steps_len;
+ }
+
+ if (adjust_strict_jumpnull != -1 )
+ {
+ ExprEvalStep *as = &state->steps[adjust_strict_jumpnull];
+ Assert(as->d.agg_strict_trans_check.jumpnull == -1);
+ as->d.agg_strict_trans_check.jumpnull = state->steps_len;
+ }
+}
+
+ExprState *
+ExecInitAggTrans(AggState *aggstate, AggStatePerPhase phase,
+ PlanState *parent, bool doSort, bool doHash)
+{
+ ExprState *state = makeNode(ExprState);
+ List *exprList = NIL;
+ ExprEvalStep scratch;
+ int transno = 0;
+ int setoff = 0;
+
+ state->expr = (Expr *) aggstate;
+
+ scratch.resvalue = &state->resvalue;
+ scratch.resnull = &state->resnull;
+
+ /*
+ * First figure out which slots we're going to need. Out of expediency
+ * build one list for all expressions and then use existing code :(
+ */
+ for (transno = 0; transno < aggstate->numtrans; transno++)
+ {
+ AggStatePerTrans pertrans = &aggstate->pertrans[transno];
+
+ exprList = lappend(exprList, pertrans->aggref->aggdirectargs);
+ exprList = lappend(exprList, pertrans->aggref->args);
+ exprList = lappend(exprList, pertrans->aggref->aggorder);
+ exprList = lappend(exprList, pertrans->aggref->aggdistinct);
+ exprList = lappend(exprList, pertrans->aggref->aggfilter);
+ }
+ ExecInitExprSlots(state, (Node *) exprList);
+
+ /*
+ * Emit instructions for each transition value / grouping set combination.
+ */
+ for (transno = 0; transno < aggstate->numtrans; transno++)
+ {
+ AggStatePerTrans pertrans = &aggstate->pertrans[transno];
+ int numInputs = pertrans->numInputs;
+ int argno;
+ int setno;
+ FunctionCallInfo fcinfo = &pertrans->transfn_fcinfo;
+ ListCell *arg, *bail;
+ List *adjust_bailout = NIL;
+ bool *strictnulls = NULL;
+
+ /*
+ * If filter present, emit. Do so before evaluating the input, to
+ * avoid potentially unneeded computations.
+ */
+ if (pertrans->aggref->aggfilter)
+ {
+ /* evaluate filter expression */
+ ExecInitExprRec(pertrans->aggref->aggfilter, parent, state,
+ &state->resvalue, &state->resnull);
+ /* and jump out if false */
+ scratch.opcode = EEOP_AGG_FILTER;
+ scratch.d.agg_filter.jumpfalse = -1; /* adjust later */
+ ExprEvalPushStep(state, &scratch);
+ adjust_bailout = lappend_int(adjust_bailout,
+ state->steps_len - 1);
+ }
+
+ /*
+ * Evaluate aggregate input into the user of that information.
+ */
+ argno = 0;
+ if (pertrans->numSortCols == 0)
+ {
+ strictnulls = fcinfo->argnull + 1;
+
+ foreach (arg, pertrans->aggref->args)
+ {
+ TargetEntry *source_tle = (TargetEntry *) lfirst(arg);
+
+ /* Start from 1, since the 0th arg will be the transition value */
+ ExecInitExprRec(source_tle->expr, parent, state,
+ &fcinfo->arg[argno + 1],
+ &fcinfo->argnull[argno + 1]);
+ argno++;
+ }
+ }
+ else if (pertrans->numInputs == 1)
+ {
+ TargetEntry *source_tle =
+ (TargetEntry *) linitial(pertrans->aggref->args);
+ Assert(list_length(pertrans->aggref->args) == 1);
+
+ ExecInitExprRec(source_tle->expr, parent, state,
+ &state->resvalue,
+ &state->resnull);
+ strictnulls = &state->resnull;
+ argno++;
+ }
+ else
+ {
+ Datum *values = pertrans->sortslot->tts_values;
+ bool *nulls = pertrans->sortslot->tts_isnull;
+
+ strictnulls = nulls;
+
+ foreach (arg, pertrans->aggref->args)
+ {
+ TargetEntry *source_tle = (TargetEntry *) lfirst(arg);
+
+ ExecInitExprRec(source_tle->expr, parent, state,
+ &values[argno], &nulls[argno]);
+ argno++;
+ }
+ }
+ Assert(numInputs == argno);
+
+ /*
+ * For a strict transfn, nothing happens when there's a NULL input; we
+ * just keep the prior transValue. This is true for both plain and
+ * sorted/distinct aggregates.
+ */
+ if (fcinfo->flinfo->fn_strict && numInputs > 0)
+ {
+ scratch.opcode = EEOP_AGG_STRICT_INPUT_CHECK;
+ scratch.d.agg_strict_input_check.nulls = strictnulls;
+ scratch.d.agg_strict_input_check.jumpnull = -1; /* adjust later */
+ scratch.d.agg_strict_input_check.nargs = numInputs;
+ ExprEvalPushStep(state, &scratch);
+ adjust_bailout = lappend_int(adjust_bailout,
+ state->steps_len - 1);
+ }
+
+
+ /* and call transition function (once for each grouping set) */
+
+ setoff = 0;
+ if (doSort)
+ {
+ int processGroupingSets = Max(phase->numsets, 1);
+
+ for (setno = 0; setno < processGroupingSets; setno++)
+ {
+ ExecInitAggTransTrans(state, aggstate, &scratch, fcinfo, pertrans, transno, setno, setoff, false);
+ setoff++;
+ }
+ }
+
+ if (doHash)
+ {
+ int numHashes = aggstate->num_hashes;
+
+ if (aggstate->aggstrategy != AGG_HASHED)
+ setoff = aggstate->maxsets;
+ else
+ setoff = 0;
+
+ for (setno = 0; setno < numHashes; setno++)
+ {
+ ExecInitAggTransTrans(state, aggstate, &scratch, fcinfo, pertrans, transno, setno, setoff, true);
+ setoff++;
+ }
+ }
+
+ /* adjust early bail out jump target(s) */
+ foreach (bail, adjust_bailout)
+ {
+ ExprEvalStep *as = &state->steps[lfirst_int(bail)];
+ if (as->opcode == EEOP_AGG_FILTER)
+ {
+ Assert(as->d.agg_filter.jumpfalse == -1);
+ as->d.agg_filter.jumpfalse = state->steps_len;
+ }
+ else if (as->opcode == EEOP_AGG_STRICT_INPUT_CHECK)
+ {
+ Assert(as->d.agg_strict_input_check.jumpnull == -1);
+ as->d.agg_strict_input_check.jumpnull = state->steps_len;
+ }
+ }
+
+ }
+
+ scratch.resvalue = NULL;
+ scratch.resnull = NULL;
+ scratch.opcode = EEOP_DONE;
+ ExprEvalPushStep(state, &scratch);
+
+ ExecReadyExpr(state, parent);
+
+ return state;
+}
+
/*
* Helper for preparing ArrayRef expressions for evaluation: is expr a nested
* FieldStore or ArrayRef that needs the old element value passed down?
diff --git a/src/backend/executor/execExprCompile.c b/src/backend/executor/execExprCompile.c
index 79b3ebd6c4..d0b943530c 100644
--- a/src/backend/executor/execExprCompile.c
+++ b/src/backend/executor/execExprCompile.c
@@ -23,6 +23,7 @@
#include "catalog/objectaccess.h"
#include "catalog/pg_type.h"
#include "executor/execdebug.h"
+#include "executor/nodeAgg.h"
#include "executor/nodeSubplan.h"
#include "executor/execExpr.h"
#include "funcapi.h"
@@ -273,6 +274,28 @@ BuildFunctionCall(LLVMJitContext *context, LLVMBuilderRef builder,
return v_retval;
}
+static LLVMValueRef
+create_ExecAggInitGroup(LLVMModuleRef mod)
+{
+ LLVMTypeRef sig;
+ LLVMValueRef fn;
+ LLVMTypeRef param_types[3];
+ const char *nm = "ExecAggInitGroup";
+
+ fn = LLVMGetNamedFunction(mod, nm);
+ if (fn)
+ return fn;
+
+ param_types[0] = LLVMPointerType(TypeSizeT, 0);
+ param_types[1] = LLVMPointerType(TypeSizeT, 0);
+ param_types[2] = LLVMPointerType(StructAggStatePerGroupData, 0);
+
+ sig = LLVMFunctionType(LLVMVoidType(), param_types, lengthof(param_types), 0);
+ fn = LLVMAddFunction(mod, nm, sig);
+
+ return fn;
+}
+
static Datum
ExecRunCompiledExpr(ExprState *state, ExprContext *econtext, bool *isNull)
{
@@ -1446,6 +1469,8 @@ ExecReadyCompiledExpr(ExprState *state, PlanState *parent)
case EEOP_NULLTEST_ROWISNULL:
case EEOP_NULLTEST_ROWISNOTNULL:
case EEOP_WHOLEROW:
+ case EEOP_AGG_ORDERED_TRANS_DATUM:
+ case EEOP_AGG_ORDERED_TRANS_TUPLE:
{
LLVMValueRef v_params[3];
const char *funcname;
@@ -1502,6 +1527,10 @@ ExecReadyCompiledExpr(ExprState *state, PlanState *parent)
funcname = "ExecEvalAlternativeSubPlan";
else if (op->opcode == EEOP_WHOLEROW)
funcname = "ExecEvalWholeRowVar";
+ else if (op->opcode == EEOP_AGG_ORDERED_TRANS_DATUM)
+ funcname = "ExecEvalAggOrderedTransDatum";
+ else if (op->opcode == EEOP_AGG_ORDERED_TRANS_TUPLE)
+ funcname = "ExecEvalAggOrderedTransTuple";
else
{
Assert(false);
@@ -2346,6 +2375,334 @@ ExecReadyCompiledExpr(ExprState *state, PlanState *parent)
LLVMBuildBr(builder, opblocks[i + 1]);
+ break;
+ }
+ case EEOP_AGG_FILTER:
+ {
+ LLVMValueRef v_resnull, v_resvalue;
+ LLVMValueRef v_filtered;
+
+ v_resnull = LLVMBuildLoad(builder, v_resnullp, "");
+ v_resvalue = LLVMBuildLoad(builder, v_resvaluep, "");
+
+ v_filtered = LLVMBuildOr(
+ builder,
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_resnull,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ LLVMBuildICmp(
+ builder, LLVMIntEQ, v_resvalue,
+ LLVMConstInt(TypeSizeT, 0, false), ""),
+ "");
+
+ LLVMBuildCondBr(
+ builder,
+ v_filtered,
+ opblocks[op->d.agg_filter.jumpfalse],
+ opblocks[i + 1]);
+
+ break;
+ }
+
+ case EEOP_AGG_STRICT_INPUT_CHECK:
+ {
+ int nargs = op->d.agg_strict_input_check.nargs;
+ bool *nulls = op->d.agg_strict_input_check.nulls;
+ int argno;
+
+ LLVMValueRef v_nullp;
+ LLVMBasicBlockRef *b_checknulls;
+
+ v_nullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) nulls, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "v_nullp");
+
+ /* create blocks for checking args */
+ b_checknulls = palloc(sizeof(LLVMBasicBlockRef *) * nargs);
+ for (argno = 0; argno < nargs; argno++)
+ {
+ b_checknulls[argno] = LLVMInsertBasicBlock(opblocks[i + 1], "check-null");
+ }
+
+ LLVMBuildBr(builder, b_checknulls[0]);
+
+ /* strict function, check for NULL args */
+ for (argno = 0; argno < nargs; argno++)
+ {
+ LLVMValueRef v_argno = LLVMConstInt(LLVMInt32Type(), argno, false);
+ LLVMValueRef v_argisnull;
+ LLVMBasicBlockRef b_argnotnull;
+
+ LLVMPositionBuilderAtEnd(builder, b_checknulls[argno]);
+
+ if (argno + 1 == nargs)
+ b_argnotnull = opblocks[i + 1];
+ else
+ b_argnotnull = b_checknulls[argno + 1];
+
+ v_argisnull = LLVMBuildLoad(
+ builder,
+ LLVMBuildGEP(
+ builder, v_nullp, &v_argno, 1, ""),
+ "");
+
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_argisnull,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ opblocks[op->d.agg_strict_input_check.jumpnull],
+ b_argnotnull);
+ }
+
+ break;
+ }
+
+ case EEOP_AGG_INIT_TRANS:
+ {
+ AggState *aggstate;
+ AggStatePerTrans pertrans;
+
+ LLVMValueRef v_aggstatep;
+ LLVMValueRef v_pertransp;
+
+ LLVMValueRef v_allpergroupspp;
+
+ LLVMValueRef v_pergroupp;
+
+ LLVMValueRef v_setoff, v_transno;
+
+ LLVMValueRef v_notransvalue;
+
+ LLVMBasicBlockRef b_init;
+
+ aggstate = op->d.agg_init_trans.aggstate;
+ pertrans = op->d.agg_init_trans.pertrans;
+
+ v_aggstatep = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (intptr_t) aggstate, false),
+ LLVMPointerType(TypeSizeT, 0),
+ "");
+
+ v_pertransp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (intptr_t) pertrans, false),
+ LLVMPointerType(TypeSizeT, 0),
+ "");
+
+ v_allpergroupspp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (intptr_t) &aggstate->all_pergroups, false),
+ LLVMPointerType(LLVMPointerType(LLVMPointerType(StructAggStatePerGroupData, 0), 0), 0),
+ "aggstate.all_pergroups");
+
+ v_setoff = LLVMConstInt(LLVMInt32Type(), op->d.agg_init_trans.setoff, 0);
+ v_transno = LLVMConstInt(LLVMInt32Type(), op->d.agg_init_trans.transno, 0);
+
+ v_pergroupp = LLVMBuildGEP(
+ builder,
+ LLVMBuildLoad(
+ builder,
+ v_allpergroupspp,
+ ""),
+ &v_setoff, 1, "");
+
+ v_pergroupp = LLVMBuildGEP(
+ builder,
+ LLVMBuildLoad(
+ builder,
+ v_pergroupp,
+ ""),
+ &v_transno, 1, "");
+
+ v_notransvalue = LLVMBuildLoad(
+ builder,
+ LLVMBuildStructGEP(
+ builder, v_pergroupp, 2, "notransvalue"),
+ ""
+ );
+
+ b_init = LLVMInsertBasicBlock(opblocks[i + 1], "inittrans");
+
+ LLVMBuildCondBr(
+ builder,
+ LLVMBuildICmp(builder, LLVMIntEQ, v_notransvalue,
+ LLVMConstInt(LLVMInt8Type(), 1, false), ""),
+ b_init,
+ opblocks[i + 1]);
+
+ LLVMPositionBuilderAtEnd(builder, b_init);
+
+ {
+ LLVMValueRef params[3];
+
+ params[0] = v_aggstatep;
+ params[1] = v_pertransp;
+ params[2] = v_pergroupp;
+
+ LLVMBuildCall(
+ builder,
+ create_ExecAggInitGroup(mod),
+ params, lengthof(params),
+ "");
+ }
+ LLVMBuildBr(builder, opblocks[op->d.agg_init_trans.jumpnull]);
+
+ break;
+ }
+
+ case EEOP_AGG_STRICT_TRANS_CHECK:
+ {
+ LLVMBuildBr(
+ builder,
+ opblocks[i + 1]);
+ break;
+ }
+
+ case EEOP_AGG_PLAIN_TRANS:
+ {
+ AggState *aggstate;
+ AggStatePerTrans pertrans;
+ FunctionCallInfo fcinfo;
+
+ LLVMValueRef v_fcinfo_isnull;
+ LLVMValueRef v_argp, v_argnullp;
+
+ LLVMValueRef v_arg0p;
+ LLVMValueRef v_argnull0p;
+
+ LLVMValueRef v_transvaluep;
+ LLVMValueRef v_transnullp;
+
+ LLVMValueRef v_setno, v_setoff, v_transno;
+ LLVMValueRef v_aggcontext;
+
+ LLVMValueRef v_allpergroupsp;
+ LLVMValueRef v_current_setp;
+ LLVMValueRef v_current_pertransp;
+ LLVMValueRef v_curaggcontext;
+
+ LLVMValueRef v_pertransp;
+
+ LLVMValueRef v_pergroupp;
+ LLVMValueRef v_argno;
+
+
+ LLVMValueRef v_retval;
+
+ aggstate = op->d.agg_plain_trans.aggstate;
+ pertrans = op->d.agg_plain_trans.pertrans;
+
+ fcinfo = &pertrans->transfn_fcinfo;
+
+ v_argnullp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) fcinfo->argnull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "v_argnullp");
+
+ v_argp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) fcinfo->arg, false),
+ LLVMPointerType(TypeSizeT, 0),
+ "v_arg");
+
+ v_setno = LLVMConstInt(LLVMInt32Type(), op->d.agg_plain_trans.setno, 0);
+ v_setoff = LLVMConstInt(LLVMInt32Type(), op->d.agg_plain_trans.setoff, 0);
+ v_transno = LLVMConstInt(LLVMInt32Type(), op->d.agg_plain_trans.transno, 0);
+ v_aggcontext = LLVMConstInt(LLVMInt64Type(), (uintptr_t)op->d.agg_plain_trans.aggcontext, 0);
+
+ v_pertransp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) pertrans, false),
+ LLVMPointerType(TypeSizeT, 0),
+ "");
+
+ v_current_setp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) &aggstate->current_set, false),
+ LLVMPointerType(LLVMInt32Type(), 0),
+ "aggstate.current_set");
+ v_curaggcontext = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) &aggstate->curaggcontext, false),
+ LLVMPointerType(TypeSizeT, 0),
+ "");
+ v_current_pertransp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) &aggstate->curpertrans, false),
+ LLVMPointerType(LLVMPointerType(TypeSizeT, 0), 0),
+ "aggstate.curpertrans");
+
+ v_allpergroupsp = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(TypeSizeT, (uintptr_t) &aggstate->all_pergroups, false),
+ LLVMPointerType(LLVMPointerType(LLVMPointerType(StructAggStatePerGroupData, 0), 0), 0),
+ "aggstate.all_pergroups");
+
+ v_pergroupp = LLVMBuildGEP(
+ builder,
+ LLVMBuildLoad(
+ builder,
+ v_allpergroupsp,
+ ""),
+ &v_setoff, 1, "setoff");
+
+ v_pergroupp = LLVMBuildGEP(
+ builder,
+ LLVMBuildLoad(
+ builder,
+ v_pergroupp,
+ ""),
+ &v_transno, 1, "transno");
+
+ /* set aggstate globals */
+ LLVMBuildStore(builder, v_setno, v_current_setp);
+ LLVMBuildStore(builder, v_pertransp, v_current_pertransp);
+ LLVMBuildStore(builder, v_aggcontext, v_curaggcontext);
+
+ /* store transvalue in fcinfo->arg/argnull[0] */
+ v_argno = LLVMConstInt(LLVMInt32Type(), 0, false);
+ v_arg0p = LLVMBuildGEP(builder, v_argp, &v_argno, 1, "");
+ v_argnull0p = LLVMBuildGEP(builder, v_argnullp, &v_argno, 1, "");
+
+ v_transvaluep = LLVMBuildStructGEP(
+ builder, v_pergroupp, 0, "transvaluep");
+ v_transnullp = LLVMBuildStructGEP(
+ builder, v_pergroupp, 1, "transnullp");
+
+ LLVMBuildStore(
+ builder,
+ LLVMBuildLoad(
+ builder,
+ v_transvaluep,
+ "transvalue"),
+ v_arg0p);
+
+ LLVMBuildStore(
+ builder,
+ LLVMBuildLoad(
+ builder,
+ v_transnullp,
+ "transnull"),
+ v_argnull0p);
+
+ v_retval = BuildFunctionCall(context, builder, mod, fcinfo, &v_fcinfo_isnull);
+
+ /* retrieve trans value */
+ LLVMBuildStore(
+ builder,
+ v_retval,
+ v_transvaluep);
+ LLVMBuildStore(
+ builder,
+ v_fcinfo_isnull,
+ v_transnullp);
+
+ LLVMBuildBr(builder, opblocks[i + 1]);
+
break;
}
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index df453b2ab4..a8e56f6f3a 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -64,12 +64,14 @@
#include "executor/execExpr.h"
#include "executor/nodeSubplan.h"
#include "funcapi.h"
+#include "utils/memutils.h"
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "parser/parsetree.h"
#include "pgstat.h"
#include "utils/builtins.h"
#include "utils/date.h"
+#include "utils/datum.h"
#include "utils/lsyscache.h"
#include "utils/timestamp.h"
#include "utils/typcache.h"
@@ -358,6 +360,13 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
&&CASE_EEOP_WINDOW_FUNC,
&&CASE_EEOP_SUBPLAN,
&&CASE_EEOP_ALTERNATIVE_SUBPLAN,
+ &&CASE_EEOP_AGG_FILTER,
+ &&CASE_EEOP_AGG_STRICT_INPUT_CHECK,
+ &&CASE_EEOP_AGG_INIT_TRANS,
+ &&CASE_EEOP_AGG_STRICT_TRANS_CHECK,
+ &&CASE_EEOP_AGG_PLAIN_TRANS,
+ &&CASE_EEOP_AGG_ORDERED_TRANS_DATUM,
+ &&CASE_EEOP_AGG_ORDERED_TRANS_TUPLE,
&&CASE_EEOP_LAST
};
@@ -1461,6 +1470,171 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_NEXT();
}
+ EEO_CASE(EEOP_AGG_FILTER)
+ {
+ if (*op->resnull || !DatumGetBool(*op->resvalue))
+ {
+ Assert(op->d.agg_filter.jumpfalse != -1);
+ EEO_JUMP(op->d.agg_filter.jumpfalse);
+ }
+ else
+ EEO_NEXT();
+ }
+
+ EEO_CASE(EEOP_AGG_STRICT_INPUT_CHECK)
+ {
+ int argno;
+ bool *nulls = op->d.agg_strict_input_check.nulls;
+
+ Assert(op->d.agg_strict_input_check.jumpnull != -1);
+
+ for (argno = 0; argno < op->d.agg_strict_input_check.nargs; argno++)
+ {
+ if (nulls[argno])
+ {
+ EEO_JUMP(op->d.agg_strict_input_check.jumpnull);
+ }
+ }
+ EEO_NEXT();
+ }
+
+ EEO_CASE(EEOP_AGG_INIT_TRANS)
+ {
+ AggState *aggstate;
+ AggStatePerGroup pergroup;
+
+ aggstate = op->d.agg_init_trans.aggstate;
+ pergroup = &aggstate->all_pergroups
+ [op->d.agg_init_trans.setoff]
+ [op->d.agg_init_trans.transno];
+
+ if (pergroup->noTransValue)
+ {
+ AggStatePerTrans pertrans = op->d.agg_init_trans.pertrans;
+
+ aggstate->curaggcontext = op->d.agg_init_trans.aggcontext;
+ aggstate->current_set = op->d.agg_init_trans.setno;
+
+ ExecAggInitGroup(aggstate, pertrans, pergroup);
+
+ EEO_JUMP(op->d.agg_init_trans.jumpnull);
+ }
+
+ EEO_NEXT();
+ }
+
+ EEO_CASE(EEOP_AGG_STRICT_TRANS_CHECK)
+ {
+ AggState *aggstate;
+ AggStatePerGroup pergroup;
+
+ aggstate = op->d.agg_strict_trans_check.aggstate;
+ pergroup = &aggstate->all_pergroups
+ [op->d.agg_strict_trans_check.setoff]
+ [op->d.agg_strict_trans_check.transno];
+
+ Assert(op->d.agg_strict_trans_check.jumpnull != -1);
+
+ if (unlikely(pergroup->transValueIsNull))
+ {
+ elog(ERROR, "blarg");
+ EEO_JUMP(op->d.agg_strict_trans_check.jumpnull);
+ }
+ EEO_NEXT();
+ }
+
+ EEO_CASE(EEOP_AGG_PLAIN_TRANS)
+ {
+ AggState *aggstate;
+ AggStatePerTrans pertrans;
+ AggStatePerGroup pergroup;
+ FunctionCallInfo fcinfo;
+ MemoryContext oldContext;
+ Datum newVal;
+
+ aggstate = op->d.agg_plain_trans.aggstate;
+ pertrans = op->d.agg_plain_trans.pertrans;
+
+ pergroup = &aggstate->all_pergroups
+ [op->d.agg_plain_trans.setoff]
+ [op->d.agg_plain_trans.transno];
+
+ fcinfo = &pertrans->transfn_fcinfo;
+
+ /* cf. select_current_set() */
+ aggstate->curaggcontext = op->d.agg_plain_trans.aggcontext;
+ aggstate->current_set = op->d.agg_plain_trans.setno;
+
+ oldContext = MemoryContextSwitchTo(aggstate->tmpcontext->ecxt_per_tuple_memory);
+
+ /* set up aggstate->curpertrans for AggGetAggref() */
+ aggstate->curpertrans = pertrans;
+
+ fcinfo->arg[0] = pergroup->transValue;
+ fcinfo->argnull[0] = pergroup->transValueIsNull;
+ fcinfo->isnull = false; /* just in case transfn doesn't set it */
+
+ newVal = FunctionCallInvoke(fcinfo);
+
+ /*
+ * If pass-by-ref datatype, must copy the new value into aggcontext and
+ * free the prior transValue. But if transfn returned a pointer to its
+ * first input, we don't need to do anything. Also, if transfn returned a
+ * pointer to a R/W expanded object that is already a child of the
+ * aggcontext, assume we can adopt that value without copying it.
+ */
+ if (!pertrans->transtypeByVal &&
+ DatumGetPointer(newVal) != DatumGetPointer(pergroup->transValue))
+ {
+ if (!fcinfo->isnull)
+ {
+ MemoryContextSwitchTo(aggstate->curaggcontext->ecxt_per_tuple_memory);
+ if (DatumIsReadWriteExpandedObject(newVal,
+ false,
+ pertrans->transtypeLen) &&
+ MemoryContextGetParent(DatumGetEOHP(newVal)->eoh_context) == CurrentMemoryContext)
+ /* do nothing */ ;
+ else
+ newVal = datumCopy(newVal,
+ pertrans->transtypeByVal,
+ pertrans->transtypeLen);
+ }
+ if (!pergroup->transValueIsNull)
+ {
+ if (DatumIsReadWriteExpandedObject(pergroup->transValue,
+ false,
+ pertrans->transtypeLen))
+ DeleteExpandedObject(pergroup->transValue);
+ else
+ pfree(DatumGetPointer(pergroup->transValue));
+ }
+ }
+
+
+ pergroup->transValue = newVal;
+ pergroup->transValueIsNull = fcinfo->isnull;
+
+ MemoryContextSwitchTo(oldContext);
+
+ EEO_NEXT();
+ }
+
+ EEO_CASE(EEOP_AGG_ORDERED_TRANS_DATUM)
+ {
+ /* too complex for an inline implementation */
+ ExecEvalAggOrderedTransDatum(state, op, econtext);
+
+ EEO_NEXT();
+ }
+
+ EEO_CASE(EEOP_AGG_ORDERED_TRANS_TUPLE)
+ {
+ /* too complex for an inline implementation */
+ ExecEvalAggOrderedTransTuple(state, op, econtext);
+
+ EEO_NEXT();
+ }
+
EEO_CASE(EEOP_LAST)
{
/* unreachable */
@@ -3539,3 +3713,52 @@ ExecEvalWholeRowVar(ExprState *state, ExprEvalStep *op, ExprContext *econtext)
*op->resvalue = PointerGetDatum(dtuple);
*op->resnull = false;
}
+
+void
+ExecAggInitGroup(AggState *aggstate, AggStatePerTrans pertrans, AggStatePerGroup pergroup)
+{
+ FunctionCallInfo fcinfo = &pertrans->transfn_fcinfo;
+ MemoryContext oldContext;
+
+ /*
+ * transValue has not been initialized. This is the first non-NULL
+ * input value. We use it as the initial value for transValue. (We
+ * already checked that the agg's input type is binary-compatible
+ * with its transtype, so straight copy here is OK.)
+ *
+ * We must copy the datum into aggcontext if it is pass-by-ref. We
+ * do not need to pfree the old transValue, since it's NULL.
+ */
+ oldContext = MemoryContextSwitchTo(
+ aggstate->curaggcontext->ecxt_per_tuple_memory);
+ pergroup->transValue = datumCopy(fcinfo->arg[1],
+ pertrans->transtypeByVal,
+ pertrans->transtypeLen);
+ pergroup->transValueIsNull = false;
+ pergroup->noTransValue = false;
+ MemoryContextSwitchTo(oldContext);
+}
+
+
+void
+ExecEvalAggOrderedTransDatum(ExprState *state, ExprEvalStep *op,
+ ExprContext *econtext)
+{
+ AggStatePerTrans pertrans = op->d.agg_plain_trans.pertrans;
+ int setno = op->d.agg_plain_trans.setno;
+
+ tuplesort_putdatum(pertrans->sortstates[setno],
+ *op->resvalue, *op->resnull);
+}
+
+void ExecEvalAggOrderedTransTuple(ExprState *state, ExprEvalStep *op,
+ ExprContext *econtext)
+{
+ AggStatePerTrans pertrans = op->d.agg_plain_trans.pertrans;
+ int setno = op->d.agg_plain_trans.setno;
+
+ ExecClearTuple(pertrans->sortslot);
+ pertrans->sortslot->tts_nvalid = pertrans->numInputs;
+ ExecStoreVirtualTuple(pertrans->sortslot);
+ tuplesort_puttupleslot(pertrans->sortstates[setno], pertrans->sortslot);
+}
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index a63c05cb68..3f3dadd2da 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -229,295 +229,6 @@
#include "utils/datum.h"
-/*
- * AggStatePerTransData - per aggregate state value information
- *
- * Working state for updating the aggregate's state value, by calling the
- * transition function with an input row. This struct does not store the
- * information needed to produce the final aggregate result from the transition
- * state, that's stored in AggStatePerAggData instead. This separation allows
- * multiple aggregate results to be produced from a single state value.
- */
-typedef struct AggStatePerTransData
-{
- /*
- * These values are set up during ExecInitAgg() and do not change
- * thereafter:
- */
-
- /*
- * Link to an Aggref expr this state value is for.
- *
- * There can be multiple Aggref's sharing the same state value, as long as
- * the inputs and transition function are identical. This points to the
- * first one of them.
- */
- Aggref *aggref;
-
- /*
- * Nominal number of arguments for aggregate function. For plain aggs,
- * this excludes any ORDER BY expressions. For ordered-set aggs, this
- * counts both the direct and aggregated (ORDER BY) arguments.
- */
- int numArguments;
-
- /*
- * Number of aggregated input columns. This includes ORDER BY expressions
- * in both the plain-agg and ordered-set cases. Ordered-set direct args
- * are not counted, though.
- */
- int numInputs;
-
- /* offset of input columns in AggState->evalslot */
- int inputoff;
-
- /*
- * Number of aggregated input columns to pass to the transfn. This
- * includes the ORDER BY columns for ordered-set aggs, but not for plain
- * aggs. (This doesn't count the transition state value!)
- */
- int numTransInputs;
-
- /* Oid of the state transition or combine function */
- Oid transfn_oid;
-
- /* Oid of the serialization function or InvalidOid */
- Oid serialfn_oid;
-
- /* Oid of the deserialization function or InvalidOid */
- Oid deserialfn_oid;
-
- /* Oid of state value's datatype */
- Oid aggtranstype;
-
- /* ExprStates of the FILTER and argument expressions. */
- ExprState *aggfilter; /* state of FILTER expression, if any */
- List *aggdirectargs; /* states of direct-argument expressions */
-
- /*
- * fmgr lookup data for transition function or combine function. Note in
- * particular that the fn_strict flag is kept here.
- */
- FmgrInfo transfn;
-
- /* fmgr lookup data for serialization function */
- FmgrInfo serialfn;
-
- /* fmgr lookup data for deserialization function */
- FmgrInfo deserialfn;
-
- /* Input collation derived for aggregate */
- Oid aggCollation;
-
- /* number of sorting columns */
- int numSortCols;
-
- /* number of sorting columns to consider in DISTINCT comparisons */
- /* (this is either zero or the same as numSortCols) */
- int numDistinctCols;
-
- /* deconstructed sorting information (arrays of length numSortCols) */
- AttrNumber *sortColIdx;
- Oid *sortOperators;
- Oid *sortCollations;
- bool *sortNullsFirst;
-
- /*
- * fmgr lookup data for input columns' equality operators --- only
- * set/used when aggregate has DISTINCT flag. Note that these are in
- * order of sort column index, not parameter index.
- */
- FmgrInfo *equalfns; /* array of length numDistinctCols */
-
- /*
- * initial value from pg_aggregate entry
- */
- Datum initValue;
- bool initValueIsNull;
-
- /*
- * We need the len and byval info for the agg's input and transition data
- * types in order to know how to copy/delete values.
- *
- * Note that the info for the input type is used only when handling
- * DISTINCT aggs with just one argument, so there is only one input type.
- */
- int16 inputtypeLen,
- transtypeLen;
- bool inputtypeByVal,
- transtypeByVal;
-
- /*
- * Stuff for evaluation of aggregate inputs in cases where the aggregate
- * requires sorted input. The arguments themselves will be evaluated via
- * AggState->evalslot/evalproj for all aggregates at once, but we only
- * want to sort the relevant columns for individual aggregates.
- */
- TupleDesc sortdesc; /* descriptor of input tuples */
-
- /*
- * Slots for holding the evaluated input arguments. These are set up
- * during ExecInitAgg() and then used for each input row requiring
- * processing besides what's done in AggState->evalproj.
- */
- TupleTableSlot *sortslot; /* current input tuple */
- TupleTableSlot *uniqslot; /* used for multi-column DISTINCT */
-
- /*
- * These values are working state that is initialized at the start of an
- * input tuple group and updated for each input tuple.
- *
- * For a simple (non DISTINCT/ORDER BY) aggregate, we just feed the input
- * values straight to the transition function. If it's DISTINCT or
- * requires ORDER BY, we pass the input values into a Tuplesort object;
- * then at completion of the input tuple group, we scan the sorted values,
- * eliminate duplicates if needed, and run the transition function on the
- * rest.
- *
- * We need a separate tuplesort for each grouping set.
- */
-
- Tuplesortstate **sortstates; /* sort objects, if DISTINCT or ORDER BY */
-
- /*
- * This field is a pre-initialized FunctionCallInfo struct used for
- * calling this aggregate's transfn. We save a few cycles per row by not
- * re-initializing the unchanging fields; which isn't much, but it seems
- * worth the extra space consumption.
- */
- FunctionCallInfoData transfn_fcinfo;
-
- /* Likewise for serialization and deserialization functions */
- FunctionCallInfoData serialfn_fcinfo;
-
- FunctionCallInfoData deserialfn_fcinfo;
-} AggStatePerTransData;
-
-/*
- * AggStatePerAggData - per-aggregate information
- *
- * This contains the information needed to call the final function, to produce
- * a final aggregate result from the state value. If there are multiple
- * identical Aggrefs in the query, they can all share the same per-agg data.
- *
- * These values are set up during ExecInitAgg() and do not change thereafter.
- */
-typedef struct AggStatePerAggData
-{
- /*
- * Link to an Aggref expr this state value is for.
- *
- * There can be multiple identical Aggref's sharing the same per-agg. This
- * points to the first one of them.
- */
- Aggref *aggref;
-
- /* index to the state value which this agg should use */
- int transno;
-
- /* Optional Oid of final function (may be InvalidOid) */
- Oid finalfn_oid;
-
- /*
- * fmgr lookup data for final function --- only valid when finalfn_oid oid
- * is not InvalidOid.
- */
- FmgrInfo finalfn;
-
- /*
- * Number of arguments to pass to the finalfn. This is always at least 1
- * (the transition state value) plus any ordered-set direct args. If the
- * finalfn wants extra args then we pass nulls corresponding to the
- * aggregated input columns.
- */
- int numFinalArgs;
-
- /*
- * We need the len and byval info for the agg's result data type in order
- * to know how to copy/delete values.
- */
- int16 resulttypeLen;
- bool resulttypeByVal;
-
-} AggStatePerAggData;
-
-/*
- * AggStatePerGroupData - per-aggregate-per-group working state
- *
- * These values are working state that is initialized at the start of
- * an input tuple group and updated for each input tuple.
- *
- * In AGG_PLAIN and AGG_SORTED modes, we have a single array of these
- * structs (pointed to by aggstate->pergroup); we re-use the array for
- * each input group, if it's AGG_SORTED mode. In AGG_HASHED mode, the
- * hash table contains an array of these structs for each tuple group.
- *
- * Logically, the sortstate field belongs in this struct, but we do not
- * keep it here for space reasons: we don't support DISTINCT aggregates
- * in AGG_HASHED mode, so there's no reason to use up a pointer field
- * in every entry of the hashtable.
- */
-typedef struct AggStatePerGroupData
-{
- Datum transValue; /* current transition value */
- bool transValueIsNull;
-
- bool noTransValue; /* true if transValue not set yet */
-
- /*
- * Note: noTransValue initially has the same value as transValueIsNull,
- * and if true both are cleared to false at the same time. They are not
- * the same though: if transfn later returns a NULL, we want to keep that
- * NULL and not auto-replace it with a later input value. Only the first
- * non-NULL input will be auto-substituted.
- */
-} AggStatePerGroupData;
-
-/*
- * AggStatePerPhaseData - per-grouping-set-phase state
- *
- * Grouping sets are divided into "phases", where a single phase can be
- * processed in one pass over the input. If there is more than one phase, then
- * at the end of input from the current phase, state is reset and another pass
- * taken over the data which has been re-sorted in the mean time.
- *
- * Accordingly, each phase specifies a list of grouping sets and group clause
- * information, plus each phase after the first also has a sort order.
- */
-typedef struct AggStatePerPhaseData
-{
- AggStrategy aggstrategy; /* strategy for this phase */
- int numsets; /* number of grouping sets (or 0) */
- int *gset_lengths; /* lengths of grouping sets */
- Bitmapset **grouped_cols; /* column groupings for rollup */
- FmgrInfo *eqfunctions; /* per-grouping-field equality fns */
- Agg *aggnode; /* Agg node for phase data */
- Sort *sortnode; /* Sort node for input ordering for phase */
-} AggStatePerPhaseData;
-
-/*
- * AggStatePerHashData - per-hashtable state
- *
- * When doing grouping sets with hashing, we have one of these for each
- * grouping set. (When doing hashing without grouping sets, we have just one of
- * them.)
- */
-typedef struct AggStatePerHashData
-{
- TupleHashTable hashtable; /* hash table with one entry per group */
- TupleHashIterator hashiter; /* for iterating through hash table */
- TupleTableSlot *hashslot; /* slot for loading hash table */
- FmgrInfo *hashfunctions; /* per-grouping-field hash fns */
- FmgrInfo *eqfunctions; /* per-grouping-field equality fns */
- int numCols; /* number of hash key columns */
- int numhashGrpCols; /* number of columns in hash table */
- int largestGrpColIdx; /* largest col required for hashing */
- AttrNumber *hashGrpColIdxInput; /* hash col indices in input slot */
- AttrNumber *hashGrpColIdxHash; /* indices in hashtbl tuples */
- Agg *aggnode; /* original Agg node, for numGroups etc. */
-} AggStatePerHashData;
-
-
static void select_current_set(AggState *aggstate, int setno, bool is_hash);
static void initialize_phase(AggState *aggstate, int newphase);
static TupleTableSlot *fetch_input_tuple(AggState *aggstate);
@@ -578,21 +289,6 @@ static int find_compatible_pertrans(AggState *aggstate, Aggref *newagg,
List *transnos);
-/*
- * Select the current grouping set; affects current_set and
- * curaggcontext.
- */
-static void
-select_current_set(AggState *aggstate, int setno, bool is_hash)
-{
- if (is_hash)
- aggstate->curaggcontext = aggstate->hashcontext;
- else
- aggstate->curaggcontext = aggstate->aggcontexts[setno];
-
- aggstate->current_set = setno;
-}
-
/*
* Switch to phase "newphase", which must either be 0 or 1 (to reset) or
* current_phase + 1. Juggle the tuplesorts accordingly.
@@ -954,137 +650,12 @@ advance_transition_function(AggState *aggstate,
static void
advance_aggregates(AggState *aggstate, AggStatePerGroup *sort_pergroups, AggStatePerGroup *hash_pergroups)
{
- int transno;
- int setno = 0;
- int numGroupingSets = Max(aggstate->phase->numsets, 1);
- int numHashes = aggstate->num_hashes;
- int numTrans = aggstate->numtrans;
- TupleTableSlot *slot = aggstate->evalslot;
- Datum *values = slot->tts_values;
- bool *nulls = slot->tts_isnull;
- AggStatePerTrans pertrans;
+ bool isnull;
- /* compute input for all aggregates */
- if (aggstate->evalproj)
- aggstate->evalslot = ExecProject(aggstate->evalproj);
-
- for (transno = 0, pertrans = &aggstate->pertrans[0];
- transno < numTrans; transno++, pertrans++)
- {
- ExprState *filter = pertrans->aggfilter;
- int numTransInputs = pertrans->numTransInputs;
- int i;
- int inputoff = pertrans->inputoff;
-
- /* Skip anything FILTERed out */
- if (filter)
- {
- Datum res;
- bool isnull;
-
- res = ExecEvalExprSwitchContext(filter, aggstate->tmpcontext,
- &isnull);
- if (isnull || !DatumGetBool(res))
- continue;
- }
-
- if (pertrans->numSortCols > 0)
- {
- /* DISTINCT and/or ORDER BY case */
- Assert(slot->tts_nvalid >= (pertrans->numInputs + inputoff));
- Assert(!hash_pergroups);
-
- /*
- * If the transfn is strict, we want to check for nullity before
- * storing the row in the sorter, to save space if there are a lot
- * of nulls. Note that we must only check numTransInputs columns,
- * not numInputs, since nullity in columns used only for sorting
- * is not relevant here.
- */
- if (pertrans->transfn.fn_strict)
- {
- for (i = 0; i < numTransInputs; i++)
- {
- if (slot->tts_isnull[i + inputoff])
- break;
- }
- if (i < numTransInputs)
- continue;
- }
-
- for (setno = 0; setno < numGroupingSets; setno++)
- {
- /* OK, put the tuple into the tuplesort object */
- if (pertrans->numInputs == 1)
- tuplesort_putdatum(pertrans->sortstates[setno],
- values[inputoff], nulls[inputoff]);
- else
- {
- /*
- * Copy slot contents, starting from inputoff, into sort
- * slot.
- */
- ExecClearTuple(pertrans->sortslot);
- memcpy(pertrans->sortslot->tts_values,
- &values[inputoff],
- pertrans->numInputs * sizeof(Datum));
- memcpy(pertrans->sortslot->tts_isnull,
- &nulls[inputoff],
- pertrans->numInputs * sizeof(bool));
- pertrans->sortslot->tts_nvalid = pertrans->numInputs;
- ExecStoreVirtualTuple(pertrans->sortslot);
- tuplesort_puttupleslot(pertrans->sortstates[setno], pertrans->sortslot);
- }
- }
- }
- else
- {
- /* We can apply the transition function immediately */
- FunctionCallInfo fcinfo = &pertrans->transfn_fcinfo;
-
- /* Load values into fcinfo */
- /* Start from 1, since the 0th arg will be the transition value */
- Assert(slot->tts_nvalid >= (numTransInputs + inputoff));
-
- for (i = 0; i < numTransInputs; i++)
- {
- fcinfo->arg[i + 1] = values[i + inputoff];
- fcinfo->argnull[i + 1] = nulls[i + inputoff];
- }
-
- if (sort_pergroups)
- {
- /* advance transition states for ordered grouping */
-
- for (setno = 0; setno < numGroupingSets; setno++)
- {
- AggStatePerGroup pergroupstate;
-
- select_current_set(aggstate, setno, false);
-
- pergroupstate = &sort_pergroups[setno][transno];
-
- advance_transition_function(aggstate, pertrans, pergroupstate);
- }
- }
-
- if (hash_pergroups)
- {
- /* advance transition states for hashed grouping */
-
- for (setno = 0; setno < numHashes; setno++)
- {
- AggStatePerGroup pergroupstate;
-
- select_current_set(aggstate, setno, true);
-
- pergroupstate = &hash_pergroups[setno][transno];
-
- advance_transition_function(aggstate, pertrans, pergroupstate);
- }
- }
- }
- }
+ ExecEvalExprSwitchContext(aggstate->phase->evaltrans,
+ aggstate->tmpcontext,
+ &isnull);
+ return;
}
/*
@@ -2663,6 +2234,8 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
AggStatePerAgg peraggs;
AggStatePerTrans pertransstates;
AggStatePerTrans pertrans;
+ AggStatePerGroup *pergroups;
+ ExprContext **contexts;
Plan *outerPlan;
ExprContext *econtext;
int numaggs,
@@ -2985,6 +2558,30 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
aggstate->peragg = peraggs;
aggstate->pertrans = pertransstates;
+
+ aggstate->all_pergroups =
+ (AggStatePerGroup *) palloc0(sizeof(AggStatePerGroup)
+ * (numGroupingSets + numHashes));
+ aggstate->all_contexts =
+ (ExprContext **) palloc0(sizeof(ExprContext *)
+ * (numGroupingSets + numHashes));
+ pergroups = aggstate->all_pergroups;
+ contexts = aggstate->all_contexts;
+
+ if (node->aggstrategy != AGG_HASHED)
+ {
+ for (i = 0; i < numGroupingSets; i++)
+ {
+ pergroups[i] = (AggStatePerGroup) palloc0(sizeof(AggStatePerGroupData)
+ * numaggs);
+ contexts[i] = aggstate->aggcontexts[i];
+ }
+
+ aggstate->pergroups = pergroups;
+ pergroups += numGroupingSets;
+ contexts += numGroupingSets;
+ }
+
/*
* Hashing can only appear in the initial phase.
*/
@@ -3001,28 +2598,13 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
}
/* this is an array of pointers, not structures */
- aggstate->hash_pergroup = palloc0(sizeof(AggStatePerGroup) * numHashes);
+ aggstate->hash_pergroup = pergroups;
find_hash_columns(aggstate);
build_hash_table(aggstate);
aggstate->table_filled = false;
}
- if (node->aggstrategy != AGG_HASHED)
- {
- AggStatePerGroup *pergroups =
- (AggStatePerGroup*) palloc0(sizeof(AggStatePerGroup)
- * numGroupingSets);
-
- for (i = 0; i < numGroupingSets; i++)
- {
- pergroups[i] = (AggStatePerGroup) palloc0(sizeof(AggStatePerGroupData)
- * numaggs);
- }
-
- aggstate->pergroups = pergroups;
- }
-
/*
* Initialize current phase-dependent values to initial phase. The initial
* phase is 1 (first sort pass) for all strategies that use sorting (if
@@ -3388,6 +2970,58 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
NULL);
ExecSetSlotDescriptor(aggstate->evalslot, aggstate->evaldesc);
+ /*
+ * Build expressions doing all the transition stuff at once. We build a
+ * different one for each phase, as the number of transition function
+ * invocation changes.
+ */
+ for (phaseidx = 0; phaseidx < aggstate->numphases; phaseidx++)
+ {
+ AggStatePerPhase phase = &aggstate->phases[phaseidx];
+ bool dohash, dosort;
+
+ if (!phase->aggnode)
+ continue;
+
+ if (aggstate->aggstrategy == AGG_MIXED &&
+ phaseidx == 1)
+ {
+ dohash = true;
+ dosort = true;
+ }
+ else if (aggstate->aggstrategy == AGG_MIXED &&
+ phaseidx == 0)
+ {
+ dohash = true;
+ dosort = false;
+ }
+ else if (phase->aggstrategy == AGG_PLAIN ||
+ phase->aggstrategy == AGG_SORTED)
+ {
+ dohash = false;
+ dosort = true;
+ }
+ else if (phase->aggstrategy == AGG_HASHED)
+ {
+ dohash = true;
+ dosort = false;
+ }
+ else if (phase->aggstrategy == AGG_MIXED)
+ {
+ dohash = true;
+ dosort = true;
+ }
+ else
+ {
+ elog(ERROR, "frak");
+ }
+
+ phase->evaltrans = ExecInitAggTrans(aggstate, phase,
+ &aggstate->ss.ps,
+ dosort, dohash);
+
+ }
+
return aggstate;
}
diff --git a/src/backend/lib/llvmjit.c b/src/backend/lib/llvmjit.c
index e05fe2dd72..57d0663410 100644
--- a/src/backend/lib/llvmjit.c
+++ b/src/backend/lib/llvmjit.c
@@ -63,6 +63,7 @@ LLVMTypeRef StructFmgrInfo;
LLVMTypeRef StructFunctionCallInfoData;
LLVMTypeRef StructExprState;
LLVMTypeRef StructExprContext;
+LLVMTypeRef StructAggStatePerGroupData;
static LLVMTargetRef llvm_targetref;
@@ -381,6 +382,18 @@ llvm_create_types(void)
params[0] = LLVMPointerType(StructFunctionCallInfoData, 0);
TypePGFunction = LLVMFunctionType(TypeSizeT, params, lengthof(params), 0);
}
+
+ {
+ LLVMTypeRef members[3];
+
+ members[0] = TypeSizeT;
+ members[1] = LLVMInt8Type();
+ members[2] = LLVMInt8Type();
+
+ StructAggStatePerGroupData = LLVMStructCreateNamed(LLVMGetGlobalContext(),
+ "struct.AggStatePerGroupData");
+ LLVMStructSetBody(StructAggStatePerGroupData, members, lengthof(members), false);
+ }
}
static uint64_t
diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h
index 3919ac5598..661c18fcbe 100644
--- a/src/include/executor/execExpr.h
+++ b/src/include/executor/execExpr.h
@@ -15,6 +15,7 @@
#define EXEC_EXPR_H
#include "nodes/execnodes.h"
+#include "executor/nodeAgg.h"
/* forward reference to avoid circularity */
struct ArrayRefState;
@@ -207,6 +208,14 @@ typedef enum ExprEvalOp
EEOP_SUBPLAN,
EEOP_ALTERNATIVE_SUBPLAN,
+ EEOP_AGG_FILTER,
+ EEOP_AGG_STRICT_INPUT_CHECK,
+ EEOP_AGG_INIT_TRANS,
+ EEOP_AGG_STRICT_TRANS_CHECK,
+ EEOP_AGG_PLAIN_TRANS,
+ EEOP_AGG_ORDERED_TRANS_DATUM,
+ EEOP_AGG_ORDERED_TRANS_TUPLE,
+
/* non-existent operation, used e.g. to check array lengths */
EEOP_LAST
} ExprEvalOp;
@@ -555,6 +564,58 @@ typedef struct ExprEvalStep
/* out-of-line state, created by nodeSubplan.c */
AlternativeSubPlanState *asstate;
} alternative_subplan;
+
+ struct
+ {
+ int jumpfalse;
+ } agg_filter;
+
+ struct
+ {
+ bool *nulls;
+ int nargs;
+ int jumpnull;
+ } agg_strict_input_check;
+
+ struct
+ {
+ AggState *aggstate;
+ AggStatePerTrans pertrans;
+ ExprContext *aggcontext;
+ int setno;
+ int transno;
+ int setoff;
+ int jumpnull;
+ } agg_init_trans;
+
+ struct
+ {
+ AggState *aggstate;
+ int setno;
+ int transno;
+ int setoff;
+ int jumpnull;
+ } agg_strict_trans_check;
+
+ struct
+ {
+ AggState *aggstate;
+ AggStatePerTrans pertrans;
+ ExprContext *aggcontext;
+ int setno;
+ int transno;
+ int setoff;
+ } agg_plain_trans;
+
+ struct
+ {
+ AggState *aggstate;
+ AggStatePerTrans pertrans;
+ ExprContext *aggcontext;
+ int setno;
+ int transno;
+ int setoff;
+ } agg_ordered_trans;
} d;
} ExprEvalStep;
@@ -648,4 +709,12 @@ extern void ExecEvalAlternativeSubPlan(ExprState *state, ExprEvalStep *op,
extern void ExecEvalWholeRowVar(ExprState *state, ExprEvalStep *op,
ExprContext *econtext);
+extern void ExecEvalAggOrderedTransDatum(ExprState *state, ExprEvalStep *op,
+ ExprContext *econtext);
+extern void ExecEvalAggOrderedTransTuple(ExprState *state, ExprEvalStep *op,
+ ExprContext *econtext);
+
+
+extern void ExecAggInitGroup(AggState *aggstate, AggStatePerTrans pertrans, AggStatePerGroup pergroup);
+
#endif /* EXEC_EXPR_H */
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ab2df96ca0..af2dfcc287 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -264,6 +264,8 @@ extern ExprState *ExecInitExpr(Expr *node, PlanState *parent);
extern ExprState *ExecInitQual(List *qual, PlanState *parent);
extern ExprState *ExecInitCheck(List *qual, PlanState *parent);
extern List *ExecInitExprList(List *nodes, PlanState *parent);
+extern ExprState *ExecInitAggTrans(AggState *aggstate, struct AggStatePerPhaseData *phase,
+ PlanState *parent, bool doSort, bool doHash);
extern ProjectionInfo *ExecBuildProjectionInfo(List *targetList,
ExprContext *econtext,
TupleTableSlot *slot,
diff --git a/src/include/executor/nodeAgg.h b/src/include/executor/nodeAgg.h
index eff5af9c2a..3932bd8270 100644
--- a/src/include/executor/nodeAgg.h
+++ b/src/include/executor/nodeAgg.h
@@ -16,6 +16,297 @@
#include "nodes/execnodes.h"
+
+/*
+ * AggStatePerTransData - per aggregate state value information
+ *
+ * Working state for updating the aggregate's state value, by calling the
+ * transition function with an input row. This struct does not store the
+ * information needed to produce the final aggregate result from the transition
+ * state, that's stored in AggStatePerAggData instead. This separation allows
+ * multiple aggregate results to be produced from a single state value.
+ */
+typedef struct AggStatePerTransData
+{
+ /*
+ * These values are set up during ExecInitAgg() and do not change
+ * thereafter:
+ */
+
+ /*
+ * Link to an Aggref expr this state value is for.
+ *
+ * There can be multiple Aggref's sharing the same state value, as long as
+ * the inputs and transition function are identical. This points to the
+ * first one of them.
+ */
+ Aggref *aggref;
+
+ /*
+ * Nominal number of arguments for aggregate function. For plain aggs,
+ * this excludes any ORDER BY expressions. For ordered-set aggs, this
+ * counts both the direct and aggregated (ORDER BY) arguments.
+ */
+ int numArguments;
+
+ /*
+ * Number of aggregated input columns. This includes ORDER BY expressions
+ * in both the plain-agg and ordered-set cases. Ordered-set direct args
+ * are not counted, though.
+ */
+ int numInputs;
+
+ /* offset of input columns in AggState->evalslot */
+ int inputoff;
+
+ /*
+ * Number of aggregated input columns to pass to the transfn. This
+ * includes the ORDER BY columns for ordered-set aggs, but not for plain
+ * aggs. (This doesn't count the transition state value!)
+ */
+ int numTransInputs;
+
+ /* Oid of the state transition or combine function */
+ Oid transfn_oid;
+
+ /* Oid of the serialization function or InvalidOid */
+ Oid serialfn_oid;
+
+ /* Oid of the deserialization function or InvalidOid */
+ Oid deserialfn_oid;
+
+ /* Oid of state value's datatype */
+ Oid aggtranstype;
+
+ /* ExprStates of the FILTER and argument expressions. */
+ ExprState *aggfilter; /* state of FILTER expression, if any */
+ List *aggdirectargs; /* states of direct-argument expressions */
+
+ /*
+ * fmgr lookup data for transition function or combine function. Note in
+ * particular that the fn_strict flag is kept here.
+ */
+ FmgrInfo transfn;
+
+ /* fmgr lookup data for serialization function */
+ FmgrInfo serialfn;
+
+ /* fmgr lookup data for deserialization function */
+ FmgrInfo deserialfn;
+
+ /* Input collation derived for aggregate */
+ Oid aggCollation;
+
+ /* number of sorting columns */
+ int numSortCols;
+
+ /* number of sorting columns to consider in DISTINCT comparisons */
+ /* (this is either zero or the same as numSortCols) */
+ int numDistinctCols;
+
+ /* deconstructed sorting information (arrays of length numSortCols) */
+ AttrNumber *sortColIdx;
+ Oid *sortOperators;
+ Oid *sortCollations;
+ bool *sortNullsFirst;
+
+ /*
+ * fmgr lookup data for input columns' equality operators --- only
+ * set/used when aggregate has DISTINCT flag. Note that these are in
+ * order of sort column index, not parameter index.
+ */
+ FmgrInfo *equalfns; /* array of length numDistinctCols */
+
+ /*
+ * initial value from pg_aggregate entry
+ */
+ Datum initValue;
+ bool initValueIsNull;
+
+ /*
+ * We need the len and byval info for the agg's input and transition data
+ * types in order to know how to copy/delete values.
+ *
+ * Note that the info for the input type is used only when handling
+ * DISTINCT aggs with just one argument, so there is only one input type.
+ */
+ int16 inputtypeLen,
+ transtypeLen;
+ bool inputtypeByVal,
+ transtypeByVal;
+
+ /*
+ * Stuff for evaluation of aggregate inputs in cases where the aggregate
+ * requires sorted input. The arguments themselves will be evaluated via
+ * AggState->evalslot/evalproj for all aggregates at once, but we only
+ * want to sort the relevant columns for individual aggregates.
+ */
+ TupleDesc sortdesc; /* descriptor of input tuples */
+
+ /*
+ * Slots for holding the evaluated input arguments. These are set up
+ * during ExecInitAgg() and then used for each input row requiring
+ * processing besides what's done in AggState->evalproj.
+ */
+ TupleTableSlot *sortslot; /* current input tuple */
+ TupleTableSlot *uniqslot; /* used for multi-column DISTINCT */
+
+ /*
+ * These values are working state that is initialized at the start of an
+ * input tuple group and updated for each input tuple.
+ *
+ * For a simple (non DISTINCT/ORDER BY) aggregate, we just feed the input
+ * values straight to the transition function. If it's DISTINCT or
+ * requires ORDER BY, we pass the input values into a Tuplesort object;
+ * then at completion of the input tuple group, we scan the sorted values,
+ * eliminate duplicates if needed, and run the transition function on the
+ * rest.
+ *
+ * We need a separate tuplesort for each grouping set.
+ */
+
+ Tuplesortstate **sortstates; /* sort objects, if DISTINCT or ORDER BY */
+
+ /*
+ * This field is a pre-initialized FunctionCallInfo struct used for
+ * calling this aggregate's transfn. We save a few cycles per row by not
+ * re-initializing the unchanging fields; which isn't much, but it seems
+ * worth the extra space consumption.
+ */
+ FunctionCallInfoData transfn_fcinfo;
+
+ /* Likewise for serialization and deserialization functions */
+ FunctionCallInfoData serialfn_fcinfo;
+
+ FunctionCallInfoData deserialfn_fcinfo;
+} AggStatePerTransData;
+
+/*
+ * AggStatePerAggData - per-aggregate information
+ *
+ * This contains the information needed to call the final function, to produce
+ * a final aggregate result from the state value. If there are multiple
+ * identical Aggrefs in the query, they can all share the same per-agg data.
+ *
+ * These values are set up during ExecInitAgg() and do not change thereafter.
+ */
+typedef struct AggStatePerAggData
+{
+ /*
+ * Link to an Aggref expr this state value is for.
+ *
+ * There can be multiple identical Aggref's sharing the same per-agg. This
+ * points to the first one of them.
+ */
+ Aggref *aggref;
+
+ /* index to the state value which this agg should use */
+ int transno;
+
+ /* Optional Oid of final function (may be InvalidOid) */
+ Oid finalfn_oid;
+
+ /*
+ * fmgr lookup data for final function --- only valid when finalfn_oid oid
+ * is not InvalidOid.
+ */
+ FmgrInfo finalfn;
+
+ /*
+ * Number of arguments to pass to the finalfn. This is always at least 1
+ * (the transition state value) plus any ordered-set direct args. If the
+ * finalfn wants extra args then we pass nulls corresponding to the
+ * aggregated input columns.
+ */
+ int numFinalArgs;
+
+ /*
+ * We need the len and byval info for the agg's result data type in order
+ * to know how to copy/delete values.
+ */
+ int16 resulttypeLen;
+ bool resulttypeByVal;
+
+} AggStatePerAggData;
+
+/*
+ * AggStatePerGroupData - per-aggregate-per-group working state
+ *
+ * These values are working state that is initialized at the start of
+ * an input tuple group and updated for each input tuple.
+ *
+ * In AGG_PLAIN and AGG_SORTED modes, we have a single array of these
+ * structs (pointed to by aggstate->pergroup); we re-use the array for
+ * each input group, if it's AGG_SORTED mode. In AGG_HASHED mode, the
+ * hash table contains an array of these structs for each tuple group.
+ *
+ * Logically, the sortstate field belongs in this struct, but we do not
+ * keep it here for space reasons: we don't support DISTINCT aggregates
+ * in AGG_HASHED mode, so there's no reason to use up a pointer field
+ * in every entry of the hashtable.
+ */
+typedef struct AggStatePerGroupData
+{
+ Datum transValue; /* current transition value */
+ bool transValueIsNull;
+
+ bool noTransValue; /* true if transValue not set yet */
+
+ /*
+ * Note: noTransValue initially has the same value as transValueIsNull,
+ * and if true both are cleared to false at the same time. They are not
+ * the same though: if transfn later returns a NULL, we want to keep that
+ * NULL and not auto-replace it with a later input value. Only the first
+ * non-NULL input will be auto-substituted.
+ */
+} AggStatePerGroupData;
+
+/*
+ * AggStatePerPhaseData - per-grouping-set-phase state
+ *
+ * Grouping sets are divided into "phases", where a single phase can be
+ * processed in one pass over the input. If there is more than one phase, then
+ * at the end of input from the current phase, state is reset and another pass
+ * taken over the data which has been re-sorted in the mean time.
+ *
+ * Accordingly, each phase specifies a list of grouping sets and group clause
+ * information, plus each phase after the first also has a sort order.
+ */
+typedef struct AggStatePerPhaseData
+{
+ AggStrategy aggstrategy; /* strategy for this phase */
+ int numsets; /* number of grouping sets (or 0) */
+ int *gset_lengths; /* lengths of grouping sets */
+ Bitmapset **grouped_cols; /* column groupings for rollup */
+ FmgrInfo *eqfunctions; /* per-grouping-field equality fns */
+ Agg *aggnode; /* Agg node for phase data */
+ Sort *sortnode; /* Sort node for input ordering for phase */
+
+ ExprState *evaltrans; /* evaluation of transition functions */
+} AggStatePerPhaseData;
+
+/*
+ * AggStatePerHashData - per-hashtable state
+ *
+ * When doing grouping sets with hashing, we have one of these for each
+ * grouping set. (When doing hashing without grouping sets, we have just one of
+ * them.)
+ */
+typedef struct AggStatePerHashData
+{
+ TupleHashTable hashtable; /* hash table with one entry per group */
+ TupleHashIterator hashiter; /* for iterating through hash table */
+ TupleTableSlot *hashslot; /* slot for loading hash table */
+ FmgrInfo *hashfunctions; /* per-grouping-field hash fns */
+ FmgrInfo *eqfunctions; /* per-grouping-field equality fns */
+ int numCols; /* number of hash key columns */
+ int numhashGrpCols; /* number of columns in hash table */
+ int largestGrpColIdx; /* largest col required for hashing */
+ AttrNumber *hashGrpColIdxInput; /* hash col indices in input slot */
+ AttrNumber *hashGrpColIdxHash; /* indices in hashtbl tuples */
+ Agg *aggnode; /* original Agg node, for numGroups etc. */
+} AggStatePerHashData;
+
extern AggState *ExecInitAgg(Agg *node, EState *estate, int eflags);
extern void ExecEndAgg(AggState *node);
extern void ExecReScanAgg(AggState *node);
@@ -24,4 +315,21 @@ extern Size hash_agg_entry_size(int numAggs);
extern Datum aggregate_dummy(PG_FUNCTION_ARGS);
+/*
+ * Select the current grouping set; affects current_set and
+ * curaggcontext.
+ */
+static inline void
+select_current_set(AggState *aggstate, int setno, bool is_hash)
+{
+ /* when changing this, also adapt ExecInterpExpr() and friends */
+ if (is_hash)
+ aggstate->curaggcontext = aggstate->hashcontext;
+ else
+ aggstate->curaggcontext = aggstate->aggcontexts[setno];
+
+ aggstate->current_set = setno;
+}
+
+
#endif /* NODEAGG_H */
diff --git a/src/include/lib/llvmjit.h b/src/include/lib/llvmjit.h
index 61d7c67d6f..47f9b6d64c 100644
--- a/src/include/lib/llvmjit.h
+++ b/src/include/lib/llvmjit.h
@@ -59,6 +59,7 @@ extern LLVMTypeRef StructFmgrInfo;
extern LLVMTypeRef StructFunctionCallInfoData;
extern LLVMTypeRef StructExprState;
extern LLVMTypeRef StructExprContext;
+extern LLVMTypeRef StructAggStatePerGroupData;
extern void llvm_initialize(void);
extern void llvm_dispose_module(LLVMModuleRef mod, const char *funcname);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index b0c4856392..68352b431c 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1835,6 +1835,11 @@ typedef struct AggState
AggStatePerHash perhash;
AggStatePerGroup *hash_pergroup; /* grouping set indexed array of
* per-group pointers */
+
+ AggStatePerGroup *all_pergroups;
+ ExprContext **all_contexts;
+
+
/* support for evaluation of agg inputs */
TupleTableSlot *evalslot; /* slot for agg inputs */
ProjectionInfo *evalproj; /* projection machinery */
--
2.14.1.2.g4274c698f4.dirty
0016-Hacky-Preliminary-inlining-implementation.patchtext/x-diff; charset=us-asciiDownload
From 97b671aff4aafa834befaa3479af0cb2afade4d8 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 31 Aug 2017 23:06:04 -0700
Subject: [PATCH 16/16] Hacky/Preliminary inlining implementation.
---
src/Makefile.global.in | 8 ++
src/backend/Makefile | 6 +
src/backend/executor/execExprCompile.c | 111 +++++++++++++++
src/backend/lib/llvmjit.c | 239 ++++++++++++++++++++++++++++++++-
src/backend/utils/misc/guc.c | 24 ++++
src/include/lib/llvmjit.h | 7 +
6 files changed, 391 insertions(+), 4 deletions(-)
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index ab5862b472..0d47734a6a 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -812,6 +812,10 @@ ifndef COMPILE.c
COMPILE.c = $(CC) $(CFLAGS) $(CPPFLAGS) -c
endif
+ifndef COMPILE.bc
+COMPILE.bc = $(CC) $(filter-out -g3 -g -ggdb3 -fno-strict-aliasing, $(CFLAGS)) -fstrict-aliasing $(CPPFLAGS) -emit-llvm -c
+endif
+
DEPDIR = .deps
ifeq ($(GCC), yes)
@@ -821,6 +825,10 @@ ifeq ($(GCC), yes)
@if test ! -d $(DEPDIR); then mkdir -p $(DEPDIR); fi
$(COMPILE.c) -o $@ $< -MMD -MP -MF $(DEPDIR)/$(*F).Po
+%.bc : %.c %.o
+ @if test ! -d $(DEPDIR); then mkdir -p $(DEPDIR); fi
+ $(COMPILE.bc) -o $@ $<
+
endif # GCC
# Include all the dependency files generated for the current
diff --git a/src/backend/Makefile b/src/backend/Makefile
index c82ad75bda..1f1945b5f3 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -64,6 +64,12 @@ ifneq ($(PORTNAME), aix)
postgres: $(OBJS)
$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(export_dynamic) $(call expand_subsys,$^) $(LIBS) -o $@
+
+ifeq ($(findstring clang,$(CC)), clang)
+parsed: $(OBJS)
+ $(MAKE $(patsubst %.o,%.bc,$(call expand_subsys,$(OBJS))))
+endif
+
endif
endif
endif
diff --git a/src/backend/executor/execExprCompile.c b/src/backend/executor/execExprCompile.c
index d0b943530c..73412d0e1b 100644
--- a/src/backend/executor/execExprCompile.c
+++ b/src/backend/executor/execExprCompile.c
@@ -213,9 +213,21 @@ BuildFunctionCall(LLVMJitContext *context, LLVMBuilderRef builder,
}
else if (builtin)
{
+ LLVMModuleRef inline_mod;
+
LLVMAddFunction(mod, builtin->funcName, TypePGFunction);
v_fn_addr = LLVMGetNamedFunction(mod, builtin->funcName);
Assert(v_fn_addr);
+
+ inline_mod = llvm_module_for_function(builtin->funcName);
+ if (inline_mod)
+ {
+ context->inline_modules =
+ list_append_unique_ptr(context->inline_modules,
+ inline_mod);
+ }
+
+ forceinline = true;
}
else
{
@@ -263,6 +275,20 @@ BuildFunctionCall(LLVMJitContext *context, LLVMBuilderRef builder,
LLVMValueRef v_lifetime = get_LifetimeEnd(mod);
LLVMValueRef params[2];
+ params[0] = LLVMConstInt(LLVMInt64Type(), sizeof(fcinfo->arg), false);
+ params[1] = LLVMBuildBitCast(
+ builder, LLVMConstInt(TypeSizeT, (intptr_t) fcinfo->arg, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "");
+ LLVMBuildCall(builder, v_lifetime, params, lengthof(params), "");
+
+ params[0] = LLVMConstInt(LLVMInt64Type(), sizeof(fcinfo->argnull), false);
+ params[1] = LLVMBuildBitCast(
+ builder, LLVMConstInt(TypeSizeT, (intptr_t) fcinfo->argnull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "");
+ LLVMBuildCall(builder, v_lifetime, params, lengthof(params), "");
+
params[0] = LLVMConstInt(LLVMInt64Type(), sizeof(FunctionCallInfoData), false);
params[1] = LLVMBuildBitCast(
builder, v_fcinfo,
@@ -314,6 +340,9 @@ ExecRunCompiledExpr(ExprState *state, ExprContext *econtext, bool *isNull)
return func(state, econtext, isNull);
}
+static void
+emit_lifetime_end(ExprState *state, LLVMModuleRef mod, LLVMBuilderRef builder);
+
bool
ExecReadyCompiledExpr(ExprState *state, PlanState *parent)
{
@@ -519,6 +548,7 @@ ExecReadyCompiledExpr(ExprState *state, PlanState *parent)
LLVMBuildCall(builder, v_lifetime, params, lengthof(params), "");
}
+ emit_lifetime_end(state, mod, builder);
LLVMBuildRet(builder, v_tmpvalue);
break;
@@ -2793,4 +2823,85 @@ ExecReadyCompiledExpr(ExprState *state, PlanState *parent)
return true;
}
+
+static void
+emit_lifetime_end(ExprState *state, LLVMModuleRef mod, LLVMBuilderRef builder)
+{
+ ExprEvalStep *op;
+ int i = 0;
+ int argno = 0;
+ LLVMValueRef v_lifetime = get_LifetimeEnd(mod);
+
+
+ /*
+ * Add lifetime-end annotation, signalling that writes to memory don't
+ * have to be retained (important for inlining potential).
+ */
+
+ for (i = 0; i < state->steps_len; i++)
+ {
+ FunctionCallInfo fcinfo = NULL;
+ LLVMValueRef v_ptr;
+ LLVMValueRef params[2];
+
+ op = &state->steps[i];
+
+ switch ((ExprEvalOp) op->opcode)
+ {
+ case EEOP_FUNCEXPR:
+ case EEOP_FUNCEXPR_STRICT:
+ case EEOP_NULLIF:
+ case EEOP_DISTINCT:
+ fcinfo = op->d.func.fcinfo_data;
+
+ for (argno = 0; argno < op->d.func.nargs; argno++)
+ {
+ params[0] = LLVMConstInt(LLVMInt64Type(), sizeof(Datum), false);
+ params[1] = LLVMBuildIntToPtr(
+ builder, LLVMConstInt(TypeSizeT, (intptr_t) &fcinfo->arg[argno], false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "");
+ LLVMBuildCall(builder, v_lifetime, params, lengthof(params), "");
+
+ params[0] = LLVMConstInt(LLVMInt64Type(), sizeof(bool), false);
+ params[1] = LLVMBuildIntToPtr(
+ builder, LLVMConstInt(TypeSizeT, (intptr_t) &fcinfo->argnull[argno], false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "");
+ LLVMBuildCall(builder, v_lifetime, params, lengthof(params), "");
+ }
+ params[0] = LLVMConstInt(LLVMInt64Type(), sizeof(FunctionCallInfoData), false);
+ params[1] = LLVMBuildIntToPtr(
+ builder, LLVMConstInt(TypeSizeT, (intptr_t) fcinfo, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "");
+ LLVMBuildCall(builder, v_lifetime, params, lengthof(params), "");
+
+ break;
+ case EEOP_ROWCOMPARE_STEP:
+ fcinfo = op->d.rowcompare_step.fcinfo_data;;
+ break;
+ case EEOP_BOOLTEST_IS_TRUE:
+ case EEOP_BOOLTEST_IS_NOT_FALSE:
+ case EEOP_BOOLTEST_IS_FALSE:
+ case EEOP_BOOLTEST_IS_NOT_TRUE:
+ if (op->d.boolexpr.anynull)
+ {
+ v_ptr = LLVMBuildIntToPtr(
+ builder,
+ LLVMConstInt(LLVMInt64Type(), (intptr_t) op->d.boolexpr.anynull, false),
+ LLVMPointerType(LLVMInt8Type(), 0),
+ "anynull");
+
+ params[0] = LLVMConstInt(LLVMInt64Type(), sizeof(bool), false);
+ params[1] = v_ptr;
+ LLVMBuildCall(builder, v_lifetime, params, lengthof(params), "");
+ }
+ break;
+ default:
+ break;
+ }
+
+ }
+}
#endif
diff --git a/src/backend/lib/llvmjit.c b/src/backend/lib/llvmjit.c
index 57d0663410..86d43c3c07 100644
--- a/src/backend/lib/llvmjit.c
+++ b/src/backend/lib/llvmjit.c
@@ -9,6 +9,7 @@
#include "utils/memutils.h"
#include "utils/resowner_private.h"
+#include "utils/varlena.h"
#ifdef USE_LLVM
@@ -32,6 +33,8 @@
/* GUCs */
bool jit_log_ir = 0;
bool jit_dump_bitcode = 0;
+bool jit_perform_inlining = 0;
+char *jit_inline_directories = NULL;
static bool llvm_initialized = false;
static LLVMPassManagerBuilderRef llvm_pmb;
@@ -72,6 +75,8 @@ static LLVMOrcJITStackRef llvm_orc;
static void llvm_shutdown(void);
static void llvm_create_types(void);
+static void llvm_search_inline_directories(void);
+
static void
llvm_shutdown(void)
@@ -103,6 +108,8 @@ llvm_initialize(void)
LLVMLoadLibraryPermanently("");
llvm_triple = LLVMGetDefaultTargetTriple();
+ /* FIXME: overwrite with clang compatible one? */
+ llvm_triple = "x86_64-pc-linux-gnu";
if (LLVMGetTargetFromTriple(llvm_triple, &llvm_targetref, &error) != 0)
{
@@ -117,6 +124,7 @@ llvm_initialize(void)
llvm_pmb = LLVMPassManagerBuilderCreate();
LLVMPassManagerBuilderSetOptLevel(llvm_pmb, 3);
+ LLVMPassManagerBuilderUseInlinerWithThreshold(llvm_pmb, 0);
llvm_orc = LLVMOrcCreateInstance(llvm_targetmachine);
@@ -128,9 +136,188 @@ llvm_initialize(void)
llvm_create_types();
llvm_initialized = true;
+
+ llvm_search_inline_directories();
+
MemoryContextSwitchTo(oldcontext);
}
+#include "common/string.h"
+#include "storage/fd.h"
+#include "miscadmin.h"
+
+static HTAB *InlineModuleHash = NULL;
+
+typedef struct InlineableFunction
+{
+ NameData fname;
+ const char *path;
+ LLVMModuleRef mod;
+} InlineableFunction;
+
+static void
+llvm_preload_bitcode(const char *filename)
+{
+ LLVMMemoryBufferRef buf;
+ char *msg;
+ LLVMValueRef func;
+ LLVMModuleRef mod = NULL;
+
+ mod = LLVMModuleCreateWithName("tmp");
+
+ if (LLVMCreateMemoryBufferWithContentsOfFile(
+ filename, &buf, &msg))
+ {
+ elog(ERROR, "LLVMCreateMemoryBufferWithContentsOfFile(%s) failed: %s",
+ filename, msg);
+ }
+
+#if 1
+ if (LLVMParseBitcode2(buf, &mod))
+ {
+ elog(ERROR, "LLVMParseBitcode2 failed: %s", msg);
+ }
+#else
+ if (LLVMGetBitcodeModule2(buf, &mod))
+ {
+ elog(ERROR, "LLVMGetBitcodeModule2 failed: %s", msg);
+ }
+#endif
+
+ func = LLVMGetFirstFunction(mod);
+ while (func)
+ {
+ const char *funcname = LLVMGetValueName(func);
+
+ if (!LLVMIsDeclaration(func))
+ {
+ if (LLVMGetLinkage(func) == LLVMExternalLinkage)
+ {
+ InlineableFunction *fentry;
+ bool found;
+
+ fentry = (InlineableFunction *)
+ hash_search(InlineModuleHash,
+ (void *) funcname,
+ HASH_ENTER, &found);
+
+ if (found)
+ {
+ elog(LOG, "skiping loading func %s, already exists at %s, loading %s",
+ funcname, fentry->path, filename);
+ }
+ else
+ {
+ fentry->path = pstrdup(filename);
+ fentry->mod = mod;
+ }
+
+ LLVMSetLinkage(func, LLVMAvailableExternallyLinkage);
+ }
+ }
+
+ func = LLVMGetNextFunction(func);
+ }
+}
+
+static void
+llvm_search_inline_directory(const char *path)
+{
+ DIR *dir;
+ struct dirent *de;
+
+ dir = AllocateDir(path);
+ if (dir == NULL)
+ {
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open directory \"%s\": %m", path)));
+ return;
+ }
+
+ while ((de = ReadDir(dir, path)) != NULL)
+ {
+ char subpath[MAXPGPATH * 2];
+ struct stat fst;
+ int sret;
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (strcmp(de->d_name, ".") == 0 ||
+ strcmp(de->d_name, "..") == 0)
+ continue;
+
+ snprintf(subpath, sizeof(subpath), "%s/%s", path, de->d_name);
+
+ sret = lstat(subpath, &fst);
+
+ if (sret < 0)
+ {
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not stat file \"%s\": %m", subpath)));
+ continue;
+ }
+
+ if (S_ISREG(fst.st_mode))
+ {
+ if (pg_str_endswith(subpath, ".bc"))
+ {
+ llvm_preload_bitcode(subpath);
+ }
+ }
+ else if (S_ISDIR(fst.st_mode))
+ {
+ llvm_search_inline_directory(subpath);
+ }
+ }
+
+ FreeDir(dir); /* we ignore any error here */
+}
+
+static void
+llvm_search_inline_directories(void)
+{
+ List *elemlist;
+ ListCell *lc;
+ HASHCTL ctl;
+
+ Assert(InlineModuleHash == NULL);
+ /* First time through: initialize the hash table */
+
+ MemSet(&ctl, 0, sizeof(ctl));
+ ctl.keysize = sizeof(NameData);
+ ctl.entrysize = sizeof(InlineableFunction);
+ InlineModuleHash = hash_create("inlineable function cache", 64,
+ &ctl, HASH_ELEM);
+
+ SplitDirectoriesString(pstrdup(jit_inline_directories), ';', &elemlist);
+
+ foreach(lc, elemlist)
+ {
+ char *curdir = (char *) lfirst(lc);
+
+ llvm_search_inline_directory(curdir);
+ }
+}
+
+LLVMModuleRef
+llvm_module_for_function(const char *funcname)
+{
+ InlineableFunction *fentry;
+ bool found;
+
+ fentry = (InlineableFunction *)
+ hash_search(InlineModuleHash,
+ (void *) funcname,
+ HASH_FIND, &found);
+
+ if (fentry)
+ return fentry->mod;
+ return NULL;
+}
+
+
static void
llvm_create_types(void)
{
@@ -399,7 +586,11 @@ llvm_create_types(void)
static uint64_t
llvm_resolve_symbol(const char *name, void *ctx)
{
- return (uint64_t) LLVMSearchForAddressOfSymbol(name);
+ uint64_t addr = (uint64_t) LLVMSearchForAddressOfSymbol(name);
+
+ if (!addr)
+ elog(ERROR, "failed to resolve name %s", name);
+ return addr;
}
void *
@@ -412,9 +603,22 @@ llvm_get_function(LLVMJitContext *context, const char *funcname)
if (!context->compiled)
{
int handle;
- LLVMSharedModuleRef smod = LLVMOrcMakeSharedModule(context->module);
+ LLVMSharedModuleRef smod;
MemoryContext oldcontext;
+ if (jit_perform_inlining)
+ {
+ ListCell *lc;
+
+ foreach(lc, context->inline_modules)
+ {
+ LLVMModuleRef inline_mod = lfirst(lc);
+
+ inline_mod = LLVMCloneModule(inline_mod);
+ LLVMLinkModules2Needed(context->module, inline_mod);
+ }
+ }
+
if (jit_log_ir)
{
LLVMDumpModule(context->module);
@@ -439,13 +643,26 @@ llvm_get_function(LLVMJitContext *context, const char *funcname)
llvm_mpm = LLVMCreatePassManager();
LLVMPassManagerBuilderPopulateFunctionPassManager(llvm_pmb, llvm_fpm);
- LLVMPassManagerBuilderPopulateModulePassManager(llvm_pmb, llvm_mpm);
+ //LLVMPassManagerBuilderPopulateModulePassManager(llvm_pmb, llvm_mpm);
LLVMPassManagerBuilderPopulateLTOPassManager(llvm_pmb, llvm_mpm, true, true);
+ LLVMAddCFGSimplificationPass(llvm_fpm);
+ LLVMAddJumpThreadingPass(llvm_fpm);
+ LLVMAddTypeBasedAliasAnalysisPass(llvm_fpm);
+ LLVMAddDeadStoreEliminationPass(llvm_fpm);
+ LLVMAddConstantPropagationPass(llvm_fpm);
+ LLVMAddSCCPPass(llvm_fpm);
+
LLVMAddAnalysisPasses(llvm_targetmachine, llvm_mpm);
LLVMAddAnalysisPasses(llvm_targetmachine, llvm_fpm);
- LLVMAddDeadStoreEliminationPass(llvm_fpm);
+ /* do function level optimization */
+ LLVMInitializeFunctionPassManager(llvm_fpm);
+ for (func = LLVMGetFirstFunction(context->module);
+ func != NULL;
+ func = LLVMGetNextFunction(func))
+ LLVMRunFunctionPassManager(llvm_fpm, func);
+ LLVMFinalizeFunctionPassManager(llvm_fpm);
/* do function level optimization */
LLVMInitializeFunctionPassManager(llvm_fpm);
@@ -457,11 +674,24 @@ llvm_get_function(LLVMJitContext *context, const char *funcname)
/* do module level optimization */
LLVMRunPassManager(llvm_mpm, context->module);
+ LLVMRunPassManager(llvm_mpm, context->module);
+ LLVMRunPassManager(llvm_mpm, context->module);
+ LLVMRunPassManager(llvm_mpm, context->module);
LLVMDisposePassManager(llvm_fpm);
LLVMDisposePassManager(llvm_mpm);
}
+ if (jit_dump_bitcode)
+ {
+ /* FIXME: invent module rather than function specific name */
+ char *filename = psprintf("%s.optimized.bc", funcname);
+ LLVMWriteBitcodeToFile(context->module, filename);
+ pfree(filename);
+ }
+
+ smod = LLVMOrcMakeSharedModule(context->module);
+
/* and emit the code */
{
handle =
@@ -480,6 +710,7 @@ llvm_get_function(LLVMJitContext *context, const char *funcname)
context->module = NULL;
context->compiled = true;
+ context->inline_modules = NIL;
}
/* search all emitted modules for function we're asked for */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 4cc9f305a2..82359b1616 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1043,6 +1043,17 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},
+ {
+ {"jit_perform_inlining", PGC_USERSET, DEVELOPER_OPTIONS,
+ gettext_noop("inline functions for JIT"),
+ NULL,
+ GUC_NOT_IN_SAMPLE
+ },
+ &jit_perform_inlining,
+ false,
+ NULL, NULL, NULL
+ },
+
#endif
{
@@ -3700,6 +3711,19 @@ static struct config_string ConfigureNamesString[] =
check_wal_consistency_checking, assign_wal_consistency_checking, NULL
},
+#ifdef USE_LLVM
+ {
+ {"jit_inline_directories", PGC_BACKEND, DEVELOPER_OPTIONS,
+ gettext_noop("Sets the directories where inline contents for JIT are located."),
+ NULL,
+ GUC_SUPERUSER_ONLY
+ },
+ &jit_inline_directories,
+ "",
+ NULL, NULL, NULL
+ },
+#endif
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, NULL, NULL, NULL, NULL
diff --git a/src/include/lib/llvmjit.h b/src/include/lib/llvmjit.h
index 47f9b6d64c..e2a8cccafc 100644
--- a/src/include/lib/llvmjit.h
+++ b/src/include/lib/llvmjit.h
@@ -27,12 +27,15 @@ typedef struct LLVMJitContext
{
int counter;
LLVMModuleRef module;
+ List *inline_modules;
bool compiled;
List *handles;
} LLVMJitContext;
extern bool jit_log_ir;
extern bool jit_dump_bitcode;
+extern bool jit_perform_inlining;
+extern char *jit_inline_directories;
extern LLVMTargetMachineRef llvm_targetmachine;
extern const char *llvm_triple;
@@ -74,6 +77,10 @@ extern void llvm_shutdown_orc_perf_support(LLVMOrcJITStackRef llvm_orc);
extern LLVMValueRef slot_compile_deform(struct LLVMJitContext *context, TupleDesc desc, int natts);
+
+extern LLVMModuleRef llvm_module_for_function(const char *funcname);
+
+
#else
struct LLVMJitContext;
--
2.14.1.2.g4274c698f4.dirty
Hi,
On 2017-08-31 23:41:31 -0700, Andres Freund wrote:
I previously had an early prototype of JITing [1] expression evaluation
and tuple deforming. I've since then worked a lot on this.Here's an initial, not really pretty but functional, submission.
One of the things I'm not really happy about yet is the naming of the
generated functions. Those primarily matter when doing profiling, where
the function name will show up when the profiler supports JIT stuff
(e.g. with a patch I proposed to LLVM that emits perf compatible output,
there's also existing LLVM support for a profiler by intel and
oprofile).
Currently there's essentially a per EState counter and the generated
functions get named deform$n and evalexpr$n. That allows for profiling
of a single query, because different compiled expressions are
disambiguated. It even allows to run the same query over and over, still
giving meaningful results. But it breaks down when running multiple
queries while profiling - evalexpr0 can mean something entirely
different for different queries.
The best idea I have so far would be to name queries like
evalexpr_$fingerprint_$n, but for that we'd need fingerprinting support
outside of pg_stat_statement, which seems painful-ish.
Perhaps somebody has a better idea?
Regards,
Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 09/03/2017 02:59 AM, Andres Freund wrote:
Hi,
On 2017-08-31 23:41:31 -0700, Andres Freund wrote:
I previously had an early prototype of JITing [1] expression evaluation
and tuple deforming. I've since then worked a lot on this.Here's an initial, not really pretty but functional, submission.
One of the things I'm not really happy about yet is the naming of the
generated functions. Those primarily matter when doing profiling, where
the function name will show up when the profiler supports JIT stuff
(e.g. with a patch I proposed to LLVM that emits perf compatible output,
there's also existing LLVM support for a profiler by intel and
oprofile).Currently there's essentially a per EState counter and the generated
functions get named deform$n and evalexpr$n. That allows for profiling
of a single query, because different compiled expressions are
disambiguated. It even allows to run the same query over and over, still
giving meaningful results. But it breaks down when running multiple
queries while profiling - evalexpr0 can mean something entirely
different for different queries.The best idea I have so far would be to name queries like
evalexpr_$fingerprint_$n, but for that we'd need fingerprinting support
outside of pg_stat_statement, which seems painful-ish.Perhaps somebody has a better idea?
As far as I understand we do not need precise fingerprint.
So may be just calculate some lightweight fingerprint?
For example take query text (es_sourceText from EText), replace all non-alphanumeric characters spaces with '_' and take first N (16?) characters of the result?
It seems to me that in most cases it will help to identify the query...
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
Currently there's essentially a per EState counter and the generated
functions get named deform$n and evalexpr$n. That allows for profiling
of a single query, because different compiled expressions are
disambiguated. It even allows to run the same query over and over, still
giving meaningful results. But it breaks down when running multiple
queries while profiling - evalexpr0 can mean something entirely
different for different queries.
The best idea I have so far would be to name queries like
evalexpr_$fingerprint_$n, but for that we'd need fingerprinting support
outside of pg_stat_statement, which seems painful-ish.
Yeah. Why not just use a static counter to give successive unique IDs
to each query that gets JIT-compiled? Then the function names would
be like deform_$querynumber_$subexprnumber.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-09-03 10:11:37 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
Currently there's essentially a per EState counter and the generated
functions get named deform$n and evalexpr$n. That allows for profiling
of a single query, because different compiled expressions are
disambiguated. It even allows to run the same query over and over, still
giving meaningful results. But it breaks down when running multiple
queries while profiling - evalexpr0 can mean something entirely
different for different queries.The best idea I have so far would be to name queries like
evalexpr_$fingerprint_$n, but for that we'd need fingerprinting support
outside of pg_stat_statement, which seems painful-ish.Yeah. Why not just use a static counter to give successive unique IDs
to each query that gets JIT-compiled? Then the function names would
be like deform_$querynumber_$subexprnumber.
That works, but unfortunately it doesn't keep the names the same over
reruns. So if you rerun the query inside the same session - a quite
reasonable thing to get more accurate profiles - the names in the
profile will change. That makes it quite hard to compare profiles,
especially when a single execution of the query is too quick to see
something meaningful.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 01.09.2017 09:41, Andres Freund wrote:
Hi,
I previously had an early prototype of JITing [1] expression evaluation
and tuple deforming. I've since then worked a lot on this.Here's an initial, not really pretty but functional, submission. This
supports all types of expressions, and tuples, and allows, albeit with
some drawbacks, inlining of builtin functions. Between the version at
[1] and this I'd done some work in c++, because that allowed to
experiment more with llvm, but I've now translated everything back.
Some features I'd to re-implement due to limitations of C API.I've whacked this around quite heavily today, this likely has some new
bugs, sorry for that :(
Can you please clarify the following fragment calculating attributes
alignment:
/* compute what following columns are aligned to */
+ if (att->attlen < 0)
+ {
+ /* can't guarantee any alignment after varlen field */
+ attcuralign = -1;
+ }
+ else if (att->attnotnull && attcuralign >= 0)
+ {
+ Assert(att->attlen > 0);
+ attcuralign += att->attlen;
+ }
+ else if (att->attnotnull)
+ {
+ /*
+ * After a NOT NULL fixed-width column, alignment is
+ * guaranteed to be the minimum of the forced alignment and
+ * length. XXX
+ */
+ attcuralign = alignto + att->attlen;
+ Assert(attcuralign > 0);
+ }
+ else
+ {
+ //elog(LOG, "attnotnullreset: %d", attnum);
+ attcuralign = -1;
+ }
I wonder why in this branch (att->attnotnull && attcuralign >= 0)
we are not adding "alignto" and comment in the following branch else if
(att->attnotnull)
seems to be not related to this branch, because in this case attcuralign
is expected to be less then zero wjhich means that previous attribute is
varlen field.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 2017-09-04 20:01:03 +0300, Konstantin Knizhnik wrote:
I previously had an early prototype of JITing [1] expression evaluation
and tuple deforming. I've since then worked a lot on this.Here's an initial, not really pretty but functional, submission. This
supports all types of expressions, and tuples, and allows, albeit with
some drawbacks, inlining of builtin functions. Between the version at
[1] and this I'd done some work in c++, because that allowed to
experiment more with llvm, but I've now translated everything back.
Some features I'd to re-implement due to limitations of C API.I've whacked this around quite heavily today, this likely has some new
bugs, sorry for that :(Can you please clarify the following fragment calculating attributes
alignment:
Hi. That piece of code isn't particularly clear (and has a bug in the
submitted version), I'm revising it.
/* compute what following columns are aligned to */ + if (att->attlen < 0) + { + /* can't guarantee any alignment after varlen field */ + attcuralign = -1; + } + else if (att->attnotnull && attcuralign >= 0) + { + Assert(att->attlen > 0); + attcuralign += att->attlen; + } + else if (att->attnotnull) + { + /* + * After a NOT NULL fixed-width column, alignment is + * guaranteed to be the minimum of the forced alignment and + * length. XXX + */ + attcuralign = alignto + att->attlen; + Assert(attcuralign > 0); + } + else + { + //elog(LOG, "attnotnullreset: %d", attnum); + attcuralign = -1; + }I wonder why in this branch (att->attnotnull && attcuralign >= 0)
we are not adding "alignto" and comment in the following branch else if
(att->attnotnull)
seems to be not related to this branch, because in this case attcuralign is
expected to be less then zero wjhich means that previous attribute is varlen
field.
Yea, I've changed that already, although it's currently added earlier,
because the alignment is needed before, to access the column correctly.
I've also made number of efficiency improvements, primarily to access
columns with an absolute offset if all preceding ones are fixed width
not null columns - that is quite noticeable performancewise.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 04.09.2017 23:52, Andres Freund wrote:
Yea, I've changed that already, although it's currently added earlier,
because the alignment is needed before, to access the column correctly.
I've also made number of efficiency improvements, primarily to access
columns with an absolute offset if all preceding ones are fixed width
not null columns - that is quite noticeable performancewise.
Unfortunately, in most of real table columns are nullable.
I wonder if we can perform some optimization in this case (assuming that
in typical cases column either contains mostly non-null values, either
mostly null values).
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-09-05 13:58:56 +0300, Konstantin Knizhnik wrote:
On 04.09.2017 23:52, Andres Freund wrote:
Yea, I've changed that already, although it's currently added earlier,
because the alignment is needed before, to access the column correctly.
I've also made number of efficiency improvements, primarily to access
columns with an absolute offset if all preceding ones are fixed width
not null columns - that is quite noticeable performancewise.Unfortunately, in most of real table columns are nullable.
I'm not sure I agree with that assertion, but:
I wonder if we can perform some optimization in this case (assuming that in
typical cases column either contains mostly non-null values, either mostly
null values).
Even if all columns are NULLABLE, the JITed code is still a good chunk
faster (a significant part of that is the slot->tts_{nulls,values}
accesses). Alignment is still cheaper with constants, and often enough
the alignment can be avoided (consider e.g. a table full of nullable
ints - everything is guaranteed to be aligned, or columns after an
individual NOT NULL column is also guaranteed to be aligned). What
largely changes is that the 'offset' from the start of the tuple has to
be tracked.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 5 September 2017 at 11:58, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
I wonder if we can perform some optimization in this case (assuming that in
typical cases column either contains mostly non-null values, either mostly
null values).
If you really wanted to go crazy here you could do lookup tables of
bits of null bitmaps. Ie, you look at the first byte of the null
bitmap, index into an array and it points to 8 offsets for the 8
fields covered by that much of the bitmap. The lookup table might be
kind of large since offsets are 16-bits so you're talking 256 * 16
bytes or 2kB for every 8 columns up until the first variable size
column (or I suppose you could even continue in the case where the
variable size column is null).
--
greg
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-09-05 19:43:33 +0100, Greg Stark wrote:
On 5 September 2017 at 11:58, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:I wonder if we can perform some optimization in this case (assuming that in
typical cases column either contains mostly non-null values, either mostly
null values).If you really wanted to go crazy here you could do lookup tables of
bits of null bitmaps. Ie, you look at the first byte of the null
bitmap, index into an array and it points to 8 offsets for the 8
fields covered by that much of the bitmap. The lookup table might be
kind of large since offsets are 16-bits so you're talking 256 * 16
bytes or 2kB for every 8 columns up until the first variable size
column (or I suppose you could even continue in the case where the
variable size column is null).
I'm missing something here. What's this saving? The code for lookups
with NULLs after jitting effectively is
a) one load for every 8 columns (could be optimized to one load every
sizeof(void*) cols)
b) one bitmask for every column + one branch for null
c) load for the datum, indexed by register
d) saving the column value, that's independent of NULLness
e) one addi adding the length to the offset
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 04.09.2017 23:52, Andres Freund wrote:
Hi. That piece of code isn't particularly clear (and has a bug in the
submitted version), I'm revising it.
...
Yea, I've changed that already, although it's currently added earlier,
because the alignment is needed before, to access the column correctly.
I've also made number of efficiency improvements, primarily to access
columns with an absolute offset if all preceding ones are fixed width
not null columns - that is quite noticeable performancewise.
Should I wait for new version of your patch or continue review of this code?
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-09-19 12:57:33 +0300, Konstantin Knizhnik wrote:
On 04.09.2017 23:52, Andres Freund wrote:
Hi. That piece of code isn't particularly clear (and has a bug in the
submitted version), I'm revising it....
Yea, I've changed that already, although it's currently added earlier,
because the alignment is needed before, to access the column correctly.
I've also made number of efficiency improvements, primarily to access
columns with an absolute offset if all preceding ones are fixed width
not null columns - that is quite noticeable performancewise.Should I wait for new version of your patch or continue review of this code?
I'll update the posted version later this week, sorry for the delay.
Regards,
Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
Here's an updated version of the patchset. There's some substantial
changes here, but it's still very obviously very far from committable as
a whole. There's some helper commmits that are simple and independent
enough to be committable earlier on.
The git tree of this work, which is *frequently* rebased, is at:
https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/jit
The biggest changes are:
- The JIT "infrastructure" is less bad than before, and starting to
shape up.
- The tuple deforming logic is considerably faster than before due to
various optimizations. The optimizations are:
- build deforming exactly to the required natts for the specific caller
- avoid checking the tuple's natts for attributes that have
"following" NOT NULL columns.
- a bunch of minor codegen improvements.
- The tuple deforming codegen also got simpler by relying on LLVM to
promote a stack variable to a register, instead of working with a
register manually - the need to keep IR in SSA form makes doing so
manually rather painful.
- WIP patch to do execGrouping.c TupleHashTableMatch() via JIT. That
makes the column comparison faster, but more importantly it JITs the
deforming (one side at least always is a MinimalTuple).
- All tests pass with JITed expression, tuple deforming, agg transition
value computation and execGrouping logic. There were a number of bugs,
who would have imagined that.
- some more experimental changes later in the series to address some
bottlenecks.
Functionally this covers all of what I think a sensible goal for v11
is. There's a lot of details to figure out, and the inlining
*implementation* isn't what I think we should do. I'll follow up, not
tonight though, with an email outlining the first few design decisions
we're going to have to finalize, which'll be around the memory/lifetime
management of functions, and other infrastructure pieces (currently
patch 0006).
As the patchset is pretty large already, and not going to get any
smaller, I'll make smaller adjustments solely via the git tree, rather
than full reposts.
Greetings,
Andres Freund
Attachments:
0007-JIT-compile-expressions.v4.patch.gzapplication/x-patch-gzipDownload
����Y 0007-JIT-compile-expressions.v4.patch �=iWG���_Q�{�Z �$>�X8dl`N<'�G��]��{z���wo���$6�$�I@�������Wu��D3��O����������t���m���������&���z����C�F!����^���l�z��>�yFvC/�)�Ohz�{�}}��N��N��G�7_9}FNs�&[���`8 ��z�����6��=����|�u�g������I�qws�������,�J��`�~��fscc�I�����#
�.�D�<���[�#�� ����c�������q����{�4�_����M���������l��}-�|��O��4w�lD�/I�C7�=Z1�3����0`�8 ��� 8���g����
{��v���1r�M�����=s�)�����>���&�@k}�
�=P�}c�I�����Y�Q����x-�~���O&dcc�g���������z��=�o�&[��l>~�n�'�IO����O�e!�/�F�������5�����*g�$Y���x�S��e�5�X��-��*�/�u�A4v�&in��i�����D��^�$4��7&n��A��$���\��h�T����`��$�c?������Oy��4������W��_��
A�JCNh����'�]G}<��N����^,`��Yv�C�d�Q���'���N���M���V�������������Gd���'\�3�~la�a������6��k�)#k���@G��w�y�{M��:�%�� r8�x;( �?����'��w�������6����0!+* �������,O����'M07
5��t���$�A��_L�"�X���^0�g�1iT��N����M����Bw=z�
� XfA52\�^����<�������m�k��5��+8�*���5��.�i;)�gO���\����{����?=�H�]%��Om`x���4���!���"��9
�x�H��X i�MN�z�$�����w�N���.�����( }G�� �������Y�.F�%���r��
#��J���v�VxCA������t��<y4s�m�a��8K(�n���������� �i7b��]�%�������b�G�����p�0c�&�`���j�����~���X-�9��/�A
pC?
�����s��5�3��N�.N �E�4����FnD��)�G�J�S4r��<Js���9�Dc��X���U��%��=�9�3����Ca����2���iMh����2�Y�C�bz�����z�&?��rm���7i��4��5!k���3���s��8�������������p���g�������q�(
�l4�Y��{��-��6�� ���X�D����i�Ow�# �L��B/�g��.�u�Cqv��\)�_aD7&!�"��4;x(�he[@Q������'���9����� `G>zf��5d+���Sg�!�&=������om���0A4I�X���h��m�lh�����,���������W�o�.0�s��x���3_���;Z���.`��������~��l= ��6$#XeY����W4u��j�9k~��*sr�J���
����akD-|��2���q�)s��'��$t ��{�V��6�2��"�[��E��V,���s���?���O�F��"#{ZJ ~�t�6�A���R�%��i�w�"<�������LX�����2�� ��k����z���E�M�B����gw�Z[5��ix%_6����n:��%/��Go� ��{z_8�������;q�r��ax��d��������eH3�e� ��s�A�
��OD����-�G�J����B�]IT^b������8!�;MP��c�!
E����{N"2q}�e
Q��v>��#5re��$���?�F�e+G��%U-&���0�Z�H-���!���5YI6�����A�%�8����@�G Lb�P����h����1��R)r��f>W�� 8
��`�xj� �iP�41hT8+�����|��k��)�M�mH�W0�SIjG�����F���0������/�1=ElF�%�l�T�""hsi\�p�NG�����\���O)m�D�4/M�Ea��+������I����:��$�9��t�-�M���sWQ7����$Y����^Y�V%;������fI"��m�@i�~��/]G*��G�n�o����d������<���Co?JPGZ+Np�\�|�
���)8'��*� }�NL�% 4y�����=�ZX����{�.Z�]
�3���P�d����$�WD=�R�v��0CY�Se��h���*o"��V�ZNX��5�A��Kd$��q�0�����!9�NIv�d�"�Y�$�������xQ�]��9�[��$X���?��$s`ia� [G
3� V;lp����k�����x;�!,Bq6��*�*���`)�
�$�U�;$l����
�����5f������G�;���8��4��QL0��Jf���*��^B���Y�i��~-%"p#�4�p�b>k��}�<�������U�C/��2 �g@�� ��x���xFq���������=��0��d'x?�����u��"D�� ~ ��U��!�G���, �p �Cds�R�7|�l>
�ikprrt�.���������Df�A���L��B:gZ�����t�gJ*F�*�"+���8���B&J%�:}�d�E,g)nu5k���%�����}�$�X��O�B�,UX��(�t a}D�G�)�eu�Z��s��&:�0[VY
Q����0(��XS8\-�,���b��([���T�)J�P����O�?����IRp����Q����JX��w��� �qS�+ Ei�����4%4��L4r�HM�e���+@�V�������n(��v$6�
�s����`)Z�d��+E���tZ�l�Q`H�Gn1J�Qipo����E,lU[4�g���_(�Y
����L�����cp������g s N�F�#=������<%�`�C^��1u�<�xV'���$�����G,=�3/� �Dx�G�n��(<�\���}8������w��<�����BY������!�ah����<����sX�N��&d��g�'�r�����l�
��KB�Y��
�<�p�����F�\DI
!��N0n�����0�2��/�Q�I�zcD��B���z��S�8at����^D��7�
X���������Lu~��G��*������������#=��j�R���b��� A�x�2�q�&�Q�F6*q �����OpZ�,�p�F3��Z��.����2`�[F���g�B4f�j��C%�����D`t�c���RL�A2�d�$��2�P�f�~UEx��S�Vl����z��A��f�e���;8Nd�|�nm���,��J��(4��?�Z�@xt6��7E���g?��~�g���Wt����b�Y`#I�c0E:��VX7]}���T"� <=�8;thR��/:���r�Mck�8J}d����k*3b��Z�
Z�<�������O� v��a_:`i��A��r���iCSE@�������P��\4���,D2P9�������J4*��9�-�FA�htx]<b��5
�n�x��f%��ZF�^�&�f��={%9\9{0����0����lJJ@�9�-d7����_�e�����KI�=/���F�2q���y���u�i-�gf����hn�~���F��te�"�O����<QTt���/������jt�����$X�M!' Q9�$��m���O� Y�-������d5���%J����*Y��4�q:�H��* �- �v�O�/u��f������R���v���w}f�A���2���W����E��IBi�~�Y�9B&��+� u�� d2��N�y}7����}������A-�!��L��DRQN�u����(�,l�IL����Xhm�����������juHJmU+�V���d���x�M���H/�����ksLQ�F5rH���x08:�::<c�DSITP�6jtc�a����U��������:,67]��Ea��@��-�����2���3�n,�7�.���}�z!kM�P)l��6S�
�������[]
<��V�2`�*l~q��)f0gc�d�\����)XC����}��W��������
�A�7��
dyI$L�|��7�)8�Y��Q(� u>�/�����pp2����8<z+L�n>zw:�y��{Xl���rgA�nm��G0o0���'��2b�W�:��8X�����i6aQ;���_�v��G<��wT���U�G�*p�g�(a��e����3�llqbefA���E*AVIB��� ��v�&%�"�?��(�1�{qF��O<,H��^D�`tT���i|JM_���f�����^u�Q���B�{��N@����'�$���*%�Ej���0��`xF�NRv3'!�DZ�@A�����[Dzf���Y2����b[Y��$W6�����:����M�y�d�A��5��jLe����%�
�����l�� �T���5<�fG%�~Y�E�sQ�Dz����3��C���� ���5���
���a���J0t��O(,�������X���P*v�����K�����\ Id>�����.��D��w�Gs�K#�:H5�j�8VY�6����/T[����JpZ"��)xw���k��#^XG����nI�M��v-�����7����������R��C��j�R��7���(��!��uk���O�Z�o���V������>��)���g��S���+������u��� :������F'G2]���Q6G�U��V������0j���o �X�|��/��s���F\���Y����#�-r;����!�������%�����f�
F>D�"�'zN����7��]�s!�[Rm�J��J������ s��b�
��nOi�A>Z>� �
���Vv);��H�
�����O�#�.�4V5�;����}zxT��7��u&/�=wq�G�x�����M_[)��'�UUo �\�h=���-K�L�Cv��=`����L�}B� W����W20��QaK���m!T�������n[��qFO�s��7%9i/e���T��[�+^�l����9rn���`B�f��-�V~V�����_��?\���W�|M�#��ozO�~�`��px:�|��0T%�MTE�t��J���6������Q��F�����I��y-��o�x��������!L;=9�+�T�U��n�[�q_���5�R1�j��0��N2M��<�i�9
~�T\O��x�� �gj��3v���
����-�����7����� �
��H3��)co]�9�A���)������,��+{���N�n��Uvw��s��j\� ��&�$�KYI��^����F�
}'0��r�Kv��7?���[�F�Zu�+���T�\X%�W ��1��C�leQZ��Y�w)������;����?�?~_���kd�� ����P�tR)A$�0�pK\-��jr�n�$��mv�x�� Z`��+ePv����1P��T��j����h`�l�:�e)����0����5���H����\A3�-3��g*M�2����1�-EW�[���H.�V�5fA�t&���5*���l�W�M'��O�RdgI�O������-�qC ��I�w��+xS���)[�z�+%��(�.���T�N����\�/p������~a�v���N����@o���G]�Ve�� }��U��aS�������9������i�pA�3'%!�?��$�F�����}I�F$ AP��Mkq-{U��z 2z�/����x��w�&��ip�!���W�=|���������[��/ic�G�N3��l��K���{��;��+�xj`�z�����:4���Z��G��-������/���?���s�2_�>�._�.�����d ��v�1��s�����-`�n� {P����g��~ ��1�
C�4�<:z3�=|5� $;8YT��u������ ���n���#��wv�����k�s�������(�����Q�,���/\�7��Dx/������rXm�^������Vc��]�C���������A'.�L��wd�c���t���/P���[�(��;/�;H�]_ �m�,]�,�'���R�J�i~�,,�W$R�s� y��TPi6���JP^j�r'����|H�EN���.'�48�+%����U�����=�W���?��BM�k!���8���{0�^�]�������e��O�!�$����/��V�J���sN���������������<����-��)<�i��^&�s�},��&�rn��Rx��IU�d�dt4���^U?]@0-�������T�?�����h�� �Hth���BF%�R���hT����y��*���nu�H��Y�8�;������}�2V�V�� {�?"��<}H�,d<�z��}�m+�J��N�������!%�B�����mY����R�py!�%:+�]Q�d����+a=��W�:i�/[5�
w�kp��$����s�f5���A�|��CT��2/(%��g%5,
��+l���G���+��<R�z�4�{�[�DCS��+��{S�)s���\����[h��JDDsb[wP�by�L�Q��$�i�Le7�)����2 �����J��=Jy^:��+�+w�1y�h�����h��2l�r����`�^4��_�K�����m�B[��V.���;p�q�����<h_�M����]iZ.���V���/Ep������W��q�U�@�T�+�;������KW����������-��u����X�S_�=�e�N�����/]��3�(�����h7i�JYNt-@������tv����%��A�Jz'�����3�'c�[�K���IR'Kr$��#��8>�rna6;Fgi�KVY��c�?p��s!X�Y��l�
��dy}���JZH�T"�,}([o�V V���WF]���M�/wj�p)��-��������f�-�:�����d b*�������G!�Dc�������8e"�� p�)!dr����e����b*���3l����V��#�`ZN%��6��B�=+�R�f�y>h��:R�d�7�V�"UA����f�u����:��G�{��E�s�$H^^ ��k��f���+��Q��;J���u�P��]�7�T��������v�e6R_�;����u����4������Nhf�����f��N�8����7�^&�����e�F� �����s�<� ���?��G����������C�i���{vq*�s���U��LO�
����ie5���[�x�g���*-����N���!Kp�l�p�������J�cF���ox��-��[\jyw�3��1�-����r&`���`�_s�CcX.������s�����s[��F.��C��n���\��0����$���Q���k�v.�^pg��\P��n0�&�\�'���<w�:q ��D6�6��{���]����,��I��(J[8�p��7qu�r���d�3�1sJ�����T M6�B�Al�����9Rh �#]�LJ�{y��v��������U���Gqh{~\fmDm�<��m��#�p��N�%����B=
����_�fh�D����o6�NlCFL�O��8����[�L7Lw����u0qE���e�?2���CVR� �_BX8�p�q��o��;;����w�+S��-������TK���2��d��V%]�}]|tH�H���/������6b����Cc��\'���
oHm�g^@J3�} ��<��u�i����?%~��������0�y�vzQ�c��B�l&��PeF�~��9YS�"t���-��c7����ek ���ig*����Mhu �sCIT�-�5��`�g�[�� �%-�5��d&O>L3m�b7����e�6DV����{��k�������1V�h�� &'�yE:��!���H�n�1�r�ll�U�~�E�V�.��T�4��-�VK1E��c�#5xg��������[�~�����F�]H��"l;D����Rk&~��NS�+Gi�~F�������V;�jG[�h_~G�}�m�;����W<0,�(����n����� �s2�"�Z�Y������O� )��5eM@�a�`��CROD�������1�4s�]Q�D�p��R����n�6\^��9s�Y�I^G;��+��
�*�f �T��\��(�)��[���C\_�j������lY��B��%d��K_�u����cR�4��Le8��'��5�,m�*��������V��bB������O
��4T�����P��F��3.[��'�+�N�h�j
1n}���Y����6��.�Me`O�0����Ii.>��o�<������#
��^��\��{�M���{�$�E���-��P�<S�@-�T����Zf��N��z�k�>U5/
!���~��5��qmS�@$����T�QxV��,tmAS�/�p�rR�>��9F������s������
[���zu����*�����bb��eh!�����EhC,udT�����*nw�����:�/������C�?�������k/x��-�|]V��"��`}P�-9���]�#�(�a_o�\3�sHe"V�.�^A Q)����4��Ym!lf�b�����9,�)��|,�n~���� ,��
����/����X�@!�E�+��;!�����e�%���,i$z�$���
�D�eP�B�R^�_Z�����>eE[IA��� @O��&�!����ft���qu�����X��L;S�3��7����Q �E8N�:���"l�b������p���2[�{4
ys�kt��<�e<�d���yfi�C4h�<��D�5���w�=��� z�e�NZ�����(� 'PJ[W����X�����b7���]x~��+�0�j��KQLx�{1�����DO4x)���_`F�����,>��� ��#���9F�U
@T60�y/c��� $myH���xB,p����=e�JM�@��d��x�y�{�����0���)j�<qZ�[��b|R�p�������X�4��j��tTT�����_}�d|�w��`=�������+9��6U9�u!3J>7��� Hswd!��x����'l���v�����0��'<L��'� 4��#(���Hg%��6T+9,y> 6+f
�I�T�Ye_��3)���cr(�[rlO��2��j�:�Z����wo��y��Enq��}�\T���?�f*� z�Q��
/
���)3+�!V|^x��"D|���G���S�9����KL�G%���V�g��X2�8<j�H�$1"�[R������K<�V�=<���Jd��tR��yK���_ Um���+���&�C�As�������h5��ES��[���-;F�?�"������{��@�����\.�F�Q*���"}��n�b���^��`r�����,��X�������e�������|���5��F��� ���.��+������b O8�2��#�L��&��2����H�L�w"�D�%>�2W��v��j�=a�r������������B��0�e���C�(�Gj\���8�Nf�6����J�C���+D��q�h�N�R'*���������J}$���l�� !$�&�����;���p�%XO������P��P$�M�1)%o�B���u�����>����I^��2�T.�#0 ��(�3�D_B��!`+��(��d��7OpR������`�H������ww��AQH���RH%CzA��e���LRe��D�e,�h��n�����y�������B8�D����Mv2�D��T��e��]�E���(F�g\����6����]�z�@����?���o��N���2���V9#�e�U�5��0�T������L�]|���,�R���`l=�����'������s�B-����>�m��FeVK�-{�?�:��f��+�Vdb����A�7�]I[U�4�mK��U��[�8�W�g��P�S%��$��u�T��$��IP���r���)$�8��j�� �sP����>�|)�����4��X��,cHfY��-C�0��]���-�����y��q�U�i������J���z��f�|5���J������T�s���N�t���`:�/� ��C��hJ�L�*vp/DQ�����-m%�V'n*�vFV�C���D�X������Pwi���
��}{r���~�sO\�����@@Db���Q�cR�wB�������'7���5���reCrVL@�"g��j��9�3X6 Z�89�H%s�wG���:P~@g �����L�����3���K����y��p�I~�;�^���
��Xh���5���m!.rG����|��I�g�Bp1�@
^����e[�}��S�� 3������u��U�~��y4?�Jj/iS���y�sm����E�~���Eg���P����;�
���� �#�N��T�?K<�M|��s�N�G
�V��)���1����`L��c�c�=�\_�@��cX�@�����S������Q�z����Yc�,�3�`��"��F����9o��5�������_y���a���R8K>�E���0�;a!�����5�?��i�d�1/)���~�~�imoE��Rpt�$E@�$�5\���g+�E4�����LG&Us����$K�W;�(���5����t����~�I�e��?s�<�<Z��}RK�H�;������>� ��u��N�C��C��%K�1�X�d���t��@})�\�~�l���(������Q?iE����j���uzh�3���L\8{B�
{����w�xK�3���t�n^2�J��$ G
�V���|�o�x�)R#�&��;��`1FV�=Bf���{a�����+E��(�N �����`��Q��_�Tu��9q/�5 ��A`�+,���*��e��T,��� ��5��N�~<u�AxO`�����j^V���5x
� ��m��]!2��W��D��De���������8��f���G� �����U��I=F�C��{+����>R<� �[���"l={�M�T�1���B�.H����� �s,X��M@�`��]@��EBQ�7��H��`4�`=;�8'�!�sh�k��� WCCv�eYO��xp���j�sE��*�F�^#���� �A��`F��"�f��\>��&�$��!j�����HH�\�"���[���`����(n�g�62��x>�n���gO68����:�'��?,�M� D$�����
Fl�g|��4QU�8L?�����`��I�����1t���s�!�h�1C��9]q_��.3��4p/lB3�|�!~�C�}C�Q�����NAi�3
�r�1h�IH~2uw�<O���GVo������*Z��_'��e�6a}���� sx�pP���~T{��������-i�e����_2��Q����N��HiI_S��$ '��{#��-Y��
����"�u�NQ����(��J�d�\8'�"�%M�hg���"�����ck{�
��Q�<��{���,�&����*��k��~�C����~�|o�Y���Vk����Qc8~n�wv��k���E��mnn���[���[��M�7����x[��� r�R��h5���wI�]�D?h�N/��Y��-�n��4�"���1��:�����:����5|9�Z���9@
� ��#�i|��d@^4$O�1�����q�ma � D�,������W3'!�k�w��7��:�������;�S�L:�&����n}�am��"J"��k�]
�A�#��,t/��G��#|��z��'�g����S���cd~�.BVh��h�����l��������t[�A���$���t�p~��?�{�����ux�8�Ej!���t�������e����y�?�w?)c��$�?�����[�(+Zh����|g2�O���,�����5�KNJ��������k�j���a�q�{�&�<H ��Rxv�� �`:��$���x�X0 `�kh��z46���y��@h���� �1���4}�����),Lg�hJ���"/�������d]"�2�)���S�� �C���u�7j����xog�g�S
� ����� P���� J��tO��
u����xi�d���� � �O�r0r�Z�Z��]7�F���z�C���n
����B�K@���'bg��={8q���������GoE�e ;�L[
���[*"d���rq �@��`*8������C�3�<&��)x�PW)���16����=D�s��F���l��a��A~�����C��nT��� 4�;���H�y��3=�!Jd�t���z���c��Gk�V�2����k����/�w�i���K5��D��U�{��gb*�Pb{�����b�Q��q��C�V��w��t0�dcK�����0�����6��
ci+(T=����i>1��g�^�� ���
���f����c�&�b��=>��D/�����|��i��|5����Y����k��3��]5�w����
k#/�����A���&B 0016-WIP-Inline-ExecScan-mostly-to-make-profiles-easie.v4.patch.gzapplication/x-patch-gzipDownload
����Y 0016-WIP-Inline-ExecScan-mostly-to-make-profiles-easie.v4.patch �<�r�F���WL|�bJ ! ��'){e{�Z�q$e���� 6 �H�����=3�@R�YU�P@OOO�{�Go�`C��i�V��k�6g�1���XK6w�5���V��9����$����GM��o �3���#�7K}��/����>��������+��g�:ec2%�X d���l�l6#m�i�������������?}qf�&�_�F.|��y��YW�G'd� ��{ I@6�#a8�t0�,���Q�D_��m�(N&�!�#����'��g��I�/8�j���b �
�L~v�����M64���0.a��t�t���m��4�{�y���%���I����{�� b��_������7�
��9��Ce`�q[����n���!�<�x�=��npLm��������E���������v����'�R������S������-/�Y�=�������=�Zp��&\�|��&�Y���a��3��~�"��#�d�j�fO&'Cb��J��"��ip��m�q�dr�&��u�2�b��q���X������.��+}=�4����k�1��d��������x1$�#�K����o����'CR~� �'Ce�{��'��t�����i�z���m�1-Rh,��zEc9��8�qBI<7N>lhb���xTs����U��<
�1y0���� wA��F~0����!r��s\�prv:���<��a0���Y�@�Ei���{��w>���0H����#�W!
�����I�� ��W+�-��&�nrK���7��7���@@\��3B#�
��VQ����.�����J%�"���J?B0!Bb|r/-�le�i�����$�*��
'�����S�M]yA����f��$��Q����`@r������� �V��@�(�z2T�fV*3�~����� Q�N^i�<��vj>G���^���vr�b>�'���*��rM�UF�sm qR��I�O&?�P�����O������/�.�]���������s�@)Fb�����,����|�y�����2R�����?q=uv������@��q�r I���h��S�;W�n��dI�\+�-��.�u=$mL�&�!�v�?H�M��N��Y��Z��r��"���!Z"�#�� ""������`��4&a��E���O���1p}�e���m�� .t��w ����a�@�K �&&J� *�A�(�3>�����)/8,&: (Q �D� ���&�OE�p��t0
*�w�����7'>�� �4�s�������y�@����q��D������_%��c�g�
��s7���W,�W�� ���4��hR�F���~�X��3H�8V�$�c�O�9J�$��P\�����P#^�#x�?���#DtJ��6a�� �b��u�CA:��Mb�����B�% y��������!�2�q&�4Ar�[�'�� Hp���^tl)�A+��������]�r��q�~�o�-�������J��K�|-[s��c���h#�Y�uD0�
��J�n�
7(�8�!sZwx
���n�dZ��������[�8,�+sio�&����\��K��\T��q��Ez.3�RJS��o�s��O�_��w8�zfA�.�Gc�^���7���C����ZGn��7�a����,�d����N�rB.�W$V��C��VaRa�Y)9�o�� <��M�,�?��w���k���3�=W/����a�w���l����Kn1�A���r|��]���/������^q�����@^y��3b����J#��<��P(C��pM��i,�Y��%yZ����
��<�1�.�$X� �tX�0���&����S�CF�*HA�D���K�-�a>J�o&��glW�%��sd���8���Fq=�H���0|���L
� �C�MI�<�gP�0������P���w��/z�&�W_�L���T����@0Fe��Wi4��=6�w�(��QN���������]��m�c�l��:�����1�����0�'h#������&r�|�g����0�� �)h�G�&62?(v�l�P��.)wj\aK��U�]x_�0�����V��Q�� ����Am�M�����<����VD�PN/��0Q?�ga�|cr,����<�U��n![�~'*�&"z/�M�e�9<�����|��/!.Y���rGGD�ZJ��
3�)k ��Za���,�{u����2��j��1�y�8� ���[���U��T����LH��TS��*��uy%"1df��l���6B�����P�U>
����8fq��1XSX$U�[F���Bh*�}Y��% �os�N^�V2�b�h��� <i�K_$P@VHG|���[��$���3@�p��:C���X"1���t��� �9)�\�AR/Z �d
���~���Y�K������7*�?����;��������� �T�a�^�Q.�+��L�K��A��S4Q�3k(-&ST�����d��)�G>��%Y���������C��@������rb�]�bi� �����7MG'�=�O6WD��*PrT�}@�����c��#��m�4��`HM������|�R��/�$7������7vP�j%���N��T��.�'F���FDc�����RR��8<����z��]���PVP'��"~���6�?�����JN�3F�*[l�[t�J.����������|�z*��KG�o9%���P���\,��V�@�$��P�������T��@e��
��>�.vJ6���P���1ci��3��Z����qh#����Fx<��ixD
��#R��� ���9�K(��nj/+g�63��=Q���w���S�������������/W��2L��QC�����K�e���Y���
'|v��H�*�j�T�i?�
I-�������j5�-6WU}�,k1�g���*�u�� �
�����OFM���y����x #9>���b��sMUm��M}�X[}9���bx����A������4��ym>��>�&���"�����������'�j�\�*�R@��X�����2�p�����UvH�
�=��{��_�����*�#��9�=
�����b��&�q�)��W����U����.
�Y�Tc������
�[�7g����>JPezj��W�"��
�CF�6�&�@��f���0�L��
_eKg���Re����(�)��t[N}r�;7vM�s���4�����d=S��$o�^��z��,U5gl����1�G�{��hY������d��&�X�-n��" ��f�N6��4V�ta���BUW��i�M��4�w��y�Cn�D�>���5��wn�x��_��>�����@�v
e�>S��������oJY��!�
,���kuX�o����3%z�M�\\"h�Z$�eS��`��BD�W�5���[\9�.Y��<[�M��zw
�,������&��d�����km��@�u�3_��U�3j-����s�a�D�K����m6����$�������nf���LO���Y�
L�j������2W�z �R[OMK[��;�(;�V�r�q������}T�V�l �������Fq��]P^�-������Sm>�s���KY��r@.�9�|W4�b�$���0+����e���FA���?,:�P{�m��_-,�e���0�3��W��������7���^�������=M�})0��i�n� U�u-,��f;RKf��&]�����l�kN{J���C(u�����JSk�BP�+���Uy���.6�]C���{����W����Zsf:�
�Ewf��3<�-I�m�6�@c�Z Q�N�m�*���}ENIiF�u@�mW&{�b�aD ?
�a>����.��P^��7eHL���x����G�����'��6�'�� �����7T�O$����VWJFE1�h��y��0b$�8IQ�����>[q,\������2�\P1l��8�>�R�y>�5������ ^�7��_������W��yA�$�{�a���\2Ag�����b7y�'��� �]���^�z����������_�e��P@9�QU�l���
n�0���.oDZp���)��B�����z�E�`�uC Z�t������T*�af���L��p�T9������S!�����K�����~Z�u������ p�nm*�Jf
����z}����=�U5C`���qncf�����4��������Zz�����D;�����j�uB���x��E�����|z|G�Ri��|T��rt�����_9��_y����uG4�+�6�+�6�������Gj�WZ�����M��#4�+�6�+����_xi�,G^P�2�,�4_P� (-W �^��
@���� (�W �^W �t<JK�rHK��]-�2��5F*��o�c[��Z����.�D�#��+�-�UyK{���R�C�����#������~�����z���5��+!�Z��rx���5X�q�������/���_���~>����~��������*�������z��+�|�}�2|G��g(j����DBG��2�W�u�&��~�F:r%���|��~�����_hC{X���L��O����%) ������1�-�����Z3����K���cTB��}}1^e�������y���\/~���6�K�eW,] ��l��(�xH�x����$�N��@( �w�s�n/F]���������?��c�jg���G�:�Q�TN�}��dD�I�j�����c�4���g��P��mM�Mk�[,24T}���|�Po���X��T��M���8
R�U On Wed, Oct 4, 2017 at 9:48 AM, Andres Freund <andres@anarazel.de> wrote:
Here's an updated version of the patchset. There's some substantial
changes here, but it's still very obviously very far from committable as
a whole. There's some helper commmits that are simple and independent
enough to be committable earlier on.
Looks pretty impressive already.
I wanted to take it for a spin, but got errors about the following
symbols being missing:
LLVMOrcUnregisterPerf
LLVMOrcRegisterGDB
LLVMOrcRegisterPerf
LLVMOrcGetSymbolAddressIn
LLVMLinkModules2Needed
As far as I can tell these are not in mainline LLVM. Is there a branch
or patchset of LLVM available somewhere that I need to use this?
Regards,
Ants Aasma
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-10-04 11:56:47 +0300, Ants Aasma wrote:
On Wed, Oct 4, 2017 at 9:48 AM, Andres Freund <andres@anarazel.de> wrote:
Here's an updated version of the patchset. There's some substantial
changes here, but it's still very obviously very far from committable as
a whole. There's some helper commmits that are simple and independent
enough to be committable earlier on.Looks pretty impressive already.
Thanks!
I wanted to take it for a spin, but got errors about the following
symbols being missing:LLVMOrcUnregisterPerf
LLVMOrcRegisterGDB
LLVMOrcRegisterPerf
LLVMOrcGetSymbolAddressIn
LLVMLinkModules2NeededAs far as I can tell these are not in mainline LLVM. Is there a branch
or patchset of LLVM available somewhere that I need to use this?
Oops, I'd forgotten about the modifications. Sorry. I've attached them
here. The GDB and Perf stuff should now be an optional dependency,
too. The required changes are fairly small, so they hopefully shouldn't
be too hard to upstream.
Please check the git tree for a rebased version of the pg patches, with
a bunch bugfixes (oops, some last minute "cleanups") and performance
fixes.
Here's some numbers for a a TPC-H scale 5 run. Obviously the Q01 numbers
are pretty nice in partcular. But it's also visible that the shorter
query can loose, which is largely due to the JIT overhead - that can be
ameliorated to some degree, but JITing obviously isn't always going to
be a win.
It's pretty impressive that in q01, even after all of this, expression
evaluation *still* is 35% of the total time (25% in the aggregate
transition function). That's partially just because the query does
primarily aggregation, but also because the generated code can stand a
good chunk of improvements.
master q01 min: 14146.498 dev min: 11479.05 [diff -23.24] dev-jit min: 8659.961 [diff -63.36] dev-jit-deform min: 7279.395 [diff -94.34] dev-jit-deform-inline min: 6997.956 [diff -102.15]
master q02 min: 1234.229 dev min: 1208.102 [diff -2.16] dev-jit min: 1292.983 [diff +4.54] dev-jit-deform min: 1580.505 [diff +21.91] dev-jit-deform-inline min: 1809.046 [diff +31.77]
master q03 min: 6220.814 dev min: 5424.107 [diff -14.69] dev-jit min: 5175.125 [diff -20.21] dev-jit-deform min: 4257.368 [diff -46.12] dev-jit-deform-inline min: 4218.115 [diff -47.48]
master q04 min: 947.476 dev min: 970.608 [diff +2.38] dev-jit min: 969.944 [diff +2.32] dev-jit-deform min: 999.006 [diff +5.16] dev-jit-deform-inline min: 1033.78 [diff +8.35]
master q05 min: 4729.9 dev min: 4059.665 [diff -16.51] dev-jit min: 4182.941 [diff -13.08] dev-jit-deform min: 4147.493 [diff -14.04] dev-jit-deform-inline min: 4284.473 [diff -10.40]
master q06 min: 1603.708 dev min: 1592.107 [diff -0.73] dev-jit min: 1556.216 [diff -3.05] dev-jit-deform min: 1516.078 [diff -5.78] dev-jit-deform-inline min: 1579.839 [diff -1.51]
master q07 min: 4549.738 dev min: 4331.565 [diff -5.04] dev-jit min: 4475.654 [diff -1.66] dev-jit-deform min: 4645.773 [diff +2.07] dev-jit-deform-inline min: 4885.781 [diff +6.88]
master q08 min: 1394.428 dev min: 1350.363 [diff -3.26] dev-jit min: 1434.366 [diff +2.78] dev-jit-deform min: 1716.65 [diff +18.77] dev-jit-deform-inline min: 1938.152 [diff +28.05]
master q09 min: 5958.198 dev min: 5700.329 [diff -4.52] dev-jit min: 5491.683 [diff -8.49] dev-jit-deform min: 5582.431 [diff -6.73] dev-jit-deform-inline min: 5797.475 [diff -2.77]
master q10 min: 5228.69 dev min: 4475.154 [diff -16.84] dev-jit min: 4269.365 [diff -22.47] dev-jit-deform min: 3767.888 [diff -38.77] dev-jit-deform-inline min: 3962.084 [diff -31.97]
master q11 min: 281.201 dev min: 280.132 [diff -0.38] dev-jit min: 351.85 [diff +20.08] dev-jit-deform min: 455.885 [diff +38.32] dev-jit-deform-inline min: 532.093 [diff +47.15]
master q12 min: 4289.268 dev min: 4082.359 [diff -5.07] dev-jit min: 4007.199 [diff -7.04] dev-jit-deform min: 3752.396 [diff -14.31] dev-jit-deform-inline min: 3916.653 [diff -9.51]
master q13 min: 7110.545 dev min: 6898.576 [diff -3.07] dev-jit min: 6579.554 [diff -8.07] dev-jit-deform min: 6304.15 [diff -12.79] dev-jit-deform-inline min: 6135.952 [diff -15.88]
master q14 min: 678.024 dev min: 650.943 [diff -4.16] dev-jit min: 682.387 [diff +0.64] dev-jit-deform min: 746.354 [diff +9.16] dev-jit-deform-inline min: 878.437 [diff +22.81]
master q15 min: 1641.897 dev min: 1650.57 [diff +0.53] dev-jit min: 1661.591 [diff +1.19] dev-jit-deform min: 1821.02 [diff +9.84] dev-jit-deform-inline min: 1863.304 [diff +11.88]
master q16 min: 1890.246 dev min: 1819.423 [diff -3.89] dev-jit min: 1838.079 [diff -2.84] dev-jit-deform min: 1962.274 [diff +3.67] dev-jit-deform-inline min: 2096.154 [diff +9.82]
master q17 min: 502.605 dev min: 462.881 [diff -8.58] dev-jit min: 495.648 [diff -1.40] dev-jit-deform min: 537.666 [diff +6.52] dev-jit-deform-inline min: 613.144 [diff +18.03]
master q18 min: 12863.972 dev min: 11257.57 [diff -14.27] dev-jit min: 10847.61 [diff -18.59] dev-jit-deform min: 10119.769 [diff -27.12] dev-jit-deform-inline min: 10103.051 [diff -27.33]
master q19 min: 281.991 dev min: 264.191 [diff -6.74] dev-jit min: 331.102 [diff +14.83] dev-jit-deform min: 373.759 [diff +24.55] dev-jit-deform-inline min: 531.07 [diff +46.90]
master q20 min: 541.154 dev min: 511.372 [diff -5.82] dev-jit min: 565.378 [diff +4.28] dev-jit-deform min: 662.926 [diff +18.37] dev-jit-deform-inline min: 805.835 [diff +32.85]
master q22 min: 678.266 dev min: 656.643 [diff -3.29] dev-jit min: 676.886 [diff -0.20] dev-jit-deform min: 735.058 [diff +7.73] dev-jit-deform-inline min: 943.013 [diff +28.07]
master total min: 76772.848 dev min: 69125.71 [diff -11.06] dev-jit min: 65545.522 [diff -17.13] dev-jit-deform min: 62963.844 [diff -21.93] dev-jit-deform-inline min: 64925.407 [diff -18.25]
Greetings,
Andres Freund
Attachments:
0001-ORC-Add-findSymbolIn-wrapper-to-C-bindings.patchtext/x-diff; charset=us-asciiDownload
From f636e4caf62ed2a29851b9cca8bb664df73c7bb9 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 4 Oct 2017 15:36:51 -0700
Subject: [PATCH 1/6] [ORC] Add findSymbolIn() wrapper to C bindings.
---
include/llvm-c/OrcBindings.h | 5 +++++
lib/ExecutionEngine/Orc/OrcCBindings.cpp | 8 ++++++++
lib/ExecutionEngine/Orc/OrcCBindingsStack.h | 22 ++++++++++++++++++++++
3 files changed, 35 insertions(+)
diff --git a/include/llvm-c/OrcBindings.h b/include/llvm-c/OrcBindings.h
index abb3ac6a7f0..4ff1f47e87d 100644
--- a/include/llvm-c/OrcBindings.h
+++ b/include/llvm-c/OrcBindings.h
@@ -170,6 +170,11 @@ LLVMOrcErrorCode LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
LLVMOrcTargetAddress *RetAddr,
const char *SymbolName);
+LLVMOrcErrorCode LLVMOrcGetSymbolAddressIn(LLVMOrcJITStackRef JITStack,
+ LLVMOrcTargetAddress *RetAddr,
+ LLVMOrcModuleHandle H,
+ const char *SymbolName);
+
/**
* Dispose of an ORC JIT stack.
*/
diff --git a/lib/ExecutionEngine/Orc/OrcCBindings.cpp b/lib/ExecutionEngine/Orc/OrcCBindings.cpp
index f945acaf95e..9b9c1512402 100644
--- a/lib/ExecutionEngine/Orc/OrcCBindings.cpp
+++ b/lib/ExecutionEngine/Orc/OrcCBindings.cpp
@@ -120,6 +120,14 @@ LLVMOrcErrorCode LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
return J.findSymbolAddress(*RetAddr, SymbolName, true);
}
+LLVMOrcErrorCode LLVMOrcGetSymbolAddressIn(LLVMOrcJITStackRef JITStack,
+ LLVMOrcTargetAddress *RetAddr,
+ LLVMOrcModuleHandle H,
+ const char *SymbolName) {
+ OrcCBindingsStack &J = *unwrap(JITStack);
+ return J.findSymbolAddressIn(*RetAddr, H, SymbolName, true);
+}
+
LLVMOrcErrorCode LLVMOrcDisposeInstance(LLVMOrcJITStackRef JITStack) {
auto *J = unwrap(JITStack);
auto Err = J->shutdown();
diff --git a/lib/ExecutionEngine/Orc/OrcCBindingsStack.h b/lib/ExecutionEngine/Orc/OrcCBindingsStack.h
index 405970e063d..6eaac01d52f 100644
--- a/lib/ExecutionEngine/Orc/OrcCBindingsStack.h
+++ b/lib/ExecutionEngine/Orc/OrcCBindingsStack.h
@@ -354,6 +354,28 @@ public:
return LLVMOrcErrSuccess;
}
+
+ LLVMOrcErrorCode findSymbolAddressIn(JITTargetAddress &RetAddr,
+ ModuleHandleT H,
+ const std::string &Name,
+ bool ExportedSymbolsOnly) {
+ RetAddr = 0;
+ if (auto Sym = findSymbolIn(H, Name, ExportedSymbolsOnly)) {
+ // Successful lookup, non-null symbol:
+ if (auto AddrOrErr = Sym.getAddress()) {
+ RetAddr = *AddrOrErr;
+ return LLVMOrcErrSuccess;
+ } else
+ return mapError(AddrOrErr.takeError());
+ } else if (auto Err = Sym.takeError()) {
+ // Lookup failure - report error.
+ return mapError(std::move(Err));
+ }
+ // Otherwise we had a successful lookup but got a null result. We already
+ // set RetAddr to '0' above, so just return success.
+ return LLVMOrcErrSuccess;
+ }
+
const std::string &getErrorMessage() const { return ErrMsg; }
private:
--
2.14.1.536.g6867272d5b.dirty
0002-C-API-WIP-Add-LLVMGetHostCPUName.patchtext/x-diff; charset=us-asciiDownload
From 98716f882cf08f521dc8e57693d224006e3f3b68 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 4 Oct 2017 23:38:46 -0700
Subject: [PATCH 2/6] [C-API] WIP: Add LLVMGetHostCPUName().
---
include/llvm-c/TargetMachine.h | 4 ++++
lib/Target/TargetMachineC.cpp | 8 ++++++++
2 files changed, 12 insertions(+)
diff --git a/include/llvm-c/TargetMachine.h b/include/llvm-c/TargetMachine.h
index f4f7f7698c4..44f6b8babd1 100644
--- a/include/llvm-c/TargetMachine.h
+++ b/include/llvm-c/TargetMachine.h
@@ -137,6 +137,10 @@ LLVMBool LLVMTargetMachineEmitToMemoryBuffer(LLVMTargetMachineRef T, LLVMModuleR
disposed with LLVMDisposeMessage. */
char* LLVMGetDefaultTargetTriple(void);
+/** Get the host CPU as a string. The result needs to be disposed with
+ LLVMDisposeMessage. */
+char* LLVMGetHostCPUName(void);
+
/** Adds the target-specific analysis passes to the pass manager. */
void LLVMAddAnalysisPasses(LLVMTargetMachineRef T, LLVMPassManagerRef PM);
diff --git a/lib/Target/TargetMachineC.cpp b/lib/Target/TargetMachineC.cpp
index 210375ff828..63d0bbf74bc 100644
--- a/lib/Target/TargetMachineC.cpp
+++ b/lib/Target/TargetMachineC.cpp
@@ -238,6 +238,14 @@ char *LLVMGetDefaultTargetTriple(void) {
return strdup(sys::getDefaultTargetTriple().c_str());
}
+/** Get the host CPU as a string. The result needs to be disposed with
+ LLVMDisposeMessage. */
+char* LLVMGetHostCPUName(void)
+{
+ /* XXX: verify it's null terminated */
+ return strdup(sys::getHostCPUName().data());
+}
+
void LLVMAddAnalysisPasses(LLVMTargetMachineRef T, LLVMPassManagerRef PM) {
unwrap(PM)->add(
createTargetTransformInfoWrapperPass(unwrap(T)->getTargetIRAnalysis()));
--
2.14.1.536.g6867272d5b.dirty
0003-C-API-Add-LLVMLinkModules2Needed.patchtext/x-diff; charset=us-asciiDownload
From c0cb3ec7472d4667226d0183382e165a7a6d2c30 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 4 Oct 2017 12:55:38 -0700
Subject: [PATCH 3/6] [C API] Add LLVMLinkModules2Needed().
---
include/llvm-c/Linker.h | 1 +
lib/Linker/LinkModules.cpp | 6 ++++++
2 files changed, 7 insertions(+)
diff --git a/include/llvm-c/Linker.h b/include/llvm-c/Linker.h
index d02c37f94c8..06af8193e57 100644
--- a/include/llvm-c/Linker.h
+++ b/include/llvm-c/Linker.h
@@ -33,6 +33,7 @@ typedef enum {
* Use the diagnostic handler to get any diagnostic message.
*/
LLVMBool LLVMLinkModules2(LLVMModuleRef Dest, LLVMModuleRef Src);
+LLVMBool LLVMLinkModules2Needed(LLVMModuleRef Dest, LLVMModuleRef Src);
#ifdef __cplusplus
}
diff --git a/lib/Linker/LinkModules.cpp b/lib/Linker/LinkModules.cpp
index 25f31a3401a..9a34c9ecce8 100644
--- a/lib/Linker/LinkModules.cpp
+++ b/lib/Linker/LinkModules.cpp
@@ -604,3 +604,9 @@ LLVMBool LLVMLinkModules2(LLVMModuleRef Dest, LLVMModuleRef Src) {
std::unique_ptr<Module> M(unwrap(Src));
return Linker::linkModules(*D, std::move(M));
}
+
+LLVMBool LLVMLinkModules2Needed(LLVMModuleRef Dest, LLVMModuleRef Src) {
+ Module *D = unwrap(Dest);
+ std::unique_ptr<Module> M(unwrap(Src));
+ return Linker::linkModules(*D, std::move(M), Linker::Flags::LinkOnlyNeeded);
+}
--
2.14.1.536.g6867272d5b.dirty
0004-MCJIT-Call-JIT-notifiers-only-after-code-sections-ar.patchtext/x-diff; charset=us-asciiDownload
From 389e5cca8e5145c9378730479de4b42870a8b347 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 1 Feb 2017 21:18:54 -0800
Subject: [PATCH 4/6] [MCJIT] Call JIT notifiers only after code sections are
ready.
Previously JIT notifiers were called before relocations were
performed (leading to ominious function call of "0"), and before
memory marked executable (confusing some profilers).
Move notifications to finalizeLoadedModules().
---
lib/ExecutionEngine/MCJIT/MCJIT.cpp | 16 ++++++++++++++--
lib/ExecutionEngine/MCJIT/MCJIT.h | 2 ++
2 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/lib/ExecutionEngine/MCJIT/MCJIT.cpp b/lib/ExecutionEngine/MCJIT/MCJIT.cpp
index 1164d60ffc1..2dd92164794 100644
--- a/lib/ExecutionEngine/MCJIT/MCJIT.cpp
+++ b/lib/ExecutionEngine/MCJIT/MCJIT.cpp
@@ -222,8 +222,10 @@ void MCJIT::generateCodeForModule(Module *M) {
if (Dyld.hasError())
report_fatal_error(Dyld.getErrorString());
- NotifyObjectEmitted(*LoadedObject.get(), *L);
-
+ // Can't call notifiers yet as relocations have not yet been performed, and
+ // memory hasn't been marked executable.
+ PendingLoadedObjects.push_back(LoadedObject->get());
+ PendingLoadedObjectInfos.push_back(std::move(L));
Buffers.push_back(std::move(ObjectToLoad));
LoadedObjects.push_back(std::move(*LoadedObject));
@@ -243,6 +245,16 @@ void MCJIT::finalizeLoadedModules() {
// Set page permissions.
MemMgr->finalizeMemory();
+
+ // Notify listeners about loaded objects now that memory is marked executable
+ // and relocations have been performed.
+ for (size_t i = 0; i < PendingLoadedObjects.size(); i++) {
+ auto &Obj = PendingLoadedObjects[i];
+ auto &Info = PendingLoadedObjectInfos[i];
+ NotifyObjectEmitted(*Obj, *Info);
+ }
+ PendingLoadedObjects.clear();
+ PendingLoadedObjectInfos.clear();
}
// FIXME: Rename this.
diff --git a/lib/ExecutionEngine/MCJIT/MCJIT.h b/lib/ExecutionEngine/MCJIT/MCJIT.h
index daf578f5daa..418578fc7a3 100644
--- a/lib/ExecutionEngine/MCJIT/MCJIT.h
+++ b/lib/ExecutionEngine/MCJIT/MCJIT.h
@@ -189,6 +189,8 @@ class MCJIT : public ExecutionEngine {
SmallVector<std::unique_ptr<MemoryBuffer>, 2> Buffers;
SmallVector<std::unique_ptr<object::ObjectFile>, 2> LoadedObjects;
+ SmallVector<object::ObjectFile*, 2> PendingLoadedObjects;
+ SmallVector<std::unique_ptr<RuntimeDyld::LoadedObjectInfo>, 2> PendingLoadedObjectInfos;
// An optional ObjectCache to be notified of compiled objects and used to
// perform lookup of pre-compiled code to avoid re-compilation.
--
2.14.1.536.g6867272d5b.dirty
0005-Add-PerfJITEventListener-for-perf-profiling-support.patchtext/x-diff; charset=us-asciiDownload
From 1db0527249415c5abbcf7425f05e04c7dc1713ef Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 1 Feb 2017 23:10:45 -0800
Subject: [PATCH 5/6] Add PerfJITEventListener for perf profiling support.
---
CMakeLists.txt | 13 +
include/llvm/Config/config.h.cmake | 3 +
include/llvm/Config/llvm-config.h.cmake | 3 +
include/llvm/ExecutionEngine/JITEventListener.h | 9 +
lib/ExecutionEngine/CMakeLists.txt | 4 +
lib/ExecutionEngine/LLVMBuild.txt | 2 +-
lib/ExecutionEngine/Orc/LLVMBuild.txt | 2 +-
lib/ExecutionEngine/PerfJITEvents/CMakeLists.txt | 5 +
.../{Orc => PerfJITEvents}/LLVMBuild.txt | 8 +-
.../PerfJITEvents/PerfJITEventListener.cpp | 530 +++++++++++++++++++++
tools/lli/CMakeLists.txt | 9 +
tools/lli/lli.cpp | 2 +
12 files changed, 584 insertions(+), 6 deletions(-)
create mode 100644 lib/ExecutionEngine/PerfJITEvents/CMakeLists.txt
copy lib/ExecutionEngine/{Orc => PerfJITEvents}/LLVMBuild.txt (72%)
create mode 100644 lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp
diff --git a/CMakeLists.txt b/CMakeLists.txt
index 3e2e548df3f..dac3b817477 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -433,6 +433,16 @@ if( LLVM_USE_OPROFILE )
endif( NOT CMAKE_SYSTEM_NAME MATCHES "Linux" )
endif( LLVM_USE_OPROFILE )
+option(LLVM_USE_PERF
+ "Use perf JIT interface to inform perf about JIT code" OFF)
+
+# If enabled, verify we are on a platform that supports perf.
+if( LLVM_USE_PERF )
+ if( NOT CMAKE_SYSTEM_NAME MATCHES "Linux" )
+ message(FATAL_ERROR "perf support is available on Linux only.")
+ endif( NOT CMAKE_SYSTEM_NAME MATCHES "Linux" )
+endif( LLVM_USE_PERF )
+
set(LLVM_USE_SANITIZER "" CACHE STRING
"Define the sanitizer used to build binaries and tests.")
@@ -639,6 +649,9 @@ endif (LLVM_USE_INTEL_JITEVENTS)
if (LLVM_USE_OPROFILE)
set(LLVMOPTIONALCOMPONENTS ${LLVMOPTIONALCOMPONENTS} OProfileJIT)
endif (LLVM_USE_OPROFILE)
+if (LLVM_USE_PERF)
+ set(LLVMOPTIONALCOMPONENTS PerfJITEvents)
+endif (LLVM_USE_PERF)
message(STATUS "Constructing LLVMBuild project information")
execute_process(
diff --git a/include/llvm/Config/config.h.cmake b/include/llvm/Config/config.h.cmake
index d67148f6aa3..59a504ca3cf 100644
--- a/include/llvm/Config/config.h.cmake
+++ b/include/llvm/Config/config.h.cmake
@@ -374,6 +374,9 @@
/* Define if we have the oprofile JIT-support library */
#cmakedefine01 LLVM_USE_OPROFILE
+/* Define if we have the perf JIT-support library */
+#cmakedefine01 LLVM_USE_PERF
+
/* LLVM version information */
#cmakedefine LLVM_VERSION_INFO "${LLVM_VERSION_INFO}"
diff --git a/include/llvm/Config/llvm-config.h.cmake b/include/llvm/Config/llvm-config.h.cmake
index 4b0c5946061..4003b4d7b15 100644
--- a/include/llvm/Config/llvm-config.h.cmake
+++ b/include/llvm/Config/llvm-config.h.cmake
@@ -62,6 +62,9 @@
/* Define if we have the oprofile JIT-support library */
#cmakedefine01 LLVM_USE_OPROFILE
+/* Define if we have the perf JIT-support library */
+#cmakedefine01 LLVM_USE_PERF
+
/* Major version of the LLVM API */
#define LLVM_VERSION_MAJOR ${LLVM_VERSION_MAJOR}
diff --git a/include/llvm/ExecutionEngine/JITEventListener.h b/include/llvm/ExecutionEngine/JITEventListener.h
index ff7840f00a4..ad89599f717 100644
--- a/include/llvm/ExecutionEngine/JITEventListener.h
+++ b/include/llvm/ExecutionEngine/JITEventListener.h
@@ -115,6 +115,15 @@ public:
}
#endif // USE_OPROFILE
+#if LLVM_USE_PERF
+ static JITEventListener *createPerfJITEventListener();
+#else
+ static JITEventListener *createPerfJITEventListener()
+ {
+ return nullptr;
+ }
+#endif // USE_PERF
+
private:
virtual void anchor();
};
diff --git a/lib/ExecutionEngine/CMakeLists.txt b/lib/ExecutionEngine/CMakeLists.txt
index 84b34919e44..c0dea0550fb 100644
--- a/lib/ExecutionEngine/CMakeLists.txt
+++ b/lib/ExecutionEngine/CMakeLists.txt
@@ -30,3 +30,7 @@ endif( LLVM_USE_OPROFILE )
if( LLVM_USE_INTEL_JITEVENTS )
add_subdirectory(IntelJITEvents)
endif( LLVM_USE_INTEL_JITEVENTS )
+
+if( LLVM_USE_PERF )
+ add_subdirectory(PerfJITEvents)
+endif( LLVM_USE_PERF )
diff --git a/lib/ExecutionEngine/LLVMBuild.txt b/lib/ExecutionEngine/LLVMBuild.txt
index 9d29a41f504..b6e1bda6a51 100644
--- a/lib/ExecutionEngine/LLVMBuild.txt
+++ b/lib/ExecutionEngine/LLVMBuild.txt
@@ -16,7 +16,7 @@
;===------------------------------------------------------------------------===;
[common]
-subdirectories = Interpreter MCJIT RuntimeDyld IntelJITEvents OProfileJIT Orc
+subdirectories = Interpreter MCJIT RuntimeDyld IntelJITEvents OProfileJIT Orc PerfJITEvents
[component_0]
type = Library
diff --git a/lib/ExecutionEngine/Orc/LLVMBuild.txt b/lib/ExecutionEngine/Orc/LLVMBuild.txt
index 8f05172e77a..ef4ae64e823 100644
--- a/lib/ExecutionEngine/Orc/LLVMBuild.txt
+++ b/lib/ExecutionEngine/Orc/LLVMBuild.txt
@@ -19,4 +19,4 @@
type = Library
name = OrcJIT
parent = ExecutionEngine
-required_libraries = Core ExecutionEngine Object RuntimeDyld Support TransformUtils
+required_libraries = Core ExecutionEngine Object RuntimeDyld Support TransformUtils PerfJITEvents
diff --git a/lib/ExecutionEngine/PerfJITEvents/CMakeLists.txt b/lib/ExecutionEngine/PerfJITEvents/CMakeLists.txt
new file mode 100644
index 00000000000..136cc429d02
--- /dev/null
+++ b/lib/ExecutionEngine/PerfJITEvents/CMakeLists.txt
@@ -0,0 +1,5 @@
+add_llvm_library(LLVMPerfJITEvents
+ PerfJITEventListener.cpp
+ )
+
+add_dependencies(LLVMPerfJITEvents LLVMCodeGen)
diff --git a/lib/ExecutionEngine/Orc/LLVMBuild.txt b/lib/ExecutionEngine/PerfJITEvents/LLVMBuild.txt
similarity index 72%
copy from lib/ExecutionEngine/Orc/LLVMBuild.txt
copy to lib/ExecutionEngine/PerfJITEvents/LLVMBuild.txt
index 8f05172e77a..5175f9dd791 100644
--- a/lib/ExecutionEngine/Orc/LLVMBuild.txt
+++ b/lib/ExecutionEngine/PerfJITEvents/LLVMBuild.txt
@@ -1,4 +1,4 @@
-;===- ./lib/ExecutionEngine/MCJIT/LLVMBuild.txt ----------------*- Conf -*--===;
+;===- ./lib/ExecutionEngine/PerfJITEvents/LLVMBuild.txt ----------------*- Conf -*--===;
;
; The LLVM Compiler Infrastructure
;
@@ -16,7 +16,7 @@
;===------------------------------------------------------------------------===;
[component_0]
-type = Library
-name = OrcJIT
+type = OptionalLibrary
+name = PerfJITEvents
parent = ExecutionEngine
-required_libraries = Core ExecutionEngine Object RuntimeDyld Support TransformUtils
+required_libraries = CodeGen Core DebugInfoDWARF Support Object ExecutionEngine
diff --git a/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp b/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp
new file mode 100644
index 00000000000..d8b40e8b949
--- /dev/null
+++ b/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp
@@ -0,0 +1,530 @@
+//===-- PerfJITEventListener.cpp - Tell Linux's perf about JITted code ----===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This file defines a JITEventListener object that tells perf JITted functions,
+// including source line information.
+//
+// Documentation for perf jit integration is available at:
+// https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/jitdump-specification.txt
+// https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/jit-interface.txt
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Config/config.h"
+
+#include <unistd.h> // for getpid(), sysconf()
+#include <syscall.h> // for gettid() */
+#include <time.h> // clock_gettime(), time(), localtime_r() */
+#include <sys/mman.h> // mmap() */
+#include <sys/types.h> // getpid(), open()
+#include <sys/stat.h> // open()
+#include <fcntl.h> // open()
+
+#include "llvm/ExecutionEngine/JITEventListener.h"
+
+#include "llvm/ADT/Twine.h"
+#include "llvm/DebugInfo/DWARF/DWARFContext.h"
+#include "llvm/Object/ObjectFile.h"
+#include "llvm/Object/SymbolSize.h"
+#include "llvm/Support/Debug.h"
+#include "llvm/Support/Errno.h"
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/Mutex.h"
+#include "llvm/Support/MutexGuard.h"
+#include "llvm/Support/raw_ostream.h"
+
+using namespace llvm;
+using namespace llvm::object;
+typedef DILineInfoSpecifier::FileLineInfoKind FileLineInfoKind;
+
+namespace {
+
+// language identifier (XXX: should we generate something better from debug info?)
+#define JIT_LANG "llvm-IR"
+#define LLVM_PERF_JIT_MAGIC ((uint32_t) 'J' << 24 | (uint32_t) 'i' << 16 | (uint32_t) 'T' << 8 | (uint32_t) 'D')
+#define LLVM_PERF_JIT_VERSION 1
+
+/* bit 0: set if the jitdump file is using an architecture-specific timestamp clock source */
+#define JITDUMP_FLAGS_ARCH_TIMESTAMP (1ULL << 0)
+
+struct LLVMPerfJitHeader;
+
+class PerfJITEventListener : public JITEventListener {
+public:
+ PerfJITEventListener();
+ ~PerfJITEventListener() {
+ if (MarkerAddr)
+ CloseMarker();
+ }
+
+ void NotifyObjectEmitted(const ObjectFile &Obj,
+ const RuntimeDyld::LoadedObjectInfo &L) override;
+
+ void NotifyFreeingObject(const ObjectFile &Obj) override;
+
+private:
+
+ bool InitDebuggingDir();
+ bool OpenMarker();
+ void CloseMarker();
+ bool FillMachine(LLVMPerfJitHeader &hdr);
+
+ void NotifyCode(Expected<llvm::StringRef> &Symbol, uint64_t CodeAddr, uint64_t CodeSize);
+ void NotifyDebug(uint64_t CodeAddr, DILineInfoTable Lines);
+
+ // output data stream
+ std::unique_ptr<raw_fd_ostream> Dumpstream;
+
+ // output data stream, lifeliness managed via Dumpstream
+ int Fd;
+
+ // prevent concurrent dumps from messing up the output file
+ sys::Mutex Mutex;
+
+ // cache lookups
+ pid_t Pid;
+
+ // base directory for output data
+ std::string JitPath;
+
+ // perf mmap marker
+ void *MarkerAddr = NULL;
+
+ // perf support ready
+ bool SuccessfullyInitialized = false;
+};
+
+// The following are POD struct definitions from the perf jit specification
+
+enum LLVMPerfJitRecordType {
+ JIT_CODE_LOAD = 0,
+ JIT_CODE_MOVE = 1,
+ JIT_CODE_DEBUG_INFO = 2,
+ JIT_CODE_CLOSE = 3,
+ JIT_CODE_UNWINDING_INFO = 4,
+
+ JIT_CODE_MAX,
+};
+
+struct LLVMPerfJitHeader {
+ uint32_t Magic; /* characters "JiTD" */
+ uint32_t Version; /* header version */
+ uint32_t TotalSize; /* total size of header */
+ uint32_t ElfMach; /* elf mach target */
+ uint32_t Pad1; /* reserved */
+ uint32_t Pid;
+ uint64_t Timestamp; /* timestamp */
+ uint64_t Flags; /* flags */
+};
+
+/* record prefix (mandatory in each record) */
+struct LLVMPerfJitRecordPrefix {
+ uint32_t Id; /* record type identifier */
+ uint32_t TotalSize;
+ uint64_t Timestamp;
+};
+
+struct LLVMPerfJitRecordCodeLoad {
+ LLVMPerfJitRecordPrefix Prefix;
+
+ uint32_t Pid;
+ uint32_t Tid;
+ uint64_t Vma;
+ uint64_t CodeAddr;
+ uint64_t CodeSize;
+ uint64_t CodeIndex;
+};
+
+struct LLVMPerfJitRecordClose {
+ LLVMPerfJitRecordPrefix Prefix;
+};
+
+struct LLVMPerfJitRecordMoveCode {
+ LLVMPerfJitRecordPrefix Prefix;
+
+ uint32_t Pid;
+ uint32_t Tid;
+ uint64_t Vma;
+ uint64_t OldCodeAddr;
+ uint64_t NewCodeAddr;
+ uint64_t CodeSize;
+ uint64_t CodeIndex;
+};
+
+struct LLVMPerfJitDebugEntry {
+ uint64_t Addr;
+ int Lineno; /* source line number starting at 1 */
+ int Discrim; /* column discriminator, 0 is default */
+ char Name[]; /* null terminated filename, \xff\0 if same as previous entry */
+};
+
+struct LLVMPerfJitRecordDebugInfo {
+ LLVMPerfJitRecordPrefix Prefix;
+
+ uint64_t CodeAddr;
+ uint64_t NrEntry;
+ LLVMPerfJitDebugEntry Entries[];
+};
+
+struct LLVMPerfJitRecordUnwindInfo {
+ LLVMPerfJitRecordPrefix prefix;
+
+ uint64_t UnwindingSize;
+ uint64_t EhFrameHdrSize;
+ uint64_t MappedSize;
+ const char UnwindingData[];
+};
+
+// not available otherwise
+static inline pid_t gettid(void) {
+ return (pid_t)syscall(__NR_gettid);
+}
+
+static inline uint64_t
+timespec_to_ns(const struct timespec *ts) {
+ const uint64_t NanoSecPerSec = 1000000000;
+ return ((uint64_t) ts->tv_sec * NanoSecPerSec) + ts->tv_nsec;
+}
+
+static inline uint64_t
+perf_get_timestamp(void) {
+ struct timespec ts;
+ int ret;
+
+ ret = clock_gettime(CLOCK_MONOTONIC, &ts);
+ if (ret)
+ return 0;
+
+ return timespec_to_ns(&ts);
+}
+
+
+PerfJITEventListener::PerfJITEventListener()
+ : Pid(getpid()) {
+
+ LLVMPerfJitHeader Header = {0};
+ std::string Filename;
+ raw_string_ostream FilenameBuf(Filename);
+
+ // check if clock-source is supported
+ if (!perf_get_timestamp()) {
+ errs() << "kernel does not support CLOCK_MONOTONIC("<<CLOCK_MONOTONIC<<")\n";
+ return;
+ }
+
+ memset(&Header, 0, sizeof(Header));
+
+ if (!InitDebuggingDir()) {
+ errs() << "could not initialize debugging directory\n";
+ return;
+ }
+
+ FilenameBuf << JitPath << "/jit-"<<Pid<<".dump";
+
+ Fd = ::open(FilenameBuf.str().c_str(), O_CREAT|O_TRUNC|O_RDWR, 0666);
+ if (Fd == -1) {
+ errs() << "could not open JIT dump file "<<FilenameBuf.str()<<"\n";
+ return;
+ }
+
+ std::error_code EC;
+ Dumpstream = make_unique<raw_fd_ostream>(Fd, true);
+ assert(!EC);
+
+ if (!OpenMarker()) {
+ return;
+ }
+
+ if (!FillMachine(Header)) {
+ return;
+ }
+
+ Header.Magic = LLVM_PERF_JIT_MAGIC;
+ Header.Version = LLVM_PERF_JIT_VERSION;
+ Header.TotalSize = sizeof(Header);
+ Header.Pid = Pid;
+ Header.Timestamp = perf_get_timestamp();
+
+ Dumpstream->write((char *) &Header, sizeof(Header));
+
+ // Everything initialized, can do profiling now.
+ if (!Dumpstream->has_error())
+ SuccessfullyInitialized = true;
+}
+
+void PerfJITEventListener::NotifyObjectEmitted(
+ const ObjectFile &Obj,
+ const RuntimeDyld::LoadedObjectInfo &L) {
+
+ if (!SuccessfullyInitialized)
+ return;
+
+ OwningBinary<ObjectFile> DebugObjOwner = L.getObjectForDebug(Obj);
+ const ObjectFile &DebugObj = *DebugObjOwner.getBinary();
+
+ // Get the address of the object image for use as a unique identifier
+ std::unique_ptr<DIContext> Context = DWARFContext::create(DebugObj);
+
+ // Use symbol info to iterate functions in the object.
+ for (const std::pair<SymbolRef, uint64_t> &P : computeSymbolSizes(DebugObj)) {
+ SymbolRef Sym = P.first;
+ std::vector<LLVMPerfJitDebugEntry> LineInfo;
+ std::string SourceFileName;
+
+ Expected<SymbolRef::Type> SymTypeOrErr = Sym.getType();
+ if (!SymTypeOrErr) {
+ // TODO: Actually report errors helpfully.
+ consumeError(SymTypeOrErr.takeError());
+ continue;
+ }
+ SymbolRef::Type SymType = *SymTypeOrErr;
+ if (SymType != SymbolRef::ST_Function)
+ continue;
+
+ Expected<StringRef> Name = Sym.getName();
+ if (!Name) {
+ // TODO: Actually report errors helpfully.
+ consumeError(Name.takeError());
+ continue;
+ }
+
+ Expected<uint64_t> AddrOrErr = Sym.getAddress();
+ if (!AddrOrErr) {
+ // TODO: Actually report errors helpfully.
+ consumeError(AddrOrErr.takeError());
+ continue;
+ }
+ uint64_t Addr = *AddrOrErr;
+ uint64_t Size = P.second;
+
+ // According to spec debugging info has to come before loading the
+ // corresonding code load.
+ DILineInfoTable Lines = Context->getLineInfoForAddressRange(
+ Addr, Size, FileLineInfoKind::AbsoluteFilePath);
+ NotifyDebug(Addr, Lines);
+
+ NotifyCode(Name, Addr, Size);
+ }
+
+ Dumpstream->flush();
+}
+
+void PerfJITEventListener::NotifyFreeingObject(const ObjectFile &Obj) {
+ /* perf currently doesn't have an interface for unloading */
+}
+
+bool PerfJITEventListener::InitDebuggingDir() {
+ const char *BaseDir;
+ llvm::SmallString<128> TestDir;
+ time_t Time;
+ struct tm LocalTime;
+ char TimeBuffer[sizeof("YYMMDD")];
+
+ time(&Time);
+ localtime_r(&Time, &LocalTime);
+
+ /* perf specific location */
+ BaseDir = getenv("JITDUMPDIR");
+ if (!BaseDir)
+ BaseDir = getenv("HOME");
+ if (!BaseDir)
+ BaseDir = ".";
+
+ strftime(TimeBuffer, sizeof(TimeBuffer), "%Y%m%d", &LocalTime);
+
+ std::string DebugDir(BaseDir);
+ DebugDir += "/.debug/jit/";
+
+ if (sys::fs::create_directories(DebugDir)) {
+ errs() << "could not create jit cache directory "<<DebugDir<<"\n";
+ return false;
+ }
+
+ SmallString<128> UniqueDebugDir;
+
+ if (sys::fs::createUniqueDirectory(Twine(DebugDir) + JIT_LANG"-jit-" + TimeBuffer,
+ UniqueDebugDir)) {
+ errs() << "could not create unique jit cache directory "<<DebugDir<<"\n";
+ return false;
+ }
+
+ JitPath = UniqueDebugDir.str();
+
+ return true;
+}
+
+bool PerfJITEventListener::OpenMarker() {
+ long pgsz;
+
+ pgsz = ::sysconf(_SC_PAGESIZE);
+ if (pgsz == -1)
+ return false;
+
+ /*
+ * We mmap the jitdump to create an MMAP RECORD in perf.data file. The mmap
+ * is captured either live (perf record running when we mmap) or in deferred
+ * mode, via /proc/PID/maps the MMAP record is used as a marker of a jitdump
+ * file for more meta data info about the jitted code. Perf report/annotate
+ * detect this special filename and process the jitdump file.
+ *
+ * Mapping must be PROT_EXEC to ensure it is captured by perf record
+ * even when not using -d option.
+ */
+ MarkerAddr = ::mmap(NULL, pgsz, PROT_READ|PROT_EXEC, MAP_PRIVATE, Fd, 0);
+
+ if (MarkerAddr == MAP_FAILED) {
+ errs() << "could not mmap JIT marker\n";
+ return false;
+ }
+ return true;
+}
+
+bool PerfJITEventListener::FillMachine(LLVMPerfJitHeader &hdr) {
+ ssize_t sret;
+ char id[16];
+ int fd;
+ struct {
+ uint16_t e_type;
+ uint16_t e_machine;
+ } info;
+
+ fd = ::open("/proc/self/exe", O_RDONLY);
+ if (fd == -1) {
+ errs() << "could not open /proc/self/exe\n";
+ return false;
+ }
+
+ sret = ::read(fd, id, sizeof(id));
+ if (sret != sizeof(id)) {
+ errs() << "could not read elf signature from /proc/self/exe\n";
+ goto error;
+ }
+
+ /* check ELF signature */
+ if (id[0] != 0x7f || id[1] != 'E' || id[2] != 'L' || id[3] != 'F') {
+ errs() << "invalid elf signature\n";
+ goto error;
+ }
+
+ sret = ::read(fd, &info, sizeof(info));
+ if (sret != sizeof(info)) {
+ errs() << "could not read machine identification\n";
+ goto error;
+ }
+
+ hdr.ElfMach = info.e_machine;
+ error:
+ close(fd);
+ return true;
+}
+
+void PerfJITEventListener::CloseMarker() {
+ long pgsz;
+
+ if (!MarkerAddr)
+ return;
+
+ pgsz = ::sysconf(_SC_PAGESIZE);
+ if (pgsz == -1)
+ return;
+
+ munmap(MarkerAddr, pgsz);
+ MarkerAddr = nullptr;
+}
+
+void PerfJITEventListener::NotifyCode(Expected<llvm::StringRef> &Symbol, uint64_t CodeAddr, uint64_t CodeSize) {
+ static int code_generation = 1;
+ LLVMPerfJitRecordCodeLoad rec;
+
+ assert(SuccessfullyInitialized);
+
+ // 0 length functions can't have samples.
+ if (CodeSize == 0)
+ return;
+
+ rec.Prefix.Id = JIT_CODE_LOAD;
+ rec.Prefix.TotalSize =
+ sizeof(rec) + // debug record itself
+ Symbol->size() + 1 + // symbol name
+ CodeSize; // and code
+ rec.Prefix.Timestamp = perf_get_timestamp();
+
+ rec.CodeSize = CodeSize;
+ rec.Vma = 0;
+ rec.CodeAddr = CodeAddr;
+ rec.Pid = Pid;
+ rec.Tid = gettid();
+
+ // get code index inside lock to avoid race condition
+ MutexGuard Guard(Mutex);
+
+ rec.CodeIndex = code_generation++;
+
+ Dumpstream->write(reinterpret_cast<const char *>(&rec), sizeof(rec));
+ Dumpstream->write(Symbol->data(), Symbol->size() + 1);
+ Dumpstream->write(reinterpret_cast<const char *>(CodeAddr), CodeSize);
+}
+
+void PerfJITEventListener::NotifyDebug(uint64_t CodeAddr, DILineInfoTable Lines) {
+ LLVMPerfJitRecordDebugInfo rec;
+
+ assert(SuccessfullyInitialized);
+
+ // Didn't get useful debug info.
+ if (Lines.empty())
+ return;
+
+ rec.Prefix.Id = JIT_CODE_DEBUG_INFO;
+ rec.Prefix.TotalSize = sizeof(rec); // will be increased further
+ rec.Prefix.Timestamp = perf_get_timestamp();
+ rec.CodeAddr = CodeAddr;
+ rec.NrEntry = Lines.size();
+
+ /* compute total size size of record (variable due to filenames) */
+ DILineInfoTable::iterator Begin = Lines.begin();
+ DILineInfoTable::iterator End = Lines.end();
+ for (DILineInfoTable::iterator It = Begin; It != End; ++It) {
+ DILineInfo &line = It->second;
+ rec.Prefix.TotalSize += sizeof(LLVMPerfJitDebugEntry);
+ rec.Prefix.TotalSize += line.FileName.size() + 1;
+ }
+
+ Dumpstream->write(reinterpret_cast<const char *>(&rec), sizeof(rec));
+
+ // The debug_entry describes the source line information. It is defined as follows in order:
+ // * uint64_t code_addr: address of function for which the debug information is generated
+ // * uint32_t line : source file line number (starting at 1)
+ // * uint32_t discrim : column discriminator, 0 is default
+ // * char name[n] : source file name in ASCII, including null termination
+
+ MutexGuard Guard(Mutex);
+
+ for (DILineInfoTable::iterator It = Begin; It != End; ++It) {
+ LLVMPerfJitDebugEntry LineInfo;
+ DILineInfo &Line = It->second;
+
+ LineInfo.Addr = It->first;
+ // For reasons unknown to me either llvm offsets or perf's use of them is
+ // offset by 0x40. Inquiring.
+ LineInfo.Addr += 0x40;
+ LineInfo.Lineno = Line.Line;
+ LineInfo.Discrim = Line.Discriminator;
+
+ Dumpstream->write(reinterpret_cast<const char *>(&LineInfo), sizeof(LineInfo));
+ Dumpstream->write(Line.FileName.c_str(), Line.FileName.size() + 1);
+ }
+}
+
+} // end anonymous namespace
+
+namespace llvm {
+JITEventListener *JITEventListener::createPerfJITEventListener() {
+ return new PerfJITEventListener();
+}
+} // end llvm namespace
diff --git a/tools/lli/CMakeLists.txt b/tools/lli/CMakeLists.txt
index f02e19313b7..5f235b6f6f3 100644
--- a/tools/lli/CMakeLists.txt
+++ b/tools/lli/CMakeLists.txt
@@ -36,6 +36,15 @@ if( LLVM_USE_INTEL_JITEVENTS )
)
endif( LLVM_USE_INTEL_JITEVENTS )
+if( LLVM_USE_PERF )
+ set(LLVM_LINK_COMPONENTS
+ ${LLVM_LINK_COMPONENTS}
+ DebugInfoDWARF
+ PerfJITEvents
+ Object
+ )
+endif( LLVM_USE_PERF )
+
add_llvm_tool(lli
lli.cpp
OrcLazyJIT.cpp
diff --git a/tools/lli/lli.cpp b/tools/lli/lli.cpp
index cd43e9d5791..a6c22526ea6 100644
--- a/tools/lli/lli.cpp
+++ b/tools/lli/lli.cpp
@@ -496,6 +496,8 @@ int main(int argc, char **argv, char * const *envp) {
JITEventListener::createOProfileJITEventListener());
EE->RegisterJITEventListener(
JITEventListener::createIntelJITEventListener());
+ EE->RegisterJITEventListener(
+ JITEventListener::createPerfJITEventListener());
if (!NoLazyCompilation && RemoteMCJIT) {
errs() << "warning: remote mcjit does not support lazy compilation\n";
--
2.14.1.536.g6867272d5b.dirty
0006-ORC-JIT-event-listener-support.patchtext/x-diff; charset=us-asciiDownload
From c97ed210b4a3400e23639cc10e746b231729dd82 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 4 Oct 2017 15:37:27 -0700
Subject: [PATCH 6/6] [ORC] JIT event listener support.
---
include/llvm-c/OrcBindings.h | 4 ++
.../ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h | 22 +++++++++-
lib/ExecutionEngine/Orc/OrcCBindings.cpp | 25 ++++++++++++
lib/ExecutionEngine/Orc/OrcCBindingsStack.h | 47 +++++++++++++++++++++-
4 files changed, 94 insertions(+), 4 deletions(-)
diff --git a/include/llvm-c/OrcBindings.h b/include/llvm-c/OrcBindings.h
index 4ff1f47e87d..38fa44c231d 100644
--- a/include/llvm-c/OrcBindings.h
+++ b/include/llvm-c/OrcBindings.h
@@ -180,6 +180,10 @@ LLVMOrcErrorCode LLVMOrcGetSymbolAddressIn(LLVMOrcJITStackRef JITStack,
*/
LLVMOrcErrorCode LLVMOrcDisposeInstance(LLVMOrcJITStackRef JITStack);
+void LLVMOrcRegisterPerf(LLVMOrcJITStackRef JITStack);
+void LLVMOrcRegisterGDB(LLVMOrcJITStackRef JITStack);
+void LLVMOrcUnregisterPerf(LLVMOrcJITStackRef JITStack);
+
#ifdef __cplusplus
}
#endif /* extern "C" */
diff --git a/include/llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h b/include/llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h
index 246c57341f3..d720f4053a3 100644
--- a/include/llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h
+++ b/include/llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h
@@ -75,6 +75,8 @@ protected:
return JITSymbol(SymEntry->second);
}
+ virtual ObjectPtr getObject() const = 0;
+
protected:
StringMap<JITEvaluatedSymbol> SymbolTable;
bool Finalized = false;
@@ -106,6 +108,10 @@ public:
/// @brief Functor for receiving finalization notifications.
using NotifyFinalizedFtor = std::function<void(ObjHandleT)>;
+
+ /// @brief Functor for receiving freeing notifications.
+ using NotifyFreedFtor = std::function<void(ObjHandleT)>;
+
private:
@@ -117,7 +123,8 @@ private:
SymbolResolverPtrT Resolver,
FinalizerFtor Finalizer,
bool ProcessAllSections)
- : MemMgr(std::move(MemMgr)),
+ : MemMgr(std::move(MemMgr)),
+ Obj(Obj),
PFC(llvm::make_unique<PreFinalizeContents>(std::move(Obj),
std::move(Resolver),
std::move(Finalizer),
@@ -168,6 +175,10 @@ private:
PFC->RTDyld->mapSectionAddress(LocalAddress, TargetAddr);
}
+ ObjectPtr getObject() const override {
+ return Obj;
+ };
+
private:
void buildInitialSymbolTable(const ObjectPtr &Obj) {
@@ -209,6 +220,7 @@ private:
};
MemoryManagerPtrT MemMgr;
+ ObjectPtr Obj;
std::unique_ptr<PreFinalizeContents> PFC;
};
@@ -238,10 +250,12 @@ public:
RTDyldObjectLinkingLayer(
MemoryManagerGetter GetMemMgr,
NotifyLoadedFtor NotifyLoaded = NotifyLoadedFtor(),
- NotifyFinalizedFtor NotifyFinalized = NotifyFinalizedFtor())
+ NotifyFinalizedFtor NotifyFinalized = NotifyFinalizedFtor(),
+ NotifyFreedFtor NotifyFreed = NotifyFreedFtor())
: GetMemMgr(GetMemMgr),
NotifyLoaded(std::move(NotifyLoaded)),
NotifyFinalized(std::move(NotifyFinalized)),
+ NotifyFreed(std::move(NotifyFreed)),
ProcessAllSections(false) {}
/// @brief Set the 'ProcessAllSections' flag.
@@ -300,6 +314,9 @@ public:
/// required to detect or resolve such issues it should be added at a higher
/// layer.
Error removeObject(ObjHandleT H) {
+ if (this->NotifyFreed)
+ this->NotifyFreed(H);
+
// How do we invalidate the symbols in H?
LinkedObjList.erase(H);
return Error::success();
@@ -350,6 +367,7 @@ private:
MemoryManagerGetter GetMemMgr;
NotifyLoadedFtor NotifyLoaded;
NotifyFinalizedFtor NotifyFinalized;
+ NotifyFreedFtor NotifyFreed;
bool ProcessAllSections = false;
};
diff --git a/lib/ExecutionEngine/Orc/OrcCBindings.cpp b/lib/ExecutionEngine/Orc/OrcCBindings.cpp
index 9b9c1512402..9c83ce2c340 100644
--- a/lib/ExecutionEngine/Orc/OrcCBindings.cpp
+++ b/lib/ExecutionEngine/Orc/OrcCBindings.cpp
@@ -10,6 +10,8 @@
#include "OrcCBindingsStack.h"
#include "llvm-c/OrcBindings.h"
+#include "llvm/ExecutionEngine/JITEventListener.h"
+
using namespace llvm;
LLVMSharedModuleRef LLVMOrcMakeSharedModule(LLVMModuleRef Mod) {
@@ -134,3 +136,26 @@ LLVMOrcErrorCode LLVMOrcDisposeInstance(LLVMOrcJITStackRef JITStack) {
delete J;
return Err;
}
+
+
+static JITEventListener *perf_listener_orc = NULL;
+static JITEventListener *gdb_listener_orc = NULL;
+
+void LLVMOrcRegisterGDB(LLVMOrcJITStackRef JITStack) {
+ if (!gdb_listener_orc)
+ gdb_listener_orc = JITEventListener::createGDBRegistrationListener();
+ unwrap(JITStack)->RegisterJITEventListener(gdb_listener_orc);
+}
+
+void LLVMOrcRegisterPerf(LLVMOrcJITStackRef JITStack) {
+ if (!perf_listener_orc)
+ perf_listener_orc = JITEventListener::createPerfJITEventListener();
+ unwrap(JITStack)->RegisterJITEventListener(perf_listener_orc);
+}
+
+void LLVMOrcUnregisterPerf(LLVMOrcJITStackRef JITStack) {
+ if (perf_listener_orc) {
+ delete perf_listener_orc;
+ perf_listener_orc = NULL;
+ }
+}
diff --git a/lib/ExecutionEngine/Orc/OrcCBindingsStack.h b/lib/ExecutionEngine/Orc/OrcCBindingsStack.h
index 6eaac01d52f..b448a9c370c 100644
--- a/lib/ExecutionEngine/Orc/OrcCBindingsStack.h
+++ b/lib/ExecutionEngine/Orc/OrcCBindingsStack.h
@@ -15,6 +15,7 @@
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/ExecutionEngine/JITSymbol.h"
+#include "llvm/ExecutionEngine/JITEventListener.h"
#include "llvm/ExecutionEngine/Orc/CompileOnDemandLayer.h"
#include "llvm/ExecutionEngine/Orc/CompileUtils.h"
#include "llvm/ExecutionEngine/Orc/ExecutionUtils.h"
@@ -24,8 +25,10 @@
#include "llvm/ExecutionEngine/RuntimeDyld.h"
#include "llvm/ExecutionEngine/SectionMemoryManager.h"
#include "llvm/IR/DataLayout.h"
+#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Mangler.h"
#include "llvm/IR/Module.h"
+#include "llvm/Object/ObjectFile.h"
#include "llvm/Support/CBindingWrapping.h"
#include "llvm/Support/Error.h"
#include "llvm/Support/raw_ostream.h"
@@ -137,13 +140,18 @@ public:
ObjectLayer(
[]() {
return std::make_shared<SectionMemoryManager>();
- }),
+ },
+ std::bind(&OrcCBindingsStack::notifyLoaded, this, std::placeholders::_1, std::placeholders::_2, std::placeholders::_3),
+ std::bind(&OrcCBindingsStack::notifyFinalized, this, std::placeholders::_1),
+ std::bind(&OrcCBindingsStack::notifyFreed, this, std::placeholders::_1)),
CompileLayer(ObjectLayer, orc::SimpleCompiler(TM)),
CODLayer(CompileLayer,
[](Function &F) { return std::set<Function *>({&F}); },
*this->CCMgr, std::move(IndirectStubsMgrBuilder), false),
CXXRuntimeOverrides(
- [this](const std::string &S) { return mangle(S); }) {}
+ [this](const std::string &S) { return mangle(S); }) {
+ ObjectLayer.setProcessAllSections(true);
+ }
LLVMOrcErrorCode shutdown() {
// Run any destructors registered with __cxa_atexit.
@@ -378,6 +386,10 @@ public:
const std::string &getErrorMessage() const { return ErrMsg; }
+ void RegisterJITEventListener(JITEventListener *l) {
+ EventListeners.push_back(l);
+ }
+
private:
template <typename LayerT, typename HandleT>
unsigned createHandle(LayerT &Layer, HandleT Handle) {
@@ -408,6 +420,33 @@ private:
return Result;
}
+ void notifyLoaded(orc::RTDyldObjectLinkingLayerBase::ObjHandleT H,
+ const orc::RTDyldObjectLinkingLayerBase::ObjectPtr &Obj,
+ const RuntimeDyld::LoadedObjectInfo &LoadedObjInfo) {
+ PendingLoadedObjectInfos.push_back(&LoadedObjInfo);
+ PendingLoadedObjects.push_back(Obj->getBinary());
+ }
+
+
+ void notifyFinalized(orc::RTDyldObjectLinkingLayerBase::ObjHandleT H) {
+ for (auto &Listener : EventListeners) {
+ for (size_t I = 0, S = PendingLoadedObjects.size(); I < S; ++I) {
+ auto &Obj = PendingLoadedObjects[I];
+ auto &Info = PendingLoadedObjectInfos[I];
+ Listener->NotifyObjectEmitted(*Obj, *Info);
+ }
+ }
+
+ PendingLoadedObjects.clear();
+ PendingLoadedObjectInfos.clear();
+ }
+
+ void notifyFreed(orc::RTDyldObjectLinkingLayerBase::ObjHandleT H) {
+ for (auto &Listener : EventListeners) {
+ Listener->NotifyFreeingObject(*(*H)->getObject()->getBinary());
+ }
+ }
+
DataLayout DL;
SectionMemoryManager CCMgrMemMgr;
@@ -424,6 +463,10 @@ private:
orc::LocalCXXRuntimeOverrides CXXRuntimeOverrides;
std::vector<orc::CtorDtorRunner<OrcCBindingsStack>> IRStaticDestructorRunners;
std::string ErrMsg;
+
+ std::vector<JITEventListener *> EventListeners;
+ std::vector<const RuntimeDyld::LoadedObjectInfo *> PendingLoadedObjectInfos;
+ std::vector<const object::ObjectFile*> PendingLoadedObjects;
};
} // end namespace llvm
--
2.14.1.536.g6867272d5b.dirty
On 5 October 2017 at 19:57, Andres Freund <andres@anarazel.de> wrote:
Here's some numbers for a a TPC-H scale 5 run. Obviously the Q01 numbers
are pretty nice in partcular. But it's also visible that the shorter
query can loose, which is largely due to the JIT overhead - that can be
ameliorated to some degree, but JITing obviously isn't always going to
be a win.
It's pretty exciting to see thing being worked on.
I've not looked at the code, but I'm thinking, could you not just JIT
if the total cost of the plan is estimated to be > X ? Where X is some
JIT threshold GUC.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-10-05 23:43:37 +1300, David Rowley wrote:
On 5 October 2017 at 19:57, Andres Freund <andres@anarazel.de> wrote:
Here's some numbers for a a TPC-H scale 5 run. Obviously the Q01 numbers
are pretty nice in partcular. But it's also visible that the shorter
query can loose, which is largely due to the JIT overhead - that can be
ameliorated to some degree, but JITing obviously isn't always going to
be a win.It's pretty exciting to see thing being worked on.
I've not looked at the code, but I'm thinking, could you not just JIT
if the total cost of the plan is estimated to be > X ? Where X is some
JIT threshold GUC.
Right, that's the plan. But it seems fairly important to make the
envelope in which it is beneficial as broad as possible. Also, test
coverage is more interesting for me right now ;)
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Oct 5, 2017 at 2:57 AM, Andres Freund <andres@anarazel.de> wrote:
master q01 min: 14146.498 dev min: 11479.05 [diff -23.24] dev-jit min: 8659.961 [diff -63.36] dev-jit-deform min: 7279.395 [diff -94.34] dev-jit-deform-inline min: 6997.956 [diff -102.15]
I think this is a really strange way to display this information.
Instead of computing the percentage of time that you saved, you've
computed the negative of the percentage that you would have lost if
the patch were already committed and you reverted it. That's just
confusing.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Sep 21, 2017 at 2:52 AM, Andres Freund <andres@anarazel.de> wrote:
On 2017-09-19 12:57:33 +0300, Konstantin Knizhnik wrote:
On 04.09.2017 23:52, Andres Freund wrote:
Hi. That piece of code isn't particularly clear (and has a bug in the
submitted version), I'm revising it....
Yea, I've changed that already, although it's currently added earlier,
because the alignment is needed before, to access the column correctly.
I've also made number of efficiency improvements, primarily to access
columns with an absolute offset if all preceding ones are fixed width
not null columns - that is quite noticeable performancewise.Should I wait for new version of your patch or continue review of this code?
I'll update the posted version later this week, sorry for the delay.
I know that you are working on this actively per the set of patches
you have sent lately, but this thread has stalled, so I am marking it
as returned with feedback. There is now only one CF entry to track
this work: https://commitfest.postgresql.org/15/1285/. Depending on
the work you are doing you may want to spawn a CF entry for each
sub-item. Just an idea.
--
Michael
Hi,
One part of the work to make JITing worth it's while is JITing tuple
deforming. That's currently often the biggest consumer of time, and if not
most often in the top entries.
My experimentation shows that tuple deforming is primarily beneficial
when it happens as *part* of jit compiling expressions. I'd originally
tried to jit compile deforming inside heaptuple.c, and cache the
deforming program inside the tuple slot. That turns out to not work very
well, because a lot of tuple descriptors are very short lived, computed
during ExecInitNode(). Even if that were not the case, compiling for
each deforming on demand has significant downsides:
- it requires emitting code in smaller increments (whenever something
new is deformed)
- because the generated code has to be generic for all potential
deformers, the number of branches to check for that are
significant. If instead the the deforming code is generated for a
specific callsite, no branches for the number of to-be-deformed
columns has to be generated. The primary remaining branches then are
the ones checking for NULLs and the number of attributes in the
column, and those can often be optimized away if there's NOT NULL
columns present.
- the call overhead is still noticeable
- the memory / function lifetime management is awkward.
If the JITing of expressions is instead done as part of expression
evaluation we can emit all the necessary code for the whole plantree
during executor startup, in one go. And, more importantly, LLVMs
optimizer is free to inline the deforming code into the expression code,
often yielding noticeable improvements (although that still could use
some improvements).
To allow doing JITing at ExecReadyExpr() time, we need to know the tuple
descriptor a EEOP_{INNER,OUTER,SCAN}_FETCHSOME step refers to. There's
currently two major impediments to that.
1) At a lot of ExecInitExpr() callsites the tupledescs for inner, outer,
scan aren't yet known. Therefore that code needs to be reordered so
we (if applicable):
a) initialize subsidiary nodes, thereby determining the left/right
(inner/outer) tupledescs
b) initialize the scan tuple desc, often that refers to a)
c) determine the result tuple desc, required to build the projection
d) build projections
e) build expressions
Attached is a patch doing so. Currently it only applies with a few
preliminary patches applied, but that could be easily reordered.
The patch is relatively large, as I decided to try to get the
different ExecInitNode functions to look a bit more similar. There's
some judgement calls involved, but I think the result looks a good
bit better, regardless of the later need.
I'm not really happy with the, preexisting, split of functions
between execScan.c, execTuples.c, execUtils.c. I wonder if the
majority, except the low level slot ones, shouldn't be moved to
execUtils.c, I think that'd be clearer. There seems to be no
justification for execScan.c to contain
ExecAssignScanProjectionInfo[WithVarno].
2) TupleSlots need to describe whether they'll contain a fixed tupledesc
for all their lifetime, or whether they can change their nature. Most
places don't need to ever change a slot's identity, but in a few
places it's quite convenient.
I've introduced the notion that a tupledesc can be marked as "fixed",
by passing a tupledesc at its creation. That also gains a bit of
efficiency (memory management overhead, higher cache hit ratio)
because the slot, tts_values, tts_isnull can be allocated in one
chunk.
3) At expression initialization time we need to figure out what slots
(or just descs INNER/OUTER/SCAN refer to. I've solved that by looking
up inner/outer/scan via the provided parent node, which required
adding a new field to store the scan slot.
Currently no expressions initialized with a parent node have a
INNER/OUTER/SCAN slot + desc that doesn't refer to the relevant node,
but I'm not sure I like that as a requirement.
Attached is a patch that implements 1 + 2. I'd welcome a quick look
through it. It currently only applies ontop a few other recently
submitted patches, but it'd just be an hour's work or so to reorder
that.
Comments about either the outline above or the patch?
Regards,
Andres
Attachments:
0001-WIP-Allow-tupleslots-to-have-a-fixed-tupledesc-use-i.patchtext/x-diff; charset=us-asciiDownload
From cd04258d92bed57d7e8a6bbe0408c7fb31f1a182 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 3 Oct 2017 23:45:44 -0700
Subject: [PATCH] WIP: Allow tupleslots to have a fixed tupledesc, use in
executor nodes.
The reason for doing so is that it will allow expression evaluation to
optimize based on the underlying tupledesc. In particular it allows
JITing tuple deforming together with the expression itself.
For that expression initialization needs to be moved after the
relevant slots are initialized - mostly unproblematic, except in the
case of nodeWorktablescan.c.
Author: Andres Freund
---
src/backend/commands/copy.c | 5 +-
src/backend/commands/trigger.c | 6 +-
src/backend/executor/README | 2 +
src/backend/executor/execExpr.c | 2 +-
src/backend/executor/execMain.c | 2 +-
src/backend/executor/execPartition.c | 2 +-
src/backend/executor/execScan.c | 2 +-
src/backend/executor/execTuples.c | 115 ++++++++++++++++++-------
src/backend/executor/execUtils.c | 62 ++-----------
src/backend/executor/nodeAgg.c | 68 +++++++--------
src/backend/executor/nodeAppend.c | 18 ++--
src/backend/executor/nodeBitmapAnd.c | 14 +--
src/backend/executor/nodeBitmapHeapscan.c | 58 ++++++-------
src/backend/executor/nodeBitmapIndexscan.c | 18 ++--
src/backend/executor/nodeBitmapOr.c | 14 +--
src/backend/executor/nodeCtescan.c | 32 +++----
src/backend/executor/nodeCustom.c | 20 ++---
src/backend/executor/nodeForeignscan.c | 30 +++----
src/backend/executor/nodeFunctionscan.c | 31 +++----
src/backend/executor/nodeGather.c | 31 +++----
src/backend/executor/nodeGatherMerge.c | 19 ++--
src/backend/executor/nodeGroup.c | 32 +++----
src/backend/executor/nodeHash.c | 23 ++---
src/backend/executor/nodeHashjoin.c | 45 +++++-----
src/backend/executor/nodeIndexonlyscan.c | 34 +++-----
src/backend/executor/nodeIndexscan.c | 45 +++++-----
src/backend/executor/nodeLimit.c | 18 ++--
src/backend/executor/nodeLockRows.c | 9 +-
src/backend/executor/nodeMaterial.c | 6 +-
src/backend/executor/nodeMergeAppend.c | 6 +-
src/backend/executor/nodeMergejoin.c | 53 ++++++------
src/backend/executor/nodeModifyTable.c | 25 +++---
src/backend/executor/nodeNamedtuplestorescan.c | 21 ++---
src/backend/executor/nodeNestloop.c | 29 +++----
src/backend/executor/nodeProjectSet.c | 14 +--
src/backend/executor/nodeRecursiveunion.c | 9 +-
src/backend/executor/nodeResult.c | 25 +++---
src/backend/executor/nodeSamplescan.c | 73 ++++++----------
src/backend/executor/nodeSeqscan.c | 64 +++++---------
src/backend/executor/nodeSetOp.c | 11 +--
src/backend/executor/nodeSort.c | 18 ++--
src/backend/executor/nodeSubplan.c | 6 +-
src/backend/executor/nodeSubqueryscan.c | 28 +++---
src/backend/executor/nodeTableFuncscan.c | 26 +++---
src/backend/executor/nodeTidscan.c | 29 +++----
src/backend/executor/nodeUnique.c | 11 +--
src/backend/executor/nodeValuesscan.c | 25 ++----
src/backend/executor/nodeWindowAgg.c | 35 +++-----
src/backend/executor/nodeWorktablescan.c | 16 ++--
src/backend/replication/logical/worker.c | 22 ++---
src/include/executor/executor.h | 11 ++-
src/include/executor/tuptable.h | 5 +-
52 files changed, 564 insertions(+), 761 deletions(-)
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index bace390470f..47a21661173 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2448,10 +2448,9 @@ CopyFrom(CopyState cstate)
estate->es_range_table = cstate->range_table;
/* Set up a tuple slot too */
- myslot = ExecInitExtraTupleSlot(estate);
- ExecSetSlotDescriptor(myslot, tupDesc);
+ myslot = ExecInitExtraTupleSlot(estate, tupDesc);
/* Triggers might need a slot as well */
- estate->es_trig_tuple_slot = ExecInitExtraTupleSlot(estate);
+ estate->es_trig_tuple_slot = ExecInitExtraTupleSlot(estate, NULL);
/* Prepare to catch AFTER triggers. */
AfterTriggerBeginQuery();
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 92ae3822d8a..6a3d0a83306 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -3246,7 +3246,8 @@ TriggerEnabled(EState *estate, ResultRelInfo *relinfo,
if (estate->es_trig_oldtup_slot == NULL)
{
oldContext = MemoryContextSwitchTo(estate->es_query_cxt);
- estate->es_trig_oldtup_slot = ExecInitExtraTupleSlot(estate);
+ estate->es_trig_oldtup_slot =
+ ExecInitExtraTupleSlot(estate, NULL);
MemoryContextSwitchTo(oldContext);
}
oldslot = estate->es_trig_oldtup_slot;
@@ -3259,7 +3260,8 @@ TriggerEnabled(EState *estate, ResultRelInfo *relinfo,
if (estate->es_trig_newtup_slot == NULL)
{
oldContext = MemoryContextSwitchTo(estate->es_query_cxt);
- estate->es_trig_newtup_slot = ExecInitExtraTupleSlot(estate);
+ estate->es_trig_newtup_slot =
+ ExecInitExtraTupleSlot(estate, NULL);
MemoryContextSwitchTo(oldContext);
}
newslot = estate->es_trig_newtup_slot;
diff --git a/src/backend/executor/README b/src/backend/executor/README
index b3e74aa1a54..0d7cd552eb6 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -243,6 +243,8 @@ This is a sketch of control flow for full query processing:
switch to per-query context to run ExecInitNode
AfterTriggerBeginQuery
ExecInitNode --- recursively scans plan tree
+ ExecInitNode
+ recurse into subsidiary nodes
CreateExprContext
creates per-tuple context
ExecInitExpr
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index d3eaf8fb00f..9041cae9d66 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -2335,7 +2335,7 @@ ExecInitWholeRowVar(ExprEvalStep *scratch, Var *variable, PlanState *parent)
scratch->d.wholerow.junkFilter =
ExecInitJunkFilter(subplan->plan->targetlist,
ExecGetResultType(subplan)->tdhasoid,
- ExecInitExtraTupleSlot(parent->state));
+ ExecInitExtraTupleSlot(parent->state, NULL));
}
}
}
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index dbaa47f2d30..750a83a7155 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1073,7 +1073,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
j = ExecInitJunkFilter(planstate->plan->targetlist,
tupType->tdhasoid,
- ExecInitExtraTupleSlot(estate));
+ ExecInitExtraTupleSlot(estate, NULL));
estate->es_junkFilter = j;
/* Want to return the cleaned tuple type */
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index d545af2b677..ebd86607ed1 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -97,7 +97,7 @@ ExecSetupPartitionTupleRouting(ModifyTableState *mtstate,
* (such as ModifyTableState) and released when the node finishes
* processing.
*/
- *partition_tuple_slot = MakeTupleTableSlot();
+ *partition_tuple_slot = MakeTupleTableSlot(NULL);
leaf_part_rri = (ResultRelInfo *) palloc0(*num_partitions *
sizeof(ResultRelInfo));
diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c
index 837abc0f017..b3f34aac980 100644
--- a/src/backend/executor/execScan.c
+++ b/src/backend/executor/execScan.c
@@ -229,7 +229,7 @@ ExecScan(ScanState *node,
* the scan node, because the planner will preferentially generate a matching
* tlist.
*
- * ExecAssignScanType must have been called already.
+ * The scan slot's descriptor must have been set already.
*/
void
ExecAssignScanProjectionInfo(ScanState *node)
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index 51d2c5d166d..2161de9e1c4 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -58,7 +58,7 @@
* At ExecutorStart()
* ----------------
* - ExecInitSeqScan() calls ExecInitScanTupleSlot() and
- * ExecInitResultTupleSlot() to construct TupleTableSlots
+ * ExecInitResultTupleSlotTL() to construct TupleTableSlots
* for the tuples returned by the access methods and the
* tuples resulting from performing target list projections.
*
@@ -104,19 +104,36 @@ static TupleDesc ExecTypeFromTLInternal(List *targetList,
/* --------------------------------
* MakeTupleTableSlot
*
- * Basic routine to make an empty TupleTableSlot.
+ * Basic routine to make an empty TupleTableSlot. If tupleDesc is
+ * specified the slot's descriptor is fixed for it's lifetime, gaining
+ * some efficiency. If that's undesirable, pass NULL.
* --------------------------------
*/
TupleTableSlot *
-MakeTupleTableSlot(void)
+MakeTupleTableSlot(TupleDesc tupleDesc)
{
- TupleTableSlot *slot = makeNode(TupleTableSlot);
+ Size sz;
+ TupleTableSlot *slot;
+ /*
+ * When a fixed descriptor is specified, we can reduce overhead a bit by
+ * allocating the entire slot in one go.
+ */
+ if (tupleDesc)
+ sz = MAXALIGN(sizeof(TupleTableSlot)) +
+ MAXALIGN(tupleDesc->natts * sizeof(Datum)) +
+ MAXALIGN(tupleDesc->natts * sizeof(bool));
+ else
+ sz = sizeof(TupleTableSlot);
+
+ slot = palloc0(sz);
+ slot->type = T_TupleTableSlot;
slot->tts_isempty = true;
slot->tts_shouldFree = false;
slot->tts_shouldFreeMin = false;
slot->tts_tuple = NULL;
- slot->tts_tupleDescriptor = NULL;
+ slot->tts_fixedTupleDescriptor = tupleDesc != NULL;
+ slot->tts_tupleDescriptor = tupleDesc;
slot->tts_mcxt = CurrentMemoryContext;
slot->tts_buffer = InvalidBuffer;
slot->tts_nvalid = 0;
@@ -124,6 +141,20 @@ MakeTupleTableSlot(void)
slot->tts_isnull = NULL;
slot->tts_mintuple = NULL;
+ if (tupleDesc != NULL)
+ {
+ slot->tts_values = (Datum * )
+ (((char *) slot)
+ + MAXALIGN(sizeof(TupleTableSlot)));
+ slot->tts_isnull = (bool * )
+ (((char *) slot)
+ + MAXALIGN(sizeof(TupleTableSlot))
+ + MAXALIGN(tupleDesc->natts * sizeof(Datum)));
+ slot->tts_fixedTupleDescriptor = true;
+
+ PinTupleDesc(tupleDesc);
+ }
+
return slot;
}
@@ -134,9 +165,9 @@ MakeTupleTableSlot(void)
* --------------------------------
*/
TupleTableSlot *
-ExecAllocTableSlot(List **tupleTable)
+ExecAllocTableSlot(List **tupleTable, TupleDesc desc)
{
- TupleTableSlot *slot = MakeTupleTableSlot();
+ TupleTableSlot *slot = MakeTupleTableSlot(desc);
*tupleTable = lappend(*tupleTable, slot);
@@ -173,10 +204,13 @@ ExecResetTupleTable(List *tupleTable, /* tuple table */
/* If shouldFree, release memory occupied by the slot itself */
if (shouldFree)
{
- if (slot->tts_values)
- pfree(slot->tts_values);
- if (slot->tts_isnull)
- pfree(slot->tts_isnull);
+ if (!slot->tts_fixedTupleDescriptor)
+ {
+ if (slot->tts_values)
+ pfree(slot->tts_values);
+ if (slot->tts_isnull)
+ pfree(slot->tts_isnull);
+ }
pfree(slot);
}
}
@@ -198,9 +232,7 @@ ExecResetTupleTable(List *tupleTable, /* tuple table */
TupleTableSlot *
MakeSingleTupleTableSlot(TupleDesc tupdesc)
{
- TupleTableSlot *slot = MakeTupleTableSlot();
-
- ExecSetSlotDescriptor(slot, tupdesc);
+ TupleTableSlot *slot = MakeTupleTableSlot(tupdesc);
return slot;
}
@@ -220,10 +252,13 @@ ExecDropSingleTupleTableSlot(TupleTableSlot *slot)
ExecClearTuple(slot);
if (slot->tts_tupleDescriptor)
ReleaseTupleDesc(slot->tts_tupleDescriptor);
- if (slot->tts_values)
- pfree(slot->tts_values);
- if (slot->tts_isnull)
- pfree(slot->tts_isnull);
+ if (!slot->tts_fixedTupleDescriptor)
+ {
+ if (slot->tts_values)
+ pfree(slot->tts_values);
+ if (slot->tts_isnull)
+ pfree(slot->tts_isnull);
+ }
pfree(slot);
}
@@ -247,6 +282,8 @@ void
ExecSetSlotDescriptor(TupleTableSlot *slot, /* slot to change */
TupleDesc tupdesc) /* new tuple descriptor */
{
+ Assert(!slot->tts_fixedTupleDescriptor);
+
/* For safety, make sure slot is empty before changing it */
ExecClearTuple(slot);
@@ -816,7 +853,7 @@ ExecCopySlot(TupleTableSlot *dstslot, TupleTableSlot *srcslot)
*/
/* --------------------------------
- * ExecInit{Result,Scan,Extra}TupleSlot
+ * ExecInit{Result,Scan,Extra}TupleSlot[TL]
*
* These are convenience routines to initialize the specified slot
* in nodes inheriting the appropriate state. ExecInitExtraTupleSlot
@@ -825,13 +862,30 @@ ExecCopySlot(TupleTableSlot *dstslot, TupleTableSlot *srcslot)
*/
/* ----------------
- * ExecInitResultTupleSlot
+ * ExecInitResultTupleSlotTL
+ *
+ * Initialize result tuple slot, using the plan node's targetlist.
* ----------------
*/
void
-ExecInitResultTupleSlot(EState *estate, PlanState *planstate)
+ExecInitResultTupleSlotTL(EState *estate, PlanState *planstate)
{
- planstate->ps_ResultTupleSlot = ExecAllocTableSlot(&estate->es_tupleTable);
+ bool hasoid;
+ TupleDesc tupDesc;
+
+ if (ExecContextForcesOids(planstate, &hasoid))
+ {
+ /* context forces OID choice; hasoid is now set correctly */
+ }
+ else
+ {
+ /* given free choice, don't leave space for OIDs in result tuples */
+ hasoid = false;
+ }
+
+ tupDesc = ExecTypeFromTL(planstate->plan->targetlist, hasoid);
+
+ planstate->ps_ResultTupleSlot = ExecAllocTableSlot(&estate->es_tupleTable, tupDesc);
}
/* ----------------
@@ -839,19 +893,24 @@ ExecInitResultTupleSlot(EState *estate, PlanState *planstate)
* ----------------
*/
void
-ExecInitScanTupleSlot(EState *estate, ScanState *scanstate)
+ExecInitScanTupleSlot(EState *estate, ScanState *scanstate, TupleDesc tupledesc)
{
- scanstate->ss_ScanTupleSlot = ExecAllocTableSlot(&estate->es_tupleTable);
+ scanstate->ss_ScanTupleSlot = ExecAllocTableSlot(&estate->es_tupleTable,
+ tupledesc);
}
/* ----------------
* ExecInitExtraTupleSlot
+ *
+ * Return a newly created slot. If tupledesc is non-NULL it'll have that as a
+ * fixed tupledesc. Otherwise the caller needs to use ExecSetSlotDescriptor()
+ * to set the descriptor before use.
* ----------------
*/
TupleTableSlot *
-ExecInitExtraTupleSlot(EState *estate)
+ExecInitExtraTupleSlot(EState *estate, TupleDesc tupledesc)
{
- return ExecAllocTableSlot(&estate->es_tupleTable);
+ return ExecAllocTableSlot(&estate->es_tupleTable, tupledesc);
}
/* ----------------
@@ -865,9 +924,7 @@ ExecInitExtraTupleSlot(EState *estate)
TupleTableSlot *
ExecInitNullTupleSlot(EState *estate, TupleDesc tupType)
{
- TupleTableSlot *slot = ExecInitExtraTupleSlot(estate);
-
- ExecSetSlotDescriptor(slot, tupType);
+ TupleTableSlot *slot = ExecInitExtraTupleSlot(estate, tupType);
return ExecStoreAllNullTuple(slot);
}
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 876439835a3..50ccd8d6560 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -22,7 +22,6 @@
* ReScanExprContext
*
* ExecAssignExprContext Common code for plan node init routines.
- * ExecAssignResultType
* etc
*
* ExecOpenScanRelation Common code for scan node init routines.
@@ -428,47 +427,6 @@ ExecAssignExprContext(EState *estate, PlanState *planstate)
planstate->ps_ExprContext = CreateExprContext(estate);
}
-/* ----------------
- * ExecAssignResultType
- * ----------------
- */
-void
-ExecAssignResultType(PlanState *planstate, TupleDesc tupDesc)
-{
- TupleTableSlot *slot = planstate->ps_ResultTupleSlot;
-
- ExecSetSlotDescriptor(slot, tupDesc);
-}
-
-/* ----------------
- * ExecAssignResultTypeFromTL
- * ----------------
- */
-void
-ExecAssignResultTypeFromTL(PlanState *planstate)
-{
- bool hasoid;
- TupleDesc tupDesc;
-
- if (ExecContextForcesOids(planstate, &hasoid))
- {
- /* context forces OID choice; hasoid is now set correctly */
- }
- else
- {
- /* given free choice, don't leave space for OIDs in result tuples */
- hasoid = false;
- }
-
- /*
- * ExecTypeFromTL needs the parse-time representation of the tlist, not a
- * list of ExprStates. This is good because some plan nodes don't bother
- * to set up planstate->targetlist ...
- */
- tupDesc = ExecTypeFromTL(planstate->plan->targetlist, hasoid);
- ExecAssignResultType(planstate, tupDesc);
-}
-
/* ----------------
* ExecGetResultType
* ----------------
@@ -609,13 +567,9 @@ ExecFreeExprContext(PlanState *planstate)
planstate->ps_ExprContext = NULL;
}
+
/* ----------------------------------------------------------------
- * the following scan type support functions are for
- * those nodes which are stubborn and return tuples in
- * their Scan tuple slot instead of their Result tuple
- * slot.. luck fur us, these nodes do not do projections
- * so we don't have to worry about getting the ProjectionInfo
- * right for them... -cim 6/3/91
+ * Scan node support
* ----------------------------------------------------------------
*/
@@ -632,11 +586,11 @@ ExecAssignScanType(ScanState *scanstate, TupleDesc tupDesc)
}
/* ----------------
- * ExecAssignScanTypeFromOuterPlan
+ * ExecCreateSlotFromOuterPlan
* ----------------
*/
void
-ExecAssignScanTypeFromOuterPlan(ScanState *scanstate)
+ExecCreateScanSlotFromOuterPlan(EState *estate, ScanState *scanstate)
{
PlanState *outerPlan;
TupleDesc tupDesc;
@@ -644,15 +598,9 @@ ExecAssignScanTypeFromOuterPlan(ScanState *scanstate)
outerPlan = outerPlanState(scanstate);
tupDesc = ExecGetResultType(outerPlan);
- ExecAssignScanType(scanstate, tupDesc);
+ ExecInitScanTupleSlot(estate, scanstate, tupDesc);
}
-
-/* ----------------------------------------------------------------
- * Scan node support
- * ----------------------------------------------------------------
- */
-
/* ----------------------------------------------------------------
* ExecRelationIsTargetRelation
*
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index a964b3caf0b..f6469e42d62 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -1390,8 +1390,8 @@ find_hash_columns(AggState *aggstate)
perhash->aggnode->grpOperators,
&perhash->eqfunctions,
&perhash->hashfunctions);
- perhash->hashslot = ExecAllocTableSlot(&estate->es_tupleTable);
- ExecSetSlotDescriptor(perhash->hashslot, hashDesc);
+ perhash->hashslot =
+ ExecAllocTableSlot(&estate->es_tupleTable, hashDesc);
list_free(hashTlist);
bms_free(colnos);
@@ -2174,13 +2174,33 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
ExecAssignExprContext(estate, &aggstate->ss.ps);
/*
- * tuple table initialization.
+ * Initialize child nodes.
*
- * For hashtables, we create some additional slots below.
+ * If we are doing a hashed aggregation then the child plan does not need
+ * to handle REWIND efficiently; see ExecReScanAgg.
*/
- ExecInitScanTupleSlot(estate, &aggstate->ss);
- ExecInitResultTupleSlot(estate, &aggstate->ss.ps);
- aggstate->sort_slot = ExecInitExtraTupleSlot(estate);
+ if (node->aggstrategy == AGG_HASHED)
+ eflags &= ~EXEC_FLAG_REWIND;
+ outerPlan = outerPlan(node);
+ outerPlanState(aggstate) = ExecInitNode(outerPlan, estate, eflags);
+
+ /*
+ * initialize source tuple type.
+ */
+ ExecCreateScanSlotFromOuterPlan(estate, &aggstate->ss);
+ if (node->chain)
+ {
+ TupleDesc scandesc;
+
+ scandesc = aggstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
+ aggstate->sort_slot = ExecInitExtraTupleSlot(estate, scandesc);
+ }
+
+ /*
+ * Initialize result type, slot and projection.
+ */
+ ExecInitResultTupleSlotTL(estate, &aggstate->ss.ps);
+ ExecAssignProjectionInfo(&aggstate->ss.ps, NULL);
/*
* initialize child expressions
@@ -2198,31 +2218,6 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
aggstate->ss.ps.qual =
ExecInitQual(node->plan.qual, (PlanState *) aggstate);
- /*
- * Initialize child nodes.
- *
- * If we are doing a hashed aggregation then the child plan does not need
- * to handle REWIND efficiently; see ExecReScanAgg.
- */
- if (node->aggstrategy == AGG_HASHED)
- eflags &= ~EXEC_FLAG_REWIND;
- outerPlan = outerPlan(node);
- outerPlanState(aggstate) = ExecInitNode(outerPlan, estate, eflags);
-
- /*
- * initialize source tuple type.
- */
- ExecAssignScanTypeFromOuterPlan(&aggstate->ss);
- if (node->chain)
- ExecSetSlotDescriptor(aggstate->sort_slot,
- aggstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor);
-
- /*
- * Initialize result tuple type and projection info.
- */
- ExecAssignResultTypeFromTL(&aggstate->ss.ps);
- ExecAssignProjectionInfo(&aggstate->ss.ps, NULL);
-
/*
* We should now have found all Aggrefs in the targetlist and quals.
*/
@@ -3026,8 +3021,8 @@ build_pertrans_for_aggref(AggStatePerTrans pertrans,
if (numSortCols > 0 || aggref->aggfilter)
{
pertrans->sortdesc = ExecTypeFromTL(aggref->args, false);
- pertrans->sortslot = ExecInitExtraTupleSlot(estate);
- ExecSetSlotDescriptor(pertrans->sortslot, pertrans->sortdesc);
+ pertrans->sortslot =
+ ExecInitExtraTupleSlot(estate, pertrans->sortdesc);
}
if (numSortCols > 0)
@@ -3048,9 +3043,8 @@ build_pertrans_for_aggref(AggStatePerTrans pertrans,
else if (numDistinctCols > 0)
{
/* we will need an extra slot to store prior values */
- pertrans->uniqslot = ExecInitExtraTupleSlot(estate);
- ExecSetSlotDescriptor(pertrans->uniqslot,
- pertrans->sortdesc);
+ pertrans->uniqslot =
+ ExecInitExtraTupleSlot(estate, pertrans->sortdesc);
}
/* Extract the sort information for use later */
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 246a0b2d852..25cd942f9a3 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -129,17 +129,9 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_nplans = nplans;
/*
- * Miscellaneous initialization
- *
- * Append plans don't have expression contexts because they never call
- * ExecQual or ExecProject.
+ * Initialize result tuple type and slot.
*/
-
- /*
- * append nodes still have Result slots, which hold pointers to tuples, so
- * we have to initialize them.
- */
- ExecInitResultTupleSlot(estate, &appendstate->ps);
+ ExecInitResultTupleSlotTL(estate, &appendstate->ps);
/*
* call ExecInitNode on each of the plans to be executed and save the
@@ -155,9 +147,11 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
}
/*
- * initialize output tuple type
+ * Miscellaneous initialization
+ *
+ * Append plans don't have expression contexts because they never call
+ * ExecQual or ExecProject.
*/
- ExecAssignResultTypeFromTL(&appendstate->ps);
appendstate->ps.ps_ProjInfo = NULL;
/*
diff --git a/src/backend/executor/nodeBitmapAnd.c b/src/backend/executor/nodeBitmapAnd.c
index 1c5c312c954..b2b30842c6a 100644
--- a/src/backend/executor/nodeBitmapAnd.c
+++ b/src/backend/executor/nodeBitmapAnd.c
@@ -80,13 +80,6 @@ ExecInitBitmapAnd(BitmapAnd *node, EState *estate, int eflags)
bitmapandstate->bitmapplans = bitmapplanstates;
bitmapandstate->nplans = nplans;
- /*
- * Miscellaneous initialization
- *
- * BitmapAnd plans don't have expression contexts because they never call
- * ExecQual or ExecProject. They don't need any tuple slots either.
- */
-
/*
* call ExecInitNode on each of the plans to be executed and save the
* results into the array "bitmapplanstates".
@@ -99,6 +92,13 @@ ExecInitBitmapAnd(BitmapAnd *node, EState *estate, int eflags)
i++;
}
+ /*
+ * Miscellaneous initialization
+ *
+ * BitmapAnd plans don't have expression contexts because they never call
+ * ExecQual or ExecProject. They don't need any tuple slots either.
+ */
+
return bitmapandstate;
}
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index eb5bbb57ef1..5872096979c 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -911,6 +911,33 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
+ /*
+ * open the base relation and acquire appropriate lock on it.
+ */
+ currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
+
+ /*
+ * initialize child nodes
+ *
+ * We do this after ExecOpenScanRelation because the child nodes will open
+ * indexscans on our relation's indexes, and we want to be sure we have
+ * acquired a lock on the relation first.
+ */
+ outerPlanState(scanstate) = ExecInitNode(outerPlan(node), estate, eflags);
+
+ /*
+ * get the scan type from the relation descriptor.
+ */
+ ExecInitScanTupleSlot(estate, &scanstate->ss,
+ RelationGetDescr(currentRelation));
+
+
+ /*
+ * Initialize result slot, type and projection.
+ */
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
+ ExecAssignScanProjectionInfo(&scanstate->ss);
+
/*
* initialize child expressions
*/
@@ -919,17 +946,6 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
scanstate->bitmapqualorig =
ExecInitQual(node->bitmapqualorig, (PlanState *) scanstate);
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
-
- /*
- * open the base relation and acquire appropriate lock on it.
- */
- currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
-
/*
* Determine the maximum for prefetch_target. If the tablespace has a
* specific IO concurrency set, use that to compute the corresponding
@@ -957,26 +973,6 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
0,
NULL);
- /*
- * get the scan type from the relation descriptor.
- */
- ExecAssignScanType(&scanstate->ss, RelationGetDescr(currentRelation));
-
- /*
- * Initialize result tuple type and projection info.
- */
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
- ExecAssignScanProjectionInfo(&scanstate->ss);
-
- /*
- * initialize child nodes
- *
- * We do this last because the child nodes will open indexscans on our
- * relation's indexes, and we want to be sure we have acquired a lock on
- * the relation first.
- */
- outerPlanState(scanstate) = ExecInitNode(outerPlan(node), estate, eflags);
-
/*
* all done.
*/
diff --git a/src/backend/executor/nodeBitmapIndexscan.c b/src/backend/executor/nodeBitmapIndexscan.c
index 6feb70f4ae3..ab95c944833 100644
--- a/src/backend/executor/nodeBitmapIndexscan.c
+++ b/src/backend/executor/nodeBitmapIndexscan.c
@@ -226,6 +226,15 @@ ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags)
/* normally we don't make the result bitmap till runtime */
indexstate->biss_result = NULL;
+ /*
+ * We do not open or lock the base relation here. We assume that an
+ * ancestor BitmapHeapScan node is holding AccessShareLock (or better) on
+ * the heap relation throughout the execution of the plan tree.
+ */
+
+ indexstate->ss.ss_currentRelation = NULL;
+ indexstate->ss.ss_currentScanDesc = NULL;
+
/*
* Miscellaneous initialization
*
@@ -242,15 +251,6 @@ ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags)
* sub-parts corresponding to runtime keys (see below).
*/
- /*
- * We do not open or lock the base relation here. We assume that an
- * ancestor BitmapHeapScan node is holding AccessShareLock (or better) on
- * the heap relation throughout the execution of the plan tree.
- */
-
- indexstate->ss.ss_currentRelation = NULL;
- indexstate->ss.ss_currentScanDesc = NULL;
-
/*
* If we are just doing EXPLAIN (ie, aren't going to run the plan), stop
* here. This allows an index-advisor plugin to EXPLAIN a plan containing
diff --git a/src/backend/executor/nodeBitmapOr.c b/src/backend/executor/nodeBitmapOr.c
index 66a7a89a8b7..73c08e46523 100644
--- a/src/backend/executor/nodeBitmapOr.c
+++ b/src/backend/executor/nodeBitmapOr.c
@@ -81,13 +81,6 @@ ExecInitBitmapOr(BitmapOr *node, EState *estate, int eflags)
bitmaporstate->bitmapplans = bitmapplanstates;
bitmaporstate->nplans = nplans;
- /*
- * Miscellaneous initialization
- *
- * BitmapOr plans don't have expression contexts because they never call
- * ExecQual or ExecProject. They don't need any tuple slots either.
- */
-
/*
* call ExecInitNode on each of the plans to be executed and save the
* results into the array "bitmapplanstates".
@@ -100,6 +93,13 @@ ExecInitBitmapOr(BitmapOr *node, EState *estate, int eflags)
i++;
}
+ /*
+ * Miscellaneous initialization
+ *
+ * BitmapOr plans don't have expression contexts because they never call
+ * ExecQual or ExecProject. They don't need any tuple slots either.
+ */
+
return bitmaporstate;
}
diff --git a/src/backend/executor/nodeCtescan.c b/src/backend/executor/nodeCtescan.c
index 79676ca9787..a349c8a6455 100644
--- a/src/backend/executor/nodeCtescan.c
+++ b/src/backend/executor/nodeCtescan.c
@@ -242,31 +242,25 @@ ExecInitCteScan(CteScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
+ /*
+ * The scan tuple type (ie, the rowtype we expect to find in the work
+ * table) is the same as the result rowtype of the CTE query.
+ */
+ ExecInitScanTupleSlot(estate, &scanstate->ss,
+ ExecGetResultType(scanstate->cteplanstate));
+
+ /*
+ * Initialize result slot, type and projection.
+ */
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
+ ExecAssignScanProjectionInfo(&scanstate->ss);
+
/*
* initialize child expressions
*/
scanstate->ss.ps.qual =
ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
-
- /*
- * The scan tuple type (ie, the rowtype we expect to find in the work
- * table) is the same as the result rowtype of the CTE query.
- */
- ExecAssignScanType(&scanstate->ss,
- ExecGetResultType(scanstate->cteplanstate));
-
- /*
- * Initialize result tuple type and projection info.
- */
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
- ExecAssignScanProjectionInfo(&scanstate->ss);
-
return scanstate;
}
diff --git a/src/backend/executor/nodeCustom.c b/src/backend/executor/nodeCustom.c
index 5f1732d6ac0..9e398ea3133 100644
--- a/src/backend/executor/nodeCustom.c
+++ b/src/backend/executor/nodeCustom.c
@@ -54,14 +54,6 @@ ExecInitCustomScan(CustomScan *cscan, EState *estate, int eflags)
/* create expression context for node */
ExecAssignExprContext(estate, &css->ss.ps);
- /* initialize child expressions */
- css->ss.ps.qual =
- ExecInitQual(cscan->scan.plan.qual, (PlanState *) css);
-
- /* tuple table initialization */
- ExecInitScanTupleSlot(estate, &css->ss);
- ExecInitResultTupleSlot(estate, &css->ss.ps);
-
/*
* open the base relation, if any, and acquire an appropriate lock on it
*/
@@ -81,23 +73,27 @@ ExecInitCustomScan(CustomScan *cscan, EState *estate, int eflags)
TupleDesc scan_tupdesc;
scan_tupdesc = ExecTypeFromTL(cscan->custom_scan_tlist, false);
- ExecAssignScanType(&css->ss, scan_tupdesc);
+ ExecInitScanTupleSlot(estate, &css->ss, scan_tupdesc);
/* Node's targetlist will contain Vars with varno = INDEX_VAR */
tlistvarno = INDEX_VAR;
}
else
{
- ExecAssignScanType(&css->ss, RelationGetDescr(scan_rel));
+ ExecInitScanTupleSlot(estate, &css->ss, RelationGetDescr(scan_rel));
/* Node's targetlist will contain Vars with varno = scanrelid */
tlistvarno = scanrelid;
}
/*
- * Initialize result tuple type and projection info.
+ * Initialize result slot, type and projection.
*/
- ExecAssignResultTypeFromTL(&css->ss.ps);
+ ExecInitResultTupleSlotTL(estate, &css->ss.ps);
ExecAssignScanProjectionInfoWithVarno(&css->ss, tlistvarno);
+ /* initialize child expressions */
+ css->ss.ps.qual =
+ ExecInitQual(cscan->scan.plan.qual, (PlanState *) css);
+
/*
* The callback of custom-scan provider applies the final initialization
* of the custom-scan-state node according to its logic.
diff --git a/src/backend/executor/nodeForeignscan.c b/src/backend/executor/nodeForeignscan.c
index dc6cfcfa66b..37e23708510 100644
--- a/src/backend/executor/nodeForeignscan.c
+++ b/src/backend/executor/nodeForeignscan.c
@@ -155,20 +155,6 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
- /*
- * initialize child expressions
- */
- scanstate->ss.ps.qual =
- ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
- scanstate->fdw_recheck_quals =
- ExecInitQual(node->fdw_recheck_quals, (PlanState *) scanstate);
-
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
-
/*
* open the base relation, if any, and acquire an appropriate lock on it;
* also acquire function pointers from the FDW's handler
@@ -194,23 +180,31 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
TupleDesc scan_tupdesc;
scan_tupdesc = ExecTypeFromTL(node->fdw_scan_tlist, false);
- ExecAssignScanType(&scanstate->ss, scan_tupdesc);
+ ExecInitScanTupleSlot(estate, &scanstate->ss, scan_tupdesc);
/* Node's targetlist will contain Vars with varno = INDEX_VAR */
tlistvarno = INDEX_VAR;
}
else
{
- ExecAssignScanType(&scanstate->ss, RelationGetDescr(currentRelation));
+ ExecInitScanTupleSlot(estate, &scanstate->ss, RelationGetDescr(currentRelation));
/* Node's targetlist will contain Vars with varno = scanrelid */
tlistvarno = scanrelid;
}
/*
- * Initialize result tuple type and projection info.
+ * Initialize result slot, type and projection.
*/
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
ExecAssignScanProjectionInfoWithVarno(&scanstate->ss, tlistvarno);
+ /*
+ * initialize child expressions
+ */
+ scanstate->ss.ps.qual =
+ ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
+ scanstate->fdw_recheck_quals =
+ ExecInitQual(node->fdw_recheck_quals, (PlanState *) scanstate);
+
/*
* Initialize FDW-related state.
*/
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index de476ac75c4..e7e82203a50 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -334,18 +334,6 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
-
- /*
- * initialize child expressions
- */
- scanstate->ss.ps.qual =
- ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
-
scanstate->funcstates = palloc(nfuncs * sizeof(FunctionScanPerFuncState));
natts = 0;
@@ -436,8 +424,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
*/
if (!scanstate->simple)
{
- fs->func_slot = ExecInitExtraTupleSlot(estate);
- ExecSetSlotDescriptor(fs->func_slot, fs->tupdesc);
+ fs->func_slot = ExecInitExtraTupleSlot(estate, fs->tupdesc);
}
else
fs->func_slot = NULL;
@@ -492,14 +479,24 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
Assert(attno == natts);
}
- ExecAssignScanType(&scanstate->ss, scan_tupdesc);
+ /*
+ * tuple table initialization
+ */
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
+ ExecInitScanTupleSlot(estate, &scanstate->ss, scan_tupdesc);
/*
- * Initialize result tuple type and projection info.
+ * Initialize projection.
*/
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
ExecAssignScanProjectionInfo(&scanstate->ss);
+ /*
+ * initialize child expressions
+ */
+ scanstate->ss.ps.qual =
+ ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
+
+
/*
* Create a memory context that ExecMakeTableFunctionResult can use to
* evaluate function arguments in. We can't use the per-tuple context for
diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c
index a44cf8409af..65daba4d026 100644
--- a/src/backend/executor/nodeGather.c
+++ b/src/backend/executor/nodeGather.c
@@ -59,7 +59,6 @@ ExecInitGather(Gather *node, EState *estate, int eflags)
{
GatherState *gatherstate;
Plan *outerNode;
- bool hasoid;
TupleDesc tupDesc;
/* Gather node doesn't have innerPlan node. */
@@ -85,37 +84,29 @@ ExecInitGather(Gather *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &gatherstate->ps);
- /*
- * Gather doesn't support checking a qual (it's always more efficient to
- * do it in the child node).
- */
- Assert(!node->plan.qual);
-
- /*
- * tuple table initialization
- */
- gatherstate->funnel_slot = ExecInitExtraTupleSlot(estate);
- ExecInitResultTupleSlot(estate, &gatherstate->ps);
-
/*
* now initialize outer plan
*/
outerNode = outerPlan(node);
outerPlanState(gatherstate) = ExecInitNode(outerNode, estate, eflags);
+ tupDesc = ExecGetResultType(outerPlanState(gatherstate));
+
+ /*
+ * Initialize result slot, type and projection.
+ */
+ ExecInitResultTupleSlotTL(estate, &gatherstate->ps);
+ ExecConditionalAssignProjectionInfo(&gatherstate->ps, tupDesc, OUTER_VAR);
/*
* Initialize funnel slot to same tuple descriptor as outer plan.
*/
- if (!ExecContextForcesOids(outerPlanState(gatherstate), &hasoid))
- hasoid = false;
- tupDesc = ExecTypeFromTL(outerNode->targetlist, hasoid);
- ExecSetSlotDescriptor(gatherstate->funnel_slot, tupDesc);
+ gatherstate->funnel_slot = ExecInitExtraTupleSlot(estate, tupDesc);
/*
- * Initialize result tuple type and projection info.
+ * Gather doesn't support checking a qual (it's always more efficient to
+ * do it in the child node).
*/
- ExecAssignResultTypeFromTL(&gatherstate->ps);
- ExecConditionalAssignProjectionInfo(&gatherstate->ps, tupDesc, OUTER_VAR);
+ Assert(!node->plan.qual);
return gatherstate;
}
diff --git a/src/backend/executor/nodeGatherMerge.c b/src/backend/executor/nodeGatherMerge.c
index 4a8a59eabf1..aaa595351d1 100644
--- a/src/backend/executor/nodeGatherMerge.c
+++ b/src/backend/executor/nodeGatherMerge.c
@@ -73,7 +73,6 @@ ExecInitGatherMerge(GatherMerge *node, EState *estate, int eflags)
{
GatherMergeState *gm_state;
Plan *outerNode;
- bool hasoid;
TupleDesc tupDesc;
/* Gather merge node doesn't have innerPlan node. */
@@ -104,11 +103,6 @@ ExecInitGatherMerge(GatherMerge *node, EState *estate, int eflags)
*/
Assert(!node->plan.qual);
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &gm_state->ps);
-
/*
* now initialize outer plan
*/
@@ -119,15 +113,13 @@ ExecInitGatherMerge(GatherMerge *node, EState *estate, int eflags)
* Store the tuple descriptor into gather merge state, so we can use it
* while initializing the gather merge slots.
*/
- if (!ExecContextForcesOids(outerPlanState(gm_state), &hasoid))
- hasoid = false;
- tupDesc = ExecTypeFromTL(outerNode->targetlist, hasoid);
+ tupDesc = ExecGetResultType(outerPlanState(gm_state));
gm_state->tupDesc = tupDesc;
/*
- * Initialize result tuple type and projection info.
+ * Initialize result slot, type and projection.
*/
- ExecAssignResultTypeFromTL(&gm_state->ps);
+ ExecInitResultTupleSlotTL(estate, &gm_state->ps);
ExecConditionalAssignProjectionInfo(&gm_state->ps, tupDesc, OUTER_VAR);
/*
@@ -410,9 +402,8 @@ gather_merge_setup(GatherMergeState *gm_state)
(HeapTuple *) palloc0(sizeof(HeapTuple) * MAX_TUPLE_STORE);
/* Initialize tuple slot for worker */
- gm_state->gm_slots[i + 1] = ExecInitExtraTupleSlot(gm_state->ps.state);
- ExecSetSlotDescriptor(gm_state->gm_slots[i + 1],
- gm_state->tupDesc);
+ gm_state->gm_slots[i + 1] =
+ ExecInitExtraTupleSlot(gm_state->ps.state, gm_state->tupDesc);
}
/* Allocate the resources for the merge */
diff --git a/src/backend/executor/nodeGroup.c b/src/backend/executor/nodeGroup.c
index b9ba0f7c702..a30f6f15278 100644
--- a/src/backend/executor/nodeGroup.c
+++ b/src/backend/executor/nodeGroup.c
@@ -181,10 +181,20 @@ ExecInitGroup(Group *node, EState *estate, int eflags)
ExecAssignExprContext(estate, &grpstate->ss.ps);
/*
- * tuple table initialization
+ * initialize child nodes
*/
- ExecInitScanTupleSlot(estate, &grpstate->ss);
- ExecInitResultTupleSlot(estate, &grpstate->ss.ps);
+ outerPlanState(grpstate) = ExecInitNode(outerPlan(node), estate, eflags);
+
+ /*
+ * Initialize scan slot and type.
+ */
+ ExecCreateScanSlotFromOuterPlan(estate, &grpstate->ss);
+
+ /*
+ * Initialize result slot, type and projection.
+ */
+ ExecInitResultTupleSlotTL(estate, &grpstate->ss.ps);
+ ExecAssignProjectionInfo(&grpstate->ss.ps, NULL);
/*
* initialize child expressions
@@ -192,22 +202,6 @@ ExecInitGroup(Group *node, EState *estate, int eflags)
grpstate->ss.ps.qual =
ExecInitQual(node->plan.qual, (PlanState *) grpstate);
- /*
- * initialize child nodes
- */
- outerPlanState(grpstate) = ExecInitNode(outerPlan(node), estate, eflags);
-
- /*
- * initialize tuple type.
- */
- ExecAssignScanTypeFromOuterPlan(&grpstate->ss);
-
- /*
- * Initialize result tuple type and projection info.
- */
- ExecAssignResultTypeFromTL(&grpstate->ss.ps);
- ExecAssignProjectionInfo(&grpstate->ss.ps, NULL);
-
/*
* Precompute fmgr lookup data for inner loop
*/
diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index 6fe5d69d558..8f9b7dd51ea 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -184,9 +184,16 @@ ExecInitHash(Hash *node, EState *estate, int eflags)
ExecAssignExprContext(estate, &hashstate->ps);
/*
- * initialize our result slot
+ * initialize child nodes
*/
- ExecInitResultTupleSlot(estate, &hashstate->ps);
+ outerPlanState(hashstate) = ExecInitNode(outerPlan(node), estate, eflags);
+
+ /*
+ * initialize our result slot and type. No need to build projection
+ * because this node doesn't do projections.
+ */
+ ExecInitResultTupleSlotTL(estate, &hashstate->ps);
+ hashstate->ps.ps_ProjInfo = NULL;
/*
* initialize child expressions
@@ -194,18 +201,6 @@ ExecInitHash(Hash *node, EState *estate, int eflags)
hashstate->ps.qual =
ExecInitQual(node->plan.qual, (PlanState *) hashstate);
- /*
- * initialize child nodes
- */
- outerPlanState(hashstate) = ExecInitNode(outerPlan(node), estate, eflags);
-
- /*
- * initialize tuple type. no need to initialize projection info because
- * this node doesn't do projections
- */
- ExecAssignResultTypeFromTL(&hashstate->ps);
- hashstate->ps.ps_ProjInfo = NULL;
-
return hashstate;
}
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index ab1632cc13d..d9efbff77e3 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -389,6 +389,7 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
List *lclauses;
List *rclauses;
List *hoperators;
+ TupleDesc outerDesc, innerDesc;
ListCell *l;
/* check for unsupported flags */
@@ -401,6 +402,7 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
hjstate->js.ps.plan = (Plan *) node;
hjstate->js.ps.state = estate;
hjstate->js.ps.ExecProcNode = ExecHashJoin;
+ hjstate->js.jointype = node->join.jointype;
/*
* Miscellaneous initialization
@@ -409,17 +411,6 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &hjstate->js.ps);
- /*
- * initialize child expressions
- */
- hjstate->js.ps.qual =
- ExecInitQual(node->join.plan.qual, (PlanState *) hjstate);
- hjstate->js.jointype = node->join.jointype;
- hjstate->js.joinqual =
- ExecInitQual(node->join.joinqual, (PlanState *) hjstate);
- hjstate->hashclauses =
- ExecInitQual(node->hashclauses, (PlanState *) hjstate);
-
/*
* initialize child nodes
*
@@ -431,13 +422,15 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
hashNode = (Hash *) innerPlan(node);
outerPlanState(hjstate) = ExecInitNode(outerNode, estate, eflags);
+ outerDesc = ExecGetResultType(outerPlanState(hjstate));
innerPlanState(hjstate) = ExecInitNode((Plan *) hashNode, estate, eflags);
+ innerDesc = ExecGetResultType(innerPlanState(hjstate));
/*
* tuple table initialization
*/
- ExecInitResultTupleSlot(estate, &hjstate->js.ps);
- hjstate->hj_OuterTupleSlot = ExecInitExtraTupleSlot(estate);
+ ExecInitResultTupleSlotTL(estate, &hjstate->js.ps);
+ hjstate->hj_OuterTupleSlot = ExecInitExtraTupleSlot(estate, outerDesc);
/*
* detect whether we need only consider the first matching inner tuple
@@ -454,21 +447,17 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
case JOIN_LEFT:
case JOIN_ANTI:
hjstate->hj_NullInnerTupleSlot =
- ExecInitNullTupleSlot(estate,
- ExecGetResultType(innerPlanState(hjstate)));
+ ExecInitNullTupleSlot(estate, innerDesc);
break;
case JOIN_RIGHT:
hjstate->hj_NullOuterTupleSlot =
- ExecInitNullTupleSlot(estate,
- ExecGetResultType(outerPlanState(hjstate)));
+ ExecInitNullTupleSlot(estate, outerDesc);
break;
case JOIN_FULL:
hjstate->hj_NullOuterTupleSlot =
- ExecInitNullTupleSlot(estate,
- ExecGetResultType(outerPlanState(hjstate)));
+ ExecInitNullTupleSlot(estate, outerDesc);
hjstate->hj_NullInnerTupleSlot =
- ExecInitNullTupleSlot(estate,
- ExecGetResultType(innerPlanState(hjstate)));
+ ExecInitNullTupleSlot(estate, innerDesc);
break;
default:
elog(ERROR, "unrecognized join type: %d",
@@ -490,13 +479,19 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
}
/*
- * initialize tuple type and projection info
+ * initialize projection info
*/
- ExecAssignResultTypeFromTL(&hjstate->js.ps);
ExecAssignProjectionInfo(&hjstate->js.ps, NULL);
- ExecSetSlotDescriptor(hjstate->hj_OuterTupleSlot,
- ExecGetResultType(outerPlanState(hjstate)));
+ /*
+ * initialize child expressions
+ */
+ hjstate->js.ps.qual =
+ ExecInitQual(node->join.plan.qual, (PlanState *) hjstate);
+ hjstate->js.joinqual =
+ ExecInitQual(node->join.joinqual, (PlanState *) hjstate);
+ hjstate->hashclauses =
+ ExecInitQual(node->hashclauses, (PlanState *) hjstate);
/*
* initialize hash-specific info
diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c
index c54c5aa6591..4f9db40cbcd 100644
--- a/src/backend/executor/nodeIndexonlyscan.c
+++ b/src/backend/executor/nodeIndexonlyscan.c
@@ -474,23 +474,6 @@ ExecInitIndexOnlyScan(IndexOnlyScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &indexstate->ss.ps);
- /*
- * initialize child expressions
- *
- * Note: we don't initialize all of the indexorderby expression, only the
- * sub-parts corresponding to runtime keys (see below).
- */
- indexstate->ss.ps.qual =
- ExecInitQual(node->scan.plan.qual, (PlanState *) indexstate);
- indexstate->indexqual =
- ExecInitQual(node->indexqual, (PlanState *) indexstate);
-
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &indexstate->ss.ps);
- ExecInitScanTupleSlot(estate, &indexstate->ss);
-
/*
* open the base relation and acquire appropriate lock on it.
*/
@@ -507,16 +490,27 @@ ExecInitIndexOnlyScan(IndexOnlyScan *node, EState *estate, int eflags)
* suitable data anyway.)
*/
tupDesc = ExecTypeFromTL(node->indextlist, false);
- ExecAssignScanType(&indexstate->ss, tupDesc);
+ ExecInitScanTupleSlot(estate, &indexstate->ss, tupDesc);
/*
- * Initialize result tuple type and projection info. The node's
+ * Initialize result slot, type and projection info. The node's
* targetlist will contain Vars with varno = INDEX_VAR, referencing the
* scan tuple.
*/
- ExecAssignResultTypeFromTL(&indexstate->ss.ps);
+ ExecInitResultTupleSlotTL(estate, &indexstate->ss.ps);
ExecAssignScanProjectionInfoWithVarno(&indexstate->ss, INDEX_VAR);
+ /*
+ * initialize child expressions
+ *
+ * Note: we don't initialize all of the indexorderby expression, only the
+ * sub-parts corresponding to runtime keys (see below).
+ */
+ indexstate->ss.ps.qual =
+ ExecInitQual(node->scan.plan.qual, (PlanState *) indexstate);
+ indexstate->indexqual =
+ ExecInitQual(node->indexqual, (PlanState *) indexstate);
+
/*
* If we are just doing EXPLAIN (ie, aren't going to run the plan), stop
* here. This allows an index-advisor plugin to EXPLAIN a plan containing
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index 2ffef231077..ca1458f6b9c 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -900,6 +900,26 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &indexstate->ss.ps);
+ /*
+ * open the base relation and acquire appropriate lock on it.
+ */
+ currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
+
+ indexstate->ss.ss_currentRelation = currentRelation;
+ indexstate->ss.ss_currentScanDesc = NULL; /* no heap scan here */
+
+ /*
+ * get the scan type from the relation descriptor.
+ */
+ ExecInitScanTupleSlot(estate, &indexstate->ss,
+ RelationGetDescr(currentRelation));
+
+ /*
+ * Initialize result slot, type and projection.
+ */
+ ExecInitResultTupleSlotTL(estate, &indexstate->ss.ps);
+ ExecAssignScanProjectionInfo(&indexstate->ss);
+
/*
* initialize child expressions
*
@@ -917,31 +937,6 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
indexstate->indexorderbyorig =
ExecInitExprList(node->indexorderbyorig, (PlanState *) indexstate);
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &indexstate->ss.ps);
- ExecInitScanTupleSlot(estate, &indexstate->ss);
-
- /*
- * open the base relation and acquire appropriate lock on it.
- */
- currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
-
- indexstate->ss.ss_currentRelation = currentRelation;
- indexstate->ss.ss_currentScanDesc = NULL; /* no heap scan here */
-
- /*
- * get the scan type from the relation descriptor.
- */
- ExecAssignScanType(&indexstate->ss, RelationGetDescr(currentRelation));
-
- /*
- * Initialize result tuple type and projection info.
- */
- ExecAssignResultTypeFromTL(&indexstate->ss.ps);
- ExecAssignScanProjectionInfo(&indexstate->ss);
-
/*
* If we are just doing EXPLAIN (ie, aren't going to run the plan), stop
* here. This allows an index-advisor plugin to EXPLAIN a plan containing
diff --git a/src/backend/executor/nodeLimit.c b/src/backend/executor/nodeLimit.c
index 883f46ce7c9..fcd0967f54a 100644
--- a/src/backend/executor/nodeLimit.c
+++ b/src/backend/executor/nodeLimit.c
@@ -353,6 +353,12 @@ ExecInitLimit(Limit *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &limitstate->ps);
+ /*
+ * initialize outer plan
+ */
+ outerPlan = outerPlan(node);
+ outerPlanState(limitstate) = ExecInitNode(outerPlan, estate, eflags);
+
/*
* initialize child expressions
*/
@@ -362,21 +368,15 @@ ExecInitLimit(Limit *node, EState *estate, int eflags)
(PlanState *) limitstate);
/*
- * Tuple table initialization (XXX not actually used...)
+ * Tuple table initialization (XXX not actually used, but upper nodes
+ * access it to get this node's result tupledesc...)
*/
- ExecInitResultTupleSlot(estate, &limitstate->ps);
-
- /*
- * then initialize outer plan
- */
- outerPlan = outerPlan(node);
- outerPlanState(limitstate) = ExecInitNode(outerPlan, estate, eflags);
+ ExecInitResultTupleSlotTL(estate, &limitstate->ps);
/*
* limit nodes do no projections, so initialize projection info for this
* node appropriately
*/
- ExecAssignResultTypeFromTL(&limitstate->ps);
limitstate->ps.ps_ProjInfo = NULL;
return limitstate;
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 93895600a5d..9999437bef3 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -370,13 +370,15 @@ ExecInitLockRows(LockRows *node, EState *estate, int eflags)
/*
* Miscellaneous initialization
*
- * LockRows nodes never call ExecQual or ExecProject.
+ * LockRows nodes never call ExecQual or ExecProject, therefore no
+ * ExprContext is needed.
*/
/*
- * Tuple table initialization (XXX not actually used...)
+ * Tuple table initialization (XXX not actually used, but upper nodes
+ * access it to get this node's result tupledesc...)
*/
- ExecInitResultTupleSlot(estate, &lrstate->ps);
+ ExecInitResultTupleSlotTL(estate, &lrstate->ps);
/*
* then initialize outer plan
@@ -387,7 +389,6 @@ ExecInitLockRows(LockRows *node, EState *estate, int eflags)
* LockRows nodes do no projections, so initialize projection info for
* this node appropriately
*/
- ExecAssignResultTypeFromTL(&lrstate->ps);
lrstate->ps.ps_ProjInfo = NULL;
/*
diff --git a/src/backend/executor/nodeMaterial.c b/src/backend/executor/nodeMaterial.c
index 91178f10198..d69d548ac18 100644
--- a/src/backend/executor/nodeMaterial.c
+++ b/src/backend/executor/nodeMaterial.c
@@ -211,8 +211,7 @@ ExecInitMaterial(Material *node, EState *estate, int eflags)
*
* material nodes only return tuples from their materialized relation.
*/
- ExecInitResultTupleSlot(estate, &matstate->ss.ps);
- ExecInitScanTupleSlot(estate, &matstate->ss);
+ ExecInitResultTupleSlotTL(estate, &matstate->ss.ps);
/*
* initialize child nodes
@@ -229,8 +228,7 @@ ExecInitMaterial(Material *node, EState *estate, int eflags)
* initialize tuple type. no need to initialize projection info because
* this node doesn't do projections.
*/
- ExecAssignResultTypeFromTL(&matstate->ss.ps);
- ExecAssignScanTypeFromOuterPlan(&matstate->ss);
+ ExecCreateScanSlotFromOuterPlan(estate, &matstate->ss);
matstate->ss.ps.ps_ProjInfo = NULL;
return matstate;
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 6bf490bd700..00bc2c47f0c 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -109,7 +109,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* MergeAppend nodes do have Result slots, which hold pointers to tuples,
* so we have to initialize them.
*/
- ExecInitResultTupleSlot(estate, &mergestate->ps);
+ ExecInitResultTupleSlotTL(estate, &mergestate->ps);
/*
* call ExecInitNode on each of the plans to be executed and save the
@@ -124,10 +124,6 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
i++;
}
- /*
- * initialize output tuple type
- */
- ExecAssignResultTypeFromTL(&mergestate->ps);
mergestate->ps.ps_ProjInfo = NULL;
/*
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index ef9e1ee4710..d405b113894 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -1436,6 +1436,7 @@ MergeJoinState *
ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
{
MergeJoinState *mergestate;
+ TupleDesc outerDesc, innerDesc;
/* check for unsupported flags */
Assert(!(eflags & (EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK)));
@@ -1450,6 +1451,8 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
mergestate->js.ps.plan = (Plan *) node;
mergestate->js.ps.state = estate;
mergestate->js.ps.ExecProcNode = ExecMergeJoin;
+ mergestate->js.jointype = node->join.jointype;
+ mergestate->mj_ConstFalseJoin = false;
/*
* Miscellaneous initialization
@@ -1466,17 +1469,6 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
mergestate->mj_OuterEContext = CreateExprContext(estate);
mergestate->mj_InnerEContext = CreateExprContext(estate);
- /*
- * initialize child expressions
- */
- mergestate->js.ps.qual =
- ExecInitQual(node->join.plan.qual, (PlanState *) mergestate);
- mergestate->js.jointype = node->join.jointype;
- mergestate->js.joinqual =
- ExecInitQual(node->join.joinqual, (PlanState *) mergestate);
- mergestate->mj_ConstFalseJoin = false;
- /* mergeclauses are handled below */
-
/*
* initialize child nodes
*
@@ -1488,10 +1480,12 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
mergestate->mj_SkipMarkRestore = node->skip_mark_restore;
outerPlanState(mergestate) = ExecInitNode(outerPlan(node), estate, eflags);
+ outerDesc = ExecGetResultType(outerPlanState(mergestate));
innerPlanState(mergestate) = ExecInitNode(innerPlan(node), estate,
mergestate->mj_SkipMarkRestore ?
eflags :
(eflags | EXEC_FLAG_MARK));
+ innerDesc = ExecGetResultType(innerPlanState(mergestate));
/*
* For certain types of inner child nodes, it is advantageous to issue
@@ -1510,14 +1504,25 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
else
mergestate->mj_ExtraMarks = false;
+ /*
+ * Initialize result slot, type and projection.
+ */
+ ExecInitResultTupleSlotTL(estate, &mergestate->js.ps);
+ ExecAssignProjectionInfo(&mergestate->js.ps, NULL);
+
/*
* tuple table initialization
*/
- ExecInitResultTupleSlot(estate, &mergestate->js.ps);
+ mergestate->mj_MarkedTupleSlot = ExecInitExtraTupleSlot(estate, innerDesc);
- mergestate->mj_MarkedTupleSlot = ExecInitExtraTupleSlot(estate);
- ExecSetSlotDescriptor(mergestate->mj_MarkedTupleSlot,
- ExecGetResultType(innerPlanState(mergestate)));
+ /*
+ * initialize child expressions
+ */
+ mergestate->js.ps.qual =
+ ExecInitQual(node->join.plan.qual, (PlanState *) mergestate);
+ mergestate->js.joinqual =
+ ExecInitQual(node->join.joinqual, (PlanState *) mergestate);
+ /* mergeclauses are handled below */
/*
* detect whether we need only consider the first matching inner tuple
@@ -1538,15 +1543,13 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
mergestate->mj_FillOuter = true;
mergestate->mj_FillInner = false;
mergestate->mj_NullInnerTupleSlot =
- ExecInitNullTupleSlot(estate,
- ExecGetResultType(innerPlanState(mergestate)));
+ ExecInitNullTupleSlot(estate, innerDesc);
break;
case JOIN_RIGHT:
mergestate->mj_FillOuter = false;
mergestate->mj_FillInner = true;
mergestate->mj_NullOuterTupleSlot =
- ExecInitNullTupleSlot(estate,
- ExecGetResultType(outerPlanState(mergestate)));
+ ExecInitNullTupleSlot(estate, outerDesc);
/*
* Can't handle right or full join with non-constant extra
@@ -1562,11 +1565,9 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
mergestate->mj_FillOuter = true;
mergestate->mj_FillInner = true;
mergestate->mj_NullOuterTupleSlot =
- ExecInitNullTupleSlot(estate,
- ExecGetResultType(outerPlanState(mergestate)));
+ ExecInitNullTupleSlot(estate, outerDesc);
mergestate->mj_NullInnerTupleSlot =
- ExecInitNullTupleSlot(estate,
- ExecGetResultType(innerPlanState(mergestate)));
+ ExecInitNullTupleSlot(estate, innerDesc);
/*
* Can't handle right or full join with non-constant extra
@@ -1583,12 +1584,6 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
(int) node->join.jointype);
}
- /*
- * initialize tuple type and projection info
- */
- ExecAssignResultTypeFromTL(&mergestate->js.ps);
- ExecAssignProjectionInfo(&mergestate->js.ps, NULL);
-
/*
* preprocess the merge clauses
*/
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 82cd4462a3e..105b03c92d7 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2070,8 +2070,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
mtstate->ps.plan->targetlist = (List *) linitial(node->returningLists);
/* Set up a slot for the output of the RETURNING projection(s) */
- ExecInitResultTupleSlot(estate, &mtstate->ps);
- ExecAssignResultTypeFromTL(&mtstate->ps);
+ ExecInitResultTupleSlotTL(estate, &mtstate->ps);
slot = mtstate->ps.ps_ResultTupleSlot;
/* Need an econtext too */
@@ -2125,8 +2124,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* expects one (maybe should change that?).
*/
mtstate->ps.plan->targetlist = NIL;
- ExecInitResultTupleSlot(estate, &mtstate->ps);
- ExecAssignResultTypeFromTL(&mtstate->ps);
+ ExecInitResultTupleSlotTL(estate, &mtstate->ps);
mtstate->ps.ps_ExprContext = NULL;
}
@@ -2143,6 +2141,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
if (node->onConflictAction == ONCONFLICT_UPDATE)
{
ExprContext *econtext;
+ TupleDesc relationDesc;
TupleDesc tupDesc;
/* insert may only have one plan, inheritance is not expanded */
@@ -2153,26 +2152,26 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
ExecAssignExprContext(estate, &mtstate->ps);
econtext = mtstate->ps.ps_ExprContext;
+ relationDesc = resultRelInfo->ri_RelationDesc->rd_att;
/* initialize slot for the existing tuple */
- mtstate->mt_existing = ExecInitExtraTupleSlot(mtstate->ps.state);
- ExecSetSlotDescriptor(mtstate->mt_existing,
- resultRelInfo->ri_RelationDesc->rd_att);
+ mtstate->mt_existing =
+ ExecInitExtraTupleSlot(mtstate->ps.state, relationDesc);
/* carried forward solely for the benefit of explain */
mtstate->mt_excludedtlist = node->exclRelTlist;
/* create target slot for UPDATE SET projection */
tupDesc = ExecTypeFromTL((List *) node->onConflictSet,
- resultRelInfo->ri_RelationDesc->rd_rel->relhasoids);
- mtstate->mt_conflproj = ExecInitExtraTupleSlot(mtstate->ps.state);
- ExecSetSlotDescriptor(mtstate->mt_conflproj, tupDesc);
+ relationDesc->tdhasoid);
+ mtstate->mt_conflproj =
+ ExecInitExtraTupleSlot(mtstate->ps.state, tupDesc);
/* build UPDATE SET projection state */
resultRelInfo->ri_onConflictSetProj =
ExecBuildProjectionInfo(node->onConflictSet, econtext,
mtstate->mt_conflproj, &mtstate->ps,
- resultRelInfo->ri_RelationDesc->rd_att);
+ relationDesc);
/* build DO UPDATE WHERE clause expression */
if (node->onConflictWhere)
@@ -2277,7 +2276,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
j = ExecInitJunkFilter(subplan->targetlist,
resultRelInfo->ri_RelationDesc->rd_att->tdhasoid,
- ExecInitExtraTupleSlot(estate));
+ ExecInitExtraTupleSlot(estate, NULL));
if (operation == CMD_UPDATE || operation == CMD_DELETE)
{
@@ -2327,7 +2326,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* we keep it in the estate.
*/
if (estate->es_trig_tuple_slot == NULL)
- estate->es_trig_tuple_slot = ExecInitExtraTupleSlot(estate);
+ estate->es_trig_tuple_slot = ExecInitExtraTupleSlot(estate, NULL);
/*
* Lastly, if this is not the primary (canSetTag) ModifyTable node, add it
diff --git a/src/backend/executor/nodeNamedtuplestorescan.c b/src/backend/executor/nodeNamedtuplestorescan.c
index 3a65b9f5dc9..e3c17d8613b 100644
--- a/src/backend/executor/nodeNamedtuplestorescan.c
+++ b/src/backend/executor/nodeNamedtuplestorescan.c
@@ -132,6 +132,13 @@ ExecInitNamedTuplestoreScan(NamedTuplestoreScan *node, EState *estate, int eflag
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
+ /*
+ * Tuple table and result type initialization. The scan tuple type is
+ * specified for the tuplestore.
+ */
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
+ ExecInitScanTupleSlot(estate, &scanstate->ss, scanstate->tupdesc);
+
/*
* initialize child expressions
*/
@@ -139,20 +146,8 @@ ExecInitNamedTuplestoreScan(NamedTuplestoreScan *node, EState *estate, int eflag
ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
/*
- * tuple table initialization
+ * Initialize projection.
*/
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
-
- /*
- * The scan tuple type is specified for the tuplestore.
- */
- ExecAssignScanType(&scanstate->ss, scanstate->tupdesc);
-
- /*
- * Initialize result tuple type and projection info.
- */
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
ExecAssignScanProjectionInfo(&scanstate->ss);
return scanstate;
diff --git a/src/backend/executor/nodeNestloop.c b/src/backend/executor/nodeNestloop.c
index 4447b7c051a..540bd249fe3 100644
--- a/src/backend/executor/nodeNestloop.c
+++ b/src/backend/executor/nodeNestloop.c
@@ -285,15 +285,6 @@ ExecInitNestLoop(NestLoop *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &nlstate->js.ps);
- /*
- * initialize child expressions
- */
- nlstate->js.ps.qual =
- ExecInitQual(node->join.plan.qual, (PlanState *) nlstate);
- nlstate->js.jointype = node->join.jointype;
- nlstate->js.joinqual =
- ExecInitQual(node->join.joinqual, (PlanState *) nlstate);
-
/*
* initialize child nodes
*
@@ -311,9 +302,19 @@ ExecInitNestLoop(NestLoop *node, EState *estate, int eflags)
innerPlanState(nlstate) = ExecInitNode(innerPlan(node), estate, eflags);
/*
- * tuple table initialization
+ * Initialize result slot, type and projection.
*/
- ExecInitResultTupleSlot(estate, &nlstate->js.ps);
+ ExecInitResultTupleSlotTL(estate, &nlstate->js.ps);
+ ExecAssignProjectionInfo(&nlstate->js.ps, NULL);
+
+ /*
+ * initialize child expressions
+ */
+ nlstate->js.ps.qual =
+ ExecInitQual(node->join.plan.qual, (PlanState *) nlstate);
+ nlstate->js.jointype = node->join.jointype;
+ nlstate->js.joinqual =
+ ExecInitQual(node->join.joinqual, (PlanState *) nlstate);
/*
* detect whether we need only consider the first matching inner tuple
@@ -338,12 +339,6 @@ ExecInitNestLoop(NestLoop *node, EState *estate, int eflags)
(int) node->join.jointype);
}
- /*
- * initialize tuple type and projection info
- */
- ExecAssignResultTypeFromTL(&nlstate->js.ps);
- ExecAssignProjectionInfo(&nlstate->js.ps, NULL);
-
/*
* finally, wipe the current outer tuple clean.
*/
diff --git a/src/backend/executor/nodeProjectSet.c b/src/backend/executor/nodeProjectSet.c
index 30789bcce4d..f3186c51446 100644
--- a/src/backend/executor/nodeProjectSet.c
+++ b/src/backend/executor/nodeProjectSet.c
@@ -243,14 +243,6 @@ ExecInitProjectSet(ProjectSet *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &state->ps);
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &state->ps);
-
- /* We don't support any qual on ProjectSet nodes */
- Assert(node->plan.qual == NIL);
-
/*
* initialize child nodes
*/
@@ -262,9 +254,9 @@ ExecInitProjectSet(ProjectSet *node, EState *estate, int eflags)
Assert(innerPlan(node) == NULL);
/*
- * initialize tuple type and projection info
+ * tuple table and result type initialization
*/
- ExecAssignResultTypeFromTL(&state->ps);
+ ExecInitResultTupleSlotTL(estate, &state->ps);
/* Create workspace for per-tlist-entry expr state & SRF-is-done state */
state->nelems = list_length(node->plan.targetlist);
@@ -301,6 +293,8 @@ ExecInitProjectSet(ProjectSet *node, EState *estate, int eflags)
off++;
}
+ /* We don't support any qual on ProjectSet nodes */
+ Assert(node->plan.qual == NIL);
/*
* Create a memory context that ExecMakeFunctionResult can use to evaluate
diff --git a/src/backend/executor/nodeRecursiveunion.c b/src/backend/executor/nodeRecursiveunion.c
index ed229158038..86110cb7707 100644
--- a/src/backend/executor/nodeRecursiveunion.c
+++ b/src/backend/executor/nodeRecursiveunion.c
@@ -229,14 +229,13 @@ ExecInitRecursiveUnion(RecursiveUnion *node, EState *estate, int eflags)
* RecursiveUnion nodes still have Result slots, which hold pointers to
* tuples, so we have to initialize them.
*/
- ExecInitResultTupleSlot(estate, &rustate->ps);
+ ExecInitResultTupleSlotTL(estate, &rustate->ps);
/*
- * Initialize result tuple type and projection info. (Note: we have to
- * set up the result type before initializing child nodes, because
- * nodeWorktablescan.c expects it to be valid.)
+ * Initialize result tuple type. (Note: we have to set up the result type
+ * before initializing child nodes, because nodeWorktablescan.c expects it
+ * to be valid.)
*/
- ExecAssignResultTypeFromTL(&rustate->ps);
rustate->ps.ps_ProjInfo = NULL;
/*
diff --git a/src/backend/executor/nodeResult.c b/src/backend/executor/nodeResult.c
index 4c879d87655..40bcefb50a9 100644
--- a/src/backend/executor/nodeResult.c
+++ b/src/backend/executor/nodeResult.c
@@ -204,19 +204,6 @@ ExecInitResult(Result *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &resstate->ps);
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &resstate->ps);
-
- /*
- * initialize child expressions
- */
- resstate->ps.qual =
- ExecInitQual(node->plan.qual, (PlanState *) resstate);
- resstate->resconstantqual =
- ExecInitQual((List *) node->resconstantqual, (PlanState *) resstate);
-
/*
* initialize child nodes
*/
@@ -228,11 +215,19 @@ ExecInitResult(Result *node, EState *estate, int eflags)
Assert(innerPlan(node) == NULL);
/*
- * initialize tuple type and projection info
+ * Initialize result slot, type and projection.
*/
- ExecAssignResultTypeFromTL(&resstate->ps);
+ ExecInitResultTupleSlotTL(estate, &resstate->ps);
ExecAssignProjectionInfo(&resstate->ps, NULL);
+ /*
+ * initialize child expressions
+ */
+ resstate->ps.qual =
+ ExecInitQual(node->plan.qual, (PlanState *) resstate);
+ resstate->resconstantqual =
+ ExecInitQual((List *) node->resconstantqual, (PlanState *) resstate);
+
return resstate;
}
diff --git a/src/backend/executor/nodeSamplescan.c b/src/backend/executor/nodeSamplescan.c
index 9c74a836e40..c40bb2735cf 100644
--- a/src/backend/executor/nodeSamplescan.c
+++ b/src/backend/executor/nodeSamplescan.c
@@ -26,7 +26,6 @@
#include "utils/rel.h"
#include "utils/tqual.h"
-static void InitScanRelation(SampleScanState *node, EState *estate, int eflags);
static TupleTableSlot *SampleNext(SampleScanState *node);
static void tablesample_init(SampleScanState *scanstate);
static HeapTuple tablesample_getnext(SampleScanState *scanstate);
@@ -106,35 +105,6 @@ ExecSampleScan(PlanState *pstate)
(ExecScanRecheckMtd) SampleRecheck);
}
-/* ----------------------------------------------------------------
- * InitScanRelation
- *
- * Set up to access the scan relation.
- * ----------------------------------------------------------------
- */
-static void
-InitScanRelation(SampleScanState *node, EState *estate, int eflags)
-{
- Relation currentRelation;
-
- /*
- * get the relation object id from the relid'th entry in the range table,
- * open that relation and acquire appropriate lock on it.
- */
- currentRelation = ExecOpenScanRelation(estate,
- ((SampleScan *) node->ss.ps.plan)->scan.scanrelid,
- eflags);
-
- node->ss.ss_currentRelation = currentRelation;
-
- /* we won't set up the HeapScanDesc till later */
- node->ss.ss_currentScanDesc = NULL;
-
- /* and report the scan tuple slot's rowtype */
- ExecAssignScanType(&node->ss, RelationGetDescr(currentRelation));
-}
-
-
/* ----------------------------------------------------------------
* ExecInitSampleScan
* ----------------------------------------------------------------
@@ -164,6 +134,32 @@ ExecInitSampleScan(SampleScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
+ /*
+ * Initialize scan relation.
+ *
+ * Get the relation object id from the relid'th entry in the range table,
+ * open that relation and acquire appropriate lock on it.
+ */
+ scanstate->ss.ss_currentRelation =
+ ExecOpenScanRelation(estate,
+ node->scan.scanrelid,
+ eflags);
+
+ /* we won't set up the HeapScanDesc till later */
+ scanstate->ss.ss_currentScanDesc = NULL;
+
+ /* and create slot with appropriate rowtype */
+ ExecInitScanTupleSlot(estate, &scanstate->ss,
+ RelationGetDescr(scanstate->ss.ss_currentRelation));
+
+
+ /*
+ * Initialize result slot, type and projection.
+ * tuple table and result tuple initialization
+ */
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
+ ExecAssignScanProjectionInfo(&scanstate->ss);
+
/*
* initialize child expressions
*/
@@ -174,23 +170,6 @@ ExecInitSampleScan(SampleScan *node, EState *estate, int eflags)
scanstate->repeatable =
ExecInitExpr(tsc->repeatable, (PlanState *) scanstate);
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
-
- /*
- * initialize scan relation
- */
- InitScanRelation(scanstate, estate, eflags);
-
- /*
- * Initialize result tuple type and projection info.
- */
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
- ExecAssignScanProjectionInfo(&scanstate->ss);
-
/*
* If we don't have a REPEATABLE clause, select a random seed. We want to
* do this just once, since the seed shouldn't change over rescans.
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index a5bd60e5795..59b759db533 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -32,7 +32,6 @@
#include "executor/nodeSeqscan.h"
#include "utils/rel.h"
-static void InitScanRelation(SeqScanState *node, EState *estate, int eflags);
static TupleTableSlot *SeqNext(SeqScanState *node);
/* ----------------------------------------------------------------
@@ -132,31 +131,6 @@ ExecSeqScan(PlanState *pstate)
(ExecScanRecheckMtd) SeqRecheck);
}
-/* ----------------------------------------------------------------
- * InitScanRelation
- *
- * Set up to access the scan relation.
- * ----------------------------------------------------------------
- */
-static void
-InitScanRelation(SeqScanState *node, EState *estate, int eflags)
-{
- Relation currentRelation;
-
- /*
- * get the relation object id from the relid'th entry in the range table,
- * open that relation and acquire appropriate lock on it.
- */
- currentRelation = ExecOpenScanRelation(estate,
- ((SeqScan *) node->ss.ps.plan)->scanrelid,
- eflags);
-
- node->ss.ss_currentRelation = currentRelation;
-
- /* and report the scan tuple slot's rowtype */
- ExecAssignScanType(&node->ss, RelationGetDescr(currentRelation));
-}
-
/* ----------------------------------------------------------------
* ExecInitSeqScan
@@ -189,29 +163,33 @@ ExecInitSeqScan(SeqScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
+ /*
+ * Initialize scan relation.
+ *
+ * Get the relation object id from the relid'th entry in the range table,
+ * open that relation and acquire appropriate lock on it.
+ */
+ scanstate->ss.ss_currentRelation =
+ ExecOpenScanRelation(estate,
+ node->scanrelid,
+ eflags);
+
+ /* and create slot with the appropriate rowtype */
+ ExecInitScanTupleSlot(estate, &scanstate->ss,
+ RelationGetDescr(scanstate->ss.ss_currentRelation));
+
+ /*
+ * Initialize result slot, type and projection.
+ */
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
+ ExecAssignScanProjectionInfo(&scanstate->ss);
+
/*
* initialize child expressions
*/
scanstate->ss.ps.qual =
ExecInitQual(node->plan.qual, (PlanState *) scanstate);
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
-
- /*
- * initialize scan relation
- */
- InitScanRelation(scanstate, estate, eflags);
-
- /*
- * Initialize result tuple type and projection info.
- */
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
- ExecAssignScanProjectionInfo(&scanstate->ss);
-
return scanstate;
}
diff --git a/src/backend/executor/nodeSetOp.c b/src/backend/executor/nodeSetOp.c
index e5300c20692..56a412aeea4 100644
--- a/src/backend/executor/nodeSetOp.c
+++ b/src/backend/executor/nodeSetOp.c
@@ -520,11 +520,6 @@ ExecInitSetOp(SetOp *node, EState *estate, int eflags)
"SetOp hash table",
ALLOCSET_DEFAULT_SIZES);
- /*
- * Tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &setopstate->ps);
-
/*
* initialize child nodes
*
@@ -536,10 +531,10 @@ ExecInitSetOp(SetOp *node, EState *estate, int eflags)
outerPlanState(setopstate) = ExecInitNode(outerPlan(node), estate, eflags);
/*
- * setop nodes do no projections, so initialize projection info for this
- * node appropriately
+ * Initialize result slot and type. Setop nodes do no projections, so
+ * initialize projection info for this node appropriately.
*/
- ExecAssignResultTypeFromTL(&setopstate->ps);
+ ExecInitResultTupleSlotTL(estate, &setopstate->ps);
setopstate->ps.ps_ProjInfo = NULL;
/*
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index 73aa3715e6d..e0d1b08bcba 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -198,14 +198,6 @@ ExecInitSort(Sort *node, EState *estate, int eflags)
* ExecQual or ExecProject.
*/
- /*
- * tuple table initialization
- *
- * sort nodes only return scan tuples from their sorted relation.
- */
- ExecInitResultTupleSlot(estate, &sortstate->ss.ps);
- ExecInitScanTupleSlot(estate, &sortstate->ss);
-
/*
* initialize child nodes
*
@@ -217,11 +209,15 @@ ExecInitSort(Sort *node, EState *estate, int eflags)
outerPlanState(sortstate) = ExecInitNode(outerPlan(node), estate, eflags);
/*
- * initialize tuple type. no need to initialize projection info because
+ * Initialize scan slot and type.
+ */
+ ExecCreateScanSlotFromOuterPlan(estate, &sortstate->ss);
+
+ /*
+ * Initialize return slot and type. No need to initialize projection info because
* this node doesn't do projections.
*/
- ExecAssignResultTypeFromTL(&sortstate->ss.ps);
- ExecAssignScanTypeFromOuterPlan(&sortstate->ss);
+ ExecInitResultTupleSlotTL(estate, &sortstate->ss.ps);
sortstate->ss.ps.ps_ProjInfo = NULL;
SO1_printf("ExecInitSort: %s\n",
diff --git a/src/backend/executor/nodeSubplan.c b/src/backend/executor/nodeSubplan.c
index 499bd5c5b2a..16de639b82d 100644
--- a/src/backend/executor/nodeSubplan.c
+++ b/src/backend/executor/nodeSubplan.c
@@ -957,8 +957,7 @@ ExecInitSubPlan(SubPlan *subplan, PlanState *parent)
* own innerecontext.
*/
tupDesc = ExecTypeFromTL(lefttlist, false);
- slot = ExecInitExtraTupleSlot(estate);
- ExecSetSlotDescriptor(slot, tupDesc);
+ slot = ExecInitExtraTupleSlot(estate, tupDesc);
sstate->projLeft = ExecBuildProjectionInfo(lefttlist,
NULL,
slot,
@@ -967,8 +966,7 @@ ExecInitSubPlan(SubPlan *subplan, PlanState *parent)
tupDesc = ExecTypeFromTL(righttlist, false);
sstate->descRight = tupDesc;
- slot = ExecInitExtraTupleSlot(estate);
- ExecSetSlotDescriptor(slot, tupDesc);
+ slot = ExecInitExtraTupleSlot(estate, tupDesc);
sstate->projRight = ExecBuildProjectionInfo(righttlist,
sstate->innerecontext,
slot,
diff --git a/src/backend/executor/nodeSubqueryscan.c b/src/backend/executor/nodeSubqueryscan.c
index 088c92992ec..0b029fc104a 100644
--- a/src/backend/executor/nodeSubqueryscan.c
+++ b/src/backend/executor/nodeSubqueryscan.c
@@ -120,35 +120,29 @@ ExecInitSubqueryScan(SubqueryScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &subquerystate->ss.ps);
- /*
- * initialize child expressions
- */
- subquerystate->ss.ps.qual =
- ExecInitQual(node->scan.plan.qual, (PlanState *) subquerystate);
-
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &subquerystate->ss.ps);
- ExecInitScanTupleSlot(estate, &subquerystate->ss);
-
/*
* initialize subquery
*/
subquerystate->subplan = ExecInitNode(node->subplan, estate, eflags);
/*
- * Initialize scan tuple type (needed by ExecAssignScanProjectionInfo)
+ * Initialize scan slot and type (needed by ExecAssignScanProjectionInfo)
*/
- ExecAssignScanType(&subquerystate->ss,
- ExecGetResultType(subquerystate->subplan));
+ ExecInitScanTupleSlot(estate, &subquerystate->ss,
+ ExecGetResultType(subquerystate->subplan));
/*
- * Initialize result tuple type and projection info.
+ * Initialize result slot, type and projection.
*/
- ExecAssignResultTypeFromTL(&subquerystate->ss.ps);
+ ExecInitResultTupleSlotTL(estate, &subquerystate->ss.ps);
ExecAssignScanProjectionInfo(&subquerystate->ss);
+ /*
+ * initialize child expressions
+ */
+ subquerystate->ss.ps.qual =
+ ExecInitQual(node->scan.plan.qual, (PlanState *) subquerystate);
+
return subquerystate;
}
diff --git a/src/backend/executor/nodeTableFuncscan.c b/src/backend/executor/nodeTableFuncscan.c
index 165fae8c83b..af84ec30b70 100644
--- a/src/backend/executor/nodeTableFuncscan.c
+++ b/src/backend/executor/nodeTableFuncscan.c
@@ -139,18 +139,6 @@ ExecInitTableFuncScan(TableFuncScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
- /*
- * initialize child expressions
- */
- scanstate->ss.ps.qual =
- ExecInitQual(node->scan.plan.qual, &scanstate->ss.ps);
-
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
-
/*
* initialize source tuple type
*/
@@ -158,15 +146,21 @@ ExecInitTableFuncScan(TableFuncScan *node, EState *estate, int eflags)
tf->coltypes,
tf->coltypmods,
tf->colcollations);
-
- ExecAssignScanType(&scanstate->ss, tupdesc);
+ /* and the corresponding scan slot */
+ ExecInitScanTupleSlot(estate, &scanstate->ss, tupdesc);
/*
- * Initialize result tuple type and projection info.
+ * Initialize result slot, type and projection.
*/
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
ExecAssignScanProjectionInfo(&scanstate->ss);
+ /*
+ * initialize child expressions
+ */
+ scanstate->ss.ps.qual =
+ ExecInitQual(node->scan.plan.qual, &scanstate->ss.ps);
+
/* Only XMLTABLE is supported currently */
scanstate->routine = &XmlTableRoutine;
diff --git a/src/backend/executor/nodeTidscan.c b/src/backend/executor/nodeTidscan.c
index 0ee76e7d252..24bf0f7cb69 100644
--- a/src/backend/executor/nodeTidscan.c
+++ b/src/backend/executor/nodeTidscan.c
@@ -530,20 +530,6 @@ ExecInitTidScan(TidScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &tidstate->ss.ps);
- /*
- * initialize child expressions
- */
- tidstate->ss.ps.qual =
- ExecInitQual(node->scan.plan.qual, (PlanState *) tidstate);
-
- TidExprListCreate(tidstate);
-
- /*
- * tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &tidstate->ss.ps);
- ExecInitScanTupleSlot(estate, &tidstate->ss);
-
/*
* mark tid list as not computed yet
*/
@@ -562,14 +548,23 @@ ExecInitTidScan(TidScan *node, EState *estate, int eflags)
/*
* get the scan type from the relation descriptor.
*/
- ExecAssignScanType(&tidstate->ss, RelationGetDescr(currentRelation));
+ ExecInitScanTupleSlot(estate, &tidstate->ss,
+ RelationGetDescr(currentRelation));
/*
- * Initialize result tuple type and projection info.
+ * Initialize result slot, type and projection.
*/
- ExecAssignResultTypeFromTL(&tidstate->ss.ps);
+ ExecInitResultTupleSlotTL(estate, &tidstate->ss.ps);
ExecAssignScanProjectionInfo(&tidstate->ss);
+ /*
+ * initialize child expressions
+ */
+ tidstate->ss.ps.qual =
+ ExecInitQual(node->scan.plan.qual, (PlanState *) tidstate);
+
+ TidExprListCreate(tidstate);
+
/*
* all done.
*/
diff --git a/src/backend/executor/nodeUnique.c b/src/backend/executor/nodeUnique.c
index 7baaf3847f2..54572b3643d 100644
--- a/src/backend/executor/nodeUnique.c
+++ b/src/backend/executor/nodeUnique.c
@@ -136,21 +136,16 @@ ExecInitUnique(Unique *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &uniquestate->ps);
- /*
- * Tuple table initialization
- */
- ExecInitResultTupleSlot(estate, &uniquestate->ps);
-
/*
* then initialize outer plan
*/
outerPlanState(uniquestate) = ExecInitNode(outerPlan(node), estate, eflags);
/*
- * unique nodes do no projections, so initialize projection info for this
- * node appropriately
+ * Initialize result slot and type. Unique nodes do no projections, so
+ * initialize projection info for this node appropriately.
*/
- ExecAssignResultTypeFromTL(&uniquestate->ps);
+ ExecInitResultTupleSlotTL(estate, &uniquestate->ps);
uniquestate->ps.ps_ProjInfo = NULL;
/*
diff --git a/src/backend/executor/nodeValuesscan.c b/src/backend/executor/nodeValuesscan.c
index 47ba9faa78e..a39a4134b3b 100644
--- a/src/backend/executor/nodeValuesscan.c
+++ b/src/backend/executor/nodeValuesscan.c
@@ -248,10 +248,16 @@ ExecInitValuesScan(ValuesScan *node, EState *estate, int eflags)
ExecAssignExprContext(estate, planstate);
/*
- * tuple table initialization
+ * get info about values list
*/
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
+ tupdesc = ExecTypeFromExprList((List *) linitial(node->values_lists));
+ ExecInitScanTupleSlot(estate, &scanstate->ss, tupdesc);
+
+ /*
+ * Initialize result slot, type and projection.
+ */
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
+ ExecAssignScanProjectionInfo(&scanstate->ss);
/*
* initialize child expressions
@@ -259,13 +265,6 @@ ExecInitValuesScan(ValuesScan *node, EState *estate, int eflags)
scanstate->ss.ps.qual =
ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
- /*
- * get info about values list
- */
- tupdesc = ExecTypeFromExprList((List *) linitial(node->values_lists));
-
- ExecAssignScanType(&scanstate->ss, tupdesc);
-
/*
* Other node-specific setup
*/
@@ -281,12 +280,6 @@ ExecInitValuesScan(ValuesScan *node, EState *estate, int eflags)
scanstate->exprlists[i++] = (List *) lfirst(vtl);
}
- /*
- * Initialize result tuple type and projection info.
- */
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
- ExecAssignScanProjectionInfo(&scanstate->ss);
-
return scanstate;
}
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 75de7728a46..c7022cf2ceb 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -1785,6 +1785,7 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
numaggs,
aggno;
ListCell *l;
+ TupleDesc scandesc;
/* check for unsupported flags */
Assert(!(eflags & (EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK)));
@@ -1824,16 +1825,6 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
"WindowAgg Aggregates",
ALLOCSET_DEFAULT_SIZES);
- /*
- * tuple table initialization
- */
- ExecInitScanTupleSlot(estate, &winstate->ss);
- ExecInitResultTupleSlot(estate, &winstate->ss.ps);
- winstate->first_part_slot = ExecInitExtraTupleSlot(estate);
- winstate->agg_row_slot = ExecInitExtraTupleSlot(estate);
- winstate->temp_slot_1 = ExecInitExtraTupleSlot(estate);
- winstate->temp_slot_2 = ExecInitExtraTupleSlot(estate);
-
/*
* WindowAgg nodes never have quals, since they can only occur at the
* logical top level of a query (ie, after any WHERE or HAVING filters)
@@ -1851,21 +1842,21 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
* initialize source tuple type (which is also the tuple type that we'll
* store in the tuplestore and use in all our working slots).
*/
- ExecAssignScanTypeFromOuterPlan(&winstate->ss);
-
- ExecSetSlotDescriptor(winstate->first_part_slot,
- winstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor);
- ExecSetSlotDescriptor(winstate->agg_row_slot,
- winstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor);
- ExecSetSlotDescriptor(winstate->temp_slot_1,
- winstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor);
- ExecSetSlotDescriptor(winstate->temp_slot_2,
- winstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor);
+ ExecCreateScanSlotFromOuterPlan(estate, &winstate->ss);
+ scandesc = winstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
/*
- * Initialize result tuple type and projection info.
+ * tuple table initialization
*/
- ExecAssignResultTypeFromTL(&winstate->ss.ps);
+ winstate->first_part_slot = ExecInitExtraTupleSlot(estate, scandesc);
+ winstate->agg_row_slot = ExecInitExtraTupleSlot(estate, scandesc);
+ winstate->temp_slot_1 = ExecInitExtraTupleSlot(estate, scandesc);
+ winstate->temp_slot_2 = ExecInitExtraTupleSlot(estate, scandesc);
+
+ /*
+ * Initialize result slot, type and projection.
+ */
+ ExecInitResultTupleSlotTL(estate, &winstate->ss.ps);
ExecAssignProjectionInfo(&winstate->ss.ps, NULL);
/* Set up data for comparing tuples */
diff --git a/src/backend/executor/nodeWorktablescan.c b/src/backend/executor/nodeWorktablescan.c
index d5ffadda3e8..2900087b40b 100644
--- a/src/backend/executor/nodeWorktablescan.c
+++ b/src/backend/executor/nodeWorktablescan.c
@@ -156,6 +156,12 @@ ExecInitWorkTableScan(WorkTableScan *node, EState *estate, int eflags)
*/
ExecAssignExprContext(estate, &scanstate->ss.ps);
+ /*
+ * tuple table initialization
+ */
+ ExecInitResultTupleSlotTL(estate, &scanstate->ss.ps);
+ ExecInitScanTupleSlot(estate, &scanstate->ss, NULL);
+
/*
* initialize child expressions
*/
@@ -163,15 +169,9 @@ ExecInitWorkTableScan(WorkTableScan *node, EState *estate, int eflags)
ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
/*
- * tuple table initialization
+ * Do not yet initialize projection info, see ExecWorkTableScan() for
+ * details.
*/
- ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
- ExecInitScanTupleSlot(estate, &scanstate->ss);
-
- /*
- * Initialize result tuple type, but not yet projection info.
- */
- ExecAssignResultTypeFromTL(&scanstate->ss.ps);
return scanstate;
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index fa5d9bb1201..36a41c386a4 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -208,7 +208,7 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
/* Triggers might need a slot */
if (resultRelInfo->ri_TrigDesc)
- estate->es_trig_tuple_slot = ExecInitExtraTupleSlot(estate);
+ estate->es_trig_tuple_slot = ExecInitExtraTupleSlot(estate, NULL);
/* Prepare to catch AFTER triggers. */
AfterTriggerBeginQuery();
@@ -585,8 +585,8 @@ apply_handle_insert(StringInfo s)
/* Initialize the executor state. */
estate = create_estate_for_relation(rel);
- remoteslot = ExecInitExtraTupleSlot(estate);
- ExecSetSlotDescriptor(remoteslot, RelationGetDescr(rel->localrel));
+ remoteslot = ExecInitExtraTupleSlot(estate,
+ RelationGetDescr(rel->localrel));
/* Process and store remote tuple in the slot */
oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
@@ -689,10 +689,10 @@ apply_handle_update(StringInfo s)
/* Initialize the executor state. */
estate = create_estate_for_relation(rel);
- remoteslot = ExecInitExtraTupleSlot(estate);
- ExecSetSlotDescriptor(remoteslot, RelationGetDescr(rel->localrel));
- localslot = ExecInitExtraTupleSlot(estate);
- ExecSetSlotDescriptor(localslot, RelationGetDescr(rel->localrel));
+ remoteslot = ExecInitExtraTupleSlot(estate,
+ RelationGetDescr(rel->localrel));
+ localslot = ExecInitExtraTupleSlot(estate,
+ RelationGetDescr(rel->localrel));
EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
PushActiveSnapshot(GetTransactionSnapshot());
@@ -807,10 +807,10 @@ apply_handle_delete(StringInfo s)
/* Initialize the executor state. */
estate = create_estate_for_relation(rel);
- remoteslot = ExecInitExtraTupleSlot(estate);
- ExecSetSlotDescriptor(remoteslot, RelationGetDescr(rel->localrel));
- localslot = ExecInitExtraTupleSlot(estate);
- ExecSetSlotDescriptor(localslot, RelationGetDescr(rel->localrel));
+ remoteslot = ExecInitExtraTupleSlot(estate,
+ RelationGetDescr(rel->localrel));
+ localslot = ExecInitExtraTupleSlot(estate,
+ RelationGetDescr(rel->localrel));
EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
PushActiveSnapshot(GetTransactionSnapshot());
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index d9f4059c6ee..f9971863d19 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -409,9 +409,10 @@ extern void ExecScanReScan(ScanState *node);
/*
* prototypes from functions in execTuples.c
*/
-extern void ExecInitResultTupleSlot(EState *estate, PlanState *planstate);
-extern void ExecInitScanTupleSlot(EState *estate, ScanState *scanstate);
-extern TupleTableSlot *ExecInitExtraTupleSlot(EState *estate);
+extern void ExecInitResultTupleSlotTL(EState *estate, PlanState *planstate);
+extern void ExecInitScanTupleSlot(EState *estate, ScanState *scanstate, TupleDesc tupleDesc);
+extern TupleTableSlot *ExecInitExtraTupleSlot(EState *estate,
+ TupleDesc tupleDesc);
extern TupleTableSlot *ExecInitNullTupleSlot(EState *estate,
TupleDesc tupType);
extern TupleDesc ExecTypeFromTL(List *targetList, bool hasoid);
@@ -480,8 +481,6 @@ extern ExprContext *MakePerTupleExprContext(EState *estate);
} while (0)
extern void ExecAssignExprContext(EState *estate, PlanState *planstate);
-extern void ExecAssignResultType(PlanState *planstate, TupleDesc tupDesc);
-extern void ExecAssignResultTypeFromTL(PlanState *planstate);
extern TupleDesc ExecGetResultType(PlanState *planstate);
extern void ExecAssignProjectionInfo(PlanState *planstate,
TupleDesc inputDesc);
@@ -489,7 +488,7 @@ extern void ExecConditionalAssignProjectionInfo(PlanState *planstate,
TupleDesc inputDesc, Index varno);
extern void ExecFreeExprContext(PlanState *planstate);
extern void ExecAssignScanType(ScanState *scanstate, TupleDesc tupDesc);
-extern void ExecAssignScanTypeFromOuterPlan(ScanState *scanstate);
+extern void ExecCreateScanSlotFromOuterPlan(EState *estate, ScanState *scanstate);
extern bool ExecRelationIsTargetRelation(EState *estate, Index scanrelid);
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index db2a42af5e9..7258ce80bee 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -127,6 +127,7 @@ typedef struct TupleTableSlot
MinimalTuple tts_mintuple; /* minimal tuple, or NULL if none */
HeapTupleData tts_minhdr; /* workspace for minimal-tuple-only case */
long tts_off; /* saved state for slot_deform_tuple */
+ bool tts_fixedTupleDescriptor; /* descriptor can't be changed */
} TupleTableSlot;
#define TTS_HAS_PHYSICAL_TUPLE(slot) \
@@ -139,8 +140,8 @@ typedef struct TupleTableSlot
((slot) == NULL || (slot)->tts_isempty)
/* in executor/execTuples.c */
-extern TupleTableSlot *MakeTupleTableSlot(void);
-extern TupleTableSlot *ExecAllocTableSlot(List **tupleTable);
+extern TupleTableSlot *MakeTupleTableSlot(TupleDesc desc);
+extern TupleTableSlot *ExecAllocTableSlot(List **tupleTable, TupleDesc desc);
extern void ExecResetTupleTable(List *tupleTable, bool shouldFree);
extern TupleTableSlot *MakeSingleTupleTableSlot(TupleDesc tupdesc);
extern void ExecDropSingleTupleTableSlot(TupleTableSlot *slot);
--
2.14.1.536.g6867272d5b.dirty
Hi,
I've spent the last weeks working on my LLVM compilation patchset. In
the course of that I *heavily* revised it. While still a good bit away
from committable, it's IMO definitely not a prototype anymore.
There's too many small changes, so I'm only going to list the major
things. A good bit of that is new. The actual LLVM IR emissions itself
hasn't changed that drastically. Since I've not described them in
detail before I'll describe from scratch in a few cases, even if things
haven't fully changed.
== JIT Interface ==
To avoid emitting code in very small increments (increases mmap/mremap
rw vs exec remapping, compile/optimization time), code generation
doesn't happen for every single expression individually, but in batches.
The basic object to emit code via is a jit context created with:
extern LLVMJitContext *llvm_create_context(bool optimize);
which in case of expression is stored on-demand in the EState. For other
usecases that might not be the right location.
To emit LLVM IR (ie. the portabe code that LLVM then optimizes and
generates native code for), one gets a module from that with:
extern LLVMModuleRef llvm_mutable_module(LLVMJitContext *context);
to which "arbitrary" numbers of functions can be added. In case of
expression evaluation, we get the module once for every expression, and
emit one function for the expression itself, and one for every
applicable/referenced deform function.
As explained above, we do not want to emit code immediately from within
ExecInitExpr()/ExecReadyExpr(). To facilitate that readying a JITed
expression sets the function to callback, which gets the actual native
function on the first actual call. That allows to batch together the
generation of all native functions that are defined before the first
expression is evaluated - in a lot of queries that'll be all.
Said callback then calls
extern void *llvm_get_function(LLVMJitContext *context, const char *funcname);
which'll emit code for the "in progress" mutable module if necessary,
and then searches all generated functions for the name. The names are
created via
extern void *llvm_get_function(LLVMJitContext *context, const char *funcname);
currently "evalexpr" and deform" with a generation and counter suffix.
Currently expression which do not have access to an EState, basically
all "parent" less expressions, aren't JIT compiled. That could be
changed, but I so far do not see a huge need.
== Error handling ==
There's two aspects to error handling.
Firstly, generated (LLVM IR) and emitted functions (mmap()ed segments)
need to be cleaned up both after a successful query execution and after
an error. I've settled on a fairly boring resowner based mechanism. On
errors all expressions owned by a resowner are released, upon success
expressions are reassigned to the parent / released on commit (unless
executor shutdown has cleaned them up of course).
A second, less pretty and newly developed, aspect of error handling is
OOM handling inside LLVM itself. The above resowner based mechanism
takes care of cleaning up emitted code upon ERROR, but there's also the
chance that LLVM itself runs out of memory. LLVM by default does *not*
use any C++ exceptions. It's allocations are primarily funneled through
the standard "new" handlers, and some direct use of malloc() and
mmap(). For the former a 'new handler' exists
http://en.cppreference.com/w/cpp/memory/new/set_new_handler for the
latter LLVM provides callback that get called upon failure
(unfortunately mmap() failures are treated as fatal rather than OOM
errors).
What I've chosen to do, and I'd be interested to get some input about
that, is to have two functions that LLVM using code must use:
extern void llvm_enter_fatal_on_oom(void);
extern void llvm_leave_fatal_on_oom(void);
before interacting with LLVM code (ie. emitting IR, or using the above
functions) llvm_enter_fatal_on_oom() needs to be called.
When a libstdc++ new or LLVM error occurs, the handlers set up by the
above functions trigger a FATAL error. We have to use FATAL rather than
ERROR, as we *cannot* reliably throw ERROR inside a foreign library
without risking corrupting its internal state.
Users of the above sections do *not* have to use PG_TRY/CATCH blocks,
the handlers instead are reset on toplevel sigsetjmp() level.
Using a relatively small enter/leave protected section of code, rather
than setting up these handlers globally, avoids negative interactions
with extensions that might use C++ like e.g. postgis. As LLVM code
generation should never execute arbitrary code, just setting these
handlers temporarily ought to suffice.
== LLVM Interface / patches ==
Unfortunately a bit of required LLVM functionality, particularly around
error handling but also initialization, aren't currently fully exposed
via LLVM's C-API. A bit more *optional* API isn't exposed either.
Instead of requiring a brand-new version of LLVM that has exposed this
functionality I decided it's better to have a small C++ wrapper that can
provide this functionality. Due to that new wrapper significantly older
LLVM versions can now be used (for now I've only runtime tested 5.0 and
master, 4.0 would be possible with a few ifdefs, a bit older probably
doable as well). Given that LLVM is written in C++ itself, and optional
dependency to a C++ compiler for one file doesn't seem to be too bad.
== Inlining ==
One big advantage of JITing expressions is that it can significantly
reduce the overhead of postgres' extensible function/operator mechanism,
by inlining the body of called operators.
This is the part of code that I've worked on most significantly. While I
think JITing is an entirely viable project without committed inlining, I
felt that we definitely need to know how exactly we want to do inlining
before merging other parts. 3 different implementations later, I'm
fairly confident that I have a good concept, even though a few corners
still need to be smoothed.
As a quick background, LLVM works on the basis of a high-level
"abstract" assembly representation (llvm.org/docs/LangRef.html). This
can be generated in memory, stored in binary form (bitcode files ending
in .bc) or text representation (.ll files). The clang compiler always
generates the in-memory representation and the -emit-llvm flag tells it
to write that out to disk, rather than .o files/binaries.
This facility allows us to get the bitcode for all operators
(e.g. int8eq, float8pl etc), without maintaining two copies. The way
I've currently set it up is that, if --with-llvm is passed to configure,
all backend files are also compiled to bitcode files. These bitcode
files get installed into the server's
$pkglibdir/bitcode/postgres/
under their original subfolder, eg.
~/build/postgres/dev-assert/install/lib/bitcode/postgres/utils/adt/float.bc
Using existing LLVM functionality (for parallel LTO compilation),
additionally an index is over these is stored to
$pkglibdir/bitcode/postgres.index.bc
When deciding to JIT for the first time, $pkglibdir/bitcode/ is scanned
for all .index.bc files and a *combined* index over all these files is
built in memory. The reason for doing so is that that allows "easy"
access to inlining access for extensions - they can install code into
$pkglibdir/bitcode/[extension]/
accompanied by
$pkglibdir/bitcode/[extension].index.bc
just alongside the actual library.
The inlining implementation, I had to write my own LLVM's isn't suitable
for a number of reasons, can then use the combined in-memory index to
look up all 'extern' function references, judge their size, and then
open just the file containing its implementation (ie. the above
float.bc). Currently there's a limit of 150 instructions for functions
to be inlined, functions used by inlined functions have a budget of 0.5
* limit, and so on. This gets rid of most operators I in queries I
tested, although there's a few that resist inlining due to references to
file-local static variables - but those largely don't seem to be
performance relevant.
== Type Synchronization ==
For my current two main avenues of performance optimizations due to
JITing, expression eval and tuple deforming, it's obviously required
that code generation knows about at least a few postgres types (tuple
slots, heap tuples, expr context/state, etc).
Initially I'd provided LLVM by emitting types manually like:
{
LLVMTypeRef members[15];
members[ 0] = LLVMInt32Type(); /* type */
members[ 1] = LLVMInt8Type(); /* isempty */
members[ 2] = LLVMInt8Type(); /* shouldFree */
members[ 3] = LLVMInt8Type(); /* shouldFreeMin */
members[ 4] = LLVMInt8Type(); /* slow */
members[ 5] = LLVMPointerType(StructHeapTupleData, 0); /* tuple */
members[ 6] = LLVMPointerType(StructtupleDesc, 0); /* tupleDescriptor */
members[ 7] = TypeMemoryContext; /* mcxt */
members[ 8] = LLVMInt32Type(); /* buffer */
members[ 9] = LLVMInt32Type(); /* nvalid */
members[10] = LLVMPointerType(TypeSizeT, 0); /* values */
members[11] = LLVMPointerType(LLVMInt8Type(), 0); /* nulls */
members[12] = LLVMPointerType(StructMinimalTupleData, 0); /* mintuple */
members[13] = StructHeapTupleData; /* minhdr */
members[14] = LLVMInt64Type(); /* off */
StructTupleTableSlot = LLVMStructCreateNamed(LLVMGetGlobalContext(),
"struct.TupleTableSlot");
LLVMStructSetBody(StructTupleTableSlot, members, lengthof(members), false);
}
and then using numeric offset when emitting code like:
LLVMBuildStructGEP(builder, v_slot, 9, "")
to compute the address of nvalid field of a slot at runtime.
but that obviously duplicates a lot of information and is incredibly
failure prone. Doesn't seem acceptable.
What I've now instead done is have one small file (llvmjit_types.c)
which references each of the types required for JITing. That file is
translated to bitcode at compile time, and loaded when LLVM is
initialized in a backend. That works very well to synchronize the type
definition, unfortunately it does *not* synchronize offsets as the IR
level representation doesn't know field names.
Instead I've added defines to the original struct definition that
provide access to the relevant offsets. Eg.
#define FIELDNO_TUPLETABLESLOT_NVALID 9
int tts_nvalid; /* # of valid values in tts_values */
while that still needs to be defined, it's only required for a
relatively small number of fields, and it's bunched together with the
struct definition, so it's easily kept synchronized.
A significant downside for this is that clang needs to be around to
create that bitcode file, but that doesn't seem that bad as an optional
*build*-time, *not* runtime, dependency.
Not a perfect solution, but I don't quite see a better approach.
== Minimal cost based planning & config ==
Currently there's a number of GUCs that influence JITing:
- jit_above_cost = -1, 0-DBL_MAX - all queries with a higher total cost
get JITed, *without* optimization (expensive part), corresponding to
-O0. This commonly already results in significant speedups if
expression/deforming is a bottleneck (removing dynamic branches
mostly).
- jit_optimize_above_cost = -1, 0-DBL_MAX - all queries with a higher total cost
get JITed, *with* optimization (expensive part).
- jit_inline_above_cost = -1, 0-DBL_MAX - inlining is tried if query has
higher cost.
For all of these -1 is a hard disable.
There currently also exist:
- jit_expressions = 0/1
- jit_deform = 0/1
- jit_perform_inlining = 0/1
but I think they could just be removed in favor of the above.
Additionally there's a few debugging/other GUCs:
- jit_debugging_support = 0/1 - register generated functions with the
debugger. Unfortunately GDBs JIT integration scales O(#functions^2),
albeit with a very small constant, so it cannot always be enabled :(
- jit_profiling_support = 0/1 - emit information so perf gets notified
about JITed functions. As this logs data to disk that is not
automatically cleaned up (otherwise it'd be useless), this definitely
cannot be enabled by default.
- jit_dump_bitcode = 0/1 - log generated pre/post optimization bitcode
to disk. This is quite useful for development, so I'd want to keep it.
- jit_log_ir = 0/1 - dump generated IR to the logfile. I found this to
be too verbose, and I think it should be yanked.
Do people feel these should be hidden behind #ifdefs, always present but
prevent from being set to a meaningful, or unrestricted?
=== Remaining work ==
These I'm planning to tackle in the near future and need to be tackled
before mergin.
- Add a big readme
- Add docs
- Add / check LLVM 4.0 support
- reconsider location of JITing code (lib/ and heaptuple.c specifically)
- Split llvmjit_wrap.cpp into three files (error handling, inlining,
temporary LLVM C API extensions)
- Split the bigger commit, improve commit messages
- Significant amounts of local code cleanup and comments
- duplicated code in expression emission for very related step types
- more consistent LLVM variable naming
- pgindent
- timing information about JITing needs to be fewer messages, and hidden
behind a GUC.
- improve logging (mostly remove)
== Future Todo (some already in-progress) ==
- JITed hash computation for nodeAgg & nodeHash. That's currently a
major bottleneck.
- Increase quality of generated code. There's a *lot* left still on the
table. The generated code currently spills far too much into memory,
and LLVM only can optimize that away to a limited degree. I've
experimented some and for TPCH Q01 it's possible to get at least
another x1.8 due to that, with expression eval *still* being the
bottleneck afterwards...
- Caching of the generated code, drastically reducing overhead and
allowing JITing to be beneficial in OLTP cases. Currently the biggest
obstacle to that is the number of specific memory locations referenced
in the expression representation, but that definitely can be improved
(a lot of it by the above point alone).
- More elaborate planning model
- The cloning of modules could e reduced to only cloning required
parts. As that's the most expensive part of inlining and most of the
time only a few functions are used, this should probably be done soon.
== Code ==
As the patchset is large (500kb) and I'm still quickly evolving it, I do
not yet want to attach it. The git tree is at
https://git.postgresql.org/git/users/andresfreund/postgres.git
in the jit branch
https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/jit
to build --with-llvm has to be passed to configure, llvm-config either
needs to be in PATH or provided with LLVM_CONFIG to make. A c++ compiler
and clang need to be available under common names or provided via CXX /
CLANG respectively.
Regards,
Andres Freund
On Wednesday, January 24, 2018 8:20:38 AM CET Andres Freund wrote:
As the patchset is large (500kb) and I'm still quickly evolving it, I do
not yet want to attach it. The git tree is at
https://git.postgresql.org/git/users/andresfreund/postgres.git
in the jit branchhttps://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shor
tlog;h=refs/heads/jitto build --with-llvm has to be passed to configure, llvm-config either
needs to be in PATH or provided with LLVM_CONFIG to make. A c++ compiler
and clang need to be available under common names or provided via CXX /
CLANG respectively.Regards,
Andres Freund
Hi
I tried to build on Debian sid, using GCC 7 and LLVM 5. I used the following
to compile, using your branch @3195c2821d :
$ export LLVM_CONFIG=/usr/bin/llvm-config-5.0
$ ./configure --with-llvm
$ make
And I had the following build error :
llvmjit_wrap.cpp:32:10: fatal error: llvm-c/DebugInfo.h: No such file or
directory
#include "llvm-c/DebugInfo.h"
^~~~~~~~~~~~~~~~~~~~
compilation terminated.
In LLVM 5.0, it looks like DebugInfo.h is not available in llvm-c, only as a C
++ API in llvm/IR/DebugInfo.h.
For 'sport' (I have not played with LLVM API since more than one year), I
tried to fix it, changing it to the C++ include.
The DebugInfo related one was easy, only one function was used.
But I still could not build because the LLVM API changed between 5.0 and 6.0
regarding value info SummaryList.
llvmjit_wrap.cpp: In function
‘std::unique_ptr<llvm::StringMap<llvm::StringSet<> > >
llvm_build_inline_plan(llvm::Module*)’:
llvmjit_wrap.cpp:285:48: error: ‘class llvm::GlobalValueSummary’ has no member
named ‘getBaseObject’
fs = llvm::cast<llvm::FunctionSummary>(gvs->getBaseObject());
^~~~~~~~~~~~~
That one was a bit uglier.
I'm not sure how to test everything properly, so the patch is attached for
both these issues, do as you wish with it… :)
Regards
Pierre Ducroquet
Attachments:
0001-Allow-building-with-LLVM-5.0.patchtext/x-patch; charset=utf-8; name=0001-Allow-building-with-LLVM-5.0.patchDownload
From fdfea09dd7410d6ed7ad54df1ba3092bd0eecb92 Mon Sep 17 00:00:00 2001
From: Pierre Ducroquet <pinaraf@pinaraf.info>
Date: Wed, 24 Jan 2018 22:28:34 +0100
Subject: [PATCH] Allow building with LLVM 5.0
---
src/backend/lib/llvmjit_wrap.cpp | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/src/backend/lib/llvmjit_wrap.cpp b/src/backend/lib/llvmjit_wrap.cpp
index b745aec4fe..7961148a85 100644
--- a/src/backend/lib/llvmjit_wrap.cpp
+++ b/src/backend/lib/llvmjit_wrap.cpp
@@ -29,7 +29,6 @@ extern "C"
#include "llvm-c/Core.h"
#include "llvm-c/BitReader.h"
-#include "llvm-c/DebugInfo.h"
#include <fcntl.h>
#include <sys/mman.h>
@@ -50,6 +49,7 @@ extern "C"
#include "llvm/Analysis/ModuleSummaryAnalysis.h"
#include "llvm/Bitcode/BitcodeReader.h"
#include "llvm/IR/CallSite.h"
+#include "llvm/IR/DebugInfo.h"
#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/ModuleSummaryIndex.h"
#include "llvm/Linker/IRMover.h"
@@ -218,6 +218,13 @@ llvm_inline(LLVMModuleRef M)
llvm_execute_inline_plan(mod, globalsToInline.get());
}
+
+inline llvm::GlobalValueSummary *GlobalValueSummary__getBaseObject(llvm::GlobalValueSummary *gvs) {
+ if (auto *AS = llvm::dyn_cast<llvm::AliasSummary>(gvs))
+ return &AS->getAliasee();
+ return gvs;
+}
+
/*
* Build information necessary for inlining external function references in
* mod.
@@ -282,7 +289,7 @@ llvm_build_inline_plan(llvm::Module *mod)
const llvm::Module *defMod;
llvm::Function *funcDef;
- fs = llvm::cast<llvm::FunctionSummary>(gvs->getBaseObject());
+ fs = llvm::cast<llvm::FunctionSummary>(GlobalValueSummary__getBaseObject(gvs.get()));
elog(DEBUG2, "func %s might be in %s",
funcName.data(),
modPath.data());
@@ -476,7 +483,7 @@ load_module(llvm::StringRef Identifier)
* code. Until that changes, not much point in wasting memory and cycles
* on processing debuginfo.
*/
- LLVMStripModuleDebugInfo(mod);
+ llvm::StripDebugInfo(*llvm::unwrap(mod));
return std::unique_ptr<llvm::Module>(llvm::unwrap(mod));
}
--
2.15.1
Hi,
On 2018-01-24 22:35:08 +0100, Pierre Ducroquet wrote:
I tried to build on Debian sid, using GCC 7 and LLVM 5. I used the following
to compile, using your branch @3195c2821d :
Thanks!
$ export LLVM_CONFIG=/usr/bin/llvm-config-5.0
$ ./configure --with-llvm
$ makeAnd I had the following build error :
llvmjit_wrap.cpp:32:10: fatal error: llvm-c/DebugInfo.h: No such file or
directory
#include "llvm-c/DebugInfo.h"
^~~~~~~~~~~~~~~~~~~~
compilation terminated.In LLVM 5.0, it looks like DebugInfo.h is not available in llvm-c, only as a C
++ API in llvm/IR/DebugInfo.h.
Hm, I compiled against 5.0 quite recently, but added the stripping of
debuginfo lateron. I'll add a fallback method, thanks for pointing that
out!
But I still could not build because the LLVM API changed between 5.0 and 6.0
regarding value info SummaryList.
Hm, thought these changes were from before my 5.0 test. But the code
evolved heavily, so I might misremember. Let me see.
Thanks, I'll try to push fixes into the tree soon-ish..
I'm not sure how to test everything properly, so the patch is attached for
both these issues, do as you wish with it… :)
What I do for testing is running postgres' tests against a started
server that has all cost based behaviour turned off (which makes no
sense from a runtime optimization perspective, but increases
coverage...).
The flags I pass to the server are:
-c jit_expressions=1 -c jit_tuple_deforming=1 -c jit_perform_inlining=1 -c jit_above_cost=0 -c jit_optimize_above_cost=0
then I run
make -s installcheck-parallel
to see whether things pass. The flags makes the tests slow-ish, but
tests everything under jit. In particular errors.sql's recursion check
takes a while...
Obviously none of the standard tests are interesting from a performance
perspective...
FWIW, here's an shortened excerpt of the debugging output of TPCH query:
DEBUG: checking inlinability of ExecAggInitGroup
DEBUG: considering extern function datumCopy at 75 for inlining
DEBUG: inline top function ExecAggInitGroup total_instcount: 24, partial: 21
so the inliner found a reference to ExecAggInitGroup, inlined it, and
scheduled to checkout datumCopy, externally referenced from
ExecAggInitGroup, later.
DEBUG: uneligible to import errstart due to early threshold: 150 vs 37
elog stuff wasn't inlined because errstart has 150 insn, but at this
point the limit was 37 (aka 150 / 2 / 2). Early means this was decided
based on the summary. There's also 'late' checks preventing inlining if
dependencies of the inlined variable (local static functions, constant
static global variables) make it bigger than the summary knows about.
Then we get to execute the importing:
DEBUG: performing import of postgres/utils/fmgr/fmgr.bc pg_detoast_datum, pg_detoast_datum_packed
DEBUG: performing import of postgres/utils/adt/arrayfuncs.bc construct_array
DEBUG: performing import of postgres/utils/error/assert.bc ExceptionalCondition, .str.1, .str
DEBUG: performing import of postgres/utils/adt/expandeddatum.bc EOH_flatten_into, DeleteExpandedObject, .str.1, .str.2, .str.4, EOH_get_flat_size
DEBUG: performing import of postgres/utils/adt/int8.bc __func__.overflowerr, .str, .str.12, int8inc, overflowerr, pg_add_s64_overflow
...
DEBUG: performing import of postgres/utils/adt/date.bc date_le_timestamp, date2timestamp, .str, __func__.date2timestamp, .str.26
And there's a timing summary (debugging build)
DEBUG: time to inline: 0.145s
DEBUG: time to opt: 0.156s
DEBUG: time to emit: 0.078s
Same debugging build:
tpch_10[6930][1]=# set jit_expressions = 1;
tpch_10[6930][1]=# \i ~/tmp/tpch/pg-tpch/queries/q01.sql
...
Time: 28442.870 ms (00:28.443)
tpch_10[6930][1]=# set jit_expressions = 0;
tpch_10[6930][1]=# \i ~/tmp/tpch/pg-tpch/queries/q01.sql
...
Time: 70357.830 ms (01:10.358)
tpch_10[6930][1]=# show max_parallel_workers_per_gather;
┌─────────────────────────────────┐
│ max_parallel_workers_per_gather │
├─────────────────────────────────┤
│ 0 │
└─────────────────────────────────┘
Now admittedly a debugging/assertion enabled build isn't quite a fair
fight, but it's not that much smaller a win without that.
- Andres
Hi,
On 2018-01-24 14:06:30 -0800, Andres Freund wrote:
In LLVM 5.0, it looks like DebugInfo.h is not available in llvm-c, only as a C
++ API in llvm/IR/DebugInfo.h.Hm, I compiled against 5.0 quite recently, but added the stripping of
debuginfo lateron. I'll add a fallback method, thanks for pointing that
out!
Went more with your fix, there's not much point in using the C API
here. Should probably remove the use of it nearly entirely from the .cpp
file (save for wrap/unwrap() use). But man, the 'class Error' usage is
one major ugly pain.
But I still could not build because the LLVM API changed between 5.0 and 6.0
regarding value info SummaryList.Hm, thought these changes were from before my 5.0 test. But the code
evolved heavily, so I might misremember. Let me see.
Ah, that one was actually easier to fix. There's no need to get the base
object at all, so it's just a one-line change.
Thanks, I'll try to push fixes into the tree soon-ish..
Pushed.
Thanks again for looking!
- Andres
On Wed, Jan 24, 2018 at 1:35 PM, Pierre Ducroquet <p.psql@pinaraf.info> wrote:
In LLVM 5.0, it looks like DebugInfo.h is not available in llvm-c, only as a C
++ API in llvm/IR/DebugInfo.h.
The LLVM APIs don't seem to be very stable; won't there just be a
continuous stream of similar issues?
Pinning major postgresql versions to specific LLVM versions doesn't
seem very appealing. Even if you aren't interested in the latest
changes in LLVM, trying to get the right version on your machine will
be annoying.
Regards,
Jeff Davis
Hi,
On 2018-01-24 22:33:30 -0800, Jeff Davis wrote:
On Wed, Jan 24, 2018 at 1:35 PM, Pierre Ducroquet <p.psql@pinaraf.info> wrote:
In LLVM 5.0, it looks like DebugInfo.h is not available in llvm-c, only as a C
++ API in llvm/IR/DebugInfo.h.The LLVM APIs don't seem to be very stable; won't there just be a
continuous stream of similar issues?
There'll be some of that yes. But the entire difference between 5 and
what will be 6 was not including one header, and not calling one unneded
function. That doesn't seem like a crazy amount of adaption that needs
to be done. From a quick look about porting to 4, it'll be a bit, but
not much more effort.
The reason I'm using the C-API where possible is that it's largely
forward compatible (i.e. new features added, but seldomly things are
removed). The C++ code changes a bit more, but it's not that much code
we're interfacing with either.
I think we'll have to make do with a number of ifdefs - I don't really
see an alternative. Unless you've a better idea?
Greetings,
Andres Freund
On Tue, Jan 23, 2018 at 11:20 PM, Andres Freund <andres@anarazel.de> wrote:
Hi,
I've spent the last weeks working on my LLVM compilation patchset. In
the course of that I *heavily* revised it. While still a good bit away
from committable, it's IMO definitely not a prototype anymore.
Great!
A couple high-level questions:
1. I notice a lot of use of the LLVM builder, for example, in
slot_compile_deform(). Why can't you do the same thing you did with
function code, where you create the ".bc" at build time from plain C
code, and then load it at runtime?
2. I'm glad you considered extensions. How far can we go with this in
the future? Can we have bitcode-only extensions that don't need a .so
file? Can we store the bitcode in pg_proc, simplifying deployment and
allowing extensions to travel over replication? I am not asking for
this now, of course, but I'd like to get the idea out there so we
leave room.
Regards,
Jeff Davis
Hi!
On 2018-01-24 22:51:36 -0800, Jeff Davis wrote:
A couple high-level questions:
1. I notice a lot of use of the LLVM builder, for example, in
slot_compile_deform(). Why can't you do the same thing you did with
function code, where you create the ".bc" at build time from plain C
code, and then load it at runtime?
Not entirely sure what you mean. You mean why I don't inline
slot_getsomeattrs() etc and instead generate code manually? The reason
is that the generated code is a *lot* smarter due to knowing the
specific tupledesc.
2. I'm glad you considered extensions. How far can we go with this in
the future?
Can we have bitcode-only extensions that don't need a .so
file?
Hm. I don't see a big problem introducing this. There'd be some
complexity in how to manage the lifetime of JITed functions generated
that way, but that should be solvable.
Can we store the bitcode in pg_proc, simplifying deployment and
allowing extensions to travel over replication?
Yes, we could. You'd need to be a bit careful that all the machines have
similar-ish cpu generations or compile with defensive settings, but that
seems okay.
Greetings,
Andres Freund
On Thursday, January 25, 2018 7:38:16 AM CET Andres Freund wrote:
Hi,
On 2018-01-24 22:33:30 -0800, Jeff Davis wrote:
On Wed, Jan 24, 2018 at 1:35 PM, Pierre Ducroquet <p.psql@pinaraf.info>
wrote:
In LLVM 5.0, it looks like DebugInfo.h is not available in llvm-c, only
as a C ++ API in llvm/IR/DebugInfo.h.The LLVM APIs don't seem to be very stable; won't there just be a
continuous stream of similar issues?There'll be some of that yes. But the entire difference between 5 and
what will be 6 was not including one header, and not calling one unneded
function. That doesn't seem like a crazy amount of adaption that needs
to be done. From a quick look about porting to 4, it'll be a bit, but
not much more effort.
I don't know when this would be released, but the minimal supported LLVM
version will have a strong influence on the availability of that feature. If
today this JIT compiling was released with only LLVM 5/6 support, it would be
unusable for most Debian users (llvm-5 is only available in sid). Even llvm 4
is not available in latest stable.
I'm already trying to build with llvm-4 and I'm going to try further with llvm
3.9 (Debian Stretch doesn't have a more recent than this one, and I won't have
something better to play with my data), I'll keep you informed. For sport, I
may also try llvm 3.5 (for Debian Jessie).
Pierre
On 24.01.2018 10:20, Andres Freund wrote:
Hi,
I've spent the last weeks working on my LLVM compilation patchset. In
the course of that I *heavily* revised it. While still a good bit away
from committable, it's IMO definitely not a prototype anymore.There's too many small changes, so I'm only going to list the major
things. A good bit of that is new. The actual LLVM IR emissions itself
hasn't changed that drastically. Since I've not described them in
detail before I'll describe from scratch in a few cases, even if things
haven't fully changed.== JIT Interface ==
To avoid emitting code in very small increments (increases mmap/mremap
rw vs exec remapping, compile/optimization time), code generation
doesn't happen for every single expression individually, but in batches.The basic object to emit code via is a jit context created with:
extern LLVMJitContext *llvm_create_context(bool optimize);
which in case of expression is stored on-demand in the EState. For other
usecases that might not be the right location.To emit LLVM IR (ie. the portabe code that LLVM then optimizes and
generates native code for), one gets a module from that with:
extern LLVMModuleRef llvm_mutable_module(LLVMJitContext *context);to which "arbitrary" numbers of functions can be added. In case of
expression evaluation, we get the module once for every expression, and
emit one function for the expression itself, and one for every
applicable/referenced deform function.As explained above, we do not want to emit code immediately from within
ExecInitExpr()/ExecReadyExpr(). To facilitate that readying a JITed
expression sets the function to callback, which gets the actual native
function on the first actual call. That allows to batch together the
generation of all native functions that are defined before the first
expression is evaluated - in a lot of queries that'll be all.Said callback then calls
extern void *llvm_get_function(LLVMJitContext *context, const char *funcname);
which'll emit code for the "in progress" mutable module if necessary,
and then searches all generated functions for the name. The names are
created via
extern void *llvm_get_function(LLVMJitContext *context, const char *funcname);
currently "evalexpr" and deform" with a generation and counter suffix.Currently expression which do not have access to an EState, basically
all "parent" less expressions, aren't JIT compiled. That could be
changed, but I so far do not see a huge need.
Hi,
As far as I understand generation of native code is now always done for
all supported expressions and individually by each backend.
I wonder it will be useful to do more efforts to understand when
compilation to native code should be done and when interpretation is better.
For example many JIT-able languages like Lua are using traces, i.e.
query is first interpreted and trace is generated. If the same trace is
followed more than N times, then native code is generated for it.
In context of DBMS executor it is obvious that only frequently executed
or expensive queries have to be compiled.
So we can use estimated plan cost and number of query executions as
simple criteria for JIT-ing the query.
May be compilation of simple queries (with small cost) should be done
only for prepared statements...
Another question is whether it is sensible to redundantly do expensive
work (llvm compilation) in all backends.
This question refers to shared prepared statement cache. But even
without such cache, it seems to be possible to use for library name some
signature of the compiled expression and allow
to share this libraries between backends. So before starting code
generation, ExecReadyCompiledExpr can first build signature and check if
correspondent library is already present.
Also it will be easier to control space used by compiled libraries in
this case.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Hi,
On 2018-01-25 10:00:14 +0100, Pierre Ducroquet wrote:
I don't know when this would be released,
August-October range.
but the minimal supported LLVM
version will have a strong influence on the availability of that feature. If
today this JIT compiling was released with only LLVM 5/6 support, it would be
unusable for most Debian users (llvm-5 is only available in sid). Even llvm 4
is not available in latest stable.
I'm already trying to build with llvm-4 and I'm going to try further with llvm
3.9 (Debian Stretch doesn't have a more recent than this one, and I won't have
something better to play with my data), I'll keep you informed. For sport, I
may also try llvm 3.5 (for Debian Jessie).
I don't think it's unreasonable to not support super old llvm
versions. This is a complex feature, and will take some time to
mature. Supporting too many LLVM versions at the outset will have some
cost. Versions before 3.8 would require supporting mcjit rather than
orc, and I don't think that'd be worth doing. I think 3.9 might be a
reasonable baseline...
Greetings,
Andres Freund
Hi,
On 2018-01-25 18:40:53 +0300, Konstantin Knizhnik wrote:
As far as I understand generation of native code is now always done for all
supported expressions and individually by each backend.
Mostly, yes. It's done "always" done, because there's cost based checks
whether to do so or not.
I wonder it will be useful to do more efforts to understand when compilation
to native code should be done and when interpretation is better.
For example many JIT-able languages like Lua are using traces, i.e. query is
first interpreted� and trace is generated. If the same trace is followed
more than N times, then native code is generated for it.
Right. That's where I actually had started out, but my experimentation
showed that that's not that interesting a path to pursue. Emitting code
in much smaller increments (as you'd do so for individual expressions)
has considerable overhead. We also have a planner that allows us
reasonable guesses when to JIT and when not - something not available in
many other languages.
That said, nothing in the infrastructure would preent you from pursuing
that, it'd just be a wrapper function for the generated exprs that
tracks infocations.
Another question is whether it is sensible to redundantly do expensive work
(llvm compilation) in all backends.
Right now we kinda have to, but I really want to get rid of
that. There's some pointers included as constants in the generated
code. I plan to work on getting rid of that requirement, but after
getting the basics in (i.e. realistically not this release). Even after
that I'm personally much more interested in caching the generated code
inside a backend, rather than across backends. Function addresses et
al being different between backends would add some complications, can be
overcome, but I'm doubtful it's immediately worth it.
So before starting code generation, ExecReadyCompiledExpr can first
build signature and check if correspondent library is already present.
Also it will be easier to control space used by compiled libraries in
this
Right, I definitely think we want to do that at some point not too far
away in the future. That makes the applicability of JITing much broader.
More advanced forms of this are that you JIT in the background for
frequently executed code (so not to incur latency the first time
somebody executes). Aand/or that you emit unoptimized code the first
time through, which is quite quick, and run the optimizer after the
query has been executed a number of times.
Greetings,
Andres Freund
Hi,
I've spent the last weeks working on my LLVM compilation patchset. In
the course of that I *heavily* revised it. While still a good bit away
from committable, it's IMO definitely not a prototype anymore.
Below are results on my system for Q1 TPC-H scale 10 (~13Gb database)
Options
Time
Default
20075
jit_expressions=on
16105
jit_tuple_deforming=on 14734
jit_perform_inlining=on
13441
Also I noticed that parallel execution didsables JIT.
At my computer with 4 cores time of Q1 with parallel execution is 6549.
Are there any principle problems with combining JIT and parallel execution?
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Hi,
Thanks for testing things out!
On 2018-01-26 10:44:24 +0300, Konstantin Knizhnik wrote:
Also I noticed that parallel execution didsables JIT.
Oh, oops, I broke that recently by moving where the decisition about
whether to jit or not is. There actually is JITing, but only in the
leader.
Are there any principle problems with combining JIT and parallel execution?
No, there's not, I just need to send down the flag to JIT down to the
workers. Will look at it tomorrow. If you want to measure / play around
till then you can manually hack the PGJIT_* checks in execExprCompile.c
with that done, on my laptop, tpch-Q01, scale 10:
SET max_parallel_workers_per_gather=0; SET jit_expressions = 1;
15145.508 ms
SET max_parallel_workers_per_gather=0; SET jit_expressions = 0;
23808.809 ms
SET max_parallel_workers_per_gather=4; SET jit_expressions = 1;
4775.170 ms
SET max_parallel_workers_per_gather=4; SET jit_expressions = 0;
7173.483 ms
(that's with inlining and deforming enabled too)
Greetings,
Andres Freund
On 26.01.2018 11:23, Andres Freund wrote:
Hi,
Thanks for testing things out!
Thank you for this work.
One more question: do you have any idea how to profile JITed code?
There is no LLVMOrcRegisterPerf in LLVM 5, so jit_profiling_support
option does nothing.
And without it perf is not able to unwind stack trace for generated code.
A attached the produced profile, looks like "unknown" bar corresponds to
JIT code.
There is NoFramePointerElim option in LLVMMCJITCompilerOptions
structure, but it requires use of ExecutionEngine.
Something like this:
mod = llvm_mutable_module(context);
{
struct LLVMMCJITCompilerOptions options;
LLVMExecutionEngineRef jit;
char* error;
LLVMCreateExecutionEngineForModule(&jit, mod, &error);
LLVMInitializeMCJITCompilerOptions(&options, sizeof(options));
options.NoFramePointerElim = 1;
LLVMCreateMCJITCompilerForModule(&jit, mod, &options,
sizeof(options),
&error);
}
...
But you are compiling code using LLVMOrcAddEagerlyCompiledIR
and I find no way to pass no-omit-frame pointer option here.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
On Thu, Jan 25, 2018 at 11:20:28AM -0800, Andres Freund wrote:
On 2018-01-25 18:40:53 +0300, Konstantin Knizhnik wrote:
Another question is whether it is sensible to redundantly do
expensive work (llvm compilation) in all backends.Right now we kinda have to, but I really want to get rid of that.
There's some pointers included as constants in the generated code. I
plan to work on getting rid of that requirement, but after getting
the basics in (i.e. realistically not this release). Even after
that I'm personally much more interested in caching the generated
code inside a backend, rather than across backends. Function
addresses et al being different between backends would add some
complications, can be overcome, but I'm doubtful it's immediately
worth it.
If we go with threading for this part, sharing that state may be
simpler. It seems a lot of work is going into things that threading
does at a much lower developer cost, but that's a different
conversation.
So before starting code generation, ExecReadyCompiledExpr can first
build signature and check if correspondent library is already present.
Also it will be easier to control space used by compiled libraries in
thisRight, I definitely think we want to do that at some point not too far
away in the future. That makes the applicability of JITing much broader.More advanced forms of this are that you JIT in the background for
frequently executed code (so not to incur latency the first time
somebody executes). Aand/or that you emit unoptimized code the first
time through, which is quite quick, and run the optimizer after the
query has been executed a number of times.
Both sound pretty neat.
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
Hi,
On 2018-01-26 13:06:27 +0300, Konstantin Knizhnik wrote:
One more question: do you have any idea how to profile JITed code?
Yes ;). It depends a bit on what exactly you want to do. Is it
sufficient to get time associated with the parent caller, or do you need
instruction-level access.
There is no LLVMOrcRegisterPerf in LLVM 5, so jit_profiling_support option
does nothing.
Right, it's a patch I'm trying to get into the next version of
llvm. With that you get access to the shared object and everything.
And without it perf is not able to unwind stack trace for generated
code.
You can work around that by using --call-graph lbr with a sufficiently
new perf. That'll not know function names et al, but at least the parent
will be associated correctly.
But you are compiling code using LLVMOrcAddEagerlyCompiledIR
and I find no way to pass no-omit-frame pointer option here.
It shouldn't be too hard to open code support for it, encapsulated in a
function:
// Set function attribute "no-frame-pointer-elim" based on
// NoFramePointerElim.
for (auto &F : *Mod) {
auto Attrs = F.getAttributes();
StringRef Value(options.NoFramePointerElim ? "true" : "false");
Attrs = Attrs.addAttribute(F.getContext(), AttributeList::FunctionIndex,
"no-frame-pointer-elim", Value);
F.setAttributes(Attrs);
}
that's all that option did for mcjit.
Greetings,
Andres Freund
On Wed, Jan 24, 2018 at 11:02 PM, Andres Freund <andres@anarazel.de> wrote:
Not entirely sure what you mean. You mean why I don't inline
slot_getsomeattrs() etc and instead generate code manually? The reason
is that the generated code is a *lot* smarter due to knowing the
specific tupledesc.
I would like to see if we can get a combination of JIT and LTO to work
together to specialize generic code at runtime.
Let's say you have a function f(int x, int y, int z). You want to be
able to specialize it on y at runtime, so that a loop gets unrolled in
the common case where y is small.
1. At build time, create bitcode for the generic implementation of f().
2. At run time, load the generic bitcode into a module (let's call it
the "generic module")
3. At run time, create a new module (let's call it the "bind module")
that only does the following things:
a. declares a global variable bind_y, and initialize it to the value 3
b. declares a wrapper function f_wrapper(int x, int z), and all the
function does is call f(x, bind_y, z)
4. Link the generic module and the bind module together (let's call
the result the "linked module")
5. Optimize the linked module
After sorting out a few details about symbols and inlining, what will
happen is that the generic f() will be inlined into f_wrapper, and it
will see that bind_y is a constant, and then unroll a "for" loop over
y.
I experimented a bit before and it works for basic cases, but I'm not
sure if it's as good as your hand-generated LLVM.
If we can make this work, it would be a big win for
readability/maintainability. The hand-generated LLVM is limited to the
bind module, which is very simple, and doesn't need to be changed when
the implementation of f() changes.
Regards,
Jeff Davis
Hi,
On 2018-01-26 18:26:03 -0800, Jeff Davis wrote:
On Wed, Jan 24, 2018 at 11:02 PM, Andres Freund <andres@anarazel.de> wrote:
Not entirely sure what you mean. You mean why I don't inline
slot_getsomeattrs() etc and instead generate code manually? The reason
is that the generated code is a *lot* smarter due to knowing the
specific tupledesc.I would like to see if we can get a combination of JIT and LTO to work
together to specialize generic code at runtime.
Well, LTO can't quite work. It relies on being able to mark code in
modules linked together as externally visible - and cleary we can't do
that for a running postgres binary. At least in all incarnations I'm
aware of. But that's why the tree I posted supports inlining of code.
Let's say you have a function f(int x, int y, int z). You want to be
able to specialize it on y at runtime, so that a loop gets unrolled in
the common case where y is small.1. At build time, create bitcode for the generic implementation of f().
2. At run time, load the generic bitcode into a module (let's call it
the "generic module")
3. At run time, create a new module (let's call it the "bind module")
that only does the following things:
a. declares a global variable bind_y, and initialize it to the value 3
b. declares a wrapper function f_wrapper(int x, int z), and all the
function does is call f(x, bind_y, z)
4. Link the generic module and the bind module together (let's call
the result the "linked module")
5. Optimize the linked module
Afaict that's effectively what I've already implemented. We could export
more input as constants to the generated program, but other than that...
Whenever any extern functions are referenced, and jit_inlining=1, then
the code will see whether the called external code is available as jit
bitcode. Based on a simple instruction based cost limit that function
will get inlined (unless it references file local non-constant static
variables and such).
Now the JITed expressions tree currently makes it hard for LLVM to
recognize some constant input as constant, but what's largely needed for
that to be better is some improvements in where temporary values are
stored (should be in alloca's rather than local memory, so mem2reg can
do its thing). It's a TODO... Right now LLVM will figure out constant
inputs to non-strict functions, but not strict ones, but after fixing
some of what I've mentioned previously it works pretty universally.
Have I misunderstood adn there's some significant functional difference?
I experimented a bit before and it works for basic cases, but I'm not
sure if it's as good as your hand-generated LLVM.
For deforming it doesn't even remotely get as good in my experiments.
If we can make this work, it would be a big win for
readability/maintainability. The hand-generated LLVM is limited to the
bind module, which is very simple, and doesn't need to be changed when
the implementation of f() changes.
Right. Thats why I think we definitely want that for the large majority
of referenced functionality.
Greetings,
Andres Freund
Hi,
On Fri, Jan 26, 2018 at 6:40 PM, Andres Freund <andres@anarazel.de> wrote:
I would like to see if we can get a combination of JIT and LTO to work
together to specialize generic code at runtime.Well, LTO can't quite work. It relies on being able to mark code in
modules linked together as externally visible - and cleary we can't do
that for a running postgres binary. At least in all incarnations I'm
aware of. But that's why the tree I posted supports inlining of code.
I meant a more narrow use of LTO: since we are doing linking in step
#4 and optimization in step #5, it's optimizing the code after
linking, which is a kind of LTO (though perhaps I'm misusing the
term?).
The version of LLVM that I tried this against had a linker option
called "InternalizeLinkedSymbols" that would prevent the visibility
problem you mention (assuming I understand you correctly). That option
is no longer there so I will have to figure out how to do it with the
current LLVM API.
Afaict that's effectively what I've already implemented. We could export
more input as constants to the generated program, but other than that...
I brought this up in the context of slot_compile_deform(). In your
patch, you have code like:
+ if (!att->attnotnull)
+ {
...
+ v_nullbyte = LLVMBuildLoad(
+ builder,
+ LLVMBuildGEP(builder, v_bits,
+ &v_nullbyteno, 1, ""),
+ "attnullbyte");
+
+ v_nullbit = LLVMBuildICmp(
+ builder,
+ LLVMIntEQ,
+ LLVMBuildAnd(builder, v_nullbyte,
v_nullbytemask, ""),
+ LLVMConstInt(LLVMInt8Type(), 0, false),
+ "attisnull");
...
So it looks like you are reimplementing the generic code, but with
conditional code gen. If the generic code changes, someone will need
to read, understand, and change this code, too, right?
With my approach, then it would initially do *un*conditional code gen,
and be less efficient and less specialized than the code generated by
your current patch. But then it would link in the constant tupledesc,
and optimize, and the optimizer will realize that they are constants
(hopefully) and then cut out a lot of the dead code and specialize it
to the given tupledesc.
This places a lot of faith in the optimizer and I realize it may not
happen as nicely with real code as it did with my earlier experiments.
Maybe you already tried and you are saying that's a dead end? I'll
give it a shot, though.
Now the JITed expressions tree currently makes it hard for LLVM to
recognize some constant input as constant, but what's largely needed for
that to be better is some improvements in where temporary values are
stored (should be in alloca's rather than local memory, so mem2reg can
do its thing). It's a TODO... Right now LLVM will figure out constant
inputs to non-strict functions, but not strict ones, but after fixing
some of what I've mentioned previously it works pretty universally.Have I misunderstood adn there's some significant functional difference?
I'll try to explain with code, and then we can know for sure ;-)
Sorry for the ambiguity, I'm probably misusing a few terms.
I experimented a bit before and it works for basic cases, but I'm not
sure if it's as good as your hand-generated LLVM.For deforming it doesn't even remotely get as good in my experiments.
I'd like some more information here -- what didn't work? It didn't
recognize constants? Or did recognize them, but didn't optimize as
well as you did by hand?
Regards,
Jeff Davis
Hi,
On 2018-01-26 22:52:35 -0800, Jeff Davis wrote:
The version of LLVM that I tried this against had a linker option
called "InternalizeLinkedSymbols" that would prevent the visibility
problem you mention (assuming I understand you correctly).
I don't think they're fully solvable - you can't really internalize a
reference to a mutable static variable in another translation
unit. Unless you modify that translation unit, which doesn't work when
postgres running.
That option is no longer there so I will have to figure out how to do
it with the current LLVM API.
Look at the llvmjit_wrap.c code invoking FunctionImporter - that pretty
much does that. I'll push a cleaned up version of that code sometime
this weekend (it'll then live in llvmjit_inline.cpp).
Afaict that's effectively what I've already implemented. We could export
more input as constants to the generated program, but other than that...I brought this up in the context of slot_compile_deform(). In your
patch, you have code like:+ if (!att->attnotnull) + { ... + v_nullbyte = LLVMBuildLoad( + builder, + LLVMBuildGEP(builder, v_bits, + &v_nullbyteno, 1, ""), + "attnullbyte"); + + v_nullbit = LLVMBuildICmp( + builder, + LLVMIntEQ, + LLVMBuildAnd(builder, v_nullbyte, v_nullbytemask, ""), + LLVMConstInt(LLVMInt8Type(), 0, false), + "attisnull"); ...So it looks like you are reimplementing the generic code, but with
conditional code gen. If the generic code changes, someone will need
to read, understand, and change this code, too, right?
Right. Not that that's code that has changed that much...
With my approach, then it would initially do *un*conditional code gen,
and be less efficient and less specialized than the code generated by
your current patch. But then it would link in the constant tupledesc,
and optimize, and the optimizer will realize that they are constants
(hopefully) and then cut out a lot of the dead code and specialize it
to the given tupledesc.
Right.
This places a lot of faith in the optimizer and I realize it may not
happen as nicely with real code as it did with my earlier experiments.
Maybe you already tried and you are saying that's a dead end? I'll
give it a shot, though.
I did that, yes. There's two major downsides:
a) The code isn't as efficient as the handrolled code. The handrolled
code e.g. can take into account that it doesn't need to access the
NULL bitmap for a NOT NULL column and we don't need to check the
tuple's number of attributes if there's a following NOT NULL
attribute. Those safe a good number of cycles.
b) The optimizations to take advantage of the constants and make the
code faster with the constant tupledesc is fairly slow (you pretty
much need at least an -O2 equivalent), whereas the handrolled tuple
deforming is faster than the slot_getsomeattrs with just a single,
pretty cheap, mem2reg pass. We're talking about ~1ms vs 70-100ms in
a lot of cases. The optimizer often will not actually unroll the
loop with many attributes despite that being beneficial.
I think in most cases using the approach you advocate makes sense, to
avoid duplication, but tuple deforming is such a major bottleneck that I
think it's clearly worth doing it manually. Being able to use llvm with
just a always-inline and a mem2reg pass makes it so much more widely
applicable than doing the full inlining and optimization work.
I experimented a bit before and it works for basic cases, but I'm not
sure if it's as good as your hand-generated LLVM.For deforming it doesn't even remotely get as good in my experiments.
I'd like some more information here -- what didn't work? It didn't
recognize constants? Or did recognize them, but didn't optimize as
well as you did by hand?
It didn't optimize as well as I did by hand, without significantly
complicating (and slowing) the originating the code. It sometimes
decided not to unroll the loop, and it takes a *lot* longer than the
direct emission of the code.
I'm hoping to work on making more of the executor JITed, and there I do
think it's largely going to be what you're proposing, due to the sheer
mass of code.
Greetings,
Andres Freund
On Sat, Jan 27, 2018 at 1:20 PM, Andres Freund <andres@anarazel.de> wrote:
b) The optimizations to take advantage of the constants and make the
code faster with the constant tupledesc is fairly slow (you pretty
much need at least an -O2 equivalent), whereas the handrolled tuple
deforming is faster than the slot_getsomeattrs with just a single,
pretty cheap, mem2reg pass. We're talking about ~1ms vs 70-100ms in
a lot of cases. The optimizer often will not actually unroll the
loop with many attributes despite that being beneficial.
This seems like the major point. We would have to customize the
optimization passes a lot and/or choose carefully which ones we apply.
I think in most cases using the approach you advocate makes sense, to
avoid duplication, but tuple deforming is such a major bottleneck that I
think it's clearly worth doing it manually. Being able to use llvm with
just a always-inline and a mem2reg pass makes it so much more widely
applicable than doing the full inlining and optimization work.
OK.
On another topic, I'm trying to find a way we could break this patch
into smaller pieces. For instance, if we concentrate on tuple
deforming, maybe it would be committable in time for v11?
I see that you added some optimizations to the existing generic code.
Do those offer a measurable improvement, and if so, can you commit
those first to make the JIT stuff more readable?
Also, I'm sure you considered this, but I'd like to ask if we can try
harder make the JIT itself happen in an extension. It has some pretty
huge benefits:
* The JIT code is likely to go through a lot of changes, and it
would be nice if it wasn't tied to a yearly release cycle.
* Would mean postgres itself isn't dependent on a huge library like
llvm, which just seems like a good idea from a packaging standpoint.
* May give GCC or something else a chance to compete with it's own JIT.
* It may make it easier to get something in v11.
It appears reasonable to make the slot deforming and expression
evaluator parts an extension. execExpr.h only exports a couple new
functions; heaptuple.c has a lot of changes but they seem like they
could be separated (unless I'm missing something).
The biggest problem is that the inlining would be much harder to
separate out, because you are building the .bc files at build time. I
really like the idea of inlining, but it doesn't necessarily need to
be in the first commit.
Regards,
Jeff Davis
Hi,
On 2018-01-27 16:56:17 -0800, Jeff Davis wrote:
On another topic, I'm trying to find a way we could break this patch
into smaller pieces. For instance, if we concentrate on tuple
deforming, maybe it would be committable in time for v11?
Yea, I'd planned and started to do so. I actually hope we can get more
committed than just the tuple deforming code - for one it currently
integrates directly with the expression evaluation code, and my
experience with trying to do so outside of it have not gone well.
I see that you added some optimizations to the existing generic code.
Do those offer a measurable improvement, and if so, can you commit
those first to make the JIT stuff more readable?
I think basically the later a patch currently is in the series the less
important it is.
I've already committed a lot of preparatory patches (like that aggs now
use the expression engine), and I plan to continue doing so.
Also, I'm sure you considered this, but I'd like to ask if we can try
harder make the JIT itself happen in an extension. It has some pretty
huge benefits:
I'm very strongly against this. To the point that I'll not pursue JITing
further if that becomes a requirement.
I could be persuaded to put it into a shared library instead of the main
binary itself, but I think developing it outside of core is entirely
infeasible because quite freuquently both non-JITed code and JITed code
need adjustments. That'd solve your concern about
* Would mean postgres itself isn't dependent on a huge library like
llvm, which just seems like a good idea from a packaging standpoint.
to some degree.
I think it's a fools errand to try to keep in sync with core changes on
the expression evaluation and struct definition side of things. There's
planner integration, error handling integration and similar related
things too, all of which require core changes. Therefore I don't think
there's a reasonable chance of success of doing this outside of core
postgres.
It appears reasonable to make the slot deforming and expression
evaluator parts an extension. execExpr.h only exports a couple new
functions; heaptuple.c has a lot of changes but they seem like they
could be separated (unless I'm missing something).
The heaptuple.c stuff could largely be dropped, that was more an effort
to level the plainfield a bit to make the comparison fairer. I kinda
wondered about putting the JIT code in a heaptuple_jit.c file instead of
heaptuple.c.
The biggest problem is that the inlining would be much harder to
separate out, because you are building the .bc files at build time. I
really like the idea of inlining, but it doesn't necessarily need to
be in the first commit.
Well, but doing this outside of core would pretty much prohibit doing so
forever, no? Getting the inlining design right has influenced several
other parts of the code. I think it's right that the inlining doesn't
necessarily have to be part of the initial set of commits (and I plan to
separate it out in the next revision), but I do think it has to be
written in a reasonably ready form at the time of commit.
Greetings,
Andres Freund
On Sat, Jan 27, 2018 at 5:15 PM, Andres Freund <andres@anarazel.de> wrote:
Also, I'm sure you considered this, but I'd like to ask if we can try
harder make the JIT itself happen in an extension. It has some pretty
huge benefits:I'm very strongly against this. To the point that I'll not pursue JITing
further if that becomes a requirement.
I would like to see this feature succeed and I'm not making any
specific demands.
infeasible because quite freuquently both non-JITed code and JITed code
need adjustments. That'd solve your concern about
Can you explain further?
I think it's a fools errand to try to keep in sync with core changes on
the expression evaluation and struct definition side of things. There's
planner integration, error handling integration and similar related
things too, all of which require core changes. Therefore I don't think
there's a reasonable chance of success of doing this outside of core
postgres.
I wasn't suggesting the entire patch be done outside of core. Core
will certainly need to know about JIT compilation, but I am not
convinced that it needs to know about the details of LLVM. All the
references to the LLVM library itself are contained in a few files, so
you've already got it well organized. What's stopping us from putting
that code into a "jit provider" extension that implements the proper
interfaces?
Well, but doing this outside of core would pretty much prohibit doing so
forever, no?
First of all, building .bc files at build time is much less invasive
than linking to the LLVM library. Any version of clang will produce
bitcode that can be read by any LLVM library or tool later (more or
less).
Second, we could change our minds later. Mark any extension APIs as
experimental, and decide we want to move LLVM into postgres whenever
it is needed.
Third, there's lots of cool stuff we can do here:
* put the source in the catalog
* an extension could have its own catalog and build the source into
bitcode and cache it there
* the source for functions would flow to replicas, etc.
* security-conscious environments might even choose to run some of
the C code in a safe C interpreter rather than machine code
So I really don't see this as permanently closing off our options.
Regards,
Jeff Davis
On Thursday, January 25, 2018 8:12:42 PM CET Andres Freund wrote:
Hi,
On 2018-01-25 10:00:14 +0100, Pierre Ducroquet wrote:
I don't know when this would be released,
August-October range.
but the minimal supported LLVM
version will have a strong influence on the availability of that feature.
If today this JIT compiling was released with only LLVM 5/6 support, it
would be unusable for most Debian users (llvm-5 is only available in
sid). Even llvm 4 is not available in latest stable.
I'm already trying to build with llvm-4 and I'm going to try further with
llvm 3.9 (Debian Stretch doesn't have a more recent than this one, and I
won't have something better to play with my data), I'll keep you
informed. For sport, I may also try llvm 3.5 (for Debian Jessie).I don't think it's unreasonable to not support super old llvm
versions. This is a complex feature, and will take some time to
mature. Supporting too many LLVM versions at the outset will have some
cost. Versions before 3.8 would require supporting mcjit rather than
orc, and I don't think that'd be worth doing. I think 3.9 might be a
reasonable baseline...Greetings,
Andres Freund
Hi
I have fixed the build issues with LLVM 3.9 and 4.0. The LLVM documentation is
really lacking when it comes to porting from version x to x+1.
The only really missing part I found is that in 3.9, GlobalValueSummary has no
flag showing if it's not EligibleToImport. I am not sure about the
consequences.
I'm still fixing some runtime issues so I will not bother you with the patch
right now.
BTW, the makefile for src/backend/lib does not remove the llvmjit_types.bc
file when cleaning, and doesn't seem to install in the right folder.
Regards
Pierre
On Thursday, January 25, 2018 8:02:54 AM CET Andres Freund wrote:
Hi!
On 2018-01-24 22:51:36 -0800, Jeff Davis wrote:
Can we store the bitcode in pg_proc, simplifying deployment and
allowing extensions to travel over replication?Yes, we could. You'd need to be a bit careful that all the machines have
similar-ish cpu generations or compile with defensive settings, but that
seems okay.
Hi
Doing this would 'bind' the database to the LLVM release used. LLVM can, as
far as I know, generate bitcode only for the current version, and will only be
able to read bitcode from previous versions. So you can't have, for instance a
master server with LLVM 5 and a standby server with LLVM 4.
So maybe PostgreSQL would have to expose what LLVM version is currently used ?
Or a major PostgreSQL release could accept only one major LLVM release, as was
suggested in another thread ?
Pierre
Hi,
On 2018-01-27 22:06:59 -0800, Jeff Davis wrote:
infeasible because quite freuquently both non-JITed code and JITed code
need adjustments. That'd solve your concern aboutCan you explain further?
There's already a *lot* of integration points in the patchseries. Error
handling needs to happen in parts of code we do not want to make
extensible, the defintion of expression steps has to exactly match, the
core code needs to emit the right types for syncing, the core code needs
to define the right FIELDNO accessors, there needs to be planner
integrations. Many of those aren't doable with even remotely the same
effort, both initial and continual, from non-core code....
I think those alone make it bad, but there'll be more. Short-Medium term
expression evaluation needs to evolve further to make JITing cachable:
http://archives.postgresql.org/message-id/20180124203616.3gx4vm45hpoijpw3%40alap3.anarazel.de
which again definitely has to be happen in core and will require
corresponding changes on the JIT side very step. Then we'll need to
introduce something like plancache (or something similar?) support for
JITing to reuse JITed functions.
Then there's also a significant difference in how large the adoption's
going to be, and how all the core code that'd need to be added is
supposed to be testable without the JIT emitting side in core.
I think it's a fools errand to try to keep in sync with core changes on
the expression evaluation and struct definition side of things. There's
planner integration, error handling integration and similar related
things too, all of which require core changes. Therefore I don't think
there's a reasonable chance of success of doing this outside of core
postgres.I wasn't suggesting the entire patch be done outside of core. Core
will certainly need to know about JIT compilation, but I am not
convinced that it needs to know about the details of LLVM. All the
references to the LLVM library itself are contained in a few files, so
you've already got it well organized. What's stopping us from putting
that code into a "jit provider" extension that implements the proper
interfaces?
The above hopefully answers that?
What we could do, imo somewhat realistically, is to put most of the
provider into a dynamically loaded shared library that lives in core
(similar to how we build the pgoutput output plugin shared library as
part of core). But that still would end up hard coding things like LLVM
specific error handling etc, which we currently do *NOT* want to be
extensible.
Well, but doing this outside of core would pretty much prohibit doing so
forever, no?First of all, building .bc files at build time is much less invasive
than linking to the LLVM library.
Could you expand on that, I don't understand why that'd be the case?
Any version of clang will produce bitcode that can be read by any LLVM
library or tool later (more or less).
Well, forward portable, not backward portable.
Second, we could change our minds later. Mark any extension APIs as
experimental, and decide we want to move LLVM into postgres whenever
it is needed.Third, there's lots of cool stuff we can do here:
* put the source in the catalog
* an extension could have its own catalog and build the source into
bitcode and cache it there
* the source for functions would flow to replicas, etc.
* security-conscious environments might even choose to run some of
the C code in a safe C interpreter rather than machine code
I agree, but what does that have to do with the llvmjit stuff being an
extension or not?
Greetings,
Andres Freund
Hi,
On 2018-01-28 23:02:56 +0100, Pierre Ducroquet wrote:
I have fixed the build issues with LLVM 3.9 and 4.0. The LLVM documentation is
really lacking when it comes to porting from version x to x+1.
The only really missing part I found is that in 3.9, GlobalValueSummary has no
flag showing if it's not EligibleToImport. I am not sure about the
consequences.
I think that'd not be too bad, it'd just lead to some small increase in
overhead as more modules would be loaded.
BTW, the makefile for src/backend/lib does not remove the llvmjit_types.bc
file when cleaning, and doesn't seem to install in the right folder.
Hm, both seems to be right here? Note that the llvmjit_types.bc file
should *not* go into the bitcode/ directory, as it's about syncing types
not inlining. I've added a comment to that effect.
Greetings,
Andres Freund
Hi,
On 2018-01-23 23:20:38 -0800, Andres Freund wrote:
== Code ==
As the patchset is large (500kb) and I'm still quickly evolving it, I do
not yet want to attach it. The git tree is at
https://git.postgresql.org/git/users/andresfreund/postgres.git
in the jit branch
https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/jit
I've just pushed an updated and rebased version of the tree:
- Split the large "jit infrastructure" commits into a number of smaller
commits
- Split the C++ file
- Dropped some of the performance stuff done to heaptuple.c - that was
mostly to make performance comparisons a bit more interesting, but
doesn't seem important enough to deal with.
- Added a commit renaming datetime.h symbols so they don't conflict with
LLVM variables anymore, removing ugly #undef PM/#define PM dance
around includes. Will post separately.
- Reduced the number of pointer constants in the generated LLVM IR, by
doing more getelementptr accesses (stem from before the time types
were automatically synced)
- Increased number of comments a bit
There's a jit-before-rebase-2018-01-29 tag, for the state of the tree
before the rebase.
Regards,
Andres
On Monday, January 29, 2018 10:46:13 AM CET Andres Freund wrote:
Hi,
On 2018-01-28 23:02:56 +0100, Pierre Ducroquet wrote:
I have fixed the build issues with LLVM 3.9 and 4.0. The LLVM
documentation is really lacking when it comes to porting from version x
to x+1.
The only really missing part I found is that in 3.9, GlobalValueSummary
has no flag showing if it's not EligibleToImport. I am not sure about the
consequences.I think that'd not be too bad, it'd just lead to some small increase in
overhead as more modules would be loaded.BTW, the makefile for src/backend/lib does not remove the llvmjit_types.bc
file when cleaning, and doesn't seem to install in the right folder.Hm, both seems to be right here? Note that the llvmjit_types.bc file
should *not* go into the bitcode/ directory, as it's about syncing types
not inlining. I've added a comment to that effect.
The file was installed in lib/ while the code expected it in lib/postgresql.
So there was something wrong here.
And deleting the file when cleaning is needed if at configure another llvm
version is used. The file must be generated with a clang release that is not
more recent than the llvm version linked to postgresql. Otherwise, the bitcode
generated is not accepted by llvm.
Regards
Pierre
On 26.01.2018 22:38, Andres Freund wrote:
And without it perf is not able to unwind stack trace for generated
code.
You can work around that by using --call-graph lbr with a sufficiently
new perf. That'll not know function names et al, but at least the parent
will be associated correctly.
With --call-graph lbr result is ... slightly different (see attached
profile) but still there is "unknown" bar.
But you are compiling code using LLVMOrcAddEagerlyCompiledIR
and I find no way to pass no-omit-frame pointer option here.It shouldn't be too hard to open code support for it, encapsulated in a
function:
// Set function attribute "no-frame-pointer-elim" based on
// NoFramePointerElim.
for (auto &F : *Mod) {
auto Attrs = F.getAttributes();
StringRef Value(options.NoFramePointerElim ? "true" : "false");
Attrs = Attrs.addAttribute(F.getContext(), AttributeList::FunctionIndex,
"no-frame-pointer-elim", Value);
F.setAttributes(Attrs);
}
that's all that option did for mcjit.
I have implemented the following function:
void
llvm_no_frame_pointer_elimination(LLVMModuleRef mod)
{
llvm::Module *module = llvm::unwrap(mod);
for (auto &F : *module) {
auto Attrs = F.getAttributes();
Attrs = Attrs.addAttribute(F.getContext(),
llvm::AttributeList::FunctionIndex,
"no-frame-pointer-elim", "true");
F.setAttributes(Attrs);
}
}
and call it before LLVMOrcAddEagerlyCompiledIR in llvm_compile_module:
llvm_no_frame_pointer_elimination(context->module);
smod = LLVMOrcMakeSharedModule(context->module);
if (LLVMOrcAddEagerlyCompiledIR(compile_orc, &orc_handle, smod,
llvm_resolve_symbol, NULL))
{
elog(ERROR, "failed to jit module");
}
... but it has no effect: produced profile is the same (with
--call-graph dwarf).
May be you can point me on my mistake...
Actually I am trying to find answer for the question why your version of
JIT provides ~2 times speedup at Q1, while ISPRAS version
(https://www.pgcon.org/2017/schedule/attachments/467_PGCon%202017-05-26%2015-00%20ISPRAS%20Dynamic%20Compilation%20of%20SQL%20Queries%20in%20PostgreSQL%20Using%20LLVM%20JIT.pdf)
speedup Q1 is 5.5x times.
May be it is because them are using double type to calculate aggregates
while as far as I understand you are using standard Postgres aggregate
functions?
Or may be because ISPRAS version is not checking for NULL values...
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
Hi,
On 2018-01-29 15:45:56 +0300, Konstantin Knizhnik wrote:
On 26.01.2018 22:38, Andres Freund wrote:
And without it perf is not able to unwind stack trace for generated
code.
You can work around that by using --call-graph lbr with a sufficiently
new perf. That'll not know function names et al, but at least the parent
will be associated correctly.With --call-graph lbr result is ... slightly different (see attached
profile) but still there is "unknown" bar.
Right. All that allows is to attribute the cost below the parent in the
perf report --children case. For it to be attributed to proper symbols
you need my llvm patch to support pef.
Actually I am trying to find answer for the question why your version of JIT
provides ~2 times speedup at Q1, while ISPRAS version (https://www.pgcon.org/2017/schedule/attachments/467_PGCon%202017-05-26%2015-00%20ISPRAS%20Dynamic%20Compilation%20of%20SQL%20Queries%20in%20PostgreSQL%20Using%20LLVM%20JIT.pdf)
speedup Q1 is 5.5x times.
May be it is because them are using double type to calculate aggregates
while as far as I understand you are using standard Postgres aggregate
functions?
Or may be because ISPRAS version is not checking for NULL values...
All of those together, yes. And added that I'm aiming to work
incrementally towards core inclusions, rather than getting the best
results. There's a *lot* that can be done to improve the generated code
- after e.g. hacking together an improvement to the argument passing (by
allocating isnull / nargs / arg[], argnull[] as a separate on-stack from
FunctionCallInfoData), I get another 1.8x. Eliminating redundant float
overflow checks gives another 1.2x. And so on.
Greetings,
Andres Freund
On Mon, Jan 29, 2018 at 1:36 AM, Andres Freund <andres@anarazel.de> wrote:
There's already a *lot* of integration points in the patchseries. Error
handling needs to happen in parts of code we do not want to make
extensible, the defintion of expression steps has to exactly match, the
core code needs to emit the right types for syncing, the core code needs
to define the right FIELDNO accessors, there needs to be planner
integrations. Many of those aren't doable with even remotely the same
effort, both initial and continual, from non-core code....
OK. How about this: are you open to changes that move us in the
direction of extensibility later? (By this I do *not* mean imposing a
bunch of requirements on you... either small changes to your patches
or something part of another commit.) Or are you determined that this
always should be a part of core?
I don't want to stand in your way, but I am also hesitant to dive head
first into LLVM and not look back. Postgres has always been lean, fast
building, and with few dependencies. Who knows what LLVM will do in
the future and how that will affect postgres? Especially when, on day
one, we already know that it causes a few annoyances?
In other words, are you "strongly against [extensbility being a
requirement for the first commit]" or "strongly against [extensible
JIT]"?
Well, but doing this outside of core would pretty much prohibit doing so
forever, no?First of all, building .bc files at build time is much less invasive
than linking to the LLVM library.Could you expand on that, I don't understand why that'd be the case?
Building the .bc files at build time depends on LLVM, but is not very
version-dependent and has no impact on the resulting binary. That's
less invasive than a dependency on a library with an unstable API that
doesn't entirely work with our error reporting facility.
Third, there's lots of cool stuff we can do here:
* put the source in the catalog
* an extension could have its own catalog and build the source into
bitcode and cache it there
* the source for functions would flow to replicas, etc.
* security-conscious environments might even choose to run some of
the C code in a safe C interpreter rather than machine codeI agree, but what does that have to do with the llvmjit stuff being an
extension or not?
If the source for functions is in the catalog, we could build the
bitcode at runtime and still do the inlining. We wouldn't need to do
anything at build time. (Again, this would be "cool stuff for the
future", I am not asking you for it now.)
Regards,
Jeff Davis
Hi,
On 2018-01-29 10:28:18 -0800, Jeff Davis wrote:
OK. How about this: are you open to changes that move us in the
direction of extensibility later? (By this I do *not* mean imposing a
bunch of requirements on you... either small changes to your patches
or something part of another commit.)
I'm good with that.
Or are you determined that this always should be a part of core?
I do think JIT compilation should be in core, yes. And after quite some
looking around that currently means either using LLVM or building our
own from scratch, and the latter doesn't seem attractive. But that
doesn't mean there *also* can be extensibility. If somebody wants to
experiment with a more advanced version of JIT compilation, develop a
gcc backed version (which can't be in core due to licensing), ... - I'm
happy to provide hooks that only require a reasonable effort and don't
affect the overall stability of the system (i.e. no callback from
PostgresMain()'s sigsetjmp() block).
I don't want to stand in your way, but I am also hesitant to dive head
first into LLVM and not look back. Postgres has always been lean, fast
building, and with few dependencies.
It's an optional dependency, and it doesn't increase build time that
much... If we were to move the llvm interfacing code to a .so, there'd
not even be a packaging issue, you can just package that .so separately
and get errors if somebody tries to enable LLVM without that .so being
installed.
In other words, are you "strongly against [extensbility being a
requirement for the first commit]" or "strongly against [extensible
JIT]"?
I'm strongly against there not being an in-core JIT. I'm not at all
against adding APIs that allow to do different JIT implementations out
of core.
If the source for functions is in the catalog, we could build the
bitcode at runtime and still do the inlining. We wouldn't need to do
anything at build time. (Again, this would be "cool stuff for the
future", I am not asking you for it now.)
Well, the source would require an actual compiler around. And the
inlining *just* for the function code itself isn't actually that
interesting, you e.g. want to also be able to
Greetings,
Andres Freund
On 01/24/2018 08:20 AM, Andres Freund wrote:
Hi,
I've spent the last weeks working on my LLVM compilation patchset. In
the course of that I *heavily* revised it. While still a good bit away
from committable, it's IMO definitely not a prototype anymore.There's too many small changes, so I'm only going to list the major
things. A good bit of that is new. The actual LLVM IR emissions itself
hasn't changed that drastically. Since I've not described them in
detail before I'll describe from scratch in a few cases, even if things
haven't fully changed.
Hi, I wanted to look at this, but my attempts to build the jit branch
fail with some compile-time warnings (uninitialized variables) and
errors (unknown types, incorrect number of arguments). See the file
attached.
I wonder if I'm doing something wrong, or if there's something wrong
with my environment. I do have this:
$ clang -v
clang version 5.0.0 (trunk 299717)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
Selected GCC installation: /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
Hi,
On 2018-01-29 22:51:38 +0100, Tomas Vondra wrote:
Hi, I wanted to look at this, but my attempts to build the jit branch
fail with some compile-time warnings (uninitialized variables) and
errors (unknown types, incorrect number of arguments). See the file
attached.
Which git hash are you building? What llvm version is this building
against? If you didn't specify LLVM_CONFIG=... what does llvm-config
--version return?
Greetings,
Andres Freund
On 01/29/2018 10:57 PM, Andres Freund wrote:
Hi,
On 2018-01-29 22:51:38 +0100, Tomas Vondra wrote:
Hi, I wanted to look at this, but my attempts to build the jit branch
fail with some compile-time warnings (uninitialized variables) and
errors (unknown types, incorrect number of arguments). See the file
attached.Which git hash are you building? What llvm version is this building
against? If you didn't specify LLVM_CONFIG=... what does llvm-config
--version return?
I'm building against fdc6c7a6dddbd6df63717f2375637660bcd00fc6 (current
HEAD in the jit branch, AFAICS).
I'm building like this:
$ ./configure --enable-debug CFLAGS="-fno-omit-frame-pointer -O2" \
--with-llvm --prefix=/home/postgres/pg-llvm
$ make -s -j4 install
and llvm-config --version says this:
$ llvm-config --version
5.0.0svn
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018-01-29 23:01:14 +0100, Tomas Vondra wrote:
On 01/29/2018 10:57 PM, Andres Freund wrote:
Hi,
On 2018-01-29 22:51:38 +0100, Tomas Vondra wrote:
Hi, I wanted to look at this, but my attempts to build the jit branch
fail with some compile-time warnings (uninitialized variables) and
errors (unknown types, incorrect number of arguments). See the file
attached.Which git hash are you building? What llvm version is this building
against? If you didn't specify LLVM_CONFIG=... what does llvm-config
--version return?I'm building against fdc6c7a6dddbd6df63717f2375637660bcd00fc6 (current
HEAD in the jit branch, AFAICS).
The warnings come from an incomplete patch I probably shouldn't have
pushed (Heavily-WIP: JIT hashing.). They should largely be irrelevant
(although will cause a handful of "ERROR: hm" regression failures),
but I'll definitely pop that commit on the next rebase. If you want you
can just reset --hard to its parent.
That errors are weird however:
llvmjit.c: In function ‘llvm_get_function’:
llvmjit.c:239:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ from incompatible pointer type [-Wincompatible-pointer-types]
if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, &addr, mangled))
^
In file included from llvmjit.c:45:0:
/usr/local/include/llvm-c/OrcBindings.h:129:22: note: expected ‘const char *’ but argument is of type ‘LLVMOrcTargetAddress * {aka long unsigned int *}’
LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
^~~~~~~~~~~~~~~~~~~~~~~
llvmjit.c:239:6: error: too many arguments to function ‘LLVMOrcGetSymbolAddress’
if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, &addr, mangled))
^~~~~~~~~~~~~~~~~~~~~~~
In file included from llvmjit.c:45:0:
/usr/local/include/llvm-c/OrcBindings.h:129:22: note: declared here
LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
^~~~~~~~~~~~~~~~~~~~~~~
llvmjit.c:243:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ from incompatible pointer type [-Wincompatible-pointer-types]
if (LLVMOrcGetSymbolAddress(llvm_opt3_orc, &addr, mangled))
^
I'm building like this:
$ ./configure --enable-debug CFLAGS="-fno-omit-frame-pointer -O2" \
--with-llvm --prefix=/home/postgres/pg-llvm$ make -s -j4 install
and llvm-config --version says this:
$ llvm-config --version
5.0.0svn
Is thta llvm-config the one in /usr/local/include/ referenced by the
error message above? Or is it possible that llvm-config is from a
different version than the one the compiler picks the headers up from?
could you go to src/backend/lib, rm llvmjit.o, and show the full output
of make llvmjit.o?
I wonder whether the issue is that my configure patch does
-I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
rather than
-I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
and that it thus picks up the wrong header first?
Greetings,
Andres Freund
On 01/29/2018 11:17 PM, Andres Freund wrote:
On 2018-01-29 23:01:14 +0100, Tomas Vondra wrote:
On 01/29/2018 10:57 PM, Andres Freund wrote:
Hi,
On 2018-01-29 22:51:38 +0100, Tomas Vondra wrote:
Hi, I wanted to look at this, but my attempts to build the jit branch
fail with some compile-time warnings (uninitialized variables) and
errors (unknown types, incorrect number of arguments). See the file
attached.Which git hash are you building? What llvm version is this building
against? If you didn't specify LLVM_CONFIG=... what does llvm-config
--version return?I'm building against fdc6c7a6dddbd6df63717f2375637660bcd00fc6 (current
HEAD in the jit branch, AFAICS).The warnings come from an incomplete patch I probably shouldn't have
pushed (Heavily-WIP: JIT hashing.). They should largely be irrelevant
(although will cause a handful of "ERROR: hm" regression failures),
but I'll definitely pop that commit on the next rebase. If you want you
can just reset --hard to its parent.
OK
That errors are weird however:
... ^
I'm building like this:
$ ./configure --enable-debug CFLAGS="-fno-omit-frame-pointer -O2" \
--with-llvm --prefix=/home/postgres/pg-llvm$ make -s -j4 install
and llvm-config --version says this:
$ llvm-config --version
5.0.0svnIs thta llvm-config the one in /usr/local/include/ referenced by the
error message above?
I don't see it referenced anywhere, but it comes from here:
$ which llvm-config
/usr/local/bin/llvm-config
Or is it possible that llvm-config is from a different version than
the one the compiler picks the headers up from?
I don't think so. I don't have any other llvm versions installed, AFAICS.
could you go to src/backend/lib, rm llvmjit.o, and show the full output
of make llvmjit.o?
Attached.
I wonder whether the issue is that my configure patch does
-I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
rather than
-I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
and that it thus picks up the wrong header first?
I've tried this configure tweak:
if test -n "$LLVM_CONFIG"; then
for pgac_option in `$LLVM_CONFIG --cflags`; do
case $pgac_option in
- -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
+ -I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
esac
done
and that indeed changes the failure to this:
Writing postgres.bki
Writing schemapg.h
Writing postgres.description
Writing postgres.shdescription
llvmjit_error.cpp: In function ‘void llvm_enter_fatal_on_oom()’:
llvmjit_error.cpp:61:3: error: ‘install_bad_alloc_error_handler’ is not
a member of ‘llvm’
llvm::install_bad_alloc_error_handler(fatal_llvm_new_handler);
^~~~
llvmjit_error.cpp: In function ‘void llvm_leave_fatal_on_oom()’:
llvmjit_error.cpp:77:3: error: ‘remove_bad_alloc_error_handler’ is not a
member of ‘llvm’
llvm::remove_bad_alloc_error_handler();
^~~~
llvmjit_error.cpp: In function ‘void llvm_reset_fatal_on_oom()’:
llvmjit_error.cpp:92:3: error: ‘remove_bad_alloc_error_handler’ is not a
member of ‘llvm’
llvm::remove_bad_alloc_error_handler();
^~~~
make[3]: *** [<builtin>: llvmjit_error.o] Error 1
make[2]: *** [common.mk:45: lib-recursive] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [Makefile:38: all-backend-recurse] Error 2
make: *** [GNUmakefile:11: all-src-recurse] Error 2
I'm not sure what that means, though ... maybe I really have system
broken in some strange way.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
Hi,
On 2018-01-29 23:49:14 +0100, Tomas Vondra wrote:
On 01/29/2018 11:17 PM, Andres Freund wrote:
On 2018-01-29 23:01:14 +0100, Tomas Vondra wrote:
$ llvm-config --version
5.0.0svnIs thta llvm-config the one in /usr/local/include/ referenced by the
error message above?I don't see it referenced anywhere, but it comes from here:
$ which llvm-config
/usr/local/bin/llvm-configOr is it possible that llvm-config is from a different version than
the one the compiler picks the headers up from?I don't think so. I don't have any other llvm versions installed, AFAICS.
Hm.
could you go to src/backend/lib, rm llvmjit.o, and show the full output
of make llvmjit.o?Attached.
I wonder whether the issue is that my configure patch does
-I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
rather than
-I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
and that it thus picks up the wrong header first?I've tried this configure tweak:
if test -n "$LLVM_CONFIG"; then for pgac_option in `$LLVM_CONFIG --cflags`; do case $pgac_option in - -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";; + -I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";; esac doneand that indeed changes the failure to this:
Err, huh? I don't understand how that can change anything if you
actually only have only one version of LLVM installed. Perhaps the
effect was just an ordering related artifact of [parallel] make?
I.e. just a question what failed first?
Writing postgres.bki
Writing schemapg.h
Writing postgres.description
Writing postgres.shdescription
llvmjit_error.cpp: In function ‘void llvm_enter_fatal_on_oom()’:
llvmjit_error.cpp:61:3: error: ‘install_bad_alloc_error_handler’ is not
a member of ‘llvm’
llvm::install_bad_alloc_error_handler(fatal_llvm_new_handler);
^~~~
llvmjit_error.cpp: In function ‘void llvm_leave_fatal_on_oom()’:
llvmjit_error.cpp:77:3: error: ‘remove_bad_alloc_error_handler’ is not a
member of ‘llvm’
llvm::remove_bad_alloc_error_handler();
^~~~
llvmjit_error.cpp: In function ‘void llvm_reset_fatal_on_oom()’:
llvmjit_error.cpp:92:3: error: ‘remove_bad_alloc_error_handler’ is not a
member of ‘llvm’
llvm::remove_bad_alloc_error_handler();
^~~~
It's a bit hard to interpret this without the actual compiler
invocation. But I've just checked both manually by inspecting 5.0 source
and by compiling against 5.0 that that function definition definitely
exists:
andres@alap4:~/src/llvm-5$ git branch
master
* release_50
andres@alap4:~/src/llvm-5$ ack remove_bad_alloc_error_handler
lib/Support/ErrorHandling.cpp
139:void llvm::remove_bad_alloc_error_handler() {
include/llvm/Support/ErrorHandling.h
101:void remove_bad_alloc_error_handler();
So does my system llvm 5:
$ ack remove_bad_alloc_error_handler /usr/include/llvm-5.0/
/usr/include/llvm-5.0/llvm/Support/ErrorHandling.h
101:void remove_bad_alloc_error_handler();
But not in 4.0:
$ ack remove_bad_alloc_error_handler /usr/include/llvm-4.0/
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -fno-omit-frame-pointer -O2 -I../../../src/include -D_GNU_SOURCE -I/usr/local/include -DNDEBUG -DLLVM_BUILD_GLOBAL_ISEL -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -c -o llvmjit.o llvmjit.c
llvmjit.c: In function ‘llvm_get_function’:
llvmjit.c:239:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ from incompatible pointer type [-Wincompatible-pointer-types]
if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, &addr, mangled))
^
In file included from llvmjit.c:45:0:
/usr/local/include/llvm-c/OrcBindings.h:129:22: note: expected ‘const char *’ but argument is of type ‘LLVMOrcTargetAddress * {aka long unsigned int *}’
LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
^~~~~~~~~~~~~~~~~~~~~~~
To me this looks like those headers are from llvm 4, rather than 5:
$ grep -A2 -B3 LLVMOrcGetSymbolAddress ~/src/llvm-4/include/llvm-c/OrcBindings.h
/**
* Get symbol address from JIT instance.
*/
LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
const char *SymbolName);
$ grep -A3 -B3 LLVMOrcGetSymbolAddress ~/src/llvm-5/include/llvm-c/OrcBindings.h
/**
* Get symbol address from JIT instance.
*/
LLVMOrcErrorCode LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
LLVMOrcTargetAddress *RetAddr,
const char *SymbolName);
So it does appear that your llvm-config and the actually installed llvm
don't quite agree. How did you install llvm?
Greetings,
Andres Freund
On 01/29/2018 11:49 PM, Tomas Vondra wrote:
...
and that indeed changes the failure to this:
Writing postgres.bki
Writing schemapg.h
Writing postgres.description
Writing postgres.shdescription
llvmjit_error.cpp: In function ‘void llvm_enter_fatal_on_oom()’:
llvmjit_error.cpp:61:3: error: ‘install_bad_alloc_error_handler’ is not
a member of ‘llvm’
llvm::install_bad_alloc_error_handler(fatal_llvm_new_handler);
^~~~
llvmjit_error.cpp: In function ‘void llvm_leave_fatal_on_oom()’:
llvmjit_error.cpp:77:3: error: ‘remove_bad_alloc_error_handler’ is not a
member of ‘llvm’
llvm::remove_bad_alloc_error_handler();
^~~~
llvmjit_error.cpp: In function ‘void llvm_reset_fatal_on_oom()’:
llvmjit_error.cpp:92:3: error: ‘remove_bad_alloc_error_handler’ is not a
member of ‘llvm’
llvm::remove_bad_alloc_error_handler();
^~~~
make[3]: *** [<builtin>: llvmjit_error.o] Error 1
make[2]: *** [common.mk:45: lib-recursive] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [Makefile:38: all-backend-recurse] Error 2
make: *** [GNUmakefile:11: all-src-recurse] Error 2I'm not sure what that means, though ... maybe I really have system
broken in some strange way.
FWIW I've installed llvm 5.0.1 from distribution package, and now
everything builds fine (I don't even need the configure tweak).
I think I had to build the other binaries because there was no 5.x llvm
back then, but it's too far back so I don't remember.
Anyway, seems I'm fine for now. Sorry for the noise.
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi,
On 2018-01-30 00:16:46 +0100, Tomas Vondra wrote:
FWIW I've installed llvm 5.0.1 from distribution package, and now
everything builds fine (I don't even need the configure tweak).I think I had to build the other binaries because there was no 5.x llvm
back then, but it's too far back so I don't remember.Anyway, seems I'm fine for now.
Phew, I'm relieved. I'd guess you buily a 5.0 version while 5.0 was
still in development, so not all 5.0 functionality was available. Hence
the inconsistent looking result. While I think we can support 4.0
without too much problem, there's obviously no point in trying to
support old between releases versions...
Sorry for the noise.
No worries.
- Andres
On 29 January 2018 at 22:53, Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2018-01-23 23:20:38 -0800, Andres Freund wrote:
== Code ==
As the patchset is large (500kb) and I'm still quickly evolving it, I do
not yet want to attach it. The git tree is at
https://git.postgresql.org/git/users/andresfreund/postgres.git
in the jit branch
https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/jit
I've just pushed an updated and rebased version of the tree:
- Split the large "jit infrastructure" commits into a number of smaller
commits
- Split the C++ file
- Dropped some of the performance stuff done to heaptuple.c - that was
mostly to make performance comparisons a bit more interesting, but
doesn't seem important enough to deal with.
- Added a commit renaming datetime.h symbols so they don't conflict with
LLVM variables anymore, removing ugly #undef PM/#define PM dance
around includes. Will post separately.
- Reduced the number of pointer constants in the generated LLVM IR, by
doing more getelementptr accesses (stem from before the time types
were automatically synced)
- Increased number of comments a bitThere's a jit-before-rebase-2018-01-29 tag, for the state of the tree
before the rebase.
If you submit the C++ support separately I'd like to sign up as reviewer
and get that in. It's non-intrusive and just makes our existing c++
compilation support actually work properly. Your patch is a more complete
version of the C++ support I hacked up during linux.conf.au - I should've
thought to look in your tree.
The only part I had to add that I don't see in yours is a workaround for
mismatched throw() annotations on our redefinition of inet_net_ntop :
src/include/port.h:
@@ -421,7 +425,7 @@ extern int pg_codepage_to_encoding(UINT cp);
/* port/inet_net_ntop.c */
extern char *inet_net_ntop(int af, const void *src, int bits,
- char *dst, size_t size);
+ char *dst, size_t size) __THROW;
src/include/c.h:
@@ -1131,6 +1131,16 @@ extern int fdatasync(int fildes);
#define NON_EXEC_STATIC static
#endif
+/*
+ * glibc uses __THROW when compiling with the c++ compiler, but port.h
reclares
+ * inet_net_ntop. If we don't annotate it the same way as the prototype in
+ * <inet/arpa.h> we'll upset g++, so we must use __THROW from
<sys/cdefs.h>. If
+ * we're not on glibc, we need to define it away.
+ */
+#ifndef __GNU_LIBRARY__
+#define __THROW
+#endif
+
/* /port compatibility functions */
#include "port.h"
This might be better solved by renaming it to pg_inet_net_ntop so we don't
conflict with a standard name.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Hi,
On Mon, Jan 29, 2018 at 10:40 AM, Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2018-01-29 10:28:18 -0800, Jeff Davis wrote:
OK. How about this: are you open to changes that move us in the
direction of extensibility later? (By this I do *not* mean imposing a
bunch of requirements on you... either small changes to your patches
or something part of another commit.)I'm good with that.
Or are you determined that this always should be a part of core?
I'm strongly against there not being an in-core JIT. I'm not at all
against adding APIs that allow to do different JIT implementations out
of core.
I can live with that.
I recommend that you discuss with packagers and a few others, to
reduce the chance of disagreement later.
Well, the source would require an actual compiler around. And the
inlining *just* for the function code itself isn't actually that
interesting, you e.g. want to also be able to
I think you hit enter too quicly... what's the rest of that sentence?
Regards,
Jeff Davis
On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund <andres@anarazel.de> wrote:
It's an optional dependency, and it doesn't increase build time that
much... If we were to move the llvm interfacing code to a .so, there'd
not even be a packaging issue, you can just package that .so separately
and get errors if somebody tries to enable LLVM without that .so being
installed.
I suspect that would be really valuable. If 'yum install
postgresql-server' (or your favorite equivalent) sucks down all of
LLVM, some people are going to complain, either because they are
trying to build little tiny machine images or because they are subject
to policies which preclude the presence of a compiler on a production
server. If you can do 'yum install postgresql-server' without
additional dependencies and 'yum install postgresql-server-jit' to
make it go faster, that issue is solved.
Unfortunately, that has the pretty significant downside that a lot of
people who actually want the postgresql-server-jit package will not
realize that they need to install it, which sucks. But I think it
might still be the better way to go. Anyway, it's for individual
packagers to cope with that problem; as far as the patch goes, +1 for
structuring things in a way which gives packagers the option to divide
it up that way.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Jan 24, 2018 at 2:20 AM, Andres Freund <andres@anarazel.de> wrote:
== Error handling ==
There's two aspects to error handling.
Firstly, generated (LLVM IR) and emitted functions (mmap()ed segments)
need to be cleaned up both after a successful query execution and after
an error. I've settled on a fairly boring resowner based mechanism. On
errors all expressions owned by a resowner are released, upon success
expressions are reassigned to the parent / released on commit (unless
executor shutdown has cleaned them up of course).
Cool.
A second, less pretty and newly developed, aspect of error handling is
OOM handling inside LLVM itself. The above resowner based mechanism
takes care of cleaning up emitted code upon ERROR, but there's also the
chance that LLVM itself runs out of memory. LLVM by default does *not*
use any C++ exceptions. It's allocations are primarily funneled through
the standard "new" handlers, and some direct use of malloc() and
mmap(). For the former a 'new handler' exists
http://en.cppreference.com/w/cpp/memory/new/set_new_handler for the
latter LLVM provides callback that get called upon failure
(unfortunately mmap() failures are treated as fatal rather than OOM
errors).
What I've chosen to do, and I'd be interested to get some input about
that, is to have two functions that LLVM using code must use:
extern void llvm_enter_fatal_on_oom(void);
extern void llvm_leave_fatal_on_oom(void);
before interacting with LLVM code (ie. emitting IR, or using the above
functions) llvm_enter_fatal_on_oom() needs to be called.When a libstdc++ new or LLVM error occurs, the handlers set up by the
above functions trigger a FATAL error. We have to use FATAL rather than
ERROR, as we *cannot* reliably throw ERROR inside a foreign library
without risking corrupting its internal state.
That bites, although it's probably tolerable if we expect such errors
only in exceptional situations such as a needed shared library failing
to load or something. Killing the session when we run out of memory
during JIT compilation is not very nice at all. Does the LLVM library
have any useful hooks that we can leverage here, like a hypothetical
function LLVMProvokeFailureAsSoonAsConvenient()? The equivalent
function for PostgreSQL would do { InterruptPending = true;
QueryCancelPending = true; }. And maybe LLVMSetProgressCallback()
that would get called periodically and let us set a handler that could
check for interrupts on the PostgreSQL side and then call
LLVMProvokeFailureAsSoonAsConvenient() as applicable? This problem
can't be completely unique to PostgreSQL; anybody who is using LLVM
for JIT from a long-running process needs a solution, so you might
think that the library would provide one.
This facility allows us to get the bitcode for all operators
(e.g. int8eq, float8pl etc), without maintaining two copies. The way
I've currently set it up is that, if --with-llvm is passed to configure,
all backend files are also compiled to bitcode files. These bitcode
files get installed into the server's
$pkglibdir/bitcode/postgres/
under their original subfolder, eg.
~/build/postgres/dev-assert/install/lib/bitcode/postgres/utils/adt/float.bc
Using existing LLVM functionality (for parallel LTO compilation),
additionally an index is over these is stored to
$pkglibdir/bitcode/postgres.index.bc
That sounds pretty sweet.
When deciding to JIT for the first time, $pkglibdir/bitcode/ is scanned
for all .index.bc files and a *combined* index over all these files is
built in memory. The reason for doing so is that that allows "easy"
access to inlining access for extensions - they can install code into
$pkglibdir/bitcode/[extension]/
accompanied by
$pkglibdir/bitcode/[extension].index.bc
just alongside the actual library.
But that means that if an extension is installed after the initial
scan has been done, concurrent sessions won't notice the new files.
Maybe that's OK, but I wonder if we can do better.
Do people feel these should be hidden behind #ifdefs, always present but
prevent from being set to a meaningful, or unrestricted?
We shouldn't allow non-superusers to set any GUC that dumps files to
the data directory or provides an easy to way to crash the server, run
the machine out of memory, or similar. GUCs that just print stuff, or
make queries faster/slower, can be set by anyone, I think. I favor
having the debugging stuff available in the default build. This
feature has a chance of containing bugs, and those bugs will be hard
to troubleshoot if the first step in getting information on what went
wrong is "recompile".
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Hi,
On 2018-01-30 13:57:50 -0500, Robert Haas wrote:
When a libstdc++ new or LLVM error occurs, the handlers set up by the
above functions trigger a FATAL error. We have to use FATAL rather than
ERROR, as we *cannot* reliably throw ERROR inside a foreign library
without risking corrupting its internal state.That bites, although it's probably tolerable if we expect such errors
only in exceptional situations such as a needed shared library failing
to load or something. Killing the session when we run out of memory
during JIT compilation is not very nice at all. Does the LLVM library
have any useful hooks that we can leverage here, like a hypothetical
function LLVMProvokeFailureAsSoonAsConvenient()?
I don't see how that'd help if a memory allocation fails? We can't just
continue in that case? You could arguably have reserve memory pool that
you release in that case and then try to continue, but that seems
awfully fragile.
The equivalent function for PostgreSQL would do { InterruptPending =
true; QueryCancelPending = true; }. And maybe
LLVMSetProgressCallback() that would get called periodically and let
us set a handler that could check for interrupts on the PostgreSQL
side and then call LLVMProvokeFailureAsSoonAsConvenient() as
applicable? This problem can't be completely unique to PostgreSQL;
anybody who is using LLVM for JIT from a long-running process needs a
solution, so you might think that the library would provide one.
The ones I looked at just error out. Needing to handle OOM in soft fail
manner isn't actually that common a demand, I guess :/.
for all .index.bc files and a *combined* index over all these files is
built in memory. The reason for doing so is that that allows "easy"
access to inlining access for extensions - they can install code into
$pkglibdir/bitcode/[extension]/
accompanied by
$pkglibdir/bitcode/[extension].index.bc
just alongside the actual library.But that means that if an extension is installed after the initial
scan has been done, concurrent sessions won't notice the new files.
Maybe that's OK, but I wonder if we can do better.
I mean we could periodically rescan, rescan after sighup, or such? But
that seems like something for later to me. It's not going to be super
common to install new extensions while a lot of sessions are
running. And things will work in that case, the functions just won't get inlined...
Do people feel these should be hidden behind #ifdefs, always present but
prevent from being set to a meaningful, or unrestricted?We shouldn't allow non-superusers to set any GUC that dumps files to
the data directory or provides an easy to way to crash the server, run
the machine out of memory, or similar.
I don't buy the OOM one - there's so so so many of those already...
The profiling one does dump to ~/.debug/jit/ - it seems a bit annoying
if profiling can only be done by a superuser? Hm :/
Greetings,
Andres Freund
On Tue, Jan 30, 2018 at 2:08 PM, Andres Freund <andres@anarazel.de> wrote:
That bites, although it's probably tolerable if we expect such errors
only in exceptional situations such as a needed shared library failing
to load or something. Killing the session when we run out of memory
during JIT compilation is not very nice at all. Does the LLVM library
have any useful hooks that we can leverage here, like a hypothetical
function LLVMProvokeFailureAsSoonAsConvenient()?I don't see how that'd help if a memory allocation fails? We can't just
continue in that case? You could arguably have reserve memory pool that
you release in that case and then try to continue, but that seems
awfully fragile.
Well, I'm just asking what the library supports. For example:
https://curl.haxx.se/libcurl/c/CURLOPT_PROGRESSFUNCTION.html
If you had something like that, you could arrange to safely interrupt
the library the next time the progress-function was called.
The ones I looked at just error out. Needing to handle OOM in soft fail
manner isn't actually that common a demand, I guess :/.
Bummer.
I mean we could periodically rescan, rescan after sighup, or such? But
that seems like something for later to me. It's not going to be super
common to install new extensions while a lot of sessions are
running. And things will work in that case, the functions just won't get inlined...
Fair enough.
Do people feel these should be hidden behind #ifdefs, always present but
prevent from being set to a meaningful, or unrestricted?We shouldn't allow non-superusers to set any GUC that dumps files to
the data directory or provides an easy to way to crash the server, run
the machine out of memory, or similar.I don't buy the OOM one - there's so so so many of those already...
The profiling one does dump to ~/.debug/jit/ - it seems a bit annoying
if profiling can only be done by a superuser? Hm :/
The server's ~/.debug/jit? Or are you somehow getting the output to the client?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
Unfortunately, that has the pretty significant downside that a lot of
people who actually want the postgresql-server-jit package will not
realize that they need to install it, which sucks. But I think it
might still be the better way to go. Anyway, it's for individual
packagers to cope with that problem; as far as the patch goes, +1 for
structuring things in a way which gives packagers the option to divide
it up that way.
I don't know about rpm/yum/dnf, but in dpkg/apt one could declare that
postgresql-server recommends postgresql-server-jit, which installs the
package by default, but can be overridden by config or on the command
line.
- ilmari
--
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
the consequences of." -- Skud's Meta-Law
Hi,
On 2018-01-30 15:06:02 -0500, Robert Haas wrote:
On Tue, Jan 30, 2018 at 2:08 PM, Andres Freund <andres@anarazel.de> wrote:
That bites, although it's probably tolerable if we expect such errors
only in exceptional situations such as a needed shared library failing
to load or something. Killing the session when we run out of memory
during JIT compilation is not very nice at all. Does the LLVM library
have any useful hooks that we can leverage here, like a hypothetical
function LLVMProvokeFailureAsSoonAsConvenient()?I don't see how that'd help if a memory allocation fails? We can't just
continue in that case? You could arguably have reserve memory pool that
you release in that case and then try to continue, but that seems
awfully fragile.Well, I'm just asking what the library supports. For example:
https://curl.haxx.se/libcurl/c/CURLOPT_PROGRESSFUNCTION.html
I get that type of function, what I don't understand how that applies to
OOM:
If you had something like that, you could arrange to safely interrupt
the library the next time the progress-function was called.
Yea, but how are you going to *get* to the next time, given that an
allocator just couldn't allocate memory? You can't just return a NULL
pointer because the caller will use that memory?
The profiling one does dump to ~/.debug/jit/ - it seems a bit annoying
if profiling can only be done by a superuser? Hm :/The server's ~/.debug/jit? Or are you somehow getting the output to the client?
Yes, the servers - I'm not sure I understand the "client" bit? It's
about perf profiling, which isn't available to the client either?
Greetings,
Andres Freund
On 01/30/2018 12:24 AM, Andres Freund wrote:
Hi,
On 2018-01-30 00:16:46 +0100, Tomas Vondra wrote:
FWIW I've installed llvm 5.0.1 from distribution package, and now
everything builds fine (I don't even need the configure tweak).I think I had to build the other binaries because there was no 5.x llvm
back then, but it's too far back so I don't remember.Anyway, seems I'm fine for now.
Phew, I'm relieved. I'd guess you buily a 5.0 version while 5.0 was
still in development, so not all 5.0 functionality was available. Hence
the inconsistent looking result. While I think we can support 4.0
without too much problem, there's obviously no point in trying to
support old between releases versions...
That's quite possible, but I don't really remember :-/
But I ran into another issue today, where everything builds fine (llvm
5.0.1, gcc 6.4.0), but at runtime I get errors like this:
ERROR:
LLVMCreateMemoryBufferWithContentsOfFile(/home/tomas/pg-llvm/lib/postgresql/llvmjit_types.bc)
failed: No such file or directory
It seems the llvmjit_types.bc file ended up in the parent directory
(/home/tomas/pg-llvm/lib/) for some reason. After simply copying it to
the expected place everything started working.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Jan 30, 2018 at 01:46:37PM -0500, Robert Haas wrote:
On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund <andres@anarazel.de> wrote:
It's an optional dependency, and it doesn't increase build time
that much... If we were to move the llvm interfacing code to a
.so, there'd not even be a packaging issue, you can just package
that .so separately and get errors if somebody tries to enable
LLVM without that .so being installed.I suspect that would be really valuable. If 'yum install
postgresql-server' (or your favorite equivalent) sucks down all of
LLVM,
As I understand it, LLVM is organized in such a way as not to require
this. Andres, am I understanding correctly that what you're using
doesn't require much of LLVM at runtime?
some people are going to complain, either because they are
trying to build little tiny machine images or because they are
subject to policies which preclude the presence of a compiler on a
production server. If you can do 'yum install postgresql-server'
without additional dependencies and 'yum install
postgresql-server-jit' to make it go faster, that issue is solved.
Would you consider it solved if there were some very small part of the
LLVM (or similar JIT-capable) toolchain added as a dependency, or does
it need to be optional into a long future?
Unfortunately, that has the pretty significant downside that a lot of
people who actually want the postgresql-server-jit package will not
realize that they need to install it, which sucks.
It does indeed.
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
Hi,
On 2018-01-30 22:57:06 +0100, David Fetter wrote:
On Tue, Jan 30, 2018 at 01:46:37PM -0500, Robert Haas wrote:
On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund <andres@anarazel.de> wrote:
It's an optional dependency, and it doesn't increase build time
that much... If we were to move the llvm interfacing code to a
.so, there'd not even be a packaging issue, you can just package
that .so separately and get errors if somebody tries to enable
LLVM without that .so being installed.I suspect that would be really valuable. If 'yum install
postgresql-server' (or your favorite equivalent) sucks down all of
LLVM,As I understand it, LLVM is organized in such a way as not to require
this. Andres, am I understanding correctly that what you're using
doesn't require much of LLVM at runtime?
I'm not sure what you exactly mean. Yes, you need the llvm library at
runtime. Perhaps you're thinking of clang or llvm binarieries? The
latter we *not* need.
What's required is something like:
$ apt show libllvm5.0
Package: libllvm5.0
Version: 1:5.0.1-2
Priority: optional
Section: libs
Source: llvm-toolchain-5.0
Maintainer: LLVM Packaging Team <pkg-llvm-team@lists.alioth.debian.org>
Installed-Size: 56.9 MB
Depends: libc6 (>= 2.15), libedit2 (>= 2.11-20080614), libffi6 (>= 3.0.4), libgcc1 (>= 1:3.4), libstdc++6 (>= 6), libtinfo5 (>= 6), zlib1g (>= 1:1.2.0)
Breaks: libllvm3.9v4
Replaces: libllvm3.9v4
Homepage: http://www.llvm.org/
Tag: role::shared-lib
Download-Size: 13.7 MB
APT-Manual-Installed: no
APT-Sources: http://debian.osuosl.org/debian unstable/main amd64 Packages
Description: Modular compiler and toolchain technologies, runtime library
LLVM is a collection of libraries and tools that make it easy to build
compilers, optimizers, just-in-time code generators, and many other
compiler-related programs.
.
This package contains the LLVM runtime library.
So ~14MB to download, ~57MB on disk. We only need a subset of
libllvm5.0, and LLVM allows to build such a subset. But obviously
distributions aren't going to target their LLVM just for postgres.
Unfortunately, that has the pretty significant downside that a lot of
people who actually want the postgresql-server-jit package will not
realize that they need to install it, which sucks.It does indeed.
With things like apt recommends and such I don't think this is a huge
problem. It'll be installed by default unless somebody is on a space
constrained system and doesn't want that...
Greetings,
Andres Freund
Hi,
On 2018-01-30 13:46:37 -0500, Robert Haas wrote:
On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund <andres@anarazel.de> wrote:
It's an optional dependency, and it doesn't increase build time that
much... If we were to move the llvm interfacing code to a .so, there'd
not even be a packaging issue, you can just package that .so separately
and get errors if somebody tries to enable LLVM without that .so being
installed.I suspect that would be really valuable. If 'yum install
postgresql-server' (or your favorite equivalent) sucks down all of
LLVM, some people are going to complain, either because they are
trying to build little tiny machine images or because they are subject
to policies which preclude the presence of a compiler on a production
server. If you can do 'yum install postgresql-server' without
additional dependencies and 'yum install postgresql-server-jit' to
make it go faster, that issue is solved.
So, I'm working on that now. In the course of this I'll be
painfully rebase and rename a lot of code, which I'd like not to repeat
unnecessarily.
Right now there primarily is:
src/backend/lib/llvmjit.c - infrastructure, optimization, error handling
src/backend/lib/llvmjit_{error,wrap,inline}.cpp - expose more stuff to C
src/backend/executor/execExprCompile.c - emit LLVM IR for expressions
src/backend/access/common/heaptuple.c - emit LLVM IR for deforming
Given that we need a shared library it'll be best buildsystem wise if
all of this is in a directory, and there's a separate file containing
the stubs that call into it.
I'm not quite sure where to put the code. I'm a bit inclined to add a
new
src/backend/jit/
because we're dealing with code from across different categories? There
we could have a pgjit.c with the stubs, and llvmjit/ with the llvm
specific code?
Alternatively I'd say we put the stub into src/backend/executor/pgjit.c,
and the actual llvm using code into src/backend/executor/llvmjit/?
Comments?
Andres Freund
On Tue, Jan 30, 2018 at 02:08:30PM -0800, Andres Freund wrote:
Hi,
On 2018-01-30 22:57:06 +0100, David Fetter wrote:
On Tue, Jan 30, 2018 at 01:46:37PM -0500, Robert Haas wrote:
On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund <andres@anarazel.de> wrote:
It's an optional dependency, and it doesn't increase build
time that much... If we were to move the llvm interfacing code
to a .so, there'd not even be a packaging issue, you can just
package that .so separately and get errors if somebody tries
to enable LLVM without that .so being installed.I suspect that would be really valuable. If 'yum install
postgresql-server' (or your favorite equivalent) sucks down all
of LLVM,As I understand it, LLVM is organized in such a way as not to
require this. Andres, am I understanding correctly that what
you're using doesn't require much of LLVM at runtime?I'm not sure what you exactly mean. Yes, you need the llvm library
at runtime. Perhaps you're thinking of clang or llvm binarieries?
The latter we *not* need.
I was, and glad I understood correctly.
What's required is something like:
$ apt show libllvm5.0
Package: libllvm5.0
Version: 1:5.0.1-2
Priority: optional
Section: libs
Source: llvm-toolchain-5.0
Maintainer: LLVM Packaging Team <pkg-llvm-team@lists.alioth.debian.org>
Installed-Size: 56.9 MB
Depends: libc6 (>= 2.15), libedit2 (>= 2.11-20080614), libffi6 (>= 3.0.4), libgcc1 (>= 1:3.4), libstdc++6 (>= 6), libtinfo5 (>= 6), zlib1g (>= 1:1.2.0)
Breaks: libllvm3.9v4
Replaces: libllvm3.9v4
Homepage: http://www.llvm.org/
Tag: role::shared-lib
Download-Size: 13.7 MB
APT-Manual-Installed: no
APT-Sources: http://debian.osuosl.org/debian unstable/main amd64 Packages
Description: Modular compiler and toolchain technologies, runtime library
LLVM is a collection of libraries and tools that make it easy to build
compilers, optimizers, just-in-time code generators, and many other
compiler-related programs.
.
This package contains the LLVM runtime library.So ~14MB to download, ~57MB on disk. We only need a subset of
libllvm5.0, and LLVM allows to build such a subset. But obviously
distributions aren't going to target their LLVM just for postgres.
True, although if they're using an LLVM only for PostgreSQL and care
about 57MB of disk, they're probably also ready to do that work.
Unfortunately, that has the pretty significant downside that a
lot of people who actually want the postgresql-server-jit
package will not realize that they need to install it, which
sucks.It does indeed.
With things like apt recommends and such I don't think this is a
huge problem. It'll be installed by default unless somebody is on a
space constrained system and doesn't want that...
Don't most of the wins for JITing come in the OLAP space anyway? I'm
having trouble picturing a severely space-constrained OLAP system, but
of course it's a possible scenario.
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
On Jan 30, 2018, at 2:08 PM, Andres Freund <andres@anarazel.de> wrote:
With things like apt recommends and such I don't think this is a huge problem.
I don’t believe there is a similar widely-supported dependency type in yum/rpm, though. rpm 4.12 adds support for Weak Dependencies, which have Recommends/Suggests-style semantics, but AFAIK it’s not going to be on most RPM machines (I haven’t checked most OSes yet, but IIRC it’s mostly a Fedora thing at this point?)
Which means in the rpm packages we’ll have to decide whether this is required or must be opt-in by end users (which as discussed would hurt adoption).
--
Jason Petersen
Software Engineer | Citus Data
303.736.9255
jason@citusdata.com
On Wed, Jan 31, 2018 at 11:57 AM, Andres Freund <andres@anarazel.de> wrote:
On 2018-01-30 13:46:37 -0500, Robert Haas wrote:
On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund <andres@anarazel.de> wrote:
It's an optional dependency, and it doesn't increase build time that
much... If we were to move the llvm interfacing code to a .so, there'd
not even be a packaging issue, you can just package that .so separately
and get errors if somebody tries to enable LLVM without that .so being
installed.I suspect that would be really valuable. If 'yum install
postgresql-server' (or your favorite equivalent) sucks down all of
LLVM, some people are going to complain, either because they are
trying to build little tiny machine images or because they are subject
to policies which preclude the presence of a compiler on a production
server. If you can do 'yum install postgresql-server' without
additional dependencies and 'yum install postgresql-server-jit' to
make it go faster, that issue is solved.So, I'm working on that now. In the course of this I'll be
painfully rebase and rename a lot of code, which I'd like not to repeat
unnecessarily.Right now there primarily is:
src/backend/lib/llvmjit.c - infrastructure, optimization, error handling
src/backend/lib/llvmjit_{error,wrap,inline}.cpp - expose more stuff to C
src/backend/executor/execExprCompile.c - emit LLVM IR for expressions
src/backend/access/common/heaptuple.c - emit LLVM IR for deformingGiven that we need a shared library it'll be best buildsystem wise if
all of this is in a directory, and there's a separate file containing
the stubs that call into it.I'm not quite sure where to put the code. I'm a bit inclined to add a
new
src/backend/jit/
because we're dealing with code from across different categories? There
we could have a pgjit.c with the stubs, and llvmjit/ with the llvm
specific code?Alternatively I'd say we put the stub into src/backend/executor/pgjit.c,
and the actual llvm using code into src/backend/executor/llvmjit/?Comments?
I'm just starting to look at this (amazing) work, and I don't have a
strong opinion yet. But certainly, making it easy for packagers to
put the -jit stuff into a separate package for the reasons already
given sounds sensible to me. Some systems package LLVM as one
gigantic package that'll get you 1GB of compiler/debugger/other stuff
and perhaps violate local rules by installing a compiler when you
really just wanted libLLVM{whatever}.so. I guess it should be made
very clear to users (explain plans, maybe startup message, ...?)
whether JIT support is active/installed so that people are at least
very aware when they encounter a system that is interpreting stuff it
could be compiling. Putting all the JIT into a separate directory
under src/backend/jit certainly looks sensible at first glance, but
I'm not sure.
Incidentally, from commit fdc6c7a6dddbd6df63717f2375637660bcd00fc6
(HEAD -> jit, andresfreund/jit) on your branch I get:
ccache c++ -Wall -Wpointer-arith -fno-strict-aliasing -fwrapv -g -g
-O2 -fno-exceptions -I../../../src/include
-I/usr/local/llvm50/include -DLLVM_BUILD_GLOBAL_ISEL
-D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
-I/usr/local/include -c -o llvmjit_error.o llvmjit_error.cpp -MMD -MP
-MF .deps/llvmjit_error.Po
In file included from llvmjit_error.cpp:26:
In file included from ../../../src/include/lib/llvmjit.h:48:
In file included from /usr/local/llvm50/include/llvm-c/Types.h:17:
In file included from /usr/local/llvm50/include/llvm/Support/DataTypes.h:33:
/usr/include/c++/v1/cmath:555:1: error: templates must have C++ linkage
template <class _A1>
^~~~~~~~~~~~~~~~~~~~
llvmjit_error.cpp:24:1: note: extern "C" language linkage
specification begins here
extern "C"
^
$ c++ -v
FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on
LLVM 4.0.0)
This seems to be a valid complaint. I don't think you should be
(indirectly) wrapping Types.h in extern "C". At a guess, your
llvmjit.h should be doing its own #ifdef __cplusplus'd linkage
specifiers, so you can use it from C or C++, but making sure that you
don't #include LLVM's headers from a bizarro context where __cplusplus
is defined but the linkage is unexpectedly already "C"?
--
Thomas Munro
http://www.enterprisedb.com
On 2018-01-31 14:42:26 +1300, Thomas Munro wrote:
I'm just starting to look at this (amazing) work, and I don't have a
strong opinion yet. But certainly, making it easy for packagers to
put the -jit stuff into a separate package for the reasons already
given sounds sensible to me. Some systems package LLVM as one
gigantic package that'll get you 1GB of compiler/debugger/other stuff
and perhaps violate local rules by installing a compiler when you
really just wanted libLLVM{whatever}.so. I guess it should be made
very clear to users (explain plans, maybe startup message, ...?)
I'm not quite sure I understand. You mean have it display whether
available? I think my plan is to "just" set jit_expressions=on (or
whatever we're going to name it) fail if the prerequisites aren't
available. I personally don't think this should be enabled by default,
definitely not in the first release.
$ c++ -v
FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on
LLVM 4.0.0)This seems to be a valid complaint. I don't think you should be
(indirectly) wrapping Types.h in extern "C". At a guess, your
llvmjit.h should be doing its own #ifdef __cplusplus'd linkage
specifiers, so you can use it from C or C++, but making sure that you
don't #include LLVM's headers from a bizarro context where __cplusplus
is defined but the linkage is unexpectedly already "C"?
Hm, this seems like a bit of pointless nitpickery by the compiler to me,
but I guess...
Greetings,
Andres Freund
On Wed, Jan 31, 2018 at 3:05 PM, Andres Freund <andres@anarazel.de> wrote:
On 2018-01-31 14:42:26 +1300, Thomas Munro wrote:
I'm just starting to look at this (amazing) work, and I don't have a
strong opinion yet. But certainly, making it easy for packagers to
put the -jit stuff into a separate package for the reasons already
given sounds sensible to me. Some systems package LLVM as one
gigantic package that'll get you 1GB of compiler/debugger/other stuff
and perhaps violate local rules by installing a compiler when you
really just wanted libLLVM{whatever}.so. I guess it should be made
very clear to users (explain plans, maybe startup message, ...?)I'm not quite sure I understand. You mean have it display whether
available? I think my plan is to "just" set jit_expressions=on (or
whatever we're going to name it) fail if the prerequisites aren't
available. I personally don't think this should be enabled by default,
definitely not in the first release.
I assumed (incorrectly) that you wanted it to default to on if
available, so I was suggesting making it obvious to end users if
they've accidentally forgotten to install -jit. If it's not enabled
until you actually ask for it and trying to enable it when it's not
installed barfs, then that seems sensible.
This seems to be a valid complaint. I don't think you should be
(indirectly) wrapping Types.h in extern "C". At a guess, your
llvmjit.h should be doing its own #ifdef __cplusplus'd linkage
specifiers, so you can use it from C or C++, but making sure that you
don't #include LLVM's headers from a bizarro context where __cplusplus
is defined but the linkage is unexpectedly already "C"?Hm, this seems like a bit of pointless nitpickery by the compiler to me,
but I guess...
Well that got me curious about how GCC could possibly be accepting
that (it certainly doesn't like extern "C" template ... any more than
the next compiler). I dug a bit and realised that it's the stdlib
that's different: libstdc++ has its own extern "C++" in <cmath>,
while libc++ doesn't.
--
Thomas Munro
http://www.enterprisedb.com
Hi,
On 2018-01-31 15:48:09 +1300, Thomas Munro wrote:
On Wed, Jan 31, 2018 at 3:05 PM, Andres Freund <andres@anarazel.de> wrote:
I'm not quite sure I understand. You mean have it display whether
available? I think my plan is to "just" set jit_expressions=on (or
whatever we're going to name it) fail if the prerequisites aren't
available. I personally don't think this should be enabled by default,
definitely not in the first release.I assumed (incorrectly) that you wanted it to default to on if
available, so I was suggesting making it obvious to end users if
they've accidentally forgotten to install -jit. If it's not enabled
until you actually ask for it and trying to enable it when it's not
installed barfs, then that seems sensible.
I'm open to changing my mind on it, but it seems a bit weird that a
feature that relies on a shlib being installed magically turns itself on
if avaible. And leaving that angle aside, ISTM, that it's a complex
enough feature that it should be opt-in the first release... Think we
roughly did that right for e.g. parallellism.
Greetings,
Andres Freund
On 31.01.2018 05:48, Thomas Munro wrote:
This seems to be a valid complaint. I don't think you should be
(indirectly) wrapping Types.h in extern "C". At a guess, your
llvmjit.h should be doing its own #ifdef __cplusplus'd linkage
specifiers, so you can use it from C or C++, but making sure that you
don't #include LLVM's headers from a bizarro context where __cplusplus
is defined but the linkage is unexpectedly already "C"?Hm, this seems like a bit of pointless nitpickery by the compiler to me,
but I guess...Well that got me curious about how GCC could possibly be accepting
that (it certainly doesn't like extern "C" template ... any more than
the next compiler). I dug a bit and realised that it's the stdlib
that's different: libstdc++ has its own extern "C++" in <cmath>,
while libc++ doesn't.
The same problem takes place with old versions of GCC: I have to upgrade
GCC to 7.2 to make it possible to compile this code.
The problem in not in compiler itself, but in libc++ headers.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 1/30/18 21:55, Andres Freund wrote:
I'm open to changing my mind on it, but it seems a bit weird that a
feature that relies on a shlib being installed magically turns itself on
if avaible. And leaving that angle aside, ISTM, that it's a complex
enough feature that it should be opt-in the first release... Think we
roughly did that right for e.g. parallellism.
That sounds reasonable, for both of those reasons.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Jan 31, 2018 at 10:22 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
On 1/30/18 21:55, Andres Freund wrote:
I'm open to changing my mind on it, but it seems a bit weird that a
feature that relies on a shlib being installed magically turns itself on
if avaible. And leaving that angle aside, ISTM, that it's a complex
enough feature that it should be opt-in the first release... Think we
roughly did that right for e.g. parallellism.That sounds reasonable, for both of those reasons.
The first one is a problem that's not going to go away. If the
problem of JIT being enabled "magically" is something we're concerned
about, we need to figure out a good solution, not just disable the
feature by default.
As far as the second one, looking back at what happened with parallel
query, I found (on a quick read) 13 back-patched commits in
REL9_6_STABLE prior to the release of 10.0, 3 of which I would qualify
as low-importance (improving documentation, fixing something that's
not really a bug, improving a test case). A couple of those were
really stupid mistakes on my part. On the other hand, would it have
been overall worse for our users if that feature had been turned on in
9.6? I don't know. They would have had those bugs (at least until we
fixed them) but they would have had parallel query, too. It's hard
for me to judge whether that was a win or a loss, and so here. Like
parallel query, this is a feature which seems to have a low risk of
data corruption, but a fairly high risk of wrong answers to queries
and/or strange errors. Users don't like that. On the other hand,
also like parallel query, if you've got the right kind of queries, it
can make them go a lot faster. Users DO like that.
So I could go either way on whether to enable this in the first
release. I definitely would not like to see it stay disabled by
default for a second release unless we find a lot of problems with it.
There's no point in developing new features unless users are going to
get the benefit of them, and while SOME users will enable features
that aren't turned on by default, many will not.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Tue, Jan 30, 2018 at 5:57 PM, Andres Freund <andres@anarazel.de> wrote:
Given that we need a shared library it'll be best buildsystem wise if
all of this is in a directory, and there's a separate file containing
the stubs that call into it.I'm not quite sure where to put the code. I'm a bit inclined to add a
new
src/backend/jit/
because we're dealing with code from across different categories? There
we could have a pgjit.c with the stubs, and llvmjit/ with the llvm
specific code?
That's kind of ugly, in that if we eventually end up with many
different parts of the system using JIT, they're all going to have to
all put their code in that directory rather than putting it with the
subsystem to which it pertains. On the other hand, I don't really
have a better idea. I'd definitely at least try to keep
executor-specific considerations in a separate FILE from general JIT
infrastructure, and make, as far as possible, a clean separation at
the API level.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Hi,
On 2018-01-31 11:53:25 -0500, Robert Haas wrote:
On Wed, Jan 31, 2018 at 10:22 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:On 1/30/18 21:55, Andres Freund wrote:
I'm open to changing my mind on it, but it seems a bit weird that a
feature that relies on a shlib being installed magically turns itself on
if avaible. And leaving that angle aside, ISTM, that it's a complex
enough feature that it should be opt-in the first release... Think we
roughly did that right for e.g. parallellism.That sounds reasonable, for both of those reasons.
The first one is a problem that's not going to go away. If the
problem of JIT being enabled "magically" is something we're concerned
about, we need to figure out a good solution, not just disable the
feature by default.
That's a fair argument, and I don't really have a good answer to it. We
could have a jit = off/try/on, and use that to signal things? I.e. it
can be set to try (possibly default in version + 1), and things will
work if it's not installed, but if set to on it'll refuse to work if not
enabled. Similar to how huge pages work now.
Greetings,
Andres Freund
Hi,
On 2018-01-31 11:56:59 -0500, Robert Haas wrote:
On Tue, Jan 30, 2018 at 5:57 PM, Andres Freund <andres@anarazel.de> wrote:
Given that we need a shared library it'll be best buildsystem wise if
all of this is in a directory, and there's a separate file containing
the stubs that call into it.I'm not quite sure where to put the code. I'm a bit inclined to add a
new
src/backend/jit/
because we're dealing with code from across different categories? There
we could have a pgjit.c with the stubs, and llvmjit/ with the llvm
specific code?That's kind of ugly, in that if we eventually end up with many
different parts of the system using JIT, they're all going to have to
all put their code in that directory rather than putting it with the
subsystem to which it pertains.
Yea, that's what I really dislike about the idea too.
On the other hand, I don't really have a better idea.
I guess one alternative would be to leave the individual files in their
subsystem directories, but not in the corresponding OBJS lists, and
instead pick them up from the makefile in the jit shlib? That might
better...
It's a bit weird because the files would be compiled when make-ing that
directory and rather when the jit shlib one made, but that's not too
bad.
I'd definitely at least try to keep executor-specific considerations
in a separate FILE from general JIT infrastructure, and make, as far
as possible, a clean separation at the API level.
Absolutely. Right now there's general infrastructure files (error
handling, optimization, inlining), expression compilation, tuple deform
compilation, and I thought to continue keeping the files separately just
like that.
Greetings,
Andres Freund
On Wed, Jan 31, 2018 at 1:34 PM, Andres Freund <andres@anarazel.de> wrote:
The first one is a problem that's not going to go away. If the
problem of JIT being enabled "magically" is something we're concerned
about, we need to figure out a good solution, not just disable the
feature by default.That's a fair argument, and I don't really have a good answer to it. We
could have a jit = off/try/on, and use that to signal things? I.e. it
can be set to try (possibly default in version + 1), and things will
work if it's not installed, but if set to on it'll refuse to work if not
enabled. Similar to how huge pages work now.
We could do that, but I'd be more inclined just to let JIT be
magically enabled. In general, if a user could do 'yum install ip4r'
(for example) and have that Just Work without any further database
configuration, I think a lot of people would consider that to be a
huge improvement. Unfortunately we can't really do that for various
reasons, the biggest of which is that there's no way for installing an
OS package to modify the internal state of a database that may not
even be running at the time. But as a general principle, I think
having to configure both the OS and the DB is an anti-feature, and
that if installing an extra package is sufficient to get the
new-and-improved behavior, users will like it. Bonus points if it
doesn't require a server restart.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2018-01-31 14:45:46 -0500, Robert Haas wrote:
On Wed, Jan 31, 2018 at 1:34 PM, Andres Freund <andres@anarazel.de> wrote:
The first one is a problem that's not going to go away. If the
problem of JIT being enabled "magically" is something we're concerned
about, we need to figure out a good solution, not just disable the
feature by default.That's a fair argument, and I don't really have a good answer to it. We
could have a jit = off/try/on, and use that to signal things? I.e. it
can be set to try (possibly default in version + 1), and things will
work if it's not installed, but if set to on it'll refuse to work if not
enabled. Similar to how huge pages work now.We could do that, but I'd be more inclined just to let JIT be
magically enabled. In general, if a user could do 'yum install ip4r'
(for example) and have that Just Work without any further database
configuration, I think a lot of people would consider that to be a
huge improvement. Unfortunately we can't really do that for various
reasons, the biggest of which is that there's no way for installing an
OS package to modify the internal state of a database that may not
even be running at the time. But as a general principle, I think
having to configure both the OS and the DB is an anti-feature, and
that if installing an extra package is sufficient to get the
new-and-improved behavior, users will like it.
I'm not seing a contradiction between what you describe as desired, and
what I describe? If it defaulted to try, that'd just do what you want,
no? I do think it's important to configure the system so it'll error if
JITing is not available.
Bonus points if it doesn't require a server restart.
I think server restart might be doable (although it'll increase memory
usage because the shlib needs to be loaded in each backend rather than
postmaster), but once a session is running I'm fairly sure we do not
want to retry. Re-checking whether a shlib is available on the
filesystem every query does not sound like a good idea...
Greetings,
Andres Freund
On Wed, Jan 31, 2018 at 2:49 PM, Andres Freund <andres@anarazel.de> wrote:
We could do that, but I'd be more inclined just to let JIT be
magically enabled. In general, if a user could do 'yum install ip4r'
(for example) and have that Just Work without any further database
configuration, I think a lot of people would consider that to be a
huge improvement. Unfortunately we can't really do that for various
reasons, the biggest of which is that there's no way for installing an
OS package to modify the internal state of a database that may not
even be running at the time. But as a general principle, I think
having to configure both the OS and the DB is an anti-feature, and
that if installing an extra package is sufficient to get the
new-and-improved behavior, users will like it.I'm not seing a contradiction between what you describe as desired, and
what I describe? If it defaulted to try, that'd just do what you want,
no? I do think it's important to configure the system so it'll error if
JITing is not available.
Hmm, I guess that's true. I'm not sure that we really need a way to
error out if JIT is not available, but maybe we do.
Bonus points if it doesn't require a server restart.
I think server restart might be doable (although it'll increase memory
usage because the shlib needs to be loaded in each backend rather than
postmaster), but once a session is running I'm fairly sure we do not
want to retry. Re-checking whether a shlib is available on the
filesystem every query does not sound like a good idea...
Agreed.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 1/31/18 13:34, Andres Freund wrote:
That's a fair argument, and I don't really have a good answer to it. We
could have a jit = off/try/on, and use that to signal things? I.e. it
can be set to try (possibly default in version + 1), and things will
work if it's not installed, but if set to on it'll refuse to work if not
enabled. Similar to how huge pages work now.
But that setup also has the problem that you can't query the setting to
know whether it's actually on.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 1/31/18 14:45, Robert Haas wrote:
We could do that, but I'd be more inclined just to let JIT be
magically enabled. In general, if a user could do 'yum install ip4r'
(for example) and have that Just Work without any further database
configuration,
One way to do that would be to have a system-wide configuration file
like /usr/local/pgsql/etc/postgresql/postgresql.conf, which in turn
includes /usr/local/pgsql/etc/postgresql/postgreql.conf.d/*, and have
the add-on package install its configuration file with the setting jit =
on there.
Then again, if we want to make it simpler, just link the whole thing in
and turn it on by default and be done with it.
Presumably, there will be planner-level knobs to model the jit startup
time, and if you don't like it, you can set that very high to disable
it. So we don't necessarily need a separate turn-it-off-it's-broken
setting.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018-02-01 08:46:08 -0500, Peter Eisentraut wrote:
On 1/31/18 14:45, Robert Haas wrote:
We could do that, but I'd be more inclined just to let JIT be
magically enabled. In general, if a user could do 'yum install ip4r'
(for example) and have that Just Work without any further database
configuration,One way to do that would be to have a system-wide configuration file
like /usr/local/pgsql/etc/postgresql/postgresql.conf, which in turn
includes /usr/local/pgsql/etc/postgresql/postgreql.conf.d/*, and have
the add-on package install its configuration file with the setting jit =
on there.
I think Robert's comment about extensions wasn't about extensions and
jit, just about needing CREATE EXTENSION. I don't see any
need for per-extension/shlib configurability of JITing.
Then again, if we want to make it simpler, just link the whole thing in
and turn it on by default and be done with it.
I'd personally be ok with that too...
Greetings,
Andres Freund
On Wed, Jan 31, 2018 at 1:45 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jan 31, 2018 at 1:34 PM, Andres Freund <andres@anarazel.de> wrote:
The first one is a problem that's not going to go away. If the
problem of JIT being enabled "magically" is something we're concerned
about, we need to figure out a good solution, not just disable the
feature by default.That's a fair argument, and I don't really have a good answer to it. We
could have a jit = off/try/on, and use that to signal things? I.e. it
can be set to try (possibly default in version + 1), and things will
work if it's not installed, but if set to on it'll refuse to work if not
enabled. Similar to how huge pages work now.We could do that, but I'd be more inclined just to let JIT be
magically enabled. In general, if a user could do 'yum install ip4r'
(for example) and have that Just Work without any further database
configuration, I think a lot of people would consider that to be a
huge improvement. Unfortunately we can't really do that for various
reasons, the biggest of which is that there's no way for installing an
OS package to modify the internal state of a database that may not
even be running at the time. But as a general principle, I think
having to configure both the OS and the DB is an anti-feature, and
that if installing an extra package is sufficient to get the
new-and-improved behavior, users will like it. Bonus points if it
doesn't require a server restart.
You bet. It'd be helpful to have some obvious, well advertised ways
to determine when it's enabled and when it isn't, and to have a
straightforward process to determine what to fix when it's not enabled
and the user thinks it ought to be though.
merlin
On Wed, Jan 31, 2018 at 12:03 AM, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
The same problem takes place with old versions of GCC: I have to upgrade GCC
to 7.2 to make it possible to compile this code.
The problem in not in compiler itself, but in libc++ headers.
How can I get this branch to compile on ubuntu 16.04? I have llvm-5.0
and gcc-5.4 installed. Do I need to compile with clang or gcc? Any
CXXFLAGS required?
Regards,
Jeff Davis
On 2018-02-01 09:32:17 -0800, Jeff Davis wrote:
On Wed, Jan 31, 2018 at 12:03 AM, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:The same problem takes place with old versions of GCC: I have to upgrade GCC
to 7.2 to make it possible to compile this code.
The problem in not in compiler itself, but in libc++ headers.How can I get this branch to compile on ubuntu 16.04? I have llvm-5.0
and gcc-5.4 installed. Do I need to compile with clang or gcc? Any
CXXFLAGS required?
Just to understand: You're running in the issue with the header being
included from within the extern "C" {}? Hm, I've pushed a quick fix for
that.
Other than that, you can compile with both gcc or clang, but clang needs
to be available. Will be guessed from PATH if clang clang-5.0 clang-4.0
(in that order) exist, similar with llvm-config llvm-config-5.0 being
guessed. LLVM_CONFIG/CLANG/CXX= as an argument to configure overrides
both of that. E.g.
./configure --with-llvm LLVM_CONFIG=~/build/llvm/5/opt/install/bin/llvm-config
is what I use, although I also add:
LDFLAGS='-Wl,-rpath,/home/andres/build/llvm/5/opt/install/lib'
so I don't have to install llvm anywhere the system knows about.
Greetings,
Andres Freund
On Fri, Feb 2, 2018 at 2:05 PM, Andres Freund <andres@anarazel.de> wrote:
On 2018-02-01 09:32:17 -0800, Jeff Davis wrote:
On Wed, Jan 31, 2018 at 12:03 AM, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:The same problem takes place with old versions of GCC: I have to upgrade GCC
to 7.2 to make it possible to compile this code.
The problem in not in compiler itself, but in libc++ headers.How can I get this branch to compile on ubuntu 16.04? I have llvm-5.0
and gcc-5.4 installed. Do I need to compile with clang or gcc? Any
CXXFLAGS required?Just to understand: You're running in the issue with the header being
included from within the extern "C" {}? Hm, I've pushed a quick fix for
that.
That change wasn't quite enough: to get this building against libc++
(Clang's native stdlb) I also needed this change to llvmjit.h so that
<llvm-c/Types.h> wouldn't be included with the wrong linkage (perhaps
you can find a less ugly way):
+#ifdef __cplusplus
+}
+#endif
#include <llvm-c/Types.h>
+#ifdef __cplusplus
+extern "C"
+{
+#endif
Other than that, you can compile with both gcc or clang, but clang needs
to be available. Will be guessed from PATH if clang clang-5.0 clang-4.0
(in that order) exist, similar with llvm-config llvm-config-5.0 being
guessed. LLVM_CONFIG/CLANG/CXX= as an argument to configure overrides
both of that. E.g.
./configure --with-llvm LLVM_CONFIG=~/build/llvm/5/opt/install/bin/llvm-config
is what I use, although I also add:
LDFLAGS='-Wl,-rpath,/home/andres/build/llvm/5/opt/install/lib'
so I don't have to install llvm anywhere the system knows about.
BTW if you're building with clang (vendor compiler on at least macOS
and FreeBSD) you'll probably need CXXFLAGS=-std=c++11 (or later
standard) because it's still defaulting to '98.
--
Thomas Munro
http://www.enterprisedb.com
Another small thing which might be environmental... llvmjit_types.bc
is getting installed into ${prefix}/lib here, but you're looking for
it in ${prefix}/lib/postgresql:
gmake[3]: Entering directory '/usr/home/munro/projects/postgres/src/backend/lib'
/usr/bin/install -c -m 644 llvmjit_types.bc '/home/munro/install/lib'
postgres=# set jit_above_cost = 0;
SET
postgres=# set jit_expressions = on;
SET
postgres=# select 4 + 4;
ERROR: LLVMCreateMemoryBufferWithContentsOfFile(/usr/home/munro/install/lib/postgresql/llvmjit_types.bc)
failed: No such file or directory
$ mv ~/install/lib/llvmjit_types.bc ~/install/lib/postgresql/
postgres=# select 4 + 4;
?column?
----------
8
(1 row)
--
Thomas Munro
http://www.enterprisedb.com
On Fri, Feb 2, 2018 at 5:11 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
Another small thing which might be environmental... llvmjit_types.bc
is getting installed into ${prefix}/lib here, but you're looking for
it in ${prefix}/lib/postgresql:
Is there something broken about my installation? I see simple
arithmetic expressions apparently compiling and working but I can
easily find stuff that breaks... so far I think it's anything
involving string literals:
postgres=# set jit_above_cost = 0;
SET
postgres=# select quote_ident('x');
ERROR: failed to resolve name MakeExpandedObjectReadOnlyInternal
Well actually just select 'hello world' does it. I've attached a backtrace.
Tab completion is broken for me with jit_above_cost = 0 due to
tab-complete.c queries failing with various other errors including:
set <tab>:
ERROR: failed to resolve name ExecEvalScalarArrayOp
update <tab>:
ERROR: failed to resolve name quote_ident
show <tab>:
ERROR: failed to resolve name slot_getsomeattrs
I wasn't sure from your status message how much of this is expected at
this stage...
This is built from:
commit 302b7a284d30fb0e00eb5f0163aa933d4d9bea10 (HEAD -> jit, andresfreund/jit)
... plus the extern "C" tweak I posted earlier to make my clang 4.0
compiler happy, built on a FreeBSD 11.1 box with:
./configure --prefix=/home/munro/install/ --enable-tap-tests
--enable-cassert --enable-debug --enable-depend --with-llvm CC="ccache
cc" CXX="ccache c++" CXXFLAGS="-std=c++11"
LLVM_CONFIG=/usr/local/llvm50/bin/llvm-config
--with-libraries="/usr/local/lib" --with-includes="/usr/local/include"
The clang that was used for bitcode was the system /usr/bin/clang,
version 4.0. Is it a problem that I used that for compiling the
bitcode, but LLVM5 for JIT? I actually tried
CLANG=/usr/local/llvm50/bin/clang but ran into weird failures I
haven't got to the bottom of at ThinLink time so I couldn't get as far
as a running system.
I installed llvm50 from a package. I did need to make a tiny tweak by
hand: in src/Makefile.global, llvm-config --system-libs had said
-l/usr/lib/libexecinfo.so which wasn't linking and looks wrong to me
so I changed it to -lexecinfo, noted that it worked and reported a bug
upstream: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225621
--
Thomas Munro
http://www.enterprisedb.com
Attachments:
On Thu, Feb 1, 2018 at 5:05 PM, Andres Freund <andres@anarazel.de> wrote:
Just to understand: You're running in the issue with the header being
included from within the extern "C" {}? Hm, I've pushed a quick fix for
that.Other than that, you can compile with both gcc or clang, but clang needs
to be available. Will be guessed from PATH if clang clang-5.0 clang-4.0
(in that order) exist, similar with llvm-config llvm-config-5.0 being
guessed. LLVM_CONFIG/CLANG/CXX= as an argument to configure overrides
both of that. E.g.
./configure --with-llvm LLVM_CONFIG=~/build/llvm/5/opt/install/bin/llvm-config
is what I use, although I also add:
LDFLAGS='-Wl,-rpath,/home/andres/build/llvm/5/opt/install/lib'
so I don't have to install llvm anywhere the system knows about.
On Ubuntu 16.04
SHA1: 302b7a284
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.6) 5.4.0 20160609
packages: llvm-5.0 llvm-5.0-dev llvm-5.0-runtime libllvm-5.0
clang-5.0 libclang-common-5.0-dev libclang1-5.0
./configure --with-llvm --prefix=/home/jdavis/install/pgsql-dev
...
checking for llvm-config... no
checking for llvm-config-5.0... llvm-config-5.0
checking for clang... no
checking for clang-5.0... clang-5.0
checking for LLVMOrcGetSymbolAddressIn... no
checking for LLVMGetHostCPUName... no
checking for LLVMOrcRegisterGDB... no
checking for LLVMOrcRegisterPerf... no
checking for LLVMOrcUnregisterPerf... no
...
That encounters errors like:
/usr/include/c++/5/bits/c++0x_warning.h:32:2: error: #error This file
requires compiler an
d library support for the ISO C++ 2011 standard. This support must be
enabled with the -st
d=c++11 or -std=gnu++11 compiler options.
...
/usr/include/c++/5/cmath:505:22: error: conflicting declaration of C
function ‘long double
...
/usr/include/c++/5/cmath:926:3: error: template with C linkage
...
So I reconfigure with:
CXXFLAGS="-std=c++11" ./configure --with-llvm
--prefix=/home/jdavis/install/pgsql-dev
I think that got rid of the first error, but the other errors remain.
I also tried installing libc++-dev and using CC=clang-5.0
CXX=clang++-5.0 and with CXXFLAGS="-std=c++11 -stdlib=libc++" but I am
not making much progress, I'm still getting:
/usr/include/c++/v1/cmath:316:1: error: templates must have C++ linkage
I suggest that you share your exact configuration so we can get past
this for now, and you can work on the build issues in the background.
We can't be the first ones with this problem; maybe you can just ask
on an LLVM channel what the right thing to do is that will work on a
variety of machines (or at least reliably detect the problem at
configure time)?
Regards,
Jeff Davis
On Fri, Feb 2, 2018 at 7:06 PM, Jeff Davis <pgsql@j-davis.com> wrote:
/usr/include/c++/5/cmath:505:22: error: conflicting declaration of C
function ‘long double
...
/usr/include/c++/5/cmath:926:3: error: template with C linkage
I suspect you can fix these with this change:
+#ifdef __cplusplus
+}
+#endif
#include <llvm-c/Types.h>
+#ifdef __cplusplus
+extern "C"
+{
+#endif
... in llvmjit.h.
--
Thomas Munro
http://www.enterprisedb.com
On Thu, Feb 1, 2018 at 10:09 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
On Fri, Feb 2, 2018 at 7:06 PM, Jeff Davis <pgsql@j-davis.com> wrote:
/usr/include/c++/5/cmath:505:22: error: conflicting declaration of C
function ‘long double
...
/usr/include/c++/5/cmath:926:3: error: template with C linkageI suspect you can fix these with this change:
+#ifdef __cplusplus +} +#endif #include <llvm-c/Types.h> +#ifdef __cplusplus +extern "C" +{ +#endif... in llvmjit.h.
Thanks! That worked, but I had to remove the "-stdlib=libc++" also,
which was causing me problems.
Regards,
Jeff Davis
Hi,
On 2018-02-02 18:22:34 +1300, Thomas Munro wrote:
Is there something broken about my installation? I see simple
arithmetic expressions apparently compiling and working but I can
easily find stuff that breaks... so far I think it's anything
involving string literals:
That definitely should all work. Did you compile with lto and forced it
to internalize all symbols or such?
postgres=# set jit_above_cost = 0;
SET
postgres=# select quote_ident('x');
ERROR: failed to resolve name MakeExpandedObjectReadOnlyInternal
...
The clang that was used for bitcode was the system /usr/bin/clang,
version 4.0. Is it a problem that I used that for compiling the
bitcode, but LLVM5 for JIT?
No, I did that locally without problems.
I actually tried CLANG=/usr/local/llvm50/bin/clang but ran into weird
failures I haven't got to the bottom of at ThinLink time so I couldn't
get as far as a running system.
So you'd clang 5 level issues rather than with this patchset, do I
understand correctly?
I installed llvm50 from a package. I did need to make a tiny tweak by
hand: in src/Makefile.global, llvm-config --system-libs had said
-l/usr/lib/libexecinfo.so which wasn't linking and looks wrong to me
so I changed it to -lexecinfo, noted that it worked and reported a bug
upstream: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225621
Yea, that seems outside of my / our hands.
- Andres
On 2018-02-01 22:20:01 -0800, Jeff Davis wrote:
Thanks! That worked, but I had to remove the "-stdlib=libc++" also,
which was causing me problems.
That'll be gone as soon as I finish the shlib thing. Will hope to have
something over the weekend. Right now I'm at FOSDEM and need to prepare
a talk for tomorrow.
Greetings,
Andres Freund
On Monday, January 29, 2018 10:53:50 AM CET Andres Freund wrote:
Hi,
On 2018-01-23 23:20:38 -0800, Andres Freund wrote:
== Code ==
As the patchset is large (500kb) and I'm still quickly evolving it, I do
not yet want to attach it. The git tree is athttps://git.postgresql.org/git/users/andresfreund/postgres.git
in the jit branch
https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=s
hortlog;h=refs/heads/jitI've just pushed an updated and rebased version of the tree:
- Split the large "jit infrastructure" commits into a number of smaller
commits
- Split the C++ file
- Dropped some of the performance stuff done to heaptuple.c - that was
mostly to make performance comparisons a bit more interesting, but
doesn't seem important enough to deal with.
- Added a commit renaming datetime.h symbols so they don't conflict with
LLVM variables anymore, removing ugly #undef PM/#define PM dance
around includes. Will post separately.
- Reduced the number of pointer constants in the generated LLVM IR, by
doing more getelementptr accesses (stem from before the time types
were automatically synced)
- Increased number of comments a bitThere's a jit-before-rebase-2018-01-29 tag, for the state of the tree
before the rebase.Regards,
Andres
Hi
I have successfully built the JIT branch against LLVM 4.0.1 on Debian testing.
This is not enough for Debian stable (LLVM 3.9 is the latest available there),
but it's a first step.
I've split the patch in four files. The first three fix the build issues, the
last one fixes a runtime issue.
I think they are small enough to not be a burden for you in your developments.
But if you don't want to carry these ifdefs right now, I maintain them in a
branch on a personal git and rebase as frequently as I can.
LLVM 3.9 support isn't going to be hard, but I prefer splitting. I also hope
this will help more people test this wonderful toy… :)
Regards
Pierre
Attachments:
0001-Add-support-for-LLVM4-in-llvmjit.c.patchtext/x-patch; charset=UTF-8; name=0001-Add-support-for-LLVM4-in-llvmjit.c.patchDownload
From 770104331a36a8d207053227b850396f1392939a Mon Sep 17 00:00:00 2001
From: Pierre <pierre.ducroquet@people-doc.com>
Date: Fri, 2 Feb 2018 09:11:55 +0100
Subject: [PATCH 1/4] Add support for LLVM4 in llvmjit.c
---
src/backend/lib/llvmjit.c | 24 +++++++++++++++++++++---
1 file changed, 21 insertions(+), 3 deletions(-)
diff --git a/src/backend/lib/llvmjit.c b/src/backend/lib/llvmjit.c
index 8e5ba94c98..d0c5537610 100644
--- a/src/backend/lib/llvmjit.c
+++ b/src/backend/lib/llvmjit.c
@@ -230,12 +230,19 @@ llvm_get_function(LLVMJitContext *context, const char *funcname)
addr = 0;
if (LLVMOrcGetSymbolAddressIn(handle->stack, &addr, handle->orc_handle, mangled))
- elog(ERROR, "failed to lookup symbol");
+ elog(ERROR, "failed to lookup symbol %s", mangled);
if (addr)
return (void *) addr;
}
#endif
+#if LLVM_VERSION_MAJOR < 5
+ if ((addr = LLVMOrcGetSymbolAddress(llvm_opt0_orc, mangled)))
+ return (void *) addr;
+ if ((addr = LLVMOrcGetSymbolAddress(llvm_opt3_orc, mangled)))
+ return (void *) addr;
+ elog(ERROR, "failed to lookup symbol %s for %s", mangled, funcname);
+#else
if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, &addr, mangled))
elog(ERROR, "failed to lookup symbol");
if (addr)
@@ -244,7 +251,7 @@ llvm_get_function(LLVMJitContext *context, const char *funcname)
elog(ERROR, "failed to lookup symbol");
if (addr)
return (void *) addr;
-
+#endif
elog(ERROR, "failed to JIT: %s", funcname);
return NULL;
@@ -380,11 +387,21 @@ llvm_compile_module(LLVMJitContext *context)
* faster instruction selection mechanism is used.
*/
{
- LLVMSharedModuleRef smod;
instr_time tb, ta;
/* emit the code */
INSTR_TIME_SET_CURRENT(ta);
+#if LLVM_VERSION < 5
+ orc_handle = LLVMOrcAddEagerlyCompiledIR(compile_orc, context->module,
+ llvm_resolve_symbol, NULL);
+ if (!orc_handle)
+ {
+ elog(ERROR, "failed to jit module");
+ }
+#else
+ LLVMSharedModuleRef smod;
+
+ LLVMSharedModuleRef smod;
smod = LLVMOrcMakeSharedModule(context->module);
if (LLVMOrcAddEagerlyCompiledIR(compile_orc, &orc_handle, smod,
llvm_resolve_symbol, NULL))
@@ -392,6 +409,7 @@ llvm_compile_module(LLVMJitContext *context)
elog(ERROR, "failed to jit module");
}
LLVMOrcDisposeSharedModuleRef(smod);
+#endif
INSTR_TIME_SET_CURRENT(tb);
INSTR_TIME_SUBTRACT(tb, ta);
ereport(DEBUG1, (errmsg("time to emit: %.3fs",
--
2.15.1
0002-Add-LLVM4-support-in-llvmjit_error.cpp.patchtext/x-patch; charset=UTF-8; name=0002-Add-LLVM4-support-in-llvmjit_error.cpp.patchDownload
From 079ad7087e2ab106c0f04fa9056c93afa9a43b7c Mon Sep 17 00:00:00 2001
From: Pierre <pierre.ducroquet@people-doc.com>
Date: Fri, 2 Feb 2018 09:13:40 +0100
Subject: [PATCH 2/4] Add LLVM4 support in llvmjit_error.cpp
---
src/backend/lib/llvmjit_error.cpp | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/src/backend/lib/llvmjit_error.cpp b/src/backend/lib/llvmjit_error.cpp
index 70cecd114b..04e51b2a31 100644
--- a/src/backend/lib/llvmjit_error.cpp
+++ b/src/backend/lib/llvmjit_error.cpp
@@ -56,7 +56,9 @@ llvm_enter_fatal_on_oom(void)
if (fatal_new_handler_depth == 0)
{
old_new_handler = std::set_new_handler(fatal_system_new_handler);
+#if LLVM_VERSION_MAJOR > 4
llvm::install_bad_alloc_error_handler(fatal_llvm_new_handler);
+#endif
llvm::install_fatal_error_handler(fatal_llvm_error_handler);
}
fatal_new_handler_depth++;
@@ -72,7 +74,9 @@ llvm_leave_fatal_on_oom(void)
if (fatal_new_handler_depth == 0)
{
std::set_new_handler(old_new_handler);
+#if LLVM_VERSION_MAJOR > 4
llvm::remove_bad_alloc_error_handler();
+#endif
llvm::remove_fatal_error_handler();
}
}
@@ -87,7 +91,9 @@ llvm_reset_fatal_on_oom(void)
if (fatal_new_handler_depth != 0)
{
std::set_new_handler(old_new_handler);
+#if LLVM_VERSION_MAJOR > 4
llvm::remove_bad_alloc_error_handler();
+#endif
llvm::remove_fatal_error_handler();
}
fatal_new_handler_depth = 0;
--
2.15.1
0003-Add-LLVM4-support-in-llvmjit_inline.cpp.patchtext/x-patch; charset=UTF-8; name=0003-Add-LLVM4-support-in-llvmjit_inline.cpp.patchDownload
From 51cc99259dc28120309e2a99f8585907d0baae06 Mon Sep 17 00:00:00 2001
From: Pierre <pierre.ducroquet@people-doc.com>
Date: Fri, 2 Feb 2018 09:23:56 +0100
Subject: [PATCH 3/4] Add LLVM4 support in llvmjit_inline.cpp
---
src/backend/lib/llvmjit_inline.cpp | 36 ++++++++++++++++++++++++++++++++++--
1 file changed, 34 insertions(+), 2 deletions(-)
diff --git a/src/backend/lib/llvmjit_inline.cpp b/src/backend/lib/llvmjit_inline.cpp
index 151198547a..8a747cbfc0 100644
--- a/src/backend/lib/llvmjit_inline.cpp
+++ b/src/backend/lib/llvmjit_inline.cpp
@@ -100,6 +100,13 @@ llvm_inline(LLVMModuleRef M)
llvm_execute_inline_plan(mod, globalsToInline.get());
}
+#if LLVM_VERSION_MAJOR < 5
+bool operator!(const llvm::ValueInfo &vi) {
+ return !( (vi.Kind == llvm::ValueInfo::VI_GUID && vi.TheValue.Id)
+ || (vi.Kind == llvm::ValueInfo::VI_Value && vi.TheValue.GV));
+}
+#endif
+
/*
* Build information necessary for inlining external function references in
* mod.
@@ -146,7 +153,14 @@ llvm_build_inline_plan(llvm::Module *mod)
if (threshold == -1)
continue;
+#if LLVM_VERSION_MAJOR > 4
llvm::ValueInfo funcVI = llvm_index->getValueInfo(funcGUID);
+#else
+ const llvm::const_gvsummary_iterator &I = llvm_index->findGlobalValueSummaryList(funcGUID);
+ if (I == llvm_index->end())
+ continue;
+ llvm::ValueInfo funcVI = llvm::ValueInfo(I->first);
+#endif
/* if index doesn't know function, we don't have a body, continue */
if (!funcVI)
@@ -157,7 +171,12 @@ llvm_build_inline_plan(llvm::Module *mod)
* look up module(s), check if function actually is defined (there
* could be hash conflicts).
*/
+#if LLVM_VERSION_MAJOR > 4
for (const auto &gvs : funcVI.getSummaryList())
+#else
+ auto it_gvs = llvm_index->findGlobalValueSummaryList(funcVI.getGUID());
+ for (const auto &gvs: it_gvs->second)
+#endif
{
const llvm::FunctionSummary *fs;
llvm::StringRef modPath = gvs->modulePath();
@@ -318,9 +337,14 @@ llvm_execute_inline_plan(llvm::Module *mod, ImportMapTy *globalsToInline)
}
+#if LLVM_VERSION_MAJOR > 4
+#define IRMOVE_PARAMS , /*IsPerformingImport=*/false
+#else
+#define IRMOVE_PARAMS , /*LinkModuleInlineAsm=*/false, /*IsPerformingImport=*/false
+#endif
if (Mover.move(std::move(importMod), GlobalsToImport.getArrayRef(),
- [](llvm::GlobalValue &, llvm::IRMover::ValueAdder) {},
- /*IsPerformingImport=*/false))
+ [](llvm::GlobalValue &, llvm::IRMover::ValueAdder) {}
+ IRMOVE_PARAMS))
elog(ERROR, "function import failed with linker error");
}
}
@@ -619,9 +643,17 @@ llvm_load_index(void)
elog(ERROR, "failed to open %s: %s", subpath,
EC.message().c_str());
llvm::MemoryBufferRef ref(*MBOrErr.get().get());
+#if LLVM_VERSION_MAJOR > 4
llvm::Error e = llvm::readModuleSummaryIndex(ref, *index, 0);
if (e)
elog(ERROR, "could not load summary at %s", subpath);
+#else
+ std::unique_ptr<llvm::ModuleSummaryIndex> subindex = std::move(llvm::getModuleSummaryIndex(ref).get());
+ if (!subindex)
+ elog(ERROR, "could not load summary at %s", subpath);
+ else
+ index->mergeFrom(std::move(subindex), 0);
+#endif
}
}
--
2.15.1
0004-Don-t-emit-bitcode-depending-on-an-LLVM-5-function.patchtext/x-patch; charset=UTF-8; name=0004-Don-t-emit-bitcode-depending-on-an-LLVM-5-function.patchDownload
From 77ee0a7bf15b2c962006b8a1d585f35830280eaf Mon Sep 17 00:00:00 2001
From: Pierre <pierre.ducroquet@people-doc.com>
Date: Fri, 2 Feb 2018 10:34:09 +0100
Subject: [PATCH 4/4] Don't emit bitcode depending on an LLVM 5+ function
---
src/backend/executor/execExprCompile.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/src/backend/executor/execExprCompile.c b/src/backend/executor/execExprCompile.c
index 4d6304f748..d129ea7828 100644
--- a/src/backend/executor/execExprCompile.c
+++ b/src/backend/executor/execExprCompile.c
@@ -173,7 +173,11 @@ get_LifetimeEnd(LLVMModuleRef mod)
LLVMTypeRef sig;
LLVMValueRef fn;
LLVMTypeRef param_types[2];
+#if LLVM_VERSION_MAJOR > 4
const char *nm = "llvm.lifetime.end.p0i8";
+#else
+ const char *nm = "llvm.lifetime.end";
+#endif
fn = LLVMGetNamedFunction(mod, nm);
if (fn)
--
2.15.1
Hi,
On 2018-02-02 18:22:34 +1300, Thomas Munro wrote:
The clang that was used for bitcode was the system /usr/bin/clang,
version 4.0. Is it a problem that I used that for compiling the
bitcode, but LLVM5 for JIT? I actually tried
CLANG=/usr/local/llvm50/bin/clang but ran into weird failures I
haven't got to the bottom of at ThinLink time so I couldn't get as far
as a running system.
You're using thinlto to compile pg? Could you provide what you pass to
configure for that? IIRC I tried that a while ago and ran into some
issues with us creating archives (libpgport, libpgcommon).
Greetings,
Andres Freund
On Friday, February 2, 2018 10:48:16 AM CET Pierre Ducroquet wrote:
On Monday, January 29, 2018 10:53:50 AM CET Andres Freund wrote:
Hi,
On 2018-01-23 23:20:38 -0800, Andres Freund wrote:
== Code ==
As the patchset is large (500kb) and I'm still quickly evolving it, I do
not yet want to attach it. The git tree is athttps://git.postgresql.org/git/users/andresfreund/postgres.git
in the jit branch
https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a
=s
hortlog;h=refs/heads/jitI've just pushed an updated and rebased version of the tree:
- Split the large "jit infrastructure" commits into a number of smallercommits
- Split the C++ file
- Dropped some of the performance stuff done to heaptuple.c - that wasmostly to make performance comparisons a bit more interesting, but
doesn't seem important enough to deal with.- Added a commit renaming datetime.h symbols so they don't conflict with
LLVM variables anymore, removing ugly #undef PM/#define PM dance
around includes. Will post separately.- Reduced the number of pointer constants in the generated LLVM IR, by
doing more getelementptr accesses (stem from before the time types
were automatically synced)- Increased number of comments a bit
There's a jit-before-rebase-2018-01-29 tag, for the state of the tree
before the rebase.Regards,
Andres
Hi
I have successfully built the JIT branch against LLVM 4.0.1 on Debian
testing. This is not enough for Debian stable (LLVM 3.9 is the latest
available there), but it's a first step.
I've split the patch in four files. The first three fix the build issues,
the last one fixes a runtime issue.
I think they are small enough to not be a burden for you in your
developments. But if you don't want to carry these ifdefs right now, I
maintain them in a branch on a personal git and rebase as frequently as I
can.LLVM 3.9 support isn't going to be hard, but I prefer splitting. I also hope
this will help more people test this wonderful toy… :)Regards
Pierre
For LLVM 3.9, only small changes were needed.
I've attached the patches to this email.
I only did very basic, primitive testing, but it seems to work.
I'll do more testing in the next days.
Pierre
Attachments:
0001-Fix-building-with-LLVM-3.9.patchtext/x-patch; charset=UTF-8; name=0001-Fix-building-with-LLVM-3.9.patchDownload
From 5ca9594a7f52b7daab8562293010fe8c807107ee Mon Sep 17 00:00:00 2001
From: Pierre <pierre.ducroquet@people-doc.com>
Date: Fri, 2 Feb 2018 11:29:45 +0100
Subject: [PATCH 1/2] Fix building with LLVM 3.9
---
src/backend/lib/llvmjit_inline.cpp | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/src/backend/lib/llvmjit_inline.cpp b/src/backend/lib/llvmjit_inline.cpp
index 8a747cbfc0..a785261bea 100644
--- a/src/backend/lib/llvmjit_inline.cpp
+++ b/src/backend/lib/llvmjit_inline.cpp
@@ -37,7 +37,12 @@ extern "C"
#include <llvm/ADT/StringSet.h>
#include <llvm/ADT/StringMap.h>
#include <llvm/Analysis/ModuleSummaryAnalysis.h>
+#if LLVM_MAJOR_VERSION > 3
#include <llvm/Bitcode/BitcodeReader.h>
+#else
+#include "llvm/Bitcode/ReaderWriter.h"
+#include "llvm/Support/Error.h"
+#endif
#include <llvm/IR/CallSite.h>
#include <llvm/IR/DebugInfo.h>
#include <llvm/IR/IntrinsicInst.h>
@@ -100,7 +105,12 @@ llvm_inline(LLVMModuleRef M)
llvm_execute_inline_plan(mod, globalsToInline.get());
}
-#if LLVM_VERSION_MAJOR < 5
+#if LLVM_VERSION_MAJOR < 4
+bool operator!(const llvm::ValueInfo &vi) {
+ return !( (vi.Kind == llvm::ValueInfo::VI_GUID && vi.TheValue.Id)
+ || (vi.Kind == llvm::ValueInfo::VI_Value && vi.TheValue.V));
+}
+#elif LLVM_VERSION_MAJOR < 5
bool operator!(const llvm::ValueInfo &vi) {
return !( (vi.Kind == llvm::ValueInfo::VI_GUID && vi.TheValue.Id)
|| (vi.Kind == llvm::ValueInfo::VI_Value && vi.TheValue.GV));
@@ -188,12 +198,15 @@ llvm_build_inline_plan(llvm::Module *mod)
funcName.data(),
modPath.data());
+// XXX Missing in LLVM < 4.0 ?
+#if LLVM_VERSION_MAJOR > 3
if (gvs->notEligibleToImport())
{
elog(DEBUG1, "uneligible to import %s due to summary",
funcName.data());
continue;
}
+#endif
if ((int) fs->instCount() > threshold)
{
@@ -339,8 +352,10 @@ llvm_execute_inline_plan(llvm::Module *mod, ImportMapTy *globalsToInline)
#if LLVM_VERSION_MAJOR > 4
#define IRMOVE_PARAMS , /*IsPerformingImport=*/false
-#else
+#elif LLVM_VERSION_MAJOR > 3
#define IRMOVE_PARAMS , /*LinkModuleInlineAsm=*/false, /*IsPerformingImport=*/false
+#else
+#define IRMOVE_PARAMS
#endif
if (Mover.move(std::move(importMod), GlobalsToImport.getArrayRef(),
[](llvm::GlobalValue &, llvm::IRMover::ValueAdder) {}
@@ -648,7 +663,11 @@ llvm_load_index(void)
if (e)
elog(ERROR, "could not load summary at %s", subpath);
#else
+#if LLVM_VERSION_MAJOR > 3
std::unique_ptr<llvm::ModuleSummaryIndex> subindex = std::move(llvm::getModuleSummaryIndex(ref).get());
+#else
+ std::unique_ptr<llvm::ModuleSummaryIndex> subindex = std::move(llvm::getModuleSummaryIndex(ref, [](const llvm::DiagnosticInfo &) {}).get());
+#endif
if (!subindex)
elog(ERROR, "could not load summary at %s", subpath);
else
--
2.15.1
0002-Fix-segfault-with-LLVM-3.9.patchtext/x-patch; charset=UTF-8; name=0002-Fix-segfault-with-LLVM-3.9.patchDownload
From be1f76a141ab346b6ba8d9e8b38c81a40a427dc7 Mon Sep 17 00:00:00 2001
From: Pierre <pierre.ducroquet@people-doc.com>
Date: Fri, 2 Feb 2018 11:29:57 +0100
Subject: [PATCH 2/2] Fix segfault with LLVM 3.9
---
src/backend/lib/llvmjit.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/backend/lib/llvmjit.c b/src/backend/lib/llvmjit.c
index d0c5537610..ad9582182f 100644
--- a/src/backend/lib/llvmjit.c
+++ b/src/backend/lib/llvmjit.c
@@ -462,12 +462,12 @@ llvm_session_initialize(void)
cpu = LLVMGetHostCPUName();
llvm_opt0_targetmachine =
- LLVMCreateTargetMachine(llvm_targetref, llvm_triple, cpu, NULL,
+ LLVMCreateTargetMachine(llvm_targetref, llvm_triple, cpu, "",
LLVMCodeGenLevelNone,
LLVMRelocDefault,
LLVMCodeModelJITDefault);
llvm_opt3_targetmachine =
- LLVMCreateTargetMachine(llvm_targetref, llvm_triple, cpu, NULL,
+ LLVMCreateTargetMachine(llvm_targetref, llvm_triple, cpu, "",
LLVMCodeGenLevelAggressive,
LLVMRelocDefault,
LLVMCodeModelJITDefault);
--
2.15.1
On Mon, Jan 29, 2018 at 1:53 AM, Andres Freund <andres@anarazel.de> wrote:
https://git.postgresql.org/git/users/andresfreund/postgres.git
There's a patch in there to change the scan order. I suggest that you
rename the GUC "synchronize_seqscans" to something more generic like
"optimize_scan_order", and use it to control your feature as well
(after all, it's the same trade-off: weird scan order vs.
performance). Then, go ahead and commit it. FWIW I see about a 7%
boost on my laptop[1]Simple scan with simple predicate on 50M tuples, after pg_prewarm. from that patch on master, without JIT or
anything else.
I also see you dropped "7ae518bf Centralize slot deforming logic a
bit.". Was that intentional? Do we want it? I think I saw about a 2%
gain here over master, but when I applied it on top of the fast scans
it did not seem to add anything on top of fast scans. Seems
reproducible, but I don't have an explanation.
And you are probably already working on this, but it would be helpful
to get the following two patches in also:
* 3c22065f Do execGrouping via expression eval
* a9dde4aa Allow tupleslots to have a fixed tupledesc
I took a brief look at those two, but will review them in more detail.
Regards,
Jeff Davis
[1]: Simple scan with simple predicate on 50M tuples, after pg_prewarm.
Hi,
On 2018-02-02 18:21:12 -0800, Jeff Davis wrote:
On Mon, Jan 29, 2018 at 1:53 AM, Andres Freund <andres@anarazel.de> wrote:
https://git.postgresql.org/git/users/andresfreund/postgres.git
There's a patch in there to change the scan order.
Yes - note it's "deactivated" at the moment in the series. I primarily
have it in there because I found profiles to be a lot more useful if
it's enabled, as otherwise the number of cache misses and related stalls
from heap accesses completely swamp everything else.
FWIW, there's http://archives.postgresql.org/message-id/20161030073655.rfa6nvbyk4w2kkpk%40alap3.anarazel.de
I suggest that you rename the GUC "synchronize_seqscans" to something
more generic like "optimize_scan_order", and use it to control your
feature as well (after all, it's the same trade-off: weird scan order
vs. performance). Then, go ahead and commit it. FWIW I see about a 7%
boost on my laptop[1] from that patch on master, without JIT or
anything else.
Yea, that's roughly the same magnitude of what I'm seeing, some queries
even bigger.
I'm not sure I want to commit this right now - ISTM we couldn't default
this to on without annoying a lot of people, and letting the performance
wins on the table by default seems like a shame. I think we should
probably either change the order we store things on the page by default
or only use the faster order if the scan above doesn't care about order
- the planner could figure that out easily.
I personally don't think it is necessary to get this committed at the
same time as the JIT stuff, so I'm not planning to push very hard on
that front. Should you be interested in taking it up, please feel
entirely free.
I also see you dropped "7ae518bf Centralize slot deforming logic a
bit.". Was that intentional? Do we want it?
The problem is that there's probably some controversial things in
there. I think the checks I dropped largely make no sense, but I don't
really want to push for that hard. I suspect we probably still want it,
but I do not want to put into the critical path right now.
I think I saw about a 2% gain here over master, but when I applied it
on top of the fast scans it did not seem to add anything on top of
fast scans. Seems reproducible, but I don't have an explanation.
Yea, that makes sense. The primary reason the patch is beneficial is
that it centralizes the place where the HeapTupleHeader is accessed to a
single piece of code (slot_deform_tuple()). In a lot of cases that first
access will result in a cache miss in all layers, requiring a memory
access. In slot_getsomeattrs() there's very little that can be done in
an out-of-order manner, whereas slot_deform_tuple() can continue
execution a bit further. Also, the latter will then go and sequentially
access the rest (or a significant part of) the tuple, so a centralized
access is more prefetchable.
And you are probably already working on this, but it would be helpful
to get the following two patches in also:
* 3c22065f Do execGrouping via expression eval
* a9dde4aa Allow tupleslots to have a fixed tupledesc
Yes, I plan to resume working in whipping them up into shape as soon as
I've finished the move to a shared library. This weekend I'm at fosdem,
so that's going to be after...
Thanks for looking!
Andres Freund
Hi,
On 2018-02-03 01:13:21 -0800, Andres Freund wrote:
On 2018-02-02 18:21:12 -0800, Jeff Davis wrote:
I think I saw about a 2% gain here over master, but when I applied it
on top of the fast scans it did not seem to add anything on top of
fast scans. Seems reproducible, but I don't have an explanation.Yea, that makes sense. The primary reason the patch is beneficial is
that it centralizes the place where the HeapTupleHeader is accessed to a
single piece of code (slot_deform_tuple()). In a lot of cases that first
access will result in a cache miss in all layers, requiring a memory
access. In slot_getsomeattrs() there's very little that can be done in
an out-of-order manner, whereas slot_deform_tuple() can continue
execution a bit further. Also, the latter will then go and sequentially
access the rest (or a significant part of) the tuple, so a centralized
access is more prefetchable.
Oops missed part of the argument here: The reason that isn't that large
an effect anymore with the scan order patch applied is that suddenly the
accesses are, due to the better scan order, more likely to be cacheable
and prefetchable. So in that case the few additional instructions and
branches in slot_getsomeattrs/slot_getattr don't hurt as much
anymore. IIRC I could still show it up, but it's a much smaller win.
Greetings,
Andres Freund
Hi,
I've done some initial benchmarking on the branch over the last couple
of days, focusing on analytics workloads using the DBT-3 benchmark.
Attached are two spreadsheets with results from two machines (the same
two I use for all benchmarks), and a couple of charts illustrating the
impact of enabling different JIT options.
I did the tests with 10GB and 50GB data sets (load into database
generally increases the size by a factor of 2-3x). So at least on the
larger machine the 10GB dataset should be fully in memory. The numbers
are medians for 10 consecutive runs of each query, so the data tends to
be well cached.
In this round of tests I've disabled parallelism. Based on discussion
with Andres I've decided to repeat the tests with parallel queries
enabled - that's running now, and will take some time to complete.
According to the results, most of the DBT-3 queries see slight
improvement in the 5-10% range, but the JIT options vary depending on
the query. What surprised me quite a bit is that the improvement is way
more significant on the 50GB dataset (on both machines). I have expected
the opposite behavior, i.e. that the JIT impact will be more obvious on
the small dataset and then will diminish as I/O becomes more prominent.
Yet that's not the case, apparently. One possible explanation is that on
the 50GB data set the queries switch to plans that are more sensitive to
the JIT optimizations.
A couple of queries saw much more significant improvements - Q1 and Q20
got about 30%-40% faster, and I have no problem believing that other
queries may see even more significant benefits.
Other queries (Q19 and Q21) saw regressions - for Q19 it's relatively
harmless, I think. It's a short query and so the relative slowdown seems
somewhat worse that in absolute terms. Not sure what's going on for Q21,
though. But I think we'll need to look at the costing model, and try
tweaking it to make the right decision in those cases.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
jit-results-bench.odsapplication/vnd.oasis.opendocument.spreadsheet; name=jit-results-bench.odsDownload
PK �[CL�l9�. . mimetypeapplication/vnd.oasis.opendocument.spreadsheetPK �[CLPb^x� � Thumbnails/thumbnail.png�PNG
IHDR � � .xq fPLTE """---333<<<@@@KKKRRR[[[ccckkksss{{{������������������������������������������������ ���_��� FIDATx��]�r����n�,�����_E�s'�A���\%U�t��8L �'����?H�����S�������}7�s.�N�8x�8����I�`�kA�fD��-��F5/|��^YX X}�"�h��Wkes������P��@�7/Y���HE;V$�P�<-�KB��eD���#kI����=���"��Z��uKTq|���e��#�+;�uOdgye>���Q�U�i����w#I����3�;��0������������^��e?���~�z�QFcZ�MXXm��T�����]��x��.�����j��P�m���z^�_����sL���
���������?�?�yMB��D�k\�#D���7�xa�q�Ome��;��O�o�p�;��?Z�V�`|��\�;�{�D��� � �=�V�v�[�.����B��iA��C.��TPj�TJ�O{Ao��NZ>��~7��
�0���h����0E]E����`�?�=-�5��J��C�F���W#�6��o4)���,�x&�6?~c�2�:EBL%��Z{�O�/���ML�M�Z>LR6��&Yf��X�������bH%�O�H4����D#z�%�vTC��2� ���>U�
���,�2�+��N-#��au4^BaD�^����������G�%F�tq0f:F��nUN��� e��|��p���$���5F"QG|Q���P����n2����F�jJ�]�nM�"=��+<���_���w�[������b�jXpQJ��Z�G���N��;�U��s����e6�+��P��e����.�� �(�������-�����KZ�]#����J�leq��s��2��Ih8�V���d7�#���NW���B��p?�%-���1�*Fn����P��VT����p�:9'3���\��H��'��;�G���D:o�����h��u�����(�4-JS�^���{���x�S�Y�En�M��3��2 Q�4��)�������{��x��TR��%�0���Xj ����kC�����(d���H��AT�����`"��M�Fc�2
���f�u(��)�1*t�
���A��L��e>��_z���
������7�h�$b2/yt�����x�0L6�[y����!��)��Z�D,�7���|lz�S������Qk���L!���s��BZ�M���#~�8���3$��l������!�r �A�=>}���7��q���4��P���������&�a}��A��_%2�8I��qO���MG�" H��r}��g�U5a��k�.���J��{��6��F��m�qL��C��$w��.����"_��M3�����h��^���b�PP`��6�������A��{k(�"�HW.v�_� �E NPZ���O��d���?�E�
fN�������-qQ�{������Z��L���^kC����X��W�z`:����Mf%~T+�D��[v�6���Q��`����7������
��~%�A�nD���v&��\�Q#�<�S��o����������M�RN������f�Rn/�n}��<\ ��;�-1rf1hn������"c*�%������.��0�0D�"�����_S����T����L�&x���� (�������i�QA�{��
�h���A��@����~#�|�B�H�E?���8}���.��
��x�0hC�3�qj��XIE��A�f3�g>?zT"�����F�y�rC�-���v��O'�[����*
[�� �BF�:�C� ]�$��d�ID3������[�s"k�t�S���#�T�����:>��eD�Hwb����]�������:��eu��M��
"�e�$�B�sN���V��bkt�>^����i��N"���?�]����{�mn���b�J�����.���[0T�Tt5����?�/�,l��q8
c���}=��R�;\"D���4b����<o>���9]74e��{�u\;d��d�,�0�zP>6���z;ne@���C���?��I���6�_zu4q��B�~�4�NS���V���t�}.McQ������PQL3
��v� _��R�e<�����?*��f\t[��U�X�lCq�����7!��
�V���@@�����j�f�D�=
C����I�aw��+����t4F� p&��B���?������;�e�l' �Pe�l�'��v�Q�4��
%D��B;��������7?��i��ew������_%��]���!d$lq���!�"{W�P{���zZM�"T��h�� �j]o��.�{��^x+���o�=06����*3� ��>�������K�,��?��%`�=^����oJSa�-����?6���Jwe�����7�.��,t��w��g~��b�)^���x'{1C�{���;����n+8`l�IRvq>���������g�7�O;�
�g����E�y��
!@��<H���s���xS����z7D��eV��|�"]��N���
��n����3��}���n���>9n �r8��$���0��x���^5��#�^�
��%�} s����yM�-�s�����B��J���( ��9:���O�>�E���t���j���xX������
�fLC4?��4��0�7��b���WtW<T���x��0n(�<��~2����7[�������Sh0�C����j�EL�1<��s�����I�����Slt�UX�y��������t����E�"��gG��A��������?��m�G���M7����Vx�=�K�A>�F��x+��� 8t���8������!co��eo�ZJ=}�_hB ���p�� A�R�x����?�����6��$��1�:t��j2�t����L}��a�7Jv�;���Lb��<bW�F[H�2e���o��^��@������G����v>��?�-�o�a����1��x���q
��N���d�<�L�Z���(��ku��N�ADfX�A*�8�����
k��9U��
����D��q|u��w��o�w�1�>�=w��B�7%����5�������������}y|A���I�@m�V0�%k�Aq����v��n��gJ<�
���S�v�4qD�������:h�k�9�7h���3?v�W�Y�.����u�c=�<�������u�I>�u�G�����z�F��U��5����v�:n�c�8�j������&� �^=�9�wYuEu�b�+��
r�G90&5NS;Cd����~C����?���1�Gf���o��.9�!�@8�X0�f���'K�� �9����K���.�W,��v 5L�VH_>��l@Qc����g=��h�a>[����Lm]>���u�_o�g�Q�{����x
D��eL��#�=������������\d���X>�H�E����?���������m2U���un�3��7M.�$7����nx�R��G}�������3�k���
e)�(N� }��w_��`>���V��T���u���f��aMK�b�g~�b�ZA>��"����T����������4��/���#Uo��Z�����5�K`�H���0������m-�X8���q/&�x��\:[i�+~t��vxk��������Rj)���r/i���An�*�CW��N����,zu=�zo�n�' +Q*��?�������1�@�F���6�
^^�Hm��n(p�j�$4�m���P��^#�q^������[��L%��c ]r1~�R���e�^2g'��_%����n o�]����1K�#����1�^<cG�Q?�1q��W�W���3Q(��G����
r�R�LC�b4vq�W��^x{���;al����-�G��]mv�q����{/�%�y�o{d�Z[Z^�B|��!�oO}����e>�S���m�p.����N+��?'�-��`�����DZ�;�;T�3~�ab������<���AA J'�o
x��.yt�����x�Pf}1^/lw��3j�D��]�>Eu��M����|�v�E)�^]��d�i�J�a�GQ����VUqI���dE�t�������v���C�k��)��|�]��A���O�#���Q�C�������|@E�~@�����AC��'^��yu���_���p6;��M3�!��?�8a��4�TZ���t�g�T�:�z�J:�;vgr=���JBp�z�AiKV������]x����q���_��'
���,��DV?������7/�� �R�,��L�jE��no�R��=e���y��[S������q���,�q2�T*nu(�h�Y-��i_�d���������E=oS|e(��P���9�-;��d������8F4��{�j�Ur!����!���<�.�C�;���?��)_����D���)S������o�WS���U(r���G�.q& B�^$D'���}w���-u\,���-��W���Y$�Bb}���/���G|��$��B�f��X��h����u��n����.����r��8T�0����>I�
n����0�1Gc��������{�q�<�B �������_�9���������������?�����A������u]x/���������s������g^?��x�~*9\w���S��4�h|��
�������Yx��F*�s�X�h9���v�K%���L%U���b�����������H��F��Z�N����U���
��#q����b���_�W���G'RZW� /��#�p�/��_�a�.l����5#
��NT"Qw"3��q����;�k����%�d^q�=�����~��~�s_������?�������������7����_�H/�oF�[F�����r]�,U����"����[��o�������������]�P�����G�a�X�Ar!f����W���f�=`y�S��v��9����k_�_���&�D�&�t��{C���o-������I�>�����`���������o����u�W�{RS�i��m6��������W��b�q��i\�{G7q��j�������F����-���n<�2'�b�|���v�j�7��+C�(e�m��_�o��^*��gR��kj�=5��x�/��p�R��p���=����q�k
���4��]��*��7�������p�w���F IEND�B`�PK �[CL Object 5/content.xml�]]o�8�}�_d0������l*�����ZL��L0�bYv�-[^I�$����D�t,�m�������G�����������yU]})������q����u^�����������?���m�X�yq3���U����^w��W��u{�_}w���o��-��u�*��.��7���u��q\=�v/�������]��{�m������w������m[#jx��>����J�Q}���r����\������67����S�D��Y���z����c��cS�V�|VT�%kg8�3�vUt������I���}�-M�e�z���<zD|YNH�?d��c�5������^:�]e��D���/������l�B�:�������rs��}�����GS�
�u������~:���)�� ����Y�����}��vxfZ$�;L��o�h'n ������O>����o�C������7N�u�e��2���IO��)6u���,�����h�C��������.��|oSc���o&^��,������A�\�h����kF���`��f�lC|�W�E��6.��k��x�Mi/e���&zB8����G+W��(��E��������u�j��1����w��C�z>�qvj����w�D����7~���^K��h���$���g �cW��V������m�����W}4�����/t���j��*�����o^.���e|=����y�Dl�h^��*��M���!1s��=k�����W�M��qiR�<�f�&�670���E�*��i�oY6����1]Z4]Y�WvZ�G5�o��z�����a:�)$��C����n��
�������TI���fQ�Z�_�C)|��6����m��!��n4������z��]���y�|������z�����4���y=;i8^�\l@��Umr�?�S���o�(�����i!���E��{q+(:MP~���:��E�P���:���21d�U�q��H.���p�Y��M���X��"Yf��)�]?B�I�l�����X��x ]���7 ��A1�ekb�KRe�E�T���d��2���)�:T���y/�p��r��)�{#�o#�G��y1�eK�M�Ne�v;���Q�a���|G���l�Bg������<����z�;�a����W?}����F��i>�����]z,-Fo=��6zO\�_�����J�+t~S��E����������x�)�7,�b�*�+�'
�oZ�9%��y�.��E�R�q�]�<Q}�"�!��+�'J�oXdE�l���D>����3�`|W"�yA.rV��}-|_�I��D[W������6�-��=���k�f������z�2~p6��������x*��m$A)%8_��=v��t I%r�{U|���w�i�?w/+{i��W�y��� ��}�x<�v�����U���#S����i�R�~������l�>��5�W}z5�����y��?6���
mng�go�#D��)�r� �5�� ��f��1��?�b�O\=_}^wEu�7*2�� ze����T�b=�
?%�}������I�S�;���p{�����+���~w�b��MUwI����fU��;�-��1����yS���Sfl���b�?��?�?���+� �����E��!k������0zl�D���n��0��J9��*��6����&���D��3��X}< IS �z���V!_�`m[�Z�-�qlU���}�E�4�*k^���]#�������gg~���:��YU�����uc���}F����o�l=<���~o_�{+��2�M���j;7}$�������������l���g������n&th�f����>���%��69�K�Ov���4r���W�79-�����}t�uS��n���f���|!�.��(��W��Ie~<K�'��������0������0���K(C�W��Ie~:K��&��������8������8���K(��W���2��R���2�.��8_��'���,e~�T��K(#�W���2��R��t2
��SVU�uQ�TwQ�|z+<y�.L��rR�����wT��b��4a�*y(�y����q�n�k=h�N�c/���Ol
��X���W6�7���������I��G��v��<��CM�v���6�]�W����m����w�6��m~wFfx;���������*�9��C����nN��Pm��;����>��@=/�����=�v��{���r���������w2�l�-���A.��[�������0�m�]GW�1�Q|��|����.p��T<��Cg,a�6K~_HJ�3;cy;���
��T
���;c�;�)�
b)e\�F��U��6�/R�Xa����3��C�?�7��Z`!g<t�Bz�X����sDE����X_����hc7���l��Xo��/�o���!o)���i�AG�������C@�,U���=�G�4%��XR��)*�U�Ap��Z��HD=�0>���G�C�A����a��H������ �1R��1e�Aq�TK""��c��`3:�����#{�1D��� Fm�[`F�4�!0|J�R!�Q3B A�u,�)#L����1����!W��x� 0FN��=t�
���!`|R�cE{�I� ��T3��l�#��P���Tc�������HbS:�Q�� �d�c4��<��1(N�b�y�G�!(Fe�M�(�1(NlG' �.A1�D���y�Xh�P�d�Y�)&;9u�AqR�����!H/1S���P�k2[i��@��R*������� ��jO�kT<�����XHV7�U�x�����X����12���u�G��L(�=����
-cM{�Qhm��4 ��,���b����"L�F�b� �L@���5D��}������
�������h�CP^�����#/=�i*0�v8{NY!)"������;NZn�`%5��$�����z��)��D���4����; p>b,�����8q���Q<18]�"��cp�\3�v8{��!N��.[�Sr!pL�������
(i")�,L<����7� ��TkE�E|��qJ1�^�����R�"M�����q��
pT�zl������!�����p;jY��J2
#Z�|�3� b�C��0S����������@����C�P�����a8!0F�,k3����A(8����PF�w��c:&���t1�����N|��d�������Q�����G��L2��-���im2����A��h�J3n ;�#`I�dL�I�C��LeJN��9B`i-2�@8�F�9��F����2����&�q':�������C�a���2�2���<�����
��c�����"`|f&��A��� ����dT�����R��!0F{nY��=t��/Pu��<0����'5����;bP�&
Pe�[���Th~Gg������2�'��cp�*�%�Q����e)��E�#N�Bt����. B�����R�u=�'S�x��l@���JD|��B���� �1J��c��$��4=���������hk�C��}�o���'5�{��� �t�9���Y$2� ����=z�:B@z���*�;���I�x8QFHS�r*��_#�HR,�z�C���q���97s1����f%��|�\�$��&P\"�'*Vv� {���h�7�)�1�E��x�
DE��\�B8%,.Ut��������D5�����g���!P|j����@�i)�����dZ�b �M+� LI�aE;AT,�H�h&x�b�g �EJU*�O�#�iW�cz �[���<*<�5BPU����Q\�{HS�\qN�
�-���~�Z����='��q<�h�\q������(5P��$eF��B����
E����y�L!;i8>�u�py�qMe�� ��cw�v��1 N�SNtt��Cp�L���q
��C��~j�3-�=�������W<�|�����,����R�����"�B��)���{���/�F|91�"m��� 2�2J�,w>���������XwI^�;����PK��S �� PK �[CL Object 5/styles.xml��K�� ��9�KYc,+��)���LM� $SA�
�#� KA��PYJ����/�/c#w=�F��d�����bP
U_�����[�r�t����X�pe��?%7;'V�L�K�iE�a��
7�2-W���4 ��?�Y�<�������=����������C���.����T�h$� 1hZj���P?.����`<�~(��k��N'���������*���`���������G-��2��������nT'�7��ez��ru|�E�z�c= ,�����C����<}G����\G8{�3*��h��q9v��/�Lk����b�[�v9H�>�.;��mo�����[g��e�������k>�>|^�����p�Vm�T�p������3KW���J��2��S�
n�����k�MTYy�{L���o���j- .��Y��� 5 ��Z�W�kLa������
�����=N�����L�
����Ct�PK�P�� � PK �[CL Object 5/meta.xml���n�0E��
�v�G���*R7��]d�@���� ���(�����sF����p�+���J&��b��*�L����&"��!VE!8�\��Fi65f
���s)!��T�V�T�[j8U
����i:�'�J�����i(@��v��J����{�����t����V8&���.,�h�_��]+)�nA#>KOq��la�/t����w���a0d�m.������6��L��l5"���<F�4��K���Q:=�L��@`o����(dw=�G��nk��S��r�B�E;�!f^V�,`E�����x:��^���~�����~PK�A�5 g PK �[CL
styles.xml�Zmo����_!(@���em����h��w@b����-QZ6�(���}��CJ����:k_�6��|8�y��#i�n�Sj�0�e����lg!�H�\�?��\�7�?]�8&!^G,,R�IG����3�.'���gk���X�e�f9���u��[�#Z����\-����Xa[k����5��:�h?w�����1���^P'fN��I����������2_��~�_����_�V���k\^p�QQ�b��f���k�)�h�~
�T)+�;�gS�$�y5�X �Uq9OPsM+�v����%#4�[�g���Ce��e�\�"����� &�?�>���s�R�U!'�l3Kts=c�VU-(�]�x��[^7��I���yN�CD��q��8���w*�
�+�G%�����Z�x~�v�:U�2����f
4�Q4u�.�-$��#x��U���_���_�|�U�:��C��I}����2�J2�}�9QS��e���f�Pv���iHhV������A1�9�pHA��ucu+��X�!W|���9G�*P7�}�i`o��385cb'�!������Vy����?(��x�� H�,hJ����g�3������h�w����<bO�h!r"C(/;���
wZ���?��bZ�f�JB��9:}d["��x�2���q�:���+8��J'Y�n��8)(��z�/�n�cqY��
��,�1*h�������8!��6�q�p�o��
sI�W,�
RX�DDH���qy���*�����#��5EYR�fq�B(��v?��"��(����9����T�����X�7P|?-�mIWt=����� �s���'a�����)����}S�P������9:�-U��RZ��
VqkV� �ay��e�Q���E�;=K��\��[�&��;'^T8�B2�#,$���"�o��0/�P:*�V�D�Y�O2��q��S�I(��j1�1wR��|��,�"��uu���(=��Q���5�&���sCVMGy!�w �+�y��������g�R�M��z/A~R�@yrf�Q,�4u�a�i�J���`������8?����}�����Qu��J����zvdUW)�t�����$D
PQo4 ��Y���>��E��Bl��<�zAX-z�1g�vf����0�=�v��G��
���)�p��O�H=��������o�v<�y]'yyx��-�YU���QN������$[�����l�o���oL��!���i!,�(�
oG��_>��j���i��W�>8��_A�g�;.��I|�l�T���VtX�C�����������/��e�:w��\q������zNIe<R����[�8K0J"����~�������s�wB'~`Lf���q�*
��E��:�g��E"Y�S������]�uL����9q�y���qt��}iR���C�j�����x�(:f��G�) ��[��B���q���Wm�_���,J���#����wOn�D�[��v+��������������>����6��=�9{U����-_��H���1m�W����;�3����hKC�6�{�q}�~A�������^�����Z�`����8a�a�;�������_��H2G�GW�vL��p�s��3�L�&����3nH��P�dq�|��N��TO�S$I�� �S�������<��P���3q�yw0d�Q�}�X��p������F�"��jZ�xS�0��y ���1)�[��o�7yR�����,���������O�k>�g��3�F�����&��8����^�S?5;
I��^��,���?&_�G�;Z����ZD]��A%i��E3�\���J�/���-�%zssss�v���CB�������W�Kn��R�TJ��}��7�5�z*Q-�'Up{<>F���+N�=f�����J���]������D�&���k�����!�I-"�S�{S��nj��S�$��_:^�xK{�y����J���2
���V #���z����������X�����0l��z�i<�6��������;Q,���������PK:�R� �, PK �[CL Object 2/content.xml�]mo�H��~�B�b���R���/n6�L;�aw��4E���D��c{�U�����YHE pI`G|������z�[������s���f�������\��Z��]����������4�YU����xX��.+�e�Gp�r}=�����u�����2_�����nV�r{�u����k������o��ww�Sw���-�7�;���q|���O�9������So~Z����/VyW%R<����wW�]���L��r��� ��Ozt'p�k�zh������u:[O��O�me��*_h��|X�����������?�O�����)���d��������WN�{yw`N��7 ��������}��HUE[�N��:��i�����a���
��dx�~<��������G�y]�4�,�)
�� �����Lw��>p�����zz�����_�Q�����q���Y�\w��E3m���#���\5m�S��t� �%v��w���r�����N�6q��>,��sU>� �����'}#�������o{�/���z yq��|�f��� �k���iU�U�����=!���9����=������b7���i�l���e�����qthO�=.,�f:K����b���>k�����e!8����%��v��]��*��c�oo_�q��������������F���E��E��}�j^u������f����g���0�y��3��6-�j������z�5Lac�%!7xP��O;����W���7���/�����,_T�s����y���A�+�����r=
��6�����u�p�;q���U����g�f]u�it�C�}N�t�����<�T,H3@A)��f
'd���_���U�f{9��/�!����~����3j~l�{��Q���Z}�QO�d�#��6W@�����Ow2��Z������]�5-M�{�B�1��_���P}�B�eQ?L����N�eV�u
�d��k�K��a����-�.+��y����?��<_m�����,t��O��mg����#�\����Y���*cZ���>gu~W�xHu3�!��_TE���Q�G�����
�;�:ev
�cw}p�]���sp����Y������&e�8�����H��?U�h���q���C��o��>����HS����?9��#������6f��������-g_}9}�|��_�[`������u9���
t�/���G�����Y�������|�+|�Z�J�~(-�&�����S^�PZ>���g-��1�(- H����,������[I�w�e��?���S5�n�j��J!'���d�M������k�&����>�^�2�������x��a�L���X��/C&P�-���{���m9{w5o_w����*����u�^oTp}������%��(]��B�`1f���J�����=#��{���}����q����d4}h{�4j����u�h�d���������I��V���]"
7<��������/�T:��?@���~��������+��?�e�~z��Wme��T��r9�/"�x�t0�W[���Y��v@o7����U�6`����Q��`U7]��e��!��a�>����)����-��{���]�,���^��� �I��@�s�}bv��������n�0'&�Kf��7���X�������p3��i���i�?���r,h�����e}���=�p�t*�����L��v��RWm�����i����� r_A�ap�h�a�"��]�\�7m�@g��94g?��MvD:>����}�?\��U�����tc�����^�O�f��;Oq�-��j���>�y�����&�e��_�������:{��������e%�+gx��xpit��o����US-�]�qU��Ma��H5�:NQ?_9�+��Y�yX9�/�q�r~9��_�R�/����E�#�W�����p�r>V���(G��������,�|<���Q�>_9�V��g)������"�1�+��a�|:K9��$�d�y��z�j�K�7�!����EO������Y�s��
w�
�p�p����|Z��'>,�k�
CGo�c���<�!�M��EYC����mzo���+I�0������,��;K�r�������-�����t�'�s ��CN�uq{VV�|{��d���$3�����OT����^����>�BP�=-���K|<��K|�����o �y�}<~#��V��pu���R�:����^���O���
!�l�N9���8^O��3�Wx6��(����C+�2�W�9��-��3Fk�����E<�:'����`p�k�7L����@���c��[����i����+��C�D�c��|p|��d�J.��FG�#A��g�H��Xq�+�s�i��3h�a���w�T~�'�bJp ^CX4�:'(#�����"��������J8%�b��e� `����3m�2�M�����0.c�X�?B�f (�).��B�@��z��7��T"A���ZX��`� B���V�%W�;�}���D���3��Yv~yl"�@CZI����1D'$�`��8��R��,�����;����� :w�d��~I��q�Pi�i��R�k� y�5�@d@��Y��c-� Cs���[�xf�,O%�� :STl�Y�}80� Y�(8�qi �T�{���_{n�Q�R�CF�8X9p?������ ��Ny%���!����xC%�G���@���d��^�R�*u��@X�= ����8���Q� #;� L�d�`r�BLR�A)E�ZBl�~E�v�����0���9� ��$0�-�V���E���3�hO
c���ZA��#d�;�����A�;#�-��5_�=�YB��w����&�t��Y �>,�r�,��4�d��f�c#�]��2e8^o�Gs�����(F�����@�-�\���!����A2�<D��bCt �BA(�w81@%WgB��Y�r���prp�\:g�;����L !LjF�s�+��a�#���\AT@!+��e&9���������a�j�w�c�B��1E�t04elR�F!I�^
^������zV�;�e��b�������Nv�B���$�_�k�9�B@��A��P����@��S]'�����U:�|�c�Nm��~��LX*�� D[�D��&� F�p3�)>�3�G"�l������Bc�%=����f���#�,�F�
|&d_�K��BW)��*��r
b�p��+
s������ �N������*� A�e�P�FH'�ZI���o
$r/���!t�@ ���O�q�\���a�Z)�����)=@�D���'a�T89C�����}+�h ��$��A���I��+�!��6�����$7�Bg D"}��#��I��,X{J�"�r� ��=]����9�.a�B�xi��2�K��0D�$;g$�`IT�:W���4�B�|r� A��N21)Xc��&8�z����8��I�����q(R+����Al��)��B2����W�����qp<�(P7��I 4�����F�P�!x�<*��(uGBh\�p�,>��B�`�X��|�Dg>�Nr/ ������3����C�a�A��8s�e��z3^r+^����5���k��0DY�w�]z^ �$�1B�?�r-��Q���D$��NY{��U����.�����#�sq�&<w�
D_�R�8�9�&PC����Cth'�����>9{�!��@���HC�P�hpQ0�N74v �����+�������a��qI������Q]��>�t�&\��sV'�@��8�5�k���!���<$��$�D��������'� �(�h��-W^�������Vi�F�!,e{��Q�U�����;��"�ZR�X\B����B�\��x8����$�R(��A�o�O ��=���;O=!�n)1vJX&�tu6Q-�80D&3��S��1�\$���A\�'�^sN*��GDY�� �a�)#��� Ct�(�,�X���)`�����U�&y���8�[/�qS)\��� �B�;�Bq"�-@�Q�I`!��� aL�����ER����TP:������M��kPu�k��"����3L)1GY���?|
������bD!a��=~,MR?B�����b+z�\�P�/<����;%e�?F.�
DtQ��D�t���d��B��� lb��Cy y�
������� �c���!�T]H�9�� \FD]�����
�=�?F��@�����U-��$��sp�v����@�sQ�H��;���I p���#�ma�R��G a�I#4$��%'D��)2�{�lR9�%]T�%�<�^���+%�>�X!`���#d���
Y ��K��c�r� ��C�O?�Z����KyB�f.�
����� ��f�E�4!�M��!��� Y'���h A�TA�I
_���9��qI��TQ�-�O��d�0�8E��%�W�����)6���r�������W��xX��.+�e�o�PK����� �� PK �[CL Object 2/styles.xml��K�� ��9�KYc,+��)���LM� $SA�
�#� KA��PYJ����/�/c#w=�F��d�����bP
U_�����[�r�t����X�pe��?%7;'V�L�K�iE�a��
7�2-W���4 ��?�Y�<�������=����������C���.����T�h$� 1hZj���P?.����`<�~(��k��N'���������*���`���������G-��2��������nT'�7��ez��ru|�E�z�c= ,�����C����<}G����\G8{�3*��h��q9v��/�Lk����b�[�v9H�>�.;��mo�����[g��e�������k>�>|^�����p�Vm�T�p������3KW���J��2��S�
n�����k�MTYy�{L���o���j- .��Y��� 5 ��Z�W�kLa������
�����=N�����L�
����Ct�PK�P�� � PK �[CL Object 2/meta.xml���n�0E��
�v�G���*R7��]d�@���� ���(�����sF����p�+���J&��b��*�L����&"��!VE!8�\��Fi65f
���s)!��T�V�T�[j8U
����i:�'�J�����i(@��v��J����{�����t����V8&���.,�h�_��]+)�nA#>KOq��la�/t����w���a0d�m.������6��L��l5"���<F�4��K���Q:=�L��@`o����(dw=�G��nk��S��r�B�E;�!f^V�,`E�����x:��^���~�����~PK�A�5 g PK �[CL Object 1/content.xml�]]�������L����%~L'Sl6�l�n/�-P���F�=���+i23���!��b9�1g� � I���s�C����=�z^WW���-���k���b��y�Y������#y�������e�7y�=��Me����^��������~l67u����&]�M����b3���k�X.���Kuts[�o�����M��mz<�������������o���m��V������iWN�x���/o��n{�X<==�O4���+��t48�m�����EQ��]�/����K�����M�<����hi�.��W�O��G����4�C�=6l��{i~|���o�N���>���t�������B�>���
���r{���������T��MPk.A�-��W��`�����������U6*^����������0���i@�x�������O�{�P��]�����r�v�f�Lc:a��d����Fa��L�[d���[W����UWM������=����>���o�xxx<���L��-0Z�J~��RSgd�d����
-���vA�\�|���hJS�V��Mp�W���W.�A8/�j��K{oS�����GO�z{��W�f�|����������0��-�������0e�Yu������\�
@���:��Yd#v{w�b}�z�������_����U�.7���n�����s_/�n{�y�#���\�U�����$z.��3�)<h�P�j���M�����7�������������^d���d���-�Vwi�te�^�i�o����l�������t��'s����h[�eg����OSwv�D�fe�(d�qEf(��Z�z8���-�OW�t���p��`��+4�{U<��'xM���!������y�^���I��J�b=���j������kW{YV�Pw��7?G� ��p'(:M��,A�MV=�E�P�y�����tY�U��?0��"��wM�vQ��v/��"�TD�t;�i��C�Fh��&��Q�H��~�{������������/Q��U�RU�R�5>��l"OS�u����)2'Q_n�rzsSD�Z�_F�E��y���+�u���e�v�[��Q��a���l"w�\�^��3b~��cZ����=�����^�����:�����iu������`�K��k��/��W��W[��N$7%4�)�U5f�cs��w_�x�3)�W,�r�*�)�g6
_��9%��yf_�������yf��,
���D��}�"K����7%������L1� �7%��o��EN�l�����{�/)��U���9��f���b�E����E��s�����_�k���|�ok�S�������l��{(���@dQ�����)�o��x��^�F�����c���m{ n��f���a��S��3�jV=bDL��Bj��2���a=�s�����?G�
�o�W�},���������E_�v����}` �1��8�J�@w�ID�~��R���~���W�W7]Q]��
�4��]�>3h�a��*V�&��/�-�}����Nb$�`��*}��y�n�#���~w�b�`[�]�6E���^�����1j�~_G�<o
���O���/6���w�ofJ>:�������C�������0z��������L?����d:�e����;Y��MV����i��c���$$f�$L�)��YB>g����4GH�!�����y����m�u��D��������I��gf��Nni��*F�LC��1A��>#s}�q ;{����������i��������A��� ���Y���� �����SMb�:�O�����n���>4��SZ=mtB�������id�����7;-����>�h���.7�x^�-�x��|!�.N�c���+���2�������| e�����U��Y���U��%���+���2?������p e�����U��Y���U��%�I�W���2�R���2.�?_�g���,e~�U��K(#�W���2�R��|2
��SZU�u��TwY�|z-��y�]������>?u-�Zg��{�0�=i^4fs����a�
�i-���=����b�������z��~�hdB��~HZ?��+� ��xfD�{��3��w��������w�=_X���$:/������v1��]7|%{�MUn4r���26H����9��CY���"�}(��4:/����{(����l������
�C���d���ZP��=�lp��
�]�Y�I�����J�k��4�t%���f7���&q"�Lv.����xOX,�b�1�����:� �������)#t��w�0�lNgWz��������2S��b�3��C�}x_x��s�G�X:m����i"q8�t��zhC
��1'{O����������M�K����3���[�{ �M*�6@�Q�0M|:�@�%'"�s`���?���Q�&|������Y �Mb$yr�r�{O�)��X�� �<�G�QLx@h0>L(g�E����SK�CF�N�d�h��������0>�p(:@��[/�j������h�0>�
� � Ay� X�;&E�y8��q����!`|�3� ����N���W(���+��=U���%�k� �1&
�$dt�P���#p�J��DS�yH1U!����"(��"`|���|�X0�|�K�%�$?��(
'� �1r&y��Z�O!,E�����������,���"p-�X�g >n�����#^���s�� ��I������� �-���~$T�A`�:��$dtX&��$�C� �����1��R���x@.�g��i=L1 wa ��I��=�C��#��;@� ����i��P���S'���C�N�)7��bS �(��!��^�u)����?! >b(�#�1�X�? 0>���t��4E�j� �Q���ZLS�8'����9fa:�b� #o��`����B�`�e��%OH��"`?:�P���E�����{Y���'�"�����v�\��SQ���)B���Bp�D��O��Q�*f�� ����G��$%8��C,� �.xl8 `|B�������p�h ���b��C
�;]&s� ������7I�����S<��Y� �1������S��A�G��#������SDC
�G8�<��Ap�D�H�����/7��4f}�����D��HbFD�<#f���G {Q��I'j�O%x�g�%����G��$W8\"-r�`�4�����������1�E@g (6����s �- O� �s��I��O�D�|�>������\�d����D�"��V4���cDR��� `|s&>�����!�����Q�B=
�' ��%t��A �'�|2>5r�@��6!���*X�- �f>�� ��*��E= �&T��y@ {.xeq@`��!1y%�!`|I�B:��1)���"���~����I,���;���)���^
��b!Y2�\����
�\v ���������^���#p����A������[#t�P�?�z�������L}@��G<�T#Uz��F��H$#�*��M��q��v��� @�
��
�Y#���DJ��������F1�����h���Gd�5[= �&B-- �e�b�� � 9�`Z2DH��C��t2������'�����A�b��!4 p|,���1��H�/��0>F�cL����\�i��1b~.u����g����0��L��]. xg��4]�*0
���c��� `|'��:��K*'�:�)<!��R�|�#M�,��2����������N���/������������S�(6��
�� �o���7
����?R�(6D������)�����!t7����1��qr�~g�p������tQVo:����PK���Ik �� PK �[CL Object 1/styles.xml��K�� ��9�KYc,+��)���LM� $SA�
�#� KA��PYJ����/�/c#w=�F��d�����bP
U_�����[�r�t����X�pe��?%7;'V�L�K�iE�a��
7�2-W���4 ��?�Y�<�������=����������C���.����T�h$� 1hZj���P?.����`<�~(��k��N'���������*���`���������G-��2��������nT'�7��ez��ru|�E�z�c= ,�����C����<}G����\G8{�3*��h��q9v��/�Lk����b�[�v9H�>�.;��mo�����[g��e�������k>�>|^�����p�Vm�T�p������3KW���J��2��S�
n�����k�MTYy�{L���o���j- .��Y��� 5 ��Z�W�kLa������
�����=N�����L�
����Ct�PK�P�� � PK �[CL Object 1/meta.xml���n�0E��
�v�G���*R7��]d�@���� ���(�����sF����p�+���J&��b��*�L����&"��!VE!8�\��Fi65f
���s)!��T�V�T�[j8U
����i:�'�J�����i(@��v��J����{�����t����V8&���.,�h�_��]+)�nA#>KOq��la�/t����w���a0d�m.������6��L��l5"���<F�4��K���Q:=�L��@`o����(dw=�G��nk��S��r�B�E;�!f^V�,`E�����x:��^���~�����~PK�A�5 g PK �[CL Configurations2/toolbar/PK �[CL Configurations2/progressbar/PK �[CL Configurations2/floater/PK �[CL Configurations2/menubar/PK �[CL Configurations2/statusbar/PK �[CL Configurations2/images/Bitmaps/PK �[CL Configurations2/popupmenu/PK �[CL Configurations2/accelerator/PK �[CL Configurations2/toolpanel/PK �[CL manifest.rdf���n�0��<�e��@/r(��j��5�X/������VQ�������F3�����a�����T4c)%�Hh��+:�.���:���+��j���*�wn*9_��-7l���(x��<O�"��8qH��� �Bi��|9�� fWQt���y� =��:���
a�R��� ��@� L��t��NK�3��Q9�����`����<`�+�������^����\��|�hz�czu����#�`�2�O��;y���.�����vDl@��g�����UG�PK��h� � PK �[CL Object 3/content.xml�][o�J�~�_axp�eA�������INs�a���4E��C�E�����&u�jK�,���8�-~Mvuu�WU�-�������s���f������U�(�I�������2w����m���(o&M�8/]V4�~^��������~l7M��V7�|^�n���Y���]7q�������{�O��o������7�������������6:�����>mN��yUg��>_�]�H�\W�?�]?t��f<~zz=�Q����{?�����������V�b\�e�l5�#>����]~�|�m,��q~_�'�&��W���<;�">���x���m�o��WNN�^9�������9q�� ������-��S�
m����Z�<��u|�4[Q�
����
��xx�~:��������G�y]l5���)
��1�����L���:p����jr�����_�Q<��|���r��Z��|��L&��H��-�M�m3=�0a��V��n^v��n����doSG�������U��'�������F�������o���nm�=���(��m���y\� r��W>/��
P^����'�6_7g<r��' :��z��!�}L�d���I��������?�����d�>1��b���>k�����e!8��{��q}�� ���n���g������k��^
ld}w�7v�4�Z��W�^��7�fU����w��=��>F^����}�q~��IYT����u^��8�0�A�=���\�Aiw?mE�_n_�������x����|^�/�Rp��-�6_>���0�e�U��*8 <�m��,�E��.��Xw�xw����!:�������z����r����z����,�T,H3@A)���
'd���_���e�fs9V������y
�a���5?6���n��m�V_m��7��df�+ |�@F��{�������7m��MK�NV(;����N��m
�g)�Z�������\dEY�@&��^���XZ
C�/wm�wY9_v/�'�e���f�r���f�`��,���SGl��x}�O$�_����*cR��l_�:�/k<�����C>��"QO[��T�e1�h����:ev
�c{}���}���8�`��jW]��P&e�8�����H��?W�h��Q�8l���M�7�pi���)v�������Q�����o�sZutJO�����N_2�7�����q�L@����@���r�_
�?}��$�����S%X�Ci�@��}ky"���|�6������������Z�{��-������{���E>��PZ~kQ�hYri�������X5u5�zK!'���d�M������k�����}3y���e������K�TM���`#)x1��P�e�@1�����m�<�����h�y��,���0�������j������\��#,�nD��.hz�1#��F
P$H�|x��a���T�w��=�W�}��j|5yl�R��-k��sy�5W�&��O����7����7���i���4|$�=$��IS�Lh�� YH����
~3W�W�]Y_�CZ�����{%�F��Lu9+���ET!�7���F`=bV���������y�/�
��r�pT��X�M��m��m�g��KbY�)��d���
�=Y���\���G?}s ���F��{N���������2����,F�����qs7�:�m7�&4\��i���iZ����,g����h�;Q������� ���V�a�c�f3/�������e[���%{�?5f}9����nh �$�cU�u����K��
t���Cs�3��xk�A���}�?���������������s��������f�f�-S��Y[M�>�'�7���ihbU����p!�<�s^?���
��������'��3<��<�� $��0{�]6���.2.K�� 8��T3���������r���������"��+������,��rX9�\D9�|�|8��g)��a�|��r����xX9�R�����x������a��z�r~=��_/�s�r>V������H"I����������i
���V.z�6����������l��n ����`�U�P���
u��|���p1t��;�^��3Z7�v���/>l�{�<% \I����dl�=`���Y�`���*�j1���T�pe�����
u�!'���;++��|s��x���$3k���=��P�������}<��{R�3|o��x\���lU���@���x4�Fb������T����:�q�^�=�w�)�Y�]#$���*��]����L�~����Jf��"ZKL�s�����|hldS�1o$�����^A���cE�7�QVz%<W����%�9�XQ�-��K���R
�
2J�������`h\j�e�����sb����[��w�{n�7��$���x�
�o08m���bN�xp tN@<V��^��C �v$RL��T�$�(Fh�g#����;�,P'b��� �B �YL{���p��~e���c����6L���"����^j��7
�F�,�;��pN
e�"���Di�N
�]X���#5`�L��V�������N&��^�3K� :ct�+�����`���LH���J`;@�8X2���X�"�@��������%DB��Tj�=g2���u���g�I��IT�����r�A�,��1B�?�N ��1�!t�%{�$p��x�1Bg{�Z)�7��`�Caf
E�U���y�.FQ�L�m��Y����=�D'�`� ���)� �N.o�c�z�3�����
+<�Raw�����L\yi�O�8F�f@��P���*&]ne�P(J��A��H-t��p�^[�����:W�>�"�l\j�YCt�j����8��0D6�1&%x�p
�B��W�8��C��#��P�b��L�d� Cd�5`�����h�
�D�A�
�J_J���$."�L�; �p�n�qX@]\�F�n/ ��a~�!��j@[�d���]n ��$�[H�qn#d����V@9�a��B���ZYn��2%]�"��t�� �ie�O"3�kv���*�$+�1B�^�����^���B:���(�i��!����"��\1�g F5`!�2
��@��"�O��!f����t12�j��#�\����%�j����_A���D]���z�����C��l�d�#Fa���*�OR���
s�yp'�Jr"�I �����8%���H��hV�#BQ��_E��[��* o�8(c�N#��+�����!����Us�.��a0����e^''b��������-0#�.@�wP� �D���$M�����YV�Dj�BGG�~i%��y��������W��$ �p�@�p��Za�m5�� ���R������
GY��\�d�C�kF��`mR��'kF1t9:�Y�r��������"�����q�k�lm"�.7p��vZ���c��?��C1���I\���9��rk�jB�����~z�@��CtT�����i(�]�
b�$<�i�i[��c�$i:��$��Ae
?��! 0D&�4�{c�v�I�`�r��)��M��"$D�v
�a��;@�,��{��>Y���~�5��!��I~�@�#�xM�n�9D�+��2�i�a����������=��jG��+���<>��!��Aj�����%��@e�"�8!(��h� �Y��(�h�3��aXQ�!p(N$��K;�r�'p��+!3>^��d�A���@�����L�a�
U��V&��5B�1���I�tr�Ats�t�^+������E(��L9���"d�C"
����F��C�#�x�p�����!{���"�nu�AN�������S�� Ju�~���5�v�H�7�5�Tm^�o;��t� >����.�IIx)-�����<��f\�D�q�>�.G
Ys��o���/��L7�b�.:@>(lX����!2 xX���3��+���CO&|dD��N�3C���q�x
��< !t+ ���7b ��x%At+ V
9 ��d"�.G����(BKHK�^'��b���W*�"��8�E%�I 9H(��Z��b D��|�����t�3Fw�@�*%Eb1B������o�/,&�F��eaT(��rt�
�`mx��
��.-Df�P�����OB ����G9X�$�0B7�ZJd��� �(�'.dx�Z��Ad6 �'��Hk>����S���4Z�������� :����r"�u�[���>F��T����:���L���UY ����c��A8����,%.�B���� U2���D��0qlxw�g�}��$�
"���yL� t9:�|�p1����NI�r�f���~����q�I�!���qi����$2��&?��'��$;@e�b�w�9g��j%���e�Td��l�"� z�g��tAA����# �^s!!e�H�l�"�N(K��� ��T��7������}�����>c 9+�!B��\)����$��� �u$'�
��^'G�1D'A8�d�c�� Co$���
_�������������g7�&M�8/]V4�~��PK�8�� � PK �[CL Object 3/styles.xml��K�� ��9�KYc,+��)���LM� $SA�
�#� KA��PYJ����/�/c#w=�F��d�����bP
U_�����[�r�t����X�pe��?%7;'V�L�K�iE�a��
7�2-W���4 ��?�Y�<�������=����������C���.����T�h$� 1hZj���P?.����`<�~(��k��N'���������*���`���������G-��2��������nT'�7��ez��ru|�E�z�c= ,�����C����<}G����\G8{�3*��h��q9v��/�Lk����b�[�v9H�>�.;��mo�����[g��e�������k>�>|^�����p�Vm�T�p������3KW���J��2��S�
n�����k�MTYy�{L���o���j- .��Y��� 5 ��Z�W�kLa������
�����=N�����L�
����Ct�PK�P�� � PK �[CL Object 3/meta.xml���n�0E��
�v�G���*R7��]d�@���� ���(�����sF����p�+���J&��b��*�L����&"��!VE!8�\��Fi65f
���s)!��T�V�T�[j8U
����i:�'�J�����i(@��v��J����{�����t����V8&���.,�h�_��]+)�nA#>KOq��la�/t����w���a0d�m.������6��L��l5"���<F�4��K���Q:=�L��@`o����(dw=�G��nk��S��r�B�E;�!f^V�,`E�����x:��^���~�����~PK�A�5 g PK �[CL meta.xml��O��0������1�b{�i�Vj*�-2fH�:vd��~��l�C}��o����.'����5"Q���tRk�c�9,�S��2}/������'p<��z`k�F����AL�� f���vO��hU.J��5����a<MS4���#&��/�
���;�V-T'0(�L"�7vN���f�>�1�f4�k��.����}�]�
�Y�}B�x�*a�����wOP�mw��T�`��$&e'aL�$aI��(��)-���V-�������-w�����_��~\~�%�CC�����G����m�ek���gQ�!>=K=^/e~���8����a��h�r�6i{�3���}�Y��dIq������^�����"Xt�[�0�v5J�*
Pj�HZ�U7����M���?z��_PKn���� % PK �[CL settings.xml�[�r�:}?_���L�q�������\�1&��l�A�<��!_d.mJqK���3������^�kK6��.- �����NUJ�x�Gdz������c��k:� �}�E$��C!d^�� �o�o���:�:�u��i�nX�u������#2�QfB��r9��w��;�������u���G�M�5��������d����1MU����Ji����hJc����[��$`�������v�H������)��}?�F�����Q�B���P�u�G�7����y�|1;]�.����=�����k��~��/�����`|�.��H~��1�������P !*o�4���2"�r��!��<���q��#!���!M�G~��Gz�D <1w�����Qv�!��}��]��\������Z�5� �4%<'.���'����Q���6y�P�Td�PX%SF'�6��>����zj�&�������3w�O�.��4�����x2����R<��b�i|��[��� �S���saJS�j��jP�A��(����J���r���e!
�(�p�(hG;�P��N��)��M�B
U(T�$U�Q(D!�y����u�~(���������<F1vK}�TyU-H}.�M�|�`����c���������������Y� j�b���1M�QE�]jZ���}�!*=���Q@L� �!���p���&�>�FW�K1�}������=YB����)�M�����3VY� -j.�>��0��s�<���� �$W�����?1����c�?�\-Ou?#����J�
��D|��
�(A/�,�����!�;p(��gs!b a�[���A`9\M&��CN�G����;R�`���[
|S�}J�)���$�����-@<��g�(���&+A�X��G��18dOD���:����Y7�*��/,k���N�]�9��1�R:��,��yE���0�y�=yn�/����{�%�P� d�+m{��{�;�[D��I?%>��;����l�r-q�H��^���o�-I������U�VJ?��-�>v{�w�Q�s�{S����~�������c��3��y���3��s��0����nG}�i��WU�����9O���U��p�w�U+�e�-���`�G�����zc��/�.����l�����3@��[�.y��Zq|���wF���-^ ��3���������5g��=v���yr�gm�����?�R�n>���Er�`������g���a^]���u;���C�jV�V�9�y������M���m�yg���5����c�a���c���9���C�q�>�gvKn"�'���A� O���NSwZ�~pQN�)J�PK\��l �2 PK �[CL META-INF/manifest.xml��Mj�0��=���X�m
E����m7�yl�H#c�Cr����)%��d��=��Z�v�&[���X����P��`S���G��V���Shj$��d����-����*� Q9���`��� I�������@��O�Y�6��~�?W���i��-������AeTJ�J����h,[���0���C���B@L���n�Aelt\��+V�S
��|����4%�B{���8�+
;��}��B��H��~ t������*��:��:��:,�_�,6�,2�,*���P�=���C�ps"��� �`���0O����U}��X�|�X���+"����������+� u��]�O��PK)�z�X -
PK �[CL content.xml�������w��(�1���p���i�F��n�n��nd��P`���r�pQUY�u��Z;yV��<� D|���|�����?�}��o�>��}������������W��}������/����������~����7�yu�����w���}����_�����o�_������������~���������7��o�����/�������~����w���G�7����7��|�_��Ww/�G�`�E���{���?}x����~������������~�����?���������}�n��{�[k��_���_~���{sP�^>�ys3������/��������_������������?����n>t��s��������_�������e~��w�g�CU�+\���������?������s���?�������[���.�������I��ooo?����������A����w�?��}���E���7/?_����]�����xv������q!>��o��_�x���������/�p���_��~�����/���������������������[|��w+|�n������W����W������}�?�������������}h������w���G�����w�}��������9���O�o�^�_z���m�������on��?r�i_� ?�_��Y?���I��1����~�2t�o�������y�����?n�~�����?����C�x������|���1m��a�M_,���_k�s,������5��o_��y��������9������������u�0������x���@�������w_�o/��~����~�����?���]�}�����~@���eP�����a��_�j����}��?�����A���?|�y��|������_���?�x����{~�z��U����~���_/����������w�����������������r~�����>��~���s>{���}/o���l�+�q���|z����;�������ws����������7w7/���77�G�����^����j�������������\���\��e_-�Z_-�����[���[I���}��k}���z�=-��nn������O����?�Z��/���K��?����}j�a��-����������_�}����������{|q��^��K7w����������}���������O��^x��������Xa���goo_�?�����������s����O�����}5�Wb������7}}���sy����g����m�W�������z�����W��_������#��|e�����Q�
/���_�{�����w��/���������??���_�����q�����+�_�����S�������Gp���=������?��?��w��?�������;���>��y�������������]�pu?O��?!��x��g |����������^��{{�����G�/��������w7����M��q�������s��������U�W��5�������V��r��o��0���J�s��\��w�z���Eg��`#��y��C-�xC�������Pk�_����k�w/Cj%�W���_��D!��?���W��x��.��0B�\.��G�����6b=�E��sm����-�~M��_D����o�V��K��/L}���$_�����?�~����������������u����}���>y^��/����������^�y����/~wT3�����^��z����7�O�����_���������$b������/�������g��s��/�Y�/����#����G���|��?��6���W��^��~a^}����������N��7����7Gf�r������~������������}��C��'3���sC����F����������W/_�����t����?'x^`����c��/ �~����=�g<j}�$���s�����y��~�W��M��_q�����o�3�n����7?������a��>��}'��e����V5�=�����+�������>�������?�_��}������7�}z���}�>����>��
�����?<�������^v����=���7�/>��W�������~���������1o����������O�>�/�$Q�\�
s�d��9�"v�m���:�$2'v���=5�*8MX��[���2���(����p����D����I���wX�$2'Q����x��c�a����D;�����l�c��X�$2'�������7�y/�X���;�ow���w~�?`id.�����Z|����^�B��3A�,o�W0��u��'q��r��e0���2�5R�[h%����J!s
Q��E��g�R��B�8y�ob�'��J!s
Q��>��>b �d��9��Q����!�d��9��A��"����R��B�y�q����p�A��\X��`IdN"� ����Kt��%���C�Dp�����vp���u��'q�A�B.,��ID���i�cOp��B��F�C��_�l�$2'5L^�=�
X���'/rh���D�$����=:����%�9�������@i����S6�e�-��c?F��B��Vm���Y<Y)dN!v�s���h������C�J�+�c� �L>����O�.r+�cA��;)}����J.p-�@%�9��5��F�o�d��9��5�}e����2��&��,�4VUNjP zEr,���g
V�{��$����]�`IdN"nMr�J
{*\�$2'w+�{���}�K"sqd���WpC}��%���C�CpWM}-���N����������1�d�p����D���>�� F9�D�$�V%�����/\,��I��JN[(*`�4aIdN"vUr7�'|B;`idouKnvQK�4,��I�nv���� NX����Gcom�f�K"s��h�lu����D�><�d��%���C�N��]�
������O�>��]����OX�����"]�2%�c�A���.j�
�!w����D�X�X��x�K#{�f��Q5�b�a����D�X�{C��D,��I������h�`IdN"n��>C���e��%�9����t/2+/X�����?f���u���K�������.��P">Zt��������Py������`IdN"j�<����d��9���r8O?��@���O��rl[|gx����D�H9o-Wp[b�R��B���{����J!s
Q����V�-��J!s
��3+ENV
�S����k&���B�"�h.�F�I��^�B��A��\|��P�����O�.��(�G!���J!s
�;)���N��D��j�������,��B�����o���=�-X���#/b��D�D�$����=��]�K"sq;����,n_�$2'����j��X�����O\v��p����D����E�3�.��uIt~�����E~�s��������Py�F��,��I�}��y��F5b�Rve��8
V���K����/X��/3Z����J"���)��=������A������|P�K#���5�>����K#���5��x����D�������|]�<�N;�+�%����^���5+�e��es�$l�2YIdP"n����sNV��1oqG�T
T�/w/<<uT
T�.o�5p=4YIdP"n��%��d%�A����k ��h����D�X�X�`���J"�qc�-��p����D���X���g+���������>jQ��J�,���42�7��f��.p�l����F�����^� K#�qs�nFv�,�j����rg/`d4aidP#nz=�;
�&,�j����r'��7,�j�M��9b��K#�q3�1O�K&5�I��F����x|��%����+����q��g�
��42�7��[�ms� K#�qs�n���',�j�����FVT��F5�����K,�D&%���}�
}z����F��_�MX���bw3����OX���bk�?��42�;��n�\����%���\<�373����J����J"�Q3���R���J!�
q�>�|��l6YIdP"jz=�p�Y���D%��}8F����B��|��T����J"�qO>�|n�� ��D%��|����
��D%����9~���J"���-����?Yit�}D�����}��"� ����A��U�n��P� K#�q����� ���F5�V]w3rH��p����F���n�^�+0aidP#n��`R �MX��[w��H1�k�K#�q���>\0
X��[w}�S��%v���A��u�c�q�E�H�^J�-��Z�2s����"6MV���b���Z�����D%"W^��Z�v����D���>�$lF��$2(����.`���J"��k�[��G��J"�Q����P+v&+�JD�����v��Y���E���u]\�'+�JD���+��/�;+����`���T��^���u�����A����e�����`idP#n��x���'Y��$2(�����|�-�42���z�/�^�VZ���A�����������F5�V^w3���2��F5�V^�nQ�D&%��]��3� K#�����=�q���K >�<�������b�h3� K#�q+�������,�j����[���@&+�J����K�,�j�����^9;aidP#n���Z�,�j��~����+X�6aidP#n
v��Bt`�� K#�q;`�g���d',�j����3UD{`��4��>�
�k[��a�-4�E'*�*D����k���$2(5�b$�k�d%�A����!X�0YIdP"jv}�.��l����D���].���A������f�d%�A�������.x6kz6�(5�>� ���J"����c���'+��M����z��W��yWA�,�jD��c�\�E��D%�f�1m!;,g��$2(5��b�����d%�A������J�A�&,�jDM���l<\�42�5��f��vxu}����F���?�����s����A���u_4���!�F5"��1l��F��H����4��3��v�� ����`�K#�ql����>NX���aw3����0aidP#n�=����E����A��9v7#%tKm����F��m��hr4aidP#n���;5gX��F5���}�j1������A��9v7#��X���c��*'���K�{E#��:���c�-��k=YIdP"j�]��3Xp4YIdP"j�}����p����D���:
T
T��^/v���d%�A���������&+�JDM��h���J!�
QS�1����'+�JD��� |��D��A��y�0���E�}�}��t�E��>��������7������������7��|�I��1o1d�����F?���n�B��C,��9����J�p[��C���C����V�MXYs�[����8�{b�C��Bw-| ��\����
:me�����C���@�������!k��x�����C�b?��%|6;`Yd-�#W>�R����Ut~�T��}��i�����$n"5J�[�����de�5��A��b���� kQc�aE��NT�X��!w)Z_����f5@�J��W���f5>�J3��� kQ��a�����`e�5���q���[X'*��2���������/XYs�;��f���'{U�<��w����:,7��U��'q������
V�C��F�C��W!�C�C��������*��C
���M*���C�������k��!kQ�c�|��w��G�'�9B>�8�c3��e���k��`[L���'*��Cn��jJ�@��A�"�5���z�We���ClB�k�c��������4n"��8v����f��8��h�v�`e�5����}f�� VY3�\}sH�:��2��A�������A��A��W��+Z�~�r����x��J��,��9��?�Zx�x��!kq�`��Dl2_���w16��\�
� _�C�� �]1�{W�.�&|U7�I�Fn
r�"%��{�r��C�:�1���C�����yh�/X�XYs�[����ak�',��9��C-�3<
V�[���Y�V��!XYs���"y������SF�?����5���2bn�[Y,��9����C����U9t~����x4L��U��'q�M-R�`1����5��M-Rh����C�b7��Z�K�r��C�0���S���,��-J�qrH[_�T��h�r��C�89�����`9d�!ns�C����,��9���3��`���!kq_�9^�;_���wo�:��U9t~�[�Uw�w�W,JZ�U��'q�q������C�����:=�s�r��C�8y�{����E�&j��x��Q����5��qr��Z��0a9d�!r��JE�W��f� s�s���� k��0��#�&+��Dn�����
���A�"7a��J� '{U�<�F�Sr�/�jvM5o"�Sr.
<�h�2��A�N�yOp����C�j|�����GY���5>N�OO�������5���q�����qh�r��C� yh�������w�b�!,��9�����8��oE,XYs��ZO����~�C�����%N{�,��:?x����;l�t�2����O�6���1�x�]���!{���+����q,��Y�����*�f��e�=����cE��C�b�KN1�[[�E�,bwL�5�����e�=����cJ
��>`Yd�"vC�X���`Yd�"vK�xIk����6-b7e���_�UYt~��Tr<`l���U�h�:���c���p�l.�94Y9d�!n������<�d��=�����P���d��=���r_���"����!{qc�-����`��=���r��Ut.;X9d�!n���'���~�r��C�@ylz6xM=X9d�!n����b��;Y9d�!vR�W9=��de�}�����Q�����hD4aYd�"nV=�`7��"{q��0�� �-X����W��e���,�E�,�&�C�4�`Yd�"nf��H��v�r��C��zx��*,��Y�M����G .X����[1|���,��Y�N���C���G�]Ip3��"jv��%t;����E��:n}�*��6aYd�"nv���{xLX����]�����E�E�,�f�qK
���`Yd�"nv����+C&,��Y�M��{�����E�,���c�r ^]�,�g7������ �"{���1����i�G�]I�I�����ug�VlJ����5�.[|�h�r��C��7������ {QS��E� '+��9�=��{����C���f��wx*����4�����
��C���f��(;�Pg��=��%�5�K�C��C�b�H�W9�������n"�c�X�<�������3aYd�"n������p�r��C�
k���"X�6aYd�"n��]����e�=���]��`��e�=���c��h��e�=���n��'+��9���^��Q��E�,��W�����,��Y����j���K=ayt����c-p��u���;6�NV�s��[��� <v�r��C����\�v�r��C�
��xH�d��=���mo��?X9d�!r�u�3�����!{�+�[��l<X9d�!r�u[�6i��C�"WX�=�5i��C��WX�}����������o�X��������}� �"{qk���.�,��Y����� !����"{q���9��Q/X���[e=��Xd�E��U�Cp}�2��A�
�c����1aYd�"n���6��� �"{qk���g�u� �"{�k�� S��a����+ ���5�0�����nLX���[e��Z]�MX���[g���#�&+��9����^��MX���[i��=� �g,��Y���b��U��C���Z�)*&����e�=����~+^uT�3�[k����2�p6`Yd�"v�uW#��� ���n$���X�X��:la/ �
��C��f�n�lr�2��A��zh�"v&+��9D��/�Fi��C�������C=Y9d�!jV}z�-�&+��9D���� �MOT�3��S#�
E9d�!jR}���lh�r��C��Z�������H��R�����_�,�g5��ys>����"{Q���P\�r��"{Q��8�o
�U,��YD���h
�����"{Q�.F�)�3��e�=����X0G����"{Qs���)��c��"{Q��1����U,��YD��� ������`yt�G`��c�����n�i ���e�=���u#�
���E�,���n�c��Xt����E����Q�
��"{q��.�����"{q��c�s��H�3��\w-Jk`�8aYd�"nr=��`O��"{q���������,�g;�L���� ���<S��:���\����c��d��=���u�B@'���!{QS���=V�m�r��C���{�BA����C��&�����LV�s��W/������!{Q�����OT�3��V-�ni��C��f����p�r��C����X�8�'i������]���w��~ss�����Oo�}xvw������WG��$���6W��&�_��~��%w-�Z�c��!kQ���={0\���5J�S
���C��F�}h)y��X�`9d�!j������GI-XYs�%���|=t�r��C�0yhx����5��a�������C�"��C����q���Z�G.zn.��&{U�<�����������^�m~7�%�-���j�2��A� �[Qk���'+��D���{j�,v�2��A���mv�d��?�����;��de�5����XW�-�&+��D
���������� k���1�������p�>�8��Z���w�Z�����r���!��#U�w�%��m�Wu���m���]���]��!kQ��E��'��C��F�cz
>��� �!kQc��Et_
XYs�$Z��?�XYs�%�S��84`9d�!r�<�H{�-�,2��M}-l�9�/fMVY3�[���6OVY3�\������^�A��A�C�-8l���U���q�u��0����f�9���u�2��A�J�zA%rU%�=�������:Xd� v%����h'+��D�D��f�����!kD�J����26�.XYs�[���(;�Pge�5��{����=� :a9d�!�R�k����N��:?x�-�zi����nM��n�����:������!kq���%���,��9��C>��v����5��u�}hq;x�����5��u�]�X����"s[r[�Rx�����5��m-r� L�',��9���Qb���l�r��C��%U��~�r��C����>c�W����ClI��Z���M|U7�I�Fv[��k�[�����EI-_�h������E�y��|���;N�[F����2�$a��at�i<�q�r��C�0yh�#.XYs����o>��C�C����w��� �!kq_��Z�
N�'+������z�����������������s�*]�U��'q�Qr�����)o�r��C�(y,CZ*�C,��9D����=X��`YdmB����m)z��?Yd� ���[��&+��D��\B��>Xd� j�<� 3NT�X���z�xvH���f��r���
VY3��z����tmc���ClA��#���o�Wu���M$�G.>%,5���f�?r �<1����@�
�������l�r��C��8�mO�a/XYs�w-j(�B��A������Rx�����5��
����w����C��v��Z$��^����e�s���C�C�����-70_��Ut~��5�X��h�_��~��$�sH�-�E�n$����"�&,��Y�>n/�.~,��Y�>r�� ��`Yd�"n��.��`A��e�=��'�5���r����E�3����
��"{�O�k}�x����E�s��V
|���"{�O�k�$�<� _�E��I��%�P����[
_I0^}���0��e��&+��9����X��v�r��C�pyK�w?Qd� n�����'8����7V�R� u�`��=���r�"^�P�C�F�}~J^QV�s�(w/�����C�����Q=��@e�=��9u_��
?�V�w����
��e�~�����e�=��9u^Z��\�,�g7���O����e�=��i�����,X����Wb�W`����E��z,u�
� �"{q3�>���1�,��Y�M��h����g�E�,���}x �h,��Y����bgw��OX�w%����
m��u�Jm�I:aYd�"nz��S�� �"{q��!FD���,�g7��b��p�,��Y�M��(�@����,�g7��K��N��e�=����^x���e�=����^
�MX����^����u��E�,b��]��"�E�_E�=�?`r]��x&+��9D�����8
�x�r��C��7�pA4Y9d�!jf=��
<