Partition prune with stable Expr
Hi:
I find we can't prune partitions in the planner if the qual is a stable
function.
CREATE TABLE measurement (
city_id int not null,
logdate date not null,
peaktemp int,
unitsales int
) PARTITION BY RANGE (logdate);
CREATE TABLE measurement_y2006m02 PARTITION OF measurement
FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
CREATE TABLE measurement_y2006m03 PARTITION OF measurement
FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
postgres=# explain (costs off) select * from measurement
postgres-# where logdate = to_date('2006-03-02', 'yyyy-mm-dd');
QUERY PLAN
-----------------------------------------------------------------------------
Append
Subplans Removed: 1 <-- Here
-> Seq Scan on measurement_y2006m03 measurement_1
Filter: (logdate = to_date('2006-03-02'::text, 'yyyy-mm-dd'::text))
(4 rows)
IMO, we should do it. Why not? The attached is used to show the things
in my mind.
Btw, why the to_date function is declared as stable rather than immutable
since it always delivers the same result for the same inputs.
--
Best Regards
Andy Fan
Attachments:
v1-0001-Allow-planner-prune-partitionn-with-stable-Expr.patchapplication/octet-stream; name=v1-0001-Allow-planner-prune-partitionn-with-stable-Expr.patchDownload
From 0efa0c3244839587ea7a6e81e0fa6fa31f6dac10 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com>
Date: Mon, 28 Sep 2020 03:40:35 +0800
Subject: [PATCH v1] Allow planner prune partitionn with stable Expr
---
src/backend/partitioning/partprune.c | 99 ++++++++++++++-----
src/test/regress/expected/partition_prune.out | 27 +++--
2 files changed, 83 insertions(+), 43 deletions(-)
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6268623d56..7e183343fe 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -603,7 +603,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* and create a list of "partition pruning steps".
*
* 'target' tells whether to generate pruning steps for planning (use
- * immutable clauses only), or for executor startup (use any allowable
+ * immutable or stable clauses only), or for executor startup (use any allowable
* clause except ones containing PARAM_EXEC Params), or for executor
* per-scan pruning (use any allowable clause).
*
@@ -739,6 +739,51 @@ get_matching_partitions(PartitionPruneContext *context, List *pruning_steps)
*/
results = (PruneStepResult **)
palloc0(num_steps * sizeof(PruneStepResult *));
+
+ if (context->planstate == NULL)
+ {
+ /*
+ * This is still in PARTTARGET_PLANNER state, initialize expression
+ * state for each expression we need
+ * XXX: it is recreated in ExecInitPruningContext since the CurrentMemoryContext
+ * looks not good and context->ppcontext fails as well.
+ **/
+ context->exprstates = (ExprState **)
+ palloc0(sizeof(ExprState *) * num_steps * context->nparts);
+ foreach(lc, pruning_steps)
+ {
+ PartitionPruneStepOp *step = (PartitionPruneStepOp *) lfirst(lc);
+ ListCell *lc2;
+ int keyno;
+
+ /* not needed for other step kinds */
+ if (!IsA(step, PartitionPruneStepOp))
+ continue;
+
+ Assert(list_length(step->exprs) <= context->partnatts);
+
+ keyno = 0;
+ foreach(lc2, step->exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc2);
+
+ /* not needed for Consts && Param for PARTTARGET_PLANNER stage*/
+ if (!IsA(expr, Const) && !IsA(expr, Param))
+ {
+ int stateidx = PruneCxtStateIdx(context->partnatts,
+ step->step.step_id,
+ keyno);
+
+ context->exprstates[stateidx] =
+ /* XXX: context->planstate is NULL at this stage, but
+ * looks everything is well per testing.
+ */
+ ExecInitExpr(expr, context->planstate);
+ }
+ keyno++;
+ }
+ }
+ }
foreach(lc, pruning_steps)
{
PartitionPruneStep *step = lfirst(lc);
@@ -1845,30 +1890,29 @@ match_clause_to_partition_key(GeneratePruningStepsContext *context,
*/
if (!IsA(expr, Const))
{
+ bool is_volatile = contain_volatile_functions((Node *) expr);
Bitmapset *paramids;
-
/*
- * When pruning in the planner, we only support pruning using
- * comparisons to constants. We cannot prune on the basis of
+ * When pruning in the planner, we support pruning using
+ * comparisons to constants or non volatile exprs. We cannot prune on the basis of
* anything that's not immutable. (Note that has_mutable_arg and
* has_exec_param do not get set for this target value.)
*/
- if (context->target == PARTTARGET_PLANNER)
- return PARTCLAUSE_UNSUPPORTED;
/*
- * We can never prune using an expression that contains Vars.
+ * And we must reject anything containing a volatile function.
+ * Stable functions are OK though.
*/
- if (contain_var_clause((Node *) expr))
+ if (is_volatile)
return PARTCLAUSE_UNSUPPORTED;
/*
- * And we must reject anything containing a volatile function.
- * Stable functions are OK though.
+ * We can never prune using an expression that contains Vars.
*/
- if (contain_volatile_functions((Node *) expr))
+ if (contain_var_clause((Node *) expr))
return PARTCLAUSE_UNSUPPORTED;
+
/*
* See if there are any exec Params. If so, we can only use this
* expression during per-scan pruning.
@@ -1895,13 +1939,6 @@ match_clause_to_partition_key(GeneratePruningStepsContext *context,
if (op_volatile(opno) != PROVOLATILE_IMMUTABLE)
{
context->has_mutable_op = true;
-
- /*
- * When pruning in the planner, we cannot prune with mutable
- * operators.
- */
- if (context->target == PARTTARGET_PLANNER)
- return PARTCLAUSE_UNSUPPORTED;
}
/*
@@ -3257,6 +3294,8 @@ perform_pruning_base_step(PartitionPruneContext *context,
*/
for (keyno = 0; keyno < context->partnatts; keyno++)
{
+ /* during PARTTARGET_PLANNER state, we will skip the Param expr */
+ bool skip_param = lc1 != NULL && context->planstate == NULL && IsA(lfirst(lc1), Param);
/*
* For hash partitioning, it is possible that values of some keys are
* not provided in operator clauses, but instead the planner found
@@ -3272,7 +3311,7 @@ perform_pruning_base_step(PartitionPruneContext *context,
if (keyno > nvalues && context->strategy == PARTITION_STRATEGY_RANGE)
break;
- if (lc1 != NULL)
+ if (lc1 != NULL && !skip_param)
{
Expr *expr;
Datum datum;
@@ -3583,15 +3622,21 @@ partkey_datum_from_expr(PartitionPruneContext *context,
{
ExprState *exprstate;
ExprContext *ectx;
-
- /*
- * We should never see a non-Const in a step unless we're running in
- * the executor.
- */
- Assert(context->planstate != NULL);
-
exprstate = context->exprstates[stateidx];
- ectx = context->planstate->ps_ExprContext;
+ if (context->planstate == NULL)
+ {
+ /* This happens on plan stage, so it must be an immutable or stable Expr */
+ Assert(!contain_volatile_functions((Node *)expr));
+ /* XXX: I don't know how to choose ExprContext for this purpose,
+ * just do it so */
+ ectx = makeNode(ExprContext);
+ ectx->ecxt_per_tuple_memory = context->ppccontext;
+ }
+ else
+ {
+ ectx = context->planstate->ps_ExprContext;
+ }
+
*value = ExecEvalExprSwitchContext(exprstate, ectx, isnull);
}
}
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 50d2a7e4b9..75bb086538 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -1864,13 +1864,11 @@ begin;
create function list_part_fn(int) returns int as $$ begin return $1; end;$$ language plpgsql stable;
-- Ensure pruning works using a stable function containing no Vars
explain (analyze, costs off, summary off, timing off) select * from list_part where a = list_part_fn(1);
- QUERY PLAN
-------------------------------------------------------------------
- Append (actual rows=1 loops=1)
- Subplans Removed: 3
- -> Seq Scan on list_part1 list_part_1 (actual rows=1 loops=1)
- Filter: (a = list_part_fn(1))
-(4 rows)
+ QUERY PLAN
+----------------------------------------------------------
+ Seq Scan on list_part1 list_part (actual rows=1 loops=1)
+ Filter: (a = list_part_fn(1))
+(2 rows)
-- Ensure pruning does not take place when the function has a Var parameter
explain (analyze, costs off, summary off, timing off) select * from list_part where a = list_part_fn(a);
@@ -2911,23 +2909,20 @@ select * from stable_qual_pruning where a < localtimestamp;
QUERY PLAN
--------------------------------------------------------------------------------------
Append (actual rows=0 loops=1)
- Subplans Removed: 1
-> Seq Scan on stable_qual_pruning1 stable_qual_pruning_1 (actual rows=0 loops=1)
Filter: (a < LOCALTIMESTAMP)
-> Seq Scan on stable_qual_pruning2 stable_qual_pruning_2 (actual rows=0 loops=1)
Filter: (a < LOCALTIMESTAMP)
-(6 rows)
+(5 rows)
-- timestamp < timestamptz comparison is only stable, not immutable
explain (analyze, costs off, summary off, timing off)
select * from stable_qual_pruning where a < '2000-02-01'::timestamptz;
- QUERY PLAN
---------------------------------------------------------------------------------------
- Append (actual rows=0 loops=1)
- Subplans Removed: 2
- -> Seq Scan on stable_qual_pruning1 stable_qual_pruning_1 (actual rows=0 loops=1)
- Filter: (a < 'Tue Feb 01 00:00:00 2000 PST'::timestamp with time zone)
-(4 rows)
+ QUERY PLAN
+------------------------------------------------------------------------------
+ Seq Scan on stable_qual_pruning1 stable_qual_pruning (actual rows=0 loops=1)
+ Filter: (a < 'Tue Feb 01 00:00:00 2000 PST'::timestamp with time zone)
+(2 rows)
-- check ScalarArrayOp cases
explain (analyze, costs off, summary off, timing off)
--
2.21.0
On Mon, 28 Sep 2020 at 08:59, Andy Fan <zhihui.fan1213@gmail.com> wrote:
I find we can't prune partitions in the planner if the qual is a stable function.
IMO, we should do it. Why not?
Thanks for showing an interest in partition pruning. Unfortunately,
it's not possible to use stable functions to prune partitions during
planning.
NOW() is one example of a function that's stable, but the return value
will change over time. If we used the return value of that to perform
partition pruning then we'd end up with a plan that's wrong over time.
Here's an example:
create table rp (t timestamp) partition by range(t);
create table rp1 partition of rp for values from ('now'::timestamp) to
('now'::timestamp + '1 min'::interval);
create table rp2 partition of rp for values from ('now'::timestamp +
'1 min'::interval) to ('now'::timestamp + '2 min'::interval);
insert into rp select t from generate_Series('now'::timestamp,
'now'::timestamp + '1 min 59 sec'::interval, '1 sec'::interval) t;
prepare q1 as select count(*) from rp where t > now() and t < now() +
'10 sec'::interval;
Now, if you run the following command with your patch, it'll prune the
rp2 partition as it's not required for the WHERE clause (at the time
we planned). However, just wait 1 minute and execute the plan again.
Oops, my rows vanished!
execute q1; select pg_sleep(60); execute q1;
The 2nd execute should have returned 10 rows, the same as the first
(assuming you executed that directly after creating the tables)
Run-time partition pruning was invented just for this purpose.
David
Thank you David for coming:)
On Mon, Sep 28, 2020 at 4:46 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Mon, 28 Sep 2020 at 08:59, Andy Fan <zhihui.fan1213@gmail.com> wrote:
I find we can't prune partitions in the planner if the qual is a stable
function.
IMO, we should do it. Why not?
Thanks for showing an interest in partition pruning. Unfortunately,
it's not possible to use stable functions to prune partitions during
planning.NOW() is one example of a function that's stable, but the return value
will change over time. If we used the return value of that to perform
partition pruning then we'd end up with a plan that's wrong over time.
Sigh.. I understand you now, I ignored the plan can be cached for later use.
Without that, we should be able to prune with stable function. I know the
run-time partition prune can help with this, but it can't help with
planning time.
I run into some cases that SELECT * FROM p WHERE pkey = to_date(..);
p has 1500+ partitions and planning takes lots of time. and users are not
willing to remove the to_date('2018-11-11', 'yyyy-mm-dd') style code since
too much and can't find out all of them at once. Actually I think to_date
should
be marked as immuable rather than stable.
--
Best Regards
Andy Fan
Andy Fan <zhihui.fan1213@gmail.com> writes:
On Mon, Sep 28, 2020 at 4:46 AM David Rowley <dgrowleyml@gmail.com> wrote:
Thanks for showing an interest in partition pruning. Unfortunately,
it's not possible to use stable functions to prune partitions during
planning.
Sigh.. I understand you now, I ignored the plan can be cached for later use.
Without that, we should be able to prune with stable function.
No, that's still wrong. The contract for a stable function is that
its result won't change over execution of a single query; but that
says *execution*, not *planning and execution*.
In particular, the canonical example of a stable function is one
whose result depends on a database query. The reason it can be
considered stable is that within a single outer query, the MVCC
snapshot it's used with won't change. But we take a new snapshot
(later than the planner's snapshot) when beginning execution.
Somebody (Robert Haas, if memory serves, which it might not)
tried to change that a few years ago. It blew up pretty well,
and was eventually reverted, because of undesirable side-effects
on user-visible query semantics. You'd have to check the archives
for details.
It's possible that we could make that work differently in serializable
mode, thanks to the longer persistence of snapshots. Not sure that
it'd be desirable for planning to work differently in serializable
mode, though.
regards, tom lane
On Mon, Sep 28, 2020 at 7:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Fan <zhihui.fan1213@gmail.com> writes:
On Mon, Sep 28, 2020 at 4:46 AM David Rowley <dgrowleyml@gmail.com>
wrote:
Thanks for showing an interest in partition pruning. Unfortunately,
it's not possible to use stable functions to prune partitions during
planning.Sigh.. I understand you now, I ignored the plan can be cached for later
use.
Without that, we should be able to prune with stable function.
No, that's still wrong. The contract for a stable function is that
its result won't change over execution of a single query; but that
says *execution*, not *planning and execution*.In particular, the canonical example of a stable function is one
whose result depends on a database query. The reason it can be
considered stable is that within a single outer query, the MVCC
snapshot it's used with won't change. But we take a new snapshot
(later than the planner's snapshot) when beginning execution.Somebody (Robert Haas, if memory serves, which it might not)
tried to change that a few years ago. It blew up pretty well,
and was eventually reverted, because of undesirable side-effects
on user-visible query semantics. You'd have to check the archives
for details.It's possible that we could make that work differently in serializable
mode, thanks to the longer persistence of snapshots. Not sure that
it'd be desirable for planning to work differently in serializable
mode, though.regards, tom lane
Well, that's very interesting. Specific to my user case,
SELECT * FROM p WHERE pkey = to_date('2018-12-13', 'yyyy-mm-dd)';
p has 1500+ partitions and planning takes lots of time, which is so same
with SELECT * FROM p WHERE pkey = '2018-12-13', however the planning
time difference is so huge, that doesn't make sense in human view. Can
we do something for that? to_date(text, text) should be a "immutable"
function
IMO. Does that have a semantic issue or other issues?
--
Best Regards
Andy Fan
Andy Fan <zhihui.fan1213@gmail.com> writes:
Well, that's very interesting. Specific to my user case,
SELECT * FROM p WHERE pkey = to_date('2018-12-13', 'yyyy-mm-dd)';
p has 1500+ partitions and planning takes lots of time, which is so same
with SELECT * FROM p WHERE pkey = '2018-12-13', however the planning
time difference is so huge, that doesn't make sense in human view. Can
we do something for that? to_date(text, text) should be a "immutable"
function IMO. Does that have a semantic issue or other issues?
Yeah. It depends on the lc_time setting, and possibly also the timezone
GUC. (Admittedly, common values of the format string would not have
any lc_time dependency, but the immutability property is not fine-grained
enough to recognize that.)
regards, tom lane
On Mon, Sep 28, 2020 at 9:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Fan <zhihui.fan1213@gmail.com> writes:
Well, that's very interesting. Specific to my user case,
SELECT * FROM p WHERE pkey = to_date('2018-12-13', 'yyyy-mm-dd)';
p has 1500+ partitions and planning takes lots of time, which is so same
with SELECT * FROM p WHERE pkey = '2018-12-13', however the planning
time difference is so huge, that doesn't make sense in human view. Can
we do something for that? to_date(text, text) should be a "immutable"
function IMO. Does that have a semantic issue or other issues?Yeah. It depends on the lc_time setting, and possibly also the timezone
GUC. (Admittedly, common values of the format string would not have
any lc_time dependency, but the immutability property is not fine-grained
enough to recognize that.)regards, tom lane
Thanks for your reply. Even it has something on GUC or lc_time setting,
suppose
it should be decided at planning time. Do we have concerns about changes
between planning and execution?
The attached patch marked some common formatting function as immutable,
only one partition prune test case needed fixing because of this. I only
changed
to_char/to_date/to_timestamp, however the whole list is below. I can
change
all of them if needed.
proname | count
-----------------+-------
to_ascii | 3
to_char | 8
to_date | 1
to_hex | 2
to_json | 1
to_jsonb | 1
to_number | 1
to_regclass | 1
to_regcollation | 1
to_regnamespace | 1
to_regoper | 1
to_regoperator | 1
to_regproc | 1
to_regprocedure | 1
to_regrole | 1
to_regtype | 1
to_timestamp | 2
to_tsquery | 2
to_tsvector | 6
(19 rows)
With this change, the exact issue on the beginning of this thread can be
fixed as
well with this patch.
--
Best Regards
Andy Fan
Attachments:
v1-0001-Mark-some-formating-builtin-function-as-immutable.patchapplication/octet-stream; name=v1-0001-Mark-some-formating-builtin-function-as-immutable.patchDownload
From c1cc1102815ef83114efbaff0117a18b23a473f6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com>
Date: Mon, 28 Sep 2020 09:59:41 +0800
Subject: [PATCH v1] Mark some formating builtin function as immutable.
Immutable function
usually has more chances for optimization.
---
src/include/catalog/pg_proc.dat | 22 +++++++++----------
src/test/regress/expected/partition_prune.out | 15 +++++--------
2 files changed, 16 insertions(+), 21 deletions(-)
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index f48f5fb4d9..face1fc6b1 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -4432,34 +4432,34 @@
# formatting
{ oid => '1770', descr => 'format timestamp with time zone to text',
- proname => 'to_char', provolatile => 's', prorettype => 'text',
+ proname => 'to_char', provolatile => 'i', prorettype => 'text',
proargtypes => 'timestamptz text', prosrc => 'timestamptz_to_char' },
{ oid => '1772', descr => 'format numeric to text',
- proname => 'to_char', provolatile => 's', prorettype => 'text',
+ proname => 'to_char', provolatile => 'i', prorettype => 'text',
proargtypes => 'numeric text', prosrc => 'numeric_to_char' },
{ oid => '1773', descr => 'format int4 to text',
- proname => 'to_char', provolatile => 's', prorettype => 'text',
+ proname => 'to_char', provolatile => 'i', prorettype => 'text',
proargtypes => 'int4 text', prosrc => 'int4_to_char' },
{ oid => '1774', descr => 'format int8 to text',
- proname => 'to_char', provolatile => 's', prorettype => 'text',
+ proname => 'to_char', provolatile => 'i', prorettype => 'text',
proargtypes => 'int8 text', prosrc => 'int8_to_char' },
{ oid => '1775', descr => 'format float4 to text',
- proname => 'to_char', provolatile => 's', prorettype => 'text',
+ proname => 'to_char', provolatile => 'i', prorettype => 'text',
proargtypes => 'float4 text', prosrc => 'float4_to_char' },
{ oid => '1776', descr => 'format float8 to text',
- proname => 'to_char', provolatile => 's', prorettype => 'text',
+ proname => 'to_char', provolatile => 'i', prorettype => 'text',
proargtypes => 'float8 text', prosrc => 'float8_to_char' },
{ oid => '1777', descr => 'convert text to numeric',
- proname => 'to_number', provolatile => 's', prorettype => 'numeric',
+ proname => 'to_number', provolatile => 'i', prorettype => 'numeric',
proargtypes => 'text text', prosrc => 'numeric_to_number' },
{ oid => '1778', descr => 'convert text to timestamp with time zone',
- proname => 'to_timestamp', provolatile => 's', prorettype => 'timestamptz',
+ proname => 'to_timestamp', provolatile => 'i', prorettype => 'timestamptz',
proargtypes => 'text text', prosrc => 'to_timestamp' },
{ oid => '1780', descr => 'convert text to date',
- proname => 'to_date', provolatile => 's', prorettype => 'date',
+ proname => 'to_date', provolatile => 'i', prorettype => 'date',
proargtypes => 'text text', prosrc => 'to_date' },
{ oid => '1768', descr => 'format interval to text',
- proname => 'to_char', provolatile => 's', prorettype => 'text',
+ proname => 'to_char', provolatile => 'i', prorettype => 'text',
proargtypes => 'interval text', prosrc => 'interval_to_char' },
{ oid => '1282', descr => 'quote an identifier for usage in a querystring',
@@ -5825,7 +5825,7 @@
proname => 'isfinite', prorettype => 'bool', proargtypes => 'timestamp',
prosrc => 'timestamp_finite' },
{ oid => '2049', descr => 'format timestamp to text',
- proname => 'to_char', provolatile => 's', prorettype => 'text',
+ proname => 'to_char', provolatile => 'i', prorettype => 'text',
proargtypes => 'timestamp text', prosrc => 'timestamp_to_char' },
{ oid => '2052',
proname => 'timestamp_eq', proleakproof => 't', prorettype => 'bool',
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 50d2a7e4b9..8753edd4fe 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -1129,16 +1129,11 @@ create table coercepart_ab partition of coercepart for values in ('ab');
create table coercepart_bc partition of coercepart for values in ('bc');
create table coercepart_cd partition of coercepart for values in ('cd');
explain (costs off) select * from coercepart where a in ('ab', to_char(125, '999'));
- QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
- Append
- -> Seq Scan on coercepart_ab coercepart_1
- Filter: ((a)::text = ANY ((ARRAY['ab'::character varying, (to_char(125, '999'::text))::character varying])::text[]))
- -> Seq Scan on coercepart_bc coercepart_2
- Filter: ((a)::text = ANY ((ARRAY['ab'::character varying, (to_char(125, '999'::text))::character varying])::text[]))
- -> Seq Scan on coercepart_cd coercepart_3
- Filter: ((a)::text = ANY ((ARRAY['ab'::character varying, (to_char(125, '999'::text))::character varying])::text[]))
-(7 rows)
+ QUERY PLAN
+-----------------------------------------------------
+ Seq Scan on coercepart_ab coercepart
+ Filter: ((a)::text = ANY ('{ab," 125"}'::text[]))
+(2 rows)
explain (costs off) select * from coercepart where a ~ any ('{ab}');
QUERY PLAN
--
2.21.0
On Sun, Sep 27, 2020 at 7:52 PM Andy Fan wrote:
On Mon, Sep 28, 2020 at 9:15 AM Tom Lane wrote:
Andy Fan writes:
Well, that's very interesting. Specific to my user case,
SELECT * FROM p WHERE pkey = to_date('2018-12-13', 'yyyy-mm-dd)';
p has 1500+ partitions and planning takes lots of time, which is so same
with SELECT * FROM p WHERE pkey = '2018-12-13', however the planning
time difference is so huge, that doesn't make sense in human view. Can
we do something for that? to_date(text, text) should be a "immutable"
function IMO. Does that have a semantic issue or other issues?Yeah. It depends on the lc_time setting, and possibly also the timezone
GUC. (Admittedly, common values of the format string would not have
any lc_time dependency, but the immutability property is not fine-grained
enough to recognize that.)regards, tom lane
Thanks for your reply. Even it has something on GUC or lc_time setting, suppose
it should be decided at planning time. Do we have concerns about changes
between planning and execution?
Planner can be called at prepared statement creation time, like
PREPARE yolo() AS SELECT * FROM foo WHERE pk = to_date(...);
Here, there's an arbitrary gap between planning time, and execution.
The attached patch marked some common formatting function as immutable,
only one partition prune test case needed fixing because of this. I only changed
to_char/to_date/to_timestamp, however the whole list is below. I can change
all of them if needed.proname | count
-----------------+-------
to_ascii | 3
to_char | 8
to_date | 1
to_hex | 2
to_json | 1
to_jsonb | 1
to_number | 1
to_regclass | 1
to_regcollation | 1
to_regnamespace | 1
to_regoper | 1
to_regoperator | 1
to_regproc | 1
to_regprocedure | 1
to_regrole | 1
to_regtype | 1
to_timestamp | 2
to_tsquery | 2
to_tsvector | 6
(19 rows)
This patch is ridiculous.
Immutable functions need to produce the same output for the same
argument values. None of the functions changed in the patch is
immutable: they are all stable because they all depend on GUC settings
(e.g. to_tsvector depends on default_text_search_config).
On Mon, Sep 28, 2020 at 2:44 PM Jesse Zhang <sbjesse@gmail.com> wrote:
On Sun, Sep 27, 2020 at 7:52 PM Andy Fan wrote:
On Mon, Sep 28, 2020 at 9:15 AM Tom Lane wrote:
Andy Fan writes:
Well, that's very interesting. Specific to my user case,
SELECT * FROM p WHERE pkey = to_date('2018-12-13', 'yyyy-mm-dd)';
p has 1500+ partitions and planning takes lots of time, which is sosame
with SELECT * FROM p WHERE pkey = '2018-12-13', however the planning
time difference is so huge, that doesn't make sense in human view.Can
we do something for that? to_date(text, text) should be a "immutable"
function IMO. Does that have a semantic issue or other issues?Yeah. It depends on the lc_time setting, and possibly also the timezone
GUC. (Admittedly, common values of the format string would not have
any lc_time dependency, but the immutability property is notfine-grained
enough to recognize that.)
regards, tom lane
Thanks for your reply. Even it has something on GUC or lc_time setting,
suppose
it should be decided at planning time. Do we have concerns about changes
between planning and execution?Planner can be called at prepared statement creation time, like
PREPARE yolo() AS SELECT * FROM foo WHERE pk = to_date(...);
Here, there's an arbitrary gap between planning time, and execution.
The attached patch marked some common formatting function as immutable,
only one partition prune test case needed fixing because of this. I onlychanged
to_char/to_date/to_timestamp, however the whole list is below. I can
change
all of them if needed.
proname | count
-----------------+-------
to_ascii | 3
to_char | 8
to_date | 1
to_hex | 2
to_json | 1
to_jsonb | 1
to_number | 1
to_regclass | 1
to_regcollation | 1
to_regnamespace | 1
to_regoper | 1
to_regoperator | 1
to_regproc | 1
to_regprocedure | 1
to_regrole | 1
to_regtype | 1
to_timestamp | 2
to_tsquery | 2
to_tsvector | 6
(19 rows)This patch is ridiculous.
Immutable functions need to produce the same output for the same
argument values. None of the functions changed in the patch is
immutable: they are all stable because they all depend on GUC settings
(e.g. to_tsvector depends on default_text_search_config).
Thanks, how can I misunderstand Tom's comment above!!
--
Best Regards
Andy Fan
On Mon, Sep 28, 2020 at 9:23 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:
On Mon, Sep 28, 2020 at 2:44 PM Jesse Zhang <sbjesse@gmail.com> wrote:
On Sun, Sep 27, 2020 at 7:52 PM Andy Fan wrote:
On Mon, Sep 28, 2020 at 9:15 AM Tom Lane wrote:
Yeah. It depends on the lc_time setting, and possibly also the
timezone
GUC. (Admittedly, common values of the format string would not have
any lc_time dependency, but the immutability property is notfine-grained
enough to recognize that.)
Thanks for your reply. Even it has something on GUC or lc_time setting,
suppose
it should be decided at planning time. Do we have concerns about
changes
between planning and execution?
Immutable functions need to produce the same output for the same
argument values. None of the functions changed in the patch is
immutable: they are all stable because they all depend on GUC settings
(e.g. to_tsvector depends on default_text_search_config).Thanks, how can I misunderstand Tom's comment above!!
To make data type formatting functions immutable you would need a third
input argument with the locale used for that specific function call, as
proposed in [1]/messages/by-id/01fe23b2-7779-d3ee-056a-074a7385e248@mail.de.
As for the original use case, in the documentation there is a tip about
using to_date() to handle input formats that cannot be converted by simple
casting, and that for most standard date/time formats a cast is the easier
way to do so [2]https://www.postgresql.org/docs/current/functions-formatting.html, I may also add that is better performing in Postgres.
[1]: /messages/by-id/01fe23b2-7779-d3ee-056a-074a7385e248@mail.de
/messages/by-id/01fe23b2-7779-d3ee-056a-074a7385e248@mail.de
[2]: https://www.postgresql.org/docs/current/functions-formatting.html
Regards,
Juan José Santamaría Flecha
Show quoted text
On Mon, Sep 28, 2020 at 7:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Fan <zhihui.fan1213@gmail.com> writes:
On Mon, Sep 28, 2020 at 4:46 AM David Rowley <dgrowleyml@gmail.com>
wrote:
Thanks for showing an interest in partition pruning. Unfortunately,
it's not possible to use stable functions to prune partitions during
planning.Sigh.. I understand you now, I ignored the plan can be cached for later
use.
Without that, we should be able to prune with stable function.
No, that's still wrong. The contract for a stable function is that
its result won't change over execution of a single query; but that
says *execution*, not *planning and execution*.
I have a slightly different opinion about the impact of "cached the plan
for later use will be wrong" now. Generic plan will never be partition
pruned plan since we don't know which partition to prune at plan time.
So for any cached plan, it is not a plan time partition pruned plan.
Partition prune with stable expr is still unacceptable even this is not
an issue but hope the snapshot issue will be the only one issue to
fix in future for this direction. I'd like to know if I am wrong again.
--
Best Regards
Andy Fan
Andy Fan schrieb am 28.09.2020 um 02:54:
Well, that's very interesting. Specific to my user case,
SELECT * FROM p WHERE pkey = to_date('2018-12-13', 'yyyy-mm-dd)';
Why use to_date() at all for a constant value?
Can't you use a standard ANSI date literal (date '2018-12-13')?
Alternatively, you could use make_date(2018,12,13) which is marked as immutable.
Thomas
On Mon, Sep 28, 2020 at 9:17 PM Thomas Kellerer <shammat@gmx.net> wrote:
Andy Fan schrieb am 28.09.2020 um 02:54:
Well, that's very interesting. Specific to my user case,
SELECT * FROM p WHERE pkey = to_date('2018-12-13', 'yyyy-mm-dd)';Why use to_date() at all for a constant value?
Can't you use a standard ANSI date literal (date '2018-12-13')?
This is from lots of legacy code from Oracle and the real case is
to_date('20181213', 'yyyymmdd').
Alternatively, you could use make_date(2018,12,13) which is marked as
immutable.
Thanks, Nice to know make_date function.
--
Best Regards
Andy Fan
On Mon, Sep 28, 2020 at 8:21 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
On Mon, Sep 28, 2020 at 7:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Fan <zhihui.fan1213@gmail.com> writes:
On Mon, Sep 28, 2020 at 4:46 AM David Rowley <dgrowleyml@gmail.com>
wrote:
Thanks for showing an interest in partition pruning. Unfortunately,
it's not possible to use stable functions to prune partitions during
planning.Sigh.. I understand you now, I ignored the plan can be cached for later
use.
Without that, we should be able to prune with stable function.
No, that's still wrong. The contract for a stable function is that
its result won't change over execution of a single query; but that
says *execution*, not *planning and execution*.I have a slightly different opinion about the impact of "cached the plan
for later use will be wrong" now. Generic plan will never be partition
pruned plan since we don't know which partition to prune at plan time.
So for any cached plan, it is not a plan time partition pruned plan.
Partition prune with stable expr is still unacceptable even this is not
an issue but hope the snapshot issue will be the only one issue to
fix in future for this direction. I'd like to know if I am wrong again.
Indeed I was wrong again. I'd like to end this thread with this
understanding fix.
prepare s as select * from measurement where logdate = $1 ;
execute s(now());
In this case, even if we run the planning time partition prune with a stable
function, we can still get the correct result (ignore the different
snapshot case).
since the generic plan includes all the partitions and just do initial
partition
prune case.
However if we create the prepared stmt like prepare s as select * from
measurement where logdate = now(); Then the cached plan should be
wrong. Actually this example is exactly the same as Daivd's example
at the beginning..
--
Best Regards
Andy Fan