Should we add GUCs to allow partition pruning to be disabled?
In PG10 the planner's partition pruning could be disabled by changing
the constraint_exclusion GUC to off. This is still the case for PG11,
but only for UPDATE and DELETE queries. There is currently no way to
disable partition pruning for SELECT.
Should we allow this?
To make this a bit more complex, we now also have run-time pruning
which can allow further partition pruning to be performed during
execution. I imagine if we're going to add a GUC for plan-time
pruning then we should also have one for run-time pruning. These could
also perhaps share the same GUC, so it seems there are some sub
choices to make here:
1. Add a single enable_ GUC which allows both plan-time and run-time
pruning to be disabled.
2. Add two new enable_ GUCs, one for plan-time and one for run-time pruning.
3. No new GUCs / Do nothing.
Run-time pruning is a little special here too, as it's the first
feature to exist in PostgreSQL which changes the plan in the executor.
From searching through the code I see no other enable_* GUC being
referenced in the executor. So there's also questions here as to
where we'd disable run-time pruning. We could disable it in the
planner so that the plan does not include the details that the
executor needs to enable the pruning, or we could just disable it in
the executor and have the planner still form plans with these details.
This separates #1 and #2 into:
a) Disable run-time pruning during execution.
b) Disable run-time pruning during planning.
c) Both of the above.
The differentiation of the above is important when you consider
PREPAREd statements. Currently, no enable_ GUC will affect a
pre-PREPAREd query. We might want to keep that rule despite there
being flexibility not to, in this case.
For UPDATE/DELETE:
It would also be quite strange if someone disabled plan-time pruning
and still got partition pruning. So I suggest we require both
constraint_exclusion and the plan-time GUC not off for pruning to be
enabled for UPDATE/DELETE. Alternatively, we just ditch
constraint_exclusion = 'partition'.
Personally, I'm for 2b and ditching constraint_exclusion =
'partition'. I don't see any sense in keeping constraint_exclusion =
'partition' if we have something else to mean the same thing.
Thoughts / Votes / Names for new GUCs?
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Apr 18, 2018 at 12:07:18PM +1200, David Rowley wrote:
In PG10 the planner's partition pruning could be disabled by changing
the constraint_exclusion GUC to off. This is still the case for PG11,
but only for UPDATE and DELETE queries. There is currently no way to
disable partition pruning for SELECT.Should we allow this?
To make this a bit more complex, we now also have run-time pruning
which can allow further partition pruning to be performed during
execution. I imagine if we're going to add a GUC for plan-time
pruning then we should also have one for run-time pruning. These could
also perhaps share the same GUC, so it seems there are some sub
choices to make here:1. Add a single enable_ GUC which allows both plan-time and run-time
pruning to be disabled.
2. Add two new enable_ GUCs, one for plan-time and one for run-time pruning.
3. No new GUCs / Do nothing.
Maybe this is divergent from the details of the implementation; but, from a
user's perspective: why not continue to use constraint_exclusion?
I would suggest to add zero new GUCs:
0. constraint_exclusion={off,partition,on,PLANNER*,EXECUTOR*}
I tentatively assume that "constraint_exclusion=partition" would disable PG11
"pruning", and that the new default setting would be "executor".
* Caveat: there may be a better name than planner/executor..
planner_prune? execute_filter?
Justin
On Tue, Apr 17, 2018 at 5:42 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Wed, Apr 18, 2018 at 12:07:18PM +1200, David Rowley wrote:
In PG10 the planner's partition pruning could be disabled by changing
the constraint_exclusion GUC to off. This is still the case for PG11,
but only for UPDATE and DELETE queries. There is currently no way to
disable partition pruning for SELECT.Should we allow this?
3. No new GUCs / Do nothing.
Maybe this is divergent from the details of the implementation; but, from a
user's perspective: why not continue to use constraint_exclusion?I would suggest to add zero new GUCs:
0. constraint_exclusion={off,partition,on,PLANNER*,EXECUTOR*}
My initial reaction is that we need to fix the bug introduced in v10 -
leaving constraint_exclusion working as it has historically and not affect
the new-as-of-10 ability to prune (maybe better termed as skip...)
partitions known during execution to contain no qualified tuples.
David J.
On 18 April 2018 at 13:03, David G. Johnston <david.g.johnston@gmail.com> wrote:
My initial reaction is that we need to fix the bug introduced in v10 -
leaving constraint_exclusion working as it has historically and not affect
the new-as-of-10 ability to prune (maybe better termed as skip...)
partitions known during execution to contain no qualified tuples.
Can you explain which bug in PG10 you are talking about? Did you
perhaps mean PG11?
I'm not onboard with overloading the constraint_exclusion GUC any
further to mean something it shouldn't. The PG11 partition pruning
code does not use CHECK constraints to eliminate partitions, so I see
no reason why constraint_exclusion should turn it on or off.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Apr 17, 2018 at 6:12 PM, David Rowley <david.rowley@2ndquadrant.com>
wrote:
On 18 April 2018 at 13:03, David G. Johnston <david.g.johnston@gmail.com>
wrote:My initial reaction is that we need to fix the bug introduced in v10 -
leaving constraint_exclusion working as it has historically and notaffect
the new-as-of-10 ability to prune (maybe better termed as skip...)
partitions known during execution to contain no qualified tuples.Can you explain which bug in PG10 you are talking about? Did you
perhaps mean PG11?
"In PG10 the planner's partition pruning could be disabled by changing
the constraint_exclusion GUC to off. This is still the case for PG11,
but only for UPDATE and DELETE queries. There is currently no way to
disable partition pruning for SELECT."
I read the word "currently" in your initial paragraph as meaning "currently
released", hence version v10. Re-reading it now I'm understanding you
meant currently to mean v11 and thus now so do I.
I'm not onboard with overloading the constraint_exclusion GUC any
further to mean something it shouldn't. The PG11 partition pruning
code does not use CHECK constraints to eliminate partitions, so I see
no reason why constraint_exclusion should turn it on or off.
You propose that the "This is still the case for PG11, but only for UPDATE
and DELETE queries" is actually wrong and none of the query types should be
impacted?
Basically go with partition pruning is always on, check constraint
evaluation defaults to off and can be turned on - and the current default
for "constraint_exclusion" changes to 'off' and if someone tries to
explicitly set it to 'partition' it fails. Add some new knobs for
partitions if desired.
I'd go that route in a green-field...I'm less convinced it is the best way
forward from today. non-partition related exclusion is something I'm not
understanding conceptually; and I don't know why one, outside of debugging
system code, would want to not perform partition related exclusion. I
could live with straight removal of the existing option and behave as if it
was indeed set to 'partition'.
David J.
On Wed, Apr 18, 2018 at 5:37 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
In PG10 the planner's partition pruning could be disabled by changing
the constraint_exclusion GUC to off. This is still the case for PG11,
but only for UPDATE and DELETE queries. There is currently no way to
disable partition pruning for SELECT.Should we allow this?
I think GUC would be useful for debugging purposes for sure. Given
that we have added this feature late in v11, there might be some bugs
that will bite customers in production. It's better to provide them
some way to work-around.
1. Add a single enable_ GUC which allows both plan-time and run-time
pruning to be disabled.
I would go for this. Both of those features have common code and it
will get cumbersome to carefully enable/disable them separately.
2. Add two new enable_ GUCs, one for plan-time and one for run-time pruning.
This would give more granularity but
3. No new GUCs / Do nothing.
Run-time pruning is a little special here too, as it's the first
feature to exist in PostgreSQL which changes the plan in the executor.
From searching through the code I see no other enable_* GUC being
referenced in the executor. So there's also questions here as to
where we'd disable run-time pruning. We could disable it in the
planner so that the plan does not include the details that the
executor needs to enable the pruning, or we could just disable it in
the executor and have the planner still form plans with these details.
This separates #1 and #2 into:a) Disable run-time pruning during execution.
b) Disable run-time pruning during planning.
c) Both of the above.The differentiation of the above is important when you consider
PREPAREd statements. Currently, no enable_ GUC will affect a
pre-PREPAREd query. We might want to keep that rule despite there
being flexibility not to, in this case.
If run-time pruning is disabled, why do we want to waste CPU cycles
and memory to produce plan time details? It might be useful to do so,
if there was a large chance that people prepared a statement which
could use partition pruning with run-time pruning disables but
EXECUTEd it with run-time pruning enabled. It will be less likely that
the session which prepares a plan would change the GUCs before
executing it.
For UPDATE/DELETE:
It would also be quite strange if someone disabled plan-time pruning
and still got partition pruning. So I suggest we require both
constraint_exclusion and the plan-time GUC not off for pruning to be
enabled for UPDATE/DELETE. Alternatively, we just ditch
constraint_exclusion = 'partition'.Personally, I'm for 2b and ditching constraint_exclusion =
'partition'. I don't see any sense in keeping constraint_exclusion =
'partition' if we have something else to mean the same thing.
That will still be useful for inheritance based partitioning.
We might re-use constraint_exclusion = 'partition' to mean
enable_partition_pruning (ok, I suggested a name as well) = true,
although that's not my favourite.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
On 18 April 2018 at 21:36, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
On Wed, Apr 18, 2018 at 5:37 AM, David Rowley
a) Disable run-time pruning during execution.
b) Disable run-time pruning during planning.
c) Both of the above.The differentiation of the above is important when you consider
PREPAREd statements. Currently, no enable_ GUC will affect a
pre-PREPAREd query. We might want to keep that rule despite there
being flexibility not to, in this case.If run-time pruning is disabled, why do we want to waste CPU cycles
and memory to produce plan time details? It might be useful to do so,
if there was a large chance that people prepared a statement which
could use partition pruning with run-time pruning disables but
EXECUTEd it with run-time pruning enabled. It will be less likely that
the session which prepares a plan would change the GUCs before
executing it.
I have to admit, can't really imagine any valid cases were disabling
this feature would be useful. Generally, enable_* properties can be
used to coax the planner into producing some plan shape that it
otherwise didn't due to some costing problem. I can only imagine it
might be useful to disable either for testing or as a workaround for
some bug that might crop up. Perhaps that's not enough reason to go
and add a GUC that'll likely need to exist forever. But it probably
does mean that we'd want c) so that the code is completely disabled as
soon as the setting is off. If we just did it at plan time then
pre-PREPAREd queries might still prune. That does not seem very
useful if it's being disabled due to the discovery of some bug.
The more I think about this the more undecided I am as to whether we
need to add a GUC for this at all, so I'm keen to hear more people
voice their opinion about this. If bugs are the only true reason to
add it, then the need for the GUC should diminish every day that
nobody reports any bugs.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Apr 19, 2018 at 2:54 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
If we just did it at plan time then
pre-PREPAREd queries might still prune. That does not seem very
useful if it's being disabled due to the discovery of some bug.
As you have pointed out upthread, that's a problem with every enable_*
GUC. After seeing a bug, users would usually re-prepare their
statements with pruning turned off. So, I don't see this as a reason
for introducing two GUCs.
The more I think about this the more undecided I am as to whether we
need to add a GUC for this at all, so I'm keen to hear more people
voice their opinion about this. If bugs are the only true reason to
add it, then the need for the GUC should diminish every day that
nobody reports any bugs.
Apart from bugs, I think, this GUC can be used to avoid extra planning
time/memory/CPU incurred in pruning, when users know for sure that
pruning is not going to happen e.g. the cases like no qual on
partition key or no equality qual on hash partition key etc. Do we
know how much planning time can be saved this way?
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
On 2018/04/19 13:32, Ashutosh Bapat wrote:
On Thu, Apr 19, 2018 at 2:54 AM, David Rowley
The more I think about this the more undecided I am as to whether we
need to add a GUC for this at all, so I'm keen to hear more people
voice their opinion about this. If bugs are the only true reason to
add it, then the need for the GUC should diminish every day that
nobody reports any bugs.Apart from bugs, I think, this GUC can be used to avoid extra planning
time/memory/CPU incurred in pruning, when users know for sure that
pruning is not going to happen e.g. the cases like no qual on
partition key or no equality qual on hash partition key etc. Do we
know how much planning time can be saved this way?
I can imagine having a enable_partition_pruning which defaults to true, if
only to avoid the performance overhead of pruning code when a user knows
for sure that it won't help for some queries. Although, I'm a bit dubious
why they'd write such queries if they're using partitioning in the first
place.
Also, I'd think that enable_partition_pruning set to false means pruning
doesn't occur at all, not even using constraint exclusion. That is,
behavior equivalent of constraint_exclusion < partition (that is, off/on).
Also, if we do have such a GUC, it should apply to all command types,
including UPDATE and DELETE which don't yet invoke the new pruning code,
from the start. So, if enable_partition_pruning is false, we won't load
the partition constraints at all, which we currently do for UPDATE and
DELETE so that constraint exclusion can be used for pruning. OTOH, if
enable_partition_pruning is on, we perform constraint exclusion -based
pruning for UPDATE and DELETE irrespective of the setting of
constraint_exclusion GUC. In other words, we completely dissociate
partitioned table pruning from the setting of constraint_exclusion.
Now as Justin pointed out upthread, the new GUC might cause confusion for
users who are long accustomed to using constraint_exclusion for this, but
I'm not sure anybody tries to change its setting a lot. The new GUC
defaulting to pruning=on would be useful for occasional debugging, as we
all seem to more or less agree.
Thanks,
Amit
On Thu, Apr 19, 2018 at 5:02 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2018/04/19 13:32, Ashutosh Bapat wrote:
On Thu, Apr 19, 2018 at 2:54 AM, David Rowley
The more I think about this the more undecided I am as to whether we
need to add a GUC for this at all, so I'm keen to hear more people
voice their opinion about this. If bugs are the only true reason to
add it, then the need for the GUC should diminish every day that
nobody reports any bugs.Apart from bugs, I think, this GUC can be used to avoid extra planning
time/memory/CPU incurred in pruning, when users know for sure that
pruning is not going to happen e.g. the cases like no qual on
partition key or no equality qual on hash partition key etc. Do we
know how much planning time can be saved this way?I can imagine having a enable_partition_pruning which defaults to true, if
only to avoid the performance overhead of pruning code when a user knows
for sure that it won't help for some queries. Although, I'm a bit dubious
why they'd write such queries if they're using partitioning in the first
place.Also, I'd think that enable_partition_pruning set to false means pruning
doesn't occur at all, not even using constraint exclusion. That is,
behavior equivalent of constraint_exclusion < partition (that is, off/on).Also, if we do have such a GUC, it should apply to all command types,
including UPDATE and DELETE which don't yet invoke the new pruning code,
from the start. So, if enable_partition_pruning is false, we won't load
the partition constraints at all, which we currently do for UPDATE and
DELETE so that constraint exclusion can be used for pruning. OTOH, if
enable_partition_pruning is on, we perform constraint exclusion -based
pruning for UPDATE and DELETE irrespective of the setting of
constraint_exclusion GUC. In other words, we completely dissociate
partitioned table pruning from the setting of constraint_exclusion.
Isn't word "dissociate" turns the last sentence into a sentence
contradicting everything you wrote prior to it?
I think we should keep these two things separate.
enable_partition_pruning affects the partition pruning based on the
partition bounds and that currently does not work for UPDATE/DELETE.
When it does work in those case, we might think of not loading
partition bound based constraints. constraint_exclusion affects
whether constraints can be used to exclude a relation (with partition
option affecting the child tables). Once we stop loading partition
bound based constraints, constraint exclusion would stop pruning
partitions based on the bounds. There's no point in confusing users
with by adding dependencies between these two GUCs.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
On 2018/04/19 21:50, Ashutosh Bapat wrote:
On Thu, Apr 19, 2018 at 5:02 PM, Amit Langote
I can imagine having a enable_partition_pruning which defaults to true, if
only to avoid the performance overhead of pruning code when a user knows
for sure that it won't help for some queries. Although, I'm a bit dubious
why they'd write such queries if they're using partitioning in the first
place.Also, I'd think that enable_partition_pruning set to false means pruning
doesn't occur at all, not even using constraint exclusion. That is,
behavior equivalent of constraint_exclusion < partition (that is, off/on).Also, if we do have such a GUC, it should apply to all command types,
including UPDATE and DELETE which don't yet invoke the new pruning code,
from the start. So, if enable_partition_pruning is false, we won't load
the partition constraints at all, which we currently do for UPDATE and
DELETE so that constraint exclusion can be used for pruning. OTOH, if
enable_partition_pruning is on, we perform constraint exclusion -based
pruning for UPDATE and DELETE irrespective of the setting of
constraint_exclusion GUC. In other words, we completely dissociate
partitioned table pruning from the setting of constraint_exclusion.Isn't word "dissociate" turns the last sentence into a sentence
contradicting everything you wrote prior to it?I think we should keep these two things separate.
Yes, that's what I meant.
To clarify: if we're going to add a new parameter *for partitioned tables*
to configure whether or not pruning occurs, even if UPDATE and DELETE now
rely on constraint exclusion for pruning, we should ignore the setting of
constraint_exclusion the configuration parameter. For UPDATE and DELETE,
if enable_partition_pruning is on, we proceed to prune using constraint
exclusion (because that's the only method available now), irrespective of
the setting of constraint_exclusion.
So to users, enable_partition_pruning should be the only way to configure
whether or not pruning occurs.
Does that make sense?
It seems like talking about the finer implementation details is making
this discussion a bit confusing.
enable_partition_pruning affects the partition pruning based on the
partition bounds and that currently does not work for UPDATE/DELETE.
When it does work in those case, we might think of not loading
partition bound based constraints. constraint_exclusion affects
whether constraints can be used to exclude a relation (with partition
option affecting the child tables). Once we stop loading partition
bound based constraints, constraint exclusion would stop pruning
partitions based on the bounds. There's no point in confusing users
with by adding dependencies between these two GUCs.
That's exactly what I'm trying to propose. I don't want any new GUC to
work only for SELECT now and UPDATE/DELETE only later when we teach the
code path handling the latter to use the new pruning implementation. In
other words, I don't want a situation where two parameters control pruning
for partitioned tables in PG 11.
BTW, should this thread be listed somewhere on the open items page?
Thanks,
Amit
On 20 April 2018 at 14:07, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
To clarify: if we're going to add a new parameter *for partitioned tables*
to configure whether or not pruning occurs, even if UPDATE and DELETE now
rely on constraint exclusion for pruning, we should ignore the setting of
constraint_exclusion the configuration parameter. For UPDATE and DELETE,
if enable_partition_pruning is on, we proceed to prune using constraint
exclusion (because that's the only method available now), irrespective of
the setting of constraint_exclusion.So to users, enable_partition_pruning should be the only way to configure
whether or not pruning occurs.Does that make sense?
So to summarise my understanding (plus filling in the blanks):
1. Add single GUC named enable_partition_pruning, default = on.
2. Check this setting in set_append_rel_size to only perform
prune_append_rel_partitions when enable_partition_pruning is true.
3. Add code in create_append_plan to only call
make_partition_pruneinfo when enable_partition_pruning is true.
4. Replace test doing (constraint_exclusion ==
CONSTRAINT_EXCLUSION_PARTITION) with (enable_partition_pruning).
5. Get rid of CONSTRAINT_EXCLUSION_PARTITION.
I don't think you mentioned 5. but if I understand you correctly then
it would leave that option doing nothing. So we should remove it.
BTW, should this thread be listed somewhere on the open items page?
Yeah. we need to decide this before PG11 is let loose. I will add it.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Hi.
On 2018/04/20 11:18, David Rowley wrote:
On 20 April 2018 at 14:07, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
To clarify: if we're going to add a new parameter *for partitioned tables*
to configure whether or not pruning occurs, even if UPDATE and DELETE now
rely on constraint exclusion for pruning, we should ignore the setting of
constraint_exclusion the configuration parameter. For UPDATE and DELETE,
if enable_partition_pruning is on, we proceed to prune using constraint
exclusion (because that's the only method available now), irrespective of
the setting of constraint_exclusion.So to users, enable_partition_pruning should be the only way to configure
whether or not pruning occurs.Does that make sense?
So to summarise my understanding (plus filling in the blanks):
1. Add single GUC named enable_partition_pruning, default = on.
2. Check this setting in set_append_rel_size to only perform
prune_append_rel_partitions when enable_partition_pruning is true.
3. Add code in create_append_plan to only call
make_partition_pruneinfo when enable_partition_pruning is true.
4. Replace test doing (constraint_exclusion ==
CONSTRAINT_EXCLUSION_PARTITION) with (enable_partition_pruning).
5. Get rid of CONSTRAINT_EXCLUSION_PARTITION.I don't think you mentioned 5. but if I understand you correctly then
it would leave that option doing nothing. So we should remove it.
About 4 & 5:
Perhaps we should leave constraint_exclusion = partition alone because
there might be users who want to continue using the old inheritance method
to set up partitioning for whatever reason?
BTW, should this thread be listed somewhere on the open items page?
Yeah. we need to decide this before PG11 is let loose. I will add it.
OK, thanks.
Regards,
Amit
[1]: https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#DDL-PARTITIONING-IMPLEMENTATION-INHERITANCE
https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#DDL-PARTITIONING-IMPLEMENTATION-INHERITANCE
On 20 April 2018 at 14:33, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2018/04/20 11:18, David Rowley wrote:
4. Replace test doing (constraint_exclusion ==
CONSTRAINT_EXCLUSION_PARTITION) with (enable_partition_pruning).
5. Get rid of CONSTRAINT_EXCLUSION_PARTITION.About 4 & 5:
Perhaps we should leave constraint_exclusion = partition alone because
there might be users who want to continue using the old inheritance method
to set up partitioning for whatever reason?
Yeah, for some reason that keeps falling out my brain.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 20 April 2018 at 14:07, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
To clarify: if we're going to add a new parameter *for partitioned tables*
to configure whether or not pruning occurs, even if UPDATE and DELETE now
rely on constraint exclusion for pruning, we should ignore the setting of
constraint_exclusion the configuration parameter. For UPDATE and DELETE,
if enable_partition_pruning is on, we proceed to prune using constraint
exclusion (because that's the only method available now), irrespective of
the setting of constraint_exclusion.So to users, enable_partition_pruning should be the only way to configure
whether or not pruning occurs.
I hope the attached implements what is being discussed here.
Please test it to ensure it behaves as you'd expect.
I was a little unsure if the new GUCs declaration should live in
costsize.c or not since it really has no effect on plan costs, but in
the end, I stuck it there anyway so that it can be with its friends.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
0001-Add-GUC-to-allow-partition-pruning-to-be-disabled.patchapplication/octet-stream; name=0001-Add-GUC-to-allow-partition-pruning-to-be-disabled.patchDownload
From 25eec71bd88fd3363af7859715c98452dcd3ca01 Mon Sep 17 00:00:00 2001
From: "dgrowley@gmail.com" <dgrowley@gmail.com>
Date: Fri, 20 Apr 2018 17:41:14 +1200
Subject: [PATCH] Add GUC to allow partition pruning to be disabled
This both disables the plan-time partition pruning and also run-time partition
pruning. We may one day want to consider two individual switches for turning
off each of these features, but at the moment there seems to be no strong
argument for that.
---
doc/src/sgml/config.sgml | 17 +++++++++++++++++
src/backend/optimizer/path/allpaths.c | 3 ++-
src/backend/optimizer/path/costsize.c | 1 +
src/backend/optimizer/plan/createplan.c | 3 ++-
src/backend/optimizer/util/plancat.c | 2 +-
src/backend/utils/misc/guc.c | 9 +++++++++
src/include/optimizer/cost.h | 1 +
src/test/regress/expected/sysviews.out | 3 ++-
8 files changed, 35 insertions(+), 4 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d5f2d23c4..5dc5d27c04 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3826,6 +3826,23 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
</listitem>
</varlistentry>
+ <varlistentry id="guc-enable-partition-pruning" xreflabel="enable_partition_pruning">
+ <term><varname>enable_partition_pruning</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>enable_partition_pruning</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Enables or disables the query planner's ability to eliminate a
+ partitioned table's subpartitions from query plans. This also
+ controls the planner's ability to generate query plans which allow the
+ query executor to remove or ignoring partitions during query
+ execution. The default is <literal>on</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-enable-partitionwise-join" xreflabel="enable_partitionwise_join">
<term><varname>enable_partitionwise_join</varname> (<type>boolean</type>)
<indexterm>
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 3ba3f87eb7..9ed73da0f7 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -901,7 +901,8 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
* store the relids of all partitions which could possibly contain a
* matching tuple, and skip anything else in the loop below.
*/
- if (rte->relkind == RELKIND_PARTITIONED_TABLE &&
+ if (enable_partition_pruning &&
+ rte->relkind == RELKIND_PARTITIONED_TABLE &&
rel->baserestrictinfo != NIL)
{
live_children = prune_append_rel_partitions(rel);
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 47729de896..fc0617ec35 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -138,6 +138,7 @@ bool enable_partitionwise_join = false;
bool enable_partitionwise_aggregate = false;
bool enable_parallel_append = true;
bool enable_parallel_hash = true;
+bool enable_partition_pruning = true;
typedef struct
{
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 280f21cd45..ea5de0cb1b 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1077,7 +1077,8 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path)
subplans = lappend(subplans, subplan);
}
- if (rel->reloptkind == RELOPT_BASEREL &&
+ if (enable_partition_pruning &&
+ rel->reloptkind == RELOPT_BASEREL &&
best_path->partitioned_rels != NIL)
{
List *prunequal;
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 1ff0ef4866..98b353b80b 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1272,7 +1272,7 @@ get_relation_constraints(PlannerInfo *root,
* descriptor, instead of constraint exclusion which is driven by the
* individual partition's partition constraint.
*/
- if (root->parse->commandType != CMD_SELECT)
+ if (enable_partition_pruning && root->parse->commandType != CMD_SELECT)
{
List *pcqual = RelationGetPartitionQual(relation);
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index fa92ce2e68..c51a9270e4 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -951,6 +951,15 @@ static struct config_bool ConfigureNamesBool[] =
true,
NULL, NULL, NULL
},
+ {
+ {"enable_partition_pruning", PGC_USERSET, QUERY_TUNING_METHOD,
+ gettext_noop("Enables the planner's ability to remove non-required partitions from the query plan."),
+ NULL
+ },
+ &enable_partition_pruning,
+ true,
+ NULL, NULL, NULL
+ },
{
{"geqo", PGC_USERSET, QUERY_TUNING_GEQO,
gettext_noop("Enables genetic query optimization."),
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index d3269eae71..2995c4d26d 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -71,6 +71,7 @@ extern PGDLLIMPORT bool enable_partitionwise_join;
extern PGDLLIMPORT bool enable_partitionwise_aggregate;
extern PGDLLIMPORT bool enable_parallel_append;
extern PGDLLIMPORT bool enable_parallel_hash;
+extern PGDLLIMPORT bool enable_partition_pruning;
extern PGDLLIMPORT int constraint_exclusion;
extern double clamp_row_est(double nrows);
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index a19ee08749..a1c90eb905 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -83,12 +83,13 @@ select name, setting from pg_settings where name like 'enable%';
enable_nestloop | on
enable_parallel_append | on
enable_parallel_hash | on
+ enable_partition_pruning | on
enable_partitionwise_aggregate | off
enable_partitionwise_join | off
enable_seqscan | on
enable_sort | on
enable_tidscan | on
-(16 rows)
+(17 rows)
-- Test that the pg_timezone_names and pg_timezone_abbrevs views are
-- more-or-less working. We can't test their contents in any great detail
--
2.16.2.windows.1
On Fri, Apr 20, 2018 at 7:37 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2018/04/19 21:50, Ashutosh Bapat wrote:
On Thu, Apr 19, 2018 at 5:02 PM, Amit Langote
I can imagine having a enable_partition_pruning which defaults to true, if
only to avoid the performance overhead of pruning code when a user knows
for sure that it won't help for some queries. Although, I'm a bit dubious
why they'd write such queries if they're using partitioning in the first
place.Also, I'd think that enable_partition_pruning set to false means pruning
doesn't occur at all, not even using constraint exclusion. That is,
behavior equivalent of constraint_exclusion < partition (that is, off/on).Also, if we do have such a GUC, it should apply to all command types,
including UPDATE and DELETE which don't yet invoke the new pruning code,
from the start. So, if enable_partition_pruning is false, we won't load
the partition constraints at all, which we currently do for UPDATE and
DELETE so that constraint exclusion can be used for pruning. OTOH, if
enable_partition_pruning is on, we perform constraint exclusion -based
pruning for UPDATE and DELETE irrespective of the setting of
constraint_exclusion GUC. In other words, we completely dissociate
partitioned table pruning from the setting of constraint_exclusion.Isn't word "dissociate" turns the last sentence into a sentence
contradicting everything you wrote prior to it?I think we should keep these two things separate.
Yes, that's what I meant.
To clarify: if we're going to add a new parameter *for partitioned tables*
to configure whether or not pruning occurs, even if UPDATE and DELETE now
rely on constraint exclusion for pruning, we should ignore the setting of
constraint_exclusion the configuration parameter. For UPDATE and DELETE,
if enable_partition_pruning is on, we proceed to prune using constraint
exclusion (because that's the only method available now), irrespective of
the setting of constraint_exclusion.So to users, enable_partition_pruning should be the only way to configure
whether or not pruning occurs.Does that make sense?
It seems like talking about the finer implementation details is making
this discussion a bit confusing.enable_partition_pruning affects the partition pruning based on the
partition bounds and that currently does not work for UPDATE/DELETE.
When it does work in those case, we might think of not loading
partition bound based constraints. constraint_exclusion affects
whether constraints can be used to exclude a relation (with partition
option affecting the child tables). Once we stop loading partition
bound based constraints, constraint exclusion would stop pruning
partitions based on the bounds. There's no point in confusing users
with by adding dependencies between these two GUCs.That's exactly what I'm trying to propose.
Not really. By pruning based on the partition bounds I didn't mean
constraint exclusion working on partition bound based constraints.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
On 2018/04/20 15:00, Ashutosh Bapat wrote:
On Fri, Apr 20, 2018 at 7:37 AM, Amit Langote wrote:
On 2018/04/19 21:50, Ashutosh Bapat wrote:
There's no point in confusing users
with by adding dependencies between these two GUCs.That's exactly what I'm trying to propose.
Not really. By pruning based on the partition bounds I didn't mean
constraint exclusion working on partition bound based constraints.
Sorry, I should have said what I said after quoting only the last sentence
of what you had said. That is, I want to the new GUC to be the only
determiner of whether the pruning occurs or not for partitioned tables.
To implement that behavior, it will have to override the setting of
constraint_exclusion (the parameter) in *some* cases, because some
commands still rely on constraint exclusion (the algorithm) as the
underlying pruning mechanism. Now, the "override the setting of
constraint_exclusion" implementation may not be the most popular choice in
the end.
Thanks,
Amit
Hi David.
Thanks for writing the patch.
On 2018/04/20 14:47, David Rowley wrote:
On 20 April 2018 at 14:07, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
To clarify: if we're going to add a new parameter *for partitioned tables*
to configure whether or not pruning occurs, even if UPDATE and DELETE now
rely on constraint exclusion for pruning, we should ignore the setting of
constraint_exclusion the configuration parameter. For UPDATE and DELETE,
if enable_partition_pruning is on, we proceed to prune using constraint
exclusion (because that's the only method available now), irrespective of
the setting of constraint_exclusion.So to users, enable_partition_pruning should be the only way to configure
whether or not pruning occurs.I hope the attached implements what is being discussed here.
Please test it to ensure it behaves as you'd expect.
I was a little unsure if the new GUCs declaration should live in
costsize.c or not since it really has no effect on plan costs, but in
the end, I stuck it there anyway so that it can be with its friends.
The patch looks good except one thing, which I was trying to emphasize
shouldn't be the behavior.
drop table p;
create table p (a int) partition by list (a);
create table p1 partition of p for values in (1);
create table p2 partition of p for values in (2);
set enable_partition_pruning to off;
-- ok
explain select * from p where a = 1;
QUERY PLAN
----------------------------------------------------------
Append (cost=0.00..83.88 rows=26 width=4)
-> Seq Scan on p1 (cost=0.00..41.88 rows=13 width=4)
Filter: (a = 1)
-> Seq Scan on p2 (cost=0.00..41.88 rows=13 width=4)
Filter: (a = 1)
(5 rows)
reset enable_partition_pruning;
-- ok
explain select * from p where a = 1;
QUERY PLAN
----------------------------------------------------------
Append (cost=0.00..41.94 rows=13 width=4)
-> Seq Scan on p1 (cost=0.00..41.88 rows=13 width=4)
Filter: (a = 1)
(3 rows)
set enable_partition_pruning to off;
-- ok
explain update p set a = 2 where a = 1;
QUERY PLAN
-----------------------------------------------------------
Update on p (cost=0.00..83.75 rows=26 width=10)
Update on p1
Update on p2
-> Seq Scan on p1 (cost=0.00..41.88 rows=13 width=10)
Filter: (a = 1)
-> Seq Scan on p2 (cost=0.00..41.88 rows=13 width=10)
Filter: (a = 1)
(7 rows)
reset enable_partition_pruning;
-- ok
explain update p set a = 2 where a = 1;
QUERY PLAN
-----------------------------------------------------------
Update on p (cost=0.00..41.88 rows=13 width=10)
Update on p1
-> Seq Scan on p1 (cost=0.00..41.88 rows=13 width=10)
Filter: (a = 1)
(4 rows)
set constraint_exclusion to off;
-- not ok!
explain update p set a = 2 where a = 1;
QUERY PLAN
-----------------------------------------------------------
Update on p (cost=0.00..83.75 rows=26 width=10)
Update on p1
Update on p2
-> Seq Scan on p1 (cost=0.00..41.88 rows=13 width=10)
Filter: (a = 1)
-> Seq Scan on p2 (cost=0.00..41.88 rows=13 width=10)
Filter: (a = 1)
(7 rows)
I think we should teach relation_excluded_by_constraints() to forge ahead
based on the value of enable_partition_pruning, ignoring whatever
constraint_exclusion has been set to. What do you think of doing that
sort of thing?
Thanks,
Amit
On 2018/04/20 17:51, Amit Langote wrote:
On 2018/04/20 14:47, David Rowley wrote:
On 20 April 2018 at 14:07, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
To clarify: if we're going to add a new parameter *for partitioned tables*
to configure whether or not pruning occurs, even if UPDATE and DELETE now
rely on constraint exclusion for pruning, we should ignore the setting of
constraint_exclusion the configuration parameter. For UPDATE and DELETE,
if enable_partition_pruning is on, we proceed to prune using constraint
exclusion (because that's the only method available now), irrespective of
the setting of constraint_exclusion.So to users, enable_partition_pruning should be the only way to configure
whether or not pruning occurs.I hope the attached implements what is being discussed here.
Please test it to ensure it behaves as you'd expect.
I was a little unsure if the new GUCs declaration should live in
costsize.c or not since it really has no effect on plan costs, but in
the end, I stuck it there anyway so that it can be with its friends.The patch looks good except one thing,
OK, I forgot to comment on a couple of minor issues.
+ <varlistentry id="guc-enable-partition-pruning"
xreflabel="enable_partition_pruning">
+ <term><varname>enable_partition_pruning</varname>
(<type>boolean</type>)
+ <indexterm>
+ <primary><varname>enable_partition_pruning</varname>
configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Enables or disables the query planner's ability to eliminate a
+ partitioned table's subpartitions from query plans.
Why subpartitions? Maybe, just "partitions" is fine.
+ This also
+ controls the planner's ability to generate query plans which
allow the
+ query executor to remove or ignoring partitions during query
Here: ignoring -> ignore
Also, maybe add the GUC to postgresql.conf.sample.
Thanks,
Amit
Amit Langote wrote:
Sorry, I should have said what I said after quoting only the last sentence
of what you had said. That is, I want to the new GUC to be the only
determiner of whether the pruning occurs or not for partitioned tables.
To implement that behavior, it will have to override the setting of
constraint_exclusion (the parameter) in *some* cases, because some
commands still rely on constraint exclusion (the algorithm) as the
underlying pruning mechanism.
I agree -- it will make more sense now, and will continue to make sense
later when we remove usage of constraint exclusion for upd/del, to make
it work as you suggest:
* if the table is partitioned, do constraint exclusion based on
enable_partition_prune=on rather than constraint_exclusion=partition.
This will only affect upd/del, because the select queries would be
affected by the enable_partition_prune anyway since
constraint_exclusion does not apply.
* If the table is using regular inheritance, continue to use the
original behavior.
Now, the "override the setting of constraint_exclusion" implementation
may not be the most popular choice in the end.
I guess there are different ways to implement it. Supposedly this is
going to disappear in pg12, so I don't think it's a big deal.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index fa92ce2e68..c51a9270e4 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -951,6 +951,15 @@ static struct config_bool ConfigureNamesBool[] =
true,
NULL, NULL, NULL
},
+ {
+ {"enable_partition_pruning", PGC_USERSET, QUERY_TUNING_METHOD,
+ gettext_noop("Enables the planner's ability to remove non-required partitions from the query plan."),
+ NULL
+ },
+ &enable_partition_pruning,
+ true,
+ NULL, NULL, NULL
+ },
I would make the short description shorter, and use the long description
to elaborate. So gettext_noop("Enable plan-time and run-time partition
pruning.")
followed by something like
gettext_noop("Allows the query planner and executor to compare partition
bounds to conditions in the query, and determine which partitions {can be
skipped | must be scanned} ...")
(Not wedded to those particular phrasings.)
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Thank you both of you for looking at this.
On 21 April 2018 at 06:28, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
+ {"enable_partition_pruning", PGC_USERSET, QUERY_TUNING_METHOD, + gettext_noop("Enables the planner's ability to remove non-required partitions from the query plan."), + NULL + }, + &enable_partition_pruning, + true, + NULL, NULL, NULL + },I would make the short description shorter, and use the long description
to elaborate. So gettext_noop("Enable plan-time and run-time partition
pruning.")
followed by something likegettext_noop("Allows the query planner and executor to compare partition
bounds to conditions in the query, and determine which partitions {can be
skipped | must be scanned} ...")
I've taken a slight variation of this, but instead of ", and" I used
"to" and went with the "must be scanned" option.
select * from pg_settings where name like 'enable%'; does show that
this is the only enable_* GUC to have a long description, but perhaps
that does not matter.
On 20 April 2018 at 20:51, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
set constraint_exclusion to off;
-- not ok!
It needed a bit more effort than I put in the first time around to
make this work properly. constraint_exclusion = 'off' becomes a bit of
a special case for partitioned tables now. To make this work I had to
get rid of hasInheritedTarget and make a new enum that tracks if we're
inheritance planning for an inheritance parent or a partitioned table.
We can't simply only set hasInheritedTarget to true when planning with
inheritance parents as constraint_exclusion = 'partition' must still
know that we're planning using the inheritance planner.
v2 patch attached.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
v2-0001-Add-GUC-to-allow-partition-pruning-to-be-disabled.patchapplication/octet-stream; name=v2-0001-Add-GUC-to-allow-partition-pruning-to-be-disabled.patchDownload
From 86d077e314b4a3c7a6f379376d90a59d973b3100 Mon Sep 17 00:00:00 2001
From: "dgrowley@gmail.com" <dgrowley@gmail.com>
Date: Sat, 21 Apr 2018 16:58:17 +1200
Subject: [PATCH v2] Add GUC to allow partition pruning to be disabled
This both disables the plan-time partition pruning and also run-time partition
pruning. We may one day want to consider two individual switches for turning
off each of these features, but at the moment there seems to be no strong
argument for that.
---
doc/src/sgml/config.sgml | 30 ++++--
src/backend/nodes/outfuncs.c | 2 +-
src/backend/optimizer/path/allpaths.c | 3 +-
src/backend/optimizer/path/costsize.c | 1 +
src/backend/optimizer/plan/createplan.c | 7 +-
src/backend/optimizer/plan/planner.c | 13 ++-
src/backend/optimizer/prep/prepjointree.c | 2 +-
src/backend/optimizer/util/plancat.c | 29 ++++--
src/backend/utils/misc/guc.c | 9 ++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/nodes/relation.h | 16 ++-
src/include/optimizer/cost.h | 1 +
src/test/regress/expected/partition_prune.out | 138 ++++++++++++++++++++++++++
src/test/regress/expected/sysviews.out | 3 +-
src/test/regress/sql/partition_prune.sql | 46 +++++++++
15 files changed, 275 insertions(+), 26 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5d5f2d23c4..49173fcf9c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3826,6 +3826,23 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
</listitem>
</varlistentry>
+ <varlistentry id="guc-enable-partition-pruning" xreflabel="enable_partition_pruning">
+ <term><varname>enable_partition_pruning</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>enable_partition_pruning</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Enables or disables the query planner's ability to eliminate a
+ partitioned table's partitions from query plans. This also controls
+ the planner's ability to generate query plans which allow the query
+ executor to remove or ignore partitions during query execution. The
+ default is <literal>on</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-enable-partitionwise-join" xreflabel="enable_partitionwise_join">
<term><varname>enable_partitionwise_join</varname> (<type>boolean</type>)
<indexterm>
@@ -4417,8 +4434,7 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
<literal>partition</literal> (examine constraints only for inheritance child
tables and <literal>UNION ALL</literal> subqueries).
<literal>partition</literal> is the default setting.
- It is often used with inheritance and partitioned tables to
- improve performance.
+ It is often used with inheritance tables to improve performance.
</para>
<para>
@@ -4441,11 +4457,11 @@ SELECT * FROM parent WHERE key = 2400;
<para>
Currently, constraint exclusion is enabled by default
- only for cases that are often used to implement table partitioning.
- Turning it on for all tables imposes extra planning overhead that is
- quite noticeable on simple queries, and most often will yield no
- benefit for simple queries. If you have no partitioned tables
- you might prefer to turn it off entirely.
+ only for cases that are often used to implement table partitioning via
+ inheritance tables. Turning it on for all tables imposes extra
+ planning overhead that is quite noticeable on simple queries, and most
+ often will yield no benefit for simple queries. If you have no
+ inheritance partitioned tables you might prefer to turn it off entirely.
</para>
<para>
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 3991a0ce83..f2d00c5e31 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2291,7 +2291,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
WRITE_FLOAT_FIELD(tuple_fraction, "%.4f");
WRITE_FLOAT_FIELD(limit_tuples, "%.0f");
WRITE_UINT_FIELD(qual_security_level);
- WRITE_BOOL_FIELD(hasInheritedTarget);
+ WRITE_ENUM_FIELD(inhTargetKind, InheritanceKind);
WRITE_BOOL_FIELD(hasJoinRTEs);
WRITE_BOOL_FIELD(hasLateralRTEs);
WRITE_BOOL_FIELD(hasDeletedRTEs);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 3ba3f87eb7..9ed73da0f7 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -901,7 +901,8 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
* store the relids of all partitions which could possibly contain a
* matching tuple, and skip anything else in the loop below.
*/
- if (rte->relkind == RELKIND_PARTITIONED_TABLE &&
+ if (enable_partition_pruning &&
+ rte->relkind == RELKIND_PARTITIONED_TABLE &&
rel->baserestrictinfo != NIL)
{
live_children = prune_append_rel_partitions(rel);
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 957f751bd4..a2a7e0c520 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -138,6 +138,7 @@ bool enable_partitionwise_join = false;
bool enable_partitionwise_aggregate = false;
bool enable_parallel_append = true;
bool enable_parallel_hash = true;
+bool enable_partition_pruning = true;
typedef struct
{
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 280f21cd45..0317763f43 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1077,7 +1077,8 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path)
subplans = lappend(subplans, subplan);
}
- if (rel->reloptkind == RELOPT_BASEREL &&
+ if (enable_partition_pruning &&
+ rel->reloptkind == RELOPT_BASEREL &&
best_path->partitioned_rels != NIL)
{
List *prunequal;
@@ -1979,7 +1980,7 @@ create_groupingsets_plan(PlannerInfo *root, GroupingSetsPath *best_path)
* create_modifytable_plan). Fortunately we can't be because there would
* never be grouping in an UPDATE/DELETE; but let's Assert that.
*/
- Assert(!root->hasInheritedTarget);
+ Assert(root->inhTargetKind == INHKIND_NONE);
Assert(root->grouping_map == NULL);
root->grouping_map = grouping_map;
@@ -2141,7 +2142,7 @@ create_minmaxagg_plan(PlannerInfo *root, MinMaxAggPath *best_path)
* create_modifytable_plan). Fortunately we can't be because there would
* never be aggregates in an UPDATE/DELETE; but let's Assert that.
*/
- Assert(!root->hasInheritedTarget);
+ Assert(root->inhTargetKind == INHKIND_NONE);
Assert(root->minmax_aggs == NIL);
root->minmax_aggs = best_path->mmaggregates;
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 2e298f8357..c90cd3951f 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -623,7 +623,7 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
root->grouping_map = NULL;
root->minmax_aggs = NIL;
root->qual_security_level = 0;
- root->hasInheritedTarget = false;
+ root->inhTargetKind = INHKIND_NONE;
root->hasRecursion = hasRecursion;
if (hasRecursion)
root->wt_param_id = SS_assign_special_param(root);
@@ -1424,8 +1424,15 @@ inheritance_planner(PlannerInfo *root)
Assert(subroot->join_info_list == NIL);
/* and we haven't created PlaceHolderInfos, either */
Assert(subroot->placeholder_list == NIL);
- /* hack to mark target relation as an inheritance partition */
- subroot->hasInheritedTarget = true;
+
+ /*
+ * Mark if we're planning a query to a partitioned table or an
+ * inheritance parent.
+ */
+ if (partitioned_relids)
+ subroot->inhTargetKind = INHKIND_PARTITIONED;
+ else
+ subroot->inhTargetKind = INHKIND_INHERITED;
/*
* If the child is further partitioned, remember it as a parent. Since
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 45d82da459..c3f46a26c3 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -914,7 +914,7 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte,
subroot->grouping_map = NULL;
subroot->minmax_aggs = NIL;
subroot->qual_security_level = 0;
- subroot->hasInheritedTarget = false;
+ subroot->inhTargetKind = INHKIND_NONE;
subroot->hasRecursion = false;
subroot->wt_param_id = -1;
subroot->non_recursive_path = NULL;
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 1ff0ef4866..28d1021b83 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1272,7 +1272,7 @@ get_relation_constraints(PlannerInfo *root,
* descriptor, instead of constraint exclusion which is driven by the
* individual partition's partition constraint.
*/
- if (root->parse->commandType != CMD_SELECT)
+ if (enable_partition_pruning && root->parse->commandType != CMD_SELECT)
{
List *pcqual = RelationGetPartitionQual(relation);
@@ -1415,13 +1415,28 @@ relation_excluded_by_constraints(PlannerInfo *root,
return true;
}
- /* Skip further tests if constraint exclusion is disabled for the rel */
- if (constraint_exclusion == CONSTRAINT_EXCLUSION_OFF ||
- (constraint_exclusion == CONSTRAINT_EXCLUSION_PARTITION &&
- !(rel->reloptkind == RELOPT_OTHER_MEMBER_REL ||
- (root->hasInheritedTarget &&
+ /*
+ * When constraint_exclusion is disabled, don't try to exclude the rel.
+ *
+ * We currently must ignore this off setting when the inheritance target
+ * is a partitioned table as partition pruning is still implemented using
+ * constraint exclusion. Users must use the enable_partition_pruning GUC
+ * to control the behavior in for partitioned tables.
+ */
+ if (constraint_exclusion == CONSTRAINT_EXCLUSION_OFF &&
+ root->inhTargetKind != INHKIND_PARTITIONED)
+ return false;
+
+ /*
+ * When constraint_exclusion is set to 'partition' we only handle
+ * OTHER_MEMBER_RELs, or BASERELs in cases where the result target is an
+ * inheritance parent or a partitioned table.
+ */
+ if (constraint_exclusion == CONSTRAINT_EXCLUSION_PARTITION &&
+ !(rel->reloptkind == RELOPT_OTHER_MEMBER_REL ||
+ (root->inhTargetKind != INHKIND_NONE &&
rel->reloptkind == RELOPT_BASEREL &&
- rel->relid == root->parse->resultRelation))))
+ rel->relid == root->parse->resultRelation)))
return false;
/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index fa92ce2e68..ec61e04d21 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -951,6 +951,15 @@ static struct config_bool ConfigureNamesBool[] =
true,
NULL, NULL, NULL
},
+ {
+ {"enable_partition_pruning", PGC_USERSET, QUERY_TUNING_METHOD,
+ gettext_noop("Enable plan-time and run-time partition pruning."),
+ gettext_noop("Allows the query planner and executor to compare partition bounds to conditions in the query to determine which partitions must be scanned.")
+ },
+ &enable_partition_pruning,
+ true,
+ NULL, NULL, NULL
+ },
{
{"geqo", PGC_USERSET, QUERY_TUNING_GEQO,
gettext_noop("Enables genetic query optimization."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 66d0938827..3d88e80a20 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -308,6 +308,7 @@
#enable_partitionwise_join = off
#enable_partitionwise_aggregate = off
#enable_parallel_hash = on
+#enable_partition_pruning = on
# - Planner Cost Constants -
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 2108b6ab1d..8b153a9d4e 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -82,6 +82,17 @@ typedef enum UpperRelationKind
/* NB: UPPERREL_FINAL must be last enum entry; it's used to size arrays */
} UpperRelationKind;
+/*
+ * This enum identifies which type of relation is being planned through the
+ * inheritance planner. INHKIND_NONE indicates the inheritance planner
+ * was not used.
+ */
+typedef enum InheritanceKind
+{
+ INHKIND_NONE,
+ INHKIND_INHERITED,
+ INHKIND_PARTITIONED
+} InheritanceKind;
/*----------
* PlannerGlobal
@@ -298,8 +309,9 @@ typedef struct PlannerInfo
Index qual_security_level; /* minimum security_level for quals */
/* Note: qual_security_level is zero if there are no securityQuals */
- bool hasInheritedTarget; /* true if parse->resultRelation is an
- * inheritance child rel */
+ InheritanceKind inhTargetKind; /* indicates if the target relation is an
+ * inheritance child or partition or a
+ * partitioned table */
bool hasJoinRTEs; /* true if any RTEs are RTE_JOIN kind */
bool hasLateralRTEs; /* true if any RTEs are marked LATERAL */
bool hasDeletedRTEs; /* true if any RTE was deleted from jointree */
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 6e6d0d3c79..55e6a8488f 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -71,6 +71,7 @@ extern PGDLLIMPORT bool enable_partitionwise_join;
extern PGDLLIMPORT bool enable_partitionwise_aggregate;
extern PGDLLIMPORT bool enable_parallel_append;
extern PGDLLIMPORT bool enable_parallel_hash;
+extern PGDLLIMPORT bool enable_partition_pruning;
extern PGDLLIMPORT int constraint_exclusion;
extern double clamp_row_est(double nrows);
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 9c65ee001d..a86169c4e9 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -2737,3 +2737,141 @@ explain (costs off) select * from pp_intrangepart where a = '(1,2)'::int4range;
(2 rows)
drop table pp_intrangepart;
+--
+-- Ensure the enable_partition_prune GUC properly disables partition pruning.
+--
+create table pp_lp (a int, value int) partition by list (a);
+create table pp_lp1 partition of pp_lp for values in(1);
+create table pp_lp2 partition of pp_lp for values in(2);
+set enable_partition_pruning = off;
+set constraint_exclusion = 'partition'; -- this should not affect the result.
+explain (costs off) select * from pp_lp where a = 1;
+ QUERY PLAN
+--------------------------
+ Append
+ -> Seq Scan on pp_lp1
+ Filter: (a = 1)
+ -> Seq Scan on pp_lp2
+ Filter: (a = 1)
+(5 rows)
+
+explain (costs off) update pp_lp set value = 10 where a = 1;
+ QUERY PLAN
+--------------------------
+ Update on pp_lp
+ Update on pp_lp1
+ Update on pp_lp2
+ -> Seq Scan on pp_lp1
+ Filter: (a = 1)
+ -> Seq Scan on pp_lp2
+ Filter: (a = 1)
+(7 rows)
+
+explain (costs off) delete from pp_lp where a = 1;
+ QUERY PLAN
+--------------------------
+ Delete on pp_lp
+ Delete on pp_lp1
+ Delete on pp_lp2
+ -> Seq Scan on pp_lp1
+ Filter: (a = 1)
+ -> Seq Scan on pp_lp2
+ Filter: (a = 1)
+(7 rows)
+
+set constraint_exclusion = 'off'; -- this should not affect the result.
+explain (costs off) select * from pp_lp where a = 1;
+ QUERY PLAN
+--------------------------
+ Append
+ -> Seq Scan on pp_lp1
+ Filter: (a = 1)
+ -> Seq Scan on pp_lp2
+ Filter: (a = 1)
+(5 rows)
+
+explain (costs off) update pp_lp set value = 10 where a = 1;
+ QUERY PLAN
+--------------------------
+ Update on pp_lp
+ Update on pp_lp1
+ Update on pp_lp2
+ -> Seq Scan on pp_lp1
+ Filter: (a = 1)
+ -> Seq Scan on pp_lp2
+ Filter: (a = 1)
+(7 rows)
+
+explain (costs off) delete from pp_lp where a = 1;
+ QUERY PLAN
+--------------------------
+ Delete on pp_lp
+ Delete on pp_lp1
+ Delete on pp_lp2
+ -> Seq Scan on pp_lp1
+ Filter: (a = 1)
+ -> Seq Scan on pp_lp2
+ Filter: (a = 1)
+(7 rows)
+
+drop table pp_lp;
+-- Ensure enable_partition_prune does not affect non-partitioned tables.
+create table inh_lp (a int, value int);
+create table inh_lp1 (a int, value int, check(a = 1)) inherits (inh_lp);
+NOTICE: merging column "a" with inherited definition
+NOTICE: merging column "value" with inherited definition
+create table inh_lp2 (a int, value int, check(a = 2)) inherits (inh_lp);
+NOTICE: merging column "a" with inherited definition
+NOTICE: merging column "value" with inherited definition
+set constraint_exclusion = 'partition';
+-- inh_lp2 should be removed in the following 3 cases.
+explain (costs off) select * from inh_lp where a = 1;
+ QUERY PLAN
+---------------------------
+ Append
+ -> Seq Scan on inh_lp
+ Filter: (a = 1)
+ -> Seq Scan on inh_lp1
+ Filter: (a = 1)
+(5 rows)
+
+explain (costs off) update inh_lp set value = 10 where a = 1;
+ QUERY PLAN
+---------------------------
+ Update on inh_lp
+ Update on inh_lp
+ Update on inh_lp1
+ -> Seq Scan on inh_lp
+ Filter: (a = 1)
+ -> Seq Scan on inh_lp1
+ Filter: (a = 1)
+(7 rows)
+
+explain (costs off) delete from inh_lp where a = 1;
+ QUERY PLAN
+---------------------------
+ Delete on inh_lp
+ Delete on inh_lp
+ Delete on inh_lp1
+ -> Seq Scan on inh_lp
+ Filter: (a = 1)
+ -> Seq Scan on inh_lp1
+ Filter: (a = 1)
+(7 rows)
+
+-- Ensure we don't exclude normal relations when we only expect to exclude
+-- inheritance children
+explain (costs off) update inh_lp1 set value = 10 where a = 2;
+ QUERY PLAN
+---------------------------
+ Update on inh_lp1
+ -> Seq Scan on inh_lp1
+ Filter: (a = 2)
+(3 rows)
+
+drop table inh_lp cascade;
+NOTICE: drop cascades to 2 other objects
+DETAIL: drop cascades to table inh_lp1
+drop cascades to table inh_lp2
+reset enable_partition_pruning;
+reset constraint_exclusion;
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index a19ee08749..a1c90eb905 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -83,12 +83,13 @@ select name, setting from pg_settings where name like 'enable%';
enable_nestloop | on
enable_parallel_append | on
enable_parallel_hash | on
+ enable_partition_pruning | on
enable_partitionwise_aggregate | off
enable_partitionwise_join | off
enable_seqscan | on
enable_sort | on
enable_tidscan | on
-(16 rows)
+(17 rows)
-- Test that the pg_timezone_names and pg_timezone_abbrevs views are
-- more-or-less working. We can't test their contents in any great detail
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index b38b39c71e..6ced422f17 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -698,3 +698,49 @@ create table pp_intrangepart2inf partition of pp_intrangepart for values in ('[2
explain (costs off) select * from pp_intrangepart where a = '[1,2]'::int4range;
explain (costs off) select * from pp_intrangepart where a = '(1,2)'::int4range;
drop table pp_intrangepart;
+
+--
+-- Ensure the enable_partition_prune GUC properly disables partition pruning.
+--
+
+create table pp_lp (a int, value int) partition by list (a);
+create table pp_lp1 partition of pp_lp for values in(1);
+create table pp_lp2 partition of pp_lp for values in(2);
+
+set enable_partition_pruning = off;
+
+set constraint_exclusion = 'partition'; -- this should not affect the result.
+
+explain (costs off) select * from pp_lp where a = 1;
+explain (costs off) update pp_lp set value = 10 where a = 1;
+explain (costs off) delete from pp_lp where a = 1;
+
+set constraint_exclusion = 'off'; -- this should not affect the result.
+
+explain (costs off) select * from pp_lp where a = 1;
+explain (costs off) update pp_lp set value = 10 where a = 1;
+explain (costs off) delete from pp_lp where a = 1;
+
+drop table pp_lp;
+
+-- Ensure enable_partition_prune does not affect non-partitioned tables.
+
+create table inh_lp (a int, value int);
+create table inh_lp1 (a int, value int, check(a = 1)) inherits (inh_lp);
+create table inh_lp2 (a int, value int, check(a = 2)) inherits (inh_lp);
+
+set constraint_exclusion = 'partition';
+
+-- inh_lp2 should be removed in the following 3 cases.
+explain (costs off) select * from inh_lp where a = 1;
+explain (costs off) update inh_lp set value = 10 where a = 1;
+explain (costs off) delete from inh_lp where a = 1;
+
+-- Ensure we don't exclude normal relations when we only expect to exclude
+-- inheritance children
+explain (costs off) update inh_lp1 set value = 10 where a = 2;
+
+drop table inh_lp cascade;
+
+reset enable_partition_pruning;
+reset constraint_exclusion;
--
2.16.2.windows.1
Hi David.
On 2018/04/21 14:09, David Rowley wrote:
On 20 April 2018 at 20:51, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
set constraint_exclusion to off;
-- not ok!
It needed a bit more effort than I put in the first time around to
make this work properly. constraint_exclusion = 'off' becomes a bit of
a special case for partitioned tables now. To make this work I had to
get rid of hasInheritedTarget and make a new enum that tracks if we're
inheritance planning for an inheritance parent or a partitioned table.
We can't simply only set hasInheritedTarget to true when planning with
inheritance parents as constraint_exclusion = 'partition' must still
know that we're planning using the inheritance planner.v2 patch attached.
Thanks for the updated patch.
Your proposed changes to inheritance_planner() look fine to me. In the
comment added by the patch in relation_excluded_by_constraints():
+ /*
+ * When constraint_exclusion is set to 'partition' we only handle
+ * OTHER_MEMBER_RELs, or BASERELs in cases where the result target is an
+ * inheritance parent or a partitioned table.
+ */
Just to clarify this a bit, would it be a good idea to be specific by
appending " (see inheritance_planner() where this is determined)" or some
such to this sentence?
BTW, while we're at it, would it also be a good idea to consider the patch
you had proposed, which I then posted an updated version of, to adjust the
documentation in ddl.sgml (in the section 5.10. Table Partitioning)
regarding the relationship between constraint exclusion and declarative
partitioning?
/messages/by-id/c2838545-0e77-3c08-cd14-1c3bbf9eb62d@lab.ntt.co.jp
Thanks,
Amit
Hi,
I just pushed David patch, with some pretty minor changes. I hope not
to have broken anything.
Amit Langote wrote:
Your proposed changes to inheritance_planner() look fine to me. In the
comment added by the patch in relation_excluded_by_constraints():+ /* + * When constraint_exclusion is set to 'partition' we only handle + * OTHER_MEMBER_RELs, or BASERELs in cases where the result target is an + * inheritance parent or a partitioned table. + */Just to clarify this a bit, would it be a good idea to be specific by
appending " (see inheritance_planner() where this is determined)" or some
such to this sentence?
I didn't think that was really required.
BTW, while we're at it, would it also be a good idea to consider the patch
you had proposed, which I then posted an updated version of, to adjust the
documentation in ddl.sgml (in the section 5.10. Table Partitioning)
regarding the relationship between constraint exclusion and declarative
partitioning?
I looked at this one. That patch has two hunks. I applied a change
where the first hunk is, to replace constraint_exclusion with the new
GUC -- seemed easy enough. However, the second hunk is on "5.10.4.
Partitioning and Constraint Exclusion" which needs major editing. Not
really sure how best to handle that one. For starters, I think it need
to stop mentioning the GUC name in the title; maybe rename it to
"Partition Pruning" or some such, and then in the text explain that
sometimes the enable_partition_pruning param is used in one case and
constraint_exclusion in the other, and approximately what effects they
have. I don't think it's worth going into too much detail on exactly
how they differ, but then I'm not 100% sure of that either.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018/04/24 6:10, Alvaro Herrera wrote:
BTW, while we're at it, would it also be a good idea to consider the patch
you had proposed, which I then posted an updated version of, to adjust the
documentation in ddl.sgml (in the section 5.10. Table Partitioning)
regarding the relationship between constraint exclusion and declarative
partitioning?I looked at this one. That patch has two hunks. I applied a change
where the first hunk is, to replace constraint_exclusion with the new
GUC -- seemed easy enough.
Looks good.
However, the second hunk is on "5.10.4.
Partitioning and Constraint Exclusion" which needs major editing.
Reading 5.10.4 again, I tend to agree with this.
Not really sure how best to handle that one. For starters, I think it need
to stop mentioning the GUC name in the title;
Hmm, "Constraint Exclusion" that's used in the title is a concept, not a
GUC, although pretty close.
maybe rename it to
"Partition Pruning" or some such, and then in the text explain that
sometimes the enable_partition_pruning param is used in one case and
constraint_exclusion in the other, and approximately what effects they
have. I don't think it's worth going into too much detail on exactly
how they differ, but then I'm not 100% sure of that either.
Just a thought -- How about making 5.10.4 cover partitioning based
optimizations in general? I see that a number of partitioning-based
optimizations have been developed in this release cycle, but I only see
various enable_partition* GUCs listed in config.sgml and not much else.
Although the config.sgml coverage of the new capabilities seems pretty
good, some may find their being mentioned in 5.10 Table Partitioning
helpful. Or if we don't want to hijack 5.10.4, maybe create a 5.10.5.
Thanks,
Amit
The patch looks good to me, David.
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Amit Langote wrote:
On 2018/04/24 6:10, Alvaro Herrera wrote:
Not really sure how best to handle that one. For starters, I think it need
to stop mentioning the GUC name in the title;Hmm, "Constraint Exclusion" that's used in the title is a concept, not a
GUC, although pretty close.
Yeah, I meant that if we want that section to cover the general concept
of partition pruning, with either technique, better not use the words
"constraint exclusion" in the title.
maybe rename it to
"Partition Pruning" or some such, and then in the text explain that
sometimes the enable_partition_pruning param is used in one case and
constraint_exclusion in the other, and approximately what effects they
have. I don't think it's worth going into too much detail on exactly
how they differ, but then I'm not 100% sure of that either.Just a thought -- How about making 5.10.4 cover partitioning based
optimizations in general? I see that a number of partitioning-based
optimizations have been developed in this release cycle, but I only see
various enable_partition* GUCs listed in config.sgml and not much else.
I think we should not rely on the config.sgml blurbs exclusively; some
narrative is always welcome -- except that for planner enable_* settings
I'm not sure we really need all that much text after all. Constraint
exclusion was pretty easy to get wrong, hence the need for a separate
section, and I suppose the new partition pruning may be prey to the same
problems, so it seems worth to document them specially. But not sure
about the others, if they are mostly debugging tools.
Although the config.sgml coverage of the new capabilities seems pretty
good, some may find their being mentioned in 5.10 Table Partitioning
helpful. Or if we don't want to hijack 5.10.4, maybe create a 5.10.5.
Can you (or someone) describe what would that section contain?
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Jim Finnerty wrote:
The patch looks good to me, David.
Thanks for checking! It's already pushed.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 24 April 2018 at 09:10, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
I just pushed David patch, with some pretty minor changes. I hope not
to have broken anything.
Thanks for pushing and thanks Amit for reviewing.
The only thing that stands out in the actual commit is:
+ executor to remove (ignore) partitions during query execution. The
I had originally written:
+ executor to remove or ignore partitions during query execution. The
The reason I was using "remove or ignore" was that partitions pruned
during init plan are effectively "removed" from the plan, whereas
partitions pruned during the running of the planner are just
"ignored".
It's minor details but I thought I'd better point it out.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 25 April 2018 at 09:59, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Amit Langote wrote:
Although the config.sgml coverage of the new capabilities seems pretty
good, some may find their being mentioned in 5.10 Table Partitioning
helpful. Or if we don't want to hijack 5.10.4, maybe create a 5.10.5.Can you (or someone) describe what would that section contain?
I've drafted and attached a patch of how I think this should look.
Likely it will need some tweaking, but it is probably a good starting
point for discussion.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
further_enable_partition_pruning_doc_updates.patchapplication/octet-stream; name=further_enable_partition_pruning_doc_updates.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index eabe2a9235..bda6adfdfc 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3840,6 +3840,11 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
executor to remove (ignore) partitions during query execution. The
default is <literal>on</literal>.
</para>
+
+ <para>
+ Refer to <xref linkend="ddl-partition-pruning"/> for more information
+ on partition pruning and partitioning.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 34da0d8d57..89735b4804 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3760,7 +3760,7 @@ ANALYZE measurement;
</sect2>
<sect2 id="ddl-partitioning-constraint-exclusion">
- <title>Partitioning and Constraint Exclusion</title>
+ <title>Inheritance Partitioning and Constraint Exclusion</title>
<indexterm>
<primary>constraint exclusion</primary>
@@ -3768,9 +3768,8 @@ ANALYZE measurement;
<para>
<firstterm>Constraint exclusion</firstterm> is a query optimization technique
- that improves performance for partitioned tables defined in the
- fashion described above (both declaratively partitioned tables and those
- implemented using inheritance). As an example:
+ that improves performance for inheritance partitioned tables defined in the
+ fashion described above. As an example:
<programlisting>
SET constraint_exclusion = on;
@@ -3847,15 +3846,14 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
<xref linkend="guc-constraint-exclusion"/> is actually neither
<literal>on</literal> nor <literal>off</literal>, but an intermediate setting
called <literal>partition</literal>, which causes the technique to be
- applied only to queries that are likely to be working on partitioned
+ applied only to queries that are likely to be working on inheritance partitioned
tables. The <literal>on</literal> setting causes the planner to examine
<literal>CHECK</literal> constraints in all queries, even simple ones that
are unlikely to benefit.
</para>
<para>
- The following caveats apply to constraint exclusion, which is used by
- both inheritance and partitioned tables:
+ The following caveats apply to constraint exclusion:
<itemizedlist>
<listitem>
@@ -3879,9 +3877,7 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
contain only comparisons of the partitioning column(s) to constants
using B-tree-indexable operators, which applies even to partitioned
tables, because only B-tree-indexable column(s) are allowed in the
- partition key. (This is not a problem when using declarative
- partitioning, since the automatically generated constraints are simple
- enough to be understood by the planner.)
+ partition key.
</para>
</listitem>
@@ -3898,6 +3894,95 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
</itemizedlist>
</para>
</sect2>
+
+ <sect2 id="ddl-partition-pruning">
+ <title>Declarative Partitioning and Partition Pruning</title>
+
+ <indexterm>
+ <primary>partition pruning</primary>
+ </indexterm>
+
+ <para>
+ <firstterm>Partition Pruning</firstterm> is a query optimization technique
+ similar to constraint exclusion, but applies only to declaratively
+ partitioned tables. Like constraint exclusion, this uses (but is not
+ limited to using) the query's <literal>WHERE</literal> clause to exclude
+ partitions which cannot possibly contain any matching records.
+ </para>
+
+ <para>
+ Unlike constraint exclusion, partition pruning can be performed much more
+ quickly as it does not have to scan each individual partition's metadata
+ to determine if the partition is required for a particular query.
+ </para>
+
+ <para>
+ Partition Pruning is also more powerful than constraint exclusion as
+ partition pruning is not something that is performed only during the
+ planning of a given query. In certain cases, partition pruning may also
+ be performed during execution of the query as well. This allows pruning
+ to be performed using values which are unknown during query planning, for
+ example, using parameters defined in a <command>PREPARE</command>
+ statement, using a value obtained from a subquery or using parameters from
+ a parameterized nested loop join.
+ </para>
+
+ <para>
+ The partition pruning which is performed during execution is done so at
+ either one or both of the following times:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ During initialization of the query plan. Partition pruning can be
+ performed here for parameter values which are known during the
+ initialization phase of execution. If partition pruning can be
+ performed here then there is the added benefit of not having to
+ initialize partitions which are pruned. Partitions which are pruned
+ during this stage will not show up in the query's
+ <command>EXPLAIN</command> or <command>EXPLAIN ANALYZE</command>. It
+ is possible to determine the number of partitions which were removed
+ using this method by observing the <quote>Subplans Removed</quote>
+ property in the <command>EXPLAIN</command> output.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ During actual execution of the query plan. Partition pruning may also
+ be performed here to remove partitions using values which are only known
+ during actual query execution. This includes values from subqueries and
+ values from execution time parameters such as ones from parameterized
+ nested loop joins. Since the value of these parameters may change many
+ times during the execution of the query, partition pruning is performed
+ whenever one of the execution parameters which is being compared to a
+ partition column or expression changes. In order to determine if
+ partitions were pruned at this stage requires careful inspection of the
+ <literal>nloops</literal> property in the
+ <command>EXPLAIN ANALYZE</command> output.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ Partition pruning can be disabled using the
+ <xref linkend="guc-enable-partition-pruning"/> setting.
+ </para>
+
+ <note>
+ <para>
+ Currently, partition pruning of partitions during the planning of an
+ <command>UPDATE</command> or <command>DELETE</command> command are
+ internally implemented using the constraint exclusion method. Only
+ <command>SELECT</command> uses the faster partition pruning method. Also
+ partition pruning performed during execution is only done so for the
+ Append node type. Both of these limitations are likely to be removed
+ in a future release of <productname>PostgreSQL</productname>.
+ </para>
+ </note>
+ </sect2>
+
</sect1>
<sect1 id="ddl-foreign-data">
On 2018/04/26 16:29, David Rowley wrote:
On 25 April 2018 at 09:59, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Amit Langote wrote:
Although the config.sgml coverage of the new capabilities seems pretty
good, some may find their being mentioned in 5.10 Table Partitioning
helpful. Or if we don't want to hijack 5.10.4, maybe create a 5.10.5.Can you (or someone) describe what would that section contain?
I've drafted and attached a patch of how I think this should look.
Likely it will need some tweaking, but it is probably a good starting
point for discussion.
Thanks David for drafting this. I see that you have not included the
description of other partitioning-based optimizations such partitionwise
plans that the planner can now consider. I guess that's fine though, as
Alvaro also seemed a bit doubtful about the value of providing the
description of those optimizations in this part of the documentation (that
is, in 5.10 Table Partitioning).
About the patch in general, it seems like the newly added documentation
talks about "Partition Pruning" as something that *replaces* constraint
exclusion. But, I think "Partition Pruning" is not the thing that
replaces constraint exclusion. We used to do partition pruning even
before and used constraint exclusion as the algorithm. What's new is the
algorithm that we now use to perform partition pruning for declaratively
partitioned tables. Also, the characteristics of the new algorithm are
such that it can now be used in more situations, thus making it more
useful than the earlier method of partition pruning, so that new features
like runtime pruning could be realized. I like that the patch adds
various details about the new pruning features, but think that the wording
and the flow could be improved a bit.
What do you think?
Thanks,
Amit
On Tue, Apr 24, 2018 at 5:59 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Constraint
exclusion was pretty easy to get wrong, hence the need for a separate
section, and I suppose the new partition pruning may be prey to the same
problems, so it seems worth to document them specially. But not sure
about the others, if they are mostly debugging tools.
Weighing in here late, but I have a hard time understanding why we
want a GUC to control partition pruning at all. With constraint
exclusion, the issue is whether you want to spend planner cycles to
try to deduce things using CHECK constraints when, quite possibly,
your CHECK constraints are unrelated to table inheritance and thus
won't help. But seems extremely unlikely that the same thing would
happen with partition pruning. Unlike your CHECK constraints, your
partition bounds are, by definition, potentially useful for pruning.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2 May 2018 at 07:24, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Apr 24, 2018 at 5:59 PM, Alvaro Herrera <alvherre@alvh.no-ip.org>
wrote:Constraint
exclusion was pretty easy to get wrong, hence the need for a separate
section, and I suppose the new partition pruning may be prey to the same
problems, so it seems worth to document them specially. But not sure
about the others, if they are mostly debugging tools.Weighing in here late, but I have a hard time understanding why we
want a GUC to control partition pruning at all. With constraint
exclusion, the issue is whether you want to spend planner cycles to
try to deduce things using CHECK constraints when, quite possibly,
your CHECK constraints are unrelated to table inheritance and thus
won't help. But seems extremely unlikely that the same thing would
happen with partition pruning. Unlike your CHECK constraints, your
partition bounds are, by definition, potentially useful for pruning.
Thanks for weighing in here. It's certainly true that I was a bit
undecided about this, hence the subject. I ended up leaning more towards
having the GUC due to the fact that partition pruning, although much
cheaper than constraint exclusion, it's still not free. There's a good
chance of there being workloads that just never benefit from it. People
running those workloads might be quite glad we added the ability to switch
it off.
It might be worth running a series of benchmarks to test where the worst
case performance hit is with partition pruning. We'd need some fast to
execute query that has items in the WHERE clause, but none that match the
partition key. It should be easy to test the overhead of this now that the
GUC is committed. Perhaps if we're unable to measure the performance drop
then the GUC is not worth it, but if we can, then perhaps it is, as it will
help speed up someone's workload. I'll try to do this today. I imagine
something like: SELECT * FROM parttable WHERE non_part_key_but_indexed_col
IN(<long list of values that matches almost 0 rows>) might be the best bet.
Another reason to have the GUC is in case some bug is discovered in the
pruning code. Being able to disable it could be useful until we can release
a minor version containing a fix. From my time reviewing the faster
partition pruning code, I very much am aware that it's not simple code, so
it would not surprise me if we find a few bugs in it down the track. The
problem with this reason is that it carries less weight every day that
passes with no bug discovered. If no bug is found in 10 years then we'll
likely wonder why we bothered doing it for this reason. Lack of any sort
of crystal ball makes it hard to know what to do here, so let's focus on
the performance reason first.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 2 May 2018 at 09:14, David Rowley <david.rowley@2ndquadrant.com> wrote:
It might be worth running a series of benchmarks to test where the worst case performance hit is with partition pruning.
I just did this:
Setup:
create table parttable (a int, b int) partition by list (a);
create table parttable1 partition of parttable for values in(1);
Benchmark:
$ echo "select * from parttable where b
in(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20);" >
parttable.sql
$ pgbench -T 30 -f parttable.sql -n postgres
transaction type: parttable.sql
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 30 s
number of transactions actually processed: 230908
latency average = 0.130 ms
tps = 7696.884795 (including connections establishing)
tps = 7697.304782 (excluding connections establishing)
$ psql -c "alter system set enable_partition_pruning = off" postgres
ALTER SYSTEM
$ pgbench -T 30 -f parttable.sql -n postgres
transaction type: parttable.sql
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 30 s
number of transactions actually processed: 233545
latency average = 0.128 ms
tps = 7784.800130 (including connections establishing)
tps = 7785.225490 (excluding connections establishing)
So about 1.1% performance improvement. That's not very much.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Hi Amit,
Thanks for looking at the patch.
On 1 May 2018 at 21:44, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
About the patch in general, it seems like the newly added documentation
talks about "Partition Pruning" as something that *replaces* constraint
exclusion. But, I think "Partition Pruning" is not the thing that
replaces constraint exclusion.
Not sure where you see the mention partition pruning replacing
constraint exclusion.
We used to do partition pruning even
before and used constraint exclusion as the algorithm.
That depends on if you think of partition pruning as the new feature
or the act of removing unneeded partitions. We seem to have settled on
partition pruning being the new feature given that we named the GUC
this way. So I don't quite understand what you mean here.
What's new is the
algorithm that we now use to perform partition pruning for declaratively
partitioned tables. Also, the characteristics of the new algorithm are
such that it can now be used in more situations, thus making it more
useful than the earlier method of partition pruning, so that new features
like runtime pruning could be realized. I like that the patch adds
various details about the new pruning features, but think that the wording
and the flow could be improved a bit.What do you think?
I re-read the patch and it still looks fine to me. I'm sure it could
be made better, but I just don't currently see how. I think it would
be better if you commented on the specifics of what you think could be
improved rather than a general comment that it could be improved.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Hi David.
On 2018/05/02 8:18, David Rowley wrote:
On 1 May 2018 at 21:44, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
About the patch in general, it seems like the newly added documentation
talks about "Partition Pruning" as something that *replaces* constraint
exclusion. But, I think "Partition Pruning" is not the thing that
replaces constraint exclusion.Not sure where you see the mention partition pruning replacing
constraint exclusion.We used to do partition pruning even
before and used constraint exclusion as the algorithm.That depends on if you think of partition pruning as the new feature
or the act of removing unneeded partitions. We seem to have settled on
partition pruning being the new feature given that we named the GUC
this way. So I don't quite understand what you mean here.What's new is the
algorithm that we now use to perform partition pruning for declaratively
partitioned tables. Also, the characteristics of the new algorithm are
such that it can now be used in more situations, thus making it more
useful than the earlier method of partition pruning, so that new features
like runtime pruning could be realized. I like that the patch adds
various details about the new pruning features, but think that the wording
and the flow could be improved a bit.What do you think?
I re-read the patch and it still looks fine to me. I'm sure it could
be made better, but I just don't currently see how. I think it would
be better if you commented on the specifics of what you think could be
improved rather than a general comment that it could be improved.
Sorry, I may have been a bit vague. I've read the patch one more time by
considering the phrase "partition pruning" as the name of the new feature
and that constraint exclusion is an optimization technique which doubled
as partition pruning until now. The new feature achieves results faster
and can be used in more cases than constraint exclusion. With that
reading, I don't see much to complain about with your patch at a high level.
Except some nitpicking:
+ <para>
+ Partition Pruning is also more powerful than constraint exclusion as
+ partition pruning is not something that is performed only during the
+ planning of a given query.
Maybe, don't repeat "partition pruning" again in the same sentence. How
about:
.. more powerful than constraint exclusion as *it* is not something..
Or may suggest to rewrite it as:
Partition pruning is also more powerful than constraint exclusion as it
can be performed not only during the planning of a given query, but also
during its execution.
If you accept the above rewrite, the next sentences in the paragraph:
+ In certain cases, partition pruning may also
+ be performed during execution of the query as well. This allows pruning
+ to be performed using values which are unknown during query planning, for
+ example, using parameters defined in a <command>PREPARE</command>
+ statement, using a value obtained from a subquery or using parameters
from
+ a parameterized nested loop join.
could be adjusted a bit to read as:
For example, this allows pruning to be performed using values which are
unknown during query planning but will be known during execution, such as
using parameters defined in a <command>PREPARE</command> statement (if a
generic plan is chosen), or using a value obtained from a subquery, or
using values from an outer row of a parameterized nested loop join.
+ <para>
+ The partition pruning which is performed during execution is done so at
+ either one or both of the following times:
done so at -> done at
+ If partition pruning can be
+ performed here then there is the added benefit of not having to
+ initialize partitions which are pruned. Partitions which are pruned
+ during this stage will not show up in the query's
+ <command>EXPLAIN</command> or <command>EXPLAIN ANALYZE</command>. It
+ is possible to determine the number of partitions which were removed
+ using this method by observing the <quote>Subplans Removed</quote>
+ property in the <command>EXPLAIN</command> output.
While it might be OK to keep the last two sentences, not sure about the
1st, which seems like it's spelling out an implementation detail -- that
there is an initialization step for partitions. It's a nice performance
enhancement, sure, but might be irrelevant to the users reading this
documentation.
+ nested loop joins. Since the value of these parameters may change
many
+ times during the execution of the query, partition pruning is
performed
+ whenever one of the execution parameters which is being compared to a
+ partition column or expression changes.
How about writing the last part as: whenever one of the execution
parameters relevant to pruning changes
+ <note>
+ <para>
+ Currently, partition pruning of partitions during the planning of an
+ <command>UPDATE</command> or <command>DELETE</command> command are
+ internally implemented using the constraint exclusion method. Only
+ <command>SELECT</command> uses the faster partition pruning method.
Also
+ partition pruning performed during execution is only done so for the
+ Append node type. Both of these limitations are likely to be removed
+ in a future release of <productname>PostgreSQL</productname>.
+ </para>
+ </note>
Do we need to write this given that we decided to decouple even the
UPDATE/DELETE pruning from the constraint_exclusion configuration? Also,
noting that only Append nodes can use execution-time pruning seems
unnecessary. I don't see plan node names mentioned like this elsewhere in
the documentation. But more to the point, it seems like spilling out
finer implementation details (and/or limitations thereof) in the
user-facing documentation.
Thanks again.
Regards,
Amit
Robert Haas wrote:
On Tue, Apr 24, 2018 at 5:59 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Constraint
exclusion was pretty easy to get wrong, hence the need for a separate
section, and I suppose the new partition pruning may be prey to the same
problems, so it seems worth to document them specially. But not sure
about the others, if they are mostly debugging tools.Weighing in here late, but I have a hard time understanding why we
want a GUC to control partition pruning at all. With constraint
exclusion, the issue is whether you want to spend planner cycles to
try to deduce things using CHECK constraints when, quite possibly,
your CHECK constraints are unrelated to table inheritance and thus
won't help. But seems extremely unlikely that the same thing would
happen with partition pruning. Unlike your CHECK constraints, your
partition bounds are, by definition, potentially useful for pruning.
I admit I am more concerned about the possibility of bugs than I am
about providing a performance-related tool. If partition prune can do
its thing with only a 1.1% of overhead, that's a great result. While
I'm sure that some real-world partitioning scenarios exist that have a
higher overhead than that, that's not what I am worried about the most.
In a couple of releases, once we know for sure that all this new code is
absolutely stable and that there are no bugs (keeping in mind that PG12
will boast additional pruning for MergeAppend as well as for UPDATE/
DELETE queries,) we can remove the GUC -- hoping that no user will bark
at us about they having to keep it disabled by default.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, May 2, 2018 at 1:07 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
wrote:
Hi David.
On 2018/05/02 8:18, David Rowley wrote:
On 1 May 2018 at 21:44, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
wrote:
I re-read the patch and it still looks fine to me. I'm sure it could
be made better, but I just don't currently see how. I think it would
be better if you commented on the specifics of what you think could be
improved rather than a general comment that it could be improved.Sorry, I may have been a bit vague. I've read the patch one more time by
considering the phrase "partition pruning" as the name of the new feature
and that constraint exclusion is an optimization technique which doubled
as partition pruning until now. The new feature achieves results faster
and can be used in more cases than constraint exclusion. With that
reading, I don't see much to complain about with your patch at a high
level.Except some nitpicking:
+ <para> + Partition Pruning is also more powerful than constraint exclusion as + partition pruning is not something that is performed only during the + planning of a given query.Maybe, don't repeat "partition pruning" again in the same sentence. How
about:
good thought
.. more powerful than constraint exclusion as *it* is not something..
technically "it" refers to "constraint exclusion" when written this way.
Better would be:
Partition pruning, unlike constraint exclusion, may be performed during
query execution.
Saying "not only planning" where there is only one other possible time it
happens is unnecessarily vague.
+ If partition pruning can be + performed here then there is the added benefit of not having to + initialize partitions which are pruned. Partitions which are pruned + during this stage will not show up in the query's + <command>EXPLAIN</command> or <command>EXPLAIN ANALYZE</command>. It + is possible to determine the number of partitions which were removed + using this method by observing the <quote>Subplans Removed</quote> + property in the <command>EXPLAIN</command> output.While it might be OK to keep the last two sentences, not sure about the
1st, which seems like it's spelling out an implementation detail -- that
there is an initialization step for partitions. It's a nice performance
enhancement, sure, but might be irrelevant to the users reading this
documentation.
I would concur with omitting the initialization implementation detail.
+ nested loop joins. Since the value of these parameters may change many + times during the execution of the query, partition pruning is performed + whenever one of the execution parameters which is being compared to a + partition column or expression changes.How about writing the last part as: whenever one of the execution
parameters relevant to pruning changes
Is it when the values change or for each different value? The difference
being if values are not sorted an something like: 1,2,3,2,3,4,1,2 were to
appear.
+ <note> + <para> + Currently, partition pruning of partitions during the planning of an + <command>UPDATE</command> or <command>DELETE</command> command are + internally implemented using the constraint exclusion method. Only + <command>SELECT</command> uses the faster partition pruning method. Also + partition pruning performed during execution is only done so for the + Append node type. Both of these limitations are likely to be removed + in a future release of <productname>PostgreSQL</productname>. + </para> + </note>Do we need to write this given that we decided to decouple even the
UPDATE/DELETE pruning from the constraint_exclusion configuration? Also,
noting that only Append nodes can use execution-time pruning seems
unnecessary. I don't see plan node names mentioned like this elsewhere in
the documentation. But more to the point, it seems like spilling out
finer implementation details (and/or limitations thereof) in the
user-facing documentation.
I suppose it would matter relative to what explain plans the user might
see. I do think the distinction between UPDATE/DELETE and SELECT can be
removed here though. The execution limitation seems potentially worthy
though as written I am unable to convert the provided information into
something I can use. Knowing when it cannot happen, even if incomplete,
would be more helpful to me.
David J.
On 1 May 2018 at 21:44, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
About the patch in general, it seems like the newly added documentation
talks about "Partition Pruning" as something that *replaces* constraint
exclusion. But, I think "Partition Pruning" is not the thing that
replaces constraint exclusion.
Just thinking about this a bit more. I've become a bit concerned that
we've completely misnamed this feature. It's true that at the moment
we build RelOptInfos for all partitions then eliminate what we can,
but the new algorithm that we've been calling "partition pruning" is
really not pruning anything at all, it's selecting the smallest set of
matching partitions. It's only the current usage of the algorithm
that's using it that way, and I kinda hope to change that for PG12.
Isn't the whole thing better to be named "partition selection"?
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Wed, May 2, 2018 at 4:06 PM, David Rowley <david.rowley@2ndquadrant.com>
wrote:
On 1 May 2018 at 21:44, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
wrote:About the patch in general, it seems like the newly added documentation
talks about "Partition Pruning" as something that *replaces* constraint
exclusion. But, I think "Partition Pruning" is not the thing that
replaces constraint exclusion.Just thinking about this a bit more. I've become a bit concerned that
we've completely misnamed this feature. It's true that at the moment
we build RelOptInfos for all partitions then eliminate what we can,
but the new algorithm that we've been calling "partition pruning" is
really not pruning anything at all, it's selecting the smallest set of
matching partitions. It's only the current usage of the algorithm
that's using it that way, and I kinda hope to change that for PG12.Isn't the whole thing better to be named "partition selection"?
The user-exposed Name/GUC need (and in some ways should) not reflect the
implementation. Partitioning creates a tree and during planning and
execution we prune those branches/paths from the tree that are not going to
yield fruit. Its not like you can outright ignore their existence so at
some point you choose to ignore them which is a form of pruning.
Writing that I can support partition_pruning on technical grounds but to
what extent are we alienating the international community that we serve?
Maybe "Partition Filtering" (I'm disliking selection, I'm thinking we must
always select partitions)
Then again a Google search suggests we will be keeping good company by
sticking with "Partition Pruning" - any language dynamic is probably
overcome through extent of use.
On the whole I'd stick with what we've got.
David J.
On 3 May 2018 at 11:38, David G. Johnston <david.g.johnston@gmail.com> wrote:
Maybe "Partition Filtering" (I'm disliking selection, I'm thinking we must
always select partitions)
I don't see why "Filtering" is any different from pruning, they both
imply removing something that was once there. What I'm saying is, that
it's backward to think of what we have now as pruning, so I don't
think renaming it to "partition filtering" addresses my concern.
FWIW, I'm not set on changing this. I just want to discuss this now so
that the chances of having regrets about this later are reduced.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Wed, May 2, 2018 at 9:28 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
I admit I am more concerned about the possibility of bugs than I am
about providing a performance-related tool.
I agree that if partition pruning has bugs, somebody might want to
turn it off. On the other hand, when they do, there's a good chance
that they will lose so much performance that they'll still be pretty
sad. Somebody certainly could have a workload where the pruning
helps, but by a small enough amount that shutting it off is
acceptable. But I suspect that's a somewhat narrow target.
I'm not going to go to war over this, though. I'm just telling you
what I think.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas wrote:
On Wed, May 2, 2018 at 9:28 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
I admit I am more concerned about the possibility of bugs than I am
about providing a performance-related tool.I agree that if partition pruning has bugs, somebody might want to
turn it off. On the other hand, when they do, there's a good chance
that they will lose so much performance that they'll still be pretty
sad. Somebody certainly could have a workload where the pruning
helps, but by a small enough amount that shutting it off is
acceptable. But I suspect that's a somewhat narrow target.I'm not going to go to war over this, though. I'm just telling you
what I think.
Well, we didn't have a GUC initially, evidently because none of us
thought that this would be a huge problem. So maybe you're both right
and it's overkill to have it. I'm not set on having it, either. Does
anybody else have an opinion?
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wednesday, May 2, 2018, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Robert Haas wrote:
On Wed, May 2, 2018 at 9:28 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:
I admit I am more concerned about the possibility of bugs than I am
about providing a performance-related tool.I agree that if partition pruning has bugs, somebody might want to
turn it off. On the other hand, when they do, there's a good chance
that they will lose so much performance that they'll still be pretty
sad. Somebody certainly could have a workload where the pruning
helps, but by a small enough amount that shutting it off is
acceptable. But I suspect that's a somewhat narrow target.I'm not going to go to war over this, though. I'm just telling you
what I think.Well, we didn't have a GUC initially, evidently because none of us
thought that this would be a huge problem. So maybe you're both right
and it's overkill to have it. I'm not set on having it, either. Does
anybody else have an opinion?
I toss my +1 to removing it altogether.
David J.
On Thu, Apr 26, 2018 at 07:29:37PM +1200, David Rowley wrote:
On 25 April 2018 at 09:59, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Amit Langote wrote:
Although the config.sgml coverage of the new capabilities seems pretty
good, some may find their being mentioned in 5.10 Table Partitioning
helpful. Or if we don't want to hijack 5.10.4, maybe create a 5.10.5.Can you (or someone) describe what would that section contain?
I've drafted and attached a patch of how I think this should look.
Likely it will need some tweaking, but it is probably a good starting
point for discussion.
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 34da0d8d57..89735b4804 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
+ <para>
+ Unlike constraint exclusion, partition pruning can be performed much more
+ quickly as it does not have to scan each individual partition's metadata
quickly COMMA
But actually I suggest:
Partition pruning is much more efficient than constraint exclusion, since
pruning avoids scanning each partition's metadata...
+ Partition Pruning is also more powerful than constraint exclusion as
+ partition pruning is not something that is performed only during the
remove "something that is" ?
Or just merge into the next sentence.
Note: Amit and David commented on this previously.
+ planning of a given query. In certain cases, partition pruning may also
+ be performed during execution of the query as well. This allows pruning
"also" is redundant with "as well"
+ to be performed using values which are unknown during query planning, for
could say "are not yet known"
+ The partition pruning which is performed during execution is done so at
+ either one or both of the following times:
remove "either" ?
+ During initialization of the query plan. Partition pruning can be
+ initialization phase of execution. If partition pruning can be
+ performed here then there is the added benefit of not having to
here COMMA
+ initialize partitions which are pruned. Partitions which are pruned
+ during this stage will not show up in the query's
+ During actual execution of the query plan. Partition pruning may also
Remove "actual" ?
+ be performed here to remove partitions using values which are only known
+ during actual query execution. This includes values from subqueries and
+ values from execution time parameters such as ones from parameterized
execution-time?
s/ones/those/
+ partition column or expression changes. In order to determine if
+ partitions were pruned at this stage requires careful inspection of the
+ <literal>nloops</literal> property in the
+ <command>EXPLAIN ANALYZE</command> output.
s/In order to determine/Determining/
+ <para>
+ Currently, partition pruning of partitions during the planning of an
s/partition //1 (just "pruning of partitions")
+ <command>UPDATE</command> or <command>DELETE</command> command are
s/are/is/
+ internally implemented using the constraint exclusion method. Only
remove "internally"?
+ <command>SELECT</command> uses the faster partition pruning method. Also
Also COMMA
+ partition pruning performed during execution is only done so for the
Remove "so".
Justin
Thanks for looking at this. I've taken most of your suggestions, some
I had changed around as a result of Amit's review.
On 7 May 2018 at 15:34, Justin Pryzby <pryzby@telsasoft.com> wrote:
+ During actual execution of the query plan. Partition pruning may also
Remove "actual" ?
I left this out one. I imagined it would be common to think of
executor startup/execution/execution shutdown as "query execution",
but I really only mean during the middle of those three things. If
you can think of a better way to make that more clear, then it might
be worth considering. For now, I think removing "actual" won't help.
A patch will follow shortly, in response to Amit's review.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Many thanks for reviewing this.
On 2 May 2018 at 20:07, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
+ <para> + Partition Pruning is also more powerful than constraint exclusion as + partition pruning is not something that is performed only during the + planning of a given query.Maybe, don't repeat "partition pruning" again in the same sentence. How
about:.. more powerful than constraint exclusion as *it* is not something..
changed.
Or may suggest to rewrite it as:
Partition pruning is also more powerful than constraint exclusion as it
can be performed not only during the planning of a given query, but also
during its execution.If you accept the above rewrite, the next sentences in the paragraph:
+ In certain cases, partition pruning may also + be performed during execution of the query as well. This allows pruning + to be performed using values which are unknown during query planning, for + example, using parameters defined in a <command>PREPARE</command> + statement, using a value obtained from a subquery or using parameters from + a parameterized nested loop join.could be adjusted a bit to read as:
For example, this allows pruning to be performed using values which are
unknown during query planning but will be known during execution, such as
using parameters defined in a <command>PREPARE</command> statement (if a
generic plan is chosen), or using a value obtained from a subquery, or
using values from an outer row of a parameterized nested loop join.
I've changed this a bit but I didn't mention generic plans. What you
say is true, but I didn't think we needed to be so specific.
+ <para> + The partition pruning which is performed during execution is done so at + either one or both of the following times:done so at -> done at
Changed
+ If partition pruning can be + performed here then there is the added benefit of not having to + initialize partitions which are pruned. Partitions which are pruned + during this stage will not show up in the query's + <command>EXPLAIN</command> or <command>EXPLAIN ANALYZE</command>. It + is possible to determine the number of partitions which were removed + using this method by observing the <quote>Subplans Removed</quote> + property in the <command>EXPLAIN</command> output.While it might be OK to keep the last two sentences, not sure about the
1st, which seems like it's spelling out an implementation detail -- that
there is an initialization step for partitions. It's a nice performance
enhancement, sure, but might be irrelevant to the users reading this
documentation.
I've reworded this. I think it's important to inform the reader that
this is performed during initialization of the plan as without that
they might ask why there are two phases of pruning and not just one.
Not having to initialize the subnode for pruned partitions is the sole
advantage of doing this pruning phase, so I would rather be specific
about when it occurs.
+ nested loop joins. Since the value of these parameters may change many + times during the execution of the query, partition pruning is performed + whenever one of the execution parameters which is being compared to a + partition column or expression changes.How about writing the last part as: whenever one of the execution
parameters relevant to pruning changes
I've reworded this.
+ <note> + <para> + Currently, partition pruning of partitions during the planning of an + <command>UPDATE</command> or <command>DELETE</command> command are + internally implemented using the constraint exclusion method. Only + <command>SELECT</command> uses the faster partition pruning method. Also + partition pruning performed during execution is only done so for the + Append node type. Both of these limitations are likely to be removed + in a future release of <productname>PostgreSQL</productname>. + </para> + </note>Do we need to write this given that we decided to decouple even the
UPDATE/DELETE pruning from the constraint_exclusion configuration?
I think it's important to inform people of the limitations. I know
there's a lot of opinions floating around about the usability of
partitioning in PostgreSQL with a large number of partitions. I
included this here so interested parties know that their problems are
not all solved by partition pruning. Perhaps those people can watch
for the removal of this notice.
Also,
noting that only Append nodes can use execution-time pruning seems
unnecessary. I don't see plan node names mentioned like this elsewhere in
the documentation. But more to the point, it seems like spilling out
finer implementation details (and/or limitations thereof) in the
user-facing documentation.
I thought about this while writing the patch, and it forced me to grep
for instances of "Append" in the docs. There were some, so I didn't
think I was breaking any rules. I also have no idea how else we might
explain that it works for Append and not MergeAppend. It's likely
going to be easier to answer possible to future bug reports which
complain run-time pruning is broken with MergeAppend with "It's not a
bug, it's behaving exactly as described in the documents. <link to
docs>".
I have now changed Append for <literal>Append</literal> in the patch
which is aligned to what perform.sgml is doing.
v2 patch is attached.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
further_enable_partition_pruning_doc_updates_v2.patchapplication/octet-stream; name=further_enable_partition_pruning_doc_updates_v2.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index eabe2a9235..bda6adfdfc 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3840,6 +3840,11 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
executor to remove (ignore) partitions during query execution. The
default is <literal>on</literal>.
</para>
+
+ <para>
+ Refer to <xref linkend="ddl-partition-pruning"/> for more information
+ on partition pruning and partitioning.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 004ecacbbf..3bc301d5c1 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3760,7 +3760,7 @@ ANALYZE measurement;
</sect2>
<sect2 id="ddl-partitioning-constraint-exclusion">
- <title>Partitioning and Constraint Exclusion</title>
+ <title>Inheritance Partitioning and Constraint Exclusion</title>
<indexterm>
<primary>constraint exclusion</primary>
@@ -3768,9 +3768,8 @@ ANALYZE measurement;
<para>
<firstterm>Constraint exclusion</firstterm> is a query optimization technique
- that improves performance for partitioned tables defined in the
- fashion described above (both declaratively partitioned tables and those
- implemented using inheritance). As an example:
+ that improves performance for inheritance partitioned tables defined in the
+ fashion described above. As an example:
<programlisting>
SET constraint_exclusion = on;
@@ -3847,15 +3846,14 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
<xref linkend="guc-constraint-exclusion"/> is actually neither
<literal>on</literal> nor <literal>off</literal>, but an intermediate setting
called <literal>partition</literal>, which causes the technique to be
- applied only to queries that are likely to be working on partitioned
+ applied only to queries that are likely to be working on inheritance partitioned
tables. The <literal>on</literal> setting causes the planner to examine
<literal>CHECK</literal> constraints in all queries, even simple ones that
are unlikely to benefit.
</para>
<para>
- The following caveats apply to constraint exclusion, which is used by
- both inheritance and partitioned tables:
+ The following caveats apply to constraint exclusion:
<itemizedlist>
<listitem>
@@ -3879,9 +3877,7 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
contain only comparisons of the partitioning column(s) to constants
using B-tree-indexable operators, which applies even to partitioned
tables, because only B-tree-indexable column(s) are allowed in the
- partition key. (This is not a problem when using declarative
- partitioning, since the automatically generated constraints are simple
- enough to be understood by the planner.)
+ partition key.
</para>
</listitem>
@@ -3898,6 +3894,94 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
</itemizedlist>
</para>
</sect2>
+
+ <sect2 id="ddl-partition-pruning">
+ <title>Declarative Partitioning and Partition Pruning</title>
+
+ <indexterm>
+ <primary>partition pruning</primary>
+ </indexterm>
+
+ <para>
+ <firstterm>Partition pruning</firstterm> is a query optimization technique
+ similar to constraint exclusion, but applies only to declaratively
+ partitioned tables. Like constraint exclusion, this uses (but is not
+ limited to using) the query's <literal>WHERE</literal> clause to exclude
+ partitions which cannot possibly contain any matching records.
+ </para>
+
+ <para>
+ Partition pruning is much more efficient than constraint exclusion, since
+ it avoids scanning each partition's metadata to determine if the partition
+ is required for a particular query.
+ </para>
+
+ <para>
+ Partition pruning is also more powerful than constraint exclusion as it
+ can be performed not only during the planning of a given query, but also
+ during its execution. This is useful as it can allow more partitions to
+ be pruned when clauses contain expressions whose values are unknown to the
+ query planner. For example, parameters defined in a
+ <command>PREPARE</command> statement, using a value obtained from a
+ subquery or using a parameterized value on the inner side of a nested loop
+ join.
+ </para>
+
+ <para>
+ Partition pruning done during execution can be performed at any of the
+ following times:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ During initialization of the query plan. Partition pruning can be
+ performed here for parameter values which are known during the
+ initialization phase of execution. Partitions which are pruned during
+ this stage will not show up in the query's <command>EXPLAIN</command>
+ or <command>EXPLAIN ANALYZE</command>. It is possible to determine the
+ number of partitions which were removed during this phase of pruning by
+ observing the <quote>Subplans Removed</quote> property in the
+ <command>EXPLAIN</command> output.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ During actual execution of the query plan. Partition pruning may also
+ be performed here to remove partitions using values which are only
+ known during actual query execution. This includes values from
+ subqueries and values from execution-time parameters such as those from
+ parameterized nested loop joins. Since the value of these parameters
+ may change many times during the execution of the query, partition
+ pruning is performed whenever one of the execution parameters being
+ used by partition pruning changes. Determining if partitions were
+ pruned during this phase requires careful inspection of the
+ <literal>nloops</literal> property in the
+ <command>EXPLAIN ANALYZE</command> output.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ Partition pruning can be disabled using the
+ <xref linkend="guc-enable-partition-pruning"/> setting.
+ </para>
+
+ <note>
+ <para>
+ Currently, pruning of partitions during the planning of an
+ <command>UPDATE</command> or <command>DELETE</command> command is
+ implemented using the constraint exclusion method. Only
+ <command>SELECT</command> uses the partition pruning technique. Also,
+ partition pruning performed during execution is only done so for the
+ <literal>Append</literal> node type. Both of these limitations are
+ likely to be removed in a future release of
+ <productname>PostgreSQL</productname>.
+ </para>
+ </note>
+ </sect2>
+
</sect1>
<sect1 id="ddl-foreign-data">
On Mon, May 07, 2018 at 06:00:59PM +1200, David Rowley wrote:
Many thanks for reviewing this.
2nd round - from the minimalist department:
+ partitions which cannot possibly contain any matching records.
maybe: partitions which cannot match any records.
+ <para>
+ Partition pruning done during execution can be performed at any of the
+ following times:
remove "done"?
+ number of partitions which were removed during this phase of pruning by
remove "of prunning"
Justin
Thanks for reviewing again.
On 9 May 2018 at 01:32, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Mon, May 07, 2018 at 06:00:59PM +1200, David Rowley wrote:
Many thanks for reviewing this.
2nd round - from the minimalist department:
+ partitions which cannot possibly contain any matching records.
maybe: partitions which cannot match any records.
I don't think that's an improvement. I don't think there's such a
thing as "partitions which match records". A partition can contain a
record, it never matches one.
+ <para> + Partition pruning done during execution can be performed at any of the + following times:remove "done"?
Removed.
+ number of partitions which were removed during this phase of pruning by
remove "of prunning"
Removed.
v3 attached.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
further_enable_partition_pruning_doc_updates_v3.patchapplication/octet-stream; name=further_enable_partition_pruning_doc_updates_v3.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index eabe2a9235..bda6adfdfc 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3840,6 +3840,11 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
executor to remove (ignore) partitions during query execution. The
default is <literal>on</literal>.
</para>
+
+ <para>
+ Refer to <xref linkend="ddl-partition-pruning"/> for more information
+ on partition pruning and partitioning.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 004ecacbbf..3bc301d5c1 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3760,7 +3760,7 @@ ANALYZE measurement;
</sect2>
<sect2 id="ddl-partitioning-constraint-exclusion">
- <title>Partitioning and Constraint Exclusion</title>
+ <title>Inheritance Partitioning and Constraint Exclusion</title>
<indexterm>
<primary>constraint exclusion</primary>
@@ -3768,9 +3768,8 @@ ANALYZE measurement;
<para>
<firstterm>Constraint exclusion</firstterm> is a query optimization technique
- that improves performance for partitioned tables defined in the
- fashion described above (both declaratively partitioned tables and those
- implemented using inheritance). As an example:
+ that improves performance for inheritance partitioned tables defined in the
+ fashion described above. As an example:
<programlisting>
SET constraint_exclusion = on;
@@ -3847,15 +3846,14 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
<xref linkend="guc-constraint-exclusion"/> is actually neither
<literal>on</literal> nor <literal>off</literal>, but an intermediate setting
called <literal>partition</literal>, which causes the technique to be
- applied only to queries that are likely to be working on partitioned
+ applied only to queries that are likely to be working on inheritance partitioned
tables. The <literal>on</literal> setting causes the planner to examine
<literal>CHECK</literal> constraints in all queries, even simple ones that
are unlikely to benefit.
</para>
<para>
- The following caveats apply to constraint exclusion, which is used by
- both inheritance and partitioned tables:
+ The following caveats apply to constraint exclusion:
<itemizedlist>
<listitem>
@@ -3879,9 +3877,7 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
contain only comparisons of the partitioning column(s) to constants
using B-tree-indexable operators, which applies even to partitioned
tables, because only B-tree-indexable column(s) are allowed in the
- partition key. (This is not a problem when using declarative
- partitioning, since the automatically generated constraints are simple
- enough to be understood by the planner.)
+ partition key.
</para>
</listitem>
@@ -3898,6 +3894,94 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
</itemizedlist>
</para>
</sect2>
+
+ <sect2 id="ddl-partition-pruning">
+ <title>Declarative Partitioning and Partition Pruning</title>
+
+ <indexterm>
+ <primary>partition pruning</primary>
+ </indexterm>
+
+ <para>
+ <firstterm>Partition pruning</firstterm> is a query optimization technique
+ similar to constraint exclusion, but applies only to declaratively
+ partitioned tables. Like constraint exclusion, this uses (but is not
+ limited to using) the query's <literal>WHERE</literal> clause to exclude
+ partitions which cannot possibly contain any matching records.
+ </para>
+
+ <para>
+ Partition pruning is much more efficient than constraint exclusion, since
+ it avoids scanning each partition's metadata to determine if the partition
+ is required for a particular query.
+ </para>
+
+ <para>
+ Partition pruning is also more powerful than constraint exclusion as it
+ can be performed not only during the planning of a given query, but also
+ during its execution. This is useful as it can allow more partitions to
+ be pruned when clauses contain expressions whose values are unknown to the
+ query planner. For example, parameters defined in a
+ <command>PREPARE</command> statement, using a value obtained from a
+ subquery or using a parameterized value on the inner side of a nested loop
+ join.
+ </para>
+
+ <para>
+ Partition pruning during execution can be performed at any of the
+ following times:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ During initialization of the query plan. Partition pruning can be
+ performed here for parameter values which are known during the
+ initialization phase of execution. Partitions which are pruned during
+ this stage will not show up in the query's <command>EXPLAIN</command>
+ or <command>EXPLAIN ANALYZE</command>. It is possible to determine the
+ number of partitions which were removed during this phase by observing
+ the <quote>Subplans Removed</quote> property in the
+ <command>EXPLAIN</command> output.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ During actual execution of the query plan. Partition pruning may also
+ be performed here to remove partitions using values which are only
+ known during actual query execution. This includes values from
+ subqueries and values from execution-time parameters such as those from
+ parameterized nested loop joins. Since the value of these parameters
+ may change many times during the execution of the query, partition
+ pruning is performed whenever one of the execution parameters being
+ used by partition pruning changes. Determining if partitions were
+ pruned during this phase requires careful inspection of the
+ <literal>nloops</literal> property in the
+ <command>EXPLAIN ANALYZE</command> output.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ Partition pruning can be disabled using the
+ <xref linkend="guc-enable-partition-pruning"/> setting.
+ </para>
+
+ <note>
+ <para>
+ Currently, pruning of partitions during the planning of an
+ <command>UPDATE</command> or <command>DELETE</command> command is
+ implemented using the constraint exclusion method. Only
+ <command>SELECT</command> uses the partition pruning technique. Also,
+ partition pruning performed during execution is only done so for the
+ <literal>Append</literal> node type. Both of these limitations are
+ likely to be removed in a future release of
+ <productname>PostgreSQL</productname>.
+ </para>
+ </note>
+ </sect2>
+
</sect1>
<sect1 id="ddl-foreign-data">
Hi David.
Thanks for addressing my comments.
On 2018/05/07 15:00, David Rowley wrote:
v2 patch is attached.
Looks good to me.
Thanks,
Amit
On 2018/05/09 13:14, Amit Langote wrote:
Hi David.
Thanks for addressing my comments.
On 2018/05/07 15:00, David Rowley wrote:
v2 patch is attached.
Looks good to me.
Sorry, I should've seen noticed v3 before sending my email.
v3 looks good too, but when going through it, I noticed one bit in 5.10.4.
Partitioning and Constraint Exclusion:
A good rule of thumb is that partitioning constraints should
contain only comparisons of the partitioning column(s) to constants
using B-tree-indexable operators, which applies even to partitioned
tables, because only B-tree-indexable column(s) are allowed in the
partition key.
I think the part after ", which applies even to partitioned tables,.."
should be removed.
Attached find the updated patch.
Thanks,
Amit
Attachments:
further_enable_partition_pruning_doc_updates_v4.patchtext/plain; charset=UTF-8; name=further_enable_partition_pruning_doc_updates_v4.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index ffea744cb8..76606a8535 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3841,6 +3841,11 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
executor to remove (ignore) partitions during query execution. The
default is <literal>on</literal>.
</para>
+
+ <para>
+ Refer to <xref linkend="ddl-partition-pruning"/> for more information
+ on partition pruning and partitioning.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 004ecacbbf..d02edd771f 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3760,7 +3760,7 @@ ANALYZE measurement;
</sect2>
<sect2 id="ddl-partitioning-constraint-exclusion">
- <title>Partitioning and Constraint Exclusion</title>
+ <title>Inheritance Partitioning and Constraint Exclusion</title>
<indexterm>
<primary>constraint exclusion</primary>
@@ -3768,9 +3768,8 @@ ANALYZE measurement;
<para>
<firstterm>Constraint exclusion</firstterm> is a query optimization technique
- that improves performance for partitioned tables defined in the
- fashion described above (both declaratively partitioned tables and those
- implemented using inheritance). As an example:
+ that improves performance for inheritance partitioned tables defined in the
+ fashion described above. As an example:
<programlisting>
SET constraint_exclusion = on;
@@ -3847,15 +3846,14 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
<xref linkend="guc-constraint-exclusion"/> is actually neither
<literal>on</literal> nor <literal>off</literal>, but an intermediate setting
called <literal>partition</literal>, which causes the technique to be
- applied only to queries that are likely to be working on partitioned
+ applied only to queries that are likely to be working on inheritance partitioned
tables. The <literal>on</literal> setting causes the planner to examine
<literal>CHECK</literal> constraints in all queries, even simple ones that
are unlikely to benefit.
</para>
<para>
- The following caveats apply to constraint exclusion, which is used by
- both inheritance and partitioned tables:
+ The following caveats apply to constraint exclusion:
<itemizedlist>
<listitem>
@@ -3877,11 +3875,7 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
range tests for range partitioning, as illustrated in the preceding
examples. A good rule of thumb is that partitioning constraints should
contain only comparisons of the partitioning column(s) to constants
- using B-tree-indexable operators, which applies even to partitioned
- tables, because only B-tree-indexable column(s) are allowed in the
- partition key. (This is not a problem when using declarative
- partitioning, since the automatically generated constraints are simple
- enough to be understood by the planner.)
+ using B-tree-indexable operators.
</para>
</listitem>
@@ -3898,6 +3892,94 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
</itemizedlist>
</para>
</sect2>
+
+ <sect2 id="ddl-partition-pruning">
+ <title>Declarative Partitioning and Partition Pruning</title>
+
+ <indexterm>
+ <primary>partition pruning</primary>
+ </indexterm>
+
+ <para>
+ <firstterm>Partition pruning</firstterm> is a query optimization technique
+ similar to constraint exclusion, but applies only to declaratively
+ partitioned tables. Like constraint exclusion, this uses (but is not
+ limited to using) the query's <literal>WHERE</literal> clause to exclude
+ partitions which cannot possibly contain any matching records.
+ </para>
+
+ <para>
+ Partition pruning is much more efficient than constraint exclusion, since
+ it avoids scanning each partition's metadata to determine if the partition
+ is required for a particular query.
+ </para>
+
+ <para>
+ Partition pruning is also more powerful than constraint exclusion as it
+ can be performed not only during the planning of a given query, but also
+ during its execution. This is useful as it can allow more partitions to
+ be pruned when clauses contain expressions whose values are unknown to the
+ query planner. For example, parameters defined in a
+ <command>PREPARE</command> statement, using a value obtained from a
+ subquery or using a parameterized value on the inner side of a nested loop
+ join.
+ </para>
+
+ <para>
+ Partition pruning during execution can be performed at any of the
+ following times:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ During initialization of the query plan. Partition pruning can be
+ performed here for parameter values which are known during the
+ initialization phase of execution. Partitions which are pruned during
+ this stage will not show up in the query's <command>EXPLAIN</command>
+ or <command>EXPLAIN ANALYZE</command>. It is possible to determine the
+ number of partitions which were removed during this phase by observing
+ the <quote>Subplans Removed</quote> property in the
+ <command>EXPLAIN</command> output.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ During actual execution of the query plan. Partition pruning may also
+ be performed here to remove partitions using values which are only
+ known during actual query execution. This includes values from
+ subqueries and values from execution-time parameters such as those from
+ parameterized nested loop joins. Since the value of these parameters
+ may change many times during the execution of the query, partition
+ pruning is performed whenever one of the execution parameters being
+ used by partition pruning changes. Determining if partitions were
+ pruned during this phase requires careful inspection of the
+ <literal>nloops</literal> property in the
+ <command>EXPLAIN ANALYZE</command> output.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ Partition pruning can be disabled using the
+ <xref linkend="guc-enable-partition-pruning"/> setting.
+ </para>
+
+ <note>
+ <para>
+ Currently, pruning of partitions during the planning of an
+ <command>UPDATE</command> or <command>DELETE</command> command is
+ implemented using the constraint exclusion method. Only
+ <command>SELECT</command> uses the partition pruning technique. Also,
+ partition pruning performed during execution is only done so for the
+ <literal>Append</literal> node type. Both of these limitations are
+ likely to be removed in a future release of
+ <productname>PostgreSQL</productname>.
+ </para>
+ </note>
+ </sect2>
+
</sect1>
<sect1 id="ddl-foreign-data">
David Rowley wrote:
Thanks for reviewing again.
Hi,
I'm thinking something a bit more radical. First, since partition
pruning is the future and constraint exclusion is soon to be a thing of
the past, we should describe pruning first, and then describe exclusion
in terms of pruning. Second, I'd put constraint exclusion as a <sect3>
inside the <sect2> that describes pruning (but keep the XML "id" the
same, so that old links continue to work.)
I took a stab at this, but ran out of time before trimming the text for
constraint exclusion. What do you think of this rough sketch? I'm
thinking 5.10.4 is close to its final form (wording suggestions of
course welcome), but 5.10.4.1 still needs to be trimmed heavily, to
avoid repeating what was already explained in 5.10.4 (we need only
explain how exclusion differs from pruning.)
I'm a bit undecided on where to leave the <note>.
(Note:
make -C doc/src/sgml html XSLTPROCFLAGS='--stringparam rootid ddl'
builds only the 'ddl' chapter, which is nice when proofreading.)
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
further_enable_partition_pruning_doc_updates_v4.patchtext/plain; charset=utf-8Download
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 3f3f567222..2152b4d16d 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3759,7 +3759,151 @@ ANALYZE measurement;
</sect3>
</sect2>
- <sect2 id="ddl-partitioning-constraint-exclusion">
+ <sect2 id="ddl-partition-pruning">
+ <title>Partition Pruning</title>
+
+ <indexterm>
+ <primary>partition pruning</primary>
+ </indexterm>
+
+ <para>
+ <firstterm>Partition pruning</firstterm> is a query optimization technique
+ that improves performance for partitioned tables. As an example:
+
+<programlisting>
+SET enable_partition_pruning = on; -- the default
+SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
+</programlisting>
+
+ Without partition pruning, the above query would scan each of the
+ the partitions of the <structname>measurement</structname> table. With
+ partition pruning enabled, the planner will examine the definition of each
+ partition and prove that the partition need not
+ be scanned because it could not contain any rows meeting the query's
+ <literal>WHERE</literal> clause. When the planner can prove this, it
+ excludes the partition from the query plan.
+ </para>
+
+ <para>
+ You can use the <command>EXPLAIN</command> command to show the difference
+ between a plan with <varname>enable_partition_pruning</varname> on and a plan
+ with it off. A typical unoptimized plan for this type of table setup is:
+
+<programlisting>
+SET enable_partition_pruning = off;
+EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
+ QUERY PLAN
+âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
+ Aggregate (cost=188.76..188.77 rows=1 width=8)
+ -> Append (cost=0.00..181.05 rows=3085 width=0)
+ -> Seq Scan on measurement_y2006m02 (cost=0.00..33.12 rows=617 width=0)
+ Filter: (logdate >= '2008-01-01'::date)
+ -> Seq Scan on measurement_y2006m03 (cost=0.00..33.12 rows=617 width=0)
+ Filter: (logdate >= '2008-01-01'::date)
+ -> Seq Scan on measurement_y2007m11 (cost=0.00..33.12 rows=617 width=0)
+ Filter: (logdate >= '2008-01-01'::date)
+ -> Seq Scan on measurement_y2007m12 (cost=0.00..33.12 rows=617 width=0)
+ Filter: (logdate >= '2008-01-01'::date)
+ -> Seq Scan on measurement_y2008m01 (cost=0.00..33.12 rows=617 width=0)
+ Filter: (logdate >= '2008-01-01'::date)
+</programlisting>
+
+ Some or all of the partitions might use index scans instead of
+ full-table sequential scans, but the point here is that there
+ is no need to scan the older partitions at all to answer this query.
+ When we enable partition pruning, we get a significantly
+ cheaper plan that will deliver the same answer:
+
+<programlisting>
+SET enable_partition_pruning = on;
+EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
+ QUERY PLAN
+âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
+ Aggregate (cost=37.75..37.76 rows=1 width=8)
+ -> Append (cost=0.00..36.21 rows=617 width=0)
+ -> Seq Scan on measurement_y2008m01 (cost=0.00..33.12 rows=617 width=0)
+ Filter: (logdate >= '2008-01-01'::date)
+</programlisting>
+ </para>
+
+ <para>
+ Note that partition pruning is driven only by the constraints defined by
+ the partition keys, not by the presence of indexes. Therefore it isn't
+ necessary to define indexes on the key columns. Whether an index
+ needs to be created for a given partition depends on whether you
+ expect that queries that scan the partition will generally scan
+ a large part of the partition or just a small part. An index will
+ be helpful in the latter case but not the former.
+ </para>
+
+ <para>
+ Partition pruning
+ can be performed not only during the planning of a given query, but also
+ during its execution. This is useful as it can allow more partitions to
+ be pruned when clauses contain expressions whose values are unknown to the
+ query planner. For example, parameters defined in a
+ <command>PREPARE</command> statement, using a value obtained from a
+ subquery or using a parameterized value on the inner side of a nested loop
+ join.
+ Partition pruning during execution can be performed at any of the
+ following times:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ During initialization of the query plan. Partition pruning can be
+ performed here for parameter values which are known during the
+ initialization phase of execution. Partitions which are pruned during
+ this stage will not show up in the query's <command>EXPLAIN</command>
+ or <command>EXPLAIN ANALYZE</command>. It is possible to determine the
+ number of partitions which were removed during this phase by observing
+ the <quote>Subplans Removed</quote> property in the
+ <command>EXPLAIN</command> output.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ During actual execution of the query plan. Partition pruning may also
+ be performed here to remove partitions using values which are only
+ known during actual query execution. This includes values from
+ subqueries and values from execution-time parameters such as those from
+ parameterized nested loop joins. Since the value of these parameters
+ may change many times during the execution of the query, partition
+ pruning is performed whenever one of the execution parameters being
+ used by partition pruning changes. Determining if partitions were
+ pruned during this phase requires careful inspection of the
+ <literal>nloops</literal> property in the
+ <command>EXPLAIN ANALYZE</command> output.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ Partition pruning can be disabled using the
+ <xref linkend="guc-enable-partition-pruning"/> setting.
+ </para>
+
+ <note>
+ <para>
+ Currently, pruning of partitions during the planning of an
+ <command>UPDATE</command> or <command>DELETE</command> command is
+ implemented using the constraint exclusion method — see the
+ next section for details. However, it is still ruled by the
+ <literal>enable_partition_pruning</literal> setting instead of
+ <literal>constraint_exclusion</literal>. Also note that
+ partition pruning performed during execution is only done so for the
+ <literal>Append</literal> node type, not <literal>MergeAppend</literal>.
+ </para>
+
+ <para>
+ Both of these limitations are likely to be removed in a future release
+ of <productname>PostgreSQL</productname>.
+ </para>
+ </note>
+
+ <sect3 id="ddl-partitioning-constraint-exclusion">
<title>Inheritance Partitioning and Constraint Exclusion</title>
<indexterm>
@@ -3768,8 +3912,9 @@ ANALYZE measurement;
<para>
<firstterm>Constraint exclusion</firstterm> is a query optimization technique
- that improves performance for inheritance partitioned tables defined in the
- fashion described above. As an example:
+ similar to partition pruning, used
+ for partitioned tables using the legacy inheritance method.
+ As an example similar to the one shown in the section above:
<programlisting>
SET constraint_exclusion = on;
@@ -3893,95 +4038,8 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
</itemizedlist>
</para>
+ </sect3>
</sect2>
-
- <sect2 id="ddl-partition-pruning">
- <title>Declarative Partitioning and Partition Pruning</title>
-
- <indexterm>
- <primary>partition pruning</primary>
- </indexterm>
-
- <para>
- <firstterm>Partition pruning</firstterm> is a query optimization technique
- similar to constraint exclusion, but applies only to declaratively
- partitioned tables. Like constraint exclusion, this uses (but is not
- limited to using) the query's <literal>WHERE</literal> clause to exclude
- partitions which cannot possibly contain any matching records.
- </para>
-
- <para>
- Partition pruning is much more efficient than constraint exclusion, since
- it avoids scanning each partition's metadata to determine if the partition
- is required for a particular query.
- </para>
-
- <para>
- Partition pruning is also more powerful than constraint exclusion as it
- can be performed not only during the planning of a given query, but also
- during its execution. This is useful as it can allow more partitions to
- be pruned when clauses contain expressions whose values are unknown to the
- query planner. For example, parameters defined in a
- <command>PREPARE</command> statement, using a value obtained from a
- subquery or using a parameterized value on the inner side of a nested loop
- join.
- </para>
-
- <para>
- Partition pruning during execution can be performed at any of the
- following times:
-
- <itemizedlist>
- <listitem>
- <para>
- During initialization of the query plan. Partition pruning can be
- performed here for parameter values which are known during the
- initialization phase of execution. Partitions which are pruned during
- this stage will not show up in the query's <command>EXPLAIN</command>
- or <command>EXPLAIN ANALYZE</command>. It is possible to determine the
- number of partitions which were removed during this phase by observing
- the <quote>Subplans Removed</quote> property in the
- <command>EXPLAIN</command> output.
- </para>
- </listitem>
-
- <listitem>
- <para>
- During actual execution of the query plan. Partition pruning may also
- be performed here to remove partitions using values which are only
- known during actual query execution. This includes values from
- subqueries and values from execution-time parameters such as those from
- parameterized nested loop joins. Since the value of these parameters
- may change many times during the execution of the query, partition
- pruning is performed whenever one of the execution parameters being
- used by partition pruning changes. Determining if partitions were
- pruned during this phase requires careful inspection of the
- <literal>nloops</literal> property in the
- <command>EXPLAIN ANALYZE</command> output.
- </para>
- </listitem>
- </itemizedlist>
- </para>
-
- <para>
- Partition pruning can be disabled using the
- <xref linkend="guc-enable-partition-pruning"/> setting.
- </para>
-
- <note>
- <para>
- Currently, pruning of partitions during the planning of an
- <command>UPDATE</command> or <command>DELETE</command> command is
- implemented using the constraint exclusion method. Only
- <command>SELECT</command> uses the partition pruning technique. Also,
- partition pruning performed during execution is only done so for the
- <literal>Append</literal> node type. Both of these limitations are
- likely to be removed in a future release of
- <productname>PostgreSQL</productname>.
- </para>
- </note>
- </sect2>
-
</sect1>
<sect1 id="ddl-foreign-data">
On 10 May 2018 at 14:01, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
I'm thinking something a bit more radical. First, since partition
pruning is the future and constraint exclusion is soon to be a thing of
the past, we should describe pruning first, and then describe exclusion
in terms of pruning.
But... that's not true. The chapter describes inheritance partitioned
tables too, and we're not getting rid of constraint exclusion because
it's needed for those. However, that might not mean your patch has to
be changed. I'd better have a look...
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
David Rowley wrote:
On 10 May 2018 at 14:01, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
I'm thinking something a bit more radical. First, since partition
pruning is the future and constraint exclusion is soon to be a thing of
the past, we should describe pruning first, and then describe exclusion
in terms of pruning.But... that's not true. The chapter describes inheritance partitioned
tables too, and we're not getting rid of constraint exclusion because
it's needed for those.
Oh, I'm sure it is, but nobody is going to set up new inheritance
partitioned tables anymore, except people who pg_upgrade from older
releases. (And while I haven't tried, I'm sure it's possible to migrate
from old-style to new-style partitioned tables without incurring full
table rewrites, with little downside and lots to gain.)
Now, maybe you argue that we could have a structure like this instead:
5.10.1. Overview
5.10.2. Declarative Partitioning
5.10.3. Partition Pruning
5.10.4. Implementation Using Inheritance
5.10.5. Constraint Exclusion
I wouldn't oppose that.
However, that might not mean your patch has to be changed. I'd better
have a look...
Thanks :-)
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, May 9, 2018 at 10:10 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
On 10 May 2018 at 14:01, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
I'm thinking something a bit more radical. First, since partition
pruning is the future and constraint exclusion is soon to be a thing of
the past, we should describe pruning first, and then describe exclusion
in terms of pruning.But... that's not true. The chapter describes inheritance partitioned
tables too, and we're not getting rid of constraint exclusion because
it's needed for those. However, that might not mean your patch has to
be changed. I'd better have a look...
I agree that constraint exclusion isn't going to die any time soon,
but I think Alvaro is right to say that we should explain the new
partition pruning technique first, and then later explain, hey, we
have this constraint exclusion thing, too. It takes more work to
reorganize the documentation along those lines, but nobody wants to
read about the techniques in the order we implemented them. They want
to read about the important stuff first, and in this case, that's the
new form of partition pruning.
In defense of constraint exclusion, let me note that constraint
exclusion is not restricted to inheritance cases. It could eliminate
the need to scan a completely unpartitioned table if the WHERE clause
can be refuted by CHECK constraints. It could eliminate the need to
scan some partitions of a partitioned table based on whatever
additional CHECK constraints exist beyond the partitioning
constraints. These are less likely scenarios, perhaps, but not out of
the question. For example, imagine a partitioned order table that is
range-partitioned by order ID. You could add CHECK constraints based
on the order_date that appears in each partition, and then constraint
exclusion could eliminate partitions based on quals related to
order_date. The order date correlates with the order ID, but
partition pruning doesn't know that, so it can can only help with
quals based on order ID. Constraint exclusion doesn't have that
restriction. That's potentially useful, I think, although BRIN
indexes on each partition are another way to tackle this sort of
problem.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas wrote:
In defense of constraint exclusion, let me note that constraint
exclusion is not restricted to inheritance cases. It could eliminate
the need to scan a completely unpartitioned table if the WHERE clause
can be refuted by CHECK constraints. It could eliminate the need to
scan some partitions of a partitioned table based on whatever
additional CHECK constraints exist beyond the partitioning
constraints.
This is a great point that hadn't occurred to me. It means that we
should keep constraint exclusion on its own <sect2> rather than relegate
it to <sect3>, as my proposed patch does. I think it's a good idea to
add this point there too.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
David Rowley wrote:
On 1 May 2018 at 21:44, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
About the patch in general, it seems like the newly added documentation
talks about "Partition Pruning" as something that *replaces* constraint
exclusion. But, I think "Partition Pruning" is not the thing that
replaces constraint exclusion.Just thinking about this a bit more. I've become a bit concerned that
we've completely misnamed this feature. It's true that at the moment
we build RelOptInfos for all partitions then eliminate what we can,
but the new algorithm that we've been calling "partition pruning" is
really not pruning anything at all, it's selecting the smallest set of
matching partitions. It's only the current usage of the algorithm
that's using it that way, and I kinda hope to change that for PG12.Isn't the whole thing better to be named "partition selection"?
I think that approach makes it more difficult to explain, not less so.
There are two logically opposite ways to explain this feature: a) by
default, all partitions must be scanned, and we examine the query to
determine which ones can be pruned. b) by default, no partitions are
scanned, and we examine the query to determine which ones must be
scanned.
The whole "enable_partition_pruning" thing is based on the idea that we
do a). You propose that we do b) instead. The only difference is what
happens if the feature is disabled -- the "by default" clause gets
inverted. So it would have to be b) if the feature is enabled, by
default no partitions are scanned, and we examine the query to determine
which ones must be scanned; if the feature is disabled, all partitions
must be scanned.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, May 10, 2018 at 8:57 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:
b) by default, no partitions are
scanned, and we examine the query to determine which ones must be
scanned.
There is an element of logic that says "by default, no partitions are
scanned" is not a reasonable behavior mode. Thus an alternative analogy
would be:
Bucket A is the set of all relevant partitions in the tree
Pruning: remove from bucket A those which we know we can skip; then iterate
over A
Selection: choose those items from A that are possible holders of our data
and process each one (place all selected items into bucket B and iterate
over B if you want to perform selection in total first).
As a user I don't really need to know which model is implemented and the
name doesn't necessarily imply the implementation. Pruning seems to be the
commonly-used term for this feature and we should stick with that.
David J.
David G. Johnston wrote:
As a user I don't really need to know which model is implemented and the
name doesn't necessarily imply the implementation. Pruning seems to be the
commonly-used term for this feature and we should stick with that.
I agree with this conclusion. So we have it right and we shouldn't
change it.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, May 10, 2018 at 12:58 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
David G. Johnston wrote:
As a user I don't really need to know which model is implemented and the
name doesn't necessarily imply the implementation. Pruning seems to be the
commonly-used term for this feature and we should stick with that.I agree with this conclusion. So we have it right and we shouldn't
change it.
+1.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Thu, May 10, 2018 at 10:13 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, May 10, 2018 at 12:58 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:David G. Johnston wrote:
As a user I don't really need to know which model is implemented and the
name doesn't necessarily imply the implementation. Pruning seems to bethe
commonly-used term for this feature and we should stick with that.
I agree with this conclusion. So we have it right and we shouldn't
change it.+1.
Seems like if it stays the name is good - but at this point no has voiced
opposition to removing it and making the name a moot point.
David J.
David G. Johnston wrote:
Seems like if it stays the name is good - but at this point no has voiced
opposition to removing it and making the name a moot point.
If we think the probability of bugs is 0%, then I'm all for removing it.
I don't. I vote to remove the GUC in a couple of releases, once it's
proven completely useless.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, May 10, 2018 at 1:51 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
David G. Johnston wrote:
Seems like if it stays the name is good - but at this point no has voiced
opposition to removing it and making the name a moot point.If we think the probability of bugs is 0%, then I'm all for removing it.
I don't. I vote to remove the GUC in a couple of releases, once it's
proven completely useless.
No feature ever written has a 0% probability of bugs.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
I'm thinking something like this.
The examples for runtime pruning are lame -- in the first, the text says
"watch out for Subplans Removed" and then the example provided doesn't
show one. (That example is probably exercising the wrong thing.)
Anyway, wording suggestions for 5.10.4 and 5.10.5 in the attached file
are welcome.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
On Thu, May 10, 2018 at 3:45 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
The examples for runtime pruning are lame -- in the first, the text says
"watch out for Subplans Removed" and then the example provided doesn't
show one. (That example is probably exercising the wrong thing.)
It seems to me that EXPLAIN output should have a clear way to show --
and to distinguish -- (1) plan-time pruning, (2) executor startup time
pruning, (3) mid-execution pruning. I don't think that's entirely the
case right now.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 11 May 2018 at 08:05, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, May 10, 2018 at 3:45 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:The examples for runtime pruning are lame -- in the first, the text says
"watch out for Subplans Removed" and then the example provided doesn't
show one. (That example is probably exercising the wrong thing.)It seems to me that EXPLAIN output should have a clear way to show --
and to distinguish -- (1) plan-time pruning, (2) executor startup time
pruning, (3) mid-execution pruning. I don't think that's entirely the
case right now.
I'm open to improving this, but I've just not come up with any bright
ideas on how to, yet.
Here's a recap of the current way to determine where the pruning occurred:
Phase 1: Plan time pruning:
EXPLAIN/EXPLAIN ANALYZE shows Append/MergeAppend/ModifyTable shows
fewer subnodes than there are partitions.
Both EXPLAIN and EXPLAIN ANALYZE output gives no indication of any pruning.
Phase 2: Executor init pruning:
EXPLAIN and EXPLAIN ANALYZE shows Append with fewer subnodes than
there are partitions + "Subplans Removed: <N>" appears to indicate the
number of subnodes removed by this phase.
MergeAppend and ModifyTable are unsupported in PG11.
Phase 3: Executor run pruning:
EXPLAIN/EXPLAIN ANALYZE shows all nodes that survived phase 1+2.
EXPLAIN ANALYZE shows that if a given node was never executed then the
runtime times appear as "(never executed)". If the Append was executed
and a subnode the Append was "(never executed)" then it was pruned by
this phase.
Changing parameters may cause some nodes to be scanned fewer times
than other nodes. The "nloops" count being lower than the nloop count
of the Append indicates this. e.g nloops=5 on an Append subnode vs
nloops=8 on the Append node indicates the node was eliminated 3 times.
Although complications around Parallel Append could make it quite
difficult to count nloops, since a node running a partial plan could
be executed by may workers which would increase the nloops.
Solutions?
The best I can think of right now is to add 2 more int properties to
the EXPLAIN output:
1. Subplans removed by plan-time constraints exclusion: N
2. Subplans removed by plan-time partition pruning: N
The rename the "Subplans Removed" that's there today to "Subplans
removed by run-time pruning"
These names are not very good, also. I'm also not very excited about
adding this. This also does nothing for phase 3.
Would something like that address your concern? Or do you have another idea?
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 2018/05/11 2:13, Robert Haas wrote:
On Thu, May 10, 2018 at 12:58 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:David G. Johnston wrote:
As a user I don't really need to know which model is implemented and the
name doesn't necessarily imply the implementation. Pruning seems to be the
commonly-used term for this feature and we should stick with that.I agree with this conclusion. So we have it right and we shouldn't
change it.+1.
+1 from me too.
Thanks,
Amit
Hi.
On 2018/05/11 4:45, Alvaro Herrera wrote:
I'm thinking something like this.
+1 to this more radical overhaul of this part of the documentation.
The examples for runtime pruning are lame -- in the first, the text says
"watch out for Subplans Removed" and then the example provided doesn't
show one. (That example is probably exercising the wrong thing.)Anyway, wording suggestions for 5.10.4 and 5.10.5 in the attached file
are welcome.
A few comments.
1. At the beginning of 5.10.4, in this example EXPLAIN's output:
SET enable_partition_pruning = off;
EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
There used to be [1]https://www.postgresql.org/docs/10/static/ddl-partitioning.html ellipses to show discontinuation between partitions
shown in the output plan, which no longer exists. Should be like this:
-> Seq Scan on measurement_y2006m03 (cost=0.00..33.12 rows=617 width=0)
Filter: (logdate >= '2008-01-01'::date)
...
-> Seq Scan on measurement_y2007m11 (cost=0.00..33.12 rows=617 width=0)
Filter: (logdate >= '2008-01-01'::date)
2. In the following sentence in 5.10.5
"Constraint exclusion works in a very similar way to partition pruning,
except that it uses each table's CHECK constraints — which gives it its
name — instead of the partitioning constraints, as with partition pruning.
Another difference is that it is only applied at plan time; there is no
attempt to remove partitions at execution time."
I think that saying "instead of the partitioning constraints, as with
partition pruning" here may be a bit misleading, because it may give
readers an impression that *all* tables have a partitioning constraint but
constraint exclusion ignores it in favor of using CHECK constraints. How
about saying:
whereas partition pruning uses a table's partitioning constraint which
exists only in the case of declarative partitioning.
3. Do we want the following sentence 5.10.5 to be revised now?
"The default (and recommended) setting of constraint_exclusion is actually
neither on nor off, but an intermediate setting called partition, which
causes the technique to be applied only to queries that are likely to be
working on inheritance partitioned tables."
I'm not sure if it's the time yet, but maybe we would want to recommend
"on" and mention that users may want to switch to "partition" if they need
to use legacy inheritance partitioning for one reason or another.
4. In the following sentence in the caveats part of 5.10.5. Partitioning
and Constraint Exclusion
"A good rule of thumb is that partitioning constraints should contain only
comparisons of the partitioning column(s) to constants using
B-tree-indexable operators, which applies even to partitioned tables,
because only B-tree-indexable column(s) are allowed in the partition key."
The part beginning with ", which applies even to partitioned tables" is no
longer needed as I had pointed out upthread [2]/messages/by-id/a8ad3dd8-ef30-bbd0-6732-a673710378fa@lab.ntt.co.jp. The reason is we no
longer pass the partition key derived partition constraints to constraint
exclusion algorithm, as the new pruning covers that base.
5. The last sentence in caveats, that is,
"Partitioning using these techniques will work well with up to perhaps a
hundred partitions; don't try to use many thousands of partitions."
should perhaps be reworded as:
"So the legacy inheritance based partitioning will work well with up to
perhaps a hundred partitions; don't try to use many thousands of partitions."
Thanks,
Amit
[1]: https://www.postgresql.org/docs/10/static/ddl-partitioning.html
[2]: /messages/by-id/a8ad3dd8-ef30-bbd0-6732-a673710378fa@lab.ntt.co.jp
/messages/by-id/a8ad3dd8-ef30-bbd0-6732-a673710378fa@lab.ntt.co.jp
On Fri, May 11, 2018 at 12:59:27PM +0900, Amit Langote wrote:
On 2018/05/11 2:13, Robert Haas wrote:
On Thu, May 10, 2018 at 12:58 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:David G. Johnston wrote:
As a user I don't really need to know which model is implemented and the
name doesn't necessarily imply the implementation. Pruning seems to be the
commonly-used term for this feature and we should stick with that.I agree with this conclusion. So we have it right and we shouldn't
change it.+1.
+1 from me too.
+1.
--
Michael
Hello
Amit Langote wrote:
+1 to this more radical overhaul of this part of the documentation.
Thanks. I pushed now after some more tweaking, including your suggested
corrections. I removed the examples, because they were both wrong. We
can give this more polish if anybody has the energy, but I think we're
in a pretty decent place now.
I'm not convinced that we need to show so much detail on pruning as
proposed by Robert elsewhere; we didn't have a lot of detail for
exclusion either and I don't have any evidence that it was a terrible
problem for users. Also, one possible use of the new GUC is that you
can compare plans if you so wish.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 13 May 2018 at 03:30, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Amit Langote wrote:
+1 to this more radical overhaul of this part of the documentation.
Thanks. I pushed now after some more tweaking,
Thanks for pushing.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 2018/05/14 9:55, David Rowley wrote:
On 13 May 2018 at 03:30, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Amit Langote wrote:
+1 to this more radical overhaul of this part of the documentation.
Thanks. I pushed now after some more tweaking,
Thanks for pushing.
Thank you.
Regards,
Amit
On Thu, May 10, 2018 at 10:22 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
Here's a recap of the current way to determine where the pruning occurred:
Phase 1: Plan time pruning:
EXPLAIN/EXPLAIN ANALYZE shows Append/MergeAppend/ModifyTable shows
fewer subnodes than there are partitions.
Both EXPLAIN and EXPLAIN ANALYZE output gives no indication of any pruning.Phase 2: Executor init pruning:
EXPLAIN and EXPLAIN ANALYZE shows Append with fewer subnodes than
there are partitions + "Subplans Removed: <N>" appears to indicate the
number of subnodes removed by this phase.MergeAppend and ModifyTable are unsupported in PG11.
Phase 3: Executor run pruning:
EXPLAIN/EXPLAIN ANALYZE shows all nodes that survived phase 1+2.
EXPLAIN ANALYZE shows that if a given node was never executed then the
runtime times appear as "(never executed)". If the Append was executed
and a subnode the Append was "(never executed)" then it was pruned by
this phase.
Hmm, that's actually not as bad as I thought. Thanks for the
explanation. I think if I were going to try to improve things, I'd
try to annotate the Append node with the name of the partitioned table
that it's using for pruning in case #2 and case #3, and maybe
something to indicate which type of pruning is in use. That would
make it really clear whether pruning is enabled or not. The methods
you mention above sort of require reading the tea leaves -- and it
might not always be very easy to distinguish between cases where
pruning is possible but nothing got pruned (imagine an inequality
qual) and where it's not even possible in the first place.
e.g.
Append
Execution-Time Pruning: order_lines (at executor startup)
-> Index Scan ...
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 17 May 2018 at 01:19, Robert Haas <robertmhaas@gmail.com> wrote:
Hmm, that's actually not as bad as I thought. Thanks for the
explanation. I think if I were going to try to improve things, I'd
try to annotate the Append node with the name of the partitioned table
that it's using for pruning in case #2 and case #3, and maybe
something to indicate which type of pruning is in use. That would
make it really clear whether pruning is enabled or not. The methods
you mention above sort of require reading the tea leaves -- and it
might not always be very easy to distinguish between cases where
pruning is possible but nothing got pruned (imagine an inequality
qual) and where it's not even possible in the first place.e.g.
Append
Execution-Time Pruning: order_lines (at executor startup)
-> Index Scan ...
Perhaps Append should be shown as "Unordered Partitioned Table Scan on
<table>". That seems more aligned to how else we show which relation a
node belongs to. The partition being scanned is simple to obtain. It's
just the first item in the partitioned_rels List. (MergeAppend would
be an "Ordered Partitioned Table Scan")
I'm not really a fan of overloading properties with a bunch of text.
Multiple int or text properties would be easier to deal with,
especially so when you consider the other explain formats. Remember,
all 3 pruning methods could be used for a single Append node.
I guess doing work here would require additional code in the planner
to track how many relations were removed by both partition pruning and
constraint exclusion. Dunno if that would be tracked together or
separately. However, I'd prefer to have a clear idea of what exactly
the design should be before I go write some code that perhaps nobody
will like.
Unsure what you have in mind for the pruning done during actual
execution; just a yay or nay as to whether we're attempting it or not?
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Thu, May 17, 2018 at 12:04 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
Append
Execution-Time Pruning: order_lines (at executor startup)
-> Index Scan ...Perhaps Append should be shown as "Unordered Partitioned Table Scan on
<table>". That seems more aligned to how else we show which relation a
node belongs to. The partition being scanned is simple to obtain. It's
just the first item in the partitioned_rels List. (MergeAppend would
be an "Ordered Partitioned Table Scan")
Hmm, that's a radical proposal but I'm not sure I like it. For one
thing, table scan might mean sequential scan to some users. For
another, it's not really unordered. Unless it's parallel-aware, we're
going to scan them strictly in the order they're given.
I'm not really a fan of overloading properties with a bunch of text.
Multiple int or text properties would be easier to deal with,
especially so when you consider the other explain formats. Remember,
all 3 pruning methods could be used for a single Append node.
I was imagining it as two properties in non-text format that got
displayed in a special way in text mode. I intended that this would
only give information about execution-time pruning, so there would
only two methods to consider here, but, yeah, you might have something
like:
Execution-Time Pruning: order_lines (at executor startup, at runtime)
I guess doing work here would require additional code in the planner
to track how many relations were removed by both partition pruning and
constraint exclusion. Dunno if that would be tracked together or
separately. However, I'd prefer to have a clear idea of what exactly
the design should be before I go write some code that perhaps nobody
will like.
I don't feel strongly about adding more code to track the number of
removed partitions. I think that the important thing is whether or
not partitioning is happening and at what stage, and I think it's
useful to show the relation name if we can. As you pointed out, it's
largely possible already to figure out how well we did at removing
stuff and at which stages, but to me it seems quite easy to be
confused about which stages tried to remove things. For example,
consider:
Gather
-> Nested Loop
-> Seq Scan
Filter: something
-> Append
-> Index Scan
-> Index Scan
-> Index Scan
I think it's going to be quite tricky to figure out whether that
Append node is trying to do execution-time pruning without some
annotation. The nloops values are going to be affected by how many
rows are in which partitions and how many workers got which rows as
well as by whether execution-time pruning worked and how effectively.
You might be able to figure out it out by staring at the EXPLAIN
output for a while... but it sure seems like it would be a lot nicer
to have an explicit indicator... especially if you're some random user
rather than a PostgreSQL expect.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2018/05/17 23:24, Robert Haas wrote:
On Thu, May 17, 2018 at 12:04 AM, David Rowley wrote:
I'm not really a fan of overloading properties with a bunch of text.
Multiple int or text properties would be easier to deal with,
especially so when you consider the other explain formats. Remember,
all 3 pruning methods could be used for a single Append node.I was imagining it as two properties in non-text format that got
displayed in a special way in text mode. I intended that this would
only give information about execution-time pruning, so there would
only two methods to consider here, but, yeah, you might have something
like:Execution-Time Pruning: order_lines (at executor startup, at runtime)
This looks short enough and useful.
I guess doing work here would require additional code in the planner
to track how many relations were removed by both partition pruning and
constraint exclusion. Dunno if that would be tracked together or
separately. However, I'd prefer to have a clear idea of what exactly
the design should be before I go write some code that perhaps nobody
will like.I don't feel strongly about adding more code to track the number of
removed partitions. I think that the important thing is whether or
not partitioning is happening and at what stage, and I think it's
useful to show the relation name if we can. As you pointed out, it's
largely possible already to figure out how well we did at removing
stuff and at which stages, but to me it seems quite easy to be
confused about which stages tried to remove things. For example,
consider:Gather
-> Nested Loop
-> Seq Scan
Filter: something
-> Append
-> Index Scan
-> Index Scan
-> Index ScanI think it's going to be quite tricky to figure out whether that
Append node is trying to do execution-time pruning without some
annotation. The nloops values are going to be affected by how many
rows are in which partitions and how many workers got which rows as
well as by whether execution-time pruning worked and how effectively.
You might be able to figure out it out by staring at the EXPLAIN
output for a while... but it sure seems like it would be a lot nicer
to have an explicit indicator... especially if you're some random user
rather than a PostgreSQL expect.
Yeah, I think it'd help to have Append be annotated as suggested by Robert
above. I guess if "at executor startup" is shown, then the subnodes
listed under Append will consist of only those that survived
executor-startup pruning and thus will help understand why there are fewer
than shown with EXPLAIN (without ANALYZE). Also, if "at runtime" is
shown, a user may want look at nloops property of the individual subnodes
to guess at how much pruning has occurred; although only the latter (that
is, inspecting nloops) suffices to know that runtime pruning has occurred
as also currently written in the documentation about pruning [1]https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#DDL-PARTITION-PRUNING, the
first piece of information (the "at runtime" annotation) seems nice to have.
Thanks,
Amit
[1]: https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#DDL-PARTITION-PRUNING
https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#DDL-PARTITION-PRUNING
On Fri, May 18, 2018 at 4:22 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
Yeah, I think it'd help to have Append be annotated as suggested by Robert
above. I guess if "at executor startup" is shown, then the subnodes
listed under Append will consist of only those that survived
executor-startup pruning and thus will help understand why there are fewer
than shown with EXPLAIN (without ANALYZE). Also, if "at runtime" is
shown, a user may want look at nloops property of the individual subnodes
to guess at how much pruning has occurred; although only the latter (that
is, inspecting nloops) suffices to know that runtime pruning has occurred
as also currently written in the documentation about pruning [1], the
first piece of information (the "at runtime" annotation) seems nice to have.
Having EXPLAIN and EXPLAIN ANALYZE show different things doesn't sound
like a good idea.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
I reread this and have some more comments.
https://www.postgresql.org/docs/devel/static/ddl-partitioning.html
"however, it is not possible to use some of the inheritance features discussed
in the previous section with partitioned tables and partitions"
=> The referenced section now follows rather than precedes the text; I suggest
to say:
"however, it is not possible to use some features of inheritance (discussed
below) with declaratively partitioned tables or their partitions"
"It is neither possible to specify columns when creating partitions with CREATE
TABLE nor is it possible to add columns to partitions after-the-fact using
ALTER TABLE"
=> change to: "It is not possible .. nor is it possible .."
Immediately after the section in inheritence:
https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#DDL-PARTITION-PRUNING
"Partition pruning is a query optimization technique that improves performance
for partitioned tables"
=> I think should say "improves performance for DECLARATIVELY partitioned
tables"
"You can use the EXPLAIN command to show the difference between a plan whose
partitions have been pruned from one whose partitions haven't, by using the
enable_partition_pruning configuration parameter. A typical unoptimized plan
for this type of table setup is:"
=> should say "difference between .. AND", not FROM.
=> Also, should avoid repeating "use...using". Also, remove the comma or
rearrange the sentence:
"By using the EXPLAIN command and the enable_partition_pruning configuration
parameter, one can show the difference between a plan whose partitions have
been pruned from one whose partitions haven't.
"Constraint exclusion is only applied during query planning; it is not applied
at execution time like partition pruning does."
=> Remove "does" ?
"Partitioning enforces a rule that all partitions must have exactly the same
set of columns as the parent"
=> I think should say "Declarative partitioning enforces"; or maybe:
"Partitions of a partitioned table must have exactly the same set of columns as
the parent"
or:
"For declarative partitioning, partitions must have exactly the same set of
columns as the partitioned table"
Let me know if it's useful to provide a patch.
Justin
On Sat, May 19, 2018 at 5:02 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, May 18, 2018 at 4:22 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:Yeah, I think it'd help to have Append be annotated as suggested by Robert
above. I guess if "at executor startup" is shown, then the subnodes
listed under Append will consist of only those that survived
executor-startup pruning and thus will help understand why there are fewer
than shown with EXPLAIN (without ANALYZE). Also, if "at runtime" is
shown, a user may want look at nloops property of the individual subnodes
to guess at how much pruning has occurred; although only the latter (that
is, inspecting nloops) suffices to know that runtime pruning has occurred
as also currently written in the documentation about pruning [1], the
first piece of information (the "at runtime" annotation) seems nice to have.Having EXPLAIN and EXPLAIN ANALYZE show different things doesn't sound
like a good idea.
Hmm yeah. I think I was misunderstanding how executor-startup pruning
works when I wrote:
...and thus will help understand why there are fewer
than shown with EXPLAIN (without ANALYZE).
Actually, because ExecInitAppend would run for both EXPLAIN and
EXPLAIN ANALYZE, executor-startup pruning should occur in both cases
and will result in the same plan shape to be shown. Sorry about the
confusion.
Thanks,
Amit
On Fri, May 18, 2018 at 08:56:53PM -0500, Justin Pryzby wrote:
I reread this and have some more comments.
https://www.postgresql.org/docs/devel/static/ddl-partitioning.html
Let me know if it's useful to provide a patch.
I propose this.
There's two other, wider changes to consider:
- should "5.10.4. Partition Pruning" be moved after "5.10.2. Declarative
Partitioning", rather than after "5.10.3. Implementation Using Inheritance" ?
- should we find a unified term for "inheritence-based partitioning" and avoid
using the word "partitioning" in that context? For example: "Partitioning
can be implemented using table inheritance[...]". One possible phrase
currently begin used is: "legacy inheritance method".
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 2cd0b8a..6e1ade9 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2967,70 +2967,70 @@ VALUES ('Albany', NULL, NULL, 'NY');
</para>
<para>
It is not possible to turn a regular table into a partitioned table or
vice versa. However, it is possible to add a regular or partitioned table
containing data as a partition of a partitioned table, or remove a
partition from a partitioned table turning it into a standalone table;
see <xref linkend="sql-altertable"/> to learn more about the
<command>ATTACH PARTITION</command> and <command>DETACH PARTITION</command>
sub-commands.
</para>
<para>
Individual partitions are linked to the partitioned table with inheritance
- behind-the-scenes; however, it is not possible to use some of the
- inheritance features discussed in the previous section with partitioned
- tables and partitions. For example, a partition cannot have any parents
+ behind-the-scenes; however, it is not possible to use some of the generic
+ features of inheritance (discussed below) with declaratively partitioned
+ tables or their partitions For example, a partition cannot have any parents
other than the partitioned table it is a partition of, nor can a regular
- table inherit from a partitioned table making the latter its parent.
- That means partitioned tables and partitions do not participate in
- inheritance with regular tables. Since a partition hierarchy consisting
- of the partitioned table and its partitions is still an inheritance
- hierarchy, all the normal rules of inheritance apply as described in
+ table inherit from a partitioned table making the latter its parent. That
+ means partitioned tables and partitions do not participate in inheritance
+ with regular tables. Since a partition hierarchy consisting of the
+ partitioned table and its partitions is still an inheritance hierarchy, all
+ the normal rules of inheritance apply as described in
<xref linkend="ddl-inherit"/> with some exceptions, most notably:
<itemizedlist>
<listitem>
<para>
Both <literal>CHECK</literal> and <literal>NOT NULL</literal>
constraints of a partitioned table are always inherited by all its
partitions. <literal>CHECK</literal> constraints that are marked
<literal>NO INHERIT</literal> are not allowed to be created on
partitioned tables.
</para>
</listitem>
<listitem>
<para>
Using <literal>ONLY</literal> to add or drop a constraint on only the
partitioned table is supported when there are no partitions. Once
partitions exist, using <literal>ONLY</literal> will result in an error
as adding or dropping constraints on only the partitioned table, when
- partitions exist, is not supported. Instead, constraints can be added
- or dropped, when they are not present in the parent table, directly on
- the partitions. As a partitioned table does not have any data
- directly, attempts to use <command>TRUNCATE</command>
+ partitions exist, is not supported. Instead, constraints on the
+ partitions themselves can be added and (if they are not present in the
+ parent table) dropped. As a partitioned table does not
+ have any data directly, attempts to use <command>TRUNCATE</command>
<literal>ONLY</literal> on a partitioned table will always return an
error.
</para>
</listitem>
<listitem>
<para>
Partitions cannot have columns that are not present in the parent. It
- is neither possible to specify columns when creating partitions with
- <command>CREATE TABLE</command> nor is it possible to add columns to
+ is not possible to specify columns when creating partitions with
+ <command>CREATE TABLE</command>, nor is it possible to add columns to
partitions after-the-fact using <command>ALTER TABLE</command>. Tables may be
added as a partition with <command>ALTER TABLE ... ATTACH PARTITION</command>
only if their columns exactly match the parent, including any
<literal>oid</literal> column.
</para>
</listitem>
<listitem>
<para>
You cannot drop the <literal>NOT NULL</literal> constraint on a
partition's column if the constraint is present in the parent table.
</para>
</listitem>
</itemizedlist>
@@ -3347,37 +3347,37 @@ ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
on individual partitions, not the partitioned table.
</para>
</listitem>
</itemizedlist>
</para>
</sect3>
</sect2>
<sect2 id="ddl-partitioning-implementation-inheritance">
<title>Implementation Using Inheritance</title>
<para>
While the built-in declarative partitioning is suitable for most
common use cases, there are some circumstances where a more flexible
approach may be useful. Partitioning can be implemented using table
- inheritance, which allows for several features which are not supported
+ inheritance, which allows for several features not supported
by declarative partitioning, such as:
<itemizedlist>
<listitem>
<para>
- Partitioning enforces a rule that all partitions must have exactly
- the same set of columns as the parent, but table inheritance allows
- children to have extra columns not present in the parent.
+ For declarative partitioning, partitions must have exactly the same set
+ of columns as the partitioned table, whereas with table inheritance,
+ child tables may have extra columns not present in the parent.
</para>
</listitem>
<listitem>
<para>
Table inheritance allows for multiple inheritance.
</para>
</listitem>
<listitem>
<para>
Declarative partitioning only supports range, list and hash
partitioning, whereas table inheritance allows data to be divided in a
manner of the user's choosing. (Note, however, that if constraint
@@ -3757,52 +3757,51 @@ ANALYZE measurement;
</itemizedlist>
</para>
</sect3>
</sect2>
<sect2 id="ddl-partition-pruning">
<title>Partition Pruning</title>
<indexterm>
<primary>partition pruning</primary>
</indexterm>
<para>
<firstterm>Partition pruning</firstterm> is a query optimization technique
- that improves performance for partitioned tables. As an example:
+ that improves performance for declaratively partitioned tables. As an example:
<programlisting>
SET enable_partition_pruning = on; -- the default
SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
</programlisting>
Without partition pruning, the above query would scan each of the
partitions of the <structname>measurement</structname> table. With
partition pruning enabled, the planner will examine the definition
of each partition and prove that the partition need not
be scanned because it could not contain any rows meeting the query's
<literal>WHERE</literal> clause. When the planner can prove this, it
excludes (<firstterm>prunes</firstterm>) the partition from the query
plan.
</para>
<para>
- You can use the <command>EXPLAIN</command> command to show the
- difference between a plan whose partitions have been pruned from one
- whose partitions haven't, by using the
- <xref linkend="guc-enable-partition-pruning"/> configuration
- parameter. A typical unoptimized plan for this type of table setup
- is:
+ By using the EXPLAIN command and the <xref
+ linkend="guc-enable-partition-pruning"/> configuration parameter, it's
+ possible to show the difference between a plan whose partitions have been
+ pruned and one whose partitions haven't. A typical unoptimized plan for
+ this type of table setup is:
<programlisting>
SET enable_partition_pruning = off;
EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
QUERY PLAN
───────────────────────────────────────────────────────────────────────────────────
Aggregate (cost=188.76..188.77 rows=1 width=8)
-> Append (cost=0.00..181.05 rows=3085 width=0)
-> Seq Scan on measurement_y2006m02 (cost=0.00..33.12 rows=617 width=0)
Filter: (logdate >= '2008-01-01'::date)
-> Seq Scan on measurement_y2006m03 (cost=0.00..33.12 rows=617 width=0)
Filter: (logdate >= '2008-01-01'::date)
...
-> Seq Scan on measurement_y2007m11 (cost=0.00..33.12 rows=617 width=0)
Filter: (logdate >= '2008-01-01'::date)
@@ -3881,31 +3880,31 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
</listitem>
</itemizedlist>
</para>
<para>
Partition pruning can be disabled using the
<xref linkend="guc-enable-partition-pruning"/> setting.
</para>
<note>
<para>
Currently, pruning of partitions during the planning of an
<command>UPDATE</command> or <command>DELETE</command> command is
implemented using the constraint exclusion method (however, it is
- still ruled by the <literal>enable_partition_pruning</literal>
- setting instead of <literal>constraint_exclusion</literal>) —
- see the next section for details and caveats that apply.
+ controlled ruled by the <literal>enable_partition_pruning</literal>
+ rather than <literal>constraint_exclusion</literal>) —
+ see the following section for details and caveats that apply.
</para>
<para>
Also, execution-time partition pruning currently only occurs for the
<literal>Append</literal> node type, not <literal>MergeAppend</literal>.
</para>
<para>
Both of these behaviors are likely to be changed in a future release
of <productname>PostgreSQL</productname>.
</para>
</note>
</sect2>
@@ -3916,72 +3915,72 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
<primary>constraint exclusion</primary>
</indexterm>
<para>
<firstterm>Constraint exclusion</firstterm> is a query optimization
technique similar to partition pruning. While it is primarily used
for partitioned tables using the legacy inheritance method, it can be
used for other purposes, including with declarative partitioning.
</para>
<para>
Constraint exclusion works in a very similar way to partition
pruning, except that it uses each table's <literal>CHECK</literal>
constraints — which gives it its name — whereas partition
- pruning uses the table's partitioning constraint, which exists only in
- the case of declarative partitioning. Another difference is that it
- is only applied at plan time; there is no attempt to remove
- partitions at execution time.
+ pruning uses the table's partitioning bounds, which exists only in
+ the case of declarative partitioning. Another difference is that
+ constraint exclusion is only applied at plan time; there is no attempt to
+ remove partitions at execution time.
</para>
<para>
The fact that constraint exclusion uses <literal>CHECK</literal>
constraints, which makes it slow compared to partition pruning, can
sometimes be used as an advantage: because constraints can be defined
even on declaratively-partitioned tables, in addition to the internal
partitioning constraints, and only constraint exclusion would be able
to elide certain partitions from the query plan using those.
</para>
<para>
The default (and recommended) setting of
<xref linkend="guc-constraint-exclusion"/> is neither
<literal>on</literal> nor <literal>off</literal>, but an intermediate setting
called <literal>partition</literal>, which causes the technique to be
applied only to queries that are likely to be working on inheritance partitioned
tables. The <literal>on</literal> setting causes the planner to examine
<literal>CHECK</literal> constraints in all queries, even simple ones that
are unlikely to benefit.
</para>
<para>
The following caveats apply to constraint exclusion:
<itemizedlist>
<listitem>
<para>
- Constraint exclusion is only applied during query planning; it is
- not applied at execution time like partition pruning does.
+ Constraint exclusion is only applied during query planning; unlike
+ partition pruning, it cannot be not applied during execution.
</para>
</listitem>
<listitem>
<para>
Constraint exclusion only works when the query's <literal>WHERE</literal>
clause contains constants (or externally supplied parameters).
For example, a comparison against a non-immutable function such as
<function>CURRENT_TIMESTAMP</function> cannot be optimized, since the
- planner cannot know which partition the function value might fall
+ planner cannot know which partition the function's value might fall
into at run time.
</para>
</listitem>
<listitem>
<para>
Keep the partitioning constraints simple, else the planner may not be
able to prove that partitions don't need to be visited. Use simple
equality conditions for list partitioning, or simple
range tests for range partitioning, as illustrated in the preceding
examples. A good rule of thumb is that partitioning constraints should
contain only comparisons of the partitioning column(s) to constants
using B-tree-indexable operators, because only B-tree-indexable
column(s) are allowed in the partition key.
On 24 May 2018 at 09:35, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Fri, May 18, 2018 at 08:56:53PM -0500, Justin Pryzby wrote:
I reread this and have some more comments.
https://www.postgresql.org/docs/devel/static/ddl-partitioning.htmlLet me know if it's useful to provide a patch.
I propose this.
Thanks for working on this.
Can you just attach the patch?
Personally, for me, it's much easier to review when applied rather
than looking at an email.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Thu, May 24, 2018 at 10:46:38AM +1200, David Rowley wrote:
On 24 May 2018 at 09:35, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Fri, May 18, 2018 at 08:56:53PM -0500, Justin Pryzby wrote:
I reread this and have some more comments.
https://www.postgresql.org/docs/devel/static/ddl-partitioning.htmlLet me know if it's useful to provide a patch.
I propose this.
Thanks for working on this.
Can you just attach the patch?
Attached.
Justin
Attachments:
doc-fix-partitioning-round2text/plain; charset=utf-8Download
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 2cd0b8a..31f3438 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2967,70 +2967,70 @@ VALUES ('Albany', NULL, NULL, 'NY');
</para>
<para>
It is not possible to turn a regular table into a partitioned table or
vice versa. However, it is possible to add a regular or partitioned table
containing data as a partition of a partitioned table, or remove a
partition from a partitioned table turning it into a standalone table;
see <xref linkend="sql-altertable"/> to learn more about the
<command>ATTACH PARTITION</command> and <command>DETACH PARTITION</command>
sub-commands.
</para>
<para>
Individual partitions are linked to the partitioned table with inheritance
- behind-the-scenes; however, it is not possible to use some of the
- inheritance features discussed in the previous section with partitioned
- tables and partitions. For example, a partition cannot have any parents
+ behind-the-scenes; however, it is not possible to use some of the generic
+ features of inheritance (discussed below) with declaratively partitioned
+ tables or their partitions For example, a partition cannot have any parents
other than the partitioned table it is a partition of, nor can a regular
- table inherit from a partitioned table making the latter its parent.
- That means partitioned tables and partitions do not participate in
- inheritance with regular tables. Since a partition hierarchy consisting
- of the partitioned table and its partitions is still an inheritance
- hierarchy, all the normal rules of inheritance apply as described in
+ table inherit from a partitioned table making the latter its parent. That
+ means partitioned tables and partitions do not participate in inheritance
+ with regular tables. Since a partition hierarchy consisting of the
+ partitioned table and its partitions is still an inheritance hierarchy, all
+ the normal rules of inheritance apply as described in
<xref linkend="ddl-inherit"/> with some exceptions, most notably:
<itemizedlist>
<listitem>
<para>
Both <literal>CHECK</literal> and <literal>NOT NULL</literal>
constraints of a partitioned table are always inherited by all its
partitions. <literal>CHECK</literal> constraints that are marked
<literal>NO INHERIT</literal> are not allowed to be created on
partitioned tables.
</para>
</listitem>
<listitem>
<para>
Using <literal>ONLY</literal> to add or drop a constraint on only the
partitioned table is supported when there are no partitions. Once
partitions exist, using <literal>ONLY</literal> will result in an error
as adding or dropping constraints on only the partitioned table, when
- partitions exist, is not supported. Instead, constraints can be added
- or dropped, when they are not present in the parent table, directly on
- the partitions. As a partitioned table does not have any data
- directly, attempts to use <command>TRUNCATE</command>
+ partitions exist, is not supported. Instead, constraints on the
+ partitions themselves can be added and (if they are not present in the
+ parent table) dropped. As a partitioned table does not
+ have any data directly, attempts to use <command>TRUNCATE</command>
<literal>ONLY</literal> on a partitioned table will always return an
error.
</para>
</listitem>
<listitem>
<para>
Partitions cannot have columns that are not present in the parent. It
- is neither possible to specify columns when creating partitions with
- <command>CREATE TABLE</command> nor is it possible to add columns to
+ is not possible to specify columns when creating partitions with
+ <command>CREATE TABLE</command>, nor is it possible to add columns to
partitions after-the-fact using <command>ALTER TABLE</command>. Tables may be
added as a partition with <command>ALTER TABLE ... ATTACH PARTITION</command>
only if their columns exactly match the parent, including any
<literal>oid</literal> column.
</para>
</listitem>
<listitem>
<para>
You cannot drop the <literal>NOT NULL</literal> constraint on a
partition's column if the constraint is present in the parent table.
</para>
</listitem>
</itemizedlist>
@@ -3347,37 +3347,37 @@ ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
on individual partitions, not the partitioned table.
</para>
</listitem>
</itemizedlist>
</para>
</sect3>
</sect2>
<sect2 id="ddl-partitioning-implementation-inheritance">
<title>Implementation Using Inheritance</title>
<para>
While the built-in declarative partitioning is suitable for most
common use cases, there are some circumstances where a more flexible
approach may be useful. Partitioning can be implemented using table
- inheritance, which allows for several features which are not supported
+ inheritance, which allows for several features not supported
by declarative partitioning, such as:
<itemizedlist>
<listitem>
<para>
- Partitioning enforces a rule that all partitions must have exactly
- the same set of columns as the parent, but table inheritance allows
- children to have extra columns not present in the parent.
+ For declarative partitioning, partitions must have exactly the same set
+ of columns as the partitioned table, whereas with table inheritance,
+ child tables may have extra columns not present in the parent.
</para>
</listitem>
<listitem>
<para>
Table inheritance allows for multiple inheritance.
</para>
</listitem>
<listitem>
<para>
Declarative partitioning only supports range, list and hash
partitioning, whereas table inheritance allows data to be divided in a
manner of the user's choosing. (Note, however, that if constraint
@@ -3757,52 +3757,51 @@ ANALYZE measurement;
</itemizedlist>
</para>
</sect3>
</sect2>
<sect2 id="ddl-partition-pruning">
<title>Partition Pruning</title>
<indexterm>
<primary>partition pruning</primary>
</indexterm>
<para>
<firstterm>Partition pruning</firstterm> is a query optimization technique
- that improves performance for partitioned tables. As an example:
+ that improves performance for declaratively partitioned tables. As an example:
<programlisting>
SET enable_partition_pruning = on; -- the default
SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
</programlisting>
Without partition pruning, the above query would scan each of the
partitions of the <structname>measurement</structname> table. With
partition pruning enabled, the planner will examine the definition
of each partition and prove that the partition need not
be scanned because it could not contain any rows meeting the query's
<literal>WHERE</literal> clause. When the planner can prove this, it
excludes (<firstterm>prunes</firstterm>) the partition from the query
plan.
</para>
<para>
- You can use the <command>EXPLAIN</command> command to show the
- difference between a plan whose partitions have been pruned from one
- whose partitions haven't, by using the
- <xref linkend="guc-enable-partition-pruning"/> configuration
- parameter. A typical unoptimized plan for this type of table setup
- is:
+ By using the EXPLAIN command and the <xref
+ linkend="guc-enable-partition-pruning"/> configuration parameter, it's
+ possible to show the difference between a plan whose partitions have been
+ pruned and one whose partitions haven't. A typical unoptimized plan for
+ this type of table setup is:
<programlisting>
SET enable_partition_pruning = off;
EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
QUERY PLAN
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
Aggregate (cost=188.76..188.77 rows=1 width=8)
-> Append (cost=0.00..181.05 rows=3085 width=0)
-> Seq Scan on measurement_y2006m02 (cost=0.00..33.12 rows=617 width=0)
Filter: (logdate >= '2008-01-01'::date)
-> Seq Scan on measurement_y2006m03 (cost=0.00..33.12 rows=617 width=0)
Filter: (logdate >= '2008-01-01'::date)
...
-> Seq Scan on measurement_y2007m11 (cost=0.00..33.12 rows=617 width=0)
Filter: (logdate >= '2008-01-01'::date)
@@ -3881,31 +3880,31 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
</listitem>
</itemizedlist>
</para>
<para>
Partition pruning can be disabled using the
<xref linkend="guc-enable-partition-pruning"/> setting.
</para>
<note>
<para>
Currently, pruning of partitions during the planning of an
<command>UPDATE</command> or <command>DELETE</command> command is
implemented using the constraint exclusion method (however, it is
- still ruled by the <literal>enable_partition_pruning</literal>
- setting instead of <literal>constraint_exclusion</literal>) —
- see the next section for details and caveats that apply.
+ controlled ruled by the <literal>enable_partition_pruning</literal>
+ rather than <literal>constraint_exclusion</literal>) —
+ see the following section for details and caveats that apply.
</para>
<para>
Also, execution-time partition pruning currently only occurs for the
<literal>Append</literal> node type, not <literal>MergeAppend</literal>.
</para>
<para>
Both of these behaviors are likely to be changed in a future release
of <productname>PostgreSQL</productname>.
</para>
</note>
</sect2>
@@ -3916,72 +3915,72 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
<primary>constraint exclusion</primary>
</indexterm>
<para>
<firstterm>Constraint exclusion</firstterm> is a query optimization
technique similar to partition pruning. While it is primarily used
for partitioned tables using the legacy inheritance method, it can be
used for other purposes, including with declarative partitioning.
</para>
<para>
Constraint exclusion works in a very similar way to partition
pruning, except that it uses each table's <literal>CHECK</literal>
constraints — which gives it its name — whereas partition
- pruning uses the table's partitioning constraint, which exists only in
- the case of declarative partitioning. Another difference is that it
- is only applied at plan time; there is no attempt to remove
- partitions at execution time.
+ pruning uses the table's partitioning bounds, which exists only in
+ the case of declarative partitioning. Another difference is that
+ constraint exclusion is only applied at plan time; there is no attempt to
+ remove partitions at execution time.
</para>
<para>
The fact that constraint exclusion uses <literal>CHECK</literal>
constraints, which makes it slow compared to partition pruning, can
sometimes be used as an advantage: because constraints can be defined
even on declaratively-partitioned tables, in addition to the internal
partitioning constraints, and only constraint exclusion would be able
to elide certain partitions from the query plan using those.
</para>
<para>
The default (and recommended) setting of
<xref linkend="guc-constraint-exclusion"/> is neither
<literal>on</literal> nor <literal>off</literal>, but an intermediate setting
called <literal>partition</literal>, which causes the technique to be
applied only to queries that are likely to be working on inheritance partitioned
tables. The <literal>on</literal> setting causes the planner to examine
<literal>CHECK</literal> constraints in all queries, even simple ones that
are unlikely to benefit.
</para>
<para>
The following caveats apply to constraint exclusion:
<itemizedlist>
<listitem>
<para>
- Constraint exclusion is only applied during query planning; it is
- not applied at execution time like partition pruning does.
+ Constraint exclusion is only applied during query planning; unlike
+ partition pruning, it cannot be not applied during execution.
</para>
</listitem>
<listitem>
<para>
Constraint exclusion only works when the query's <literal>WHERE</literal>
clause contains constants (or externally supplied parameters).
For example, a comparison against a non-immutable function such as
<function>CURRENT_TIMESTAMP</function> cannot be optimized, since the
- planner cannot know which partition the function value might fall
+ planner cannot know which partition the function's value might fall
into at run time.
</para>
</listitem>
<listitem>
<para>
Keep the partitioning constraints simple, else the planner may not be
able to prove that partitions don't need to be visited. Use simple
equality conditions for list partitioning, or simple
range tests for range partitioning, as illustrated in the preceding
examples. A good rule of thumb is that partitioning constraints should
contain only comparisons of the partitioning column(s) to constants
using B-tree-indexable operators, because only B-tree-indexable
column(s) are allowed in the partition key.
Hi Justin.
Thanks for writing the patch. I have a couple of comments.
On 2018/05/24 8:31, Justin Pryzby wrote:
On Thu, May 24, 2018 at 10:46:38AM +1200, David Rowley wrote:
On 24 May 2018 at 09:35, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Fri, May 18, 2018 at 08:56:53PM -0500, Justin Pryzby wrote:
I reread this and have some more comments.
https://www.postgresql.org/docs/devel/static/ddl-partitioning.htmlLet me know if it's useful to provide a patch.
I propose this.
Thanks for working on this.
Can you just attach the patch?
Attached.
- behind-the-scenes; however, it is not possible to use some of the
- inheritance features discussed in the previous section with partitioned
- tables and partitions. For example, a partition cannot have any parents
+ behind-the-scenes; however, it is not possible to use some of the generic
+ features of inheritance (discussed below) with declaratively partitioned
+ tables or their partitions For example, a partition cannot have any
parents
As I recall, I had written the "previous section" in the original text to
mean 5.9 Inheritance
https://www.postgresql.org/docs/devel/static/ddl-inherit.html
Although, we do list some inheritance features that cannot be used with
declarative partitioned tables on the same page in 5.10.3, so what you
have here may be fine.
+ possible to show the difference between a plan whose partitions have been
+ pruned and one whose partitions haven't. A typical unoptimized plan for
+ this type of table setup is:
"a plan whose partitions have been pruned" sounds a bit off; maybe, "a
plan in which partitions have been pruned".
+ controlled ruled by the <literal>enable_partition_pruning</literal>
controlled ruled by -> still controlled by
- pruning uses the table's partitioning constraint, which exists only in
- the case of declarative partitioning.
...
+ pruning uses the table's partitioning bounds, which exists only in
+ the case of declarative partitioning.
Maybe say "partition bounds" here if change it at all.
Thanks,
Amit
On Thu, May 24, 2018 at 11:30:40AM +0900, Amit Langote wrote:
Hi Justin.
Thanks for writing the patch. I have a couple of comments.
Thanks for your review, find attached updated patch.
+ possible to show the difference between a plan whose partitions have been + pruned and one whose partitions haven't. A typical unoptimized plan for + this type of table setup is:"a plan whose partitions have been pruned" sounds a bit off; maybe, "a
plan in which partitions have been pruned".
I wrote:
"[...] a plan for which partitions have been pruned and for which they have
not."
Cheers,
Justin
Attachments:
v2-doc-fix-partitioning-round2text/plain; charset=utf-8Download
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 2cd0b8a..bae2589 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2967,70 +2967,70 @@ VALUES ('Albany', NULL, NULL, 'NY');
</para>
<para>
It is not possible to turn a regular table into a partitioned table or
vice versa. However, it is possible to add a regular or partitioned table
containing data as a partition of a partitioned table, or remove a
partition from a partitioned table turning it into a standalone table;
see <xref linkend="sql-altertable"/> to learn more about the
<command>ATTACH PARTITION</command> and <command>DETACH PARTITION</command>
sub-commands.
</para>
<para>
Individual partitions are linked to the partitioned table with inheritance
- behind-the-scenes; however, it is not possible to use some of the
- inheritance features discussed in the previous section with partitioned
- tables and partitions. For example, a partition cannot have any parents
+ behind-the-scenes; however, it is not possible to use some of the generic
+ features of inheritance (discussed below) with declaratively partitioned
+ tables or their partitions For example, a partition cannot have any parents
other than the partitioned table it is a partition of, nor can a regular
- table inherit from a partitioned table making the latter its parent.
- That means partitioned tables and partitions do not participate in
- inheritance with regular tables. Since a partition hierarchy consisting
- of the partitioned table and its partitions is still an inheritance
- hierarchy, all the normal rules of inheritance apply as described in
+ table inherit from a partitioned table making the latter its parent. That
+ means partitioned tables and partitions do not participate in inheritance
+ with regular tables. Since a partition hierarchy consisting of the
+ partitioned table and its partitions is still an inheritance hierarchy, all
+ the normal rules of inheritance apply as described in
<xref linkend="ddl-inherit"/> with some exceptions, most notably:
<itemizedlist>
<listitem>
<para>
Both <literal>CHECK</literal> and <literal>NOT NULL</literal>
constraints of a partitioned table are always inherited by all its
partitions. <literal>CHECK</literal> constraints that are marked
<literal>NO INHERIT</literal> are not allowed to be created on
partitioned tables.
</para>
</listitem>
<listitem>
<para>
Using <literal>ONLY</literal> to add or drop a constraint on only the
partitioned table is supported when there are no partitions. Once
partitions exist, using <literal>ONLY</literal> will result in an error
as adding or dropping constraints on only the partitioned table, when
- partitions exist, is not supported. Instead, constraints can be added
- or dropped, when they are not present in the parent table, directly on
- the partitions. As a partitioned table does not have any data
- directly, attempts to use <command>TRUNCATE</command>
+ partitions exist, is not supported. Instead, constraints on the
+ partitions themselves can be added and (if they are not present in the
+ parent table) dropped. As a partitioned table does not
+ have any data directly, attempts to use <command>TRUNCATE</command>
<literal>ONLY</literal> on a partitioned table will always return an
error.
</para>
</listitem>
<listitem>
<para>
Partitions cannot have columns that are not present in the parent. It
- is neither possible to specify columns when creating partitions with
- <command>CREATE TABLE</command> nor is it possible to add columns to
+ is not possible to specify columns when creating partitions with
+ <command>CREATE TABLE</command>, nor is it possible to add columns to
partitions after-the-fact using <command>ALTER TABLE</command>. Tables may be
added as a partition with <command>ALTER TABLE ... ATTACH PARTITION</command>
only if their columns exactly match the parent, including any
<literal>oid</literal> column.
</para>
</listitem>
<listitem>
<para>
You cannot drop the <literal>NOT NULL</literal> constraint on a
partition's column if the constraint is present in the parent table.
</para>
</listitem>
</itemizedlist>
@@ -3347,37 +3347,37 @@ ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
on individual partitions, not the partitioned table.
</para>
</listitem>
</itemizedlist>
</para>
</sect3>
</sect2>
<sect2 id="ddl-partitioning-implementation-inheritance">
<title>Implementation Using Inheritance</title>
<para>
While the built-in declarative partitioning is suitable for most
common use cases, there are some circumstances where a more flexible
approach may be useful. Partitioning can be implemented using table
- inheritance, which allows for several features which are not supported
+ inheritance, which allows for several features not supported
by declarative partitioning, such as:
<itemizedlist>
<listitem>
<para>
- Partitioning enforces a rule that all partitions must have exactly
- the same set of columns as the parent, but table inheritance allows
- children to have extra columns not present in the parent.
+ For declarative partitioning, partitions must have exactly the same set
+ of columns as the partitioned table, whereas with table inheritance,
+ child tables may have extra columns not present in the parent.
</para>
</listitem>
<listitem>
<para>
Table inheritance allows for multiple inheritance.
</para>
</listitem>
<listitem>
<para>
Declarative partitioning only supports range, list and hash
partitioning, whereas table inheritance allows data to be divided in a
manner of the user's choosing. (Note, however, that if constraint
@@ -3757,52 +3757,51 @@ ANALYZE measurement;
</itemizedlist>
</para>
</sect3>
</sect2>
<sect2 id="ddl-partition-pruning">
<title>Partition Pruning</title>
<indexterm>
<primary>partition pruning</primary>
</indexterm>
<para>
<firstterm>Partition pruning</firstterm> is a query optimization technique
- that improves performance for partitioned tables. As an example:
+ that improves performance for declaratively partitioned tables. As an example:
<programlisting>
SET enable_partition_pruning = on; -- the default
SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
</programlisting>
Without partition pruning, the above query would scan each of the
partitions of the <structname>measurement</structname> table. With
partition pruning enabled, the planner will examine the definition
of each partition and prove that the partition need not
be scanned because it could not contain any rows meeting the query's
<literal>WHERE</literal> clause. When the planner can prove this, it
excludes (<firstterm>prunes</firstterm>) the partition from the query
plan.
</para>
<para>
- You can use the <command>EXPLAIN</command> command to show the
- difference between a plan whose partitions have been pruned from one
- whose partitions haven't, by using the
- <xref linkend="guc-enable-partition-pruning"/> configuration
- parameter. A typical unoptimized plan for this type of table setup
- is:
+ By using the EXPLAIN command and the <xref
+ linkend="guc-enable-partition-pruning"/> configuration parameter, it's
+ possible to show the difference between a plan for which partitions have
+ been pruned and for which they have not. A typical unoptimized plan for
+ this type of table setup is:
<programlisting>
SET enable_partition_pruning = off;
EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
QUERY PLAN
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
Aggregate (cost=188.76..188.77 rows=1 width=8)
-> Append (cost=0.00..181.05 rows=3085 width=0)
-> Seq Scan on measurement_y2006m02 (cost=0.00..33.12 rows=617 width=0)
Filter: (logdate >= '2008-01-01'::date)
-> Seq Scan on measurement_y2006m03 (cost=0.00..33.12 rows=617 width=0)
Filter: (logdate >= '2008-01-01'::date)
...
-> Seq Scan on measurement_y2007m11 (cost=0.00..33.12 rows=617 width=0)
Filter: (logdate >= '2008-01-01'::date)
@@ -3881,31 +3880,31 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
</listitem>
</itemizedlist>
</para>
<para>
Partition pruning can be disabled using the
<xref linkend="guc-enable-partition-pruning"/> setting.
</para>
<note>
<para>
Currently, pruning of partitions during the planning of an
<command>UPDATE</command> or <command>DELETE</command> command is
implemented using the constraint exclusion method (however, it is
- still ruled by the <literal>enable_partition_pruning</literal>
- setting instead of <literal>constraint_exclusion</literal>) —
- see the next section for details and caveats that apply.
+ controlled by the <literal>enable_partition_pruning</literal> rather than
+ <literal>constraint_exclusion</literal>) — see the following section
+ for details and caveats that apply.
</para>
<para>
Also, execution-time partition pruning currently only occurs for the
<literal>Append</literal> node type, not <literal>MergeAppend</literal>.
</para>
<para>
Both of these behaviors are likely to be changed in a future release
of <productname>PostgreSQL</productname>.
</para>
</note>
</sect2>
@@ -3916,72 +3915,72 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
<primary>constraint exclusion</primary>
</indexterm>
<para>
<firstterm>Constraint exclusion</firstterm> is a query optimization
technique similar to partition pruning. While it is primarily used
for partitioned tables using the legacy inheritance method, it can be
used for other purposes, including with declarative partitioning.
</para>
<para>
Constraint exclusion works in a very similar way to partition
pruning, except that it uses each table's <literal>CHECK</literal>
constraints — which gives it its name — whereas partition
- pruning uses the table's partitioning constraint, which exists only in
- the case of declarative partitioning. Another difference is that it
- is only applied at plan time; there is no attempt to remove
- partitions at execution time.
+ pruning uses the table's partition bounds, which exists only in the case of
+ declarative partitioning. Another difference is that constraint exclusion
+ is only applied at plan time; there is no attempt to remove partitions at
+ execution time.
</para>
<para>
The fact that constraint exclusion uses <literal>CHECK</literal>
constraints, which makes it slow compared to partition pruning, can
sometimes be used as an advantage: because constraints can be defined
even on declaratively-partitioned tables, in addition to the internal
partitioning constraints, and only constraint exclusion would be able
to elide certain partitions from the query plan using those.
</para>
<para>
The default (and recommended) setting of
<xref linkend="guc-constraint-exclusion"/> is neither
<literal>on</literal> nor <literal>off</literal>, but an intermediate setting
called <literal>partition</literal>, which causes the technique to be
applied only to queries that are likely to be working on inheritance partitioned
tables. The <literal>on</literal> setting causes the planner to examine
<literal>CHECK</literal> constraints in all queries, even simple ones that
are unlikely to benefit.
</para>
<para>
The following caveats apply to constraint exclusion:
<itemizedlist>
<listitem>
<para>
- Constraint exclusion is only applied during query planning; it is
- not applied at execution time like partition pruning does.
+ Constraint exclusion is only applied during query planning; unlike
+ partition pruning, it cannot be not applied during execution.
</para>
</listitem>
<listitem>
<para>
Constraint exclusion only works when the query's <literal>WHERE</literal>
clause contains constants (or externally supplied parameters).
For example, a comparison against a non-immutable function such as
<function>CURRENT_TIMESTAMP</function> cannot be optimized, since the
- planner cannot know which partition the function value might fall
+ planner cannot know which partition the function's value might fall
into at run time.
</para>
</listitem>
<listitem>
<para>
Keep the partitioning constraints simple, else the planner may not be
able to prove that partitions don't need to be visited. Use simple
equality conditions for list partitioning, or simple
range tests for range partitioning, as illustrated in the preceding
examples. A good rule of thumb is that partitioning constraints should
contain only comparisons of the partitioning column(s) to constants
using B-tree-indexable operators, because only B-tree-indexable
column(s) are allowed in the partition key.
On Wed, May 2, 2018 at 07:49:42PM -0700, David G. Johnston wrote:
On Wednesday, May 2, 2018, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Robert Haas wrote:
On Wed, May 2, 2018 at 9:28 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:
I admit I am more concerned about the possibility of bugs than I am
about providing a performance-related tool.I agree that if partition pruning has bugs, somebody might want to
turn it off.� On the other hand, when they do, there's a good chance
that they will lose so much performance that they'll still be pretty
sad.� Somebody certainly could have a workload where the pruning
helps, but by a small enough amount that shutting it off is
acceptable.� But I suspect that's a somewhat narrow target.I'm not going to go to war over this, though.� I'm just telling you
what I think.Well, we didn't have a GUC initially, evidently because none of us
thought that this would be a huge problem.� So maybe you're both right
and it's overkill to have it.� I'm not set on having it, either.� Does
anybody else have an opinion?I toss my +1 to removing it altogether.
+1 We are terrible at removing old GUCs and having it around means
everyone has to decide if they need to change it, so having it is not a
zero cost.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
On 2018-May-24, Bruce Momjian wrote:
On Wed, May 2, 2018 at 07:49:42PM -0700, David G. Johnston wrote:
I toss my +1 to removing it altogether.
+1 We are terrible at removing old GUCs and having it around means
everyone has to decide if they need to change it, so having it is not a
zero cost.
Are you voting to remove the GUC?
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, May 24, 2018 at 02:23:17PM -0400, Alvaro Herrera wrote:
On 2018-May-24, Bruce Momjian wrote:
On Wed, May 2, 2018 at 07:49:42PM -0700, David G. Johnston wrote:
I toss my +1 to removing it altogether.
+1 We are terrible at removing old GUCs and having it around means
everyone has to decide if they need to change it, so having it is not a
zero cost.Are you voting to remove the GUC?
Yes. Sorry but I am very late on this and maybe too late to vote.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
Pushed. I made a couple of minor changes, in particular I added the
word "one" to this sentence, which was already under discussion:
On 2018-May-24, Justin Pryzby wrote:
On Thu, May 24, 2018 at 11:30:40AM +0900, Amit Langote wrote:
+ possible to show the difference between a plan whose partitions have been + pruned and one whose partitions haven't. A typical unoptimized plan for + this type of table setup is:"a plan whose partitions have been pruned" sounds a bit off; maybe, "a
plan in which partitions have been pruned".I wrote:
"[...] a plan for which partitions have been pruned and for which they have
not."
"it's possible to show the difference between a plan for which partitions have
been pruned and *one* for which they have not."
Thanks!
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018-May-23, Justin Pryzby wrote:
There's two other, wider changes to consider:
- should "5.10.4. Partition Pruning" be moved after "5.10.2. Declarative
Partitioning", rather than after "5.10.3. Implementation Using Inheritance" ?
I considered that when reorganizing this section, but it seemed more
sensible to me to keep both pruning techniques together rather than put
each one immediately below its partitioning technique. Maybe I'm wrong
in that.
- should we find a unified term for "inheritence-based partitioning" and avoid
using the word "partitioning" in that context? For example: "Partitioning
can be implemented using table inheritance[...]". One possible phrase
currently begin used is: "legacy inheritance method".
Yeah, maybe it'd be a good time to do that. In particular I wondered
whether the section title "Partitioning and Constraint Exclusion" should
be changed somehow to note the fact that it's mostly for the legacy
method.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Jun 01, 2018 at 03:00:10PM -0400, Alvaro Herrera wrote:
On 2018-May-23, Justin Pryzby wrote:
There's two other, wider changes to consider:
...
- should we find a unified term for "inheritence-based partitioning" and avoid
using the word "partitioning" in that context? For example: "Partitioning
can be implemented using table inheritance[...]". One possible phrase
currently begin used is: "legacy inheritance method".Yeah, maybe it'd be a good time to do that. In particular I wondered
whether the section title "Partitioning and Constraint Exclusion" should
be changed somehow to note the fact that it's mostly for the legacy
method.
I made changes to avoid "partition" (which I think should mean a child of
relkind='p', and itself of relkind='r') and "partitioned" (meaning relkind='p'
itself) but left alone most instances of "partitioning".
There's two issues. One is the unfortunately-named, recommended setting of
constraint_exclusion='partition' :(
And one is this, which I think should be disambiguated from native
list/range/hash partition bounds:
Use simple equality conditions for list partitioning, or simple range
tests for range partitioning, as illustrated in the preceding examples.
I'm short on words so maybe someone else can recommend language.
On Fri, Jun 01, 2018 at 02:57:22PM -0400, Alvaro Herrera wrote:
Pushed. I made a couple of minor changes, in particular I added the
It looks like you also fixed a double negative - thanks.
Justin
Attachments:
v0-doc-fix-partitioning-round3text/plain; charset=us-asciiDownload
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 0258391..d919818 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3387,8 +3387,8 @@ ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
Declarative partitioning only supports range, list and hash
partitioning, whereas table inheritance allows data to be divided in a
manner of the user's choosing. (Note, however, that if constraint
- exclusion is unable to prune partitions effectively, query performance
- will be very poor.)
+ exclusion is unable to prune child tables effectively, query performance
+ may be poor.)
</para>
</listitem>
@@ -3410,18 +3410,18 @@ ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
<para>
We use the same <structname>measurement</structname> table we used
- above. To implement it as a partitioned table using inheritance, use
+ above. To implement partitioning using inheritance, use
the following steps:
<orderedlist spacing="compact">
<listitem>
<para>
Create the <quote>master</quote> table, from which all of the
- partitions will inherit. This table will contain no data. Do not
+ <quote>child</quote> tables will inherit. This table will contain no data. Do not
define any check constraints on this table, unless you intend them
- to be applied equally to all partitions. There is no point in
+ to be applied equally to all child tables. There is no point in
defining any indexes or unique constraints on it, either. For our
- example, master table is the <structname>measurement</structname>
+ example, the master table is the <structname>measurement</structname>
table as originally defined.
</para>
</listitem>
@@ -3431,7 +3431,7 @@ ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
Create several <quote>child</quote> tables that each inherit from
the master table. Normally, these tables will not add any columns
to the set inherited from the master. Just as with declarative
- partitioning, these partitions are in every way normal
+ partitioning, these tables are in every way normal
<productname>PostgreSQL</productname> tables (or foreign tables).
</para>
@@ -3449,8 +3449,8 @@ CREATE TABLE measurement_y2008m01 () INHERITS (measurement);
<listitem>
<para>
- Add non-overlapping table constraints to the partition tables to
- define the allowed key values in each partition.
+ Add non-overlapping table constraints to the child tables to
+ define the allowed key values in each.
</para>
<para>
@@ -3461,18 +3461,18 @@ CHECK ( county IN ( 'Oxfordshire', 'Buckinghamshire', 'Warwickshire' ))
CHECK ( outletID >= 100 AND outletID < 200 )
</programlisting>
Ensure that the constraints guarantee that there is no overlap
- between the key values permitted in different partitions. A common
+ between the key values permitted in different child tables. A common
mistake is to set up range constraints like:
<programlisting>
CHECK ( outletID BETWEEN 100 AND 200 )
CHECK ( outletID BETWEEN 200 AND 300 )
</programlisting>
- This is wrong since it is not clear which partition the key value
- 200 belongs in.
+ This is wrong since it is not clear into which child table the key
+ value 200 belongs.
</para>
<para>
- It would be better to instead create partitions as follows:
+ It would be better to instead create child tables as follows:
<programlisting>
CREATE TABLE measurement_y2006m02 (
@@ -3501,7 +3501,7 @@ CREATE TABLE measurement_y2008m01 (
<listitem>
<para>
- For each partition, create an index on the key column(s),
+ For each child table, create an index on the key column(s),
as well as any other indexes you might want.
<programlisting>
CREATE INDEX measurement_y2006m02_logdate ON measurement_y2006m02 (logdate);
@@ -3517,9 +3517,9 @@ CREATE INDEX measurement_y2008m01_logdate ON measurement_y2008m01 (logdate);
<para>
We want our application to be able to say <literal>INSERT INTO
measurement ...</literal> and have the data be redirected into the
- appropriate partition table. We can arrange that by attaching
+ appropriate child table. We can arrange that by attaching
a suitable trigger function to the master table.
- If data will be added only to the latest partition, we can
+ If data will be added only to the latest child, we can
use a very simple trigger function:
<programlisting>
@@ -3545,13 +3545,13 @@ CREATE TRIGGER insert_measurement_trigger
</programlisting>
We must redefine the trigger function each month so that it always
- points to the current partition. The trigger definition does
+ points to the current child table. The trigger definition does
not need to be updated, however.
</para>
<para>
We might want to insert data and have the server automatically
- locate the partition into which the row should be added. We
+ locate the child table into which the row should be added. We
could do this with a more complex trigger function, for example:
<programlisting>
@@ -3579,7 +3579,7 @@ LANGUAGE plpgsql;
The trigger definition is the same as before.
Note that each <literal>IF</literal> test must exactly match the
- <literal>CHECK</literal> constraint for its partition.
+ <literal>CHECK</literal> constraint for its child table.
</para>
<para>
@@ -3590,8 +3590,8 @@ LANGUAGE plpgsql;
<note>
<para>
- In practice it might be best to check the newest partition first,
- if most inserts go into that partition. For simplicity we have
+ In practice, it might be best to check the newest child first,
+ if most inserts go into that child. For simplicity, we have
shown the trigger's tests in the same order as in other parts
of this example.
</para>
@@ -3599,7 +3599,7 @@ LANGUAGE plpgsql;
<para>
A different approach to redirecting inserts into the appropriate
- partition table is to set up rules, instead of a trigger, on the
+ child table is to set up rules, instead of a trigger, on the
master table. For example:
<programlisting>
@@ -3625,7 +3625,7 @@ DO INSTEAD
<para>
Be aware that <command>COPY</command> ignores rules. If you want to
use <command>COPY</command> to insert data, you'll need to copy into the
- correct partition table rather than into the master. <command>COPY</command>
+ correct child table rather than directly into the master. <command>COPY</command>
does fire triggers, so you can use it normally if you use the trigger
approach.
</para>
@@ -3641,25 +3641,25 @@ DO INSTEAD
<para>
Ensure that the <xref linkend="guc-constraint-exclusion"/>
configuration parameter is not disabled in
- <filename>postgresql.conf</filename>.
- If it is, queries will not be optimized as desired.
+ <filename>postgresql.conf</filename>, otherwise
+ child tables may be accessed unnecessarily.
</para>
</listitem>
</orderedlist>
</para>
<para>
- As we can see, a complex partitioning scheme could require a
+ As we can see, a complex table hierarchy could require a
substantial amount of DDL. In the above example we would be creating
- a new partition each month, so it might be wise to write a script that
+ a new child table each month, so it might be wise to write a script that
generates the required DDL automatically.
</para>
</sect3>
<sect3 id="ddl-partitioning-inheritance-maintenance">
- <title>Partition Maintenance</title>
+ <title>Maintenance for Inheritence Partitioning</title>
<para>
- To remove old data quickly, simply drop the partition that is no longer
+ To remove old data quickly, simply drop the child table that is no longer
necessary:
<programlisting>
DROP TABLE measurement_y2006m02;
@@ -3667,7 +3667,7 @@ DROP TABLE measurement_y2006m02;
</para>
<para>
- To remove the partition from the partitioned table but retain access to
+ To remove the child table from the inheritence hierarchy table but retain access to
it as a table in its own right:
<programlisting>
@@ -3676,8 +3676,8 @@ ALTER TABLE measurement_y2006m02 NO INHERIT measurement;
</para>
<para>
- To add a new partition to handle new data, create an empty partition
- just as the original partitions were created above:
+ To add a new child table to handle new data, create an empty child table
+ just as the original children were created above:
<programlisting>
CREATE TABLE measurement_y2008m02 (
@@ -3685,9 +3685,11 @@ CREATE TABLE measurement_y2008m02 (
) INHERITS (measurement);
</programlisting>
- Alternatively, one may want to create the new table outside the partition
- structure, and make it a partition after the data is loaded, checked,
- and transformed.
+ Alternatively, one may want to create and populate the new child table
+ before adding it to the table hierarchy — this could allow data to be
+ loaded, checked, and transformed before being made visible to queries on the
+ parent table.
+
<programlisting>
CREATE TABLE measurement_y2008m02
@@ -3705,7 +3707,7 @@ ALTER TABLE measurement_y2008m02 INHERIT measurement;
<title>Caveats</title>
<para>
- The following caveats apply to partitioned tables implemented using
+ The following caveats apply to partitioning implemented using
inheritance:
<itemizedlist>
<listitem>
@@ -3713,19 +3715,20 @@ ALTER TABLE measurement_y2008m02 INHERIT measurement;
There is no automatic way to verify that all of the
<literal>CHECK</literal> constraints are mutually
exclusive. It is safer to create code that generates
- partitions and creates and/or modifies associated objects than
+ child tables and creates and/or modifies associated objects than
to write each by hand.
</para>
</listitem>
<listitem>
<para>
- The schemes shown here assume that the partition key column(s)
- of a row never change, or at least do not change enough to require
- it to move to another partition. An <command>UPDATE</command> that attempts
+ The schemes shown here assume that the values of a row's key column(s)
+ never change, or at least do not change enough to make it inconsistent
+ with the constraints on its table.
+ An <command>UPDATE</command> that attempts
to do that will fail because of the <literal>CHECK</literal> constraints.
If you need to handle such cases, you can put suitable update triggers
- on the partition tables, but it makes management of the structure
+ on the child tables, but it makes management of the structure
much more complicated.
</para>
</listitem>
@@ -3734,7 +3737,7 @@ ALTER TABLE measurement_y2008m02 INHERIT measurement;
<para>
If you are using manual <command>VACUUM</command> or
<command>ANALYZE</command> commands, don't forget that
- you need to run them on each partition individually. A command like:
+ you need to run them on each child table individually. A command like:
<programlisting>
ANALYZE measurement;
</programlisting>
@@ -3754,7 +3757,7 @@ ANALYZE measurement;
<listitem>
<para>
Triggers or rules will be needed to route rows to the desired
- partition, unless the application is explicitly aware of the
+ child table, unless the application is explicitly aware of the
partitioning scheme. Triggers may be complicated to write, and will
be much slower than the tuple routing performed internally by
declarative partitioning.
@@ -3925,7 +3928,7 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
<para>
<firstterm>Constraint exclusion</firstterm> is a query optimization
technique similar to partition pruning. While it is primarily used
- for partitioned tables using the legacy inheritance method, it can be
+ for partitioning implemented using the legacy inheritance method, it can be
used for other purposes, including with declarative partitioning.
</para>
@@ -3933,7 +3936,7 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
Constraint exclusion works in a very similar way to partition
pruning, except that it uses each table's <literal>CHECK</literal>
constraints — which gives it its name — whereas partition
- pruning uses the table's partition bounds, which exists only in the
+ pruning uses the table's partition bounds, which exist only in the
case of declarative partitioning. Another difference is that
constraint exclusion is only applied at plan time; there is no attempt
to remove partitions at execution time.
@@ -3943,9 +3946,9 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
The fact that constraint exclusion uses <literal>CHECK</literal>
constraints, which makes it slow compared to partition pruning, can
sometimes be used as an advantage: because constraints can be defined
- even on declaratively-partitioned tables, in addition to the internal
- partitioning constraints, and only constraint exclusion would be able
- to elide certain partitions from the query plan using those.
+ even on declaratively-partitioned tables, in addition to their internal
+ partition bounds, constraint exclusion may be able
+ to elide additional partitions from the query plan.
</para>
<para>
@@ -3976,7 +3979,7 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
clause contains constants (or externally supplied parameters).
For example, a comparison against a non-immutable function such as
<function>CURRENT_TIMESTAMP</function> cannot be optimized, since the
- planner cannot know which partition the function's value might fall
+ planner cannot know which child table the function's value might fall
into at run time.
</para>
</listitem>
@@ -3984,7 +3987,7 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
<listitem>
<para>
Keep the partitioning constraints simple, else the planner may not be
- able to prove that partitions don't need to be visited. Use simple
+ able to prove that child tables may not need to be visited. Use simple
equality conditions for list partitioning, or simple
range tests for range partitioning, as illustrated in the preceding
examples. A good rule of thumb is that partitioning constraints should
@@ -3996,11 +3999,11 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
<listitem>
<para>
- All constraints on all partitions of the master table are examined
- during constraint exclusion, so large numbers of partitions are likely
+ All constraints on all children of the parent table are examined
+ during constraint exclusion, so large numbers of children are likely
to increase query planning time considerably. So the legacy
inheritance based partitioning will work well with up to perhaps a
- hundred partitions; don't try to use many thousands of partitions.
+ hundred child tables; don't try to use many thousands of children.
</para>
</listitem>
On 01.06.18 23:33, Justin Pryzby wrote:
- should we find a unified term for "inheritence-based partitioning" and avoid
using the word "partitioning" in that context? For example: "Partitioning
can be implemented using table inheritance[...]". One possible phrase
currently begin used is: "legacy inheritance method".Yeah, maybe it'd be a good time to do that. In particular I wondered
whether the section title "Partitioning and Constraint Exclusion" should
be changed somehow to note the fact that it's mostly for the legacy
method.I made changes to avoid "partition" (which I think should mean a child of
relkind='p', and itself of relkind='r') and "partitioned" (meaning relkind='p'
itself) but left alone most instances of "partitioning".
Committed.
There's two issues. One is the unfortunately-named, recommended setting of
constraint_exclusion='partition' :(And one is this, which I think should be disambiguated from native
list/range/hash partition bounds:Use simple equality conditions for list partitioning, or simple range
tests for range partitioning, as illustrated in the preceding examples.I'm short on words so maybe someone else can recommend language.
I'm not worried about those.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018-Jul-05, Peter Eisentraut wrote:
On 01.06.18 23:33, Justin Pryzby wrote:
- should we find a unified term for "inheritence-based partitioning" and avoid
using the word "partitioning" in that context? For example: "Partitioning
can be implemented using table inheritance[...]". One possible phrase
currently begin used is: "legacy inheritance method".Yeah, maybe it'd be a good time to do that. In particular I wondered
whether the section title "Partitioning and Constraint Exclusion" should
be changed somehow to note the fact that it's mostly for the legacy
method.I made changes to avoid "partition" (which I think should mean a child of
relkind='p', and itself of relkind='r') and "partitioned" (meaning relkind='p'
itself) but left alone most instances of "partitioning".Committed.
Thanks for handling this.
Should we do this in REL_11_STABLE too? I vote yes.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 6 July 2018 at 09:41, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
On 2018-Jul-05, Peter Eisentraut wrote:
Committed.
Thanks for handling this.
Should we do this in REL_11_STABLE too? I vote yes.
Sorry for now paying much attention to this, but I've read through
what's been committed and I also think PG11 deserves this too.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 2018/07/06 6:55, David Rowley wrote:
On 6 July 2018 at 09:41, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
On 2018-Jul-05, Peter Eisentraut wrote:
Committed.
Thanks for handling this.
Should we do this in REL_11_STABLE too? I vote yes.
Sorry for now paying much attention to this, but I've read through
what's been committed and I also think PG11 deserves this too.
+1
Thanks Justin and Peter.
Regards,
Amit
On 2018-Jul-06, Amit Langote wrote:
On 2018/07/06 6:55, David Rowley wrote:
On 6 July 2018 at 09:41, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
On 2018-Jul-05, Peter Eisentraut wrote:
Committed.
Thanks for handling this.
Should we do this in REL_11_STABLE too? I vote yes.
Sorry for now paying much attention to this, but I've read through
what's been committed and I also think PG11 deserves this too.+1
Done, thanks :-)
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, 11 May 2018 at 17:37, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
5. The last sentence in caveats, that is,
"Partitioning using these techniques will work well with up to perhaps a
hundred partitions; don't try to use many thousands of partitions."should perhaps be reworded as:
"So the legacy inheritance based partitioning will work well with up to
perhaps a hundred partitions; don't try to use many thousands of partitions."
(digging up 10-month-old thread [1]/messages/by-id/6bc4e96a-0e30-e9b6-dcc7-791c7486a491@lab.ntt.co.jp)
There was a report [2]/messages/by-id/739b7a5e-1192-1011-5aa2-41adad55682d@perfexpert.ch on -general today where someone had a 4000
partition partitioned table and were complaining about memory
consumption in the planner during DELETE. They didn't mention the
exact version they were using, but mentioned that the problem exists
on 10, 11 and master. Of course, we're well aware of this issue with
DELETE and UPDATE of large partition hierarchies, Amit has been
working hard with trying to solve it for PG12.
In the -general post, I was just about to point them at the part in
the documents that warn against these large partition hierarchies, but
it looks like the warning was removed in bebc46931a1, or at least
modified to say that constraint exclusion with heritance tables is
slow. I really wonder if we shouldn't put something back in there to
warn against this sort of thing. It might be a bit late for the
people who've read the docs and done it already, but a warning might
at least stop new people making the mistake.
Hopefully one day we can remove the warning again, but it won't be for PG12.
Thoughts?
[1]: /messages/by-id/6bc4e96a-0e30-e9b6-dcc7-791c7486a491@lab.ntt.co.jp
[2]: /messages/by-id/739b7a5e-1192-1011-5aa2-41adad55682d@perfexpert.ch
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Sun, Mar 10, 2019 at 10:53:02PM +1300, David Rowley wrote:
On Fri, 11 May 2018 at 17:37, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
5. The last sentence in caveats, that is,
"Partitioning using these techniques will work well with up to perhaps a
hundred partitions; don't try to use many thousands of partitions."should perhaps be reworded as:
"So the legacy inheritance based partitioning will work well with up to
perhaps a hundred partitions; don't try to use many thousands of partitions."
In the -general post, I was just about to point them at the part in
the documents that warn against these large partition hierarchies, but
it looks like the warning was removed in bebc46931a1, or at least
modified to say that constraint exclusion with heritance tables is
slow. I really wonder if we shouldn't put something back in there to
warn against this sort of thing.
+1
I believe I was of the same mind when I wrote:
/messages/by-id/20180525215002.GD14378@telsasoft.com
Justin
PS. Sorry to dredge up another 10 month old thread..
On 2019/03/11 0:25, Justin Pryzby wrote:
On Sun, Mar 10, 2019 at 10:53:02PM +1300, David Rowley wrote:
On Fri, 11 May 2018 at 17:37, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
5. The last sentence in caveats, that is,
"Partitioning using these techniques will work well with up to perhaps a
hundred partitions; don't try to use many thousands of partitions."should perhaps be reworded as:
"So the legacy inheritance based partitioning will work well with up to
perhaps a hundred partitions; don't try to use many thousands of partitions."In the -general post, I was just about to point them at the part in
the documents that warn against these large partition hierarchies, but
it looks like the warning was removed in bebc46931a1, or at least
modified to say that constraint exclusion with heritance tables is
slow. I really wonder if we shouldn't put something back in there to
warn against this sort of thing.+1
I believe I was of the same mind when I wrote:
/messages/by-id/20180525215002.GD14378@telsasoft.com
I agree PG 11 didn't improve things enough to have removed such a warning
from the documentation even for partitioning. Actually, we only ever had
a warning about constraint exclusion getting slower as more children are
added, but nothing about UPDATE/DELETE planning being slow in itself;
perhaps more importantly, much slower than SELECT. It seems very hard to
put that in the documentation though.
In PG 10:
Excluding unnecessary partitions is slow, especially as the number of
partitions increases, because constraint exclusion needs to look at each
partition to determine whether it could be excluded. Also, planning for
UPDATE and DELETE queries is significantly slower than for SELECT queries
for $REASONS. Given that, it is wise to use up to a few hundred
partitions but not more.
PG 11 moved the needle a bit for SELECT queries:
Excluding unnecessary partitions is slow for UPDATE and DELETE queries,
especially as the number of partitions increases, because constraint
exclusion needs to look at each partition to determine whether it could be
excluded. Also, planning for UPDATE and DELETE queries is significantly
slower than for SELECT queries for $REASONS. Given that, it is wise to
use up to a few hundred partitions but not more.
Thanks,
Amit
On Mon, 11 Mar 2019 at 14:33, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
PG 11 moved the needle a bit for SELECT queries:
Excluding unnecessary partitions is slow for UPDATE and DELETE queries,
With those words I expect the user might be surprised that it's still
slow after doing SET enable_partition_pruning = off;
I'm not really talking about constraint exclusion or partition
pruning. The memory growth problem the user was experiencing was down
to the fact that we plan once per partition and each of the
PlannerInfos used for each planner run has a RangeTblEntry for all
partitions. This means if you add one more partition and you get N
partitions more RangeTblEntry items in memory. This is the quadratic
memory growth that I mentioned in the -general post.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 2019/03/11 11:00, David Rowley wrote:
On Mon, 11 Mar 2019 at 14:33, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:PG 11 moved the needle a bit for SELECT queries:
Excluding unnecessary partitions is slow for UPDATE and DELETE queries,
With those words I expect the user might be surprised that it's still
slow after doing SET enable_partition_pruning = off;I'm not really talking about constraint exclusion or partition
pruning. The memory growth problem the user was experiencing was down
to the fact that we plan once per partition and each of the
PlannerInfos used for each planner run has a RangeTblEntry for all
partitions. This means if you add one more partition and you get N
partitions more RangeTblEntry items in memory. This is the quadratic
memory growth that I mentioned in the -general post.
Yeah, I get it. As I said in my email, all we have ever mentioned in the
documentation as the reason for queries on partitioned tables being slow
is that partition exclusion is slow and nothing else. Can we put
quadratic memory growth during planning as the reason for performance
degradation into the documentation? Maybe we could, but every time I
tried it, it didn't read like user-facing documentation to me. Do you
have something in mind that we could add?
Thanks,
Amit
On Mon, 11 Mar 2019 at 15:00, David Rowley <david.rowley@2ndquadrant.com> wrote:
On Mon, 11 Mar 2019 at 14:33, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:PG 11 moved the needle a bit for SELECT queries:
Excluding unnecessary partitions is slow for UPDATE and DELETE queries,
With those words I expect the user might be surprised that it's still
slow after doing SET enable_partition_pruning = off;
I had in mind in 10, 11 and master add a note to mention:
Currently, it is not recommended to have partition hierarchies more
than a few hundred partitions. Larger partition hierarchies can
suffer from slow planning times with <command>SELECT</command>
queries. Planning times for <command>UPDATE</command> and
<command>DELETE</command> commands may also suffer slow planning
times, but in addition, memory consumption may also become an issue
due to how the planner currently plans the query once per partition.
These limitations are likely to be resolved in a future version of
<productname>PostgreSQL</productname>.
I've not really thought too much on the fact that the issue also
exists with inheritance tables in earlier version too.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 2019/03/11 11:13, David Rowley wrote:
On Mon, 11 Mar 2019 at 15:00, David Rowley <david.rowley@2ndquadrant.com> wrote:
On Mon, 11 Mar 2019 at 14:33, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:PG 11 moved the needle a bit for SELECT queries:
Excluding unnecessary partitions is slow for UPDATE and DELETE queries,
With those words I expect the user might be surprised that it's still
slow after doing SET enable_partition_pruning = off;I had in mind in 10, 11 and master add a note to mention:
Thanks for putting this together.
Currently, it is not recommended to have partition hierarchies more
than a few hundred partitions. Larger partition hierarchies can
suffer from slow planning times with <command>SELECT</command>
queries. Planning times for <command>UPDATE</command> and
<command>DELETE</command> commands may also suffer slow planning
times, but in addition, memory consumption may also become an issue
due to how the planner currently plans the query once per partition.
These limitations are likely to be resolved in a future version of
<productname>PostgreSQL</productname>.
How about slightly rewriting the sentence toward the end as:
memory consumption may also become an issue, because planner currently
plans the query once for every partition.
I've not really thought too much on the fact that the issue also
exists with inheritance tables in earlier version too.
That's fine maybe.
Thanks,
Amit
On Mon, Mar 11, 2019 at 01:06:08PM +0900, Amit Langote wrote:
On 2019/03/11 11:13, David Rowley wrote:
On Mon, 11 Mar 2019 at 15:00, David Rowley <david.rowley@2ndquadrant.com> wrote:
On Mon, 11 Mar 2019 at 14:33, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
PG 11 moved the needle a bit for SELECT queries:
Excluding unnecessary partitions is slow for UPDATE and DELETE queries,
With those words I expect the user might be surprised that it's still
slow after doing SET enable_partition_pruning = off;I had in mind in 10, 11 and master add a note to mention:
Thanks for putting this together.
Currently, it is not recommended to have partition hierarchies more
than a few hundred partitions. Larger partition hierarchies can
suffer from slow planning times with <command>SELECT</command>
queries. Planning times for <command>UPDATE</command> and
<command>DELETE</command> commands may also suffer slow planning
times, but in addition, memory consumption may also become an issue
due to how the planner currently plans the query once per partition.
These limitations are likely to be resolved in a future version of
<productname>PostgreSQL</productname>.
Can I offer the following variation:
| Currently, it is not recommended to have partition hierarchies with more than
| a few hundred partitions. Larger partition hierarchies may incur long
| planning time.
| In addition, <command>UPDATE</command> and <command>DELETE</command>
| commands on larger hierarchies may cause excessive memory consumption.
| These deficiencies are likely to be fixed in a future release of
| <productname>PostgreSQL</productname>.
On 2019/03/11 13:22, Justin Pryzby wrote:
On Mon, Mar 11, 2019 at 01:06:08PM +0900, Amit Langote wrote:
On 2019/03/11 11:13, David Rowley wrote:
On Mon, 11 Mar 2019 at 15:00, David Rowley <david.rowley@2ndquadrant.com> wrote:
On Mon, 11 Mar 2019 at 14:33, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
PG 11 moved the needle a bit for SELECT queries:
Excluding unnecessary partitions is slow for UPDATE and DELETE queries,
With those words I expect the user might be surprised that it's still
slow after doing SET enable_partition_pruning = off;I had in mind in 10, 11 and master add a note to mention:
Thanks for putting this together.
Currently, it is not recommended to have partition hierarchies more
than a few hundred partitions. Larger partition hierarchies can
suffer from slow planning times with <command>SELECT</command>
queries. Planning times for <command>UPDATE</command> and
<command>DELETE</command> commands may also suffer slow planning
times, but in addition, memory consumption may also become an issue
due to how the planner currently plans the query once per partition.
These limitations are likely to be resolved in a future version of
<productname>PostgreSQL</productname>.Can I offer the following variation:
| Currently, it is not recommended to have partition hierarchies with more than
| a few hundred partitions. Larger partition hierarchies may incur long
| planning time.
| In addition, <command>UPDATE</command> and <command>DELETE</command>
| commands on larger hierarchies may cause excessive memory consumption.
| These deficiencies are likely to be fixed in a future release of
| <productname>PostgreSQL</productname>.
Says essentially the same thing but with fewer words, so +1.
Now the question is where to put this text? Currently, we have:
5.10. Table Partitioning
5.10.1. Overview
5.10.2. Declarative Partitioning
5.10.3. Implementation Using Inheritance
5.10.4. Partition Pruning
5.10.5. Partitioning and Constraint Exclusion
Should we add 5.10.6 Notes for the above "note", or should it be stuffed
under one of the existing sub-headings?
Thanks,
Amit
On Mon, Mar 11, 2019 at 12:30 AM Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
Now the question is where to put this text? Currently, we have:
5.10. Table Partitioning
5.10.1. Overview
5.10.2. Declarative Partitioning
5.10.3. Implementation Using Inheritance
5.10.4. Partition Pruning
5.10.5. Partitioning and Constraint ExclusionShould we add 5.10.6 Notes for the above "note", or should it be stuffed
under one of the existing sub-headings?
I think it should be added to one of the existing sub-headings. I
suggest adding it to the end of 5.10.1 and rephrasing it so that it
makes clearer the distinction between what will happen with
inheritance and what will happen with table partitioning, e.g.
When using either declarative partitioning or table inheritance,
partitioning hierarchies with more than a few hundred partitions are
not currently recommended. Larger partition hierarchies may incur long
planning time, and especially in the case of UPDATE and DELETE,
excessive memory usage. When inheritance is used, see also the
limitations described in Section 5.10.5, Partitioning and Constraint
Exclusion.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, 13 Mar 2019 at 04:07, Robert Haas <robertmhaas@gmail.com> wrote:
I think it should be added to one of the existing sub-headings. I
suggest adding it to the end of 5.10.1 and rephrasing it so that it
makes clearer the distinction between what will happen with
inheritance and what will happen with table partitioning, e.g.When using either declarative partitioning or table inheritance,
partitioning hierarchies with more than a few hundred partitions are
not currently recommended. Larger partition hierarchies may incur long
planning time, and especially in the case of UPDATE and DELETE,
excessive memory usage. When inheritance is used, see also the
limitations described in Section 5.10.5, Partitioning and Constraint
Exclusion.
I think I've done that in the attached patch. However, do think the
just saying "excessive memory usage" seems strange without prefixing
it with "can result in" and dropping the "especially". I'm fairly
used to having my wording debated, so I've left your words in the
patch.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
docs_partitioning_warning.patchapplication/octet-stream; name=docs_partitioning_warning.patchDownload
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 110f6b4657..15e62b2d8a 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3465,6 +3465,18 @@ VALUES ('Albany', NULL, NULL, 'NY');
offer flexibility but do not have some of the performance benefits
of built-in declarative partitioning.
</para>
+
+ <note>
+ <para>
+ When using either declarative partitioning or table inheritance,
+ partitioning hierarchies with more than a few hundred partitions are
+ not currently recommended. Larger partition hierarchies may incur long
+ planning time, and especially in the case of <command>UPDATE</command>
+ and <command>DELETE</command>, excessive memory usage. When inheritance
+ is used, see also the limitations described in
+ <xref linkend="ddl-partitioning-constraint-exclusion"/>.
+ </para>
+ </note>
</sect2>
<sect2 id="ddl-partitioning-declarative">
On 2019/03/13 8:28, David Rowley wrote:
On Wed, 13 Mar 2019 at 04:07, Robert Haas <robertmhaas@gmail.com> wrote:
I think it should be added to one of the existing sub-headings. I
suggest adding it to the end of 5.10.1 and rephrasing it so that it
makes clearer the distinction between what will happen with
inheritance and what will happen with table partitioning, e.g.
+1.
When using either declarative partitioning or table inheritance,
partitioning hierarchies with more than a few hundred partitions are
not currently recommended. Larger partition hierarchies may incur long
planning time, and especially in the case of UPDATE and DELETE,
excessive memory usage. When inheritance is used, see also the
limitations described in Section 5.10.5, Partitioning and Constraint
Exclusion.I think I've done that in the attached patch. However, do think the
just saying "excessive memory usage" seems strange without prefixing
it with "can result in" and dropping the "especially".
FWIW, I've gotten used to reading the kind of English that Robert wrote
(meaning it makes perfect sense to me), but wording tweaks you suggest
will work to.
Thanks,
Amit
On Tue, Mar 12, 2019 at 7:28 PM David Rowley
<david.rowley@2ndquadrant.com> wrote:
I think I've done that in the attached patch.
Cool, thanks.
However, do think the
just saying "excessive memory usage" seems strange without prefixing
it with "can result in" and dropping the "especially". I'm fairly
used to having my wording debated, so I've left your words in the
patch.
I'm not direly opposed to that. I included "especially" so as not to
rule out the possibility that there might be cases other than UPDATE
and DELETE that, in some circumstances, also use a lot of memory. I
didn't prefix it with "can result in" because I don't think English
grammar requires it to be there. It would be grammatically correct to
say "Larger partitioning hierarchies may incur long planning time and
excessive memory usage," and I don't think that injecting an
appositive phrase before "excessive memory usage" changes that
calculus. However, somebody might find your way easier to follow.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Thu, 14 Mar 2019 at 02:10, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Mar 12, 2019 at 7:28 PM David Rowley
<david.rowley@2ndquadrant.com> wrote:I think I've done that in the attached patch.
Cool, thanks.
Just so I don't forget about this, I've added it to the July 'fest.
https://commitfest.postgresql.org/23/2065/
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Thu, 21 Mar 2019 at 00:51, David Rowley <david.rowley@2ndquadrant.com> wrote:
Just so I don't forget about this, I've added it to the July 'fest.
Now that we have 428b260f8, I think the version of this that goes into
master should be more like the attached.
I think the original patch is fine for the back branches.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
docs_partitioning_warning_master.patchapplication/octet-stream; name=docs_partitioning_warning_master.patchDownload
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 1fe27c5da9..f47170cb35 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3583,6 +3583,26 @@ VALUES ('Albany', NULL, NULL, 'NY');
offer flexibility but do not have some of the performance benefits
of built-in declarative partitioning.
</para>
+
+ <note>
+ <para>
+ When using table inheritance, partitioning hierarchies with more than a
+ few hundred partitions are not currently recommended. Larger partition
+ hierarchies may incur long planning time, and especially in the case of
+ <command>UPDATE</command> and <command>DELETE</command>, excessive
+ memory usage. When inheritance is used, see also the limitations
+ described in <xref linkend="ddl-partitioning-constraint-exclusion"/>.
+ </para>
+
+ <para>
+ When using declarative partitioning, the effort required by the planner
+ is closely tied to the number of unpruned partitions. Planning is
+ generally fast with small numbers of unpruned partitions even in
+ partitioning hierarchies containing many thousands of partitions.
+ However, planning queries where partition pruning is not possible during
+ planning are likely to be slow with such large partitioning hierarchies.
+ </para>
+ </note>
</sect2>
<sect2 id="ddl-partitioning-declarative">
On 2019/04/11 12:34, David Rowley wrote:
On Thu, 21 Mar 2019 at 00:51, David Rowley <david.rowley@2ndquadrant.com> wrote:
Just so I don't forget about this, I've added it to the July 'fest.
Now that we have 428b260f8, I think the version of this that goes into
master should be more like the attached.
Thanks, looks good.
I've posted a patch last week on the "speed up partition planning" thread
[1]: /messages/by-id/4f049572-9440-3c99-afa1-f7ca7f38fe80@lab.ntt.co.jp
constraint exclusion under the covers. Do you think there's any merit to
combining that with this one?
Thanks,
Amit
[1]: /messages/by-id/4f049572-9440-3c99-afa1-f7ca7f38fe80@lab.ntt.co.jp
/messages/by-id/4f049572-9440-3c99-afa1-f7ca7f38fe80@lab.ntt.co.jp
On Thu, 11 Apr 2019 at 16:06, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2019/04/11 12:34, David Rowley wrote:
Now that we have 428b260f8, I think the version of this that goes into
master should be more like the attached.Thanks, looks good.
Thanks for looking.
I've posted a patch last week on the "speed up partition planning" thread
[1] which modifies ddl.sgml to remove the text about UPDATE/DELETE using
constraint exclusion under the covers. Do you think there's any merit to
combining that with this one?
Probably separate is better. I don't think anything you're proposing
there is for back-patching, but I think the original patch over here
should be.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 2019/04/11 13:50, David Rowley wrote:
On Thu, 11 Apr 2019 at 16:06, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:I've posted a patch last week on the "speed up partition planning" thread
[1] which modifies ddl.sgml to remove the text about UPDATE/DELETE using
constraint exclusion under the covers. Do you think there's any merit to
combining that with this one?Probably separate is better. I don't think anything you're proposing
there is for back-patching, but I think the original patch over here
should be.
OK, no problem. I just thought to point out my patch because you've
posted a version of the patch here for HEAD *because of* 428b260f8, the
commit which also obsoleted the text that the other patch fixes.
Anyway, let's leave the other patch on its own thread where there are a
few other things to be sorted out.
Thanks,
Amit
On Thu, Apr 11, 2019 at 03:34:30PM +1200, David Rowley wrote:
On Thu, 21 Mar 2019 at 00:51, David Rowley <david.rowley@2ndquadrant.com> wrote:
Just so I don't forget about this, I've added it to the July 'fest.
Now that we have 428b260f8, I think the version of this that goes into
master should be more like the attached.
I tweaked this patch some more (sorry):
- remove "currently" since that's not expected to be changed (right?);
- remove "especially";
- refer to "partition hierarchies" not "partitioning hierarchies";
- rewrite bit about "When partition pruning is not possible"
Also, I noticed awhile ago while grepping for "probably be fixed in future
releases" that some items under ddl-inherit-caveats are actually possible for
relkind=p partitions in v11. I assume those will never be implemented for
inheritence partitioning, so I propose another update to docs (if preferred,
I'll bring up on a new thread).
- unique constraints on parent table;
- FK constraints on parent table;
Note that FK constraints *referencing* a partitiond table are possible in v12
but not in v11. So if there's any finer-grained update to documentation of the
individual limitations, it'd need to be tweaked for back branches (v10 and 11).
Justin
Attachments:
v1-0001-Reinstate-warnings-regarding-large-heirarchies.patchtext/x-diff; charset=us-asciiDownload
From 3a787b95f5a35b53cd958855ec6fc4ff9fc9a455 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Thu, 11 Apr 2019 00:24:44 -0500
Subject: [PATCH v1 1/2] Reinstate warnings regarding large heirarchies
Put back warnings regarding high planning time and/or RAM use for large
inheritance heirarchies, and high planning time for large number of partitions
not pruned during planning with declaratively partitioned tables.
Discussion:
https://www.postgresql.org/message-id/CAKJS1f8RW-mHQ8aEWD5Dv0%2B8A1wH5tHHdYMGW9y5sXqnE0X9wA%40mail.gmail.com
https://commitfest.postgresql.org/23/2065/
Author: Robert Haas, David Rowley
Reviewed by: Amit Langote, Justin Pryzby
---
doc/src/sgml/ddl.sgml | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 244d5ce..83cbc66 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3582,6 +3582,26 @@ VALUES ('Albany', NULL, NULL, 'NY');
offer flexibility but do not have some of the performance benefits
of built-in declarative partitioning.
</para>
+
+ <note>
+ <para>
+ When using table inheritance, partition hierarchies with more than a few
+ hundred partitions are not recommended. Larger partition hierarchies may
+ incur long planning time, and, in the case of <command>UPDATE</command>
+ and <command>DELETE</command>, excessive memory usage. When inheritance
+ is used, see also the limitations described in
+ <xref linkend="ddl-partitioning-constraint-exclusion"/>.
+ </para>
+
+ <para>
+ When using declarative partitioning, the overhead of query planning
+ is directly related to the number of unpruned partitions. Planning is
+ generally fast with small numbers of unpruned partitions, even in
+ partition hierarchies containing many thousands of partitions. However,
+ long planning time will be incurred by large partition hierarchies if
+ partition pruning is not possible during the planning phase.
+ </para>
+ </note>
</sect2>
<sect2 id="ddl-partitioning-declarative">
--
2.1.4
v1-0002-Document-features-of-declarative-partitioning.patchtext/x-diff; charset=us-asciiDownload
From 6bd80e7cdddc3c9552d44439b4b8e9843c1007e4 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Thu, 11 Apr 2019 00:22:56 -0500
Subject: [PATCH v1 2/2] Document features of declarative partitioning..
..which will never be implemented for legacy inheritance.
---
doc/src/sgml/ddl.sgml | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 83cbc66..3495a66 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3450,8 +3450,9 @@ VALUES ('Albany', NULL, NULL, 'NY');
</listitem>
</itemizedlist>
- These deficiencies will probably be fixed in some future release,
- but in the meantime considerable care is needed in deciding whether
+ Some functionality not implemented for inheritance hierarchies is
+ implemented for declarative partitioning.
+ Considerable care is needed in deciding whether partitioning with legacy
inheritance is useful for your application.
</para>
--
2.1.4
On Thu, 11 Apr 2019 at 17:40, Justin Pryzby <pryzby@telsasoft.com> wrote:
I tweaked this patch some more (sorry):
- remove "currently" since that's not expected to be changed (right?);
Seems like a good idea. I think the way we exclude inheritance child
relations will never scale well. Other improvements that we'll see
will most likely be as a consequence of speeding up declarative
partitioning. For example the planner improvements in PG12 just did
that for UPDATE/DELETE.
- remove "especially";
I think that likely needs to be kept for the PG11 version. I was
hoping it was stop a casual tester testing a SELECT and seeing that
it's not so bad only to find later that UPDATE/DELETE OOMs.
- refer to "partition hierarchies" not "partitioning hierarchies";
fine
- rewrite bit about "When partition pruning is not possible"
fine.
Also, I noticed awhile ago while grepping for "probably be fixed in future
releases" that some items under ddl-inherit-caveats are actually possible for
relkind=p partitions in v11. I assume those will never be implemented for
inheritence partitioning, so I propose another update to docs (if preferred,
I'll bring up on a new thread).
Not sure about that. It may be very simple to implement if we one day
get global indexes. It may just be a matter of pointing all the tables
at the same index and letting the wonders of global indexes handle all
the hard stuff. I'm not that excited about removing that. I'd be
equally excited about adding the text if it wasn't already there and
you were proposing to add it.
- unique constraints on parent table;
- FK constraints on parent table;Note that FK constraints *referencing* a partitiond table are possible in v12
but not in v11. So if there's any finer-grained update to documentation of the
individual limitations, it'd need to be tweaked for back branches (v10 and 11).
Don't we just need to remove or update:
<listitem>
<para>
While primary keys are supported on partitioned tables, foreign
keys referencing partitioned tables are not supported. (Foreign key
references from a partitioned table to some other table are supported.)
</para>
</listitem>
I didn't follow this work, but on testing, I see the foreign key does
not CASCADE when doing DETACH PARTITION, it errors instead. Perhaps
that's worth a mention here.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Apr 12, 2019 at 02:01:39PM +1200, David Rowley wrote:
On Thu, 11 Apr 2019 at 17:40, Justin Pryzby <pryzby@telsasoft.com> wrote:
I tweaked this patch some more (sorry):
- remove "especially";I think that likely needs to be kept for the PG11 version. I was
hoping it was stop a casual tester testing a SELECT and seeing that
it's not so bad only to find later that UPDATE/DELETE OOMs.
With "especially", it reads as if "excessive memory usage" might happen for
SELECT, and it'll be additionally worse for UPDATE/DELETE.
Without "especially", it makes "excessive RAM use" apply only to UPDATE/DELETE,
which I think is what's intended.
|Larger partition hierarchies may incur long planning time, and [especially] in
|the case of <command>UPDATE</command> and <command>DELETE</command>, excessive
|memory usage.
I think as long as UPDATE/DELETE are specifically mentioned, that would handle
your concern. If I were to suggest an alternative:
|Larger partition hierarchies may incur long planning time; and, in
|the case of <command>UPDATE</command> and <command>DELETE</command>, may also
|incur excessive memory usage.
..after which I'll stop wrestling with words.
Justin