Out of date comment in cached_plan_cost

Started by David Rowleyabout 8 years ago6 messages
#1David Rowley
david.rowley@2ndquadrant.com
1 attachment(s)

Hi,

I just noticed a comment which has been made a little outdated by the
partition-wise join code from commit f49842d1. The comment claims that
inheritance children don't add to the effort required in join
planning, while that still may be true, we should probably mention
that partitioned tables may be a more complex case.

The attached is my attempt at putting this right.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

cached_plan_cost_comment_fix.patchapplication/octet-stream; name=cached_plan_cost_comment_fix.patchDownload
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 853c1f6..274b463 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -1090,10 +1090,13 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
 			 * in the number of relations --- but only until the join collapse
 			 * limits kick in.  Also, while inheritance child relations surely
 			 * add to planning effort, they don't make the join situation
-			 * worse.  So the actual shape of the planning cost curve versus
-			 * number of relations isn't all that obvious.  It will take
-			 * considerable work to arrive at a less crude estimate, and for
-			 * now it's not clear that's worth doing.
+			 * worse.  However, when choosing the join order for partitioned
+			 * tables the effort required to plan the join order is even less
+			 * clear-cut.  Extra effort is put into join planning when
+			 * partition-wise join is in use.  So the actual shape of the
+			 * planning cost curve versus number of relations isn't all that
+			 * obvious.  It will take considerable work to arrive at a less
+			 * crude estimate, and for now it's not clear that's worth doing.
 			 *
 			 * The other big difficulty here is that we don't have any very
 			 * good model of how planning cost compares to execution costs.
#2Robert Haas
robertmhaas@gmail.com
In reply to: David Rowley (#1)
Re: Out of date comment in cached_plan_cost

On Thu, Dec 7, 2017 at 3:14 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

I just noticed a comment which has been made a little outdated by the
partition-wise join code from commit f49842d1. The comment claims that
inheritance children don't add to the effort required in join
planning, while that still may be true, we should probably mention
that partitioned tables may be a more complex case.

The attached is my attempt at putting this right.

I don't feel entirely right about the way this seems to treat
inheritance and partitioning as two entirely separate features; that's
true from a user perspective, more or less, but not internally to the
code.

Of course, this also begs the question of whether we ought to be
changing the formula somehow.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#3David Rowley
david.rowley@2ndquadrant.com
In reply to: Robert Haas (#2)
1 attachment(s)
Re: Out of date comment in cached_plan_cost

On 9 December 2017 at 06:05, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Dec 7, 2017 at 3:14 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

The attached is my attempt at putting this right.

I don't feel entirely right about the way this seems to treat
inheritance and partitioning as two entirely separate features; that's
true from a user perspective, more or less, but not internally to the
code.

Originally I had it typed out in a way that mentioned something about
"unless using partition-wise join with partitioned tables", but I felt
that the partition planning code is in such a state of flux at the
moment that I feared that comment might get outdated again someday in
the near future.

I've attached another patch which just loosens the claim that join
planning situation is never made worse by inheritance children to now
say it does not "generally" make any worse.

Of course, this also begs the question of whether we ought to be
changing the formula somehow.

Perhaps, but not for this patch. The comment already mentions that the
code is far from perfect.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

cached_plan_cost_comment_fix_v2.patchapplication/octet-stream; name=cached_plan_cost_comment_fix_v2.patchDownload
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 853c1f6..2089e2b 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -1089,11 +1089,11 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
 			 * Join planning effort actually scales much worse than linearly
 			 * in the number of relations --- but only until the join collapse
 			 * limits kick in.  Also, while inheritance child relations surely
-			 * add to planning effort, they don't make the join situation
-			 * worse.  So the actual shape of the planning cost curve versus
-			 * number of relations isn't all that obvious.  It will take
-			 * considerable work to arrive at a less crude estimate, and for
-			 * now it's not clear that's worth doing.
+			 * add to planning effort, they generally don't make the join
+			 * situation worse.  So the actual shape of the planning cost
+			 * curve versus number of relations isn't all that obvious.  It
+			 * will take considerable work to arrive at a less crude estimate,
+			 * and for now it's not clear that's worth doing.
 			 *
 			 * The other big difficulty here is that we don't have any very
 			 * good model of how planning cost compares to execution costs.
#4Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: David Rowley (#3)
Re: Out of date comment in cached_plan_cost

On Mon, Dec 11, 2017 at 3:02 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

On 9 December 2017 at 06:05, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Dec 7, 2017 at 3:14 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

The attached is my attempt at putting this right.

I don't feel entirely right about the way this seems to treat
inheritance and partitioning as two entirely separate features; that's
true from a user perspective, more or less, but not internally to the
code.

Originally I had it typed out in a way that mentioned something about
"unless using partition-wise join with partitioned tables", but I felt
that the partition planning code is in such a state of flux at the
moment that I feared that comment might get outdated again someday in
the near future.

I've attached another patch which just loosens the claim that join
planning situation is never made worse by inheritance children to now
say it does not "generally" make any worse.

Of course, this also begs the question of whether we ought to be
changing the formula somehow.

Perhaps, but not for this patch. The comment already mentions that the
code is far from perfect.

I don't see much difference in the old and new wording. The word
"generally" confuses more than clarifying the cases when the planning
cost curves do not change with the number of relations i.e.
partitions. I think if we add following sentence after "situation
worse." "However in case of declarative partitioning, we may plan each
child separately e.g partition-wise join aims at planning join between
matching partitions. In that case, plan cost curve significantly
changes with the number of partition relations.". After writing this I
think I understand what you meant by

Originally I had it typed out in a way that mentioned something about
"unless using partition-wise join with partitioned tables", but I felt
that the partition planning code is in such a state of flux at the
moment that I feared that comment might get outdated again someday in
the near future.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

#5David Rowley
david.rowley@2ndquadrant.com
In reply to: Ashutosh Bapat (#4)
Re: Out of date comment in cached_plan_cost

On 11 December 2017 at 21:39, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

I don't see much difference in the old and new wording. The word
"generally" confuses more than clarifying the cases when the planning
cost curves do not change with the number of relations i.e.
partitions.

I added that to remove the false claim that inheritance children don't
make the join problem more complex. This was only true before we had
partition-wise joins.

I've re-read my original patch and I don't really see the problem with
it. The comment is talking about inheritance child relations, which
you could either interpret to mean INHERITS (sometable), or some table
listed in pg_inherits. The statement that I added forces me into
thinking of the former rather than the latter, so I don't really see
any issue.

I'm going to leave it here. I don't want to spend too much effort
rewording a comment. My vote is for the original patch I sent. I only
changed it because Robert complained that technically an inheritance
child could actually be a partition.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#6Robert Haas
robertmhaas@gmail.com
In reply to: David Rowley (#5)
Re: Out of date comment in cached_plan_cost

On Mon, Dec 11, 2017 at 5:00 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

On 11 December 2017 at 21:39, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

I don't see much difference in the old and new wording. The word
"generally" confuses more than clarifying the cases when the planning
cost curves do not change with the number of relations i.e.
partitions.

I added that to remove the false claim that inheritance children don't
make the join problem more complex. This was only true before we had
partition-wise joins.

I've re-read my original patch and I don't really see the problem with
it. The comment is talking about inheritance child relations, which
you could either interpret to mean INHERITS (sometable), or some table
listed in pg_inherits. The statement that I added forces me into
thinking of the former rather than the latter, so I don't really see
any issue.

I'm going to leave it here. I don't want to spend too much effort
rewording a comment. My vote is for the original patch I sent. I only
changed it because Robert complained that technically an inheritance
child could actually be a partition.

I basically feel like we're not really solving any problem by changing
this. I mean, partition-wise join makes this statement less true, but
adding the word "generally" doesn't really help anybody understand the
situation better. If we're going to add anything here, I think it
should be something like:

(This might need to be rethought in light of partition-wise join.)

If that's more specific than we want to get, then let's just leave it
alone. Partition-wise join isn't even enabled by default at this
point.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company