Confusing comment for function ExecParallelEstimate

Started by Wu, Feiover 6 years ago8 messages
#1Wu, Fei
wufei.fnst@cn.fujitsu.com

Hi, all
Lately I was researching Parallelism of Postgres 10.7(and it is same in all version), and I was confused when reading the comment of function ExecParallelEstimate :
(in src/backend/executor/execParallel.c)
----------------------------------------------

* While we're at it, count the number of PlanState nodes in the tree, so
* we know how many SharedPlanStateInstrumentation structures we need.
static bool
ExecParallelEstimate(PlanState *planstate, ExecParallelEstimateContext *e)
----------------------------------------------

The structure SharedPlanStateInstrumentation is not exists at all. And I noticed that the so called “SharedPlanStateInstrumentation”
maybe is the structure instrumentation now, which is used for storing information of planstate in parallelism. The function count the number
of planState nodes and stored it in ExecParallelEstimateContext-> nnodes ,then use it to Estimate space for instrumentation structure in
function ExecInitParallelPlan.

So, I think the comment is out of date now, isn’t it?

Maybe we can modified this piece of comment from “SharedPlanStateInstrumentation” to “instrumentation” for clear

--
Best Regards
-----------------------------------------------------
Wu Fei
Development Department II
Software Division III
Nanjing Fujitsu Nanda Software Tech. Co., Ltd.(FNST)
ADDR.: No.6 Wenzhu Road, Software Avenue,
Nanjing, 210012, China
TEL : +86+25-86630566-9356
COINS: 7998-9356
FAX: +86+25-83317685
MAIL:wufei.fnst@cn.fujitsu.com
http://www.fujitsu.com/cn/fnst/
---------------------------------------------------

#2Amit Kapila
amit.kapila16@gmail.com
In reply to: Wu, Fei (#1)
Re: Confusing comment for function ExecParallelEstimate

On Wed, Jun 5, 2019 at 9:24 AM Wu, Fei <wufei.fnst@cn.fujitsu.com> wrote:

Hi, all

Lately I was researching Parallelism of Postgres 10.7(and it is same in all version), and I was confused when reading the comment of function ExecParallelEstimate :

(in src/backend/executor/execParallel.c)

----------------------------------------------

* While we're at it, count the number of PlanState nodes in the tree, so

* we know how many SharedPlanStateInstrumentation structures we need.

static bool

ExecParallelEstimate(PlanState *planstate, ExecParallelEstimateContext *e)

----------------------------------------------

The structure SharedPlanStateInstrumentation is not exists at all. And I noticed that the so called “SharedPlanStateInstrumentation”

maybe is the structure instrumentation now, which is used for storing information of planstate in parallelism. The function count the number

of planState nodes and stored it in ExecParallelEstimateContext-> nnodes ,then use it to Estimate space for instrumentation structure in

function ExecInitParallelPlan.

I think here it refers to SharedExecutorInstrumentation. This
structure is used for accumulating per-PlanState instrumentation. So,
it is not totally wrong, but I guess we can change it to
SharedExecutorInstrumentation to avoid confusion? What do you think?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#3Wu, Fei
wufei.fnst@cn.fujitsu.com
In reply to: Amit Kapila (#2)
RE: Confusing comment for function ExecParallelEstimate

Thanks for your reply.
From the code below:
(https://github.com/postgres/postgres/blob/REL_10_7/src/backend/executor/execParallel.c)
#######################################################################################
443 /*
444 * Give parallel-aware nodes a chance to add to the estimates, and get a
445 * count of how many PlanState nodes there are.
446 */
447 e.pcxt = pcxt;
448 e.nnodes = 0;
449 ExecParallelEstimate(planstate, &e);
450
451 /* Estimate space for instrumentation, if required. */
452 if (estate->es_instrument)
453 {
454 instrumentation_len =
455 offsetof(SharedExecutorInstrumentation, plan_node_id) +
456 sizeof(int) * e.nnodes;
457 instrumentation_len = MAXALIGN(instrumentation_len);
458 instrument_offset = instrumentation_len;
459 instrumentation_len +=
460 mul_size(sizeof(Instrumentation),
461 mul_size(e.nnodes, nworkers));
462 shm_toc_estimate_chunk(&pcxt->estimator, instrumentation_len);
463 shm_toc_estimate_keys(&pcxt->estimator, 1);

#######################################################################################
It seems that e.nnodes which returns from ExecParallelEstimate(planstate, &e) , determines how much instrumentation structures in DSM(line459~line461).
And e.nnodes also determines the length of SharedExecutorInstrumentation-> plan_node_id(line454~line456).

So, I think here it refers to instrumentation.

SharedExecutorInstrumentation is just likes a master that hold the metadata:
struct SharedExecutorInstrumentation
{
int instrument_options;
int instrument_offset;
int num_workers;
int num_plan_nodes; // this equals to e.nnodes from the source code
int plan_node_id[FLEXIBLE_ARRAY_MEMBER];
/* array of num_plan_nodes * num_workers Instrumentation objects follows */
};

What do you think?

With Regards,
Wu Fei

-----Original Message-----
From: Amit Kapila [mailto:amit.kapila16@gmail.com]
Sent: Wednesday, June 05, 2019 12:20 PM
To: Wu, Fei/吴 非 <wufei.fnst@cn.fujitsu.com>
Cc: pgsql-hackers@postgresql.org
Subject: Re: Confusing comment for function ExecParallelEstimate

On Wed, Jun 5, 2019 at 9:24 AM Wu, Fei <wufei.fnst@cn.fujitsu.com> wrote:

Hi, all

Lately I was researching Parallelism of Postgres 10.7(and it is same in all version), and I was confused when reading the comment of function ExecParallelEstimate :

(in src/backend/executor/execParallel.c)

----------------------------------------------

* While we're at it, count the number of PlanState nodes in the tree,
so

* we know how many SharedPlanStateInstrumentation structures we need.

static bool

ExecParallelEstimate(PlanState *planstate, ExecParallelEstimateContext
*e)

----------------------------------------------

The structure SharedPlanStateInstrumentation is not exists at all. And I noticed that the so called “SharedPlanStateInstrumentation”

maybe is the structure instrumentation now, which is used for storing
information of planstate in parallelism. The function count the
number

of planState nodes and stored it in ExecParallelEstimateContext->
nnodes ,then use it to Estimate space for instrumentation structure in

function ExecInitParallelPlan.

I think here it refers to SharedExecutorInstrumentation. This structure is used for accumulating per-PlanState instrumentation. So, it is not totally wrong, but I guess we can change it to SharedExecutorInstrumentation to avoid confusion? What do you think?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#4Amit Kapila
amit.kapila16@gmail.com
In reply to: Wu, Fei (#3)
Re: Confusing comment for function ExecParallelEstimate

On Wed, Jun 5, 2019 at 11:27 AM Wu, Fei <wufei.fnst@cn.fujitsu.com> wrote:

Thanks for your reply.
From the code below:
(https://github.com/postgres/postgres/blob/REL_10_7/src/backend/executor/execParallel.c)
#######################################################################################
443 /*
444 * Give parallel-aware nodes a chance to add to the estimates, and get a
445 * count of how many PlanState nodes there are.
446 */
447 e.pcxt = pcxt;
448 e.nnodes = 0;
449 ExecParallelEstimate(planstate, &e);
450
451 /* Estimate space for instrumentation, if required. */
452 if (estate->es_instrument)
453 {
454 instrumentation_len =
455 offsetof(SharedExecutorInstrumentation, plan_node_id) +
456 sizeof(int) * e.nnodes;
457 instrumentation_len = MAXALIGN(instrumentation_len);
458 instrument_offset = instrumentation_len;
459 instrumentation_len +=
460 mul_size(sizeof(Instrumentation),
461 mul_size(e.nnodes, nworkers));
462 shm_toc_estimate_chunk(&pcxt->estimator, instrumentation_len);
463 shm_toc_estimate_keys(&pcxt->estimator, 1);

#######################################################################################
It seems that e.nnodes which returns from ExecParallelEstimate(planstate, &e) , determines how much instrumentation structures in DSM(line459~line461).
And e.nnodes also determines the length of SharedExecutorInstrumentation-> plan_node_id(line454~line456).

So, I think here it refers to instrumentation.

Right. I think the way it is mentioned
(SharedPlanStateInstrumentation structures ..) in the comment can
confuse readers. We can replace SharedPlanStateInstrumentation with
Instrumentation in the comment.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#5Wu, Fei
wufei.fnst@cn.fujitsu.com
In reply to: Amit Kapila (#4)
1 attachment(s)
RE: Confusing comment for function ExecParallelEstimate

Sorry, Last mail forget to CC the mailing list.

Now the comment is confusing, Maybe someone should correct it.

Here is a simple patch, What do you think ?

With Regards,
Wu Fei

-----Original Message-----
From: Amit Kapila [mailto:amit.kapila16@gmail.com]
Sent: Wednesday, June 05, 2019 7:18 PM
To: Wu, Fei/吴 非 <wufei.fnst@cn.fujitsu.com>
Cc: pgsql-hackers@postgresql.org
Subject: Re: Confusing comment for function ExecParallelEstimate

On Wed, Jun 5, 2019 at 11:27 AM Wu, Fei <wufei.fnst@cn.fujitsu.com> wrote:

Thanks for your reply.
From the code below:
(https://github.com/postgres/postgres/blob/REL_10_7/src/backend/execut
or/execParallel.c)
#######################################################################################
443 /*
444 * Give parallel-aware nodes a chance to add to the estimates, and get a
445 * count of how many PlanState nodes there are.
446 */
447 e.pcxt = pcxt;
448 e.nnodes = 0;
449 ExecParallelEstimate(planstate, &e);
450
451 /* Estimate space for instrumentation, if required. */
452 if (estate->es_instrument)
453 {
454 instrumentation_len =
455 offsetof(SharedExecutorInstrumentation, plan_node_id) +
456 sizeof(int) * e.nnodes;
457 instrumentation_len = MAXALIGN(instrumentation_len);
458 instrument_offset = instrumentation_len;
459 instrumentation_len +=
460 mul_size(sizeof(Instrumentation),
461 mul_size(e.nnodes, nworkers));
462 shm_toc_estimate_chunk(&pcxt->estimator, instrumentation_len);
463 shm_toc_estimate_keys(&pcxt->estimator, 1);

######################################################################
################# It seems that e.nnodes which returns from
ExecParallelEstimate(planstate, &e) , determines how much instrumentation structures in DSM(line459~line461).
And e.nnodes also determines the length of SharedExecutorInstrumentation-> plan_node_id(line454~line456).

So, I think here it refers to instrumentation.

Right. I think the way it is mentioned
(SharedPlanStateInstrumentation structures ..) in the comment can confuse readers. We can replace SharedPlanStateInstrumentation with Instrumentation in the comment.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

fix_confusing_comment_for_ExecParallelEstimate.patchapplication/octet-stream; name=fix_confusing_comment_for_ExecParallelEstimate.patchDownload
diff old/execParallel.c new/execParallel.c
204c204
<  * we know how many SharedPlanStateInstrumentation structures we need.
---
>  * we know how many instrumentation structures we need.
#6Amit Kapila
amit.kapila16@gmail.com
In reply to: Wu, Fei (#5)
Re: Confusing comment for function ExecParallelEstimate

On Thu, Jun 6, 2019 at 7:37 AM Wu, Fei <wufei.fnst@cn.fujitsu.com> wrote:

Sorry, Last mail forget to CC the mailing list.

Now the comment is confusing, Maybe someone should correct it.

Sure, for the sake of clarity, when this code was initially introduced
in commit d1b7c1ff, the structure used was
SharedPlanStateInstrumentation, but later when it got changed to
Instrumentation structure in commit b287df70, we forgot to update the
comment. So, we should backpatch this till 9.6 where it got
introduced. I will commit this change by tomorrow or so.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#7Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#6)
Re: Confusing comment for function ExecParallelEstimate

On Thu, Jun 6, 2019 at 8:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Jun 6, 2019 at 7:37 AM Wu, Fei <wufei.fnst@cn.fujitsu.com> wrote:

Sorry, Last mail forget to CC the mailing list.

Now the comment is confusing, Maybe someone should correct it.

Sure, for the sake of clarity, when this code was initially introduced
in commit d1b7c1ff, the structure used was
SharedPlanStateInstrumentation, but later when it got changed to
Instrumentation structure in commit b287df70, we forgot to update the
comment. So, we should backpatch this till 9.6 where it got
introduced. I will commit this change by tomorrow or so.

Pushed. Note, I was not able to apply your patch using patch -p1 command.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#8Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Amit Kapila (#7)
Re: Confusing comment for function ExecParallelEstimate

On 2019-Jun-07, Amit Kapila wrote:

Pushed. Note, I was not able to apply your patch using patch -p1 command.

Yeah, it's a "normal" diff (old school), not a unified or context diff.
patch doesn't like normal diff, for good reasons, but you can force it
to apply with "patch --normal" (not really recommended).

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services