Explain buffers wrong counter with parallel plans

Started by Adrien Nayratabout 8 years ago54 messageshackers
Jump to latest
#1Adrien Nayrat
adrien.nayrat@anayrat.info

Hello,

I notice Explain report wrong counters with parallel plans :

create unlogged table t1 (c1 int);
insert into t1 select generate_series(1,100000000);
vacuum t1;

(stop PG, drop caches,start)

set track_io_timing to on;
explain (analyze,buffers) select count(*) from t1;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=964311.55..964311.56 rows=1 width=8) (actual
time=34478.972..34478.972 rows=1 loops=1)
Buffers: shared read=147952
I/O Timings: read=20407.894
-> Gather (cost=964311.33..964311.54 rows=2 width=8) (actual
time=34478.835..34478.937 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared read=147952
I/O Timings: read=20407.894
-> Partial Aggregate (cost=963311.33..963311.34 rows=1 width=8)
(actual time=34283.809..34283.810 rows=1 loops=3)
Buffers: shared read=442478 dirtied=1 written=1
I/O Timings: read=60706.314 write=0.498
-> Parallel Seq Scan on t1 (cost=0.00..859144.67 rows=41666667
width=0) (actual time=27.027..27405.058 rows=33333333 loops=3)
Buffers: shared read=442478 dirtied=1 written=1
I/O Timings: read=60706.314 write=0.498
Planning time: 39.037 ms
Execution time: 34480.280 ms

(stop pg, drop caches,start)

set max_parallel_workers_per_gather to 0;
explain (analyze,buffers) select count(*) from t1;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1692478.00..1692478.01 rows=1 width=8) (actual
time=34367.678..34367.678 rows=1 loops=1)
Buffers: shared read=442478
I/O Timings: read=7977.775
-> Seq Scan on t1 (cost=0.00..1442478.00 rows=100000000 width=0) (actual
time=8.672..19685.654 rows=100000000 loops=1)
Buffers: shared read=442478
I/O Timings: read=7977.775
Planning time: 38.971 ms
Execution time: 34368.097 ms
(8 rows)

Gather and Finalize Aggregate should report shared read=442478.

I am also surprised for I/O Timings in parallel node:
I/O Timings: read=60706.314 but the query last 34s?

I played with git bisect and I found this commit :

commit 01edb5c7fc3bcf6aea15f2b3be36189b52ad9d1a
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri Sep 1 17:38:54 2017 -0400

Improve division of labor between execParallel.c and nodeGather[Merge].c.

Move the responsibility for creating/destroying TupleQueueReaders into
execParallel.c, to avoid duplicative coding in nodeGather.c and
nodeGatherMerge.c. Also, instead of having DestroyTupleQueueReader do
shm_mq_detach, do it in the caller (which is now only ExecParallelFinish).
This means execParallel.c does both the attaching and detaching of the
tuple-queue-reader shm_mqs, which seems less weird than the previous
arrangement.

These changes also eliminate a vestigial memory leak (of the pei->tqueue
array). It's now demonstrable that rescans of Gather or GatherMerge don't
leak memory.

Discussion: /messages/by-id/8670.1504192177@sss.pgh.pa.us

Before this commit here is the result :

explain (analyze,buffers) select count(*) from t1;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=964311.55..964311.56 rows=1 width=8) (actual
time=34745.387..34745.387 rows=1 loops=1)
Buffers: shared hit=160 read=442478 dirtied=1 written=1
I/O Timings: read=61219.308 write=0.033
-> Gather (cost=964311.33..964311.54 rows=2 width=8) (actual
time=34745.273..34745.379 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=160 read=442478 dirtied=1 written=1
I/O Timings: read=61219.308 write=0.033
-> Partial Aggregate (cost=963311.33..963311.34 rows=1 width=8)
(actual time=34660.023..34660.023 rows=1 loops=3)
Buffers: shared read=442478 dirtied=1 written=1
I/O Timings: read=61219.308 write=0.033
-> Parallel Seq Scan on t1 (cost=0.00..859144.67 rows=41666667
width=0) (actual time=11.559..27607.271 rows=33333333 loops=3)
Buffers: shared read=442478 dirtied=1 written=1
I/O Timings: read=61219.308 write=0.033
Planning time: 38.094 ms
Execution time: 34746.395 ms
(16 rows)

set max_parallel_workers_per_gather to 0;
explain (analyze,buffers) select count(*) from t1;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1692478.00..1692478.01 rows=1 width=8) (actual
time=34368.540..34368.540 rows=1 loops=1)
Buffers: shared read=442478
I/O Timings: read=8874.662
-> Seq Scan on t1 (cost=0.00..1442478.00 rows=100000000 width=0) (actual
time=20.604..19992.941 rows=100000000 loops=1)
Buffers: shared read=442478
I/O Timings: read=8874.662
Planning time: 37.834 ms
Execution time: 34368.817 ms
(8 rows)

I got similar results on 9.6.

Regards,

--
Adrien NAYRAT

#2Adrien Nayrat
adrien.nayrat@anayrat.info
In reply to: Adrien Nayrat (#1)
Re: Explain buffers wrong counter with parallel plans

Hello,

I tried to understand this issue and it seems Gather node only take account of
this own buffer usage :

create unlogged table t1 (c1 int);
insert into t1 select generate_series(1,1000000);
vacuum t1;

explain (analyze,buffers,timing off,costs off) select count(*) from t1;
QUERY PLAN
------------------------------------------------------------------------
Finalize Aggregate (actual rows=1 loops=1)
Buffers: shared hit=1531
-> Gather (actual rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=1531
-> Partial Aggregate (actual rows=1 loops=3)
Buffers: shared hit=4425
-> Parallel Seq Scan on t1 (actual rows=333333 loops=3)
Buffers: shared hit=4425

Same query without parallelism

QUERY PLAN
----------------------------------------------------
Aggregate (actual rows=1 loops=1)
Buffers: shared hit=4425
-> Seq Scan on t1 (actual rows=1000000 loops=1)
Buffers: shared hit=4425

We can notice Parallel Seq Scan and Partial Aggregate report 4425 buffers, same
for the plan without parallelism.

I put elog debug around theses lines in execParallel.c :

/* Accumulate the statistics from all workers. */
instrument = GetInstrumentationArray(instrumentation);
instrument += i * instrumentation->num_workers;
for (n = 0; n < instrumentation->num_workers; ++n)
{
elog(LOG, "worker %d - shared_blks_read: %ld - shared_blks_hit: %ld", n,
instrument[n].bufusage.shared_blks_read,instrument[n].bufusage.shared_blks_hit);
InstrAggNode(planstate->instrument, &instrument[n]);
}

And I get theses messages :

LOG: worker 0 - shared_blks_read: 0 - shared_blks_hit: 1443
LOG: worker 1 - shared_blks_read: 0 - shared_blks_hit: 1451

If you sum each worker's shared_blks_read and Gather's shared hit you get :
1443 + 1451 + 1531 = 4425 (size of t1)

It seems Gather node and Finalize Aggregate only report buffers read by the main
process. Workers seem omitted.

Same issue occurs for Gather merge :

explain (analyze,buffers,timing off,costs off)
select * from t1 where c1%10=4 order by 1;

QUERY PLAN
-----------------------------------------------------------------
Gather Merge (actual rows=100000 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=1549
-> Sort (actual rows=33333 loops=3)
Sort Key: c1
Sort Method: quicksort Memory: 3143kB
Buffers: shared hit=4521
-> Parallel Seq Scan on t1 (actual rows=33333 loops=3)
Filter: ((c1 % 10) = 4)
Rows Removed by Filter: 300000
Buffers: shared hit=4425

In my log :
worker 0 - shared_blks_read: 0 - shared_blks_hit: 1487
worker 1 - shared_blks_read: 0 - shared_blks_hit: 1485

1487 + 1485 + 1549 = 4521

I tried to understand how instrumentation works, but it is above my knowledge :(

Regards,

--
Adrien

#3Amit Kapila
amit.kapila16@gmail.com
In reply to: Adrien Nayrat (#1)
Re: Explain buffers wrong counter with parallel plans

On Wed, Mar 28, 2018 at 12:07 AM, Adrien Nayrat
<adrien.nayrat@anayrat.info> wrote:

Hello,

I notice Explain report wrong counters with parallel plans :

create unlogged table t1 (c1 int);
insert into t1 select generate_series(1,100000000);
vacuum t1;

(stop PG, drop caches,start)

set track_io_timing to on;
explain (analyze,buffers) select count(*) from t1;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=964311.55..964311.56 rows=1 width=8) (actual
time=34478.972..34478.972 rows=1 loops=1)
Buffers: shared read=147952
I/O Timings: read=20407.894
-> Gather (cost=964311.33..964311.54 rows=2 width=8) (actual
time=34478.835..34478.937 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared read=147952
I/O Timings: read=20407.894
-> Partial Aggregate (cost=963311.33..963311.34 rows=1 width=8)
(actual time=34283.809..34283.810 rows=1 loops=3)
Buffers: shared read=442478 dirtied=1 written=1
I/O Timings: read=60706.314 write=0.498
-> Parallel Seq Scan on t1 (cost=0.00..859144.67 rows=41666667
width=0) (actual time=27.027..27405.058 rows=33333333 loops=3)
Buffers: shared read=442478 dirtied=1 written=1
I/O Timings: read=60706.314 write=0.498
Planning time: 39.037 ms
Execution time: 34480.280 ms

(stop pg, drop caches,start)

set max_parallel_workers_per_gather to 0;
explain (analyze,buffers) select count(*) from t1;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1692478.00..1692478.01 rows=1 width=8) (actual
time=34367.678..34367.678 rows=1 loops=1)
Buffers: shared read=442478
I/O Timings: read=7977.775
-> Seq Scan on t1 (cost=0.00..1442478.00 rows=100000000 width=0) (actual
time=8.672..19685.654 rows=100000000 loops=1)
Buffers: shared read=442478
I/O Timings: read=7977.775
Planning time: 38.971 ms

now> Execution time: 34368.097 ms

(8 rows)

Gather and Finalize Aggregate should report shared read=442478.

Yeah, it would have been convenient, if Gather and Finalize Aggregate
displays that way as it would have been easier for users to
understand. However, as of now, the aggregated stats for parallel
workers and leader are displayed at parallel/partial nodes as is
displayed in the plan. So according to me, what you are seeing is
okay, it is only slightly tricky to understand it.

I am also surprised for I/O Timings in parallel node:
I/O Timings: read=60706.314 but the query last 34s?

This is the aggregated time of all the parallel workers, so you need
to divide it by loops (which in your case is 3).

I played with git bisect and I found this commit :

commit 01edb5c7fc3bcf6aea15f2b3be36189b52ad9d1a
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri Sep 1 17:38:54 2017 -0400

I think you were seeing different results before this commit because
before that we were shutting down workers as soon as parallel workers
are done and the buffer_usage stats were accumulated and were being
used for upper nodes. According to me behavior after the commit is
consistent, for example, I think if you check the case of GatherMerge
before this commit, you will still get the stats in the way it is
after commit.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#4Amit Kapila
amit.kapila16@gmail.com
In reply to: Adrien Nayrat (#2)
Re: Explain buffers wrong counter with parallel plans

On Fri, Apr 20, 2018 at 6:36 PM, Adrien Nayrat
<adrien.nayrat@anayrat.info> wrote:

Hello,

I tried to understand this issue and it seems Gather node only take account of
this own buffer usage :

create unlogged table t1 (c1 int);
insert into t1 select generate_series(1,1000000);
vacuum t1;

explain (analyze,buffers,timing off,costs off) select count(*) from t1;
QUERY PLAN
------------------------------------------------------------------------
Finalize Aggregate (actual rows=1 loops=1)
Buffers: shared hit=1531
-> Gather (actual rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=1531
-> Partial Aggregate (actual rows=1 loops=3)
Buffers: shared hit=4425
-> Parallel Seq Scan on t1 (actual rows=333333 loops=3)
Buffers: shared hit=4425

Same query without parallelism

QUERY PLAN
----------------------------------------------------
Aggregate (actual rows=1 loops=1)
Buffers: shared hit=4425
-> Seq Scan on t1 (actual rows=1000000 loops=1)
Buffers: shared hit=4425

We can notice Parallel Seq Scan and Partial Aggregate report 4425 buffers, same
for the plan without parallelism.

I put elog debug around theses lines in execParallel.c :

/* Accumulate the statistics from all workers. */
instrument = GetInstrumentationArray(instrumentation);
instrument += i * instrumentation->num_workers;
for (n = 0; n < instrumentation->num_workers; ++n)
{
elog(LOG, "worker %d - shared_blks_read: %ld - shared_blks_hit: %ld", n,
instrument[n].bufusage.shared_blks_read,instrument[n].bufusage.shared_blks_hit);
InstrAggNode(planstate->instrument, &instrument[n]);
}

And I get theses messages :

LOG: worker 0 - shared_blks_read: 0 - shared_blks_hit: 1443
LOG: worker 1 - shared_blks_read: 0 - shared_blks_hit: 1451

I think you can try 'verbose' option, it will give per-worker stats.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#5Adrien Nayrat
adrien.nayrat@anayrat.info
In reply to: Amit Kapila (#4)
Re: Explain buffers wrong counter with parallel plans

On 04/29/2018 05:11 PM, Amit Kapila wrote:

I think you can try 'verbose' option, it will give per-worker stats.

Thanks! I didn't know verbose give per-woker stats!

--
Adrien

#6Adrien Nayrat
adrien.nayrat@anayrat.info
In reply to: Amit Kapila (#3)
Re: Explain buffers wrong counter with parallel plans

On 04/29/2018 05:10 PM, Amit Kapila wrote:

Yeah, it would have been convenient, if Gather and Finalize Aggregate
displays that way as it would have been easier for users to
understand. However, as of now, the aggregated stats for parallel
workers and leader are displayed at parallel/partial nodes as is
displayed in the plan. So according to me, what you are seeing is
okay, it is only slightly tricky to understand it.

[...]

I played with git bisect and I found this commit :

commit 01edb5c7fc3bcf6aea15f2b3be36189b52ad9d1a
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri Sep 1 17:38:54 2017 -0400

I think you were seeing different results before this commit because
before that we were shutting down workers as soon as parallel workers
are done and the buffer_usage stats were accumulated and were being
used for upper nodes. According to me behavior after the commit is
consistent, for example, I think if you check the case of GatherMerge
before this commit, you will still get the stats in the way it is
after commit.

I understand. Maybe, this change should be mentioned in releases notes and/or
documentation?

Thanks,

#7Andres Freund
andres@anarazel.de
In reply to: Amit Kapila (#3)
Re: Explain buffers wrong counter with parallel plans

Hi,

On 2018-04-29 20:40:43 +0530, Amit Kapila wrote:

On Wed, Mar 28, 2018 at 12:07 AM, Adrien Nayrat

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1692478.00..1692478.01 rows=1 width=8) (actual
time=34367.678..34367.678 rows=1 loops=1)
Buffers: shared read=442478
I/O Timings: read=7977.775
-> Seq Scan on t1 (cost=0.00..1442478.00 rows=100000000 width=0) (actual
time=8.672..19685.654 rows=100000000 loops=1)
Buffers: shared read=442478
I/O Timings: read=7977.775
Planning time: 38.971 ms

now> Execution time: 34368.097 ms

(8 rows)

Gather and Finalize Aggregate should report shared read=442478.

Yeah, it would have been convenient, if Gather and Finalize Aggregate
displays that way as it would have been easier for users to
understand. However, as of now, the aggregated stats for parallel
workers and leader are displayed at parallel/partial nodes as is
displayed in the plan. So according to me, what you are seeing is
okay, it is only slightly tricky to understand it.

I am also surprised for I/O Timings in parallel node:
I/O Timings: read=60706.314 but the query last 34s?

This is the aggregated time of all the parallel workers, so you need
to divide it by loops (which in your case is 3).

I played with git bisect and I found this commit :

commit 01edb5c7fc3bcf6aea15f2b3be36189b52ad9d1a
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri Sep 1 17:38:54 2017 -0400

I think you were seeing different results before this commit because
before that we were shutting down workers as soon as parallel workers
are done and the buffer_usage stats were accumulated and were being
used for upper nodes. According to me behavior after the commit is
consistent, for example, I think if you check the case of GatherMerge
before this commit, you will still get the stats in the way it is
after commit.

Robert, you added this as an open item. I don't think it's clear that
there's a bug here, and even less clear that it's something new for
v11. Am I understanding that right? Should we close this open item?

Greetings,

Andres Freund

#8Adrien Nayrat
adrien.nayrat@anayrat.info
In reply to: Andres Freund (#7)
Re: Explain buffers wrong counter with parallel plans

On 05/01/2018 10:57 PM, Andres Freund wrote:

Robert, you added this as an open item. I don't think it's clear that
there's a bug here, and even less clear that it's something new for
v11. Am I understanding that right? Should we close this open item?

Hi,

FIY this change is related to 01edb5c7fc which is not specific to v11.
01edb5c7fc is also in v10.

Regards,

--
Adrien

#9Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#7)
Re: Explain buffers wrong counter with parallel plans

On Tue, May 1, 2018 at 4:57 PM, Andres Freund <andres@anarazel.de> wrote:

Robert, you added this as an open item. I don't think it's clear that
there's a bug here, and even less clear that it's something new for
v11. Am I understanding that right? Should we close this open item?

Yeah, I added it in response to the original post, but it sounds like
this is just another case of somebody being confused by the fact that
EXPLAIN divides the row counts, startup time, total time, and number
of rows filtered by the loop count. So it's not an open item.

That having been said, what I think we ought to do about the confusion
that regularly results from that logic is rip it out, as in the
attached patch. This would affect not only the parallel query case
but also the inner sides of nested loops, but that doesn't bother me a
bit because I think dividing is unhelpful in both cases. Note that
explain.depesz.com, for example, goes and re-multiplies the times by
the loop count to get a figure for how much time was spent in that
node; I regularly do the same thing by hand when I read such EXPLAIN
output. Also, if you did think those cases should be handled
differently, you'd have to explain what should happen when there's a
nested loop that gets run by multiple workers, possibly different
numbers of times in each one.

One problem with this patch is that tools like explain.depesz.com (or
my brain) would need to know if they were getting output from 11+ or
10-, so it might be good to perturb this output in some other way so
as to avoid confusion, like by replacing "loops=" with some other
text. I don't have a specific proposal for that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

no-loop-division.patchapplication/octet-stream; name=no-loop-division.patchDownload+48-51
#10Adrien Nayrat
adrien.nayrat@anayrat.info
In reply to: Robert Haas (#9)
Re: Explain buffers wrong counter with parallel plans

Hi,

On 05/02/2018 05:22 PM, Robert Haas wrote:

On Tue, May 1, 2018 at 4:57 PM, Andres Freund <andres@anarazel.de> wrote:

Robert, you added this as an open item. I don't think it's clear that
there's a bug here, and even less clear that it's something new for
v11. Am I understanding that right? Should we close this open item?

Yeah, I added it in response to the original post, but it sounds like
this is just another case of somebody being confused by the fact that
EXPLAIN divides the row counts, startup time, total time, and number
of rows filtered by the loop count. So it's not an open item.

To precise, I am not confused by these counters. They do not change between 9.6
and 10.

I am confused by buffers counter which are differents between 9.6 and 10.

In 9.6 gather node reports sum of buffers for main process + workers. In 10,
gather node only reports buffers from the main process.

Thanks,

--
Adrien

#11Robert Haas
robertmhaas@gmail.com
In reply to: Adrien Nayrat (#10)
Re: Explain buffers wrong counter with parallel plans

On Wed, May 2, 2018 at 11:37 AM, Adrien Nayrat
<adrien.nayrat@anayrat.info> wrote:

In 9.6 gather node reports sum of buffers for main process + workers. In 10,
gather node only reports buffers from the main process.

Oh, really? Well, that sounds like a bug. Amit seems to think it's
expected behavior, but I don't know why it should be. The commit
message makes it sound like it's just refactoring, but in fact it
seems to have made a significant behavior change that doesn't look
very desirable.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#12Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#11)
Re: Explain buffers wrong counter with parallel plans

On Fri, May 4, 2018 at 10:35 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, May 2, 2018 at 11:37 AM, Adrien Nayrat
<adrien.nayrat@anayrat.info> wrote:

In 9.6 gather node reports sum of buffers for main process + workers. In 10,
gather node only reports buffers from the main process.

Oh, really? Well, that sounds like a bug. Amit seems to think it's
expected behavior, but I don't know why it should be.

The reason why I think the current behavior is okay because it is
coincidental that they were displayed correctly. We have not made any
effort to percolate it to upper nodes. For ex., before that commit
also, it was not being displayed for Gather Merge or Gather with some
kind of node like 'Limit' where we have to stop before reaching the
end of the result.

The commit
message makes it sound like it's just refactoring, but in fact it
seems to have made a significant behavior change that doesn't look
very desirable.

I think it is below part of that commit which has made this
difference, basically shutting down the workers.

if (readerdone)
{
Assert(!tup);
..
if (gatherstate->nreaders == 0)
- {
- ExecShutdownGatherWorkers(gatherstate);
return NULL;
- }

The reason why we were getting different results due to above code is
that because while shutting down workers, we gather the buffer usage
of all workers in 'pgBufferUsage' via InstrAccumParallelQuery and then
upper-level node say Gather in this case would get chance to use that
stats via ExecProcNodeInstr->InstrStopNode. However, this won't be
true in other cases where we need to exit before reaching the end of
results like in 'Limit' node case as in such cases after shutting down
the workers via ExecShutdownNode we won't do InstrStopNode for
upper-level nodes. I think if we want that all the stats being
collected by workers should percolate to Gather and nodes above it,
then we need to somehow ensure that we always shutdown
Gather/GatherMerge before we do InstrStopNode for those nodes.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#13Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#12)
Re: Explain buffers wrong counter with parallel plans

On Sat, May 5, 2018 at 8:56 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

The reason why I think the current behavior is okay because it is
coincidental that they were displayed correctly. We have not made any
effort to percolate it to upper nodes. For ex., before that commit
also, it was not being displayed for Gather Merge or Gather with some
kind of node like 'Limit' where we have to stop before reaching the
end of the result.

It's not entirely coincidental. I had the intention to try to ensure
that the workers would be shut down before the Gather or Gather Merge,
and I think that various things in the code testify to that intention.
It sounds like I missed some cases, but now we're missing even more
cases.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#14Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#13)
Re: Explain buffers wrong counter with parallel plans

On 2018-May-07, Robert Haas wrote:

On Sat, May 5, 2018 at 8:56 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

The reason why I think the current behavior is okay because it is
coincidental that they were displayed correctly. We have not made any
effort to percolate it to upper nodes. For ex., before that commit
also, it was not being displayed for Gather Merge or Gather with some
kind of node like 'Limit' where we have to stop before reaching the
end of the result.

It's not entirely coincidental. I had the intention to try to ensure
that the workers would be shut down before the Gather or Gather Merge,
and I think that various things in the code testify to that intention.
It sounds like I missed some cases, but now we're missing even more
cases.

So, apparently this is not a Postgres 11 open item, but rather a bug
that goes back to pg10. However, maybe it would be worth fixing soon
anyway? In particular, if we want to perturb the explain output as
suggested upthread, maybe *that* should be considered an open item?

Anyway, this has stalled for a month now.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#15Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#14)
Re: Explain buffers wrong counter with parallel plans

On Wed, Jun 6, 2018 at 12:08 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

So, apparently this is not a Postgres 11 open item, but rather a bug
that goes back to pg10. However, maybe it would be worth fixing soon
anyway? In particular, if we want to perturb the explain output as
suggested upthread, maybe *that* should be considered an open item?

Anyway, this has stalled for a month now.

Yeah. I'm not willing to do anything here unilaterally. There is
neither universal agreement that there is a problem here nor agreement
on a fix. Unless that changes, I don't know what to do here.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#16Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#15)
Re: Explain buffers wrong counter with parallel plans

On Fri, Jun 8, 2018 at 11:34 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Jun 6, 2018 at 12:08 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

So, apparently this is not a Postgres 11 open item, but rather a bug
that goes back to pg10. However, maybe it would be worth fixing soon
anyway? In particular, if we want to perturb the explain output as
suggested upthread, maybe *that* should be considered an open item?

Anyway, this has stalled for a month now.

Yeah. I'm not willing to do anything here unilaterally. There is
neither universal agreement that there is a problem here nor agreement
on a fix.

Right, I think we have following options:
(a) Come up with a solution which allows percolating the buffer usage
and or similar stats to upper nodes in all cases.
(b) Allow it to work for some of the cases as it was earlier.

I think (b) can cause confusion and could lead to further questions on
which specific cases will it work and for which it won't work. If we
think (a) is a reasonable approach, then we can close this item with a
conclusion as a work item for future and OTOH if we think option (b)
is the better way to deal with it, then we can come up with a patch to
do so. My inclination is to go with option (a), but I don't mind if
the decision is to choose option (b).

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#17Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#15)
Re: Explain buffers wrong counter with parallel plans

On 2018-Jun-08, Robert Haas wrote:

On Wed, Jun 6, 2018 at 12:08 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

So, apparently this is not a Postgres 11 open item, but rather a bug
that goes back to pg10. However, maybe it would be worth fixing soon
anyway? In particular, if we want to perturb the explain output as
suggested upthread, maybe *that* should be considered an open item?

Anyway, this has stalled for a month now.

Yeah. I'm not willing to do anything here unilaterally. There is
neither universal agreement that there is a problem here nor agreement
on a fix. Unless that changes, I don't know what to do here.

After going over this whole thing once more, my opinion is that this
reporting is broken. IMO if we want to display per-process counts, then
let's display them all, or none; if we're going to display aggregated
counts, then we should mark them as such. Otherwise it's pretty much
impossible to read the output unless you know exactly what the code does
for each value. (The other option would be to document what the code
does for each value, but that seems suboptimal -- that just requires
people to glue one more cheat sheet to their walls, which is not Pg
style.)

I'm CCing Tom here, as author of the patch that caused (most of) the
issue.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#17)
Re: Explain buffers wrong counter with parallel plans

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

I'm CCing Tom here, as author of the patch that caused (most of) the
issue.

Uh ... me? I thought this was a parallel-query issue, which I've
pretty much not been involved in.

regards, tom lane

#19Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#18)
Re: Explain buffers wrong counter with parallel plans

On 2018-Jun-29, Tom Lane wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

I'm CCing Tom here, as author of the patch that caused (most of) the
issue.

Uh ... me? I thought this was a parallel-query issue, which I've
pretty much not been involved in.

Well, maybe it's a matter of opinion. Amit K said a few messages back
that your 01edb5c7fc3b ("Improve division of labor between
execParallel.c and nodeGather[Merge].c.") had changed the way these
numbers are printed, but only now I realize that he then indicated that
a different code path was already behaving in that way.

I stand by my opinion that we should not give misleading/confusing info;
either let's show it all in the default output, or only do it in
VERBOSE, but if the latter then let's suppress the misleading numbers in
the default output.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#20Amit Kapila
amit.kapila16@gmail.com
In reply to: Alvaro Herrera (#19)
Re: Explain buffers wrong counter with parallel plans

On Sat, Jun 30, 2018 at 4:08 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

On 2018-Jun-29, Tom Lane wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

I'm CCing Tom here, as author of the patch that caused (most of) the
issue.

Uh ... me? I thought this was a parallel-query issue, which I've
pretty much not been involved in.

Well, maybe it's a matter of opinion. Amit K said a few messages back
that your 01edb5c7fc3b ("Improve division of labor between
execParallel.c and nodeGather[Merge].c.") had changed the way these
numbers are printed, but only now I realize that he then indicated that
a different code path was already behaving in that way.

I stand by my opinion that we should not give misleading/confusing info;
either let's show it all in the default output, or only do it in
VERBOSE, but if the latter then let's suppress the misleading numbers in
the default output.

It is not clear to me what exactly is your expectation. Can you be a
bit more specific? AFAIU, the primary confusion to OP is that the
aggregated stats like buffer_usage (Buffers: ..) should be displayed
correctly at Gather/Gather Merge and nodes above it. Currently, those
are being displayed correctly at Parallel (Seq Scan) nodes.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#21Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#18)
#22Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#16)
#23Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#22)
#24Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#23)
#25Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#23)
#26Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#25)
#27Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#26)
#28Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#27)
#29Jonathan S. Katz
jonathan.katz@excoventures.com
In reply to: Amit Kapila (#28)
#30Amit Kapila
amit.kapila16@gmail.com
In reply to: Jonathan S. Katz (#29)
#31Jonathan S. Katz
jonathan.katz@excoventures.com
In reply to: Amit Kapila (#30)
#32Amit Kapila
amit.kapila16@gmail.com
In reply to: Jonathan S. Katz (#31)
#33Jonathan S. Katz
jonathan.katz@excoventures.com
In reply to: Amit Kapila (#32)
#34Amit Kapila
amit.kapila16@gmail.com
In reply to: Jonathan S. Katz (#33)
#35Jonathan S. Katz
jonathan.katz@excoventures.com
In reply to: Amit Kapila (#34)
#36Andres Freund
andres@anarazel.de
In reply to: Amit Kapila (#34)
#37Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#36)
#38Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#34)
#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#38)
#40Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#39)
#41Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#40)
#42Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#38)
#43Andres Freund
andres@anarazel.de
In reply to: Amit Kapila (#42)
#44Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#43)
#45Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#44)
#46Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#45)
#47Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#46)
#48Adrien Nayrat
adrien.nayrat@anayrat.info
In reply to: Amit Kapila (#47)
#49Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#47)
#50Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Adrien Nayrat (#48)
#51Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#47)
#52Amit Kapila
amit.kapila16@gmail.com
In reply to: Alvaro Herrera (#50)
#53Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#51)
#54Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#49)