[PATCH] Add extra statistics to explain for Nested Loop

Started by Ekaterina Sokolovaover 5 years ago33 messageshackers
Jump to latest
#1Ekaterina Sokolova
e.sokolova@postgrespro.ru

Hi, hackers.
For some distributions of data in tables, different loops in nested loop
joins can take different time and process different amounts of entries.
It makes average statistics returned by explain analyze not very useful
for DBA.
To fix it, here is the patch that add printing of min and max statistics
for time and rows across all loops in Nested Loop to EXPLAIN ANALYSE.
Please don't hesitate to share any thoughts on this topic!
--
Ekaterina Sokolova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

extra_statistics_v0.patchtext/x-diff; name=extra_statistics_v0.patchDownload+209-106
example_v0.sqltext/plain; name=example_v0.sqlDownload
#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Ekaterina Sokolova (#1)
Re: [PATCH] Add extra statistics to explain for Nested Loop

pá 16. 10. 2020 v 9:43 odesílatel <e.sokolova@postgrespro.ru> napsal:

Hi, hackers.
For some distributions of data in tables, different loops in nested loop
joins can take different time and process different amounts of entries.
It makes average statistics returned by explain analyze not very useful
for DBA.
To fix it, here is the patch that add printing of min and max statistics
for time and rows across all loops in Nested Loop to EXPLAIN ANALYSE.
Please don't hesitate to share any thoughts on this topic!

+1

This is great feature - sometimes it can be pretty messy current limited
format

Pavel

--

Show quoted text

Ekaterina Sokolova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#3Julien Rouhaud
rjuju123@gmail.com
In reply to: Pavel Stehule (#2)
Re: [PATCH] Add extra statistics to explain for Nested Loop

Le ven. 16 oct. 2020 à 16:12, Pavel Stehule <pavel.stehule@gmail.com> a
écrit :

pá 16. 10. 2020 v 9:43 odesílatel <e.sokolova@postgrespro.ru> napsal:

Hi, hackers.
For some distributions of data in tables, different loops in nested loop
joins can take different time and process different amounts of entries.
It makes average statistics returned by explain analyze not very useful
for DBA.
To fix it, here is the patch that add printing of min and max statistics
for time and rows across all loops in Nested Loop to EXPLAIN ANALYSE.
Please don't hesitate to share any thoughts on this topic!

+1

This is great feature - sometimes it can be pretty messy current limited
format

+1, this can be very handy!

Show quoted text
#4Anastasia Lubennikova
a.lubennikova@postgrespro.ru
In reply to: Julien Rouhaud (#3)
Re: [PATCH] Add extra statistics to explain for Nested Loop

On 16.10.2020 12:07, Julien Rouhaud wrote:

Le ven. 16 oct. 2020 à 16:12, Pavel Stehule <pavel.stehule@gmail.com
<mailto:pavel.stehule@gmail.com>> a écrit :

pá 16. 10. 2020 v 9:43 odesílatel <e.sokolova@postgrespro.ru
<mailto:e.sokolova@postgrespro.ru>> napsal:

Hi, hackers.
For some distributions of data in tables, different loops in
nested loop
joins can take different time and process different amounts of
entries.
It makes average statistics returned by explain analyze not
very useful
for DBA.
To fix it, here is the patch that add printing of min and max
statistics
for time and rows across all loops in Nested Loop to EXPLAIN
ANALYSE.
Please don't hesitate to share any thoughts on this topic!

+1

This is great feature - sometimes it can be pretty messy current
limited format

+1, this can be very handy!

Cool.
I have added your patch to the commitfest, so it won't get lost.
https://commitfest.postgresql.org/30/2765/

I will review the code next week.  Unfortunately, I cannot give any
feedback about usability of this feature.

User visible change is:

-               ->  Nested Loop (actual rows=N loops=N)
+              ->  Nested Loop (actual min_rows=0 rows=0 max_rows=0 loops=2)

Pavel, Julien, could you please say if it looks good?

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#5Pavel Stehule
pavel.stehule@gmail.com
In reply to: Anastasia Lubennikova (#4)
Re: [PATCH] Add extra statistics to explain for Nested Loop

so 17. 10. 2020 v 0:11 odesílatel Anastasia Lubennikova <
a.lubennikova@postgrespro.ru> napsal:

On 16.10.2020 12:07, Julien Rouhaud wrote:

Le ven. 16 oct. 2020 à 16:12, Pavel Stehule <pavel.stehule@gmail.com> a
écrit :

pá 16. 10. 2020 v 9:43 odesílatel <e.sokolova@postgrespro.ru> napsal:

Hi, hackers.
For some distributions of data in tables, different loops in nested loop
joins can take different time and process different amounts of entries.
It makes average statistics returned by explain analyze not very useful
for DBA.
To fix it, here is the patch that add printing of min and max statistics
for time and rows across all loops in Nested Loop to EXPLAIN ANALYSE.
Please don't hesitate to share any thoughts on this topic!

+1

This is great feature - sometimes it can be pretty messy current limited
format

+1, this can be very handy!

Cool.

I have added your patch to the commitfest, so it won't get lost.
https://commitfest.postgresql.org/30/2765/

I will review the code next week. Unfortunately, I cannot give any
feedback about usability of this feature.

User visible change is:

-               ->  Nested Loop (actual rows=N loops=N)
+              ->  Nested Loop (actual min_rows=0 rows=0 max_rows=0
loops=2)

This interface is ok - there is not too much space for creativity. I can
imagine displaying variance or average - but I am afraid about very bad
performance impacts.

Regards

Pavel

Show quoted text

Pavel, Julien, could you please say if it looks good?

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#6Julien Rouhaud
rjuju123@gmail.com
In reply to: Pavel Stehule (#5)
Re: [PATCH] Add extra statistics to explain for Nested Loop

On Sat, Oct 17, 2020 at 12:15 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:

so 17. 10. 2020 v 0:11 odesílatel Anastasia Lubennikova <a.lubennikova@postgrespro.ru> napsal:

On 16.10.2020 12:07, Julien Rouhaud wrote:

Le ven. 16 oct. 2020 à 16:12, Pavel Stehule <pavel.stehule@gmail.com> a écrit :

pá 16. 10. 2020 v 9:43 odesílatel <e.sokolova@postgrespro.ru> napsal:

Hi, hackers.
For some distributions of data in tables, different loops in nested loop
joins can take different time and process different amounts of entries.
It makes average statistics returned by explain analyze not very useful
for DBA.
To fix it, here is the patch that add printing of min and max statistics
for time and rows across all loops in Nested Loop to EXPLAIN ANALYSE.
Please don't hesitate to share any thoughts on this topic!

+1

This is great feature - sometimes it can be pretty messy current limited format

+1, this can be very handy!

Cool.
I have added your patch to the commitfest, so it won't get lost.

Thanks! I'll also try to review it next week.

https://commitfest.postgresql.org/30/2765/

I will review the code next week. Unfortunately, I cannot give any feedback about usability of this feature.

User visible change is:

-               ->  Nested Loop (actual rows=N loops=N)
+              ->  Nested Loop (actual min_rows=0 rows=0 max_rows=0 loops=2)

This interface is ok - there is not too much space for creativity.

Yes I also think it's ok. We should also consider usability for tools
like explain.depesz.com, I don't know if the current output is best.
I'm adding Depesz and Pierre which are both working on this kind of
tool for additional input.

I can imagine displaying variance or average - but I am afraid about very bad performance impacts.

The original counter (rows here) is already an average right?
Variance could be nice too. Instrumentation will already spam
gettimeofday() calls for nested loops, I don't think that computing
variance would add that much overhead?

#7Pavel Stehule
pavel.stehule@gmail.com
In reply to: Julien Rouhaud (#6)
Re: [PATCH] Add extra statistics to explain for Nested Loop

so 17. 10. 2020 v 6:26 odesílatel Julien Rouhaud <rjuju123@gmail.com>
napsal:

On Sat, Oct 17, 2020 at 12:15 PM Pavel Stehule <pavel.stehule@gmail.com>
wrote:

so 17. 10. 2020 v 0:11 odesílatel Anastasia Lubennikova <

a.lubennikova@postgrespro.ru> napsal:

On 16.10.2020 12:07, Julien Rouhaud wrote:

Le ven. 16 oct. 2020 à 16:12, Pavel Stehule <pavel.stehule@gmail.com>

a écrit :

pá 16. 10. 2020 v 9:43 odesílatel <e.sokolova@postgrespro.ru> napsal:

Hi, hackers.
For some distributions of data in tables, different loops in nested

loop

joins can take different time and process different amounts of

entries.

It makes average statistics returned by explain analyze not very

useful

for DBA.
To fix it, here is the patch that add printing of min and max

statistics

for time and rows across all loops in Nested Loop to EXPLAIN ANALYSE.
Please don't hesitate to share any thoughts on this topic!

+1

This is great feature - sometimes it can be pretty messy current

limited format

+1, this can be very handy!

Cool.
I have added your patch to the commitfest, so it won't get lost.

Thanks! I'll also try to review it next week.

https://commitfest.postgresql.org/30/2765/

I will review the code next week. Unfortunately, I cannot give any

feedback about usability of this feature.

User visible change is:

-               ->  Nested Loop (actual rows=N loops=N)
+              ->  Nested Loop (actual min_rows=0 rows=0 max_rows=0

loops=2)

This interface is ok - there is not too much space for creativity.

Yes I also think it's ok. We should also consider usability for tools
like explain.depesz.com, I don't know if the current output is best.
I'm adding Depesz and Pierre which are both working on this kind of
tool for additional input.

I can imagine displaying variance or average - but I am afraid about

very bad performance impacts.

The original counter (rows here) is already an average right?
Variance could be nice too. Instrumentation will already spam
gettimeofday() calls for nested loops, I don't think that computing
variance would add that much overhead?

There is not any problem to write benchmark for worst case and test it

In reply to: Julien Rouhaud (#6)
Re: [PATCH] Add extra statistics to explain for Nested Loop

On Sat, Oct 17, 2020 at 12:26:08PM +0800, Julien Rouhaud wrote:

-               ->  Nested Loop (actual rows=N loops=N)
+              ->  Nested Loop (actual min_rows=0 rows=0 max_rows=0 loops=2)

This interface is ok - there is not too much space for creativity.

Yes I also think it's ok. We should also consider usability for tools
like explain.depesz.com, I don't know if the current output is best.
I'm adding Depesz and Pierre which are both working on this kind of
tool for additional input.

Thanks for heads up. This definitely will need some changes on my side,
but should be easy to handle.

Best regards,

depesz

#9David G. Johnston
david.g.johnston@gmail.com
In reply to: Anastasia Lubennikova (#4)
Re: [PATCH] Add extra statistics to explain for Nested Loop

On Fri, Oct 16, 2020 at 3:11 PM Anastasia Lubennikova <
a.lubennikova@postgrespro.ru> wrote:

User visible change is:

-               ->  Nested Loop (actual rows=N loops=N)
+              ->  Nested Loop (actual min_rows=0 rows=0 max_rows=0
loops=2)

I'd be inclined to append both new rows to the end.

(actual rows=N loops=N min_rows=N max_rows=N)

rows * loops is still an important calculation.

Why not just add total_rows while we are at it - last in the listing?

(actual rows=N loops=N min_rows=N max_rows=N total_rows=N)

David J.

#10Julien Rouhaud
rjuju123@gmail.com
In reply to: Anastasia Lubennikova (#4)
Re: [PATCH] Add extra statistics to explain for Nested Loop

On Sat, Oct 17, 2020 at 6:11 AM Anastasia Lubennikova
<a.lubennikova@postgrespro.ru> wrote:

On 16.10.2020 12:07, Julien Rouhaud wrote:

Le ven. 16 oct. 2020 à 16:12, Pavel Stehule <pavel.stehule@gmail.com> a écrit :

pá 16. 10. 2020 v 9:43 odesílatel <e.sokolova@postgrespro.ru> napsal:

Hi, hackers.
For some distributions of data in tables, different loops in nested loop
joins can take different time and process different amounts of entries.
It makes average statistics returned by explain analyze not very useful
for DBA.
To fix it, here is the patch that add printing of min and max statistics
for time and rows across all loops in Nested Loop to EXPLAIN ANALYSE.
Please don't hesitate to share any thoughts on this topic!

+1

This is great feature - sometimes it can be pretty messy current limited format

+1, this can be very handy!

Cool.
I have added your patch to the commitfest, so it won't get lost.
https://commitfest.postgresql.org/30/2765/

I will review the code next week. Unfortunately, I cannot give any feedback about usability of this feature.

User visible change is:

-               ->  Nested Loop (actual rows=N loops=N)
+              ->  Nested Loop (actual min_rows=0 rows=0 max_rows=0 loops=2)

Thanks for working on this feature! Here are some comments on the patch.

First, cosmetic issues. There are a lot of whitespace issues, the new
code is space indented while it should be tab indented. Also there
are 3 patches included with some fixups, could you instead push a
single patch?

It also misses some modification in the regression tests. For instance:

diff --git a/src/test/regress/expected/partition_prune.out
b/src/test/regress/expected/partition_prune.out
index 50d2a7e4b9..db0b167ef4 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -2065,7 +2065,7 @@ select explain_parallel_append('select avg(ab.a)
from ab inner join lprt_a a on
          Workers Planned: 1
          Workers Launched: N
          ->  Partial Aggregate (actual rows=N loops=N)
-               ->  Nested Loop (actual rows=N loops=N)
+               ->  Nested Loop (actual min_rows=0 rows=0 max_rows=0 loops=2)
                      ->  Parallel Seq Scan on lprt_a a (actual rows=N loops=N)

You should update the explain_parallel_append() plpgsql function
created in that test file to make sure that both "rows" and the two
new counters are changed to "N". There might be other similar changes
needed.

Now, for the changes themselves. For the min/max time, you're
aggregating "totaltime - instr->firsttuple". Why removing the startup
time from the loop execution time? I think this should be kept.
Also, in explain.c you display the counters in the "Nested loop" node,
but this should be done in the inner plan node instead, as this is the
one being looped on. So the condition should probably be "nloops > 1"
rather than testing if it's a NestLoop.

I'm switching the patch to WoA.

#11Pierre Giraud
pierre.giraud@dalibo.com
In reply to: Julien Rouhaud (#6)
Re: [PATCH] Add extra statistics to explain for Nested Loop

Le 17/10/2020 à 06:26, Julien Rouhaud a écrit :

On Sat, Oct 17, 2020 at 12:15 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:

so 17. 10. 2020 v 0:11 odesílatel Anastasia Lubennikova <a.lubennikova@postgrespro.ru> napsal:

On 16.10.2020 12:07, Julien Rouhaud wrote:

Le ven. 16 oct. 2020 à 16:12, Pavel Stehule <pavel.stehule@gmail.com> a écrit :

pá 16. 10. 2020 v 9:43 odesílatel <e.sokolova@postgrespro.ru> napsal:

Hi, hackers.
For some distributions of data in tables, different loops in nested loop
joins can take different time and process different amounts of entries.
It makes average statistics returned by explain analyze not very useful
for DBA.
To fix it, here is the patch that add printing of min and max statistics
for time and rows across all loops in Nested Loop to EXPLAIN ANALYSE.
Please don't hesitate to share any thoughts on this topic!

+1

This is great feature - sometimes it can be pretty messy current limited format

+1, this can be very handy!

Cool.
I have added your patch to the commitfest, so it won't get lost.

Thanks! I'll also try to review it next week.

https://commitfest.postgresql.org/30/2765/

I will review the code next week. Unfortunately, I cannot give any feedback about usability of this feature.

User visible change is:

-               ->  Nested Loop (actual rows=N loops=N)
+              ->  Nested Loop (actual min_rows=0 rows=0 max_rows=0 loops=2)

This interface is ok - there is not too much space for creativity.

Yes I also think it's ok. We should also consider usability for tools
like explain.depesz.com, I don't know if the current output is best.
I'm adding Depesz and Pierre which are both working on this kind of
tool for additional input.

Same for me and PEV2. It should be fairly easy to parse this new format.

I can imagine displaying variance or average - but I am afraid about very bad performance impacts.

The original counter (rows here) is already an average right?
Variance could be nice too. Instrumentation will already spam
gettimeofday() calls for nested loops, I don't think that computing
variance would add that much overhead?

Thus, it's an average value. And to be mentioned: a rounded one! Which I
found a bit tricky to figure out.

#12Andres Freund
andres@anarazel.de
In reply to: Ekaterina Sokolova (#1)
Re: [PATCH] Add extra statistics to explain for Nested Loop

Hi,

On 2020-10-16 10:42:43 +0300, e.sokolova@postgrespro.ru wrote:

For some distributions of data in tables, different loops in nested loop
joins can take different time and process different amounts of entries. It
makes average statistics returned by explain analyze not very useful for
DBA.
To fix it, here is the patch that add printing of min and max statistics for
time and rows across all loops in Nested Loop to EXPLAIN ANALYSE.
Please don't hesitate to share any thoughts on this topic!

Interesting idea!

I'm a bit worried that further increasing the size of struct
Instrumentation will increase the overhead of EXPLAIN ANALYZE further -
in some workloads we spend a fair bit of time in code handling that. It
would be good to try to find a few bad cases, and see what the overhead is.

Unfortunately your patch is pretty hard to look at - you seem to have
included your incremental hacking efforts?

From 7871ac1afe7837a6dc0676a6c9819cc68a5c0f07 Mon Sep 17 00:00:00 2001
From: "e.sokolova" <e.sokolova@postgrespro.ru>
Date: Fri, 4 Sep 2020 18:00:47 +0300
Subject: Add min and max statistics without case of
parallel workers. Tags: commitfest_hotfix.

From ebdfe117e4074d268e3e7c480b98d375d1d6f62b Mon Sep 17 00:00:00 2001
From: "e.sokolova" <e.sokolova@postgrespro.ru>
Date: Fri, 11 Sep 2020 23:04:34 +0300
Subject: Add case of parallel workers. Tags:
commitfest_hotfix.

From ecbf04d519e17b8968103364e89169ab965b41d7 Mon Sep 17 00:00:00 2001
From: "e.sokolova" <e.sokolova@postgrespro.ru>
Date: Fri, 18 Sep 2020 13:35:19 +0300
Subject: Fix bugs. Tags: commitfest_hotfix.

From 7566a98bbc33a24052e1334b0afe2cf341c0818f Mon Sep 17 00:00:00 2001
From: "e.sokolova" <e.sokolova@postgrespro.ru>
Date: Fri, 25 Sep 2020 20:09:22 +0300
Subject: Fix tests. Tags: commitfest_hotfix.

Greetings,

Andres Freund

#13Ekaterina Sokolova
e.sokolova@postgrespro.ru
In reply to: Julien Rouhaud (#10)
Re: [PATCH] Add extra statistics to explain for Nested Loop

<rjuju123@gmail.com> wrote:

You should update the explain_parallel_append() plpgsql function
created in that test file to make sure that both "rows" and the two
new counters are changed to "N". There might be other similar changes
needed.

Thank you for watching this issue. I made the necessary changes in tests
following your advice.

Now, for the changes themselves. For the min/max time, you're
aggregating "totaltime - instr->firsttuple". Why removing the startup
time from the loop execution time? I think this should be kept.

I think it's right remark. I fixed it.

Also, in explain.c you display the counters in the "Nested loop" node,
but this should be done in the inner plan node instead, as this is the
one being looped on. So the condition should probably be "nloops > 1"
rather than testing if it's a NestLoop.

Condition "nloops > 1" is not the same as checking if it's NestLoop.
This condition will lead to printing extra statistics for nodes with
different types of join, not only Nested Loop Join. If this statistic is
useful for other plan nodes, it makes sense to change the condition.

<andres@anarazel.de> wrote:

I'm a bit worried that further increasing the size of struct
Instrumentation will increase the overhead of EXPLAIN ANALYZE further -
in some workloads we spend a fair bit of time in code handling that. It
would be good to try to find a few bad cases, and see what the overhead
is.

Thank you for this comment, I will try to figure it out. Do you have
some examples of large overhead dependent on this struct? I think I need
some sample to know which way to think.

Thank you all for the feedback. I hope the new version of my patch will
be more correct and useful.
Please don't hesitate to share any thoughts on this topic!
--
Ekaterina Sokolova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

extra_statistics_v1.patchtext/x-diff; name=extra_statistics_v1.patchDownload+163-52
#14Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Ekaterina Sokolova (#13)
Re: [PATCH] Add extra statistics to explain for Nested Loop

Hello Ekaterina,

seems like an interesting and useful improvement. I did a quick review
of the patch - attached is a 0002 patch with a couple minor changes (the
0001 is just your v1 patch, to keep cfbot happy).

1) There's a couple changes to follow project code style (e.g. brackets
after "if" on a separate line, no brackets around single-line blocks,
etc.). I've reverted some unnecessary whitespace changes. Minor stuff.

2) I don't think InstrEndLoop needs to check if min_t == 0 before
initializing it in the first loop. It certainly has to be 0, no? Same
for min_tuples. I also suggest comment explaining that we don't have to
initialize the max values.

3) In ExplainNode, in the part processing per-worker stats, I think some
of the fields are incorrectly referencing planstate->instrument instead
of using the 'instrument' variable from WorkerInstrumentation.

In general, I agree with Andres this might add overhead to explain
analyze, although I doubt it's going to be measurable. But maybe try
doing some measurements for common and worst-cases.

I wonder if we should have another option EXPLAIN option enabling this.
I.e. by default we'd not collect/print this, and users would have to
pass some option to EXPLAIN. Or maybe we could tie this to VERBOSE?

Also, why print this only for nested loop, and not for all nodes with
(nloops > 1)? I see there was some discussion why checking nodeTag is
necessary to identify NL, but that's not my point.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-extra_statistics_v1.patchtext/plain; charset=us-asciiDownload+163-53
0002-minor-tweaks.patchtext/plain; charset=us-asciiDownload+40-32
#15Georgios Kokolatos
gkokolatos@protonmail.com
In reply to: Tomas Vondra (#14)
Re: [PATCH] Add extra statistics to explain for Nested Loop

Hi,

I noticed that this patch fails on the cfbot.
For this, I changed the status to: 'Waiting on Author'.

Cheers,
//Georgios

The new status of this patch is: Waiting on Author

#16Ekaterina Sokolova
e.sokolova@postgrespro.ru
In reply to: Tomas Vondra (#14)
Re: [PATCH] Add extra statistics to explain for Nested Loop

Tomas Vondra писал 2020-10-31 04:20:

seems like an interesting and useful improvement. I did a quick review
of the patch - attached is a 0002 patch with a couple minor changes
(the
0001 is just your v1 patch, to keep cfbot happy).

Thank you for your review and changes!

3) In ExplainNode, in the part processing per-worker stats, I think
some
of the fields are incorrectly referencing planstate->instrument instead
of using the 'instrument' variable from WorkerInstrumentation.

It's correct behavior because of struct WorkerInstrumentation contains
struct Instrumentation instrument. But planstate->instrument is struct
Instrumentation too.

I wonder if we should have another option EXPLAIN option enabling this.
I.e. by default we'd not collect/print this, and users would have to
pass some option to EXPLAIN. Or maybe we could tie this to VERBOSE?

It's good idea. Now additional statistics are only printed when we set
the VERBOSE.

New version of this patch prints extra statistics for all cases of
multiple loops, not only for Nested Loop. Also I fixed the example by
adding VERBOSE.

Please don't hesitate to share any thoughts on this topic!
--
Ekaterina Sokolova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

extra_statistics_v2.patchtext/x-diff; name=extra_statistics_v2.patchDownload+157-30
example_v1.sqltext/plain; name=example_v1.sqlDownload
#17Michael Christofides
michael@pgmustard.com
In reply to: David G. Johnston (#9)
Re: [PATCH] Add extra statistics to explain for Nested Loop

New version of this patch prints extra statistics for all cases of
multiple loops, not only for Nested Loop. Also I fixed the example by
adding VERBOSE.

Please don't hesitate to share any thoughts on this topic!

Thanks a lot for working on this! I really like the extra details, and
including it only with VERBOSE sounds good.

rows * loops is still an important calculation.

Why not just add total_rows while we are at it - last in the listing?

(actual rows=N loops=N min_rows=N max_rows=N total_rows=N)

This total_rows idea from David would really help us too, especially
in the cases where the actual rows is rounded down to zero. We make an
explain visualisation tool, and it'd be nice to show people a better
total than loops * actual rows. It would also help the accuracy of
some of our tips, that use this number.

Apologies if this input is too late to be helpful.

Cheers,
Michael

#18Yugo Nagata
nagata@sraoss.co.jp
In reply to: Ekaterina Sokolova (#16)
Re: [PATCH] Add extra statistics to explain for Nested Loop

Hello,

On Thu, 12 Nov 2020 23:10:05 +0300
e.sokolova@postgrespro.ru wrote:

New version of this patch prints extra statistics for all cases of
multiple loops, not only for Nested Loop. Also I fixed the example by
adding VERBOSE.

I think this extra statistics seems good because it is useful for DBA
to understand explained plan. I reviewed this patch. Here are a few
comments:

1)
postgres=# explain (analyze, verbose) select * from a,b where a.i=b.j;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2752.00 rows=991 width=8) (actual time=0.021..17.651 rows=991 loops=1)
Output: a.i, b.j
Join Filter: (a.i = b.j)
Rows Removed by Join Filter: 99009
-> Seq Scan on public.b (cost=0.00..2.00 rows=100 width=4) (actual time=0.009..0.023 rows=100 loops=1)
Output: b.j
-> Seq Scan on public.a (cost=0.00..15.00 rows=1000 width=4) (actual time=0.005..0.091 min_time=0.065 max_time=0.163 min_rows=1000 rows=1000 max_rows=1000 loops=100)
Output: a.i
Planning Time: 0.066 ms
Execution Time: 17.719 ms
(10 rows)

I don't like this format where the extra statistics appear in the same
line of existing information because the output format differs depended
on whether the plan node's loops > 1 or not. This makes the length of a
line too long. Also, other information reported by VERBOSE doesn't change
the exiting row format and just add extra rows for new information.

Instead, it seems good for me to add extra rows for the new statistics
without changint the existing row format as other VERBOSE information,
like below.

-> Seq Scan on public.a (cost=0.00..15.00 rows=1000 width=4) (actual time=0.005..0.091 rows=1000 loops=100)
Output: a.i
Min Time: 0.065 ms
Max Time: 0.163 ms
Min Rows: 1000
Max Rows: 1000

or, like Buffers,

-> Seq Scan on public.a (cost=0.00..15.00 rows=1000 width=4) (actual time=0.005..0.091 rows=1000 loops=100)
Output: a.i
Loops: min_time=0.065 max_time=0.163 min_rows=1000 max_rows=1000

and so on. What do you think about it?

2)
In parallel scan, the extra statistics are not reported correctly.

postgres=# explain (analyze, verbose) select * from a,b where a.i=b.j;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..2463.52 rows=991 width=8) (actual time=0.823..25.651 rows=991 loops=1)
Output: a.i, b.j
Workers Planned: 2
Workers Launched: 2
-> Nested Loop (cost=0.00..1364.42 rows=413 width=8) (actual time=9.426..16.723 min_time=0.000 max_time=22.017 min_rows=0 rows=330 max_rows=991 loops=3)
Output: a.i, b.j
Join Filter: (a.i = b.j)
Rows Removed by Join Filter: 33003
Worker 0: actual time=14.689..14.692 rows=0 loops=1
Worker 1: actual time=13.458..13.460 rows=0 loops=1
-> Parallel Seq Scan on public.a (cost=0.00..9.17 rows=417 width=4) (actual time=0.049..0.152 min_time=0.000 max_time=0.202 min_rows=0 rows=333 max_rows=452 loops=3)
Output: a.i
Worker 0: actual time=0.040..0.130 rows=322 loops=1
Worker 1: actual time=0.039..0.125 rows=226 loops=1
-> Seq Scan on public.b (cost=0.00..2.00 rows=100 width=4) (actual time=0.006..0.026 min_time=0.012 max_time=0.066 min_rows=100 rows=100 max_rows=100 loops=1000)
Output: b.j
Worker 0: actual time=0.006..0.024 min_time=0.000 max_time=0.000 min_rows=0 rows=100 max_rows=0 loops=322
Worker 1: actual time=0.008..0.030 min_time=0.000 max_time=0.000 min_rows=0 rows=100 max_rows=0 loops=226
Planning Time: 0.186 ms
Execution Time: 25.838 ms
(20 rows)

This reports max/min rows or time of inner scan as 0 in parallel workers,
and as a result only the leader process's ones are accounted. To fix this,
we would change InstrAggNode as below.

@@ -167,6 +196,10 @@ InstrAggNode(Instrumentation *dst, Instrumentation *add)
        dst->nloops += add->nloops;
        dst->nfiltered1 += add->nfiltered1;
        dst->nfiltered2 += add->nfiltered2;
+       dst->min_t = Min(dst->min_t, add->min_t);
+       dst->max_t = Max(dst->max_t, add->max_t);
+       dst->min_tuples = Min(dst->min_tuples, add->min_tuples);
+       dst->max_tuples = Max(dst->max_tuples, add->max_tuples);

3)
There are garbage lines and I could not apply this patch.

diff --git a/src/test/regress/expected/timetz.out b/src/test/regress/expected/timetz.out
index 038bb5fa094..5294179aa45 100644

Regards,
Yugo Nagata

--
Yugo NAGATA <nagata@sraoss.co.jp>

#19Julien Rouhaud
rjuju123@gmail.com
In reply to: Yugo Nagata (#18)
Re: [PATCH] Add extra statistics to explain for Nested Loop

On Thu, Jan 28, 2021 at 8:38 PM Yugo NAGATA <nagata@sraoss.co.jp> wrote:

postgres=# explain (analyze, verbose) select * from a,b where a.i=b.j;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2752.00 rows=991 width=8) (actual time=0.021..17.651 rows=991 loops=1)
Output: a.i, b.j
Join Filter: (a.i = b.j)
Rows Removed by Join Filter: 99009
-> Seq Scan on public.b (cost=0.00..2.00 rows=100 width=4) (actual time=0.009..0.023 rows=100 loops=1)
Output: b.j
-> Seq Scan on public.a (cost=0.00..15.00 rows=1000 width=4) (actual time=0.005..0.091 min_time=0.065 max_time=0.163 min_rows=1000 rows=1000 max_rows=1000 loops=100)
Output: a.i
Planning Time: 0.066 ms
Execution Time: 17.719 ms
(10 rows)

I don't like this format where the extra statistics appear in the same
line of existing information because the output format differs depended
on whether the plan node's loops > 1 or not. This makes the length of a
line too long. Also, other information reported by VERBOSE doesn't change
the exiting row format and just add extra rows for new information.

Instead, it seems good for me to add extra rows for the new statistics
without changint the existing row format as other VERBOSE information,
like below.

-> Seq Scan on public.a (cost=0.00..15.00 rows=1000 width=4) (actual time=0.005..0.091 rows=1000 loops=100)
Output: a.i
Min Time: 0.065 ms
Max Time: 0.163 ms
Min Rows: 1000
Max Rows: 1000

or, like Buffers,

-> Seq Scan on public.a (cost=0.00..15.00 rows=1000 width=4) (actual time=0.005..0.091 rows=1000 loops=100)
Output: a.i
Loops: min_time=0.065 max_time=0.163 min_rows=1000 max_rows=1000

and so on. What do you think about it?

It's true that the current output is a bit long, which isn't really
convenient to read. Using one of those alternative format would also
have the advantage of not breaking compatibility with tools that
process those entries. I personally prefer the 2nd option with the
extra "Loops:" line . For non text format, should we keep the current
format?

2)
In parallel scan, the extra statistics are not reported correctly.

postgres=# explain (analyze, verbose) select * from a,b where a.i=b.j;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..2463.52 rows=991 width=8) (actual time=0.823..25.651 rows=991 loops=1)
Output: a.i, b.j
Workers Planned: 2
Workers Launched: 2
-> Nested Loop (cost=0.00..1364.42 rows=413 width=8) (actual time=9.426..16.723 min_time=0.000 max_time=22.017 min_rows=0 rows=330 max_rows=991 loops=3)
Output: a.i, b.j
Join Filter: (a.i = b.j)
Rows Removed by Join Filter: 33003
Worker 0: actual time=14.689..14.692 rows=0 loops=1
Worker 1: actual time=13.458..13.460 rows=0 loops=1
-> Parallel Seq Scan on public.a (cost=0.00..9.17 rows=417 width=4) (actual time=0.049..0.152 min_time=0.000 max_time=0.202 min_rows=0 rows=333 max_rows=452 loops=3)
Output: a.i
Worker 0: actual time=0.040..0.130 rows=322 loops=1
Worker 1: actual time=0.039..0.125 rows=226 loops=1
-> Seq Scan on public.b (cost=0.00..2.00 rows=100 width=4) (actual time=0.006..0.026 min_time=0.012 max_time=0.066 min_rows=100 rows=100 max_rows=100 loops=1000)
Output: b.j
Worker 0: actual time=0.006..0.024 min_time=0.000 max_time=0.000 min_rows=0 rows=100 max_rows=0 loops=322
Worker 1: actual time=0.008..0.030 min_time=0.000 max_time=0.000 min_rows=0 rows=100 max_rows=0 loops=226
Planning Time: 0.186 ms
Execution Time: 25.838 ms
(20 rows)

This reports max/min rows or time of inner scan as 0 in parallel workers,
and as a result only the leader process's ones are accounted. To fix this,
we would change InstrAggNode as below.

@@ -167,6 +196,10 @@ InstrAggNode(Instrumentation *dst, Instrumentation *add)
dst->nloops += add->nloops;
dst->nfiltered1 += add->nfiltered1;
dst->nfiltered2 += add->nfiltered2;
+       dst->min_t = Min(dst->min_t, add->min_t);
+       dst->max_t = Max(dst->max_t, add->max_t);
+       dst->min_tuples = Min(dst->min_tuples, add->min_tuples);
+       dst->max_tuples = Max(dst->max_tuples, add->max_tuples);

Agreed.

3)
There are garbage lines and I could not apply this patch.

diff --git a/src/test/regress/expected/timetz.out b/src/test/regress/expected/timetz.out
index 038bb5fa094..5294179aa45 100644

I also switch the patch to "waiting on author" on the commit fest app.

#20Yugo Nagata
nagata@sraoss.co.jp
In reply to: Julien Rouhaud (#19)
Re: [PATCH] Add extra statistics to explain for Nested Loop

On Mon, 1 Feb 2021 13:28:45 +0800
Julien Rouhaud <rjuju123@gmail.com> wrote:

On Thu, Jan 28, 2021 at 8:38 PM Yugo NAGATA <nagata@sraoss.co.jp> wrote:

postgres=# explain (analyze, verbose) select * from a,b where a.i=b.j;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2752.00 rows=991 width=8) (actual time=0.021..17.651 rows=991 loops=1)
Output: a.i, b.j
Join Filter: (a.i = b.j)
Rows Removed by Join Filter: 99009
-> Seq Scan on public.b (cost=0.00..2.00 rows=100 width=4) (actual time=0.009..0.023 rows=100 loops=1)
Output: b.j
-> Seq Scan on public.a (cost=0.00..15.00 rows=1000 width=4) (actual time=0.005..0.091 min_time=0.065 max_time=0.163 min_rows=1000 rows=1000 max_rows=1000 loops=100)
Output: a.i
Planning Time: 0.066 ms
Execution Time: 17.719 ms
(10 rows)

I don't like this format where the extra statistics appear in the same
line of existing information because the output format differs depended
on whether the plan node's loops > 1 or not. This makes the length of a
line too long. Also, other information reported by VERBOSE doesn't change
the exiting row format and just add extra rows for new information.

Instead, it seems good for me to add extra rows for the new statistics
without changint the existing row format as other VERBOSE information,
like below.

-> Seq Scan on public.a (cost=0.00..15.00 rows=1000 width=4) (actual time=0.005..0.091 rows=1000 loops=100)
Output: a.i
Min Time: 0.065 ms
Max Time: 0.163 ms
Min Rows: 1000
Max Rows: 1000

or, like Buffers,

-> Seq Scan on public.a (cost=0.00..15.00 rows=1000 width=4) (actual time=0.005..0.091 rows=1000 loops=100)
Output: a.i
Loops: min_time=0.065 max_time=0.163 min_rows=1000 max_rows=1000

and so on. What do you think about it?

It's true that the current output is a bit long, which isn't really
convenient to read. Using one of those alternative format would also
have the advantage of not breaking compatibility with tools that
process those entries. I personally prefer the 2nd option with the
extra "Loops:" line . For non text format, should we keep the current
format?

For non text format, I think "Max/Min Rows", "Max/Min Times" are a bit
simple and the meaning is unclear. Instead, similar to a style of "Buffers",
does it make sense using "Max/Min Rows in Loops" and "Max/Min Times in Loops"?

Regards,
Yugo Nagata

--
Yugo NAGATA <nagata@sraoss.co.jp>

#21Ekaterina Sokolova
e.sokolova@postgrespro.ru
In reply to: Julien Rouhaud (#19)
#22Justin Pryzby
pryzby@telsasoft.com
In reply to: Ekaterina Sokolova (#21)
#23Ekaterina Sokolova
e.sokolova@postgrespro.ru
In reply to: Justin Pryzby (#22)
#24vignesh C
vignesh21@gmail.com
In reply to: Ekaterina Sokolova (#23)
#25Ekaterina Sokolova
e.sokolova@postgrespro.ru
In reply to: vignesh C (#24)
#26Justin Pryzby
pryzby@telsasoft.com
In reply to: Ekaterina Sokolova (#23)
#27Lukas Fittl
lukas@fittl.com
In reply to: Justin Pryzby (#26)
#28Ekaterina Sokolova
e.sokolova@postgrespro.ru
In reply to: Justin Pryzby (#26)
#29Julien Rouhaud
rjuju123@gmail.com
In reply to: Ekaterina Sokolova (#28)
#30Bruce Momjian
bruce@momjian.us
In reply to: Julien Rouhaud (#29)
#31Justin Pryzby
pryzby@telsasoft.com
In reply to: Bruce Momjian (#30)
#32Julien Rouhaud
rjuju123@gmail.com
In reply to: Bruce Momjian (#30)
#33Justin Pryzby
pryzby@telsasoft.com
In reply to: Julien Rouhaud (#32)