V18 change on EXPLAIN ANALYZE

Started by Marcos Pegoraro4 months ago6 messages
#1Marcos Pegoraro
marcos@f10.com.br

Robert Hass committed ddb17e387aa28d61521227377b00f997756b8a27,
which changed how EXPLAIN indicates fractional rows. That's cool but I have
to see that change on sources because it's not explained on DOCs.

One can understand that cost=4.36, that actual time=0.009, but what means
rows=10.43 ? Costs and Time having decimals are fine but what is a row and
a half ?

So, I don't know exactly how to explain this change but I think it would be
good to have a new paragraph on [1]https://www.postgresql.org/docs/current/using-explain.html#USING-EXPLAIN-ANALYZE and explain that now when nloops > 1,
we now display two digits after the decimal point, rather than none.

[1]: https://www.postgresql.org/docs/current/using-explain.html#USING-EXPLAIN-ANALYZE
https://www.postgresql.org/docs/current/using-explain.html#USING-EXPLAIN-ANALYZE

regards
Marcos

#2Maciek Sakrejda
m.sakrejda@gmail.com
In reply to: Marcos Pegoraro (#1)
Re: V18 change on EXPLAIN ANALYZE

On Fri, Sep 26, 2025 at 1:34 PM Marcos Pegoraro <marcos@f10.com.br> wrote:

Robert Hass committed ddb17e387aa28d61521227377b00f997756b8a27, which changed how EXPLAIN indicates fractional rows. That's cool but I have to see that change on sources because it's not explained on DOCs.

One can understand that cost=4.36, that actual time=0.009, but what means rows=10.43 ? Costs and Time having decimals are fine but what is a row and a half ?

So, I don't know exactly how to explain this change but I think it would be good to have a new paragraph on [1] and explain that now when nloops > 1, we now display two digits after the decimal point, rather than none.

[1] - https://www.postgresql.org/docs/current/using-explain.html#USING-EXPLAIN-ANALYZE

The page you link says

In some query plans, it is possible for a subplan node to be
executed more than once. For example, the inner index scan will be
executed once per outer row in the above nested-loop plan. In such
cases, the loops value reports the total number of executions of the
node, and the actual time and rows values shown are averages
per-execution. This is done to make the numbers comparable with the
way that the cost estimates are shown. Multiply by the loops value to
get the total time actually spent in the node. In the above example,
we spent a total of 0.030 milliseconds executing the index scans on
tenk2.

in the second paragraph after the example in this section. Do you
think that's not sufficiently clear?

Thanks,
Maciek

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Maciek Sakrejda (#2)
Re: V18 change on EXPLAIN ANALYZE

Maciek Sakrejda <m.sakrejda@gmail.com> writes:

The page you link says

In some query plans, it is possible for a subplan node to be
executed more than once. For example, the inner index scan will be
executed once per outer row in the above nested-loop plan. In such
cases, the loops value reports the total number of executions of the
node, and the actual time and rows values shown are averages
per-execution. This is done to make the numbers comparable with the
way that the cost estimates are shown. Multiply by the loops value to
get the total time actually spent in the node. In the above example,
we spent a total of 0.030 milliseconds executing the index scans on
tenk2.

in the second paragraph after the example in this section. Do you
think that's not sufficiently clear?

It's not wrong, but it feels a little incomplete now. Maybe change
the last two sentences to

Multiply by the loops value to get the total time actually spent in
the node and the total number of rows processed by the node across all
executions. In the above example, we spent a total of 0.030
milliseconds executing the index scans on tenk2, and they handled a
total of 10 rows.

A bigger gap in perform.sgml is that it doesn't address parallel
query cases at all AFAICS. I think that was one of the main drivers
of this change, so it feels a little sad that it's not covered here.

regards, tom lane

#4Maciek Sakrejda
m.sakrejda@gmail.com
In reply to: Tom Lane (#3)
1 attachment(s)
Re: V18 change on EXPLAIN ANALYZE

On Fri, Sep 26, 2025 at 2:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Maciek Sakrejda <m.sakrejda@gmail.com> writes:

The page you link says

In some query plans, it is possible for a subplan node to be
executed more than once. For example, the inner index scan will be
executed once per outer row in the above nested-loop plan. In such
cases, the loops value reports the total number of executions of the
node, and the actual time and rows values shown are averages
per-execution. This is done to make the numbers comparable with the
way that the cost estimates are shown. Multiply by the loops value to
get the total time actually spent in the node. In the above example,
we spent a total of 0.030 milliseconds executing the index scans on
tenk2.

in the second paragraph after the example in this section. Do you
think that's not sufficiently clear?

It's not wrong, but it feels a little incomplete now. Maybe change
the last two sentences to

Multiply by the loops value to get the total time actually spent in
the node and the total number of rows processed by the node across all
executions. In the above example, we spent a total of 0.030
milliseconds executing the index scans on tenk2, and they handled a
total of 10 rows.

A bigger gap in perform.sgml is that it doesn't address parallel
query cases at all AFAICS. I think that was one of the main drivers
of this change, so it feels a little sad that it's not covered here.

Fair point. I included your proposed change and took a stab at briefly
covering parallelism in the attached (admittedly, my understanding of
how that works is a little shaky, so apologies if I'm way off on some
of this).

However, to get a parallel query in the regression database (I chose
EXPLAIN ANALYZE SELECT * FROM tenk2), I had to change some settings:

SET min_parallel_table_scan_size = 0;
SET parallel_tuple_cost = 0;
SET parallel_setup_cost = 0;

Should I mention that in the example? Or should I generate a bigger
table so using these is not necessary? If we say nothing and use the
example, I think it may be confusing if someone wants to use the
example as a starting point for their own exploration of how this
works. Or is there a better query that works out of the box and does
not need changes to the settings?

It also seems like the EXPLAIN ANALYZE section is getting a little
unwieldy. Should we subdivide it, or is this still okay?

Thanks,
Maciek

Attachments:

v1-0001-Improve-EXPLAIN-docs.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Improve-EXPLAIN-docs.patchDownload
From 1c777b07ff1c085884f273dfd40290a06921e224 Mon Sep 17 00:00:00 2001
From: Maciek Sakrejda <m.sakrejda@gmail.com>
Date: Fri, 26 Sep 2025 17:17:41 -0700
Subject: [PATCH v1] Improve EXPLAIN docs

 - clarify fractional rows in loops
 - add some basic info on parallel execution
---
 doc/src/sgml/perform.sgml | 80 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 76 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index 106583fb296..82b71571b86 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -756,12 +756,84 @@ WHERE t1.unique1 &lt; 10 AND t1.unique2 = t2.unique2;
     <literal>loops</literal> value reports the
     total number of executions of the node, and the actual time and rows
     values shown are averages per-execution.  This is done to make the numbers
-    comparable with the way that the cost estimates are shown.  Multiply by
-    the <literal>loops</literal> value to get the total time actually spent in
-    the node.  In the above example, we spent a total of 0.030 milliseconds
-    executing the index scans on <literal>tenk2</literal>.
+    comparable with the way that the cost estimates are shown.  Multiply by the
+    <literal>loops</literal> value to get the total time actually spent in the
+    node and the total number of rows processed by the node across all
+    executions. In the above example, we spent a total of 0.030
+    milliseconds executing the index scans on <literal>tenk2</literal>, and
+    they handled a total of 10 rows.
    </para>
 
+   <para>
+     Parallel execution will also cause nodes to be executed more than once.
+     This is also indicated with the <literal>loops</literal> value:
+   </para>
+
+<screen>
+EXPLAIN ANALYZE SELECT * FROM tenk2;
+                                                        QUERY PLAN                                                         
+---------------------------------------------------------------------------------------------------------------------------
+ Gather  (cost=0.00..386.67 rows=10000 width=244) (actual time=1.291..19.308 rows=10000.00 loops=1)
+   Workers Planned: 2
+   Workers Launched: 2
+   Buffers: shared hit=345
+   ->  Parallel Seq Scan on tenk2  (cost=0.00..386.67 rows=4167 width=244) (actual time=0.006..0.556 rows=3333.33 loops=3)
+         Buffers: shared hit=345
+ Planning:
+   Buffers: shared hit=126
+ Planning Time: 1.375 ms
+ Execution Time: 20.214 ms
+(10 rows)
+</screen>
+
+  <para>
+    The sequential scan was executed three separate times in parallel: once
+    in the leader (since <xref linkend="guc-parallel-leader-participation"/>
+    is on by default), and once in each of the two launched workers.  Similarly
+    to sequential repeated executions, rows and actual time are averages
+    per-worker.  Multiply by the <literal>loops</literal> value to get the
+    total number of rows processed by the node across all workers.  The total
+    time spent in all workers can be similarly calculated, but note this time
+    is spent concurrently, so it is not equivalent to total time spent in the
+    node.
+  </para>
+
+  <para>
+    More detailed information about parallel execution is available with the
+    VERBOSE option:
+  </para>
+
+<screen>
+EXPLAIN (ANALYZE, VERBOSE) SELECT * FROM tenk2;
+                                                                      QUERY PLAN                                                                       
+-------------------------------------------------------------------------------------------------------------------------------------------------------
+ Gather  (cost=0.00..386.67 rows=10000 width=244) (actual time=0.901..8.389 rows=10000.00 loops=1)
+   Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous, odd, even, stringu1, stringu2, string4
+   Workers Planned: 2
+   Workers Launched: 2
+   Buffers: shared hit=345
+   ->  Parallel Seq Scan on public.tenk2  (cost=0.00..386.67 rows=4167 width=244) (actual time=0.011..0.669 rows=3333.33 loops=3)
+         Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous, odd, even, stringu1, stringu2, string4
+         Buffers: shared hit=345
+         Worker 0:  actual time=0.009..0.071 rows=517.00 loops=1
+           Buffers: shared hit=18
+         Worker 1:  actual time=0.009..0.062 rows=435.00 loops=1
+           Buffers: shared hit=15
+ Planning Time: 0.187 ms
+ Execution Time: 9.524 ms
+(14 rows)
+</screen>
+
+  <para>
+    This shows actual time, rows, loops, and buffer usage for each worker.
+    When <xref linkend="guc-parallel-leader-participation"/> is on, the
+    corresponding stats for the leader can be calculated by subtracting the
+    sum of the workers' stats from the total. In the above example, the leader
+    spent 1.874 milliseconds (0.669 * 3 - 0.071 - 0.062) executing its portion
+    of the work and processed 9048 rows (3333.33 * 3 - 517.00 - 435.00, rounded
+    up).
+  </para>
+
    <para>
     In some cases <command>EXPLAIN ANALYZE</command> shows additional execution
     statistics beyond the plan node execution times and row counts.
-- 
2.43.0

#5Ilia Evdokimov
ilya.evdokimov@tantorlabs.com
In reply to: Maciek Sakrejda (#4)
Re: V18 change on EXPLAIN ANALYZE

Hi hackers,

On 27.09.2025 03:31, Maciek Sakrejda wrote:

However, to get a parallel query in the regression database (I chose
EXPLAIN ANALYZE SELECT * FROM tenk2), I had to change some settings:

SET min_parallel_table_scan_size = 0;
SET parallel_tuple_cost = 0;
SET parallel_setup_cost = 0;

Should I mention that in the example? Or should I generate a bigger
table so using these is not necessary? If we say nothing and use the
example, I think it may be confusing if someone wants to use the
example as a starting point for their own exploration of how this
works. Or is there a better query that works out of the box and does
not need changes to the settings?

It also seems like the EXPLAIN ANALYZE section is getting a little
unwieldy. Should we subdivide it, or is this still okay?

Thanks for noticing the documentation gap regarding parallel plans.

1. I think the mention of VERBOSE might be unnecessary, since this is
already covered in parallel.sgml, section 'Parallel Plan Tips'. That
section explicitly says that EXPLAIN (ANALYZE, VERBOSE) shows per-worker
statistics.

2. Instead of introducing another query, why not reuse the one already
shown earlier in the same section, just with the GUCs adjusted to make
it parallel? For example:

SET min_parallel_table_scan_size = 0;
SET parallel_tuple_cost = 0;
SET parallel_setup_cost = 0;

EXPLAIN ANALYZE
SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=4.65..70.96 rows=10 width=488) (actual time=1.670..6.246
rows=10.00 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=78 read=6
   ->  Nested Loop  (cost=4.65..70.96 rows=4 width=488) (actual
time=0.218..0.277 rows=3.33 loops=3)
         Buffers: shared hit=78 read=6
         ->  Parallel Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.31
rows=4 width=244) (actual time=0.195..0.202 rows=3.33 loops=3)
               Recheck Cond: (unique1 < 10)
               Heap Blocks: exact=10
               Buffers: shared hit=54
               ->  Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36
rows=10 width=0) (actual time=0.449..0.450 rows=10.00 loops=1)
                     Index Cond: (unique1 < 10)
                     Index Searches: 1
                     Buffers: shared hit=2
         ->  Index Scan using tenk2_unique2 on tenk2 t2
(cost=0.29..7.90 rows=1 width=244) (actual time=0.020..0.020 rows=1.00
loops=10)
               Index Cond: (unique2 = t1.unique2)
               Index Searches: 10
               Buffers: shared hit=24 read=6
 Planning:
   Buffers: shared hit=141 read=3
 Planning Time: 0.519 ms
 Execution Time: 6.302 ms
(22 rows)

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/

#6Maciek Sakrejda
maciek@pganalyze.com
In reply to: Ilia Evdokimov (#5)
Re: V18 change on EXPLAIN ANALYZE

Thanks for the feedback!

On Thu, Oct 30, 2025 at 7:35 AM Ilia Evdokimov
<ilya.evdokimov@tantorlabs.com> wrote:

1. I think the mention of VERBOSE might be unnecessary, since this is already covered in parallel.sgml, section 'Parallel Plan Tips'. That section explicitly says that EXPLAIN (ANALYZE, VERBOSE) shows per-worker statistics.

Okay. I think the EXPLAIN docs are a better place for that info, but
if that's the consensus, I can update the patch.

2. Instead of introducing another query, why not reuse the one already shown earlier in the same section, just with the GUCs adjusted to make it parallel?

Sure, I can do that. I thought a more concise example would make it
clearer to see the parallelism-related bits.

And to answer my own question, I noticed another example has `SET
enable_seqscan TO off;`, so I'll add the GUCs.

Thanks,
Maciek