Add parallel columns for pg_stat_statements

Started by Guillaume Lelargealmost 2 years ago11 messageshackers

guillaume@lelarge.info

almost 2 years ago

Hello,

This patch was a bit discussed on [1]/messages/by-id/b4220d15-2e21-0e98-921b-b9892543cc93@dalibo.com, and with more details on [2]/messages/by-id/d657df20-c4bf-63f6-e74c-cb85a81d0383@dalibo.com. It's
based on another patch sent in 2022 (see [3]/messages/by-id/6acbe570-068e-bd8e-95d5-00c737b865e8@gmail.com). It introduces seven new
columns in pg_stat_statements:

* parallelized_queries_planned, number of times the query has been planned
to be parallelized,
* parallelized_queries_launched, number of times the query has been
executed with parallelization,
* parallelized_workers_planned, number of parallel workers planned for
this query,
* parallelized_workers_launched, number of parallel workers executed for
this query,
* parallelized_nodes, number of parallelized nodes,
* parallelized_nodes_all_workers, number of parallelized nodes which had
all requested workers,
* parallelized_nodes_no_worker, number of parallelized nodes which had no
requested workers.

As Benoit said yesterday, the intent is to help administrators evaluate the
usage of parallel workers in their databases and help configuring
parallelization usage.

A test script (test2.sql) is attached. You can execute it with "psql -Xef
test2.sql your_database" (your_database should not contain a t1 table as it
will be dropped and recreated).

Here is its result, a bit commented:

CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
CREATE EXTENSION
SELECT pg_stat_statements_reset();
pg_stat_statements_reset
-------------------------------
2024-08-29 18:00:35.314557+02
(1 row)

DROP TABLE IF EXISTS t1;
DROP TABLE
CREATE TABLE t1 (id integer);
CREATE TABLE
INSERT INTO t1 SELECT generate_series(1, 10_000_000);
INSERT 0 10000000
VACUUM ANALYZE t1;
VACUUM
SELECT query,
parallelized_queries_planned, parallelized_queries_launched,
parallelized_workers_planned, parallelized_workers_launched,
parallelized_nodes, parallelized_nodes_all_workers,
parallelized_nodes_no_worker
FROM pg_stat_statements
WHERE query LIKE 'SELECT%t1%'
(0 rows)

SELECT * FROM t1 LIMIT 1;
id
----
1
(1 row)

==> no parallelization

SELECT count(*) FROM t1;
count
----------
10000000
(1 row)

SELECT pg_sleep(1);
SELECT query,
parallelized_queries_planned, parallelized_queries_launched,
parallelized_workers_planned, parallelized_workers_launched,
parallelized_nodes, parallelized_nodes_all_workers,
parallelized_nodes_no_worker
FROM pg_stat_statements
WHERE query LIKE 'SELECT%t1%'
-[ RECORD 1 ]------------------+--------------------------
query | SELECT count(*) FROM t1
parallelized_queries_planned | 1
parallelized_queries_launched | 1
parallelized_workers_planned | 2
parallelized_workers_launched | 2
parallelized_nodes | 1
parallelized_nodes_all_workers | 1
parallelized_nodes_no_worker | 0
-[ RECORD 2 ]------------------+--------------------------
query | SELECT * FROM t1 LIMIT $1
parallelized_queries_planned | 0
parallelized_queries_launched | 0
parallelized_workers_planned | 0
parallelized_workers_launched | 0
parallelized_nodes | 0
parallelized_nodes_all_workers | 0
parallelized_nodes_no_worker | 0

==> one parallelized query
==> I have the default configuration, so 2 for
max_parallel_worker_per_gather
==> hence two workers, with one node with all workers

SET max_parallel_workers_per_gather TO 5;
SET
SELECT count(*) FROM t1;
count
----------
10000000
(1 row)

SELECT pg_sleep(1);
SELECT query,
parallelized_queries_planned, parallelized_queries_launched,
parallelized_workers_planned, parallelized_workers_launched,
parallelized_nodes, parallelized_nodes_all_workers,
parallelized_nodes_no_worker
FROM pg_stat_statements
WHERE query LIKE 'SELECT%t1%'
-[ RECORD 1 ]------------------+--------------------------
query | SELECT count(*) FROM t1
parallelized_queries_planned | 2
parallelized_queries_launched | 2
parallelized_workers_planned | 6
parallelized_workers_launched | 6
parallelized_nodes | 2
parallelized_nodes_all_workers | 2
parallelized_nodes_no_worker | 0
-[ RECORD 2 ]------------------+--------------------------
query | SELECT * FROM t1 LIMIT $1
parallelized_queries_planned | 0
parallelized_queries_launched | 0
parallelized_workers_planned | 0
parallelized_workers_launched | 0
parallelized_nodes | 0
parallelized_nodes_all_workers | 0
parallelized_nodes_no_worker | 0

==> another parallelized query
==> with 5 as max_parallel_workers_per_gather, but only 4 workers to use
==> hence four workers, with one node with all workers

The biggest issue with this patch is that it's unable to know workers for
maintenance queries (CREATE INDEX for BTree and VACUUM).

Documentation is done, tests are missing. Once there's an agreement on this
patch, we'll work on the tests.

This has been a collective work with Benoit Lobréau, Jehan-Guillaume de
Rorthais, and Franck Boudehen.

Thanks.

Regards.

[1]: /messages/by-id/b4220d15-2e21-0e98-921b-b9892543cc93@dalibo.com
/messages/by-id/b4220d15-2e21-0e98-921b-b9892543cc93@dalibo.com
[2]: /messages/by-id/d657df20-c4bf-63f6-e74c-cb85a81d0383@dalibo.com
/messages/by-id/d657df20-c4bf-63f6-e74c-cb85a81d0383@dalibo.com
[3]: /messages/by-id/6acbe570-068e-bd8e-95d5-00c737b865e8@gmail.com
/messages/by-id/6acbe570-068e-bd8e-95d5-00c737b865e8@gmail.com

--
Guillaume.

Michael Paquier

michael@paquier.xyz

over 1 year ago

In reply to: Guillaume Lelarge (#1)

Re: Add parallel columns for pg_stat_statements

On Thu, Aug 29, 2024 at 10:08:23PM +0200, Guillaume Lelarge wrote:

This patch was a bit discussed on [1], and with more details on [2]. It's
based on another patch sent in 2022 (see [3]). It introduces seven new
columns in pg_stat_statements:

* parallelized_queries_planned, number of times the query has been planned
to be parallelized,
* parallelized_queries_launched, number of times the query has been
executed with parallelization,

Comparing the numbers of workers planned and launched with the number
of times a query has been called and planned should provide a rather
good equivalent, no? I am not sure that these two are mandatory to
have.

* parallelized_workers_planned, number of parallel workers planned for
this query,
* parallelized_workers_launched, number of parallel workers executed for
this query,

Yep. Definitely OK with these two. There is an overlap with what
Benoit has sent here when it comes to publish this data to the
executor state:
/messages/by-id/783bc7f7-659a-42fa-99dd-ee0565644e25@dalibo.com

* parallelized_nodes, number of parallelized nodes,
* parallelized_nodes_all_workers, number of parallelized nodes which had
all requested workers,

* parallelized_nodes_no_worker, number of parallelized nodes which had no
requested workers.

I can see why you want to register this extra data on a node-basis,
but how does that help when it comes to tune the parallel GUCs? We
cannot control them at node level and the launched/planned ratio
offers an equivalent of that. Not exactly, but that's enough to get a
picture if there is a draught.

A test script (test2.sql) is attached. You can execute it with "psql -Xef
test2.sql your_database" (your_database should not contain a t1 table as it
will be dropped and recreated).

Let's add proper regression tests instead, including
oldextversions.sql as this bumps the version of the module. See for
example the tests of 6fd5071909a2 that can force workers to spawn
for BRIN and btree queries, validating some of the stats published
here.
--
Michael

Guillaume Lelarge

guillaume@lelarge.info

over 1 year ago

In reply to: Michael Paquier (#2)

Re: Add parallel columns for pg_stat_statements

Hi Michael,

Le jeu. 3 oct. 2024 à 09:15, Michael Paquier <michael@paquier.xyz> a écrit :

On Thu, Aug 29, 2024 at 10:08:23PM +0200, Guillaume Lelarge wrote:

This patch was a bit discussed on [1], and with more details on [2]. It's
based on another patch sent in 2022 (see [3]). It introduces seven new
columns in pg_stat_statements:

* parallelized_queries_planned, number of times the query has been

planned

to be parallelized,
* parallelized_queries_launched, number of times the query has been
executed with parallelization,

Comparing the numbers of workers planned and launched with the number
of times a query has been called and planned should provide a rather
good equivalent, no? I am not sure that these two are mandatory to
have.

I'm not sure I follow. That would mean that every time a query is executed,
it always gets the same amount of workers. Which is not guaranteed to be
true.

I would agree, though, that parallelized_queries_launched is probably not
that interesting. I could get rid of it if you think it should go away.

* parallelized_workers_planned, number of parallel workers planned for

this query,
* parallelized_workers_launched, number of parallel workers executed for
this query,

Yep. Definitely OK with these two. There is an overlap with what
Benoit has sent here when it comes to publish this data to the
executor state:

/messages/by-id/783bc7f7-659a-42fa-99dd-ee0565644e25@dalibo.com

Well, I don't see this as an overlap. Rather more information.

* parallelized_nodes, number of parallelized nodes,
* parallelized_nodes_all_workers, number of parallelized nodes which had
all requested workers,

* parallelized_nodes_no_worker, number of parallelized nodes which had

no

requested workers.

I can see why you want to register this extra data on a node-basis,
but how does that help when it comes to tune the parallel GUCs? We
cannot control them at node level and the launched/planned ratio
offers an equivalent of that. Not exactly, but that's enough to get a
picture if there is a draught.

On this, I would agree with you. They are not that particularly useful to
get better setting for parallel GUCs. I can drop them if you want.

A test script (test2.sql) is attached. You can execute it with "psql -Xef

test2.sql your_database" (your_database should not contain a t1 table as

it

will be dropped and recreated).

Let's add proper regression tests instead, including
oldextversions.sql as this bumps the version of the module. See for
example the tests of 6fd5071909a2 that can force workers to spawn
for BRIN and btree queries, validating some of the stats published
here.

Did this on the v2 version of the patch (attached here).

Thanks for your review. If you want the parallelized_queries_launched
column and the parallelized_nodes_* columns dropped, I can do that on a v3
patch.

Regards.

--
Guillaume.

Michael Paquier

michael@paquier.xyz

over 1 year ago

In reply to: Guillaume Lelarge (#3)

Re: Add parallel columns for pg_stat_statements

On Sun, Oct 06, 2024 at 03:32:02PM +0200, Guillaume Lelarge wrote:

I'm not sure I follow. That would mean that every time a query is executed,
it always gets the same amount of workers. Which is not guaranteed to be
true.

I would agree, though, that parallelized_queries_launched is probably not
that interesting. I could get rid of it if you think it should go away.

My point is that these stats are useful to know which action may have
to be taken when reaching a mean, and numbers in pg_stat_statements
offer hints that something is going wrong and that a closer lookup at
an EXPLAIN plan may be required, particularly if the total number of
workers planned and launched aggregated in the counters is unbalanced
across queries. If the planned/launched ratio is balanced across most
queries queries, a GUC adjustment may be OK. If the ratio is very
unbalanced in a lower set of queries, I'd also look at tweaking GUCs
instead like the per_gather. These counters give information that one
or the other may be required.

Well, I don't see this as an overlap. Rather more information.

Later versions of Benoit's patch have been accumulating this data in
the executor state.  v4 posted at [1] has the following diffs:
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -724,6 +724,9 @@ typedef struct EState
 	 */
 	List	   *es_insert_pending_result_relations;
 	List	   *es_insert_pending_modifytables;
+
+	int			es_workers_launched;
+	int			es_workers_planned;
 } EState;

Your v2 posted on this thread has that:
@@ -707,6 +707,12 @@ typedef struct EState
struct EPQState *es_epq_active;

 	bool		es_use_parallel_mode;	/* can we use parallel workers? */
+	bool		es_used_parallel_mode;	/* was executed in parallel */
+	int			es_parallelized_workers_launched;
+	int			es_parallelized_workers_planned;
+	int			es_parallelized_nodes; /* # of parallelized nodes */
+	int			es_parallelized_nodes_all_workers; /* # of nodes with all workers launched */
+	int			es_parallelized_nodes_no_worker; /* # of nodes with no workers launched */

es_parallelized_workers_launched and es_workers_launched are the same
thing in both.

On this, I would agree with you. They are not that particularly useful to
get better setting for parallel GUCs. I can drop them if you want.

Yep. I would remove them for now. This leads to more bloat.

Did this on the v2 version of the patch (attached here).

Thanks for your review. If you want the parallelized_queries_launched
column and the parallelized_nodes_* columns dropped, I can do that on a v3
patch.

I'd recommend to split that into more independent patches:
- Introduce the two counters in EState with the incrementations done
in nodeGatherMerge.c and nodeGather.c (mentioned that at [2]/messages/by-id/Zv46wTMjLTuu2t9J@paquier.xyz -- Michael, you may
want to coordinate with Benoit to avoid duplicating the work).
- Expand pg_stat_statements to use them for DMLs, SELECTs, well where
they matter.
- Look at expanding that for utilities that can do parallel jobs:
CREATE INDEX and VACUUM, but this has lower priority to me, and this
can reuse the same counters as the ones added by patch 2.

[1]: /messages/by-id/6ecad3ad-835c-486c-9ebd-da87a9a97634@dalibo.com
[2]: /messages/by-id/Zv46wTMjLTuu2t9J@paquier.xyz -- Michael
--
Michael

Guillaume Lelarge

guillaume@lelarge.info

over 1 year ago

In reply to: Michael Paquier (#4)

Re: Add parallel columns for pg_stat_statements

Le lun. 7 oct. 2024 à 02:18, Michael Paquier <michael@paquier.xyz> a écrit :

On Sun, Oct 06, 2024 at 03:32:02PM +0200, Guillaume Lelarge wrote:

I'm not sure I follow. That would mean that every time a query is

executed,

it always gets the same amount of workers. Which is not guaranteed to be
true.

I would agree, though, that parallelized_queries_launched is probably not
that interesting. I could get rid of it if you think it should go away.

My point is that these stats are useful to know which action may have
to be taken when reaching a mean, and numbers in pg_stat_statements
offer hints that something is going wrong and that a closer lookup at
an EXPLAIN plan may be required, particularly if the total number of
workers planned and launched aggregated in the counters is unbalanced
across queries. If the planned/launched ratio is balanced across most
queries queries, a GUC adjustment may be OK. If the ratio is very
unbalanced in a lower set of queries, I'd also look at tweaking GUCs
instead like the per_gather. These counters give information that one
or the other may be required.

Well, I don't see this as an overlap. Rather more information.
Later versions of Benoit's patch have been accumulating this data in
the executor state.  v4 posted at [1] has the following diffs:
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -724,6 +724,9 @@ typedef struct EState
*/
List       *es_insert_pending_result_relations;
List       *es_insert_pending_modifytables;
+
+       int                     es_workers_launched;
+       int                     es_workers_planned;
} EState;
Your v2 posted on this thread has that:
@@ -707,6 +707,12 @@ typedef struct EState
struct EPQState *es_epq_active;
bool            es_use_parallel_mode;   /* can we use parallel
workers? */
+       bool            es_used_parallel_mode;  /* was executed in
parallel */
+       int                     es_parallelized_workers_launched;
+       int                     es_parallelized_workers_planned;
+       int                     es_parallelized_nodes; /* # of
parallelized nodes */
+       int                     es_parallelized_nodes_all_workers; /* # of
nodes with all workers launched */
+       int                     es_parallelized_nodes_no_worker; /* # of
nodes with no workers launched */
es_parallelized_workers_launched and es_workers_launched are the same
thing in both.

My bad. I agree this is the way to go. See patch v3-0001 attached.

On this, I would agree with you. They are not that particularly useful to
get better setting for parallel GUCs. I can drop them if you want.

Yep. I would remove them for now. This leads to more bloat.

Done. See patch v3-0002 attached.

Did this on the v2 version of the patch (attached here).

Thanks for your review. If you want the parallelized_queries_launched
column and the parallelized_nodes_* columns dropped, I can do that on a

v3

patch.

I'd recommend to split that into more independent patches:
- Introduce the two counters in EState with the incrementations done
in nodeGatherMerge.c and nodeGather.c (mentioned that at [2], you may
want to coordinate with Benoit to avoid duplicating the work).
- Expand pg_stat_statements to use them for DMLs, SELECTs, well where
they matter.
- Look at expanding that for utilities that can do parallel jobs:
CREATE INDEX and VACUUM, but this has lower priority to me, and this
can reuse the same counters as the ones added by patch 2.

The first two are done. The last one is beyond my scope.

I'm now working on Benoit's patch to make it work with my v3-0001 patch.
I'll send the resulting patch on his thread.

[1]:
/messages/by-id/6ecad3ad-835c-486c-9ebd-da87a9a97634@dalibo.com
[2]: /messages/by-id/Zv46wTMjLTuu2t9J@paquier.xyz
--
Michael

Regards.

--
Guillaume.

Michael Paquier

michael@paquier.xyz

over 1 year ago

In reply to: Guillaume Lelarge (#5)

Re: Add parallel columns for pg_stat_statements

On Mon, Oct 07, 2024 at 10:00:13AM +0200, Guillaume Lelarge wrote:

Le lun. 7 oct. 2024 à 02:18, Michael Paquier <michael@paquier.xyz> a écrit :

I'd recommend to split that into more independent patches:
- Introduce the two counters in EState with the incrementations done
in nodeGatherMerge.c and nodeGather.c (mentioned that at [2], you may
want to coordinate with Benoit to avoid duplicating the work).
- Expand pg_stat_statements to use them for DMLs, SELECTs, well where
they matter.
- Look at expanding that for utilities that can do parallel jobs:
CREATE INDEX and VACUUM, but this has lower priority to me, and this
can reuse the same counters as the ones added by patch 2.

The first two are done. The last one is beyond my scope.

That's fair. I have put my hands on this patch set, finishing with
the attached.

A couple of notes:
- I've been struggling a bit on the "planned" vs "launched" terms used
in the names for the counters. It is inconsistent with the backend
state, where we talk about workers "to launch" and workers "launched".
"planned" does not really apply to utilities, as this may not be
planned per se.
- The test in parallel.sql can be cheaper, tweaking the right GUCs the
right way data in the table is not even required to spawn a set of
parallel workers.
- Meson was not updated for the new test and the files to install.

0001 and 0002 are the parts of the patch that I can see myself
applying; it is pretty cool to see pg_stat_statements complain that
the launched/to_launch ratio can get unbalanced really quickly when I
do something stupid. The CI is stable with these.

0003 has the remaining bits with the 3rd and 4th counters, able to
apply on top of 0002.
--
Michael

Guillaume Lelarge

guillaume@lelarge.info

over 1 year ago

In reply to: Michael Paquier (#6)

Re: Add parallel columns for pg_stat_statements

Hi,

Le mar. 8 oct. 2024 à 09:29, Michael Paquier <michael@paquier.xyz> a écrit :

On Mon, Oct 07, 2024 at 10:00:13AM +0200, Guillaume Lelarge wrote:

Le lun. 7 oct. 2024 à 02:18, Michael Paquier <michael@paquier.xyz> a

écrit :

I'd recommend to split that into more independent patches:
- Introduce the two counters in EState with the incrementations done
in nodeGatherMerge.c and nodeGather.c (mentioned that at [2], you may
want to coordinate with Benoit to avoid duplicating the work).
- Expand pg_stat_statements to use them for DMLs, SELECTs, well where
they matter.
- Look at expanding that for utilities that can do parallel jobs:
CREATE INDEX and VACUUM, but this has lower priority to me, and this
can reuse the same counters as the ones added by patch 2.

The first two are done. The last one is beyond my scope.

That's fair. I have put my hands on this patch set, finishing with
the attached.

A couple of notes:
- I've been struggling a bit on the "planned" vs "launched" terms used
in the names for the counters. It is inconsistent with the backend
state, where we talk about workers "to launch" and workers "launched".
"planned" does not really apply to utilities, as this may not be
planned per se.

You're right. Much better to keep them consistent.

- The test in parallel.sql can be cheaper, tweaking the right GUCs the
right way data in the table is not even required to spawn a set of
parallel workers.

Yeah, it could be cheaper. As it is already quick, I didn't do the extra
mile to make it cheaper.

- Meson was not updated for the new test and the files to install.

Oops, sorry. Didn't know that Meson had to be updated.

0001 and 0002 are the parts of the patch that I can see myself
applying; it is pretty cool to see pg_stat_statements complain that
the launched/to_launch ratio can get unbalanced really quickly when I
do something stupid. The CI is stable with these.

Great, thanks :)

0003 has the remaining bits with the 3rd and 4th counters, able to
apply on top of 0002.

Still think they are interesting but I understand your concerns.

I've done a bit of testing with the three patches, and didn't find any
issue with them.

--
Guillaume.

Michael Paquier

michael@paquier.xyz

over 1 year ago

In reply to: Guillaume Lelarge (#7)

Re: Add parallel columns for pg_stat_statements

On Tue, Oct 08, 2024 at 03:53:16PM +0200, Guillaume Lelarge wrote:

I've done a bit of testing with the three patches, and didn't find any
issue with them.

Okay, applied 0001 and 0002 then after a second lookup. I'll spend
some more time thinking about 0003 and the other threads.
--
Michael

Guillaume Lelarge

guillaume@lelarge.info

over 1 year ago

In reply to: Michael Paquier (#8)

Re: Add parallel columns for pg_stat_statements

Le mer. 9 oct. 2024 à 01:33, Michael Paquier <michael@paquier.xyz> a écrit :

On Tue, Oct 08, 2024 at 03:53:16PM +0200, Guillaume Lelarge wrote:

I've done a bit of testing with the three patches, and didn't find any
issue with them.

Okay, applied 0001 and 0002 then after a second lookup. I'll spend
some more time thinking about 0003 and the other threads.

Thanks a lot.

--
Guillaume.

#10

Michael Paquier

michael@paquier.xyz

over 1 year ago

In reply to: Michael Paquier (#8)

Re: Add parallel columns for pg_stat_statements

On Wed, Oct 09, 2024 at 08:32:52AM +0900, Michael Paquier wrote:

Okay, applied 0001 and 0002 then after a second lookup. I'll spend
some more time thinking about 0003 and the other threads.

Considered 0003, and I'm still not sure that this is something that
is really required based on the correlation that are now possible with
the number of times a query has been called and the number of
planned/launched workers.

So I am marking the entry as committed. Let's see about the threads
for the addition of this data at table-level and at the
database-level.
--
Michael

#11

Guillaume Lelarge

guillaume@lelarge.info

over 1 year ago

In reply to: Michael Paquier (#10)

Re: Add parallel columns for pg_stat_statements

Le jeu. 7 nov. 2024 à 04:19, Michael Paquier <michael@paquier.xyz> a écrit :

On Wed, Oct 09, 2024 at 08:32:52AM +0900, Michael Paquier wrote:

Okay, applied 0001 and 0002 then after a second lookup. I'll spend
some more time thinking about 0003 and the other threads.

Considered 0003, and I'm still not sure that this is something that
is really required based on the correlation that are now possible with
the number of times a query has been called and the number of
planned/launched workers.

I'm fine with your decision. After using the new metrics, we'll probably
see more clearly if that's enough.

So I am marking the entry as committed. Let's see about the threads
for the addition of this data at table-level and at the
database-level.

Sounds good!

Table level is probably not the most important in my view. Database-level
and logging are what really matters to me.

Thanks.

--
Guillaume.

Add parallel columns for pg_stat_statements

Attachments:

Attachments:

Attachments:

Attachments: