[PATCH] Query Jumbling for CALL and SET utility statements

Started by Drouvot, Bertrandover 3 years ago45 messages

bdrouvot@amazon.com

over 3 years ago

1 attachment(s)

Hi hackers,

While query jumbling is provided for function calls that’s currently not
the case for procedures calls.
The reason behind this is that all utility statements are currently
discarded for jumbling.

We’ve recently seen performance impacts (LWLock contention) due to the
lack of jumbling on procedure calls with pg_stat_statements and
pg_stat_statements.track_utility enabled (think an application with a
high rate of procedure calls with unique parameters for each call).

Jeremy has had this conversation on twitter (see
https://twitter.com/jer_s/status/1560003560116342785) and Nikolay
reported that he also had to work on a similar performance issue with
SET being used.

That’s why we think it would make sense to allow jumbling for those 2
utility statements: CALL and SET.

Please find attached a patch proposal for doing so.

With the attached patch we would get things like:

CALL MINUS_TWO(3);
CALL MINUS_TWO(7);
CALL SUM_TWO(3, 8);
CALL SUM_TWO(7, 5);
set enable_seqscan=false;
set enable_seqscan=true;
set seq_page_cost=2.0;
set seq_page_cost=1.0;

postgres=# SELECT query, calls, rows FROM pg_stat_statements;
               query               | calls | rows
-----------------------------------+-------+------
set seq_page_cost=$1              |     2 |    0
CALL MINUS_TWO($1)                |     2 |    0
set enable_seqscan=$1             |     2 |    0
CALL SUM_TWO($1, $2)              |     2 |    0

Looking forward to your feedback,

Thanks,

Jeremy & Bertrand

--
Bertrand Drouvot
Amazon Web Services:https://aws.amazon.com

Attachments:

v1-0001-JumbleQuery-on-Call-and-Set.patchtext/plain; charset=UTF-8; name=v1-0001-JumbleQuery-on-Call-and-Set.patchDownload

diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index ff0166fb9d..bdb83f5f72 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -272,7 +272,7 @@ FROM pg_stat_statements ORDER BY query COLLATE "C";
  wal_records > $2 as wal_records_generated,               +|       |      |                     |                       | 
  wal_records >= rows as wal_records_ge_rows               +|       |      |                     |                       | 
  FROM pg_stat_statements ORDER BY query COLLATE "C"        |       |      |                     |                       | 
- SET pg_stat_statements.track_utility = FALSE              |     1 |    0 | f                   | f                     | t
+ SET pg_stat_statements.track_utility = $1                 |     1 |    0 | f                   | f                     | t
  UPDATE pgss_test SET b = $1 WHERE a > $2                  |     1 |    3 | t                   | t                     | t
 (7 rows)
 
@@ -423,6 +423,101 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
 (6 rows)
 
+-- PL/pgSQL procedure and pg_stat_statements.track = top
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(4 rows)
+
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+SET pg_stat_statements.track_utility = FALSE;
+DROP PROCEDURE MINUS_TWO(INTEGER);
+DROP PROCEDURE SUM_TWO(INTEGER, INTEGER);
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+SET pg_stat_statements.track = 'all';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT (i - $2 - $3)::INTEGER                                                |     2 |    2
+ SELECT (j + j)::INTEGER                                                      |     2 |    2
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(6 rows)
+
+-- SET
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+set enable_seqscan=false;
+set enable_seqscan=true;
+set seq_page_cost=2.0;
+set seq_page_cost=1.0;
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+ set enable_seqscan=$1                                                        |     2 |    0
+ set seq_page_cost=$1                                                         |     2 |    0
+(4 rows)
+
+SET pg_stat_statements.track_utility = FALSE;
 --
 -- queries with locking clauses
 --
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index ba868f0de9..b98d9b60df 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -832,7 +832,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
 	 * inherit from the underlying statement's one (except DEALLOCATE which is
 	 * entirely untracked).
 	 */
-	if (query->utilityStmt)
+	if (query->utilityStmt && !JUMBLE_UTILITY(query->utilityStmt))
 	{
 		if (pgss_track_utility && !PGSS_HANDLED_UTILITY(query->utilityStmt))
 			query->queryId = UINT64CONST(0);
diff --git a/contrib/pg_stat_statements/sql/pg_stat_statements.sql b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
index a01f183727..88d54fea11 100644
--- a/contrib/pg_stat_statements/sql/pg_stat_statements.sql
+++ b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
@@ -201,6 +201,71 @@ SELECT PLUS_ONE(1);
 
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 
+-- PL/pgSQL procedure and pg_stat_statements.track = top
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+SET pg_stat_statements.track_utility = FALSE;
+DROP PROCEDURE MINUS_TWO(INTEGER);
+DROP PROCEDURE SUM_TWO(INTEGER, INTEGER);
+
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+SET pg_stat_statements.track = 'all';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+-- SET
+SELECT pg_stat_statements_reset();
+set enable_seqscan=false;
+set enable_seqscan=true;
+set seq_page_cost=2.0;
+set seq_page_cost=1.0;
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+SET pg_stat_statements.track_utility = FALSE;
+
 --
 -- queries with locking clauses
 --
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index ecf6cd6bf3..60dd153a47 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -492,7 +492,7 @@
    structures according to an internal hash calculation.  Typically, two
    queries will be considered the same for this purpose if they are
    semantically equivalent except for the values of literal constants
-   appearing in the query.  Utility commands (that is, all other commands)
+   appearing in the query.  Utility commands (that is, all other commands) except CALL and SET
    are compared strictly on the basis of their textual query strings, however.
   </para>
 
diff --git a/src/backend/utils/misc/queryjumble.c b/src/backend/utils/misc/queryjumble.c
index eeaa0b31fe..8d526c9e9a 100644
--- a/src/backend/utils/misc/queryjumble.c
+++ b/src/backend/utils/misc/queryjumble.c
@@ -104,7 +104,7 @@ JumbleQuery(Query *query, const char *querytext)
 
 	Assert(IsQueryIdEnabled());
 
-	if (query->utilityStmt)
+	if (query->utilityStmt && !JUMBLE_UTILITY(query->utilityStmt))
 	{
 		query->queryId = compute_utility_query_id(querytext,
 												  query->stmt_location,
@@ -240,10 +240,11 @@ static void
 JumbleQueryInternal(JumbleState *jstate, Query *query)
 {
 	Assert(IsA(query, Query));
-	Assert(query->utilityStmt == NULL);
+	Assert(query->utilityStmt == NULL || JUMBLE_UTILITY(query->utilityStmt));
 
 	APP_JUMB(query->commandType);
 	/* resultRelation is usually predictable from commandType */
+	JumbleExpr(jstate, (Node *) query->utilityStmt);
 	JumbleExpr(jstate, (Node *) query->cteList);
 	JumbleRangeTable(jstate, query->rtable);
 	JumbleExpr(jstate, (Node *) query->jointree);
@@ -383,6 +384,30 @@ JumbleExpr(JumbleState *jstate, Node *node)
 				APP_JUMB(var->varlevelsup);
 			}
 			break;
+		case T_CallStmt:
+			{
+				CallStmt   *stmt = (CallStmt *) node;
+				FuncExpr   *expr = stmt->funcexpr;
+
+				APP_JUMB(expr->funcid);
+				JumbleExpr(jstate, (Node *) expr->args);
+			}
+			break;
+		case T_VariableSetStmt:
+			{
+				VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+				APP_JUMB_STRING(stmt->name);
+				JumbleExpr(jstate, (Node *) stmt->args);
+			}
+			break;
+		case T_A_Const:
+			{
+				int			loc = ((const A_Const *) node)->location;
+
+				RecordConstLocation(jstate, loc);
+			}
+			break;
 		case T_Const:
 			{
 				Const	   *c = (Const *) node;
diff --git a/src/include/utils/queryjumble.h b/src/include/utils/queryjumble.h
index 3c2d9beab2..cada9a1ed3 100644
--- a/src/include/utils/queryjumble.h
+++ b/src/include/utils/queryjumble.h
@@ -18,6 +18,12 @@
 
 #define JUMBLE_SIZE				1024	/* query serialization buffer size */
 
+/*
+ * Jumble those utility statements
+ */
+#define JUMBLE_UTILITY(n)	(IsA(n, CallStmt) || \
+							IsA(n, VariableSetStmt))
+
 /*
  * Struct for tracking locations/lengths of constants during normalization
  */

Pavel Stehule

pavel.stehule@gmail.com

over 3 years ago

In reply to: Drouvot, Bertrand (#1)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

st 31. 8. 2022 v 17:34 odesílatel Drouvot, Bertrand <bdrouvot@amazon.com>
napsal:

Hi hackers,

While query jumbling is provided for function calls that’s currently not
the case for procedures calls.
The reason behind this is that all utility statements are currently
discarded for jumbling.

We’ve recently seen performance impacts (LWLock contention) due to the
lack of jumbling on procedure calls with pg_stat_statements and
pg_stat_statements.track_utility enabled (think an application with a high
rate of procedure calls with unique parameters for each call).

Jeremy has had this conversation on twitter (see
https://twitter.com/jer_s/status/1560003560116342785) and Nikolay
reported that he also had to work on a similar performance issue with SET
being used.

That’s why we think it would make sense to allow jumbling for those 2
utility statements: CALL and SET.

Please find attached a patch proposal for doing so.

With the attached patch we would get things like:
CALL MINUS_TWO(3);
CALL MINUS_TWO(7);
CALL SUM_TWO(3, 8);
CALL SUM_TWO(7, 5);
set enable_seqscan=false;
set enable_seqscan=true;
set seq_page_cost=2.0;
set seq_page_cost=1.0;

postgres=# SELECT query, calls, rows FROM pg_stat_statements;
query | calls | rows
-----------------------------------+-------+------
set seq_page_cost=$1 | 2 | 0
CALL MINUS_TWO($1) | 2 | 0
set enable_seqscan=$1 | 2 | 0
CALL SUM_TWO($1, $2) | 2 | 0

Looking forward to your feedback,

The idea is good, but I think you should use pg_stat_functions instead.
Maybe it is supported already (I didn't test it). I am not sure so SET
statement should be traced in pg_stat_statements - it is usually pretty
fast, and without context it says nothing. It looks like just overhead.

Regards

Pavel

Show quoted text

Thanks,

Jeremy & Bertrand

--
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Drouvot, Bertrand (#1)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

Hi,

On 2022-08-31 17:33:44 +0200, Drouvot, Bertrand wrote:

While query jumbling is provided for function calls that’s currently not the
case for procedures calls.
The reason behind this is that all utility statements are currently
discarded for jumbling.
[...]
That’s why we think it would make sense to allow jumbling for those 2
utility statements: CALL and SET.

Yes, I've seen this be an issue repeatedly. Although more heavily for PREPARE
/ EXECUTE than either of the two cases you handle here. IME not tracking
PREPARE / EXECUTE can distort statistics substantially - there's appears to be
a surprising number of applications / frameworks resorting to them. Basically
requiring that track utility is turned on.

I suspect we should carve out things like CALL, PREPARE, EXECUTE from
track_utility - it's more or less an architectural accident that they're
utility statements. It's a bit less clear that SET should be dealt with that
way.

@@ -383,6 +384,30 @@ JumbleExpr(JumbleState *jstate, Node *node)
APP_JUMB(var->varlevelsup);
}
break;
+		case T_CallStmt:
+			{
+				CallStmt   *stmt = (CallStmt *) node;
+				FuncExpr   *expr = stmt->funcexpr;
+
+				APP_JUMB(expr->funcid);
+				JumbleExpr(jstate, (Node *) expr->args);
+			}
+			break;

Why do we need to take the arguments into account?

+		case T_VariableSetStmt:
+			{
+				VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+				APP_JUMB_STRING(stmt->name);
+				JumbleExpr(jstate, (Node *) stmt->args);
+			}
+			break;

Same?

+		case T_A_Const:
+			{
+				int			loc = ((const A_Const *) node)->location;
+
+				RecordConstLocation(jstate, loc);
+			}
+			break;

I suspect we only need this because of the jumbling of unparsed arguments I
questioned above? If we do end up needing it, shouldn't we include the type
in the jumbling?

Greetings,

Andres Freund

Pavel Stehule

pavel.stehule@gmail.com

over 3 years ago

In reply to: Pavel Stehule (#2)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

st 31. 8. 2022 v 17:50 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
napsal:

Hi

st 31. 8. 2022 v 17:34 odesílatel Drouvot, Bertrand <bdrouvot@amazon.com>
napsal:

Hi hackers,

While query jumbling is provided for function calls that’s currently not
the case for procedures calls.
The reason behind this is that all utility statements are currently
discarded for jumbling.

We’ve recently seen performance impacts (LWLock contention) due to the
lack of jumbling on procedure calls with pg_stat_statements and
pg_stat_statements.track_utility enabled (think an application with a high
rate of procedure calls with unique parameters for each call).

Jeremy has had this conversation on twitter (see
https://twitter.com/jer_s/status/1560003560116342785) and Nikolay
reported that he also had to work on a similar performance issue with SET
being used.

That’s why we think it would make sense to allow jumbling for those 2
utility statements: CALL and SET.

Please find attached a patch proposal for doing so.

With the attached patch we would get things like:
CALL MINUS_TWO(3);
CALL MINUS_TWO(7);
CALL SUM_TWO(3, 8);
CALL SUM_TWO(7, 5);
set enable_seqscan=false;
set enable_seqscan=true;
set seq_page_cost=2.0;
set seq_page_cost=1.0;

postgres=# SELECT query, calls, rows FROM pg_stat_statements;
query | calls | rows
-----------------------------------+-------+------
set seq_page_cost=$1 | 2 | 0
CALL MINUS_TWO($1) | 2 | 0
set enable_seqscan=$1 | 2 | 0
CALL SUM_TWO($1, $2) | 2 | 0

Looking forward to your feedback,

The idea is good, but I think you should use pg_stat_functions instead.
Maybe it is supported already (I didn't test it). I am not sure so SET
statement should be traced in pg_stat_statements - it is usually pretty
fast, and without context it says nothing. It looks like just overhead.

I was wrong - there is an analogy with SELECT fx, and the statistics are in
pg_stat_statements, and in pg_stat_function too.

Regards

Pavel

Show quoted text

Regards

Pavel

Thanks,

Jeremy & Bertrand

--
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com

Jeremy Schneider

schnjere@amazon.com

over 3 years ago

In reply to: Drouvot, Bertrand (#1)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

On 8/31/22 8:33 AM, Drouvot, Bertrand wrote:

We’ve recently seen performance impacts (LWLock contention) due to the
lack of jumbling on procedure calls with pg_stat_statements and
pg_stat_statements.track_utility enabled (think an application with a
high rate of procedure calls with unique parameters for each call).

I ran some performance tests with the patch that Bertrand wrote to get
numbers. From my perspective, this patch is scoped very minimally and is
low risk; I don’t think it should need an enormous amount of validation.
It does appear to address the issues with both SET and CALL statements
that Nikolay and I respectively encountered. Honestly, this almost seems
like it was just an minor oversight in the original patch that added
support for CALL and procedures.

I used an r5.large EC2 instance running Linux and tested Bertrand’s
patch using the PostgreSQL 14.4 code base, compiled without and with
Bertrand’s patch. The difference is a lot more extreme on big servers
with lots of cores, but the difference is obvious even on a small
instance like this one.

As a side note: while I certainly don't want to build a database
primarily based on benchmarks, it's nice when benchmarks showcase the
database's strength. Without this patch, HammerDB completely falls over
in stored procedure mode, since one of the procedure arguments is a
time-based unique value on every call. Someone else at Amazon running
HammerDB was how I originally became aware of this problem.

-Jeremy

===== Setup:
$ psql -c "create procedure test(x int) as 'begin return; end' language
plpgsql;"
CREATE PROCEDURE
$ echo -e "\set x random(1,100000000) \n call test(:x)" >test-call.pgbench
$ echo -e "\set x random(1,100000000) \n set application_name=':x'"

test-set.pgbench

===== CALL results without patch:

[postgres@ip-172-31-44-176 ~]$ pgbench -n -c 100 -j 100 -T15 -r -f
test-set.pgbench
pgbench (14.4)
transaction type: test-set.pgbench
scaling factor: 1
query mode: simple
number of clients: 100
number of threads: 100
duration: 15 s
number of transactions actually processed: 728748
latency average = 2.051 ms
initial connection time = 91.844 ms
tps = 48755.446492 (without initial connection time)
statement latencies in milliseconds:
0.000 \set x random(1,100000000)
2.046 set application_name=':x'

pg-14.4 rw postgres@postgres=# select wait_event, count(*) from
pg_stat_activity group by wait_event; \watch 1
...
Tue 30 Aug 2022 08:26:35 PM UTC (every 1s)

===== CALL results with patch:

[postgres@ip-172-31-44-176 ~]$ pgbench -n -c 100 -j 100 -T15 -r -f
test-call.pgbench
pgbench (14.4)
transaction type: test-call.pgbench
scaling factor: 1
query mode: simple
number of clients: 100
number of threads: 100
duration: 15 s
number of transactions actually processed: 1098776
latency average = 1.361 ms
initial connection time = 89.002 ms
tps = 73491.904878 (without initial connection time)
statement latencies in milliseconds:
0.000 \set x random(1,100000000)
1.383 call test(:x)

pg-14.4 rw postgres@postgres=# select wait_event, count(*) from
pg_stat_activity group by wait_event; \watch 1
...
Tue 30 Aug 2022 08:42:51 PM UTC (every 1s)

===== SET results without patch:

pg-14.4 rw postgres@postgres=# select wait_event, count(*) from
pg_stat_activity group by wait_event; \watch 1
...
Tue 30 Aug 2022 08:26:35 PM UTC (every 1s)

===== SET results with patch:

[postgres@ip-172-31-44-176 ~]$ pgbench -n -c 100 -j 100 -T15 -r -f
test-set.pgbench
pgbench (14.4)
transaction type: test-set.pgbench
scaling factor: 1
query mode: simple
number of clients: 100
number of threads: 100
duration: 15 s
number of transactions actually processed: 1178844
latency average = 1.268 ms
initial connection time = 89.159 ms
tps = 78850.178814 (without initial connection time)
statement latencies in milliseconds:
0.000 \set x random(1,100000000)
1.270 set application_name=':x'

pg-14.4 rw postgres@postgres=# select wait_event, count(*) from
pg_stat_activity group by wait_event; \watch 1
...
Tue 30 Aug 2022 08:44:30 PM UTC (every 1s)

--
Jeremy Schneider
Database Engineer
Amazon Web Services

Jeremy Schneider

schnjere@amazon.com

over 3 years ago

In reply to: Andres Freund (#3)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

On 8/31/22 9:08 AM, Andres Freund wrote:

I suspect we should carve out things like CALL, PREPARE, EXECUTE from
track_utility - it's more or less an architectural accident that they're
utility statements. It's a bit less clear that SET should be dealt with that
way.

Regarding SET, the compelling use case was around "application_name"
whose purpose is to provide a label in pg_stat_activity and on log
lines, which can be used to improve observability and connect queries to
their source in application code. Nikolay's incident (on peak shopping
day for an eCommerce corp) evidently involved an application which
leveraged this, but as a result the contention on the pg_stat_statements
LWLock in exclusive mode effectively caused an outage for the retailer?
Or nearly did? My description here is based on Nikolay's public twitter
comment.

I've seen a lot of applications that make heavy use of temp tables,
where DDL would be pretty important to track as part of the regular
workload. So that probably should be added to the list alongside
prepared statements. And I'd want to spend a little more time thinking
about what other use cases might be missing. I'm hesitant about the
general idea of carving out some utility statements away from this
"track_utility" GUC.

Personally, at this point, I think pg_stat_statements is critical
infrastructure for anyone running PostgreSQL at scale. The information
it provides is indispensable. I don't think it's really defensible to
tell people that if they want to scale, then they need to fly blind on
any utility statements.

-Jeremy

--
Jeremy Schneider
Database Engineer
Amazon Web Services

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Jeremy Schneider (#6)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

Hi,

On 2022-08-31 11:00:05 -0700, Jeremy Schneider wrote:

On 8/31/22 9:08 AM, Andres Freund wrote:

I suspect we should carve out things like CALL, PREPARE, EXECUTE from
track_utility - it's more or less an architectural accident that they're
utility statements. It's a bit less clear that SET should be dealt with that
way.

Regarding SET, the compelling use case was around "application_name"
whose purpose is to provide a label in pg_stat_activity and on log
lines, which can be used to improve observability and connect queries to
their source in application code.

I wasn't saying that SET shouldn't be jumbled, just that it seems more
reasonable to track it only when track_utility is enabled, rather than doing
so even when that's disabled. Which I do think makes sense for executing a
prepared statement and calling a procedure, since they're really only utility
statements by accident.

Personally, at this point, I think pg_stat_statements is critical
infrastructure for anyone running PostgreSQL at scale. The information
it provides is indispensable. I don't think it's really defensible to
tell people that if they want to scale, then they need to fly blind on
any utility statements.

I wasn't suggesting doing so at all.

Greetings,

Andres Freund

Jeremy Schneider

schnjere@amazon.com

over 3 years ago

In reply to: Andres Freund (#7)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

On 8/31/22 12:06 PM, Andres Freund wrote:

Regarding SET, the compelling use case was around "application_name"
whose purpose is to provide a label in pg_stat_activity and on log
lines, which can be used to improve observability and connect queries to
their source in application code.

I wasn't saying that SET shouldn't be jumbled, just that it seems more
reasonable to track it only when track_utility is enabled, rather than doing
so even when that's disabled. Which I do think makes sense for executing a
prepared statement and calling a procedure, since they're really only utility
statements by accident.

Hey Andres, sorry for misunderstanding your email!

Based on this quick test I just now ran (transcript below), I think that
PREPARE/EXECUTE is already excluded from track_utility?

I get your point about CALL, maybe it does make sense to also exclude
this. It might also be worth a small update to the doc for track_utility
about how it behaves, in this regard.

https://www.postgresql.org/docs/14/pgstatstatements.html#id-1.11.7.39.9

Example updated sentence:

|pg_stat_statements.track_utility| controls whether <<most>> utility

=====

pg-14.4 rw root@db1=# set pg_stat_statements.track_utility=on;
SET
pg-14.4 rw root@db1=# select pg_stat_statements_reset();
pg_stat_statements_reset
--------------------------

(1 row)
pg-14.4 rw root@db1=# prepare test as select /* unique123 */ 1;
PREPARE
pg-14.4 rw root@db1=# execute test;
?column?
----------
1
(1 row)

pg-14.4 rw root@db1=# set application_name='test';
SET
pg-14.4 rw root@db1=# select substr(query,1,50) from pg_stat_statements;
substr
-------------------------------------------
prepare test as select /* unique123 */ $1
select pg_stat_statements_reset()
set application_name=$1
(3 rows)

=====

pg-14.4 rw root@db1=# set pg_stat_statements.track_utility=off;
SET
pg-14.4 rw root@db1=# select pg_stat_statements_reset();
pg_stat_statements_reset
--------------------------

(1 row)

pg-14.4 rw root@db1=# prepare test as select /* unique123 */ 1;
PREPARE
pg-14.4 rw root@db1=# execute test;
?column?
----------
1
(1 row)

--
Jeremy Schneider
Database Engineer
Amazon Web Services

Drouvot, Bertrand

bdrouvot@amazon.com

over 3 years ago

In reply to: Andres Freund (#3)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

Hi,

On 8/31/22 6:08 PM, Andres Freund wrote:

Hi,

On 2022-08-31 17:33:44 +0200, Drouvot, Bertrand wrote:

@@ -383,6 +384,30 @@ JumbleExpr(JumbleState *jstate, Node *node)
APP_JUMB(var->varlevelsup);
}
break;
+             case T_CallStmt:
+                     {
+                             CallStmt   *stmt = (CallStmt *) node;
+                             FuncExpr   *expr = stmt->funcexpr;
+
+                             APP_JUMB(expr->funcid);
+                             JumbleExpr(jstate, (Node *) expr->args);
+                     }
+                     break;

Why do we need to take the arguments into account?

Thanks for looking at it!

Agree that It's not needed to "solve" the Lock contention issue, but I
think it's needed for the "render".

Without it we would get, things like:

instead of

+             case T_VariableSetStmt:
+                     {
+                             VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+                             APP_JUMB_STRING(stmt->name);
+                             JumbleExpr(jstate, (Node *) stmt->args);
+                     }
+                     break;

Same?

yeah, same reason. Without it we would get things like:

postgres=# set enable_seqscan=false;
SET
postgres=# set enable_seqscan=true;
SET
postgres=# SELECT query, calls, rows FROM pg_stat_statements;
               query               | calls | rows
-----------------------------------+-------+------
select pg_stat_statements_reset() |     1 |    1
set enable_seqscan=false          |     2 |    0
(2 rows)

instead of

+             case T_A_Const:
+                     {
+                             int                     loc = ((const A_Const *) node)->location;
+
+                             RecordConstLocation(jstate, loc);
+                     }
+                     break;

I suspect we only need this because of the jumbling of unparsed arguments I
questioned above?

Right but only for the T_VariableSetStmt case.

If we do end up needing it, shouldn't we include the type
in the jumbling?

I don't think so as this is only for the T_VariableSetStmt case.

And looking closer I don't see such as thing as "consttype" (that we can
find in the Const struct) in the A_Const struct.

Regards,

--
Bertrand Drouvot
Amazon Web Services:https://aws.amazon.com

#10

Drouvot, Bertrand

bdrouvot@amazon.com

over 3 years ago

In reply to: Jeremy Schneider (#8)

1 attachment(s)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

Hi,

On 8/31/22 10:05 PM, Jeremy Schneider wrote:

On 8/31/22 12:06 PM, Andres Freund wrote:

Regarding SET, the compelling use case was around "application_name"
whose purpose is to provide a label in pg_stat_activity and on log
lines, which can be used to improve observability and connect queries to
their source in application code.

I wasn't saying that SET shouldn't be jumbled, just that it seems more
reasonable to track it only when track_utility is enabled, rather than doing
so even when that's disabled. Which I do think makes sense for executing a
prepared statement and calling a procedure, since they're really only utility
statements by accident.

I get your point about CALL, maybe it does make sense to also exclude
this.

That's a good point and i think we should track CALL whatever the value
of pgss_track_utility is.

I think so because we are tracking function calls in all the cases
(because "linked" to select aka not a utility) and i don't see any
reasons why not to do the same for procedure calls.

Please find attached v2 as an attempt to do so.

With v2 we get things like:

postgres=# set pg_stat_statements.track_utility=on;
SET
postgres=# call MY_PROC(20);
CALL
postgres=# call MY_PROC(10);
CALL
postgres=# set enable_seqscan=false;
SET
postgres=# set enable_seqscan=true;
SET
postgres=# select queryid,query,calls from pg_stat_statements;
       queryid       |                  query                  | calls
---------------------+-----------------------------------------+-------
4670878543381973400 | set pg_stat_statements.track_utility=$1 |     1
-640317129591544054 | set enable_seqscan=$1 |     2
492647827690744963 | select pg_stat_statements_reset() |     1
6541399678435597534 | call MY_PROC($1) |     2

and

postgres=# set pg_stat_statements.track_utility=off;
SET
postgres=# call MY_PROC(10);
CALL
postgres=# call MY_PROC(20);
CALL
postgres=# set enable_seqscan=true;
SET
postgres=# set enable_seqscan=false;
SET
postgres=# select queryid,query,calls from pg_stat_statements;
       queryid       |                  query                  | calls
---------------------+-----------------------------------------+-------
4670878543381973400 | set pg_stat_statements.track_utility=$1 |     1
492647827690744963 | select pg_stat_statements_reset() |     1
6541399678435597534 | call MY_PROC($1) |     2
(3 rows)

It might also be worth a small update to the doc for track_utility
about how it behaves, in this regard.

https://www.postgresql.org/docs/14/pgstatstatements.html#id-1.11.7.39.9

Example updated sentence:

|pg_stat_statements.track_utility| controls whether <<most>> utility

commands are tracked by the module. Utility commands are all those
other than |SELECT|, |INSERT|, |UPDATE| and |DELETE| <<, but this
parameter does not disable tracking of PREPARE, EXECUTE or CALL>>. The
default value is |on|. Only superusers can change this setting.

Agree, wording added to v2.

Regards,

--
Bertrand Drouvot
Amazon Web Services:https://aws.amazon.com

Attachments:

v2-0001-JumbleQuery-on-Call-and-Set.patchtext/plain; charset=UTF-8; name=v2-0001-JumbleQuery-on-Call-and-Set.patchDownload

diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index ff0166fb9d..e1834b0782 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -272,7 +272,7 @@ FROM pg_stat_statements ORDER BY query COLLATE "C";
  wal_records > $2 as wal_records_generated,               +|       |      |                     |                       | 
  wal_records >= rows as wal_records_ge_rows               +|       |      |                     |                       | 
  FROM pg_stat_statements ORDER BY query COLLATE "C"        |       |      |                     |                       | 
- SET pg_stat_statements.track_utility = FALSE              |     1 |    0 | f                   | f                     | t
+ SET pg_stat_statements.track_utility = $1                 |     1 |    0 | f                   | f                     | t
  UPDATE pgss_test SET b = $1 WHERE a > $2                  |     1 |    3 | t                   | t                     | t
 (7 rows)
 
@@ -423,6 +423,147 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
 (6 rows)
 
+-- PL/pgSQL procedure and pg_stat_statements.track = top
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(4 rows)
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(4 rows)
+
+SET pg_stat_statements.track_utility = TRUE;
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+SET pg_stat_statements.track_utility = FALSE;
+DROP PROCEDURE MINUS_TWO(INTEGER);
+DROP PROCEDURE SUM_TWO(INTEGER, INTEGER);
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+SET pg_stat_statements.track = 'all';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT (i - $2 - $3)::INTEGER                                                |     2 |    2
+ SELECT (j + j)::INTEGER                                                      |     2 |    2
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(6 rows)
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT (i - 1 - 1.0)::INTEGER                                                |     2 |    2
+ SELECT (j + j)::INTEGER                                                      |     2 |    2
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(6 rows)
+
+SET pg_stat_statements.track_utility = TRUE;
+-- SET
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+set enable_seqscan=false;
+set enable_seqscan=true;
+set seq_page_cost=2.0;
+set seq_page_cost=1.0;
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+ set enable_seqscan=$1                                                        |     2 |    0
+ set seq_page_cost=$1                                                         |     2 |    0
+(4 rows)
+
+SET pg_stat_statements.track_utility = FALSE;
 --
 -- queries with locking clauses
 --
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index ba868f0de9..cccd1a8337 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -106,6 +106,11 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100;
 #define PGSS_HANDLED_UTILITY(n)		(!IsA(n, ExecuteStmt) && \
 									!IsA(n, PrepareStmt) && \
 									!IsA(n, DeallocateStmt))
+/*
+ * Force track those utility statements
+ * whatever the value of pgss_track_utility is.
+ */
+#define FORCE_TRACK_UTILITY(n)	(IsA(n, CallStmt))
 
 /*
  * Extension version number, for supporting older extension versions' objects
@@ -832,7 +837,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
 	 * inherit from the underlying statement's one (except DEALLOCATE which is
 	 * entirely untracked).
 	 */
-	if (query->utilityStmt)
+	if (query->utilityStmt && !JUMBLE_UTILITY(query->utilityStmt))
 	{
 		if (pgss_track_utility && !PGSS_HANDLED_UTILITY(query->utilityStmt))
 			query->queryId = UINT64CONST(0);
@@ -1097,7 +1102,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 	 * that user configured another extension to handle utility statements
 	 * only.
 	 */
-	if (pgss_enabled(exec_nested_level) && pgss_track_utility)
+	if (pgss_enabled(exec_nested_level) &&
+		(pgss_track_utility || FORCE_TRACK_UTILITY(parsetree)))
 		pstmt->queryId = UINT64CONST(0);
 
 	/*
@@ -1114,7 +1120,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 	 *
 	 * Likewise, we don't track execution of DEALLOCATE.
 	 */
-	if (pgss_track_utility && pgss_enabled(exec_nested_level) &&
+	if ((pgss_track_utility || FORCE_TRACK_UTILITY(parsetree)) &&
+		pgss_enabled(exec_nested_level) &&
 		PGSS_HANDLED_UTILITY(parsetree))
 	{
 		instr_time	start;
diff --git a/contrib/pg_stat_statements/sql/pg_stat_statements.sql b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
index a01f183727..92359f0184 100644
--- a/contrib/pg_stat_statements/sql/pg_stat_statements.sql
+++ b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
@@ -201,6 +201,92 @@ SELECT PLUS_ONE(1);
 
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 
+-- PL/pgSQL procedure and pg_stat_statements.track = top
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+SET pg_stat_statements.track_utility = TRUE;
+
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+SET pg_stat_statements.track_utility = FALSE;
+DROP PROCEDURE MINUS_TWO(INTEGER);
+DROP PROCEDURE SUM_TWO(INTEGER, INTEGER);
+
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+SET pg_stat_statements.track = 'all';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+SET pg_stat_statements.track_utility = TRUE;
+
+-- SET
+SELECT pg_stat_statements_reset();
+set enable_seqscan=false;
+set enable_seqscan=true;
+set seq_page_cost=2.0;
+set seq_page_cost=1.0;
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+SET pg_stat_statements.track_utility = FALSE;
+
 --
 -- queries with locking clauses
 --
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index ecf6cd6bf3..b1f407e794 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -492,7 +492,7 @@
    structures according to an internal hash calculation.  Typically, two
    queries will be considered the same for this purpose if they are
    semantically equivalent except for the values of literal constants
-   appearing in the query.  Utility commands (that is, all other commands)
+   appearing in the query.  Utility commands (that is, all other commands) except CALL and SET
    are compared strictly on the basis of their textual query strings, however.
   </para>
 
@@ -781,9 +781,10 @@
     <listitem>
      <para>
       <varname>pg_stat_statements.track_utility</varname> controls whether
-      utility commands are tracked by the module.  Utility commands are
+      most utility commands are tracked by the module.  Utility commands are
       all those other than <command>SELECT</command>, <command>INSERT</command>,
-      <command>UPDATE</command> and <command>DELETE</command>.
+      <command>UPDATE</command> and <command>DELETE</command>, but this parameter does not disable
+      tracking of PREPARE, EXECUTE or CALL.
       The default value is <literal>on</literal>.
       Only superusers can change this setting.
      </para>
diff --git a/src/backend/utils/misc/queryjumble.c b/src/backend/utils/misc/queryjumble.c
index eeaa0b31fe..8d526c9e9a 100644
--- a/src/backend/utils/misc/queryjumble.c
+++ b/src/backend/utils/misc/queryjumble.c
@@ -104,7 +104,7 @@ JumbleQuery(Query *query, const char *querytext)
 
 	Assert(IsQueryIdEnabled());
 
-	if (query->utilityStmt)
+	if (query->utilityStmt && !JUMBLE_UTILITY(query->utilityStmt))
 	{
 		query->queryId = compute_utility_query_id(querytext,
 												  query->stmt_location,
@@ -240,10 +240,11 @@ static void
 JumbleQueryInternal(JumbleState *jstate, Query *query)
 {
 	Assert(IsA(query, Query));
-	Assert(query->utilityStmt == NULL);
+	Assert(query->utilityStmt == NULL || JUMBLE_UTILITY(query->utilityStmt));
 
 	APP_JUMB(query->commandType);
 	/* resultRelation is usually predictable from commandType */
+	JumbleExpr(jstate, (Node *) query->utilityStmt);
 	JumbleExpr(jstate, (Node *) query->cteList);
 	JumbleRangeTable(jstate, query->rtable);
 	JumbleExpr(jstate, (Node *) query->jointree);
@@ -383,6 +384,30 @@ JumbleExpr(JumbleState *jstate, Node *node)
 				APP_JUMB(var->varlevelsup);
 			}
 			break;
+		case T_CallStmt:
+			{
+				CallStmt   *stmt = (CallStmt *) node;
+				FuncExpr   *expr = stmt->funcexpr;
+
+				APP_JUMB(expr->funcid);
+				JumbleExpr(jstate, (Node *) expr->args);
+			}
+			break;
+		case T_VariableSetStmt:
+			{
+				VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+				APP_JUMB_STRING(stmt->name);
+				JumbleExpr(jstate, (Node *) stmt->args);
+			}
+			break;
+		case T_A_Const:
+			{
+				int			loc = ((const A_Const *) node)->location;
+
+				RecordConstLocation(jstate, loc);
+			}
+			break;
 		case T_Const:
 			{
 				Const	   *c = (Const *) node;
diff --git a/src/include/utils/queryjumble.h b/src/include/utils/queryjumble.h
index 3c2d9beab2..cada9a1ed3 100644
--- a/src/include/utils/queryjumble.h
+++ b/src/include/utils/queryjumble.h
@@ -18,6 +18,12 @@
 
 #define JUMBLE_SIZE				1024	/* query serialization buffer size */
 
+/*
+ * Jumble those utility statements
+ */
+#define JUMBLE_UTILITY(n)	(IsA(n, CallStmt) || \
+							IsA(n, VariableSetStmt))
+
 /*
  * Struct for tracking locations/lengths of constants during normalization
  */

#11

Imseih (AWS), Sami

simseih@amazon.com

over 3 years ago

In reply to: Drouvot, Bertrand (#10)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

Please find attached v2 as an attempt to do so.

+1 to the idea.
I think it will be better to evaluate jstate instead of
JUMBLE_UTILITY, such as:

if (query->utilityStmt && !jstate)

instead of

if (query->utilityStmt && !JUMBLE_UTILITY(query->utilityStmt))

This will allow for support of potentially other utility statements
In the future without having to teach pg_stat_statements about them.
If a jstate is set for the utility statements, pgss will do the right thing.

Thanks

--
Sami Imseih
Amazon Web Services (AWS)

#12

Drouvot, Bertrand

bdrouvot@amazon.com

over 3 years ago

In reply to: Imseih (AWS), Sami (#11)

1 attachment(s)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

Hi,

On 9/1/22 5:13 PM, Imseih (AWS), Sami wrote:

Please find attached v2 as an attempt to do so.

+1 to the idea.

Thanks for looking at it!

I think it will be better to evaluate jstate instead of

JUMBLE_UTILITY, such as:

if (query->utilityStmt && !jstate)

instead of

if (query->utilityStmt && !JUMBLE_UTILITY(query->utilityStmt))

This will allow for support of potentially other utility statements

In the future without having to teach pg_stat_statements about them.

If a jstate is set for the utility statements, pgss will do the right
thing.

Fair point, thanks!

v3 including this change is attached.

Thanks,

Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services:https://aws.amazon.com

Attachments:

v3-0001-JumbleQuery-on-Call-and-Set.patchtext/plain; charset=UTF-8; name=v3-0001-JumbleQuery-on-Call-and-Set.patchDownload

diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index ff0166fb9d..e1834b0782 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -272,7 +272,7 @@ FROM pg_stat_statements ORDER BY query COLLATE "C";
  wal_records > $2 as wal_records_generated,               +|       |      |                     |                       | 
  wal_records >= rows as wal_records_ge_rows               +|       |      |                     |                       | 
  FROM pg_stat_statements ORDER BY query COLLATE "C"        |       |      |                     |                       | 
- SET pg_stat_statements.track_utility = FALSE              |     1 |    0 | f                   | f                     | t
+ SET pg_stat_statements.track_utility = $1                 |     1 |    0 | f                   | f                     | t
  UPDATE pgss_test SET b = $1 WHERE a > $2                  |     1 |    3 | t                   | t                     | t
 (7 rows)
 
@@ -423,6 +423,147 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
 (6 rows)
 
+-- PL/pgSQL procedure and pg_stat_statements.track = top
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(4 rows)
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(4 rows)
+
+SET pg_stat_statements.track_utility = TRUE;
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+SET pg_stat_statements.track_utility = FALSE;
+DROP PROCEDURE MINUS_TWO(INTEGER);
+DROP PROCEDURE SUM_TWO(INTEGER, INTEGER);
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+SET pg_stat_statements.track = 'all';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT (i - $2 - $3)::INTEGER                                                |     2 |    2
+ SELECT (j + j)::INTEGER                                                      |     2 |    2
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(6 rows)
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT (i - 1 - 1.0)::INTEGER                                                |     2 |    2
+ SELECT (j + j)::INTEGER                                                      |     2 |    2
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(6 rows)
+
+SET pg_stat_statements.track_utility = TRUE;
+-- SET
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+set enable_seqscan=false;
+set enable_seqscan=true;
+set seq_page_cost=2.0;
+set seq_page_cost=1.0;
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+ set enable_seqscan=$1                                                        |     2 |    0
+ set seq_page_cost=$1                                                         |     2 |    0
+(4 rows)
+
+SET pg_stat_statements.track_utility = FALSE;
 --
 -- queries with locking clauses
 --
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index ba868f0de9..62de9b9325 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -106,6 +106,11 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100;
 #define PGSS_HANDLED_UTILITY(n)		(!IsA(n, ExecuteStmt) && \
 									!IsA(n, PrepareStmt) && \
 									!IsA(n, DeallocateStmt))
+/*
+ * Force track those utility statements
+ * whatever the value of pgss_track_utility is.
+ */
+#define FORCE_TRACK_UTILITY(n)	(IsA(n, CallStmt))
 
 /*
  * Extension version number, for supporting older extension versions' objects
@@ -832,7 +837,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
 	 * inherit from the underlying statement's one (except DEALLOCATE which is
 	 * entirely untracked).
 	 */
-	if (query->utilityStmt)
+	if (query->utilityStmt && !jstate)
 	{
 		if (pgss_track_utility && !PGSS_HANDLED_UTILITY(query->utilityStmt))
 			query->queryId = UINT64CONST(0);
@@ -1097,7 +1102,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 	 * that user configured another extension to handle utility statements
 	 * only.
 	 */
-	if (pgss_enabled(exec_nested_level) && pgss_track_utility)
+	if (pgss_enabled(exec_nested_level) &&
+		(pgss_track_utility || FORCE_TRACK_UTILITY(parsetree)))
 		pstmt->queryId = UINT64CONST(0);
 
 	/*
@@ -1114,7 +1120,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 	 *
 	 * Likewise, we don't track execution of DEALLOCATE.
 	 */
-	if (pgss_track_utility && pgss_enabled(exec_nested_level) &&
+	if ((pgss_track_utility || FORCE_TRACK_UTILITY(parsetree)) &&
+		pgss_enabled(exec_nested_level) &&
 		PGSS_HANDLED_UTILITY(parsetree))
 	{
 		instr_time	start;
diff --git a/contrib/pg_stat_statements/sql/pg_stat_statements.sql b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
index a01f183727..92359f0184 100644
--- a/contrib/pg_stat_statements/sql/pg_stat_statements.sql
+++ b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
@@ -201,6 +201,92 @@ SELECT PLUS_ONE(1);
 
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 
+-- PL/pgSQL procedure and pg_stat_statements.track = top
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+SET pg_stat_statements.track_utility = TRUE;
+
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+SET pg_stat_statements.track_utility = FALSE;
+DROP PROCEDURE MINUS_TWO(INTEGER);
+DROP PROCEDURE SUM_TWO(INTEGER, INTEGER);
+
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+SET pg_stat_statements.track = 'all';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+SET pg_stat_statements.track_utility = TRUE;
+
+-- SET
+SELECT pg_stat_statements_reset();
+set enable_seqscan=false;
+set enable_seqscan=true;
+set seq_page_cost=2.0;
+set seq_page_cost=1.0;
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+SET pg_stat_statements.track_utility = FALSE;
+
 --
 -- queries with locking clauses
 --
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index ecf6cd6bf3..b1f407e794 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -492,7 +492,7 @@
    structures according to an internal hash calculation.  Typically, two
    queries will be considered the same for this purpose if they are
    semantically equivalent except for the values of literal constants
-   appearing in the query.  Utility commands (that is, all other commands)
+   appearing in the query.  Utility commands (that is, all other commands) except CALL and SET
    are compared strictly on the basis of their textual query strings, however.
   </para>
 
@@ -781,9 +781,10 @@
     <listitem>
      <para>
       <varname>pg_stat_statements.track_utility</varname> controls whether
-      utility commands are tracked by the module.  Utility commands are
+      most utility commands are tracked by the module.  Utility commands are
       all those other than <command>SELECT</command>, <command>INSERT</command>,
-      <command>UPDATE</command> and <command>DELETE</command>.
+      <command>UPDATE</command> and <command>DELETE</command>, but this parameter does not disable
+      tracking of PREPARE, EXECUTE or CALL.
       The default value is <literal>on</literal>.
       Only superusers can change this setting.
      </para>
diff --git a/src/backend/utils/misc/queryjumble.c b/src/backend/utils/misc/queryjumble.c
index a67487e5fe..11a57d7616 100644
--- a/src/backend/utils/misc/queryjumble.c
+++ b/src/backend/utils/misc/queryjumble.c
@@ -104,7 +104,7 @@ JumbleQuery(Query *query, const char *querytext)
 
 	Assert(IsQueryIdEnabled());
 
-	if (query->utilityStmt)
+	if (query->utilityStmt && !JUMBLE_UTILITY(query->utilityStmt))
 	{
 		query->queryId = compute_utility_query_id(querytext,
 												  query->stmt_location,
@@ -240,10 +240,11 @@ static void
 JumbleQueryInternal(JumbleState *jstate, Query *query)
 {
 	Assert(IsA(query, Query));
-	Assert(query->utilityStmt == NULL);
+	Assert(query->utilityStmt == NULL || JUMBLE_UTILITY(query->utilityStmt));
 
 	APP_JUMB(query->commandType);
 	/* resultRelation is usually predictable from commandType */
+	JumbleExpr(jstate, (Node *) query->utilityStmt);
 	JumbleExpr(jstate, (Node *) query->cteList);
 	JumbleRangeTable(jstate, query->rtable);
 	JumbleExpr(jstate, (Node *) query->jointree);
@@ -383,6 +384,30 @@ JumbleExpr(JumbleState *jstate, Node *node)
 				APP_JUMB(var->varlevelsup);
 			}
 			break;
+		case T_CallStmt:
+			{
+				CallStmt   *stmt = (CallStmt *) node;
+				FuncExpr   *expr = stmt->funcexpr;
+
+				APP_JUMB(expr->funcid);
+				JumbleExpr(jstate, (Node *) expr->args);
+			}
+			break;
+		case T_VariableSetStmt:
+			{
+				VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+				APP_JUMB_STRING(stmt->name);
+				JumbleExpr(jstate, (Node *) stmt->args);
+			}
+			break;
+		case T_A_Const:
+			{
+				int			loc = ((const A_Const *) node)->location;
+
+				RecordConstLocation(jstate, loc);
+			}
+			break;
 		case T_Const:
 			{
 				Const	   *c = (Const *) node;
diff --git a/src/include/utils/queryjumble.h b/src/include/utils/queryjumble.h
index 3c2d9beab2..cada9a1ed3 100644
--- a/src/include/utils/queryjumble.h
+++ b/src/include/utils/queryjumble.h
@@ -18,6 +18,12 @@
 
 #define JUMBLE_SIZE				1024	/* query serialization buffer size */
 
+/*
+ * Jumble those utility statements
+ */
+#define JUMBLE_UTILITY(n)	(IsA(n, CallStmt) || \
+							IsA(n, VariableSetStmt))
+
 /*
  * Struct for tracking locations/lengths of constants during normalization
  */

#13

Imseih (AWS), Sami

simseih@amazon.com

over 3 years ago

In reply to: Drouvot, Bertrand (#12)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

This is ready for committer but I suggest the following for the
doc changes:

1.
Plannable queries (that is, SELECT, INSERT, UPDATE, and DELETE) are
combined into a single pg_stat_statements entry whenever they have
identical query structures according to an internal hash calculation.
Typically, two queries will be considered the same for this purpose
if they are semantically equivalent except for the values of literal
constants appearing in the query. Utility commands (that is, all other commands)
are compared strictly on the basis of their textual query strings, however.
-- to --
Plannable queries (that is, SELECT, INSERT, UPDATE, and DELETE) as
well as CALL and SET commands are combined into a single
pg_stat_statements entry whenever they have identical query
structures according to an internal hash calculation.
Typically, two queries will be considered the same for this purpose
if they are semantically equivalent except for the values of literal
constants appearing in the command. All other commands are compared
strictly on the basis of their textual query strings, however.

pg_stat_statements.track_utility controls whether utility
commands are tracked by the module. Utility commands
are all those other than SELECT, INSERT, UPDATE and DELETE.
The default value is on. Only superusers can change this setting.
-- to --
pg_stat_statements.track_utility controls whether utility commands
are tracked by the module. Tracked utility commands are all those
other than SELECT, INSERT, UPDATE, DELETE, CALL and SET.
The default value is on. Only superusers can change this setting.

--
Thanks,
Sami Imseih
Amazon Web Services (AWS)

From: "Drouvot, Bertrand" <bdrouvot@amazon.com>
Date: Friday, September 2, 2022 at 4:06 AM
To: "Imseih (AWS), Sami" <simseih@amazon.com>, "Schneider (AWS), Jeremy" <schnjere@amazon.com>, Andres Freund <andres@anarazel.de>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>, Peter Eisentraut <peter.eisentraut@enterprisedb.com>, Pavel Stehule <pavel.stehule@gmail.com>, Nikolay Samokhvalov <samokhvalov@gmail.com>
Subject: Re: [PATCH] Query Jumbling for CALL and SET utility statements

Hi,
On 9/1/22 5:13 PM, Imseih (AWS), Sami wrote:

Please find attached v2 as an attempt to do so.

+1 to the idea.

Thanks for looking at it!
I think it will be better to evaluate jstate instead of
JUMBLE_UTILITY, such as:

if (query->utilityStmt && !jstate)

instead of

if (query->utilityStmt && !JUMBLE_UTILITY(query->utilityStmt))

Fair point, thanks!

v3 including this change is attached.

Thanks,
--

Bertrand Drouvot

PostgreSQL Contributors Team

RDS Open Source Databases

Amazon Web Services: https://aws.amazon.com

#14

Jeremy Schneider

schnjere@amazon.com

over 3 years ago

In reply to: Drouvot, Bertrand (#12)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

On 9/2/22 2:06 AM, Drouvot, Bertrand wrote:

v3 including this change is attached.

FYI: "reset all" core dumps with v3

I didn't fully debug yet, but here's the backtrace on my 14.4 build with
the patch

[postgres@ip-172-31-44-176 data]$ gdb /usr/local/pgsql-14.4/bin/postgres
core.27217
...
Core was generated by `postgres: postgres postgres [local]
RESET '.
Program terminated with signal 11, Segmentation fault.
#0 0x00007f7776ae4821 in __strlen_sse2_pminub () from /lib64/libc.so.6
...
(gdb) bt
#0 0x00007f7776ae4821 in __strlen_sse2_pminub () from /lib64/libc.so.6
#1 0x00000000008e061c in JumbleExpr (jstate=0x1cf7f80, node=<optimized
out>) at queryjumble.c:400
#2 0x00000000008dfdd8 in JumbleQueryInternal (jstate=0x1cf7f80,
query=0x1cf7e70) at queryjumble.c:247
#3 0x00000000008e0b4b in JumbleQuery (query=query@entry=0x1cf7e70,
querytext=querytext@entry=0x1cf72f8 "reset all;") at queryjumble.c:127
#4 0x000000000056ba4b in parse_analyze (parseTree=0x1cf7ce0,
sourceText=0x1cf72f8 "reset all;", paramTypes=0x0, numParams=<optimized
out>, queryEnv=0x0) at analyze.c:130
#5 0x000000000079df63 in pg_analyze_and_rewrite
(parsetree=parsetree@entry=0x1cf7ce0,
query_string=query_string@entry=0x1cf72f8 "reset all;",
paramTypes=paramTypes@entry=0x0, numParams=numParams@entry=0,
queryEnv=queryEnv@entry=0x0) at postgres.c:657
#6 0x000000000079e472 in exec_simple_query (query_string=0x1cf72f8
"reset all;") at postgres.c:1130
#7 0x000000000079f9d3 in PostgresMain (argc=argc@entry=1,
argv=argv@entry=0x7ffd0c341f80, dbname=0x1d44948 "postgres",
username=<optimized out>) at postgres.c:4496
#8 0x000000000048c9f3 in BackendRun (port=<optimized out>,
port=<optimized out>) at postmaster.c:4530
#9 BackendStartup (port=0x1d3bdd0) at postmaster.c:4252
#10 ServerLoop () at postmaster.c:1745
#11 0x0000000000721332 in PostmasterMain (argc=argc@entry=5,
argv=argv@entry=0x1cf1e10) at postmaster.c:1417
#12 0x000000000048da6e in main (argc=5, argv=0x1cf1e10) at main.c:209

--
Jeremy Schneider
Database Engineer
Amazon Web Services

#15

Michael Paquier

michael@paquier.xyz

over 3 years ago

In reply to: Jeremy Schneider (#14)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

On Wed, Sep 07, 2022 at 06:19:42PM -0700, Jeremy Schneider wrote:

I didn't fully debug yet, but here's the backtrace on my 14.4 build with
the patch

What happens on HEAD? That would be the target branch for a new
feature.
--
Michael

#16

Drouvot, Bertrand

bdrouvot@amazon.com

over 3 years ago

In reply to: Michael Paquier (#15)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

Hi,

On 9/8/22 7:23 AM, Michael Paquier wrote:

On Wed, Sep 07, 2022 at 06:19:42PM -0700, Jeremy Schneider wrote:

I didn't fully debug yet, but here's the backtrace on my 14.4 build with
the patch

Thanks Jeremy for reporting the issue!

What happens on HEAD? That would be the target branch for a new
feature.

Just tested and i can see the same issue on HEAD.

Issue is on stmt->name being NULL here:

Breakpoint 2, JumbleExpr (jstate=0x55d60e769e30, node=0x55d60e769b60) at
queryjumble.c:364
364             if (node == NULL)
(gdb) n
368             check_stack_depth();
(gdb)
374             APP_JUMB(node->type);
(gdb)
376             switch (nodeTag(node))
(gdb)
398                                     VariableSetStmt *stmt =
(VariableSetStmt *) node;
(gdb) n
400 APP_JUMB_STRING(stmt->name);
(gdb) p stmt->name
$1 = 0x0

I'll have a closer look.

Regards,

Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services:https://aws.amazon.com

#17

Julien Rouhaud

rjuju123@gmail.com

over 3 years ago

In reply to: Michael Paquier (#15)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

On Thu, Sep 08, 2022 at 02:23:19PM +0900, Michael Paquier wrote:

On Wed, Sep 07, 2022 at 06:19:42PM -0700, Jeremy Schneider wrote:

I didn't fully debug yet, but here's the backtrace on my 14.4 build with
the patch

What happens on HEAD? That would be the target branch for a new
feature.

It would be the same AFAICS. From v3:

+		case T_VariableSetStmt:
+			{
+				VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+				APP_JUMB_STRING(stmt->name);
+				JumbleExpr(jstate, (Node *) stmt->args);
+			}

For a RESET ALL command stmt->name is NULL.

#18

Drouvot, Bertrand

bdrouvot@amazon.com

over 3 years ago

In reply to: Julien Rouhaud (#17)

1 attachment(s)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

Hi,

On 9/8/22 8:50 AM, Julien Rouhaud wrote:

Thanks for looking at it!

On Thu, Sep 08, 2022 at 02:23:19PM +0900, Michael Paquier wrote:

On Wed, Sep 07, 2022 at 06:19:42PM -0700, Jeremy Schneider wrote:

I didn't fully debug yet, but here's the backtrace on my 14.4 build with
the patch

What happens on HEAD? That would be the target branch for a new
feature.

It would be the same AFAICS. From v3:
+               case T_VariableSetStmt:
+                       {
+                               VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+                               APP_JUMB_STRING(stmt->name);
+                               JumbleExpr(jstate, (Node *) stmt->args);
+                       }
For a RESET ALL command stmt->name is NULL.

Right, please find attached v4 addressing the issue and also Sami's
comments [1]/messages/by-id/82A35172-BEB3-4DFA-B11C-AE5E50A0F932@amazon.com.

[1]: /messages/by-id/82A35172-BEB3-4DFA-B11C-AE5E50A0F932@amazon.com
/messages/by-id/82A35172-BEB3-4DFA-B11C-AE5E50A0F932@amazon.com

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services:https://aws.amazon.com

Attachments:

v4-0001-JumbleQuery-on-Call-and-Set.patchtext/plain; charset=UTF-8; name=v4-0001-JumbleQuery-on-Call-and-Set.patchDownload

diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index ff0166fb9d..e1834b0782 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -272,7 +272,7 @@ FROM pg_stat_statements ORDER BY query COLLATE "C";
  wal_records > $2 as wal_records_generated,               +|       |      |                     |                       | 
  wal_records >= rows as wal_records_ge_rows               +|       |      |                     |                       | 
  FROM pg_stat_statements ORDER BY query COLLATE "C"        |       |      |                     |                       | 
- SET pg_stat_statements.track_utility = FALSE              |     1 |    0 | f                   | f                     | t
+ SET pg_stat_statements.track_utility = $1                 |     1 |    0 | f                   | f                     | t
  UPDATE pgss_test SET b = $1 WHERE a > $2                  |     1 |    3 | t                   | t                     | t
 (7 rows)
 
@@ -423,6 +423,147 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
 (6 rows)
 
+-- PL/pgSQL procedure and pg_stat_statements.track = top
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(4 rows)
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(4 rows)
+
+SET pg_stat_statements.track_utility = TRUE;
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+SET pg_stat_statements.track_utility = FALSE;
+DROP PROCEDURE MINUS_TWO(INTEGER);
+DROP PROCEDURE SUM_TWO(INTEGER, INTEGER);
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+SET pg_stat_statements.track = 'all';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT (i - $2 - $3)::INTEGER                                                |     2 |    2
+ SELECT (j + j)::INTEGER                                                      |     2 |    2
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(6 rows)
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT (i - 1 - 1.0)::INTEGER                                                |     2 |    2
+ SELECT (j + j)::INTEGER                                                      |     2 |    2
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(6 rows)
+
+SET pg_stat_statements.track_utility = TRUE;
+-- SET
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+set enable_seqscan=false;
+set enable_seqscan=true;
+set seq_page_cost=2.0;
+set seq_page_cost=1.0;
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+ set enable_seqscan=$1                                                        |     2 |    0
+ set seq_page_cost=$1                                                         |     2 |    0
+(4 rows)
+
+SET pg_stat_statements.track_utility = FALSE;
 --
 -- queries with locking clauses
 --
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index ba868f0de9..62de9b9325 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -106,6 +106,11 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100;
 #define PGSS_HANDLED_UTILITY(n)		(!IsA(n, ExecuteStmt) && \
 									!IsA(n, PrepareStmt) && \
 									!IsA(n, DeallocateStmt))
+/*
+ * Force track those utility statements
+ * whatever the value of pgss_track_utility is.
+ */
+#define FORCE_TRACK_UTILITY(n)	(IsA(n, CallStmt))
 
 /*
  * Extension version number, for supporting older extension versions' objects
@@ -832,7 +837,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
 	 * inherit from the underlying statement's one (except DEALLOCATE which is
 	 * entirely untracked).
 	 */
-	if (query->utilityStmt)
+	if (query->utilityStmt && !jstate)
 	{
 		if (pgss_track_utility && !PGSS_HANDLED_UTILITY(query->utilityStmt))
 			query->queryId = UINT64CONST(0);
@@ -1097,7 +1102,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 	 * that user configured another extension to handle utility statements
 	 * only.
 	 */
-	if (pgss_enabled(exec_nested_level) && pgss_track_utility)
+	if (pgss_enabled(exec_nested_level) &&
+		(pgss_track_utility || FORCE_TRACK_UTILITY(parsetree)))
 		pstmt->queryId = UINT64CONST(0);
 
 	/*
@@ -1114,7 +1120,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 	 *
 	 * Likewise, we don't track execution of DEALLOCATE.
 	 */
-	if (pgss_track_utility && pgss_enabled(exec_nested_level) &&
+	if ((pgss_track_utility || FORCE_TRACK_UTILITY(parsetree)) &&
+		pgss_enabled(exec_nested_level) &&
 		PGSS_HANDLED_UTILITY(parsetree))
 	{
 		instr_time	start;
diff --git a/contrib/pg_stat_statements/sql/pg_stat_statements.sql b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
index a01f183727..92359f0184 100644
--- a/contrib/pg_stat_statements/sql/pg_stat_statements.sql
+++ b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
@@ -201,6 +201,92 @@ SELECT PLUS_ONE(1);
 
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 
+-- PL/pgSQL procedure and pg_stat_statements.track = top
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+SET pg_stat_statements.track_utility = TRUE;
+
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+SET pg_stat_statements.track_utility = FALSE;
+DROP PROCEDURE MINUS_TWO(INTEGER);
+DROP PROCEDURE SUM_TWO(INTEGER, INTEGER);
+
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+SET pg_stat_statements.track = 'all';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+SET pg_stat_statements.track_utility = TRUE;
+
+-- SET
+SELECT pg_stat_statements_reset();
+set enable_seqscan=false;
+set enable_seqscan=true;
+set seq_page_cost=2.0;
+set seq_page_cost=1.0;
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+SET pg_stat_statements.track_utility = FALSE;
+
 --
 -- queries with locking clauses
 --
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index ecf6cd6bf3..f0cbbec9aa 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -487,13 +487,14 @@
 
   <para>
    Plannable queries (that is, <command>SELECT</command>, <command>INSERT</command>,
-   <command>UPDATE</command>, and <command>DELETE</command>) are combined into a single
-   <structname>pg_stat_statements</structname> entry whenever they have identical query
-   structures according to an internal hash calculation.  Typically, two
-   queries will be considered the same for this purpose if they are
-   semantically equivalent except for the values of literal constants
-   appearing in the query.  Utility commands (that is, all other commands)
-   are compared strictly on the basis of their textual query strings, however.
+   <command>UPDATE</command>, and <command>DELETE</command>) as well as
+   <command>CALL</command> and <command>SET</command> commands are combined into
+   a single <structname>pg_stat_statements</structname> entry whenever they have
+   identical query structures according to an internal hash calculation.
+   Typically, two queries will be considered the same for this purpose if they are
+   semantically equivalent except for the values of literal constants appearing
+   in the command. All other commands are compared strictly on the basis
+   of their textual query strings, however.
   </para>
 
   <note>
@@ -781,9 +782,10 @@
     <listitem>
      <para>
       <varname>pg_stat_statements.track_utility</varname> controls whether
-      utility commands are tracked by the module.  Utility commands are
+      utility commands are tracked by the module. Tracked utility commands are
       all those other than <command>SELECT</command>, <command>INSERT</command>,
-      <command>UPDATE</command> and <command>DELETE</command>.
+      <command>UPDATE</command>, <command>DELETE</command>,
+      <command>CALL</command> and <command>SET</command>.
       The default value is <literal>on</literal>.
       Only superusers can change this setting.
      </para>
diff --git a/src/backend/utils/misc/queryjumble.c b/src/backend/utils/misc/queryjumble.c
index a67487e5fe..4bf489e66f 100644
--- a/src/backend/utils/misc/queryjumble.c
+++ b/src/backend/utils/misc/queryjumble.c
@@ -104,7 +104,7 @@ JumbleQuery(Query *query, const char *querytext)
 
 	Assert(IsQueryIdEnabled());
 
-	if (query->utilityStmt)
+	if (query->utilityStmt && !JUMBLE_UTILITY(query->utilityStmt))
 	{
 		query->queryId = compute_utility_query_id(querytext,
 												  query->stmt_location,
@@ -240,10 +240,11 @@ static void
 JumbleQueryInternal(JumbleState *jstate, Query *query)
 {
 	Assert(IsA(query, Query));
-	Assert(query->utilityStmt == NULL);
+	Assert(query->utilityStmt == NULL || JUMBLE_UTILITY(query->utilityStmt));
 
 	APP_JUMB(query->commandType);
 	/* resultRelation is usually predictable from commandType */
+	JumbleExpr(jstate, (Node *) query->utilityStmt);
 	JumbleExpr(jstate, (Node *) query->cteList);
 	JumbleRangeTable(jstate, query->rtable);
 	JumbleExpr(jstate, (Node *) query->jointree);
@@ -383,6 +384,34 @@ JumbleExpr(JumbleState *jstate, Node *node)
 				APP_JUMB(var->varlevelsup);
 			}
 			break;
+		case T_CallStmt:
+			{
+				CallStmt   *stmt = (CallStmt *) node;
+				FuncExpr   *expr = stmt->funcexpr;
+
+				APP_JUMB(expr->funcid);
+				JumbleExpr(jstate, (Node *) expr->args);
+			}
+			break;
+		case T_VariableSetStmt:
+			{
+				VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+				/* stmt->name is NULL for RESET ALL */
+				if (stmt->name)
+				{
+					APP_JUMB_STRING(stmt->name);
+					JumbleExpr(jstate, (Node *) stmt->args);
+				}
+			}
+			break;
+		case T_A_Const:
+			{
+				int			loc = ((const A_Const *) node)->location;
+
+				RecordConstLocation(jstate, loc);
+			}
+			break;
 		case T_Const:
 			{
 				Const	   *c = (Const *) node;
diff --git a/src/include/utils/queryjumble.h b/src/include/utils/queryjumble.h
index 3c2d9beab2..cada9a1ed3 100644
--- a/src/include/utils/queryjumble.h
+++ b/src/include/utils/queryjumble.h
@@ -18,6 +18,12 @@
 
 #define JUMBLE_SIZE				1024	/* query serialization buffer size */
 
+/*
+ * Jumble those utility statements
+ */
+#define JUMBLE_UTILITY(n)	(IsA(n, CallStmt) || \
+							IsA(n, VariableSetStmt))
+
 /*
  * Struct for tracking locations/lengths of constants during normalization
  */

#19

Julien Rouhaud

rjuju123@gmail.com

over 3 years ago

In reply to: Drouvot, Bertrand (#18)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

Hi,

On Thu, Sep 08, 2022 at 11:06:51AM +0200, Drouvot, Bertrand wrote:

Hi,

On 9/8/22 8:50 AM, Julien Rouhaud wrote:

Thanks for looking at it!
On Thu, Sep 08, 2022 at 02:23:19PM +0900, Michael Paquier wrote:

On Wed, Sep 07, 2022 at 06:19:42PM -0700, Jeremy Schneider wrote:

I didn't fully debug yet, but here's the backtrace on my 14.4 build with
the patch

What happens on HEAD? That would be the target branch for a new
feature.

It would be the same AFAICS. From v3:
+               case T_VariableSetStmt:
+                       {
+                               VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+                               APP_JUMB_STRING(stmt->name);
+                               JumbleExpr(jstate, (Node *) stmt->args);
+                       }
For a RESET ALL command stmt->name is NULL.
Right, please find attached v4 addressing the issue and also Sami's comments
[1].

(Sorry I've not been following this thread until now)

IME if your application relies on 2PC it's very likely that you will hit the
exact same problems described in your original email. What do you think about
normalizing those too while working on the subject?

#20

Drouvot, Bertrand

bdrouvot@amazon.com

over 3 years ago

In reply to: Julien Rouhaud (#19)

Re: [PATCH] Query Jumbling for CALL and SET utility statements

Hi,

On 9/8/22 1:29 PM, Julien Rouhaud wrote:

Hi,

On Thu, Sep 08, 2022 at 11:06:51AM +0200, Drouvot, Bertrand wrote:
Hi,

On 9/8/22 8:50 AM, Julien Rouhaud wrote:

Thanks for looking at it!
On Thu, Sep 08, 2022 at 02:23:19PM +0900, Michael Paquier wrote:

On Wed, Sep 07, 2022 at 06:19:42PM -0700, Jeremy Schneider wrote:

I didn't fully debug yet, but here's the backtrace on my 14.4 build with
the patch

What happens on HEAD? That would be the target branch for a new
feature.

It would be the same AFAICS. From v3:
+               case T_VariableSetStmt:
+                       {
+                               VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+                               APP_JUMB_STRING(stmt->name);
+                               JumbleExpr(jstate, (Node *) stmt->args);
+                       }
For a RESET ALL command stmt->name is NULL.
Right, please find attached v4 addressing the issue and also Sami's comments
[1].
(Sorry I've not been following this thread until now)

IME if your application relies on 2PC it's very likely that you will hit the
exact same problems described in your original email.

Agree

What do you think about
normalizing those too while working on the subject?

That sounds reasonable, I'll have a look at those too while at it.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services:https://aws.amazon.com

#21

Drouvot, Bertrand

bdrouvot@amazon.com

over 3 years ago

In reply to: Drouvot, Bertrand (#20)

1 attachment(s)

Re: Query Jumbling for CALL and SET utility statements

Hi,

On 9/8/22 6:07 PM, Drouvot, Bertrand wrote:

On 9/8/22 1:29 PM, Julien Rouhaud wrote:

IME if your application relies on 2PC it's very likely that you will hit the
exact same problems described in your original email.

Agree

What do you think about
normalizing those too while working on the subject?

That sounds reasonable, I'll have a look at those too while at it.

Attached v5 to normalize 2PC commands too, so that we get things like:

create table test_tx (a int);
begin;
prepare transaction 'tx1';
insert into test_tx values (1);
commit prepared 'tx1';
begin;
prepare transaction 'tx2';
insert into test_tx values (2);
commit prepared 'tx2';
begin;
prepare transaction 'tx3';
insert into test_tx values (3);
rollback prepared 'tx3';
begin;
prepare transaction 'tx4';
insert into test_tx values (4);
rollback prepared 'tx4';
SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE
"C";
query                                     | calls | rows
------------------------------------------------------------------------------+-------+------
SELECT pg_stat_statements_reset() |     1 |    1
SELECT query, calls, rows FROM pg_stat_statements ORDER BY query
COLLATE "C" |     0 |    0
begin |     4 |    0
commit prepared $1 |     2 |    0
create table test_tx (a
int)                                                 |     1 |    0
insert into test_tx values
($1)                                              |     4 |    4
prepare transaction
$1                                                       |     4 |    0
rollback prepared
$1                                                         | 2 |    0
(8 rows)

For those ones I also had to do some minor changes in gram.y and to the
TransactionStmt struct to record the gid location.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services:https://aws.amazon.com

Attachments:

v5-0001-JumbleQuery-on-Call-and-Set.patchtext/plain; charset=UTF-8; name=v5-0001-JumbleQuery-on-Call-and-Set.patchDownload

diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index ff0166fb9d..e97d2c0f02 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -272,7 +272,7 @@ FROM pg_stat_statements ORDER BY query COLLATE "C";
  wal_records > $2 as wal_records_generated,               +|       |      |                     |                       | 
  wal_records >= rows as wal_records_ge_rows               +|       |      |                     |                       | 
  FROM pg_stat_statements ORDER BY query COLLATE "C"        |       |      |                     |                       | 
- SET pg_stat_statements.track_utility = FALSE              |     1 |    0 | f                   | f                     | t
+ SET pg_stat_statements.track_utility = $1                 |     1 |    0 | f                   | f                     | t
  UPDATE pgss_test SET b = $1 WHERE a > $2                  |     1 |    3 | t                   | t                     | t
 (7 rows)
 
@@ -423,6 +423,147 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
 (6 rows)
 
+-- PL/pgSQL procedure and pg_stat_statements.track = top
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(4 rows)
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(4 rows)
+
+SET pg_stat_statements.track_utility = TRUE;
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+SET pg_stat_statements.track_utility = FALSE;
+DROP PROCEDURE MINUS_TWO(INTEGER);
+DROP PROCEDURE SUM_TWO(INTEGER, INTEGER);
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+SET pg_stat_statements.track = 'all';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT (i - $2 - $3)::INTEGER                                                |     2 |    2
+ SELECT (j + j)::INTEGER                                                      |     2 |    2
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(6 rows)
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT (i - 1 - 1.0)::INTEGER                                                |     2 |    2
+ SELECT (j + j)::INTEGER                                                      |     2 |    2
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(6 rows)
+
+SET pg_stat_statements.track_utility = TRUE;
+-- SET
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+set enable_seqscan=false;
+set enable_seqscan=true;
+set seq_page_cost=2.0;
+set seq_page_cost=1.0;
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+ set enable_seqscan=$1                                                        |     2 |    0
+ set seq_page_cost=$1                                                         |     2 |    0
+(4 rows)
+
+SET pg_stat_statements.track_utility = FALSE;
 --
 -- queries with locking clauses
 --
@@ -553,11 +694,80 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
 (9 rows)
 
+-- 2PC
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+create table test_tx (a int);
+begin;
+prepare transaction 'tx1';
+insert into test_tx values (1);
+commit prepared 'tx1';
+begin;
+prepare transaction 'tx2';
+insert into test_tx values (2);
+commit prepared 'tx2';
+begin;
+prepare transaction 'tx3';
+insert into test_tx values (3);
+rollback prepared 'tx3';
+begin;
+prepare transaction 'tx4';
+insert into test_tx values (4);
+rollback prepared 'tx4';
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+ begin                                                                        |     4 |    0
+ commit prepared $1                                                           |     2 |    0
+ create table test_tx (a int)                                                 |     1 |    0
+ insert into test_tx values ($1)                                              |     4 |    4
+ prepare transaction $1                                                       |     4 |    0
+ rollback prepared $1                                                         |     2 |    0
+(8 rows)
+
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+begin;
+prepare transaction 'tx1';
+insert into test_tx values (1);
+commit prepared 'tx1';
+begin;
+prepare transaction 'tx2';
+insert into test_tx values (2);
+commit prepared 'tx2';
+begin;
+prepare transaction 'tx3';
+insert into test_tx values (3);
+rollback prepared 'tx3';
+begin;
+prepare transaction 'tx4';
+insert into test_tx values (4);
+rollback prepared 'tx4';
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+ insert into test_tx values ($1)                                              |     4 |    4
+(3 rows)
+
 --
 -- Track the total number of rows retrieved or affected by the utility
 -- commands of COPY, FETCH, CREATE TABLE AS, CREATE MATERIALIZED VIEW,
 -- REFRESH MATERIALIZED VIEW and SELECT INTO
 --
+SET pg_stat_statements.track_utility = TRUE;
 SELECT pg_stat_statements_reset();
  pg_stat_statements_reset 
 --------------------------
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index ba868f0de9..62de9b9325 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -106,6 +106,11 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100;
 #define PGSS_HANDLED_UTILITY(n)		(!IsA(n, ExecuteStmt) && \
 									!IsA(n, PrepareStmt) && \
 									!IsA(n, DeallocateStmt))
+/*
+ * Force track those utility statements
+ * whatever the value of pgss_track_utility is.
+ */
+#define FORCE_TRACK_UTILITY(n)	(IsA(n, CallStmt))
 
 /*
  * Extension version number, for supporting older extension versions' objects
@@ -832,7 +837,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
 	 * inherit from the underlying statement's one (except DEALLOCATE which is
 	 * entirely untracked).
 	 */
-	if (query->utilityStmt)
+	if (query->utilityStmt && !jstate)
 	{
 		if (pgss_track_utility && !PGSS_HANDLED_UTILITY(query->utilityStmt))
 			query->queryId = UINT64CONST(0);
@@ -1097,7 +1102,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 	 * that user configured another extension to handle utility statements
 	 * only.
 	 */
-	if (pgss_enabled(exec_nested_level) && pgss_track_utility)
+	if (pgss_enabled(exec_nested_level) &&
+		(pgss_track_utility || FORCE_TRACK_UTILITY(parsetree)))
 		pstmt->queryId = UINT64CONST(0);
 
 	/*
@@ -1114,7 +1120,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 	 *
 	 * Likewise, we don't track execution of DEALLOCATE.
 	 */
-	if (pgss_track_utility && pgss_enabled(exec_nested_level) &&
+	if ((pgss_track_utility || FORCE_TRACK_UTILITY(parsetree)) &&
+		pgss_enabled(exec_nested_level) &&
 		PGSS_HANDLED_UTILITY(parsetree))
 	{
 		instr_time	start;
diff --git a/contrib/pg_stat_statements/sql/pg_stat_statements.sql b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
index a01f183727..7b44aef340 100644
--- a/contrib/pg_stat_statements/sql/pg_stat_statements.sql
+++ b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
@@ -201,6 +201,92 @@ SELECT PLUS_ONE(1);
 
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 
+-- PL/pgSQL procedure and pg_stat_statements.track = top
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+SET pg_stat_statements.track_utility = TRUE;
+
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+SET pg_stat_statements.track_utility = FALSE;
+DROP PROCEDURE MINUS_TWO(INTEGER);
+DROP PROCEDURE SUM_TWO(INTEGER, INTEGER);
+
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+SET pg_stat_statements.track = 'all';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+SET pg_stat_statements.track_utility = TRUE;
+
+-- SET
+SELECT pg_stat_statements_reset();
+set enable_seqscan=false;
+set enable_seqscan=true;
+set seq_page_cost=2.0;
+set seq_page_cost=1.0;
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+SET pg_stat_statements.track_utility = FALSE;
+
 --
 -- queries with locking clauses
 --
@@ -250,11 +336,63 @@ DROP FUNCTION PLUS_TWO(INTEGER);
 
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 
+-- 2PC
+SELECT pg_stat_statements_reset();
+
+create table test_tx (a int);
+begin;
+prepare transaction 'tx1';
+insert into test_tx values (1);
+commit prepared 'tx1';
+
+begin;
+prepare transaction 'tx2';
+insert into test_tx values (2);
+commit prepared 'tx2';
+
+begin;
+prepare transaction 'tx3';
+insert into test_tx values (3);
+rollback prepared 'tx3';
+
+begin;
+prepare transaction 'tx4';
+insert into test_tx values (4);
+rollback prepared 'tx4';
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+
+begin;
+prepare transaction 'tx1';
+insert into test_tx values (1);
+commit prepared 'tx1';
+
+begin;
+prepare transaction 'tx2';
+insert into test_tx values (2);
+commit prepared 'tx2';
+
+begin;
+prepare transaction 'tx3';
+insert into test_tx values (3);
+rollback prepared 'tx3';
+
+begin;
+prepare transaction 'tx4';
+insert into test_tx values (4);
+rollback prepared 'tx4';
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
 --
 -- Track the total number of rows retrieved or affected by the utility
 -- commands of COPY, FETCH, CREATE TABLE AS, CREATE MATERIALIZED VIEW,
 -- REFRESH MATERIALIZED VIEW and SELECT INTO
 --
+SET pg_stat_statements.track_utility = TRUE;
 SELECT pg_stat_statements_reset();
 
 CREATE TABLE pgss_ctas AS SELECT a, 'ctas' b FROM generate_series(1, 10) a;
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index ecf6cd6bf3..ff106b951f 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -487,13 +487,16 @@
 
   <para>
    Plannable queries (that is, <command>SELECT</command>, <command>INSERT</command>,
-   <command>UPDATE</command>, and <command>DELETE</command>) are combined into a single
-   <structname>pg_stat_statements</structname> entry whenever they have identical query
-   structures according to an internal hash calculation.  Typically, two
-   queries will be considered the same for this purpose if they are
-   semantically equivalent except for the values of literal constants
-   appearing in the query.  Utility commands (that is, all other commands)
-   are compared strictly on the basis of their textual query strings, however.
+   <command>UPDATE</command>, and <command>DELETE</command>) as well as
+   <command>CALL</command>, <command>SET</command> and two-phase commit commands
+   <command>PREPARE TRANSACTION</command>, <command>COMMIT PREPARED</command>
+   and <command>ROLLBACK PREPARED</command> are combined into a single
+   <structname>pg_stat_statements</structname> entry whenever they have
+   identical query structures according to an internal hash calculation.
+   Typically, two queries will be considered the same for this purpose if they are
+   semantically equivalent except for the values of literal constants appearing
+   in the command. All other commands are compared strictly on the basis
+   of their textual query strings, however.
   </para>
 
   <note>
@@ -781,9 +784,9 @@
     <listitem>
      <para>
       <varname>pg_stat_statements.track_utility</varname> controls whether
-      utility commands are tracked by the module.  Utility commands are
+      utility commands are tracked by the module. Tracked utility commands are
       all those other than <command>SELECT</command>, <command>INSERT</command>,
-      <command>UPDATE</command> and <command>DELETE</command>.
+      <command>UPDATE</command>, <command>DELETE</command> and <command>CALL</command>.
       The default value is <literal>on</literal>.
       Only superusers can change this setting.
      </para>
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index ea33784316..1ffa943a7d 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10890,6 +10890,7 @@ TransactionStmt:
 
 					n->kind = TRANS_STMT_PREPARE;
 					n->gid = $3;
+					n->gid_location = @3;
 					$$ = (Node *) n;
 				}
 			| COMMIT PREPARED Sconst
@@ -10898,6 +10899,7 @@ TransactionStmt:
 
 					n->kind = TRANS_STMT_COMMIT_PREPARED;
 					n->gid = $3;
+					n->gid_location = @3;
 					$$ = (Node *) n;
 				}
 			| ROLLBACK PREPARED Sconst
@@ -10906,6 +10908,7 @@ TransactionStmt:
 
 					n->kind = TRANS_STMT_ROLLBACK_PREPARED;
 					n->gid = $3;
+					n->gid_location = @3;
 					$$ = (Node *) n;
 				}
 		;
diff --git a/src/backend/utils/misc/queryjumble.c b/src/backend/utils/misc/queryjumble.c
index a67487e5fe..682075a798 100644
--- a/src/backend/utils/misc/queryjumble.c
+++ b/src/backend/utils/misc/queryjumble.c
@@ -104,7 +104,7 @@ JumbleQuery(Query *query, const char *querytext)
 
 	Assert(IsQueryIdEnabled());
 
-	if (query->utilityStmt)
+	if (query->utilityStmt && !IsJumbleUtilityAllowed(query->utilityStmt))
 	{
 		query->queryId = compute_utility_query_id(querytext,
 												  query->stmt_location,
@@ -240,10 +240,11 @@ static void
 JumbleQueryInternal(JumbleState *jstate, Query *query)
 {
 	Assert(IsA(query, Query));
-	Assert(query->utilityStmt == NULL);
+	Assert(query->utilityStmt == NULL || IsJumbleUtilityAllowed(query->utilityStmt));
 
 	APP_JUMB(query->commandType);
 	/* resultRelation is usually predictable from commandType */
+	JumbleExpr(jstate, (Node *) query->utilityStmt);
 	JumbleExpr(jstate, (Node *) query->cteList);
 	JumbleRangeTable(jstate, query->rtable);
 	JumbleExpr(jstate, (Node *) query->jointree);
@@ -383,6 +384,46 @@ JumbleExpr(JumbleState *jstate, Node *node)
 				APP_JUMB(var->varlevelsup);
 			}
 			break;
+		case T_CallStmt:
+			{
+				CallStmt   *stmt = (CallStmt *) node;
+				FuncExpr   *expr = stmt->funcexpr;
+
+				APP_JUMB(expr->funcid);
+				JumbleExpr(jstate, (Node *) expr->args);
+			}
+			break;
+		case T_VariableSetStmt:
+			{
+				VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+				/* stmt->name is NULL for RESET ALL */
+				if (stmt->name)
+				{
+					APP_JUMB_STRING(stmt->name);
+					JumbleExpr(jstate, (Node *) stmt->args);
+				}
+			}
+			break;
+		case T_A_Const:
+			{
+				int			loc = ((const A_Const *) node)->location;
+
+				RecordConstLocation(jstate, loc);
+			}
+			break;
+		case T_TransactionStmt:
+			{
+				TransactionStmt *stmt = (TransactionStmt *) node;
+
+				Assert(stmt->kind == TRANS_STMT_PREPARE ||
+					   stmt->kind == TRANS_STMT_COMMIT_PREPARED ||
+					   stmt->kind == TRANS_STMT_ROLLBACK_PREPARED);
+
+				APP_JUMB(stmt->kind);
+				RecordConstLocation(jstate, stmt->gid_location);
+			}
+			break;
 		case T_Const:
 			{
 				Const	   *c = (Const *) node;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 6958306a7d..e9823c51f9 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3281,6 +3281,7 @@ typedef struct TransactionStmt
 	char	   *savepoint_name; /* for savepoint commands */
 	char	   *gid;			/* for two-phase-commit related commands */
 	bool		chain;			/* AND CHAIN option */
+	int			gid_location;	/* gid location */
 } TransactionStmt;
 
 /* ----------------------
diff --git a/src/include/utils/queryjumble.h b/src/include/utils/queryjumble.h
index 3c2d9beab2..dfbb060f55 100644
--- a/src/include/utils/queryjumble.h
+++ b/src/include/utils/queryjumble.h
@@ -85,4 +85,24 @@ IsQueryIdEnabled(void)
 	return query_id_enabled;
 }
 
+/*
+ * Jumble those utility statements
+ */
+static inline bool
+IsJumbleUtilityAllowed(Node *n)
+{
+	if (IsA(n, CallStmt) || IsA(n, VariableSetStmt))
+		return true;
+	else if (IsA(n, TransactionStmt))
+	{
+		TransactionStmt *stmt = (TransactionStmt *) n;
+
+		return (stmt->kind == TRANS_STMT_PREPARE ||
+				stmt->kind == TRANS_STMT_COMMIT_PREPARED ||
+				stmt->kind == TRANS_STMT_ROLLBACK_PREPARED);
+	}
+	else
+		return false;
+}
+
 #endif							/* QUERYJUMBLE_H */

#22

bt22kawamotok

bt22kawamotok@oss.nttdata.com

over 3 years ago

In reply to: Drouvot, Bertrand (#21)

Re: Query Jumbling for CALL and SET utility statements

Attached v5 to normalize 2PC commands too, so that we get things like:

create table test_tx (a int);
begin;
prepare transaction 'tx1';
insert into test_tx values (1);
commit prepared 'tx1';
begin;
prepare transaction 'tx2';
insert into test_tx values (2);
commit prepared 'tx2';
begin;
prepare transaction 'tx3';
insert into test_tx values (3);
rollback prepared 'tx3';
begin;
prepare transaction 'tx4';
insert into test_tx values (4);
rollback prepared 'tx4';
SELECT query, calls, rows FROM pg_stat_statements ORDER BY query
COLLATE "C";
query
| calls | rows
------------------------------------------------------------------------------+-------+------
SELECT pg_stat_statements_reset()
| 1 | 1
SELECT query, calls, rows FROM pg_stat_statements ORDER BY query
COLLATE "C" | 0 | 0
begin
| 4 | 0
commit prepared $1
| 2 | 0
create table test_tx (a int)
| 1 | 0
insert into test_tx values ($1)
| 4 | 4
prepare transaction $1
| 4 | 0
rollback prepared $1
| 2 | 0
(8 rows)

For those ones I also had to do some minor changes in gram.y and to
the TransactionStmt struct to record the gid location.

Thanks Bertrand.
I used your patch. It's looks very good.
I found that utility statement is counted separately in upper and lower
case.
For example BEGIN and begin are counted separately.
Is it difficult to fix this problem?

Regards,

Kotaro Kawamoto

#23

Julien Rouhaud

rjuju123@gmail.com

over 3 years ago

In reply to: bt22kawamotok (#22)

Re: Query Jumbling for CALL and SET utility statements

Hi,

On Tue, Sep 13, 2022 at 11:43:52AM +0900, bt22kawamotok wrote:

I found that utility statement is counted separately in upper and lower
case.
For example BEGIN and begin are counted separately.
Is it difficult to fix this problem?

This is a known behavior, utility command aren't normalized (apart from the few
that will be with this patch) and the queryid is just a hash of the provided
query string.

It seems unrelated to this patch though. While it can be a bit annoying, it's
unlikely that the application will have thousands of way to ask for a new
transaction (mixing case, adding a random number of spaces between BEGIN and
TRANSACTION and so on), so in real life it won't cause any problem. Fixing it
would require to fully jumble all utility statements, which would require a
separate discussion.

#24

Drouvot, Bertrand

bdrouvot@amazon.com

over 3 years ago

In reply to: Julien Rouhaud (#23)

Re: Query Jumbling for CALL and SET utility statements

Hi,

On 9/13/22 6:33 AM, Julien Rouhaud wrote:

Hi,

On Tue, Sep 13, 2022 at 11:43:52AM +0900, bt22kawamotok wrote:

I found that utility statement is counted separately in upper and lower
case.
For example BEGIN and begin are counted separately.
Is it difficult to fix this problem?

This is a known behavior, utility command aren't normalized (apart from the few
that will be with this patch) and the queryid is just a hash of the provided
query string.

It seems unrelated to this patch though. While it can be a bit annoying, it's
unlikely that the application will have thousands of way to ask for a new
transaction (mixing case, adding a random number of spaces between BEGIN and
TRANSACTION and so on), so in real life it won't cause any problem.

Agree that it seems unlikely to cause any problem (as compare to the
utility statements that are handled in this patch).

Fixing it
would require to fully jumble all utility statements, which would require a
separate discussion.

Agree.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855 Luxembourg, R.C.S. Luxembourg B186284

Amazon Web Services EMEA SARL, succursale francaise, 31 Place des Corolles, Tour Carpe Diem, F-92400 Courbevoie, SIREN 831 001 334, RCS Nanterre, APE 6311Z, TVA FR30831001334

#25

Imseih (AWS), Sami

simseih@amazon.com

over 3 years ago

In reply to: Drouvot, Bertrand (#21)

Re: Query Jumbling for CALL and SET utility statements

Attached v5 to normalize 2PC commands too, so that we get things like:

A nit on the documentation for v5, otherwise lgtm.

Plannable queries (that is, SELECT, INSERT, UPDATE, and DELETE) as well as CALL, SET
and two-phase commit commands PREPARE TRANSACTION, , COMMIT PREPARED
and ROLLBACK PREPARED are combined

---- to ----

Plannable queries (that is, SELECT, INSERT, UPDATE, and DELETE) as well as CALL,
SET, PREPARE TRANSACTION, COMMIT PREPARED and ROLLBACK PREPARED are combined

---

Regards,

Sami Imseih
Amazon Web Services (AWS)

#26

Fujii Masao

masao.fujii@oss.nttdata.com

over 3 years ago

In reply to: Drouvot, Bertrand (#21)

Re: Query Jumbling for CALL and SET utility statements

On 2022/09/09 19:11, Drouvot, Bertrand wrote:

IME if your application relies on 2PC it's very likely that you will hit the
exact same problems described in your original email.

The utility commands for cursor like DECLARE CURSOR seem to have the same issue
and can cause lots of pgss entries. For example, when we use postgres_fdw and
execute "SELECT * FROM <foreign table> WHERE id = 10" five times in the same
transaction, the following commands are executed in the remote PostgreSQL server
and recorded as pgss entries there.

DECLARE c1 CURSOR FOR ...
DECLARE c2 CURSOR FOR ...
DECLARE c3 CURSOR FOR ...
DECLARE c4 CURSOR FOR ...
DECLARE c5 CURSOR FOR ...
FETCH 100 FROM c1
FETCH 100 FROM c2
FETCH 100 FROM c3
FETCH 100 FROM c4
FETCH 100 FROM c5
CLOSE c1
CLOSE c2
CLOSE c3
CLOSE c4
CLOSE c5

Furthermore, if the different query on foreign table is executed in the next
transaction, it may reuse the same cursor name previously used by another query.
That is, different queries can cause the same FETCH command like
"FETCH 100 FROM c1". This would be also an issue.

I'm not sure if the patch should also handle cursor cases. We can implement
that separately later if necessary.

I don't think that the patch should include the fix for cursor cases. It can be implemented separately later if necessary.

Attached v5 to normalize 2PC commands too, so that we get things like:

+		case T_VariableSetStmt:
+			{
+				VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+				/* stmt->name is NULL for RESET ALL */
+				if (stmt->name)
+				{
+					APP_JUMB_STRING(stmt->name);
+					JumbleExpr(jstate, (Node *) stmt->args);

With the patch, "SET ... TO DEFAULT" and "RESET ..." are counted as the same query.
Is this intentional? Which might be ok because their behavior is basically the same.
But I'm afaid which may cause users to be confused. For example, they may fail to
find the pgss entry for RESET command they ran and just wonder why the command was
not recorded. To avoid such confusion, how about appending stmt->kind to the jumble?
Thought?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#27

Imseih (AWS), Sami

simseih@amazon.com

over 3 years ago

In reply to: Fujii Masao (#26)

Re: Query Jumbling for CALL and SET utility statements

The utility commands for cursor like DECLARE CURSOR seem to have the same issue
and can cause lots of pgss entries. For example, when we use postgres_fdw and
execute "SELECT * FROM <foreign table> WHERE id = 10" five times in the same
transaction, the following commands are executed in the remote PostgreSQL server
and recorded as pgss entries there.

DECLARE c1 CURSOR FOR ...
DECLARE c2 CURSOR FOR ...
DECLARE c3 CURSOR FOR ...

I also made this observation recently and have a patch to suggest
to improve tis situation. I will start a separate thread for this.

Regards,

--
Sami Imseih
Amazon Web Services (AWS)

#28

Drouvot, Bertrand

bertranddrouvot.pg@gmail.com

over 3 years ago

In reply to: Fujii Masao (#26)

Re: Query Jumbling for CALL and SET utility statements

Hi,

On 9/16/22 2:53 PM, Fujii Masao wrote:

Attached v5 to normalize 2PC commands too, so that we get things like:

+        case T_VariableSetStmt:
+            {
+                VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+                /* stmt->name is NULL for RESET ALL */
+                if (stmt->name)
+                {
+                    APP_JUMB_STRING(stmt->name);
+                    JumbleExpr(jstate, (Node *) stmt->args);

With the patch, "SET ... TO DEFAULT" and "RESET ..." are counted as the
same query.
Is this intentional?

Thanks for looking at the patch!
No, it is not intentional, good catch!

Which might be ok because their behavior is
basically the same.
But I'm afaid which may cause users to be confused. For example, they
may fail to
find the pgss entry for RESET command they ran and just wonder why the
command was
not recorded. To avoid such confusion, how about appending stmt->kind to
the jumble?
Thought?

I think that's a good idea and will provide a new version taking care of
it (and also Sami's comments up-thread).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#29

Drouvot, Bertrand

bertranddrouvot.pg@gmail.com

over 3 years ago

In reply to: Drouvot, Bertrand (#28)

1 attachment(s)

Re: Query Jumbling for CALL and SET utility statements

Hi,

On 9/16/22 5:47 PM, Drouvot, Bertrand wrote:

Hi,

On 9/16/22 2:53 PM, Fujii Masao wrote:
Attached v5 to normalize 2PC commands too, so that we get things like:
+        case T_VariableSetStmt:
+            {
+                VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+                /* stmt->name is NULL for RESET ALL */
+                if (stmt->name)
+                {
+                    APP_JUMB_STRING(stmt->name);
+                    JumbleExpr(jstate, (Node *) stmt->args);
With the patch, "SET ... TO DEFAULT" and "RESET ..." are counted as
the same query.
Is this intentional?
Thanks for looking at the patch!
No, it is not intentional, good catch!

Which might be ok because their behavior is basically the same.
But I'm afaid which may cause users to be confused. For example, they
may fail to
find the pgss entry for RESET command they ran and just wonder why the
command was
not recorded. To avoid such confusion, how about appending stmt->kind
to the jumble?
Thought?

I think that's a good idea and will provide a new version taking care of
it (and also Sami's comments up-thread).

Please find attached v6 taking care of the remarks mentioned above.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v6-0001-JumbleQuery-on-Call-and-Set.patchtext/plain; charset=UTF-8; name=v6-0001-JumbleQuery-on-Call-and-Set.patchDownload

diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index ff0166fb9d..ad8cebcbad 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -272,7 +272,7 @@ FROM pg_stat_statements ORDER BY query COLLATE "C";
  wal_records > $2 as wal_records_generated,               +|       |      |                     |                       | 
  wal_records >= rows as wal_records_ge_rows               +|       |      |                     |                       | 
  FROM pg_stat_statements ORDER BY query COLLATE "C"        |       |      |                     |                       | 
- SET pg_stat_statements.track_utility = FALSE              |     1 |    0 | f                   | f                     | t
+ SET pg_stat_statements.track_utility = $1                 |     1 |    0 | f                   | f                     | t
  UPDATE pgss_test SET b = $1 WHERE a > $2                  |     1 |    3 | t                   | t                     | t
 (7 rows)
 
@@ -423,6 +423,155 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
 (6 rows)
 
+-- PL/pgSQL procedure and pg_stat_statements.track = top
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(4 rows)
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(4 rows)
+
+SET pg_stat_statements.track_utility = TRUE;
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+SET pg_stat_statements.track_utility = FALSE;
+DROP PROCEDURE MINUS_TWO(INTEGER);
+DROP PROCEDURE SUM_TWO(INTEGER, INTEGER);
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+SET pg_stat_statements.track = 'all';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT (i - $2 - $3)::INTEGER                                                |     2 |    2
+ SELECT (j + j)::INTEGER                                                      |     2 |    2
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(6 rows)
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT (i - 1 - 1.0)::INTEGER                                                |     2 |    2
+ SELECT (j + j)::INTEGER                                                      |     2 |    2
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(6 rows)
+
+SET pg_stat_statements.track_utility = TRUE;
+-- SET
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+set enable_seqscan=false;
+set enable_seqscan=true;
+set seq_page_cost=2.0;
+set seq_page_cost=1.0;
+set enable_seqscan to default;
+set seq_page_cost to default;
+reset seq_page_cost;
+reset enable_seqscan;
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+ reset enable_seqscan                                                         |     1 |    0
+ reset seq_page_cost                                                          |     1 |    0
+ set enable_seqscan to default                                                |     1 |    0
+ set enable_seqscan=$1                                                        |     2 |    0
+ set seq_page_cost to default                                                 |     1 |    0
+ set seq_page_cost=$1                                                         |     2 |    0
+(8 rows)
+
+SET pg_stat_statements.track_utility = FALSE;
 --
 -- queries with locking clauses
 --
@@ -553,11 +702,80 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
 (9 rows)
 
+-- 2PC
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+create table test_tx (a int);
+begin;
+prepare transaction 'tx1';
+insert into test_tx values (1);
+commit prepared 'tx1';
+begin;
+prepare transaction 'tx2';
+insert into test_tx values (2);
+commit prepared 'tx2';
+begin;
+prepare transaction 'tx3';
+insert into test_tx values (3);
+rollback prepared 'tx3';
+begin;
+prepare transaction 'tx4';
+insert into test_tx values (4);
+rollback prepared 'tx4';
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+ begin                                                                        |     4 |    0
+ commit prepared $1                                                           |     2 |    0
+ create table test_tx (a int)                                                 |     1 |    0
+ insert into test_tx values ($1)                                              |     4 |    4
+ prepare transaction $1                                                       |     4 |    0
+ rollback prepared $1                                                         |     2 |    0
+(8 rows)
+
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+begin;
+prepare transaction 'tx1';
+insert into test_tx values (1);
+commit prepared 'tx1';
+begin;
+prepare transaction 'tx2';
+insert into test_tx values (2);
+commit prepared 'tx2';
+begin;
+prepare transaction 'tx3';
+insert into test_tx values (3);
+rollback prepared 'tx3';
+begin;
+prepare transaction 'tx4';
+insert into test_tx values (4);
+rollback prepared 'tx4';
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+ insert into test_tx values ($1)                                              |     4 |    4
+(3 rows)
+
 --
 -- Track the total number of rows retrieved or affected by the utility
 -- commands of COPY, FETCH, CREATE TABLE AS, CREATE MATERIALIZED VIEW,
 -- REFRESH MATERIALIZED VIEW and SELECT INTO
 --
+SET pg_stat_statements.track_utility = TRUE;
 SELECT pg_stat_statements_reset();
  pg_stat_statements_reset 
 --------------------------
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index ba868f0de9..62de9b9325 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -106,6 +106,11 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100;
 #define PGSS_HANDLED_UTILITY(n)		(!IsA(n, ExecuteStmt) && \
 									!IsA(n, PrepareStmt) && \
 									!IsA(n, DeallocateStmt))
+/*
+ * Force track those utility statements
+ * whatever the value of pgss_track_utility is.
+ */
+#define FORCE_TRACK_UTILITY(n)	(IsA(n, CallStmt))
 
 /*
  * Extension version number, for supporting older extension versions' objects
@@ -832,7 +837,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
 	 * inherit from the underlying statement's one (except DEALLOCATE which is
 	 * entirely untracked).
 	 */
-	if (query->utilityStmt)
+	if (query->utilityStmt && !jstate)
 	{
 		if (pgss_track_utility && !PGSS_HANDLED_UTILITY(query->utilityStmt))
 			query->queryId = UINT64CONST(0);
@@ -1097,7 +1102,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 	 * that user configured another extension to handle utility statements
 	 * only.
 	 */
-	if (pgss_enabled(exec_nested_level) && pgss_track_utility)
+	if (pgss_enabled(exec_nested_level) &&
+		(pgss_track_utility || FORCE_TRACK_UTILITY(parsetree)))
 		pstmt->queryId = UINT64CONST(0);
 
 	/*
@@ -1114,7 +1120,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 	 *
 	 * Likewise, we don't track execution of DEALLOCATE.
 	 */
-	if (pgss_track_utility && pgss_enabled(exec_nested_level) &&
+	if ((pgss_track_utility || FORCE_TRACK_UTILITY(parsetree)) &&
+		pgss_enabled(exec_nested_level) &&
 		PGSS_HANDLED_UTILITY(parsetree))
 	{
 		instr_time	start;
diff --git a/contrib/pg_stat_statements/sql/pg_stat_statements.sql b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
index a01f183727..d4494e81b8 100644
--- a/contrib/pg_stat_statements/sql/pg_stat_statements.sql
+++ b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
@@ -201,6 +201,96 @@ SELECT PLUS_ONE(1);
 
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 
+-- PL/pgSQL procedure and pg_stat_statements.track = top
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+SET pg_stat_statements.track_utility = TRUE;
+
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+SET pg_stat_statements.track_utility = FALSE;
+DROP PROCEDURE MINUS_TWO(INTEGER);
+DROP PROCEDURE SUM_TWO(INTEGER, INTEGER);
+
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+SET pg_stat_statements.track = 'all';
+SET pg_stat_statements.track_utility = TRUE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+-- Ensure CALL is tracked even if pg_stat_statements.track_utility is FALSE
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+SET pg_stat_statements.track_utility = TRUE;
+
+-- SET
+SELECT pg_stat_statements_reset();
+set enable_seqscan=false;
+set enable_seqscan=true;
+set seq_page_cost=2.0;
+set seq_page_cost=1.0;
+set enable_seqscan to default;
+set seq_page_cost to default;
+reset seq_page_cost;
+reset enable_seqscan;
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+SET pg_stat_statements.track_utility = FALSE;
+
 --
 -- queries with locking clauses
 --
@@ -250,11 +340,63 @@ DROP FUNCTION PLUS_TWO(INTEGER);
 
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 
+-- 2PC
+SELECT pg_stat_statements_reset();
+
+create table test_tx (a int);
+begin;
+prepare transaction 'tx1';
+insert into test_tx values (1);
+commit prepared 'tx1';
+
+begin;
+prepare transaction 'tx2';
+insert into test_tx values (2);
+commit prepared 'tx2';
+
+begin;
+prepare transaction 'tx3';
+insert into test_tx values (3);
+rollback prepared 'tx3';
+
+begin;
+prepare transaction 'tx4';
+insert into test_tx values (4);
+rollback prepared 'tx4';
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+
+begin;
+prepare transaction 'tx1';
+insert into test_tx values (1);
+commit prepared 'tx1';
+
+begin;
+prepare transaction 'tx2';
+insert into test_tx values (2);
+commit prepared 'tx2';
+
+begin;
+prepare transaction 'tx3';
+insert into test_tx values (3);
+rollback prepared 'tx3';
+
+begin;
+prepare transaction 'tx4';
+insert into test_tx values (4);
+rollback prepared 'tx4';
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
 --
 -- Track the total number of rows retrieved or affected by the utility
 -- commands of COPY, FETCH, CREATE TABLE AS, CREATE MATERIALIZED VIEW,
 -- REFRESH MATERIALIZED VIEW and SELECT INTO
 --
+SET pg_stat_statements.track_utility = TRUE;
 SELECT pg_stat_statements_reset();
 
 CREATE TABLE pgss_ctas AS SELECT a, 'ctas' b FROM generate_series(1, 10) a;
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index ecf6cd6bf3..d18f3632e9 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -487,13 +487,15 @@
 
   <para>
    Plannable queries (that is, <command>SELECT</command>, <command>INSERT</command>,
-   <command>UPDATE</command>, and <command>DELETE</command>) are combined into a single
-   <structname>pg_stat_statements</structname> entry whenever they have identical query
-   structures according to an internal hash calculation.  Typically, two
-   queries will be considered the same for this purpose if they are
-   semantically equivalent except for the values of literal constants
-   appearing in the query.  Utility commands (that is, all other commands)
-   are compared strictly on the basis of their textual query strings, however.
+   <command>UPDATE</command>, and <command>DELETE</command>) as well as
+   <command>CALL</command>, <command>SET</command>, <command>PREPARE TRANSACTION</command>,
+   <command>COMMIT PREPARED</command> and <command>ROLLBACK PREPARED</command>
+   are combined into a single <structname>pg_stat_statements</structname> entry
+   whenever they have identical query structures according to an internal hash calculation.
+   Typically, two queries will be considered the same for this purpose if they are
+   semantically equivalent except for the values of literal constants appearing
+   in the command. All other commands are compared strictly on the basis
+   of their textual query strings, however.
   </para>
 
   <note>
@@ -781,9 +783,9 @@
     <listitem>
      <para>
       <varname>pg_stat_statements.track_utility</varname> controls whether
-      utility commands are tracked by the module.  Utility commands are
+      utility commands are tracked by the module. Tracked utility commands are
       all those other than <command>SELECT</command>, <command>INSERT</command>,
-      <command>UPDATE</command> and <command>DELETE</command>.
+      <command>UPDATE</command>, <command>DELETE</command> and <command>CALL</command>.
       The default value is <literal>on</literal>.
       Only superusers can change this setting.
      </para>
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 82f03fc9c9..c7f4227508 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10890,6 +10890,7 @@ TransactionStmt:
 
 					n->kind = TRANS_STMT_PREPARE;
 					n->gid = $3;
+					n->gid_location = @3;
 					$$ = (Node *) n;
 				}
 			| COMMIT PREPARED Sconst
@@ -10898,6 +10899,7 @@ TransactionStmt:
 
 					n->kind = TRANS_STMT_COMMIT_PREPARED;
 					n->gid = $3;
+					n->gid_location = @3;
 					$$ = (Node *) n;
 				}
 			| ROLLBACK PREPARED Sconst
@@ -10906,6 +10908,7 @@ TransactionStmt:
 
 					n->kind = TRANS_STMT_ROLLBACK_PREPARED;
 					n->gid = $3;
+					n->gid_location = @3;
 					$$ = (Node *) n;
 				}
 		;
diff --git a/src/backend/utils/misc/queryjumble.c b/src/backend/utils/misc/queryjumble.c
index a67487e5fe..69890a6a26 100644
--- a/src/backend/utils/misc/queryjumble.c
+++ b/src/backend/utils/misc/queryjumble.c
@@ -104,7 +104,7 @@ JumbleQuery(Query *query, const char *querytext)
 
 	Assert(IsQueryIdEnabled());
 
-	if (query->utilityStmt)
+	if (query->utilityStmt && !IsJumbleUtilityAllowed(query->utilityStmt))
 	{
 		query->queryId = compute_utility_query_id(querytext,
 												  query->stmt_location,
@@ -240,10 +240,11 @@ static void
 JumbleQueryInternal(JumbleState *jstate, Query *query)
 {
 	Assert(IsA(query, Query));
-	Assert(query->utilityStmt == NULL);
+	Assert(query->utilityStmt == NULL || IsJumbleUtilityAllowed(query->utilityStmt));
 
 	APP_JUMB(query->commandType);
 	/* resultRelation is usually predictable from commandType */
+	JumbleExpr(jstate, (Node *) query->utilityStmt);
 	JumbleExpr(jstate, (Node *) query->cteList);
 	JumbleRangeTable(jstate, query->rtable);
 	JumbleExpr(jstate, (Node *) query->jointree);
@@ -383,6 +384,47 @@ JumbleExpr(JumbleState *jstate, Node *node)
 				APP_JUMB(var->varlevelsup);
 			}
 			break;
+		case T_CallStmt:
+			{
+				CallStmt   *stmt = (CallStmt *) node;
+				FuncExpr   *expr = stmt->funcexpr;
+
+				APP_JUMB(expr->funcid);
+				JumbleExpr(jstate, (Node *) expr->args);
+			}
+			break;
+		case T_VariableSetStmt:
+			{
+				VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+				/* stmt->name is NULL for RESET ALL */
+				if (stmt->name)
+				{
+					APP_JUMB(stmt->kind);
+					APP_JUMB_STRING(stmt->name);
+					JumbleExpr(jstate, (Node *) stmt->args);
+				}
+			}
+			break;
+		case T_A_Const:
+			{
+				int			loc = ((const A_Const *) node)->location;
+
+				RecordConstLocation(jstate, loc);
+			}
+			break;
+		case T_TransactionStmt:
+			{
+				TransactionStmt *stmt = (TransactionStmt *) node;
+
+				Assert(stmt->kind == TRANS_STMT_PREPARE ||
+					   stmt->kind == TRANS_STMT_COMMIT_PREPARED ||
+					   stmt->kind == TRANS_STMT_ROLLBACK_PREPARED);
+
+				APP_JUMB(stmt->kind);
+				RecordConstLocation(jstate, stmt->gid_location);
+			}
+			break;
 		case T_Const:
 			{
 				Const	   *c = (Const *) node;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 6958306a7d..e9823c51f9 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3281,6 +3281,7 @@ typedef struct TransactionStmt
 	char	   *savepoint_name; /* for savepoint commands */
 	char	   *gid;			/* for two-phase-commit related commands */
 	bool		chain;			/* AND CHAIN option */
+	int			gid_location;	/* gid location */
 } TransactionStmt;
 
 /* ----------------------
diff --git a/src/include/utils/queryjumble.h b/src/include/utils/queryjumble.h
index 3c2d9beab2..dfbb060f55 100644
--- a/src/include/utils/queryjumble.h
+++ b/src/include/utils/queryjumble.h
@@ -85,4 +85,24 @@ IsQueryIdEnabled(void)
 	return query_id_enabled;
 }
 
+/*
+ * Jumble those utility statements
+ */
+static inline bool
+IsJumbleUtilityAllowed(Node *n)
+{
+	if (IsA(n, CallStmt) || IsA(n, VariableSetStmt))
+		return true;
+	else if (IsA(n, TransactionStmt))
+	{
+		TransactionStmt *stmt = (TransactionStmt *) n;
+
+		return (stmt->kind == TRANS_STMT_PREPARE ||
+				stmt->kind == TRANS_STMT_COMMIT_PREPARED ||
+				stmt->kind == TRANS_STMT_ROLLBACK_PREPARED);
+	}
+	else
+		return false;
+}
+
 #endif							/* QUERYJUMBLE_H */

#30

Fujii Masao

masao.fujii@oss.nttdata.com

over 3 years ago

In reply to: Drouvot, Bertrand (#29)

Re: Query Jumbling for CALL and SET utility statements

On 2022/09/19 15:29, Drouvot, Bertrand wrote:

Please find attached v6 taking care of the remarks mentioned above.

Thanks for updating the patch!

+SET pg_stat_statements.track_utility = TRUE;
+
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+SET pg_stat_statements.track_utility = FALSE;

Could you tell me why track_utility is enabled just before it's disabled?

Could you tell me what actually happens if we don't drop and
recreate the procedures? I'd like to know what "any caching funnies" are.

+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";

This test set for the procedures is executed with the following
four conditions, respectively. Do we really need all of these tests?

track = top, track_utility = true
track = top, track_utility = false
track = all, track_utility = true
track = all, track_utility = false

+begin;
+prepare transaction 'tx1';
+insert into test_tx values (1);
+commit prepared 'tx1';

The test set of 2PC commands is also executed with track_utility = on
and off, respectively. But why do we need to run that test when
track_utility = off?

-	if (query->utilityStmt)
+	if (query->utilityStmt && !jstate)
  	{
  		if (pgss_track_utility && !PGSS_HANDLED_UTILITY(query->utilityStmt))

"pgss_track_utility" should be
"pgss_track_utility || FORCE_TRACK_UTILITY(parsetree)" theoretically?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#31

Drouvot, Bertrand

bertranddrouvot.pg@gmail.com

over 3 years ago

In reply to: Fujii Masao (#30)

1 attachment(s)

Re: Query Jumbling for CALL and SET utility statements

Hi,

On 9/21/22 6:07 PM, Fujii Masao wrote:

On 2022/09/19 15:29, Drouvot, Bertrand wrote:

Please find attached v6 taking care of the remarks mentioned above.

Thanks for updating the patch!
+SET pg_stat_statements.track_utility = TRUE;
+
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+SET pg_stat_statements.track_utility = FALSE;
Could you tell me why track_utility is enabled just before it's disabled?

Thanks for looking at the new version!

No real reason, I removed those useless SET in the new V7 attached.

Could you tell me what actually happens if we don't drop and
recreate the procedures? I'd like to know what "any caching funnies" are.

Without the drop/recreate the procedure body does not appear normalized
(while the CALL itself is) when switching from track = top to track = all.

I copy-pasted this comment from the already existing "function" section
in the pg_stat_statements.sql file. This comment has been introduced for
the function section in commit 83f2061dd0. Note that the behavior would
be the same for the function case (function body does not appear
normalized without the drop/recreate).

+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query 
COLLATE "C";
This test set for the procedures is executed with the following
four conditions, respectively. Do we really need all of these tests?

track = top, track_utility = true
track = top, track_utility = false
track = all, track_utility = true
track = all, track_utility = false

Oh right, the track_utility = false cases have been added when we
decided up-thread to force track CALL.

But now that's probably not needed to test with track_utility = true. So
I'm just keeping track_utility = off with track = top or all in the new
V7 attached (like this is the case for the "function" section).

+begin;
+prepare transaction 'tx1';
+insert into test_tx values (1);
+commit prepared 'tx1';
The test set of 2PC commands is also executed with track_utility = on
and off, respectively. But why do we need to run that test when
track_utility = off?

That's useless, thanks for pointing out. Removed in V7 attached.

-    if (query->utilityStmt)
+    if (query->utilityStmt && !jstate)
     {
         if (pgss_track_utility &&
!PGSS_HANDLED_UTILITY(query->utilityStmt))

"pgss_track_utility" should be
"pgss_track_utility || FORCE_TRACK_UTILITY(parsetree)" theoretically?

Good catch! That's not needed (in practice) with the current code but
that is "theoretically" needed indeed, let's add it in V7 attached
(that's safer should the code change later on).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v7-0001-JumbleQuery-on-Call-and-Set.patchtext/plain; charset=UTF-8; name=v7-0001-JumbleQuery-on-Call-and-Set.patchDownload

diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index ff0166fb9d..f5fc2f1f38 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -272,7 +272,7 @@ FROM pg_stat_statements ORDER BY query COLLATE "C";
  wal_records > $2 as wal_records_generated,               +|       |      |                     |                       | 
  wal_records >= rows as wal_records_ge_rows               +|       |      |                     |                       | 
  FROM pg_stat_statements ORDER BY query COLLATE "C"        |       |      |                     |                       | 
- SET pg_stat_statements.track_utility = FALSE              |     1 |    0 | f                   | f                     | t
+ SET pg_stat_statements.track_utility = $1                 |     1 |    0 | f                   | f                     | t
  UPDATE pgss_test SET b = $1 WHERE a > $2                  |     1 |    3 | t                   | t                     | t
 (7 rows)
 
@@ -423,6 +423,111 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
 (6 rows)
 
+-- PL/pgSQL procedure and pg_stat_statements.track = top
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(4 rows)
+
+--
+-- pg_stat_statements.track = all
+--
+SET pg_stat_statements.track = 'all';
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+DROP PROCEDURE MINUS_TWO(INTEGER);
+DROP PROCEDURE SUM_TWO(INTEGER, INTEGER);
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT (i - $2 - $3)::INTEGER                                                |     2 |    2
+ SELECT (j + j)::INTEGER                                                      |     2 |    2
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(6 rows)
+
+SET pg_stat_statements.track_utility = TRUE;
+-- SET
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+set enable_seqscan=false;
+set enable_seqscan=true;
+set seq_page_cost=2.0;
+set seq_page_cost=1.0;
+set enable_seqscan to default;
+set seq_page_cost to default;
+reset seq_page_cost;
+reset enable_seqscan;
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+ reset enable_seqscan                                                         |     1 |    0
+ reset seq_page_cost                                                          |     1 |    0
+ set enable_seqscan to default                                                |     1 |    0
+ set enable_seqscan=$1                                                        |     2 |    0
+ set seq_page_cost to default                                                 |     1 |    0
+ set seq_page_cost=$1                                                         |     2 |    0
+(8 rows)
+
+SET pg_stat_statements.track_utility = FALSE;
 --
 -- queries with locking clauses
 --
@@ -553,11 +658,49 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
 (9 rows)
 
+-- 2PC
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+create table test_tx (a int);
+begin;
+prepare transaction 'tx1';
+insert into test_tx values (1);
+commit prepared 'tx1';
+begin;
+prepare transaction 'tx2';
+insert into test_tx values (2);
+commit prepared 'tx2';
+begin;
+prepare transaction 'tx3';
+insert into test_tx values (3);
+rollback prepared 'tx3';
+begin;
+prepare transaction 'tx4';
+insert into test_tx values (4);
+rollback prepared 'tx4';
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+ begin                                                                        |     4 |    0
+ commit prepared $1                                                           |     2 |    0
+ create table test_tx (a int)                                                 |     1 |    0
+ insert into test_tx values ($1)                                              |     4 |    4
+ prepare transaction $1                                                       |     4 |    0
+ rollback prepared $1                                                         |     2 |    0
+(8 rows)
+
 --
 -- Track the total number of rows retrieved or affected by the utility
 -- commands of COPY, FETCH, CREATE TABLE AS, CREATE MATERIALIZED VIEW,
 -- REFRESH MATERIALIZED VIEW and SELECT INTO
 --
+SET pg_stat_statements.track_utility = TRUE;
 SELECT pg_stat_statements_reset();
  pg_stat_statements_reset 
 --------------------------
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index ba868f0de9..8929f0d310 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -106,6 +106,11 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100;
 #define PGSS_HANDLED_UTILITY(n)		(!IsA(n, ExecuteStmt) && \
 									!IsA(n, PrepareStmt) && \
 									!IsA(n, DeallocateStmt))
+/*
+ * Force track those utility statements
+ * whatever the value of pgss_track_utility is.
+ */
+#define FORCE_TRACK_UTILITY(n)	(IsA(n, CallStmt))
 
 /*
  * Extension version number, for supporting older extension versions' objects
@@ -832,9 +837,10 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
 	 * inherit from the underlying statement's one (except DEALLOCATE which is
 	 * entirely untracked).
 	 */
-	if (query->utilityStmt)
+	if (query->utilityStmt && !jstate)
 	{
-		if (pgss_track_utility && !PGSS_HANDLED_UTILITY(query->utilityStmt))
+		if ((pgss_track_utility || FORCE_TRACK_UTILITY(query->utilityStmt)) &&
+			!PGSS_HANDLED_UTILITY(query->utilityStmt))
 			query->queryId = UINT64CONST(0);
 		return;
 	}
@@ -1097,7 +1103,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 	 * that user configured another extension to handle utility statements
 	 * only.
 	 */
-	if (pgss_enabled(exec_nested_level) && pgss_track_utility)
+	if (pgss_enabled(exec_nested_level) &&
+		(pgss_track_utility || FORCE_TRACK_UTILITY(parsetree)))
 		pstmt->queryId = UINT64CONST(0);
 
 	/*
@@ -1114,7 +1121,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 	 *
 	 * Likewise, we don't track execution of DEALLOCATE.
 	 */
-	if (pgss_track_utility && pgss_enabled(exec_nested_level) &&
+	if ((pgss_track_utility || FORCE_TRACK_UTILITY(parsetree)) &&
+		pgss_enabled(exec_nested_level) &&
 		PGSS_HANDLED_UTILITY(parsetree))
 	{
 		instr_time	start;
diff --git a/contrib/pg_stat_statements/sql/pg_stat_statements.sql b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
index a01f183727..8973138fa0 100644
--- a/contrib/pg_stat_statements/sql/pg_stat_statements.sql
+++ b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
@@ -201,6 +201,81 @@ SELECT PLUS_ONE(1);
 
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 
+-- PL/pgSQL procedure and pg_stat_statements.track = top
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+--
+-- pg_stat_statements.track = all
+--
+SET pg_stat_statements.track = 'all';
+SELECT pg_stat_statements_reset();
+
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+
+DROP PROCEDURE MINUS_TWO(INTEGER);
+DROP PROCEDURE SUM_TWO(INTEGER, INTEGER);
+
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+SET pg_stat_statements.track_utility = TRUE;
+
+-- SET
+SELECT pg_stat_statements_reset();
+set enable_seqscan=false;
+set enable_seqscan=true;
+set seq_page_cost=2.0;
+set seq_page_cost=1.0;
+set enable_seqscan to default;
+set seq_page_cost to default;
+reset seq_page_cost;
+reset enable_seqscan;
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+SET pg_stat_statements.track_utility = FALSE;
+
 --
 -- queries with locking clauses
 --
@@ -250,11 +325,38 @@ DROP FUNCTION PLUS_TWO(INTEGER);
 
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 
+-- 2PC
+SELECT pg_stat_statements_reset();
+
+create table test_tx (a int);
+begin;
+prepare transaction 'tx1';
+insert into test_tx values (1);
+commit prepared 'tx1';
+
+begin;
+prepare transaction 'tx2';
+insert into test_tx values (2);
+commit prepared 'tx2';
+
+begin;
+prepare transaction 'tx3';
+insert into test_tx values (3);
+rollback prepared 'tx3';
+
+begin;
+prepare transaction 'tx4';
+insert into test_tx values (4);
+rollback prepared 'tx4';
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
 --
 -- Track the total number of rows retrieved or affected by the utility
 -- commands of COPY, FETCH, CREATE TABLE AS, CREATE MATERIALIZED VIEW,
 -- REFRESH MATERIALIZED VIEW and SELECT INTO
 --
+SET pg_stat_statements.track_utility = TRUE;
 SELECT pg_stat_statements_reset();
 
 CREATE TABLE pgss_ctas AS SELECT a, 'ctas' b FROM generate_series(1, 10) a;
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index ecf6cd6bf3..d18f3632e9 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -487,13 +487,15 @@
 
   <para>
    Plannable queries (that is, <command>SELECT</command>, <command>INSERT</command>,
-   <command>UPDATE</command>, and <command>DELETE</command>) are combined into a single
-   <structname>pg_stat_statements</structname> entry whenever they have identical query
-   structures according to an internal hash calculation.  Typically, two
-   queries will be considered the same for this purpose if they are
-   semantically equivalent except for the values of literal constants
-   appearing in the query.  Utility commands (that is, all other commands)
-   are compared strictly on the basis of their textual query strings, however.
+   <command>UPDATE</command>, and <command>DELETE</command>) as well as
+   <command>CALL</command>, <command>SET</command>, <command>PREPARE TRANSACTION</command>,
+   <command>COMMIT PREPARED</command> and <command>ROLLBACK PREPARED</command>
+   are combined into a single <structname>pg_stat_statements</structname> entry
+   whenever they have identical query structures according to an internal hash calculation.
+   Typically, two queries will be considered the same for this purpose if they are
+   semantically equivalent except for the values of literal constants appearing
+   in the command. All other commands are compared strictly on the basis
+   of their textual query strings, however.
   </para>
 
   <note>
@@ -781,9 +783,9 @@
     <listitem>
      <para>
       <varname>pg_stat_statements.track_utility</varname> controls whether
-      utility commands are tracked by the module.  Utility commands are
+      utility commands are tracked by the module. Tracked utility commands are
       all those other than <command>SELECT</command>, <command>INSERT</command>,
-      <command>UPDATE</command> and <command>DELETE</command>.
+      <command>UPDATE</command>, <command>DELETE</command> and <command>CALL</command>.
       The default value is <literal>on</literal>.
       Only superusers can change this setting.
      </para>
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 0d8d292850..949194c7c6 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10890,6 +10890,7 @@ TransactionStmt:
 
 					n->kind = TRANS_STMT_PREPARE;
 					n->gid = $3;
+					n->gid_location = @3;
 					$$ = (Node *) n;
 				}
 			| COMMIT PREPARED Sconst
@@ -10898,6 +10899,7 @@ TransactionStmt:
 
 					n->kind = TRANS_STMT_COMMIT_PREPARED;
 					n->gid = $3;
+					n->gid_location = @3;
 					$$ = (Node *) n;
 				}
 			| ROLLBACK PREPARED Sconst
@@ -10906,6 +10908,7 @@ TransactionStmt:
 
 					n->kind = TRANS_STMT_ROLLBACK_PREPARED;
 					n->gid = $3;
+					n->gid_location = @3;
 					$$ = (Node *) n;
 				}
 		;
diff --git a/src/backend/utils/misc/queryjumble.c b/src/backend/utils/misc/queryjumble.c
index 6e75acda27..dfcef7a414 100644
--- a/src/backend/utils/misc/queryjumble.c
+++ b/src/backend/utils/misc/queryjumble.c
@@ -105,7 +105,7 @@ JumbleQuery(Query *query, const char *querytext)
 
 	Assert(IsQueryIdEnabled());
 
-	if (query->utilityStmt)
+	if (query->utilityStmt && !IsJumbleUtilityAllowed(query->utilityStmt))
 	{
 		query->queryId = compute_utility_query_id(querytext,
 												  query->stmt_location,
@@ -241,10 +241,11 @@ static void
 JumbleQueryInternal(JumbleState *jstate, Query *query)
 {
 	Assert(IsA(query, Query));
-	Assert(query->utilityStmt == NULL);
+	Assert(query->utilityStmt == NULL || IsJumbleUtilityAllowed(query->utilityStmt));
 
 	APP_JUMB(query->commandType);
 	/* resultRelation is usually predictable from commandType */
+	JumbleExpr(jstate, (Node *) query->utilityStmt);
 	JumbleExpr(jstate, (Node *) query->cteList);
 	JumbleRangeTable(jstate, query->rtable);
 	JumbleExpr(jstate, (Node *) query->jointree);
@@ -384,6 +385,47 @@ JumbleExpr(JumbleState *jstate, Node *node)
 				APP_JUMB(var->varlevelsup);
 			}
 			break;
+		case T_CallStmt:
+			{
+				CallStmt   *stmt = (CallStmt *) node;
+				FuncExpr   *expr = stmt->funcexpr;
+
+				APP_JUMB(expr->funcid);
+				JumbleExpr(jstate, (Node *) expr->args);
+			}
+			break;
+		case T_VariableSetStmt:
+			{
+				VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+				/* stmt->name is NULL for RESET ALL */
+				if (stmt->name)
+				{
+					APP_JUMB(stmt->kind);
+					APP_JUMB_STRING(stmt->name);
+					JumbleExpr(jstate, (Node *) stmt->args);
+				}
+			}
+			break;
+		case T_A_Const:
+			{
+				int			loc = ((const A_Const *) node)->location;
+
+				RecordConstLocation(jstate, loc);
+			}
+			break;
+		case T_TransactionStmt:
+			{
+				TransactionStmt *stmt = (TransactionStmt *) node;
+
+				Assert(stmt->kind == TRANS_STMT_PREPARE ||
+					   stmt->kind == TRANS_STMT_COMMIT_PREPARED ||
+					   stmt->kind == TRANS_STMT_ROLLBACK_PREPARED);
+
+				APP_JUMB(stmt->kind);
+				RecordConstLocation(jstate, stmt->gid_location);
+			}
+			break;
 		case T_Const:
 			{
 				Const	   *c = (Const *) node;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index aead2afd6e..f59457dcd6 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3281,6 +3281,7 @@ typedef struct TransactionStmt
 	char	   *savepoint_name; /* for savepoint commands */
 	char	   *gid;			/* for two-phase-commit related commands */
 	bool		chain;			/* AND CHAIN option */
+	int			gid_location;	/* gid location */
 } TransactionStmt;
 
 /* ----------------------
diff --git a/src/include/utils/queryjumble.h b/src/include/utils/queryjumble.h
index 3c2d9beab2..dfbb060f55 100644
--- a/src/include/utils/queryjumble.h
+++ b/src/include/utils/queryjumble.h
@@ -85,4 +85,24 @@ IsQueryIdEnabled(void)
 	return query_id_enabled;
 }
 
+/*
+ * Jumble those utility statements
+ */
+static inline bool
+IsJumbleUtilityAllowed(Node *n)
+{
+	if (IsA(n, CallStmt) || IsA(n, VariableSetStmt))
+		return true;
+	else if (IsA(n, TransactionStmt))
+	{
+		TransactionStmt *stmt = (TransactionStmt *) n;
+
+		return (stmt->kind == TRANS_STMT_PREPARE ||
+				stmt->kind == TRANS_STMT_COMMIT_PREPARED ||
+				stmt->kind == TRANS_STMT_ROLLBACK_PREPARED);
+	}
+	else
+		return false;
+}
+
 #endif							/* QUERYJUMBLE_H */

#32

Michael Paquier

michael@paquier.xyz

over 3 years ago

In reply to: Drouvot, Bertrand (#29)

Re: Query Jumbling for CALL and SET utility statements

On Mon, Sep 19, 2022 at 08:29:22AM +0200, Drouvot, Bertrand wrote:

Please find attached v6 taking care of the remarks mentioned above.
+        case T_VariableSetStmt:
+            {
+                VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+                /* stmt->name is NULL for RESET ALL */
+                if (stmt->name)
+                {
+                    APP_JUMB(stmt->kind);
+                    APP_JUMB_STRING(stmt->name);
+                    JumbleExpr(jstate, (Node *) stmt->args);
+                }
+            }
+            break;

Hmm. If VariableSetStmt->is_local is not added to the jumble, then
aren't "SET foo = $1" and "SET LOCAL foo = $1" counted as the same
query?

I am not seeing SAVEPOINT, RELEASE, ROLLBACK .. TO SAVEPOINT
mentioned on this thread. Would these be worth considering in what
gets compiled? That would cover the remaining bits of
TransactionStmt. The ODBC driver abuses of savepoints, for example,
so this could be useful for monitoring purposes in such cases.

As of the code stands, it could be cleaner to check
IsJumbleUtilityAllowed() in compute_utility_query_id(), falling back
to a default in JumbleQuery(). Not that what your patch does is
incorrect, of course.
--
Michael

#33

Drouvot, Bertrand

bertranddrouvot.pg@gmail.com

over 3 years ago

In reply to: Drouvot, Bertrand (#31)

1 attachment(s)

Re: Query Jumbling for CALL and SET utility statements

Hi,

On 9/26/22 12:40 PM, Drouvot, Bertrand wrote:

let's add it in V7 attached
(that's safer should the code change later on).

Attached a tiny rebase needed due to 249b0409b1.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v8-0001-JumbleQuery-on-Call-and-Set.patchtext/plain; charset=UTF-8; name=v8-0001-JumbleQuery-on-Call-and-Set.patchDownload

diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index 9ac5c87c3a..5d3330a87d 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -311,7 +311,7 @@ FROM pg_stat_statements ORDER BY query COLLATE "C";
  wal_records > $2 as wal_records_generated,               +|       |      |                     |                       | 
  wal_records >= rows as wal_records_ge_rows               +|       |      |                     |                       | 
  FROM pg_stat_statements ORDER BY query COLLATE "C"        |       |      |                     |                       | 
- SET pg_stat_statements.track_utility = FALSE              |     1 |    0 | f                   | f                     | t
+ SET pg_stat_statements.track_utility = $1                 |     1 |    0 | f                   | f                     | t
  UPDATE pgss_test SET b = $1 WHERE a > $2                  |     1 |    3 | t                   | t                     | t
 (7 rows)
 
@@ -462,6 +462,111 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
 (6 rows)
 
+-- PL/pgSQL procedure and pg_stat_statements.track = top
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(4 rows)
+
+--
+-- pg_stat_statements.track = all
+--
+SET pg_stat_statements.track = 'all';
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+DROP PROCEDURE MINUS_TWO(INTEGER);
+DROP PROCEDURE SUM_TWO(INTEGER, INTEGER);
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ CALL MINUS_TWO($1)                                                           |     2 |    0
+ CALL SUM_TWO($1, $2)                                                         |     2 |    0
+ SELECT (i - $2 - $3)::INTEGER                                                |     2 |    2
+ SELECT (j + j)::INTEGER                                                      |     2 |    2
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+(6 rows)
+
+SET pg_stat_statements.track_utility = TRUE;
+-- SET
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+set enable_seqscan=false;
+set enable_seqscan=true;
+set seq_page_cost=2.0;
+set seq_page_cost=1.0;
+set enable_seqscan to default;
+set seq_page_cost to default;
+reset seq_page_cost;
+reset enable_seqscan;
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+ reset enable_seqscan                                                         |     1 |    0
+ reset seq_page_cost                                                          |     1 |    0
+ set enable_seqscan to default                                                |     1 |    0
+ set enable_seqscan=$1                                                        |     2 |    0
+ set seq_page_cost to default                                                 |     1 |    0
+ set seq_page_cost=$1                                                         |     2 |    0
+(8 rows)
+
+SET pg_stat_statements.track_utility = FALSE;
 --
 -- queries with locking clauses
 --
@@ -592,11 +697,49 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
 (9 rows)
 
+-- 2PC
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+create table test_tx (a int);
+begin;
+prepare transaction 'tx1';
+insert into test_tx values (1);
+commit prepared 'tx1';
+begin;
+prepare transaction 'tx2';
+insert into test_tx values (2);
+commit prepared 'tx2';
+begin;
+prepare transaction 'tx3';
+insert into test_tx values (3);
+rollback prepared 'tx3';
+begin;
+prepare transaction 'tx4';
+insert into test_tx values (4);
+rollback prepared 'tx4';
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+                                    query                                     | calls | rows 
+------------------------------------------------------------------------------+-------+------
+ SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
+ begin                                                                        |     4 |    0
+ commit prepared $1                                                           |     2 |    0
+ create table test_tx (a int)                                                 |     1 |    0
+ insert into test_tx values ($1)                                              |     4 |    4
+ prepare transaction $1                                                       |     4 |    0
+ rollback prepared $1                                                         |     2 |    0
+(8 rows)
+
 --
 -- Track the total number of rows retrieved or affected by the utility
 -- commands of COPY, FETCH, CREATE TABLE AS, CREATE MATERIALIZED VIEW,
 -- REFRESH MATERIALIZED VIEW and SELECT INTO
 --
+SET pg_stat_statements.track_utility = TRUE;
 SELECT pg_stat_statements_reset();
  pg_stat_statements_reset 
 --------------------------
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 73439c0199..4aefbe70f0 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -106,6 +106,11 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100;
 #define PGSS_HANDLED_UTILITY(n)		(!IsA(n, ExecuteStmt) && \
 									!IsA(n, PrepareStmt) && \
 									!IsA(n, DeallocateStmt))
+/*
+ * Force track those utility statements
+ * whatever the value of pgss_track_utility is.
+ */
+#define FORCE_TRACK_UTILITY(n)	(IsA(n, CallStmt))
 
 /*
  * Extension version number, for supporting older extension versions' objects
@@ -832,9 +837,10 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
 	 * inherit from the underlying statement's one (except DEALLOCATE which is
 	 * entirely untracked).
 	 */
-	if (query->utilityStmt)
+	if (query->utilityStmt && !jstate)
 	{
-		if (pgss_track_utility && !PGSS_HANDLED_UTILITY(query->utilityStmt))
+		if ((pgss_track_utility || FORCE_TRACK_UTILITY(query->utilityStmt)) &&
+			!PGSS_HANDLED_UTILITY(query->utilityStmt))
 			query->queryId = UINT64CONST(0);
 		return;
 	}
@@ -1097,7 +1103,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 	 * that user configured another extension to handle utility statements
 	 * only.
 	 */
-	if (pgss_enabled(exec_nested_level) && pgss_track_utility)
+	if (pgss_enabled(exec_nested_level) &&
+		(pgss_track_utility || FORCE_TRACK_UTILITY(parsetree)))
 		pstmt->queryId = UINT64CONST(0);
 
 	/*
@@ -1114,7 +1121,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 	 *
 	 * Likewise, we don't track execution of DEALLOCATE.
 	 */
-	if (pgss_track_utility && pgss_enabled(exec_nested_level) &&
+	if ((pgss_track_utility || FORCE_TRACK_UTILITY(parsetree)) &&
+		pgss_enabled(exec_nested_level) &&
 		PGSS_HANDLED_UTILITY(parsetree))
 	{
 		instr_time	start;
diff --git a/contrib/pg_stat_statements/sql/pg_stat_statements.sql b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
index 8f5c866225..34b0d61815 100644
--- a/contrib/pg_stat_statements/sql/pg_stat_statements.sql
+++ b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
@@ -223,6 +223,81 @@ SELECT PLUS_ONE(1);
 
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 
+-- PL/pgSQL procedure and pg_stat_statements.track = top
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+--
+-- pg_stat_statements.track = all
+--
+SET pg_stat_statements.track = 'all';
+SELECT pg_stat_statements_reset();
+
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+
+DROP PROCEDURE MINUS_TWO(INTEGER);
+DROP PROCEDURE SUM_TWO(INTEGER, INTEGER);
+
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+SET pg_stat_statements.track_utility = TRUE;
+
+-- SET
+SELECT pg_stat_statements_reset();
+set enable_seqscan=false;
+set enable_seqscan=true;
+set seq_page_cost=2.0;
+set seq_page_cost=1.0;
+set enable_seqscan to default;
+set seq_page_cost to default;
+reset seq_page_cost;
+reset enable_seqscan;
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+SET pg_stat_statements.track_utility = FALSE;
+
 --
 -- queries with locking clauses
 --
@@ -272,11 +347,38 @@ DROP FUNCTION PLUS_TWO(INTEGER);
 
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 
+-- 2PC
+SELECT pg_stat_statements_reset();
+
+create table test_tx (a int);
+begin;
+prepare transaction 'tx1';
+insert into test_tx values (1);
+commit prepared 'tx1';
+
+begin;
+prepare transaction 'tx2';
+insert into test_tx values (2);
+commit prepared 'tx2';
+
+begin;
+prepare transaction 'tx3';
+insert into test_tx values (3);
+rollback prepared 'tx3';
+
+begin;
+prepare transaction 'tx4';
+insert into test_tx values (4);
+rollback prepared 'tx4';
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+
 --
 -- Track the total number of rows retrieved or affected by the utility
 -- commands of COPY, FETCH, CREATE TABLE AS, CREATE MATERIALIZED VIEW,
 -- REFRESH MATERIALIZED VIEW and SELECT INTO
 --
+SET pg_stat_statements.track_utility = TRUE;
 SELECT pg_stat_statements_reset();
 
 CREATE TABLE pgss_ctas AS SELECT a, 'ctas' b FROM generate_series(1, 10) a;
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index ea90365c7f..db91bb20bd 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -487,13 +487,15 @@
 
   <para>
    Plannable queries (that is, <command>SELECT</command>, <command>INSERT</command>,
-   <command>UPDATE</command>, <command>DELETE</command>, and <command>MERGE</command>) are combined into a single
-   <structname>pg_stat_statements</structname> entry whenever they have identical query
-   structures according to an internal hash calculation.  Typically, two
-   queries will be considered the same for this purpose if they are
-   semantically equivalent except for the values of literal constants
-   appearing in the query.  Utility commands (that is, all other commands)
-   are compared strictly on the basis of their textual query strings, however.
+   <command>UPDATE</command>, <command>DELETE</command>, and <command>MERGE</command>)
+   as well as <command>CALL</command>, <command>SET</command>, <command>PREPARE TRANSACTION</command>,
+   <command>COMMIT PREPARED</command> and <command>ROLLBACK PREPARED</command>
+   are combined into a single <structname>pg_stat_statements</structname> entry
+   whenever they have identical query structures according to an internal hash calculation.
+   Typically, two queries will be considered the same for this purpose if they are
+   semantically equivalent except for the values of literal constants appearing
+   in the command. All other commands are compared strictly on the basis
+   of their textual query strings, however.
   </para>
 
   <note>
@@ -781,10 +783,10 @@
     <listitem>
      <para>
       <varname>pg_stat_statements.track_utility</varname> controls whether
-      utility commands are tracked by the module.  Utility commands are
+      utility commands are tracked by the module. Tracked utility commands are
       all those other than <command>SELECT</command>, <command>INSERT</command>,
-      <command>UPDATE</command>, <command>DELETE</command>, and <command>MERGE</command>.
-      The default value is <literal>on</literal>.
+      <command>UPDATE</command>, <command>DELETE</command>, <command>MERGE</command>,
+      and <command>CALL</command>. The default value is <literal>on</literal>.
       Only superusers can change this setting.
      </para>
     </listitem>
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 94d5142a4a..7203a9411a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10890,6 +10890,7 @@ TransactionStmt:
 
 					n->kind = TRANS_STMT_PREPARE;
 					n->gid = $3;
+					n->gid_location = @3;
 					$$ = (Node *) n;
 				}
 			| COMMIT PREPARED Sconst
@@ -10898,6 +10899,7 @@ TransactionStmt:
 
 					n->kind = TRANS_STMT_COMMIT_PREPARED;
 					n->gid = $3;
+					n->gid_location = @3;
 					$$ = (Node *) n;
 				}
 			| ROLLBACK PREPARED Sconst
@@ -10906,6 +10908,7 @@ TransactionStmt:
 
 					n->kind = TRANS_STMT_ROLLBACK_PREPARED;
 					n->gid = $3;
+					n->gid_location = @3;
 					$$ = (Node *) n;
 				}
 		;
diff --git a/src/backend/utils/misc/queryjumble.c b/src/backend/utils/misc/queryjumble.c
index a8508463e7..02f5178b6b 100644
--- a/src/backend/utils/misc/queryjumble.c
+++ b/src/backend/utils/misc/queryjumble.c
@@ -105,7 +105,7 @@ JumbleQuery(Query *query, const char *querytext)
 
 	Assert(IsQueryIdEnabled());
 
-	if (query->utilityStmt)
+	if (query->utilityStmt && !IsJumbleUtilityAllowed(query->utilityStmt))
 	{
 		query->queryId = compute_utility_query_id(querytext,
 												  query->stmt_location,
@@ -241,10 +241,11 @@ static void
 JumbleQueryInternal(JumbleState *jstate, Query *query)
 {
 	Assert(IsA(query, Query));
-	Assert(query->utilityStmt == NULL);
+	Assert(query->utilityStmt == NULL || IsJumbleUtilityAllowed(query->utilityStmt));
 
 	APP_JUMB(query->commandType);
 	/* resultRelation is usually predictable from commandType */
+	JumbleExpr(jstate, (Node *) query->utilityStmt);
 	JumbleExpr(jstate, (Node *) query->cteList);
 	JumbleRangeTable(jstate, query->rtable);
 	JumbleExpr(jstate, (Node *) query->jointree);
@@ -385,6 +386,47 @@ JumbleExpr(JumbleState *jstate, Node *node)
 				APP_JUMB(var->varlevelsup);
 			}
 			break;
+		case T_CallStmt:
+			{
+				CallStmt   *stmt = (CallStmt *) node;
+				FuncExpr   *expr = stmt->funcexpr;
+
+				APP_JUMB(expr->funcid);
+				JumbleExpr(jstate, (Node *) expr->args);
+			}
+			break;
+		case T_VariableSetStmt:
+			{
+				VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+				/* stmt->name is NULL for RESET ALL */
+				if (stmt->name)
+				{
+					APP_JUMB(stmt->kind);
+					APP_JUMB_STRING(stmt->name);
+					JumbleExpr(jstate, (Node *) stmt->args);
+				}
+			}
+			break;
+		case T_A_Const:
+			{
+				int			loc = ((const A_Const *) node)->location;
+
+				RecordConstLocation(jstate, loc);
+			}
+			break;
+		case T_TransactionStmt:
+			{
+				TransactionStmt *stmt = (TransactionStmt *) node;
+
+				Assert(stmt->kind == TRANS_STMT_PREPARE ||
+					   stmt->kind == TRANS_STMT_COMMIT_PREPARED ||
+					   stmt->kind == TRANS_STMT_ROLLBACK_PREPARED);
+
+				APP_JUMB(stmt->kind);
+				RecordConstLocation(jstate, stmt->gid_location);
+			}
+			break;
 		case T_Const:
 			{
 				Const	   *c = (Const *) node;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 633e7671b3..d62fac9d86 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3282,6 +3282,7 @@ typedef struct TransactionStmt
 	char	   *savepoint_name; /* for savepoint commands */
 	char	   *gid;			/* for two-phase-commit related commands */
 	bool		chain;			/* AND CHAIN option */
+	int			gid_location;	/* gid location */
 } TransactionStmt;
 
 /* ----------------------
diff --git a/src/include/utils/queryjumble.h b/src/include/utils/queryjumble.h
index f799fb19e4..bb15c7e1d8 100644
--- a/src/include/utils/queryjumble.h
+++ b/src/include/utils/queryjumble.h
@@ -83,4 +83,24 @@ IsQueryIdEnabled(void)
 	return query_id_enabled;
 }
 
+/*
+ * Jumble those utility statements
+ */
+static inline bool
+IsJumbleUtilityAllowed(Node *n)
+{
+	if (IsA(n, CallStmt) || IsA(n, VariableSetStmt))
+		return true;
+	else if (IsA(n, TransactionStmt))
+	{
+		TransactionStmt *stmt = (TransactionStmt *) n;
+
+		return (stmt->kind == TRANS_STMT_PREPARE ||
+				stmt->kind == TRANS_STMT_COMMIT_PREPARED ||
+				stmt->kind == TRANS_STMT_ROLLBACK_PREPARED);
+	}
+
+	return false;
+}
+
 #endif							/* QUERYJUMBLE_H */

#34

Drouvot, Bertrand

bertranddrouvot.pg@gmail.com

over 3 years ago

In reply to: Michael Paquier (#32)

Re: Query Jumbling for CALL and SET utility statements

Hi,

On 10/6/22 8:39 AM, Michael Paquier wrote:

On Mon, Sep 19, 2022 at 08:29:22AM +0200, Drouvot, Bertrand wrote:

Please find attached v6 taking care of the remarks mentioned above.
+        case T_VariableSetStmt:
+            {
+                VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+                /* stmt->name is NULL for RESET ALL */
+                if (stmt->name)
+                {
+                    APP_JUMB(stmt->kind);
+                    APP_JUMB_STRING(stmt->name);
+                    JumbleExpr(jstate, (Node *) stmt->args);
+                }
+            }
+            break;

Hmm. If VariableSetStmt->is_local is not added to the jumble, then
aren't "SET foo = $1" and "SET LOCAL foo = $1" counted as the same
query?

Good catch, thanks!
While at it let's also jumble "SET SESSION foo =". For this one, we
would need to record another bool in VariableSetStmt: I'll create a
dedicated patch for that.

I am not seeing SAVEPOINT, RELEASE, ROLLBACK .. TO SAVEPOINT
mentioned on this thread. Would these be worth considering in what
gets compiled? That would cover the remaining bits of
TransactionStmt. The ODBC driver abuses of savepoints, for example,
so this could be useful for monitoring purposes in such cases.

Agree. I'll look at those too.

As of the code stands, it could be cleaner to check
IsJumbleUtilityAllowed() in compute_utility_query_id(), falling back
to a default in JumbleQuery(). Not that what your patch does is
incorrect, of course.

Will look at it.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#35

Michael Paquier

michael@paquier.xyz

over 3 years ago

In reply to: Drouvot, Bertrand (#34)

Re: Query Jumbling for CALL and SET utility statements

On Thu, Oct 06, 2022 at 10:43:57AM +0200, Drouvot, Bertrand wrote:

On 10/6/22 8:39 AM, Michael Paquier wrote:

I am not seeing SAVEPOINT, RELEASE, ROLLBACK .. TO SAVEPOINT
mentioned on this thread. Would these be worth considering in what
gets compiled? That would cover the remaining bits of
TransactionStmt. The ODBC driver abuses of savepoints, for example,
so this could be useful for monitoring purposes in such cases.

Agree. I'll look at those too.

Thanks.

While studying a bit more this thread, I've been reminded of the fact
that this would treat different flavors of BEGIN/COMMIT commands (mix
of upper/lower characters, etc.) as different entries in
pg_stat_statements, and it feels inconsistent to me that we'd begin
jumbling the 2PC and savepoint commands with their nodes but not do
that for the rest of the commands, even if, as mentioned upthread,
applications may not mix grammars. If they do, one could finish by
viewing incorrect reports, and I'd like to think that this would make
the life of a lot of people easier.

SET/RESET and CALL have a much lower presence frequency than the
transaction commands, where it is fine by me to include both of these
under the utility statement switch. For OLTP workloads (I've seen
quite a bit of 2PC used across multiple nodes for short transactions
with writes involving more than two remote nodes), with a lot of
BEGIN/COMMIT or even 2PC commands issued, the performance could be
noticeable? It may make sense to control these with a different GUC
switch, where we drop completely the string-only approach under
track_utility. In short, I don't have any objections about the
business with SET and CALL, but the transaction part worries me a
bit. As a first step, we could cut the cake in two parts, and just
focus on SET/RESET and CALL, which was the main point of discussion
of this thread to begin with.
--
Michael

#36

Tom Lane

tgl@sss.pgh.pa.us

over 3 years ago

In reply to: Michael Paquier (#35)

Re: Query Jumbling for CALL and SET utility statements

Michael Paquier <michael@paquier.xyz> writes:

While studying a bit more this thread, I've been reminded of the fact
that this would treat different flavors of BEGIN/COMMIT commands (mix
of upper/lower characters, etc.) as different entries in
pg_stat_statements, and it feels inconsistent to me that we'd begin
jumbling the 2PC and savepoint commands with their nodes but not do
that for the rest of the commands, even if, as mentioned upthread,
applications may not mix grammars.

I've been thinking since the beginning of this thread that there
was no coherent, defensible rationale being offered for jumbling
some utility statements and not others.

I wonder if the answer is to jumble them all. We avoided that
up to now because it would imply a ton of manual effort and
future code maintenance ... but now that the backend/nodes/
infrastructure is largely auto-generated, could we auto-generate
the jumbling code?

regards, tom lane

#37

Michael Paquier

michael@paquier.xyz

over 3 years ago

In reply to: Tom Lane (#36)

Re: Query Jumbling for CALL and SET utility statements

On Thu, Oct 06, 2022 at 11:51:52PM -0400, Tom Lane wrote:

I've been thinking since the beginning of this thread that there
was no coherent, defensible rationale being offered for jumbling
some utility statements and not others.

Yeah. The potential performance impact of all the TransactionStmts
worries me a bit, though.

I wonder if the answer is to jumble them all. We avoided that
up to now because it would imply a ton of manual effort and
future code maintenance ... but now that the backend/nodes/
infrastructure is largely auto-generated, could we auto-generate
the jumbling code?

Probably. One part that may be tricky though is the location of the
constants we'd like to make generic, but perhaps this could be handled
by using a dedicated variable type that just maps to int? It does not
seem like a mandatory requirement to add that everywhere as a first
step, either.
--
Michael

#38

Julien Rouhaud

rjuju123@gmail.com

over 3 years ago

In reply to: Tom Lane (#36)

Re: Query Jumbling for CALL and SET utility statements

On Thu, Oct 06, 2022 at 11:51:52PM -0400, Tom Lane wrote:

Michael Paquier <michael@paquier.xyz> writes:

While studying a bit more this thread, I've been reminded of the fact
that this would treat different flavors of BEGIN/COMMIT commands (mix
of upper/lower characters, etc.) as different entries in
pg_stat_statements, and it feels inconsistent to me that we'd begin
jumbling the 2PC and savepoint commands with their nodes but not do
that for the rest of the commands, even if, as mentioned upthread,
applications may not mix grammars.

I've been thinking since the beginning of this thread that there
was no coherent, defensible rationale being offered for jumbling
some utility statements and not others.

Only a very small subset causes trouble in real life scenario, but I agree that
cherry-picking some utility statements isn't a great approach.

I wonder if the answer is to jumble them all. We avoided that
up to now because it would imply a ton of manual effort and
future code maintenance ... but now that the backend/nodes/
infrastructure is largely auto-generated, could we auto-generate
the jumbling code?

That's a good idea. Naively, it seems doable as the infrastructure in
gen_node_support.pl already supports everything that should be needed (like
per-member annotation).

#39

Drouvot, Bertrand

bertranddrouvot.pg@gmail.com

over 3 years ago

In reply to: Michael Paquier (#37)

Re: Query Jumbling for CALL and SET utility statements

Hi,

On 10/7/22 6:13 AM, Michael Paquier wrote:

On Thu, Oct 06, 2022 at 11:51:52PM -0400, Tom Lane wrote:

I've been thinking since the beginning of this thread that there
was no coherent, defensible rationale being offered for jumbling
some utility statements and not others.

Yeah. The potential performance impact of all the TransactionStmts
worries me a bit, though.

I wonder if the answer is to jumble them all. We avoided that
up to now because it would imply a ton of manual effort and
future code maintenance ... but now that the backend/nodes/
infrastructure is largely auto-generated, could we auto-generate
the jumbling code?

I think that's a good idea.

Probably. One part that may be tricky though is the location of the
constants we'd like to make generic, but perhaps this could be handled
by using a dedicated variable type that just maps to int?

It looks to me that we'd also need to teach the perl parser which fields
per statements struct need to be jumbled (or more probably which ones to
exclude from the jumbling, for example funccall in CallStmt). Not sure
yet how to teach the perl parser, but I'll build this list first to help
decide for a right approach, unless you already have some thoughts about it?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#40

Julien Rouhaud

rjuju123@gmail.com

over 3 years ago

In reply to: Drouvot, Bertrand (#39)

Re: Query Jumbling for CALL and SET utility statements

Hi,

On Mon, Oct 10, 2022 at 03:04:57PM +0200, Drouvot, Bertrand wrote:

On 10/7/22 6:13 AM, Michael Paquier wrote:

Probably. One part that may be tricky though is the location of the
constants we'd like to make generic, but perhaps this could be handled
by using a dedicated variable type that just maps to int?

It looks to me that we'd also need to teach the perl parser which fields per
statements struct need to be jumbled (or more probably which ones to exclude
from the jumbling, for example funccall in CallStmt). Not sure yet how to
teach the perl parser, but I'll build this list first to help decide for a
right approach, unless you already have some thoughts about it?

Unless I'm missing something both can be handled using pg_node_attr()
annotations that already exists?

#41

Michael Paquier

michael@paquier.xyz

over 3 years ago

In reply to: Julien Rouhaud (#40)

Re: Query Jumbling for CALL and SET utility statements

On Mon, Oct 10, 2022 at 09:16:47PM +0800, Julien Rouhaud wrote:

Unless I'm missing something both can be handled using pg_node_attr()
annotations that already exists?

Indeed, that should work for the locators.
--
Michael

#42

Imseih (AWS), Sami

simseih@amazon.com

over 3 years ago

In reply to: Drouvot, Bertrand (#39)

Re: Query Jumbling for CALL and SET utility statements

I've been thinking since the beginning of this thread that there
was no coherent, defensible rationale being offered for jumbling
some utility statements and not others.

+1 to the idea, as there are other utility statement cases
that should be Jumbled. Here is a recent thread for jumbling
cursors [1]/messages/by-id/203CFCF7-176E-4AFC-A48E-B2CECFECD6AA@amazon.com.

The CF entry [2]https://commitfest.postgresql.org/40/3901/ has been withdrawn until consensus is reached
on this topic.

[1]: /messages/by-id/203CFCF7-176E-4AFC-A48E-B2CECFECD6AA@amazon.com
[2]: https://commitfest.postgresql.org/40/3901/

Regards,

Sami Imseih
AWS (Amazon Web Services)

#43

Michael Paquier

michael@paquier.xyz

over 3 years ago

In reply to: Imseih (AWS), Sami (#42)

Re: Query Jumbling for CALL and SET utility statements

On Tue, Oct 11, 2022 at 02:18:54PM +0000, Imseih (AWS), Sami wrote:

+1 to the idea, as there are other utility statement cases
that should be Jumbled. Here is a recent thread for jumbling
cursors [1].

Thanks for mentioning that. With an automated way to generate this
code, cursors would be handled, at the advantage of making sure that
no fields are missing in the jumbled structures (is_local was missed
for example on SET).

The CF entry [2] has been withdrawn until consensus is reached
on this topic.

It seems to me that the consensus is here, which is a good step
forward ;)
--
Michael

#44

Michael Paquier

michael@paquier.xyz

about 3 years ago

In reply to: Michael Paquier (#43)

Re: Query Jumbling for CALL and SET utility statements

On Wed, Oct 12, 2022 at 09:13:20AM +0900, Michael Paquier wrote:

Thanks for mentioning that. With an automated way to generate this
code, cursors would be handled, at the advantage of making sure that
no fields are missing in the jumbled structures (is_local was missed
for example on SET).

So, this thread has stalled for the last few weeks and it seems to me
that the conclusion is that we'd want an approach using a set of
scripts that automate the generation of the code in charge of the DDL
jumbling. And, depending on the portions of the queries that need to
be silenced, we may need to extend a few nodes with a location. We
are not there yet, so I have marked the patch as returned with
feedback.
--
Michael

#45

Michael Paquier

michael@paquier.xyz

about 3 years ago

In reply to: Michael Paquier (#41)

Re: Query Jumbling for CALL and SET utility statements

On Tue, Oct 11, 2022 at 11:54:51AM +0900, Michael Paquier wrote:

On Mon, Oct 10, 2022 at 09:16:47PM +0800, Julien Rouhaud wrote:

Unless I'm missing something both can be handled using pg_node_attr()
annotations that already exists?

Indeed, that should work for the locators.

My mistake here.

When it comes to the locations to silence in the normalization, we
already rely on the variable name "location" of type int in
gen_node_support.pl, so we can just do the same for the code
generating the jumbling.

This stuff still needs two more pg_node_attr() to be able to work: one
to ignore a full Node in the jumbling and a second that can be used on
a per-field basis. Once this is in place, automating the generation
of the code is not that complicated, most of the work is to get around
placing the pg_node_attr() so as the jumbling gets things right. The
number of fields to mark as things to ignore depends on the Node type
(heavy for Const, for example), but I'd like to think that a "ignore"
approach is better than an "include" approach so as new fields would
be considered by default in the jumble compilation.

I have not looked at all the edge cases, so perhaps more attrs would
be needed..
--
Michael