support fix query_id for temp table
session 1:
create temp table ttt ( a int );
insert into ttt values(3); -- query_id is XXX from pg_stat_activity
session 2:
create temp table ttt ( a int );
insert into ttt values(3); -- query_id is YYY from pg_stat_activity
I know temp table has different oid, so query_id is different, is there a way to use table name for temp table instead of oid?
On Thu, Feb 01, 2024 at 07:37:32AM +0000, ma lz wrote:
session 1:
create temp table ttt ( a int );
insert into ttt values(3); -- query_id is XXX from pg_stat_activitysession 2:
create temp table ttt ( a int );
insert into ttt values(3); -- query_id is YYY from pg_stat_activityI know temp table has different oid, so query_id is different, is
there a way to use table name for temp table instead of oid?
The CREATE TABLE statements have indeed the same query ID (in 16~),
and the inserts have a different one as they touch different schemas
and relations. That's quite an old problem, that depends on the
RangeVar attached to an InsertStmt. I don't quite see a way to
directly handle that except by using a custom implementation in query
jumbling for this node and its RangeVar, so there is no "easy" way to
tackle that :/
--
Michael
Re: Michael Paquier
On Thu, Feb 01, 2024 at 07:37:32AM +0000, ma lz wrote:
session 1:
create temp table ttt ( a int );
insert into ttt values(3); -- query_id is XXX from pg_stat_activitysession 2:
create temp table ttt ( a int );
insert into ttt values(3); -- query_id is YYY from pg_stat_activityI know temp table has different oid, so query_id is different, is
there a way to use table name for temp table instead of oid?The CREATE TABLE statements have indeed the same query ID (in 16~),
and the inserts have a different one as they touch different schemas
and relations. That's quite an old problem, that depends on the
RangeVar attached to an InsertStmt. I don't quite see a way to
directly handle that except by using a custom implementation in query
jumbling for this node and its RangeVar, so there is no "easy" way to
tackle that :/
A customer reported that pg_stat_statements is not useful for them
because they are seeing 160k different query ids in 6-8 hours. They
also proposed to use the temp table name for query jumbling and wrote
a patch for it, which I would also see as the obvious solution to the
problem.
Here's that patch with regression tests added. I would think changing
this would be a big usability improvement for anyone using temp tables
a lot.
There does not seem to be a performance impact - all test were run
with pg_stat_statements active:
Standard pgbench -S (-s 10):
without patch: tps = 154155.407337 (without initial connection time)
with patch: tps = 154223.966534 (without initial connection time)
pgbench -S on temp tables where each table has just one record:
without patch: tps = 184430.801954 (without initial connection time)
with patch: tps = 185692.602764 (without initial connection time)
Christoph
Attachments:
v1-0001-Jumble-temp-tables-by-name.patchtext/x-diff; charset=us-asciiDownload
From c50dbb614f5e7696cb687aa156eb4149dcdb231d Mon Sep 17 00:00:00 2001
From: Christoph Berg <myon@debian.org>
Date: Mon, 17 Mar 2025 17:17:17 +0100
Subject: [PATCH v1] Jumble temp tables by name
Query jumbling considers everything by oid, which is fine for regular
objects. But for temp tables, which have to be recreated in each session
(or even transaction), this means that the same temp table query run
from the next session will never be aggregated in pg_stat_statements.
Instead, the statistics are polluted with a large number of 1-call
entries.
Fix by using the temp table name instead. This has the risk of
aggregating structurally different temp tables together if they same the
same name, but practically, the queries will likely differ in other
details anyway. And even if not, aggregating the entries in
pg_stat_statements instead of polluting the stats seems the better
choice. (The user has still the option to simply change the name of the
temp table to have the queries separated. In the old scheme, the user
does not have any chance to change behavior.)
---
.../pg_stat_statements/expected/select.out | 31 ++++++++++++++++
contrib/pg_stat_statements/sql/select.sql | 12 +++++++
src/backend/nodes/queryjumblefuncs.c | 35 +++++++++++++++++++
src/include/nodes/parsenodes.h | 2 +-
4 files changed, 79 insertions(+), 1 deletion(-)
diff --git a/contrib/pg_stat_statements/expected/select.out b/contrib/pg_stat_statements/expected/select.out
index 37a30af034a..8a7e237298c 100644
--- a/contrib/pg_stat_statements/expected/select.out
+++ b/contrib/pg_stat_statements/expected/select.out
@@ -241,6 +241,37 @@ SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
(12 rows)
DROP TABLE pgss_a, pgss_b CASCADE;
+-- temp tables
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t
+---
+ t
+(1 row)
+
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+ id
+----
+(0 rows)
+
+COMMIT;
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+ id
+----
+(0 rows)
+
+COMMIT;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"; -- one query with two calls
+ calls | query
+-------+------------------------------------------------------------------------
+ 2 | SELECT * FROM temp_t
+ 0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
+ 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
+(3 rows)
+
--
-- access to pg_stat_statements_info view
--
diff --git a/contrib/pg_stat_statements/sql/select.sql b/contrib/pg_stat_statements/sql/select.sql
index e0be58d5e24..81b9d50ecec 100644
--- a/contrib/pg_stat_statements/sql/select.sql
+++ b/contrib/pg_stat_statements/sql/select.sql
@@ -90,6 +90,18 @@ SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
DROP TABLE pgss_a, pgss_b CASCADE;
+-- temp tables
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+COMMIT;
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+COMMIT;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"; -- one query with two calls
+
--
-- access to pg_stat_statements_info view
--
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index b103a281936..bbddc92e28c 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -32,10 +32,12 @@
*/
#include "postgres.h"
+#include "catalog/namespace.h"
#include "common/hashfn.h"
#include "miscadmin.h"
#include "nodes/queryjumble.h"
#include "parser/scansup.h"
+#include "utils/lsyscache.h"
#define JUMBLE_SIZE 1024 /* query serialization buffer size */
@@ -58,6 +60,7 @@ static void _jumbleNode(JumbleState *jstate, Node *node);
static void _jumbleA_Const(JumbleState *jstate, Node *node);
static void _jumbleList(JumbleState *jstate, Node *node);
static void _jumbleVariableSetStmt(JumbleState *jstate, Node *node);
+static void _jumbleRangeTblEntry(JumbleState *jstate, Node *node);
/*
* Given a possibly multi-statement source string, confine our attention to the
@@ -377,3 +380,35 @@ _jumbleVariableSetStmt(JumbleState *jstate, Node *node)
JUMBLE_FIELD(is_local);
JUMBLE_LOCATION(location);
}
+
+static void
+_jumbleRangeTblEntry(JumbleState *jstate, Node *node)
+{
+ char *rel_name = NULL;
+ RangeTblEntry *expr = (RangeTblEntry *) node;
+
+ JUMBLE_FIELD(rtekind);
+
+ /*
+ * If this is a temp table, jumble the name instead of the table oid.
+ */
+ if (expr->rtekind == RTE_RELATION && isAnyTempNamespace(get_rel_namespace(expr->relid)))
+ {
+ rel_name = get_rel_name(expr->relid);
+ AppendJumble(jstate, (const unsigned char *)rel_name, strlen(rel_name));
+ }
+ else
+ JUMBLE_FIELD(relid);
+
+ JUMBLE_FIELD(inh);
+ JUMBLE_NODE(tablesample);
+ JUMBLE_NODE(subquery);
+ JUMBLE_FIELD(jointype);
+ JUMBLE_NODE(functions);
+ JUMBLE_FIELD(funcordinality);
+ JUMBLE_NODE(tablefunc);
+ JUMBLE_NODE(values_lists);
+ JUMBLE_STRING(ctename);
+ JUMBLE_FIELD(ctelevelsup);
+ JUMBLE_STRING(enrname);
+}
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 23c9e3c5abf..deec95c7a26 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1039,7 +1039,7 @@ typedef enum RTEKind
typedef struct RangeTblEntry
{
- pg_node_attr(custom_read_write)
+ pg_node_attr(custom_read_write, custom_query_jumble)
NodeTag type;
--
2.47.2
On Mon, Mar 17, 2025 at 10:38:36PM +0100, Christoph Berg wrote:
Here's that patch with regression tests added. I would think changing
this would be a big usability improvement for anyone using temp tables
a lot.
Not the first time I am seeing this argument this month. It is the
second time.
+ /*
+ * If this is a temp table, jumble the name instead of the table oid.
+ */
+ if (expr->rtekind == RTE_RELATION && isAnyTempNamespace(get_rel_namespace(expr->relid)))
+ {
+ rel_name = get_rel_name(expr->relid);
+ AppendJumble(jstate, (const unsigned char *)rel_name, strlen(rel_name));
+ }
+ else
+ JUMBLE_FIELD(relid);
This is OK on its own, still feels a bit incomplete, as the relid also
includes an assumption about the namespace. I would suggested to add
a hardcoded "pg_temp" here, to keep track of this assumption, at
least.
typedef struct RangeTblEntry
{
- pg_node_attr(custom_read_write)
+ pg_node_attr(custom_read_write, custom_query_jumble)
This structure still includes some query_jumble_ignore, which are not
required once custom_query_jumble is added.
We had better document at the top of RangeTblEntry why we are using a
custom function.
--
Michael
Re: Michael Paquier
This is OK on its own, still feels a bit incomplete, as the relid also
includes an assumption about the namespace. I would suggested to add
a hardcoded "pg_temp" here, to keep track of this assumption, at
least.
I had thought about it, but figured that integers and strings are
already separate namespaces, so hashing them shouldn't have any
conflicts. But it's more clear to do that, so added in the new
version:
AppendJumble(jstate, (const unsigned char *)"pg_temp", sizeof("pg_temp"));
AppendJumble(jstate, (const unsigned char *)rel_name, strlen(rel_name));
typedef struct RangeTblEntry { - pg_node_attr(custom_read_write) + pg_node_attr(custom_read_write, custom_query_jumble)This structure still includes some query_jumble_ignore, which are not
required once custom_query_jumble is added.
I would tend to keep them for documentation purposes. (The other
custom_query_jumble functions have a much more explicit structure so
there it is clear which fields are supposed to be jumbled.)
We had better document at the top of RangeTblEntry why we are using a
custom function.
I added a short comment just above custom_query_jumble.
Christoph
Attachments:
v2-0001-Jumble-temp-tables-by-name.patchtext/x-diff; charset=us-asciiDownload
From 70a80f9abd781f8114563ae8fc69e64123f12eeb Mon Sep 17 00:00:00 2001
From: Christoph Berg <myon@debian.org>
Date: Mon, 17 Mar 2025 17:17:17 +0100
Subject: [PATCH v2] Jumble temp tables by name
Query jumbling considers everything by OID, which is fine for regular
objects. But for temp tables, which have to be recreated in each session
(or even transaction), this means that the same temp table query run
again in the next session will never be aggregated in
pg_stat_statements. Instead, the statistics are polluted with a large
number of 1-call entries.
Fix by using the temp table name instead. This has the risk of
aggregating structurally different temp tables together if they share
the same name, but practically, the queries will likely already differ
in other details anyway. And even if not, aggregating the entries in
pg_stat_statements instead of polluting the stats seems the better
choice. (The user has still the option to simply change the name of the
temp table to have the queries separated. In the old scheme, the user
does not have any chance to change behavior.)
---
.../pg_stat_statements/expected/select.out | 31 ++++++++++++++++
contrib/pg_stat_statements/sql/select.sql | 12 +++++++
src/backend/nodes/queryjumblefuncs.c | 36 +++++++++++++++++++
src/include/nodes/parsenodes.h | 3 +-
4 files changed, 81 insertions(+), 1 deletion(-)
diff --git a/contrib/pg_stat_statements/expected/select.out b/contrib/pg_stat_statements/expected/select.out
index 37a30af034a..8a7e237298c 100644
--- a/contrib/pg_stat_statements/expected/select.out
+++ b/contrib/pg_stat_statements/expected/select.out
@@ -241,6 +241,37 @@ SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
(12 rows)
DROP TABLE pgss_a, pgss_b CASCADE;
+-- temp tables
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t
+---
+ t
+(1 row)
+
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+ id
+----
+(0 rows)
+
+COMMIT;
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+ id
+----
+(0 rows)
+
+COMMIT;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"; -- one query with two calls
+ calls | query
+-------+------------------------------------------------------------------------
+ 2 | SELECT * FROM temp_t
+ 0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
+ 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
+(3 rows)
+
--
-- access to pg_stat_statements_info view
--
diff --git a/contrib/pg_stat_statements/sql/select.sql b/contrib/pg_stat_statements/sql/select.sql
index e0be58d5e24..81b9d50ecec 100644
--- a/contrib/pg_stat_statements/sql/select.sql
+++ b/contrib/pg_stat_statements/sql/select.sql
@@ -90,6 +90,18 @@ SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
DROP TABLE pgss_a, pgss_b CASCADE;
+-- temp tables
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+COMMIT;
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+COMMIT;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"; -- one query with two calls
+
--
-- access to pg_stat_statements_info view
--
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index b103a281936..1e18f1ddf5f 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -32,10 +32,12 @@
*/
#include "postgres.h"
+#include "catalog/namespace.h"
#include "common/hashfn.h"
#include "miscadmin.h"
#include "nodes/queryjumble.h"
#include "parser/scansup.h"
+#include "utils/lsyscache.h"
#define JUMBLE_SIZE 1024 /* query serialization buffer size */
@@ -58,6 +60,7 @@ static void _jumbleNode(JumbleState *jstate, Node *node);
static void _jumbleA_Const(JumbleState *jstate, Node *node);
static void _jumbleList(JumbleState *jstate, Node *node);
static void _jumbleVariableSetStmt(JumbleState *jstate, Node *node);
+static void _jumbleRangeTblEntry(JumbleState *jstate, Node *node);
/*
* Given a possibly multi-statement source string, confine our attention to the
@@ -377,3 +380,36 @@ _jumbleVariableSetStmt(JumbleState *jstate, Node *node)
JUMBLE_FIELD(is_local);
JUMBLE_LOCATION(location);
}
+
+static void
+_jumbleRangeTblEntry(JumbleState *jstate, Node *node)
+{
+ RangeTblEntry *expr = (RangeTblEntry *) node;
+ char *rel_name;
+
+ JUMBLE_FIELD(rtekind);
+
+ /*
+ * If this is a temp table, jumble the name instead of the table OID.
+ */
+ if (expr->rtekind == RTE_RELATION && isAnyTempNamespace(get_rel_namespace(expr->relid)))
+ {
+ rel_name = get_rel_name(expr->relid);
+ AppendJumble(jstate, (const unsigned char *)"pg_temp", sizeof("pg_temp"));
+ AppendJumble(jstate, (const unsigned char *)rel_name, strlen(rel_name));
+ }
+ else
+ JUMBLE_FIELD(relid);
+
+ JUMBLE_FIELD(inh);
+ JUMBLE_NODE(tablesample);
+ JUMBLE_NODE(subquery);
+ JUMBLE_FIELD(jointype);
+ JUMBLE_NODE(functions);
+ JUMBLE_FIELD(funcordinality);
+ JUMBLE_NODE(tablefunc);
+ JUMBLE_NODE(values_lists);
+ JUMBLE_STRING(ctename);
+ JUMBLE_FIELD(ctelevelsup);
+ JUMBLE_STRING(enrname);
+}
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 23c9e3c5abf..f0f3bd2b95f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1039,7 +1039,8 @@ typedef enum RTEKind
typedef struct RangeTblEntry
{
- pg_node_attr(custom_read_write)
+ /* custom_query_jumble for handling temp tables specially */
+ pg_node_attr(custom_read_write, custom_query_jumble)
NodeTag type;
--
2.47.2
On Tue, Mar 18, 2025 at 05:51:54PM +0100, Christoph Berg wrote:
I had thought about it, but figured that integers and strings are
already separate namespaces, so hashing them shouldn't have any
conflicts. But it's more clear to do that, so added in the new
version:AppendJumble(jstate, (const unsigned char *)"pg_temp", sizeof("pg_temp"));
AppendJumble(jstate, (const unsigned char *)rel_name, strlen(rel_name));
Yes, I know that it's not a big deal, but it could be confusing if
somebody makes an argument about jumbling more object names for
relations. The problem is not only limited to relations, though, as
there are other object types that can use a temp namespace like
functions, but the case of table entries should cover most of the
complaints I can imagine.
typedef struct RangeTblEntry { - pg_node_attr(custom_read_write) + pg_node_attr(custom_read_write, custom_query_jumble)This structure still includes some query_jumble_ignore, which are not
required once custom_query_jumble is added.I would tend to keep them for documentation purposes. (The other
custom_query_jumble functions have a much more explicit structure so
there it is clear which fields are supposed to be jumbled.)
Fine by me as well to keep a dependency based on the fact that the
structure is rather complicated, but I'd rather document that as well
in parsenodes.h with a slightly fatter comment. What do you think?
--
Michael
Re: Michael Paquier
Fine by me as well to keep a dependency based on the fact that the
structure is rather complicated, but I'd rather document that as well
in parsenodes.h with a slightly fatter comment. What do you think?
You are of course right, that one-line comment was just snakeoil :D.
Now there are proper ones, thanks.
Christoph
Attachments:
v3-0001-Jumble-temp-tables-by-name.patchtext/x-diff; charset=us-asciiDownload
From 725b6c65160a44678b2c43b4af0a3dc9c281830d Mon Sep 17 00:00:00 2001
From: Christoph Berg <myon@debian.org>
Date: Mon, 17 Mar 2025 17:17:17 +0100
Subject: [PATCH v3] Jumble temp tables by name
Query jumbling considers everything by OID, which is fine for regular
objects. But for temp tables, which have to be recreated in each session
(or even transaction), this means that the same temp table query run
again in the next session will never be aggregated in
pg_stat_statements. Instead, the statistics are polluted with a large
number of 1-call entries.
Fix by using the temp table name instead. This has the risk of
aggregating structurally different temp tables together if they share
the same name, but practically, the queries will likely already differ
in other details anyway. And even if not, aggregating the entries in
pg_stat_statements instead of polluting the stats seems the better
choice. (The user has still the option to simply change the name of the
temp table to have the queries separated. In the old scheme, the user
does not have any chance to change behavior.)
---
.../pg_stat_statements/expected/select.out | 31 +++++++++++++++
contrib/pg_stat_statements/sql/select.sql | 12 ++++++
src/backend/nodes/queryjumblefuncs.c | 39 +++++++++++++++++++
src/include/nodes/parsenodes.h | 8 +++-
4 files changed, 89 insertions(+), 1 deletion(-)
diff --git a/contrib/pg_stat_statements/expected/select.out b/contrib/pg_stat_statements/expected/select.out
index 37a30af034a..8a7e237298c 100644
--- a/contrib/pg_stat_statements/expected/select.out
+++ b/contrib/pg_stat_statements/expected/select.out
@@ -241,6 +241,37 @@ SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
(12 rows)
DROP TABLE pgss_a, pgss_b CASCADE;
+-- temp tables
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t
+---
+ t
+(1 row)
+
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+ id
+----
+(0 rows)
+
+COMMIT;
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+ id
+----
+(0 rows)
+
+COMMIT;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"; -- one query with two calls
+ calls | query
+-------+------------------------------------------------------------------------
+ 2 | SELECT * FROM temp_t
+ 0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
+ 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
+(3 rows)
+
--
-- access to pg_stat_statements_info view
--
diff --git a/contrib/pg_stat_statements/sql/select.sql b/contrib/pg_stat_statements/sql/select.sql
index e0be58d5e24..81b9d50ecec 100644
--- a/contrib/pg_stat_statements/sql/select.sql
+++ b/contrib/pg_stat_statements/sql/select.sql
@@ -90,6 +90,18 @@ SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
DROP TABLE pgss_a, pgss_b CASCADE;
+-- temp tables
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+COMMIT;
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+COMMIT;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"; -- one query with two calls
+
--
-- access to pg_stat_statements_info view
--
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index 189bfda610a..d5b3134cc2f 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -33,6 +33,7 @@
#include "postgres.h"
#include "access/transam.h"
+#include "catalog/namespace.h"
#include "catalog/pg_proc.h"
#include "common/hashfn.h"
#include "miscadmin.h"
@@ -40,6 +41,7 @@
#include "nodes/queryjumble.h"
#include "utils/lsyscache.h"
#include "parser/scansup.h"
+#include "utils/lsyscache.h"
#define JUMBLE_SIZE 1024 /* query serialization buffer size */
@@ -67,6 +69,7 @@ static void _jumbleElements(JumbleState *jstate, List *elements);
static void _jumbleA_Const(JumbleState *jstate, Node *node);
static void _jumbleList(JumbleState *jstate, Node *node);
static void _jumbleVariableSetStmt(JumbleState *jstate, Node *node);
+static void _jumbleRangeTblEntry(JumbleState *jstate, Node *node);
/*
* Given a possibly multi-statement source string, confine our attention to the
@@ -513,3 +516,39 @@ _jumbleVariableSetStmt(JumbleState *jstate, Node *node)
JUMBLE_FIELD(is_local);
JUMBLE_LOCATION(location);
}
+
+/*
+ * RangeTblEntry is jumbled exactly like the automatic jumbling would do, but
+ * with using the name of temp tables instead of OID. Since temp tables have to
+ * be recreated for each session (or even transaction), their OID is useless
+ * for fingerprinting.
+ */
+static void
+_jumbleRangeTblEntry(JumbleState *jstate, Node *node)
+{
+ RangeTblEntry *expr = (RangeTblEntry *) node;
+ char *rel_name;
+
+ JUMBLE_FIELD(rtekind);
+
+ if (expr->rtekind == RTE_RELATION && isAnyTempNamespace(get_rel_namespace(expr->relid)))
+ {
+ rel_name = get_rel_name(expr->relid);
+ AppendJumble(jstate, (const unsigned char *)"pg_temp", sizeof("pg_temp"));
+ AppendJumble(jstate, (const unsigned char *)rel_name, strlen(rel_name));
+ }
+ else
+ JUMBLE_FIELD(relid);
+
+ JUMBLE_FIELD(inh);
+ JUMBLE_NODE(tablesample);
+ JUMBLE_NODE(subquery);
+ JUMBLE_FIELD(jointype);
+ JUMBLE_NODE(functions);
+ JUMBLE_FIELD(funcordinality);
+ JUMBLE_NODE(tablefunc);
+ JUMBLE_NODE(values_lists);
+ JUMBLE_STRING(ctename);
+ JUMBLE_FIELD(ctelevelsup);
+ JUMBLE_STRING(enrname);
+}
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 23c9e3c5abf..4fbe1520548 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1019,6 +1019,12 @@ typedef struct PartitionCmd
* rewriter to implement security-barrier views and/or row-level security.
* Note that the planner turns each boolean expression into an implicitly
* AND'ed sublist, as is its usual habit with qualification expressions.
+ *
+ * RangeTblEntry uses a custom query jumble function to hash temp tables by
+ * name instead of by OID. The query_jumble_ignore markers on struct members
+ * are still kept for documentation; if the custom_query_jumble attribute is
+ * dropped, the automatically generated _jumbleRangeTblEntry function should
+ * be identical except for the relid.
*--------------------
*/
typedef enum RTEKind
@@ -1039,7 +1045,7 @@ typedef enum RTEKind
typedef struct RangeTblEntry
{
- pg_node_attr(custom_read_write)
+ pg_node_attr(custom_read_write, custom_query_jumble)
NodeTag type;
--
2.47.2
On Wed, Mar 19, 2025 at 01:02:54PM +0100, Christoph Berg wrote:
You are of course right, that one-line comment was just snakeoil :D.
Now there are proper ones, thanks.
I have been thinking about this patch for a couple of days. What
makes me unhappy in this proposal is that RangeTblEntry is a large and
complicated Node, and we only want to force a custom computation for
the "relid" portion of the node, while being able to also take some
decisions for what the parent node has. Leaving the existing
per-field query_jumble_ignore with the custom function is prone to
errors, as well, because we need to maintain some level of consistency
between parsenodes.h and src/backend/nodes/.
Hence here is a counter-proposal, that can bring the best of both
worlds: let's add support for custom_query_jumble at field level.
I've spent some time on that, and some concatenation in a macro used
by gen_node_support.pl to force a policy for the custom function name
and its arguments is proving to be rather straight-forward. This
approach addresses the problem of this thread, while also tackling my
concerns about complex node structures.
The custom functions are named _jumble${node}_${field}, with the field
and the parent node given as arguments. I agree that giving the field
is kind of pointless if you have the parent node, but I think that
this is better so as this forces developers to think about how to use
the field value with the node.
Please see the attached. What do you think?
--
Michael
Attachments:
v3-0001-Add-support-for-custom_query_jumble-at-field-leve.patchtext/x-diff; charset=us-asciiDownload
From 11e3154cabdc24feb14e91d35c0cfee5f6c0ca2c Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 21 Mar 2025 09:35:38 +0900
Subject: [PATCH v3 1/2] Add support for custom_query_jumble at field-level
This option gives the possibility for query jumbling to force custom
jumbling policies specific to a field of a node. When dealing with
complex node structures, this is simpler than having to enforce a custom
function across the full node.
Custom functions need to be defined as _jumble${node}_${field}, are
given in input the parent node and the field value. The field value is
not really required if we have the parent Node, but it makes custom
implementations easier to follow with field-specific definitions and
values. The code generated by gen_node_support.pl uses a macro called
JUMBLE_CUSTOM(), that hides the objects specific to queryjumblefuncs.c.
---
src/include/nodes/nodes.h | 4 ++++
src/backend/nodes/gen_node_support.pl | 11 +++++++++++
src/backend/nodes/queryjumblefuncs.c | 6 ++++++
3 files changed, 21 insertions(+)
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index d18044b4e650..adec68a45ab8 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -54,6 +54,7 @@ typedef enum NodeTag
* readfuncs.c.
*
* - custom_query_jumble: Has custom implementation in queryjumblefuncs.c.
+ * Available as well as a per-field attribute.
*
* - no_copy: Does not support copyObject() at all.
*
@@ -101,6 +102,9 @@ typedef enum NodeTag
* - equal_ignore_if_zero: Ignore the field for equality if it is zero.
* (Otherwise, compare normally.)
*
+ * - custom_query_jumble: Has custom implementation in queryjumblefuncs.c
+ * applied for the field of a node. Available as well as a node attribute.
+ *
* - query_jumble_ignore: Ignore the field for the query jumbling. Note
* that typmod and collation information are usually irrelevant for the
* query jumbling.
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 7e3f335ac09d..bcae70cea7d4 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -471,6 +471,7 @@ foreach my $infile (@ARGV)
&& $attr !~ /^read_as\(\w+\)$/
&& !elem $attr,
qw(copy_as_scalar
+ custom_query_jumble
equal_as_scalar
equal_ignore
equal_ignore_if_zero
@@ -1283,12 +1284,17 @@ _jumble${n}(JumbleState *jstate, Node *node)
my $t = $node_type_info{$n}->{field_types}{$f};
my @a = @{ $node_type_info{$n}->{field_attrs}{$f} };
my $query_jumble_ignore = $struct_no_query_jumble;
+ my $query_jumble_custom = 0;
my $query_jumble_location = 0;
my $query_jumble_squash = 0;
# extract per-field attributes
foreach my $a (@a)
{
+ if ($a eq 'custom_query_jumble')
+ {
+ $query_jumble_custom = 1;
+ }
if ($a eq 'query_jumble_ignore')
{
$query_jumble_ignore = 1;
@@ -1328,6 +1334,11 @@ _jumble${n}(JumbleState *jstate, Node *node)
unless $query_jumble_ignore;
}
}
+ elsif ($query_jumble_custom)
+ {
+ # Custom function that applies to one field of a node.
+ print $jff "\tJUMBLE_CUSTOM($n, $f);\n";
+ }
elsif ($t eq 'char*')
{
print $jff "\tJUMBLE_STRING($f);\n"
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index 189bfda610aa..9b9cf6f34381 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -333,6 +333,12 @@ do { \
if (expr->str) \
AppendJumble(jstate, (const unsigned char *) (expr->str), strlen(expr->str) + 1); \
} while(0)
+/*
+ * Note that the argument types are enforced for the per-field custom
+ * functions.
+ */
+#define JUMBLE_CUSTOM(nodetype, item) \
+ _jumble##nodetype##_##item(jstate, expr, expr->item)
#include "queryjumblefuncs.funcs.c"
--
2.49.0
v3-0002-Add-custom-query-jumble-function-for-RangeTblEntr.patchtext/x-diff; charset=us-asciiDownload
From 1c5b1a99aee4d0872d42b6edfdaab1266d13a522 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 21 Mar 2025 09:43:48 +0900
Subject: [PATCH v3 2/2] Add custom query jumble function for
RangeTblEntry.relid
---
src/include/nodes/parsenodes.h | 9 +++--
src/backend/nodes/queryjumblefuncs.c | 27 +++++++++++++++
.../pg_stat_statements/expected/utility.out | 33 +++++++++++++++++++
contrib/pg_stat_statements/sql/utility.sql | 12 +++++++
4 files changed, 79 insertions(+), 2 deletions(-)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 23c9e3c5abf2..3eb16846e0e1 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1093,8 +1093,13 @@ typedef struct RangeTblEntry
* relation. This allows plans referencing AFTER trigger transition
* tables to be invalidated if the underlying table is altered.
*/
- /* OID of the relation */
- Oid relid;
+
+ /*
+ * OID of the relation. A custom query jumble function is used here for
+ * temporary tables, where the computation uses the relation name instead
+ * of the OID.
+ */
+ Oid relid pg_node_attr(custom_query_jumble);
/* inheritance requested? */
bool inh;
/* relation kind (see pg_class.relkind) */
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index 9b9cf6f34381..fbb05eab1bbe 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -33,6 +33,7 @@
#include "postgres.h"
#include "access/transam.h"
+#include "catalog/namespace.h"
#include "catalog/pg_proc.h"
#include "common/hashfn.h"
#include "miscadmin.h"
@@ -67,6 +68,9 @@ static void _jumbleElements(JumbleState *jstate, List *elements);
static void _jumbleA_Const(JumbleState *jstate, Node *node);
static void _jumbleList(JumbleState *jstate, Node *node);
static void _jumbleVariableSetStmt(JumbleState *jstate, Node *node);
+static void _jumbleRangeTblEntry_relid(JumbleState *jstate,
+ RangeTblEntry *expr,
+ Oid relid);
/*
* Given a possibly multi-statement source string, confine our attention to the
@@ -519,3 +523,26 @@ _jumbleVariableSetStmt(JumbleState *jstate, Node *node)
JUMBLE_FIELD(is_local);
JUMBLE_LOCATION(location);
}
+
+/*
+ * Custom query jumble function for RangeTblEntry.relid.
+ */
+static void
+_jumbleRangeTblEntry_relid(JumbleState *jstate,
+ RangeTblEntry *expr,
+ Oid relid)
+{
+ /*
+ * If this is a temporary table, jumble its name instead of the table OID.
+ */
+ if (expr->rtekind == RTE_RELATION &&
+ isAnyTempNamespace(get_rel_namespace(expr->relid)))
+ {
+ char *relname = get_rel_name(expr->relid);
+
+ AppendJumble(jstate, (const unsigned char *) "pg_temp", sizeof("pg_temp"));
+ AppendJumble(jstate, (const unsigned char *) relname, strlen(relname));
+ }
+ else
+ JUMBLE_FIELD(relid);
+}
diff --git a/contrib/pg_stat_statements/expected/utility.out b/contrib/pg_stat_statements/expected/utility.out
index aa4f0f7e6280..941ba0e66deb 100644
--- a/contrib/pg_stat_statements/expected/utility.out
+++ b/contrib/pg_stat_statements/expected/utility.out
@@ -9,6 +9,39 @@ SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
(1 row)
+-- Temporary tables, grouped together.
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+ id
+----
+(0 rows)
+
+COMMIT;
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+ id
+----
+(0 rows)
+
+COMMIT;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+ calls | query
+-------+----------------------------------------------------
+ 2 | BEGIN
+ 2 | COMMIT
+ 2 | CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP
+ 2 | SELECT * FROM temp_t
+ 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
+(5 rows)
+
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t
+---
+ t
+(1 row)
+
-- Tables, indexes, triggers
CREATE TEMP TABLE tab_stats (a int, b char(20));
CREATE INDEX index_stats ON tab_stats(b, (b || 'data1'), (b || 'data2')) WHERE a > 0;
diff --git a/contrib/pg_stat_statements/sql/utility.sql b/contrib/pg_stat_statements/sql/utility.sql
index dd97203c2102..e21b656d44a8 100644
--- a/contrib/pg_stat_statements/sql/utility.sql
+++ b/contrib/pg_stat_statements/sql/utility.sql
@@ -6,6 +6,18 @@
SET pg_stat_statements.track_utility = TRUE;
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+-- Temporary tables, grouped together.
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+COMMIT;
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+COMMIT;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+
-- Tables, indexes, triggers
CREATE TEMP TABLE tab_stats (a int, b char(20));
CREATE INDEX index_stats ON tab_stats(b, (b || 'data1'), (b || 'data2')) WHERE a > 0;
--
2.49.0
Re: Michael Paquier
I have been thinking about this patch for a couple of days. What
makes me unhappy in this proposal is that RangeTblEntry is a large and
complicated Node, and we only want to force a custom computation for
the "relid" portion of the node, while being able to also take some
decisions for what the parent node has. Leaving the existing
per-field query_jumble_ignore with the custom function is prone to
errors, as well, because we need to maintain some level of consistency
between parsenodes.h and src/backend/nodes/.
Ack, that was also bothering me, but I didn't think it was so easy to
do it on a per-field level. Thanks!
The custom functions are named _jumble${node}_${field}, with the field
and the parent node given as arguments. I agree that giving the field
is kind of pointless if you have the parent node, but I think that
this is better so as this forces developers to think about how to use
the field value with the node.
Makes sense.
Please see the attached. What do you think?
Just one minor thing, I don't understand what you are trying to say in
this comment:
+/* + * Note that the argument types are enforced for the per-field custom + * functions. + */ +#define JUMBLE_CUSTOM(nodetype, item) \ + _jumble##nodetype##_##item(jstate, expr, expr->item)
Thanks,
Christoph
On Fri, Mar 21, 2025 at 05:26:20PM +0100, Christoph Berg wrote:
Just one minor thing, I don't understand what you are trying to say in
this comment:+/* + * Note that the argument types are enforced for the per-field custom + * functions. + */ +#define JUMBLE_CUSTOM(nodetype, item) \ + _jumble##nodetype##_##item(jstate, expr, expr->item)
In this one, I want to mean that we require a custom per-field
function to look like that:
_jumbleNodefoo_field(JumbleState *jstate, NodeFoo *expr, FieldType field);
Rather than having more generic shape like that:
_jumbleNodefoo_field(JumbleState *jstate, Node *exp,
const unsigned char *item, Size size);
So a custom function is defined so as the node type and field type are
arguments. Perhaps this comment would be better if reworded like
that:
"The arguments of this function use the node type and the field type,
rather than a generic argument like AppendJumble() and the other
_jumble() functions."
If you have a better idea, please feel free..
--
Michael
Re: Michael Paquier
+ * Note that the argument types are enforced for the per-field custom + * functions. + */ +#define JUMBLE_CUSTOM(nodetype, item) \ + _jumble##nodetype##_##item(jstate, expr, expr->item)In this one, I want to mean that we require a custom per-field
function to look like that:
_jumbleNodefoo_field(JumbleState *jstate, NodeFoo *expr, FieldType field);Rather than having more generic shape like that:
_jumbleNodefoo_field(JumbleState *jstate, Node *exp,
const unsigned char *item, Size size);So a custom function is defined so as the node type and field type are
arguments. Perhaps this comment would be better if reworded like
that:
"The arguments of this function use the node type and the field type,
rather than a generic argument like AppendJumble() and the other
_jumble() functions."
Perhaps this:
/*
* The per-field custom jumble functions get jstate, the node, and the
* field as arguments.
*/
They are not actually different from _jumbleList and _jumbleA_Const
which also get the node (and just not the field). AppendJumble is a
different layer, the output, so it's not surprising its signature is
different.
Are we at the point where the patch is already Ready for Committer?
Thanks,
Christoph
Re: To Michael Paquier
+#define JUMBLE_CUSTOM(nodetype, item) \ + _jumble##nodetype##_##item(jstate, expr, expr->item)In this one, I want to mean that we require a custom per-field
function to look like that:
_jumbleNodefoo_field(JumbleState *jstate, NodeFoo *expr, FieldType field);Perhaps this:
Or actually more explicit:
/*
* Per-field custom jumble functions have this signature:
* _jumbleNodefoo_field(JumbleState *jstate, NodeFoo *expr, FieldType field);
*/
Christoph
On Sat, Mar 22, 2025 at 10:43:00AM +0100, Christoph Berg wrote:
Are we at the point where the patch is already Ready for Committer?
I'll think a bit more about how to document all that. Anyway, yes,
I'm OK with the per-field custom_query_jumble, so let's move on with
that, so I will do something about that.
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
On Sat, Mar 22, 2025 at 10:43:00AM +0100, Christoph Berg wrote:
Are we at the point where the patch is already Ready for Committer?
I'll think a bit more about how to document all that. Anyway, yes,
I'm OK with the per-field custom_query_jumble, so let's move on with
that, so I will do something about that.
I'm not terribly happy with the entire proposal.
(1) I know it was asserted upthread that there was no performance
impact, but I find that hard to believe.
(2) This patch inserts catalog lookups into query ID computation,
which AFAIK there never were before. This means you can't compute a
query ID outside a transaction or in an already-failed transaction.
Surely that's going to bite us eventually.
(3) I think having query jumbling work differently for temp tables
than other tables is a fundamentally bad idea.
So my feeling is: if we think this is the behavior we want, let's do
it across the board. I suggest that we simply drop the relid from the
jumble and use the table alias (that is, eref->aliasname) instead.
ISTM this fits well with the general trend in pg_stat_statements
to merge statements together more aggressively than the original
concept envisioned.
regards, tom lane
I wrote:
So my feeling is: if we think this is the behavior we want, let's do
it across the board. I suggest that we simply drop the relid from the
jumble and use the table alias (that is, eref->aliasname) instead.
I experimented with this trivial fix (shown in-line to keep the cfbot
from thinking this is the patch-of-record):
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 23c9e3c5abf..a54bbdc18b7 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1051,7 +1051,7 @@ typedef struct RangeTblEntry
/* user-written alias clause, if any */
Alias *alias pg_node_attr(query_jumble_ignore);
/* expanded reference names */
- Alias *eref pg_node_attr(query_jumble_ignore);
+ Alias *eref;
RTEKind rtekind; /* see above */
@@ -1094,7 +1094,7 @@ typedef struct RangeTblEntry
* tables to be invalidated if the underlying table is altered.
*/
/* OID of the relation */
- Oid relid;
+ Oid relid pg_node_attr(query_jumble_ignore);
/* inheritance requested? */
bool inh;
/* relation kind (see pg_class.relkind) */
This caused just one diff in existing regression test cases:
diff --git a/contrib/pg_stat_statements/expected/planning.out b/contrib/pg_stat_statements/expected/planning.out
index 3ee1928cbe9..c25b8b946fd 100644
--- a/contrib/pg_stat_statements/expected/planning.out
+++ b/contrib/pg_stat_statements/expected/planning.out
@@ -75,8 +75,9 @@ SELECT plans >= 2 AND plans <= calls AS plans_ok, calls, rows, query FROM pg_sta
WHERE query LIKE 'SELECT COUNT%' ORDER BY query COLLATE "C";
plans_ok | calls | rows | query
----------+-------+------+--------------------------------------
- t | 4 | 4 | SELECT COUNT(*) FROM stats_plan_test
-(1 row)
+ f | 1 | 1 | SELECT COUNT(*) FROM stats_plan_test
+ f | 3 | 3 | SELECT COUNT(*) FROM stats_plan_test
+(2 rows)
-- Cleanup
DROP TABLE stats_plan_test;
What's happening there is that there's an ALTER TABLE ADD COLUMN in
the test, so the executions after the first one see more entries
in eref->colnames and come up with a different jumble. I think
we probably don't want that behavior; we only want to jumble the
table name. So we'd still need the v3-0001 patch in some form to
allow annotating RangeTblEntry.eref with a custom jumble method
that'd only jumble the aliasname.
regards, tom lane
On Sat, Mar 22, 2025 at 12:24:43PM -0400, Tom Lane wrote:
I experimented with this trivial fix (shown in-line to keep the cfbot
from thinking this is the patch-of-record):What's happening there is that there's an ALTER TABLE ADD COLUMN in
the test, so the executions after the first one see more entries
in eref->colnames and come up with a different jumble. I think
we probably don't want that behavior; we only want to jumble the
table name. So we'd still need the v3-0001 patch in some form to
allow annotating RangeTblEntry.eref with a custom jumble method
that'd only jumble the aliasname.
Alias.aliasname is not qualified, so it means that we'd begin to
assign the same query ID even if using two relations from two schemas
depending on what search_path assigns, no? Say:
create schema popo1;
create schema popo2;
create table popo1.aa (a int, b int);
create table popo2.aa (a int, b int);
set search_path = 'popo1';
select count(*) from aa;
set search_path = 'popo2';
select count(*) from aa;
=# select query, calls from pg_stat_statements where
query ~ 'select count';
query | calls
-------------------------+-------
select count(*) from aa | 2
(1 row)
Perhaps that's OK because such queries use the same query string, but
just silencing the relid means that we'd lose the namespace reference
entirely, making the stats potentially fuzzier depending on the
workload. On HEAD, one can guess the query ID with an EXPLAIN and a
search_path, as well, so currently it's possible to cross-check the
contents of pgss. But we'd lose this possibility here..
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
Alias.aliasname is not qualified, so it means that we'd begin to
assign the same query ID even if using two relations from two schemas
depending on what search_path assigns, no?
Right. I'm arguing that that's good. The proposed patch already
obscures the difference between similar table names in different
(temp) schemas, and I'm suggesting that taking that a bit further
would be fine.
Note that if the tables we're considering don't have identical
rowtypes, the queries would likely jumble differently anyway
due to differences in Vars' varattno and vartype.
regards, tom lane
On Sat, Mar 22, 2025 at 09:04:19PM -0400, Tom Lane wrote:
Right. I'm arguing that that's good. The proposed patch already
obscures the difference between similar table names in different
(temp) schemas, and I'm suggesting that taking that a bit further
would be fine.Note that if the tables we're considering don't have identical
rowtypes, the queries would likely jumble differently anyway
due to differences in Vars' varattno and vartype.
Not for the types AFAIK, the varattnos count in, but perhaps for the
same argument as previously it's just kind of OK? Please see the
tests in the attached about that.
I've spent a few hours looking at the diffs of a pgss dump before and
after the fact. The reduction in the number of entries seem to come
mainly from tests where we do a successive creates and drops of the
same table name. There are quite a few of them for updatable views,
but it's not the only one. A good chunk comes from tables with
simpler and rather generic names.
So your idea to use the relation name in eref while skipping the
column list looks kind of promising. Per se the attached. Thoughts?
--
Michael
Attachments:
v4-0001-Add-support-for-custom_query_jumble-at-field-leve.patchtext/x-diff; charset=us-asciiDownload
From fd2bc0e26134b7d74fe33da99eab6623a9859f22 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Sun, 23 Mar 2025 15:33:05 +0900
Subject: [PATCH v4 1/2] Add support for custom_query_jumble at field-level
This option gives the possibility for query jumbling to force custom
jumbling policies specific to a field of a node. When dealing with
complex node structures, this is simpler than having to enforce a custom
function across the full node.
Custom functions need to be defined as _jumble${node}_${field}, are
given in input the parent node and the field value. The field value is
not really required if we have the parent Node, but it makes custom
implementations easier to follow with field-specific definitions and
values. The code generated by gen_node_support.pl uses a macro called
JUMBLE_CUSTOM(), that hides the objects specific to queryjumblefuncs.c.
---
src/include/nodes/nodes.h | 4 ++++
src/backend/nodes/gen_node_support.pl | 15 +++++++++++++--
src/backend/nodes/queryjumblefuncs.c | 6 ++++++
3 files changed, 23 insertions(+), 2 deletions(-)
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index d18044b4e650..adec68a45ab8 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -54,6 +54,7 @@ typedef enum NodeTag
* readfuncs.c.
*
* - custom_query_jumble: Has custom implementation in queryjumblefuncs.c.
+ * Available as well as a per-field attribute.
*
* - no_copy: Does not support copyObject() at all.
*
@@ -101,6 +102,9 @@ typedef enum NodeTag
* - equal_ignore_if_zero: Ignore the field for equality if it is zero.
* (Otherwise, compare normally.)
*
+ * - custom_query_jumble: Has custom implementation in queryjumblefuncs.c
+ * applied for the field of a node. Available as well as a node attribute.
+ *
* - query_jumble_ignore: Ignore the field for the query jumbling. Note
* that typmod and collation information are usually irrelevant for the
* query jumbling.
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 7e3f335ac09d..2d62ecb1d9d6 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -471,6 +471,7 @@ foreach my $infile (@ARGV)
&& $attr !~ /^read_as\(\w+\)$/
&& !elem $attr,
qw(copy_as_scalar
+ custom_query_jumble
equal_as_scalar
equal_ignore
equal_ignore_if_zero
@@ -1283,12 +1284,17 @@ _jumble${n}(JumbleState *jstate, Node *node)
my $t = $node_type_info{$n}->{field_types}{$f};
my @a = @{ $node_type_info{$n}->{field_attrs}{$f} };
my $query_jumble_ignore = $struct_no_query_jumble;
+ my $query_jumble_custom = 0;
my $query_jumble_location = 0;
my $query_jumble_squash = 0;
# extract per-field attributes
foreach my $a (@a)
{
+ if ($a eq 'custom_query_jumble')
+ {
+ $query_jumble_custom = 1;
+ }
if ($a eq 'query_jumble_ignore')
{
$query_jumble_ignore = 1;
@@ -1304,8 +1310,13 @@ _jumble${n}(JumbleState *jstate, Node *node)
}
# node type
- if (($t =~ /^(\w+)\*$/ or $t =~ /^struct\s+(\w+)\*$/)
- and elem $1, @node_types)
+ if ($query_jumble_custom)
+ {
+ # Custom function that applies to one field of a node.
+ print $jff "\tJUMBLE_CUSTOM($n, $f);\n";
+ }
+ elsif (($t =~ /^(\w+)\*$/ or $t =~ /^struct\s+(\w+)\*$/)
+ and elem $1, @node_types)
{
# Squash constants if requested.
if ($query_jumble_squash)
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index 189bfda610aa..9b9cf6f34381 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -333,6 +333,12 @@ do { \
if (expr->str) \
AppendJumble(jstate, (const unsigned char *) (expr->str), strlen(expr->str) + 1); \
} while(0)
+/*
+ * Note that the argument types are enforced for the per-field custom
+ * functions.
+ */
+#define JUMBLE_CUSTOM(nodetype, item) \
+ _jumble##nodetype##_##item(jstate, expr, expr->item)
#include "queryjumblefuncs.funcs.c"
--
2.49.0
v4-0002-Add-custom-query-jumble-function-for-RangeTblEntr.patchtext/x-diff; charset=us-asciiDownload
From e91d0ff80c3670d2a7656e1711581c4df2a58c21 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Sun, 23 Mar 2025 15:34:29 +0900
Subject: [PATCH v4 2/2] Add custom query jumble function for
RangeTblEntry.eref
---
src/include/nodes/parsenodes.h | 10 +-
src/backend/nodes/queryjumblefuncs.c | 20 ++
.../pg_stat_statements/expected/select.out | 191 ++++++++++++++++++
contrib/pg_stat_statements/sql/select.sql | 57 ++++++
4 files changed, 275 insertions(+), 3 deletions(-)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 23c9e3c5abf2..36363de3a769 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1050,8 +1050,12 @@ typedef struct RangeTblEntry
*/
/* user-written alias clause, if any */
Alias *alias pg_node_attr(query_jumble_ignore);
- /* expanded reference names */
- Alias *eref pg_node_attr(query_jumble_ignore);
+ /*
+ * Expanded reference names. This uses a custom query jumble function
+ * so as the table name is included in the computation, not its list
+ * of columns.
+ */
+ Alias *eref pg_node_attr(custom_query_jumble);
RTEKind rtekind; /* see above */
@@ -1094,7 +1098,7 @@ typedef struct RangeTblEntry
* tables to be invalidated if the underlying table is altered.
*/
/* OID of the relation */
- Oid relid;
+ Oid relid pg_node_attr(query_jumble_ignore);
/* inheritance requested? */
bool inh;
/* relation kind (see pg_class.relkind) */
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index 9b9cf6f34381..896bdc663a61 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -33,6 +33,7 @@
#include "postgres.h"
#include "access/transam.h"
+#include "catalog/namespace.h"
#include "catalog/pg_proc.h"
#include "common/hashfn.h"
#include "miscadmin.h"
@@ -67,6 +68,9 @@ static void _jumbleElements(JumbleState *jstate, List *elements);
static void _jumbleA_Const(JumbleState *jstate, Node *node);
static void _jumbleList(JumbleState *jstate, Node *node);
static void _jumbleVariableSetStmt(JumbleState *jstate, Node *node);
+static void _jumbleRangeTblEntry_eref(JumbleState *jstate,
+ RangeTblEntry *rte,
+ Alias *expr);
/*
* Given a possibly multi-statement source string, confine our attention to the
@@ -519,3 +523,19 @@ _jumbleVariableSetStmt(JumbleState *jstate, Node *node)
JUMBLE_FIELD(is_local);
JUMBLE_LOCATION(location);
}
+
+/*
+ * Custom query jumble function for RangeTblEntry.eref.
+ */
+static void
+_jumbleRangeTblEntry_eref(JumbleState *jstate,
+ RangeTblEntry *rte,
+ Alias *expr)
+{
+ JUMBLE_FIELD(type);
+ /*
+ * This includes only the table name, the list of column names is
+ * ignored.
+ */
+ JUMBLE_STRING(aliasname);
+}
diff --git a/contrib/pg_stat_statements/expected/select.out b/contrib/pg_stat_statements/expected/select.out
index 37a30af034a6..218717b9a765 100644
--- a/contrib/pg_stat_statements/expected/select.out
+++ b/contrib/pg_stat_statements/expected/select.out
@@ -413,3 +413,194 @@ SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
(1 row)
+-- Temporary tables, grouped together.
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+ id
+----
+(0 rows)
+
+COMMIT;
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+ id
+----
+(0 rows)
+
+COMMIT;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+ calls | query
+-------+------------------------------------------------------------------------
+ 2 | SELECT * FROM temp_t
+ 0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
+ 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
+(3 rows)
+
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t
+---
+ t
+(1 row)
+
+-- search_path with various schemas and temporary tables
+CREATE SCHEMA pgss_schema_1;
+CREATE SCHEMA pgss_schema_2;
+-- Same attributes.
+CREATE TABLE pgss_schema_1.tab_search_same (a int, b int);
+CREATE TABLE pgss_schema_2.tab_search_same (a int, b int);
+CREATE TEMP TABLE tab_search_same (a int, b int);
+-- Different number of attributes, mapping types
+CREATE TABLE pgss_schema_1.tab_search_diff_1 (a int);
+CREATE TABLE pgss_schema_2.tab_search_diff_1 (a int, b int);
+CREATE TEMP TABLE tab_search_diff_1 (a int, b int, c int);
+-- Same number of attributes, different types
+CREATE TABLE pgss_schema_1.tab_search_diff_2 (a int);
+CREATE TABLE pgss_schema_2.tab_search_diff_2 (a text);
+CREATE TEMP TABLE tab_search_diff_2 (a bigint);
+SET search_path = 'pgss_schema_1';
+SELECT count(*) FROM tab_search_same;
+ count
+-------
+ 0
+(1 row)
+
+SELECT a, b FROM tab_search_same;
+ a | b
+---+---
+(0 rows)
+
+SELECT count(*) FROM tab_search_diff_1;
+ count
+-------
+ 0
+(1 row)
+
+SELECT count(*) FROM tab_search_diff_2;
+ count
+-------
+ 0
+(1 row)
+
+SELECT a FROM tab_search_diff_2;
+ a
+---
+(0 rows)
+
+SET search_path = 'pgss_schema_1';
+SELECT count(*) FROM tab_search_same;
+ count
+-------
+ 0
+(1 row)
+
+SELECT a, b FROM tab_search_same;
+ a | b
+---+---
+(0 rows)
+
+SELECT count(*) FROM tab_search_diff_1;
+ count
+-------
+ 0
+(1 row)
+
+SELECT count(*) FROM tab_search_diff_2;
+ count
+-------
+ 0
+(1 row)
+
+SELECT a FROM tab_search_diff_2;
+ a
+---
+(0 rows)
+
+SET search_path = 'pg_temp';
+SELECT count(*) FROM tab_search_same;
+ count
+-------
+ 0
+(1 row)
+
+SELECT a, b FROM tab_search_same;
+ a | b
+---+---
+(0 rows)
+
+SELECT count(*) FROM tab_search_diff_1;
+ count
+-------
+ 0
+(1 row)
+
+SELECT count(*) FROM tab_search_diff_2;
+ count
+-------
+ 0
+(1 row)
+
+SELECT a FROM tab_search_diff_2;
+ a
+---
+(0 rows)
+
+RESET search_path;
+SELECT count(*) FROM pgss_schema_1.tab_search_same;
+ count
+-------
+ 0
+(1 row)
+
+SELECT a, b FROM pgss_schema_1.tab_search_same;
+ a | b
+---+---
+(0 rows)
+
+SELECT count(*) FROM pgss_schema_2.tab_search_diff_1;
+ count
+-------
+ 0
+(1 row)
+
+SELECT count(*) FROM pg_temp.tab_search_diff_2;
+ count
+-------
+ 0
+(1 row)
+
+SELECT a FROM pgss_schema_2.tab_search_diff_2;
+ a
+---
+(0 rows)
+
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+ calls | query
+-------+------------------------------------------------------------------------
+ 4 | SELECT a FROM tab_search_diff_2
+ 4 | SELECT a, b FROM tab_search_same
+ 0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
+ 4 | SELECT count(*) FROM tab_search_diff_1
+ 4 | SELECT count(*) FROM tab_search_diff_2
+ 4 | SELECT count(*) FROM tab_search_same
+ 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
+(7 rows)
+
+DROP SCHEMA pgss_schema_1 CASCADE;
+NOTICE: drop cascades to 3 other objects
+DETAIL: drop cascades to table pgss_schema_1.tab_search_same
+drop cascades to table pgss_schema_1.tab_search_diff_1
+drop cascades to table pgss_schema_1.tab_search_diff_2
+DROP SCHEMA pgss_schema_2 CASCADE;
+NOTICE: drop cascades to 3 other objects
+DETAIL: drop cascades to table pgss_schema_2.tab_search_same
+drop cascades to table pgss_schema_2.tab_search_diff_1
+drop cascades to table pgss_schema_2.tab_search_diff_2
+DROP TABLE tab_search_same, tab_search_diff_1, tab_search_diff_2;
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t
+---
+ t
+(1 row)
+
diff --git a/contrib/pg_stat_statements/sql/select.sql b/contrib/pg_stat_statements/sql/select.sql
index e0be58d5e24b..64603f85a411 100644
--- a/contrib/pg_stat_statements/sql/select.sql
+++ b/contrib/pg_stat_statements/sql/select.sql
@@ -148,3 +148,60 @@ SELECT (
SELECT COUNT(*) FROM pg_stat_statements WHERE query LIKE '%SELECT GROUPING%';
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+
+-- Temporary tables, grouped together.
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+COMMIT;
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+COMMIT;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+
+-- search_path with various schemas and temporary tables
+CREATE SCHEMA pgss_schema_1;
+CREATE SCHEMA pgss_schema_2;
+-- Same attributes.
+CREATE TABLE pgss_schema_1.tab_search_same (a int, b int);
+CREATE TABLE pgss_schema_2.tab_search_same (a int, b int);
+CREATE TEMP TABLE tab_search_same (a int, b int);
+-- Different number of attributes, mapping types
+CREATE TABLE pgss_schema_1.tab_search_diff_1 (a int);
+CREATE TABLE pgss_schema_2.tab_search_diff_1 (a int, b int);
+CREATE TEMP TABLE tab_search_diff_1 (a int, b int, c int);
+-- Same number of attributes, different types
+CREATE TABLE pgss_schema_1.tab_search_diff_2 (a int);
+CREATE TABLE pgss_schema_2.tab_search_diff_2 (a text);
+CREATE TEMP TABLE tab_search_diff_2 (a bigint);
+SET search_path = 'pgss_schema_1';
+SELECT count(*) FROM tab_search_same;
+SELECT a, b FROM tab_search_same;
+SELECT count(*) FROM tab_search_diff_1;
+SELECT count(*) FROM tab_search_diff_2;
+SELECT a FROM tab_search_diff_2;
+SET search_path = 'pgss_schema_1';
+SELECT count(*) FROM tab_search_same;
+SELECT a, b FROM tab_search_same;
+SELECT count(*) FROM tab_search_diff_1;
+SELECT count(*) FROM tab_search_diff_2;
+SELECT a FROM tab_search_diff_2;
+SET search_path = 'pg_temp';
+SELECT count(*) FROM tab_search_same;
+SELECT a, b FROM tab_search_same;
+SELECT count(*) FROM tab_search_diff_1;
+SELECT count(*) FROM tab_search_diff_2;
+SELECT a FROM tab_search_diff_2;
+RESET search_path;
+SELECT count(*) FROM pgss_schema_1.tab_search_same;
+SELECT a, b FROM pgss_schema_1.tab_search_same;
+SELECT count(*) FROM pgss_schema_2.tab_search_diff_1;
+SELECT count(*) FROM pg_temp.tab_search_diff_2;
+SELECT a FROM pgss_schema_2.tab_search_diff_2;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+DROP SCHEMA pgss_schema_1 CASCADE;
+DROP SCHEMA pgss_schema_2 CASCADE;
+DROP TABLE tab_search_same, tab_search_diff_1, tab_search_diff_2;
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
--
2.49.0
Re: Michael Paquier
So your idea to use the relation name in eref while skipping the
column list looks kind of promising. Per se the attached. Thoughts?
Makes sense to me, thanks for digging into it.
+++ b/src/backend/nodes/queryjumblefuncs.c @@ -33,6 +33,7 @@ #include "postgres.h"#include "access/transam.h"
+#include "catalog/namespace.h"
No longer needed.
+++ b/contrib/pg_stat_statements/sql/select.sql +SET search_path = 'pgss_schema_1'; +SELECT count(*) FROM tab_search_same; +SELECT a, b FROM tab_search_same; +SELECT count(*) FROM tab_search_diff_1; +SELECT count(*) FROM tab_search_diff_2; +SELECT a FROM tab_search_diff_2; +SET search_path = 'pgss_schema_1';
Should this be pgss_schema_2 ?
Christoph
So your idea to use the relation name in eref while skipping the
column list looks kind of promising. Per se the attached. Thoughts?
I feel really uneasy about this behavior becoming the default.
I can bet there are some users which run common queries across
different schemas ( e.g. multi-tenancy ) will consider this behavior a
regression
in pg_stat_statements as now all their common queries have been merged
into a single entry.
For example, I have seen users add comments to SQLs to differentiate
similar SQLs coming from different tenants. This patch makes this no longer a
somewhat decent workaround to overcome the fact that pg_stat_statements
does not track schemas or search path.
```
select pg_stat_statements_reset();
set search_path = s1;
select /*+ user s1 */ * from foo;
set search_path = s2;
select /*+ user s2 */ * from foo;
reset search_path;
select userid, queryid, query, calls from public.pg_stat_statements;
test=# select userid, queryid, query, calls from public.pg_stat_statements;
userid | queryid | query | calls
--------+----------------------+-----------------------------------+-------
10 | 1788423388555345932 | select /*+ user s1 */ * from foo | 2
10 | -8935568138104064674 | select pg_stat_statements_reset() | 1
10 | -8663970364987885379 | set search_path = $1 | 2
10 | -6563543739552933350 | reset search_path | 1
(4 rows)
```
--
Sami Imseih
Amazon Web Services (AWS)
Sami Imseih <samimseih@gmail.com> writes:
For example, I have seen users add comments to SQLs to differentiate
similar SQLs coming from different tenants. This patch makes this no longer a
somewhat decent workaround to overcome the fact that pg_stat_statements
does not track schemas or search path.
Well, the workaround is different, but that doesn't mean there is no
workaround. You just need to alter a table alias in the query:
select * from foo s1;
select * from foo s2;
As against this, there is probably also a set of people who would
*like* identical queries on identically-rowtyped tables in different
schemas to be merged. Right now they have no way to make that happen.
So yeah, it's a nontrivial behavioral change. But I think on the
whole it's likely to be more useful. We could always revert the
change during beta if we get too much pushback.
regards, tom lane
select * from foo s1;
select * from foo s2;
ah, thanks for pointing this out. Not as good of a workaround as
a comment since one must change aliases, but at least there is
a workaround...
As against this, there is probably also a set of people who would
*like* identical queries on identically-rowtyped tables in different
schemas to be merged. Right now they have no way to make that happen.
I agree that some may want stats to merge for the same tables,
and others may not. I will suggest this with some hesitation, but why not
make this behavior configurable via a GUC?
We recently introduced query_id_squash_values for controlling
the merge of an IN list, maybe this is another queryId behavior we should
provide a configuration for?
--
Sami Imseih
Amazon Web Services (AWS)
Sami Imseih <samimseih@gmail.com> writes:
I agree that some may want stats to merge for the same tables,
and others may not. I will suggest this with some hesitation, but why not
make this behavior configurable via a GUC?
We recently introduced query_id_squash_values for controlling
the merge of an IN list, maybe this is another queryId behavior we should
provide a configuration for?
I don't like that GUC and I would not like this one either. We
learned years ago that GUCs that change query semantics are a bad
idea, but apparently now we have hackers who need to relearn that
lesson the hard way. (Admittedly, this isn't quite *query* semantics,
which perhaps lessens the blast radius. But I think we're still going
to regret query_id_squash_values.)
regards, tom lane
Sami Imseih <samimseih@gmail.com> writes:
I agree that some may want stats to merge for the same tables,
and others may not. I will suggest this with some hesitation, but why not
make this behavior configurable via a GUC?
We recently introduced query_id_squash_values for controlling
the merge of an IN list, maybe this is another queryId behavior we should
provide a configuration for?I don't like that GUC and I would not like this one either. We
learned years ago that GUCs that change query semantics are a bad
idea, but apparently now we have hackers who need to relearn that
lesson the hard way. (Admittedly, this isn't quite *query* semantics,
which perhaps lessens the blast radius. But I think we're still going
to regret query_id_squash_values.)
query_id_squash_values has a much weaker argument to exist than a
guc to control the use of alias vs OID. Why would anyone not want
to squash the IN list? maybe we should revisit this decision in that thread.
With that said, if everyone is OK with the behavior change of pg_s_s
with this patch, I will concede. FWIW, I do think this for most cases, the
proposed behavior is desirable. I just worry about the users who rely on the
OID being jumbled behavior, and have no way to revery.
--
Sami Imseih
Amazon Web Services (AWS)
On Mon, Mar 24, 2025 at 04:41:35PM +0100, Christoph Berg wrote:
Re: Michael Paquier
+++ b/src/backend/nodes/queryjumblefuncs.c @@ -33,6 +33,7 @@ #include "postgres.h"#include "access/transam.h"
+#include "catalog/namespace.h"No longer needed.
Indeed.
+++ b/contrib/pg_stat_statements/sql/select.sql +SET search_path = 'pgss_schema_1'; +SELECT count(*) FROM tab_search_same; +SELECT a, b FROM tab_search_same; +SELECT count(*) FROM tab_search_diff_1; +SELECT count(*) FROM tab_search_diff_2; +SELECT a FROM tab_search_diff_2; +SET search_path = 'pgss_schema_1';Should this be pgss_schema_2 ?
Oops. Nice copy-paste.
--
Michael
On Mon, Mar 24, 2025 at 09:38:35PM -0500, Sami Imseih wrote:
select * from foo s1;
select * from foo s2;ah, thanks for pointing this out. Not as good of a workaround as
a comment since one must change aliases, but at least there is
a workaround...
Exactly. Like Tom I'm not really worried about the proposal, but of
course I could prove to be wrong. I am ready to assume that bloat in
pgss entries caused by temp tables is a more common case.
So let's move on with this proposal, before the gong.
--
Michael
On Mon, Mar 24, 2025 at 10:30:59PM -0500, Sami Imseih wrote:
Sami Imseih <samimseih@gmail.com> writes:
I agree that some may want stats to merge for the same tables,
and others may not. I will suggest this with some hesitation, but why not
make this behavior configurable via a GUC?
We recently introduced query_id_squash_values for controlling
the merge of an IN list, maybe this is another queryId behavior we should
provide a configuration for?I don't like that GUC and I would not like this one either. We
learned years ago that GUCs that change query semantics are a bad
idea, but apparently now we have hackers who need to relearn that
lesson the hard way. (Admittedly, this isn't quite *query* semantics,
which perhaps lessens the blast radius. But I think we're still going
to regret query_id_squash_values.)query_id_squash_values has a much weaker argument to exist than a
guc to control the use of alias vs OID. Why would anyone not want
to squash the IN list? maybe we should revisit this decision in that thread.
This part of the thread is digressing, but I'd on the side of removing
entirely the GUC and make the grouping of IN values the default. We
still have time to discuss that during the beta cycle, so let's do so
on its related thread.
--
Michael
Exactly. Like Tom I'm not really worried about the proposal, but of
course I could prove to be wrong. I am ready to assume that bloat in
pgss entries caused by temp tables is a more common case.
I ran installcheck-parallel with the patch and without the patch
3x, and the reduction in pg_s_s bloat is visible. The same entries
are reused even when the regression database is recreated, as
expected.
## without patch run installcheck-parallel 3 times
postgres=# select count(distinct queryid), count(*) from pg_stat_statements;
count | count
-------+-------
41630 | 74776
(1 row)
## with patch run installcheck-parallel 3 times
postgres=# select count(distinct queryid), count(*) from pg_stat_statements;
count | count
-------+-------
26301 | 73030
(1 row)
This part of the thread is digressing, but I'd on the side of removing
entirely the GUC and make the grouping of IN values the default. We
still have time to discuss that during the beta cycle, so let's do so
on its related thread.
I will do that tomorrow in that thread. It would not make sense to introduce
a GUC for that behavior and not for this. So, I am glad we discussed this
here.
Looking at the patches, I could not find anything else besides what
was pointed out by Christoph earlier.
--
Sami Imseih
Amazon Web Services (AWS)
Sami Imseih <samimseih@gmail.com> writes:
I don't like that GUC and I would not like this one either. We
learned years ago that GUCs that change query semantics are a bad
idea, but apparently now we have hackers who need to relearn that
lesson the hard way.
query_id_squash_values has a much weaker argument to exist than a
guc to control the use of alias vs OID. Why would anyone not want
to squash the IN list? maybe we should revisit this decision in that thread.
I'm not opining one way or the other on whether squashing an IN list
is desirable. My point is that a GUC is a poor way to make --- or
really, avoid making --- such decisions. The reasons we took away
from previous experiments with semantics-determing GUCs were:
1. The scope of effect of a GUC is seldom what you want for such
things. There are going to be some queries in which you want behavior
A, and some in which you want behavior B, and in the worst case you
want different behaviors in different parts of the same query. It's
really painful to make that happen.
2. Tools that are not entitled to set the value of the GUC are forced
to be prepared to cope with any setting. That can be anywhere from
painful to impossible.
For the specific context of controlling how query jumbling happens,
there's still another problem: the actual statement-merging behavior of
pg_stat_statements will depend on the totality of settings of the GUC
across the entire system. It's not very clear to me what will happen
if different sessions use different settings, much less if people
change the setting intra-session; but I bet a lot of people will find
it surprising. 62d712ecf did no one any favors by marking that GUC
USERSET rather than something that would prevail system-wide.
All of these remarks apply with equal force to anything that changes
the behavior of query-jumbling, no matter the specific details.
regards, tom lane
Hi,
On Tue, Mar 25, 2025 at 12:32:05AM -0400, Tom Lane wrote:
I'm not opining one way or the other on whether squashing an IN list
is desirable. My point is that a GUC is a poor way to make --- or
really, avoid making --- such decisions. The reasons we took away
from previous experiments with semantics-determing GUCs were:1. The scope of effect of a GUC is seldom what you want for such
things. There are going to be some queries in which you want behavior
A, and some in which you want behavior B, and in the worst case you
want different behaviors in different parts of the same query. It's
really painful to make that happen.2. Tools that are not entitled to set the value of the GUC are forced
to be prepared to cope with any setting. That can be anywhere from
painful to impossible.
Didn't that ship already sailed in pg14 when we allowed generating custom
query_id?
Now:
For the specific context of controlling how query jumbling happens,
there's still another problem: the actual statement-merging behavior of
pg_stat_statements will depend on the totality of settings of the GUC
across the entire system. It's not very clear to me what will happen
if different sessions use different settings, much less if people
change the setting intra-session; but I bet a lot of people will find
it surprising. 62d712ecf did no one any favors by marking that GUC
USERSET rather than something that would prevail system-wide.
One of the requirement for the custom query_id was that you shouldn't be
allowed to change the algorithm dynamically, at least not unless a superuser
agrees to maybe break everything, which is why compute_query_id is marked as
PGC_SUSET. I think that the same reasoning should apply here and if the GUC is
kept it should be at least at that level.
Julien Rouhaud <rjuju123@gmail.com> writes:
On Tue, Mar 25, 2025 at 12:32:05AM -0400, Tom Lane wrote:
2. Tools that are not entitled to set the value of the GUC are forced
to be prepared to cope with any setting. That can be anywhere from
painful to impossible.
Didn't that ship already sailed in pg14 when we allowed generating custom
query_id?
Up to a point, perhaps. If I'm writing some kind of tool that digests
pg_stat_statements results, I think I'm entitled to disregard the
possibility that somebody is using a custom query_id that behaves in
ways I'm not expecting --- or at least, fixing my code for that is
their problem not mine. But it's much harder to take that attitude
for things that are built into core PG.
For the specific context of controlling how query jumbling happens,
there's still another problem: the actual statement-merging behavior of
pg_stat_statements will depend on the totality of settings of the GUC
across the entire system.
One of the requirement for the custom query_id was that you shouldn't be
allowed to change the algorithm dynamically, at least not unless a superuser
agrees to maybe break everything, which is why compute_query_id is marked as
PGC_SUSET. I think that the same reasoning should apply here and if the GUC is
kept it should be at least at that level.
Fully agreed. (Even SUSET is debatable in this situation, but I'm okay
with it on the principle that superusers are expected to know what
they're doing and accept the consequences.)
regards, tom lane
On Tue, Mar 25, 2025 at 01:17:22AM -0400, Tom Lane wrote:
Julien Rouhaud <rjuju123@gmail.com> writes:
On Tue, Mar 25, 2025 at 12:32:05AM -0400, Tom Lane wrote:
2. Tools that are not entitled to set the value of the GUC are forced
to be prepared to cope with any setting. That can be anywhere from
painful to impossible.Didn't that ship already sailed in pg14 when we allowed generating custom
query_id?Up to a point, perhaps. If I'm writing some kind of tool that digests
pg_stat_statements results, I think I'm entitled to disregard the
possibility that somebody is using a custom query_id that behaves in
ways I'm not expecting --- or at least, fixing my code for that is
their problem not mine. But it's much harder to take that attitude
for things that are built into core PG.
I see, that's fair.
On Mon, Mar 24, 2025 at 8:51 PM Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Mar 24, 2025 at 09:38:35PM -0500, Sami Imseih wrote:
select * from foo s1;
select * from foo s2;ah, thanks for pointing this out. Not as good of a workaround as
a comment since one must change aliases, but at least there is
a workaround...Exactly. Like Tom I'm not really worried about the proposal, but of
course I could prove to be wrong. I am ready to assume that bloat in
pgss entries caused by temp tables is a more common case.
For what its worth, +1 on the current proposal in this thread (and doing it
without a GUC), i.e. merging a query that references the same table alias,
ignoring different schemas.
In the context of the earlier referenced one-schema-per-customer workloads:
In my experience these often not work well with pg_stat_statements today
because of their own bloat problem, just like with temp tables. You quickly
have way too many unique entries, and your query text file accumulates a
lot of duplicative entries (since the same query text gets repeated in the
text file, since its queryid is different), to the point that you can't
monitor your workload at all anymore.
Thanks,
Lukas
--
Lukas Fittl
On Mon, Mar 24, 2025 at 11:09:06PM -0700, Lukas Fittl wrote:
For what its worth, +1 on the current proposal in this thread (and doing it
without a GUC), i.e. merging a query that references the same table alias,
ignoring different schemas.
Thanks for the feedback. I have looked again at the first patch to
add custom_query_jumble as a node field attribute, adjusted some
comments, and applied it as 5ac462e2b7ac.
Attached is the second one, with more tests coverage with attribute
aliases (these being ignored exists in stable branches, but why not
while on it) and table aliases, and the fixes for the issues pointed
out by Christoph. I'll double-check all that again tomorrow. Please
find an updated version attached for now.
--
Michael
Attachments:
v5-0001-Add-custom-query-jumble-function-for-RangeTblEntr.patchtext/x-diff; charset=us-asciiDownload
From 68d363fbee484b40308a00a85329364ff0901e9b Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Tue, 25 Mar 2025 15:40:10 +0900
Subject: [PATCH v5] Add custom query jumble function for RangeTblEntry.eref
---
src/include/nodes/parsenodes.h | 11 +-
src/backend/nodes/queryjumblefuncs.c | 19 ++
.../pg_stat_statements/expected/select.out | 236 ++++++++++++++++++
contrib/pg_stat_statements/sql/select.sql | 69 +++++
4 files changed, 332 insertions(+), 3 deletions(-)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 23c9e3c5abf2..a87f949b389e 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1050,8 +1050,13 @@ typedef struct RangeTblEntry
*/
/* user-written alias clause, if any */
Alias *alias pg_node_attr(query_jumble_ignore);
- /* expanded reference names */
- Alias *eref pg_node_attr(query_jumble_ignore);
+
+ /*
+ * Expanded reference names. This uses a custom query jumble function so
+ * as the table name is included in the computation, not its list of
+ * columns.
+ */
+ Alias *eref pg_node_attr(custom_query_jumble);
RTEKind rtekind; /* see above */
@@ -1094,7 +1099,7 @@ typedef struct RangeTblEntry
* tables to be invalidated if the underlying table is altered.
*/
/* OID of the relation */
- Oid relid;
+ Oid relid pg_node_attr(query_jumble_ignore);
/* inheritance requested? */
bool inh;
/* relation kind (see pg_class.relkind) */
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index f8b0f91704ba..62d6cfb7ac15 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -67,6 +67,9 @@ static void _jumbleElements(JumbleState *jstate, List *elements);
static void _jumbleA_Const(JumbleState *jstate, Node *node);
static void _jumbleList(JumbleState *jstate, Node *node);
static void _jumbleVariableSetStmt(JumbleState *jstate, Node *node);
+static void _jumbleRangeTblEntry_eref(JumbleState *jstate,
+ RangeTblEntry *rte,
+ Alias *expr);
/*
* Given a possibly multi-statement source string, confine our attention to the
@@ -516,3 +519,19 @@ _jumbleVariableSetStmt(JumbleState *jstate, Node *node)
JUMBLE_FIELD(is_local);
JUMBLE_LOCATION(location);
}
+
+/*
+ * Custom query jumble function for RangeTblEntry.eref.
+ */
+static void
+_jumbleRangeTblEntry_eref(JumbleState *jstate,
+ RangeTblEntry *rte,
+ Alias *expr)
+{
+ JUMBLE_FIELD(type);
+
+ /*
+ * This includes only the table name, the list of column names is ignored.
+ */
+ JUMBLE_STRING(aliasname);
+}
diff --git a/contrib/pg_stat_statements/expected/select.out b/contrib/pg_stat_statements/expected/select.out
index 37a30af034a6..bf05d521e866 100644
--- a/contrib/pg_stat_statements/expected/select.out
+++ b/contrib/pg_stat_statements/expected/select.out
@@ -413,3 +413,239 @@ SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
(1 row)
+-- Temporary tables, grouped together.
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+ id
+----
+(0 rows)
+
+COMMIT;
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+ id
+----
+(0 rows)
+
+COMMIT;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+ calls | query
+-------+------------------------------------------------------------------------
+ 2 | SELECT * FROM temp_t
+ 0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
+ 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
+(3 rows)
+
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t
+---
+ t
+(1 row)
+
+-- search_path with various schemas and temporary tables
+CREATE SCHEMA pgss_schema_1;
+CREATE SCHEMA pgss_schema_2;
+-- Same attributes.
+CREATE TABLE pgss_schema_1.tab_search_same (a int, b int);
+CREATE TABLE pgss_schema_2.tab_search_same (a int, b int);
+CREATE TEMP TABLE tab_search_same (a int, b int);
+-- Different number of attributes, mapping types
+CREATE TABLE pgss_schema_1.tab_search_diff_1 (a int);
+CREATE TABLE pgss_schema_2.tab_search_diff_1 (a int, b int);
+CREATE TEMP TABLE tab_search_diff_1 (a int, b int, c int);
+-- Same number of attributes, different types
+CREATE TABLE pgss_schema_1.tab_search_diff_2 (a int);
+CREATE TABLE pgss_schema_2.tab_search_diff_2 (a text);
+CREATE TEMP TABLE tab_search_diff_2 (a bigint);
+-- First permanent schema
+SET search_path = 'pgss_schema_1';
+SELECT count(*) FROM tab_search_same;
+ count
+-------
+ 0
+(1 row)
+
+SELECT a, b FROM tab_search_same;
+ a | b
+---+---
+(0 rows)
+
+SELECT count(*) FROM tab_search_diff_1;
+ count
+-------
+ 0
+(1 row)
+
+SELECT count(*) FROM tab_search_diff_2;
+ count
+-------
+ 0
+(1 row)
+
+SELECT a FROM tab_search_diff_2 AS t1;
+ a
+---
+(0 rows)
+
+SELECT a FROM tab_search_diff_2;
+ a
+---
+(0 rows)
+
+SELECT a AS a1 FROM tab_search_diff_2;
+ a1
+----
+(0 rows)
+
+-- Second permanent schema
+SET search_path = 'pgss_schema_2';
+SELECT count(*) FROM tab_search_same;
+ count
+-------
+ 0
+(1 row)
+
+SELECT a, b FROM tab_search_same;
+ a | b
+---+---
+(0 rows)
+
+SELECT count(*) FROM tab_search_diff_1;
+ count
+-------
+ 0
+(1 row)
+
+SELECT count(*) FROM tab_search_diff_2;
+ count
+-------
+ 0
+(1 row)
+
+SELECT a FROM tab_search_diff_2 AS t1;
+ a
+---
+(0 rows)
+
+SELECT a FROM tab_search_diff_2;
+ a
+---
+(0 rows)
+
+SELECT a AS a1 FROM tab_search_diff_2;
+ a1
+----
+(0 rows)
+
+-- Temporary schema
+SET search_path = 'pg_temp';
+SELECT count(*) FROM tab_search_same;
+ count
+-------
+ 0
+(1 row)
+
+SELECT a, b FROM tab_search_same;
+ a | b
+---+---
+(0 rows)
+
+SELECT count(*) FROM tab_search_diff_1;
+ count
+-------
+ 0
+(1 row)
+
+SELECT count(*) FROM tab_search_diff_2;
+ count
+-------
+ 0
+(1 row)
+
+SELECT a FROM tab_search_diff_2 AS t1;
+ a
+---
+(0 rows)
+
+SELECT a FROM tab_search_diff_2;
+ a
+---
+(0 rows)
+
+SELECT a AS a1 FROM tab_search_diff_2;
+ a1
+----
+(0 rows)
+
+RESET search_path;
+-- Schema qualifications
+SELECT count(*) FROM pgss_schema_1.tab_search_same;
+ count
+-------
+ 0
+(1 row)
+
+SELECT a, b FROM pgss_schema_1.tab_search_same;
+ a | b
+---+---
+(0 rows)
+
+SELECT count(*) FROM pgss_schema_2.tab_search_diff_1;
+ count
+-------
+ 0
+(1 row)
+
+SELECT count(*) FROM pg_temp.tab_search_diff_2;
+ count
+-------
+ 0
+(1 row)
+
+SELECT a FROM pgss_schema_2.tab_search_diff_2 AS t1;
+ a
+---
+(0 rows)
+
+SELECT a FROM pgss_schema_2.tab_search_diff_2;
+ a
+---
+(0 rows)
+
+SELECT a AS a1 FROM pgss_schema_2.tab_search_diff_2;
+ a1
+----
+(0 rows)
+
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+ calls | query
+-------+------------------------------------------------------------------------
+ 8 | SELECT a FROM tab_search_diff_2
+ 4 | SELECT a FROM tab_search_diff_2 AS t1
+ 4 | SELECT a, b FROM tab_search_same
+ 0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
+ 4 | SELECT count(*) FROM tab_search_diff_1
+ 4 | SELECT count(*) FROM tab_search_diff_2
+ 4 | SELECT count(*) FROM tab_search_same
+ 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
+(8 rows)
+
+DROP SCHEMA pgss_schema_1 CASCADE;
+NOTICE: drop cascades to 3 other objects
+DETAIL: drop cascades to table pgss_schema_1.tab_search_same
+drop cascades to table pgss_schema_1.tab_search_diff_1
+drop cascades to table pgss_schema_1.tab_search_diff_2
+DROP SCHEMA pgss_schema_2 CASCADE;
+NOTICE: drop cascades to 3 other objects
+DETAIL: drop cascades to table pgss_schema_2.tab_search_same
+drop cascades to table pgss_schema_2.tab_search_diff_1
+drop cascades to table pgss_schema_2.tab_search_diff_2
+DROP TABLE tab_search_same, tab_search_diff_1, tab_search_diff_2;
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t
+---
+ t
+(1 row)
+
diff --git a/contrib/pg_stat_statements/sql/select.sql b/contrib/pg_stat_statements/sql/select.sql
index e0be58d5e24b..fbed557ec369 100644
--- a/contrib/pg_stat_statements/sql/select.sql
+++ b/contrib/pg_stat_statements/sql/select.sql
@@ -148,3 +148,72 @@ SELECT (
SELECT COUNT(*) FROM pg_stat_statements WHERE query LIKE '%SELECT GROUPING%';
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+
+-- Temporary tables, grouped together.
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+COMMIT;
+BEGIN;
+ CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
+ SELECT * FROM temp_t;
+COMMIT;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+
+-- search_path with various schemas and temporary tables
+CREATE SCHEMA pgss_schema_1;
+CREATE SCHEMA pgss_schema_2;
+-- Same attributes.
+CREATE TABLE pgss_schema_1.tab_search_same (a int, b int);
+CREATE TABLE pgss_schema_2.tab_search_same (a int, b int);
+CREATE TEMP TABLE tab_search_same (a int, b int);
+-- Different number of attributes, mapping types
+CREATE TABLE pgss_schema_1.tab_search_diff_1 (a int);
+CREATE TABLE pgss_schema_2.tab_search_diff_1 (a int, b int);
+CREATE TEMP TABLE tab_search_diff_1 (a int, b int, c int);
+-- Same number of attributes, different types
+CREATE TABLE pgss_schema_1.tab_search_diff_2 (a int);
+CREATE TABLE pgss_schema_2.tab_search_diff_2 (a text);
+CREATE TEMP TABLE tab_search_diff_2 (a bigint);
+-- First permanent schema
+SET search_path = 'pgss_schema_1';
+SELECT count(*) FROM tab_search_same;
+SELECT a, b FROM tab_search_same;
+SELECT count(*) FROM tab_search_diff_1;
+SELECT count(*) FROM tab_search_diff_2;
+SELECT a FROM tab_search_diff_2 AS t1;
+SELECT a FROM tab_search_diff_2;
+SELECT a AS a1 FROM tab_search_diff_2;
+-- Second permanent schema
+SET search_path = 'pgss_schema_2';
+SELECT count(*) FROM tab_search_same;
+SELECT a, b FROM tab_search_same;
+SELECT count(*) FROM tab_search_diff_1;
+SELECT count(*) FROM tab_search_diff_2;
+SELECT a FROM tab_search_diff_2 AS t1;
+SELECT a FROM tab_search_diff_2;
+SELECT a AS a1 FROM tab_search_diff_2;
+-- Temporary schema
+SET search_path = 'pg_temp';
+SELECT count(*) FROM tab_search_same;
+SELECT a, b FROM tab_search_same;
+SELECT count(*) FROM tab_search_diff_1;
+SELECT count(*) FROM tab_search_diff_2;
+SELECT a FROM tab_search_diff_2 AS t1;
+SELECT a FROM tab_search_diff_2;
+SELECT a AS a1 FROM tab_search_diff_2;
+RESET search_path;
+-- Schema qualifications
+SELECT count(*) FROM pgss_schema_1.tab_search_same;
+SELECT a, b FROM pgss_schema_1.tab_search_same;
+SELECT count(*) FROM pgss_schema_2.tab_search_diff_1;
+SELECT count(*) FROM pg_temp.tab_search_diff_2;
+SELECT a FROM pgss_schema_2.tab_search_diff_2 AS t1;
+SELECT a FROM pgss_schema_2.tab_search_diff_2;
+SELECT a AS a1 FROM pgss_schema_2.tab_search_diff_2;
+SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+DROP SCHEMA pgss_schema_1 CASCADE;
+DROP SCHEMA pgss_schema_2 CASCADE;
+DROP TABLE tab_search_same, tab_search_diff_1, tab_search_diff_2;
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
--
2.49.0
In my experience these often not work well with pg_stat_statements today because
of their own bloat problem, just like with temp tables. You quickly have way too many
unique entries, and your query text file accumulates a lot of duplicative entries
(since the same query text gets repeated in the text file, since its queryid is different),
to the point that you can't monitor your workload at all anymore.
That is true in terms of pg_stat_statements, and I have seen users
have to increase
pg_stat_statements.max to something much higher to avoid the bloat and constant
deallocs/GC.
But, besides pg_stat_statements, queryId is used to group queries for
database load
monitoring ( pg_stat_activity sampling). As of now, different schemas
are tracked
separately, but with this change they will be merged. This may come as
a surprise to
use-cases that rely on the existing behavior.
But I do agree that pg_s_s bloat is a big pain point, so this change
should be positive
overall. Let's see if there are enough complaints to force us to reconsider.
--
Sami Imseih
Amazon Web Services (AwS)
Re: Michael Paquier
Attached is the second one, with more tests coverage with attribute
aliases (these being ignored exists in stable branches, but why not
while on it) and table aliases, and the fixes for the issues pointed
out by Christoph. I'll double-check all that again tomorrow. Please
find an updated version attached for now.
Looks good to me.
Christoph
Attached is the second one, with more tests coverage with attribute
aliases (these being ignored exists in stable branches, but why not
while on it) and table aliases, and the fixes for the issues pointed
out by Christoph. I'll double-check all that again tomorrow. Please
find an updated version attached for now.
There are several parts of the doc that may no longer hold true.
1/
"Since the queryid hash value is computed on the post-parse-analysis
representation of the queries, the opposite is also possible: queries with
identical texts might appear as separate entries, if they have different
meanings as a result of factors such as different search_path settings."
I think this text could remain as is, because search_path still
matters for things like functions, etc.
"""
postgres=# SET SEARCH_PATH=a;
SET
postgres=# explain verbose select * from test();
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=4)
Output: 1
Query Identifier: -1813735303617154554
(3 rows)
postgres=# SET SEARCH_PATH=b;
SET
postgres=# explain verbose select * from test();
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=4)
Output: 1
Query Identifier: -3896107319863686763
(3 rows)
"""
2/
"For example, pg_stat_statements will consider two apparently-identical
queries to be distinct, if they reference a table that was dropped and
recreated between the executions of the two queries."
This is no longer true for relations, but is still true for functions. I think
we should mention the caveats in a bit more detail as this change
will have impact on the most common case. What about something
like this?
"For example, pg_stat_statements will consider two apparently-identical
queries to be distinct, if they reference a function that was dropped and
recreated between the executions of the two queries. Conversely, if a table is
dropped and recreated between the executions of queries, two
apparently-identical
queries will be considered the same. However, if the alias for a table is
different for semantically similar queries, these queries will be
considered distinct"
--
Sami Imseih
Amazon Web Services (AWS)
On Tue, Mar 25, 2025 at 11:58:21AM -0500, Sami Imseih wrote:
"Since the queryid hash value is computed on the post-parse-analysis
representation of the queries, the opposite is also possible: queries with
identical texts might appear as separate entries, if they have different
meanings as a result of factors such as different search_path settings."I think this text could remain as is, because search_path still
matters for things like functions, etc.
Yeah, I think that's OK as-is. I'm open to more improvements,
including more tests for these function patterns. It's one of these
areas where we should be able to tweak RangeTblFunction and apply a
custom function to its funcexpr, and please note that I have no idea
how complex it could become as this is a Node expression. :D
Functions in a temporary schema is not something as common as temp
tables, I guess, so these matter less, but they would still be a cause
of bloat for monitoring in very specific workloads.
2/
"For example, pg_stat_statements will consider two apparently-identical
queries to be distinct, if they reference a table that was dropped and
recreated between the executions of the two queries."This is no longer true for relations, but is still true for functions. I think
we should mention the caveats in a bit more detail as this change
will have impact on the most common case. What about something
like this?"For example, pg_stat_statements will consider two apparently-identical
queries to be distinct, if they reference a function that was dropped and
recreated between the executions of the two queries.
That's a bit larger than functions, but we could remain a bit more
evasive, with "if they reference *for example* a function that was
dropped and recreated between the executions of the two queries".
Note that for DDLs, like CREATE TABLE, we also group entries with
identical relation names, so we are kind of in line with the patch,
not with the current docs.
Conversely, if a table is dropped and recreated between the
executions of queries, two apparently-identical queries may be
considered the same. However, if the alias for a table is different
for semantically similar queries, these queries will be considered distinct"
This addition sounds like an improvement here.
As this thread has proved, we had little coverage these cases in pgss,
so I've applied the tests as an independent change. It is also useful
to track how things change in the commit history depending on how the
computation is tweaked. I've also included your doc suggestions. I
feel that we could do better here, but that's a common statement
anyway when it comes to the documentation.
--
Michael
Attachments:
v6-0001-Add-custom-query-jumble-function-for-RangeTblEntr.patchtext/x-diff; charset=us-asciiDownload
From 8ea61bb0d7d6c4dbbb40dbaedb5e751027163cfe Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Wed, 26 Mar 2025 08:07:59 +0900
Subject: [PATCH v6] Add custom query jumble function for RangeTblEntry.eref
---
src/include/nodes/parsenodes.h | 11 +++++++---
src/backend/nodes/queryjumblefuncs.c | 19 ++++++++++++++++++
doc/src/sgml/pgstatstatements.sgml | 9 +++++++--
.../pg_stat_statements/expected/select.out | 20 ++++++++-----------
4 files changed, 42 insertions(+), 17 deletions(-)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 23c9e3c5abf2..a87f949b389e 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1050,8 +1050,13 @@ typedef struct RangeTblEntry
*/
/* user-written alias clause, if any */
Alias *alias pg_node_attr(query_jumble_ignore);
- /* expanded reference names */
- Alias *eref pg_node_attr(query_jumble_ignore);
+
+ /*
+ * Expanded reference names. This uses a custom query jumble function so
+ * as the table name is included in the computation, not its list of
+ * columns.
+ */
+ Alias *eref pg_node_attr(custom_query_jumble);
RTEKind rtekind; /* see above */
@@ -1094,7 +1099,7 @@ typedef struct RangeTblEntry
* tables to be invalidated if the underlying table is altered.
*/
/* OID of the relation */
- Oid relid;
+ Oid relid pg_node_attr(query_jumble_ignore);
/* inheritance requested? */
bool inh;
/* relation kind (see pg_class.relkind) */
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index f8b0f91704ba..62d6cfb7ac15 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -67,6 +67,9 @@ static void _jumbleElements(JumbleState *jstate, List *elements);
static void _jumbleA_Const(JumbleState *jstate, Node *node);
static void _jumbleList(JumbleState *jstate, Node *node);
static void _jumbleVariableSetStmt(JumbleState *jstate, Node *node);
+static void _jumbleRangeTblEntry_eref(JumbleState *jstate,
+ RangeTblEntry *rte,
+ Alias *expr);
/*
* Given a possibly multi-statement source string, confine our attention to the
@@ -516,3 +519,19 @@ _jumbleVariableSetStmt(JumbleState *jstate, Node *node)
JUMBLE_FIELD(is_local);
JUMBLE_LOCATION(location);
}
+
+/*
+ * Custom query jumble function for RangeTblEntry.eref.
+ */
+static void
+_jumbleRangeTblEntry_eref(JumbleState *jstate,
+ RangeTblEntry *rte,
+ Alias *expr)
+{
+ JUMBLE_FIELD(type);
+
+ /*
+ * This includes only the table name, the list of column names is ignored.
+ */
+ JUMBLE_STRING(aliasname);
+}
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index f4e384e95aea..679381f00607 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -675,8 +675,13 @@ calls | 2
things, the internal object identifiers appearing in this representation.
This has some counterintuitive implications. For example,
<filename>pg_stat_statements</filename> will consider two apparently-identical
- queries to be distinct, if they reference a table that was dropped
- and recreated between the executions of the two queries.
+ queries to be distinct, if they reference for example a function that was
+ dropped and recreated between the executions of the two queries.
+ Conversely, if a table is dropped and recreated between the
+ executions of queries, two apparently-identical queries may be
+ considered the same. However, if the alias for a table is different
+ for semantically similar queries, these queries will be considered
+ distinct.
The hashing process is also sensitive to differences in
machine architecture and other facets of the platform.
Furthermore, it is not safe to assume that <structfield>queryid</structfield>
diff --git a/contrib/pg_stat_statements/expected/select.out b/contrib/pg_stat_statements/expected/select.out
index 708c6b0e9c41..1eebc2898ab9 100644
--- a/contrib/pg_stat_statements/expected/select.out
+++ b/contrib/pg_stat_statements/expected/select.out
@@ -433,11 +433,10 @@ COMMIT;
SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
calls | query
-------+------------------------------------------------------------------------
- 1 | SELECT * FROM temp_t
- 1 | SELECT * FROM temp_t
+ 2 | SELECT * FROM temp_t
0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
-(4 rows)
+(3 rows)
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
@@ -623,18 +622,15 @@ SELECT a AS a1 FROM pgss_schema_2.tab_search_diff_2;
SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
calls | query
-------+------------------------------------------------------------------------
- 3 | SELECT a FROM pgss_schema_2.tab_search_diff_2 AS t1
- 9 | SELECT a FROM tab_search_diff_2 AS t1
- 1 | SELECT a, b FROM pgss_schema_1.tab_search_same
- 3 | SELECT a, b FROM tab_search_same
+ 8 | SELECT a FROM tab_search_diff_2
+ 4 | SELECT a FROM tab_search_diff_2 AS t1
+ 4 | SELECT a, b FROM tab_search_same
0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
- 1 | SELECT count(*) FROM pgss_schema_1.tab_search_same
- 1 | SELECT count(*) FROM pgss_schema_2.tab_search_diff_1
- 3 | SELECT count(*) FROM tab_search_diff_1
+ 4 | SELECT count(*) FROM tab_search_diff_1
4 | SELECT count(*) FROM tab_search_diff_2
- 3 | SELECT count(*) FROM tab_search_same
+ 4 | SELECT count(*) FROM tab_search_same
1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
-(11 rows)
+(8 rows)
DROP SCHEMA pgss_schema_1 CASCADE;
NOTICE: drop cascades to 3 other objects
--
2.49.0
Michael Paquier <michael@paquier.xyz> writes:
[ v6-0001-Add-custom-query-jumble-function-for-RangeTblEntr.patch ]
Couple of minor review comments ...
* In 5ac462e2b, this bit:
# node type
- if (($t =~ /^(\w+)\*$/ or $t =~ /^struct\s+(\w+)\*$/)
+ if ($query_jumble_custom)
+ {
+ # Custom function that applies to one field of a node.
+ print $jff "\tJUMBLE_CUSTOM($n, $f);\n";
+ }
+ elsif (($t =~ /^(\w+)\*$/ or $t =~ /^struct\s+(\w+)\*$/)
fails to honor $query_jumble_ignore as the other if-branches do.
Perhaps it's unlikely that a node would have both query_jumble_custom
and query_jumble_ignore annotations, but still, the script would emit
completely incorrect code if it did. Also, the "# node type" comment
should probably be moved down to within the first "elsif" block.
* In the v6 patch, this doesn't read very smoothly:
+ * Expanded reference names. This uses a custom query jumble function so
+ * as the table name is included in the computation, not its list of
+ * columns.
Perhaps better
+ * Expanded reference names. This uses a custom query jumble function so
+ * that the table name is included in the computation, but not its list of
+ * columns.
* Also, here:
+ considered the same. However, if the alias for a table is different
+ for semantically similar queries, these queries will be considered
+ distinct.
I'd change "semantically similar queries" to "otherwise-similar
queries"; I think writing "semantically" will just confuse people.
Otherwise LGTM.
regards, tom lane
On Tue, Mar 25, 2025 at 07:24:20PM -0400, Tom Lane wrote:
fails to honor $query_jumble_ignore as the other if-branches do.
Perhaps it's unlikely that a node would have both query_jumble_custom
and query_jumble_ignore annotations, but still, the script would emit
completely incorrect code if it did. Also, the "# node type" comment
should probably be moved down to within the first "elsif" block.
Oops, sorry about that. Fixed both of these in 27ee6ede6bc9.
I'd change "semantically similar queries" to "otherwise-similar
queries"; I think writing "semantically" will just confuse people.
If I get the difference right, semantically would apply to concepts
related to linguistics, but that's not what we have here, so you are
using a more general term.
Thanks for the suggestions.
--
Michael
Attachments:
v7-0001-Add-custom-query-jumble-function-for-RangeTblEntr.patchtext/x-diff; charset=us-asciiDownload
From 30f8066989eac6f8c72034d3fb5150368c2821a9 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Wed, 26 Mar 2025 09:17:41 +0900
Subject: [PATCH v7] Add custom query jumble function for RangeTblEntry.eref
---
src/include/nodes/parsenodes.h | 11 +++++++---
src/backend/nodes/queryjumblefuncs.c | 19 ++++++++++++++++++
doc/src/sgml/pgstatstatements.sgml | 9 +++++++--
.../pg_stat_statements/expected/select.out | 20 ++++++++-----------
4 files changed, 42 insertions(+), 17 deletions(-)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 23c9e3c5abf2..df331b1c0d99 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1050,8 +1050,13 @@ typedef struct RangeTblEntry
*/
/* user-written alias clause, if any */
Alias *alias pg_node_attr(query_jumble_ignore);
- /* expanded reference names */
- Alias *eref pg_node_attr(query_jumble_ignore);
+
+ /*
+ * Expanded reference names. This uses a custom query jumble function so
+ * that the table name is included in the computation, but not its list of
+ * columns.
+ */
+ Alias *eref pg_node_attr(custom_query_jumble);
RTEKind rtekind; /* see above */
@@ -1094,7 +1099,7 @@ typedef struct RangeTblEntry
* tables to be invalidated if the underlying table is altered.
*/
/* OID of the relation */
- Oid relid;
+ Oid relid pg_node_attr(query_jumble_ignore);
/* inheritance requested? */
bool inh;
/* relation kind (see pg_class.relkind) */
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index f8b0f91704ba..62d6cfb7ac15 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -67,6 +67,9 @@ static void _jumbleElements(JumbleState *jstate, List *elements);
static void _jumbleA_Const(JumbleState *jstate, Node *node);
static void _jumbleList(JumbleState *jstate, Node *node);
static void _jumbleVariableSetStmt(JumbleState *jstate, Node *node);
+static void _jumbleRangeTblEntry_eref(JumbleState *jstate,
+ RangeTblEntry *rte,
+ Alias *expr);
/*
* Given a possibly multi-statement source string, confine our attention to the
@@ -516,3 +519,19 @@ _jumbleVariableSetStmt(JumbleState *jstate, Node *node)
JUMBLE_FIELD(is_local);
JUMBLE_LOCATION(location);
}
+
+/*
+ * Custom query jumble function for RangeTblEntry.eref.
+ */
+static void
+_jumbleRangeTblEntry_eref(JumbleState *jstate,
+ RangeTblEntry *rte,
+ Alias *expr)
+{
+ JUMBLE_FIELD(type);
+
+ /*
+ * This includes only the table name, the list of column names is ignored.
+ */
+ JUMBLE_STRING(aliasname);
+}
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index f4e384e95aea..625b84ebfefd 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -675,8 +675,13 @@ calls | 2
things, the internal object identifiers appearing in this representation.
This has some counterintuitive implications. For example,
<filename>pg_stat_statements</filename> will consider two apparently-identical
- queries to be distinct, if they reference a table that was dropped
- and recreated between the executions of the two queries.
+ queries to be distinct, if they reference for example a function that was
+ dropped and recreated between the executions of the two queries.
+ Conversely, if a table is dropped and recreated between the
+ executions of queries, two apparently-identical queries may be
+ considered the same. However, if the alias for a table is different
+ for otherwise-similar queries, these queries will be considered
+ distinct.
The hashing process is also sensitive to differences in
machine architecture and other facets of the platform.
Furthermore, it is not safe to assume that <structfield>queryid</structfield>
diff --git a/contrib/pg_stat_statements/expected/select.out b/contrib/pg_stat_statements/expected/select.out
index 708c6b0e9c41..1eebc2898ab9 100644
--- a/contrib/pg_stat_statements/expected/select.out
+++ b/contrib/pg_stat_statements/expected/select.out
@@ -433,11 +433,10 @@ COMMIT;
SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
calls | query
-------+------------------------------------------------------------------------
- 1 | SELECT * FROM temp_t
- 1 | SELECT * FROM temp_t
+ 2 | SELECT * FROM temp_t
0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
-(4 rows)
+(3 rows)
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
@@ -623,18 +622,15 @@ SELECT a AS a1 FROM pgss_schema_2.tab_search_diff_2;
SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
calls | query
-------+------------------------------------------------------------------------
- 3 | SELECT a FROM pgss_schema_2.tab_search_diff_2 AS t1
- 9 | SELECT a FROM tab_search_diff_2 AS t1
- 1 | SELECT a, b FROM pgss_schema_1.tab_search_same
- 3 | SELECT a, b FROM tab_search_same
+ 8 | SELECT a FROM tab_search_diff_2
+ 4 | SELECT a FROM tab_search_diff_2 AS t1
+ 4 | SELECT a, b FROM tab_search_same
0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
- 1 | SELECT count(*) FROM pgss_schema_1.tab_search_same
- 1 | SELECT count(*) FROM pgss_schema_2.tab_search_diff_1
- 3 | SELECT count(*) FROM tab_search_diff_1
+ 4 | SELECT count(*) FROM tab_search_diff_1
4 | SELECT count(*) FROM tab_search_diff_2
- 3 | SELECT count(*) FROM tab_search_same
+ 4 | SELECT count(*) FROM tab_search_same
1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
-(11 rows)
+(8 rows)
DROP SCHEMA pgss_schema_1 CASCADE;
NOTICE: drop cascades to 3 other objects
--
2.49.0
If I get the difference right, semantically would apply to concepts
related to linguistics, but that's not what we have here, so you are
using a more general term.
FWIW, the pg_stat_statements docs in a few places refer to
queries that may look different but have the same meaning
as “semantically equivalent”, this is why I used the same
terminology here. But, I have no issue with the simplified
rewrite either.
The patch LGTM as well.
Regards,
Sami Imseih
Amazon Web Services (AWS)
On Tue, Mar 25, 2025 at 07:56:29PM -0500, Sami Imseih wrote:
FWIW, the pg_stat_statements docs in a few places refer to
queries that may look different but have the same meaning
as “semantically equivalent”, this is why I used the same
terminology here. But, I have no issue with the simplified
rewrite either.The patch LGTM as well.
If any adjustments are required, it would be always possible to do
these later. Anyway, workloads with a lot of temporary tables are
going to benefit from this change, so let's see how it goes.
Applied, after eyeing much more the PGSS dumps from installcheck
before and after the patch, to make sure that I'm not missing
something..
--
Michael
Hi,
I totally understand the wish to make pg_stat_statements useful for
workloads that create/drop a ton of temporary tables.
However, when pursuing this goal we impacted other types of totally valid
workloads when tables with the same name exist in different schemas.
Example:
create schema s1;
create table s1.t as select id from generate_series(1, 10) as id;
create schema s2;
create table s1.t as select id from generate_series(1, 1000000) as id;
select count(id) from s1.t;
select count(id) from s2.t;
select * from pg_stat_statements;
userid | 10
dbid | 5
toplevel | t
queryid | -8317141500049987426
query | select count(id) from s1.t
plans | 0
total_plan_time | 0
min_plan_time | 0
max_plan_time | 0
mean_plan_time | 0
stddev_plan_time | 0
calls | 2
total_exec_time | 22.577107
min_exec_time | 0.325021
max_exec_time | 22.252086000000002
mean_exec_time | 11.2885535
stddev_exec_time | 10.963532500000001
rows | 2
shared_blks_hit | 4425
That is, two different queries, accessing two absolutely different tables
(one of them has 100000 times more rows!) were merged together.
Regards,
--
Alexander Kukushkin
On Tue, Jul 15, 2025 at 04:48:05PM +0200, Alexander Kukushkin wrote:
I totally understand the wish to make pg_stat_statements useful for
workloads that create/drop a ton of temporary tables.
However, when pursuing this goal we impacted other types of totally valid
workloads when tables with the same name exist in different schemas.
Example:
create schema s1;
create table s1.t as select id from generate_series(1, 10) as id;
create schema s2;
create table s1.t as select id from generate_series(1, 1000000) as id;
I suspect that you mean s2.t and not s1.t here.
select count(id) from s1.t;
select count(id) from s2.t;That is, two different queries, accessing two absolutely different tables
(one of them has 100000 times more rows!) were merged together.
Yes, we had this argument upthread, and it is still possible to
differentiate both cases by using a different alias in the FROM
clause, as of:
select count(id) from s1.t as t1;
select count(id) from s2.t as t2;
The new behavior where we do not need to worry about temporary tables,
which is not that uncommon because some workloads like using these for
JOIN patterns as a "temporary" anchor in a session, has more benefits
IMO, particularly more if the connections have a rather higher
turnover.
--
Michael
On Wed, 16 Jul 2025 at 01:39, Michael Paquier <michael@paquier.xyz> wrote:
create schema s1;
create table s1.t as select id from generate_series(1, 10) as id;
create schema s2;
create table s1.t as select id from generate_series(1, 1000000) as id;I suspect that you mean s2.t and not s1.t here.
Yes.
Yes, we had this argument upthread, and it is still possible to
differentiate both cases by using a different alias in the FROM
clause, as of:
select count(id) from s1.t as t1;
select count(id) from s2.t as t2;The new behavior where we do not need to worry about temporary tables,
which is not that uncommon because some workloads like using these for
JOIN patterns as a "temporary" anchor in a session, has more benefits
IMO, particularly more if the connections have a rather higher
turnover.
Yes, I've seen this argument and know that aliases will make these queries
look different.
However, we regularly hear from many different customers that they *don't
control queries* sent by application or *can't modify these queries*.
Such kinds of workloads are also not that uncommon and this change makes it
impossible to monitor them.
I would somewhat understand when a table in the query is used without
specifying schema and such queries are merged together:
s1: SET search_path s1; select count(*) from t;
s2: SET search_path s2; select count(*) from t;
But, even this case doesn't feel right, because these tables are still
different and therefore queries.
Regards,
--
Alexander Kukushkin
On Tue, Jul 15, 2025 at 11:20 PM Alexander Kukushkin <cyberdemn@gmail.com>
wrote:
However, we regularly hear from many different customers that they *don't
control queries* sent by application or *can't modify these queries*.
Such kinds of workloads are also not that uncommon and this change makes
it impossible to monitor them.
For the workloads you are thinking of, are these "one customer per schema"
multi-tenant workloads, or something else?
I mentioned this earlier in the discussion (when supporting the change that
was done), but the main challenge I've seen is that for "one customer per
schema" workloads, pg_stat_statements just doesn't work today, unless you
have only a handful of customers on a server.
Once you have anything close to 100 or more customer schemas on a server,
the churn on the entries makes pg_stat_statements unusable (even with a
high max), especially with the current way the query text file works, since
you can't reliably read from pg_stat_statements anymore without incurring a
read from a 100MB+ query text file.
So I agree this change reduces the visibility into which of the schemas had
a slow query, but it at least allows reliably using pg_stat_statements to
narrow down which query / part of an application is problematic. To get
specifics on the schema, one could then use other means (e.g.
log_min_duration_statement, auto_explain, etc) to get exact details,
grepping the logfile for the query ID retrieved from pg_stat_statements.
Thanks,
Lukas
--
Lukas Fittl