BUG #18790: Pg_stat_statements doesn't track schema.

Started by PG Bug reporting formabout 1 year ago6 messagesbugs

noreply@postgresql.org

about 1 year ago

The following bug has been logged on the website:

Bug reference: 18790
Logged by: Raghvendra Mishra
Email address: raghshr1351@gmail.com
PostgreSQL version: 17.0
Operating system: Linux
Description:

Currently, pg_stat_statment doesn't track the schema to which the query
belongs. In the case of a multitenant database, it becomes hard to find a
query belonging to which customer is the culprit. It could be solely an
enhancement, so my question is, Is it useful to expose the schema name also?
If yes can I contribute to add this support?

Michael Paquier

michael@paquier.xyz

about 1 year ago

In reply to: PG Bug reporting form (#1)

Re: BUG #18790: Pg_stat_statements doesn't track schema.

On Wed, Jan 29, 2025 at 07:10:33PM +0000, PG Bug reporting form wrote:

Currently, pg_stat_statment doesn't track the schema to which the query
belongs. In the case of a multitenant database, it becomes hard to find a
query belonging to which customer is the culprit. It could be solely an
enhancement, so my question is, Is it useful to expose the schema name also?
If yes can I contribute to add this support?

Objects from multiple schemas could be used in a single query. Even
if multiple schemas are tracked, I doubt that the cost of tracking
them is going to be really useful at this level for monitoring. Or
perhaps you have some specific use case in mind?
--
Michael

Raghvendra Mishra

raghshr1351@gmail.com

about 1 year ago

In reply to: Michael Paquier (#2)

Re: BUG #18790: Pg_stat_statements doesn't track schema.

If multiple schema is used in a query then this information can be
extracted by parsing the query.
But when the schema is being accessed by setting the search path then it
becomes hard to find with which schema
query belongs to in pg_stat_statements.

Thanks for your attention,
Ragh

On Thu, 30 Jan 2025 at 12:10, Michael Paquier <michael@paquier.xyz> wrote:

Show quoted text

On Wed, Jan 29, 2025 at 07:10:33PM +0000, PG Bug reporting form wrote:

Currently, pg_stat_statment doesn't track the schema to which the query
belongs. In the case of a multitenant database, it becomes hard to find a
query belonging to which customer is the culprit. It could be solely an
enhancement, so my question is, Is it useful to expose the schema name

also?

If yes can I contribute to add this support?

Objects from multiple schemas could be used in a single query. Even
if multiple schemas are tracked, I doubt that the cost of tracking
them is going to be really useful at this level for monitoring. Or
perhaps you have some specific use case in mind?
--
Michael

Greg Sabino Mullane

greg@turnstep.com

about 1 year ago

In reply to: Raghvendra Mishra (#3)

Re: BUG #18790: Pg_stat_statements doesn't track schema.

On Thu, Jan 30, 2025 at 11:30 AM Raghvendra Mishra <raghshr1351@gmail.com>
wrote:

If multiple schema is used in a query then this information can be
extracted by parsing the query.

But when the schema is being accessed by setting the search path then it

becomes hard to find with which schema
query belongs to in pg_stat_statements.

Even if you were to store the search_path as a separate field, there is no
promise that the items in it have not changed since the query was added to
pg_stat_statements. Your best bet is to schema-qualify your relations when
writing your queries. Then your pg_stat_statement output will contain the
information you want.

Note that you can use the queryid to figure out (with a little work) which
schemas were used for particular queries:

CREATE SCHEMA a; CREATE TABLE a.foo (id int);
CREATE SCHEMA b; CREATE TABLE b.foo (id int);
SET search_path = a; select * from foo;
SET search_path = b; select * from foo;
RESET search_path;
select query, queryid from pg_stat_statements where query ~ 'select \* from
foo';
query | queryid
-------------------+----------------------
select * from foo | 255924940643424438
select * from foo | -7783557204835816030
(2 rows)

greg=# explain verbose select * from a.foo;
QUERY PLAN
---------------------------------------------------------
Seq Scan on a.foo (cost=0.00..35.50 rows=2550 width=4)
Output: id
Query Identifier: 255924940643424438
(3 rows)

greg=# explain verbose select * from b.foo;
QUERY PLAN
---------------------------------------------------------
Seq Scan on b.foo (cost=0.00..35.50 rows=2550 width=4)
Output: id
Query Identifier: -7783557204835816030

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support

Raghvendra Mishra

raghshr1351@gmail.com

about 1 year ago

In reply to: Greg Sabino Mullane (#4)

Re: BUG #18790: Pg_stat_statements doesn't track schema.

I doubt if it generates the same queryid for the substituted query
parameter we have in pg_stat_statement.

Thanks,
Raghvendra

Greg Sabino Mullane

greg@turnstep.com

about 1 year ago

In reply to: Raghvendra Mishra (#5)

Re: BUG #18790: Pg_stat_statements doesn't track schema.

Yes, it will generate the same queryid regardless of the parameters: that's
the whole point of flattening (aka normalizing) the queries.

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support