pg_stat_statements oddity with track = all

Started by Julien Rouhaudover 5 years ago32 messageshackers

rjuju123@gmail.com

over 5 years ago

Hi,

Someone raised an interested point recently on pg_stat_kcache extension for
handling nested statements, which also applies to pg_stat_statements.

The root issue is that when pg_stat_statements tracks nested statements,
there's no way to really make sense of the counters, as top level statements
will also account for underlying statements. Using a naive example:

=# CREATE FUNCTION f1() RETURNS VOID AS $$BEGIN PERFORM pg_sleep(5); END; $$ LANGUAGE plpgsql;
CREATE FUNCTION

=# SELECT pg_stat_statements_reset();
pg_stat_statements_reset
--------------------------

(1 row)

=# SELECT f1();
f1
----

(1 row)

=# select sum(total_exec_time) from pg_stat_statements;
sum
--------------
10004.403601
(1 row)

It looks like there was 10s total execution time overall all statements, which
doesn't really make sense.

It's of course possible to avoid that using track = top, but tracking all
nested statements is usually quite useful so it could be better to find a way
to better address that problem.

The only idea I have for that is to add a new field to entry key, for instance
is_toplevel. The immediate cons is obviously that it could amplify quite a lot
the number of entries tracked, so people may need to increase
pg_stat_statements.max to avoid slowdown if that makes them reach frequent
entry eviction.

Should we address the problem, and in that case do you see a better way for
that, or should we just document this behavior?

Nikolay Samokhvalov

samokhvalov@gmail.com

over 5 years ago

In reply to: Julien Rouhaud (#1)

Re: pg_stat_statements oddity with track = all

On Tue, Dec 1, 2020 at 8:05 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

Hi,

Someone raised an interested point recently on pg_stat_kcache extension for
handling nested statements, which also applies to pg_stat_statements.

...

The only idea I have for that is to add a new field to entry key, for
instance
is_toplevel.

This particular problem often bothered me when dealing with
pg_stat_statements contents operating under "track = all" (especially when
performing the aggregated analysis, like you showed).

I think the idea of having a flag to distinguish the top-level entries is
great.

The immediate cons is obviously that it could amplify quite a lot
the number of entries tracked, so people may need to increase
pg_stat_statements.max to avoid slowdown if that makes them reach frequent
entry eviction.

If all top-level records in pg_stat_statements have "true" in the new
column (is_toplevel), how would this lead to the need to increase
pg_stat_statements.max? The number of records would remain the same, as
before extending pg_stat_statements.

Julien Rouhaud

rjuju123@gmail.com

over 5 years ago

In reply to: Nikolay Samokhvalov (#2)

Re: pg_stat_statements oddity with track = all

On Tue, Dec 01, 2020 at 10:08:06PM -0800, Nikolay Samokhvalov wrote:

On Tue, Dec 1, 2020 at 8:05 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

Someone raised an interested point recently on pg_stat_kcache extension for
handling nested statements, which also applies to pg_stat_statements.

...

The only idea I have for that is to add a new field to entry key, for
instance
is_toplevel.

This particular problem often bothered me when dealing with
pg_stat_statements contents operating under "track = all" (especially when
performing the aggregated analysis, like you showed).

I think the idea of having a flag to distinguish the top-level entries is
great.

Ok!

The immediate cons is obviously that it could amplify quite a lot
the number of entries tracked, so people may need to increase
pg_stat_statements.max to avoid slowdown if that makes them reach frequent
entry eviction.

If all top-level records in pg_stat_statements have "true" in the new
column (is_toplevel), how would this lead to the need to increase
pg_stat_statements.max? The number of records would remain the same, as
before extending pg_stat_statements.

If the same query is getting executed both at top level and as a nested
statement, two entries will then be created. That's probably unlikely for
things like RI trigger queries, but I don't know what to expect for client
application queries.

Fujii Masao

masao.fujii@gmail.com

over 5 years ago

In reply to: Julien Rouhaud (#3)

Re: pg_stat_statements oddity with track = all

On 2020/12/02 15:32, Julien Rouhaud wrote:

On Tue, Dec 01, 2020 at 10:08:06PM -0800, Nikolay Samokhvalov wrote:

On Tue, Dec 1, 2020 at 8:05 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

Someone raised an interested point recently on pg_stat_kcache extension for
handling nested statements, which also applies to pg_stat_statements.

...

The only idea I have for that is to add a new field to entry key, for
instance
is_toplevel.

This particular problem often bothered me when dealing with
pg_stat_statements contents operating under "track = all" (especially when
performing the aggregated analysis, like you showed).

I think the idea of having a flag to distinguish the top-level entries is
great.

Ok!

The immediate cons is obviously that it could amplify quite a lot
the number of entries tracked, so people may need to increase
pg_stat_statements.max to avoid slowdown if that makes them reach frequent
entry eviction.

If all top-level records in pg_stat_statements have "true" in the new
column (is_toplevel), how would this lead to the need to increase
pg_stat_statements.max? The number of records would remain the same, as
before extending pg_stat_statements.

If the same query is getting executed both at top level and as a nested
statement, two entries will then be created. That's probably unlikely for
things like RI trigger queries, but I don't know what to expect for client
application queries.

Just idea; instead of boolean is_toplevel flag, what about
counting the number of times when the statement is executed
in toplevel, and also in nested level?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Julien Rouhaud

rjuju123@gmail.com

over 5 years ago

In reply to: Fujii Masao (#4)

Re: pg_stat_statements oddity with track = all

On Wed, Dec 02, 2020 at 03:52:37PM +0900, Fujii Masao wrote:

On 2020/12/02 15:32, Julien Rouhaud wrote:

On Tue, Dec 01, 2020 at 10:08:06PM -0800, Nikolay Samokhvalov wrote:

On Tue, Dec 1, 2020 at 8:05 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

Someone raised an interested point recently on pg_stat_kcache extension for
handling nested statements, which also applies to pg_stat_statements.

...

The only idea I have for that is to add a new field to entry key, for
instance
is_toplevel.

[...]

Just idea; instead of boolean is_toplevel flag, what about
counting the number of times when the statement is executed
in toplevel, and also in nested level?

Ah, indeed that would avoid extraneous entries. FTR we would also need that
for the planning part.

The cons I can see is that it'll make the counters harder to process (unless we
provide a specific view for the top-level statements only for instance), and
that it assumes that doing a simple division is representative enough for the
top level/nested repartition. This might be quite off for in some cases, e.g.
big stored procedures due to lack of autovacuum, but that can't be worse than
what we currently have.

legrand legrand

legrand_legrand@hotmail.com

over 5 years ago

In reply to: Julien Rouhaud (#5)

Re: pg_stat_statements oddity with track = all

Hi,

a crazy idea:
- add a parent_statement_id column that would be NULL for top level queries
- build statement_id for nested queries based on the merge of:
a/ current_statement_id and parent one
or
b/ current_statement_id and nested level.

this would offer the ability to track counters at any depth level ;o)
Regards
PAscal

--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Sergei Kornilov

sk@zsrv.org

over 5 years ago

In reply to: legrand legrand (#6)

Re: pg_stat_statements oddity with track = all

Hello

- add a parent_statement_id column that would be NULL for top level queries

Will generate too much entries... Every FK for each different delete/insert, for example.
But very useful for databases with a lot of stored procedures to find where this query is called. May be new mode track = tree? Use NULL to indicate a top-level query (same as with track=tree) and some constant for any nested queries when track = all.

Also, currently a top statement will account buffers usage for underlying statements?

regards, Sergei

Nikolay Samokhvalov

samokhvalov@gmail.com

over 5 years ago

In reply to: Julien Rouhaud (#3)

Re: pg_stat_statements oddity with track = all

On Tue, Dec 1, 2020 at 10:32 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Tue, Dec 01, 2020 at 10:08:06PM -0800, Nikolay Samokhvalov wrote:

If all top-level records in pg_stat_statements have "true" in the new
column (is_toplevel), how would this lead to the need to increase
pg_stat_statements.max? The number of records would remain the same, as
before extending pg_stat_statements.

If the same query is getting executed both at top level and as a nested
statement, two entries will then be created. That's probably unlikely for
things like RI trigger queries, but I don't know what to expect for client
application queries.

Right, but this is how things already work. The extra field you've proposed
won't increase the number of records so it shouldn't affect how users
choose pg_stat_statements.max.

Julien Rouhaud

rjuju123@gmail.com

over 5 years ago

In reply to: Nikolay Samokhvalov (#8)

Re: pg_stat_statements oddity with track = all

On Wed, Dec 02, 2020 at 06:23:54AM -0800, Nikolay Samokhvalov wrote:

On Tue, Dec 1, 2020 at 10:32 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Tue, Dec 01, 2020 at 10:08:06PM -0800, Nikolay Samokhvalov wrote:

If all top-level records in pg_stat_statements have "true" in the new
column (is_toplevel), how would this lead to the need to increase
pg_stat_statements.max? The number of records would remain the same, as
before extending pg_stat_statements.

If the same query is getting executed both at top level and as a nested
statement, two entries will then be created. That's probably unlikely for
things like RI trigger queries, but I don't know what to expect for client
application queries.

Right, but this is how things already work. The extra field you've proposed
won't increase the number of records so it shouldn't affect how users
choose pg_stat_statements.max.

The extra field I've proposed would increase the number of records, as it needs
to be a part of the key. The only alternative would be what Fufi-san
mentioned, i.e. to split plans and calls (for instance plans_toplevel,
plans_nested, calls_toplevel, calls_nested) and let users compute an
approximate value for toplevel counters.

#10

legrand legrand

legrand_legrand@hotmail.com

over 5 years ago

In reply to: Julien Rouhaud (#9)

Re: pg_stat_statements oddity with track = all

Hi Julien,

The extra field I've proposed would increase the number of records, as it
needs

to be a part of the key.

To get an increase in the number of records that means that the same
statement
would appear at top level AND nested level. This seems a corner case with
very low
(neglectible) occurence rate. Did I miss something ?

Regards
PAscal

--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

#11

Sergei Kornilov

sk@zsrv.org

over 5 years ago

In reply to: legrand legrand (#10)

Re: pg_stat_statements oddity with track = all

Hello

To get an increase in the number of records that means that the same
statement
would appear at top level AND nested level. This seems a corner case with
very low
(neglectible) occurence rate.

+1
I think splitting fields into plans_toplevel / plans_nested will be less convenient. And more code with higher chance of copypaste errors

regards, Sergei

#12

Julien Rouhaud

rjuju123@gmail.com

over 5 years ago

In reply to: Sergei Kornilov (#7)

Re: pg_stat_statements oddity with track = all

On Wed, Dec 02, 2020 at 05:13:56PM +0300, Sergei Kornilov wrote:

Hello

- add a parent_statement_id column that would be NULL for top level queries

Will generate too much entries... Every FK for each different delete/insert, for example.
But very useful for databases with a lot of stored procedures to find where this query is called. May be new mode track = tree? Use NULL to indicate a top-level query (same as with track=tree) and some constant for any nested queries when track = all.

Maybe pg_stat_statements isn't the best tool for that use case. For the record
the profiler in plpgsql_check can now track queryid for each statements inside
a function, so you match pg_stat_statements entries. That's clearly not
perfect as dynamic queries could generate different queryid, but that's a
start.

Also, currently a top statement will account buffers usage for underlying statements?

I think so.

#13

Julien Rouhaud

rjuju123@gmail.com

over 5 years ago

In reply to: Sergei Kornilov (#11)

Re: pg_stat_statements oddity with track = all

On Thu, Dec 03, 2020 at 11:40:22AM +0300, Sergei Kornilov wrote:

Hello

To get an increase in the number of records that means that the same
statement
would appear at top level AND nested level. This seems a corner case with
very low
(neglectible) occurence rate.

+1
I think splitting fields into plans_toplevel / plans_nested will be less convenient. And more code with higher chance of copypaste errors

As I mentioned in a previous message, I really have no idea if that would be a
corner case or not. For instance with native partitioning, the odds to have
many different query executed both at top level and as a nested statement may
be quite higher.

#14

Julien Rouhaud

rjuju123@gmail.com

over 5 years ago

In reply to: Julien Rouhaud (#13)

Re: pg_stat_statements oddity with track = all

On Thu, Dec 03, 2020 at 04:53:59PM +0800, Julien Rouhaud wrote:

On Thu, Dec 03, 2020 at 11:40:22AM +0300, Sergei Kornilov wrote:

Hello

To get an increase in the number of records that means that the same
statement
would appear at top level AND nested level. This seems a corner case with
very low
(neglectible) occurence rate.

+1
I think splitting fields into plans_toplevel / plans_nested will be less convenient. And more code with higher chance of copypaste errors

As I mentioned in a previous message, I really have no idea if that would be a
corner case or not. For instance with native partitioning, the odds to have
many different query executed both at top level and as a nested statement may
be quite higher.

The consensus seems to be adding a new boolean toplevel flag in the entry key,
so PFA a patch implementing that. Note that the key now has padding, so
memset() calls are required.

#15

Sergei Kornilov

sk@zsrv.org

over 5 years ago

In reply to: Julien Rouhaud (#14)

Re: pg_stat_statements oddity with track = all

Hello

Seems we need also change PGSS_FILE_HEADER.

regards, Sergei

#16

Julien Rouhaud

rjuju123@gmail.com

over 5 years ago

In reply to: Sergei Kornilov (#15)

Re: pg_stat_statements oddity with track = all

On Fri, Dec 04, 2020 at 12:06:10PM +0300, Sergei Kornilov wrote:

Hello

Seems we need also change PGSS_FILE_HEADER.

Indeed, thanks! v2 attached.

#17

Julien Rouhaud

rjuju123@gmail.com

over 5 years ago

In reply to: Julien Rouhaud (#16)

Re: pg_stat_statements oddity with track = all

On Fri, Dec 04, 2020 at 06:09:13PM +0800, Julien Rouhaud wrote:

On Fri, Dec 04, 2020 at 12:06:10PM +0300, Sergei Kornilov wrote:

Hello

Seems we need also change PGSS_FILE_HEADER.

Indeed, thanks! v2 attached.

There was a conflict on PGSS_FILE_HEADER since some recent commit, v3 attached.

#18

Masahiro Ikeda

ikedamsh@oss.nttdata.com

over 5 years ago

In reply to: Julien Rouhaud (#17)

Re: pg_stat_statements oddity with track = all

Hi,

Thanks for making the patch to add "toplevel" column in
pg_stat_statements.
This is a review comment.

There hasn't been any discussion, at least that I've been able to find.
So, +1 to change the status to "Ready for Committer".

1. submission/feature review
=============================

This patch can be applied cleanly to the current master branch(ed4367).
I tested with `make check-world` and I checked there is no fail.

This patch has reasonable documents and tests.
A "toplevel" column of pg_stat_statements view is documented and
following tests are added.
- pg_stat_statements.track option : 'top' and 'all'
- query type: normal query and nested query(pl/pgsql)

I tested the "update" command can work.
postgres=# ALTER EXTENSION pg_stat_statements UPDATE TO '1.10';

Although the "toplevel" column of all queries which already stored is
'false',
we have to decide the default. I think 'false' is ok.

2. usability review
====================

This patch solves the problem we can't get to know
which query is top-level if pg_stat_statements.track = 'all'.

This leads that we can analyze with aggregated queries.

There is some use-case.
For example, we can know the sum of total_exec_time of queries
even if nested queries are executed.

We can know how efficiently a database can use CPU resource for queries
using the sum of the total_exec_time, and the exec_user_time and
exec_system_time in pg_stat_kcache which is the extension gathering
os resources.

Although one concern is whether only top-level is enough or not,
I couldn't come up with any use-case to use nested level, so I think
it's ok.

3. coding review
=================

Although I had two concerns, I think they are no problem.
So, this patch looks good to me.

Following were my concerns.

A. the risk of too many same queries is duplicate.

Since this patch adds a "top" member in the hash key,
it leads to store duplicated same query which "top" is false and true.

This concern is already discussed and I agreed to the consensus
it seems unlikely to have the same queries being executed both
at the top level and as nested statements.

B. add a argument of the pg_stat_statements_reset().

Now, pg_stat_statements supports a flexible reset feature.

pg_stat_statements_reset(userid Oid, dbid Oid, queryid bigint)

Although I wondered whether we need to add a top-level flag to the
arguments,
I couldn't come up with any use-case to reset only top-level queries or
not top-level queries.

4. others
==========

These comments are not related to this patch.

A. about the topic of commitfests

Since I think this feature is for monitoring,
it's better to change the topic from "System Administration"
to "Monitoring & Control".

B. check logic whether a query is top-level or not in pg_stat_kcache

This patch uses the only exec_nested_level to check whether a query is
top-level or not.
The reason is nested_level is less useful and I agree.

But, pg_stat_kcache uses plan_nested_level too.
Although the check result is the same, it's better to change it
corresponding to this patch after it's committed.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

#19

Julien Rouhaud

rjuju123@gmail.com

over 5 years ago

In reply to: Masahiro Ikeda (#18)

Re: pg_stat_statements oddity with track = all

Hi,

On Tue, Jan 19, 2021 at 4:55 PM Masahiro Ikeda <ikedamsh@oss.nttdata.com> wrote:

Thanks for making the patch to add "toplevel" column in
pg_stat_statements.
This is a review comment.

Thanks a lot for the thorough review!

I tested the "update" command can work.
postgres=# ALTER EXTENSION pg_stat_statements UPDATE TO '1.10';

Although the "toplevel" column of all queries which already stored is
'false',
we have to decide the default. I think 'false' is ok.

Mmm, I'm not sure that I understand this result. The "toplevel" value
is tracked by the C code loaded at startup, so it should have a
correct value even if you used to have the extension in a previous
version, it's just that you can't access the toplevel field before
doing the ALTER EXTENSION pg_stat_statements UPDATE. There's also a
change in the magic number, so you can't use the stored stat file from
a version before this patch.

I also fail to reproduce this behavior. I did the following:

- start postgres with pg_stat_statements v1.10 (so with this patch) in
shared_preload_libraries
- CREATE EXTENSION pg_stat_statements VERSION '1.9';
- execute a few queries
- ALTER EXTENSION pg_stat_statements UPDATE;
- SELECT * FROM pg_stat_statements reports the query with toplvel to TRUE

Can you share a way to reproduce your findings?

2. usability review
====================
[...]
Although one concern is whether only top-level is enough or not,
I couldn't come up with any use-case to use nested level, so I think
it's ok.

I agree, I don't see how tracking statistics per nesting level would
really help. The only additional use case I see would tracking
triggers/FK query execution, but the nesting level won't help with
that.

3. coding review
=================
[...]
B. add a argument of the pg_stat_statements_reset().

Now, pg_stat_statements supports a flexible reset feature.

pg_stat_statements_reset(userid Oid, dbid Oid, queryid bigint)

Although I wondered whether we need to add a top-level flag to the
arguments,
I couldn't come up with any use-case to reset only top-level queries or
not top-level queries.

Ah, I didn't think of the reset function. I also fail to see a
reasonable use case to provide a switch for the reset function.

4. others
==========

These comments are not related to this patch.

A. about the topic of commitfests

Since I think this feature is for monitoring,
it's better to change the topic from "System Administration"
to "Monitoring & Control".

I agree, thanks for the change!

B. check logic whether a query is top-level or not in pg_stat_kcache

This patch uses the only exec_nested_level to check whether a query is
top-level or not.
The reason is nested_level is less useful and I agree.

Do you mean plan_nest_level is less useful?

But, pg_stat_kcache uses plan_nested_level too.
Although the check result is the same, it's better to change it
corresponding to this patch after it's committed.

I agree, we should be consistent here. I'll take care of the needed
changes if and when this patch is commited!

#20

Masahiro Ikeda

ikedamsh@oss.nttdata.com

over 5 years ago

In reply to: Julien Rouhaud (#19)

Re: pg_stat_statements oddity with track = all

On 2021-01-20 18:14, Julien Rouhaud wrote:

On Tue, Jan 19, 2021 at 4:55 PM Masahiro Ikeda
<ikedamsh@oss.nttdata.com> wrote:

I tested the "update" command can work.
postgres=# ALTER EXTENSION pg_stat_statements UPDATE TO '1.10';

Although the "toplevel" column of all queries which already stored is
'false',
we have to decide the default. I think 'false' is ok.

Mmm, I'm not sure that I understand this result. The "toplevel" value
is tracked by the C code loaded at startup, so it should have a
correct value even if you used to have the extension in a previous
version, it's just that you can't access the toplevel field before
doing the ALTER EXTENSION pg_stat_statements UPDATE. There's also a
change in the magic number, so you can't use the stored stat file from
a version before this patch.

I also fail to reproduce this behavior. I did the following:

- start postgres with pg_stat_statements v1.10 (so with this patch) in
shared_preload_libraries
- CREATE EXTENSION pg_stat_statements VERSION '1.9';
- execute a few queries
- ALTER EXTENSION pg_stat_statements UPDATE;
- SELECT * FROM pg_stat_statements reports the query with toplvel to
TRUE

Can you share a way to reproduce your findings?

Sorry for making you confused. I can't reproduce although I tried now.
I think my procedure was something wrong. So, please ignore this
comment, sorry about that.

B. check logic whether a query is top-level or not in pg_stat_kcache

This patch uses the only exec_nested_level to check whether a query is
top-level or not.
The reason is nested_level is less useful and I agree.

Do you mean plan_nest_level is less useful?

I think so. Anyway, it's important to correspond core's implementation
because pg_stat_statements and pg_stat_kcache are used at the same time.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

#21