pg_stat_statements: Fix nested tracking for implicitly closed cursors
Hi,
It was brought to my attention that there is pg_stat_statements
behavior where an implicitly closed cursor, via COMMIT or END,
is stored as toplevel for both the utility DECLARE CURSOR
statement and the underlying query.
```
BEGIN;
DECLARE FOOCUR CURSOR FOR SELECT * from stats_track_tab;
FETCH FORWARD 1 FROM foocur;
x
---
(0 rows)
COMMIT;
SELECT toplevel, calls, query FROM pg_stat_statements
ORDER BY query COLLATE "C";
toplevel | calls | query
----------+-------+----------------------------------------------------------
t | 1 | BEGIN
t | 1 | COMMIT
t | 1 | DECLARE FOOCUR CURSOR FOR SELECT * from stats_track_tab
t | 1 | DECLARE FOOCUR CURSOR FOR SELECT * from stats_track_tab;
t | 1 | FETCH FORWARD $1 FROM foocur
t | 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
(6 rows)
```
Also, with track_planning enabled, the underlying query is
stored with toplevel = false for the plans counter and with
toplevel = true for the calls counter, resulting in an
additional entry.
```
SELECT toplevel, calls, plans, query FROM pg_stat_statements
ORDER BY query COLLATE "C";
toplevel | calls | plans | query
----------+-------+-------+--------------------------------------------------------------
t | 1 | 0 | BEGIN
t | 1 | 0 | COMMIT
t | 1 | 0 | DECLARE FOOCUR CURSOR FOR SELECT * from
stats_track_tab
t | 1 | 0 | DECLARE FOOCUR CURSOR FOR SELECT * from
stats_track_tab;
f | 0 | 1 | DECLARE FOOCUR CURSOR FOR SELECT * from
stats_track_tab;
t | 1 | 0 | FETCH FORWARD $1 FROM foocur
t | 1 | 0 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
t | 0 | 1 | SELECT toplevel, calls, plans, query FROM
pg_stat_statements+
| | | ORDER BY query COLLATE "C"
```
The reason this occurs is because PortalCleanup, which triggers
ExecutorEnd, runs after the ProcessUtility hook. At that point,
nesting_level has already been reset back to 0.
I am not sure how common this pattern is, but it is probably
worth fixing. At a minimum, we need tests to show this behavior,
but we can do better by checking whether we just processed a
COMMIT statement and setting a flag to let ExecutorEnd increment
nesting_level. There should be no other way to reach ExecutorEnd
after a COMMIT besides PortalCleanup, AFAICT. I could be proven
wrong.
The attached patch fixes this as described above.
Note that, due to f85f6ab051b7, there is a separate issue that
should be improved. Tracking the underlying SQL of a utility
statement using the utility statement itself is confusing
and should be fixed. That is a separate point, but I am
mentioning it here for clarity.
--
Sami Imseih
Amazon Web Services
Attachments:
v1-0001-pg_stat_statements-Fix-nested-tracking-for-implic.patchapplication/octet-stream; name=v1-0001-pg_stat_statements-Fix-nested-tracking-for-implic.patchDownload+230-1
Hi Sami,
Please add a `PG_TRY` and `PG_FINALLY` to make sure we always reset the
nesting_level.
Also this will break the following scenario
```
BEGIN;
COMMIT;
SELECT 1; -- This will be stored as inner level because COMMIT sets
is_txn_end flag
```
Can we reset is_txn_end at executorStart to solve the problem?
--
Martin Huang
Amazon Web Services
On Fri, Jan 9, 2026 at 1:02 PM Sami Imseih <samimseih@gmail.com> wrote:
Show quoted text
Hi,
It was brought to my attention that there is pg_stat_statements
behavior where an implicitly closed cursor, via COMMIT or END,
is stored as toplevel for both the utility DECLARE CURSOR
statement and the underlying query.```
BEGIN;
DECLARE FOOCUR CURSOR FOR SELECT * from stats_track_tab;
FETCH FORWARD 1 FROM foocur;
x
---
(0 rows)COMMIT;
SELECT toplevel, calls, query FROM pg_stat_statements
ORDER BY query COLLATE "C";
toplevel | calls | query----------+-------+----------------------------------------------------------
t | 1 | BEGIN
t | 1 | COMMIT
t | 1 | DECLARE FOOCUR CURSOR FOR SELECT * from stats_track_tab
t | 1 | DECLARE FOOCUR CURSOR FOR SELECT * from
stats_track_tab;
t | 1 | FETCH FORWARD $1 FROM foocur
t | 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
(6 rows)
```Also, with track_planning enabled, the underlying query is
stored with toplevel = false for the plans counter and with
toplevel = true for the calls counter, resulting in an
additional entry.```
SELECT toplevel, calls, plans, query FROM pg_stat_statements
ORDER BY query COLLATE "C";
toplevel | calls | plans | query----------+-------+-------+--------------------------------------------------------------
t | 1 | 0 | BEGIN
t | 1 | 0 | COMMIT
t | 1 | 0 | DECLARE FOOCUR CURSOR FOR SELECT * from
stats_track_tab
t | 1 | 0 | DECLARE FOOCUR CURSOR FOR SELECT * from
stats_track_tab;
f | 0 | 1 | DECLARE FOOCUR CURSOR FOR SELECT * from
stats_track_tab;
t | 1 | 0 | FETCH FORWARD $1 FROM foocur
t | 1 | 0 | SELECT pg_stat_statements_reset() IS NOT NULL
AS t
t | 0 | 1 | SELECT toplevel, calls, plans, query FROM
pg_stat_statements+
| | | ORDER BY query COLLATE "C"
```The reason this occurs is because PortalCleanup, which triggers
ExecutorEnd, runs after the ProcessUtility hook. At that point,
nesting_level has already been reset back to 0.I am not sure how common this pattern is, but it is probably
worth fixing. At a minimum, we need tests to show this behavior,
but we can do better by checking whether we just processed a
COMMIT statement and setting a flag to let ExecutorEnd increment
nesting_level. There should be no other way to reach ExecutorEnd
after a COMMIT besides PortalCleanup, AFAICT. I could be proven
wrong.The attached patch fixes this as described above.
Note that, due to f85f6ab051b7, there is a separate issue that
should be improved. Tracking the underlying SQL of a utility
statement using the utility statement itself is confusing
and should be fixed. That is a separate point, but I am
mentioning it here for clarity.--
Sami Imseih
Amazon Web Services
Hi,
Thanks for the comment!
Please add a `PG_TRY` and `PG_FINALLY` to make sure we always reset the nesting_level.
Also this will break the following scenario
```
BEGIN;
COMMIT;
SELECT 1; -- This will be stored as inner level because COMMIT sets is_txn_end flag
```Can we reset is_txn_end at executorStart to solve the problem?
Correct.Thanks for spotting this oversight. I think the is_txn_end flag must
be reset in all hook functions that call pgss_store, except in pgss_ExecutorEnd,
where we just have to check is_txn_end and increment the nesting_level. Also,
I added a PG_TRY/FINALLY inside pgss_ExecutorEnd to ensure we reset the
nesting_level is reset.
I added a test case for the query you mentioned above.
--
Sami Imseih
Amazon Web Services (AWS)
Attachments:
v2-0001-pg_stat_statements-Fix-nested-tracking-for-implic.patchapplication/octet-stream; name=v2-0001-pg_stat_statements-Fix-nested-tracking-for-implic.patchDownload+302-27
The patch looks good to me.
--
Martin Huang
Amazon Web Services
On Tue, Jan 13, 2026 at 07:01:48PM -0600, Sami Imseih wrote:
Correct.Thanks for spotting this oversight. I think the is_txn_end flag must
be reset in all hook functions that call pgss_store, except in pgss_ExecutorEnd,
where we just have to check is_txn_end and increment the nesting_level. Also,
I added a PG_TRY/FINALLY inside pgss_ExecutorEnd to ensure we reset the
nesting_level is reset.I added a test case for the query you mentioned above.
@@ -1079,35 +1095,47 @@ pgss_ExecutorEnd(QueryDesc *queryDesc)
{
int64 queryId = queryDesc->plannedstmt->queryId;
- if (queryId != INT64CONST(0) && queryDesc->totaltime &&
- pgss_enabled(nesting_level))
+ if (is_txn_end)
+ nesting_level++;
Couldn't it be possible that we reach this stack of execution with
is_txn_end = true but we don't intend it to be so? For example,
imagine first that we reach pgss_ProcessUtility() for a COMMIT
TransactionStmt, second we error without resetting is_txn_end, third a
different portal is executed a with the same backend: we could exit
the executor with is_txn_end set and nesting_level increased but we
did not intend these events to happen.
--
Michael
Couldn't it be possible that we reach this stack of execution with
is_txn_end = true but we don't intend it to be so? For example,
imagine first that we reach pgss_ProcessUtility() for a COMMIT
TransactionStmt, second we error without resetting is_txn_end, third a
different portal is executed a with the same backend: we could exit
the executor with is_txn_end set and nesting_level increased but we
did not intend these events to happen.
This is the key point to this fix. So, in v2, is_txn_end is reset at the start
of pgss_planner, pgss_post_parse_analyze and pgss_ProcessUtility. But now
I realized that is_txn_end should also be reset at
pgss_ExecutorStart/Run and should
always be reset at the end of pgss_ExecutorEnd.
So in your example above the new portal will go through one of the hooks which
will reset is_txn_end at the start. Basically, we have to go through
one of pgss_planner,
pgss_post_parse_analyze and pgss_ProcessUtility, pgss_ExecutorRun or
pgss_ExecutorStart
for any command, so resetting this value at the start of these hooks
should ensure that
is_txn_end = true does not incorrectly persist across statements. right?
( oops, I just realized my example at the start of the thread doe not indicate
that this issue only occurs with pg_stat_statements.track="all" )
--
Sami Imseih
Amazon Web Services (AWS)
Attachments:
v3-0001-pg_stat_statements-Fix-nested-tracking-for-implic.patchapplication/octet-stream; name=v3-0001-pg_stat_statements-Fix-nested-tracking-for-implic.patchDownload+309-27
On Fri, Jan 16, 2026 at 06:12:16PM -0600, Sami Imseih wrote:
This is the key point to this fix. So, in v2, is_txn_end is reset at the start
of pgss_planner, pgss_post_parse_analyze and pgss_ProcessUtility. But now
I realized that is_txn_end should also be reset at
pgss_ExecutorStart/Run and should
always be reset at the end of pgss_ExecutorEnd.
Hmm. While reading through the patch, I am wondering if this approach
to this kind of problem is the right thing to do long-term, relying on
a static flag to decide if the nesting level should be manipulated or
not when we end the executor. pgss is not the only module where I
have seen that we want to track a nesting level in a precise way. So
could be it more flexible to attach that to a backend structure at
this point and make manipulations of this number through the core
backend, reducing TRY/CATCH blocks required in extension code?
--
Michael
On Sun, Jan 18, 2026 at 7:18 PM Michael Paquier <michael@paquier.xyz> wrote:
Hmm. While reading through the patch, I am wondering if this approach
to this kind of problem is the right thing to do long-term, relying on
a static flag to decide if the nesting level should be manipulated or
not when we end the executor. pgss is not the only module where I
have seen that we want to track a nesting level in a precise way. So
could be it more flexible to attach that to a backend structure at
this point and make manipulations of this number through the core
backend, reducing TRY/CATCH blocks required in extension code?
+1
I think tracking nesting level in core would make a lot of sense -
I've noticed this too that extensions (pg_wait_sampling, pg_qualstats,
pg_stat_plans, etc) seem to keep re-inventing this, and I assume the
definition of this would be the same across extensions.
Not sure if you already had a place in mind, but thinking through
where we could attach it, I realized two things:
1) QueryDesc seems like a reasonable struct to communicate nesting
level (since it gets passed to the relevant executor hooks already),
but that doesn't get passed to ProcessUtility today
2) ProcessUtility has an existing indicator for top-level statements
in ProcessUtilityContext (PROCESS_UTILITY_TOPLEVEL)
Were you thinking of actually using a backend struct (i.e.
PgBackendStatus) and keep changing that as we run through the
different execution levels, and exposing a helper function to retrieve
it?
Thanks,
Lukas
--
Lukas Fittl
On Tue, Jan 20, 2026 at 01:27:05PM -0800, Lukas Fittl wrote:
Not sure if you already had a place in mind, but thinking through
where we could attach it, I realized two things:1) QueryDesc seems like a reasonable struct to communicate nesting
level (since it gets passed to the relevant executor hooks already),
but that doesn't get passed to ProcessUtility today
2) ProcessUtility has an existing indicator for top-level statements
in ProcessUtilityContext (PROCESS_UTILITY_TOPLEVEL)Were you thinking of actually using a backend struct (i.e.
PgBackendStatus) and keep changing that as we run through the
different execution levels, and exposing a helper function to retrieve
it?
I did not have in mind a backend structure with a state allocated for
as long as the session lives would be a good fit. It is a bit better
than a static flag to drive the decisions, but it is still open to
being incorrect because it would require more code paths to reset it.
Something like QueryDesc would be a better fit, for a state that
sticks for as long as a top-level transaction is running. We are not
doing that stuff really well yet, though, if we want to track data
like a nesting level. So my short answer is that I do not know yet
what a good answer is, but it is definitely a problem to have some
duplicated logic to calculate a nesting level across many extensions.
QueryDesc was one that popped in mind, like you, but I got reminded
that utility.c has no idea about that, so... Now QueryDesc knows
about a PlannedStmt, which is something that utility.c knows.
--
Michael
Not sure if you already had a place in mind, but thinking through
where we could attach it, I realized two things:1) QueryDesc seems like a reasonable struct to communicate nesting
level (since it gets passed to the relevant executor hooks already),
but that doesn't get passed to ProcessUtility today
2) ProcessUtility has an existing indicator for top-level statements
in ProcessUtilityContext (PROCESS_UTILITY_TOPLEVEL)Were you thinking of actually using a backend struct (i.e.
PgBackendStatus) and keep changing that as we run through the
different execution levels, and exposing a helper function to retrieve
it?I did not have in mind a backend structure with a state allocated for
as long as the session lives would be a good fit. It is a bit better
than a static flag to drive the decisions, but it is still open to
being incorrect because it would require more code paths to reset it.
Something like QueryDesc would be a better fit, for a state that
sticks for as long as a top-level transaction is running. We are not
doing that stuff really well yet, though, if we want to track data
like a nesting level. So my short answer is that I do not know yet
what a good answer is, but it is definitely a problem to have some
duplicated logic to calculate a nesting level across many extensions.
QueryDesc was one that popped in mind, like you, but I got reminded
that utility.c has no idea about that, so... Now QueryDesc knows
about a PlannedStmt, which is something that utility.c knows.
Before going down the fix in pg_stat_statements directly, it did cross my
mind about doing nesting_level tracking in core, but I was not sure if
there is an appetite for this. I may have been wrong.
nesting_level is used in both auto_explain and pg_stat_statements
in contrib, and as mentioned by Lukas there are other extensions
out in the ecosystem that do the same.
I am not opposed to the idea of going down this path. I took a brief look
today and found some missing nesting_level tests [0]/messages/by-id/CAA5RZ0uK1PSrgf52bWCtDpzaqbWt04o6ZA7zBm6UQyv7vyvf9w@mail.gmail.com.
My initial thought is this would need some type of a backend stucture.
I am not sure how attaching this to something like a queryDesc will
be useful ( couldn't a single execution have multiple queryDesc? i.e
nested queries, etc. )
[0]: /messages/by-id/CAA5RZ0uK1PSrgf52bWCtDpzaqbWt04o6ZA7zBm6UQyv7vyvf9w@mail.gmail.com
--
Sami Imseih
Amazon Web Services (AWS)