[PATCH] Add last_executed timestamp to pg_stat_statements

Started by Pavlo Golub3 months ago23 messages

pavlo.golub@cybertec.at

3 months ago

Hello,

I would like to propose adding a last_executed timestamptz column to
pg_stat_statements. This column records when each tracked statement
was most recently executed.

The motivation comes from real world experience with monitoring tools
like pgwatch that poll pg_stat_statements regularly. Currently, these
tools must fetch and store statistics for all statements, even those
that haven't executed recently. This creates significant storage
overhead. For a database with around 3400 statements polled every 3
minutes, storing full query text requires roughly 2.5 MB per snapshot.
Over two weeks, this accumulates to about 17 GB. Even without query
text, storage reaches 10 GB.

With a last_executed timestamptz, monitoring tools can simply filter
statements by "last_executed > NOW() - polling_interval" to fetch only
statements that have been executed since the last poll. This
eliminates the need for complex workarounds that some tools currently
use to identify changed statements
(https://github.com/cybertec-postgresql/pgwatch/blob/759df3a149cbbe973165547186068aa7b5332f9d/internal/metrics/metrics.yaml#L2605-L2766).

Beyond monitoring efficiency, the timestamp enables other useful
queries. You can find statements that haven't executed in 30 days to
identify deprecated code paths. You can correlate statement execution
with specific time windows during incident investigation. You can also
make informed decisions about which statistics to reset.

The implementation is straightforward. The timestamp is stored in the
Counters structure and updated on every statement execution, protected
by the existing spinlock. The overhead is minimal, just a single
timestamp assignment per execution. The timestamp persists with other
statistics across server restarts. I've bumped the stats file format
version to handle the structure change cleanly.

The patch includes a new pg_stat_statements_1_14 function, the upgrade
script from 1.13 to 1.14, and regression tests. All existing tests
continue to pass.

I believe this is a simple addition that addresses a real pain point
for database monitoring and provides useful functionality for
understanding query patterns over time.

Thanks in advance!

Attached patch applies cleanly to the current master.

Sami Imseih

samimseih@gmail.com

3 months ago

In reply to: Pavlo Golub (#1)

Re: [PATCH] Add last_executed timestamp to pg_stat_statements

Hi,

Thanks for raising this. I did not look at the patch, but I have some high
level comments.

I would like to propose adding a last_executed timestamptz column to
pg_stat_statements. This column records when each tracked statement
was most recently executed.

I do think there is value in adding a last_executed timestamp. I actually think
last_executed should be the time the query started timestamp, so we should
actually create an entry at ExecutorStart, along with calls_started and
calls_completed. This is great for tracking cancelled queries.
The issue is the extra overhead of tracking the query on
EcecutorStart, but that
should be less of an issue once we move pg_stat_statements to the cumulative
statistics system, which will be possible once we get some prerequisite
work to make this happen [0]/messages/by-id/CAA5RZ0s9SDOu+Z6veoJCHWk+kDeTktAtC-KY9fQ9Z6BJdDUirQ@mail.gmail.com.

Another concern is the width of the current view. I think before we add
any new attribute, pg_stat_statements fields should be split.
This was discussed in [1]/messages/by-id/03f82e6f-66a3-4c4d-935c-ea4d93871dc1@gmail.com.

The motivation comes from real world experience with monitoring tools
like pgwatch that poll pg_stat_statements regularly. Currently, these
tools must fetch and store statistics for all statements, even those
that haven't executed recently. This creates significant storage
overhead. For a database with around 3400 statements polled every 3
minutes, storing full query text requires roughly 2.5 MB per snapshot.
Over two weeks, this accumulates to about 17 GB. Even without query
text, storage reaches 10 GB.

With a last_executed timestamptz, monitoring tools can simply filter
statements by "last_executed > NOW() - polling_interval" to fetch only
statements that have been executed since the last poll. This
eliminates the need for complex workarounds that some tools currently
use to identify changed statements
(https://github.com/cybertec-postgresql/pgwatch/blob/759df3a149cbbe973165547186068aa7b5332f9d/internal/metrics/metrics.yaml#L2605-L2766).

Can pg_stat_statements.stats_since help here?

for example "where stats_since > last_poll_timestamp" ?

The client does have to track the last_poll_timestamp in that
case.

[0]: /messages/by-id/CAA5RZ0s9SDOu+Z6veoJCHWk+kDeTktAtC-KY9fQ9Z6BJdDUirQ@mail.gmail.com
[1]: /messages/by-id/03f82e6f-66a3-4c4d-935c-ea4d93871dc1@gmail.com

--
Sami Imseih
Amazon Web Services (AWS)

Pavlo Golub

pavlo.golub@cybertec.at

3 months ago

In reply to: Sami Imseih (#2)

Re[2]: [PATCH] Add last_executed timestamp to pg_stat_statements

Hi,

Thanks for raising this. I did not look at the patch, but I have some high
level comments.

I would like to propose adding a last_executed timestamptz column to
pg_stat_statements. This column records when each tracked statement
was most recently executed.

I do think there is value in adding a last_executed timestamp.

Thanks for your support!

Can pg_stat_statements.stats_since help here?

for example "where stats_since > last_poll_timestamp" ?

Actually no, monitoring tools fetch snapshots to find the difference
between snapshots.
Data for every statement is changes after each execution.

But stats_since is inserted only once when the new statement execution
appears and is never updated during next executions.

Show quoted text

The client does have to track the last_poll_timestamp in that
case.

[0] /messages/by-id/CAA5RZ0s9SDOu+Z6veoJCHWk+kDeTktAtC-KY9fQ9Z6BJdDUirQ@mail.gmail.com
[1] /messages/by-id/03f82e6f-66a3-4c4d-935c-ea4d93871dc1@gmail.com

--
Sami Imseih
Amazon Web Services (AWS)

Sami Imseih

samimseih@gmail.com

3 months ago

In reply to: Pavlo Golub (#3)

Re: Re[2]: [PATCH] Add last_executed timestamp to pg_stat_statements

Can pg_stat_statements.stats_since help here?

for example "where stats_since > last_poll_timestamp" ?

Actually no, monitoring tools fetch snapshots to find the difference
between snapshots.
Data for every statement is changes after each execution.

But stats_since is inserted only once when the new statement execution
appears and is never updated during next executions.

I was thinking of using stats_since to avoid fetching query text,
since that does not change. But you are talking about avoiding all
the stats if they have not changed. I see that now.

FWIW, this was discussed back in 2017 [0]/messages/by-id/CA+TgmoZgZMeuN8t9pawSt6M=mvxKiAZ4CvPofBWwwVWeZwHe4w@mail.gmail.com, and at that time there was
some support for last_executed, but the patch did not go anywhere.

After looking at the patch, I have a few comments:

1/ There are whitespace errors when applying.

2/ Calling GetCurrentTimestamp while holding a spinlock is
not a good idea and should be avoided. This was also a point
raised in [0]/messages/by-id/CA+TgmoZgZMeuN8t9pawSt6M=mvxKiAZ4CvPofBWwwVWeZwHe4w@mail.gmail.com. Even when we move pg_stat_statements
to cumulative stats and not at the mercy of the spinlock for updating
entries, i would still hesitate to add an additional GetCurrentTimestamp()
for every call.

I wonder if we can use GetCurrentStatementStartTimestamp()
instead?

```
/*
* GetCurrentStatementStartTimestamp
*/
TimestampTz
GetCurrentStatementStartTimestamp(void)
{
return stmtStartTimestamp;
}
```

stmtStartTimestamp is the time the query started, which seems OK for
the use-case you are mentioning. But also, stmtStartTimestamp gets
set at the top-level so nested entries (toplevel = false ) will just
inherit the timestamp of the top-level entry.

IMO, this is the most important point in the patch for now.

3/ last_executed, or maybe (last_toplevel_start) if we go with #2 should not
be added under pgssEntry->Counters, but rather directory under pgssEntry.

@@ -213,6 +214,7 @@ typedef struct Counters
                                              * launched */
     int64        generic_plan_calls; /* number of calls using a generic plan */
     int64        custom_plan_calls;    /* number of calls using a
custom plan */
+    TimestampTz last_executed;    /* timestamp of last statement execution */
 } Counters;

4/ instead of a " last_executed" maybe the tests should be added to
entry_timestamp.sql?

[0]: /messages/by-id/CA+TgmoZgZMeuN8t9pawSt6M=mvxKiAZ4CvPofBWwwVWeZwHe4w@mail.gmail.com

--
Sami Imseih
Amazon Web Services (AWS)

Christoph Berg

myon@debian.org

about 1 month ago

In reply to: Sami Imseih (#4)

Re: Re[2]: [PATCH] Add last_executed timestamp to pg_stat_statements

Re: Sami Imseih

I wonder if we can use GetCurrentStatementStartTimestamp()
instead?

The main use case for this column is being able to retrieve the stats
that were updated since the last time one was looking. That only works
if it's the statement end time, or else long-running statements
spanning more than one poll interval would be missed.

Perhaps the column should rather be called "stats_last_updated" to
match "stats_since", and be moved to the very end? Similarly, nested
statements would also have to get that stamp.

Oh and, yes, I'm definitely +1 on this feature.

Christoph

Sami Imseih

samimseih@gmail.com

about 1 month ago

In reply to: Christoph Berg (#5)

Re: Re[2]: [PATCH] Add last_executed timestamp to pg_stat_statements

I wonder if we can use GetCurrentStatementStartTimestamp()
instead?

The main use case for this column is being able to retrieve the stats
that were updated since the last time one was looking. That only works
if it's the statement end time, or else long-running statements
spanning more than one poll interval would be missed.

Sure, I get it is not perfect for the polling use-case due to the scenario
you mention, but I don't think it will be acceptable to call
GetCurrentTimeStamp() at the end of every execution and especially
with a SpinLock held.

This will probably be worse for nested_tacking as well, due to
multiple GetCurrentTimeStamp() calls.

What do you think?

--
Sami Imseih
Amazon Web Services (AWS)

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

about 1 month ago

In reply to: Sami Imseih (#6)

Re: Re[2]: [PATCH] Add last_executed timestamp to pg_stat_statements

Hi,

On Tue, Feb 03, 2026 at 08:37:31PM -0600, Sami Imseih wrote:

I wonder if we can use GetCurrentStatementStartTimestamp()
instead?

The main use case for this column is being able to retrieve the stats
that were updated since the last time one was looking. That only works
if it's the statement end time, or else long-running statements
spanning more than one poll interval would be missed.

Sure, I get it is not perfect for the polling use-case due to the scenario
you mention, but I don't think it will be acceptable to call
GetCurrentTimeStamp() at the end of every execution and especially
with a SpinLock held.

I think the same, that would not match (lmgr/README):

"
* Spinlocks. These are intended for *very* short-term locks. If a lock
is to be held more than a few dozen instructions, or across any sort of
kernel call
"

Out of curiosity I looked for GetCurrentTimeStamp() calls while holding a spinlock
and found one in WalReceiverMain(). But I guess it's less of an issue since
it's only called when the walreceiver starts.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Pavlo Golub

pavlo.golub@cybertec.at

about 1 month ago

In reply to: Bertrand Drouvot (#7)

Re: Re[2]: [PATCH] Add last_executed timestamp to pg_stat_statements

Hi all,

Thank you to Sami, Christoph, and Bertrand for the thorough review and valuable
feedback on v1. I've prepared a v2 patch that addresses all the concerns raised.

You're absolutely right that calling `GetCurrentTimestamp()` while holding a
spinlock is unacceptable.

I've adopted Sami's suggestion to use `GetCurrentStatementStartTimestamp()`
instead. While Christoph correctly points out that this means we're capturing
statement start time rather than end time, I believe this is an acceptable
trade-off for several reasons:
- the vast majority of statements complete quickly (well under typical polling
intervals of 1-5 minutes);
- long-running queries will appear in the next poll anyway;
- deferring timestamp updates adds significant complexity;
- and most importantly zero performance overhead vs a kernel call per execution.

For the monitoring use case, what matters is knowing has this statement been
active since my last poll? The start time serves this purpose effectively.

I've renamed the column to `stats_last_updated` as Christoph suggested. This
matches the existing "stats_since" column for consistency. Following Christoph's
suggestion, I've also moved it to the end of the view.

Per Sami's feedback, I've moved the timestamp from the `Counters`
struct directly
to `pgssEntry`. This is more semantically correct since it's metadata about the
entry rather than a statistical counter.

I've updated `PGSS_FILE_HEADER` to 0x20260205.

Among other changes:
- fixed all whitespace errors
- moved tests to `entry_timestamp.sql` as suggested
- added documentation notice about nested statement behavior
- the timestamp is now set outside the spinlock-protected section

Sami mentioned this was discussed in 2017 [0]/messages/by-id/CA+TgmoZgZMeuN8t9pawSt6M=mvxKiAZ4CvPofBWwwVWeZwHe4w@mail.gmail.com. Looking at that thread, the main
concerns were overhead and use cases. I believe we've addressed both:
- zero overhead by using existing statement start timestamp
- clear, documented use case from production monitoring tools (pgwatch, etc.)

[0]: /messages/by-id/CA+TgmoZgZMeuN8t9pawSt6M=mvxKiAZ4CvPofBWwwVWeZwHe4w@mail.gmail.com

I've attached the v2 patch. All existing tests pass.

I'd appreciate further review and feedback.

Thanks,
Pavlo Golub

Sami Imseih

samimseih@gmail.com

about 1 month ago

In reply to: Pavlo Golub (#8)

Re: Re[2]: [PATCH] Add last_executed timestamp to pg_stat_statements

Thank you to Sami, Christoph, and Bertrand for the thorough review and valuable
feedback on v1. I've prepared a v2 patch that addresses all the concerns raised.

Thanks for the patch! I have not looked at v2 in detail yet. Did take
a quick peek
at the doc. Some comments:

I've renamed the column to `stats_last_updated` as Christoph suggested. This
matches the existing "stats_since" column for consistency. Following Christoph's
suggestion, I've also moved it to the end of the view.

I still wonder if "stats_last_updated" is a good name here. What about
"last_execution_start", since that is exactly what this timestamp is.

+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>stats_last_updated</structfield> <type>timestamp with
time zone</type>
+ </para>
+ <para>
+ Time at which the statement statistics were last updated (specifically,
+ the time when the statement most recently started execution).

Here I think we can just say:

"The start time of the most recent execution of the statement that completed. "

+ This is useful for monitoring tools to identify which statements
+ have been executed since their last poll.

I am not sure we need this part for the docs. others may disagree.

+ For nested statements (when <varname>pg_stat_statements.track</varname>
+ is set to <literal>all</literal>), this reflects the start time of the
+ parent top-level statement.
+ </para></entry>

Maybe this is better as it mentioned "toplevel"

"For nested statements (toplevel = false), this reflects the start time
of the top-level statement."

what do you think?

--
Sami Imseih
Amazon Web Services (AWS)

#10

Pavlo Golub

pavlo.golub@cybertec.at

about 1 month ago

In reply to: Sami Imseih (#9)

Re[4]: [PATCH] Add last_executed timestamp to pg_stat_statements

Hey

Thank you to Sami, Christoph, and Bertrand for the thorough review and valuable
feedback on v1. I've prepared a v2 patch that addresses all the concerns raised.

Thanks for the patch! I have not looked at v2 in detail yet. Did take
a quick peek
at the doc. Some comments:

I've renamed the column to `stats_last_updated` as Christoph suggested. This
matches the existing "stats_since" column for consistency. Following Christoph's
suggestion, I've also moved it to the end of the view.

I still wonder if "stats_last_updated" is a good name here. What about
"last_execution_start", since that is exactly what this timestamp is.

Yeah, sounds better really. Thanks

+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>stats_last_updated</structfield> <type>timestamp with
time zone</type>
+ </para>
+ <para>
+ Time at which the statement statistics were last updated (specifically,
+ the time when the statement most recently started execution).

Here I think we can just say:

"The start time of the most recent execution of the statement that completed. "

+ This is useful for monitoring tools to identify which statements
+ have been executed since their last poll.
I am not sure we need this part for the docs. others may disagree.

Agree, too much details.

+ For nested statements (when <varname>pg_stat_statements.track</varname>
+ is set to <literal>all</literal>), this reflects the start time of the
+ parent top-level statement.
+ </para></entry>
Maybe this is better as it mentioned "toplevel"

"For nested statements (toplevel = false), this reflects the start time
of the top-level statement."

I like it. Looks easier to read.

what do you think?

Thanks, I'll prepare updated patch

Show quoted text

--
Sami Imseih
Amazon Web Services (AWS)

#11

Christoph Berg

myon@debian.org

about 1 month ago

In reply to: Pavlo Golub (#10)

Re: [PATCH] Add last_executed timestamp to pg_stat_statements

Re: Pavlo Golub

I still wonder if "stats_last_updated" is a good name here. What about
"last_execution_start", since that is exactly what this timestamp is.

Yeah, sounds better really. Thanks

I still believe that storing execution start time is the wrong thing
to do as it will miss all long-running statements. Consider this
timeline:

09:55 get all stats changed since 09:50
-> doesn't see the statement because it hasn't started yet
09:57 start long-running statement
10:00 get all stats changed since 09:55
-> doesn't see the statement because it's still running
10:02 long-running statement terminates, storing 09:57 as timestamp
10:05 get all stats changed since 10:00
-> doesn't see the statement because it's too old

Christoph

#12

Pavlo Golub

pavlo.golub@cybertec.at

about 1 month ago

In reply to: Christoph Berg (#11)

Re[2]: [PATCH] Add last_executed timestamp to pg_stat_statements

I still believe that storing execution start time is the wrong thing
to do as it will miss all long-running statements. Consider this
timeline:

09:55 get all stats changed since 09:50
-> doesn't see the statement because it hasn't started yet
09:57 start long-running statement
10:00 get all stats changed since 09:55
-> doesn't see the statement because it's still running
10:02 long-running statement terminates, storing 09:57 as timestamp
10:05 get all stats changed since 10:00
-> doesn't see the statement because it's too old

Thanks for sharing this. I spent a sleepless night and it seems I found
the solution for the issue.

If we do

WHERE last_execution_start + max_exec_time * INTERVAL '1 ms' > NOW() -
polling_interval

we will grab all long-running statements. The worst thing that might
happen, if some query
has significant run time deviation, then we could grab it several times.
But again this is far
better than re-fetching thousands of rows every time.

For short queries it will just work as expected. "Issues" might be only
with long queries with
a high deviation longer than polling interval. But fetching one or two
rows once in a while is not
a problem.

#13

Christoph Berg

myon@debian.org

about 1 month ago

In reply to: Pavlo Golub (#12)

Re: [PATCH] Add last_executed timestamp to pg_stat_statements

Re: Pavlo Golub

If we do

WHERE last_execution_start + max_exec_time * INTERVAL '1 ms' > NOW() -
polling_interval

Is this extra complexity worth one saved GetCurrentTimestamp()?

src/backend/access/transam/xact.c is calling GetCurrentTimestamp a
lot already, so I don't really buy the argument it should be avoided
at all cost in pg_stat_statements. Just storing the statement end time
would make this use case much nicer.

Christoph

#14

Pavlo Golub

pavlo.golub@cybertec.at

about 1 month ago

In reply to: Christoph Berg (#13)

Re[2]: [PATCH] Add last_executed timestamp to pg_stat_statements

Is this extra complexity worth one saved GetCurrentTimestamp()?

I'm not sure really. But I really need a new column in one way or
another, so I can agree to whatever decision we'll make.

src/backend/access/transam/xact.c is calling GetCurrentTimestamp a
lot already, so I don't really buy the argument it should be avoided
at all cost in pg_stat_statements. Just storing the statement end time
would make this use case much nicer.

Maybe just move GetCurrentTimestamp outside of spin lock is engouh
already?

Show quoted text

Christoph

#15

Pavlo Golub

pavlo.golub@cybertec.at

about 1 month ago

In reply to: Christoph Berg (#11)

Re: [PATCH] Add last_executed timestamp to pg_stat_statements

OK, here is one more try. I discovered the `total_time` argument to
the `pgss_store()` function! So we can calculate the finish time
without calling `GetCurrentTimestamp()`.

This is version 3 of the patch adding a `stats_last_updated` column
(yes, again) to pg_stat_statements. Based on feedback, this version
improves the implementation with better performance and correctness.

The main improvement uses `statement_start + execution_duration` with
`rint(total_time * 1000.0)` to convert milliseconds to microseconds
with proper rounding. The calculation performed BEFORE acquiring
spinlock and assigned within locked scope.

I'm wondering how we all missed this trick from the very beginning and
started to argue if `GetCurrentTimestamp()` is heavy or not. :)

Best regards,
Pavlo

#16

Sami Imseih

samimseih@gmail.com

about 1 month ago

In reply to: Christoph Berg (#13)

Re: [PATCH] Add last_executed timestamp to pg_stat_statements

WHERE last_execution_start + max_exec_time * INTERVAL '1 ms' > NOW() -
polling_interval

Is this extra complexity worth one saved GetCurrentTimestamp()?

src/backend/access/transam/xact.c is calling GetCurrentTimestamp a
lot already, so I don't really buy the argument it should be avoided
at all cost in pg_stat_statements. Just storing the statement end time
would make this use case much nicer.

We do call GetCurrentTimestamp() to set timestamps for xact_start, etc.

But, we also take special care to avoid extra calls.

```
/*
* set transaction_timestamp() (a/k/a now()). Normally, we want this to
* be the same as the first command's statement_timestamp(), so don't do a
* fresh GetCurrentTimestamp() call (which'd be expensive anyway). But
* for transactions started inside procedures (i.e., nonatomic SPI
* contexts), we do need to advance the timestamp. Also, in a parallel
* worker, the timestamp should already have been provided by a call to
* SetParallelStartTimestamps().
*/
if (!IsParallelWorker())
{
if (!SPI_inside_nonatomic_context())
xactStartTimestamp = stmtStartTimestamp;
else
xactStartTimestamp = GetCurrentTimestamp();
```

OK, here is one more try. I discovered the `total_time` argument to
the `pgss_store()` function! So we can calculate the finish time
without calling `GetCurrentTimestamp()`.

This is version 3 of the patch adding a `stats_last_updated` column
(yes, again) to pg_stat_statements. Based on feedback, this version
improves the implementation with better performance and correctness.

The main improvement uses `statement_start + execution_duration` with
`rint(total_time * 1000.0)` to convert milliseconds to microseconds
with proper rounding. The calculation performed BEFORE acquiring
spinlock and assigned within locked scope.

Correct, this also crossed my mind. Although I would consider doing things a bit
different. Instead of calculating the end time in a single column, I still
think it's worth having a last_executed_start and a
last_execution_time (duration),
and the user can calculate this on the SQL layer. I think it's better because
last_execution_start is already a known timestamp in
pg_stat_activity.query_start
and some tool that finds a long running query in pg_stat_activity, knowing the
query_start they could then go look it up in pg_stat_statements. What I'm
really getting at is separating these fields will open up more use cases, IMO.

Of course, we are adding 2 new columns instead of just 1, but these values
do have benefits.

--
Sami Imseih
Amazon Web Services (AWS)

#17

Christoph Berg

myon@debian.org

29 days ago

Re: [PATCH] Add last_executed timestamp to pg_stat_statements

Re: Pavlo Golub

How about
TimestampTz stmt_end = TimestampTzPlusMilliseconds(
GetCurrentStatementStartTimestamp(),
(int64) total_time
);
We have total_time as an argument already! No kernel calls, sweet and easy!

Cool idea!

Re: Sami Imseih

I think it's better because last_execution_start is already a known
timestamp in pg_stat_activity.query_start and some tool that finds a
long running query in pg_stat_activity, knowing the
query_start they could then go look it up in pg_stat_statements.

That only works if a) the query was not yet overwritten in
pg_stat_activity and b) neither in pg_stat_statements. Optimizing for
that use case seems pretty narrow.

What I'm really getting at is separating these fields will open up
more use cases, IMO.

Generally, I think pgss should have cumulative statistics, and less
about individual executions, so I'm not really sure what practical
problem "last start" and "last runtime" would solve. The
last_stats_update column we are talking about here is different in the
sense that it's not about an individual execution, but infrastructure
for retrieving the stats sensibly.

Christoph

Import Notes

Reply to msg id not found: CAA5RZ0vMW+0_6FYtFEaN-NAnk1sT2ucCUDJe0-xKwZriROV--A@mail.gmail.comemfe7f88ae-6f9a-469e-98f4-7eb9fbfddcfe@cybertec.at

#18

Sami Imseih

samimseih@gmail.com

29 days ago

In reply to: Christoph Berg (#17)

Re: [PATCH] Add last_executed timestamp to pg_stat_statements

How about
TimestampTz stmt_end = TimestampTzPlusMilliseconds(
GetCurrentStatementStartTimestamp(),
(int64) total_time
);
We have total_time as an argument already! No kernel calls, sweet and easy!

Cool idea!

This calculation could be wrong for very common cases in extended
query protocol,

Here is a script to test with:

```
select pg_stat_statements_reset();

BEGIN;
select now() as now, clock_timestamp() as clock_timestamp,
pg_sleep($1) \bind 10 \g

\! sleep 10

SELECT now() as now, clock_timestamp() as clock_timestamp, $1 \bind 1 \g
END;

select stats_last_updated, total_exec_time, substr(query, 1, 150) as
query from pg_stat_statements;

````

With v3 applied, notice the output is calculating a stats_last_updated
that is beyond the current time

```
pg_stat_statements_reset
-------------------------------
2026-02-09 16:13:35.188849+00
(1 row)

BEGIN
now | clock_timestamp | pg_sleep
------------------------------+-------------------------------+----------
2026-02-09 16:13:35.18911+00 | 2026-02-09 16:13:35.189397+00 |
(1 row)

now | clock_timestamp | ?column?
------------------------------+-------------------------------+----------
2026-02-09 16:13:35.18911+00 | 2026-02-09 16:13:55.193443+00 | 1
(1 row)

COMMIT
stats_last_updated | total_exec_time |
query
-------------------------------+-----------------+-------------------------------------------------------------------------
2026-02-09 16:13:55.19367+00 | 0.007401 | SELECT now() as
now, clock_timestamp() as clock_timestamp, $1
2026-02-09 16:13:55.193664+00 | 0.00103 | END
2026-02-09 16:13:35.189111+00 | 0.00098 | BEGIN
2026-02-09 16:13:35.188584+00 | 0.090183 | select
pg_stat_statements_reset()
2026-02-09 16:14:05.194134+00 | 10000.751122 | select now() as
now, clock_timestamp() as clock_timestamp, pg_sleep($1)
(5 rows)
```

This happens because in the case of extended query protocol,
ExecutorEnd is called
at the next query. This has been discussed in [1]/messages/by-id/3d07ee43-8855-42db-97e0-bad5db82d972@dalibo.com [2]/messages/by-id/CAA5RZ0t2+GLnE_55L2cfCay+L8yPFpdPRVQo-JswUFgXy-EK5Q@mail.gmail.com.

So, for this to work, we will likely need to store the query start
time in the queryDesc; actually
queryDesc->totaltime, and set the query start time at ExecutorStart,
during InstrAlloc.

I think it's better because last_execution_start is already a known
timestamp in pg_stat_activity.query_start and some tool that finds a
long running query in pg_stat_activity, knowing the
query_start they could then go look it up in pg_stat_statements.

That only works if a) the query was not yet overwritten in
pg_stat_activity and b) neither in pg_stat_statements. Optimizing for
that use case seems pretty narrow.

What I'm really getting at is separating these fields will open up
more use cases, IMO.

Maybe this is a bad use case. But I felt separating these 2 fields will
be more flexible.

Generally, I think pgss should have cumulative statistics, and less
about individual executions, so I'm not really sure what practical
problem "last start" and "last runtime" would solve. The
last_stats_update column we are talking about here is different in the
sense that it's not about an individual execution, but infrastructure
for retrieving the stats sensibly.

Sure, generally, pg_stat_statements is for cumulative stats, but we also
do have computed stats such as max/min/stddev, etc. But, it's not without
precedent that we track timestamps of the last time some operation occurred.
We do that in views that have a purpose of tracking cumulative data, because
these timestamps are useful. See pg_stat_all_tables.last_seq_scan or
last_autovacuum
as an example.

Maybe having the last runtime column is not that valuable if we can
correctly calculate
the last execution time. AlsoI will be a strong -1 calling this field
"stats_last_updated"
instead of "last_execution_time".

[1]: /messages/by-id/3d07ee43-8855-42db-97e0-bad5db82d972@dalibo.com
[2]: /messages/by-id/CAA5RZ0t2+GLnE_55L2cfCay+L8yPFpdPRVQo-JswUFgXy-EK5Q@mail.gmail.com

--
Sami Imseih
Amazon Web Services (AWS)

#19

Pavlo Golub

pavlo.golub@cybertec.at

22 days ago

In reply to: Sami Imseih (#18)

Re: [PATCH] Add last_executed timestamp to pg_stat_statements

Hi hackers,

This is v4 of the patch adding a stats_last_updated column to
pg_stat_statements.

The thread is somehow shifted to things that are not related to the
patch initial idea. I decided to return to the beginning.
I simplified the implementation to use GetCurrentTimestamp() directly
instead of calculating the timestamp from
GetCurrentStatementStartTimestamp() + total_time. The previous optimization was
premature, benchmark testing proves GetCurrentTimestamp() adds no
measurable overhead.

Test environment: Dockerized Linux x86_64 under Windows host (the
worst possible combination), PostgreSQL 19devel, gcc-14.2.0
Test: 1,000,000 iterations of PERFORM 1 (fastest possible statement)

Unpatched PostgreSQL:
track='none' (baseline): 562.07ms average
track='all' (tracking): 719.17ms average
Overhead: 157.10ms (27.9%)

Patched PostgreSQL (with stats_last_updated):
track='none' (baseline): 548.95ms average
track='all' (tracking): 732.50ms average
Overhead: 183.55ms (33.4%)

Direct comparison (what matters):
Unpatched track='all': 719.17ms
Patched track='all': 732.50ms
Difference: +13.33ms (+1.85%)
Per-statement: 13.33ms / 1,000,000 = 13 nanoseconds

The baseline comparison shows patched is actually faster (-13ms),
which is impossible.
This confirms the 13ms variance is a measurement noise, not real overhead.

Real-world impact for 100ms query is 0.000013% overhead

GetCurrentTimestamp() is the standard approach used throughout PostgreSQL for
monitoring features. The measured overhead of 13 nanoseconds per statement is
negligible for any realistic workload and well within measurement noise.

The implementation captures GetCurrentTimestamp() before acquiring the spinlock,
so no syscall occurs while holding the lock. This is simple, correct, and has
no measurable performance impact.

Changes from v2-v3:
- Simplified implementation to use GetCurrentTimestamp() directly
- Removed complex calculation with GetCurrentStatementStartTimestamp()
and total_time computation (premature optimization)
- Added comprehensive benchmark testing (unpatched vs patched)
- Benchmark testing shows no measurable overhead (<2% in synthetic tests)
- Measured overhead: 13ns per statement (1.85% for 1M iteration test,
negligible for real queries >0.1ms)

Changes from v1:
- Rename column from last_executed to stats_last_updated (Christoph Berg)
- Move timestamp from Counters struct to pgssEntry for better semantics
- Place column at end of view to match stats_since naming convention
- Fixed whitespace errors
- Moved tests to entry_timestamp.sql (Sami Imseih)
- Updated PGSS_FILE_HEADER to handle structure change

Patch, benchmark script and raw results are attached.

Best regards,
Pavlo Golub

#20

Sami Imseih

samimseih@gmail.com

22 days ago

In reply to: Pavlo Golub (#19)

Re: [PATCH] Add last_executed timestamp to pg_stat_statements

Hi,

I simplified the implementation to use GetCurrentTimestamp() directly
instead of calculating the timestamp from

I don't think it will be acceptable to add GetCurrentTimestamp() at the end of
every execution. We take extra measures to avoid such overhead in core, and
I don't think we should impose this on pg_stat_statements users as well.

For example, in c037471832e which tracks lastscan:

"
To make it easier to detect the last time a relation has been scanned, track
that time in each relation's pgstat entry. To minimize overhead a) the
timestamp is updated only when the backend pending stats entry is flushed to
shared stats b) the last transaction's stop timestamp is used as the
timestamp.
"

My opinion is still that using the last_start_time will be useful for monitoring
as you demonstrated [0]/messages/by-id/ema0100e49-522e-4b8f-9c25-9257af1f0793@cybertec.at, albeit with some complexity involved for the
monitoring client,
will be the best way and safest way forward.

Others may disagree.

GetCurrentStatementStartTimestamp() + total_time. The previous optimization was
premature, benchmark testing proves GetCurrentTimestamp() adds no
measurable overhead.

I am not sure if this benchmark is enough to build confidence in this approach.
Workloads with nested queries and tack. 'all' are now paying an even heavier
cost.

Also, quickly looking at v4, you probably want to call GetCurrentTimestamp()
when kind == PGSS_EXEC only.

Assert(kind == PGSS_PLAN || kind == PGSS_EXEC);
+ /*
+ * Get current timestamp before acquiring spinlock to avoid holding
+ * the lock during syscall.
+ */
+ stats_updated_at = GetCurrentTimestamp();
+
/*

[0]: /messages/by-id/ema0100e49-522e-4b8f-9c25-9257af1f0793@cybertec.at

--
Sami Imseih
Amazon Web Services (AWS)

#21