[PATCH] pg_stat_statements: add last_execution_start column

Started by Pavlo Golub3 months ago15 messageshackers
Jump to latest
#1Pavlo Golub
pavlo.golub@cybertec.at

Hi hackers,

This patch adds a `last_execution_start` column to `pg_stat_statements`,
recording the start timestamp of the most recent execution of each
tracked statement.

It supersedes the `stats_last_updated` series discussed here:

/messages/by-id/CAK7ymc+FxoVswo1ok_xDW-xPG-ZEZ8SAqCUkJ7WF04=0aQDvVQ@mail.gmail.com

The main criticism of that series was performance using
`GetCurrentTimestamp()` inside the stats accumulation. pgbench testing
confirmed the concern of roughly 5–6% TPS regression on a
short-transaction workload.

This patch takes a different approach. Instead of calling
`GetCurrentTimestamp()`, it uses `GetCurrentStatementStartTimestamp()`,
which simply is a variable reading.
There is no syscall and no additional work in the hot path.

Benchmark (16-vCPU, pgbench -c8 -j4 -T60, explicit transactions with 15
SELECT statements each):

master HEAD: ~4574 TPS (runs: 4636, 4585, 4500)
patched: ~4571 TPS (runs: 4577, 4560, 4575)
difference: ~0.1%

The column is initialized to the entry allocation time and updated on
every call to `pgss_store()`. It is reset by
`pg_stat_statements_reset()` but preserved across minmax-only resets,
consistent with `stats_since` semantics.

A monitoring query to find statements that have executed since the last
observation could look like:

SELECT query, calls, last_execution_start
FROM pg_stat_statements
WHERE last_execution_start >= $1 -- e.g. last check timestamp
ORDER BY last_execution_start DESC;

Patch attached.

Best regards,
Pavlo Golub

Attachments:

v1-0001-pg_stat_statements-Add-last_execution_start-colum.patchapplication/octet-stream; name=v1-0001-pg_stat_statements-Add-last_execution_start-colum.patchDownload+259-5
#2Sami Imseih
samimseih@gmail.com
In reply to: Pavlo Golub (#1)
Re: [PATCH] pg_stat_statements: add last_execution_start column

Hi,

It would be really great to get this field in, but I think the current
implementation still suffers from the same issue that is mentioned
here [1][/messages/by-id/CAA5RZ0sxPWP2xm8fxhscE+cUqC2VSFi9UZ9882BdGZ0MbGQNUA@mail.gmail.com%5D. We cannot rely on GetCurrentStatementStartTimestamp()
in-line because ExecutorEnd is deferred to the next execution in the
case of extended query protocol. I think we need to add it to track the
start timestamp in queryDesc. What do you think?

```
select pg_stat_statements_reset();

BEGIN;
select now() as now, clock_timestamp() as clock_timestamp,
pg_sleep($1) \bind 10 \g

\! sleep 10

SELECT now() as now, clock_timestamp() as clock_timestamp, $1 \bind 1 \g
END;

select last_execution_start, total_exec_time, substr(query, 1, 150) as
query from pg_stat_statements;
```
Notice how the last_execution_start reflects when the
"SELECT now() as now, clock_timestamp() as clock_timestamp, $1 " runs

--
Sami

[1]: [/messages/by-id/CAA5RZ0sxPWP2xm8fxhscE+cUqC2VSFi9UZ9882BdGZ0MbGQNUA@mail.gmail.com%5D

#3Pavlo Golub
pavlo.golub@cybertec.at
In reply to: Sami Imseih (#2)
Re[2]: [PATCH] pg_stat_statements: add last_execution_start column

Hey!

Hi,

It would be really great to get this field in, but I think the current
implementation still suffers from the same issue that is mentioned
here [1]. We cannot rely on GetCurrentStatementStartTimestamp()
in-line because ExecutorEnd is deferred to the next execution in the
case of extended query protocol.

Oh, I completely missed this fact from the previous thread! Thanks for
pointing this out!

I think we need to add it to track the
start timestamp in queryDesc. What do you think?

I agree this might be the proper way to go but I don't want to touch
core functionality
in this patch. I'm afraid it could lead to even more problems with
accepting.

Please, find my new patch that addresses the issue. It captures the
statement start timestamp
at ExecutorStart and then save it to per-nesting-level array to write it
later in ExecutorEnd.
I set the max nesting level to 64, just because it feels right. I have
no other motivation here.
For even deeper statements it will fall back to direct
GetCurrentStatementStartTimestamp().

I added new regression test based on your example.

Benchmark is still OK (16-vCPU, pgbench -c8 -j4 -T60, explicit
transactions with
15 SELECT statements each):

master HEAD: ~4356 TPS (runs: 4089, 4308, 4672)
patched v2: ~4383 TPS (runs: 4226, 4382, 4541)
difference: ~0.6%

How do you feel about it?

Best regards,
Pavlo Golub

Attachments:

v2-0001-pg_stat_statements-Add-last_execution_start-colum.patchapplication/octet-stream; name=v2-0001-pg_stat_statements-Add-last_execution_start-colum.patchDownload+363-12
#4Sami Imseih
samimseih@gmail.com
In reply to: Pavlo Golub (#3)
Re: Re[2]: [PATCH] pg_stat_statements: add last_execution_start column

Hi,

I think we need to add it to track the
start timestamp in queryDesc. What do you think?

I agree this might be the proper way to go but I don't want to touch
core functionality
in this patch. I'm afraid it could lead to even more problems with
accepting.

How do you feel about it?

I think adding to queryDesc->EState just like we do for other fields
that are consumed by pg_stat_statements, i.e. es_processed,
es_parallel_workers_to_launch, etc. is the proper pattern. The
difference is these other fields have use-cases outside of
pg_stat_statements, but I am not sure if that is a reason to
deviate from this existing pattern.

I also don't think the bounded array in v2 is acceptable.

GetCurrentStatementStartTimestamp() does not change for
nested levels, it's always the top-level statement start time,
so we don't need this.

+
+        /*
+         * Capture the statement start timestamp now, while it is still
+         * correct.  We cannot rely on reading it later in ExecutorEnd,
+         * because ExecutorEnd can be deferred until the next Bind message
+         * in the extended query protocol.
+         */
+        if (nesting_level < PGSS_MAX_NESTING_LEVEL)
+            pgss_exec_start[nesting_level] =
+                GetCurrentStatementStartTimestamp();

The same could be accomplished with a single static TimestampTz,
but this is still wrong, because what you will need pg_stat_statements to track
is an arbitrary number of query executions at any given time, not nesting
levels, along with their queryId and toplevel. So, this could get very complex
and does not scale well. Whereas just simply tracking the start time in
queryDesc->estate will be a much simpler solution.

Here is an example with the v2 where it breaks. The case being we
have multiple portals open at the same time being accessed as in the
case of setting a fetch size. When the portal is closed, the execution
start time for both portals is set relative to the second portal. I used
JDBC since I can't demo this using psql as \bind always runs the portal
to completion.

"
java DeferredEndTest
=== Portal A ===
Time: 2026-04-01 22:13:03.238
A row: id=1
A row: id=2
A row: id=3

--- sleeping 3 seconds ---

=== Portal B ===
Time: 2026-04-01 22:13:06.745
B row: id=10000
B row: id=9999
B row: id=9998

=== Closing Portal A (ExecutorEnd A) ===
Time: 2026-04-01 22:13:07.749

=== Closing Portal B (ExecutorEnd B) ===
Time: 2026-04-01 22:13:07.749

=== pg_stat_statements results ===
last_execution_start=2026-04-01 22:13:06.74567 exec_time=1002.44ms
query=SELECT id, pg_sleep($1), data FROM big_table ORDER BY id DESC
last_execution_start=2026-04-01 22:13:06.74567 exec_time=501.81ms
query=SELECT id, pg_sleep($1), data FROM big_table ORDER BY id
"

If we track the start time in queryDesc->EState, this will not be a
problem. For utility
statements, we can rely on a direct call to GetCurrentStatementStartTimestamp()

We also need to document that for toplevel = false statements, this
execution_start time
is that of the top level statement. I think that is acceptable.

--
Sami Imseih
Amazon Web Services (AWS)

Attachments:

DeferredEndTest.javaapplication/octet-stream; name=DeferredEndTest.javaDownload
#5Pavlo Golub
pavlo.golub@cybertec.at
In reply to: Sami Imseih (#4)
Re: Re[2]: [PATCH] pg_stat_statements: add last_execution_start column

I think adding to queryDesc->EState just like we do for other fields
that are consumed by pg_stat_statements, i.e. es_processed,
es_parallel_workers_to_launch, etc. is the proper pattern.

Yeah, makes sense. Thanks.

If we track the start time in queryDesc->EState, this will not be a
problem. For utility
statements, we can rely on a direct call to GetCurrentStatementStartTimestamp()

Thank you for the review. v3 addresses all concerns, I hope.

I switched from a bounded static array to statement start timestamp
stored in EState.
The field is set in pgss_ExecutorStart immediately after
standard_ExecutorStart creates the estate.

All pg_stat_statements regression tests pass (16/16).

Benchmark (16-vCPU, pgbench -c8 -j4 -T60):
master HEAD: ~4691 TPS (runs: 4849, 4588, 4635)
Patched v3: ~4744 TPS (runs: 4870, 4735, 4627)
Difference: ~1.1%

No measurable overhead.

The latest v3 patch attached.

We also need to document that for toplevel = false statements, this
execution_start time
is that of the top level statement. I think that is acceptable.

Docs updated

Show quoted text

--
Sami Imseih
Amazon Web Services (AWS)

Attachments:

v3-0001-pg_stat_statements-Add-last_execution_start-colum.patchapplication/octet-stream; name=v3-0001-pg_stat_statements-Add-last_execution_start-colum.patchDownload+354-12
#6Pavlo Golub
pavlo.golub@cybertec.at
In reply to: Pavlo Golub (#5)
[PATCH v4] pg_stat_statements: Add last_execution_start column

Rebased on current master (a3e6beba60e).

The only conflict was with commit 66366217822 (pg_stat_statements: Set
PlannedStmt to NULL after nested utility execution), which introduced a
local copy of pstmt->planOrigin and then nulled pstmt. The patch context
line was updated to reference saved_planOrigin instead. No functional
changes from v3.

All pg_stat_statements regression tests pass (16/16).

v4 patch attached.

Best regards,
Pavlo Golub

Attachments:

v4-0001-pg_stat_statements-Add-last_execution_start-colum.patchapplication/octet-stream; name=v4-0001-pg_stat_statements-Add-last_execution_start-colum.patchDownload+354-12
#7Anthonin Bonnefoy
anthonin.bonnefoy@datadoghq.com
In reply to: Pavlo Golub (#6)
Re: [PATCH v4] pg_stat_statements: Add last_execution_start column

Hi,

The code looks good. Some comments on the tests:

+-- last_execution_start timestamp tests
+--
+-- Reset stats first to avoid queryId collisions: simple "SELECT
const AS alias"
+-- queries all share the same normalized structure as the STMTTS queries above,
+-- so EXECSTART entries would otherwise land on the pre-existing STMTTS entry.
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;

I don't think the comment on queryId collision is necessary. Looking
at other tests, it's a common pattern to do a pgss reset before
starting a set of tests.

+-- last_execution_start should be set and >= ref_ts_upd1, because the
+-- statement started after we captured the reference timestamp.
+SELECT
+    query,
+    last_execution_start IS NOT NULL as has_ts,
+    last_execution_start >= :'ref_ts_upd1' as after_ref1,
+    stats_since <= last_execution_start as after_stats_since
+FROM pg_stat_statements
+WHERE query LIKE '%EXECSTART%'
+ORDER BY query COLLATE "C";

The test is also checking stats_since and shows that
last_execution_start < stats_since, is it necessary? The result is
also a bit confusing, but makes sense since stats_since is the time
the entry is created in pgss, which happens before ExecutorEnd.

+-- Run EXECSTART1 again and verify that last_execution_start is updated.
+SELECT now() AS ref_ts_upd2 \gset
+SELECT 1 AS "EXECSTART1";
+SELECT
+    query,
+    last_execution_start >= :'ref_ts_upd2' as updated
+FROM pg_stat_statements
+WHERE query LIKE '%EXECSTART1%';
+
+-- test filtering (monitoring use case): find statements that started
+-- executing since our last observation (ref_ts_upd2).
+SELECT count(*) as filtered_count
+FROM pg_stat_statements
+WHERE last_execution_start >= :'ref_ts_upd2'
+  AND query LIKE '%EXECSTART%';

The 'test filtering' feels redundant, we already have the list of
queries executed after ref_ts_upd2.

+SELECT now() AS ref_ts_ext \gset
+-- Use \bind \g to force the extended query protocol.
+SELECT pg_sleep(0.5) AS "DEFERRED_END" \bind \g
+-- Capture a timestamp *after* the sleep finishes but *before* the next
+-- extended-protocol statement replaces the unnamed portal.
+SELECT now() AS ref_ts_ext2 \gset
+-- The pg_sleep query's last_execution_start should be close to ref_ts_ext
+-- (before the sleep), NOT to ref_ts_ext2 (after the sleep).

I think the test doesn't work, or at least, doesn't check what you're
expecting. If I set last_execution_start to
GetCurrentStatementStartTimestamp(), it still passes.

When using the extended protocol with \bind \g, a Sync message is
immediately sent, and since you're in an implicit transaction,
finish_xact_command will drop the portal and call ExecutorEnd. Also,
doing a `SELECT now()` closes the unnamed portal, even when using the
simple protocol.

You could do something like that (and avoid a pg_sleep):

BEGIN;
SELECT 1 AS DEFERRED_END, statement_timestamp() AS query_stmt_ts \bind \gset
-- With an explicit transaction + extended protocol, the portal is left opened
-- ExecutorEnd will only be called when the next command is processed
END;

SELECT query, last_execution_start < :'query_stmt_ts' as before_next
FROM pg_stat_statements
WHERE query LIKE '%DEFERRED_END%'
ORDER BY query COLLATE "C";

With entry->last_execution_start =
GetCurrentStatementStartTimestamp(), you will have
last_execution_start > query_stmt_ts since it will be using END's
statement_start.
With entry->last_execution_start = exec_start, you use the statement
start set during the Bind message, which should be smaller than
statement_timestamp() (which is the statement start set during the
Execute message).

On that note, I wonder if relying on
GetCurrentStatementStartTimestamp() wouldn't be good enough? That
would remove the need to keep track of this state in EState.

Regards,
Anthonin Bonnefoy

#8Zsolt Parragi
zsolt.parragi@percona.com
In reply to: Pavlo Golub (#6)
Re: [PATCH v4] pg_stat_statements: Add last_execution_start column

Hello!

The write happens inside the spinlock and shared pgss->lock:

+		/* Record the start time of this execution */
+		entry->last_execution_start = exec_start;
+
 		SpinLockRelease(&entry->mutex);

And then at the read, we see:

 		/*
-		 * The spinlock is not required when reading these two as they are
+		 * The spinlock is not required when reading these three as they are
 		 * always updated when holding pgss->lock exclusively.
 		 */
 		stats_since = entry->stats_since;
 		minmax_stats_since = entry->minmax_stats_since;
+		last_execution_start = entry->last_execution_start;

But that comment doesn't seem true, it is updated with a shared lock, not exclusive. Shouldn't it be read together with the counters above this?

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Pavlo Golub (#6)
Re: [PATCH v4] pg_stat_statements: Add last_execution_start column

On 13.05.26 16:38, Pavlo Golub wrote:

Rebased on current master (a3e6beba60e).

The only conflict was with commit 66366217822 (pg_stat_statements: Set
PlannedStmt to NULL after nested utility execution), which introduced a
local copy of pstmt->planOrigin and then nulled pstmt. The patch context
line was updated to reference saved_planOrigin instead. No functional
changes from v3.

All pg_stat_statements regression tests pass (16/16).

v4 patch attached.

Here is a summary of the feedback comments we discussed at pgconf.dev:

(Some of these might have applied to previous patch versions.)

technical:

- obvious concerns: getting current time is expensive (inside spinlock!)
- documenting intended use of the field is good, I would put that back
- issues with long-running queries, start time vs. end time -- ignores
analytics use cases
- argument total_time of pgss_store() is not documented, should be fixed
first -- also why is this double??
- issues with extended query protocol can be surprising
- comment says it can be executed under spinlock but then it's not
- only top-level statements? -- not acceptable IMO

code style:

- INT64CONST is useless
- when changing "two" to "three" in a comment, just delete the number

meta/process:

- good: patch versioned, commit message up to date
- I confused me that a new thread was started in the middle of the
discussion.
- As a theoretical committer, I would like to get Sami's and Christoph's
feedback on the latest patch version.

#10Sami Imseih
samimseih@gmail.com
In reply to: Peter Eisentraut (#9)
Re: [PATCH v4] pg_stat_statements: Add last_execution_start column

Hi,

Sorry for the delay in response to this. I spent time today looking at
v4 and here are my comments.

1/
v4 captures the timestamp conditionally inside the pgss hook, meaning that
there is an ownership issue with this field. It's a core field, so only
core should set it, and extensions should only read it.

It is best that queryDesc->estate->es_exec_start is set inside ExecutorStart.

```
/* pgss_ExecutorStart (original) */
static void
pgss_ExecutorStart(QueryDesc *queryDesc, int eflags)
{
if (prev_ExecutorStart)
prev_ExecutorStart(queryDesc, eflags);
else
standard_ExecutorStart(queryDesc, eflags);

/*
* Capture the statement start timestamp into EState here after the estate
* has been created by standard_ExecutorStart.
*/
if (pgss_enabled(nesting_level) && queryDesc->plannedstmt->queryId
!= INT64CONST(0))
queryDesc->estate->es_exec_start = GetCurrentStatementStartTimestamp();
}
```

Also, in pgss_ProcessUtility, we should not call
GetCurrentStatementStartTimestamp() directly,
but rather should track an execution start time field in PlannedStmt.

```
@@ -1212,7 +1228,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const
char *queryString,
                                  NULL,
                                  0,
                                  0,
-                                 saved_planOrigin);
+                                 saved_planOrigin,
+                                 GetCurrentStatementStartTimestamp());
```

PlannedStmt docs already acknowledge it being used this way:

```
* For simplicity in APIs, we also wrap utility statements in PlannedStmt
* nodes; in such cases, commandType == CMD_UTILITY, the statement itself
* is in the utilityStmt field, and the rest of the struct is mostly dummy.
* (We do use canSetTag, stmt_location, stmt_len, and possibly queryId.)
```

Note that ProcessUtility receives a potentially read-only tree when called from
the plan cache (readOnlyTree == true). We need to copyObject before writing
es_exec_start, otherwise we corrupt the cached plan. standard_ProcessUtility
already does this copy internally, but hooks never see it:

```
/*
* If the given node tree is read-only, make a copy to ensure that parse
* transformations don't damage the original tree. This could be
* refactored to avoid making unnecessary copies in more cases, but it's
* not clear that it's worth a great deal of trouble over. Statements
* that are complex enough to be expensive to copy are exactly the ones
* we'd need to copy, so that only marginal savings seem possible.
*/
if (readOnlyTree)
pstmt = copyObject(pstmt);
```

So the copy needs to happen in ProcessUtility() itself, before calling into
hooks:

```
if (readOnlyTree)
pstmt = copyObject(pstmt);
pstmt->es_exec_start = GetCurrentStatementStartTimestamp();
```

Also note that currently pgss_ProcessUtility already sets
pstmt->queryId before calling
through to standard_ProcessUtility, which looks like a latent bug.

```
if (enabled)
pstmt->queryId = INT64CONST(0);
```
Adding TimestampTz to PlannedStmt also requires teaching gen_node_support.pl
about the type.

2/

Also, Why do we need to track last_execution_start time for the planner?
This field is for the last time the query execution start.

```
@@ -958,7 +965,8 @@ pgss_planner(Query *parse,
                                  NULL,
                                  0,
                                  0,
-                                 result->planOrigin);
+                                 result->planOrigin,
+                                 GetCurrentStatementStartTimestamp());
```

3/ DDL tests are needed. Right now, we are only testing DML.

4/ To some points made earlier here regarding the test [1]/messages/by-id/CAO6_XqoRLxL2+FsaE3JbwOH+oZ=CAjN8yK9-+p7QeH_LY5B+ag@mail.gmail.com

+SELECT now() AS ref_ts_ext \gset
+-- Use \bind \g to force the extended query protocol.
+SELECT pg_sleep(0.5) AS "DEFERRED_END" \bind \g
+-- Capture a timestamp *after* the sleep finishes but *before* the next
+-- extended-protocol statement replaces the unnamed portal.
+SELECT now() AS ref_ts_ext2 \gset
+-- The pg_sleep query's last_execution_start should be close to ref_ts_ext
+-- (before the sleep), NOT to ref_ts_ext2 (after the sleep).

I think the test doesn't work, or at least, doesn't check what you're
expecting. If I set last_execution_start to
GetCurrentStatementStartTimestamp(), it still passes.

correct. The deferred test needs to be done in an explicit transaction

You could do something like that (and avoid a pg_sleep):

BEGIN;
SELECT 1 AS DEFERRED_END, statement_timestamp() AS query_stmt_ts \bind \gset
-- With an explicit transaction + extended protocol, the portal is left opened
-- ExecutorEnd will only be called when the next command is processed
END;

That is a better, and much simpler test.

[1]: /messages/by-id/CAO6_XqoRLxL2+FsaE3JbwOH+oZ=CAjN8yK9-+p7QeH_LY5B+ag@mail.gmail.com

--
Sami Imseih
Amazon Web Services (AWS)

#11Pavlo Golub
pavlo.golub@cybertec.at
In reply to: Sami Imseih (#10)
[PATCH v5] pg_stat_statements: Add last_execution_start column

Thank you all for the thorough review in person and on the list. I hope
that v5 addresses all feedback.

Rebased on current master (e18b0cb7344).

Changes from v4:

The locking was inconsistent. The read is now inside the spinlock
together with the other counters, and the comment no longer hardcodes a
field count. Thanks to Zsolt Parragi for catching this.

The `es_exec_start` field is now set unconditionally in
`pgss_ExecutorStart`, not just when `pgss_enabled()`. Thanks to Sami
Imseih for pointing this out.
The planner hook no longer passes a timestamp to `pgss_store`.
`pgss_store` now skips the update when exec_start is zero.

`INT64CONST(0)` in the `pgss_ExecutorStart` was replaced with a plain 0
iconst. No idea why this was introduced in the first place.

The deferred-ExecutorEnd regression test is rewritten. The previous
version used `pg_sleep` with `\bind \g` inside an implicit transaction.
The new test uses an explicit `BEGIN/END` with `\bind \gset`. Thanks to
Anthonin Bonnefoy for the correct pattern and to Sami Imseih for
confirming it.

A DDL test is added to cover utility-statement paths. Not sure how it is
different from a DML statement though.

The redundant "test filtering" count query and the confusing
`stats_since` are removed from the basic update test. Extra comments
are also removed.

The documentation now includes a sample monitoring query, as it appeared
in
the original v1 submission.

Regarding Peter Eisentraut's concern (raised at pgconf.dev) about
top-level vs nested statement timestamps: for `toplevel = false` rows,
`last_execution_start` reflects the
start time of the enclosing top-level statement, not the nested one.
This is a consequence of how `GetCurrentStatementStartTimestamp()` works
and is documented. It means the field is not useful for timing
individual nested calls, but it is still useful for the stated purpose
of answering "when was this statement last active".

All pg_stat_statements regression tests that pass on master also pass
with this patch.

Magic number `PGSS_FILE_HEADER` is bumped to `0x20260610`.

v5 patch attached.

Best regards,
Pavlo Golub

Attachments:

v5-0001-pg_stat_statements-Add-last_execution_start-colum.patchapplication/octet-stream; name=v5-0001-pg_stat_statements-Add-last_execution_start-colum.patchDownload+363-13
#12Andres Freund
andres@anarazel.de
In reply to: Pavlo Golub (#11)
Re: [PATCH v5] pg_stat_statements: Add last_execution_start column

Hi,

On 2026-06-10 17:36:24 +0000, Pavlo Golub wrote:

@@ -1490,6 +1509,10 @@ pgss_store(const char *query, int64 queryId,
else if (planOrigin == PLAN_STMT_CACHE_CUSTOM)
entry->counters.custom_plan_calls++;

+		/* Record the start time of this execution, if provided */
+		if (exec_start != 0)
+			entry->last_execution_start = exec_start;
+
SpinLockRelease(&entry->mutex);
}

FWIW, I remain opposed to adding *any* additional thing under the spinlock. We
first need to fix the design of pgss, then we can start discussing expanding
it even further.

I've seen way too many instances being brought to their knee because of pgss
contention. We shouldn't make it even worse.

Greetings,

Andres Freund

#13Zsolt Parragi
zsolt.parragi@percona.com
In reply to: Pavlo Golub (#11)
Re: [PATCH v5] pg_stat_statements: Add last_execution_start column

I have one more question: should statements that failed have an execution start set?

CREATE TABLE t(d int);
INSERT INTO t VALUES (0);

SELECT pg_stat_statements_reset();
SELECT now() AS ref_ts \gset
SELECT 1/d FROM t;

SELECT calls, plans, query FROM pg_stat_statements
WHERE last_execution_start >= :'ref_ts'
AND query LIKE '%/d FROM t%';

#14Sami Imseih
samimseih@gmail.com
In reply to: Pavlo Golub (#11)
Re: [PATCH v5] pg_stat_statements: Add last_execution_start column

A DDL test is added to cover utility-statement paths. Not sure how it is
different from a DML statement though.

This is because DDL will go through ProcessUtility hook. I don't think
pg_stat_statements should be calling GetCurrentStatementStartTimestamp()
as you have it in v5. We should be setting this the start timestamp in
PlannedStmt during the ProcessUtility hook ( same pattern as ExecutorStart ).
see [1]/messages/by-id/CAA5RZ0uLbF_zcu64-K50fepq20s7GEYCsnizVBhm6eMmsfSa7Q@mail.gmail.com.

FWIW, I remain opposed to adding *any* additional thing under the spinlock. We
first need to fix the design of pgss, then we can start discussing expanding
it even further.

I do think having the execution start time is valuable, but I will do
agree that it should
wait until after the pg_stat_statements redesign is complete which is
being discussed
here [2]/messages/by-id/CAA5RZ0sQ+gDn-J85j1FzOdL1YjVYRegpmQpDiah1=REWZSZj+Q@mail.gmail.com.

--
Sami Imseih
Amazon Web Services (AWS)

[1]: /messages/by-id/CAA5RZ0uLbF_zcu64-K50fepq20s7GEYCsnizVBhm6eMmsfSa7Q@mail.gmail.com
[2]: /messages/by-id/CAA5RZ0sQ+gDn-J85j1FzOdL1YjVYRegpmQpDiah1=REWZSZj+Q@mail.gmail.com

#15Sami Imseih
samimseih@gmail.com
In reply to: Zsolt Parragi (#13)
Re: [PATCH v5] pg_stat_statements: Add last_execution_start column

I have one more question: should statements that failed have an
execution start set?

No, the should not have an execution start set. Any query execution
that fails will not accumulate statistics currently. This is because
the counters are updated on ExecutorEnd.

I do think there could be future work to accumulate some stats on
ExecutorStart; for example calls_started, called_completed instead
of just a calls counter. However the current design of pg_stat_statements
makes the spinlock contention worse. This could become viable once we
redesign pg_stat_statements to remove the SpinLock to update the counters.

--
Sami Imseih
Amazon Web Services (AWS)