[BUG] pg_stat_statements and extended query protocol

Started by Imseih (AWS), Samialmost 3 years ago37 messages
#1Imseih (AWS), Sami
simseih@amazon.com
2 attachment(s)

Doing some work with extended query protocol, I encountered the same
issue that was discussed in [1]/messages/by-id/c90890e7-9c89-c34f-d3c5-d5c763a34bd8@dunslane.net. It appears when a client is using
extended query protocol and sends an Execute message to a portal with
max_rows, and a portal is executed multiple times,
pg_stat_statements does not correctly track rows and calls.

Using the attached jdbc script, TEST.java, which can reproduce the issue
with setFetchSize of 100 with autocommit mode set to OFF. We can
see that although pg_class has 414 rows, the total call and
rows returned is 14. the first 4 * 100 fetches did not get accounted for,.

postgres=# select calls, rows, query from pg_stat_statements
postgres-# where queryid = '-1905758228217333571';
calls | rows | query
---------------------------------
1 | 14 | select * from pg_class
(1 row)

The execution work flow goes something like this:
ExecutorStart
ExecutorRun – which will be called multiple times to fetch from the
portal until the caller Closes the portal or the portal
runs out of rows.
ExecutorFinish
ExecutorEnd – portal is closed & pg_stat_statements stores the final rows processed

Where this breaks for pg_stat_statements is during ExecutorRun,
es_processed is reset to 0 every iteration. So by the time the portal
is closed, es_processed will only show the total from the last execute
message.

This appears to be only an issue for portals fetched
through extended query protocol and not explicit cursors
that go through simple query protocol (i.e. FETCH <cursor>)

I attached a JDBC script to repro the issue.

One potential fix I see is to introduce 2 new counters in the
ExecutionState which will track the total rows processed
and the number of calls. These counters can then be used
by pg_stat_statements. Attached is an experimental patch
which shows the correct number of rows and number of
calls.

postgres=# select calls, rows, query from pg_stat_statements
postgres-# where queryid = '-1905758228217333571';
calls | rows | query
---------------------------------
5 | 414 | select * from pg_class
(1 row)

[1]: /messages/by-id/c90890e7-9c89-c34f-d3c5-d5c763a34bd8@dunslane.net

Thanks


Sami Imseih
Amazon Web Services (AWS)

Attachments:

0001-correct-pg_stat_statements-tracking-of-portals.patchapplication/octet-stream; name=0001-correct-pg_stat_statements-tracking-of-portals.patchDownload
From 412c26f58082007e08189d0763cc8a2598cbec26 Mon Sep 17 00:00:00 2001
From: "Imseih (AWS)" <simseih@88665a22795f.ant.amazon.com>
Date: Tue, 24 Jan 2023 12:26:44 -0600
Subject: [PATCH 1/1] correct pg_stat_statements tracking of portals

---
 contrib/pg_stat_statements/pg_stat_statements.c | 13 +++++++++----
 src/backend/executor/execMain.c                 |  6 ++++++
 src/backend/executor/execUtils.c                |  2 ++
 src/include/nodes/execnodes.h                   |  4 +++-
 4 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index ad1fe44496..63bbfd06ba 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -339,7 +339,7 @@ static void pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 static void pgss_store(const char *query, uint64 queryId,
 					   int query_location, int query_len,
 					   pgssStoreKind kind,
-					   double total_time, uint64 rows,
+					   double total_time, uint64 rows, uint64 calls,
 					   const BufferUsage *bufusage,
 					   const WalUsage *walusage,
 					   const struct JitInstrumentation *jitusage,
@@ -855,6 +855,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
 				   PGSS_INVALID,
 				   0,
 				   0,
+				   0,
 				   NULL,
 				   NULL,
 				   NULL,
@@ -940,6 +941,7 @@ pgss_planner(Query *parse,
 				   PGSS_PLAN,
 				   INSTR_TIME_GET_MILLISEC(duration),
 				   0,
+				   1,
 				   &bufusage,
 				   &walusage,
 				   NULL,
@@ -1000,6 +1002,7 @@ pgss_ExecutorRun(QueryDesc *queryDesc, ScanDirection direction, uint64 count,
 				 bool execute_once)
 {
 	exec_nested_level++;
+
 	PG_TRY();
 	{
 		if (prev_ExecutorRun)
@@ -1058,7 +1061,8 @@ pgss_ExecutorEnd(QueryDesc *queryDesc)
 				   queryDesc->plannedstmt->stmt_len,
 				   PGSS_EXEC,
 				   queryDesc->totaltime->total * 1000.0,	/* convert to msec */
-				   queryDesc->estate->es_processed,
+				   queryDesc->estate->es_total_processed,
+				   queryDesc->estate->es_calls,
 				   &queryDesc->totaltime->bufusage,
 				   &queryDesc->totaltime->walusage,
 				   queryDesc->estate->es_jit ? &queryDesc->estate->es_jit->instr : NULL,
@@ -1189,6 +1193,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 				   PGSS_EXEC,
 				   INSTR_TIME_GET_MILLISEC(duration),
 				   rows,
+				   1,
 				   &bufusage,
 				   &walusage,
 				   NULL,
@@ -1222,7 +1227,7 @@ static void
 pgss_store(const char *query, uint64 queryId,
 		   int query_location, int query_len,
 		   pgssStoreKind kind,
-		   double total_time, uint64 rows,
+		   double total_time, uint64 rows, uint64 calls,
 		   const BufferUsage *bufusage,
 		   const WalUsage *walusage,
 		   const struct JitInstrumentation *jitusage,
@@ -1348,7 +1353,7 @@ pgss_store(const char *query, uint64 queryId,
 		if (IS_STICKY(e->counters))
 			e->counters.usage = USAGE_INIT;
 
-		e->counters.calls[kind] += 1;
+		e->counters.calls[kind] += calls;
 		e->counters.total_time[kind] += total_time;
 
 		if (e->counters.calls[kind] == 1)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index a5115b9c1f..3ccf6f26e8 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -341,6 +341,11 @@ standard_ExecutorRun(QueryDesc *queryDesc,
 	operation = queryDesc->operation;
 	dest = queryDesc->dest;
 
+	/*
+	 * update the counter for total_processed and calls
+	 */
+	queryDesc->estate->es_total_processed += queryDesc->estate->es_processed;
+	estate->es_calls++;
 	/*
 	 * startup tuple receiver, if we will be emitting tuples
 	 */
@@ -444,6 +449,7 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
 	MemoryContextSwitchTo(oldcontext);
 
 	estate->es_finished = true;
+	queryDesc->estate->es_total_processed += queryDesc->estate->es_processed;
 }
 
 /* ----------------------------------------------------------------
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index c33a3c0bec..98f9c556b8 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -146,6 +146,8 @@ CreateExecutorState(void)
 	estate->es_tupleTable = NIL;
 
 	estate->es_processed = 0;
+	estate->es_calls = 0;
+	estate->es_total_processed = 0;
 
 	estate->es_top_eflags = 0;
 	estate->es_instrument = 0;
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 20f4c8b35f..dd1f389a7d 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -657,7 +657,9 @@ typedef struct EState
 
 	List	   *es_tupleTable;	/* List of TupleTableSlots */
 
-	uint64		es_processed;	/* # of tuples processed */
+	uint64		es_processed;	/* # of tuples processed at the top level only */
+	uint64		es_calls;       /* # of calls */
+	uint64		es_total_processed; /* total # of tuples processed */
 
 	int			es_top_eflags;	/* eflags passed to ExecutorStart */
 	int			es_instrument;	/* OR of InstrumentOption flags */
-- 
2.37.1 (Apple Git-137.1)

Test.javaapplication/octet-stream; name=Test.javaDownload
#2Michael Paquier
michael@paquier.xyz
In reply to: Imseih (AWS), Sami (#1)
Re: [BUG] pg_stat_statements and extended query protocol

On Wed, Jan 25, 2023 at 11:22:04PM +0000, Imseih (AWS), Sami wrote:

Doing some work with extended query protocol, I encountered the same
issue that was discussed in [1]. It appears when a client is using
extended query protocol and sends an Execute message to a portal with
max_rows, and a portal is executed multiple times,
pg_stat_statements does not correctly track rows and calls.

Well, it is one of these areas where it seems to me we have never been
able to put a definition on what should be the correct behavior when
it comes to pg_stat_statements. Could it be possible to add some
regression tests using the recently-added \bind command and see how
this affects things? I would suggest splitting these into their own
SQL file, following an effort I have been doing recently for the
regression tests of pg_stat_statements. It would be good to know the
effects of this change for pg_stat_statements.track = (top|all), as
well.

@@ -657,7 +657,9 @@ typedef struct EState

List *es_tupleTable; /* List of TupleTableSlots */

-       uint64          es_processed;   /* # of tuples processed */
+       uint64          es_processed;   /* # of tuples processed at the top level only */
+       uint64          es_calls;       /* # of calls */
+       uint64          es_total_processed; /* total # of tuples processed */

So the root of the logic is here. Anything that makes the executor
structures larger freaks me out, FWIW, and that's quite an addition.
--
Michael

#3Imseih (AWS), Sami
simseih@amazon.com
In reply to: Michael Paquier (#2)
Re: [BUG] pg_stat_statements and extended query protocol

Well, it is one of these areas where it seems to me we have never been
able to put a definition on what should be the correct behavior when
it comes to pg_stat_statements.

What needs to be defined here is how pgss should account for # of rows
processed when A) a select goes through extended query (EP) protocol, and
B) it requires multiple executes to complete a portal.

The patch being suggested will treat every 'E' ( execute message ) to the same
portal as a new call ( pgss will increment the calls ) and the number of rows
processed will be accumulated for every 'E' message.

Currently, only the rows fetched in the last 'E' call to the portal is tracked by
pgss. This is incorrect.

Could it be possible to add some
regression tests using the recently-added \bind command and see how
this affects things?

\bind alone will not be enough as we also need a way to fetch from
a portal in batches. The code that needs to be exercised
as part of the test is exec_execute_message with max_rows != 0.

\bind will call exec_execute_message with max_rows = 0 to fetch
all the rows.

I would suggest splitting these into their own
SQL file, following an effort I have been doing recently for the
regression tests of pg_stat_statements. It would be good to know the
effects of this change for pg_stat_statements.track = (top|all), as
well.

Yes, I agree that proper test coverage is needed here. Will think
about how to accomplish this.

- uint64 es_processed; /* # of tuples processed */
+ uint64 es_processed; /* # of tuples processed at the top level only */
+ uint64 es_calls; /* # of calls */
+ uint64 es_total_processed; /* total # of tuples processed */

So the root of the logic is here. Anything that makes the executor
structures larger freaks me out, FWIW, and that's quite an addition.

I am not sure how to get around the changes to Estate and fixing this
Issue.

We could potentially only need the es_total_processed field and
continue to track calls in pgss.

es_total_processed in EState however is still needed.

Regards,

--

Sami Imseih
Amazon Web Services (AWS)

#4David Zhang
david.zhang@highgo.ca
In reply to: Imseih (AWS), Sami (#3)
Re: [BUG] pg_stat_statements and extended query protocol

Yes, I agree that proper test coverage is needed here. Will think
about how to accomplish this.

Tried to apply this patch to current master branch and the build was ok,
however it crashed during initdb with a message like below.

"performing post-bootstrap initialization ... Segmentation fault (core
dumped)"

If I remove this patch and recompile again, then "initdb -D $PGDATA" works.

Thanks,

David

#5Imseih (AWS), Sami
simseih@amazon.com
In reply to: David Zhang (#4)
Re: [BUG] pg_stat_statements and extended query protocol

If I remove this patch and recompile again, then "initdb -D $PGDATA" works.

It appears you must "make clean; make install" to correctly compile after
applying the patch.

Regards,

Sami Imseih
Amazon Web Services (AWS)

#6Michael Paquier
michael@paquier.xyz
In reply to: Imseih (AWS), Sami (#5)
Re: [BUG] pg_stat_statements and extended query protocol

On Sat, Mar 11, 2023 at 11:55:22PM +0000, Imseih (AWS), Sami wrote:

It appears you must "make clean; make install" to correctly compile after
applying the patch.

In a git repository, I've learnt to rely on this simple formula, even
if it means extra cycles when running ./configure:
git clean -d -x -f
--
Michael

#7Drouvot, Bertrand
bertranddrouvot.pg@gmail.com
In reply to: Michael Paquier (#2)
Re: [BUG] pg_stat_statements and extended query protocol

Hi,

On 3/2/23 8:27 AM, Michael Paquier wrote:

On Wed, Jan 25, 2023 at 11:22:04PM +0000, Imseih (AWS), Sami wrote:

Doing some work with extended query protocol, I encountered the same
issue that was discussed in [1]. It appears when a client is using
extended query protocol and sends an Execute message to a portal with
max_rows, and a portal is executed multiple times,
pg_stat_statements does not correctly track rows and calls.

Well, it is one of these areas where it seems to me we have never been
able to put a definition on what should be the correct behavior when
it comes to pg_stat_statements. Could it be possible to add some
regression tests using the recently-added \bind command and see how
this affects things? I would suggest splitting these into their own
SQL file, following an effort I have been doing recently for the
regression tests of pg_stat_statements. It would be good to know the
effects of this change for pg_stat_statements.track = (top|all), as
well.

@@ -657,7 +657,9 @@ typedef struct EState

List *es_tupleTable; /* List of TupleTableSlots */

-       uint64          es_processed;   /* # of tuples processed */
+       uint64          es_processed;   /* # of tuples processed at the top level only */
+       uint64          es_calls;       /* # of calls */
+       uint64          es_total_processed; /* total # of tuples processed */

So the root of the logic is here. Anything that makes the executor
structures larger freaks me out, FWIW, and that's quite an addition.
--
Michael

I wonder if we can't "just" make use of the "count" parameter passed to the
ExecutorRun_hook.

Something like?

- Increment a "es_total_processed" counter in pgss based on the count received in pgss_ExecutorRun()
- In pgss_ExecutorEnd(): substract the last count we received in pgss_ExecutorRun() and add queryDesc->estate->es_processed? (we'd
need to be able to distinguish when we should apply this rule or not).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#8David Zhang
david.zhang@highgo.ca
In reply to: Michael Paquier (#6)
Re: [BUG] pg_stat_statements and extended query protocol

It appears you must "make clean; make install" to correctly compile after
applying the patch.

In a git repository, I've learnt to rely on this simple formula, even
if it means extra cycles when running ./configure:
git clean -d -x -f

Thank you all for pointing out that it needs make clean first. After
make clean followed by recompile with the patch then both make check
from regression test and pg_stat_statements extension report all test
passed. So the current existing test cases can't really detect any
change from this patch, then it would be better to add some test cases
to cover this.

Best regards,

David

#9Imseih (AWS), Sami
simseih@amazon.com
In reply to: David Zhang (#8)
Re: [BUG] pg_stat_statements and extended query protocol

Sorry about the delay in response about this.

I was thinking about this and it seems to me we can avoid
adding new fields to Estate. I think a better place to track
rows and calls is in the Instrumentation struct.

--- a/src/include/executor/instrument.h
+++ b/src/include/executor/instrument.h
@@ -88,6 +88,8 @@ typedef struct Instrumentation
        double          nfiltered2;             /* # of tuples removed by "other" quals */
        BufferUsage bufusage;           /* total buffer usage */
        WalUsage        walusage;               /* total WAL usage */
+       int64       calls;
+       int64       rows_processed;
 } Instrumentation; 

If this is more palatable, I can prepare the patch.

Thanks for your feedback.

Regards.

Sami Imseih
Amazon Web Services (AWS)

#10Michael Paquier
michael@paquier.xyz
In reply to: Imseih (AWS), Sami (#9)
Re: [BUG] pg_stat_statements and extended query protocol

On Mon, Mar 20, 2023 at 09:41:12PM +0000, Imseih (AWS), Sami wrote:

I was thinking about this and it seems to me we can avoid
adding new fields to Estate. I think a better place to track
rows and calls is in the Instrumentation struct.

If this is more palatable, I can prepare the patch.

This indeed feels a bit more natural seen from here, after looking at
the code paths using an Instrumentation in the executor and explain,
for example. At least, this stresses me much less than adding 16
bytes to EState for something restricted to the extended protocol when
it comes to monitoring capabilities.
--
Michael

#11Imseih (AWS), Sami
simseih@amazon.com
In reply to: Michael Paquier (#10)
1 attachment(s)
Re: [BUG] pg_stat_statements and extended query protocol

This indeed feels a bit more natural seen from here, after looking at
the code paths using an Instrumentation in the executor and explain,
for example. At least, this stresses me much less than adding 16
bytes to EState for something restricted to the extended protocol when
it comes to monitoring capabilities.

Attached is the patch that uses Instrumentation.

I did not add any new tests, and we do not have anyway now
of setting a row count when going through the Execute message.
I think this may be need to be addressed separately since there
Seems to be. Gap in extended query protocol testing.

For this fix however, The JDBC test does show correct results.

Regards,

Sami Imseih
Amazon Web Services (AWS)

Attachments:

0001-Correct-accumulation-of-counters-for-extended-query-.patchapplication/octet-stream; name=0001-Correct-accumulation-of-counters-for-extended-query-.patchDownload
From a2516f302b7a4eca62cb6f80bedca0543114a4ba Mon Sep 17 00:00:00 2001
From: EC2 Default User <ec2-user@ip-172-31-26-221.ec2.internal>
Date: Tue, 21 Mar 2023 12:49:56 +0000
Subject: [PATCH 1/1] Correct accumulation of counters for extended query
 protocol

In pg_stat_statements, EState->es_processed cannot reliably
be used to count the rows processed for a statement that
goes through extended query protocol. This is because, such
a statement may go through ExecutorRun more than once, and
es_processed is reset after every call. This fix addresses
this issue by accuumulating the # of calls and row processed
in Instrumentation and the correct totals are available to
ExecutorEnd ( and pgss_store ) before the Instrumentation
is destroyed.
---
 .../pg_stat_statements/pg_stat_statements.c    | 18 ++++++++++++++----
 src/include/executor/instrument.h              |  2 ++
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 5285c3f7fa..821cb4dad7 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -340,7 +340,7 @@ static void pgss_store(const char *query, uint64 queryId,
 					   int query_location, int query_len,
 					   pgssStoreKind kind,
 					   double total_time, uint64 rows,
-					   const BufferUsage *bufusage,
+					   int64 calls, const BufferUsage *bufusage,
 					   const WalUsage *walusage,
 					   const struct JitInstrumentation *jitusage,
 					   JumbleState *jstate);
@@ -857,6 +857,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
 				   PGSS_INVALID,
 				   0,
 				   0,
+				   0,
 				   NULL,
 				   NULL,
 				   NULL,
@@ -942,6 +943,7 @@ pgss_planner(Query *parse,
 				   PGSS_PLAN,
 				   INSTR_TIME_GET_MILLISEC(duration),
 				   0,
+				   1,
 				   &bufusage,
 				   &walusage,
 				   NULL,
@@ -1014,6 +1016,12 @@ pgss_ExecutorRun(QueryDesc *queryDesc, ScanDirection direction, uint64 count,
 		exec_nested_level--;
 	}
 	PG_END_TRY();
+
+	if (queryDesc->totaltime)
+	{
+		queryDesc->totaltime->calls++;
+		queryDesc->totaltime->rows_processed += queryDesc->estate->es_processed;
+	}
 }
 
 /*
@@ -1060,7 +1068,8 @@ pgss_ExecutorEnd(QueryDesc *queryDesc)
 				   queryDesc->plannedstmt->stmt_len,
 				   PGSS_EXEC,
 				   queryDesc->totaltime->total * 1000.0,	/* convert to msec */
-				   queryDesc->estate->es_processed,
+				   queryDesc->totaltime->rows_processed,
+				   queryDesc->totaltime->calls,
 				   &queryDesc->totaltime->bufusage,
 				   &queryDesc->totaltime->walusage,
 				   queryDesc->estate->es_jit ? &queryDesc->estate->es_jit->instr : NULL,
@@ -1191,6 +1200,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 				   PGSS_EXEC,
 				   INSTR_TIME_GET_MILLISEC(duration),
 				   rows,
+				   1,
 				   &bufusage,
 				   &walusage,
 				   NULL,
@@ -1225,7 +1235,7 @@ pgss_store(const char *query, uint64 queryId,
 		   int query_location, int query_len,
 		   pgssStoreKind kind,
 		   double total_time, uint64 rows,
-		   const BufferUsage *bufusage,
+		   int64 calls, const BufferUsage *bufusage,
 		   const WalUsage *walusage,
 		   const struct JitInstrumentation *jitusage,
 		   JumbleState *jstate)
@@ -1350,7 +1360,7 @@ pgss_store(const char *query, uint64 queryId,
 		if (IS_STICKY(e->counters))
 			e->counters.usage = USAGE_INIT;
 
-		e->counters.calls[kind] += 1;
+		e->counters.calls[kind] += calls;
 		e->counters.total_time[kind] += total_time;
 
 		if (e->counters.calls[kind] == 1)
diff --git a/src/include/executor/instrument.h b/src/include/executor/instrument.h
index 87e5e2183b..5436564568 100644
--- a/src/include/executor/instrument.h
+++ b/src/include/executor/instrument.h
@@ -88,6 +88,8 @@ typedef struct Instrumentation
 	double		nfiltered2;		/* # of tuples removed by "other" quals */
 	BufferUsage bufusage;		/* total buffer usage */
 	WalUsage	walusage;		/* total WAL usage */
+	int64       calls;			/* # of total calls to ExecutorRun */
+	int64       rows_processed;	/* # of total rows processed in ExecutorRun */
 } Instrumentation;
 
 typedef struct WorkerInstrumentation
-- 
2.39.2

#12Drouvot, Bertrand
bertranddrouvot.pg@gmail.com
In reply to: Imseih (AWS), Sami (#11)
Re: [BUG] pg_stat_statements and extended query protocol

Hi,

On 3/21/23 2:16 PM, Imseih (AWS), Sami wrote:

This indeed feels a bit more natural seen from here, after looking at
the code paths using an Instrumentation in the executor and explain,
for example. At least, this stresses me much less than adding 16
bytes to EState for something restricted to the extended protocol when
it comes to monitoring capabilities.

Attached is the patch that uses Instrumentation.

Thanks, I think this new approach makes sense.

-                                          const BufferUsage *bufusage,
+                                          int64 calls, const BufferUsage *bufusage,

What about using an uint64 for calls? That seems more appropriate to me (even if
queryDesc->totaltime->calls will be passed (which is int64), but that's already
also the case for the "rows" argument and queryDesc->totaltime->rows_processed)

@@ -88,6 +88,8 @@ typedef struct Instrumentation
         double          nfiltered2;             /* # of tuples removed by "other" quals */
         BufferUsage bufusage;           /* total buffer usage */
         WalUsage        walusage;               /* total WAL usage */
+       int64       calls;                      /* # of total calls to ExecutorRun */
+       int64       rows_processed;     /* # of total rows processed in ExecutorRun */

I'm not sure it's worth mentioning that the new counters are "currently" used with the ExecutorRun.

What about just "total calls" and "total rows processed" (or "total rows", see below)?

Also, I wonder if "rows" (and not rows_processed) would not be a better naming.

Those last comments regarding the Instrumentation are done because ISTM that at the end their usage
could vary depending of the use case of the Instrumentation.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#13Imseih (AWS), Sami
simseih@amazon.com
In reply to: Drouvot, Bertrand (#12)
1 attachment(s)
Re: [BUG] pg_stat_statements and extended query protocol

What about using an uint64 for calls? That seems more appropriate to me (even if
queryDesc->totaltime->calls will be passed (which is int64), but that's already
also the case for the "rows" argument and queryDesc->totaltime->rows_processed)

That's fair

I'm not sure it's worth mentioning that the new counters are "currently" used with the ExecutorRun.

Sure, I suppose these fields could be used outside of ExecutorRun. Good point.

Also, I wonder if "rows" (and not rows_processed) would not be a better naming.

Agree.

I went with rows_processed initially, since it was accumulating es_processed,
but as the previous point, this instrumentation could be used outside of
ExecutorRun.

v3 addresses the comments.

Regards,

--
Sami Imseih
Amazon Web Services (AWS)

Attachments:

v3-0001-Correct-accumulation-of-counters-for-extended-query-.patchapplication/octet-stream; name=v3-0001-Correct-accumulation-of-counters-for-extended-query-.patchDownload
From 9427777ccc6fc84a087b9845bce58054b628458d Mon Sep 17 00:00:00 2001
From: EC2 Default User <ec2-user@ip-172-31-26-221.ec2.internal>
Date: Wed, 22 Mar 2023 20:14:51 +0000
Subject: [PATCH 1/1] Correct accumulation of counters for extended query
 protocol

In pg_stat_statements, EState->es_processed cannot reliably
be used to count the rows processed for a statement that
goes through extended query protocol. This is because, such
a statement may go through ExecutorRun more than once, and
es_processed is reset after every call. This fix addresses
this issue by accuumulating the # of calls and row processed
in Instrumentation and the correct totals are available to
ExecutorEnd ( and pgss_store ) before the Instrumentation
is destroyed.

Discussion: https://www.postgresql.org/message-id/flat/EBE6C507-9EB6-4142-9E4D-38B1673363A7%40amazon.com
---
 .../pg_stat_statements/pg_stat_statements.c    | 18 ++++++++++++++----
 src/include/executor/instrument.h              |  2 ++
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 5285c3f7fa..f6d5ac9402 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -340,7 +340,7 @@ static void pgss_store(const char *query, uint64 queryId,
 					   int query_location, int query_len,
 					   pgssStoreKind kind,
 					   double total_time, uint64 rows,
-					   const BufferUsage *bufusage,
+					   uint64 calls, const BufferUsage *bufusage,
 					   const WalUsage *walusage,
 					   const struct JitInstrumentation *jitusage,
 					   JumbleState *jstate);
@@ -857,6 +857,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
 				   PGSS_INVALID,
 				   0,
 				   0,
+				   0,
 				   NULL,
 				   NULL,
 				   NULL,
@@ -942,6 +943,7 @@ pgss_planner(Query *parse,
 				   PGSS_PLAN,
 				   INSTR_TIME_GET_MILLISEC(duration),
 				   0,
+				   1,
 				   &bufusage,
 				   &walusage,
 				   NULL,
@@ -1014,6 +1016,12 @@ pgss_ExecutorRun(QueryDesc *queryDesc, ScanDirection direction, uint64 count,
 		exec_nested_level--;
 	}
 	PG_END_TRY();
+
+	if (queryDesc->totaltime)
+	{
+		queryDesc->totaltime->calls++;
+		queryDesc->totaltime->rows += queryDesc->estate->es_processed;
+	}
 }
 
 /*
@@ -1060,7 +1068,8 @@ pgss_ExecutorEnd(QueryDesc *queryDesc)
 				   queryDesc->plannedstmt->stmt_len,
 				   PGSS_EXEC,
 				   queryDesc->totaltime->total * 1000.0,	/* convert to msec */
-				   queryDesc->estate->es_processed,
+				   queryDesc->totaltime->rows,
+				   queryDesc->totaltime->calls,
 				   &queryDesc->totaltime->bufusage,
 				   &queryDesc->totaltime->walusage,
 				   queryDesc->estate->es_jit ? &queryDesc->estate->es_jit->instr : NULL,
@@ -1191,6 +1200,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 				   PGSS_EXEC,
 				   INSTR_TIME_GET_MILLISEC(duration),
 				   rows,
+				   1,
 				   &bufusage,
 				   &walusage,
 				   NULL,
@@ -1225,7 +1235,7 @@ pgss_store(const char *query, uint64 queryId,
 		   int query_location, int query_len,
 		   pgssStoreKind kind,
 		   double total_time, uint64 rows,
-		   const BufferUsage *bufusage,
+		   uint64 calls, const BufferUsage *bufusage,
 		   const WalUsage *walusage,
 		   const struct JitInstrumentation *jitusage,
 		   JumbleState *jstate)
@@ -1350,7 +1360,7 @@ pgss_store(const char *query, uint64 queryId,
 		if (IS_STICKY(e->counters))
 			e->counters.usage = USAGE_INIT;
 
-		e->counters.calls[kind] += 1;
+		e->counters.calls[kind] += calls;
 		e->counters.total_time[kind] += total_time;
 
 		if (e->counters.calls[kind] == 1)
diff --git a/src/include/executor/instrument.h b/src/include/executor/instrument.h
index 87e5e2183b..33f30832d7 100644
--- a/src/include/executor/instrument.h
+++ b/src/include/executor/instrument.h
@@ -88,6 +88,8 @@ typedef struct Instrumentation
 	double		nfiltered2;		/* # of tuples removed by "other" quals */
 	BufferUsage bufusage;		/* total buffer usage */
 	WalUsage	walusage;		/* total WAL usage */
+	int64       calls;			/* total calls */
+	int64       rows;	/* total rows */
 } Instrumentation;
 
 typedef struct WorkerInstrumentation
-- 
2.39.2

#14Drouvot, Bertrand
bertranddrouvot.pg@gmail.com
In reply to: Imseih (AWS), Sami (#13)
Re: [BUG] pg_stat_statements and extended query protocol

Hi,

On 3/22/23 10:35 PM, Imseih (AWS), Sami wrote:

What about using an uint64 for calls? That seems more appropriate to me (even if
queryDesc->totaltime->calls will be passed (which is int64), but that's already
also the case for the "rows" argument and queryDesc->totaltime->rows_processed)

That's fair

I'm not sure it's worth mentioning that the new counters are "currently" used with the ExecutorRun.

Sure, I suppose these fields could be used outside of ExecutorRun. Good point.

Also, I wonder if "rows" (and not rows_processed) would not be a better naming.

Agree.

I went with rows_processed initially, since it was accumulating es_processed,
but as the previous point, this instrumentation could be used outside of
ExecutorRun.

v3 addresses the comments.

Thanks! LGTM and also do confirm that, with the patch, the JDBC test does show the correct results.

That said, not having a test (for the reasons you explained up-thread) associated with the patch worry me a bit.

But, I'm tempted to say that adding new tests could be addressed separately though (as this patch looks pretty straightforward).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#15Michael Paquier
michael@paquier.xyz
In reply to: Drouvot, Bertrand (#14)
Re: [BUG] pg_stat_statements and extended query protocol

On Thu, Mar 23, 2023 at 09:33:16AM +0100, Drouvot, Bertrand wrote:

Thanks! LGTM and also do confirm that, with the patch, the JDBC test
does show the correct results.

How does JDBC test that? Does it have a dependency on
pg_stat_statements?

That said, not having a test (for the reasons you explained
up-thread) associated with the patch worry me a bit.

Same impression here.

But, I'm tempted to say that adding new tests could be addressed
separately though (as this patch looks pretty straightforward).

Even small patches can have gotchas. I think that this should have
tests in-core rather than just depend on JDBC and hope for the best.
Even if \bind does not allow that, we could use an approach similar to
libpq_pipeline, for example, depending on pg_stat_statements for the
validation with a test module in src/test/modules/?
--
Michael

#16Imseih (AWS), Sami
simseih@amazon.com
In reply to: Michael Paquier (#15)
Re: [BUG] pg_stat_statements and extended query protocol

How does JDBC test that? Does it have a dependency on
pg_stat_statements?

No, at the start of the thread, a sample jdbc script was attached.
But I agree, we need to add test coverage. See below.

But, I'm tempted to say that adding new tests could be addressed
separately though (as this patch looks pretty straightforward).

Even small patches can have gotchas. I think that this should have
tests in-core rather than just depend on JDBC and hope for the best.
Even if \bind does not allow that, we could use an approach similar to
libpq_pipeline, for example, depending on pg_stat_statements for the
validation with a test module in src/test/modules/?

Yes, that is possible but we will need to add a libpq API
that allows the caller to pass in a "fetch size".
PQsendQueryParams does not take in a "fetch size",
so it returns all rows, through PQsendQueryParams

https://github.com/postgres/postgres/blob/master/src/interfaces/libpq/fe-exec.c#L1882

Adding such an API that takes in a "fetch size" will be beneficial
not just for this test, but I can see it enabling another psql meta command,
similar to \bind but that takes in a "fetch size".

Regards

--
Sami Imseih
Amazon Web Services (AWS)

#17Yugo NAGATA
nagata@sraoss.co.jp
In reply to: Drouvot, Bertrand (#14)
Re: [BUG] pg_stat_statements and extended query protocol

On Thu, 23 Mar 2023 09:33:16 +0100
"Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com> wrote:

Hi,

On 3/22/23 10:35 PM, Imseih (AWS), Sami wrote:

What about using an uint64 for calls? That seems more appropriate to me (even if
queryDesc->totaltime->calls will be passed (which is int64), but that's already
also the case for the "rows" argument and queryDesc->totaltime->rows_processed)

That's fair

I'm not sure it's worth mentioning that the new counters are "currently" used with the ExecutorRun.

Sure, I suppose these fields could be used outside of ExecutorRun. Good point.

Also, I wonder if "rows" (and not rows_processed) would not be a better naming.

Agree.

I went with rows_processed initially, since it was accumulating es_processed,
but as the previous point, this instrumentation could be used outside of
ExecutorRun.

v3 addresses the comments.

I wonder that this patch changes the meaning of "calls" in the pg_stat_statement
view a bit; previously it was "Number of times the statement was executed" as
described in the documentation, but currently this means "Number of times the
portal was executed". I'm worried that this makes users confused. For example,
a user may think the average numbers of rows returned by a statement is given by
rows/calls, but it is not always correct because some statements could be executed
with multiple portal runs.

Although it might not big issue to users, I think it is better to add an explanation
to the doc for clarification.

Regards,
Yugo Nagata

Thanks! LGTM and also do confirm that, with the patch, the JDBC test does show the correct results.

That said, not having a test (for the reasons you explained up-thread) associated with the patch worry me a bit.

But, I'm tempted to say that adding new tests could be addressed separately though (as this patch looks pretty straightforward).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

--
Yugo NAGATA <nagata@sraoss.co.jp>

#18Imseih (AWS), Sami
simseih@amazon.com
In reply to: Yugo NAGATA (#17)
Re: [BUG] pg_stat_statements and extended query protocol

I wonder that this patch changes the meaning of "calls" in the pg_stat_statement
view a bit; previously it was "Number of times the statement was executed" as
described in the documentation, but currently this means "Number of times the
portal was executed". I'm worried that this makes users confused. For example,
a user may think the average numbers of rows returned by a statement is given by
rows/calls, but it is not always correct because some statements could be executed
with multiple portal runs.

I don't think it changes the meaning of "calls" in pg_stat_statements, since every
time the app fetches X amount of rows from a portal, it's still done in a separate
execution, and thus a separate call.

I agree, the meaning of "calls" should be clarified in docs.

Regards,

Sami Imseih
Amazon Web Services (AWS)

#19Michael Paquier
michael@paquier.xyz
In reply to: Imseih (AWS), Sami (#16)
Re: [BUG] pg_stat_statements and extended query protocol

On Thu, Mar 23, 2023 at 01:54:05PM +0000, Imseih (AWS), Sami wrote:

Yes, that is possible but we will need to add a libpq API
that allows the caller to pass in a "fetch size".
PQsendQueryParams does not take in a "fetch size",
so it returns all rows, through PQsendQueryParams

https://github.com/postgres/postgres/blob/master/src/interfaces/libpq/fe-exec.c#L1882

Adding such an API that takes in a "fetch size" will be beneficial
not just for this test, but I can see it enabling another psql meta command,
similar to \bind but that takes in a "fetch size".

So... The idea here is to set a custom fetch size so as the number of
calls can be deterministic in the tests, still more than 1 for the
tests we'd have. And your point is that libpq enforces always 0 when
sending the EXECUTE message causing it to always return all the rows
for any caller of PQsendQueryGuts().

The extended protocol allows that, so you would like a libpq API to
have more control of what we send with EXECUTE:
https://www.postgresql.org/docs/current/protocol-overview.html#PROTOCOL-QUERY-CONCEPTS

The extended query protocol would require multiple 'E' messages, but
we would not need multiple describe or bind messages, meaning that
this cannot just be an extra flavor PQsendQueryParams(). Am I gettig
that right? The correct API design seems tricky, to say the least.
Perhaps requiring this much extra work in libpq for the purpose of
having some tests in this thread is not a brilliant idea.. Or perhaps
we could just do it and have something a-la-JDBC with two routines?
That would be one libpq routine for describe/bind and one for execute
where the limit can be given by the caller in the latter case, similar
to sendDescribeStatement() and sendExecute() in
QueryExecutorImpl.java.
--
Michael

#20Imseih (AWS), Sami
simseih@amazon.com
In reply to: Michael Paquier (#19)
Re: [BUG] pg_stat_statements and extended query protocol

So... The idea here is to set a custom fetch size so as the number of
calls can be deterministic in the tests, still more than 1 for the
tests we'd have. And your point is that libpq enforces always 0 when
sending the EXECUTE message causing it to always return all the rows
for any caller of PQsendQueryGuts().

That is correct.

The extended protocol allows that, so you would like a libpq API to
have more control of what we send with EXECUTE:
https://www.postgresql.org/docs/current/protocol-overview.html#PROTOCOL-QUERY-CONCEPTS

The extended query protocol would require multiple 'E' messages, but
we would not need multiple describe or bind messages, meaning that
this cannot just be an extra flavor PQsendQueryParams(). Am I gettig
that right?

Correct, there will need to be separate APIs for Parse/Bind, Execute
and Close of a Portal.

The correct API design seems tricky, to say the least.
Perhaps requiring this much extra work in libpq for the purpose of
having some tests in this thread is not a brilliant idea.. Or perhaps
we could just do it and have something a-la-JDBC with two routines?
That would be one libpq routine for describe/bind and one for execute
where the limit can be given by the caller in the latter case, similar
to sendDescribeStatement() and sendExecute() in
QueryExecutorImpl.java.

I am not too clear on your point here. ISTM you are suggesting adding
new libpq APis similar to JDBC, which is what I am also suggesting.

Did I understand correctly?

--
Sami Imseih
Amazon Web Services (AWS)

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Imseih (AWS), Sami (#20)
Re: [BUG] pg_stat_statements and extended query protocol

"Imseih (AWS), Sami" <simseih@amazon.com> writes:

So... The idea here is to set a custom fetch size so as the number of
calls can be deterministic in the tests, still more than 1 for the
tests we'd have. And your point is that libpq enforces always 0 when
sending the EXECUTE message causing it to always return all the rows
for any caller of PQsendQueryGuts().

That is correct.

Hi, I took a quick look through this thread, and I have a couple of
thoughts:

* Yeah, it'd be nice to have an in-core test, but it's folly to insist
on one that works via libpq and psql. That requires a whole new set
of features that you're apparently designing on-the-fly with no other
use cases in mind. I don't think that will accomplish much except to
ensure that this bug fix doesn't make it into v16.

* I don't understand why it was thought good to add two new counters
to struct Instrumentation. In EXPLAIN ANALYZE cases those will be
wasted space *per plan node*, not per Query.

* It also seems quite bizarre to add counters to a core data structure
and then leave it to pg_stat_statements to maintain them. (BTW, I didn't
much care for putting that maintenance into pgss_ExecutorRun without
updating its header comment.)

I'm inclined to think that adding the counters to struct EState is
fine. That's 304 bytes already on 64-bit platforms, another 8 or 16
won't matter.

Also, I'm doubtful that counting calls this way is a great idea,
which would mean you only need one new counter field not two. The
fact that you're having trouble defining what it means certainly
suggests that the implementation is out front of the design.

In short, what I think I'd suggest is adding an es_total_processed
field to EState and having standard_ExecutorRun do "es_total_processed
+= es_processed" near the end, then just change pg_stat_statements to
use es_total_processed not es_processed.

regards, tom lane

#22Imseih (AWS), Sami
simseih@amazon.com
In reply to: Tom Lane (#21)
Re: [BUG] pg_stat_statements and extended query protocol

* Yeah, it'd be nice to have an in-core test, but it's folly to insist
on one that works via libpq and psql. That requires a whole new set
of features that you're apparently designing on-the-fly with no other
use cases in mind. I don't think that will accomplish much except to
ensure that this bug fix doesn't make it into v16.

I agree, Solving the lack of in-core testing for this area belong in a
different discussion.

* I don't understand why it was thought good to add two new counters
to struct Instrumentation. In EXPLAIN ANALYZE cases those will be
wasted space *per plan node*, not per Query.

Indeed, looking at ExplainNode, Instrumentation struct is allocated
per node and the new fields will be wasted space. Thanks for highlighting
this.

* It also seems quite bizarre to add counters to a core data structure
and then leave it to pg_stat_statements to maintain them.

That is fair point

I'm inclined to think that adding the counters to struct EState is
fine. That's 304 bytes already on 64-bit platforms, another 8 or 16
won't matter.

With the point you raise about Insrumentation per node, Estate
is the better place for the new counters.

Also, I'm doubtful that counting calls this way is a great idea,
which would mean you only need one new counter field not two. The
fact that you're having trouble defining what it means certainly
suggests that the implementation is out front of the design.

ISTM you are not in agreement that a call count should be incremented
after every executorRun, but should only be incremented after
the portal is closed, at executorEnd. Is that correct?

FWIW, The rationale for incrementing calls in executorRun is that calls refers
to the number of times a client executes a portal, whether partially or to completion.

Clients can also fetch rows from portals at various rates, so to determine the
"rows per call" accurately from pg_stat_statements, we should track a calls as
the number of times executorRun was called on a portal.

In short, what I think I'd suggest is adding an es_total_processed
field to EState and having standard_ExecutorRun do "es_total_processed
+= es_processed" near the end, then just change pg_stat_statements to
use es_total_processed not es_processed.

The original proposal in 0001-correct-pg_stat_statements-tracking-of-portals.patch,
was to add a "calls" and "es_total_processed" field to EState.

Regards,

Sami Imseih
Amazon Web Services (AWS)

#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Imseih (AWS), Sami (#22)
Re: [BUG] pg_stat_statements and extended query protocol

"Imseih (AWS), Sami" <simseih@amazon.com> writes:

Also, I'm doubtful that counting calls this way is a great idea,
which would mean you only need one new counter field not two. The
fact that you're having trouble defining what it means certainly
suggests that the implementation is out front of the design.

ISTM you are not in agreement that a call count should be incremented
after every executorRun, but should only be incremented after
the portal is closed, at executorEnd. Is that correct?

Right. That makes the "call count" equal to the number of times the
query is invoked.

FWIW, The rationale for incrementing calls in executorRun is that calls refers
to the number of times a client executes a portal, whether partially or to completion.

Why should that be the definition? Partial execution of a portal
might be something that is happening at the driver level, behind the
user's back. You can't make rational calculations of, say, plan
time versus execution time if that's how "calls" is measured.

regards, tom lane

#24Imseih (AWS), Sami
simseih@amazon.com
In reply to: Tom Lane (#23)
Re: [BUG] pg_stat_statements and extended query protocol

Why should that be the definition? Partial execution of a portal
might be something that is happening at the driver level, behind the
user's back. You can't make rational calculations of, say, plan
time versus execution time if that's how "calls" is measured.

Correct, and there are also drivers that implement fetch size using
cursor statements, i.e. DECLARE CURSOR, FETCH CURSOR,
and each FETCH gets counted as 1 call.

I wonder if the right answer here is to track fetches as
a separate counter in pg_stat_statements, in which fetch
refers to the number of times a portal is executed?

Regards,

Sami Imseih
Amazon Web Services (AWS)

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Imseih (AWS), Sami (#24)
Re: [BUG] pg_stat_statements and extended query protocol

"Imseih (AWS), Sami" <simseih@amazon.com> writes:

I wonder if the right answer here is to track fetches as
a separate counter in pg_stat_statements, in which fetch
refers to the number of times a portal is executed?

Maybe, but is there any field demand for that?

IMV, the existing behavior is that we count one "call" per overall
query execution (that is, per ExecutorEnd invocation). The argument
that that's a bug and we should change it seems unsupportable to me,
and even the argument that we should also count ExecutorRun calls
seems quite lacking in evidence. We clearly do need to fix the
reported rowcount for cases where ExecutorRun is invoked more than
once per ExecutorEnd call; but I think that's sufficient.

regards, tom lane

#26Imseih (AWS), Sami
simseih@amazon.com
In reply to: Tom Lane (#25)
Re: [BUG] pg_stat_statements and extended query protocol

Maybe, but is there any field demand for that?

I don't think there is.

We clearly do need to fix the
reported rowcount for cases where ExecutorRun is invoked more than
once per ExecutorEnd call; but I think that's sufficient.

Sure, the original proposed fix, but with tracking the es_total_processed
only in Estate should be enough for now.

Regards,

Sami Imseih
Amazon Web Services (AWS)

#27Michael Paquier
michael@paquier.xyz
In reply to: Imseih (AWS), Sami (#26)
Re: [BUG] pg_stat_statements and extended query protocol

On Tue, Apr 04, 2023 at 03:29:07AM +0000, Imseih (AWS), Sami wrote:

We clearly do need to fix the
reported rowcount for cases where ExecutorRun is invoked more than
once per ExecutorEnd call; but I think that's sufficient.

Sure, the original proposed fix, but with tracking the es_total_processed
only in Estate should be enough for now.

I was looking back at this thread, and the suggestion to use one field
in EState sounds fine to me. Sami, would you like to send a new
version of the patch (simplified version based on v1)?
--
Michael

#28Imseih (AWS), Sami
simseih@amazon.com
In reply to: Michael Paquier (#27)
1 attachment(s)
Re: [BUG] pg_stat_statements and extended query protocol

I was looking back at this thread, and the suggestion to use one field
in EState sounds fine to me. Sami, would you like to send a new
version of the patch (simplified version based on v1)?

Here is v4.

The "calls" tracking is removed from Estate. Unlike v1 however,
I added a check for the operation type. Inside ExecutorRun,
es_total_processed is incremented when the operation is
a SELECT. This check is done for es_processed as well inside
executorRun -> ExecutePlan.

For non select operations, es_total_processed is set to
es_processed in executorfinish. This is because the modify
plan nodes set es_processed outside of execMain.c

Regards,

Sami Imseih
Amazon Web Services (AWS)

Attachments:

v4-0001-Fix-row-tracking-in-pg_stat_statements.patchapplication/octet-stream; name=v4-0001-Fix-row-tracking-in-pg_stat_statements.patchDownload
From 071ffc0ac50aa9d5e0d8bb7c9fbf07c69e6e628f Mon Sep 17 00:00:00 2001
From: EC2 Default User <ec2-user@ip-172-31-26-221.ec2.internal>
Date: Tue, 4 Apr 2023 21:32:43 +0000
Subject: [PATCH 1/1] Fix row tracking in pg_stat_statements.

In pg_stat_statements, EState->es_processed cannot reliably
be used to count the rows processed for a statement that
goes through extended query protocol. This is because, such
a statement may execute a portal more than once, and
es_processed is reset after every execution. This fix addresses
this issue by accuumulating the # row processed for all the
executions of a portal.

Author: Sami Imseih <simseih@amazon.com>
Discussion: https://www.postgresql.org/message-id/flat/EBE6C507-9EB6-4142-9E4D-38B1673363A7%40amazon.com
---
 contrib/pg_stat_statements/pg_stat_statements.c |  2 +-
 src/backend/executor/execMain.c                 | 17 +++++++++++++++++
 src/include/nodes/execnodes.h                   |  5 ++++-
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 5285c3f7fa..68215bb2e3 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -1060,7 +1060,7 @@ pgss_ExecutorEnd(QueryDesc *queryDesc)
 				   queryDesc->plannedstmt->stmt_len,
 				   PGSS_EXEC,
 				   queryDesc->totaltime->total * 1000.0,	/* convert to msec */
-				   queryDesc->estate->es_processed,
+				   queryDesc->estate->es_total_processed,
 				   &queryDesc->totaltime->bufusage,
 				   &queryDesc->totaltime->walusage,
 				   queryDesc->estate->es_jit ? &queryDesc->estate->es_jit->instr : NULL,
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 1b007dc32c..d9164c8245 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -372,6 +372,15 @@ standard_ExecutorRun(QueryDesc *queryDesc,
 					execute_once);
 	}
 
+	/*
+	 * If this is a SELECT, add the tuples processed in this
+	 * execution of a portal to the total processed.
+	 * For other operation types, the total processed is set
+	 * during executorFinish.
+	 */
+	if (operation == CMD_SELECT)
+		estate->es_total_processed += estate->es_processed;
+
 	/*
 	 * shutdown tuple receiver, if we started it
 	 */
@@ -412,6 +421,7 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
 {
 	EState	   *estate;
 	MemoryContext oldcontext;
+	CmdType		operation = queryDesc->operation;
 
 	/* sanity checks */
 	Assert(queryDesc != NULL);
@@ -441,6 +451,13 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
 	if (queryDesc->totaltime)
 		InstrStopNode(queryDesc->totaltime, 0);
 
+	/*
+	 * For non-SELECT operations, es_total_processed will always be
+	 * equal to es_processed.
+	 */
+	if (operation != CMD_SELECT)
+		queryDesc->estate->es_total_processed = queryDesc->estate->es_processed;
+
 	MemoryContextSwitchTo(oldcontext);
 
 	estate->es_finished = true;
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index d97f5a8e7d..ee20cd5c8a 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -661,7 +661,10 @@ typedef struct EState
 
 	List	   *es_tupleTable;	/* List of TupleTableSlots */
 
-	uint64		es_processed;	/* # of tuples processed */
+	uint64		es_processed;	/* # of tuples processed for a single
+								 * execution of a portal */
+	uint64		es_total_processed; /* # of tuples processed for all
+									 * executions of a portal */
 
 	int			es_top_eflags;	/* eflags passed to ExecutorStart */
 	int			es_instrument;	/* OR of InstrumentOption flags */
-- 
2.39.2

#29Michael Paquier
michael@paquier.xyz
In reply to: Imseih (AWS), Sami (#28)
Re: [BUG] pg_stat_statements and extended query protocol

On Tue, Apr 04, 2023 at 09:48:17PM +0000, Imseih (AWS), Sami wrote:

The "calls" tracking is removed from Estate. Unlike v1 however,
I added a check for the operation type. Inside ExecutorRun,
es_total_processed is incremented when the operation is
a SELECT. This check is done for es_processed as well inside
executorRun -> ExecutePlan.

I see. This seems in line with ExecutePlan() where es_processed is
incremented only for SELECT queries.

For non select operations, es_total_processed is set to
es_processed in executorfinish. This is because the modify
plan nodes set es_processed outside of execMain.c

My java is very rusty but (after some time reminding myself how to do
a CLASSPATH) I can see the row counts being right for things like
SELECT, INSERT RETURNING, WITH queries for both of them and
autocommit, and the proposed fix influences SELECT only depending on
the fetch size. Doing nothing for calls now is fine by me, though I
agree that this could be improved at some point, as seeing only 1
rather than N for each fetch depending on the size is a bit confusing.

* There is no return value, but output tuples (if any) are sent to
* the destination receiver specified in the QueryDesc; and the number
* of tuples processed at the top level can be found in
* estate->es_processed.

Doesn't this comment at the top of ExecutorRun() need an update? It
seems to me that this comment should mention both es_total_processed
and es_processed, telling that es_processed is the number of rows
processed in a single call, while es_total_processed is the sum of all
tuples processed in the ExecutorRun() calls.

@@ -441,6 +451,13 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
if (queryDesc->totaltime)
InstrStopNode(queryDesc->totaltime, 0);

+   /*
+    * For non-SELECT operations, es_total_processed will always be
+    * equal to es_processed.
+    */
+   if (operation != CMD_SELECT)
+       queryDesc->estate->es_total_processed = queryDesc->estate->es_processed;

There is no need for this part in ExecutorFinish(), actually, as long
as we always increment es_total_processed at the end ExecutorRun() for
all the operation types? If the portal does not have a store, we
would do one ExecutorRun() call per execute fetch. If the portal has
a store, we'd do only one ExecutorRun(). Both cases call once
ExecutorFinish(), but the finish would happen during the first
execute when filling a portal's tuple store, and during the last
execute fetch if the portal has no store. This is Tom's point in [1]/messages/by-id/1311773.1680577992@sss.pgh.pa.us -- Michael,
from what I can see. That seems simpler to me, as well.

-   uint64      es_processed;   /* # of tuples processed */
+   uint64      es_processed;   /* # of tuples processed for a single
+                                * execution of a portal */
+   uint64      es_total_processed; /* # of tuples processed for all
+                                    * executions of a portal */

Hmm. This does not reflect completely the reality for non-SELECT
statements, no? For SELECT statements, that's correct, because
es_processed is reset in standard_ExecutorFinish() each time the
backend does an execute fetch thanks to PortalRunSelect(). For
non-SELECT statements, the extended query protocol uses the tuples
stored in the portal after one execution, meaning that we run through
the executor once with both es_processed and es_total_processed set to
their final numbers from the start, before any fetches. I would
suggest something like that to document both fields:
- es_processed: number of tuples processed during one ExecutorRun()
call.
- es_total_processed: total number of tuples aggregated across all
ExecutorRun() calls.

At the end, I'm OK with the proposal after a closer look, but I think
that we should do a much better job at describing es_processed and
es_total_processed in execnodes.h, particularly in the case of a
portal holding a store where es_processed may not reflect the number
of rows for a single portal execution, and it seems to me that the
proposal of incrementing es_total_processed at the end of
ExecutorRun() for all commands is simpler, based on what I have
tested.

[1]: /messages/by-id/1311773.1680577992@sss.pgh.pa.us -- Michael
--
Michael

#30Imseih (AWS), Sami
simseih@amazon.com
In reply to: Michael Paquier (#29)
1 attachment(s)
Re: [BUG] pg_stat_statements and extended query protocol

Doing nothing for calls now is fine by me, though I
agree that this could be improved at some point, as seeing only 1
rather than N for each fetch depending on the size is a bit confusing.

I think we will need to clearly define what "calls" is. Perhaps as mentioned
above, we may need separate counters for "calls" vs "fetches". This is
definitely a separate thread.

Doesn't this comment at the top of ExecutorRun() need an update? It
seems to me that this comment should mention both es_total_processed

Yes, updated in v5.

There is no need for this part in ExecutorFinish(), actually, as long
as we always increment es_total_processed at the end ExecutorRun() for
all the operation types?

Ah, correct. I changed that and tested again.

- es_processed: number of tuples processed during one ExecutorRun()
call.
- es_total_processed: total number of tuples aggregated across all
ExecutorRun() calls.

I thought hard about this point and for some reason I did not want to
mention ExecutorRun in the comment. But, I agree with what you suggest.
It's more clear as to the intention of the fields.

Attached is v5 addressing the comments.

Regards,

Sami Imseih
Amazon Web Services (AWS)

Attachments:

v5-0001-Fix-row-tracking-in-pg_stat_statements.patchapplication/octet-stream; name=v5-0001-Fix-row-tracking-in-pg_stat_statements.patchDownload
From 5ffb00497dc76b2289c1e03167ce70e2dcf7b85a Mon Sep 17 00:00:00 2001
From: EC2 Default User <ec2-user@ip-172-31-26-221.ec2.internal>
Date: Wed, 5 Apr 2023 04:00:52 +0000
Subject: [PATCH 1/1] Fix row tracking in pg_stat_statements.

In pg_stat_statements, EState->es_processed cannot reliably
be used to count the rows processed for a statement that
goes through extended query protocol. This is because, such
a statement may execute a portal more than once, and
es_processed is reset after every execution. This fix addresses
this issue by accuumulating the # row processed for all the
executions of a portal.

Author: Sami Imseih <simseih@amazon.com>
Discussion: https://www.postgresql.org/message-id/flat/EBE6C507-9EB6-4142-9E4D-38B1673363A7%40amazon.com
---
 contrib/pg_stat_statements/pg_stat_statements.c | 2 +-
 src/backend/executor/execMain.c                 | 5 ++++-
 src/include/nodes/execnodes.h                   | 5 ++++-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 5285c3f7fa..68215bb2e3 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -1060,7 +1060,7 @@ pgss_ExecutorEnd(QueryDesc *queryDesc)
 				   queryDesc->plannedstmt->stmt_len,
 				   PGSS_EXEC,
 				   queryDesc->totaltime->total * 1000.0,	/* convert to msec */
-				   queryDesc->estate->es_processed,
+				   queryDesc->estate->es_total_processed,
 				   &queryDesc->totaltime->bufusage,
 				   &queryDesc->totaltime->walusage,
 				   queryDesc->estate->es_jit ? &queryDesc->estate->es_jit->instr : NULL,
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 1b007dc32c..4c97ffd910 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -289,7 +289,8 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
  *		There is no return value, but output tuples (if any) are sent to
  *		the destination receiver specified in the QueryDesc; and the number
  *		of tuples processed at the top level can be found in
- *		estate->es_processed.
+ *		estate->es_processed. The total tuples processed in all
+ *		ExecutorRun calls can be found in estate->es_total_processed.
  *
  *		We provide a function hook variable that lets loadable plugins
  *		get control when ExecutorRun is called.  Such a plugin would
@@ -372,6 +373,8 @@ standard_ExecutorRun(QueryDesc *queryDesc,
 					execute_once);
 	}
 
+	estate->es_total_processed += estate->es_processed;
+
 	/*
 	 * shutdown tuple receiver, if we started it
 	 */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index d97f5a8e7d..ee456ab2cb 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -661,7 +661,10 @@ typedef struct EState
 
 	List	   *es_tupleTable;	/* List of TupleTableSlots */
 
-	uint64		es_processed;	/* # of tuples processed */
+	uint64		es_processed;	/* # of tuples processed during
+								 * one ExecutorRun() call. */
+	uint64		es_total_processed; /* total # of tuples aggregated across
+									 * all ExecutorRun() calls. */
 
 	int			es_top_eflags;	/* eflags passed to ExecutorStart */
 	int			es_instrument;	/* OR of InstrumentOption flags */
-- 
2.39.2

#31Michael Paquier
michael@paquier.xyz
In reply to: Imseih (AWS), Sami (#30)
Re: [BUG] pg_stat_statements and extended query protocol

On Wed, Apr 05, 2023 at 04:07:21AM +0000, Imseih (AWS), Sami wrote:

- es_processed: number of tuples processed during one ExecutorRun()
call.
- es_total_processed: total number of tuples aggregated across all
ExecutorRun() calls.

I thought hard about this point and for some reason I did not want to
mention ExecutorRun in the comment. But, I agree with what you suggest.
It's more clear as to the intention of the fields.

Attached is v5 addressing the comments.

Thanks, this should be enough to persist the number of tuples tracked
across multiple ExecutorRun() calls. This looks pretty good to me.

We should do something about providing more control over that to
libpq in the long run, IMO, and have more test coverage, but let's see
about that in 17~.
--
Michael

#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#31)
Re: [BUG] pg_stat_statements and extended query protocol

Michael Paquier <michael@paquier.xyz> writes:

On Wed, Apr 05, 2023 at 04:07:21AM +0000, Imseih (AWS), Sami wrote:

Attached is v5 addressing the comments.

Thanks, this should be enough to persist the number of tuples tracked
across multiple ExecutorRun() calls. This looks pretty good to me.

v5 seems OK to me except I think CreateExecutorState() should explicitly
zero the new es_total_processed field, alongside zeroing es_processed.
(I realize that the makeNode would have done it already, but our
coding conventions generally run towards not relying on that. This is
mainly for greppability, so you can find where a field is initialized.)

regards, tom lane

#33Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#32)
Re: [BUG] pg_stat_statements and extended query protocol

On Wed, Apr 05, 2023 at 05:39:13PM -0400, Tom Lane wrote:

v5 seems OK to me except I think CreateExecutorState() should explicitly
zero the new es_total_processed field, alongside zeroing es_processed.
(I realize that the makeNode would have done it already, but our
coding conventions generally run towards not relying on that. This is
mainly for greppability, so you can find where a field is initialized.)

Makes sense to me. I'll look at that again today, potentially apply
the fix on HEAD.
--
Michael

#34Imseih (AWS), Sami
simseih@amazon.com
In reply to: Michael Paquier (#33)
1 attachment(s)
Re: [BUG] pg_stat_statements and extended query protocol

Makes sense to me. I'll look at that again today, potentially apply
the fix on HEAD.

Here is v6. That was my mistake not to zero out the es_total_processed.
I had it in the first version.

--
Regards,

Sami Imseih
Amazon Web Services (AWS)

Attachments:

v6-0001-Fix-row-tracking-in-pg_stat_statements.patchapplication/octet-stream; name=v6-0001-Fix-row-tracking-in-pg_stat_statements.patchDownload
From 4bdb4f2584057eb775886d442a1909168403bb3d Mon Sep 17 00:00:00 2001
From: EC2 Default User <ec2-user@ip-172-31-26-221.ec2.internal>
Date: Wed, 5 Apr 2023 22:07:40 +0000
Subject: [PATCH 1/1] Fix row tracking in pg_stat_statements.

In pg_stat_statements, EState->es_processed cannot reliably
be used to count the rows processed for a statement that
goes through extended query protocol. This is because, such
a statement may execute a portal more than once, and
es_processed is reset after every execution. This fix addresses
this issue by accuumulating the # row processed for all the
executions of a portal.

Author: Sami Imseih <simseih@amazon.com>
Discussion: https://www.postgresql.org/message-id/flat/EBE6C507-9EB6-4142-9E4D-38B1673363A7%40amazon.com
---
 contrib/pg_stat_statements/pg_stat_statements.c | 2 +-
 src/backend/executor/execMain.c                 | 5 ++++-
 src/backend/executor/execUtils.c                | 1 +
 src/include/nodes/execnodes.h                   | 5 ++++-
 4 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 5285c3f7fa..68215bb2e3 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -1060,7 +1060,7 @@ pgss_ExecutorEnd(QueryDesc *queryDesc)
 				   queryDesc->plannedstmt->stmt_len,
 				   PGSS_EXEC,
 				   queryDesc->totaltime->total * 1000.0,	/* convert to msec */
-				   queryDesc->estate->es_processed,
+				   queryDesc->estate->es_total_processed,
 				   &queryDesc->totaltime->bufusage,
 				   &queryDesc->totaltime->walusage,
 				   queryDesc->estate->es_jit ? &queryDesc->estate->es_jit->instr : NULL,
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 1b007dc32c..4c97ffd910 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -289,7 +289,8 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
  *		There is no return value, but output tuples (if any) are sent to
  *		the destination receiver specified in the QueryDesc; and the number
  *		of tuples processed at the top level can be found in
- *		estate->es_processed.
+ *		estate->es_processed. The total tuples processed in all
+ *		ExecutorRun calls can be found in estate->es_total_processed.
  *
  *		We provide a function hook variable that lets loadable plugins
  *		get control when ExecutorRun is called.  Such a plugin would
@@ -372,6 +373,8 @@ standard_ExecutorRun(QueryDesc *queryDesc,
 					execute_once);
 	}
 
+	estate->es_total_processed += estate->es_processed;
+
 	/*
 	 * shutdown tuple receiver, if we started it
 	 */
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 012dbb6965..5ce8a0831b 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -147,6 +147,7 @@ CreateExecutorState(void)
 	estate->es_tupleTable = NIL;
 
 	estate->es_processed = 0;
+	estate->es_total_processed = 0;
 
 	estate->es_top_eflags = 0;
 	estate->es_instrument = 0;
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index b0def732ca..7e07f5777e 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -661,7 +661,10 @@ typedef struct EState
 
 	List	   *es_tupleTable;	/* List of TupleTableSlots */
 
-	uint64		es_processed;	/* # of tuples processed */
+	uint64		es_processed;	/* # of tuples processed during
+								 * one ExecutorRun() call. */
+	uint64		es_total_processed; /* total # of tuples aggregated across
+									 * all ExecutorRun() calls. */
 
 	int			es_top_eflags;	/* eflags passed to ExecutorStart */
 	int			es_instrument;	/* OR of InstrumentOption flags */
-- 
2.39.2

#35Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#33)
Re: [BUG] pg_stat_statements and extended query protocol

Hi,

On 2023-04-06 07:09:37 +0900, Michael Paquier wrote:

I'll look at that again today, potentially apply the fix on HEAD.

Seems like a complicated enough facility to benefit from a test or two? Peter
Eisentraut added support for the extended query protocol to psql, so it
shouldn't be too hard...

commit 5b66de3433e
Author: Peter Eisentraut <peter@eisentraut.org>
Date: 2022-11-15 13:50:27 +0100

psql: Add command to use extended query protocol

Greetings,

Andres Freund

#36Michael Paquier
michael@paquier.xyz
In reply to: Imseih (AWS), Sami (#34)
Re: [BUG] pg_stat_statements and extended query protocol

On Wed, Apr 05, 2023 at 10:16:19PM +0000, Imseih (AWS), Sami wrote:

Here is v6. That was my mistake not to zero out the es_total_processed.
I had it in the first version.

The update of es_total_processed in standard_ExecutorRun() felt a bit
lonely, so I have added an extra comment, ran an indentation, and
applied the result. Thanks Sami for the patch, and everyone else for
the feedback!
--
Michael

#37Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#35)
1 attachment(s)
Re: [BUG] pg_stat_statements and extended query protocol

On Wed, Apr 05, 2023 at 05:39:35PM -0700, Andres Freund wrote:

Seems like a complicated enough facility to benefit from a test or two? Peter
Eisentraut added support for the extended query protocol to psql, so it
shouldn't be too hard...

PQsendQueryGuts() does not split yet the bind/describe phase and the
execute phases, so we'd need a couple more libpq APIs to do that, with
more tracking of the state we're currently on when looping across
multiple execute fetches. My guess is that it is possible to follow a
model similar to JDBC here. I don't think that's necessarily
complicated, but it is not as straight-forward as it looks. \bind was
much more straight-forward than that, as it can feed on a single call
of PQsendQueryParams() after saving a set of parameters. An \exec
would not completely do that.

Attaching one of the scripts I've played with, in a very rusty java
with no classes or such, for future reference. Just update CLASSPATH
to point to a copy of the JDBC driver, run it with a java command, and
then look at rows, query in pg_stat_statements.
--
Michael

Attachments:

Test.javatext/x-java; charset=us-asciiDownload