[PATCH] Query Jumbling for CALL and SET utility statements
Hi hackers,
While query jumbling is provided for function calls that’s currently not
the case for procedures calls.
The reason behind this is that all utility statements are currently
discarded for jumbling.
We’ve recently seen performance impacts (LWLock contention) due to the
lack of jumbling on procedure calls with pg_stat_statements and
pg_stat_statements.track_utility enabled (think an application with a
high rate of procedure calls with unique parameters for each call).
Jeremy has had this conversation on twitter (see
https://twitter.com/jer_s/status/1560003560116342785) and Nikolay
reported that he also had to work on a similar performance issue with
SET being used.
That’s why we think it would make sense to allow jumbling for those 2
utility statements: CALL and SET.
Please find attached a patch proposal for doing so.
With the attached patch we would get things like:
CALL MINUS_TWO(3);
CALL MINUS_TWO(7);
CALL SUM_TWO(3, 8);
CALL SUM_TWO(7, 5);
set enable_seqscan=false;
set enable_seqscan=true;
set seq_page_cost=2.0;
set seq_page_cost=1.0;
postgres=# SELECT query, calls, rows FROM pg_stat_statements;
query | calls | rows
-----------------------------------+-------+------
set seq_page_cost=$1 | 2 | 0
CALL MINUS_TWO($1) | 2 | 0
set enable_seqscan=$1 | 2 | 0
CALL SUM_TWO($1, $2) | 2 | 0
Looking forward to your feedback,
Thanks,
Jeremy & Bertrand
--
Bertrand Drouvot
Amazon Web Services:https://aws.amazon.com
Attachments:
v1-0001-JumbleQuery-on-Call-and-Set.patchtext/plain; charset=UTF-8; name=v1-0001-JumbleQuery-on-Call-and-Set.patchDownload+196-5
Hi
st 31. 8. 2022 v 17:34 odesílatel Drouvot, Bertrand <bdrouvot@amazon.com>
napsal:
Hi hackers,
While query jumbling is provided for function calls that’s currently not
the case for procedures calls.
The reason behind this is that all utility statements are currently
discarded for jumbling.We’ve recently seen performance impacts (LWLock contention) due to the
lack of jumbling on procedure calls with pg_stat_statements and
pg_stat_statements.track_utility enabled (think an application with a high
rate of procedure calls with unique parameters for each call).Jeremy has had this conversation on twitter (see
https://twitter.com/jer_s/status/1560003560116342785) and Nikolay
reported that he also had to work on a similar performance issue with SET
being used.That’s why we think it would make sense to allow jumbling for those 2
utility statements: CALL and SET.Please find attached a patch proposal for doing so.
With the attached patch we would get things like:
CALL MINUS_TWO(3);
CALL MINUS_TWO(7);
CALL SUM_TWO(3, 8);
CALL SUM_TWO(7, 5);
set enable_seqscan=false;
set enable_seqscan=true;
set seq_page_cost=2.0;
set seq_page_cost=1.0;postgres=# SELECT query, calls, rows FROM pg_stat_statements;
query | calls | rows
-----------------------------------+-------+------
set seq_page_cost=$1 | 2 | 0
CALL MINUS_TWO($1) | 2 | 0
set enable_seqscan=$1 | 2 | 0
CALL SUM_TWO($1, $2) | 2 | 0Looking forward to your feedback,
The idea is good, but I think you should use pg_stat_functions instead.
Maybe it is supported already (I didn't test it). I am not sure so SET
statement should be traced in pg_stat_statements - it is usually pretty
fast, and without context it says nothing. It looks like just overhead.
Regards
Pavel
Show quoted text
Thanks,
Jeremy & Bertrand
--
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com
Hi,
On 2022-08-31 17:33:44 +0200, Drouvot, Bertrand wrote:
While query jumbling is provided for function calls that’s currently not the
case for procedures calls.
The reason behind this is that all utility statements are currently
discarded for jumbling.
[...]
That’s why we think it would make sense to allow jumbling for those 2
utility statements: CALL and SET.
Yes, I've seen this be an issue repeatedly. Although more heavily for PREPARE
/ EXECUTE than either of the two cases you handle here. IME not tracking
PREPARE / EXECUTE can distort statistics substantially - there's appears to be
a surprising number of applications / frameworks resorting to them. Basically
requiring that track utility is turned on.
I suspect we should carve out things like CALL, PREPARE, EXECUTE from
track_utility - it's more or less an architectural accident that they're
utility statements. It's a bit less clear that SET should be dealt with that
way.
@@ -383,6 +384,30 @@ JumbleExpr(JumbleState *jstate, Node *node) APP_JUMB(var->varlevelsup); } break; + case T_CallStmt: + { + CallStmt *stmt = (CallStmt *) node; + FuncExpr *expr = stmt->funcexpr; + + APP_JUMB(expr->funcid); + JumbleExpr(jstate, (Node *) expr->args); + } + break;
Why do we need to take the arguments into account?
+ case T_VariableSetStmt: + { + VariableSetStmt *stmt = (VariableSetStmt *) node; + + APP_JUMB_STRING(stmt->name); + JumbleExpr(jstate, (Node *) stmt->args); + } + break;
Same?
+ case T_A_Const: + { + int loc = ((const A_Const *) node)->location; + + RecordConstLocation(jstate, loc); + } + break;
I suspect we only need this because of the jumbling of unparsed arguments I
questioned above? If we do end up needing it, shouldn't we include the type
in the jumbling?
Greetings,
Andres Freund
st 31. 8. 2022 v 17:50 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
napsal:
Hi
st 31. 8. 2022 v 17:34 odesílatel Drouvot, Bertrand <bdrouvot@amazon.com>
napsal:Hi hackers,
While query jumbling is provided for function calls that’s currently not
the case for procedures calls.
The reason behind this is that all utility statements are currently
discarded for jumbling.We’ve recently seen performance impacts (LWLock contention) due to the
lack of jumbling on procedure calls with pg_stat_statements and
pg_stat_statements.track_utility enabled (think an application with a high
rate of procedure calls with unique parameters for each call).Jeremy has had this conversation on twitter (see
https://twitter.com/jer_s/status/1560003560116342785) and Nikolay
reported that he also had to work on a similar performance issue with SET
being used.That’s why we think it would make sense to allow jumbling for those 2
utility statements: CALL and SET.Please find attached a patch proposal for doing so.
With the attached patch we would get things like:
CALL MINUS_TWO(3);
CALL MINUS_TWO(7);
CALL SUM_TWO(3, 8);
CALL SUM_TWO(7, 5);
set enable_seqscan=false;
set enable_seqscan=true;
set seq_page_cost=2.0;
set seq_page_cost=1.0;postgres=# SELECT query, calls, rows FROM pg_stat_statements;
query | calls | rows
-----------------------------------+-------+------
set seq_page_cost=$1 | 2 | 0
CALL MINUS_TWO($1) | 2 | 0
set enable_seqscan=$1 | 2 | 0
CALL SUM_TWO($1, $2) | 2 | 0Looking forward to your feedback,
The idea is good, but I think you should use pg_stat_functions instead.
Maybe it is supported already (I didn't test it). I am not sure so SET
statement should be traced in pg_stat_statements - it is usually pretty
fast, and without context it says nothing. It looks like just overhead.
I was wrong - there is an analogy with SELECT fx, and the statistics are in
pg_stat_statements, and in pg_stat_function too.
Regards
Pavel
Show quoted text
Regards
Pavel
Thanks,
Jeremy & Bertrand
--
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com
On 8/31/22 8:33 AM, Drouvot, Bertrand wrote:
We’ve recently seen performance impacts (LWLock contention) due to the
lack of jumbling on procedure calls with pg_stat_statements and
pg_stat_statements.track_utility enabled (think an application with a
high rate of procedure calls with unique parameters for each call).
I ran some performance tests with the patch that Bertrand wrote to get
numbers. From my perspective, this patch is scoped very minimally and is
low risk; I don’t think it should need an enormous amount of validation.
It does appear to address the issues with both SET and CALL statements
that Nikolay and I respectively encountered. Honestly, this almost seems
like it was just an minor oversight in the original patch that added
support for CALL and procedures.
I used an r5.large EC2 instance running Linux and tested Bertrand’s
patch using the PostgreSQL 14.4 code base, compiled without and with
Bertrand’s patch. The difference is a lot more extreme on big servers
with lots of cores, but the difference is obvious even on a small
instance like this one.
As a side note: while I certainly don't want to build a database
primarily based on benchmarks, it's nice when benchmarks showcase the
database's strength. Without this patch, HammerDB completely falls over
in stored procedure mode, since one of the procedure arguments is a
time-based unique value on every call. Someone else at Amazon running
HammerDB was how I originally became aware of this problem.
-Jeremy
===== Setup:
$ psql -c "create procedure test(x int) as 'begin return; end' language
plpgsql;"
CREATE PROCEDURE
$ echo -e "\set x random(1,100000000) \n call test(:x)" >test-call.pgbench
$ echo -e "\set x random(1,100000000) \n set application_name=':x'"
test-set.pgbench
===== CALL results without patch:
[postgres@ip-172-31-44-176 ~]$ pgbench -n -c 100 -j 100 -T15 -r -f
test-set.pgbench
pgbench (14.4)
transaction type: test-set.pgbench
scaling factor: 1
query mode: simple
number of clients: 100
number of threads: 100
duration: 15 s
number of transactions actually processed: 728748
latency average = 2.051 ms
initial connection time = 91.844 ms
tps = 48755.446492 (without initial connection time)
statement latencies in milliseconds:
0.000 \set x random(1,100000000)
2.046 set application_name=':x'
pg-14.4 rw postgres@postgres=# select wait_event, count(*) from
pg_stat_activity group by wait_event; \watch 1
...
Tue 30 Aug 2022 08:26:35 PM UTC (every 1s)
wait_event | count
---------------------+-------
[NULL] | 6
pg_stat_statements | 95
BgWriterMain | 1
ArchiverMain | 1
WalWriterMain | 1
AutoVacuumMain | 1
CheckpointerMain | 1
LogicalLauncherMain | 1
(8 rows)
...
===== CALL results with patch:
[postgres@ip-172-31-44-176 ~]$ pgbench -n -c 100 -j 100 -T15 -r -f
test-call.pgbench
pgbench (14.4)
transaction type: test-call.pgbench
scaling factor: 1
query mode: simple
number of clients: 100
number of threads: 100
duration: 15 s
number of transactions actually processed: 1098776
latency average = 1.361 ms
initial connection time = 89.002 ms
tps = 73491.904878 (without initial connection time)
statement latencies in milliseconds:
0.000 \set x random(1,100000000)
1.383 call test(:x)
pg-14.4 rw postgres@postgres=# select wait_event, count(*) from
pg_stat_activity group by wait_event; \watch 1
...
Tue 30 Aug 2022 08:42:51 PM UTC (every 1s)
wait_event | count
---------------------+-------
[NULL] | 99
BgWriterHibernate | 1
ArchiverMain | 1
WalWriterMain | 1
AutoVacuumMain | 1
CheckpointerMain | 1
ClientRead | 2
LogicalLauncherMain | 1
(8 rows)
...
===== SET results without patch:
[postgres@ip-172-31-44-176 ~]$ pgbench -n -c 100 -j 100 -T15 -r -f
test-set.pgbench
pgbench (14.4)
transaction type: test-set.pgbench
scaling factor: 1
query mode: simple
number of clients: 100
number of threads: 100
duration: 15 s
number of transactions actually processed: 728748
latency average = 2.051 ms
initial connection time = 91.844 ms
tps = 48755.446492 (without initial connection time)
statement latencies in milliseconds:
0.000 \set x random(1,100000000)
2.046 set application_name=':x'
pg-14.4 rw postgres@postgres=# select wait_event, count(*) from
pg_stat_activity group by wait_event; \watch 1
...
Tue 30 Aug 2022 08:26:35 PM UTC (every 1s)
wait_event | count
---------------------+-------
[NULL] | 6
pg_stat_statements | 95
BgWriterMain | 1
ArchiverMain | 1
WalWriterMain | 1
AutoVacuumMain | 1
CheckpointerMain | 1
LogicalLauncherMain | 1
(8 rows)
...
===== SET results with patch:
[postgres@ip-172-31-44-176 ~]$ pgbench -n -c 100 -j 100 -T15 -r -f
test-set.pgbench
pgbench (14.4)
transaction type: test-set.pgbench
scaling factor: 1
query mode: simple
number of clients: 100
number of threads: 100
duration: 15 s
number of transactions actually processed: 1178844
latency average = 1.268 ms
initial connection time = 89.159 ms
tps = 78850.178814 (without initial connection time)
statement latencies in milliseconds:
0.000 \set x random(1,100000000)
1.270 set application_name=':x'
pg-14.4 rw postgres@postgres=# select wait_event, count(*) from
pg_stat_activity group by wait_event; \watch 1
...
Tue 30 Aug 2022 08:44:30 PM UTC (every 1s)
wait_event | count
---------------------+-------
[NULL] | 101
BgWriterHibernate | 1
ArchiverMain | 1
WalWriterMain | 1
AutoVacuumMain | 1
CheckpointerMain | 1
LogicalLauncherMain | 1
(7 rows)
...
--
Jeremy Schneider
Database Engineer
Amazon Web Services
On 8/31/22 9:08 AM, Andres Freund wrote:
I suspect we should carve out things like CALL, PREPARE, EXECUTE from
track_utility - it's more or less an architectural accident that they're
utility statements. It's a bit less clear that SET should be dealt with that
way.
Regarding SET, the compelling use case was around "application_name"
whose purpose is to provide a label in pg_stat_activity and on log
lines, which can be used to improve observability and connect queries to
their source in application code. Nikolay's incident (on peak shopping
day for an eCommerce corp) evidently involved an application which
leveraged this, but as a result the contention on the pg_stat_statements
LWLock in exclusive mode effectively caused an outage for the retailer?
Or nearly did? My description here is based on Nikolay's public twitter
comment.
I've seen a lot of applications that make heavy use of temp tables,
where DDL would be pretty important to track as part of the regular
workload. So that probably should be added to the list alongside
prepared statements. And I'd want to spend a little more time thinking
about what other use cases might be missing. I'm hesitant about the
general idea of carving out some utility statements away from this
"track_utility" GUC.
Personally, at this point, I think pg_stat_statements is critical
infrastructure for anyone running PostgreSQL at scale. The information
it provides is indispensable. I don't think it's really defensible to
tell people that if they want to scale, then they need to fly blind on
any utility statements.
-Jeremy
--
Jeremy Schneider
Database Engineer
Amazon Web Services
Hi,
On 2022-08-31 11:00:05 -0700, Jeremy Schneider wrote:
On 8/31/22 9:08 AM, Andres Freund wrote:
I suspect we should carve out things like CALL, PREPARE, EXECUTE from
track_utility - it's more or less an architectural accident that they're
utility statements. It's a bit less clear that SET should be dealt with that
way.Regarding SET, the compelling use case was around "application_name"
whose purpose is to provide a label in pg_stat_activity and on log
lines, which can be used to improve observability and connect queries to
their source in application code.
I wasn't saying that SET shouldn't be jumbled, just that it seems more
reasonable to track it only when track_utility is enabled, rather than doing
so even when that's disabled. Which I do think makes sense for executing a
prepared statement and calling a procedure, since they're really only utility
statements by accident.
Personally, at this point, I think pg_stat_statements is critical
infrastructure for anyone running PostgreSQL at scale. The information
it provides is indispensable. I don't think it's really defensible to
tell people that if they want to scale, then they need to fly blind on
any utility statements.
I wasn't suggesting doing so at all.
Greetings,
Andres Freund
On 8/31/22 12:06 PM, Andres Freund wrote:
Regarding SET, the compelling use case was around "application_name"
whose purpose is to provide a label in pg_stat_activity and on log
lines, which can be used to improve observability and connect queries to
their source in application code.I wasn't saying that SET shouldn't be jumbled, just that it seems more
reasonable to track it only when track_utility is enabled, rather than doing
so even when that's disabled. Which I do think makes sense for executing a
prepared statement and calling a procedure, since they're really only utility
statements by accident.
Hey Andres, sorry for misunderstanding your email!
Based on this quick test I just now ran (transcript below), I think that
PREPARE/EXECUTE is already excluded from track_utility?
I get your point about CALL, maybe it does make sense to also exclude
this. It might also be worth a small update to the doc for track_utility
about how it behaves, in this regard.
https://www.postgresql.org/docs/14/pgstatstatements.html#id-1.11.7.39.9
Example updated sentence:
|pg_stat_statements.track_utility| controls whether <<most>> utility
commands are tracked by the module. Utility commands are all those other
than |SELECT|, |INSERT|, |UPDATE| and |DELETE| <<, but this parameter
does not disable tracking of PREPARE, EXECUTE or CALL>>. The default
value is |on|. Only superusers can change this setting.
=====
pg-14.4 rw root@db1=# set pg_stat_statements.track_utility=on;
SET
pg-14.4 rw root@db1=# select pg_stat_statements_reset();
pg_stat_statements_reset
--------------------------
(1 row)
pg-14.4 rw root@db1=# prepare test as select /* unique123 */ 1;
PREPARE
pg-14.4 rw root@db1=# execute test;
?column?
----------
1
(1 row)
pg-14.4 rw root@db1=# set application_name='test';
SET
pg-14.4 rw root@db1=# select substr(query,1,50) from pg_stat_statements;
substr
-------------------------------------------
prepare test as select /* unique123 */ $1
select pg_stat_statements_reset()
set application_name=$1
(3 rows)
=====
pg-14.4 rw root@db1=# set pg_stat_statements.track_utility=off;
SET
pg-14.4 rw root@db1=# select pg_stat_statements_reset();
pg_stat_statements_reset
--------------------------
(1 row)
pg-14.4 rw root@db1=# prepare test as select /* unique123 */ 1;
PREPARE
pg-14.4 rw root@db1=# execute test;
?column?
----------
1
(1 row)
pg-14.4 rw root@db1=# set application_name='test';
SET
pg-14.4 rw root@db1=# select substr(query,1,50) from pg_stat_statements;
substr
-------------------------------------------
prepare test as select /* unique123 */ $1
select pg_stat_statements_reset()
(2 rows)
--
Jeremy Schneider
Database Engineer
Amazon Web Services
Hi,
On 8/31/22 6:08 PM, Andres Freund wrote:
Hi,
On 2022-08-31 17:33:44 +0200, Drouvot, Bertrand wrote:
@@ -383,6 +384,30 @@ JumbleExpr(JumbleState *jstate, Node *node) APP_JUMB(var->varlevelsup); } break; + case T_CallStmt: + { + CallStmt *stmt = (CallStmt *) node; + FuncExpr *expr = stmt->funcexpr; + + APP_JUMB(expr->funcid); + JumbleExpr(jstate, (Node *) expr->args); + } + break;Why do we need to take the arguments into account?
Thanks for looking at it!
Agree that It's not needed to "solve" the Lock contention issue, but I
think it's needed for the "render".
Without it we would get, things like:
postgres=# call MY_PROC(10);
CALL
postgres=# call MY_PROC(100000000);
CALL
postgres=# SELECT query, calls, rows FROM pg_stat_statements;
query | calls | rows
-----------------------------------+-------+------
select pg_stat_statements_reset() | 1 | 1
call MY_PROC(10) | 2 | 0
(2 rows)
instead of
postgres=# SELECT query, calls, rows FROM pg_stat_statements;
query | calls | rows
-----------------------------------+-------+------
select pg_stat_statements_reset() | 1 | 1
call MY_PROC($1) | 2 | 0
(2 rows)
+ case T_VariableSetStmt: + { + VariableSetStmt *stmt = (VariableSetStmt *) node; + + APP_JUMB_STRING(stmt->name); + JumbleExpr(jstate, (Node *) stmt->args); + } + break;Same?
yeah, same reason. Without it we would get things like:
postgres=# set enable_seqscan=false;
SET
postgres=# set enable_seqscan=true;
SET
postgres=# SELECT query, calls, rows FROM pg_stat_statements;
query | calls | rows
-----------------------------------+-------+------
select pg_stat_statements_reset() | 1 | 1
set enable_seqscan=false | 2 | 0
(2 rows)
instead of
postgres=# SELECT query, calls, rows FROM pg_stat_statements;
query | calls | rows
-----------------------------------+-------+------
set enable_seqscan=$1 | 2 | 0
select pg_stat_statements_reset() | 1 | 1
(2 rows)
+ case T_A_Const: + { + int loc = ((const A_Const *) node)->location; + + RecordConstLocation(jstate, loc); + } + break;I suspect we only need this because of the jumbling of unparsed arguments I
questioned above?
Right but only for the T_VariableSetStmt case.
If we do end up needing it, shouldn't we include the type
in the jumbling?
I don't think so as this is only for the T_VariableSetStmt case.
And looking closer I don't see such as thing as "consttype" (that we can
find in the Const struct) in the A_Const struct.
Regards,
--
Bertrand Drouvot
Amazon Web Services:https://aws.amazon.com
Hi,
On 8/31/22 10:05 PM, Jeremy Schneider wrote:
On 8/31/22 12:06 PM, Andres Freund wrote:
Regarding SET, the compelling use case was around "application_name"
whose purpose is to provide a label in pg_stat_activity and on log
lines, which can be used to improve observability and connect queries to
their source in application code.I wasn't saying that SET shouldn't be jumbled, just that it seems more
reasonable to track it only when track_utility is enabled, rather than doing
so even when that's disabled. Which I do think makes sense for executing a
prepared statement and calling a procedure, since they're really only utility
statements by accident.I get your point about CALL, maybe it does make sense to also exclude
this.
That's a good point and i think we should track CALL whatever the value
of pgss_track_utility is.
I think so because we are tracking function calls in all the cases
(because "linked" to select aka not a utility) and i don't see any
reasons why not to do the same for procedure calls.
Please find attached v2 as an attempt to do so.
With v2 we get things like:
postgres=# set pg_stat_statements.track_utility=on;
SET
postgres=# call MY_PROC(20);
CALL
postgres=# call MY_PROC(10);
CALL
postgres=# set enable_seqscan=false;
SET
postgres=# set enable_seqscan=true;
SET
postgres=# select queryid,query,calls from pg_stat_statements;
queryid | query | calls
---------------------+-----------------------------------------+-------
4670878543381973400 | set pg_stat_statements.track_utility=$1 | 1
-640317129591544054 | set enable_seqscan=$1 | 2
492647827690744963 | select pg_stat_statements_reset() | 1
6541399678435597534 | call MY_PROC($1) | 2
and
postgres=# set pg_stat_statements.track_utility=off;
SET
postgres=# call MY_PROC(10);
CALL
postgres=# call MY_PROC(20);
CALL
postgres=# set enable_seqscan=true;
SET
postgres=# set enable_seqscan=false;
SET
postgres=# select queryid,query,calls from pg_stat_statements;
queryid | query | calls
---------------------+-----------------------------------------+-------
4670878543381973400 | set pg_stat_statements.track_utility=$1 | 1
492647827690744963 | select pg_stat_statements_reset() | 1
6541399678435597534 | call MY_PROC($1) | 2
(3 rows)
It might also be worth a small update to the doc for track_utility
about how it behaves, in this regard.https://www.postgresql.org/docs/14/pgstatstatements.html#id-1.11.7.39.9
Example updated sentence:
|pg_stat_statements.track_utility| controls whether <<most>> utility
commands are tracked by the module. Utility commands are all those
other than |SELECT|, |INSERT|, |UPDATE| and |DELETE| <<, but this
parameter does not disable tracking of PREPARE, EXECUTE or CALL>>. The
default value is |on|. Only superusers can change this setting.
Agree, wording added to v2.
Regards,
--
Bertrand Drouvot
Amazon Web Services:https://aws.amazon.com
Attachments:
v2-0001-JumbleQuery-on-Call-and-Set.patchtext/plain; charset=UTF-8; name=v2-0001-JumbleQuery-on-Call-and-Set.patchDownload+275-9
Please find attached v2 as an attempt to do so.
+1 to the idea.
I think it will be better to evaluate jstate instead of
JUMBLE_UTILITY, such as:
if (query->utilityStmt && !jstate)
instead of
if (query->utilityStmt && !JUMBLE_UTILITY(query->utilityStmt))
This will allow for support of potentially other utility statements
In the future without having to teach pg_stat_statements about them.
If a jstate is set for the utility statements, pgss will do the right thing.
Thanks
--
Sami Imseih
Amazon Web Services (AWS)
Hi,
On 9/1/22 5:13 PM, Imseih (AWS), Sami wrote:
Please find attached v2 as an attempt to do so.
+1 to the idea.
Thanks for looking at it!
I think it will be better to evaluate jstate instead of
JUMBLE_UTILITY, such as:
if (query->utilityStmt && !jstate)
instead of
if (query->utilityStmt && !JUMBLE_UTILITY(query->utilityStmt))
This will allow for support of potentially other utility statements
In the future without having to teach pg_stat_statements about them.
If a jstate is set for the utility statements, pgss will do the right
thing.
Fair point, thanks!
v3 including this change is attached.
Thanks,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services:https://aws.amazon.com
Attachments:
v3-0001-JumbleQuery-on-Call-and-Set.patchtext/plain; charset=UTF-8; name=v3-0001-JumbleQuery-on-Call-and-Set.patchDownload+275-9
This is ready for committer but I suggest the following for the
doc changes:
1.
Plannable queries (that is, SELECT, INSERT, UPDATE, and DELETE) are
combined into a single pg_stat_statements entry whenever they have
identical query structures according to an internal hash calculation.
Typically, two queries will be considered the same for this purpose
if they are semantically equivalent except for the values of literal
constants appearing in the query. Utility commands (that is, all other commands)
are compared strictly on the basis of their textual query strings, however.
-- to --
Plannable queries (that is, SELECT, INSERT, UPDATE, and DELETE) as
well as CALL and SET commands are combined into a single
pg_stat_statements entry whenever they have identical query
structures according to an internal hash calculation.
Typically, two queries will be considered the same for this purpose
if they are semantically equivalent except for the values of literal
constants appearing in the command. All other commands are compared
strictly on the basis of their textual query strings, however.
2.
pg_stat_statements.track_utility controls whether utility
commands are tracked by the module. Utility commands
are all those other than SELECT, INSERT, UPDATE and DELETE.
The default value is on. Only superusers can change this setting.
-- to --
pg_stat_statements.track_utility controls whether utility commands
are tracked by the module. Tracked utility commands are all those
other than SELECT, INSERT, UPDATE, DELETE, CALL and SET.
The default value is on. Only superusers can change this setting.
--
Thanks,
Sami Imseih
Amazon Web Services (AWS)
From: "Drouvot, Bertrand" <bdrouvot@amazon.com>
Date: Friday, September 2, 2022 at 4:06 AM
To: "Imseih (AWS), Sami" <simseih@amazon.com>, "Schneider (AWS), Jeremy" <schnjere@amazon.com>, Andres Freund <andres@anarazel.de>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>, Peter Eisentraut <peter.eisentraut@enterprisedb.com>, Pavel Stehule <pavel.stehule@gmail.com>, Nikolay Samokhvalov <samokhvalov@gmail.com>
Subject: Re: [PATCH] Query Jumbling for CALL and SET utility statements
Hi,
On 9/1/22 5:13 PM, Imseih (AWS), Sami wrote:
Please find attached v2 as an attempt to do so.
+1 to the idea.
Thanks for looking at it!
I think it will be better to evaluate jstate instead of
JUMBLE_UTILITY, such as:
if (query->utilityStmt && !jstate)
instead of
if (query->utilityStmt && !JUMBLE_UTILITY(query->utilityStmt))
This will allow for support of potentially other utility statements
In the future without having to teach pg_stat_statements about them.
If a jstate is set for the utility statements, pgss will do the right thing.
Fair point, thanks!
v3 including this change is attached.
Thanks,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On 9/2/22 2:06 AM, Drouvot, Bertrand wrote:
v3 including this change is attached.
FYI: "reset all" core dumps with v3
I didn't fully debug yet, but here's the backtrace on my 14.4 build with
the patch
[postgres@ip-172-31-44-176 data]$ gdb /usr/local/pgsql-14.4/bin/postgres
core.27217
...
Core was generated by `postgres: postgres postgres [local]
RESET '.
Program terminated with signal 11, Segmentation fault.
#0 0x00007f7776ae4821 in __strlen_sse2_pminub () from /lib64/libc.so.6
...
(gdb) bt
#0 0x00007f7776ae4821 in __strlen_sse2_pminub () from /lib64/libc.so.6
#1 0x00000000008e061c in JumbleExpr (jstate=0x1cf7f80, node=<optimized
out>) at queryjumble.c:400
#2 0x00000000008dfdd8 in JumbleQueryInternal (jstate=0x1cf7f80,
query=0x1cf7e70) at queryjumble.c:247
#3 0x00000000008e0b4b in JumbleQuery (query=query@entry=0x1cf7e70,
querytext=querytext@entry=0x1cf72f8 "reset all;") at queryjumble.c:127
#4 0x000000000056ba4b in parse_analyze (parseTree=0x1cf7ce0,
sourceText=0x1cf72f8 "reset all;", paramTypes=0x0, numParams=<optimized
out>, queryEnv=0x0) at analyze.c:130
#5 0x000000000079df63 in pg_analyze_and_rewrite
(parsetree=parsetree@entry=0x1cf7ce0,
query_string=query_string@entry=0x1cf72f8 "reset all;",
paramTypes=paramTypes@entry=0x0, numParams=numParams@entry=0,
queryEnv=queryEnv@entry=0x0) at postgres.c:657
#6 0x000000000079e472 in exec_simple_query (query_string=0x1cf72f8
"reset all;") at postgres.c:1130
#7 0x000000000079f9d3 in PostgresMain (argc=argc@entry=1,
argv=argv@entry=0x7ffd0c341f80, dbname=0x1d44948 "postgres",
username=<optimized out>) at postgres.c:4496
#8 0x000000000048c9f3 in BackendRun (port=<optimized out>,
port=<optimized out>) at postmaster.c:4530
#9 BackendStartup (port=0x1d3bdd0) at postmaster.c:4252
#10 ServerLoop () at postmaster.c:1745
#11 0x0000000000721332 in PostmasterMain (argc=argc@entry=5,
argv=argv@entry=0x1cf1e10) at postmaster.c:1417
#12 0x000000000048da6e in main (argc=5, argv=0x1cf1e10) at main.c:209
--
Jeremy Schneider
Database Engineer
Amazon Web Services
On Wed, Sep 07, 2022 at 06:19:42PM -0700, Jeremy Schneider wrote:
I didn't fully debug yet, but here's the backtrace on my 14.4 build with
the patch
What happens on HEAD? That would be the target branch for a new
feature.
--
Michael
Hi,
On 9/8/22 7:23 AM, Michael Paquier wrote:
On Wed, Sep 07, 2022 at 06:19:42PM -0700, Jeremy Schneider wrote:
I didn't fully debug yet, but here's the backtrace on my 14.4 build with
the patch
Thanks Jeremy for reporting the issue!
What happens on HEAD? That would be the target branch for a new
feature.
Just tested and i can see the same issue on HEAD.
Issue is on stmt->name being NULL here:
Breakpoint 2, JumbleExpr (jstate=0x55d60e769e30, node=0x55d60e769b60) at
queryjumble.c:364
364 if (node == NULL)
(gdb) n
368 check_stack_depth();
(gdb)
374 APP_JUMB(node->type);
(gdb)
376 switch (nodeTag(node))
(gdb)
398 VariableSetStmt *stmt =
(VariableSetStmt *) node;
(gdb) n
400 APP_JUMB_STRING(stmt->name);
(gdb) p stmt->name
$1 = 0x0
I'll have a closer look.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services:https://aws.amazon.com
On Thu, Sep 08, 2022 at 02:23:19PM +0900, Michael Paquier wrote:
On Wed, Sep 07, 2022 at 06:19:42PM -0700, Jeremy Schneider wrote:
I didn't fully debug yet, but here's the backtrace on my 14.4 build with
the patchWhat happens on HEAD? That would be the target branch for a new
feature.
It would be the same AFAICS. From v3:
+ case T_VariableSetStmt:
+ {
+ VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+ APP_JUMB_STRING(stmt->name);
+ JumbleExpr(jstate, (Node *) stmt->args);
+ }
For a RESET ALL command stmt->name is NULL.
Hi,
On 9/8/22 8:50 AM, Julien Rouhaud wrote:
Thanks for looking at it!
On Thu, Sep 08, 2022 at 02:23:19PM +0900, Michael Paquier wrote:
On Wed, Sep 07, 2022 at 06:19:42PM -0700, Jeremy Schneider wrote:
I didn't fully debug yet, but here's the backtrace on my 14.4 build with
the patchWhat happens on HEAD? That would be the target branch for a new
feature.It would be the same AFAICS. From v3:
+ case T_VariableSetStmt: + { + VariableSetStmt *stmt = (VariableSetStmt *) node; + + APP_JUMB_STRING(stmt->name); + JumbleExpr(jstate, (Node *) stmt->args); + }For a RESET ALL command stmt->name is NULL.
Right, please find attached v4 addressing the issue and also Sami's
comments [1]/messages/by-id/82A35172-BEB3-4DFA-B11C-AE5E50A0F932@amazon.com.
[1]: /messages/by-id/82A35172-BEB3-4DFA-B11C-AE5E50A0F932@amazon.com
/messages/by-id/82A35172-BEB3-4DFA-B11C-AE5E50A0F932@amazon.com
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services:https://aws.amazon.com
Attachments:
v4-0001-JumbleQuery-on-Call-and-Set.patchtext/plain; charset=UTF-8; name=v4-0001-JumbleQuery-on-Call-and-Set.patchDownload+286-15
Hi,
On Thu, Sep 08, 2022 at 11:06:51AM +0200, Drouvot, Bertrand wrote:
Hi,
On 9/8/22 8:50 AM, Julien Rouhaud wrote:
Thanks for looking at it!
On Thu, Sep 08, 2022 at 02:23:19PM +0900, Michael Paquier wrote:
On Wed, Sep 07, 2022 at 06:19:42PM -0700, Jeremy Schneider wrote:
I didn't fully debug yet, but here's the backtrace on my 14.4 build with
the patchWhat happens on HEAD? That would be the target branch for a new
feature.It would be the same AFAICS. From v3:
+ case T_VariableSetStmt: + { + VariableSetStmt *stmt = (VariableSetStmt *) node; + + APP_JUMB_STRING(stmt->name); + JumbleExpr(jstate, (Node *) stmt->args); + }For a RESET ALL command stmt->name is NULL.
Right, please find attached v4 addressing the issue and also Sami's comments
[1].
(Sorry I've not been following this thread until now)
IME if your application relies on 2PC it's very likely that you will hit the
exact same problems described in your original email. What do you think about
normalizing those too while working on the subject?
Hi,
On 9/8/22 1:29 PM, Julien Rouhaud wrote:
Hi,
On Thu, Sep 08, 2022 at 11:06:51AM +0200, Drouvot, Bertrand wrote:
Hi,
On 9/8/22 8:50 AM, Julien Rouhaud wrote:
Thanks for looking at it!
On Thu, Sep 08, 2022 at 02:23:19PM +0900, Michael Paquier wrote:
On Wed, Sep 07, 2022 at 06:19:42PM -0700, Jeremy Schneider wrote:
I didn't fully debug yet, but here's the backtrace on my 14.4 build with
the patchWhat happens on HEAD? That would be the target branch for a new
feature.It would be the same AFAICS. From v3:
+ case T_VariableSetStmt: + { + VariableSetStmt *stmt = (VariableSetStmt *) node; + + APP_JUMB_STRING(stmt->name); + JumbleExpr(jstate, (Node *) stmt->args); + }For a RESET ALL command stmt->name is NULL.
Right, please find attached v4 addressing the issue and also Sami's comments
[1].(Sorry I've not been following this thread until now)
IME if your application relies on 2PC it's very likely that you will hit the
exact same problems described in your original email.
Agree
What do you think about
normalizing those too while working on the subject?
That sounds reasonable, I'll have a look at those too while at it.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services:https://aws.amazon.com