Record SET session in VariableSetStmt
Hi hackers,
"SET local" is currently recorded in VariableSetStmt (with the boolean
is_local) but "SET session" is not.
Please find attached a patch proposal to also record "SET session" so
that VariableSetStmt records all the cases.
Remark: Recording "SET session" will also help for the Jumbling work
being done in [1]/messages/by-id/66be1104-164f-dcb8-6c43-f03a68a139a7@gmail.com.
[1]: /messages/by-id/66be1104-164f-dcb8-6c43-f03a68a139a7@gmail.com
/messages/by-id/66be1104-164f-dcb8-6c43-f03a68a139a7@gmail.com
Looking forward to your feedback,
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
v1-0001-record-set-session-in-VariableSetStmt.patchtext/plain; charset=UTF-8; name=v1-0001-record-set-session-in-VariableSetStmt.patchDownload
commit 2401087366432b0a520ac45947c4584924099617
Author: bdrouvotAWS <bdrouvot@amazon.com>
Date: Thu Oct 6 10:20:18 2022 +0000
v1-0001-record-set-session-in-VariableSetStmt.patch
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 94d5142a4a..809e617d18 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -1563,6 +1563,7 @@ VariableSetStmt:
VariableSetStmt *n = $2;
n->is_local = false;
+ n->is_session = false;
$$ = (Node *) n;
}
| SET LOCAL set_rest
@@ -1570,6 +1571,7 @@ VariableSetStmt:
VariableSetStmt *n = $3;
n->is_local = true;
+ n->is_session = false;
$$ = (Node *) n;
}
| SET SESSION set_rest
@@ -1577,6 +1579,7 @@ VariableSetStmt:
VariableSetStmt *n = $3;
n->is_local = false;
+ n->is_session = true;
$$ = (Node *) n;
}
;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 633e7671b3..c3f33e7c4a 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2228,6 +2228,7 @@ typedef struct VariableSetStmt
char *name; /* variable to be set */
List *args; /* List of A_Const nodes */
bool is_local; /* SET LOCAL? */
+ bool is_session; /* SET SESSION? */
} VariableSetStmt;
/* ----------------------
Hi,
On Thu, Oct 06, 2022 at 12:57:17PM +0200, Drouvot, Bertrand wrote:
Hi hackers,
"SET local" is currently recorded in VariableSetStmt (with the boolean
is_local) but "SET session" is not.Please find attached a patch proposal to also record "SET session" so that
VariableSetStmt records all the cases.Remark: Recording "SET session" will also help for the Jumbling work being
done in [1].
I don't think it's necessary. SET and SET SESSION are semantically the same so
nothing should rely on how exactly someone spelled it. This is also the case
for our core jumbling code, where we guarantee (or at least try to) that two
semantically identical statements will get the same queryid, and therefore
don't distinguish eg. LIKE vs ~~.
Hi,
On 10/6/22 1:18 PM, Julien Rouhaud wrote:
Hi,
On Thu, Oct 06, 2022 at 12:57:17PM +0200, Drouvot, Bertrand wrote:
Hi hackers,
"SET local" is currently recorded in VariableSetStmt (with the boolean
is_local) but "SET session" is not.Please find attached a patch proposal to also record "SET session" so that
VariableSetStmt records all the cases.Remark: Recording "SET session" will also help for the Jumbling work being
done in [1].I don't think it's necessary. SET and SET SESSION are semantically the same
Thanks for your feedback!
so
nothing should rely on how exactly someone spelled it. This is also the case
for our core jumbling code, where we guarantee (or at least try to) that two
semantically identical statements will get the same queryid, and therefore
don't distinguish eg. LIKE vs ~~.
Agree, but on the other hand currently SET and SET SESSION are recorded
with distinct queryid:
postgres=# select calls, query, queryid from pg_stat_statements;
calls | query | queryid
-------+---------------------------------------------+----------------------
2 | select calls, query from pg_stat_statements | -6345508659980235519
1 | set session enable_seqscan=1 | -3921418831612111986
1 | create extension pg_stat_statements | -1739183385080879393
1 | set enable_seqscan=1 | 7925920505912025406
(4 rows)
and this behavior would change with the Jumbling work in progress in [1]
(mentioned up-thread) if we don't record "SET SESSION".
I think that would make sense to keep the same behavior, what do you think?
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Thu, Oct 06, 2022 at 02:19:32PM +0200, Drouvot, Bertrand wrote:
On 10/6/22 1:18 PM, Julien Rouhaud wrote:
so
nothing should rely on how exactly someone spelled it. This is also the case
for our core jumbling code, where we guarantee (or at least try to) that two
semantically identical statements will get the same queryid, and therefore
don't distinguish eg. LIKE vs ~~.Agree, but on the other hand currently SET and SET SESSION are recorded with
distinct queryid:postgres=# select calls, query, queryid from pg_stat_statements;
calls | query | queryid
-------+---------------------------------------------+----------------------
2 | select calls, query from pg_stat_statements | -6345508659980235519
1 | set session enable_seqscan=1 | -3921418831612111986
1 | create extension pg_stat_statements | -1739183385080879393
1 | set enable_seqscan=1 | 7925920505912025406
(4 rows)and this behavior would change with the Jumbling work in progress in [1]
(mentioned up-thread) if we don't record "SET SESSION".I think that would make sense to keep the same behavior, what do you think?
It's because until now jumbling of utility statements was just hashing the
query text, which is quite terrible. This was also implying getting different
queryids for things like this:
=# select query, queryid from pg_stat_statements where query like '%work_mem%';;
query | queryid
-----------------------+----------------------
SeT work_mem = 123465 | -1114638544275583196
Set work_mem = 123465 | -1966597613643458788
SET work_mem = 123465 | 4161441071081149574
seT work_mem = 123465 | 8327271737593275474
(4 rows)
If we move to a real jumbling of VariableSetStmt, we should keep the rules
consistent with the rest of the jumble code and ignore an explicit "SESSION" in
the original command.
On 10/6/22 2:28 PM, Julien Rouhaud wrote:
On Thu, Oct 06, 2022 at 02:19:32PM +0200, Drouvot, Bertrand wrote:
On 10/6/22 1:18 PM, Julien Rouhaud wrote:
so
nothing should rely on how exactly someone spelled it. This is also the case
for our core jumbling code, where we guarantee (or at least try to) that two
semantically identical statements will get the same queryid, and therefore
don't distinguish eg. LIKE vs ~~.Agree, but on the other hand currently SET and SET SESSION are recorded with
distinct queryid:postgres=# select calls, query, queryid from pg_stat_statements;
calls | query | queryid
-------+---------------------------------------------+----------------------
2 | select calls, query from pg_stat_statements | -6345508659980235519
1 | set session enable_seqscan=1 | -3921418831612111986
1 | create extension pg_stat_statements | -1739183385080879393
1 | set enable_seqscan=1 | 7925920505912025406
(4 rows)and this behavior would change with the Jumbling work in progress in [1]
(mentioned up-thread) if we don't record "SET SESSION".I think that would make sense to keep the same behavior, what do you think?
It's because until now jumbling of utility statements was just hashing the
query text, which is quite terrible. This was also implying getting different
queryids for things like this:=# select query, queryid from pg_stat_statements where query like '%work_mem%';;
query | queryid
-----------------------+----------------------
SeT work_mem = 123465 | -1114638544275583196
Set work_mem = 123465 | -1966597613643458788
SET work_mem = 123465 | 4161441071081149574
seT work_mem = 123465 | 8327271737593275474
(4 rows)If we move to a real jumbling of VariableSetStmt, we should keep the rules
consistent with the rest of the jumble code and ignore an explicit "SESSION" in
the original command.
Understood, so I agree that it makes sense to keep the jumbling behavior
consistent and so keep the same queryid for statements that are
semantically identical.
Thanks for your feedback!
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Thu, Oct 06, 2022 at 08:28:27PM +0800, Julien Rouhaud wrote:
If we move to a real jumbling of VariableSetStmt, we should keep the rules
consistent with the rest of the jumble code and ignore an explicit "SESSION" in
the original command.
Hm, interesting bit, I should study more this area. So the query ID
calculation actually only cares about the contents of the Nodes
parsed, while the query string used is the one when the entry is
created for the first time. It seems like the patch to add
TransactionStmt nodes into the jumbling misses something here, as we'd
still compile different query IDs depending on the query string itself
for simple commands like BEGIN or COMMIT. I'll reply on the other
thread about all that..
--
Michael
On Fri, Oct 07, 2022 at 10:30:28AM +0900, Michael Paquier wrote:
On Thu, Oct 06, 2022 at 08:28:27PM +0800, Julien Rouhaud wrote:
If we move to a real jumbling of VariableSetStmt, we should keep the rules
consistent with the rest of the jumble code and ignore an explicit "SESSION" in
the original command.Hm, interesting bit, I should study more this area. So the query ID
calculation actually only cares about the contents of the Nodes
parsed, while the query string used is the one when the entry is
created for the first time. It seems like the patch to add
TransactionStmt nodes into the jumbling misses something here, as we'd
still compile different query IDs depending on the query string itself
for simple commands like BEGIN or COMMIT. I'll reply on the other
thread about all that..
Ah, indeed we have different TransactionStmtKind for BEGIN and START!