{PROPOSAL] add session information column to pg_stat_statements
I have renamed the thread [PROPOSAL] extend the object names to the
qualified names in pg_stat_statements
started on
/messages/by-id/9baf5c06-d6ab-c688-010c-843348e3d98c@gmail.com
and ended on
/messages/by-id/c93bb5ce-22bd-eb6b-a057-d0666585258f@gmail.com
Currently pg_stat_statements doesn't have enough information to distinct
queries executed in the different environment;
for example the queries with the same text from pg_stat_statements.query
column can be the different queries if they were executed with
the different search path parameter.
The initial proposed solution was to extend names of the objects in the
query to the qualified names, so for the text
"select * from t1"
from the pg_stat_statements.query column the new proposed column
pg_stat_statements.query_qn would have the text
"select * from s1.t1"
Based on the discussion this solution has proved to be
a) relatively difficult to implement and slow to execute
b) resolves only the missed schema name problem but other differences in
the execution environment would require some new extra columns
So the new proposed change addresses these concerns by been
a) faster (probably)
b) extensible
and, of course, it is backward compatible with the existing
pg_stat_statements view.
We can add a column pg_stat_statements.session_info jsonb.
Its content can be defined by the new configuration parameter
pg_stat_statements.session_info ('current_schemas, current_user,
session_user') // a subset of the data from the system information functions
or in the initial implementation it can be hardcoded to include at least
current_schemas and current user.
and it will have data like
{
"current_schemas" : ["pg_catalog", "s1", "s2", "public"],
"current_user" : "user1",
"session_user" : "user1"
}
It will allow the DBA/developer to understand and reproduce a
performance issue, and will allow the deeper level of granularity for
the reporting tools.
I don't know how difficult will be to implement something like that.
Thank you,
Sergei
I'm also very interested by collecting "search_path" information for
statements,
but this information may not be unique for pg_stat_statements key
(dbid,userid,queryid) ...
How would this 1-N relation be handled ?
1/ just catch initial session_info for each (dbid,userid,queryid),
2/ adding a sessid field in pgss key, and a table for distinct
sessid,session_info values,
3/ store session_info in an array,
4/ ...
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
On 12/2/2018 2:22 AM, legrand legrand wrote:
I'm also very interested by collecting "search_path" information for
statements,
but this information may not be unique for pg_stat_statements key
(dbid,userid,queryid) ...How would this 1-N relation be handled ?
1/ just catch initial session_info for each (dbid,userid,queryid),
2/ adding a sessid field in pgss key, and a table for distinct
sessid,session_info values,
3/ store session_info in an array,
4/ ...
According to documentation queryid is an "Internal hash code, computed
from the statement's parse tree"
so I expect it to be different for the same query with different
current_schemas or current_user
outside of the possible hash collisions if the changes current_schemas
or current_user in ant way affect the execution of the query.
Queryid can be different even for the same combination of
pg_stat_statements.query + pg_stat_statements.session_info (objects can
be dropped and recreated etc.)
I would say that option 1 makes the most sense.
When a user or an application changes search_path or session_user he/she
expects that it will affect the execution of at least some queries,
that's why these setting were introduced in the first place.
If you know that you execute the same queries on the different schemas
then you would like to group your statistics by query+current_schemas.
If you know that you change current_user for row level security then you
would like to group your statistics by query+current_user.
But you need to preserve this session_info for every queryid in
pg_stat_statements.
Thank you,
Sergei