Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Hey pg developers,
Do you think if we can add queryId into the pg_stat_get_activity function
and ultimatly expose it in the view? It would be easier to track "similar"
query's performance over time easier.
Thanks a lot!
Yun
Yun Li <liyunjuanyong@gmail.com> writes:
Do you think if we can add queryId into the pg_stat_get_activity function
and ultimatly expose it in the view? It would be easier to track "similar"
query's performance over time easier.
No, we're not likely to do that, because it would mean (1) baking one
single definition of "query ID" into the core system and (2) paying
the cost to calculate that ID all the time.
pg_stat_statements has a notion of query ID, but that notion might be
quite inappropriate for other usages, which is why it's an extension
and not core.
regards, tom lane
On Fri, Mar 15, 2019 at 9:50 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yun Li <liyunjuanyong@gmail.com> writes:
Do you think if we can add queryId into the pg_stat_get_activity function
and ultimatly expose it in the view? It would be easier to track "similar"
query's performance over time easier.No, we're not likely to do that, because it would mean (1) baking one
single definition of "query ID" into the core system and (2) paying
the cost to calculate that ID all the time.pg_stat_statements has a notion of query ID, but that notion might be
quite inappropriate for other usages, which is why it's an extension
and not core.
Having written an extension that also wanted a query ID, I disagree
with this position. There's only one query ID field available, and
you can't use two extensions that care about query ID unless they
compute it the same way, and replicating all the code that computes
the query ID into each new extension that wants one sucks. I think we
should actually bite the bullet and move all of that code into core,
and then just let extensions say whether they care about it getting
set.
Also, I think this is now the third independent request to expose
query ID in pg_stat_statements. I think we should give the people
what they want.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
On Fri, Mar 15, 2019 at 9:50 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
pg_stat_statements has a notion of query ID, but that notion might be
quite inappropriate for other usages, which is why it's an extension
and not core.
Having written an extension that also wanted a query ID, I disagree
with this position.
[ shrug... ] The fact remains that pg_stat_statements's definition is
pretty lame. There's a lot of judgment calls in which query fields
it chooses to examine or ignore, and there's been no attempt at all
to make the ID PG-version-independent, and I rather doubt that it's
platform-independent either. Nor will the IDs survive a dump/reload
even on the same server, since object OIDs will likely change.
These things are OK, or at least mostly tolerable, for pg_stat_statements'
usage ... but I don't think it's a good idea to have the core code
dictating that definition to all extensions. Right now, if you have
an extension that needs some other query-ID definition, you can do it,
you just can't run that extension alongside pg_stat_statements.
But you'll be out of luck if the core code starts filling that field.
I'd be happier about having the core code compute a query ID if we
had a definition that was not so obviously slapped together.
regards, tom lane
On Sat, Mar 16, 2019 at 5:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Fri, Mar 15, 2019 at 9:50 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
pg_stat_statements has a notion of query ID, but that notion might be
quite inappropriate for other usages, which is why it's an extension
and not core.Having written an extension that also wanted a query ID, I disagree
with this position.[ shrug... ] The fact remains that pg_stat_statements's definition is
pretty lame. There's a lot of judgment calls in which query fields
it chooses to examine or ignore, and there's been no attempt at all
to make the ID PG-version-independent, and I rather doubt that it's
platform-independent either. Nor will the IDs survive a dump/reload
even on the same server, since object OIDs will likely change.These things are OK, or at least mostly tolerable, for pg_stat_statements'
usage ... but I don't think it's a good idea to have the core code
dictating that definition to all extensions. Right now, if you have
an extension that needs some other query-ID definition, you can do it,
you just can't run that extension alongside pg_stat_statements.
But you'll be out of luck if the core code starts filling that field.I'd be happier about having the core code compute a query ID if we
had a definition that was not so obviously slapped together.
But the queryId itself is stored in core. Exposing it in
pg_stat_activity or log_line_prefix would still allow users to choose
the implementation of their choice, or none. That seems like a
different complaint from asking pgss integration in core to have all
its metrics available by default (or at least without a restart).
Maybe we could add a GUC for pg_stat_statements to choose whether it
should set the queryid itself and not, if anyone wants to have its
metrics but with different queryid semantics?
Hello,
This is available in https://github.com/legrandlegrand/pg_stat_sql_plans
extension with a specific function
pgssp_backend_queryid(pid) that permits to join pg_stat_activity with
pg_stat_sql_plans (that is similar to pg_stat_statements) and also permits
to collect samples of wait events per query id.
This extension computes its own queryid based on a normalized query text
(that doesn't change after table
drop/create).
Maybe that queryid calculation should stay in a dedicated extension,
permiting to users to choose their queryid definition.
Regards
PAscal
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Thanks a lot for really good points!! I did not expected I will get this
many points of view. :P
I have identical experience with Robert when other extension calculate the
id different as PGSS, PGSS will overwritten that id when it is on. But Tom
got a point that if we centralize the logic that pgss has, then other
extension will have no way to change it unless we have some new config to
toggle pointed out by Julien. Also Tom got the concern about the current
PGSS jumble query logic is not bullet proof and may take time then impact
the perf.
Let's take one step back. Since queryId is stored in core as Julien pointed
out, can we just add that global to the pg_stat_get_activity and ultimately
exposed in pg_stat_activity view? Then no matter whether PGSS is on or
off, or however the customer extensions are updating that filed, we expose
that field in that view then enable user to leverage that id to join with
pgss or their extension. Will this sounds a good idea?
Thanks again,
Yun
On Sat, Mar 16, 2019 at 11:01 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
Show quoted text
On Sat, Mar 16, 2019 at 5:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Fri, Mar 15, 2019 at 9:50 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
pg_stat_statements has a notion of query ID, but that notion might be
quite inappropriate for other usages, which is why it's an extension
and not core.Having written an extension that also wanted a query ID, I disagree
with this position.[ shrug... ] The fact remains that pg_stat_statements's definition is
pretty lame. There's a lot of judgment calls in which query fields
it chooses to examine or ignore, and there's been no attempt at all
to make the ID PG-version-independent, and I rather doubt that it's
platform-independent either. Nor will the IDs survive a dump/reload
even on the same server, since object OIDs will likely change.These things are OK, or at least mostly tolerable, for
pg_stat_statements'
usage ... but I don't think it's a good idea to have the core code
dictating that definition to all extensions. Right now, if you have
an extension that needs some other query-ID definition, you can do it,
you just can't run that extension alongside pg_stat_statements.
But you'll be out of luck if the core code starts filling that field.I'd be happier about having the core code compute a query ID if we
had a definition that was not so obviously slapped together.But the queryId itself is stored in core. Exposing it in
pg_stat_activity or log_line_prefix would still allow users to choose
the implementation of their choice, or none. That seems like a
different complaint from asking pgss integration in core to have all
its metrics available by default (or at least without a restart).Maybe we could add a GUC for pg_stat_statements to choose whether it
should set the queryid itself and not, if anyone wants to have its
metrics but with different queryid semantics?
Hello
On Sat, Mar 16, 2019 at 7:32 AM Robert Haas <robertmhaas@gmail.com> wrote:
Also, I think this is now the third independent request to expose
query ID in pg_stat_statements. I think we should give the people
what they want.
Count me as the 4th.
This would be a very important feature for automated query analysis.
pg_stat_statements lacks query examples, and the only way to get them is
from the logs.
Where we don't have queryid as well. So people end up either doing it
manually or writing
yet another set of nasty regular expressions.
Routing query analysis s a crucial for any large project. If there are
chances to implement
queryid for pg_stat_activity (or anything that will allow to automate query
analysis)
in Postgres 12 or later -- this would be a great news and huge support for
engineers.
Same level as recently implemented sampling for statement logging.
By the way, if queryid goes to the core someday, I'm sure it is worth to
consider using
it in logs as well.
Thanks,
Nik
On Mon, Mar 18, 2019 at 6:23 PM Yun Li <liyunjuanyong@gmail.com> wrote:
Let's take one step back. Since queryId is stored in core as Julien pointed out, can we just add that global to the pg_stat_get_activity and ultimately exposed in pg_stat_activity view? Then no matter whether PGSS is on or off, or however the customer extensions are updating that filed, we expose that field in that view then enable user to leverage that id to join with pgss or their extension. Will this sounds a good idea?
I'd greatly welcome expose queryid exposure in pg_stat_activity, and
also in log_line_prefix. I'm afraid that it's too late for pg12
inclusion, but I'll be happy to provide a patch for that for pg13.
On 3/16/19 5:32 PM, Robert Haas wrote:
There's only one query ID field available, and
you can't use two extensions that care about query ID unless they
compute it the same way, and replicating all the code that computes
the query ID into each new extension that wants one sucks. I think we
should actually bite the bullet and move all of that code into core,
and then just let extensions say whether they care about it getting
set.
+1.
But I think that enough to integrate into core the query normalization
routine and store generalized query strings (from which the queryId is
produced) in shared memory (for example, hashtable that maps queryId to
the text representation of generalized query). And activate
normalization routine and filling the table of generalized queries by
specified GUC.
This allows to unbind extensions that require queryId from using
pg_stat_statements and consider such computing of queryId as canonical.
--
Regards,
Maksim Milyutin
On Tue, Mar 19, 2019 at 2:45 PM Maksim Milyutin <milyutinma@gmail.com> wrote:
But I think that enough to integrate into core the query normalization
routine and store generalized query strings (from which the queryId is
produced) in shared memory (for example, hashtable that maps queryId to
the text representation of generalized query).
That's more or less how pg_stat_statements was previously behaving,
and it had too many problems. Current implementation, with an
external file, is a better alternative.
And activate
normalization routine and filling the table of generalized queries by
specified GUC.This allows to unbind extensions that require queryId from using
pg_stat_statements and consider such computing of queryId as canonical.
The problem I see with this approach is that if you want a different
implementation, you'll have to reimplement the in-core normalised
queries saving and retrieval, but with a different set of SQL-visible
functions. I don't think that's it's acceptable, unless we add a
specific hook for query normalisation and queryid computing. But it
isn't ideal either, as it would be a total mess if someone changes the
implementation without resetting the previously saved normalised
queries.
On Mon, Mar 18, 2019 at 7:33 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
On Mon, Mar 18, 2019 at 6:23 PM Yun Li <liyunjuanyong@gmail.com> wrote:
Let's take one step back. Since queryId is stored in core as Julien pointed out, can we just add that global to the pg_stat_get_activity and ultimately exposed in pg_stat_activity view? Then no matter whether PGSS is on or off, or however the customer extensions are updating that filed, we expose that field in that view then enable user to leverage that id to join with pgss or their extension. Will this sounds a good idea?
I'd greatly welcome expose queryid exposure in pg_stat_activity, and
also in log_line_prefix. I'm afraid that it's too late for pg12
inclusion, but I'll be happy to provide a patch for that for pg13.
Here's a prototype patch for queryid exposure in pg_stat_activity and
log_line prefix.
Attachments:
queryid_exposure-v1.difftext/x-patch; charset=US-ASCII; name=queryid_exposure-v1.diffDownload+91-7
The queryId depends on oids, so it is not stable enough for some purposes.
For example, to create a SQL identifier that survives across a server
upgrade, or that can be shipped to another database, the queryId isn't
usable.
The apg_plan_mgmt extensions keeps both its own stable SQL identifier as
well as the queryId, so it can be used to join to pg_stat_statements if
desired. If we were to standardize on one SQL identifier, it should be
stable enough to survive a major version upgrade or to be the same in
different databases.
-----
Jim Finnerty, AWS, Amazon Aurora PostgreSQL
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
On Tue, Mar 19, 2019 at 1:24 PM Jim Finnerty <jfinnert@amazon.com> wrote:
The queryId depends on oids, so it is not stable enough for some purposes.
For example, to create a SQL identifier that survives across a server
upgrade, or that can be shipped to another database, the queryId isn't
usable.The apg_plan_mgmt extensions keeps both its own stable SQL identifier as
well as the queryId, so it can be used to join to pg_stat_statements if
desired. If we were to standardize on one SQL identifier, it should be
stable enough to survive a major version upgrade or to be the same in
different databases.
If Amazon would like to open-source its (AIUI) proprietary technology
for computing query IDs and propose it for inclusion in PostgreSQL,
cool, but I think that is a separate question from whether people
would like more convenient access to the query ID technology that we
have today. I think it's 100% clear that they would like that, even
as things stand, and therefore it does not make sense to block that
behind Amazon deciding to share what it already has or somebody else
trying to reimplement it.
If we need to have a space for both a core-standard query ID and
another query ID that is available for extension use, adding one more
field to struct Query, so we can have both coreQueryId and
extensionQueryId or whatever, would be easy to do. It appears that
there's more use case than I would have guessed for custom query IDs.
On the other hand, it also appears that a lot of people would be very,
very happy to just be able to see the query ID field that already
exists, both in pg_stat_statements in pg_stat_activity, and we
shouldn't throw up unnecessary impediments in the way of making that
happen, at least IMHO.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Great,
thank you Julien !
Would it make sense to add it in auto explain ?
I don't know for explain itself, but maybe ...
Regards
PAscal
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
On Tue, Mar 19, 2019 at 8:38 PM legrand legrand
<legrand_legrand@hotmail.com> wrote:
Would it make sense to add it in auto explain ?
I don't know for explain itself, but maybe ...
I'd think that people interested in getting the queryid in the logs
would configure the log_line_prefix to display it consistently rather
than having it in only a subset of cases, so that's probably not
really needed.
Hi Jim, Robert,
As this is a distinct subject from adding QueryId to pg_stat_activity,
would it be possible to continue the discussion "new QueryId definition"
(for postgres open source software) here:
/messages/by-id/1553029215728-0.post@n3.nabble.com
Thanks in advance.
Regards
PAscal
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Would it make sense to add it in auto explain ?
I don't know for explain itself, but maybe ...
I'd think that people interested in getting the queryid in the logs
would configure the log_line_prefix to display it consistently rather
than having it in only a subset of cases, so that's probably not
really needed.
Ok.
Shoudn't you add this to commitfest ?
On Mon, Mar 25, 2019 at 12:36 PM legrand legrand
<legrand_legrand@hotmail.com> wrote:
Would it make sense to add it in auto explain ?
I don't know for explain itself, but maybe ...I'd think that people interested in getting the queryid in the logs
would configure the log_line_prefix to display it consistently rather
than having it in only a subset of cases, so that's probably not
really needed.Ok.
Shoudn't you add this to commitfest ?
I added it last week, see https://commitfest.postgresql.org/23/2069/
Shoudn't you add this to commitfest ?
I added it last week, see https://commitfest.postgresql.org/23/2069/
Oups, sorry for the noise