Assert in pg_stat_statements
Hi,
I just found that pg_stat_statements causes assert when queryId is
set by other module, which is loaded prior to pg_stat_statements in
the shared_preload_libraries parameter.
Theoretically, queryId in the shared memory could be set by other
module by design.
So, IMHO, pg_stat_statements should not cause assert even if queryId
already has non-zero value --- my module has this problem now.
Is my understanding correct? Any comments?
Regards,
--
NAGAYASU Satoshi <snaga@uptime.jp>
Attachments:
pg_stat_statements.c.difftext/plain; charset=UTF-8; name=pg_stat_statements.c.diffDownload
commit b975d7c2fe1b36a3ded1e0960be191466704e0fc
Author: Satoshi Nagayasu <snaga@uptime.jp>
Date: Sat Aug 8 08:51:45 2015 +0000
Fix pg_stat_statements to avoid assert failure when queryId already has non-zero value.
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 59b8a2e..84c5200 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -776,8 +776,9 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query)
if (prev_post_parse_analyze_hook)
prev_post_parse_analyze_hook(pstate, query);
- /* Assert we didn't do this already */
- Assert(query->queryId == 0);
+ /* Assume that other module has calculated and set queryId */
+ if (query->queryId > 0)
+ return;
/* Safety check... */
if (!pgss || !pgss_hash)
On 2015/08/08 22:32, Robert Haas wrote:
On Sat, Aug 8, 2015 at 5:00 AM, Satoshi Nagayasu <snaga@uptime.jp> wrote:
I just found that pg_stat_statements causes assert when queryId is
set by other module, which is loaded prior to pg_stat_statements in
the shared_preload_libraries parameter.Theoretically, queryId in the shared memory could be set by other
module by design.So, IMHO, pg_stat_statements should not cause assert even if queryId
already has non-zero value --- my module has this problem now.Is my understanding correct? Any comments?
Seems like an ERROR would be more appropriate than an Assert.
Well, it's not that simple, and I'm afraid that it may not fix
the issue.
Here, let's assume that we have two modules (foo and pg_stat_statements)
which calculate, use and store Query->queryId independently.
What we may see when two are enabled is following.
(1) Enable two modules in shared_preload_libraries.
shared_preload_libraries = 'foo,pg_stat_statements'
(2) Once a query comes in, a callback function in module "foo"
associated with post_parse_analyze_hook calculates and store
queryId in Query->queryId.
(3) After that, a callback function (pgss_post_parse_analyze) in
"pg_stat_statements" associated with post_parse_analyze_hook
detects that Query->queryId has non-zero value, and asserts it.
As a result, we can not have two or more modules that use queryId
at the same time.
But I think it should be possible because one of the advantages of
PostgreSQL is its extensibility.
So, I think the problem here is the fact that pg_stat_statements
deals with having non-zero value in queryId as "error" even if
it has exact the same value with other module.
Regards,
--
NAGAYASU Satoshi <snaga@uptime.jp>
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: CA+TgmoZJKM8HZ+4Qk8qQNQDtSDXkb9tJLrx+janVd1YJpvGtw@mail.gmail.com
On Sun, Aug 9, 2015 at 1:36 AM, Satoshi Nagayasu <snaga@uptime.jp> wrote:
On 2015/08/08 22:32, Robert Haas wrote:
On Sat, Aug 8, 2015 at 5:00 AM, Satoshi Nagayasu <snaga@uptime.jp> wrote:
I just found that pg_stat_statements causes assert when queryId is
set by other module, which is loaded prior to pg_stat_statements in
the shared_preload_libraries parameter.Theoretically, queryId in the shared memory could be set by other
module by design.So, IMHO, pg_stat_statements should not cause assert even if queryId
already has non-zero value --- my module has this problem now.Is my understanding correct? Any comments?
Seems like an ERROR would be more appropriate than an Assert.
Well, it's not that simple, and I'm afraid that it may not fix
the issue.Here, let's assume that we have two modules (foo and pg_stat_statements)
which calculate, use and store Query->queryId independently.What we may see when two are enabled is following.
(1) Enable two modules in shared_preload_libraries.
shared_preload_libraries = 'foo,pg_stat_statements'
(2) Once a query comes in, a callback function in module "foo"
associated with post_parse_analyze_hook calculates and store
queryId in Query->queryId.(3) After that, a callback function (pgss_post_parse_analyze) in
"pg_stat_statements" associated with post_parse_analyze_hook
detects that Query->queryId has non-zero value, and asserts it.As a result, we can not have two or more modules that use queryId
at the same time.But I think it should be possible because one of the advantages of
PostgreSQL is its extensibility.So, I think the problem here is the fact that pg_stat_statements
deals with having non-zero value in queryId as "error" even if
it has exact the same value with other module.
I'm not too excited about supporting the use case where there are two
people using queryId but it just so happens that they always set
exactly the same value. That seems like a weird setup. Wouldn't that
mean both modules were applying the same jumble algorithm, and
therefore you're doing the work of computing the jumble twice per
query? It would be better to have the second module have an option
not to compute the query ID and just do whatever else it does. Then,
if you want to use that other module with pg_stat_statements, flip the
GUC.
The reason I think an Assert is wrong is that if there are other
modules using queryId, we should catch attempts to use the queryId for
more than one purpose even in non-assert enabled builds.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
I'm not too excited about supporting the use case where there are two
people using queryId but it just so happens that they always set
exactly the same value. That seems like a weird setup. Wouldn't that
mean both modules were applying the same jumble algorithm, and
therefore you're doing the work of computing the jumble twice per
query?
Not only that, but you'd need to keep the two modules in sync, which
would be one of the greatest recipes for bugs we've thought of yet.
If there's actually a use case for that sort of thing, I would vote
for moving the jumble-calculation code into core and making both of
the interested extensions do
/* Calculate query hash if it's not been done yet */
if (query->queryId == 0)
calculate_query_hash(query);
I note also that pg_stat_statements is using query->queryId == 0 as a
state flag, which means that if some other module has already calculated
the hash that would break it, even if the value were identical.
(In particular, it's using the event of calculating the queryId as an
opportunity to possibly make an entry in its own hashtable.) So you'd
need to do something to replace that logic if you'd like to use
pg_stat_statements concurrently with some other use of queryId.
The reason I think an Assert is wrong is that if there are other
modules using queryId, we should catch attempts to use the queryId for
more than one purpose even in non-assert enabled builds.
Yeah, an explicit runtime check and elog() would be safer.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Aug 9, 2015 at 10:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
If there's actually a use case for that sort of thing, I would vote
for moving the jumble-calculation code into core
I think that there'd be a good case for doing that for several other
reasons. It would be great to have a per-queryId
log_min_duration_statement, for example.
Most likely, this would work based on a system of grouping queries.
One grouping might be "SLA queries", that the user hopes to prevent
ever hobbling loading the application's home page. Outliers in this
group of queries are far more interesting to the user than any other,
or expectations are in some other way quite different from the
average. At the same time, maybe for reporting queries 60 seconds is
considered an unreasonably long time to wait.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2015-08-10 0:04 GMT+09:00 Robert Haas <robertmhaas@gmail.com>:
On Sun, Aug 9, 2015 at 1:36 AM, Satoshi Nagayasu <snaga@uptime.jp> wrote:
On 2015/08/08 22:32, Robert Haas wrote:
On Sat, Aug 8, 2015 at 5:00 AM, Satoshi Nagayasu <snaga@uptime.jp> wrote:
I just found that pg_stat_statements causes assert when queryId is
set by other module, which is loaded prior to pg_stat_statements in
the shared_preload_libraries parameter.Theoretically, queryId in the shared memory could be set by other
module by design.So, IMHO, pg_stat_statements should not cause assert even if queryId
already has non-zero value --- my module has this problem now.Is my understanding correct? Any comments?
Seems like an ERROR would be more appropriate than an Assert.
Well, it's not that simple, and I'm afraid that it may not fix
the issue.Here, let's assume that we have two modules (foo and pg_stat_statements)
which calculate, use and store Query->queryId independently.What we may see when two are enabled is following.
(1) Enable two modules in shared_preload_libraries.
shared_preload_libraries = 'foo,pg_stat_statements'
(2) Once a query comes in, a callback function in module "foo"
associated with post_parse_analyze_hook calculates and store
queryId in Query->queryId.(3) After that, a callback function (pgss_post_parse_analyze) in
"pg_stat_statements" associated with post_parse_analyze_hook
detects that Query->queryId has non-zero value, and asserts it.As a result, we can not have two or more modules that use queryId
at the same time.But I think it should be possible because one of the advantages of
PostgreSQL is its extensibility.So, I think the problem here is the fact that pg_stat_statements
deals with having non-zero value in queryId as "error" even if
it has exact the same value with other module.I'm not too excited about supporting the use case where there are two
people using queryId but it just so happens that they always set
exactly the same value. That seems like a weird setup.
I think so too, but unfortunately, I have to do that to work
with 9.4 and 9.5.
Wouldn't that
mean both modules were applying the same jumble algorithm, and
therefore you're doing the work of computing the jumble twice per
query?
Basically yes, because both modules should work not only together,
but also solely. So, my module need to have capability to calculate
queryId itself.
More precisely, if we have following configuration,
shared_preload_libraries = 'foo,pg_stat_statements'
my module "foo" calculates queryId itself, and pg_stat_statements
would do the same. (and gets crashed when it has --enable-cassert)
If we have following configuration,
shared_preload_libraries = 'pg_stat_statements,foo'
my module "foo" can skip queryId calculation because
pg_stat_statements has already done that.
Of course, my module "foo" should be able to work solely as following.
shared_preload_libraries = 'foo'
It would be better to have the second module have an option
not to compute the query ID and just do whatever else it does. Then,
if you want to use that other module with pg_stat_statements, flip the
GUC.
Yes, that's exactly what I'm doing in my module, and what I intended
in my first patch for pg_stat_statements on this thread. :)
Regards,
--
NAGAYASU Satoshi <snaga@uptime.jp>
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2015-08-10 2:23 GMT+09:00 Tom Lane <tgl@sss.pgh.pa.us>:
Robert Haas <robertmhaas@gmail.com> writes:
I'm not too excited about supporting the use case where there are two
people using queryId but it just so happens that they always set
exactly the same value. That seems like a weird setup. Wouldn't that
mean both modules were applying the same jumble algorithm, and
therefore you're doing the work of computing the jumble twice per
query?Not only that, but you'd need to keep the two modules in sync, which
would be one of the greatest recipes for bugs we've thought of yet.If there's actually a use case for that sort of thing, I would vote
for moving the jumble-calculation code into core and making both of
the interested extensions do/* Calculate query hash if it's not been done yet */
if (query->queryId == 0)
calculate_query_hash(query);
I vote for this too.
Jumble-calculation is the smartest way to identify query groups,
so several modules can take advantages of it if in the core.
Regards,
--
Satoshi Nagayasu <snaga@uptime.jp>
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers