pg_stat_statements cluttered with "DEALLOCATE dbdpg_p*"

Started by Fabien COELHOalmost 12 years ago17 messages
#1Fabien COELHO
coelho@cri.ensmp.fr

Hello pgdevs,

I noticed that my pg_stat_statements is cluttered with hundreds of entries
like "DEALLOCATE dbdpg_p123456_7", occuring each only once.

It seems to me that it would be more helful if these similar entries where
aggregated together, that is if the query "normalization" could ignore the
name of the descriptor.

Any thoughts about this?

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Andrew Dunstan
andrew@dunslane.net
In reply to: Fabien COELHO (#1)
Re: pg_stat_statements cluttered with "DEALLOCATE dbdpg_p*"

On 04/01/2014 10:45 AM, Fabien COELHO wrote:

Hello pgdevs,

I noticed that my pg_stat_statements is cluttered with hundreds of
entries like "DEALLOCATE dbdpg_p123456_7", occuring each only once.

It seems to me that it would be more helful if these similar entries
where aggregated together, that is if the query "normalization" could
ignore the name of the descriptor.

Any thoughts about this?

You might find this relevant:
<http://blog.endpoint.com/2014/02/perl-dbdpg-postgresql-prepared-statement.html&gt;

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Andrew Dunstan (#2)
Re: pg_stat_statements cluttered with "DEALLOCATE dbdpg_p*"

I noticed that my pg_stat_statements is cluttered with hundreds of entries
like "DEALLOCATE dbdpg_p123456_7", occuring each only once.

It seems to me that it would be more helful if these similar entries where
aggregated together, that is if the query "normalization" could ignore the
name of the descriptor.

Any thoughts about this?

You might find this relevant:
<http://blog.endpoint.com/2014/02/perl-dbdpg-postgresql-prepared-statement.html&gt;

Indeed. Thanks for the pointer. I had guessed who the culprit was, and the
new behavior mentioned in the blog entry may help when the new driver
version hits my debian box.

In the mean time, ISTM that progress can be achieved on pg_stat_statements
normalization as well.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Fabien COELHO (#1)
2 attachment(s)
Re: pg_stat_statements cluttered with "DEALLOCATE dbdpg_p*"

Hello devs,

I noticed that my pg_stat_statements is cluttered with hundreds of entries
like "DEALLOCATE dbdpg_p123456_7", occuring each only once.

Here is a patch and sql test file to:

* normalize DEALLOCATE utility statements in pg_stat_statements

Some drivers such as DBD:Pg generate process/counter-based identifiers for
PREPARE, which result in hundreds of DEALLOCATE being tracked, although
the prepared query may be the same. This is also consistent with the
corresponding PREPARE not being tracked (although the underlying prepared
query *is* tracked).

** Note **: another simpler option would be to skip deallocates altogether
by inserting a "&& !IsA(parsetree, DeallocateStmt)" at the beginning of
pgss_ProcessUtility(). I'm not sure what is best.

--
Fabien.

Attachments:

pgss-norm-deallocate.patchtext/x-diff; name=pgss-norm-deallocate.patchDownload
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 07f09e1..4b3348a 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -966,10 +966,19 @@ pgss_ProcessUtility(Node *parsetree, const char *queryString,
 		BufferUsage bufusage_start,
 					bufusage;
 		uint32		queryId;
+		const char *normalQueryString;
 
 		bufusage_start = pgBufferUsage;
 		INSTR_TIME_SET_CURRENT(start);
 
+		/*
+		 * normalize queryString for DEALLOCATE, as drivers such as DBD::Pg
+		 * generate counter-based names for prepared statements which can
+		 * result in many useless entries.
+		 */
+		normalQueryString = strncasecmp(queryString, "DEALLOCATE", 10)==0 ?
+			"DEALLOCATE ?;": queryString;
+
 		nested_level++;
 		PG_TRY();
 		{
@@ -1025,9 +1034,9 @@ pgss_ProcessUtility(Node *parsetree, const char *queryString,
 		INSTR_TIME_SUBTRACT(bufusage.blk_write_time, bufusage_start.blk_write_time);
 
 		/* For utility statements, we just hash the query string directly */
-		queryId = pgss_hash_string(queryString);
+		queryId = pgss_hash_string(normalQueryString);
 
-		pgss_store(queryString,
+		pgss_store(normalQueryString,
 				   queryId,
 				   INSTR_TIME_GET_MILLISEC(duration),
 				   rows,
pgss-norm-deallocate.sqlapplication/x-sql; name=pgss-norm-deallocate.sqlDownload
#5Andres Freund
andres@2ndquadrant.com
In reply to: Fabien COELHO (#1)
Re: pg_stat_statements cluttered with "DEALLOCATE dbdpg_p*"

Hi,

On 2014-04-01 16:45:29 +0200, Fabien COELHO wrote:

I noticed that my pg_stat_statements is cluttered with hundreds of entries
like "DEALLOCATE dbdpg_p123456_7", occuring each only once.

Why isn't the driver using the extended query protocol? Sending
PREPARE/EXECUTE/DEALLOCATE wastes roundtrips...

It seems to me that it would be more helful if these similar entries where
aggregated together, that is if the query "normalization" could ignore the
name of the descriptor.

I'm not in favor of this. If there's DEALLOCATEs that are frequent
enough to drown out other entries you should

a) Consider using the extended query protocol.
b) consider using unnamed prepared statements to reduce the number of
roundtrips
c) wonder why PREPARE/DEALLOCATE are so much more frequent than the
actualy query execution.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Andres Freund
andres@2ndquadrant.com
In reply to: Andres Freund (#5)
Re: pg_stat_statements cluttered with "DEALLOCATE dbdpg_p*"

On 2014-07-20 13:54:01 +0200, Andres Freund wrote:

Hi,

On 2014-04-01 16:45:29 +0200, Fabien COELHO wrote:

I noticed that my pg_stat_statements is cluttered with hundreds of entries
like "DEALLOCATE dbdpg_p123456_7", occuring each only once.

Why isn't the driver using the extended query protocol? Sending
PREPARE/EXECUTE/DEALLOCATE wastes roundtrips...

Hm. It's probably because libqp doesn't expose sending Close message for
prepared statements :(. No idea why.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Marko Tiikkaja
marko@joh.to
In reply to: Andres Freund (#6)
Re: pg_stat_statements cluttered with "DEALLOCATE dbdpg_p*"

On 2014-07-20 14:06, Andres Freund wrote:

On 2014-07-20 13:54:01 +0200, Andres Freund wrote:

On 2014-04-01 16:45:29 +0200, Fabien COELHO wrote:

I noticed that my pg_stat_statements is cluttered with hundreds of entries
like "DEALLOCATE dbdpg_p123456_7", occuring each only once.

Why isn't the driver using the extended query protocol? Sending
PREPARE/EXECUTE/DEALLOCATE wastes roundtrips...

Hm. It's probably because libqp doesn't expose sending Close message for
prepared statements :(. No idea why.

Yeah, I always considered that a missing feature, and even wrote a patch
to add it at some point. I wonder what happened to it.

.marko

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Andres Freund (#5)
Re: pg_stat_statements cluttered with "DEALLOCATE dbdpg_p*"

Hello Andres,

Why isn't the driver using the extended query protocol? Sending
PREPARE/EXECUTE/DEALLOCATE wastes roundtrips...

It seems to me that it would be more helful if these similar entries where
aggregated together, that is if the query "normalization" could ignore the
name of the descriptor.

I'm not in favor of this. If there's DEALLOCATEs that are frequent
enough to drown out other entries you should

Thanks for the advice. I'm not responsible for the application nor the
driver, and I think that pg_stat_statements should be consistent and
reasonable independently of drivers and applications.

a) Consider using the extended query protocol.
b) consider using unnamed prepared statements to reduce the number of
roundtrips
c) wonder why PREPARE/DEALLOCATE are so much more frequent than the
actualy query execution.

(1) I'm not responsible for DBD::Pg allocating "random" names to prepared
statements, even if the queries are the same, and that accumulate over
time (weeks, possibly months).

(2) pg_stat_statements is currently inconsistent anyway, as PREPARE is not
counted (but the underlying query is on each EXECUTE), although its
corresponding DEALLOCATE is counted, so I think that something is needed
for consistency.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Andres Freund
andres@2ndquadrant.com
In reply to: Fabien COELHO (#8)
Re: pg_stat_statements cluttered with "DEALLOCATE dbdpg_p*"

On 2014-07-20 14:43:27 +0200, Fabien COELHO wrote:

a) Consider using the extended query protocol.
b) consider using unnamed prepared statements to reduce the number of
roundtrips
c) wonder why PREPARE/DEALLOCATE are so much more frequent than the
actualy query execution.

(1) I'm not responsible for DBD::Pg allocating "random" names to prepared
statements, even if the queries are the same, and that accumulate over time
(weeks, possibly months).

(2) pg_stat_statements is currently inconsistent anyway, as PREPARE is not
counted (but the underlying query is on each EXECUTE), although its
corresponding DEALLOCATE is counted, so I think that something is needed for
consistency.

That's because PREPARE isn't executed as it's own statement, but done on
the protocol level (which will need noticeably fewer messages). There's
no builtin logic to ignore actual PREPARE statements. So I don't think
your consistency argument counts as much here.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Andres Freund (#9)
Re: pg_stat_statements cluttered with "DEALLOCATE dbdpg_p*"

That's because PREPARE isn't executed as it's own statement, but done on
the protocol level (which will need noticeably fewer messages). There's
no builtin logic to ignore actual PREPARE statements.

ISTM that there is indeed a special handling in function
pgss_ProcessUtility for PREPARE and EXECUTE:

/*
* If it's an EXECUTE statement, we don't track it and don't
increment the
* nesting level. This allows the cycles to be charged to the
underlying
* PREPARE instead (by the Executor hooks), which is much more
useful.
*
* We also don't track execution of PREPARE. If we did, we would
get one
* hash table entry for the PREPARE (with hash calculated from the
query
* string), and then a different one with the same query string
(but hash
* calculated from the query tree) would be used to accumulate
costs of
* ensuing EXECUTEs. This would be confusing, and inconsistent
with other
* cases where planning time is not included at all.
*/
if (pgss_track_utility && pgss_enabled() &&
!IsA(parsetree, ExecuteStmt) &&
!IsA(parsetree, PrepareStmt))
... standard handling ...
else
... special "no" handling ...

So I don't think your consistency argument counts as much here.

I think that given the above code, my argument stands reasonably.

If you do not like my normalization hack (I do not like it much either:-),
I have suggested to add "&& !IsA(parsetree, DeallocateStmt)" to the
condition above, which would ignore DEALLOCATE as PREPARE and EXECUTE are
currently and rightfully ignored.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Fabien COELHO (#10)
1 attachment(s)
Re: pg_stat_statements cluttered with "DEALLOCATE dbdpg_p*"

That's because PREPARE isn't executed as it's own statement, but done on
the protocol level (which will need noticeably fewer messages). There's
no builtin logic to ignore actual PREPARE statements.

ISTM that there is indeed a special handling in function
pgss_ProcessUtility for PREPARE and EXECUTE:

[...]

For completeness purpose, here is the one-liner patch to handle DEALLOCATE
as PREPARE & EXECUTE are handled. It is cleaner than the other one, but
then DEALLOCATE disappear from the table, as PREPARE and EXECUTE do.

--
Fabien.

Attachments:

pgss-norm-deallocate-simple.patchtext/x-diff; name=pgss-norm-deallocate-simple.patchDownload
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 07f09e1..a8000cb 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -955,10 +955,14 @@ pgss_ProcessUtility(Node *parsetree, const char *queryString,
 	 * calculated from the query tree) would be used to accumulate costs of
 	 * ensuing EXECUTEs.  This would be confusing, and inconsistent with other
 	 * cases where planning time is not included at all.
+	 *
+	 * DEALLOCATE of prepared statement is skip, especially as some drivers
+	 * use process/counter-based names which end up cluttering the stats table.
 	 */
 	if (pgss_track_utility && pgss_enabled() &&
 		!IsA(parsetree, ExecuteStmt) &&
-		!IsA(parsetree, PrepareStmt))
+		!IsA(parsetree, PrepareStmt) &&
+		!IsA(parsetree, DeallocateStmt))
 	{
 		instr_time	start;
 		instr_time	duration;
#12Andres Freund
andres@2ndquadrant.com
In reply to: Fabien COELHO (#10)
Re: pg_stat_statements cluttered with "DEALLOCATE dbdpg_p*"

On 2014-07-20 17:01:50 +0200, Fabien COELHO wrote:

That's because PREPARE isn't executed as it's own statement, but done on
the protocol level (which will need noticeably fewer messages). There's
no builtin logic to ignore actual PREPARE statements.

ISTM that there is indeed a special handling in function pgss_ProcessUtility
for PREPARE and EXECUTE:

/*
* If it's an EXECUTE statement, we don't track it and don't increment the
* nesting level. This allows the cycles to be charged to the underlying
* PREPARE instead (by the Executor hooks), which is much more useful.
*
* We also don't track execution of PREPARE. If we did, we would get one
* hash table entry for the PREPARE (with hash calculated from the query
* string), and then a different one with the same query string (but hash
* calculated from the query tree) would be used to accumulate costs of
* ensuing EXECUTEs. This would be confusing, and inconsistent with other
* cases where planning time is not included at all.
*/
if (pgss_track_utility && pgss_enabled() &&
!IsA(parsetree, ExecuteStmt) &&
!IsA(parsetree, PrepareStmt))
... standard handling ...
else
... special "no" handling ...

So I don't think your consistency argument counts as much here.

I think that given the above code, my argument stands reasonably.

Ick. You have a point.

If you do not like my normalization hack (I do not like it much either:-), I
have suggested to add "&& !IsA(parsetree, DeallocateStmt)" to the condition
above, which would ignore DEALLOCATE as PREPARE and EXECUTE are currently
and rightfully ignored.

Well, EXECUTE isn't actually ignored, but tracked via the execution
time. But that doesn't diminish your point with PREPARE. If we do
something we should go for the && !IsA(parsetree, DeallocateStmt), not
the normalization. The latter is pretty darn bogus.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Andres Freund (#12)
Re: pg_stat_statements cluttered with "DEALLOCATE dbdpg_p*"

[...]. If we do something we should go for the && !IsA(parsetree,
DeallocateStmt), not the normalization.

Ok.

The latter is pretty darn bogus.

Yep:-) I'm fine with ignoring DEALLOCATE altogether.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#12)
Re: pg_stat_statements cluttered with "DEALLOCATE dbdpg_p*"

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-07-20 17:01:50 +0200, Fabien COELHO wrote:

If you do not like my normalization hack (I do not like it much either:-), I
have suggested to add "&& !IsA(parsetree, DeallocateStmt)" to the condition
above, which would ignore DEALLOCATE as PREPARE and EXECUTE are currently
and rightfully ignored.

Well, EXECUTE isn't actually ignored, but tracked via the execution
time. But that doesn't diminish your point with PREPARE. If we do
something we should go for the && !IsA(parsetree, DeallocateStmt), not
the normalization. The latter is pretty darn bogus.

Agreed. I think basically the reasoning here is "since we don't track
PREPARE or EXECUTE, we shouldn't track DEALLOCATE either".

However, this is certainly a behavioral change. Perhaps squeeze it
into 9.4, but not the back braches?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Tom Lane (#14)
Re: pg_stat_statements cluttered with "DEALLOCATE dbdpg_p*"

If you do not like my normalization hack (I do not like it much either:-), I
have suggested to add "&& !IsA(parsetree, DeallocateStmt)" to the condition
above, which would ignore DEALLOCATE as PREPARE and EXECUTE are currently
and rightfully ignored.

Well, EXECUTE isn't actually ignored, but tracked via the execution
time. But that doesn't diminish your point with PREPARE. If we do
something we should go for the && !IsA(parsetree, DeallocateStmt), not
the normalization. The latter is pretty darn bogus.

Agreed. I think basically the reasoning here is "since we don't track
PREPARE or EXECUTE, we shouldn't track DEALLOCATE either".

Yes. It is not just because it is nicely symmetric, it is also annoying to
have hundreds of useless DEALLOCATE stats in the table.

However, this is certainly a behavioral change. Perhaps squeeze it
into 9.4,

That would be nice, and the one-liner looks safe enough.

but not the back braches?

Yep. I doubt that pg_stat_statements users rely on statistics about
DEALLOCATE, so back patching the would be quite safe as well, but I would
not advocate it.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Peter Geoghegan
pg@heroku.com
In reply to: Tom Lane (#14)
Re: pg_stat_statements cluttered with "DEALLOCATE dbdpg_p*"

On Sun, Jul 20, 2014 at 10:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

However, this is certainly a behavioral change. Perhaps squeeze it
into 9.4, but not the back braches?

+1

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Peter Geoghegan (#16)
Re: pg_stat_statements cluttered with "DEALLOCATE dbdpg_p*"

On 07/20/2014 11:51 PM, Peter Geoghegan wrote:

On Sun, Jul 20, 2014 at 10:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

However, this is certainly a behavioral change. Perhaps squeeze it
into 9.4, but not the back braches?

+1

Ok, done. (We're a month closer to releasing 9.4 than we were when this
consensus was reached, but I think it's still OK...)

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers