pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Started by Peter Geogheganabout 14 years ago82 messageshackers
Jump to latest

Attached is a revision of the pg_stat_statements normalisation patch,
plus its Python/psycopg2/dellstore2 test suite, which has yet to be
converted to produce pg_regress output. This revision is a response to
the last round of feedback given by reviewers. Highlights include:

* No more invasive changes to the parser. The only way that this patch
even touches the core code itself is the addition of a new hook for
hashing the post parse analysis tree (there are actually technically
two hooks - parse_analyze_hook and parse_analyze_varparams_hook), as
well as by adding a query_id to the Query and PlannedStmt structs,
that the core system simply naively copies around. This resolves the
hook synchronisation issues that had less elegant workarounds in prior
revisions.

* We now use the internal, low-level scanner API declared in
scanner.h, so that pg_stat_statements has the capability of robustly
detecting a given constant's length based only on its position in the
query string (taken from Const nodes, as before) and the string
itself.

* All my old regression tests pass, but I've added quite a few new
ones too, as problems transpired, including tests to exercise
canonicalisation of what might be considered edge-case query strings,
such as ones with many large constants. There are 100 tests just for
that, that use psuedo-random constants to exercise the
canonicalisation logic thoroughly. Once things start to shape up, I'll
modify that python script to spit out pg_regress tests - it seems
worth delaying committing to that less flexible approach for now
though, and clearly not all of the hundreds of tests are going to make
the cut, as at certain points I was shooting from the hip, so to
speak. I'll do something similar to sepgsql, another contrib module
that has tests.

* All the regular Postgres regression tests now pass, with the new
pg_stat_statements enabled, and with both parallel and serial
schedules. There are no unrecognised nodes, nor any other apparent
failures, with assertions enabled. All strings that are subsequently
seen in the view are correctly canonicalised, with the exception 3 or
4 corner cases, noted below. These may well not be worth fixing, or
may be down to subtle bugs in the core system parser that we ought to
fix.

* Continual hashing is now used, so that arbitrarily long queries can
be differentiated (though of course we are still subject to the
previous limitation of a query string being capped to
pgstat_track_activity_query_size - now, that's what the
*canonicalised* query is capped at). That's another improvement on
9.1's pg_stat_statements, which didn't see any differences past
pgstat_track_activity_query_size (default: 1024) characters.

There are a number of outstanding issues that I'm aware of:

* Under some situations, the logic through which we determine the
length of constants is a little fragile, though I believe we can find
a solution. In particular, consider this query:

select integer '1';

this normalises to:

select ?

and not, as was the case in prior revisions:

select integer ?;

This is because the post analysis tree, unlike the post rewrite tree,
appears to give the position of the constant in this case as starting
with the datatype, so I'm forced to try and work out a way to have the
length of the constant considered as more than a single token. I'll
break on reaching a SCONST token in this case, but there are other
cases that require careful workarounds. I wouldn't be surprised if
someone was able to craft a query to break this logic. Ideally, I'd be
able to assume that constants are exactly one token, allowing me to
greatly simplify the code.

* I am aware that it's suboptimal how I initialise the scanner once
for each time a constant of a given query is first seen. The function
get_constant_length is fairly messy, but the fact that we may only
need to take the length of a single token in a future revision (once
we address the previous known issue) doesn't leave me with much
motivation to clean it up just yet.

* # XXX: This test currently fails!:
#verify_normalizes_correctly("SELECT cast('1' as dnotnull);","SELECT
cast(? as dnotnull);",conn, "domain literal canonicalization/cast")

It appears to fail because the CoerceToDomain node gives its location
to the constant node as starting from "cast", so we end up with
"SELECT ?('1' as dnotnull);". I'm not quite sure if this points to
there being a slight tension with my use of the location field in this
way, or if this is something that could be fixed as a bug in core
(albeit a highly obscure one), though I suspect the latter.

* I'm still not using a core mechanism like query_tree_walker to walk
the tree, which would be preferable. The maintainability of the walker
logic was criticized. At about 800 lines of code in total for the
walker logic (for the functions PerformJumble, QualsNode, LeafNode,
LimitOffsetNode, JoinExprNode, JoinExprNodeChild), for structures that
in practice are seldom changed, with a good test suite, I think we
could do a lot worse. We now raise a warning rather than an error in
the event of an unrecognised node, which seems more sensible - people
really aren't going to thank you for making their entire query fail,
just because we failed to serialise some node at some point. I don't
think that we can get away with just accumulating nodetags much of the
time, as least if we'd like to implement this feature as I'd
envisaged, which is that it would be robust and comprehensive.

* If we use prepared statements, it's possible that an entry, created
from within our parse analysis hook, will get evicted from the
fixed-sized shared hash table before it is once again executed within
our executor hook. Now, if this happens, we won't be able to
canonicalise the query string constants again. However, it can
probably only happen with prepared statements (I concede that eviction
might be possible between a given backends parse analysis hook and
executor hook being called - not really sure. Might be worth holding a
shared lock between the hooks in that case, on the off chance that the
query string won't be canonicalised, but then again that's a fairly
rare failure). People aren't going to care too much about
canonicalisation of prepared statement constants, but I haven't just
removed it and hashed the query string there because it may still be
valuable to be able to differentiate arbitrarily long prepared
queries.

Maybe the answer here is to have pg_stat_statements tell the core
system "this is that querytree's original query string now". That
would have hazards of its own though, including invalidating the
positions of constants. Another option would be to add a
normalized_query char* to the Query and PlannedStmt structs, with
which the core system does much the same thing as the query_id field
in the proposed patch.

* The way that I maintain a stack of range tables, so that Vars whose
vallevelsup != 0 can rt_fetch() an rte to hash its relid may be less
than idiomatic. There is a function used elsewhere on the raw parse
tree to do something similar, but that tree has a parent pointer that
can be followed which is not available to me.

* I would have liked to have been able to have pg_stat_statements have
a configurable eviction criteria, so that queries with the lowest
total time executed could be evicted first, rather than the lowest
number of calls. I haven't done that here.

Thoughts?

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachments:

normalization_regression.pytext/x-python; charset=US-ASCII; name=normalization_regression.pyDownload
pg_stat_statements_norm_2012_02_16.patchtext/x-patch; charset=US-ASCII; name=pg_stat_statements_norm_2012_02_16.patchDownload+1599-258
In reply to: Peter Geoghegan (#1)
Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 16 February 2012 21:11, Peter Geoghegan <peter@2ndquadrant.com> wrote:

*       # XXX: This test currently fails!:
       #verify_normalizes_correctly("SELECT cast('1' as dnotnull);","SELECT
cast(? as dnotnull);",conn, "domain literal canonicalization/cast")

It appears to fail because the CoerceToDomain node gives its location
to the constant node as starting from "cast", so we end up with
"SELECT ?('1' as dnotnull);". I'm not quite sure if this points to
there being a slight tension with my use of the location field in this
way, or if this is something that could be fixed as a bug in core
(albeit a highly obscure one), though I suspect the latter.

So I looked at this in more detail today, and it turns out that it has
nothing to do with CoerceToDomain in particular. The same effect can
be observed by doing this:

select cast('foo' as text);

In turns out that this happens for the same reason as the location of
the Const token in the following query:

select integer 5;

being given such that the string "select ?" results.

Resolving this one issue resolves some others, as it allows me to
greatly simplify the get_constant_length() logic.

Here is the single, hacky change I've made just for now to the core
parser to quickly see if it all works as expected:

*************** transformTypeCast(ParseState *pstate, Ty
*** 2108,2113 ****
--- 2108,2116 ----
  	if (location < 0)
  		location = tc->typeName->location;
+ 	if (IsA(expr, Const))
+ 		location = ((Const*)expr)->location;
+
  	result = coerce_to_target_type(pstate, expr, inputType,
  								   targetType, targetTypmod,
  								   COERCION_EXPLICIT,

After making this change, I can get all my regression tests to pass
(once I change the normalised representation of certain queries to
look like: "select integer ?" rather than "select ?", which is better
anyway), including the CAST()/CoerceToDomain one that previously
failed. So far so good.

Clearly this change is a quick and dirty workaround, and something
better is required. The question I'd pose to the maintainer of this
code is: what business does the coerce_to_target_type call have
changing the location of the Const node resulting from coercion under
the circumstances described? I understand that the location of the
CoerceToDomain should be at "CAST", but why should the underlying
Const's position be the same? Do you agree that this is a bug, and if
so, would you please facilitate me by committing a fix?

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

In reply to: Peter Geoghegan (#2)
Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 20 February 2012 23:16, Peter Geoghegan <peter@2ndquadrant.com> wrote:

Clearly this change is a quick and dirty workaround, and something
better is required. The question I'd pose to the maintainer of this
code is: what business does the coerce_to_target_type call have
changing the location of the Const node resulting from coercion under
the circumstances described? I understand that the location of the
CoerceToDomain should be at "CAST", but why should the underlying
Const's position be the same?

Another look around shows that the CoerceToDomain struct takes its
location from the new Const location in turn, so my dirty little hack
will break the location of the CoerceToDomain struct, giving an
arguably incorrect caret position in some error messages. It would
suit me if MyCoerceToDomain->arg (or the "arg" of a similar node
related to coercion, like CoerceViaIO) pointed to a Const node with,
potentially, and certainly in the case of my original CoerceToDomain
test case, a distinct location to the coercion node itself.

Can we do that?

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#2)
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan <peter@2ndquadrant.com> writes:

Here is the single, hacky change I've made just for now to the core
parser to quickly see if it all works as expected:

*************** transformTypeCast(ParseState *pstate, Ty
*** 2108,2113 ****
--- 2108,2116 ----
if (location < 0)
location = tc->typeName->location;
+ 	if (IsA(expr, Const))
+ 		location = ((Const*)expr)->location;
+
result = coerce_to_target_type(pstate, expr, inputType,
targetType, targetTypmod,
COERCION_EXPLICIT,

This does not look terribly sane to me. AFAICS, the main effect of this
would be that if you have an error in coercing a literal to some
specified type, the error message would point at the literal and not
at the cast operator. That is, in examples like these:

regression=# select 42::point;
ERROR: cannot cast type integer to point
LINE 1: select 42::point;
^
regression=# select cast (42 as point);
ERROR: cannot cast type integer to point
LINE 1: select cast (42 as point);
^

you're proposing to move the error pointer to the "42", and that does
not seem like an improvement, especially not if it only happens when the
cast subject is a simple constant rather than an expression.

regards, tom lane

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#3)
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan <peter@2ndquadrant.com> writes:

Another look around shows that the CoerceToDomain struct takes its
location from the new Const location in turn, so my dirty little hack
will break the location of the CoerceToDomain struct, giving an
arguably incorrect caret position in some error messages. It would
suit me if MyCoerceToDomain->arg (or the "arg" of a similar node
related to coercion, like CoerceViaIO) pointed to a Const node with,
potentially, and certainly in the case of my original CoerceToDomain
test case, a distinct location to the coercion node itself.

Sorry, I'm not following. What about that isn't true already?

regards, tom lane

In reply to: Tom Lane (#4)
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 21 February 2012 01:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:

you're proposing to move the error pointer to the "42", and that does
not seem like an improvement, especially not if it only happens when the
cast subject is a simple constant rather than an expression.

I'm not actually proposing that though. What I'm proposing, quite
simply, is that the Const location actually be correct in all
circumstances. Now, I can understand why the Coercion node for this
query would have its current location starting from the "CAST" part in
your second example or would happen to be the same as the Constant in
your first, and I'm not questioning that. I'm questioning why the
Const node's location need to *always* be the same as that of the
Coercion node when pg_stat_statements walks the tree, since I'd have
imagined that Postgres has no business blaming the error that you've
shown on the Const node.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

In reply to: Tom Lane (#4)
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 21 February 2012 01:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:

you're proposing to move the error pointer to the "42", and that does
not seem like an improvement, especially not if it only happens when the
cast subject is a simple constant rather than an expression.

2008's commit a2794623d292f7bbfe3134d1407281055acce584 [1]http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a2794623d292f7bbfe3134d1407281055acce584 added the
following code to parse_coerce.c [2]http://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=src/backend/parser/parse_coerce.c;h=cd9b7b0cfbed03ec74f2cf295e4a7113627d7f72;hp=1244498ffb291b67d35917a6fdddb54b0d8d759d;hb=a2794623d292f7bbfe3134d1407281055acce584;hpb=6734182c169a1ecb74dd8495004e896ee4519adb:

/* Use the leftmost of the constant's and coercion's locations */
if (location < 0)
newcon->location = con->location;
else if (con->location >= 0 && con->location < location)
newcon->location = con->location;
else
newcon->location = location;

With that commit, Tom made a special case of both Const and Param
nodes, and had them take the leftmost location of the original Const
location and the coercion location. Clearly, he judged that the
current exact set of behaviours with regard to caret position were
optimal. It is my contention that:

A. They may not actually be optimal, at least not according to my
taste. At the very least, it is a hack to misrepresent the location of
Const nodes just so the core system can blame things on Const nodes
and have the user see the coercion being at fault. I appreciate that
it wouldn't have seemed to matter at the time, but the fact remains.

B. The question of where the caret goes in relevant cases - the
location of the coercion, or the location of the constant - is
inconsequential to the vast majority of Postgres users, if not all,
even if the existing behaviour is technically superior according to
the prevailing aesthetic. On the other hand, it matters a lot to me
that I be able to trust the Const location under all circumstances -
I'd really like to not have to engineer a way around this behaviour,
because the only way to do that is with tricks with the low-level
scanner API, which would be quite brittle. The fact that "select
integer '5'" is canonicalised to "select ?" isn't very pretty. That's
not the only issue though, as even to get that more limited behaviour
lots more code is required, that is more difficult to verify as
correct. "Canonicalise one token at each Const location" is a simple
and robust approach, if only the core system could be tweaked to make
this assumption hold in all circumstances, rather than just the vast
majority.

Tom's point example does not seem to be problematic to me - the cast
*should* blame the 42 const token, as the cast doesn't work as a
result of its representation, which is in point of fact why the core
system blames the Const node and not the coercion one. For that
reason, the constant vs expression thing strikes me as false
equivalency. All of that said, I must reiterate that the difference in
behaviour strike me as very unimportant, or it would if it was not so
important to what I'm trying to do with pg_stat_statements.

Can this be accommodated? It might be a matter of changing the core
system to blame the coercion node rather than the Const node, if
you're determined to preserve the existing behaviour.

[1]: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a2794623d292f7bbfe3134d1407281055acce584

[2]: http://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=src/backend/parser/parse_coerce.c;h=cd9b7b0cfbed03ec74f2cf295e4a7113627d7f72;hp=1244498ffb291b67d35917a6fdddb54b0d8d759d;hb=a2794623d292f7bbfe3134d1407281055acce584;hpb=6734182c169a1ecb74dd8495004e896ee4519adb

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#8Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#7)
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Fri, Feb 24, 2012 at 9:43 AM, Peter Geoghegan <peter@2ndquadrant.com> wrote:

Tom's point example does not seem to be problematic to me - the cast
*should* blame the 42 const token, as the cast doesn't work as a
result of its representation, which is in point of fact why the core
system blames the Const node and not the coercion one.

I think I agree Tom's position upthread: blaming the coercion seems to
me to make more sense. But if that's what we're trying to do, then
why does parse_coerce() say this?

/*
* Set up to point at the constant's text if the input routine throws
* an error.
*/

/me is confused.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#8)
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Robert Haas <robertmhaas@gmail.com> writes:

I think I agree Tom's position upthread: blaming the coercion seems to
me to make more sense. But if that's what we're trying to do, then
why does parse_coerce() say this?

/*
* Set up to point at the constant's text if the input routine throws
* an error.
*/

/me is confused.

There are two cases that are fundamentally different in the eyes of the
system:

'literal string'::typename defines a constant of the named type.
The string is fed to the type's input routine de novo, that is, it never
really had any other type. (Under the hood, it had type UNKNOWN for a
short time, but that's an implementation detail.) In this situation it
seems appropriate to point at the text string if the input routine
doesn't like it, because it is the input string and nothing else that is
wrong.

On the other hand, when you cast something that already had a known type
to some other type, any failure seems reasonable to blame on the cast
operator.

So in these terms there's a very real difference between what
'42'::bigint means and what 42::bigint means --- the latter implies
forming an int4 constant and then converting it to int8.

I think that what Peter is on about in
http://archives.postgresql.org/pgsql-hackers/2012-02/msg01152.php
is the question of what location to use for the *result* of
'literal string'::typename, assuming that the type's input function
doesn't complain. Generally we consider that we should use the
leftmost token's location for the location of any expression composed
of more than one input token. This is of course the same place for
'literal string'::typename, but not for the alternate syntaxes
typename 'literal string' and cast('literal string' as typename).
I'm not terribly impressed by the proposal to put in an arbitrary
exception to that general rule for the convenience of this patch.

Especially not when the only reason it's needed is that Peter is
doing the fingerprinting at what is IMO the wrong place anyway.
If he were working on the raw grammar output it wouldn't matter
what parse_coerce chooses to do afterwards.

regards, tom lane

In reply to: Tom Lane (#9)
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 27 February 2012 06:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think that what Peter is on about in
http://archives.postgresql.org/pgsql-hackers/2012-02/msg01152.php
is the question of what location to use for the *result* of
'literal string'::typename, assuming that the type's input function
doesn't complain.  Generally we consider that we should use the
leftmost token's location for the location of any expression composed
of more than one input token.  This is of course the same place for
'literal string'::typename, but not for the alternate syntaxes
typename 'literal string' and cast('literal string' as typename).
I'm not terribly impressed by the proposal to put in an arbitrary
exception to that general rule for the convenience of this patch.

Now, you don't have to be. It was a mistake on my part to bring the
current user-visible behaviour into this. I don't see that there is
necessarily a tension between your position that we should blame the
leftmost token's location, and my contention that the Const "location"
field shouldn't misrepresent the location of certain Consts involved
in coercion post-analysis.

Let me put that in concrete terms. In my working copy of the patch, I
have made some more changes to the core system (mostly reverting
things that turned out to be unnecessary), but I have also made the
following change:

*** a/src/backend/parser/parse_coerce.c
--- b/src/backend/parser/parse_coerce.c
*************** coerce_type(ParseState *pstate, Node *no
*** 280,293 ****
  		newcon->constlen = typeLen(targetType);
  		newcon->constbyval = typeByVal(targetType);
  		newcon->constisnull = con->constisnull;
! 		/* Use the leftmost of the constant's and coercion's locations */
! 		if (location < 0)
! 			newcon->location = con->location;
! 		else if (con->location >= 0 && con->location < location)
! 			newcon->location = con->location;
! 		else
! 			newcon->location = location;
!
  		/*
  		 * Set up to point at the constant's text if the input routine throws
  		 * an error.
--- 280,286 ----
  		newcon->constlen = typeLen(targetType);
  		newcon->constbyval = typeByVal(targetType);
  		newcon->constisnull = con->constisnull;
! 		newcon->location = con->location;
  		/*
  		 * Set up to point at the constant's text if the input routine throws
  		 * an error.
*********************

This does not appear to have any user-visible effect on caret position
for all variations in coercion syntax, while giving me everything that
I need. I had assumed that we were relying on things being this way,
but apparently this is not the case. The system is correctly blaming
the coercion token when it finds the coercion is at fault, and the
const token when it finds the Const node at fault, just as it did
before. So this looks like a case of removing what amounts to dead
code.

Especially not when the only reason it's needed is that Peter is
doing the fingerprinting at what is IMO the wrong place anyway.
If he were working on the raw grammar output it wouldn't matter
what parse_coerce chooses to do afterwards.

Well, I believe that your reason for preferring to do it at that stage
was that we could not capture all of the system's "normalisation
smarts", like the fact that the omission of noise words isn't a
differentiator, so we might as well not have any. This was because
much of it - like the recognition of the equivalence of explicit joins
and queries with join conditions in the where clause - occurs within
the planner. We can't have it all, so we might as well not have any.
My solution here is that we be sufficiently vague about the behaviour
of normalisation that the user has no reasonable basis to count on
that kind of more advanced reduction occurring.

I did very seriously consider hashing the raw parse tree, but I have
several practical reasons for not doing so. Whatever way you look at
it, hashing there is going to result in more code, that is more ugly.
There is no uniform parent node that I can tag with a query_id. There
has to be more modifications to the core system so that queryId value
is carried around more places and persists for longer. The fact that
I'd actually be hashing different structs at different times (that
tree is accessed through a Node pointer) would necessitate lots of
redundant code that operated on each of the very similar structs in an
analogous way. The fact is that waiting until after parse analysis has
plenty of things to recommend it, and yes, the fact that we already
have working code with extensive regression tests is one of them.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#11Daniel Farina
daniel@heroku.com
In reply to: Peter Geoghegan (#10)
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Mon, Feb 27, 2012 at 4:26 AM, Peter Geoghegan <peter@2ndquadrant.com> wrote:

This does not appear to have any user-visible effect on caret position
for all variations in coercion syntax, while giving me everything that
I need. I had assumed that we were relying on things being this way,
but apparently this is not the case. The system is correctly blaming
the coercion token when it finds the coercion is at fault, and the
const token when it finds the Const node at fault, just as it did
before. So this looks like a case of removing what amounts to dead
code.

To shed some light on that hypothesis, attached is a patch whereby I
use 'semantic analysis by compiler error' to show the extent of the
reach of the changes by renaming (codebase-wide) the Const node's
location symbol. The scope whereby the error token will change
position is small and amenable to analysis. I don't see a problem,
nor wide-reaching consequences. As Peter says: probably dead code.
Note that the cancellation of the error position happens very soon,
after an invocation of stringTypeDatum (on two sides of a branch).
Pre and post-patch is puts the carat at the beginning of the constant
string, even in event there is a failure to parse it properly to the
destined type.

--
fdr

Attachments:

Straw-man-to-show-the-effects-of-the-change-to-const.patchapplication/octet-stream; name=Straw-man-to-show-the-effects-of-the-change-to-const.patchDownload+21-28
In reply to: Daniel Farina (#11)
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 29 February 2012 09:05, Daniel Farina <daniel@heroku.com> wrote:

To shed some light on that hypothesis, attached is a patch whereby I
use 'semantic analysis by compiler error' to show the extent of the
reach of the changes by renaming (codebase-wide) the Const node's
location symbol.  The scope whereby the error token will change
position is small and amenable to analysis.  I don't see a problem,
nor wide-reaching consequences.  As Peter says: probably dead code.

Thanks for confirming that.

I decided to benchmark this patch against the same server with
shared_preload_libraries commented out. I chose a quite unsympathetic
pgbench-tools benchmark - the pgbench-tools config is attached. This
is the same server/configuration that I used for my recent page
checksums benchmark. I've thrown the full report up on:

http://pgbenchstatstatements.staticloud.com/

Executive summary:

It looks like we take a 1% - 2.5% hit. On a workload like this, where
parser overhead is high, that isn't bad at all, and seems at most
marginally worse than classic pg_stat_statements with prepared
statements, according to independently produced benchmarks that I've
seen. Had I benchmarked "-M prepared", I wouldn't be surprised if
there was an improvement over classic pg_stat_statements for some
workloads, since the pgss_match_fn logic doesn't involve a strcmp now
- it just compares scalar values. There is no question of there being
a performance regression. Certainly, this patch adds a very practical
feature, vastly more practical than auto_explain currently is, for
example. I didn't choose the most unsympathetic of all benchmarks that
could be easily conducted, which would have been a select-only
workload, which executes very simple select statements only, as fast
as it possibly can. I only avoided that because the tpc-b.sql workload
seems to be recognised as the most useful and objective workload for
general purpose benchmarks.

I've attached the revision of the patch that was benchmarked. There
have been a few changes, mostly bug-fixes and clean-ups, including:

* Most notably, I went ahead and made the required changes to parse
coercion's alteration of Const location, while also tweaking similar
logic for Param location analogously, though that change was purely
for consistency and not out of any practical need to do so.

* Removing the unneeded alteration gave me leeway to considerably
clean up the scanner logic, which doesn't care about which particular
type of token is scanned anymore. There is a single invocation per
query string to be canonicalised (i.e. for each first call of the
query not in the shared hash table). This seems a lot more robust and
correct (in terms of how it canonicalises queries like: select integer
'5') than the workaround that I had in the last revision, written when
it wasn't clear that I'd be able to get the core system to
consistently tell the truth about Const location.

* We no longer canonicalise query strings in the event of prepared
statements, while still walking the query tree to compute a queryId.
Of course, an additional benefit of this patch is that it allows
differentiation of queries that only differ beyond
track_activity_query_size bytes, which is a benefit that I want for
prepared statements too.

* The concept of a "sticky" entry is introduced; this prevents queries
from being evicted after parse analysis/canonicalisation but before a
reprieve-delivering query execution. There is still no absolute,
iron-clad guarantee that this can't happen, but it is probably
impossible for all practical purposes, and even when it does happen,
the only consequence is that a query string with some old,
uncanonicalized constants is seen, probably before being immediately
evicted anyway due to the extreme set of circumstances that would have
been required to produce that failure mode. If, somehow, a sticky
entry is never demoted to a regular entry in the corresponding
executor hook call, which ought to be impossible, that sticky entry
still won't survive a restart, so problems with the shared hash table
getting clogged with sticky entries should never occur. Prepared
statements will add zero call entries to the table during their
initial parse analysis, but these entries are not sticky, and have
their "usage" value initialised just as before.

* 32-bit hash values are now used. Fewer changes still to core code generally.

* Merged against master - Robert's changes would have prevented my
earlier patch from cleanly applying.

* Even more tests! Updated regression tests attached, with a total of
289 tests. Those aside, I found the fuzz testing of third party
regression tests that leverage Postgres to be useful. Daniel pointed
out to me that the SQL Alchemy regression tests broke the patch due to
an assertion failure. Obviously I've fixed that, so both the standard
postgres and the SQL Alchemy tests do not present the patch with any
difficulties. They are both fairly extensive.

Thoughts?

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachments:

configapplication/octet-stream; name=configDownload
pg_stat_statements_norm_2012_02_29.patchtext/x-patch; charset=US-ASCII; name=pg_stat_statements_norm_2012_02_29.patchDownload+1506-268
normalization_regression.pytext/x-python; charset=US-ASCII; name=normalization_regression.pyDownload
#13Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Peter Geoghegan (#12)
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

I'm curious about the LeafNode stuff. Is this something that could be
done by expression_tree_walker? I'm not completely familiar with it so
maybe there's some showstopper such as some node tags not being
supported, or maybe it just doesn't help. But I'm curious.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

In reply to: Alvaro Herrera (#13)
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 1 March 2012 00:48, Alvaro Herrera <alvherre@commandprompt.com> wrote:

I'm curious about the LeafNode stuff.  Is this something that could be
done by expression_tree_walker?  I'm not completely familiar with it so
maybe there's some showstopper such as some node tags not being
supported, or maybe it just doesn't help.  But I'm curious.

Good question. The LeafNode function (which is a bit of a misnomer, as
noted in a comment) looks rather like a walker function, as appears in
the example in a comment in nodeFuncs.c:

* expression_tree_walker() is designed to support routines that traverse
* a tree in a read-only fashion (although it will also work for routines
* that modify nodes in-place but never add/delete/replace nodes).
* A walker routine should look like this:
*
* bool my_walker (Node *node, my_struct *context)
* {
* if (node == NULL)
* return false;
* // check for nodes that special work is required for, eg:
* if (IsA(node, Var))
* {
* ... do special actions for Var nodes
* }
* else if (IsA(node, ...))
* {
* ... do special actions for other node types
* }
* // for any node type not specially processed, do:
* return expression_tree_walker(node, my_walker, (void *) context);
* }

My understanding is that the expression-tree walking support is mostly
useful for the majority of walker code, which only cares about a small
subset of nodes, and hopes to avoid including boilerplate code just to
walk those other nodes that it's actually disinterested in.

This code, unlike most clients of expression_tree_walker(), is pretty
much interested in everything, since its express purpose is to
fingerprint all possible query trees.

Another benefit of expression_tree_walker is that if you miss a
certain node being added, (say a FuncExpr-like node), you get to
automatically have that node walked over to walk to the nodes that you
do in fact care about (such as those within this new nodes args List).
That makes perfect sense in the majority of cases, but here you've
already missed the fields within this new node that FuncExpr itself
lacks, so you're already finger-printing inaccurately. I suppose you
could still at least get the nodetag and still have a warning about
the fingerprinting being inadequate by going down the
expression_tree_walker path, but I'm inclined to wonder if it you
aren't just better of directly walking the tree, if only to encourage
the idea that this code needs to be maintained over time, and to cut
down on the little extra bit of indirection that that imposes.

It's not going to be any sort of burden to maintain it - it currently
stands at a relatively meagre 800 lines of code for everything to do
with tree walking - and the code that will have to be added with new
nodes or refactored along with the existing tree structure is going to
be totally trivial.

All of that said, I wouldn't mind making LeafNode into a walker, if
that approach is judged to be better, and you don't mind documenting
the order in which the tree is walked as deterministic, because the
order now matters in a way apparently not really anticipated by
expression_tree_walker, though that's probably not a problem.

My real concern now is that it's March 1st, and the last commitfest
may end soon. Even though this patch has extensive regression tests,
has been floating around for months, and, I believe, will end up being
a timely and important feature, a committer has yet to step forward to
work towards this patch getting committed. Can someone volunteer,
please? My expectation is that this feature will make life a lot
easier for a lot of Postgres users.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#15Daniel Farina
daniel@heroku.com
In reply to: Peter Geoghegan (#14)
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Thu, Mar 1, 2012 at 1:27 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote:

My expectation is that this feature will make life a lot
easier for a lot of Postgres users.

Yes. It's hard to overstate the apparent utility of this feature in
the general category of visibility and profiling.

--
fdr

#16Josh Berkus
josh@agliodbs.com
In reply to: Daniel Farina (#15)
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 3/1/12 1:57 PM, Daniel Farina wrote:

On Thu, Mar 1, 2012 at 1:27 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote:

My expectation is that this feature will make life a lot
easier for a lot of Postgres users.

Yes. It's hard to overstate the apparent utility of this feature in
the general category of visibility and profiling.

More importantly, this is what pg_stat_statements *should* have been in
8.4, and wasn't.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

In reply to: Josh Berkus (#16)
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 1 March 2012 22:09, Josh Berkus <josh@agliodbs.com> wrote:

On 3/1/12 1:57 PM, Daniel Farina wrote:

On Thu, Mar 1, 2012 at 1:27 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote:

My expectation is that this feature will make life a lot
easier for a lot of Postgres users.

Yes.  It's hard to overstate the apparent utility of this feature in
the general category of visibility and profiling.

More importantly, this is what pg_stat_statements *should* have been in
8.4, and wasn't.

It would probably be prudent to concentrate on getting the core
infrastructure committed first. That way, we at least know that if
this doesn't get into 9.2, we can work on getting it into 9.3 knowing
that once committed, people won't have to wait over a year at the very
least to be able to use the feature. The footprint of such a change is
quite small:

src/backend/nodes/copyfuncs.c | 2 +
src/backend/nodes/equalfuncs.c | 4 +
src/backend/nodes/outfuncs.c | 6 +
src/backend/nodes/readfuncs.c | 5 +
src/backend/optimizer/plan/planner.c | 1 +
src/backend/parser/analyze.c | 37 +-
src/backend/parser/parse_coerce.c | 12 +-
src/backend/parser/parse_param.c | 11 +-
src/include/nodes/parsenodes.h | 3 +
src/include/nodes/plannodes.h | 2 +
src/include/parser/analyze.h | 12 +
src/include/parser/parse_node.h | 3 +-

That said, I believe that the patch is pretty close to a commitable
state as things stand, and that all that is really needed is for a
committer familiar with the problem space to conclude the work started
by Daniel and others, adding:

contrib/pg_stat_statements/pg_stat_statements.c | 1420 ++++++++++++++++++-

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#18Josh Berkus
josh@agliodbs.com
In reply to: Peter Geoghegan (#17)
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

It would probably be prudent to concentrate on getting the core
infrastructure committed first. That way, we at least know that if
this doesn't get into 9.2, we can work on getting it into 9.3 knowing
that once committed, people won't have to wait over a year at the very

I don't see why we can't commit the whole thing. This is way more ready
for prime-time than checksums.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#18)
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Josh Berkus <josh@agliodbs.com> writes:

It would probably be prudent to concentrate on getting the core
infrastructure committed first. That way, we at least know that if
this doesn't get into 9.2, we can work on getting it into 9.3 knowing
that once committed, people won't have to wait over a year at the very

I don't see why we can't commit the whole thing. This is way more ready
for prime-time than checksums.

We'll get to it in due time. In case you haven't noticed, there's a lot
of stuff in this commitfest. And I don't follow the logic that says
that because Simon is trying to push through a not-ready-for-commit
patch we should drop our standards for other patches.

regards, tom lane

#20Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#19)
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Fri, Mar 2, 2012 at 12:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Josh Berkus <josh@agliodbs.com> writes:

It would probably be prudent to concentrate on getting the core
infrastructure committed first. That way, we at least know that if
this doesn't get into 9.2, we can work on getting it into 9.3 knowing
that once committed, people won't have to wait over a year at the very

I don't see why we can't commit the whole thing.  This is way more ready
for prime-time than checksums.

We'll get to it in due time.  In case you haven't noticed, there's a lot
of stuff in this commitfest.  And I don't follow the logic that says
that because Simon is trying to push through a not-ready-for-commit
patch we should drop our standards for other patches.

I don't follow that logic either, but I also feel like this CommitFest
is dragging on and on. Unless you -- or someone -- are prepared to
devote a lot more time to this, "due time" is not going to arrive any
time in the forseeable future. We're currently making progress at a
rate of maybe 4 patches a week, at which rate we're going to finish
this CommitFest in May. And that might be generous, because we've
been disproportionately knocking off the easy ones. Do we have any
kind of a plan for, I don't know, bringing this to closure on some
reasonable time frame?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#21Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#19)
#22Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#19)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#20)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#22)
#25Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#24)
#26Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#25)
#27Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#26)
#28Andrew Dunstan
andrew@dunslane.net
In reply to: Simon Riggs (#27)
In reply to: Tom Lane (#23)
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#29)
In reply to: Tom Lane (#30)
#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#31)
In reply to: Tom Lane (#32)
#34Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Geoghegan (#33)
In reply to: Andrew Dunstan (#34)
#36Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Geoghegan (#35)
In reply to: Andrew Dunstan (#36)
#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#37)
In reply to: Tom Lane (#38)
#40Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#38)
#41Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Geoghegan (#39)
#42Peter Eisentraut
peter_e@gmx.net
In reply to: Bruce Momjian (#40)
In reply to: Peter Eisentraut (#42)
#44Daniel Farina
daniel@heroku.com
In reply to: Peter Geoghegan (#39)
#45Noah Misch
noah@leadboat.com
In reply to: Peter Eisentraut (#41)
#46Bruce Momjian
bruce@momjian.us
In reply to: Peter Geoghegan (#43)
#47Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#12)
In reply to: Tom Lane (#47)
#49Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#48)
In reply to: Tom Lane (#49)
In reply to: Peter Geoghegan (#48)
#52Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#51)
In reply to: Tom Lane (#52)
#54Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#51)
In reply to: Tom Lane (#54)
#56Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#53)
In reply to: Tom Lane (#56)
In reply to: Tom Lane (#54)
#59Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#58)
In reply to: Tom Lane (#59)
#61Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#58)
#62Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#60)
In reply to: Peter Geoghegan (#1)
#64Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#63)
In reply to: Tom Lane (#64)
#66Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#65)
#67Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#66)
#68Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#67)
In reply to: Tom Lane (#66)
#70Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#68)
#71Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#70)
#72Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#71)
#73Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#72)
#74Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#73)
#75Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#73)
#76Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#75)
#77Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#76)
#78Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#77)
In reply to: Tom Lane (#77)
#80Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#79)
In reply to: Tom Lane (#80)
#82Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#81)