pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Started by Peter Geogheganalmost 14 years ago82 messages

peter@2ndquadrant.com

almost 14 years ago

2 attachment(s)

Attached is a revision of the pg_stat_statements normalisation patch,
plus its Python/psycopg2/dellstore2 test suite, which has yet to be
converted to produce pg_regress output. This revision is a response to
the last round of feedback given by reviewers. Highlights include:

* No more invasive changes to the parser. The only way that this patch
even touches the core code itself is the addition of a new hook for
hashing the post parse analysis tree (there are actually technically
two hooks - parse_analyze_hook and parse_analyze_varparams_hook), as
well as by adding a query_id to the Query and PlannedStmt structs,
that the core system simply naively copies around. This resolves the
hook synchronisation issues that had less elegant workarounds in prior
revisions.

* We now use the internal, low-level scanner API declared in
scanner.h, so that pg_stat_statements has the capability of robustly
detecting a given constant's length based only on its position in the
query string (taken from Const nodes, as before) and the string
itself.

* All my old regression tests pass, but I've added quite a few new
ones too, as problems transpired, including tests to exercise
canonicalisation of what might be considered edge-case query strings,
such as ones with many large constants. There are 100 tests just for
that, that use psuedo-random constants to exercise the
canonicalisation logic thoroughly. Once things start to shape up, I'll
modify that python script to spit out pg_regress tests - it seems
worth delaying committing to that less flexible approach for now
though, and clearly not all of the hundreds of tests are going to make
the cut, as at certain points I was shooting from the hip, so to
speak. I'll do something similar to sepgsql, another contrib module
that has tests.

* All the regular Postgres regression tests now pass, with the new
pg_stat_statements enabled, and with both parallel and serial
schedules. There are no unrecognised nodes, nor any other apparent
failures, with assertions enabled. All strings that are subsequently
seen in the view are correctly canonicalised, with the exception 3 or
4 corner cases, noted below. These may well not be worth fixing, or
may be down to subtle bugs in the core system parser that we ought to
fix.

* Continual hashing is now used, so that arbitrarily long queries can
be differentiated (though of course we are still subject to the
previous limitation of a query string being capped to
pgstat_track_activity_query_size - now, that's what the
*canonicalised* query is capped at). That's another improvement on
9.1's pg_stat_statements, which didn't see any differences past
pgstat_track_activity_query_size (default: 1024) characters.

There are a number of outstanding issues that I'm aware of:

* Under some situations, the logic through which we determine the
length of constants is a little fragile, though I believe we can find
a solution. In particular, consider this query:

select integer '1';

this normalises to:

select ?

and not, as was the case in prior revisions:

select integer ?;

This is because the post analysis tree, unlike the post rewrite tree,
appears to give the position of the constant in this case as starting
with the datatype, so I'm forced to try and work out a way to have the
length of the constant considered as more than a single token. I'll
break on reaching a SCONST token in this case, but there are other
cases that require careful workarounds. I wouldn't be surprised if
someone was able to craft a query to break this logic. Ideally, I'd be
able to assume that constants are exactly one token, allowing me to
greatly simplify the code.

* I am aware that it's suboptimal how I initialise the scanner once
for each time a constant of a given query is first seen. The function
get_constant_length is fairly messy, but the fact that we may only
need to take the length of a single token in a future revision (once
we address the previous known issue) doesn't leave me with much
motivation to clean it up just yet.

* # XXX: This test currently fails!:
#verify_normalizes_correctly("SELECT cast('1' as dnotnull);","SELECT
cast(? as dnotnull);",conn, "domain literal canonicalization/cast")

It appears to fail because the CoerceToDomain node gives its location
to the constant node as starting from "cast", so we end up with
"SELECT ?('1' as dnotnull);". I'm not quite sure if this points to
there being a slight tension with my use of the location field in this
way, or if this is something that could be fixed as a bug in core
(albeit a highly obscure one), though I suspect the latter.

* I'm still not using a core mechanism like query_tree_walker to walk
the tree, which would be preferable. The maintainability of the walker
logic was criticized. At about 800 lines of code in total for the
walker logic (for the functions PerformJumble, QualsNode, LeafNode,
LimitOffsetNode, JoinExprNode, JoinExprNodeChild), for structures that
in practice are seldom changed, with a good test suite, I think we
could do a lot worse. We now raise a warning rather than an error in
the event of an unrecognised node, which seems more sensible - people
really aren't going to thank you for making their entire query fail,
just because we failed to serialise some node at some point. I don't
think that we can get away with just accumulating nodetags much of the
time, as least if we'd like to implement this feature as I'd
envisaged, which is that it would be robust and comprehensive.

* If we use prepared statements, it's possible that an entry, created
from within our parse analysis hook, will get evicted from the
fixed-sized shared hash table before it is once again executed within
our executor hook. Now, if this happens, we won't be able to
canonicalise the query string constants again. However, it can
probably only happen with prepared statements (I concede that eviction
might be possible between a given backends parse analysis hook and
executor hook being called - not really sure. Might be worth holding a
shared lock between the hooks in that case, on the off chance that the
query string won't be canonicalised, but then again that's a fairly
rare failure). People aren't going to care too much about
canonicalisation of prepared statement constants, but I haven't just
removed it and hashed the query string there because it may still be
valuable to be able to differentiate arbitrarily long prepared
queries.

Maybe the answer here is to have pg_stat_statements tell the core
system "this is that querytree's original query string now". That
would have hazards of its own though, including invalidating the
positions of constants. Another option would be to add a
normalized_query char* to the Query and PlannedStmt structs, with
which the core system does much the same thing as the query_id field
in the proposed patch.

* The way that I maintain a stack of range tables, so that Vars whose
vallevelsup != 0 can rt_fetch() an rte to hash its relid may be less
than idiomatic. There is a function used elsewhere on the raw parse
tree to do something similar, but that tree has a parent pointer that
can be followed which is not available to me.

* I would have liked to have been able to have pg_stat_statements have
a configurable eviction criteria, so that queries with the lowest
total time executed could be evicted first, rather than the lowest
number of calls. I haven't done that here.

Thoughts?

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachments:

normalization_regression.pytext/x-python; charset=US-ASCII; name=normalization_regression.pyDownload

pg_stat_statements_norm_2012_02_16.patchtext/x-patch; charset=US-ASCII; name=pg_stat_statements_norm_2012_02_16.patchDownload

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
new file mode 100644
index 434aa71..8bc9cc5
*** a/contrib/pg_stat_statements/pg_stat_statements.c
--- b/contrib/pg_stat_statements/pg_stat_statements.c
***************
*** 10,15 ****
--- 10,34 ----
   * an entry, one must hold the lock shared or exclusive (so the entry doesn't
   * disappear!) and also take the entry's mutex spinlock.
   *
+  * As of Postgres 9.2, this module normalizes query strings. Normalization is a
+  * process whereby similar queries, typically differing only in their constants
+  * (though the exact rules are somewhat more subtle than that) are recognized as
+  * equivalent, and are tracked as a single entry. This is particularly useful
+  * for non-prepared queries.
+  *
+  * Normalization is implemented by selectively serializing fields of each query
+  * tree's nodes, which are judged to be essential to the nature of the query.
+  * This is referred to as a query jumble. This is distinct from a straight
+  * serialization of the query tree in that constants are canonicalized, and
+  * various extraneous information is ignored as irrelevant, such as the
+  * collation of Vars. Once this jumble is acquired, a 64-bit hash is taken,
+  * which is copied back into the query tree at the post-analysis stage.
+  * Postgres then naively copies this value around, making it later available
+  * from within the corresponding plan tree. The executor can then use this value
+  * to blame query costs on a known query_id.
+  *
+  * Within the executor hook, the module stores the cost of the queries
+  * execution, based on a query_id provided by the core system.
   *
   * Copyright (c) 2008-2012, PostgreSQL Global Development Group
   *
***************
*** 22,38 ****
--- 41,67 ----
  
  #include <unistd.h>
  
+ /*
+  * XXX: include scanner.h first, to prevent code from gram.h complaining about
+  * lack of a core-parser type definition.
+  */
+ #include "parser/scanner.h"
+ 
  #include "access/hash.h"
  #include "executor/instrument.h"
  #include "funcapi.h"
  #include "mb/pg_wchar.h"
  #include "miscadmin.h"
+ #include "parser/analyze.h"
+ #include "parser/gram.h"
+ #include "parser/parsetree.h"
  #include "pgstat.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
  #include "storage/spin.h"
  #include "tcop/utility.h"
  #include "utils/builtins.h"
+ #include "utils/memutils.h"
  
  
  PG_MODULE_MAGIC;
*************** PG_MODULE_MAGIC;
*** 41,54 ****
  #define PGSS_DUMP_FILE	"global/pg_stat_statements.stat"
  
  /* This constant defines the magic number in the stats file header */
! static const uint32 PGSS_FILE_HEADER = 0x20100108;
  
  /* XXX: Should USAGE_EXEC reflect execution time and/or buffer usage? */
  #define USAGE_EXEC(duration)	(1.0)
  #define USAGE_INIT				(1.0)	/* including initial planning */
  #define USAGE_DECREASE_FACTOR	(0.99)	/* decreased every entry_dealloc */
  #define USAGE_DEALLOC_PERCENT	5		/* free this % of entries at once */
! 
  /*
   * Hashtable key that defines the identity of a hashtable entry.  The
   * hash comparators do not assume that the query string is null-terminated;
--- 70,88 ----
  #define PGSS_DUMP_FILE	"global/pg_stat_statements.stat"
  
  /* This constant defines the magic number in the stats file header */
! static const uint32 PGSS_FILE_HEADER = 0x20120103;
  
  /* XXX: Should USAGE_EXEC reflect execution time and/or buffer usage? */
  #define USAGE_EXEC(duration)	(1.0)
  #define USAGE_INIT				(1.0)	/* including initial planning */
  #define USAGE_DECREASE_FACTOR	(0.99)	/* decreased every entry_dealloc */
  #define USAGE_DEALLOC_PERCENT	5		/* free this % of entries at once */
! #define JUMBLE_SIZE				1024    /* query serialization buffer size */
! /* Magic values for jumble */
! #define MAG_HASH_BUF				0xFA	/* buffer is a hash of query tree */
! #define MAG_STR_BUF					0xEB	/* buffer is query string itself */
! #define MAG_RETURN_LIST				0xAE	/* returning list node follows */
! #define MAG_LIMIT_OFFSET			0xBA	/* limit/offset node follows */
  /*
   * Hashtable key that defines the identity of a hashtable entry.  The
   * hash comparators do not assume that the query string is null-terminated;
*************** typedef struct pgssHashKey
*** 63,70 ****
  	Oid			userid;			/* user OID */
  	Oid			dbid;			/* database OID */
  	int			encoding;		/* query encoding */
! 	int			query_len;		/* # of valid bytes in query string */
! 	const char *query_ptr;		/* query string proper */
  } pgssHashKey;
  
  /*
--- 97,103 ----
  	Oid			userid;			/* user OID */
  	Oid			dbid;			/* database OID */
  	int			encoding;		/* query encoding */
! 	uint64		query_id;		/* query identifier */
  } pgssHashKey;
  
  /*
*************** typedef struct pgssEntry
*** 95,100 ****
--- 128,134 ----
  {
  	pgssHashKey key;			/* hash key of entry - MUST BE FIRST */
  	Counters	counters;		/* the statistics for this query */
+ 	int			query_len;		/* # of valid bytes in query string */
  	slock_t		mutex;			/* protects the counters only */
  	char		query[1];		/* VARIABLE LENGTH ARRAY - MUST BE LAST */
  	/* Note: the allocated length of query[] is actually pgss->query_size */
*************** typedef struct pgssSharedState
*** 109,115 ****
--- 143,167 ----
  	int			query_size;		/* max query length in bytes */
  } pgssSharedState;
  
+ /*
+  * Last seen constant positions for a statement
+  */
+ typedef struct pgssQueryConEntry
+ {
+ 	pgssHashKey		key;			/* hash key of entry - MUST BE FIRST */
+ 	int				n_elems;		/* length of offsets array */
+ 	Size offsets[1];		/* VARIABLE LENGTH ARRAY - MUST BE LAST */
+ 	/* Note: the allocated length of offsets is actually n_elems */
+ } pgssQueryConEntry;
  /*---- Local variables ----*/
+ /* Jumble of current query tree */
+ static unsigned char *last_jumble = NULL;
+ /* Buffer that represents position of normalized characters */
+ static Size *last_offsets = NULL;
+ /* Current Length of last_offsets buffer */
+ static Size last_offset_buf_size = 10;
+ /* Current number of actual offsets stored in last_offsets */
+ static Size last_offset_num = 0;
  
  /* Current nesting depth of ExecutorRun calls */
  static int	nested_level = 0;
*************** static ExecutorRun_hook_type prev_Execut
*** 121,131 ****
--- 173,192 ----
  static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
  static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
  static ProcessUtility_hook_type prev_ProcessUtility = NULL;
+ static parse_analyze_hook_type prev_parse_analyze_hook = NULL;
+ static parse_analyze_varparams_hook_type prev_parse_analyze_varparams_hook = NULL;
  
  /* Links to shared memory state */
  static pgssSharedState *pgss = NULL;
  static HTAB *pgss_hash = NULL;
  
+ /*
+  * Maintain a stack of the rangetable of the query tree that we're currently
+  * walking, so subqueries can reference parent rangetables. The stack is pushed
+  * and popped as each Query struct is walked into or out of.
+  */
+ static List* pgss_rangetbl_stack = NIL;
+ 
  /*---- GUC variables ----*/
  
  typedef enum
*************** static int	pgss_max;			/* max # statemen
*** 147,152 ****
--- 208,214 ----
  static int	pgss_track;			/* tracking level */
  static bool pgss_track_utility; /* whether to track utility commands */
  static bool pgss_save;			/* whether to save stats across shutdown */
+ static bool pgss_string_key;	/* whether to always only hash query str */
  
  
  #define pgss_enabled() \
*************** PG_FUNCTION_INFO_V1(pg_stat_statements);
*** 166,171 ****
--- 228,250 ----
  
  static void pgss_shmem_startup(void);
  static void pgss_shmem_shutdown(int code, Datum arg);
+ static int comp_offset(const void *a, const void *b);
+ static Query *pgss_parse_analyze(Node *parseTree, const char *sourceText,
+ 			  Oid *paramTypes, int numParams);
+ static Query *pgss_parse_analyze_varparams(Node *parseTree, const char *sourceText,
+ 						Oid **paramTypes, int *numParams);
+ static void pgss_process_post_analysis_tree(Query* post_analysis_tree,
+ 		const char* sourceText);
+ static uint32 get_constant_length(const char* query_str_const);
+ static uint64 JumbleQuery(Query *post_analysis_tree);
+ static void AppendJumb(unsigned char* item, unsigned char jumble[], Size size, Size *i);
+ static void PerformJumble(const Query *tree, Size size, Size *i);
+ static void QualsNode(const OpExpr *node, Size size, Size *i, List *rtable);
+ static void LeafNode(const Node *arg, Size size, Size *i, List *rtable);
+ static void LimitOffsetNode(const Node *node, Size size, Size *i, List *rtable);
+ static void JoinExprNode(JoinExpr *node, Size size, Size *i, List *rtable);
+ static void JoinExprNodeChild(const Node *node, Size size, Size *i, List *rtable);
+ static void RecordConstLocation(int location);
  static void pgss_ExecutorStart(QueryDesc *queryDesc, int eflags);
  static void pgss_ExecutorRun(QueryDesc *queryDesc,
  				 ScanDirection direction,
*************** static void pgss_ProcessUtility(Node *pa
*** 177,188 ****
  					DestReceiver *dest, char *completionTag);
  static uint32 pgss_hash_fn(const void *key, Size keysize);
  static int	pgss_match_fn(const void *key1, const void *key2, Size keysize);
! static void pgss_store(const char *query, double total_time, uint64 rows,
! 		   const BufferUsage *bufusage);
  static Size pgss_memsize(void);
! static pgssEntry *entry_alloc(pgssHashKey *key);
  static void entry_dealloc(void);
  static void entry_reset(void);
  
  
  /*
--- 256,272 ----
  					DestReceiver *dest, char *completionTag);
  static uint32 pgss_hash_fn(const void *key, Size keysize);
  static int	pgss_match_fn(const void *key1, const void *key2, Size keysize);
! static uint64 pgss_hash_string(const char* str);
! static void pgss_store(const char *query, uint64 query_id,
! 				double total_time, uint64 rows,
! 				const BufferUsage *bufusage, bool empty_entry, bool normalize);
  static Size pgss_memsize(void);
! static pgssEntry *entry_alloc(pgssHashKey *key, const char* query, int new_query_len);
  static void entry_dealloc(void);
  static void entry_reset(void);
+ static int n_dups(Size offs[], Size n);
+ static Size* dedup_toks(Size offs[], Size n,
+ 		Size new_size);
  
  
  /*
*************** static void entry_reset(void);
*** 191,196 ****
--- 275,281 ----
  void
  _PG_init(void)
  {
+ 	MemoryContext cur_context;
  	/*
  	 * In order to create our shared memory area, we have to be loaded via
  	 * shared_preload_libraries.  If not, fall out without hooking into any of
*************** _PG_init(void)
*** 252,257 ****
--- 337,357 ----
  							 NULL,
  							 NULL);
  
+ 	/*
+ 	 * Support legacy pg_stat_statements behavior, for compatibility with
+ 	 * versions shipped with Postgres 8.4, 9.0 and 9.1
+ 	 */
+ 	DefineCustomBoolVariable("pg_stat_statements.string_key",
+ 			   "Differentiate queries based on query string alone.",
+ 							 NULL,
+ 							 &pgss_string_key,
+ 							 false,
+ 							 PGC_POSTMASTER,
+ 							 0,
+ 							 NULL,
+ 							 NULL,
+ 							 NULL);
+ 
  	EmitWarningsOnPlaceholders("pg_stat_statements");
  
  	/*
*************** _PG_init(void)
*** 263,268 ****
--- 363,383 ----
  	RequestAddinLWLocks(1);
  
  	/*
+ 	 * Allocate a buffer to store selective serialization of the query tree
+ 	 * for the purposes of query normalization.
+ 	 *
+ 	 * State that persists for the lifetime of the backend should be allocated
+ 	 * in TopMemoryContext
+ 	 */
+ 	cur_context = MemoryContextSwitchTo(TopMemoryContext);
+ 
+ 	last_jumble = palloc(JUMBLE_SIZE);
+ 	/* Allocate space for bookkeeping information for query str normalization */
+ 	last_offsets = palloc(last_offset_buf_size * sizeof(Size));
+ 
+ 	MemoryContextSwitchTo(cur_context);
+ 
+ 	/*
  	 * Install hooks.
  	 */
  	prev_shmem_startup_hook = shmem_startup_hook;
*************** _PG_init(void)
*** 277,282 ****
--- 392,401 ----
  	ExecutorEnd_hook = pgss_ExecutorEnd;
  	prev_ProcessUtility = ProcessUtility_hook;
  	ProcessUtility_hook = pgss_ProcessUtility;
+ 	prev_parse_analyze_hook = parse_analyze_hook;
+ 	parse_analyze_hook = pgss_parse_analyze;
+ 	prev_parse_analyze_varparams_hook = parse_analyze_varparams_hook;
+ 	parse_analyze_varparams_hook = pgss_parse_analyze_varparams;
  }
  
  /*
*************** _PG_fini(void)
*** 292,297 ****
--- 411,421 ----
  	ExecutorFinish_hook = prev_ExecutorFinish;
  	ExecutorEnd_hook = prev_ExecutorEnd;
  	ProcessUtility_hook = prev_ProcessUtility;
+ 	parse_analyze_hook = prev_parse_analyze_hook;
+ 	parse_analyze_varparams_hook = prev_parse_analyze_varparams_hook;
+ 
+ 	pfree(last_jumble);
+ 	pfree(last_offsets);
  }
  
  /*
*************** pgss_shmem_startup(void)
*** 395,421 ****
  		if (!PG_VALID_BE_ENCODING(temp.key.encoding))
  			goto error;
  
  		/* Previous incarnation might have had a larger query_size */
! 		if (temp.key.query_len >= buffer_size)
  		{
! 			buffer = (char *) repalloc(buffer, temp.key.query_len + 1);
! 			buffer_size = temp.key.query_len + 1;
  		}
  
! 		if (fread(buffer, 1, temp.key.query_len, file) != temp.key.query_len)
  			goto error;
! 		buffer[temp.key.query_len] = '\0';
  
  		/* Clip to available length if needed */
! 		if (temp.key.query_len >= query_size)
! 			temp.key.query_len = pg_encoding_mbcliplen(temp.key.encoding,
  													   buffer,
! 													   temp.key.query_len,
  													   query_size - 1);
- 		temp.key.query_ptr = buffer;
  
  		/* make the hashtable entry (discards old entries if too many) */
! 		entry = entry_alloc(&temp.key);
  
  		/* copy in the actual stats */
  		entry->counters = temp.counters;
--- 519,546 ----
  		if (!PG_VALID_BE_ENCODING(temp.key.encoding))
  			goto error;
  
+ 
  		/* Previous incarnation might have had a larger query_size */
! 		if (temp.query_len >= buffer_size)
  		{
! 			buffer = (char *) repalloc(buffer, temp.query_len + 1);
! 			buffer_size = temp.query_len + 1;
  		}
  
! 		if (fread(buffer, 1, temp.query_len, file) != temp.query_len)
  			goto error;
! 		buffer[temp.query_len] = '\0';
! 
  
  		/* Clip to available length if needed */
! 		if (temp.query_len >= query_size)
! 			temp.query_len = pg_encoding_mbcliplen(temp.key.encoding,
  													   buffer,
! 													   temp.query_len,
  													   query_size - 1);
  
  		/* make the hashtable entry (discards old entries if too many) */
! 		entry = entry_alloc(&temp.key, buffer, temp.query_len);
  
  		/* copy in the actual stats */
  		entry->counters = temp.counters;
*************** pgss_shmem_shutdown(int code, Datum arg)
*** 477,483 ****
  	hash_seq_init(&hash_seq, pgss_hash);
  	while ((entry = hash_seq_search(&hash_seq)) != NULL)
  	{
! 		int			len = entry->key.query_len;
  
  		if (fwrite(entry, offsetof(pgssEntry, mutex), 1, file) != 1 ||
  			fwrite(entry->query, 1, len, file) != len)
--- 602,608 ----
  	hash_seq_init(&hash_seq, pgss_hash);
  	while ((entry = hash_seq_search(&hash_seq)) != NULL)
  	{
! 		int			len = entry->query_len;
  
  		if (fwrite(entry, offsetof(pgssEntry, mutex), 1, file) != 1 ||
  			fwrite(entry->query, 1, len, file) != len)
*************** error:
*** 503,508 ****
--- 628,1702 ----
  }
  
  /*
+  * comp_offset: Comparator for qsorting Size values.
+  */
+ static int
+ comp_offset(const void *a, const void *b)
+ {
+ 	Size l = *((Size*) a);
+ 	Size r = *((Size*) b);
+ 	if (l < r)
+ 		return -1;
+ 	else if (l > r)
+ 		return +1;
+ 	else
+ 		return 0;
+ }
+ 
+ static Query *
+ pgss_parse_analyze(Node *parseTree, const char *sourceText,
+ 			  Oid *paramTypes, int numParams)
+ {
+ 	Query *post_analysis_tree;
+ 
+ 	if (prev_parse_analyze_hook)
+ 		post_analysis_tree = (*prev_parse_analyze_hook) (parseTree, sourceText,
+ 			  paramTypes, numParams);
+ 	else
+ 		post_analysis_tree = standard_parse_analyze(parseTree, sourceText,
+ 			  paramTypes, numParams);
+ 
+ 	if (!post_analysis_tree->utilityStmt)
+ 		pgss_process_post_analysis_tree(post_analysis_tree, sourceText);
+ 
+ 	return post_analysis_tree;
+ }
+ 
+ static Query *
+ pgss_parse_analyze_varparams(Node *parseTree, const char *sourceText,
+ 						Oid **paramTypes, int *numParams)
+ {
+ 	Query *post_analysis_tree;
+ 
+ 	if (prev_parse_analyze_hook)
+ 		post_analysis_tree = (*prev_parse_analyze_varparams_hook) (parseTree,
+ 				sourceText, paramTypes, numParams);
+ 	else
+ 		post_analysis_tree = standard_parse_analyze_varparams(parseTree,
+ 				sourceText, paramTypes, numParams);
+ 
+ 	if (!post_analysis_tree->utilityStmt)
+ 		pgss_process_post_analysis_tree(post_analysis_tree, sourceText);
+ 
+ 	return post_analysis_tree;
+ }
+ 
+ /*
+  * pgss_process_post_analysis_tree: Record query_id, which is based on the query
+  * tree, within the tree itself, for later retrieval in the exeuctor hook. The
+  * core system will copy the value to the tree's corresponding plannedstmt.
+  */
+ static void
+ pgss_process_post_analysis_tree(Query* post_analysis_tree,
+ 		const char* sourceText)
+ {
+ 	BufferUsage bufusage;
+ 
+ 	post_analysis_tree->query_id = JumbleQuery(post_analysis_tree);
+ 
+ 	memset(&bufusage, 0, sizeof(bufusage));
+ 	pgss_store(sourceText, post_analysis_tree->query_id, 0, 0, &bufusage,
+ 			true, true);
+ 
+ 	/* Trim last_offsets */
+ 	if (last_offset_buf_size > 10)
+ 	{
+ 		last_offset_buf_size = 10;
+ 		last_offsets = repalloc(last_offsets,
+ 							last_offset_buf_size *
+ 							sizeof(Size));
+ 	}
+ }
+ 
+ /*
+  * Given query_str_const, which points to the first character of a constant
+  * within a null-terminated SQL query string, determine the total length of the
+  * constant.
+  *
+  * The constant may use any available constant syntax, including but not limited
+  * to float literals, bit-strings, single quoted strings and dollar-quoted
+  * strings. This is accomplished by using the public API for the core scanner,
+  * with a few workarounds for quirks of their representation, such as the fact
+  * that constants preceded by a minus symbol have a position at the minus
+  * symbol, and yet are separately tokenized. This is effectively the inverse of
+  * what the later parsing step would have done to the Const node's position -
+  * compensate for the inclusion of the minus symbol.
+  *
+  * It is the caller's job to ensure that the string points to the first
+  * character of a valid constant, and that it includes the constant in its
+  * entirety. Since in practice the string has already been validated, and the
+  * initial position that the caller provides will have originated from within
+  * the authoritative parser, this should not be a problem.
+  */
+ static uint32
+ get_constant_length(const char* query_str_const)
+ {
+ 	core_yyscan_t  init_scan;
+ 	core_yy_extra_type ext_type;
+ 	core_YYSTYPE type;
+ 	uint32 len;
+ 	YYLTYPE pos;
+ 	int token;
+ 	int orig_tok_len;
+ 
+ 	if (query_str_const[0] == '-')
+ 		/* Negative constant */
+ 		return 1 + get_constant_length(&query_str_const[1]);
+ 
+ 	init_scan = scanner_init(query_str_const,
+ 							 &ext_type,
+ 							 ScanKeywords,
+ 							 NumScanKeywords);
+ 
+ 	token = core_yylex(&type, &pos,
+ 			   init_scan);
+ 
+ 	orig_tok_len = strlen(ext_type.scanbuf);
+ 	switch(token)
+ 	{
+ 		case NULL_P:
+ 		case SCONST:
+ 		case BCONST:
+ 		case XCONST:
+ 		case TRUE_P:
+ 		case FALSE_P:
+ 		case FCONST:
+ 		case ICONST:
+ 		case TYPECAST:
+ 		default:
+ 			len = orig_tok_len;
+ 			break;
+ 		/* XXX: "select integer '1'" must normalize to "select ?"
+ 		 * This is due to the position given within Const nodes.
+ 		 *
+ 		 * This is rather fragile, as I must enumerate all such types that there
+ 		 * may be leading tokens for here:
+ 		 */
+ 		case DECIMAL_P:
+ 		case BOOLEAN_P:
+ 		case NAME_P:
+ 		case TEXT_P:
+ 		case XML_P:
+ 		case TIMESTAMP:
+ 		case TIME:
+ 		case INTERVAL:
+ 		case INTEGER:
+ 		case BIGINT:
+ 		case NUMERIC:
+ 		case IDENT:
+ 			for(;;)
+ 			{
+ 				int lat_tok = core_yylex(&type, &pos,
+ 						   init_scan);
+ 				if (token == IDENT && lat_tok != SCONST)
+ 				{
+ 					len = orig_tok_len;
+ 					break;
+ 				}
+ 				/* String to follow */
+ 				if (lat_tok == SCONST)
+ 				{
+ 					len = strlen(ext_type.scanbuf);
+ 					break;
+ 				}
+ 			}
+ 			break;
+ 	}
+ 	scanner_finish(init_scan);
+ 	Assert(len > 0);
+ 	return len;
+ }
+ 
+ /*
+  * JumbleQuery: Selectively serialize query tree, and return a hash representing
+  * that serialization - it's query_id.
+  *
+  * Note that this doesn't necessarily uniquely identify the query across
+  * different databases and encodings.
+  */
+ static uint64
+ JumbleQuery(Query *post_analysis_tree)
+ {
+ 	/* State for this run of PerformJumble */
+ 	Size i = 0;
+ 	last_offset_num = 0;
+ 	memset(last_jumble, 0, JUMBLE_SIZE);
+ 	last_jumble[++i] = MAG_HASH_BUF;
+ 	PerformJumble(post_analysis_tree, JUMBLE_SIZE, &i);
+ 	/* Reset rangetbl state */
+ 	list_free(pgss_rangetbl_stack);
+ 	pgss_rangetbl_stack = NIL;
+ 
+ 	/* Sort offsets for later query string canonicalization */
+ 	qsort(last_offsets, last_offset_num, sizeof(Size), comp_offset);
+ 	return hash_any64((const unsigned char* ) last_jumble, i);
+ }
+ 
+ /*
+  * AppendJumb: Append a value that is substantive to a given query to jumble,
+  * while incrementing the iterator, i.
+  */
+ static void
+ AppendJumb(unsigned char* item, unsigned char jumble[], Size size, Size *i)
+ {
+ 	Assert(item != NULL);
+ 	Assert(jumble != NULL);
+ 	Assert(i != NULL);
+ 
+ 	/*
+ 	 * Copy the entire item to the buffer, or as much of it as possible to fill
+ 	 * the buffer to capacity.
+ 	 */
+ 	memcpy(jumble + *i, item, Min(*i > JUMBLE_SIZE? 0:JUMBLE_SIZE - *i, size));
+ 
+ 	/*
+ 	 * Continually hash the query tree's jumble.
+ 	 *
+ 	 * Was JUMBLE_SIZE exceeded? If so, hash the jumble and append that to the
+ 	 * start of the jumble buffer, and then continue to append the fraction of
+ 	 * "item" that we might not have been able to fit at the end of the buffer
+ 	 * in the last iteration. Since the value of i has been set to 0, there is
+ 	 * no need to memset the buffer in advance of this new iteration, but
+ 	 * effectively we are completely discarding the prior iteration's jumble
+ 	 * except for this hashed value.
+ 	 */
+ 	if (*i > JUMBLE_SIZE)
+ 	{
+ 		uint64 start_hash = hash_any64((const unsigned char* ) last_jumble, JUMBLE_SIZE);
+ 		int hash_l = sizeof(start_hash);
+ 		int part_left_l = Max(0, ((int) size - ((int) *i - JUMBLE_SIZE)));
+ 
+ 		Assert(part_left_l >= 0 && part_left_l <= size);
+ 
+ 		memcpy(jumble, &start_hash, hash_l);
+ 		memcpy(jumble + hash_l, item + (size - part_left_l), part_left_l);
+ 		*i = hash_l + part_left_l;
+ 	}
+ 	else
+ 	{
+ 		*i += size;
+ 	}
+ }
+ 
+ /*
+  * Wrapper around AppendJumb to encapsulate details of serialization
+  * of individual local variable elements.
+  */
+ #define APP_JUMB(item) \
+ AppendJumb((unsigned char*)&item, last_jumble, sizeof(item), i)
+ 
+ /*
+  * Space in the jumble buffer is limited - we can compact enum representations
+  * that will obviously never really need more than a single byte to store all
+  * possible enumerations.
+  *
+  * It would be pretty questionable to attempt this with an enum that has
+  * explicit integer values corresponding to constants, such as the huge enum
+  * "Node" that we use to dynamically identify nodes, and it would be downright
+  * incorrect to do so with one with negative values explicitly assigned to
+  * constants. This is intended to be used with enums with perhaps less than a
+  * dozen possible values, that are never likely to far exceed that.
+  */
+ #define COMPACT_ENUM(val) \
+ 	(unsigned char) val;
+ /*
+  * PerformJumble: Serialize the query tree "parse" and canonicalize
+  * constants, while simply skipping over others that are not essential to the
+  * query, such that it is usefully normalized, excluding things from the tree
+  * that are not essential to the query itself.
+  *
+  * The last_jumble buffer, which this function writes to, can be hashed to
+  * uniquely identify a query that may use different constants in successive
+  * calls.
+  */
+ static void
+ PerformJumble(const Query *tree, Size size, Size *i)
+ {
+ 	ListCell *l;
+ 	/* table join tree (FROM and WHERE clauses) */
+ 	FromExpr *jt = (FromExpr *) tree->jointree;
+ 	/* # of result tuples to skip (int8 expr) */
+ 	FuncExpr *off = (FuncExpr *) tree->limitOffset;
+ 	/* # of result tuples to skip (int8 expr) */
+ 	FuncExpr *limcount = (FuncExpr *) tree->limitCount;
+ 
+ 	if (pgss_rangetbl_stack &&
+ 			!IsA(pgss_rangetbl_stack, List))
+ 		pgss_rangetbl_stack = NIL;
+ 
+ 	if (tree->rtable != NIL)
+ 	{
+ 		pgss_rangetbl_stack = lappend(pgss_rangetbl_stack, tree->rtable);
+ 	}
+ 	else
+ 	{
+ 		/* Add dummy Range table entry to maintain stack */
+ 		RangeTblEntry *rte = makeNode(RangeTblEntry);
+ 		List *dummy = lappend(NIL, rte);
+ 		pgss_rangetbl_stack = lappend(pgss_rangetbl_stack, dummy);
+ 	}
+ 
+ 
+ 	APP_JUMB(tree->resultRelation);
+ 
+ 	if (tree->intoClause)
+ 	{
+ 		unsigned char OnCommit;
+ 		unsigned char skipData;
+ 		IntoClause *ic = tree->intoClause;
+ 		RangeVar   *rel = ic->rel;
+ 
+ 		OnCommit = COMPACT_ENUM(ic->onCommit);
+ 		skipData = COMPACT_ENUM(ic->skipData);
+ 		APP_JUMB(OnCommit);
+ 		APP_JUMB(skipData);
+ 		if (rel)
+ 		{
+ 			APP_JUMB(rel->relpersistence);
+ 			/* Bypass macro abstraction to supply size directly.
+ 			 *
+ 			 * Serialize schemaname, relname themselves - this makes us
+ 			 * somewhat consistent with the behavior of utility statements like "create
+ 			 * table", which seems appropriate.
+ 			 */
+ 			if (rel->schemaname)
+ 				AppendJumb((unsigned char *)rel->schemaname, last_jumble,
+ 								strlen(rel->schemaname), i);
+ 			if (rel->relname)
+ 				AppendJumb((unsigned char *)rel->relname, last_jumble,
+ 								strlen(rel->relname), i);
+ 		}
+ 	}
+ 
+ 	/* WITH list (of CommonTableExpr's) */
+ 	foreach(l, tree->cteList)
+ 	{
+ 		CommonTableExpr	*cte = (CommonTableExpr *) lfirst(l);
+ 		Query			*cteq = (Query*) cte->ctequery;
+ 		if (cteq)
+ 			PerformJumble(cteq, size, i);
+ 	}
+ 	if (jt)
+ 	{
+ 		if (jt->quals)
+ 		{
+ 			if (IsA(jt->quals, OpExpr))
+ 			{
+ 				QualsNode((OpExpr*) jt->quals, size, i, tree->rtable);
+ 			}
+ 			else
+ 			{
+ 				LeafNode((Node*) jt->quals, size, i, tree->rtable);
+ 			}
+ 		}
+ 		/* table join tree */
+ 		foreach(l, jt->fromlist)
+ 		{
+ 			Node* fr = lfirst(l);
+ 			if (IsA(fr, JoinExpr))
+ 			{
+ 				JoinExprNode((JoinExpr*) fr, size, i, tree->rtable);
+ 			}
+ 			else if (IsA(fr, RangeTblRef))
+ 			{
+ 				unsigned char rtekind;
+ 				RangeTblRef   *rtf = (RangeTblRef *) fr;
+ 				RangeTblEntry *rte = rt_fetch(rtf->rtindex, tree->rtable);
+ 				APP_JUMB(rte->relid);
+ 				rtekind = COMPACT_ENUM(rte->rtekind);
+ 				APP_JUMB(rtekind);
+ 				/* Subselection in where clause */
+ 				if (rte->subquery)
+ 					PerformJumble(rte->subquery, size, i);
+ 
+ 				/* Function call in where clause */
+ 				if (rte->funcexpr)
+ 					LeafNode((Node*) rte->funcexpr, size, i, tree->rtable);
+ 			}
+ 			else
+ 			{
+ 				ereport(WARNING,
+ 						(errcode(ERRCODE_INTERNAL_ERROR),
+ 						 errmsg("unexpected, unrecognised fromlist node type: %d",
+ 							 (int) nodeTag(fr))));
+ 			}
+ 		}
+ 	}
+ 	/*
+ 	 * target list (of TargetEntry)
+ 	 * columns returned by query
+ 	 */
+ 	foreach(l, tree->targetList)
+ 	{
+ 		TargetEntry *tg = (TargetEntry *) lfirst(l);
+ 		Node        *e  = (Node*) tg->expr;
+ 		if (tg->ressortgroupref)
+ 			/* nonzero if referenced by a sort/group - for ORDER BY */
+ 			APP_JUMB(tg->ressortgroupref);
+ 		APP_JUMB(tg->resno); /* column number for select */
+ 		/*
+ 		 * Handle the various types of nodes in
+ 		 * the select list of this query
+ 		 */
+ 		LeafNode(e, size, i, tree->rtable);
+ 	}
+ 	/* return-values list (of TargetEntry) */
+ 	foreach(l, tree->returningList)
+ 	{
+ 		TargetEntry *rt = (TargetEntry *) lfirst(l);
+ 		Expr        *e  = (Expr*) rt->expr;
+ 		unsigned char magic = MAG_RETURN_LIST;
+ 		APP_JUMB(magic);
+ 		/*
+ 		 * Handle the various types of nodes in
+ 		 * the select list of this query
+ 		 */
+ 		LeafNode((Node*) e, size, i, tree->rtable);
+ 	}
+ 	/* a list of SortGroupClause's */
+ 	foreach(l, tree->groupClause)
+ 	{
+ 		SortGroupClause *gc = (SortGroupClause *) lfirst(l);
+ 		APP_JUMB(gc->tleSortGroupRef);
+ 		APP_JUMB(gc->nulls_first);
+ 	}
+ 
+ 	if (tree->havingQual)
+ 	{
+ 		if (IsA(tree->havingQual, OpExpr))
+ 		{
+ 			OpExpr *na = (OpExpr *) tree->havingQual;
+ 			QualsNode(na, size, i, tree->rtable);
+ 		}
+ 		else
+ 		{
+ 			Node *n = (Node*) tree->havingQual;
+ 			LeafNode(n, size, i, tree->rtable);
+ 		}
+ 	}
+ 
+ 	foreach(l, tree->windowClause)
+ 	{
+ 		WindowClause *wc = (WindowClause *) lfirst(l);
+ 		ListCell     *il;
+ 		APP_JUMB(wc->frameOptions);
+ 		foreach(il, wc->partitionClause)	/* PARTITION BY list */
+ 		{
+ 			Node *n = (Node *) lfirst(il);
+ 			LeafNode(n, size, i, tree->rtable);
+ 		}
+ 		foreach(il, wc->orderClause)		/* ORDER BY list */
+ 		{
+ 			Node *n = (Node *) lfirst(il);
+ 			LeafNode(n, size, i, tree->rtable);
+ 		}
+ 	}
+ 
+ 	foreach(l, tree->distinctClause)
+ 	{
+ 		SortGroupClause *dc = (SortGroupClause *) lfirst(l);
+ 		APP_JUMB(dc->tleSortGroupRef);
+ 		APP_JUMB(dc->nulls_first);
+ 	}
+ 
+ 	/* Don't look at tree->sortClause,
+ 	 * because the value ressortgroupref is already
+ 	 * serialized when we iterate through targetList
+ 	 */
+ 
+ 	if (off)
+ 		LimitOffsetNode((Node*) off, size, i, tree->rtable);
+ 
+ 	if (limcount)
+ 		LimitOffsetNode((Node*) limcount, size, i, tree->rtable);
+ 
+ 	if (tree->setOperations)
+ 	{
+ 		/*
+ 		 * set-operation tree if this is top
+ 		 * level of a UNION/INTERSECT/EXCEPT query
+ 		 */
+ 		unsigned char op;
+ 		SetOperationStmt *topop = (SetOperationStmt *) tree->setOperations;
+ 		op = COMPACT_ENUM(topop->op);
+ 		APP_JUMB(op);
+ 		APP_JUMB(topop->all);
+ 
+ 		/* leaf selects are RTE subselections */
+ 		foreach(l, tree->rtable)
+ 		{
+ 			RangeTblEntry *r = (RangeTblEntry *) lfirst(l);
+ 			if (r->subquery)
+ 				PerformJumble(r->subquery, size, i);
+ 		}
+ 	}
+ 	pgss_rangetbl_stack = list_delete_ptr(pgss_rangetbl_stack,
+ 			list_nth(pgss_rangetbl_stack, pgss_rangetbl_stack->length - 1));
+ }
+ 
+ /*
+  * Perform selective serialization of "Quals" nodes when
+  * they're IsA(*, OpExpr)
+  */
+ static void
+ QualsNode(const OpExpr *node, Size size, Size *i, List *rtable)
+ {
+ 	ListCell *l;
+ 	APP_JUMB(node->xpr);
+ 	APP_JUMB(node->opno);
+ 	foreach(l, node->args)
+ 	{
+ 		Node *arg = (Node *) lfirst(l);
+ 		LeafNode(arg, size, i, rtable);
+ 	}
+ }
+ 
+ /*
+  * LeafNode: Selectively serialize a selection of parser/prim nodes that are
+  * frequently, though certainly not necesssarily leaf nodes, such as Vars
+  * (columns), constants and function calls
+  */
+ static void
+ LeafNode(const Node *arg, Size size, Size *i, List *rtable)
+ {
+ 	ListCell *l;
+ 	/* Use the node's NodeTag as a magic number */
+ 	APP_JUMB(arg->type);
+ 
+ 	if (IsA(arg, Const))
+ 	{
+ 		/*
+ 		 * If implicit casts are used, such as when inserting an integer into a
+ 		 * text column, then that will be a distinct query from directly
+ 		 * inserting a string literal, so that literal value will be a FuncExpr
+ 		 * to cast, and control won't reach here for that node. This behavior is
+ 		 * considered correct.
+ 		 */
+ 		Const *c = (Const *) arg;
+ 
+ 		/*
+ 		 * Datatype of the constant is a
+ 		 * differentiator
+ 		 */
+ 		APP_JUMB(c->consttype);
+ 		RecordConstLocation(c->location);
+ 	}
+ 	else if(IsA(arg, CoerceToDomain))
+ 	{
+ 		CoerceToDomain *cd = (CoerceToDomain*) arg;
+ 		/*
+ 		 * Datatype of the constant is a
+ 		 * differentiator
+ 		 */
+ 		APP_JUMB(cd->resulttype);
+ 		LeafNode((Node*) cd->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, Var))
+ 	{
+ 		Var			  *v = (Var *) arg;
+ 		RangeTblEntry *rte;
+ 		ListCell *lc;
+ 
+ 		/*
+ 		 * We need to get the details of the rangetable, but rtable may not
+ 		 * refer to the relevant one if we're in a subselection.
+ 		 */
+ 		if (v->varlevelsup == 0)
+ 		{
+ 			rte = rt_fetch(v->varno, rtable);
+ 		}
+ 		else
+ 		{
+ 			List *rtable_upper = list_nth(pgss_rangetbl_stack,
+ 					(list_length(pgss_rangetbl_stack) - 1) - v->varlevelsup);
+ 			rte = rt_fetch(v->varno, rtable_upper);
+ 		}
+ 		APP_JUMB(rte->relid);
+ 
+ 		foreach(lc, rte->values_lists)
+ 		{
+ 			List	   *sublist = (List *) lfirst(lc);
+ 			ListCell   *lc2;
+ 
+ 			foreach(lc2, sublist)
+ 			{
+ 				Node	   *col = (Node *) lfirst(lc2);
+ 				LeafNode(col, size, i, rtable);
+ 			}
+ 		}
+ 		APP_JUMB(v->varattno);
+ 	}
+ 	else if (IsA(arg, CurrentOfExpr))
+ 	{
+ 		CurrentOfExpr *CoE = (CurrentOfExpr*) arg;
+ 		APP_JUMB(CoE->cvarno);
+ 		APP_JUMB(CoE->cursor_param);
+ 	}
+ 	else if (IsA(arg, CollateExpr))
+ 	{
+ 		CollateExpr *Ce = (CollateExpr*) arg;
+ 		APP_JUMB(Ce->collOid);
+ 	}
+ 	else if (IsA(arg, FieldSelect))
+ 	{
+ 		FieldSelect *Fs = (FieldSelect*) arg;
+ 		APP_JUMB(Fs->resulttype);
+ 		LeafNode((Node*) Fs->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, NamedArgExpr))
+ 	{
+ 		NamedArgExpr *Nae = (NamedArgExpr*) arg;
+ 		APP_JUMB(Nae->argnumber);
+ 		LeafNode((Node*) Nae->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, Param))
+ 	{
+ 		Param *p = ((Param *) arg);
+ 		APP_JUMB(p->paramkind);
+ 		APP_JUMB(p->paramid);
+ 	}
+ 	else if (IsA(arg, RelabelType))
+ 	{
+ 		RelabelType *rt = (RelabelType*) arg;
+ 		APP_JUMB(rt->resulttype);
+ 		LeafNode((Node*) rt->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, WindowFunc))
+ 	{
+ 		WindowFunc *wf = (WindowFunc *) arg;
+ 		APP_JUMB(wf->winfnoid);
+ 		foreach(l, wf->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, FuncExpr))
+ 	{
+ 		FuncExpr *f = (FuncExpr *) arg;
+ 		APP_JUMB(f->funcid);
+ 		foreach(l, f->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, OpExpr) || IsA(arg, DistinctExpr))
+ 	{
+ 		QualsNode((OpExpr*) arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, CoerceViaIO))
+ 	{
+ 		CoerceViaIO *Cio = (CoerceViaIO*) arg;
+ 		APP_JUMB(Cio->coerceformat);
+ 		APP_JUMB(Cio->resulttype);
+ 		LeafNode((Node*) Cio->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, Aggref))
+ 	{
+ 		Aggref *a =  (Aggref *) arg;
+ 		APP_JUMB(a->aggfnoid);
+ 		foreach(l, a->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, SubLink))
+ 	{
+ 		SubLink *s = (SubLink*) arg;
+ 		APP_JUMB(s->subLinkType);
+ 		/* Serialize select-list subselect recursively */
+ 		if (s->subselect)
+ 			PerformJumble((Query*) s->subselect, size, i);
+ 
+ 		if (s->testexpr)
+ 			LeafNode((Node*) s->testexpr, size, i, rtable);
+ 		foreach(l, s->operName)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, TargetEntry))
+ 	{
+ 		TargetEntry *rt = (TargetEntry *) arg;
+ 		Node *e = (Node*) rt->expr;
+ 		APP_JUMB(rt->resorigtbl);
+ 		APP_JUMB(rt->ressortgroupref);
+ 		LeafNode(e, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, BoolExpr))
+ 	{
+ 		BoolExpr *be = (BoolExpr *) arg;
+ 		APP_JUMB(be->boolop);
+ 		foreach(l, be->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, NullTest))
+ 	{
+ 		NullTest *nt = (NullTest *) arg;
+ 		Node     *arg = (Node *) nt->arg;
+ 		unsigned char nulltesttype = COMPACT_ENUM(nt->nulltesttype);
+ 		APP_JUMB(nulltesttype);	/* IS NULL, IS NOT NULL */
+ 		APP_JUMB(nt->argisrow);	/* is input a composite type ? */
+ 		LeafNode(arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, ArrayExpr))
+ 	{
+ 		ArrayExpr *ae = (ArrayExpr *) arg;
+ 		APP_JUMB(ae->array_typeid);		/* type of expression result */
+ 		APP_JUMB(ae->element_typeid);	/* common type of array elements */
+ 		foreach(l, ae->elements)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, CaseExpr))
+ 	{
+ 		CaseExpr *ce = (CaseExpr*) arg;
+ 		Assert(ce->casetype != InvalidOid);
+ 		APP_JUMB(ce->casetype);
+ 		foreach(l, ce->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		if (ce->arg)
+ 			LeafNode((Node*) ce->arg, size, i, rtable);
+ 
+ 		if (ce->defresult)
+ 		{
+ 			/* Default result (ELSE clause).
+ 			 *
+ 			 * May be NULL, because no else clause
+ 			 * was actually specified, and thus the value is
+ 			 * equivalent to SQL ELSE NULL
+ 			 */
+ 			LeafNode((Node*) ce->defresult, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, CaseTestExpr))
+ 	{
+ 		CaseTestExpr *ct = (CaseTestExpr*) arg;
+ 		APP_JUMB(ct->typeId);
+ 	}
+ 	else if (IsA(arg, CaseWhen))
+ 	{
+ 		CaseWhen *cw = (CaseWhen*) arg;
+ 		Node     *res = (Node*) cw->result;
+ 		Node     *exp = (Node*) cw->expr;
+ 		if (res)
+ 			LeafNode(res, size, i, rtable);
+ 		if (exp)
+ 			LeafNode(exp, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, MinMaxExpr))
+ 	{
+ 		MinMaxExpr *cw = (MinMaxExpr*) arg;
+ 		APP_JUMB(cw->minmaxtype);
+ 		APP_JUMB(cw->op);
+ 		foreach(l, cw->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, ScalarArrayOpExpr))
+ 	{
+ 		ScalarArrayOpExpr *sa = (ScalarArrayOpExpr*) arg;
+ 		APP_JUMB(sa->opfuncid);
+ 		APP_JUMB(sa->useOr);
+ 		foreach(l, sa->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, CoalesceExpr))
+ 	{
+ 		CoalesceExpr *ca = (CoalesceExpr*) arg;
+ 		foreach(l, ca->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, ArrayCoerceExpr))
+ 	{
+ 		ArrayCoerceExpr *ac = (ArrayCoerceExpr *) arg;
+ 		LeafNode((Node*) ac->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, WindowClause))
+ 	{
+ 		WindowClause *wc = (WindowClause*) arg;
+ 		foreach(l, wc->partitionClause)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		foreach(l, wc->orderClause)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, SortGroupClause))
+ 	{
+ 		SortGroupClause *sgc = (SortGroupClause*) arg;
+ 		APP_JUMB(sgc->tleSortGroupRef);
+ 		APP_JUMB(sgc->nulls_first);
+ 	}
+ 	else if (IsA(arg, Integer) ||
+ 		  IsA(arg, Float) ||
+ 		  IsA(arg, String) ||
+ 		  IsA(arg, BitString) ||
+ 		  IsA(arg, Null)
+ 		)
+ 	{
+ 		/* It is not necessary to serialize Value nodes - they are seen when
+ 		 * aliases are used, which are ignored.
+ 		 */
+ 		return;
+ 	}
+ 	else if (IsA(arg, BooleanTest))
+ 	{
+ 		BooleanTest *bt = (BooleanTest *) arg;
+ 		APP_JUMB(bt->booltesttype);
+ 		LeafNode((Node*) bt->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, ArrayRef))
+ 	{
+ 		ArrayRef *ar = (ArrayRef*) arg;
+ 		APP_JUMB(ar->refarraytype);
+ 		foreach(l, ar->refupperindexpr)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		foreach(l, ar->reflowerindexpr)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		if (ar->refexpr)
+ 			LeafNode((Node*) ar->refexpr, size, i, rtable);
+ 		if (ar->refassgnexpr)
+ 			LeafNode((Node*) ar->refassgnexpr, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, NullIfExpr))
+ 	{
+ 		/* NullIfExpr is just a typedef for OpExpr */
+ 		QualsNode((OpExpr*) arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, RowExpr))
+ 	{
+ 		RowExpr *re = (RowExpr*) arg;
+ 		APP_JUMB(re->row_format);
+ 		foreach(l, re->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 
+ 	}
+ 	else if (IsA(arg, XmlExpr))
+ 	{
+ 		XmlExpr *xml = (XmlExpr*) arg;
+ 		APP_JUMB(xml->op);
+ 		foreach(l, xml->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		/* non-XML expressions for xml_attributes */
+ 		foreach(l, xml->named_args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		/* parallel list of Value strings */
+ 		foreach(l, xml->arg_names)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, RowCompareExpr))
+ 	{
+ 		RowCompareExpr *rc = (RowCompareExpr*) arg;
+ 		foreach(l, rc->largs)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		foreach(l, rc->rargs)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, SetToDefault))
+ 	{
+ 		SetToDefault *sd = (SetToDefault*) arg;
+ 		APP_JUMB(sd->typeId);
+ 		APP_JUMB(sd->typeMod);
+ 	}
+ 	else if (IsA(arg, ConvertRowtypeExpr))
+ 	{
+ 		ConvertRowtypeExpr* Cr = (ConvertRowtypeExpr*) arg;
+ 		APP_JUMB(Cr->convertformat);
+ 		APP_JUMB(Cr->resulttype);
+ 		LeafNode((Node*) Cr->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, FieldStore))
+ 	{
+ 		FieldStore* Fs = (FieldStore*) arg;
+ 		LeafNode((Node*) Fs->arg, size, i, rtable);
+ 		foreach(l, Fs->newvals)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else
+ 	{
+ 		ereport(WARNING,
+ 				(errcode(ERRCODE_INTERNAL_ERROR),
+ 				 errmsg("unexpected, unrecognised LeafNode node type: %d",
+ 					 (int) nodeTag(arg))));
+ 	}
+ }
+ 
+ /*
+  * Perform selective serialization of limit or offset nodes
+  */
+ static void
+ LimitOffsetNode(const Node *node, Size size, Size *i, List *rtable)
+ {
+ 	ListCell *l;
+ 	unsigned char magic = MAG_LIMIT_OFFSET;
+ 	APP_JUMB(magic);
+ 
+ 	if (IsA(node, FuncExpr))
+ 	{
+ 
+ 		foreach(l, ((FuncExpr*) node)->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else
+ 	{
+ 		/* Fall back on leaf node representation */
+ 		LeafNode(node, size, i, rtable);
+ 	}
+ }
+ 
+ /*
+  * JoinExprNode: Perform selective serialization of JoinExpr nodes
+  */
+ static void
+ JoinExprNode(JoinExpr *node, Size size, Size *i, List *rtable)
+ {
+ 	Node	 *larg = node->larg;	/* left subtree */
+ 	Node	 *rarg = node->rarg;	/* right subtree */
+ 	ListCell *l;
+ 
+ 	Assert( IsA(node, JoinExpr));
+ 
+ 	APP_JUMB(node->jointype);
+ 	APP_JUMB(node->isNatural);
+ 
+ 	if (node->quals)
+ 	{
+ 		if ( IsA(node, OpExpr))
+ 		{
+ 			QualsNode((OpExpr*) node->quals, size, i, rtable);
+ 		}
+ 		else
+ 		{
+ 			LeafNode((Node*) node->quals, size, i, rtable);
+ 		}
+ 
+ 	}
+ 	foreach(l, node->usingClause) /* USING clause, if any (list of String) */
+ 	{
+ 		Node *arg = (Node *) lfirst(l);
+ 		LeafNode(arg, size, i, rtable);
+ 	}
+ 	if (larg)
+ 		JoinExprNodeChild(larg, size, i, rtable);
+ 	if (rarg)
+ 		JoinExprNodeChild(rarg, size, i, rtable);
+ }
+ 
+ /*
+  * JoinExprNodeChild: Serialize children of the JoinExpr node
+  */
+ static void
+ JoinExprNodeChild(const Node *node, Size size, Size *i, List *rtable)
+ {
+ 	if (IsA(node, RangeTblRef))
+ 	{
+ 		RangeTblRef   *rt = (RangeTblRef*) node;
+ 		RangeTblEntry *rte = rt_fetch(rt->rtindex, rtable);
+ 		ListCell      *l;
+ 
+ 		APP_JUMB(rte->relid);
+ 		APP_JUMB(rte->jointype);
+ 
+ 		if (rte->subquery)
+ 			PerformJumble((Query*) rte->subquery, size, i);
+ 
+ 		foreach(l, rte->joinaliasvars)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(node, JoinExpr))
+ 	{
+ 		JoinExprNode((JoinExpr*) node, size, i, rtable);
+ 	}
+ 	else
+ 	{
+ 		LeafNode(node, size, i, rtable);
+ 	}
+ }
+ 
+ /*
+  * Record location of constant within query string of query tree that is
+  * currently being walked.
+  */
+ static void
+ RecordConstLocation(int location)
+ {
+ 	/* -1 indicates unknown or undefined location */
+ 	if (location > -1)
+ 	{
+ 		if (last_offset_num < pgss->query_size / 2)
+ 		{
+ 			if (last_offset_num >= last_offset_buf_size)
+ 			{
+ 				last_offset_buf_size *= 2;
+ 				last_offsets = repalloc(last_offsets,
+ 								last_offset_buf_size *
+ 								sizeof(Size));
+ 
+ 			}
+ 			last_offsets[last_offset_num] = location;
+ 			last_offset_num++;
+ 		}
+ 	}
+ }
+ 
+ /*
   * ExecutorStart hook: start up tracking if needed
   */
  static void
*************** pgss_ExecutorEnd(QueryDesc *queryDesc)
*** 585,590 ****
--- 1779,1789 ----
  {
  	if (queryDesc->totaltime && pgss_enabled())
  	{
+ 		uint64 queryId;
+ 		if (pgss_string_key)
+ 			queryId = pgss_hash_string(queryDesc->sourceText);
+ 		else
+ 			queryId = queryDesc->plannedstmt->queryId;
  		/*
  		 * Make sure stats accumulation is done.  (Note: it's okay if several
  		 * levels of hook all do this.)
*************** pgss_ExecutorEnd(QueryDesc *queryDesc)
*** 592,600 ****
  		InstrEndLoop(queryDesc->totaltime);
  
  		pgss_store(queryDesc->sourceText,
! 				   queryDesc->totaltime->total,
! 				   queryDesc->estate->es_processed,
! 				   &queryDesc->totaltime->bufusage);
  	}
  
  	if (prev_ExecutorEnd)
--- 1791,1803 ----
  		InstrEndLoop(queryDesc->totaltime);
  
  		pgss_store(queryDesc->sourceText,
! 		   queryId,
! 		   queryDesc->totaltime->total,
! 		   queryDesc->estate->es_processed,
! 		   &queryDesc->totaltime->bufusage,
! 		   false,
! 		   false);
! 
  	}
  
  	if (prev_ExecutorEnd)
*************** pgss_ProcessUtility(Node *parsetree, con
*** 616,621 ****
--- 1819,1825 ----
  		instr_time	start;
  		instr_time	duration;
  		uint64		rows = 0;
+ 		uint64		query_id;
  		BufferUsage bufusage;
  
  		bufusage = pgBufferUsage;
*************** pgss_ProcessUtility(Node *parsetree, con
*** 665,672 ****
  		bufusage.temp_blks_written =
  			pgBufferUsage.temp_blks_written - bufusage.temp_blks_written;
  
! 		pgss_store(queryString, INSTR_TIME_GET_DOUBLE(duration), rows,
! 				   &bufusage);
  	}
  	else
  	{
--- 1869,1879 ----
  		bufusage.temp_blks_written =
  			pgBufferUsage.temp_blks_written - bufusage.temp_blks_written;
  
! 		query_id = pgss_hash_string(queryString);
! 
! 		/* In the case of utility statements, hash the query string directly */
! 		pgss_store(queryString, query_id,
! 				INSTR_TIME_GET_DOUBLE(duration), rows, &bufusage, false, false);
  	}
  	else
  	{
*************** pgss_hash_fn(const void *key, Size keysi
*** 690,697 ****
  	/* we don't bother to include encoding in the hash */
  	return hash_uint32((uint32) k->userid) ^
  		hash_uint32((uint32) k->dbid) ^
! 		DatumGetUInt32(hash_any((const unsigned char *) k->query_ptr,
! 								k->query_len));
  }
  
  /*
--- 1897,1904 ----
  	/* we don't bother to include encoding in the hash */
  	return hash_uint32((uint32) k->userid) ^
  		hash_uint32((uint32) k->dbid) ^
! 		DatumGetUInt32(hash_any((const unsigned char* ) &k->query_id,
! 					sizeof(k->query_id)) );
  }
  
  /*
*************** pgss_match_fn(const void *key1, const vo
*** 706,727 ****
  	if (k1->userid == k2->userid &&
  		k1->dbid == k2->dbid &&
  		k1->encoding == k2->encoding &&
! 		k1->query_len == k2->query_len &&
! 		memcmp(k1->query_ptr, k2->query_ptr, k1->query_len) == 0)
  		return 0;
  	else
  		return 1;
  }
  
  /*
   * Store some statistics for a statement.
   */
  static void
! pgss_store(const char *query, double total_time, uint64 rows,
! 		   const BufferUsage *bufusage)
  {
  	pgssHashKey key;
  	double		usage;
  	pgssEntry  *entry;
  
  	Assert(query != NULL);
--- 1913,1958 ----
  	if (k1->userid == k2->userid &&
  		k1->dbid == k2->dbid &&
  		k1->encoding == k2->encoding &&
! 		k1->query_id == k2->query_id)
  		return 0;
  	else
  		return 1;
  }
  
  /*
+  * Given an arbitrarily long query string, produce a hash for the purposes of
+  * identifying the query, without canonicalizing constants. Used when hashing
+  * utility statements, or for legacy compatibility mode.
+  */
+ static uint64
+ pgss_hash_string(const char* str)
+ {
+ 	/* For additional protection against collisions, including magic value */
+ 	uint64 Magic = MAG_STR_BUF;
+ 	uint64 result;
+ 	Size size = sizeof(Magic) + strlen(str);
+ 	unsigned char* p = palloc(size);
+ 	memcpy(p, &Magic, sizeof(Magic));
+ 	memcpy(p + sizeof(Magic), str, strlen(str));
+ 	result = hash_any64((const unsigned char *) p, size);
+ 	pfree(p);
+ 	return result;
+ }
+ 
+ /*
   * Store some statistics for a statement.
   */
  static void
! pgss_store(const char *query, uint64 query_id,
! 				double total_time, uint64 rows,
! 				const BufferUsage *bufusage,
! 				bool empty_entry,
! 				bool normalize)
  {
  	pgssHashKey key;
  	double		usage;
+ 	int		    new_query_len = strlen(query);
+ 	char	   *norm_query = NULL;
  	pgssEntry  *entry;
  
  	Assert(query != NULL);
*************** pgss_store(const char *query, double tot
*** 734,747 ****
  	key.userid = GetUserId();
  	key.dbid = MyDatabaseId;
  	key.encoding = GetDatabaseEncoding();
! 	key.query_len = strlen(query);
! 	if (key.query_len >= pgss->query_size)
! 		key.query_len = pg_encoding_mbcliplen(key.encoding,
  											  query,
! 											  key.query_len,
  											  pgss->query_size - 1);
- 	key.query_ptr = query;
- 
  	usage = USAGE_EXEC(duration);
  
  	/* Lookup the hash table entry with shared lock. */
--- 1965,1980 ----
  	key.userid = GetUserId();
  	key.dbid = MyDatabaseId;
  	key.encoding = GetDatabaseEncoding();
! 	key.query_id = query_id;
! 
! 	if (new_query_len >= pgss->query_size)
! 		/* We don't have to worry about this later, because canonicalization
! 		 * cannot possibly result in a longer query string
! 		 */
! 		new_query_len = pg_encoding_mbcliplen(key.encoding,
  											  query,
! 											  new_query_len,
  											  pgss->query_size - 1);
  	usage = USAGE_EXEC(duration);
  
  	/* Lookup the hash table entry with shared lock. */
*************** pgss_store(const char *query, double tot
*** 750,759 ****
  	entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
  	if (!entry)
  	{
! 		/* Must acquire exclusive lock to add a new entry. */
! 		LWLockRelease(pgss->lock);
! 		LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
! 		entry = entry_alloc(&key);
  	}
  
  	/* Grab the spinlock while updating the counters. */
--- 1983,2103 ----
  	entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
  	if (!entry)
  	{
! 		Size *offs = NULL;
! 		int qry_n_dups;
! 		int off_n = 0;
! 
! 		offs = last_offsets;
! 		off_n = last_offset_num;
! 
! 		if ((qry_n_dups = n_dups(offs, off_n)) > 0)
! 		{
! 			/* Assuming that we cannot have walked some part of the tree twice
! 			 * seems rather fragile, and besides, there isn't much we can do to
! 			 * ensure that this does not happen with targetLists that contain
! 			 * duplicate entries, as with queries like "values(1,2), (3,4)".
! 			 */
! 			Size *offs_new;
! 			offs_new = dedup_toks(offs, off_n, off_n - qry_n_dups);
! 			offs = offs_new;
! 			off_n = off_n - qry_n_dups;
! 		}
! 
! 		/*
! 		 * It is necessary to generate a normalized version of the query
! 		 * string that will be used to represent it. It's important that
! 		 * the user be presented with a stable representation of the query.
! 		 *
! 		 * Note that the representation seen by the user will only have
! 		 * non-differentiating Const tokens swapped with '?' characters, and
! 		 * this does not for example take account of the fact that alias names
! 		 * could vary between successive calls of what is regarded as the same
! 		 * query.
! 		 */
! 		if (off_n > 0 && normalize)
! 		{
! 			int i,
! 			  off = 0,			/* Offset from start for cur tok */
! 			  tok_len = 0,		/* length (in bytes) of that tok */
! 			  quer_it = 0,		/* Original query iterator */
! 			  n_quer_it = 0,	/* Normalized query iterator */
! 			  len_to_wrt = 0,	/* Length (in bytes) to write */
! 			  last_off = 0,		/* Offset from start for last iter tok */
! 			  last_tok_len = 0,	/* length (in bytes) of that tok */
! 			  length_delta = 0; /* Finished str is n bytes shorter so far */
! 
! 			norm_query = palloc0(new_query_len + 1);
! 			for(i = 0; i < off_n; i++)
! 			{
! 				off = offs[i];
! 				tok_len = get_constant_length(&query[off]);
! 				len_to_wrt = off - last_off;
! 				len_to_wrt -= last_tok_len;
! 				length_delta += tok_len - 1;
! 				Assert(tok_len > 0);
! 				Assert(len_to_wrt >= 0);
! 				/*
! 				 * Each iteration copies everything prior to the current
! 				 * offset/token to be replaced, except bytes copied in
! 				 * previous iterations
! 				 */
! 				if (off - length_delta + tok_len > new_query_len)
! 				{
! 					/* We could just be oversized due to a large constant
! 					 * literal. Try and copy bytes prior to the literal that may
! 					 * have been missed.
! 					 */
! 					if (off - length_delta < new_query_len)
! 					{
! 						memcpy(norm_query + n_quer_it, query + quer_it,
! 								Min(len_to_wrt, new_query_len - n_quer_it));
! 						n_quer_it += len_to_wrt;
! 						quer_it += len_to_wrt + tok_len;
! 
! 						/*
! 						 * See if there is room left for a '?', and copy one
! 						 * over if there is.
! 						 */
! 						if (n_quer_it < new_query_len)
! 							norm_query[n_quer_it++] = '?';
! 					}
! 					/*
! 					 * Even though we'd have exceeded buffer size with last
! 					 * constant if constants weren't canonicalized, they are, so
! 					 * there could be additional constants that will fit in that
! 					 * buffer, once they're actually represented as '?'
! 					 */
! 					continue;
! 				}
! 				memcpy(norm_query + n_quer_it, query + quer_it, len_to_wrt);
! 
! 				n_quer_it += len_to_wrt;
! 				norm_query[n_quer_it++] = '?';
! 				quer_it += len_to_wrt + tok_len;
! 				last_off = off;
! 				last_tok_len = tok_len;
! 			}
! 			/* Copy end of query string (piece past last constant) if there's room */
! 			memcpy(norm_query + n_quer_it, query + (off + tok_len),
! 				Min( strlen(query) - (off + tok_len),
! 					new_query_len - n_quer_it ) );
! 			/*
! 			 * Must acquire exclusive lock to add a new entry.
! 			 * Leave that until as late as possible.
! 			 */
! 			LWLockRelease(pgss->lock);
! 			LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
! 
! 			entry = entry_alloc(&key, norm_query, strlen(norm_query));
! 		}
! 		else
! 		{
! 			/* Acquire exclusive lock as required by entry_alloc() */
! 			LWLockRelease(pgss->lock);
! 			LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
! 
! 			entry = entry_alloc(&key, query, new_query_len);
! 		}
  	}
  
  	/* Grab the spinlock while updating the counters. */
*************** pgss_store(const char *query, double tot
*** 761,767 ****
  		volatile pgssEntry *e = (volatile pgssEntry *) entry;
  
  		SpinLockAcquire(&e->mutex);
! 		e->counters.calls += 1;
  		e->counters.total_time += total_time;
  		e->counters.rows += rows;
  		e->counters.shared_blks_hit += bufusage->shared_blks_hit;
--- 2105,2113 ----
  		volatile pgssEntry *e = (volatile pgssEntry *) entry;
  
  		SpinLockAcquire(&e->mutex);
! 		if (!empty_entry)
! 			e->counters.calls += 1;
! 
  		e->counters.total_time += total_time;
  		e->counters.rows += rows;
  		e->counters.shared_blks_hit += bufusage->shared_blks_hit;
*************** pgss_store(const char *query, double tot
*** 775,782 ****
  		e->counters.usage += usage;
  		SpinLockRelease(&e->mutex);
  	}
- 
  	LWLockRelease(pgss->lock);
  }
  
  /*
--- 2121,2129 ----
  		e->counters.usage += usage;
  		SpinLockRelease(&e->mutex);
  	}
  	LWLockRelease(pgss->lock);
+ 	if (norm_query)
+ 		pfree(norm_query);
  }
  
  /*
*************** pg_stat_statements(PG_FUNCTION_ARGS)
*** 863,869 ****
  
  			qstr = (char *)
  				pg_do_encoding_conversion((unsigned char *) entry->query,
! 										  entry->key.query_len,
  										  entry->key.encoding,
  										  GetDatabaseEncoding());
  			values[i++] = CStringGetTextDatum(qstr);
--- 2210,2216 ----
  
  			qstr = (char *)
  				pg_do_encoding_conversion((unsigned char *) entry->query,
! 										  entry->query_len,
  										  entry->key.encoding,
  										  GetDatabaseEncoding());
  			values[i++] = CStringGetTextDatum(qstr);
*************** pg_stat_statements(PG_FUNCTION_ARGS)
*** 881,886 ****
--- 2228,2236 ----
  			tmp = e->counters;
  			SpinLockRelease(&e->mutex);
  		}
+ 		/* Skip record of unexecuted query */
+ 		if (tmp.calls == 0)
+ 			continue;
  
  		values[i++] = Int64GetDatumFast(tmp.calls);
  		values[i++] = Float8GetDatumFast(tmp.total_time);
*************** pgss_memsize(void)
*** 933,946 ****
   * have made the entry while we waited to get exclusive lock.
   */
  static pgssEntry *
! entry_alloc(pgssHashKey *key)
  {
  	pgssEntry  *entry;
  	bool		found;
  
- 	/* Caller must have clipped query properly */
- 	Assert(key->query_len < pgss->query_size);
- 
  	/* Make space if needed */
  	while (hash_get_num_entries(pgss_hash) >= pgss_max)
  		entry_dealloc();
--- 2283,2293 ----
   * have made the entry while we waited to get exclusive lock.
   */
  static pgssEntry *
! entry_alloc(pgssHashKey *key, const char* query, int new_query_len)
  {
  	pgssEntry  *entry;
  	bool		found;
  
  	/* Make space if needed */
  	while (hash_get_num_entries(pgss_hash) >= pgss_max)
  		entry_dealloc();
*************** entry_alloc(pgssHashKey *key)
*** 952,968 ****
  	{
  		/* New entry, initialize it */
  
! 		/* dynahash tried to copy the key for us, but must fix query_ptr */
! 		entry->key.query_ptr = entry->query;
  		/* reset the statistics */
  		memset(&entry->counters, 0, sizeof(Counters));
  		entry->counters.usage = USAGE_INIT;
  		/* re-initialize the mutex each time ... we assume no one using it */
  		SpinLockInit(&entry->mutex);
  		/* ... and don't forget the query text */
! 		memcpy(entry->query, key->query_ptr, key->query_len);
! 		entry->query[key->query_len] = '\0';
  	}
  
  	return entry;
  }
--- 2299,2318 ----
  	{
  		/* New entry, initialize it */
  
! 		entry->query_len = new_query_len;
! 		Assert(entry->query_len > 0);
  		/* reset the statistics */
  		memset(&entry->counters, 0, sizeof(Counters));
  		entry->counters.usage = USAGE_INIT;
  		/* re-initialize the mutex each time ... we assume no one using it */
  		SpinLockInit(&entry->mutex);
  		/* ... and don't forget the query text */
! 		memcpy(entry->query, query, entry->query_len);
! 		Assert(new_query_len <= pgss->query_size);
! 		entry->query[entry->query_len] = '\0';
  	}
+ 	/* Caller must have clipped query properly */
+ 	Assert(entry->query_len < pgss->query_size);
  
  	return entry;
  }
*************** entry_reset(void)
*** 1040,1042 ****
--- 2390,2437 ----
  
  	LWLockRelease(pgss->lock);
  }
+ 
+ /*
+  * Returns a value indicating the number of duplicates.
+  *
+  * Assumes that the array has already been sorted.
+  */
+ static int
+ n_dups(Size offs[], Size n)
+ {
+ 	int i, n_fdups = 0;
+ 
+ 	for(i = 1; i < n; i++)
+ 		if (offs[i - 1] == offs[i])
+ 			n_fdups++;
+ 
+ 	return n_fdups;
+ }
+ 
+ /*
+  * Function removes duplicate values, returning new array.
+  *
+  * new_size specifies the size of the new, duplicate-free array, which must be
+  * known ahead of time.
+  *
+  * Assumes that the array has already been sorted.
+  */
+ static Size*
+ dedup_toks(Size offs[], Size n, Size new_size)
+ {
+ 	int i, j = 0;
+ 	Size *new_offsets;
+ 
+ 	new_offsets = (Size*) palloc(sizeof(Size) * new_size);
+ 	new_offsets[j++] = offs[0];
+ 
+ 	for(i = 1; i < n; i++)
+ 	{
+ 		if (offs[i - 1] == offs[i])
+ 			continue;
+ 		new_offsets[j++] = offs[i];
+ 	}
+ 	Assert(j == new_size);
+ 
+ 	return new_offsets;
+ }
diff --git a/src/backend/access/hash/hashfunc.c b/src/backend/access/hash/hashfunc.c
new file mode 100644
index 0e4cf8e..5497deb
*** a/src/backend/access/hash/hashfunc.c
--- b/src/backend/access/hash/hashfunc.c
*************** hashvarlena(PG_FUNCTION_ARGS)
*** 291,308 ****
   *		k		: the key (the unaligned variable-length array of bytes)
   *		len		: the length of the key, counting by bytes
   *
!  * Returns a uint32 value.	Every bit of the key affects every bit of
!  * the return value.  Every 1-bit and 2-bit delta achieves avalanche.
!  * About 6*len+35 instructions. The best hash table sizes are powers
!  * of 2.  There is no need to do mod a prime (mod is sooo slow!).
!  * If you need less than 32 bits, use a bitmask.
   *
!  * Note: we could easily change this function to return a 64-bit hash value
!  * by using the final values of both b and c.  b is perhaps a little less
!  * well mixed than c, however.
   */
  Datum
! hash_any(register const unsigned char *k, register int keylen)
  {
  	register uint32 a,
  				b,
--- 291,305 ----
   *		k		: the key (the unaligned variable-length array of bytes)
   *		len		: the length of the key, counting by bytes
   *
!  * Returns a uint32 or a uint64 value, depending on the width_32 argument.
   *
!  * Every bit of the key affects every bit of the return value.  Every 1-bit and
!  * 2-bit delta achieves avalanche.  About 6*len+35 instructions. The best hash
!  * table sizes are powers of 2.  There is no need to do mod a prime (mod is sooo
!  * slow!).  If you need less than 32 bits, use a bitmask.
   */
  Datum
! hash_any_var_width(register const unsigned char *k, register int keylen, bool width_32)
  {
  	register uint32 a,
  				b,
*************** hash_any(register const unsigned char *k
*** 496,502 ****
  	final(a, b, c);
  
  	/* report the result */
! 	return UInt32GetDatum(c);
  }
  
  /*
--- 493,503 ----
  	final(a, b, c);
  
  	/* report the result */
! 	if (width_32)
! 		return UInt32GetDatum(c);
! 	else
! 		return (uint64) b |
! 			(((uint64) c) << (sizeof(uint64) / 2) * BITS_PER_BYTE);
  }
  
  /*
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
new file mode 100644
index cc3168d..385c13a
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 62,67 ****
--- 62,70 ----
  #define COPY_LOCATION_FIELD(fldname) \
  	(newnode->fldname = from->fldname)
  
+ /* Copy a query_id field (for Copy, this is same as scalar case) */
+ #define COPY_QUERYID_FIELD(fldname) \
+ 	(newnode->fldname = from->fldname)
  
  /* ****************************************************************
   *					 plannodes.h copy functions
*************** _copyPlannedStmt(const PlannedStmt *from
*** 92,97 ****
--- 95,101 ----
  	COPY_NODE_FIELD(relationOids);
  	COPY_NODE_FIELD(invalItems);
  	COPY_SCALAR_FIELD(nParamExec);
+ 	COPY_QUERYID_FIELD(queryId);
  
  	return newnode;
  }
*************** _copyQuery(const Query *from)
*** 2415,2420 ****
--- 2419,2425 ----
  
  	COPY_SCALAR_FIELD(commandType);
  	COPY_SCALAR_FIELD(querySource);
+ 	COPY_SCALAR_FIELD(query_id);
  	COPY_SCALAR_FIELD(canSetTag);
  	COPY_NODE_FIELD(utilityStmt);
  	COPY_SCALAR_FIELD(resultRelation);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
new file mode 100644
index 2295195..ce75da3
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 83,88 ****
--- 83,91 ----
  #define COMPARE_LOCATION_FIELD(fldname) \
  	((void) 0)
  
+ /* Compare a query_id field (this is a no-op, per note above) */
+ #define COMPARE_QUERYID_FIELD(fldname) \
+ 	((void) 0)
  
  /*
   *	Stuff from primnodes.h
*************** _equalQuery(const Query *a, const Query
*** 897,902 ****
--- 900,906 ----
  {
  	COMPARE_SCALAR_FIELD(commandType);
  	COMPARE_SCALAR_FIELD(querySource);
+ 	COMPARE_QUERYID_FIELD(query_id);
  	COMPARE_SCALAR_FIELD(canSetTag);
  	COMPARE_NODE_FIELD(utilityStmt);
  	COMPARE_SCALAR_FIELD(resultRelation);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
new file mode 100644
index 829f6d4..1b5cfaf
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 46,51 ****
--- 46,55 ----
  #define WRITE_UINT_FIELD(fldname) \
  	appendStringInfo(str, " :" CppAsString(fldname) " %u", node->fldname)
  
+ /* Write a location/query id field (anything written as ":fldname %lu") */
+ #define WRITE_ULINT_FIELD(fldname) \
+ 	appendStringInfo(str, " :" CppAsString(fldname) " %lu", node->fldname)
+ 
  /* Write an OID field (don't hard-wire assumption that OID is same as uint) */
  #define WRITE_OID_FIELD(fldname) \
  	appendStringInfo(str, " :" CppAsString(fldname) " %u", node->fldname)
***************
*** 81,86 ****
--- 85,94 ----
  #define WRITE_LOCATION_FIELD(fldname) \
  	appendStringInfo(str, " :" CppAsString(fldname) " %d", node->fldname)
  
+ /* Write a query id field */
+ #define WRITE_QUERYID_FIELD(fldname) \
+ 	((void) 0)
+ 
  /* Write a Node field */
  #define WRITE_NODE_FIELD(fldname) \
  	(appendStringInfo(str, " :" CppAsString(fldname) " "), \
*************** _outPlannedStmt(StringInfo str, const Pl
*** 255,260 ****
--- 263,269 ----
  	WRITE_NODE_FIELD(relationOids);
  	WRITE_NODE_FIELD(invalItems);
  	WRITE_INT_FIELD(nParamExec);
+ 	WRITE_QUERYID_FIELD(queryId);
  }
  
  /*
*************** _outQuery(StringInfo str, const Query *n
*** 2159,2164 ****
--- 2168,2174 ----
  
  	WRITE_ENUM_FIELD(commandType, CmdType);
  	WRITE_ENUM_FIELD(querySource, QuerySource);
+ 	WRITE_QUERYID_FIELD(query_id);
  	WRITE_BOOL_FIELD(canSetTag);
  
  	/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
new file mode 100644
index b9258ad..d2e65ef
*** a/src/backend/nodes/readfuncs.c
--- b/src/backend/nodes/readfuncs.c
***************
*** 68,73 ****
--- 68,79 ----
  	token = pg_strtok(&length);		/* get field value */ \
  	local_node->fldname = atoui(token)
  
+ /* Read an unsigned integer field (anything written as ":fldname %lu") */
+ #define READ_ULINT_FIELD(fldname) \
+ 	token = pg_strtok(&length);		/* skip :fldname */ \
+ 	token = pg_strtok(&length);		/* get field value */ \
+ 	local_node->fldname = atoul(token)
+ 
  /* Read an OID field (don't hard-wire assumption that OID is same as uint) */
  #define READ_OID_FIELD(fldname) \
  	token = pg_strtok(&length);		/* skip :fldname */ \
***************
*** 110,115 ****
--- 116,124 ----
  	token = pg_strtok(&length);		/* get field value */ \
  	local_node->fldname = -1	/* set field to "unknown" */
  
+ /* NOOP */
+ #define READ_QUERYID_FIELD(fldname) \
+ 	((void) 0)
  /* Read a Node field */
  #define READ_NODE_FIELD(fldname) \
  	token = pg_strtok(&length);		/* skip :fldname */ \
***************
*** 133,138 ****
--- 142,149 ----
   */
  #define atoui(x)  ((unsigned int) strtoul((x), NULL, 10))
  
+ #define atoul(x)  ((unsigned long) strtoul((x), NULL, 10))
+ 
  #define atooid(x)  ((Oid) strtoul((x), NULL, 10))
  
  #define strtobool(x)  ((*(x) == 't') ? true : false)
*************** _readQuery(void)
*** 195,200 ****
--- 206,212 ----
  
  	READ_ENUM_FIELD(commandType, CmdType);
  	READ_ENUM_FIELD(querySource, QuerySource);
+ 	READ_QUERYID_FIELD(query_id);
  	READ_BOOL_FIELD(canSetTag);
  	READ_NODE_FIELD(utilityStmt);
  	READ_INT_FIELD(resultRelation);
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
new file mode 100644
index 2e8ea5a..ae011ae
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*************** standard_planner(Query *parse, int curso
*** 240,245 ****
--- 240,246 ----
  	result->relationOids = glob->relationOids;
  	result->invalItems = glob->invalItems;
  	result->nParamExec = list_length(glob->paramlist);
+ 	result->queryId = parse->query_id;
  
  	return result;
  }
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
new file mode 100644
index be6e93e..fc43bed
*** a/src/backend/parser/analyze.c
--- b/src/backend/parser/analyze.c
*************** static Query *transformExplainStmt(Parse
*** 65,73 ****
  static void transformLockingClause(ParseState *pstate, Query *qry,
  					   LockingClause *lc, bool pushedDown);
  
  
  /*
!  * parse_analyze
   *		Analyze a raw parse tree and transform it to Query form.
   *
   * Optionally, information about $n parameter types can be supplied.
--- 65,89 ----
  static void transformLockingClause(ParseState *pstate, Query *qry,
  					   LockingClause *lc, bool pushedDown);
  
+ /* Hooks for plugins to get control of parse analysis */
+ parse_analyze_hook_type				parse_analyze_hook = NULL;
+ parse_analyze_varparams_hook_type	parse_analyze_varparams_hook = NULL;
+ 
+ 
+ Query *
+ parse_analyze(Node *parseTree, const char *sourceText,
+ 			  Oid *paramTypes, int numParams)
+ {
+ 	if (parse_analyze_hook)
+ 		return (*parse_analyze_hook) (parseTree, sourceText,
+ 			  paramTypes, numParams);
+ 	else
+ 		return standard_parse_analyze(parseTree, sourceText,
+ 			  paramTypes, numParams);
+ }
  
  /*
!  * standard_parse_analyze
   *		Analyze a raw parse tree and transform it to Query form.
   *
   * Optionally, information about $n parameter types can be supplied.
*************** static void transformLockingClause(Parse
*** 78,84 ****
   * a dummy CMD_UTILITY Query node.
   */
  Query *
! parse_analyze(Node *parseTree, const char *sourceText,
  			  Oid *paramTypes, int numParams)
  {
  	ParseState *pstate = make_parsestate(NULL);
--- 94,100 ----
   * a dummy CMD_UTILITY Query node.
   */
  Query *
! standard_parse_analyze(Node *parseTree, const char *sourceText,
  			  Oid *paramTypes, int numParams)
  {
  	ParseState *pstate = make_parsestate(NULL);
*************** parse_analyze(Node *parseTree, const cha
*** 98,112 ****
  	return query;
  }
  
  /*
!  * parse_analyze_varparams
   *
   * This variant is used when it's okay to deduce information about $n
   * symbol datatypes from context.  The passed-in paramTypes[] array can
   * be modified or enlarged (via repalloc).
   */
  Query *
! parse_analyze_varparams(Node *parseTree, const char *sourceText,
  						Oid **paramTypes, int *numParams)
  {
  	ParseState *pstate = make_parsestate(NULL);
--- 114,140 ----
  	return query;
  }
  
+ Query *
+ parse_analyze_varparams(Node *parseTree, const char *sourceText,
+ 						Oid **paramTypes, int *numParams)
+ {
+ 	if (parse_analyze_varparams_hook)
+ 		return (*parse_analyze_varparams_hook) (parseTree, sourceText,
+ 						paramTypes, numParams);
+ 	else
+ 		return standard_parse_analyze_varparams(parseTree, sourceText,
+ 			  paramTypes, numParams);
+ }
+ 
  /*
!  * standard_parse_analyze_varparams
   *
   * This variant is used when it's okay to deduce information about $n
   * symbol datatypes from context.  The passed-in paramTypes[] array can
   * be modified or enlarged (via repalloc).
   */
  Query *
! standard_parse_analyze_varparams(Node *parseTree, const char *sourceText,
  						Oid **paramTypes, int *numParams)
  {
  	ParseState *pstate = make_parsestate(NULL);
*************** transformSelectStmt(ParseState *pstate,
*** 877,882 ****
--- 905,911 ----
  	ListCell   *l;
  
  	qry->commandType = CMD_SELECT;
+ 	qry->query_id = 0;
  
  	/* process the WITH clause independently of all else */
  	if (stmt->withClause)
diff --git a/src/include/access/hash.h b/src/include/access/hash.h
new file mode 100644
index a3d0f98..464d772
*** a/src/include/access/hash.h
--- b/src/include/access/hash.h
*************** typedef HashMetaPageData *HashMetaPage;
*** 239,244 ****
--- 239,246 ----
   */
  #define HASHPROC		1
  
+ #define hash_any(k, keylen) (hash_any_var_width(k, keylen, true))
+ #define hash_any64(k, keylen) (hash_any_var_width(k, keylen, false))
  
  /* public routines */
  
*************** extern Datum hashint2vector(PG_FUNCTION_
*** 278,284 ****
  extern Datum hashname(PG_FUNCTION_ARGS);
  extern Datum hashtext(PG_FUNCTION_ARGS);
  extern Datum hashvarlena(PG_FUNCTION_ARGS);
! extern Datum hash_any(register const unsigned char *k, register int keylen);
  extern Datum hash_uint32(uint32 k);
  
  /* private routines */
--- 280,286 ----
  extern Datum hashname(PG_FUNCTION_ARGS);
  extern Datum hashtext(PG_FUNCTION_ARGS);
  extern Datum hashvarlena(PG_FUNCTION_ARGS);
! extern Datum hash_any_var_width(register const unsigned char *k, register int keylen, bool width_32);
  extern Datum hash_uint32(uint32 k);
  
  /* private routines */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
new file mode 100644
index 1d33ceb..d4133f2
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
*************** typedef struct Query
*** 103,108 ****
--- 103,111 ----
  
  	QuerySource querySource;	/* where did I come from? */
  
+ 	uint64		query_id;		/* query identifier that can be set by plugins.
+ 								 * Will be copied to resulting PlannedStmt. */
+ 
  	bool		canSetTag;		/* do I set the command result tag? */
  
  	Node	   *utilityStmt;	/* non-null if this is DECLARE CURSOR or a
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
new file mode 100644
index 7d90b91..f520085
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
*************** typedef struct PlannedStmt
*** 67,72 ****
--- 67,74 ----
  	List	   *invalItems;		/* other dependencies, as PlanInvalItems */
  
  	int			nParamExec;		/* number of PARAM_EXEC Params used */
+ 
+ 	uint64		queryId;		/* query identifier carried from query tree */
  } PlannedStmt;
  
  /* macro for fetching the Plan associated with a SubPlan node */
diff --git a/src/include/parser/analyze.h b/src/include/parser/analyze.h
new file mode 100644
index b8987db..2bad10f
*** a/src/include/parser/analyze.h
--- b/src/include/parser/analyze.h
***************
*** 16,26 ****
--- 16,38 ----
  
  #include "parser/parse_node.h"
  
+ /* Hook for plugins to get control in parse_analyze() */
+ typedef Query* (*parse_analyze_hook_type) (Node *parseTree, const char *sourceText,
+ 			  Oid *paramTypes, int numParams);
+ extern PGDLLIMPORT parse_analyze_hook_type parse_analyze_hook;
+ /* Hook for plugins to get control in parse_analyze_varparams() */
+ typedef Query* (*parse_analyze_varparams_hook_type) (Node *parseTree, const char *sourceText,
+ 						Oid **paramTypes, int *numParams);
+ extern PGDLLIMPORT parse_analyze_varparams_hook_type parse_analyze_varparams_hook;
  
  extern Query *parse_analyze(Node *parseTree, const char *sourceText,
  			  Oid *paramTypes, int numParams);
+ extern Query *standard_parse_analyze(Node *parseTree, const char *sourceText,
+ 			  Oid *paramTypes, int numParams);
  extern Query *parse_analyze_varparams(Node *parseTree, const char *sourceText,
  						Oid **paramTypes, int *numParams);
+ extern Query *standard_parse_analyze_varparams(Node *parseTree, const char *sourceText,
+ 						Oid **paramTypes, int *numParams);
  
  extern Query *parse_sub_analyze(Node *parseTree, ParseState *parentParseState,
  				  CommonTableExpr *parentCTE,

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Peter Geoghegan (#1)

Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 16 February 2012 21:11, Peter Geoghegan <peter@2ndquadrant.com> wrote:

* # XXX: This test currently fails!:
#verify_normalizes_correctly("SELECT cast('1' as dnotnull);","SELECT
cast(? as dnotnull);",conn, "domain literal canonicalization/cast")

It appears to fail because the CoerceToDomain node gives its location
to the constant node as starting from "cast", so we end up with
"SELECT ?('1' as dnotnull);". I'm not quite sure if this points to
there being a slight tension with my use of the location field in this
way, or if this is something that could be fixed as a bug in core
(albeit a highly obscure one), though I suspect the latter.

So I looked at this in more detail today, and it turns out that it has
nothing to do with CoerceToDomain in particular. The same effect can
be observed by doing this:

select cast('foo' as text);

In turns out that this happens for the same reason as the location of
the Const token in the following query:

select integer 5;

being given such that the string "select ?" results.

Resolving this one issue resolves some others, as it allows me to
greatly simplify the get_constant_length() logic.

Here is the single, hacky change I've made just for now to the core
parser to quickly see if it all works as expected:

*************** transformTypeCast(ParseState *pstate, Ty
*** 2108,2113 ****
--- 2108,2116 ----
  	if (location < 0)
  		location = tc->typeName->location;

+ 	if (IsA(expr, Const))
+ 		location = ((Const*)expr)->location;
+
  	result = coerce_to_target_type(pstate, expr, inputType,
  								   targetType, targetTypmod,
  								   COERCION_EXPLICIT,

After making this change, I can get all my regression tests to pass
(once I change the normalised representation of certain queries to
look like: "select integer ?" rather than "select ?", which is better
anyway), including the CAST()/CoerceToDomain one that previously
failed. So far so good.

Clearly this change is a quick and dirty workaround, and something
better is required. The question I'd pose to the maintainer of this
code is: what business does the coerce_to_target_type call have
changing the location of the Const node resulting from coercion under
the circumstances described? I understand that the location of the
CoerceToDomain should be at "CAST", but why should the underlying
Const's position be the same? Do you agree that this is a bug, and if
so, would you please facilitate me by committing a fix?

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Peter Geoghegan (#2)

Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 20 February 2012 23:16, Peter Geoghegan <peter@2ndquadrant.com> wrote:

Clearly this change is a quick and dirty workaround, and something
better is required. The question I'd pose to the maintainer of this
code is: what business does the coerce_to_target_type call have
changing the location of the Const node resulting from coercion under
the circumstances described? I understand that the location of the
CoerceToDomain should be at "CAST", but why should the underlying
Const's position be the same?

Another look around shows that the CoerceToDomain struct takes its
location from the new Const location in turn, so my dirty little hack
will break the location of the CoerceToDomain struct, giving an
arguably incorrect caret position in some error messages. It would
suit me if MyCoerceToDomain->arg (or the "arg" of a similar node
related to coercion, like CoerceViaIO) pointed to a Const node with,
potentially, and certainly in the case of my original CoerceToDomain
test case, a distinct location to the coercion node itself.

Can we do that?

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Peter Geoghegan (#2)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan <peter@2ndquadrant.com> writes:

Here is the single, hacky change I've made just for now to the core
parser to quickly see if it all works as expected:

*************** transformTypeCast(ParseState *pstate, Ty
*** 2108,2113 ****
--- 2108,2116 ----
if (location < 0)
location = tc->typeName->location;

+ 	if (IsA(expr, Const))
+ 		location = ((Const*)expr)->location;
+
result = coerce_to_target_type(pstate, expr, inputType,
targetType, targetTypmod,
COERCION_EXPLICIT,

This does not look terribly sane to me. AFAICS, the main effect of this
would be that if you have an error in coercing a literal to some
specified type, the error message would point at the literal and not
at the cast operator. That is, in examples like these:

regression=# select 42::point;
ERROR: cannot cast type integer to point
LINE 1: select 42::point;
^
regression=# select cast (42 as point);
ERROR: cannot cast type integer to point
LINE 1: select cast (42 as point);
^

you're proposing to move the error pointer to the "42", and that does
not seem like an improvement, especially not if it only happens when the
cast subject is a simple constant rather than an expression.

regards, tom lane

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Peter Geoghegan (#3)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan <peter@2ndquadrant.com> writes:

Another look around shows that the CoerceToDomain struct takes its
location from the new Const location in turn, so my dirty little hack
will break the location of the CoerceToDomain struct, giving an
arguably incorrect caret position in some error messages. It would
suit me if MyCoerceToDomain->arg (or the "arg" of a similar node
related to coercion, like CoerceViaIO) pointed to a Const node with,
potentially, and certainly in the case of my original CoerceToDomain
test case, a distinct location to the coercion node itself.

Sorry, I'm not following. What about that isn't true already?

regards, tom lane

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#4)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 21 February 2012 01:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:

you're proposing to move the error pointer to the "42", and that does
not seem like an improvement, especially not if it only happens when the
cast subject is a simple constant rather than an expression.

I'm not actually proposing that though. What I'm proposing, quite
simply, is that the Const location actually be correct in all
circumstances. Now, I can understand why the Coercion node for this
query would have its current location starting from the "CAST" part in
your second example or would happen to be the same as the Constant in
your first, and I'm not questioning that. I'm questioning why the
Const node's location need to *always* be the same as that of the
Coercion node when pg_stat_statements walks the tree, since I'd have
imagined that Postgres has no business blaming the error that you've
shown on the Const node.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#4)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 21 February 2012 01:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:

you're proposing to move the error pointer to the "42", and that does
not seem like an improvement, especially not if it only happens when the
cast subject is a simple constant rather than an expression.

2008's commit a2794623d292f7bbfe3134d1407281055acce584 [1]http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a2794623d292f7bbfe3134d1407281055acce584 added the
following code to parse_coerce.c [2]http://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=src/backend/parser/parse_coerce.c;h=cd9b7b0cfbed03ec74f2cf295e4a7113627d7f72;hp=1244498ffb291b67d35917a6fdddb54b0d8d759d;hb=a2794623d292f7bbfe3134d1407281055acce584;hpb=6734182c169a1ecb74dd8495004e896ee4519adb:

/* Use the leftmost of the constant's and coercion's locations */
if (location < 0)
newcon->location = con->location;
else if (con->location >= 0 && con->location < location)
newcon->location = con->location;
else
newcon->location = location;

With that commit, Tom made a special case of both Const and Param
nodes, and had them take the leftmost location of the original Const
location and the coercion location. Clearly, he judged that the
current exact set of behaviours with regard to caret position were
optimal. It is my contention that:

A. They may not actually be optimal, at least not according to my
taste. At the very least, it is a hack to misrepresent the location of
Const nodes just so the core system can blame things on Const nodes
and have the user see the coercion being at fault. I appreciate that
it wouldn't have seemed to matter at the time, but the fact remains.

B. The question of where the caret goes in relevant cases - the
location of the coercion, or the location of the constant - is
inconsequential to the vast majority of Postgres users, if not all,
even if the existing behaviour is technically superior according to
the prevailing aesthetic. On the other hand, it matters a lot to me
that I be able to trust the Const location under all circumstances -
I'd really like to not have to engineer a way around this behaviour,
because the only way to do that is with tricks with the low-level
scanner API, which would be quite brittle. The fact that "select
integer '5'" is canonicalised to "select ?" isn't very pretty. That's
not the only issue though, as even to get that more limited behaviour
lots more code is required, that is more difficult to verify as
correct. "Canonicalise one token at each Const location" is a simple
and robust approach, if only the core system could be tweaked to make
this assumption hold in all circumstances, rather than just the vast
majority.

Tom's point example does not seem to be problematic to me - the cast
*should* blame the 42 const token, as the cast doesn't work as a
result of its representation, which is in point of fact why the core
system blames the Const node and not the coercion one. For that
reason, the constant vs expression thing strikes me as false
equivalency. All of that said, I must reiterate that the difference in
behaviour strike me as very unimportant, or it would if it was not so
important to what I'm trying to do with pg_stat_statements.

Can this be accommodated? It might be a matter of changing the core
system to blame the coercion node rather than the Const node, if
you're determined to preserve the existing behaviour.

[1]: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a2794623d292f7bbfe3134d1407281055acce584

[2]: http://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=src/backend/parser/parse_coerce.c;h=cd9b7b0cfbed03ec74f2cf295e4a7113627d7f72;hp=1244498ffb291b67d35917a6fdddb54b0d8d759d;hb=a2794623d292f7bbfe3134d1407281055acce584;hpb=6734182c169a1ecb74dd8495004e896ee4519adb

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Peter Geoghegan (#7)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Fri, Feb 24, 2012 at 9:43 AM, Peter Geoghegan <peter@2ndquadrant.com> wrote:

Tom's point example does not seem to be problematic to me - the cast
*should* blame the 42 const token, as the cast doesn't work as a
result of its representation, which is in point of fact why the core
system blames the Const node and not the coercion one.

I think I agree Tom's position upthread: blaming the coercion seems to
me to make more sense. But if that's what we're trying to do, then
why does parse_coerce() say this?

/*
* Set up to point at the constant's text if the input routine throws
* an error.
*/

/me is confused.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Robert Haas (#8)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Robert Haas <robertmhaas@gmail.com> writes:

I think I agree Tom's position upthread: blaming the coercion seems to
me to make more sense. But if that's what we're trying to do, then
why does parse_coerce() say this?

/*
* Set up to point at the constant's text if the input routine throws
* an error.
*/

/me is confused.

There are two cases that are fundamentally different in the eyes of the
system:

'literal string'::typename defines a constant of the named type.
The string is fed to the type's input routine de novo, that is, it never
really had any other type. (Under the hood, it had type UNKNOWN for a
short time, but that's an implementation detail.) In this situation it
seems appropriate to point at the text string if the input routine
doesn't like it, because it is the input string and nothing else that is
wrong.

On the other hand, when you cast something that already had a known type
to some other type, any failure seems reasonable to blame on the cast
operator.

So in these terms there's a very real difference between what
'42'::bigint means and what 42::bigint means --- the latter implies
forming an int4 constant and then converting it to int8.

I think that what Peter is on about in
http://archives.postgresql.org/pgsql-hackers/2012-02/msg01152.php
is the question of what location to use for the *result* of
'literal string'::typename, assuming that the type's input function
doesn't complain. Generally we consider that we should use the
leftmost token's location for the location of any expression composed
of more than one input token. This is of course the same place for
'literal string'::typename, but not for the alternate syntaxes
typename 'literal string' and cast('literal string' as typename).
I'm not terribly impressed by the proposal to put in an arbitrary
exception to that general rule for the convenience of this patch.

Especially not when the only reason it's needed is that Peter is
doing the fingerprinting at what is IMO the wrong place anyway.
If he were working on the raw grammar output it wouldn't matter
what parse_coerce chooses to do afterwards.

regards, tom lane

#10

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#9)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 27 February 2012 06:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think that what Peter is on about in
http://archives.postgresql.org/pgsql-hackers/2012-02/msg01152.php
is the question of what location to use for the *result* of
'literal string'::typename, assuming that the type's input function
doesn't complain. Generally we consider that we should use the
leftmost token's location for the location of any expression composed
of more than one input token. This is of course the same place for
'literal string'::typename, but not for the alternate syntaxes
typename 'literal string' and cast('literal string' as typename).
I'm not terribly impressed by the proposal to put in an arbitrary
exception to that general rule for the convenience of this patch.

Now, you don't have to be. It was a mistake on my part to bring the
current user-visible behaviour into this. I don't see that there is
necessarily a tension between your position that we should blame the
leftmost token's location, and my contention that the Const "location"
field shouldn't misrepresent the location of certain Consts involved
in coercion post-analysis.

Let me put that in concrete terms. In my working copy of the patch, I
have made some more changes to the core system (mostly reverting
things that turned out to be unnecessary), but I have also made the
following change:

*** a/src/backend/parser/parse_coerce.c
--- b/src/backend/parser/parse_coerce.c
*************** coerce_type(ParseState *pstate, Node *no
*** 280,293 ****
  		newcon->constlen = typeLen(targetType);
  		newcon->constbyval = typeByVal(targetType);
  		newcon->constisnull = con->constisnull;
! 		/* Use the leftmost of the constant's and coercion's locations */
! 		if (location < 0)
! 			newcon->location = con->location;
! 		else if (con->location >= 0 && con->location < location)
! 			newcon->location = con->location;
! 		else
! 			newcon->location = location;
!
  		/*
  		 * Set up to point at the constant's text if the input routine throws
  		 * an error.
--- 280,286 ----
  		newcon->constlen = typeLen(targetType);
  		newcon->constbyval = typeByVal(targetType);
  		newcon->constisnull = con->constisnull;
! 		newcon->location = con->location;
  		/*
  		 * Set up to point at the constant's text if the input routine throws
  		 * an error.
*********************

This does not appear to have any user-visible effect on caret position
for all variations in coercion syntax, while giving me everything that
I need. I had assumed that we were relying on things being this way,
but apparently this is not the case. The system is correctly blaming
the coercion token when it finds the coercion is at fault, and the
const token when it finds the Const node at fault, just as it did
before. So this looks like a case of removing what amounts to dead
code.

Especially not when the only reason it's needed is that Peter is
doing the fingerprinting at what is IMO the wrong place anyway.
If he were working on the raw grammar output it wouldn't matter
what parse_coerce chooses to do afterwards.

Well, I believe that your reason for preferring to do it at that stage
was that we could not capture all of the system's "normalisation
smarts", like the fact that the omission of noise words isn't a
differentiator, so we might as well not have any. This was because
much of it - like the recognition of the equivalence of explicit joins
and queries with join conditions in the where clause - occurs within
the planner. We can't have it all, so we might as well not have any.
My solution here is that we be sufficiently vague about the behaviour
of normalisation that the user has no reasonable basis to count on
that kind of more advanced reduction occurring.

I did very seriously consider hashing the raw parse tree, but I have
several practical reasons for not doing so. Whatever way you look at
it, hashing there is going to result in more code, that is more ugly.
There is no uniform parent node that I can tag with a query_id. There
has to be more modifications to the core system so that queryId value
is carried around more places and persists for longer. The fact that
I'd actually be hashing different structs at different times (that
tree is accessed through a Node pointer) would necessitate lots of
redundant code that operated on each of the very similar structs in an
analogous way. The fact is that waiting until after parse analysis has
plenty of things to recommend it, and yes, the fact that we already
have working code with extensive regression tests is one of them.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#11

Daniel Farina

daniel@heroku.com

almost 14 years ago

In reply to: Peter Geoghegan (#10)

1 attachment(s)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Mon, Feb 27, 2012 at 4:26 AM, Peter Geoghegan <peter@2ndquadrant.com> wrote:

This does not appear to have any user-visible effect on caret position
for all variations in coercion syntax, while giving me everything that
I need. I had assumed that we were relying on things being this way,
but apparently this is not the case. The system is correctly blaming
the coercion token when it finds the coercion is at fault, and the
const token when it finds the Const node at fault, just as it did
before. So this looks like a case of removing what amounts to dead
code.

To shed some light on that hypothesis, attached is a patch whereby I
use 'semantic analysis by compiler error' to show the extent of the
reach of the changes by renaming (codebase-wide) the Const node's
location symbol. The scope whereby the error token will change
position is small and amenable to analysis. I don't see a problem,
nor wide-reaching consequences. As Peter says: probably dead code.
Note that the cancellation of the error position happens very soon,
after an invocation of stringTypeDatum (on two sides of a branch).
Pre and post-patch is puts the carat at the beginning of the constant
string, even in event there is a failure to parse it properly to the
destined type.

--
fdr

Attachments:

Straw-man-to-show-the-effects-of-the-change-to-const.patchapplication/octet-stream; name=Straw-man-to-show-the-effects-of-the-change-to-const.patchDownload

From 0cbdbd17c6d33398ffe8fb1c7f2a778503764bc2 Mon Sep 17 00:00:00 2001
From: Daniel Farina <daniel@heroku.com>
Date: Wed, 29 Feb 2012 00:30:18 -0800
Subject: [PATCH] Straw man to show the effects of the change to const
 location

Signed-off-by: Daniel Farina <daniel@heroku.com>
---
 src/backend/nodes/copyfuncs.c     |    2 +-
 src/backend/nodes/makefuncs.c     |    2 +-
 src/backend/nodes/nodeFuncs.c     |    2 +-
 src/backend/nodes/outfuncs.c      |    2 +-
 src/backend/nodes/readfuncs.c     |    2 +-
 src/backend/parser/parse_coerce.c |   10 ++--------
 src/backend/parser/parse_expr.c   |    2 +-
 src/backend/parser/parse_node.c   |    4 ++--
 src/include/nodes/primnodes.h     |    2 +-
 9 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index c9133dd..078c645 100644
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 1086,1092 **** _copyConst(Const *from)
  
  	COPY_SCALAR_FIELD(constisnull);
  	COPY_SCALAR_FIELD(constbyval);
! 	COPY_LOCATION_FIELD(location);
  
  	return newnode;
  }
--- 1086,1092 ----
  
  	COPY_SCALAR_FIELD(constisnull);
  	COPY_SCALAR_FIELD(constbyval);
! 	COPY_LOCATION_FIELD(constlocation);
  
  	return newnode;
  }
*** a/src/backend/nodes/makefuncs.c
--- b/src/backend/nodes/makefuncs.c
***************
*** 287,293 **** makeConst(Oid consttype,
  	cnst->constvalue = constvalue;
  	cnst->constisnull = constisnull;
  	cnst->constbyval = constbyval;
! 	cnst->location = -1;		/* "unknown" */
  
  	return cnst;
  }
--- 287,293 ----
  	cnst->constvalue = constvalue;
  	cnst->constisnull = constisnull;
  	cnst->constbyval = constbyval;
! 	cnst->constlocation = -1;		/* "unknown" */
  
  	return cnst;
  }
*** a/src/backend/nodes/nodeFuncs.c
--- b/src/backend/nodes/nodeFuncs.c
***************
*** 1093,1099 **** exprLocation(Node *expr)
  			loc = ((Var *) expr)->location;
  			break;
  		case T_Const:
! 			loc = ((Const *) expr)->location;
  			break;
  		case T_Param:
  			loc = ((Param *) expr)->location;
--- 1093,1099 ----
  			loc = ((Var *) expr)->location;
  			break;
  		case T_Const:
! 			loc = ((Const *) expr)->constlocation;
  			break;
  		case T_Param:
  			loc = ((Param *) expr)->location;
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 921,927 **** _outConst(StringInfo str, Const *node)
  	WRITE_INT_FIELD(constlen);
  	WRITE_BOOL_FIELD(constbyval);
  	WRITE_BOOL_FIELD(constisnull);
! 	WRITE_LOCATION_FIELD(location);
  
  	appendStringInfo(str, " :constvalue ");
  	if (node->constisnull)
--- 921,927 ----
  	WRITE_INT_FIELD(constlen);
  	WRITE_BOOL_FIELD(constbyval);
  	WRITE_BOOL_FIELD(constisnull);
! 	WRITE_LOCATION_FIELD(constlocation);
  
  	appendStringInfo(str, " :constvalue ");
  	if (node->constisnull)
*** a/src/backend/nodes/readfuncs.c
--- b/src/backend/nodes/readfuncs.c
***************
*** 432,438 **** _readConst(void)
  	READ_INT_FIELD(constlen);
  	READ_BOOL_FIELD(constbyval);
  	READ_BOOL_FIELD(constisnull);
! 	READ_LOCATION_FIELD(location);
  
  	token = pg_strtok(&length); /* skip :constvalue */
  	if (local_node->constisnull)
--- 432,438 ----
  	READ_INT_FIELD(constlen);
  	READ_BOOL_FIELD(constbyval);
  	READ_BOOL_FIELD(constisnull);
! 	READ_LOCATION_FIELD(constlocation);
  
  	token = pg_strtok(&length); /* skip :constvalue */
  	if (local_node->constisnull)
*** a/src/backend/parser/parse_coerce.c
--- b/src/backend/parser/parse_coerce.c
***************
*** 280,298 **** coerce_type(ParseState *pstate, Node *node,
  		newcon->constlen = typeLen(targetType);
  		newcon->constbyval = typeByVal(targetType);
  		newcon->constisnull = con->constisnull;
- 		/* Use the leftmost of the constant's and coercion's locations */
- 		if (location < 0)
- 			newcon->location = con->location;
- 		else if (con->location >= 0 && con->location < location)
- 			newcon->location = con->location;
- 		else
- 			newcon->location = location;
  
  		/*
  		 * Set up to point at the constant's text if the input routine throws
  		 * an error.
  		 */
! 		setup_parser_errposition_callback(&pcbstate, pstate, con->location);
  
  		/*
  		 * We assume here that UNKNOWN's internal representation is the same
--- 280,292 ----
  		newcon->constlen = typeLen(targetType);
  		newcon->constbyval = typeByVal(targetType);
  		newcon->constisnull = con->constisnull;
  
  		/*
  		 * Set up to point at the constant's text if the input routine throws
  		 * an error.
  		 */
! 		setup_parser_errposition_callback(&pcbstate, pstate,
! 										  con->constlocation);
  
  		/*
  		 * We assume here that UNKNOWN's internal representation is the same
*** a/src/backend/parser/parse_expr.c
--- b/src/backend/parser/parse_expr.c
***************
*** 1067,1073 **** transformAExprOf(ParseState *pstate, A_Expr *a)
  	result = (Const *) makeBoolConst(matched, false);
  
  	/* Make the result have the original input's parse location */
! 	result->location = exprLocation((Node *) a);
  
  	return (Node *) result;
  }
--- 1067,1073 ----
  	result = (Const *) makeBoolConst(matched, false);
  
  	/* Make the result have the original input's parse location */
! 	result->constlocation = exprLocation((Node *) a);
  
  	return (Node *) result;
  }
*** a/src/backend/parser/parse_node.c
--- b/src/backend/parser/parse_node.c
***************
*** 532,538 **** make_const(ParseState *pstate, Value *value, int location)
  							(Datum) 0,
  							true,
  							false);
! 			con->location = location;
  			return con;
  
  		default:
--- 532,538 ----
  							(Datum) 0,
  							true,
  							false);
! 			con->constlocation = location;
  			return con;
  
  		default:
***************
*** 547,553 **** make_const(ParseState *pstate, Value *value, int location)
  					val,
  					false,
  					typebyval);
! 	con->location = location;
  
  	return con;
  }
--- 547,553 ----
  					val,
  					false,
  					typebyval);
! 	con->constlocation = location;
  
  	return con;
  }
*** a/src/include/nodes/primnodes.h
--- b/src/include/nodes/primnodes.h
***************
*** 165,171 **** typedef struct Const
  								 * If true, then all the information is stored
  								 * in the Datum. If false, then the Datum
  								 * contains a pointer to the information. */
! 	int			location;		/* token location, or -1 if unknown */
  } Const;
  
  /* ----------------
--- 165,171 ----
  								 * If true, then all the information is stored
  								 * in the Datum. If false, then the Datum
  								 * contains a pointer to the information. */
! 	int			constlocation;		/* token location, or -1 if unknown */
  } Const;
  
  /* ----------------

#12

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Daniel Farina (#11)

3 attachment(s)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 29 February 2012 09:05, Daniel Farina <daniel@heroku.com> wrote:

To shed some light on that hypothesis, attached is a patch whereby I
use 'semantic analysis by compiler error' to show the extent of the
reach of the changes by renaming (codebase-wide) the Const node's
location symbol. The scope whereby the error token will change
position is small and amenable to analysis. I don't see a problem,
nor wide-reaching consequences. As Peter says: probably dead code.

Thanks for confirming that.

I decided to benchmark this patch against the same server with
shared_preload_libraries commented out. I chose a quite unsympathetic
pgbench-tools benchmark - the pgbench-tools config is attached. This
is the same server/configuration that I used for my recent page
checksums benchmark. I've thrown the full report up on:

http://pgbenchstatstatements.staticloud.com/

Executive summary:

It looks like we take a 1% - 2.5% hit. On a workload like this, where
parser overhead is high, that isn't bad at all, and seems at most
marginally worse than classic pg_stat_statements with prepared
statements, according to independently produced benchmarks that I've
seen. Had I benchmarked "-M prepared", I wouldn't be surprised if
there was an improvement over classic pg_stat_statements for some
workloads, since the pgss_match_fn logic doesn't involve a strcmp now
- it just compares scalar values. There is no question of there being
a performance regression. Certainly, this patch adds a very practical
feature, vastly more practical than auto_explain currently is, for
example. I didn't choose the most unsympathetic of all benchmarks that
could be easily conducted, which would have been a select-only
workload, which executes very simple select statements only, as fast
as it possibly can. I only avoided that because the tpc-b.sql workload
seems to be recognised as the most useful and objective workload for
general purpose benchmarks.

I've attached the revision of the patch that was benchmarked. There
have been a few changes, mostly bug-fixes and clean-ups, including:

* Most notably, I went ahead and made the required changes to parse
coercion's alteration of Const location, while also tweaking similar
logic for Param location analogously, though that change was purely
for consistency and not out of any practical need to do so.

* Removing the unneeded alteration gave me leeway to considerably
clean up the scanner logic, which doesn't care about which particular
type of token is scanned anymore. There is a single invocation per
query string to be canonicalised (i.e. for each first call of the
query not in the shared hash table). This seems a lot more robust and
correct (in terms of how it canonicalises queries like: select integer
'5') than the workaround that I had in the last revision, written when
it wasn't clear that I'd be able to get the core system to
consistently tell the truth about Const location.

* We no longer canonicalise query strings in the event of prepared
statements, while still walking the query tree to compute a queryId.
Of course, an additional benefit of this patch is that it allows
differentiation of queries that only differ beyond
track_activity_query_size bytes, which is a benefit that I want for
prepared statements too.

* The concept of a "sticky" entry is introduced; this prevents queries
from being evicted after parse analysis/canonicalisation but before a
reprieve-delivering query execution. There is still no absolute,
iron-clad guarantee that this can't happen, but it is probably
impossible for all practical purposes, and even when it does happen,
the only consequence is that a query string with some old,
uncanonicalized constants is seen, probably before being immediately
evicted anyway due to the extreme set of circumstances that would have
been required to produce that failure mode. If, somehow, a sticky
entry is never demoted to a regular entry in the corresponding
executor hook call, which ought to be impossible, that sticky entry
still won't survive a restart, so problems with the shared hash table
getting clogged with sticky entries should never occur. Prepared
statements will add zero call entries to the table during their
initial parse analysis, but these entries are not sticky, and have
their "usage" value initialised just as before.

* 32-bit hash values are now used. Fewer changes still to core code generally.

* Merged against master - Robert's changes would have prevented my
earlier patch from cleanly applying.

* Even more tests! Updated regression tests attached, with a total of
289 tests. Those aside, I found the fuzz testing of third party
regression tests that leverage Postgres to be useful. Daniel pointed
out to me that the SQL Alchemy regression tests broke the patch due to
an assertion failure. Obviously I've fixed that, so both the standard
postgres and the SQL Alchemy tests do not present the patch with any
difficulties. They are both fairly extensive.

Thoughts?

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachments:

configapplication/octet-stream; name=configDownload

pg_stat_statements_norm_2012_02_29.patchtext/x-patch; charset=US-ASCII; name=pg_stat_statements_norm_2012_02_29.patchDownload

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
new file mode 100644
index 914fbf2..b660d0d
*** a/contrib/pg_stat_statements/pg_stat_statements.c
--- b/contrib/pg_stat_statements/pg_stat_statements.c
***************
*** 10,15 ****
--- 10,35 ----
   * an entry, one must hold the lock shared or exclusive (so the entry doesn't
   * disappear!) and also take the entry's mutex spinlock.
   *
+  * As of Postgres 9.2, this module normalizes query strings. Normalization is a
+  * process whereby similar queries, typically differing only in their constants
+  * (though the exact rules are somewhat more subtle than that) are recognized as
+  * equivalent, and are tracked as a single entry. This is particularly useful
+  * for non-prepared queries.
+  *
+  * Normalization is implemented by selectively serializing those fields of each
+  * query tree's nodes that are judged to be essential to the nature of the
+  * query.  This is referred to as a query jumble. This is distinct from a
+  * straight serialization of the query tree in that various extraneous
+  * information is ignored as irrelevant or not essential to the query, such as
+  * the collation of Vars, and, most notably, the value of constants. Once this
+  * jumble is acquired, a 32-bit hash is taken, which is copied back into the
+  * query tree at the post-analysis stage.  Postgres then naively copies this
+  * value around, making it later available from within the corresponding plan
+  * tree. The executor can then use this value to blame query costs on a known
+  * queryId.
+  *
+  * Within the executor hook, the module stores the cost of query  execution,
+  * based on a queryId provided by the core system.
   *
   * Copyright (c) 2008-2012, PostgreSQL Global Development Group
   *
***************
*** 27,38 ****
--- 47,62 ----
  #include "funcapi.h"
  #include "mb/pg_wchar.h"
  #include "miscadmin.h"
+ #include "parser/analyze.h"
+ #include "parser/parsetree.h"
+ #include "parser/scanner.h"
  #include "pgstat.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
  #include "storage/spin.h"
  #include "tcop/utility.h"
  #include "utils/builtins.h"
+ #include "utils/memutils.h"
  
  
  PG_MODULE_MAGIC;
*************** PG_MODULE_MAGIC;
*** 41,54 ****
  #define PGSS_DUMP_FILE	"global/pg_stat_statements.stat"
  
  /* This constant defines the magic number in the stats file header */
! static const uint32 PGSS_FILE_HEADER = 0x20100108;
  
  /* XXX: Should USAGE_EXEC reflect execution time and/or buffer usage? */
  #define USAGE_EXEC(duration)	(1.0)
  #define USAGE_INIT				(1.0)	/* including initial planning */
  #define USAGE_DECREASE_FACTOR	(0.99)	/* decreased every entry_dealloc */
  #define USAGE_DEALLOC_PERCENT	5		/* free this % of entries at once */
! 
  /*
   * Hashtable key that defines the identity of a hashtable entry.  The
   * hash comparators do not assume that the query string is null-terminated;
--- 65,84 ----
  #define PGSS_DUMP_FILE	"global/pg_stat_statements.stat"
  
  /* This constant defines the magic number in the stats file header */
! static const uint32 PGSS_FILE_HEADER = 0x20120103;
  
  /* XXX: Should USAGE_EXEC reflect execution time and/or buffer usage? */
  #define USAGE_EXEC(duration)	(1.0)
  #define USAGE_INIT				(1.0)	/* including initial planning */
+ #define USAGE_NON_EXEC_STICK	(1.0e10)/* unexecuted queries sticky */
  #define USAGE_DECREASE_FACTOR	(0.99)	/* decreased every entry_dealloc */
  #define USAGE_DEALLOC_PERCENT	5		/* free this % of entries at once */
! #define JUMBLE_SIZE				1024    /* query serialization buffer size */
! /* Magic values for jumble */
! #define MAG_HASH_BUF			0xFA	/* buffer is a hash of query tree */
! #define MAG_STR_BUF				0xEB	/* buffer is query string itself */
! #define MAG_RETURN_LIST			0xAE	/* returning list node follows */
! #define MAG_LIMIT_OFFSET		0xBA	/* limit/offset node follows */
  /*
   * Hashtable key that defines the identity of a hashtable entry.  The
   * hash comparators do not assume that the query string is null-terminated;
*************** typedef struct pgssHashKey
*** 63,70 ****
  	Oid			userid;			/* user OID */
  	Oid			dbid;			/* database OID */
  	int			encoding;		/* query encoding */
! 	int			query_len;		/* # of valid bytes in query string */
! 	const char *query_ptr;		/* query string proper */
  } pgssHashKey;
  
  /*
--- 93,99 ----
  	Oid			userid;			/* user OID */
  	Oid			dbid;			/* database OID */
  	int			encoding;		/* query encoding */
! 	uint32		queryid;		/* query identifier */
  } pgssHashKey;
  
  /*
*************** typedef struct pgssEntry
*** 97,102 ****
--- 126,132 ----
  {
  	pgssHashKey key;			/* hash key of entry - MUST BE FIRST */
  	Counters	counters;		/* the statistics for this query */
+ 	int			query_len;		/* # of valid bytes in query string */
  	slock_t		mutex;			/* protects the counters only */
  	char		query[1];		/* VARIABLE LENGTH ARRAY - MUST BE LAST */
  	/* Note: the allocated length of query[] is actually pgss->query_size */
*************** typedef struct pgssSharedState
*** 111,117 ****
--- 141,171 ----
  	int			query_size;		/* max query length in bytes */
  } pgssSharedState;
  
+ typedef struct pgssLocationLen
+ {
+ 	int location;
+ 	int length;
+ } pgssLocationLen;
+ 
+ /*
+  * Last seen constant positions for a statement
+  */
+ typedef struct pgssQueryConEntry
+ {
+ 	pgssHashKey		key;			/* hash key of entry - MUST BE FIRST */
+ 	int				n_elems;		/* length of offsets array */
+ 	Size offsets[1];		/* VARIABLE LENGTH ARRAY - MUST BE LAST */
+ 	/* Note: the allocated length of offsets is actually n_elems */
+ } pgssQueryConEntry;
  /*---- Local variables ----*/
+ /* Jumble of current query tree */
+ static unsigned char *last_jumble = NULL;
+ /* Buffer that represents position of normalized characters */
+ static pgssLocationLen *last_offsets = NULL;
+ /* Current Length of last_offsets buffer */
+ static Size last_offset_buf_size = 10;
+ /* Current number of actual offsets stored in last_offsets */
+ static Size last_offset_num = 0;
  
  /* Current nesting depth of ExecutorRun calls */
  static int	nested_level = 0;
*************** static ExecutorRun_hook_type prev_Execut
*** 123,133 ****
--- 177,196 ----
  static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
  static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
  static ProcessUtility_hook_type prev_ProcessUtility = NULL;
+ static parse_analyze_hook_type prev_parse_analyze_hook = NULL;
+ static parse_analyze_varparams_hook_type prev_parse_analyze_varparams_hook = NULL;
  
  /* Links to shared memory state */
  static pgssSharedState *pgss = NULL;
  static HTAB *pgss_hash = NULL;
  
+ /*
+  * Maintain a stack of the rangetable of the query tree that we're currently
+  * walking, so subqueries can reference parent rangetables. The stack is pushed
+  * and popped as each Query struct is walked into or out of.
+  */
+ static List* pgss_rangetbl_stack = NIL;
+ 
  /*---- GUC variables ----*/
  
  typedef enum
*************** static int	pgss_max;			/* max # statemen
*** 149,154 ****
--- 212,218 ----
  static int	pgss_track;			/* tracking level */
  static bool pgss_track_utility; /* whether to track utility commands */
  static bool pgss_save;			/* whether to save stats across shutdown */
+ static bool pgss_string_key;	/* whether to always only hash query str */
  
  
  #define pgss_enabled() \
*************** PG_FUNCTION_INFO_V1(pg_stat_statements);
*** 168,173 ****
--- 232,255 ----
  
  static void pgss_shmem_startup(void);
  static void pgss_shmem_shutdown(int code, Datum arg);
+ static int comp_offset(const void *a, const void *b);
+ static Query *pgss_parse_analyze(Node *parseTree, const char *sourceText,
+ 			  Oid *paramTypes, int numParams);
+ static Query *pgss_parse_analyze_varparams(Node *parseTree, const char *sourceText,
+ 						Oid **paramTypes, int *numParams);
+ static void pgss_process_post_analysis_tree(Query* post_analysis_tree,
+ 		const char* sourceText, bool canonicalize);
+ static void fill_in_constant_lengths(const char* query,
+ 						pgssLocationLen offs[], Size n_offs);
+ static uint32 JumbleQuery(Query *post_analysis_tree);
+ static void AppendJumb(unsigned char* item, unsigned char jumble[], Size size, Size *i);
+ static void PerformJumble(const Query *tree, Size size, Size *i);
+ static void QualsNode(const OpExpr *node, Size size, Size *i, List *rtable);
+ static void LeafNode(const Node *arg, Size size, Size *i, List *rtable);
+ static void LimitOffsetNode(const Node *node, Size size, Size *i, List *rtable);
+ static void JoinExprNode(JoinExpr *node, Size size, Size *i, List *rtable);
+ static void JoinExprNodeChild(const Node *node, Size size, Size *i, List *rtable);
+ static void RecordConstLocation(int location);
  static void pgss_ExecutorStart(QueryDesc *queryDesc, int eflags);
  static void pgss_ExecutorRun(QueryDesc *queryDesc,
  				 ScanDirection direction,
*************** static void pgss_ProcessUtility(Node *pa
*** 179,188 ****
  					DestReceiver *dest, char *completionTag);
  static uint32 pgss_hash_fn(const void *key, Size keysize);
  static int	pgss_match_fn(const void *key1, const void *key2, Size keysize);
! static void pgss_store(const char *query, double total_time, uint64 rows,
! 		   const BufferUsage *bufusage);
  static Size pgss_memsize(void);
! static pgssEntry *entry_alloc(pgssHashKey *key);
  static void entry_dealloc(void);
  static void entry_reset(void);
  
--- 261,272 ----
  					DestReceiver *dest, char *completionTag);
  static uint32 pgss_hash_fn(const void *key, Size keysize);
  static int	pgss_match_fn(const void *key1, const void *key2, Size keysize);
! static uint32 pgss_hash_string(const char* str);
! static void pgss_store(const char *query, uint32 queryId,
! 				double total_time, uint64 rows,
! 				const BufferUsage *bufusage, bool empty_entry, bool canonicalize);
  static Size pgss_memsize(void);
! static pgssEntry *entry_alloc(pgssHashKey *key, const char* query, int new_query_len);
  static void entry_dealloc(void);
  static void entry_reset(void);
  
*************** static void entry_reset(void);
*** 193,198 ****
--- 277,283 ----
  void
  _PG_init(void)
  {
+ 	MemoryContext oldcontext;
  	/*
  	 * In order to create our shared memory area, we have to be loaded via
  	 * shared_preload_libraries.  If not, fall out without hooking into any of
*************** _PG_init(void)
*** 254,259 ****
--- 339,359 ----
  							 NULL,
  							 NULL);
  
+ 	/*
+ 	 * Support legacy pg_stat_statements behavior, for compatibility with
+ 	 * versions shipped with Postgres 8.4, 9.0 and 9.1
+ 	 */
+ 	DefineCustomBoolVariable("pg_stat_statements.string_key",
+ 			   "Differentiate queries based on query string alone.",
+ 							 NULL,
+ 							 &pgss_string_key,
+ 							 false,
+ 							 PGC_POSTMASTER,
+ 							 0,
+ 							 NULL,
+ 							 NULL,
+ 							 NULL);
+ 
  	EmitWarningsOnPlaceholders("pg_stat_statements");
  
  	/*
*************** _PG_init(void)
*** 265,270 ****
--- 365,382 ----
  	RequestAddinLWLocks(1);
  
  	/*
+ 	 * Allocate a buffer to store selective serialization of the query tree
+ 	 * for the purposes of query normalization.
+ 	 */
+ 	oldcontext = MemoryContextSwitchTo(TopMemoryContext);
+ 
+ 	last_jumble = palloc(JUMBLE_SIZE);
+ 	/* Allocate space for bookkeeping information for query str normalization */
+ 	last_offsets = palloc(last_offset_buf_size * sizeof(pgssLocationLen));
+ 
+ 	MemoryContextSwitchTo(oldcontext);
+ 
+ 	/*
  	 * Install hooks.
  	 */
  	prev_shmem_startup_hook = shmem_startup_hook;
*************** _PG_init(void)
*** 279,284 ****
--- 391,400 ----
  	ExecutorEnd_hook = pgss_ExecutorEnd;
  	prev_ProcessUtility = ProcessUtility_hook;
  	ProcessUtility_hook = pgss_ProcessUtility;
+ 	prev_parse_analyze_hook = parse_analyze_hook;
+ 	parse_analyze_hook = pgss_parse_analyze;
+ 	prev_parse_analyze_varparams_hook = parse_analyze_varparams_hook;
+ 	parse_analyze_varparams_hook = pgss_parse_analyze_varparams;
  }
  
  /*
*************** _PG_fini(void)
*** 294,299 ****
--- 410,420 ----
  	ExecutorFinish_hook = prev_ExecutorFinish;
  	ExecutorEnd_hook = prev_ExecutorEnd;
  	ProcessUtility_hook = prev_ProcessUtility;
+ 	parse_analyze_hook = prev_parse_analyze_hook;
+ 	parse_analyze_varparams_hook = prev_parse_analyze_varparams_hook;
+ 
+ 	pfree(last_jumble);
+ 	pfree(last_offsets);
  }
  
  /*
*************** pgss_shmem_startup(void)
*** 397,423 ****
  		if (!PG_VALID_BE_ENCODING(temp.key.encoding))
  			goto error;
  
  		/* Previous incarnation might have had a larger query_size */
! 		if (temp.key.query_len >= buffer_size)
  		{
! 			buffer = (char *) repalloc(buffer, temp.key.query_len + 1);
! 			buffer_size = temp.key.query_len + 1;
  		}
  
! 		if (fread(buffer, 1, temp.key.query_len, file) != temp.key.query_len)
  			goto error;
! 		buffer[temp.key.query_len] = '\0';
  
  		/* Clip to available length if needed */
! 		if (temp.key.query_len >= query_size)
! 			temp.key.query_len = pg_encoding_mbcliplen(temp.key.encoding,
  													   buffer,
! 													   temp.key.query_len,
  													   query_size - 1);
- 		temp.key.query_ptr = buffer;
  
  		/* make the hashtable entry (discards old entries if too many) */
! 		entry = entry_alloc(&temp.key);
  
  		/* copy in the actual stats */
  		entry->counters = temp.counters;
--- 518,548 ----
  		if (!PG_VALID_BE_ENCODING(temp.key.encoding))
  			goto error;
  
+ 		/* Avoid loading sticky entries */
+ 		if (temp.counters.calls == 0)
+ 			continue;
+ 
  		/* Previous incarnation might have had a larger query_size */
! 		if (temp.query_len >= buffer_size)
  		{
! 			buffer = (char *) repalloc(buffer, temp.query_len + 1);
! 			buffer_size = temp.query_len + 1;
  		}
  
! 		if (fread(buffer, 1, temp.query_len, file) != temp.query_len)
  			goto error;
! 		buffer[temp.query_len] = '\0';
! 
  
  		/* Clip to available length if needed */
! 		if (temp.query_len >= query_size)
! 			temp.query_len = pg_encoding_mbcliplen(temp.key.encoding,
  													   buffer,
! 													   temp.query_len,
  													   query_size - 1);
  
  		/* make the hashtable entry (discards old entries if too many) */
! 		entry = entry_alloc(&temp.key, buffer, temp.query_len);
  
  		/* copy in the actual stats */
  		entry->counters = temp.counters;
*************** pgss_shmem_shutdown(int code, Datum arg)
*** 479,485 ****
  	hash_seq_init(&hash_seq, pgss_hash);
  	while ((entry = hash_seq_search(&hash_seq)) != NULL)
  	{
! 		int			len = entry->key.query_len;
  
  		if (fwrite(entry, offsetof(pgssEntry, mutex), 1, file) != 1 ||
  			fwrite(entry->query, 1, len, file) != len)
--- 604,610 ----
  	hash_seq_init(&hash_seq, pgss_hash);
  	while ((entry = hash_seq_search(&hash_seq)) != NULL)
  	{
! 		int			len = entry->query_len;
  
  		if (fwrite(entry, offsetof(pgssEntry, mutex), 1, file) != 1 ||
  			fwrite(entry->query, 1, len, file) != len)
*************** error:
*** 505,510 ****
--- 630,1667 ----
  }
  
  /*
+  * comp_offset: Comparator for qsorting pgssLocationLen values.
+  */
+ static int
+ comp_offset(const void *a, const void *b)
+ {
+ 	int l = ((pgssLocationLen*) a)->location;
+ 	int r = ((pgssLocationLen*) b)->location;
+ 	if (l < r)
+ 		return -1;
+ 	else if (l > r)
+ 		return +1;
+ 	else
+ 		return 0;
+ }
+ 
+ static Query *
+ pgss_parse_analyze(Node *parseTree, const char *sourceText,
+ 			  Oid *paramTypes, int numParams)
+ {
+ 	Query *post_analysis_tree;
+ 
+ 	if (prev_parse_analyze_hook)
+ 		post_analysis_tree = (*prev_parse_analyze_hook) (parseTree, sourceText,
+ 			  paramTypes, numParams);
+ 	else
+ 		post_analysis_tree = standard_parse_analyze(parseTree, sourceText,
+ 			  paramTypes, numParams);
+ 
+ 	if (!post_analysis_tree->utilityStmt)
+ 		pgss_process_post_analysis_tree(post_analysis_tree, sourceText,
+ 											numParams == 0);
+ 
+ 	return post_analysis_tree;
+ }
+ 
+ static Query *
+ pgss_parse_analyze_varparams(Node *parseTree, const char *sourceText,
+ 						Oid **paramTypes, int *numParams)
+ {
+ 	Query *post_analysis_tree;
+ 
+ 	if (prev_parse_analyze_hook)
+ 		post_analysis_tree = (*prev_parse_analyze_varparams_hook) (parseTree,
+ 				sourceText, paramTypes, numParams);
+ 	else
+ 		post_analysis_tree = standard_parse_analyze_varparams(parseTree,
+ 				sourceText, paramTypes, numParams);
+ 
+ 	if (!post_analysis_tree->utilityStmt)
+ 		pgss_process_post_analysis_tree(post_analysis_tree, sourceText,
+ 											false);
+ 
+ 	return post_analysis_tree;
+ }
+ 
+ /*
+  * pgss_process_post_analysis_tree: Record queryId, which is based on the query
+  * tree, within the tree itself, for later retrieval in the executor hook. The
+  * core system will copy the value to the tree's corresponding plannedstmt.
+  *
+  * Avoid producing a canonicalized string for parameterized queries. It is
+  * simply not desirable given that constants that we might otherwise
+  * canonicalize are going to always be consistent between calls. In addition, it
+  * would be impractical to make the hash entry sticky for an indefinitely long
+  * period (i.e. until the query is actually executed).
+  *
+  * It's still worth going to the trouble of hashing the query tree though,
+  * because that ensures that we can hash an arbitrarily long query.
+  */
+ static void
+ pgss_process_post_analysis_tree(Query* post_analysis_tree,
+ 		const char* sourceText, bool canonicalize)
+ {
+ 	BufferUsage bufusage;
+ 
+ 	post_analysis_tree->queryId = JumbleQuery(post_analysis_tree);
+ 
+ 	memset(&bufusage, 0, sizeof(bufusage));
+ 	pgss_store(sourceText, post_analysis_tree->queryId, 0, 0, &bufusage,
+ 			true, canonicalize);
+ 
+ 	/* Trim last_offsets */
+ 	if (last_offset_buf_size > 10)
+ 	{
+ 		last_offset_buf_size = 10;
+ 		last_offsets = repalloc(last_offsets,
+ 							last_offset_buf_size *
+ 							sizeof(pgssLocationLen));
+ 	}
+ }
+ 
+ /*
+  * Given a valid SQL string, and offsets whose lengths are uninitialized, fill
+  * in the corresponding lengths of those constants.
+  *
+  * The constant may use any available constant syntax, including but not limited
+  * to float literals, bit-strings, single quoted strings and dollar-quoted
+  * strings. This is accomplished by using the public API for the core scanner,
+  * with a workaround for quirks of their representation.
+  *
+  * It is the caller's job to ensure that the string is a valid SQL statement.
+  * Since in practice the string has already been validated, and the locations
+  * that the caller provides will have originated from within the authoritative
+  * parser, this should not be a problem. The caller must also ensure that
+  * constants are provided in pre-sorted order. Duplicates are expected, and have
+  * their lengths marked as '-1', so that they are later ignored.
+  *
+  * N.B. There is an assumption that a '-' character at a Const location begins a
+  * negative constant. This precludes there ever being another reason for a
+  * constant to start with a '-' for any other reason.
+  */
+ static void
+ fill_in_constant_lengths(const char* query, pgssLocationLen offs[],
+ 							Size n_offs)
+ {
+ 	core_yyscan_t  init_scan;
+ 	core_yy_extra_type ext_type;
+ 	core_YYSTYPE type;
+ 	YYLTYPE pos;
+ 	int i, last_loc = -1;
+ 
+ 	init_scan = scanner_init(query,
+ 							 &ext_type,
+ 							 ScanKeywords,
+ 							 NumScanKeywords);
+ 
+ 	for(i = 0; i < n_offs; i++)
+ 	{
+ 		int loc = offs[i].location;
+ 		Assert(loc > 0);
+ 
+ 		if (loc == last_loc)
+ 		{
+ 			/* Duplicate */
+ 			offs[i].length = -1;
+ 			continue;
+ 		}
+ 
+ 		for(;;)
+ 		{
+ 			int scanbuf_len;
+ #ifdef USE_ASSERT_CHECKING
+ 			int tok =
+ #endif
+ 						core_yylex(&type, &pos, init_scan);
+ 			scanbuf_len = strlen(ext_type.scanbuf);
+ 			Assert(tok != 0);
+ 
+ 			if (scanbuf_len > loc)
+ 			{
+ 				if (query[loc] == '-')
+ 				{
+ 					/*
+ 					 * It's a negative value - this is the one and only case
+ 					 * where we canonicalize more than a single token.
+ 					 *
+ 					 * Do not compensate for the core system's special-case
+ 					 * adjustment of location to that of the leading '-'
+ 					 * operator in the event of a negative constant. It is also
+ 					 * useful for our purposes to start from the minus symbol.
+ 					 * In this way, queries like "select * from foo where bar =
+ 					 * 1" and "select * from foo where bar = -2" will always
+ 					 * have identical canonicalized query strings.
+ 					 */
+ 					core_yylex(&type, &pos, init_scan);
+ 					scanbuf_len = strlen(ext_type.scanbuf);
+ 				}
+ 
+ 				/*
+ 				 * Scanner is now at end of const token of outer iteration -
+ 				 * work backwards to get constant length.
+ 				 */
+ 				offs[i].length = scanbuf_len - loc;
+ 				break;
+ 			}
+ 		}
+ 		last_loc = loc;
+ 	}
+ 	scanner_finish(init_scan);
+ }
+ 
+ /*
+  * JumbleQuery: Selectively serialize query tree, and return a hash representing
+  * that serialization - it's queryId.
+  *
+  * Note that this doesn't necessarily uniquely identify the query across
+  * different databases and encodings.
+  */
+ static uint32
+ JumbleQuery(Query *post_analysis_tree)
+ {
+ 	/* State for this run of PerformJumble */
+ 	Size i = 0;
+ 	last_offset_num = 0;
+ 	memset(last_jumble, 0, JUMBLE_SIZE);
+ 	last_jumble[i++] = MAG_HASH_BUF;
+ 	PerformJumble(post_analysis_tree, JUMBLE_SIZE, &i);
+ 	/* Reset rangetbl state */
+ 	list_free(pgss_rangetbl_stack);
+ 	pgss_rangetbl_stack = NIL;
+ 
+ 	/* Sort offsets as required by later query string canonicalization */
+ 	qsort(last_offsets, last_offset_num, sizeof(pgssLocationLen), comp_offset);
+ 	return hash_any((const unsigned char* ) last_jumble, i);
+ }
+ 
+ /*
+  * AppendJumb: Append a value that is substantive to a given query to jumble,
+  * while incrementing the iterator, i.
+  */
+ static void
+ AppendJumb(unsigned char* item, unsigned char jumble[], Size size, Size *i)
+ {
+ 	Assert(item != NULL);
+ 	Assert(jumble != NULL);
+ 	Assert(i != NULL);
+ 
+ 	/*
+ 	 * Copy the entire item to the buffer, or as much of it as possible to fill
+ 	 * the buffer to capacity.
+ 	 */
+ 	memcpy(jumble + *i, item, Min(*i > JUMBLE_SIZE? 0:JUMBLE_SIZE - *i, size));
+ 
+ 	/*
+ 	 * Continually hash the query tree's jumble.
+ 	 *
+ 	 * Was JUMBLE_SIZE exceeded? If so, hash the jumble and append that to the
+ 	 * start of the jumble buffer, and then continue to append the fraction of
+ 	 * "item" that we might not have been able to fit at the end of the buffer
+ 	 * in the last iteration. Since the value of i has been set to 0, there is
+ 	 * no need to memset the buffer in advance of this new iteration, but
+ 	 * effectively we are completely discarding the prior iteration's jumble
+ 	 * except for this representative hash value.
+ 	 */
+ 	if (*i > JUMBLE_SIZE)
+ 	{
+ 		uint32 start_hash = hash_any((const unsigned char* ) last_jumble, JUMBLE_SIZE);
+ 		int hash_l = sizeof(start_hash);
+ 		int part_left_l = Max(0, ((int) size - ((int) *i - JUMBLE_SIZE)));
+ 
+ 		Assert(part_left_l >= 0 && part_left_l <= size);
+ 
+ 		memcpy(jumble, &start_hash, hash_l);
+ 		memcpy(jumble + hash_l, item + (size - part_left_l), part_left_l);
+ 		*i = hash_l + part_left_l;
+ 	}
+ 	else
+ 	{
+ 		*i += size;
+ 	}
+ }
+ 
+ /*
+  * Wrapper around AppendJumb to encapsulate details of serialization
+  * of individual local variable elements.
+  */
+ #define APP_JUMB(item) \
+ AppendJumb((unsigned char*)&item, last_jumble, sizeof(item), i)
+ 
+ /*
+  * PerformJumble: Selectively serialize the query tree and canonicalize
+  * constants (i.e.  don't consider their actual value - just their type).
+  *
+  * The last_jumble buffer, which this function writes to, can be hashed to
+  * uniquely identify a query that may use different constants in successive
+  * calls.
+  */
+ static void
+ PerformJumble(const Query *tree, Size size, Size *i)
+ {
+ 	ListCell *l;
+ 	/* table join tree (FROM and WHERE clauses) */
+ 	FromExpr *jt = (FromExpr *) tree->jointree;
+ 	/* # of result tuples to skip (int8 expr) */
+ 	FuncExpr *off = (FuncExpr *) tree->limitOffset;
+ 	/* # of result tuples to skip (int8 expr) */
+ 	FuncExpr *limcount = (FuncExpr *) tree->limitCount;
+ 
+ 	if (pgss_rangetbl_stack &&
+ 			!IsA(pgss_rangetbl_stack, List))
+ 		pgss_rangetbl_stack = NIL;
+ 
+ 	if (tree->rtable != NIL)
+ 	{
+ 		pgss_rangetbl_stack = lappend(pgss_rangetbl_stack, tree->rtable);
+ 	}
+ 	else
+ 	{
+ 		/* Add dummy Range table entry to maintain stack */
+ 		RangeTblEntry *rte = makeNode(RangeTblEntry);
+ 		List *dummy = lappend(NIL, rte);
+ 		pgss_rangetbl_stack = lappend(pgss_rangetbl_stack, dummy);
+ 	}
+ 
+ 	APP_JUMB(tree->resultRelation);
+ 
+ 	if (tree->intoClause)
+ 	{
+ 		IntoClause *ic = tree->intoClause;
+ 		RangeVar   *rel = ic->rel;
+ 
+ 		APP_JUMB(ic->onCommit);
+ 		APP_JUMB(ic->skipData);
+ 		if (rel)
+ 		{
+ 			APP_JUMB(rel->relpersistence);
+ 			/* Bypass macro abstraction to supply size directly.
+ 			 *
+ 			 * Serialize schemaname, relname themselves - this makes us
+ 			 * somewhat consistent with the behavior of utility statements like "create
+ 			 * table", which seems appropriate.
+ 			 */
+ 			if (rel->schemaname)
+ 				AppendJumb((unsigned char *)rel->schemaname, last_jumble,
+ 								strlen(rel->schemaname), i);
+ 			if (rel->relname)
+ 				AppendJumb((unsigned char *)rel->relname, last_jumble,
+ 								strlen(rel->relname), i);
+ 		}
+ 	}
+ 
+ 	/* WITH list (of CommonTableExpr's) */
+ 	foreach(l, tree->cteList)
+ 	{
+ 		CommonTableExpr	*cte = (CommonTableExpr *) lfirst(l);
+ 		Query			*cteq = (Query*) cte->ctequery;
+ 		if (cteq)
+ 			PerformJumble(cteq, size, i);
+ 	}
+ 	if (jt)
+ 	{
+ 		if (jt->quals)
+ 		{
+ 			if (IsA(jt->quals, OpExpr))
+ 			{
+ 				QualsNode((OpExpr*) jt->quals, size, i, tree->rtable);
+ 			}
+ 			else
+ 			{
+ 				LeafNode((Node*) jt->quals, size, i, tree->rtable);
+ 			}
+ 		}
+ 		/* table join tree */
+ 		foreach(l, jt->fromlist)
+ 		{
+ 			Node* fr = lfirst(l);
+ 			if (IsA(fr, JoinExpr))
+ 			{
+ 				JoinExprNode((JoinExpr*) fr, size, i, tree->rtable);
+ 			}
+ 			else if (IsA(fr, RangeTblRef))
+ 			{
+ 				RangeTblRef   *rtf = (RangeTblRef *) fr;
+ 				RangeTblEntry *rte = rt_fetch(rtf->rtindex, tree->rtable);
+ 				APP_JUMB(rte->relid);
+ 				APP_JUMB(rte->rtekind);
+ 				/* Subselection in where clause */
+ 				if (rte->subquery)
+ 					PerformJumble(rte->subquery, size, i);
+ 
+ 				/* Function call in where clause */
+ 				if (rte->funcexpr)
+ 					LeafNode((Node*) rte->funcexpr, size, i, tree->rtable);
+ 			}
+ 			else
+ 			{
+ 				ereport(WARNING,
+ 						(errcode(ERRCODE_INTERNAL_ERROR),
+ 						 errmsg("unexpected, unrecognised fromlist node type: %d",
+ 							 (int) nodeTag(fr))));
+ 			}
+ 		}
+ 	}
+ 	/*
+ 	 * target list (of TargetEntry)
+ 	 * columns returned by query
+ 	 */
+ 	foreach(l, tree->targetList)
+ 	{
+ 		TargetEntry *tg = (TargetEntry *) lfirst(l);
+ 		Node        *e  = (Node*) tg->expr;
+ 		if (tg->ressortgroupref)
+ 			/* nonzero if referenced by a sort/group - for ORDER BY */
+ 			APP_JUMB(tg->ressortgroupref);
+ 		APP_JUMB(tg->resno); /* column number for select */
+ 		/*
+ 		 * Handle the various types of nodes in
+ 		 * the select list of this query
+ 		 */
+ 		LeafNode(e, size, i, tree->rtable);
+ 	}
+ 	/* return-values list (of TargetEntry) */
+ 	foreach(l, tree->returningList)
+ 	{
+ 		TargetEntry *rt = (TargetEntry *) lfirst(l);
+ 		Expr        *e  = (Expr*) rt->expr;
+ 		unsigned char magic = MAG_RETURN_LIST;
+ 		APP_JUMB(magic);
+ 		/*
+ 		 * Handle the various types of nodes in
+ 		 * the select list of this query
+ 		 */
+ 		LeafNode((Node*) e, size, i, tree->rtable);
+ 	}
+ 	/* a list of SortGroupClause's */
+ 	foreach(l, tree->groupClause)
+ 	{
+ 		SortGroupClause *gc = (SortGroupClause *) lfirst(l);
+ 		APP_JUMB(gc->tleSortGroupRef);
+ 		APP_JUMB(gc->nulls_first);
+ 	}
+ 
+ 	if (tree->havingQual)
+ 	{
+ 		if (IsA(tree->havingQual, OpExpr))
+ 		{
+ 			OpExpr *na = (OpExpr *) tree->havingQual;
+ 			QualsNode(na, size, i, tree->rtable);
+ 		}
+ 		else
+ 		{
+ 			Node *n = (Node*) tree->havingQual;
+ 			LeafNode(n, size, i, tree->rtable);
+ 		}
+ 	}
+ 
+ 	foreach(l, tree->windowClause)
+ 	{
+ 		WindowClause *wc = (WindowClause *) lfirst(l);
+ 		ListCell     *il;
+ 		APP_JUMB(wc->frameOptions);
+ 		foreach(il, wc->partitionClause)	/* PARTITION BY list */
+ 		{
+ 			Node *n = (Node *) lfirst(il);
+ 			LeafNode(n, size, i, tree->rtable);
+ 		}
+ 		foreach(il, wc->orderClause)		/* ORDER BY list */
+ 		{
+ 			Node *n = (Node *) lfirst(il);
+ 			LeafNode(n, size, i, tree->rtable);
+ 		}
+ 	}
+ 
+ 	foreach(l, tree->distinctClause)
+ 	{
+ 		SortGroupClause *dc = (SortGroupClause *) lfirst(l);
+ 		APP_JUMB(dc->tleSortGroupRef);
+ 		APP_JUMB(dc->nulls_first);
+ 	}
+ 
+ 	/* Don't look at tree->sortClause,
+ 	 * because the value ressortgroupref is already
+ 	 * serialized when we iterate through targetList
+ 	 */
+ 
+ 	if (off)
+ 		LimitOffsetNode((Node*) off, size, i, tree->rtable);
+ 
+ 	if (limcount)
+ 		LimitOffsetNode((Node*) limcount, size, i, tree->rtable);
+ 
+ 	if (tree->setOperations)
+ 	{
+ 		/*
+ 		 * set-operation tree if this is top
+ 		 * level of a UNION/INTERSECT/EXCEPT query
+ 		 */
+ 		SetOperationStmt *topop = (SetOperationStmt *) tree->setOperations;
+ 		APP_JUMB(topop->op);
+ 		APP_JUMB(topop->all);
+ 
+ 		/* leaf selects are RTE subselections */
+ 		foreach(l, tree->rtable)
+ 		{
+ 			RangeTblEntry *r = (RangeTblEntry *) lfirst(l);
+ 			if (r->subquery)
+ 				PerformJumble(r->subquery, size, i);
+ 		}
+ 	}
+ 	pgss_rangetbl_stack = list_delete_ptr(pgss_rangetbl_stack,
+ 			list_nth(pgss_rangetbl_stack, pgss_rangetbl_stack->length - 1));
+ }
+ 
+ /*
+  * Perform selective serialization of "Quals" nodes when
+  * they're IsA(*, OpExpr)
+  */
+ static void
+ QualsNode(const OpExpr *node, Size size, Size *i, List *rtable)
+ {
+ 	ListCell *l;
+ 	APP_JUMB(node->xpr);
+ 	APP_JUMB(node->opno);
+ 	foreach(l, node->args)
+ 	{
+ 		Node *arg = (Node *) lfirst(l);
+ 		LeafNode(arg, size, i, rtable);
+ 	}
+ }
+ 
+ /*
+  * LeafNode: Selectively serialize a selection of parser/prim nodes that are
+  * frequently, though certainly not necesssarily leaf nodes, such as Vars
+  * (columns), constants and function calls
+  */
+ static void
+ LeafNode(const Node *arg, Size size, Size *i, List *rtable)
+ {
+ 	ListCell *l;
+ 	/* Use the node's NodeTag as a magic number */
+ 	APP_JUMB(arg->type);
+ 
+ 	if (IsA(arg, Const))
+ 	{
+ 		Const *c = (Const *) arg;
+ 
+ 		/*
+ 		 * Datatype of the constant is a differentiator
+ 		 */
+ 		APP_JUMB(c->consttype);
+ 		RecordConstLocation(c->location);
+ 	}
+ 	else if(IsA(arg, CoerceToDomain))
+ 	{
+ 		CoerceToDomain *cd = (CoerceToDomain*) arg;
+ 		/*
+ 		 * Datatype of the constant is a
+ 		 * differentiator
+ 		 */
+ 		APP_JUMB(cd->resulttype);
+ 		LeafNode((Node*) cd->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, Var))
+ 	{
+ 		Var			  *v = (Var *) arg;
+ 		RangeTblEntry *rte;
+ 		ListCell *lc;
+ 
+ 		/*
+ 		 * We need to get the details of the rangetable, but rtable may not
+ 		 * refer to the relevant one if we're in a subselection.
+ 		 */
+ 		if (v->varlevelsup == 0)
+ 		{
+ 			rte = rt_fetch(v->varno, rtable);
+ 		}
+ 		else
+ 		{
+ 			List *rtable_upper = list_nth(pgss_rangetbl_stack,
+ 					(list_length(pgss_rangetbl_stack) - 1) - v->varlevelsup);
+ 			rte = rt_fetch(v->varno, rtable_upper);
+ 		}
+ 		APP_JUMB(rte->relid);
+ 
+ 		foreach(lc, rte->values_lists)
+ 		{
+ 			List	   *sublist = (List *) lfirst(lc);
+ 			ListCell   *lc2;
+ 
+ 			foreach(lc2, sublist)
+ 			{
+ 				Node	   *col = (Node *) lfirst(lc2);
+ 				LeafNode(col, size, i, rtable);
+ 			}
+ 		}
+ 		APP_JUMB(v->varattno);
+ 	}
+ 	else if (IsA(arg, CurrentOfExpr))
+ 	{
+ 		CurrentOfExpr *CoE = (CurrentOfExpr*) arg;
+ 		APP_JUMB(CoE->cvarno);
+ 		APP_JUMB(CoE->cursor_param);
+ 	}
+ 	else if (IsA(arg, CollateExpr))
+ 	{
+ 		CollateExpr *Ce = (CollateExpr*) arg;
+ 		APP_JUMB(Ce->collOid);
+ 	}
+ 	else if (IsA(arg, FieldSelect))
+ 	{
+ 		FieldSelect *Fs = (FieldSelect*) arg;
+ 		APP_JUMB(Fs->resulttype);
+ 		LeafNode((Node*) Fs->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, NamedArgExpr))
+ 	{
+ 		NamedArgExpr *Nae = (NamedArgExpr*) arg;
+ 		APP_JUMB(Nae->argnumber);
+ 		LeafNode((Node*) Nae->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, Param))
+ 	{
+ 		Param *p = ((Param *) arg);
+ 		APP_JUMB(p->paramkind);
+ 		APP_JUMB(p->paramid);
+ 	}
+ 	else if (IsA(arg, RelabelType))
+ 	{
+ 		RelabelType *rt = (RelabelType*) arg;
+ 		APP_JUMB(rt->resulttype);
+ 		LeafNode((Node*) rt->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, WindowFunc))
+ 	{
+ 		WindowFunc *wf = (WindowFunc *) arg;
+ 		APP_JUMB(wf->winfnoid);
+ 		foreach(l, wf->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, FuncExpr))
+ 	{
+ 		FuncExpr *f = (FuncExpr *) arg;
+ 		APP_JUMB(f->funcid);
+ 		foreach(l, f->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, OpExpr) || IsA(arg, DistinctExpr))
+ 	{
+ 		QualsNode((OpExpr*) arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, CoerceViaIO))
+ 	{
+ 		CoerceViaIO *Cio = (CoerceViaIO*) arg;
+ 		APP_JUMB(Cio->coerceformat);
+ 		APP_JUMB(Cio->resulttype);
+ 		LeafNode((Node*) Cio->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, Aggref))
+ 	{
+ 		Aggref *a =  (Aggref *) arg;
+ 		APP_JUMB(a->aggfnoid);
+ 		foreach(l, a->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, SubLink))
+ 	{
+ 		SubLink *s = (SubLink*) arg;
+ 		APP_JUMB(s->subLinkType);
+ 		/* Serialize select-list subselect recursively */
+ 		if (s->subselect)
+ 			PerformJumble((Query*) s->subselect, size, i);
+ 
+ 		if (s->testexpr)
+ 			LeafNode((Node*) s->testexpr, size, i, rtable);
+ 		foreach(l, s->operName)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, TargetEntry))
+ 	{
+ 		TargetEntry *rt = (TargetEntry *) arg;
+ 		Node *e = (Node*) rt->expr;
+ 		APP_JUMB(rt->resorigtbl);
+ 		APP_JUMB(rt->ressortgroupref);
+ 		LeafNode(e, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, BoolExpr))
+ 	{
+ 		BoolExpr *be = (BoolExpr *) arg;
+ 		APP_JUMB(be->boolop);
+ 		foreach(l, be->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, NullTest))
+ 	{
+ 		NullTest *nt = (NullTest *) arg;
+ 		Node     *arg = (Node *) nt->arg;
+ 		APP_JUMB(nt->nulltesttype);		/* IS NULL, IS NOT NULL */
+ 		APP_JUMB(nt->argisrow);			/* is input a composite type ? */
+ 		LeafNode(arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, ArrayExpr))
+ 	{
+ 		ArrayExpr *ae = (ArrayExpr *) arg;
+ 		APP_JUMB(ae->array_typeid);		/* type of expression result */
+ 		APP_JUMB(ae->element_typeid);	/* common type of array elements */
+ 		foreach(l, ae->elements)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, CaseExpr))
+ 	{
+ 		CaseExpr *ce = (CaseExpr*) arg;
+ 		Assert(ce->casetype != InvalidOid);
+ 		APP_JUMB(ce->casetype);
+ 		foreach(l, ce->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		if (ce->arg)
+ 			LeafNode((Node*) ce->arg, size, i, rtable);
+ 
+ 		if (ce->defresult)
+ 		{
+ 			/* Default result (ELSE clause).
+ 			 *
+ 			 * May be NULL, because no else clause
+ 			 * was actually specified, and thus the value is
+ 			 * equivalent to SQL ELSE NULL
+ 			 */
+ 			LeafNode((Node*) ce->defresult, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, CaseTestExpr))
+ 	{
+ 		CaseTestExpr *ct = (CaseTestExpr*) arg;
+ 		APP_JUMB(ct->typeId);
+ 	}
+ 	else if (IsA(arg, CaseWhen))
+ 	{
+ 		CaseWhen *cw = (CaseWhen*) arg;
+ 		Node     *res = (Node*) cw->result;
+ 		Node     *exp = (Node*) cw->expr;
+ 		if (res)
+ 			LeafNode(res, size, i, rtable);
+ 		if (exp)
+ 			LeafNode(exp, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, MinMaxExpr))
+ 	{
+ 		MinMaxExpr *cw = (MinMaxExpr*) arg;
+ 		APP_JUMB(cw->minmaxtype);
+ 		APP_JUMB(cw->op);
+ 		foreach(l, cw->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, ScalarArrayOpExpr))
+ 	{
+ 		ScalarArrayOpExpr *sa = (ScalarArrayOpExpr*) arg;
+ 		APP_JUMB(sa->opfuncid);
+ 		APP_JUMB(sa->useOr);
+ 		foreach(l, sa->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, CoalesceExpr))
+ 	{
+ 		CoalesceExpr *ca = (CoalesceExpr*) arg;
+ 		foreach(l, ca->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, ArrayCoerceExpr))
+ 	{
+ 		ArrayCoerceExpr *ac = (ArrayCoerceExpr *) arg;
+ 		LeafNode((Node*) ac->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, WindowClause))
+ 	{
+ 		WindowClause *wc = (WindowClause*) arg;
+ 		foreach(l, wc->partitionClause)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		foreach(l, wc->orderClause)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, SortGroupClause))
+ 	{
+ 		SortGroupClause *sgc = (SortGroupClause*) arg;
+ 		APP_JUMB(sgc->tleSortGroupRef);
+ 		APP_JUMB(sgc->nulls_first);
+ 	}
+ 	else if (IsA(arg, Integer) ||
+ 		  IsA(arg, Float) ||
+ 		  IsA(arg, String) ||
+ 		  IsA(arg, BitString) ||
+ 		  IsA(arg, Null)
+ 		)
+ 	{
+ 		/* It is not necessary to serialize Value nodes - they are seen when
+ 		 * aliases are used, which are ignored.
+ 		 */
+ 		return;
+ 	}
+ 	else if (IsA(arg, BooleanTest))
+ 	{
+ 		BooleanTest *bt = (BooleanTest *) arg;
+ 		APP_JUMB(bt->booltesttype);
+ 		LeafNode((Node*) bt->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, ArrayRef))
+ 	{
+ 		ArrayRef *ar = (ArrayRef*) arg;
+ 		APP_JUMB(ar->refarraytype);
+ 		foreach(l, ar->refupperindexpr)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		foreach(l, ar->reflowerindexpr)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		if (ar->refexpr)
+ 			LeafNode((Node*) ar->refexpr, size, i, rtable);
+ 		if (ar->refassgnexpr)
+ 			LeafNode((Node*) ar->refassgnexpr, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, NullIfExpr))
+ 	{
+ 		/* NullIfExpr is just a typedef for OpExpr */
+ 		QualsNode((OpExpr*) arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, RowExpr))
+ 	{
+ 		RowExpr *re = (RowExpr*) arg;
+ 		APP_JUMB(re->row_format);
+ 		foreach(l, re->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 
+ 	}
+ 	else if (IsA(arg, XmlExpr))
+ 	{
+ 		XmlExpr *xml = (XmlExpr*) arg;
+ 		APP_JUMB(xml->op);
+ 		foreach(l, xml->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		/* non-XML expressions for xml_attributes */
+ 		foreach(l, xml->named_args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		/* parallel list of Value strings */
+ 		foreach(l, xml->arg_names)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, RowCompareExpr))
+ 	{
+ 		RowCompareExpr *rc = (RowCompareExpr*) arg;
+ 		foreach(l, rc->largs)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		foreach(l, rc->rargs)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, SetToDefault))
+ 	{
+ 		SetToDefault *sd = (SetToDefault*) arg;
+ 		APP_JUMB(sd->typeId);
+ 		APP_JUMB(sd->typeMod);
+ 	}
+ 	else if (IsA(arg, ConvertRowtypeExpr))
+ 	{
+ 		ConvertRowtypeExpr* Cr = (ConvertRowtypeExpr*) arg;
+ 		APP_JUMB(Cr->convertformat);
+ 		APP_JUMB(Cr->resulttype);
+ 		LeafNode((Node*) Cr->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, FieldStore))
+ 	{
+ 		FieldStore* Fs = (FieldStore*) arg;
+ 		LeafNode((Node*) Fs->arg, size, i, rtable);
+ 		foreach(l, Fs->newvals)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else
+ 	{
+ 		ereport(WARNING,
+ 				(errcode(ERRCODE_INTERNAL_ERROR),
+ 				 errmsg("unexpected, unrecognised LeafNode node type: %d",
+ 					 (int) nodeTag(arg))));
+ 	}
+ }
+ 
+ /*
+  * Perform selective serialization of limit or offset nodes
+  */
+ static void
+ LimitOffsetNode(const Node *node, Size size, Size *i, List *rtable)
+ {
+ 	ListCell *l;
+ 	unsigned char magic = MAG_LIMIT_OFFSET;
+ 	APP_JUMB(magic);
+ 
+ 	if (IsA(node, FuncExpr))
+ 	{
+ 
+ 		foreach(l, ((FuncExpr*) node)->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else
+ 	{
+ 		/* Fall back on leaf node representation */
+ 		LeafNode(node, size, i, rtable);
+ 	}
+ }
+ 
+ /*
+  * JoinExprNode: Perform selective serialization of JoinExpr nodes
+  */
+ static void
+ JoinExprNode(JoinExpr *node, Size size, Size *i, List *rtable)
+ {
+ 	Node	 *larg = node->larg;	/* left subtree */
+ 	Node	 *rarg = node->rarg;	/* right subtree */
+ 	ListCell *l;
+ 
+ 	Assert( IsA(node, JoinExpr));
+ 
+ 	APP_JUMB(node->jointype);
+ 	APP_JUMB(node->isNatural);
+ 
+ 	if (node->quals)
+ 	{
+ 		if ( IsA(node, OpExpr))
+ 		{
+ 			QualsNode((OpExpr*) node->quals, size, i, rtable);
+ 		}
+ 		else
+ 		{
+ 			LeafNode((Node*) node->quals, size, i, rtable);
+ 		}
+ 	}
+ 	foreach(l, node->usingClause) /* USING clause, if any (list of String) */
+ 	{
+ 		Node *arg = (Node *) lfirst(l);
+ 		LeafNode(arg, size, i, rtable);
+ 	}
+ 	if (larg)
+ 		JoinExprNodeChild(larg, size, i, rtable);
+ 	if (rarg)
+ 		JoinExprNodeChild(rarg, size, i, rtable);
+ }
+ 
+ /*
+  * JoinExprNodeChild: Serialize children of the JoinExpr node
+  */
+ static void
+ JoinExprNodeChild(const Node *node, Size size, Size *i, List *rtable)
+ {
+ 	if (IsA(node, RangeTblRef))
+ 	{
+ 		RangeTblRef   *rt = (RangeTblRef*) node;
+ 		RangeTblEntry *rte = rt_fetch(rt->rtindex, rtable);
+ 		ListCell      *l;
+ 
+ 		APP_JUMB(rte->relid);
+ 		APP_JUMB(rte->jointype);
+ 
+ 		if (rte->subquery)
+ 			PerformJumble((Query*) rte->subquery, size, i);
+ 
+ 		foreach(l, rte->joinaliasvars)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(node, JoinExpr))
+ 	{
+ 		JoinExprNode((JoinExpr*) node, size, i, rtable);
+ 	}
+ 	else
+ 	{
+ 		LeafNode(node, size, i, rtable);
+ 	}
+ }
+ 
+ /*
+  * Record location of constant within query string of query tree that is
+  * currently being walked.
+  */
+ static void
+ RecordConstLocation(int location)
+ {
+ 	/* -1 indicates unknown or undefined location */
+ 	if (location > 0)
+ 	{
+ 		if (last_offset_num >= last_offset_buf_size)
+ 		{
+ 			last_offset_buf_size *= 2;
+ 			last_offsets = repalloc(last_offsets,
+ 							last_offset_buf_size *
+ 							sizeof(pgssLocationLen));
+ 
+ 		}
+ 		last_offsets[last_offset_num++].location = location;
+ 	}
+ }
+ 
+ /*
   * ExecutorStart hook: start up tracking if needed
   */
  static void
*************** pgss_ExecutorEnd(QueryDesc *queryDesc)
*** 587,592 ****
--- 1744,1754 ----
  {
  	if (queryDesc->totaltime && pgss_enabled())
  	{
+ 		uint32 queryId;
+ 		if (pgss_string_key)
+ 			queryId = pgss_hash_string(queryDesc->sourceText);
+ 		else
+ 			queryId = queryDesc->plannedstmt->queryId;
  		/*
  		 * Make sure stats accumulation is done.  (Note: it's okay if several
  		 * levels of hook all do this.)
*************** pgss_ExecutorEnd(QueryDesc *queryDesc)
*** 594,602 ****
  		InstrEndLoop(queryDesc->totaltime);
  
  		pgss_store(queryDesc->sourceText,
! 				   queryDesc->totaltime->total,
! 				   queryDesc->estate->es_processed,
! 				   &queryDesc->totaltime->bufusage);
  	}
  
  	if (prev_ExecutorEnd)
--- 1756,1768 ----
  		InstrEndLoop(queryDesc->totaltime);
  
  		pgss_store(queryDesc->sourceText,
! 		   queryId,
! 		   queryDesc->totaltime->total,
! 		   queryDesc->estate->es_processed,
! 		   &queryDesc->totaltime->bufusage,
! 		   false,
! 		   false);
! 
  	}
  
  	if (prev_ExecutorEnd)
*************** pgss_ProcessUtility(Node *parsetree, con
*** 618,623 ****
--- 1784,1790 ----
  		instr_time	start;
  		instr_time	duration;
  		uint64		rows = 0;
+ 		uint32		queryId;
  		BufferUsage bufusage;
  
  		bufusage = pgBufferUsage;
*************** pgss_ProcessUtility(Node *parsetree, con
*** 671,678 ****
  		bufusage.temp_blks_written =
  			pgBufferUsage.temp_blks_written - bufusage.temp_blks_written;
  
! 		pgss_store(queryString, INSTR_TIME_GET_DOUBLE(duration), rows,
! 				   &bufusage);
  	}
  	else
  	{
--- 1838,1848 ----
  		bufusage.temp_blks_written =
  			pgBufferUsage.temp_blks_written - bufusage.temp_blks_written;
  
! 		queryId = pgss_hash_string(queryString);
! 
! 		/* In the case of utility statements, hash the query string directly */
! 		pgss_store(queryString, queryId,
! 				INSTR_TIME_GET_DOUBLE(duration), rows, &bufusage, false, false);
  	}
  	else
  	{
*************** pgss_hash_fn(const void *key, Size keysi
*** 696,703 ****
  	/* we don't bother to include encoding in the hash */
  	return hash_uint32((uint32) k->userid) ^
  		hash_uint32((uint32) k->dbid) ^
! 		DatumGetUInt32(hash_any((const unsigned char *) k->query_ptr,
! 								k->query_len));
  }
  
  /*
--- 1866,1873 ----
  	/* we don't bother to include encoding in the hash */
  	return hash_uint32((uint32) k->userid) ^
  		hash_uint32((uint32) k->dbid) ^
! 		DatumGetUInt32(hash_any((const unsigned char* ) &k->queryid,
! 					sizeof(k->queryid)) );
  }
  
  /*
*************** pgss_match_fn(const void *key1, const vo
*** 712,733 ****
  	if (k1->userid == k2->userid &&
  		k1->dbid == k2->dbid &&
  		k1->encoding == k2->encoding &&
! 		k1->query_len == k2->query_len &&
! 		memcmp(k1->query_ptr, k2->query_ptr, k1->query_len) == 0)
  		return 0;
  	else
  		return 1;
  }
  
  /*
   * Store some statistics for a statement.
   */
  static void
! pgss_store(const char *query, double total_time, uint64 rows,
! 		   const BufferUsage *bufusage)
  {
  	pgssHashKey key;
  	double		usage;
  	pgssEntry  *entry;
  
  	Assert(query != NULL);
--- 1882,1927 ----
  	if (k1->userid == k2->userid &&
  		k1->dbid == k2->dbid &&
  		k1->encoding == k2->encoding &&
! 		k1->queryid == k2->queryid)
  		return 0;
  	else
  		return 1;
  }
  
  /*
+  * Given an arbitrarily long query string, produce a hash for the purposes of
+  * identifying the query, without canonicalizing constants. Used when hashing
+  * utility statements, or for legacy compatibility mode.
+  */
+ static uint32
+ pgss_hash_string(const char* str)
+ {
+ 	/* For additional protection against collisions, including magic value */
+ 	char magic = MAG_STR_BUF;
+ 	uint32 result;
+ 	Size size = sizeof(magic) + strlen(str);
+ 	unsigned char* p = palloc(size);
+ 	memcpy(p, &magic, sizeof(magic));
+ 	memcpy(p + sizeof(magic), str, strlen(str));
+ 	result = hash_any((const unsigned char *) p, size);
+ 	pfree(p);
+ 	return result;
+ }
+ 
+ /*
   * Store some statistics for a statement.
   */
  static void
! pgss_store(const char *query, uint32 queryId,
! 				double total_time, uint64 rows,
! 				const BufferUsage *bufusage,
! 				bool empty_entry,
! 				bool canonicalize)
  {
  	pgssHashKey key;
  	double		usage;
+ 	int		    new_query_len = strlen(query);
+ 	char	   *norm_query = NULL;
  	pgssEntry  *entry;
  
  	Assert(query != NULL);
*************** pgss_store(const char *query, double tot
*** 740,773 ****
  	key.userid = GetUserId();
  	key.dbid = MyDatabaseId;
  	key.encoding = GetDatabaseEncoding();
! 	key.query_len = strlen(query);
! 	if (key.query_len >= pgss->query_size)
! 		key.query_len = pg_encoding_mbcliplen(key.encoding,
  											  query,
! 											  key.query_len,
  											  pgss->query_size - 1);
- 	key.query_ptr = query;
  
! 	usage = USAGE_EXEC(duration);
  
  	/* Lookup the hash table entry with shared lock. */
  	LWLockAcquire(pgss->lock, LW_SHARED);
  
- 	entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
  	if (!entry)
  	{
! 		/* Must acquire exclusive lock to add a new entry. */
! 		LWLockRelease(pgss->lock);
! 		LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
! 		entry = entry_alloc(&key);
  	}
  
! 	/* Grab the spinlock while updating the counters. */
  	{
  		volatile pgssEntry *e = (volatile pgssEntry *) entry;
  
  		SpinLockAcquire(&e->mutex);
! 		e->counters.calls += 1;
  		e->counters.total_time += total_time;
  		e->counters.rows += rows;
  		e->counters.shared_blks_hit += bufusage->shared_blks_hit;
--- 1934,2085 ----
  	key.userid = GetUserId();
  	key.dbid = MyDatabaseId;
  	key.encoding = GetDatabaseEncoding();
! 	key.queryid = queryId;
! 
! 	if (new_query_len >= pgss->query_size)
! 		/* We don't have to worry about this later, because canonicalization
! 		 * cannot possibly result in a longer query string
! 		 */
! 		new_query_len = pg_encoding_mbcliplen(key.encoding,
  											  query,
! 											  new_query_len,
  											  pgss->query_size - 1);
  
! 	entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
! 
! 	/*
! 	 * When just initializing an entry and putting counters at zero, make it
! 	 * artificially sticky so that it will probably still be there when
! 	 * executed. Strictly speaking, query strings are canonicalized on a
! 	 * best effort basis, though it would be difficult to demonstrate this even
! 	 * under artificial conditions.
! 	 */
! 	if (empty_entry && !entry)
! 		usage = USAGE_NON_EXEC_STICK;
! 	else
! 		usage = USAGE_EXEC(duration);
  
  	/* Lookup the hash table entry with shared lock. */
  	LWLockAcquire(pgss->lock, LW_SHARED);
  
  	if (!entry)
  	{
! 		/*
! 		 * Generate a normalized version of the query string that will be used
! 		 * to represent entry.
! 		 *
! 		 * Note that the representation seen by the user will only have
! 		 * non-differentiating Const tokens swapped with '?' characters, and
! 		 * this does not for example take account of the fact that alias names
! 		 * could vary between successive calls of what is regarded as the same
! 		 * query, or that whitespace could vary.
! 		 */
! 		if (last_offset_num > 0 && canonicalize)
! 		{
! 			int i,
! 			  off = 0,				/* Offset from start for cur tok */
! 			  tok_len = 0,			/* length (in bytes) of that tok */
! 			  quer_it = 0,			/* Original query byte iterator */
! 			  n_quer_it = 0,		/* Normalized query byte iterator */
! 			  len_to_wrt = 0,		/* Length (in bytes) to write */
! 			  last_off = 0,			/* Offset from start for last iter's tok */
! 			  last_tok_len = 0,		/* length (in bytes) of that tok */
! 			  tok_len_delta = 0;	/* Finished str is n bytes shorter so far */
! 
! 			/* Fill-in constant lengths - core system only gives us locations */
! 			fill_in_constant_lengths(query, last_offsets, last_offset_num);
! 
! 			norm_query = palloc0(new_query_len + 1);
! 
! 			for(i = 0; i < last_offset_num; i++)
! 			{
! 				if(last_offsets[i].length == -1)
! 					continue; /* don't assume that there's no duplicates */
! 
! 				off = last_offsets[i].location;
! 				tok_len = last_offsets[i].length;
! 				len_to_wrt = off - last_off;
! 				len_to_wrt -= last_tok_len;
! 				/* -1 for the '?' char: */
! 				tok_len_delta += tok_len - 1;
! 
! 				Assert(tok_len > 0);
! 				Assert(len_to_wrt >= 0);
! 				/*
! 				 * Each iteration copies everything prior to the current
! 				 * offset/token to be replaced, except bytes copied in
! 				 * previous iterations
! 				 */
! 				if (off - tok_len_delta + tok_len > new_query_len)
! 				{
! 					if (off - tok_len_delta < new_query_len)
! 					{
! 						len_to_wrt = new_query_len - n_quer_it;
! 						/* Out of space entirely - copy as much as possible */
! 						memcpy(norm_query + n_quer_it, query + quer_it,
! 								len_to_wrt);
! 						n_quer_it += len_to_wrt;
! 						quer_it += len_to_wrt + tok_len;
! 					}
! 					break;
! 				}
! 				memcpy(norm_query + n_quer_it, query + quer_it, len_to_wrt);
! 
! 				n_quer_it += len_to_wrt;
! 				if (n_quer_it < new_query_len)
! 					norm_query[n_quer_it++] = '?';
! 				quer_it += len_to_wrt + tok_len;
! 				last_off = off;
! 				last_tok_len = tok_len;
! 			}
! 			/*
! 			 * We've copied up until the last canonicalized constant
! 			 * (inclusive), or have run out of space entirely. Either fill
! 			 * norm_query to capacity, or copy over all remaining bytes from
! 			 * query, or copy nothing.
! 			 */
! 			memcpy(norm_query + n_quer_it, query + quer_it,
! 					new_query_len - n_quer_it);
! 
! 			/*
! 			 * Must acquire exclusive lock to add a new entry.
! 			 * Leave that until as late as possible.
! 			 */
! 			LWLockRelease(pgss->lock);
! 			LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
! 
! 			entry = entry_alloc(&key, norm_query, new_query_len);
! 		}
! 		else
! 		{
! 			/* Acquire exclusive lock as required by entry_alloc() */
! 			LWLockRelease(pgss->lock);
! 			LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
! 
! 			entry = entry_alloc(&key, query, new_query_len);
! 		}
  	}
  
! 	/*
! 	 * Grab the spinlock while updating the counters, if we're not just here to
! 	 * canonicalize.
! 	 */
  	{
  		volatile pgssEntry *e = (volatile pgssEntry *) entry;
  
  		SpinLockAcquire(&e->mutex);
! 		if (!empty_entry)
! 		{
! 			/*
! 			 * If necessary, "unstick" previously stuck query entry that just
! 			 * held a normalized query string, and then increment calls.
! 			 */
! 			if (e->counters.calls == 0)
! 				e->counters.usage = USAGE_INIT;
! 
! 			e->counters.calls += 1;
! 		}
! 
  		e->counters.total_time += total_time;
  		e->counters.rows += rows;
  		e->counters.shared_blks_hit += bufusage->shared_blks_hit;
*************** pgss_store(const char *query, double tot
*** 783,790 ****
  		e->counters.usage += usage;
  		SpinLockRelease(&e->mutex);
  	}
- 
  	LWLockRelease(pgss->lock);
  }
  
  /*
--- 2095,2103 ----
  		e->counters.usage += usage;
  		SpinLockRelease(&e->mutex);
  	}
  	LWLockRelease(pgss->lock);
+ 	if (norm_query)
+ 		pfree(norm_query);
  }
  
  /*
*************** pg_stat_statements(PG_FUNCTION_ARGS)
*** 875,881 ****
  
  			qstr = (char *)
  				pg_do_encoding_conversion((unsigned char *) entry->query,
! 										  entry->key.query_len,
  										  entry->key.encoding,
  										  GetDatabaseEncoding());
  			values[i++] = CStringGetTextDatum(qstr);
--- 2188,2194 ----
  
  			qstr = (char *)
  				pg_do_encoding_conversion((unsigned char *) entry->query,
! 										  entry->query_len,
  										  entry->key.encoding,
  										  GetDatabaseEncoding());
  			values[i++] = CStringGetTextDatum(qstr);
*************** pg_stat_statements(PG_FUNCTION_ARGS)
*** 893,898 ****
--- 2206,2214 ----
  			tmp = e->counters;
  			SpinLockRelease(&e->mutex);
  		}
+ 		/* Skip record of unexecuted query */
+ 		if (tmp.calls == 0)
+ 			continue;
  
  		values[i++] = Int64GetDatumFast(tmp.calls);
  		values[i++] = Float8GetDatumFast(tmp.total_time);
*************** pgss_memsize(void)
*** 950,963 ****
   * have made the entry while we waited to get exclusive lock.
   */
  static pgssEntry *
! entry_alloc(pgssHashKey *key)
  {
  	pgssEntry  *entry;
  	bool		found;
  
- 	/* Caller must have clipped query properly */
- 	Assert(key->query_len < pgss->query_size);
- 
  	/* Make space if needed */
  	while (hash_get_num_entries(pgss_hash) >= pgss_max)
  		entry_dealloc();
--- 2266,2276 ----
   * have made the entry while we waited to get exclusive lock.
   */
  static pgssEntry *
! entry_alloc(pgssHashKey *key, const char* query, int new_query_len)
  {
  	pgssEntry  *entry;
  	bool		found;
  
  	/* Make space if needed */
  	while (hash_get_num_entries(pgss_hash) >= pgss_max)
  		entry_dealloc();
*************** entry_alloc(pgssHashKey *key)
*** 969,985 ****
  	{
  		/* New entry, initialize it */
  
! 		/* dynahash tried to copy the key for us, but must fix query_ptr */
! 		entry->key.query_ptr = entry->query;
  		/* reset the statistics */
  		memset(&entry->counters, 0, sizeof(Counters));
  		entry->counters.usage = USAGE_INIT;
  		/* re-initialize the mutex each time ... we assume no one using it */
  		SpinLockInit(&entry->mutex);
  		/* ... and don't forget the query text */
! 		memcpy(entry->query, key->query_ptr, key->query_len);
! 		entry->query[key->query_len] = '\0';
  	}
  
  	return entry;
  }
--- 2282,2301 ----
  	{
  		/* New entry, initialize it */
  
! 		entry->query_len = new_query_len;
! 		Assert(entry->query_len > 0);
  		/* reset the statistics */
  		memset(&entry->counters, 0, sizeof(Counters));
  		entry->counters.usage = USAGE_INIT;
  		/* re-initialize the mutex each time ... we assume no one using it */
  		SpinLockInit(&entry->mutex);
  		/* ... and don't forget the query text */
! 		memcpy(entry->query, query, entry->query_len);
! 		Assert(new_query_len <= pgss->query_size);
! 		entry->query[entry->query_len] = '\0';
  	}
+ 	/* Caller must have clipped query properly */
+ 	Assert(entry->query_len < pgss->query_size);
  
  	return entry;
  }
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
new file mode 100644
index cc3168d..84483ce
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
*************** _copyPlannedStmt(const PlannedStmt *from
*** 92,97 ****
--- 92,98 ----
  	COPY_NODE_FIELD(relationOids);
  	COPY_NODE_FIELD(invalItems);
  	COPY_SCALAR_FIELD(nParamExec);
+ 	COPY_SCALAR_FIELD(queryId);
  
  	return newnode;
  }
*************** _copyQuery(const Query *from)
*** 2415,2420 ****
--- 2416,2422 ----
  
  	COPY_SCALAR_FIELD(commandType);
  	COPY_SCALAR_FIELD(querySource);
+ 	COPY_SCALAR_FIELD(queryId);
  	COPY_SCALAR_FIELD(canSetTag);
  	COPY_NODE_FIELD(utilityStmt);
  	COPY_SCALAR_FIELD(resultRelation);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
new file mode 100644
index 2295195..ce75da3
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 83,88 ****
--- 83,91 ----
  #define COMPARE_LOCATION_FIELD(fldname) \
  	((void) 0)
  
+ /* Compare a query_id field (this is a no-op, per note above) */
+ #define COMPARE_QUERYID_FIELD(fldname) \
+ 	((void) 0)
  
  /*
   *	Stuff from primnodes.h
*************** _equalQuery(const Query *a, const Query
*** 897,902 ****
--- 900,906 ----
  {
  	COMPARE_SCALAR_FIELD(commandType);
  	COMPARE_SCALAR_FIELD(querySource);
+ 	COMPARE_QUERYID_FIELD(query_id);
  	COMPARE_SCALAR_FIELD(canSetTag);
  	COMPARE_NODE_FIELD(utilityStmt);
  	COMPARE_SCALAR_FIELD(resultRelation);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
new file mode 100644
index 829f6d4..9646125
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 81,86 ****
--- 81,90 ----
  #define WRITE_LOCATION_FIELD(fldname) \
  	appendStringInfo(str, " :" CppAsString(fldname) " %d", node->fldname)
  
+ /* Write a query id field */
+ #define WRITE_QUERYID_FIELD(fldname) \
+ 	((void) 0)
+ 
  /* Write a Node field */
  #define WRITE_NODE_FIELD(fldname) \
  	(appendStringInfo(str, " :" CppAsString(fldname) " "), \
*************** _outPlannedStmt(StringInfo str, const Pl
*** 255,260 ****
--- 259,265 ----
  	WRITE_NODE_FIELD(relationOids);
  	WRITE_NODE_FIELD(invalItems);
  	WRITE_INT_FIELD(nParamExec);
+ 	WRITE_QUERYID_FIELD(queryId);
  }
  
  /*
*************** _outQuery(StringInfo str, const Query *n
*** 2159,2164 ****
--- 2164,2170 ----
  
  	WRITE_ENUM_FIELD(commandType, CmdType);
  	WRITE_ENUM_FIELD(querySource, QuerySource);
+ 	WRITE_QUERYID_FIELD(query_id);
  	WRITE_BOOL_FIELD(canSetTag);
  
  	/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
new file mode 100644
index b9258ad..5ea0d52
*** a/src/backend/nodes/readfuncs.c
--- b/src/backend/nodes/readfuncs.c
***************
*** 110,115 ****
--- 110,119 ----
  	token = pg_strtok(&length);		/* get field value */ \
  	local_node->fldname = -1	/* set field to "unknown" */
  
+ /* Read a QueryId field - NO-OP */
+ #define READ_QUERYID_FIELD(fldname) \
+ 	((void) 0)
+ 
  /* Read a Node field */
  #define READ_NODE_FIELD(fldname) \
  	token = pg_strtok(&length);		/* skip :fldname */ \
*************** _readQuery(void)
*** 195,200 ****
--- 199,205 ----
  
  	READ_ENUM_FIELD(commandType, CmdType);
  	READ_ENUM_FIELD(querySource, QuerySource);
+ 	READ_QUERYID_FIELD(query_id);
  	READ_BOOL_FIELD(canSetTag);
  	READ_NODE_FIELD(utilityStmt);
  	READ_INT_FIELD(resultRelation);
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
new file mode 100644
index 8bbe977..1b4030f
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*************** standard_planner(Query *parse, int curso
*** 240,245 ****
--- 240,246 ----
  	result->relationOids = glob->relationOids;
  	result->invalItems = glob->invalItems;
  	result->nParamExec = list_length(glob->paramlist);
+ 	result->queryId = parse->queryId;
  
  	return result;
  }
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
new file mode 100644
index b187b03..92a7dec
*** a/src/backend/parser/analyze.c
--- b/src/backend/parser/analyze.c
*************** static Query *transformExplainStmt(Parse
*** 65,73 ****
  static void transformLockingClause(ParseState *pstate, Query *qry,
  					   LockingClause *lc, bool pushedDown);
  
  
  /*
!  * parse_analyze
   *		Analyze a raw parse tree and transform it to Query form.
   *
   * Optionally, information about $n parameter types can be supplied.
--- 65,89 ----
  static void transformLockingClause(ParseState *pstate, Query *qry,
  					   LockingClause *lc, bool pushedDown);
  
+ /* Hooks for plugins to get control of parse analysis */
+ parse_analyze_hook_type				parse_analyze_hook = NULL;
+ parse_analyze_varparams_hook_type	parse_analyze_varparams_hook = NULL;
+ 
+ 
+ Query *
+ parse_analyze(Node *parseTree, const char *sourceText,
+ 			  Oid *paramTypes, int numParams)
+ {
+ 	if (parse_analyze_hook)
+ 		return (*parse_analyze_hook) (parseTree, sourceText,
+ 			  paramTypes, numParams);
+ 	else
+ 		return standard_parse_analyze(parseTree, sourceText,
+ 			  paramTypes, numParams);
+ }
  
  /*
!  * standard_parse_analyze
   *		Analyze a raw parse tree and transform it to Query form.
   *
   * Optionally, information about $n parameter types can be supplied.
*************** static void transformLockingClause(Parse
*** 78,84 ****
   * a dummy CMD_UTILITY Query node.
   */
  Query *
! parse_analyze(Node *parseTree, const char *sourceText,
  			  Oid *paramTypes, int numParams)
  {
  	ParseState *pstate = make_parsestate(NULL);
--- 94,100 ----
   * a dummy CMD_UTILITY Query node.
   */
  Query *
! standard_parse_analyze(Node *parseTree, const char *sourceText,
  			  Oid *paramTypes, int numParams)
  {
  	ParseState *pstate = make_parsestate(NULL);
*************** parse_analyze(Node *parseTree, const cha
*** 98,112 ****
  	return query;
  }
  
  /*
!  * parse_analyze_varparams
   *
   * This variant is used when it's okay to deduce information about $n
   * symbol datatypes from context.  The passed-in paramTypes[] array can
   * be modified or enlarged (via repalloc).
   */
  Query *
! parse_analyze_varparams(Node *parseTree, const char *sourceText,
  						Oid **paramTypes, int *numParams)
  {
  	ParseState *pstate = make_parsestate(NULL);
--- 114,140 ----
  	return query;
  }
  
+ Query *
+ parse_analyze_varparams(Node *parseTree, const char *sourceText,
+ 						Oid **paramTypes, int *numParams)
+ {
+ 	if (parse_analyze_varparams_hook)
+ 		return (*parse_analyze_varparams_hook) (parseTree, sourceText,
+ 						paramTypes, numParams);
+ 	else
+ 		return standard_parse_analyze_varparams(parseTree, sourceText,
+ 			  paramTypes, numParams);
+ }
+ 
  /*
!  * standard_parse_analyze_varparams
   *
   * This variant is used when it's okay to deduce information about $n
   * symbol datatypes from context.  The passed-in paramTypes[] array can
   * be modified or enlarged (via repalloc).
   */
  Query *
! standard_parse_analyze_varparams(Node *parseTree, const char *sourceText,
  						Oid **paramTypes, int *numParams)
  {
  	ParseState *pstate = make_parsestate(NULL);
*************** transformSelectStmt(ParseState *pstate,
*** 877,882 ****
--- 905,911 ----
  	ListCell   *l;
  
  	qry->commandType = CMD_SELECT;
+ 	qry->queryId = 0;
  
  	/* process the WITH clause independently of all else */
  	if (stmt->withClause)
diff --git a/src/backend/parser/parse_coerce.c b/src/backend/parser/parse_coerce.c
new file mode 100644
index 6661a3d..841d2b2
*** a/src/backend/parser/parse_coerce.c
--- b/src/backend/parser/parse_coerce.c
*************** coerce_type(ParseState *pstate, Node *no
*** 280,293 ****
  		newcon->constlen = typeLen(targetType);
  		newcon->constbyval = typeByVal(targetType);
  		newcon->constisnull = con->constisnull;
! 		/* Use the leftmost of the constant's and coercion's locations */
! 		if (location < 0)
! 			newcon->location = con->location;
! 		else if (con->location >= 0 && con->location < location)
! 			newcon->location = con->location;
! 		else
! 			newcon->location = location;
! 
  		/*
  		 * Set up to point at the constant's text if the input routine throws
  		 * an error.
--- 280,286 ----
  		newcon->constlen = typeLen(targetType);
  		newcon->constbyval = typeByVal(targetType);
  		newcon->constisnull = con->constisnull;
! 		newcon->location = con->location;
  		/*
  		 * Set up to point at the constant's text if the input routine throws
  		 * an error.
*************** coerce_type(ParseState *pstate, Node *no
*** 333,340 ****
  		result = (*pstate->p_coerce_param_hook) (pstate,
  												 (Param *) node,
  												 targetTypeId,
! 												 targetTypeMod,
! 												 location);
  		if (result)
  			return result;
  	}
--- 326,332 ----
  		result = (*pstate->p_coerce_param_hook) (pstate,
  												 (Param *) node,
  												 targetTypeId,
! 												 targetTypeMod);
  		if (result)
  			return result;
  	}
diff --git a/src/backend/parser/parse_param.c b/src/backend/parser/parse_param.c
new file mode 100644
index cfe7262..75214ed
*** a/src/backend/parser/parse_param.c
--- b/src/backend/parser/parse_param.c
*************** typedef struct VarParamState
*** 54,61 ****
  static Node *fixed_paramref_hook(ParseState *pstate, ParamRef *pref);
  static Node *variable_paramref_hook(ParseState *pstate, ParamRef *pref);
  static Node *variable_coerce_param_hook(ParseState *pstate, Param *param,
! 						   Oid targetTypeId, int32 targetTypeMod,
! 						   int location);
  static bool check_parameter_resolution_walker(Node *node, ParseState *pstate);
  
  
--- 54,60 ----
  static Node *fixed_paramref_hook(ParseState *pstate, ParamRef *pref);
  static Node *variable_paramref_hook(ParseState *pstate, ParamRef *pref);
  static Node *variable_coerce_param_hook(ParseState *pstate, Param *param,
! 						   Oid targetTypeId, int32 targetTypeMod);
  static bool check_parameter_resolution_walker(Node *node, ParseState *pstate);
  
  
*************** variable_paramref_hook(ParseState *pstat
*** 178,185 ****
   */
  static Node *
  variable_coerce_param_hook(ParseState *pstate, Param *param,
! 						   Oid targetTypeId, int32 targetTypeMod,
! 						   int location)
  {
  	if (param->paramkind == PARAM_EXTERN && param->paramtype == UNKNOWNOID)
  	{
--- 177,183 ----
   */
  static Node *
  variable_coerce_param_hook(ParseState *pstate, Param *param,
! 						   Oid targetTypeId, int32 targetTypeMod)
  {
  	if (param->paramkind == PARAM_EXTERN && param->paramtype == UNKNOWNOID)
  	{
*************** variable_coerce_param_hook(ParseState *p
*** 238,248 ****
  		 */
  		param->paramcollid = get_typcollation(param->paramtype);
  
- 		/* Use the leftmost of the param's and coercion's locations */
- 		if (location >= 0 &&
- 			(param->location < 0 || location < param->location))
- 			param->location = location;
- 
  		return (Node *) param;
  	}
  
--- 236,241 ----
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
new file mode 100644
index 1d33ceb..9fb3c0f
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
*************** typedef struct Query
*** 103,108 ****
--- 103,111 ----
  
  	QuerySource querySource;	/* where did I come from? */
  
+ 	uint32		queryId;		/* query identifier that can be set by plugins.
+ 								 * Will be copied to resulting PlannedStmt. */
+ 
  	bool		canSetTag;		/* do I set the command result tag? */
  
  	Node	   *utilityStmt;	/* non-null if this is DECLARE CURSOR or a
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
new file mode 100644
index 7d90b91..3cec1be
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
*************** typedef struct PlannedStmt
*** 67,72 ****
--- 67,74 ----
  	List	   *invalItems;		/* other dependencies, as PlanInvalItems */
  
  	int			nParamExec;		/* number of PARAM_EXEC Params used */
+ 
+ 	uint32		queryId;		/* query identifier carried from query tree */
  } PlannedStmt;
  
  /* macro for fetching the Plan associated with a SubPlan node */
diff --git a/src/include/parser/analyze.h b/src/include/parser/analyze.h
new file mode 100644
index b8987db..2bad10f
*** a/src/include/parser/analyze.h
--- b/src/include/parser/analyze.h
***************
*** 16,26 ****
--- 16,38 ----
  
  #include "parser/parse_node.h"
  
+ /* Hook for plugins to get control in parse_analyze() */
+ typedef Query* (*parse_analyze_hook_type) (Node *parseTree, const char *sourceText,
+ 			  Oid *paramTypes, int numParams);
+ extern PGDLLIMPORT parse_analyze_hook_type parse_analyze_hook;
+ /* Hook for plugins to get control in parse_analyze_varparams() */
+ typedef Query* (*parse_analyze_varparams_hook_type) (Node *parseTree, const char *sourceText,
+ 						Oid **paramTypes, int *numParams);
+ extern PGDLLIMPORT parse_analyze_varparams_hook_type parse_analyze_varparams_hook;
  
  extern Query *parse_analyze(Node *parseTree, const char *sourceText,
  			  Oid *paramTypes, int numParams);
+ extern Query *standard_parse_analyze(Node *parseTree, const char *sourceText,
+ 			  Oid *paramTypes, int numParams);
  extern Query *parse_analyze_varparams(Node *parseTree, const char *sourceText,
  						Oid **paramTypes, int *numParams);
+ extern Query *standard_parse_analyze_varparams(Node *parseTree, const char *sourceText,
+ 						Oid **paramTypes, int *numParams);
  
  extern Query *parse_sub_analyze(Node *parseTree, ParseState *parentParseState,
  				  CommonTableExpr *parentCTE,
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
new file mode 100644
index 670e084..a484ae8
*** a/src/include/parser/parse_node.h
--- b/src/include/parser/parse_node.h
*************** typedef Node *(*PreParseColumnRefHook) (
*** 27,34 ****
  typedef Node *(*PostParseColumnRefHook) (ParseState *pstate, ColumnRef *cref, Node *var);
  typedef Node *(*ParseParamRefHook) (ParseState *pstate, ParamRef *pref);
  typedef Node *(*CoerceParamHook) (ParseState *pstate, Param *param,
! 									   Oid targetTypeId, int32 targetTypeMod,
! 											  int location);
  
  
  /*
--- 27,33 ----
  typedef Node *(*PostParseColumnRefHook) (ParseState *pstate, ColumnRef *cref, Node *var);
  typedef Node *(*ParseParamRefHook) (ParseState *pstate, ParamRef *pref);
  typedef Node *(*CoerceParamHook) (ParseState *pstate, Param *param,
! 									   Oid targetTypeId, int32 targetTypeMod);
  
  
  /*

normalization_regression.pytext/x-python; charset=US-ASCII; name=normalization_regression.pyDownload

#13

Alvaro Herrera

alvherre@commandprompt.com

almost 14 years ago

In reply to: Peter Geoghegan (#12)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

I'm curious about the LeafNode stuff. Is this something that could be
done by expression_tree_walker? I'm not completely familiar with it so
maybe there's some showstopper such as some node tags not being
supported, or maybe it just doesn't help. But I'm curious.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#14

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Alvaro Herrera (#13)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 1 March 2012 00:48, Alvaro Herrera <alvherre@commandprompt.com> wrote:

I'm curious about the LeafNode stuff. Is this something that could be
done by expression_tree_walker? I'm not completely familiar with it so
maybe there's some showstopper such as some node tags not being
supported, or maybe it just doesn't help. But I'm curious.

Good question. The LeafNode function (which is a bit of a misnomer, as
noted in a comment) looks rather like a walker function, as appears in
the example in a comment in nodeFuncs.c:

* expression_tree_walker() is designed to support routines that traverse
* a tree in a read-only fashion (although it will also work for routines
* that modify nodes in-place but never add/delete/replace nodes).
* A walker routine should look like this:
*
* bool my_walker (Node *node, my_struct *context)
* {
* if (node == NULL)
* return false;
* // check for nodes that special work is required for, eg:
* if (IsA(node, Var))
* {
* ... do special actions for Var nodes
* }
* else if (IsA(node, ...))
* {
* ... do special actions for other node types
* }
* // for any node type not specially processed, do:
* return expression_tree_walker(node, my_walker, (void *) context);
* }

My understanding is that the expression-tree walking support is mostly
useful for the majority of walker code, which only cares about a small
subset of nodes, and hopes to avoid including boilerplate code just to
walk those other nodes that it's actually disinterested in.

This code, unlike most clients of expression_tree_walker(), is pretty
much interested in everything, since its express purpose is to
fingerprint all possible query trees.

Another benefit of expression_tree_walker is that if you miss a
certain node being added, (say a FuncExpr-like node), you get to
automatically have that node walked over to walk to the nodes that you
do in fact care about (such as those within this new nodes args List).
That makes perfect sense in the majority of cases, but here you've
already missed the fields within this new node that FuncExpr itself
lacks, so you're already finger-printing inaccurately. I suppose you
could still at least get the nodetag and still have a warning about
the fingerprinting being inadequate by going down the
expression_tree_walker path, but I'm inclined to wonder if it you
aren't just better of directly walking the tree, if only to encourage
the idea that this code needs to be maintained over time, and to cut
down on the little extra bit of indirection that that imposes.

It's not going to be any sort of burden to maintain it - it currently
stands at a relatively meagre 800 lines of code for everything to do
with tree walking - and the code that will have to be added with new
nodes or refactored along with the existing tree structure is going to
be totally trivial.

All of that said, I wouldn't mind making LeafNode into a walker, if
that approach is judged to be better, and you don't mind documenting
the order in which the tree is walked as deterministic, because the
order now matters in a way apparently not really anticipated by
expression_tree_walker, though that's probably not a problem.

My real concern now is that it's March 1st, and the last commitfest
may end soon. Even though this patch has extensive regression tests,
has been floating around for months, and, I believe, will end up being
a timely and important feature, a committer has yet to step forward to
work towards this patch getting committed. Can someone volunteer,
please? My expectation is that this feature will make life a lot
easier for a lot of Postgres users.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#15

Daniel Farina

daniel@heroku.com

almost 14 years ago

In reply to: Peter Geoghegan (#14)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Thu, Mar 1, 2012 at 1:27 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote:

My expectation is that this feature will make life a lot
easier for a lot of Postgres users.

Yes. It's hard to overstate the apparent utility of this feature in
the general category of visibility and profiling.

--
fdr

#16

Josh Berkus

josh@agliodbs.com

almost 14 years ago

In reply to: Daniel Farina (#15)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 3/1/12 1:57 PM, Daniel Farina wrote:

On Thu, Mar 1, 2012 at 1:27 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote:

My expectation is that this feature will make life a lot
easier for a lot of Postgres users.

Yes. It's hard to overstate the apparent utility of this feature in
the general category of visibility and profiling.

More importantly, this is what pg_stat_statements *should* have been in
8.4, and wasn't.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#17

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Josh Berkus (#16)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 1 March 2012 22:09, Josh Berkus <josh@agliodbs.com> wrote:

On 3/1/12 1:57 PM, Daniel Farina wrote:

On Thu, Mar 1, 2012 at 1:27 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote:

My expectation is that this feature will make life a lot
easier for a lot of Postgres users.

Yes. It's hard to overstate the apparent utility of this feature in
the general category of visibility and profiling.

More importantly, this is what pg_stat_statements *should* have been in
8.4, and wasn't.

It would probably be prudent to concentrate on getting the core
infrastructure committed first. That way, we at least know that if
this doesn't get into 9.2, we can work on getting it into 9.3 knowing
that once committed, people won't have to wait over a year at the very
least to be able to use the feature. The footprint of such a change is
quite small:

That said, I believe that the patch is pretty close to a commitable
state as things stand, and that all that is really needed is for a
committer familiar with the problem space to conclude the work started
by Daniel and others, adding:

contrib/pg_stat_statements/pg_stat_statements.c | 1420 ++++++++++++++++++-

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#18

Josh Berkus

josh@agliodbs.com

almost 14 years ago

In reply to: Peter Geoghegan (#17)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

It would probably be prudent to concentrate on getting the core
infrastructure committed first. That way, we at least know that if
this doesn't get into 9.2, we can work on getting it into 9.3 knowing
that once committed, people won't have to wait over a year at the very

I don't see why we can't commit the whole thing. This is way more ready
for prime-time than checksums.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#19

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Josh Berkus (#18)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Josh Berkus <josh@agliodbs.com> writes:

It would probably be prudent to concentrate on getting the core
infrastructure committed first. That way, we at least know that if
this doesn't get into 9.2, we can work on getting it into 9.3 knowing
that once committed, people won't have to wait over a year at the very

I don't see why we can't commit the whole thing. This is way more ready
for prime-time than checksums.

We'll get to it in due time. In case you haven't noticed, there's a lot
of stuff in this commitfest. And I don't follow the logic that says
that because Simon is trying to push through a not-ready-for-commit
patch we should drop our standards for other patches.

regards, tom lane

#20

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Tom Lane (#19)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Fri, Mar 2, 2012 at 12:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Josh Berkus <josh@agliodbs.com> writes:

It would probably be prudent to concentrate on getting the core
infrastructure committed first. That way, we at least know that if
this doesn't get into 9.2, we can work on getting it into 9.3 knowing
that once committed, people won't have to wait over a year at the very

I don't see why we can't commit the whole thing. This is way more ready
for prime-time than checksums.

We'll get to it in due time. In case you haven't noticed, there's a lot
of stuff in this commitfest. And I don't follow the logic that says
that because Simon is trying to push through a not-ready-for-commit
patch we should drop our standards for other patches.

I don't follow that logic either, but I also feel like this CommitFest
is dragging on and on. Unless you -- or someone -- are prepared to
devote a lot more time to this, "due time" is not going to arrive any
time in the forseeable future. We're currently making progress at a
rate of maybe 4 patches a week, at which rate we're going to finish
this CommitFest in May. And that might be generous, because we've
been disproportionately knocking off the easy ones. Do we have any
kind of a plan for, I don't know, bringing this to closure on some
reasonable time frame?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#21

Simon Riggs

simon@2ndQuadrant.com

almost 14 years ago

In reply to: Tom Lane (#19)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Fri, Mar 2, 2012 at 5:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Josh Berkus <josh@agliodbs.com> writes:

It would probably be prudent to concentrate on getting the core
infrastructure committed first. That way, we at least know that if
this doesn't get into 9.2, we can work on getting it into 9.3 knowing
that once committed, people won't have to wait over a year at the very

I don't see why we can't commit the whole thing. This is way more ready
for prime-time than checksums.

We'll get to it in due time. In case you haven't noticed, there's a lot
of stuff in this commitfest. And I don't follow the logic that says
that because Simon is trying to push through a not-ready-for-commit
patch we should drop our standards for other patches.

Hmm, not deaf you know. I would never push through a patch that isn't
ready for commit. If I back something it is because it is ready for
use in production by PostgreSQL users, in my opinion. I get burned
just as much, if not more, than others if that's a bad decision, so
its not given lightly.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#22

Josh Berkus

josh@agliodbs.com

almost 14 years ago

In reply to: Tom Lane (#19)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

We'll get to it in due time. In case you haven't noticed, there's a lot
of stuff in this commitfest. And I don't follow the logic that says
that because Simon is trying to push through a not-ready-for-commit
patch we should drop our standards for other patches.

What I'm pointing out is that Peter shouldn't even be talking about
cutting functionality from an apparently-ready-for-committer patch in
order to yield way to a patch about which people are still arguing
specification.

This is exactly why I'm not keen on checksums for 9.2. We've reached
the point where the attention on the checksum patch is pushing aside
other patches which are more ready and have had more work.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#23

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Robert Haas (#20)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Mar 2, 2012 at 12:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

We'll get to it in due time. In case you haven't noticed, there's a lot
of stuff in this commitfest. And I don't follow the logic that says
that because Simon is trying to push through a not-ready-for-commit
patch we should drop our standards for other patches.

I don't follow that logic either, but I also feel like this CommitFest
is dragging on and on. Unless you -- or someone -- are prepared to
devote a lot more time to this, "due time" is not going to arrive any
time in the forseeable future. We're currently making progress at a
rate of maybe 4 patches a week, at which rate we're going to finish
this CommitFest in May. And that might be generous, because we've
been disproportionately knocking off the easy ones. Do we have any
kind of a plan for, I don't know, bringing this to closure on some
reasonable time frame?

Well, personally I was paying approximately zero attention to the
commitfest for most of February, because I was occupied with trying to
get back-branch releases out, as well as some non-Postgres matters.
CF items are now back to the head of my to-do queue; you may have
noticed that I'm busy with Korotkov's array stats patch. I do intend to
take this one up in due course (although considering it's not marked
Ready For Committer yet, I don't see that it deserves time ahead of
those that are).

As for when we'll be done with the CF, I dunno, but since it's the last
one for this release cycle I didn't think that we'd be arbitrarily
closing it on any particular schedule. It'll be done when it's done.

regards, tom lane

#24

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Josh Berkus (#22)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Josh Berkus <josh@agliodbs.com> writes:

This is exactly why I'm not keen on checksums for 9.2. We've reached
the point where the attention on the checksum patch is pushing aside
other patches which are more ready and have had more work.

IMO the reason why it's sucking so much attention is precisely that it's
not close to being ready to commit. But this is well off topic for the
thread we're on. If you want to propose booting checksums from
consideration for 9.2, let's have that discussion on the checksum
thread.

regards, tom lane

#25

Simon Riggs

simon@2ndQuadrant.com

almost 14 years ago

In reply to: Tom Lane (#24)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Fri, Mar 2, 2012 at 8:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Josh Berkus <josh@agliodbs.com> writes:

This is exactly why I'm not keen on checksums for 9.2. We've reached
the point where the attention on the checksum patch is pushing aside
other patches which are more ready and have had more work.

IMO the reason why it's sucking so much attention is precisely that it's
not close to being ready to commit. But this is well off topic for the
thread we're on. If you want to propose booting checksums from
consideration for 9.2, let's have that discussion on the checksum
thread.

Checksums patch isn't sucking much attention at all but admittedly
there are some people opposed to the patch that want to draw out the
conversation until the patch is rejected, but that's not the same
thing. The main elements of the patch have been working for around 7
weeks by now.

I'm not sure how this topic is even raised here, since the patches are
wholly and completely separate, apart from the minor and irrelevant
point that the patch authors both work for 2ndQuadrant. If that
matters at all, I'll be asking how and why.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#26

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Simon Riggs (#25)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Fri, Mar 2, 2012 at 4:56 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Checksums patch isn't sucking much attention at all but admittedly
there are some people opposed to the patch that want to draw out the
conversation until the patch is rejected,

Wow. Sounds like a really shitty thing for those people to do -
torpedoing a perfectly good patch for no reason.

I have an alternative theory, though: they have sincere objections and
don't accept your reasons for discounting those objections.

I'm not sure how this topic is even raised here, since the patches are
wholly and completely separate, apart from the minor and irrelevant
point that the patch authors both work for 2ndQuadrant. If that
matters at all, I'll be asking how and why.

It came up because Josh pointed out that this patch is, in his
opinion, in better shape than the checksum patch. I don't believe
anyone's employment situation comes into it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#27

Simon Riggs

simon@2ndQuadrant.com

almost 14 years ago

In reply to: Robert Haas (#26)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Sat, Mar 3, 2012 at 12:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 2, 2012 at 4:56 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Checksums patch isn't sucking much attention at all but admittedly
there are some people opposed to the patch that want to draw out the
conversation until the patch is rejected,

Wow. Sounds like a really shitty thing for those people to do -
torpedoing a perfectly good patch for no reason.

You've explained to me how you think I do that elsewhere and how that
annoyed you, so I think that topic deserves discussion at the
developers meeting to help us understand one another rather than
perpetuate this.

I have an alternative theory, though: they have sincere objections and
don't accept your reasons for discounting those objections.

That's exactly the problem though and the discussion on it is relevant here.

Nobody thinks objections on this patch, checksums or others are made
insincerely. It's what happens next that matters. The question should
be about acceptance criteria. What do we need to do to get something
useful committed? Without a clear set of criteria for resolution we
cannot move forward swiftly enough to do useful things. My thoughts
are always about salvaging what we can, trying to find a way through
the maze of objections and constraints not just black/white decisions
based upon the existence of an objection, as if that single point
trumps any other consideration and blocks all possibilities.

So there is a clear difference between an objection to any progress on
a topic ("I sincerely object to the checksum patch"), and a technical
objection to taking a particular course of action ("We shouldn't use
bits x1..x3 because...."). The first is not viable, however sincerely
it is made, because it leaves the author with no way of resolving
things and it also presumes that the patch only exists in one version
and that the author is somehow refusing to make agreed changes.
Discussion started *here* because it was said "Person X is trying to
force patch Y thru", which is true - but that doesn't necessarily mean
the version of the patch that current objections apply to, only that
the author has an equally sincere wish to do something useful.

The way forwards here and elsewhere is to list out the things we can't
do and list out the things that must change - a clear list of
acceptance criteria. If we do that as early as possible we give the
author a good shot at being able to make those changes in time to
commit something useful. Again, only *something* useful: the full
original vision is not always possible.

In summary: "What can be done in this release, given the constraints discussed?"

So for Peter's patch - what do we need to do to allow some/all of this
to be committed?

And for the checksum patch please go back to the checksum thread and
list out all the things you consider unresolved. In some cases,
resolutions have been suggested but not yet implemented so it would
help if those are either discounted now before they are written, or
accepted in principle to allow work to proceed.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#28

Andrew Dunstan

andrew@dunslane.net

almost 14 years ago

In reply to: Simon Riggs (#27)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 03/05/2012 05:12 AM, Simon Riggs wrote:

On Sat, Mar 3, 2012 at 12:01 AM, Robert Haas<robertmhaas@gmail.com> wrote:

On Fri, Mar 2, 2012 at 4:56 PM, Simon Riggs<simon@2ndquadrant.com> wrote:

Checksums patch isn't sucking much attention at all but admittedly
there are some people opposed to the patch that want to draw out the
conversation until the patch is rejected,

Wow. Sounds like a really shitty thing for those people to do -
torpedoing a perfectly good patch for no reason.

You've explained to me how you think I do that elsewhere and how that
annoyed you, so I think that topic deserves discussion at the
developers meeting to help us understand one another rather than
perpetuate this.

No matter how much we occasionally annoy each other, I think we all need
to accept that we're all dealing in good faith. Suggestions to the
contrary are ugly, have no foundation in fact that I'm aware of, and
reflect badly on our community.

Postgres has a well deserved reputation for not having the sort of
public bickering that has caused people to avoid certain other projects.
Please keep it that way.

cheers

andrew

#29

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#23)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 2 March 2012 20:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I do intend to take this one up in due course

I probably should have exposed the query_id directly in the
pg_stat_statements view, perhaps as "query_hash". The idea of that
would be to advertise the potential non-uniqueness of the value - a
collision is *extremely* unlikely (as I've previously calculated), but
we cannot preclude the possibility, and as such it isn't *really*
usable as a primary key. BTW, even if there is a collision, we at
least know that there can't be a situation where one user's query
entry gets spurious statistics from the execution of some other
user's, or one database gets statistics from another, since their
corresponding oid values separately form part of the dynahash key,
alongside query_id.

The other reason why I'd like to do this is that I'd like to build on
this work for 9.3, and add a new column - plan_hash. When a new mode,
pg_stat_statements.plan_hash (or somesuch) is disabled (as it is by
default), this is always null, and we get the same 9.2 behaviour. When
it is enabled, however, all existing entries are invalidated, for a
clean slate. We then start hashing both the query tree *and* the query
plan. It's a whole lot less useful if we only hash the latter. Now,
entries within the view use the plan_hash as their key (or maybe a
composite of query_hash and plan_hash). This often results in entries
with duplicate query_hash values, as the planner generates different
plans for equivalent queries, but that doesn't matter; you can easily
write an aggregate query with a "GROUP BY query_hash" clause if that's
what you happen to want to see.

When this optional mode is enabled, at that point we'd probably also
separately instrument planning time, as recently proposed by Fujii.

Does that seem like an interesting idea?

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#30

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Peter Geoghegan (#29)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan <peter@2ndquadrant.com> writes:

I probably should have exposed the query_id directly in the
pg_stat_statements view, perhaps as "query_hash".

FWIW, I think that's a pretty bad idea; the hash seems to me to be
strictly an internal matter. Given the sponginess of its definition
I don't really want it exposed to users.

regards, tom lane

#31

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#30)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Is there anything that I could be doing to help bring this patch
closer to a committable state? I'm thinking of the tests in particular
- do you suppose it's acceptable to commit them more or less as-is?

The standard for testing contrib modules seems to be a bit different,
as there is a number of other cases where an impedance mistmatch with
pg_regress necessitates doing things differently. So, the sepgsql
tests, which I understand are mainly to test the environment that the
module is being built for rather than the code itself, are written as
a shellscript than uses various selinux tools. There is also a Perl
script that uses DBD::Pg to benchmark intarray, for example.

Now that we have a defacto standard python driver, something that we
didn't have a couple of years ago, it probably isn't terribly
unreasonable to keep the tests in Python. They'll still probably need
some level of clean-up, to cut back on some of the tests that are
redundant. Some of the tests are merely fuzz tests, which are perhaps
a bit questionable.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#32

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Peter Geoghegan (#31)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan <peter@2ndquadrant.com> writes:

Is there anything that I could be doing to help bring this patch
closer to a committable state?

Sorry, I've not actually looked at that patch yet. I felt I should
push on Andres' CTAS patch first, since that's blocking progress on
the command triggers patch.

I'm thinking of the tests in particular
- do you suppose it's acceptable to commit them more or less as-is?

If they rely on having python, that's a 100% guaranteed rejection
in my opinion. It's difficult enough to sell people on incremental
additions of perl dependencies to the build/test process. Bringing
in an entire new scripting language seems like a nonstarter.

I suppose we could commit such a thing as an appendage that doesn't
get run in standard builds, but then I see little point in it at all.
Tests that don't get run regularly are next door to useless.

Is there a really strong reason why adequate regression testing isn't
possible in a plain-vanilla pg_regress script? A quick look at the
script says that it's just doing some SQL commands and then checking the
results of queries on the pg_stat_statements views. Admittedly the
output would be bulkier in pg_regress, which would mean that we'd not
likely want several hundred test cases. But IMO the objective of a
regression test is not to memorialize every single case the code author
thought about during development. ISTM it would not take very many
cases to have reasonable code coverage.

regards, tom lane

#33

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#32)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 18 March 2012 16:13, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Is there a really strong reason why adequate regression testing isn't
possible in a plain-vanilla pg_regress script? A quick look at the
script says that it's just doing some SQL commands and then checking the
results of queries on the pg_stat_statements views. Admittedly the
output would be bulkier in pg_regress, which would mean that we'd not
likely want several hundred test cases. But IMO the objective of a
regression test is not to memorialize every single case the code author
thought about during development. ISTM it would not take very many
cases to have reasonable code coverage.

Hmm. It's difficult to have much confidence that a greatly reduced
number of test cases ought to provide sufficient coverage. I don't
disagree with your contention, I just don't know how to judge this
matter. Given that there isn't really a maintenance burden with
regression tests, I imagine that that makes it compelling to be much
more inclusive.

The fact that we rely on there being no concurrent queries might have
to be worked around for parallel scheduled regression tests, such as
by doing everything using a separate database, with that database oid
always in the predicate of the query checking the pg_stat_statements
view.

I probably would have written the tests in Perl in the first place,
but I don't know Perl. These tests existed in some form from day 1, as
I followed a test-driven development methodology, and needed to use a
language that I could be productive in immediately. There is probably
no reason why they cannot be re-written in Perl, but spit out
pg_regress tests, compacting the otherwise-verbose pg_regress input.
Should I cut my teeth on Perl by writing the tests to do so? How might
this be integrated with the standard regression tests, if that's
something that is important?

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#34

Andrew Dunstan

andrew@dunslane.net

almost 14 years ago

In reply to: Peter Geoghegan (#33)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 03/18/2012 06:12 PM, Peter Geoghegan wrote:

On 18 March 2012 16:13, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Is there a really strong reason why adequate regression testing isn't
possible in a plain-vanilla pg_regress script? A quick look at the
script says that it's just doing some SQL commands and then checking the
results of queries on the pg_stat_statements views. Admittedly the
output would be bulkier in pg_regress, which would mean that we'd not
likely want several hundred test cases. But IMO the objective of a
regression test is not to memorialize every single case the code author
thought about during development. ISTM it would not take very many
cases to have reasonable code coverage.

Hmm. It's difficult to have much confidence that a greatly reduced
number of test cases ought to provide sufficient coverage. I don't
disagree with your contention, I just don't know how to judge this
matter. Given that there isn't really a maintenance burden with
regression tests, I imagine that that makes it compelling to be much
more inclusive.

The fact that we rely on there being no concurrent queries might have
to be worked around for parallel scheduled regression tests, such as
by doing everything using a separate database, with that database oid
always in the predicate of the query checking the pg_stat_statements
view.

I probably would have written the tests in Perl in the first place,
but I don't know Perl. These tests existed in some form from day 1, as
I followed a test-driven development methodology, and needed to use a
language that I could be productive in immediately. There is probably
no reason why they cannot be re-written in Perl, but spit out
pg_regress tests, compacting the otherwise-verbose pg_regress input.
Should I cut my teeth on Perl by writing the tests to do so? How might
this be integrated with the standard regression tests, if that's
something that is important?

A pg_regress script doesn't require any perl. It's pure SQL.

It is perfectly possible to make a single test its own group in a
parallel schedule, and this is done now for a number of cases. See
src/test/regress/parallel_schedule. Regression tests run in their own
database set up for the purpose. You should be able to restrict the
regression queries to only the current database.

If you want to generate the tests using some tool, then use whatever
works for you, be it Python or Perl or Valgol, but ideally what is
committed (and this what should be in your patch) will be the SQL output
of that, not the generator plus input. Tests built that way get
automatically run by the buildfarm. Tests that don't use the standard
testing framework don't. You need a *really* good reason, therefore, not
to do it that way.

cheers

andrew

#35

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Andrew Dunstan (#34)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 18 March 2012 22:46, Andrew Dunstan <andrew@dunslane.net> wrote:

If you want to generate the tests using some tool, then use whatever works
for you, be it Python or Perl or Valgol, but ideally what is committed (and
this what should be in your patch) will be the SQL output of that, not the
generator plus input.

The reason that I'd prefer to use Perl or even Python to generate
pg_regress input, and then have that infrastructure committed is
because it's a lot more natural and succint to deal with the problem
that way. I would have imagined that a patch that repeats the same
boilerplate again and again, to test almost every minor facet of
normalisation would be frowned upon. However, if you prefer that, it
can easily be accommodated.

The best approach might be to commit the output of the Python script
as well as the python script itself, with some clean-up work. That
way, no one is actually required to run the Python script themselves
as part of a standard build, and so they have no basis to complain
about additional dependencies. We can run the regression tests from
the buildfarm without any additional infrastructure to invoke the
python script to generate the pg_regress tests each time. When time
comes to change the representation of the query tree, which is not
going to be that frequent an event, but will occur every once in a
while, the author of the relevant patch should think to add some tests
to my existing set, and verify that they pass. That's going to be made
a lot easier by having them edit a file that expresses the problem in
terms whether two queries should be equivalent or distinct, or what a
given query's final canonicalised representation should look like, all
with minimal boilerplate. I'm only concerned with making the patch as
easy as possible to maintain.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#36

Andrew Dunstan

andrew@dunslane.net

almost 14 years ago

In reply to: Peter Geoghegan (#35)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 03/18/2012 07:46 PM, Peter Geoghegan wrote:

On 18 March 2012 22:46, Andrew Dunstan<andrew@dunslane.net> wrote:

If you want to generate the tests using some tool, then use whatever works
for you, be it Python or Perl or Valgol, but ideally what is committed (and
this what should be in your patch) will be the SQL output of that, not the
generator plus input.

The reason that I'd prefer to use Perl or even Python to generate
pg_regress input, and then have that infrastructure committed is
because it's a lot more natural and succint to deal with the problem
that way. I would have imagined that a patch that repeats the same
boilerplate again and again, to test almost every minor facet of
normalisation would be frowned upon. However, if you prefer that, it
can easily be accommodated.

If your tests are that voluminous then maybe they are not what we're
looking for anyway. As Tom noted:

IMO the objective of a regression test is not to memorialize every single case the code author thought about during development. ISTM it would not take very many cases to have reasonable code coverage.

Why exactly does this feature need particularly to have script-driven
regression test generation when others don't?

If this is a general pattern that people want to follow, then maybe we
need to plan and support it rather than just add a random test
generation script here and there.

cheers

andrew

#37

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Andrew Dunstan (#36)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 19 March 2012 00:10, Andrew Dunstan <andrew@dunslane.net> wrote:

If your tests are that voluminous then maybe they are not what we're looking
for anyway. As Tom noted:

IMO the objective of a regression test is not to memorialize every single
case the code author thought about during development. ISTM it would not
take very many cases to have reasonable code coverage.

Fair enough.

Why exactly does this feature need particularly to have script-driven
regression test generation when others don't?

It's not that it needs it, so much as that it is possible to provide
coverage for much of the code with black-box testing. In the case of
most of the hundreds of tests, I can point to a particular piece of
code that is being tested, that was written *after* the test was.
Doing this with pg_regress the old-fashioned way is going to be
incredibly verbose. I'm all for doing script-generation of pg_regress
tests in a well-principled way, and I'm happy to take direction from
others as to what that should look like.

I know that for the most part the tests provide coverage for discrete
units of functionality, and so add value. If they add value, why not
include them? Tests are supposed to be comprehensive. If that
inconveniences you, by slowing down the buildfarm for questionable
benefits, maybe it would be okay to have some tests not run
automatically, even if that did make them "next door to useless" in
Tom's estimation. There could be a more limited set of conventional
pg_regress tests that are run automatically, plus more comprehensive
tests that are run less frequently, typically only as it becomes
necessary to alter pg_stat_statements to take account of those
infrequent changes (typically additions) to the query tree.

We have tests that ensure that header files don't contain C++
keywords, and nominally promise to not do so, and they are not run
automatically. I don't see the sense in requiring that tests should be
easy to run, while also aspiring to have tests that are as useful and
comprehensive as possible. It seems like the code should dictate the
testing infrastructure, and not the other way around.

Part of the reason why I'm resistant to reducing the number of tests
is that it seems to me that excluding some tests but not others would
be quite arbitrary. It is not the case that some tests are clearly
more useful than others (except for the fuzz testing stuff, which
probably isn't all that useful).

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#38

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Peter Geoghegan (#37)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan <peter@2ndquadrant.com> writes:

On 19 March 2012 00:10, Andrew Dunstan <andrew@dunslane.net> wrote:

Why exactly does this feature need particularly to have script-driven
regression test generation when others don't?

It's not that it needs it, so much as that it is possible to provide
coverage for much of the code with black-box testing. In the case of
most of the hundreds of tests, I can point to a particular piece of
code that is being tested, that was written *after* the test was.

Well, basically what you're saying is that you did test-driven
development, which is fine. However, that does not mean that those
same tests are ideal for ongoing regression testing. What we want from
a regression test these days is primarily (a) portability testing, ie
does the feature work on platforms other than yours?, and (b) early
warning if someone breaks it down the road. In most cases, fairly
coarse testing is enough to catch drive-by breakage; and when it's not
enough, like as not the breakage is due to something you never thought
about originally and thus never tested for, so you'd not have caught it
anyway.

I am *not* a fan of regression tests that try to microscopically test
every feature in the system. Sure you should do that when initially
developing a feature, but it serves little purpose to do it over again
every time any other developer runs the regression tests for the
foreseeable future. That road leads to a regression suite that's so
voluminous that it takes too long to run and developers start to avoid
running it, which is counterproductive. For an example in our own
problem space look at mysql, whose regression tests take well over an
hour to run on a fast box. So they must be damn near bug-free right?
Uh, not so much, and I think the fact that developers can't easily run
their test suite is not unrelated to that.

So what I'd rather see is a small set of tests that are designed to do a
smoke-test of functionality and then exercise any interfaces to the rest
of the system that seem likely to break. Over time we might augment
that, when we find particular soft spots as a result of previously
undetected bugs. But sheer volume of tests is not a positive IMO.

As for the scripted vs raw-SQL-in-pg_regress question, I'm making the
same point as Andrew: only the pg_regress method is likely to get run
nearly everywhere, which means that the scripted approach is a FAIL
so far as the portability-testing aspect is concerned.

Lastly, even given that we were willing to accept a scripted set of
tests, I'd want to see it in perl not python. Perl is the project
standard; I see no reason to expect developers to learn two different
scripting languages to work on PG. (There might be a case for
accepting python-scripted infrastructure for pl/python, say, but not
for components that are 100% unrelated to python.)

regards, tom lane

#39

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#38)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 19 March 2012 01:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I am *not* a fan of regression tests that try to microscopically test
every feature in the system.

I see your point of view. I suppose I can privately hold onto the test
suite, since it might prove useful again.

I will work on a pg_regress based approach with a reasonably-sized
random subset of about 20 of my existing tests, to provide some basic
smoke testing.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#40

Greg Stark

stark@mit.edu

almost 14 years ago

In reply to: Tom Lane (#38)

Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Mon, Mar 19, 2012 at 1:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

For an example in our own
problem space look at mysql, whose regression tests take well over an
hour to run on a fast box. So they must be damn near bug-free right?
Uh, not so much, and I think the fact that developers can't easily run
their test suite is not unrelated to that.

The other problem with this approach is that it's hard to keep a huge
test suite 100% clean. Changes inevitably introduce behaviour changes
that cause some of the tests to fail. If the test suite is huge then
it's a lot of work to be continually fixing these tests and you're
always behind. If it's always the case that some tests in this huge
suite are failing then it's extra work whenever you make a change to
dig through the results and determine whether any of the failures are
caused by your changes and represent a real problem. Even if you do
the work it's easy to get it wrong and miss a real failure.

My suggestion would be to go ahead and check in the python or perl
script but not make that the pg_regress tests that are run by mak
check. Cherry pick just a good set of tests that test most of the
tricky bits and check that in to run on make test. I tihnk there's
even precedent for that in one of the other modules that has a make
longcheck or make slowcheck or something like that.

--
greg

#41

Peter Eisentraut

peter_e@gmx.net

almost 14 years ago

In reply to: Peter Geoghegan (#39)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On mån, 2012-03-19 at 02:35 +0000, Peter Geoghegan wrote:

I see your point of view. I suppose I can privately hold onto the test
suite, since it might prove useful again.

I would still like to have those tests checked in, but not run by
default, in case someone wants to hack on this particular feature again.

#42

Peter Eisentraut

peter_e@gmx.net

almost 14 years ago

In reply to: Greg Stark (#40)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On mån, 2012-03-19 at 08:59 +0000, Greg Stark wrote:

The other problem with this approach is that it's hard to keep a huge
test suite 100% clean. Changes inevitably introduce behaviour changes
that cause some of the tests to fail.

I think we are used to that because of the way pg_regress works. When
you have a better test infrastructure that tests actual functionality
rather than output formatting, this shouldn't be the case (nearly as
much).

If someone wanted to bite the bullet and do the work, I think we could
move to a Perl/TAP-based test suite (not pgTAP, but Perl and some fairly
standard Test::* modules) and reduce that useless reformatting work and
test more interesting things. Just a thought ...

#43

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Peter Eisentraut (#42)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 19 March 2012 19:55, Peter Eisentraut <peter_e@gmx.net> wrote:

If someone wanted to bite the bullet and do the work, I think we could
move to a Perl/TAP-based test suite (not pgTAP, but Perl and some fairly
standard Test::* modules) and reduce that useless reformatting work and
test more interesting things. Just a thought ...

I think that that is a good idea. However, I am not a Perl hacker,
though this is the second time that that has left me at a disadvantage
when working on Postgres, so I think it's probably time to learn a
certain amount.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#44

Daniel Farina

daniel@heroku.com

almost 14 years ago

In reply to: Peter Geoghegan (#39)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Sun, Mar 18, 2012 at 7:35 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote:

On 19 March 2012 01:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I am *not* a fan of regression tests that try to microscopically test
every feature in the system.

I see your point of view. I suppose I can privately hold onto the test
suite, since it might prove useful again.

I will work on a pg_regress based approach with a reasonably-sized
random subset of about 20 of my existing tests, to provide some basic
smoke testing.

This may sound rather tortured, but in the main regression suite there
is a .c file that links some stuff into the backend that is then
accessed via CREATE FUNCTION to do some special fiddly bits. Could a
creative hook be used here to avoid the repetition you are avoiding
via Python? (e.g. constant resetting of pg_stat_statements or
whatnot). It might sound too much like changing the system under
test, but I think it would still retain most of the value.

I also do like the pg_regress workflow in general, although clearly it
cannot do absolutely everything. Running and interpreting the results
of your tests was not hard, but it was definitely *different* which
could be a headache if one-off testing frameworks proliferate.

--
fdr

#45

Noah Misch

noah@leadboat.com

almost 14 years ago

In reply to: Peter Eisentraut (#41)

Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Mon, Mar 19, 2012 at 09:49:32PM +0200, Peter Eisentraut wrote:

On m??n, 2012-03-19 at 02:35 +0000, Peter Geoghegan wrote:

I see your point of view. I suppose I can privately hold onto the test
suite, since it might prove useful again.

I would still like to have those tests checked in, but not run by
default, in case someone wants to hack on this particular feature again.

Agreed. Also, patch review becomes materially smoother when the author
includes comprehensive tests. When a usage I wish to verify already appears
in the submitted tests, that saves time. I respect the desire to keep regular
"make check" lean, but not if it means comprehensive tests get written to be
buried in the mailing list archives or never submitted at all.

#46

Bruce Momjian

bruce@momjian.us

almost 14 years ago

In reply to: Peter Geoghegan (#43)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Mon, Mar 19, 2012 at 08:48:07PM +0000, Peter Geoghegan wrote:

On 19 March 2012 19:55, Peter Eisentraut <peter_e@gmx.net> wrote:

If someone wanted to bite the bullet and do the work, I think we could
move to a Perl/TAP-based test suite (not pgTAP, but Perl and some fairly
standard Test::* modules) and reduce that useless reformatting work and
test more interesting things. ï¿½Just a thought ...

I think that that is a good idea. However, I am not a Perl hacker,
though this is the second time that that has left me at a disadvantage
when working on Postgres, so I think it's probably time to learn a
certain amount.

My blog entry on this topic might be helpful:

http://momjian.us/main/blogs/pgblog/2008.html#October_4_2008_2

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#47

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Peter Geoghegan (#12)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan <peter@2ndquadrant.com> writes:

[ pg_stat_statements_norm_2012_02_29.patch ]

I started to look at this patch (just the core-code modifications so
far). There are some things that seem not terribly well thought out:

* It doesn't look to me like it will behave very sanely with rules.
The patch doesn't store queryId in a stored rule tree, so a Query
retrieved from a stored rule will have a zero queryId, and that's
what will get pushed through to the resulting plan tree as well.
So basically all DO ALSO or DO INSTEAD operations are going to get
lumped together by pg_stat_statements, and separated from the queries
that triggered them, which seems pretty darn unhelpful.

I don't know that storing queryId would be better, since after a restart
that'd mean there are query IDs running around in the system that the
current instance of pg_stat_statements has never heard of. Permanently
stored query IDs would also be a headache if you needed to change the
fingerprint algorithm, or if there were more than one add-on trying to
use the query ID support.

I'm inclined to think that the most useful behavior is to teach the
rewriter to copy queryId from the original query into all the Queries
generated by rewrite. Then, all rules fired by a source query would
be lumped into that query for tracking purposes. This might not be
the ideal behavior either, but I don't see a better solution.

* The patch injects the query ID calculation code by redefining
parse_analyze and parse_analyze_varparams as hookable functions and
then getting into those hooks. I don't find this terribly sane either.
pg_stat_statements has no interest in the distinction between those two
methods of getting into parse analysis. Perhaps more to the point,
those are not the only two ways of getting into parse analysis: some
places call transformTopLevelStmt directly, for instance
pg_analyze_and_rewrite_params. While it might be that the code paths
that do that are not of interest for fingerprinting queries, it's far
from obvious that these two are the correct and only places to do such
fingerprinting.

I think that if we are going to take the attitude that we only care
about fingerprinting queries that come in from the client, then we
ought to call the fingerprinting code in the client-message-processing
routines in postgres.c. But in that case we need to be a little clearer
about what we are doing with unfingerprinted queries. Alternatively,
we might take the position that we want to fingerprint every Query
struct, but in that case the existing hooks are clearly insufficient.
This seems to boil down to what you want to have happen with queries
created/executed inside functions, which is something I don't recall
being discussed.

Either way, I think we'd be a lot better advised to define a single
hook "post_parse_analysis_hook" and make the core code responsible
for calling it at the appropriate places, rather than supposing that
the contrib module knows exactly which core functions ought to be
the places to do it.

Thoughts?

regards, tom lane

#48

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#47)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 22 March 2012 17:19, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm inclined to think that the most useful behavior is to teach the
rewriter to copy queryId from the original query into all the Queries
generated by rewrite. Then, all rules fired by a source query would
be lumped into that query for tracking purposes. This might not be
the ideal behavior either, but I don't see a better solution.

+1. This behaviour seems fairly sane. The lumping together of DO ALSO
and DO INSTEAD operations was a simple oversight.

This seems to boil down to what you want to have happen with queries
created/executed inside functions, which is something I don't recall
being discussed.

Uh, well, pg_stat_statements is clearly supposed to monitor execution
of queries from within functions - there is a GUC,
"pg_stat_statements.track", which can be set to 'all' to track nested
queries. That being the case, we should clearly be fingerprinting
those query trees too.

The fact that we'll fingerprint these queries even though we usually
don't care about them doesn't seem like a problem, since in practice
the vast majority will be prepared.

Either way, I think we'd be a lot better advised to define a single
hook "post_parse_analysis_hook" and make the core code responsible
for calling it at the appropriate places, rather than supposing that
the contrib module knows exactly which core functions ought to be
the places to do it.

I agree.

Since you haven't mentioned the removal of parse-analysis Const
location alterations, I take it that you do not object to that, which
is something I'm glad of.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#49

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Peter Geoghegan (#48)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan <peter@2ndquadrant.com> writes:

Since you haven't mentioned the removal of parse-analysis Const
location alterations, I take it that you do not object to that, which
is something I'm glad of.

I remain un-thrilled about it, but apparently nobody else cares, so
I'll yield the point. (I do however object to your removal of the
cast location value from the param_coerce_hook signature. The fact
that one current user of the hook won't need it anymore doesn't mean
no others would. Consider throwing a "can't coerce" error from within
the hook function, for instance.)

Will you adjust the patch for the other issues?

regards, tom lane

#50

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#49)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 22 March 2012 19:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Will you adjust the patch for the other issues?

Sure. I'll take a look at it now.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#51

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Peter Geoghegan (#48)

2 attachment(s)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

I've attached a patch with the required modifications. I also attach
revised tests, since naturally I have continued with test-driven
development.

On 22 March 2012 18:49, Peter Geoghegan <peter@2ndquadrant.com> wrote:

On 22 March 2012 17:19, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm inclined to think that the most useful behavior is to teach the
rewriter to copy queryId from the original query into all the Queries
generated by rewrite. Then, all rules fired by a source query would
be lumped into that query for tracking purposes. This might not be
the ideal behavior either, but I don't see a better solution.

+1. This behaviour seems fairly sane. The lumping together of DO ALSO
and DO INSTEAD operations was a simple oversight.

Implemented. We simply do this now:

*************** RewriteQuery(Query *parsetree, List *rew
*** 2141,2146 ****
--- 2142,2154 ----
  errmsg("WITH cannot be used in a query that is rewritten by rules
into multiple queries")));
  }

+ /* Mark rewritten queries with their originating queryId */
+ foreach(lc1, rewritten)
+ {
+ Query   *q = (Query *) lfirst(lc1);
+ q->queryId = orig_query_id;
+ }
+
  return rewritten;
 }

Either way, I think we'd be a lot better advised to define a single
hook "post_parse_analysis_hook" and make the core code responsible
for calling it at the appropriate places, rather than supposing that
the contrib module knows exactly which core functions ought to be
the places to do it.

I have done this too. The hook is called in the following places, and
some tests won't pass if any one of them is commented out:

parse_analyze
parse_analyze_varparams
pg_analyze_and_rewrite_params

I have notably *not* added anything to the following transformstmt
call site functions for various obvious reasons:

inline_function
parse_sub_analyze
transformInsertStmt
transformDeclareCursorStmt
transformExplainStmt
transformRuleStmt

I assert against pg_stat_statements fingerprinting a query twice, and
have reasonable test coverage for nested queries (both due to rules
and function execution) now. I also tweaked pg_stat_statements itself
in one or two places.

I restored the location field to the ParamCoerceHook signature, but
the removal of code to modify the param location remains (again, not
because I need it, but because I happen to think that it ought to be
consistent with Const).

Thoughts?

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachments:

normalization_regression.pytext/x-python; charset=US-ASCII; name=normalization_regression.pyDownload

pg_stat_statements_norm_2012_03_25.patchtext/x-patch; charset=US-ASCII; name=pg_stat_statements_norm_2012_03_25.patchDownload

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
new file mode 100644
index 914fbf2..209d62e
*** a/contrib/pg_stat_statements/pg_stat_statements.c
--- b/contrib/pg_stat_statements/pg_stat_statements.c
***************
*** 10,15 ****
--- 10,38 ----
   * an entry, one must hold the lock shared or exclusive (so the entry doesn't
   * disappear!) and also take the entry's mutex spinlock.
   *
+  * As of Postgres 9.2, this module normalizes query entries. Normalization is a
+  * process whereby similar queries, typically differing only in their constants
+  * (though the exact rules are somewhat more subtle than that) are recognized as
+  * equivalent, and are tracked as a single entry. This is particularly useful
+  * for non-prepared queries.
+  *
+  * Normalization is implemented by fingerprinting queries, selectively
+  * serializing those fields of each query tree's nodes that are judged to be
+  * essential to the query.  This is referred to as a query jumble. This is
+  * distinct from a regular serialization, in that various extraneous information
+  * is ignored as irrelevant or not essential to the query, such as the collation
+  * of Vars, and, most notably, the value of constants - it isn't actually
+  * possible or desirable to deserialize.
+  *
+  * Once this jumble is acquired, a 32-bit hash is taken, which is copied back
+  * into the query tree at the post-analysis stage.  Postgres then naively copies
+  * this value around, making it later available from within the corresponding
+  * plan tree. The executor can then use this value to blame query costs on a
+  * known queryId.
+  *
+  * Within the executor hook, the module stores the cost of query execution,
+  * based on a queryId provided by the core system and some other values, within
+  * the shared hashtable.
   *
   * Copyright (c) 2008-2012, PostgreSQL Global Development Group
   *
***************
*** 27,38 ****
--- 50,65 ----
  #include "funcapi.h"
  #include "mb/pg_wchar.h"
  #include "miscadmin.h"
+ #include "parser/analyze.h"
+ #include "parser/parsetree.h"
+ #include "parser/scanner.h"
  #include "pgstat.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
  #include "storage/spin.h"
  #include "tcop/utility.h"
  #include "utils/builtins.h"
+ #include "utils/memutils.h"
  
  
  PG_MODULE_MAGIC;
*************** PG_MODULE_MAGIC;
*** 41,54 ****
  #define PGSS_DUMP_FILE	"global/pg_stat_statements.stat"
  
  /* This constant defines the magic number in the stats file header */
! static const uint32 PGSS_FILE_HEADER = 0x20100108;
  
  /* XXX: Should USAGE_EXEC reflect execution time and/or buffer usage? */
  #define USAGE_EXEC(duration)	(1.0)
  #define USAGE_INIT				(1.0)	/* including initial planning */
  #define USAGE_DECREASE_FACTOR	(0.99)	/* decreased every entry_dealloc */
  #define USAGE_DEALLOC_PERCENT	5		/* free this % of entries at once */
! 
  /*
   * Hashtable key that defines the identity of a hashtable entry.  The
   * hash comparators do not assume that the query string is null-terminated;
--- 68,87 ----
  #define PGSS_DUMP_FILE	"global/pg_stat_statements.stat"
  
  /* This constant defines the magic number in the stats file header */
! static const uint32 PGSS_FILE_HEADER = 0x20120103;
  
  /* XXX: Should USAGE_EXEC reflect execution time and/or buffer usage? */
  #define USAGE_EXEC(duration)	(1.0)
  #define USAGE_INIT				(1.0)	/* including initial planning */
+ #define USAGE_NON_EXEC_STICK	(1.0e10)/* unexecuted queries sticky */
  #define USAGE_DECREASE_FACTOR	(0.99)	/* decreased every entry_dealloc */
  #define USAGE_DEALLOC_PERCENT	5		/* free this % of entries at once */
! #define JUMBLE_SIZE				1024    /* query serialization buffer size */
! /* Magic values for jumble */
! #define MAG_HASH_BUF			0xFA	/* buffer is a hash of query tree */
! #define MAG_STR_BUF				0xEB	/* buffer is query string itself */
! #define MAG_RETURN_LIST			0xAE	/* returning list node follows */
! #define MAG_LIMIT_OFFSET		0xBA	/* limit/offset node follows */
  /*
   * Hashtable key that defines the identity of a hashtable entry.  The
   * hash comparators do not assume that the query string is null-terminated;
*************** typedef struct pgssHashKey
*** 63,70 ****
  	Oid			userid;			/* user OID */
  	Oid			dbid;			/* database OID */
  	int			encoding;		/* query encoding */
! 	int			query_len;		/* # of valid bytes in query string */
! 	const char *query_ptr;		/* query string proper */
  } pgssHashKey;
  
  /*
--- 96,102 ----
  	Oid			userid;			/* user OID */
  	Oid			dbid;			/* database OID */
  	int			encoding;		/* query encoding */
! 	uint32		queryid;		/* query identifier */
  } pgssHashKey;
  
  /*
*************** typedef struct pgssEntry
*** 97,102 ****
--- 129,135 ----
  {
  	pgssHashKey key;			/* hash key of entry - MUST BE FIRST */
  	Counters	counters;		/* the statistics for this query */
+ 	int			query_len;		/* # of valid bytes in query string */
  	slock_t		mutex;			/* protects the counters only */
  	char		query[1];		/* VARIABLE LENGTH ARRAY - MUST BE LAST */
  	/* Note: the allocated length of query[] is actually pgss->query_size */
*************** typedef struct pgssSharedState
*** 111,117 ****
--- 144,164 ----
  	int			query_size;		/* max query length in bytes */
  } pgssSharedState;
  
+ typedef struct pgssLocationLen
+ {
+ 	int location;
+ 	int length;
+ } pgssLocationLen;
+ 
  /*---- Local variables ----*/
+ /* Jumble of current query tree */
+ static unsigned char *last_jumble = NULL;
+ /* Buffer that represents position of normalized characters */
+ static pgssLocationLen *last_offsets = NULL;
+ /* Current Length of last_offsets buffer */
+ static Size last_offset_buf_size = 10;
+ /* Current number of actual offsets stored in last_offsets */
+ static Size last_offset_num = 0;
  
  /* Current nesting depth of ExecutorRun calls */
  static int	nested_level = 0;
*************** static ExecutorRun_hook_type prev_Execut
*** 123,133 ****
--- 170,188 ----
  static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
  static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
  static ProcessUtility_hook_type prev_ProcessUtility = NULL;
+ static post_parse_analyze_hook_type prev_post_parse_analyze_hook = NULL;
  
  /* Links to shared memory state */
  static pgssSharedState *pgss = NULL;
  static HTAB *pgss_hash = NULL;
  
+ /*
+  * Maintain a stack of the rangetable of the query tree that we're currently
+  * walking, so subqueries can reference parent rangetables. The stack is pushed
+  * and popped as each Query struct is walked into or out of.
+  */
+ static List* pgss_rangetbl_stack = NIL;
+ 
  /*---- GUC variables ----*/
  
  typedef enum
*************** static int	pgss_max;			/* max # statemen
*** 149,154 ****
--- 204,210 ----
  static int	pgss_track;			/* tracking level */
  static bool pgss_track_utility; /* whether to track utility commands */
  static bool pgss_save;			/* whether to save stats across shutdown */
+ static bool pgss_string_key;	/* whether to always only hash query str */
  
  
  #define pgss_enabled() \
*************** PG_FUNCTION_INFO_V1(pg_stat_statements);
*** 168,173 ****
--- 224,245 ----
  
  static void pgss_shmem_startup(void);
  static void pgss_shmem_shutdown(int code, Datum arg);
+ static int comp_offset(const void *a, const void *b);
+ static void pgss_parse_analyze(Query* post_analysis_tree,
+ 		const char *sourceText,	bool canonicalize);
+ static void pgss_process_post_analysis_tree(Query* post_analysis_tree,
+ 		const char* sourceText, bool canonicalize);
+ static void fill_in_constant_lengths(const char* query,
+ 						pgssLocationLen offs[], Size n_offs);
+ static uint32 JumbleQuery(Query *post_analysis_tree);
+ static void AppendJumb(unsigned char* item, unsigned char jumble[], Size size, Size *i);
+ static void PerformJumble(const Query *tree, Size size, Size *i);
+ static void QualsNode(const OpExpr *node, Size size, Size *i, List *rtable);
+ static void LeafNode(const Node *arg, Size size, Size *i, List *rtable);
+ static void LimitOffsetNode(const Node *node, Size size, Size *i, List *rtable);
+ static void JoinExprNode(JoinExpr *node, Size size, Size *i, List *rtable);
+ static void JoinExprNodeChild(const Node *node, Size size, Size *i, List *rtable);
+ static void RecordConstLocation(int location);
  static void pgss_ExecutorStart(QueryDesc *queryDesc, int eflags);
  static void pgss_ExecutorRun(QueryDesc *queryDesc,
  				 ScanDirection direction,
*************** static void pgss_ProcessUtility(Node *pa
*** 179,188 ****
  					DestReceiver *dest, char *completionTag);
  static uint32 pgss_hash_fn(const void *key, Size keysize);
  static int	pgss_match_fn(const void *key1, const void *key2, Size keysize);
! static void pgss_store(const char *query, double total_time, uint64 rows,
! 		   const BufferUsage *bufusage);
  static Size pgss_memsize(void);
! static pgssEntry *entry_alloc(pgssHashKey *key);
  static void entry_dealloc(void);
  static void entry_reset(void);
  
--- 251,262 ----
  					DestReceiver *dest, char *completionTag);
  static uint32 pgss_hash_fn(const void *key, Size keysize);
  static int	pgss_match_fn(const void *key1, const void *key2, Size keysize);
! static uint32 pgss_hash_string(const char* str);
! static void pgss_store(const char *query, uint32 queryId,
! 				double total_time, uint64 rows,
! 				const BufferUsage *bufusage, bool empty_entry, bool canonicalize);
  static Size pgss_memsize(void);
! static pgssEntry *entry_alloc(pgssHashKey *key, const char* query, int new_query_len);
  static void entry_dealloc(void);
  static void entry_reset(void);
  
*************** static void entry_reset(void);
*** 193,198 ****
--- 267,273 ----
  void
  _PG_init(void)
  {
+ 	MemoryContext oldcontext;
  	/*
  	 * In order to create our shared memory area, we have to be loaded via
  	 * shared_preload_libraries.  If not, fall out without hooking into any of
*************** _PG_init(void)
*** 254,259 ****
--- 329,349 ----
  							 NULL,
  							 NULL);
  
+ 	/*
+ 	 * Support legacy pg_stat_statements behavior, for compatibility with
+ 	 * versions shipped with Postgres 8.4, 9.0 and 9.1
+ 	 */
+ 	DefineCustomBoolVariable("pg_stat_statements.string_key",
+ 			   "Differentiate queries based on query string alone.",
+ 							 NULL,
+ 							 &pgss_string_key,
+ 							 false,
+ 							 PGC_POSTMASTER,
+ 							 0,
+ 							 NULL,
+ 							 NULL,
+ 							 NULL);
+ 
  	EmitWarningsOnPlaceholders("pg_stat_statements");
  
  	/*
*************** _PG_init(void)
*** 265,270 ****
--- 355,372 ----
  	RequestAddinLWLocks(1);
  
  	/*
+ 	 * Allocate a buffer to store selective serialization of the query tree
+ 	 * for the purposes of query normalization.
+ 	 */
+ 	oldcontext = MemoryContextSwitchTo(TopMemoryContext);
+ 
+ 	last_jumble = palloc(JUMBLE_SIZE);
+ 	/* Allocate space for bookkeeping information for query str normalization */
+ 	last_offsets = palloc(last_offset_buf_size * sizeof(pgssLocationLen));
+ 
+ 	MemoryContextSwitchTo(oldcontext);
+ 
+ 	/*
  	 * Install hooks.
  	 */
  	prev_shmem_startup_hook = shmem_startup_hook;
*************** _PG_init(void)
*** 279,284 ****
--- 381,388 ----
  	ExecutorEnd_hook = pgss_ExecutorEnd;
  	prev_ProcessUtility = ProcessUtility_hook;
  	ProcessUtility_hook = pgss_ProcessUtility;
+ 	prev_post_parse_analyze_hook = post_parse_analyze_hook;
+ 	post_parse_analyze_hook = pgss_parse_analyze;
  }
  
  /*
*************** _PG_fini(void)
*** 294,299 ****
--- 398,407 ----
  	ExecutorFinish_hook = prev_ExecutorFinish;
  	ExecutorEnd_hook = prev_ExecutorEnd;
  	ProcessUtility_hook = prev_ProcessUtility;
+ 	post_parse_analyze_hook = prev_post_parse_analyze_hook;
+ 
+ 	pfree(last_jumble);
+ 	pfree(last_offsets);
  }
  
  /*
*************** pgss_shmem_startup(void)
*** 397,423 ****
  		if (!PG_VALID_BE_ENCODING(temp.key.encoding))
  			goto error;
  
  		/* Previous incarnation might have had a larger query_size */
! 		if (temp.key.query_len >= buffer_size)
  		{
! 			buffer = (char *) repalloc(buffer, temp.key.query_len + 1);
! 			buffer_size = temp.key.query_len + 1;
  		}
  
! 		if (fread(buffer, 1, temp.key.query_len, file) != temp.key.query_len)
  			goto error;
! 		buffer[temp.key.query_len] = '\0';
  
  		/* Clip to available length if needed */
! 		if (temp.key.query_len >= query_size)
! 			temp.key.query_len = pg_encoding_mbcliplen(temp.key.encoding,
  													   buffer,
! 													   temp.key.query_len,
  													   query_size - 1);
- 		temp.key.query_ptr = buffer;
  
  		/* make the hashtable entry (discards old entries if too many) */
! 		entry = entry_alloc(&temp.key);
  
  		/* copy in the actual stats */
  		entry->counters = temp.counters;
--- 505,535 ----
  		if (!PG_VALID_BE_ENCODING(temp.key.encoding))
  			goto error;
  
+ 		/* Avoid loading sticky entries */
+ 		if (temp.counters.calls == 0)
+ 			continue;
+ 
  		/* Previous incarnation might have had a larger query_size */
! 		if (temp.query_len >= buffer_size)
  		{
! 			buffer = (char *) repalloc(buffer, temp.query_len + 1);
! 			buffer_size = temp.query_len + 1;
  		}
  
! 		if (fread(buffer, 1, temp.query_len, file) != temp.query_len)
  			goto error;
! 		buffer[temp.query_len] = '\0';
! 
  
  		/* Clip to available length if needed */
! 		if (temp.query_len >= query_size)
! 			temp.query_len = pg_encoding_mbcliplen(temp.key.encoding,
  													   buffer,
! 													   temp.query_len,
  													   query_size - 1);
  
  		/* make the hashtable entry (discards old entries if too many) */
! 		entry = entry_alloc(&temp.key, buffer, temp.query_len);
  
  		/* copy in the actual stats */
  		entry->counters = temp.counters;
*************** pgss_shmem_shutdown(int code, Datum arg)
*** 479,485 ****
  	hash_seq_init(&hash_seq, pgss_hash);
  	while ((entry = hash_seq_search(&hash_seq)) != NULL)
  	{
! 		int			len = entry->key.query_len;
  
  		if (fwrite(entry, offsetof(pgssEntry, mutex), 1, file) != 1 ||
  			fwrite(entry->query, 1, len, file) != len)
--- 591,597 ----
  	hash_seq_init(&hash_seq, pgss_hash);
  	while ((entry = hash_seq_search(&hash_seq)) != NULL)
  	{
! 		int			len = entry->query_len;
  
  		if (fwrite(entry, offsetof(pgssEntry, mutex), 1, file) != 1 ||
  			fwrite(entry->query, 1, len, file) != len)
*************** error:
*** 505,510 ****
--- 617,1630 ----
  }
  
  /*
+  * comp_offset: Comparator for qsorting pgssLocationLen values.
+  */
+ static int
+ comp_offset(const void *a, const void *b)
+ {
+ 	int l = ((pgssLocationLen*) a)->location;
+ 	int r = ((pgssLocationLen*) b)->location;
+ 	if (l < r)
+ 		return -1;
+ 	else if (l > r)
+ 		return +1;
+ 	else
+ 		return 0;
+ }
+ 
+ static void
+ pgss_parse_analyze(Query* post_analysis_tree, const char *sourceText,
+ 					bool canonicalize)
+ {
+ 	/*
+ 	 * It is possible that a query could organically have a queryId of 0, but
+ 	 * that is exceptionally unlikely, and besides, this assertion naturally
+ 	 * evaluates to a no-op on "release" builds
+ 	 */
+ 	Assert(post_analysis_tree->queryId == 0);
+ 	if (!post_analysis_tree->utilityStmt)
+ 		pgss_process_post_analysis_tree(post_analysis_tree, sourceText,
+ 											canonicalize);
+ }
+ 
+ /*
+  * pgss_process_post_analysis_tree: Record queryId, which is based on the query
+  * tree, within the tree itself, for later retrieval in the executor hook. The
+  * core system will copy the value to the tree's corresponding plannedstmt.
+  *
+  * Avoid producing a canonicalized string for parameterized queries. It is
+  * simply not desirable given that constants that we might otherwise
+  * canonicalize are going to always be consistent between calls. In addition, it
+  * would be impractical to make the hash entry sticky for an indefinitely long
+  * period (i.e. until the query is actually executed).
+  */
+ static void
+ pgss_process_post_analysis_tree(Query* post_analysis_tree,
+ 		const char* sourceText, bool canonicalize)
+ {
+ 	BufferUsage bufusage;
+ 
+ 	post_analysis_tree->queryId = JumbleQuery(post_analysis_tree);
+ 
+ 	memset(&bufusage, 0, sizeof(bufusage));
+ 	pgss_store(sourceText, post_analysis_tree->queryId, 0, 0, &bufusage,
+ 			true, canonicalize);
+ 
+ 	/* Trim last_offsets */
+ 	if (last_offset_buf_size > 10)
+ 	{
+ 		last_offset_buf_size = 10;
+ 		last_offsets = repalloc(last_offsets,
+ 							last_offset_buf_size *
+ 							sizeof(pgssLocationLen));
+ 	}
+ }
+ 
+ /*
+  * Given a valid SQL string, and offsets whose lengths are uninitialized, fill
+  * in the corresponding lengths of those constants.
+  *
+  * The constant may use any available constant syntax, including but not limited
+  * to float literals, bit-strings, single quoted strings and dollar-quoted
+  * strings. This is accomplished by using the public API for the core scanner,
+  * with a workaround for quirks of their representation. It is expected that the
+  * constants will be sorted by their original location when canonicalizing the
+  * query string, so do that here.
+  *
+  * It is the caller's job to ensure that the string is a valid SQL statement.
+  * Since in practice the string has already been validated, and the locations
+  * that the caller provides will have originated from within the authoritative
+  * parser, this should not be a problem. Duplicates are expected, and will have
+  * their lengths marked as '-1', so that they are later ignored.
+  *
+  * N.B. There is an assumption that a '-' character at a Const location begins a
+  * negative constant. This precludes there ever being another reason for a
+  * constant to start with a '-' for any other reason.
+  */
+ static void
+ fill_in_constant_lengths(const char* query, pgssLocationLen offs[],
+ 							Size n_offs)
+ {
+ 	core_yyscan_t  init_scan;
+ 	core_yy_extra_type ext_type;
+ 	core_YYSTYPE type;
+ 	YYLTYPE pos;
+ 	int i, last_loc = -1;
+ 
+ 	/* Sort offsets */
+ 	qsort(offs, n_offs, sizeof(pgssLocationLen), comp_offset);
+ 
+ 
+ 	init_scan = scanner_init(query,
+ 							 &ext_type,
+ 							 ScanKeywords,
+ 							 NumScanKeywords);
+ 
+ 	for(i = 0; i < n_offs; i++)
+ 	{
+ 		int loc = offs[i].location;
+ 		Assert(loc > 0);
+ 
+ 		if (loc == last_loc)
+ 		{
+ 			/* Duplicate */
+ 			offs[i].length = -1;
+ 			continue;
+ 		}
+ 
+ 		for(;;)
+ 		{
+ 			int scanbuf_len;
+ #ifdef USE_ASSERT_CHECKING
+ 			int tok =
+ #endif
+ 						core_yylex(&type, &pos, init_scan);
+ 			scanbuf_len = strlen(ext_type.scanbuf);
+ 			Assert(tok != 0);
+ 
+ 			if (scanbuf_len > loc)
+ 			{
+ 				if (query[loc] == '-')
+ 				{
+ 					/*
+ 					 * It's a negative value - this is the one and only case
+ 					 * where we canonicalize more than a single token.
+ 					 *
+ 					 * Do not compensate for the core system's special-case
+ 					 * adjustment of location to that of the leading '-'
+ 					 * operator in the event of a negative constant. It is also
+ 					 * useful for our purposes to start from the minus symbol.
+ 					 * In this way, queries like "select * from foo where bar =
+ 					 * 1" and "select * from foo where bar = -2" will always
+ 					 * have identical canonicalized query strings.
+ 					 */
+ 					core_yylex(&type, &pos, init_scan);
+ 					scanbuf_len = strlen(ext_type.scanbuf);
+ 				}
+ 
+ 				/*
+ 				 * Scanner is now at end of const token of outer iteration -
+ 				 * work backwards to get constant length.
+ 				 */
+ 				offs[i].length = scanbuf_len - loc;
+ 				break;
+ 			}
+ 		}
+ 		last_loc = loc;
+ 	}
+ 	scanner_finish(init_scan);
+ }
+ 
+ /*
+  * JumbleQuery: Selectively serialize query tree, and return a hash representing
+  * that serialization - its queryId.
+  *
+  * Note that this doesn't necessarily uniquely identify the query across
+  * different databases and encodings.
+  */
+ static uint32
+ JumbleQuery(Query *post_analysis_tree)
+ {
+ 	/* State for this run of PerformJumble */
+ 	Size i = 0;
+ 	last_offset_num = 0;
+ 	Assert(post_analysis_tree->queryId == 0);
+ 	memset(last_jumble, 0, JUMBLE_SIZE);
+ 	last_jumble[i++] = MAG_HASH_BUF;
+ 	PerformJumble(post_analysis_tree, JUMBLE_SIZE, &i);
+ 	/* Reset rangetbl state */
+ 	list_free(pgss_rangetbl_stack);
+ 	pgss_rangetbl_stack = NIL;
+ 
+ 	return hash_any((const unsigned char* ) last_jumble, i);
+ }
+ 
+ /*
+  * AppendJumb: Append a value that is substantive to a given query to jumble,
+  * while incrementing the iterator, i.
+  */
+ static void
+ AppendJumb(unsigned char* item, unsigned char jumble[], Size size, Size *i)
+ {
+ 	Assert(item != NULL);
+ 	Assert(jumble != NULL);
+ 	Assert(i != NULL);
+ 
+ 	/*
+ 	 * Copy the entire item to the buffer, or as much of it as possible to fill
+ 	 * the buffer to capacity.
+ 	 */
+ 	memcpy(jumble + *i, item, Min(*i > JUMBLE_SIZE? 0:JUMBLE_SIZE - *i, size));
+ 
+ 	/*
+ 	 * Continually hash the query tree's jumble.
+ 	 *
+ 	 * Was JUMBLE_SIZE exceeded? If so, hash the jumble and append that to the
+ 	 * start of the jumble buffer, and then continue to append the fraction of
+ 	 * "item" that we might not have been able to fit at the end of the buffer
+ 	 * in the last iteration. Since the value of i has been set to 0, there is
+ 	 * no need to memset the buffer in advance of this new iteration, but
+ 	 * effectively we are completely discarding the prior iteration's jumble
+ 	 * except for this representative hash value.
+ 	 */
+ 	if (*i > JUMBLE_SIZE)
+ 	{
+ 		uint32 start_hash = hash_any((const unsigned char* ) last_jumble, JUMBLE_SIZE);
+ 		int hash_l = sizeof(start_hash);
+ 		int part_left_l = Max(0, ((int) size - ((int) *i - JUMBLE_SIZE)));
+ 
+ 		Assert(part_left_l >= 0 && part_left_l <= size);
+ 
+ 		memcpy(jumble, &start_hash, hash_l);
+ 		memcpy(jumble + hash_l, item + (size - part_left_l), part_left_l);
+ 		*i = hash_l + part_left_l;
+ 	}
+ 	else
+ 	{
+ 		*i += size;
+ 	}
+ }
+ 
+ /*
+  * Wrapper around AppendJumb to encapsulate details of serialization
+  * of individual local variable elements.
+  */
+ #define APP_JUMB(item) \
+ AppendJumb((unsigned char*)&item, last_jumble, sizeof(item), i)
+ 
+ /*
+  * PerformJumble: Selectively serialize the query tree and canonicalize
+  * constants (i.e.  don't consider their actual value - just their type).
+  *
+  * The last_jumble buffer, which this function writes to, can be hashed to
+  * uniquely identify a query that may use different constants in successive
+  * calls.
+  */
+ static void
+ PerformJumble(const Query *tree, Size size, Size *i)
+ {
+ 	ListCell *l;
+ 	/* table join tree (FROM and WHERE clauses) */
+ 	FromExpr *jt = (FromExpr *) tree->jointree;
+ 	/* # of result tuples to skip (int8 expr) */
+ 	FuncExpr *off = (FuncExpr *) tree->limitOffset;
+ 	/* # of result tuples to skip (int8 expr) */
+ 	FuncExpr *limcount = (FuncExpr *) tree->limitCount;
+ 
+ 	if (pgss_rangetbl_stack &&
+ 			!IsA(pgss_rangetbl_stack, List))
+ 		pgss_rangetbl_stack = NIL;
+ 
+ 	if (tree->rtable != NIL)
+ 	{
+ 		pgss_rangetbl_stack = lappend(pgss_rangetbl_stack, tree->rtable);
+ 	}
+ 	else
+ 	{
+ 		/* Add dummy Range table entry to maintain stack */
+ 		RangeTblEntry *rte = makeNode(RangeTblEntry);
+ 		List *dummy = lappend(NIL, rte);
+ 		pgss_rangetbl_stack = lappend(pgss_rangetbl_stack, dummy);
+ 	}
+ 
+ 	APP_JUMB(tree->resultRelation);
+ 
+ 	if (tree->intoClause)
+ 	{
+ 		IntoClause *ic = tree->intoClause;
+ 		RangeVar   *rel = ic->rel;
+ 
+ 		APP_JUMB(ic->onCommit);
+ 		APP_JUMB(ic->skipData);
+ 		if (rel)
+ 		{
+ 			APP_JUMB(rel->relpersistence);
+ 			/* Bypass macro abstraction to supply size directly.
+ 			 *
+ 			 * Serialize schemaname, relname themselves - this makes us
+ 			 * somewhat consistent with the behavior of utility statements like "create
+ 			 * table", which seems appropriate.
+ 			 */
+ 			if (rel->schemaname)
+ 				AppendJumb((unsigned char *)rel->schemaname, last_jumble,
+ 								strlen(rel->schemaname), i);
+ 			if (rel->relname)
+ 				AppendJumb((unsigned char *)rel->relname, last_jumble,
+ 								strlen(rel->relname), i);
+ 		}
+ 	}
+ 
+ 	/* WITH list (of CommonTableExpr's) */
+ 	foreach(l, tree->cteList)
+ 	{
+ 		CommonTableExpr	*cte = (CommonTableExpr *) lfirst(l);
+ 		Query			*cteq = (Query*) cte->ctequery;
+ 		if (cteq)
+ 			PerformJumble(cteq, size, i);
+ 	}
+ 	if (jt)
+ 	{
+ 		if (jt->quals)
+ 		{
+ 			if (IsA(jt->quals, OpExpr))
+ 			{
+ 				QualsNode((OpExpr*) jt->quals, size, i, tree->rtable);
+ 			}
+ 			else
+ 			{
+ 				LeafNode((Node*) jt->quals, size, i, tree->rtable);
+ 			}
+ 		}
+ 		/* table join tree */
+ 		foreach(l, jt->fromlist)
+ 		{
+ 			Node* fr = lfirst(l);
+ 			if (IsA(fr, JoinExpr))
+ 			{
+ 				JoinExprNode((JoinExpr*) fr, size, i, tree->rtable);
+ 			}
+ 			else if (IsA(fr, RangeTblRef))
+ 			{
+ 				RangeTblRef   *rtf = (RangeTblRef *) fr;
+ 				RangeTblEntry *rte = rt_fetch(rtf->rtindex, tree->rtable);
+ 				APP_JUMB(rte->relid);
+ 				APP_JUMB(rte->rtekind);
+ 				/* Subselection in where clause */
+ 				if (rte->subquery)
+ 					PerformJumble(rte->subquery, size, i);
+ 
+ 				/* Function call in where clause */
+ 				if (rte->funcexpr)
+ 					LeafNode((Node*) rte->funcexpr, size, i, tree->rtable);
+ 			}
+ 			else
+ 			{
+ 				ereport(WARNING,
+ 						(errcode(ERRCODE_INTERNAL_ERROR),
+ 						 errmsg("unexpected, unrecognised fromlist node type: %d",
+ 							 (int) nodeTag(fr))));
+ 			}
+ 		}
+ 	}
+ 	/*
+ 	 * target list (of TargetEntry)
+ 	 * columns returned by query
+ 	 */
+ 	foreach(l, tree->targetList)
+ 	{
+ 		TargetEntry *tg = (TargetEntry *) lfirst(l);
+ 		Node        *e  = (Node*) tg->expr;
+ 		if (tg->ressortgroupref)
+ 			/* nonzero if referenced by a sort/group - for ORDER BY */
+ 			APP_JUMB(tg->ressortgroupref);
+ 		APP_JUMB(tg->resno); /* column number for select */
+ 		/*
+ 		 * Handle the various types of nodes in
+ 		 * the select list of this query
+ 		 */
+ 		LeafNode(e, size, i, tree->rtable);
+ 	}
+ 	/* return-values list (of TargetEntry) */
+ 	foreach(l, tree->returningList)
+ 	{
+ 		TargetEntry *rt = (TargetEntry *) lfirst(l);
+ 		Expr        *e  = (Expr*) rt->expr;
+ 		unsigned char magic = MAG_RETURN_LIST;
+ 		APP_JUMB(magic);
+ 		/*
+ 		 * Handle the various types of nodes in
+ 		 * the select list of this query
+ 		 */
+ 		LeafNode((Node*) e, size, i, tree->rtable);
+ 	}
+ 	/* a list of SortGroupClause's */
+ 	foreach(l, tree->groupClause)
+ 	{
+ 		SortGroupClause *gc = (SortGroupClause *) lfirst(l);
+ 		APP_JUMB(gc->tleSortGroupRef);
+ 		APP_JUMB(gc->nulls_first);
+ 	}
+ 
+ 	if (tree->havingQual)
+ 	{
+ 		if (IsA(tree->havingQual, OpExpr))
+ 		{
+ 			OpExpr *na = (OpExpr *) tree->havingQual;
+ 			QualsNode(na, size, i, tree->rtable);
+ 		}
+ 		else
+ 		{
+ 			Node *n = (Node*) tree->havingQual;
+ 			LeafNode(n, size, i, tree->rtable);
+ 		}
+ 	}
+ 
+ 	foreach(l, tree->windowClause)
+ 	{
+ 		WindowClause *wc = (WindowClause *) lfirst(l);
+ 		ListCell     *il;
+ 		APP_JUMB(wc->frameOptions);
+ 		foreach(il, wc->partitionClause)	/* PARTITION BY list */
+ 		{
+ 			Node *n = (Node *) lfirst(il);
+ 			LeafNode(n, size, i, tree->rtable);
+ 		}
+ 		foreach(il, wc->orderClause)		/* ORDER BY list */
+ 		{
+ 			Node *n = (Node *) lfirst(il);
+ 			LeafNode(n, size, i, tree->rtable);
+ 		}
+ 	}
+ 
+ 	foreach(l, tree->distinctClause)
+ 	{
+ 		SortGroupClause *dc = (SortGroupClause *) lfirst(l);
+ 		APP_JUMB(dc->tleSortGroupRef);
+ 		APP_JUMB(dc->nulls_first);
+ 	}
+ 
+ 	/* Don't look at tree->sortClause,
+ 	 * because the value ressortgroupref is already
+ 	 * serialized when we iterate through targetList
+ 	 */
+ 
+ 	if (off)
+ 		LimitOffsetNode((Node*) off, size, i, tree->rtable);
+ 
+ 	if (limcount)
+ 		LimitOffsetNode((Node*) limcount, size, i, tree->rtable);
+ 
+ 	if (tree->setOperations)
+ 	{
+ 		/*
+ 		 * set-operation tree if this is top
+ 		 * level of a UNION/INTERSECT/EXCEPT query
+ 		 */
+ 		SetOperationStmt *topop = (SetOperationStmt *) tree->setOperations;
+ 		APP_JUMB(topop->op);
+ 		APP_JUMB(topop->all);
+ 
+ 		/* leaf selects are RTE subselections */
+ 		foreach(l, tree->rtable)
+ 		{
+ 			RangeTblEntry *r = (RangeTblEntry *) lfirst(l);
+ 			if (r->subquery)
+ 				PerformJumble(r->subquery, size, i);
+ 		}
+ 	}
+ 	pgss_rangetbl_stack = list_delete_ptr(pgss_rangetbl_stack,
+ 			list_nth(pgss_rangetbl_stack, pgss_rangetbl_stack->length - 1));
+ }
+ 
+ /*
+  * Perform selective serialization of "Quals" nodes when
+  * they're IsA(*, OpExpr)
+  */
+ static void
+ QualsNode(const OpExpr *node, Size size, Size *i, List *rtable)
+ {
+ 	ListCell *l;
+ 	APP_JUMB(node->xpr);
+ 	APP_JUMB(node->opno);
+ 	foreach(l, node->args)
+ 	{
+ 		Node *arg = (Node *) lfirst(l);
+ 		LeafNode(arg, size, i, rtable);
+ 	}
+ }
+ 
+ /*
+  * LeafNode: Selectively serialize a selection of parser/prim nodes that are
+  * frequently, though certainly not necesssarily leaf nodes, such as Vars
+  * (columns), constants and function calls
+  */
+ static void
+ LeafNode(const Node *arg, Size size, Size *i, List *rtable)
+ {
+ 	ListCell *l;
+ 	/* Use the node's NodeTag as a magic number */
+ 	APP_JUMB(arg->type);
+ 
+ 	if (IsA(arg, Const))
+ 	{
+ 		Const *c = (Const *) arg;
+ 
+ 		/*
+ 		 * Datatype of the constant is a differentiator
+ 		 */
+ 		APP_JUMB(c->consttype);
+ 		RecordConstLocation(c->location);
+ 	}
+ 	else if(IsA(arg, CoerceToDomain))
+ 	{
+ 		CoerceToDomain *cd = (CoerceToDomain*) arg;
+ 		/*
+ 		 * Datatype of the constant is a
+ 		 * differentiator
+ 		 */
+ 		APP_JUMB(cd->resulttype);
+ 		LeafNode((Node*) cd->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, Var))
+ 	{
+ 		Var			  *v = (Var *) arg;
+ 		RangeTblEntry *rte;
+ 		ListCell *lc;
+ 
+ 		/*
+ 		 * We need to get the details of the rangetable, but rtable may not
+ 		 * refer to the relevant one if we're in a subselection.
+ 		 */
+ 		if (v->varlevelsup == 0)
+ 		{
+ 			rte = rt_fetch(v->varno, rtable);
+ 		}
+ 		else
+ 		{
+ 			List *rtable_upper = list_nth(pgss_rangetbl_stack,
+ 					(list_length(pgss_rangetbl_stack) - 1) - v->varlevelsup);
+ 			rte = rt_fetch(v->varno, rtable_upper);
+ 		}
+ 		APP_JUMB(rte->relid);
+ 
+ 		foreach(lc, rte->values_lists)
+ 		{
+ 			List	   *sublist = (List *) lfirst(lc);
+ 			ListCell   *lc2;
+ 
+ 			foreach(lc2, sublist)
+ 			{
+ 				Node	   *col = (Node *) lfirst(lc2);
+ 				LeafNode(col, size, i, rtable);
+ 			}
+ 		}
+ 		APP_JUMB(v->varattno);
+ 	}
+ 	else if (IsA(arg, CurrentOfExpr))
+ 	{
+ 		CurrentOfExpr *CoE = (CurrentOfExpr*) arg;
+ 		APP_JUMB(CoE->cvarno);
+ 		APP_JUMB(CoE->cursor_param);
+ 	}
+ 	else if (IsA(arg, CollateExpr))
+ 	{
+ 		CollateExpr *Ce = (CollateExpr*) arg;
+ 		APP_JUMB(Ce->collOid);
+ 	}
+ 	else if (IsA(arg, FieldSelect))
+ 	{
+ 		FieldSelect *Fs = (FieldSelect*) arg;
+ 		APP_JUMB(Fs->resulttype);
+ 		LeafNode((Node*) Fs->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, NamedArgExpr))
+ 	{
+ 		NamedArgExpr *Nae = (NamedArgExpr*) arg;
+ 		APP_JUMB(Nae->argnumber);
+ 		LeafNode((Node*) Nae->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, Param))
+ 	{
+ 		Param *p = ((Param *) arg);
+ 		APP_JUMB(p->paramkind);
+ 		APP_JUMB(p->paramid);
+ 	}
+ 	else if (IsA(arg, RelabelType))
+ 	{
+ 		RelabelType *rt = (RelabelType*) arg;
+ 		APP_JUMB(rt->resulttype);
+ 		LeafNode((Node*) rt->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, WindowFunc))
+ 	{
+ 		WindowFunc *wf = (WindowFunc *) arg;
+ 		APP_JUMB(wf->winfnoid);
+ 		foreach(l, wf->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, FuncExpr))
+ 	{
+ 		FuncExpr *f = (FuncExpr *) arg;
+ 		APP_JUMB(f->funcid);
+ 		foreach(l, f->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, OpExpr) || IsA(arg, DistinctExpr))
+ 	{
+ 		QualsNode((OpExpr*) arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, CoerceViaIO))
+ 	{
+ 		CoerceViaIO *Cio = (CoerceViaIO*) arg;
+ 		APP_JUMB(Cio->coerceformat);
+ 		APP_JUMB(Cio->resulttype);
+ 		LeafNode((Node*) Cio->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, Aggref))
+ 	{
+ 		Aggref *a =  (Aggref *) arg;
+ 		APP_JUMB(a->aggfnoid);
+ 		foreach(l, a->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, SubLink))
+ 	{
+ 		SubLink *s = (SubLink*) arg;
+ 		APP_JUMB(s->subLinkType);
+ 		/* Serialize select-list subselect recursively */
+ 		if (s->subselect)
+ 			PerformJumble((Query*) s->subselect, size, i);
+ 
+ 		if (s->testexpr)
+ 			LeafNode((Node*) s->testexpr, size, i, rtable);
+ 		foreach(l, s->operName)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, TargetEntry))
+ 	{
+ 		TargetEntry *rt = (TargetEntry *) arg;
+ 		Node *e = (Node*) rt->expr;
+ 		APP_JUMB(rt->resorigtbl);
+ 		APP_JUMB(rt->ressortgroupref);
+ 		LeafNode(e, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, BoolExpr))
+ 	{
+ 		BoolExpr *be = (BoolExpr *) arg;
+ 		APP_JUMB(be->boolop);
+ 		foreach(l, be->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, NullTest))
+ 	{
+ 		NullTest *nt = (NullTest *) arg;
+ 		Node     *arg = (Node *) nt->arg;
+ 		APP_JUMB(nt->nulltesttype);		/* IS NULL, IS NOT NULL */
+ 		APP_JUMB(nt->argisrow);			/* is input a composite type ? */
+ 		LeafNode(arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, ArrayExpr))
+ 	{
+ 		ArrayExpr *ae = (ArrayExpr *) arg;
+ 		APP_JUMB(ae->array_typeid);		/* type of expression result */
+ 		APP_JUMB(ae->element_typeid);	/* common type of array elements */
+ 		foreach(l, ae->elements)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, CaseExpr))
+ 	{
+ 		CaseExpr *ce = (CaseExpr*) arg;
+ 		Assert(ce->casetype != InvalidOid);
+ 		APP_JUMB(ce->casetype);
+ 		foreach(l, ce->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		if (ce->arg)
+ 			LeafNode((Node*) ce->arg, size, i, rtable);
+ 
+ 		if (ce->defresult)
+ 		{
+ 			/* Default result (ELSE clause).
+ 			 *
+ 			 * May be NULL, because no else clause
+ 			 * was actually specified, and thus the value is
+ 			 * equivalent to SQL ELSE NULL
+ 			 */
+ 			LeafNode((Node*) ce->defresult, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, CaseTestExpr))
+ 	{
+ 		CaseTestExpr *ct = (CaseTestExpr*) arg;
+ 		APP_JUMB(ct->typeId);
+ 	}
+ 	else if (IsA(arg, CaseWhen))
+ 	{
+ 		CaseWhen *cw = (CaseWhen*) arg;
+ 		Node     *res = (Node*) cw->result;
+ 		Node     *exp = (Node*) cw->expr;
+ 		if (res)
+ 			LeafNode(res, size, i, rtable);
+ 		if (exp)
+ 			LeafNode(exp, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, MinMaxExpr))
+ 	{
+ 		MinMaxExpr *cw = (MinMaxExpr*) arg;
+ 		APP_JUMB(cw->minmaxtype);
+ 		APP_JUMB(cw->op);
+ 		foreach(l, cw->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, ScalarArrayOpExpr))
+ 	{
+ 		ScalarArrayOpExpr *sa = (ScalarArrayOpExpr*) arg;
+ 		APP_JUMB(sa->opfuncid);
+ 		APP_JUMB(sa->useOr);
+ 		foreach(l, sa->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, CoalesceExpr))
+ 	{
+ 		CoalesceExpr *ca = (CoalesceExpr*) arg;
+ 		foreach(l, ca->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, ArrayCoerceExpr))
+ 	{
+ 		ArrayCoerceExpr *ac = (ArrayCoerceExpr *) arg;
+ 		LeafNode((Node*) ac->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, WindowClause))
+ 	{
+ 		WindowClause *wc = (WindowClause*) arg;
+ 		foreach(l, wc->partitionClause)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		foreach(l, wc->orderClause)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, SortGroupClause))
+ 	{
+ 		SortGroupClause *sgc = (SortGroupClause*) arg;
+ 		APP_JUMB(sgc->tleSortGroupRef);
+ 		APP_JUMB(sgc->nulls_first);
+ 	}
+ 	else if (IsA(arg, Integer) ||
+ 		  IsA(arg, Float) ||
+ 		  IsA(arg, String) ||
+ 		  IsA(arg, BitString) ||
+ 		  IsA(arg, Null)
+ 		)
+ 	{
+ 		/* It is not necessary to serialize Value nodes - they are seen when
+ 		 * aliases are used, which are ignored.
+ 		 */
+ 		return;
+ 	}
+ 	else if (IsA(arg, BooleanTest))
+ 	{
+ 		BooleanTest *bt = (BooleanTest *) arg;
+ 		APP_JUMB(bt->booltesttype);
+ 		LeafNode((Node*) bt->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, ArrayRef))
+ 	{
+ 		ArrayRef *ar = (ArrayRef*) arg;
+ 		APP_JUMB(ar->refarraytype);
+ 		foreach(l, ar->refupperindexpr)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		foreach(l, ar->reflowerindexpr)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		if (ar->refexpr)
+ 			LeafNode((Node*) ar->refexpr, size, i, rtable);
+ 		if (ar->refassgnexpr)
+ 			LeafNode((Node*) ar->refassgnexpr, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, NullIfExpr))
+ 	{
+ 		/* NullIfExpr is just a typedef for OpExpr */
+ 		QualsNode((OpExpr*) arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, RowExpr))
+ 	{
+ 		RowExpr *re = (RowExpr*) arg;
+ 		APP_JUMB(re->row_format);
+ 		foreach(l, re->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 
+ 	}
+ 	else if (IsA(arg, XmlExpr))
+ 	{
+ 		XmlExpr *xml = (XmlExpr*) arg;
+ 		APP_JUMB(xml->op);
+ 		foreach(l, xml->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		/* non-XML expressions for xml_attributes */
+ 		foreach(l, xml->named_args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		/* parallel list of Value strings */
+ 		foreach(l, xml->arg_names)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, RowCompareExpr))
+ 	{
+ 		RowCompareExpr *rc = (RowCompareExpr*) arg;
+ 		foreach(l, rc->largs)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		foreach(l, rc->rargs)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, SetToDefault))
+ 	{
+ 		SetToDefault *sd = (SetToDefault*) arg;
+ 		APP_JUMB(sd->typeId);
+ 		APP_JUMB(sd->typeMod);
+ 	}
+ 	else if (IsA(arg, ConvertRowtypeExpr))
+ 	{
+ 		ConvertRowtypeExpr* Cr = (ConvertRowtypeExpr*) arg;
+ 		APP_JUMB(Cr->convertformat);
+ 		APP_JUMB(Cr->resulttype);
+ 		LeafNode((Node*) Cr->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, FieldStore))
+ 	{
+ 		FieldStore* Fs = (FieldStore*) arg;
+ 		LeafNode((Node*) Fs->arg, size, i, rtable);
+ 		foreach(l, Fs->newvals)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else
+ 	{
+ 		ereport(WARNING,
+ 				(errcode(ERRCODE_INTERNAL_ERROR),
+ 				 errmsg("unexpected, unrecognised LeafNode node type: %d",
+ 					 (int) nodeTag(arg))));
+ 	}
+ }
+ 
+ /*
+  * Perform selective serialization of limit or offset nodes
+  */
+ static void
+ LimitOffsetNode(const Node *node, Size size, Size *i, List *rtable)
+ {
+ 	ListCell *l;
+ 	unsigned char magic = MAG_LIMIT_OFFSET;
+ 	APP_JUMB(magic);
+ 
+ 	if (IsA(node, FuncExpr))
+ 	{
+ 
+ 		foreach(l, ((FuncExpr*) node)->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else
+ 	{
+ 		/* Fall back on leaf node representation */
+ 		LeafNode(node, size, i, rtable);
+ 	}
+ }
+ 
+ /*
+  * JoinExprNode: Perform selective serialization of JoinExpr nodes
+  */
+ static void
+ JoinExprNode(JoinExpr *node, Size size, Size *i, List *rtable)
+ {
+ 	Node	 *larg = node->larg;	/* left subtree */
+ 	Node	 *rarg = node->rarg;	/* right subtree */
+ 	ListCell *l;
+ 
+ 	Assert( IsA(node, JoinExpr));
+ 
+ 	APP_JUMB(node->jointype);
+ 	APP_JUMB(node->isNatural);
+ 
+ 	if (node->quals)
+ 	{
+ 		if ( IsA(node, OpExpr))
+ 		{
+ 			QualsNode((OpExpr*) node->quals, size, i, rtable);
+ 		}
+ 		else
+ 		{
+ 			LeafNode((Node*) node->quals, size, i, rtable);
+ 		}
+ 	}
+ 	foreach(l, node->usingClause) /* USING clause, if any (list of String) */
+ 	{
+ 		Node *arg = (Node *) lfirst(l);
+ 		LeafNode(arg, size, i, rtable);
+ 	}
+ 	if (larg)
+ 		JoinExprNodeChild(larg, size, i, rtable);
+ 	if (rarg)
+ 		JoinExprNodeChild(rarg, size, i, rtable);
+ }
+ 
+ /*
+  * JoinExprNodeChild: Serialize children of the JoinExpr node
+  */
+ static void
+ JoinExprNodeChild(const Node *node, Size size, Size *i, List *rtable)
+ {
+ 	if (IsA(node, RangeTblRef))
+ 	{
+ 		RangeTblRef   *rt = (RangeTblRef*) node;
+ 		RangeTblEntry *rte = rt_fetch(rt->rtindex, rtable);
+ 		ListCell      *l;
+ 
+ 		APP_JUMB(rte->relid);
+ 		APP_JUMB(rte->jointype);
+ 
+ 		if (rte->subquery)
+ 			PerformJumble((Query*) rte->subquery, size, i);
+ 
+ 		foreach(l, rte->joinaliasvars)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(node, JoinExpr))
+ 	{
+ 		JoinExprNode((JoinExpr*) node, size, i, rtable);
+ 	}
+ 	else
+ 	{
+ 		LeafNode(node, size, i, rtable);
+ 	}
+ }
+ 
+ /*
+  * Record location of constant within query string of query tree that is
+  * currently being walked.
+  */
+ static void
+ RecordConstLocation(int location)
+ {
+ 	/* -1 indicates unknown or undefined location */
+ 	if (location > 0)
+ 	{
+ 		if (last_offset_num >= last_offset_buf_size)
+ 		{
+ 			last_offset_buf_size *= 2;
+ 			last_offsets = repalloc(last_offsets,
+ 							last_offset_buf_size *
+ 							sizeof(pgssLocationLen));
+ 
+ 		}
+ 		last_offsets[last_offset_num++].location = location;
+ 	}
+ }
+ 
+ /*
   * ExecutorStart hook: start up tracking if needed
   */
  static void
*************** pgss_ExecutorEnd(QueryDesc *queryDesc)
*** 587,592 ****
--- 1707,1717 ----
  {
  	if (queryDesc->totaltime && pgss_enabled())
  	{
+ 		uint32 queryId;
+ 		if (pgss_string_key)
+ 			queryId = pgss_hash_string(queryDesc->sourceText);
+ 		else
+ 			queryId = queryDesc->plannedstmt->queryId;
  		/*
  		 * Make sure stats accumulation is done.  (Note: it's okay if several
  		 * levels of hook all do this.)
*************** pgss_ExecutorEnd(QueryDesc *queryDesc)
*** 594,602 ****
  		InstrEndLoop(queryDesc->totaltime);
  
  		pgss_store(queryDesc->sourceText,
! 				   queryDesc->totaltime->total,
! 				   queryDesc->estate->es_processed,
! 				   &queryDesc->totaltime->bufusage);
  	}
  
  	if (prev_ExecutorEnd)
--- 1719,1731 ----
  		InstrEndLoop(queryDesc->totaltime);
  
  		pgss_store(queryDesc->sourceText,
! 		   queryId,
! 		   queryDesc->totaltime->total,
! 		   queryDesc->estate->es_processed,
! 		   &queryDesc->totaltime->bufusage,
! 		   false,
! 		   false);
! 
  	}
  
  	if (prev_ExecutorEnd)
*************** pgss_ProcessUtility(Node *parsetree, con
*** 618,623 ****
--- 1747,1753 ----
  		instr_time	start;
  		instr_time	duration;
  		uint64		rows = 0;
+ 		uint32		queryId;
  		BufferUsage bufusage;
  
  		bufusage = pgBufferUsage;
*************** pgss_ProcessUtility(Node *parsetree, con
*** 671,678 ****
  		bufusage.temp_blks_written =
  			pgBufferUsage.temp_blks_written - bufusage.temp_blks_written;
  
! 		pgss_store(queryString, INSTR_TIME_GET_DOUBLE(duration), rows,
! 				   &bufusage);
  	}
  	else
  	{
--- 1801,1811 ----
  		bufusage.temp_blks_written =
  			pgBufferUsage.temp_blks_written - bufusage.temp_blks_written;
  
! 		queryId = pgss_hash_string(queryString);
! 
! 		/* In the case of utility statements, hash the query string directly */
! 		pgss_store(queryString, queryId,
! 				INSTR_TIME_GET_DOUBLE(duration), rows, &bufusage, false, false);
  	}
  	else
  	{
*************** pgss_hash_fn(const void *key, Size keysi
*** 696,703 ****
  	/* we don't bother to include encoding in the hash */
  	return hash_uint32((uint32) k->userid) ^
  		hash_uint32((uint32) k->dbid) ^
! 		DatumGetUInt32(hash_any((const unsigned char *) k->query_ptr,
! 								k->query_len));
  }
  
  /*
--- 1829,1835 ----
  	/* we don't bother to include encoding in the hash */
  	return hash_uint32((uint32) k->userid) ^
  		hash_uint32((uint32) k->dbid) ^
! 		hash_uint32((uint32) k->queryid);
  }
  
  /*
*************** pgss_match_fn(const void *key1, const vo
*** 712,733 ****
  	if (k1->userid == k2->userid &&
  		k1->dbid == k2->dbid &&
  		k1->encoding == k2->encoding &&
! 		k1->query_len == k2->query_len &&
! 		memcmp(k1->query_ptr, k2->query_ptr, k1->query_len) == 0)
  		return 0;
  	else
  		return 1;
  }
  
  /*
   * Store some statistics for a statement.
   */
  static void
! pgss_store(const char *query, double total_time, uint64 rows,
! 		   const BufferUsage *bufusage)
  {
  	pgssHashKey key;
  	double		usage;
  	pgssEntry  *entry;
  
  	Assert(query != NULL);
--- 1844,1889 ----
  	if (k1->userid == k2->userid &&
  		k1->dbid == k2->dbid &&
  		k1->encoding == k2->encoding &&
! 		k1->queryid == k2->queryid)
  		return 0;
  	else
  		return 1;
  }
  
  /*
+  * Given an arbitrarily long query string, produce a hash for the purposes of
+  * identifying the query, without canonicalizing constants. Used when hashing
+  * utility statements, or for legacy compatibility mode.
+  */
+ static uint32
+ pgss_hash_string(const char* str)
+ {
+ 	/* For additional protection against collisions, including magic value */
+ 	char magic = MAG_STR_BUF;
+ 	uint32 result;
+ 	Size size = sizeof(magic) + strlen(str);
+ 	unsigned char* p = palloc(size);
+ 	memcpy(p, &magic, sizeof(magic));
+ 	memcpy(p + sizeof(magic), str, strlen(str));
+ 	result = hash_any((const unsigned char *) p, size);
+ 	pfree(p);
+ 	return result;
+ }
+ 
+ /*
   * Store some statistics for a statement.
   */
  static void
! pgss_store(const char *query, uint32 queryId,
! 				double total_time, uint64 rows,
! 				const BufferUsage *bufusage,
! 				bool empty_entry,
! 				bool canonicalize)
  {
  	pgssHashKey key;
  	double		usage;
+ 	int		    new_query_len = strlen(query);
+ 	char	   *norm_query = NULL;
  	pgssEntry  *entry;
  
  	Assert(query != NULL);
*************** pgss_store(const char *query, double tot
*** 740,773 ****
  	key.userid = GetUserId();
  	key.dbid = MyDatabaseId;
  	key.encoding = GetDatabaseEncoding();
! 	key.query_len = strlen(query);
! 	if (key.query_len >= pgss->query_size)
! 		key.query_len = pg_encoding_mbcliplen(key.encoding,
  											  query,
! 											  key.query_len,
  											  pgss->query_size - 1);
- 	key.query_ptr = query;
  
! 	usage = USAGE_EXEC(duration);
  
  	/* Lookup the hash table entry with shared lock. */
  	LWLockAcquire(pgss->lock, LW_SHARED);
  
- 	entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
  	if (!entry)
  	{
! 		/* Must acquire exclusive lock to add a new entry. */
! 		LWLockRelease(pgss->lock);
! 		LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
! 		entry = entry_alloc(&key);
  	}
  
! 	/* Grab the spinlock while updating the counters. */
  	{
  		volatile pgssEntry *e = (volatile pgssEntry *) entry;
  
  		SpinLockAcquire(&e->mutex);
! 		e->counters.calls += 1;
  		e->counters.total_time += total_time;
  		e->counters.rows += rows;
  		e->counters.shared_blks_hit += bufusage->shared_blks_hit;
--- 1896,2045 ----
  	key.userid = GetUserId();
  	key.dbid = MyDatabaseId;
  	key.encoding = GetDatabaseEncoding();
! 	key.queryid = queryId;
! 
! 	if (new_query_len >= pgss->query_size)
! 		/* We don't have to worry about this later, because canonicalization
! 		 * cannot possibly result in a longer query string
! 		 */
! 		new_query_len = pg_encoding_mbcliplen(key.encoding,
  											  query,
! 											  new_query_len,
  											  pgss->query_size - 1);
  
! 	entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
! 
! 	/*
! 	 * When just initializing an entry and putting counters at zero, make it
! 	 * artificially sticky so that it will probably still be there when
! 	 * executed. Strictly speaking, query strings are canonicalized on a
! 	 * best effort basis, though it would be difficult to demonstrate this even
! 	 * under artificial conditions.
! 	 */
! 	if (empty_entry && !entry)
! 		usage = USAGE_NON_EXEC_STICK;
! 	else
! 		usage = USAGE_EXEC(duration);
  
  	/* Lookup the hash table entry with shared lock. */
  	LWLockAcquire(pgss->lock, LW_SHARED);
  
  	if (!entry)
  	{
! 		/*
! 		 * Generate a normalized version of the query string that will be used
! 		 * to represent the entry.
! 		 *
! 		 * Note that the representation seen by the user will only have
! 		 * non-differentiating Const tokens swapped with '?' characters, and
! 		 * this does not for example take account of the fact that alias names
! 		 * could vary between successive calls of what is regarded as the same
! 		 * query, or that whitespace could vary.
! 		 */
! 		if (last_offset_num > 0 && canonicalize)
! 		{
! 			int i,
! 			  off = 0,				/* Offset from start for cur tok */
! 			  tok_len = 0,			/* Length (in bytes) of that tok */
! 			  quer_it = 0,			/* Original query byte iterator */
! 			  n_quer_it = 0,		/* Normalized query byte iterator */
! 			  len_to_wrt = 0,		/* Length (in bytes) to write */
! 			  last_off = 0,			/* Offset from start for last iter's tok */
! 			  last_tok_len = 0,		/* Length (in bytes) of that tok */
! 			  tok_len_delta = 0;	/* Finished str is n bytes shorter so far */
! 
! 			/* Fill-in constant lengths - core system only gives us locations */
! 			fill_in_constant_lengths(query, last_offsets, last_offset_num);
! 
! 			norm_query = palloc0(new_query_len + 1);
! 
! 			for(i = 0; i < last_offset_num; i++)
! 			{
! 				if(last_offsets[i].length == -1)
! 					continue; /* don't assume that there's no duplicates */
! 
! 				off = last_offsets[i].location;
! 				tok_len = last_offsets[i].length;
! 				len_to_wrt = off - last_off;
! 				len_to_wrt -= last_tok_len;
! 				/* -1 for the '?' char: */
! 				tok_len_delta += tok_len - 1;
! 
! 				Assert(tok_len > 0);
! 				Assert(len_to_wrt >= 0);
! 				/*
! 				 * Each iteration copies everything prior to the current
! 				 * offset/token to be replaced, except bytes copied in
! 				 * previous iterations
! 				 */
! 				if (off - tok_len_delta + tok_len > new_query_len)
! 				{
! 					if (off - tok_len_delta < new_query_len)
! 					{
! 						len_to_wrt = new_query_len - n_quer_it;
! 						/* Out of space entirely - copy as much as possible */
! 						memcpy(norm_query + n_quer_it, query + quer_it,
! 								len_to_wrt);
! 						n_quer_it += len_to_wrt;
! 						quer_it += len_to_wrt + tok_len;
! 					}
! 					break;
! 				}
! 				memcpy(norm_query + n_quer_it, query + quer_it, len_to_wrt);
! 
! 				n_quer_it += len_to_wrt;
! 				if (n_quer_it < new_query_len)
! 					norm_query[n_quer_it++] = '?';
! 				quer_it += len_to_wrt + tok_len;
! 				last_off = off;
! 				last_tok_len = tok_len;
! 			}
! 			/*
! 			 * We've copied up until the last canonicalized constant. Copy over
! 			 * the remaining bytes of the original query string.
! 			 */
! 			memcpy(norm_query + n_quer_it, query + quer_it,
! 					new_query_len - n_quer_it);
! 
! 			/*
! 			 * Must acquire exclusive lock to add a new entry.
! 			 * Leave that until as late as possible.
! 			 */
! 			LWLockRelease(pgss->lock);
! 			LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
! 
! 			entry = entry_alloc(&key, norm_query, new_query_len);
! 		}
! 		else
! 		{
! 			/* Acquire exclusive lock as required by entry_alloc() */
! 			LWLockRelease(pgss->lock);
! 			LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
! 
! 			entry = entry_alloc(&key, query, new_query_len);
! 		}
  	}
  
! 	/*
! 	 * Grab the spinlock while updating the counters, if we're not just here to
! 	 * canonicalize.
! 	 */
  	{
  		volatile pgssEntry *e = (volatile pgssEntry *) entry;
  
  		SpinLockAcquire(&e->mutex);
! 		if (!empty_entry)
! 		{
! 			/*
! 			 * If necessary, "unstick" previously stuck query entry that just
! 			 * held a normalized query string, and then increment calls.
! 			 */
! 			if (e->counters.calls == 0)
! 				e->counters.usage = USAGE_INIT;
! 
! 			e->counters.calls += 1;
! 		}
! 
  		e->counters.total_time += total_time;
  		e->counters.rows += rows;
  		e->counters.shared_blks_hit += bufusage->shared_blks_hit;
*************** pgss_store(const char *query, double tot
*** 785,790 ****
--- 2057,2064 ----
  	}
  
  	LWLockRelease(pgss->lock);
+ 	if (norm_query)
+ 		pfree(norm_query);
  }
  
  /*
*************** pg_stat_statements(PG_FUNCTION_ARGS)
*** 875,881 ****
  
  			qstr = (char *)
  				pg_do_encoding_conversion((unsigned char *) entry->query,
! 										  entry->key.query_len,
  										  entry->key.encoding,
  										  GetDatabaseEncoding());
  			values[i++] = CStringGetTextDatum(qstr);
--- 2149,2155 ----
  
  			qstr = (char *)
  				pg_do_encoding_conversion((unsigned char *) entry->query,
! 										  entry->query_len,
  										  entry->key.encoding,
  										  GetDatabaseEncoding());
  			values[i++] = CStringGetTextDatum(qstr);
*************** pg_stat_statements(PG_FUNCTION_ARGS)
*** 893,898 ****
--- 2167,2175 ----
  			tmp = e->counters;
  			SpinLockRelease(&e->mutex);
  		}
+ 		/* Skip record of unexecuted query */
+ 		if (tmp.calls == 0)
+ 			continue;
  
  		values[i++] = Int64GetDatumFast(tmp.calls);
  		values[i++] = Float8GetDatumFast(tmp.total_time);
*************** pgss_memsize(void)
*** 950,963 ****
   * have made the entry while we waited to get exclusive lock.
   */
  static pgssEntry *
! entry_alloc(pgssHashKey *key)
  {
  	pgssEntry  *entry;
  	bool		found;
  
- 	/* Caller must have clipped query properly */
- 	Assert(key->query_len < pgss->query_size);
- 
  	/* Make space if needed */
  	while (hash_get_num_entries(pgss_hash) >= pgss_max)
  		entry_dealloc();
--- 2227,2237 ----
   * have made the entry while we waited to get exclusive lock.
   */
  static pgssEntry *
! entry_alloc(pgssHashKey *key, const char* query, int new_query_len)
  {
  	pgssEntry  *entry;
  	bool		found;
  
  	/* Make space if needed */
  	while (hash_get_num_entries(pgss_hash) >= pgss_max)
  		entry_dealloc();
*************** entry_alloc(pgssHashKey *key)
*** 969,985 ****
  	{
  		/* New entry, initialize it */
  
! 		/* dynahash tried to copy the key for us, but must fix query_ptr */
! 		entry->key.query_ptr = entry->query;
  		/* reset the statistics */
  		memset(&entry->counters, 0, sizeof(Counters));
  		entry->counters.usage = USAGE_INIT;
  		/* re-initialize the mutex each time ... we assume no one using it */
  		SpinLockInit(&entry->mutex);
  		/* ... and don't forget the query text */
! 		memcpy(entry->query, key->query_ptr, key->query_len);
! 		entry->query[key->query_len] = '\0';
  	}
  
  	return entry;
  }
--- 2243,2262 ----
  	{
  		/* New entry, initialize it */
  
! 		entry->query_len = new_query_len;
! 		Assert(entry->query_len > 0);
  		/* reset the statistics */
  		memset(&entry->counters, 0, sizeof(Counters));
  		entry->counters.usage = USAGE_INIT;
  		/* re-initialize the mutex each time ... we assume no one using it */
  		SpinLockInit(&entry->mutex);
  		/* ... and don't forget the query text */
! 		memcpy(entry->query, query, entry->query_len);
! 		Assert(new_query_len <= pgss->query_size);
! 		entry->query[entry->query_len] = '\0';
  	}
+ 	/* Caller must have clipped query properly */
+ 	Assert(entry->query_len < pgss->query_size);
  
  	return entry;
  }
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
new file mode 100644
index cc3168d..84483ce
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
*************** _copyPlannedStmt(const PlannedStmt *from
*** 92,97 ****
--- 92,98 ----
  	COPY_NODE_FIELD(relationOids);
  	COPY_NODE_FIELD(invalItems);
  	COPY_SCALAR_FIELD(nParamExec);
+ 	COPY_SCALAR_FIELD(queryId);
  
  	return newnode;
  }
*************** _copyQuery(const Query *from)
*** 2415,2420 ****
--- 2416,2422 ----
  
  	COPY_SCALAR_FIELD(commandType);
  	COPY_SCALAR_FIELD(querySource);
+ 	COPY_SCALAR_FIELD(queryId);
  	COPY_SCALAR_FIELD(canSetTag);
  	COPY_NODE_FIELD(utilityStmt);
  	COPY_SCALAR_FIELD(resultRelation);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
new file mode 100644
index 2295195..ce75da3
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 83,88 ****
--- 83,91 ----
  #define COMPARE_LOCATION_FIELD(fldname) \
  	((void) 0)
  
+ /* Compare a query_id field (this is a no-op, per note above) */
+ #define COMPARE_QUERYID_FIELD(fldname) \
+ 	((void) 0)
  
  /*
   *	Stuff from primnodes.h
*************** _equalQuery(const Query *a, const Query
*** 897,902 ****
--- 900,906 ----
  {
  	COMPARE_SCALAR_FIELD(commandType);
  	COMPARE_SCALAR_FIELD(querySource);
+ 	COMPARE_QUERYID_FIELD(query_id);
  	COMPARE_SCALAR_FIELD(canSetTag);
  	COMPARE_NODE_FIELD(utilityStmt);
  	COMPARE_SCALAR_FIELD(resultRelation);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
new file mode 100644
index 829f6d4..9646125
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 81,86 ****
--- 81,90 ----
  #define WRITE_LOCATION_FIELD(fldname) \
  	appendStringInfo(str, " :" CppAsString(fldname) " %d", node->fldname)
  
+ /* Write a query id field */
+ #define WRITE_QUERYID_FIELD(fldname) \
+ 	((void) 0)
+ 
  /* Write a Node field */
  #define WRITE_NODE_FIELD(fldname) \
  	(appendStringInfo(str, " :" CppAsString(fldname) " "), \
*************** _outPlannedStmt(StringInfo str, const Pl
*** 255,260 ****
--- 259,265 ----
  	WRITE_NODE_FIELD(relationOids);
  	WRITE_NODE_FIELD(invalItems);
  	WRITE_INT_FIELD(nParamExec);
+ 	WRITE_QUERYID_FIELD(queryId);
  }
  
  /*
*************** _outQuery(StringInfo str, const Query *n
*** 2159,2164 ****
--- 2164,2170 ----
  
  	WRITE_ENUM_FIELD(commandType, CmdType);
  	WRITE_ENUM_FIELD(querySource, QuerySource);
+ 	WRITE_QUERYID_FIELD(query_id);
  	WRITE_BOOL_FIELD(canSetTag);
  
  	/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
new file mode 100644
index b9258ad..5ea0d52
*** a/src/backend/nodes/readfuncs.c
--- b/src/backend/nodes/readfuncs.c
***************
*** 110,115 ****
--- 110,119 ----
  	token = pg_strtok(&length);		/* get field value */ \
  	local_node->fldname = -1	/* set field to "unknown" */
  
+ /* Read a QueryId field - NO-OP */
+ #define READ_QUERYID_FIELD(fldname) \
+ 	((void) 0)
+ 
  /* Read a Node field */
  #define READ_NODE_FIELD(fldname) \
  	token = pg_strtok(&length);		/* skip :fldname */ \
*************** _readQuery(void)
*** 195,200 ****
--- 199,205 ----
  
  	READ_ENUM_FIELD(commandType, CmdType);
  	READ_ENUM_FIELD(querySource, QuerySource);
+ 	READ_QUERYID_FIELD(query_id);
  	READ_BOOL_FIELD(canSetTag);
  	READ_NODE_FIELD(utilityStmt);
  	READ_INT_FIELD(resultRelation);
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
new file mode 100644
index 8bbe977..1b4030f
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*************** standard_planner(Query *parse, int curso
*** 240,245 ****
--- 240,246 ----
  	result->relationOids = glob->relationOids;
  	result->invalItems = glob->invalItems;
  	result->nParamExec = list_length(glob->paramlist);
+ 	result->queryId = parse->queryId;
  
  	return result;
  }
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
new file mode 100644
index b187b03..26d0d96
*** a/src/backend/parser/analyze.c
--- b/src/backend/parser/analyze.c
*************** static Query *transformExplainStmt(Parse
*** 65,70 ****
--- 65,72 ----
  static void transformLockingClause(ParseState *pstate, Query *qry,
  					   LockingClause *lc, bool pushedDown);
  
+ /* Hooks for plugins to get control of parse analysis */
+ post_parse_analyze_hook_type post_parse_analyze_hook = NULL;
  
  /*
   * parse_analyze
*************** parse_analyze(Node *parseTree, const cha
*** 95,100 ****
--- 97,105 ----
  
  	free_parsestate(pstate);
  
+ 	if (post_parse_analyze_hook)
+ 		(*post_parse_analyze_hook)(query, sourceText, numParams == 0);
+ 
  	return query;
  }
  
*************** parse_analyze_varparams(Node *parseTree,
*** 125,130 ****
--- 130,138 ----
  
  	free_parsestate(pstate);
  
+ 	if (post_parse_analyze_hook)
+ 		(*post_parse_analyze_hook)(query, sourceText, false);
+ 
  	return query;
  }
  
diff --git a/src/backend/parser/parse_coerce.c b/src/backend/parser/parse_coerce.c
new file mode 100644
index 6661a3d..1e04c0e
*** a/src/backend/parser/parse_coerce.c
--- b/src/backend/parser/parse_coerce.c
*************** coerce_type(ParseState *pstate, Node *no
*** 280,293 ****
  		newcon->constlen = typeLen(targetType);
  		newcon->constbyval = typeByVal(targetType);
  		newcon->constisnull = con->constisnull;
! 		/* Use the leftmost of the constant's and coercion's locations */
! 		if (location < 0)
! 			newcon->location = con->location;
! 		else if (con->location >= 0 && con->location < location)
! 			newcon->location = con->location;
! 		else
! 			newcon->location = location;
! 
  		/*
  		 * Set up to point at the constant's text if the input routine throws
  		 * an error.
--- 280,286 ----
  		newcon->constlen = typeLen(targetType);
  		newcon->constbyval = typeByVal(targetType);
  		newcon->constisnull = con->constisnull;
! 		newcon->location = con->location;
  		/*
  		 * Set up to point at the constant's text if the input routine throws
  		 * an error.
diff --git a/src/backend/parser/parse_param.c b/src/backend/parser/parse_param.c
new file mode 100644
index cfe7262..482861f
*** a/src/backend/parser/parse_param.c
--- b/src/backend/parser/parse_param.c
*************** variable_coerce_param_hook(ParseState *p
*** 238,248 ****
  		 */
  		param->paramcollid = get_typcollation(param->paramtype);
  
- 		/* Use the leftmost of the param's and coercion's locations */
- 		if (location >= 0 &&
- 			(param->location < 0 || location < param->location))
- 			param->location = location;
- 
  		return (Node *) param;
  	}
  
--- 238,243 ----
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
new file mode 100644
index 04f9622..42a1b30
*** a/src/backend/rewrite/rewriteHandler.c
--- b/src/backend/rewrite/rewriteHandler.c
*************** static List *
*** 1839,1844 ****
--- 1839,1845 ----
  RewriteQuery(Query *parsetree, List *rewrite_events)
  {
  	CmdType		event = parsetree->commandType;
+ 	uint32		orig_query_id = parsetree->queryId;
  	bool		instead = false;
  	bool		returning = false;
  	Query	   *qual_product = NULL;
*************** RewriteQuery(Query *parsetree, List *rew
*** 2141,2146 ****
--- 2142,2154 ----
  					 errmsg("WITH cannot be used in a query that is rewritten by rules into multiple queries")));
  	}
  
+ 	/* Mark rewritten queries with their originating queryId */
+ 	foreach(lc1, rewritten)
+ 	{
+ 		Query	   *q = (Query *) lfirst(lc1);
+ 		q->queryId = orig_query_id;
+ 	}
+ 
  	return rewritten;
  }
  
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
new file mode 100644
index 49a3969..87b2b0d
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
*************** pg_analyze_and_rewrite_params(Node *pars
*** 631,636 ****
--- 631,640 ----
  	if (log_parser_stats)
  		ShowUsage("PARSE ANALYSIS STATISTICS");
  
+ 	/* Since we're not calling parse_analyze(), do this here */
+ 	if (post_parse_analyze_hook)
+ 		(*post_parse_analyze_hook)(query, query_string, false);
+ 
  	/*
  	 * (2) Rewrite the queries, as necessary
  	 */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
new file mode 100644
index 1d33ceb..9fb3c0f
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
*************** typedef struct Query
*** 103,108 ****
--- 103,111 ----
  
  	QuerySource querySource;	/* where did I come from? */
  
+ 	uint32		queryId;		/* query identifier that can be set by plugins.
+ 								 * Will be copied to resulting PlannedStmt. */
+ 
  	bool		canSetTag;		/* do I set the command result tag? */
  
  	Node	   *utilityStmt;	/* non-null if this is DECLARE CURSOR or a
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
new file mode 100644
index 7d90b91..3cec1be
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
*************** typedef struct PlannedStmt
*** 67,72 ****
--- 67,74 ----
  	List	   *invalItems;		/* other dependencies, as PlanInvalItems */
  
  	int			nParamExec;		/* number of PARAM_EXEC Params used */
+ 
+ 	uint32		queryId;		/* query identifier carried from query tree */
  } PlannedStmt;
  
  /* macro for fetching the Plan associated with a SubPlan node */
diff --git a/src/include/parser/analyze.h b/src/include/parser/analyze.h
new file mode 100644
index b8987db..59635b0
*** a/src/include/parser/analyze.h
--- b/src/include/parser/analyze.h
***************
*** 16,21 ****
--- 16,25 ----
  
  #include "parser/parse_node.h"
  
+ /* Hook for plugins to get control in parse_analyze() */
+ typedef void (*post_parse_analyze_hook_type) (Query* post_analysis_tree,
+ 		const char *sourceText,	bool canonicalize);
+ extern PGDLLIMPORT post_parse_analyze_hook_type post_parse_analyze_hook;
  
  extern Query *parse_analyze(Node *parseTree, const char *sourceText,
  			  Oid *paramTypes, int numParams);

#52

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Peter Geoghegan (#51)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan <peter@2ndquadrant.com> writes:

On 22 March 2012 17:19, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Either way, I think we'd be a lot better advised to define a single
hook "post_parse_analysis_hook" and make the core code responsible
for calling it at the appropriate places, rather than supposing that
the contrib module knows exactly which core functions ought to be
the places to do it.

I have done this too.

The "canonicalize" argument to the proposed hook seems like a bit of a
crock. You've got the core code magically setting that in a way that
responds to extremely pg_stat_statements-specific concerns, and I am not
very sure it's right even for those concerns.

I am thinking that perhaps a reasonable signature for the hook function
would be

void post_parse_analyze (ParseState *pstate, Query *query);

with the expectation that it could dig whatever it wants to know out
of the ParseState (in particular the sourceText is available there,
and in general this should provide everything that's known at parse
time).

Now, if what it wants to know about is the parameterization status
of the query, things aren't ideal because most of the info is hidden
in parse-callback fields that aren't of globally exposed types. However
we could at least duplicate the behavior you have here, because you're
only passing canonicalize = true in cases where no parse callback will
be registered at all, so pg_stat_statements could equivalently test for
pstate->p_paramref_hook == NULL.

Thoughts, other ideas?

regards, tom lane

#53

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#52)

1 attachment(s)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 27 March 2012 18:15, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I am thinking that perhaps a reasonable signature for the hook function
would be

void post_parse_analyze (ParseState *pstate, Query *query);

with the expectation that it could dig whatever it wants to know out
of the ParseState (in particular the sourceText is available there,
and in general this should provide everything that's known at parse
time).

It seems reasonable to suggest that this will provide everything known
at parse time.

Now, if what it wants to know about is the parameterization status
of the query, things aren't ideal because most of the info is hidden
in parse-callback fields that aren't of globally exposed types. However
we could at least duplicate the behavior you have here, because you're
only passing canonicalize = true in cases where no parse callback will
be registered at all, so pg_stat_statements could equivalently test for
pstate->p_paramref_hook == NULL.

It has been suggested to me before that comparisons with function
pointers - using them as a flag, in effect - is generally iffy, but
that particular usage seems reasonable to me.

Attached is a revision with the suggested changes.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachments:

pg_stat_statements_norm_2012_03_27.patchtext/x-patch; charset=US-ASCII; name=pg_stat_statements_norm_2012_03_27.patchDownload

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
new file mode 100644
index 914fbf2..b25cb71
*** a/contrib/pg_stat_statements/pg_stat_statements.c
--- b/contrib/pg_stat_statements/pg_stat_statements.c
***************
*** 10,15 ****
--- 10,38 ----
   * an entry, one must hold the lock shared or exclusive (so the entry doesn't
   * disappear!) and also take the entry's mutex spinlock.
   *
+  * As of Postgres 9.2, this module normalizes query entries. Normalization is a
+  * process whereby similar queries, typically differing only in their constants
+  * (though the exact rules are somewhat more subtle than that) are recognized as
+  * equivalent, and are tracked as a single entry. This is particularly useful
+  * for non-prepared queries.
+  *
+  * Normalization is implemented by fingerprinting queries, selectively
+  * serializing those fields of each query tree's nodes that are judged to be
+  * essential to the query.  This is referred to as a query jumble. This is
+  * distinct from a regular serialization, in that various extraneous information
+  * is ignored as irrelevant or not essential to the query, such as the collation
+  * of Vars, and, most notably, the value of constants - it isn't actually
+  * possible or desirable to deserialize.
+  *
+  * Once this jumble is acquired, a 32-bit hash is taken, which is copied back
+  * into the query tree at the post-analysis stage.  Postgres then naively copies
+  * this value around, making it later available from within the corresponding
+  * plan tree. The executor can then use this value to blame query costs on a
+  * known queryId.
+  *
+  * Within the executor hook, the module stores the cost of query execution,
+  * based on a queryId provided by the core system and some other values, within
+  * the shared hashtable.
   *
   * Copyright (c) 2008-2012, PostgreSQL Global Development Group
   *
***************
*** 27,38 ****
--- 50,65 ----
  #include "funcapi.h"
  #include "mb/pg_wchar.h"
  #include "miscadmin.h"
+ #include "parser/analyze.h"
+ #include "parser/parsetree.h"
+ #include "parser/scanner.h"
  #include "pgstat.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
  #include "storage/spin.h"
  #include "tcop/utility.h"
  #include "utils/builtins.h"
+ #include "utils/memutils.h"
  
  
  PG_MODULE_MAGIC;
*************** PG_MODULE_MAGIC;
*** 41,54 ****
  #define PGSS_DUMP_FILE	"global/pg_stat_statements.stat"
  
  /* This constant defines the magic number in the stats file header */
! static const uint32 PGSS_FILE_HEADER = 0x20100108;
  
  /* XXX: Should USAGE_EXEC reflect execution time and/or buffer usage? */
  #define USAGE_EXEC(duration)	(1.0)
  #define USAGE_INIT				(1.0)	/* including initial planning */
  #define USAGE_DECREASE_FACTOR	(0.99)	/* decreased every entry_dealloc */
  #define USAGE_DEALLOC_PERCENT	5		/* free this % of entries at once */
! 
  /*
   * Hashtable key that defines the identity of a hashtable entry.  The
   * hash comparators do not assume that the query string is null-terminated;
--- 68,87 ----
  #define PGSS_DUMP_FILE	"global/pg_stat_statements.stat"
  
  /* This constant defines the magic number in the stats file header */
! static const uint32 PGSS_FILE_HEADER = 0x20120103;
  
  /* XXX: Should USAGE_EXEC reflect execution time and/or buffer usage? */
  #define USAGE_EXEC(duration)	(1.0)
  #define USAGE_INIT				(1.0)	/* including initial planning */
+ #define USAGE_NON_EXEC_STICK	(1.0e10)/* unexecuted queries sticky */
  #define USAGE_DECREASE_FACTOR	(0.99)	/* decreased every entry_dealloc */
  #define USAGE_DEALLOC_PERCENT	5		/* free this % of entries at once */
! #define JUMBLE_SIZE				1024    /* query serialization buffer size */
! /* Magic values for jumble */
! #define MAG_HASH_BUF			0xFA	/* buffer is a hash of query tree */
! #define MAG_STR_BUF				0xEB	/* buffer is query string itself */
! #define MAG_RETURN_LIST			0xAE	/* returning list node follows */
! #define MAG_LIMIT_OFFSET		0xBA	/* limit/offset node follows */
  /*
   * Hashtable key that defines the identity of a hashtable entry.  The
   * hash comparators do not assume that the query string is null-terminated;
*************** typedef struct pgssHashKey
*** 63,70 ****
  	Oid			userid;			/* user OID */
  	Oid			dbid;			/* database OID */
  	int			encoding;		/* query encoding */
! 	int			query_len;		/* # of valid bytes in query string */
! 	const char *query_ptr;		/* query string proper */
  } pgssHashKey;
  
  /*
--- 96,102 ----
  	Oid			userid;			/* user OID */
  	Oid			dbid;			/* database OID */
  	int			encoding;		/* query encoding */
! 	uint32		queryid;		/* query identifier */
  } pgssHashKey;
  
  /*
*************** typedef struct pgssEntry
*** 97,102 ****
--- 129,135 ----
  {
  	pgssHashKey key;			/* hash key of entry - MUST BE FIRST */
  	Counters	counters;		/* the statistics for this query */
+ 	int			query_len;		/* # of valid bytes in query string */
  	slock_t		mutex;			/* protects the counters only */
  	char		query[1];		/* VARIABLE LENGTH ARRAY - MUST BE LAST */
  	/* Note: the allocated length of query[] is actually pgss->query_size */
*************** typedef struct pgssSharedState
*** 111,117 ****
--- 144,164 ----
  	int			query_size;		/* max query length in bytes */
  } pgssSharedState;
  
+ typedef struct pgssLocationLen
+ {
+ 	int location;
+ 	int length;
+ } pgssLocationLen;
+ 
  /*---- Local variables ----*/
+ /* Jumble of current query tree */
+ static unsigned char *last_jumble = NULL;
+ /* Buffer that represents position of normalized characters */
+ static pgssLocationLen *last_offsets = NULL;
+ /* Current Length of last_offsets buffer */
+ static Size last_offset_buf_size = 10;
+ /* Current number of actual offsets stored in last_offsets */
+ static Size last_offset_num = 0;
  
  /* Current nesting depth of ExecutorRun calls */
  static int	nested_level = 0;
*************** static ExecutorRun_hook_type prev_Execut
*** 123,133 ****
--- 170,188 ----
  static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
  static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
  static ProcessUtility_hook_type prev_ProcessUtility = NULL;
+ static post_parse_analyze_hook_type prev_post_parse_analyze_hook = NULL;
  
  /* Links to shared memory state */
  static pgssSharedState *pgss = NULL;
  static HTAB *pgss_hash = NULL;
  
+ /*
+  * Maintain a stack of the rangetable of the query tree that we're currently
+  * walking, so subqueries can reference parent rangetables. The stack is pushed
+  * and popped as each Query struct is walked into or out of.
+  */
+ static List* pgss_rangetbl_stack = NIL;
+ 
  /*---- GUC variables ----*/
  
  typedef enum
*************** static int	pgss_max;			/* max # statemen
*** 149,154 ****
--- 204,210 ----
  static int	pgss_track;			/* tracking level */
  static bool pgss_track_utility; /* whether to track utility commands */
  static bool pgss_save;			/* whether to save stats across shutdown */
+ static bool pgss_string_key;	/* whether to always only hash query str */
  
  
  #define pgss_enabled() \
*************** PG_FUNCTION_INFO_V1(pg_stat_statements);
*** 168,173 ****
--- 224,244 ----
  
  static void pgss_shmem_startup(void);
  static void pgss_shmem_shutdown(int code, Datum arg);
+ static int comp_offset(const void *a, const void *b);
+ static void pgss_parse_analyze(ParseState *pstate, Query *post_analysis_tree);
+ static void pgss_process_post_analysis_tree(Query* post_analysis_tree,
+ 		const char* sourceText, bool canonicalize);
+ static void fill_in_constant_lengths(const char* query,
+ 						pgssLocationLen offs[], Size n_offs);
+ static uint32 JumbleQuery(Query *post_analysis_tree);
+ static void AppendJumb(unsigned char* item, unsigned char jumble[], Size size, Size *i);
+ static void PerformJumble(const Query *tree, Size size, Size *i);
+ static void QualsNode(const OpExpr *node, Size size, Size *i, List *rtable);
+ static void LeafNode(const Node *arg, Size size, Size *i, List *rtable);
+ static void LimitOffsetNode(const Node *node, Size size, Size *i, List *rtable);
+ static void JoinExprNode(JoinExpr *node, Size size, Size *i, List *rtable);
+ static void JoinExprNodeChild(const Node *node, Size size, Size *i, List *rtable);
+ static void RecordConstLocation(int location);
  static void pgss_ExecutorStart(QueryDesc *queryDesc, int eflags);
  static void pgss_ExecutorRun(QueryDesc *queryDesc,
  				 ScanDirection direction,
*************** static void pgss_ProcessUtility(Node *pa
*** 179,188 ****
  					DestReceiver *dest, char *completionTag);
  static uint32 pgss_hash_fn(const void *key, Size keysize);
  static int	pgss_match_fn(const void *key1, const void *key2, Size keysize);
! static void pgss_store(const char *query, double total_time, uint64 rows,
! 		   const BufferUsage *bufusage);
  static Size pgss_memsize(void);
! static pgssEntry *entry_alloc(pgssHashKey *key);
  static void entry_dealloc(void);
  static void entry_reset(void);
  
--- 250,261 ----
  					DestReceiver *dest, char *completionTag);
  static uint32 pgss_hash_fn(const void *key, Size keysize);
  static int	pgss_match_fn(const void *key1, const void *key2, Size keysize);
! static uint32 pgss_hash_string(const char* str);
! static void pgss_store(const char *query, uint32 queryId,
! 				double total_time, uint64 rows,
! 				const BufferUsage *bufusage, bool empty_entry, bool canonicalize);
  static Size pgss_memsize(void);
! static pgssEntry *entry_alloc(pgssHashKey *key, const char* query, int new_query_len);
  static void entry_dealloc(void);
  static void entry_reset(void);
  
*************** static void entry_reset(void);
*** 193,198 ****
--- 266,272 ----
  void
  _PG_init(void)
  {
+ 	MemoryContext oldcontext;
  	/*
  	 * In order to create our shared memory area, we have to be loaded via
  	 * shared_preload_libraries.  If not, fall out without hooking into any of
*************** _PG_init(void)
*** 254,259 ****
--- 328,348 ----
  							 NULL,
  							 NULL);
  
+ 	/*
+ 	 * Support legacy pg_stat_statements behavior, for compatibility with
+ 	 * versions shipped with Postgres 8.4, 9.0 and 9.1
+ 	 */
+ 	DefineCustomBoolVariable("pg_stat_statements.string_key",
+ 			   "Differentiate queries based on query string alone.",
+ 							 NULL,
+ 							 &pgss_string_key,
+ 							 false,
+ 							 PGC_POSTMASTER,
+ 							 0,
+ 							 NULL,
+ 							 NULL,
+ 							 NULL);
+ 
  	EmitWarningsOnPlaceholders("pg_stat_statements");
  
  	/*
*************** _PG_init(void)
*** 265,270 ****
--- 354,371 ----
  	RequestAddinLWLocks(1);
  
  	/*
+ 	 * Allocate a buffer to store selective serialization of the query tree
+ 	 * for the purposes of query normalization.
+ 	 */
+ 	oldcontext = MemoryContextSwitchTo(TopMemoryContext);
+ 
+ 	last_jumble = palloc(JUMBLE_SIZE);
+ 	/* Allocate space for bookkeeping information for query str normalization */
+ 	last_offsets = palloc(last_offset_buf_size * sizeof(pgssLocationLen));
+ 
+ 	MemoryContextSwitchTo(oldcontext);
+ 
+ 	/*
  	 * Install hooks.
  	 */
  	prev_shmem_startup_hook = shmem_startup_hook;
*************** _PG_init(void)
*** 279,284 ****
--- 380,387 ----
  	ExecutorEnd_hook = pgss_ExecutorEnd;
  	prev_ProcessUtility = ProcessUtility_hook;
  	ProcessUtility_hook = pgss_ProcessUtility;
+ 	prev_post_parse_analyze_hook = post_parse_analyze_hook;
+ 	post_parse_analyze_hook = pgss_parse_analyze;
  }
  
  /*
*************** _PG_fini(void)
*** 294,299 ****
--- 397,406 ----
  	ExecutorFinish_hook = prev_ExecutorFinish;
  	ExecutorEnd_hook = prev_ExecutorEnd;
  	ProcessUtility_hook = prev_ProcessUtility;
+ 	post_parse_analyze_hook = prev_post_parse_analyze_hook;
+ 
+ 	pfree(last_jumble);
+ 	pfree(last_offsets);
  }
  
  /*
*************** pgss_shmem_startup(void)
*** 397,423 ****
  		if (!PG_VALID_BE_ENCODING(temp.key.encoding))
  			goto error;
  
  		/* Previous incarnation might have had a larger query_size */
! 		if (temp.key.query_len >= buffer_size)
  		{
! 			buffer = (char *) repalloc(buffer, temp.key.query_len + 1);
! 			buffer_size = temp.key.query_len + 1;
  		}
  
! 		if (fread(buffer, 1, temp.key.query_len, file) != temp.key.query_len)
  			goto error;
! 		buffer[temp.key.query_len] = '\0';
  
  		/* Clip to available length if needed */
! 		if (temp.key.query_len >= query_size)
! 			temp.key.query_len = pg_encoding_mbcliplen(temp.key.encoding,
  													   buffer,
! 													   temp.key.query_len,
  													   query_size - 1);
- 		temp.key.query_ptr = buffer;
  
  		/* make the hashtable entry (discards old entries if too many) */
! 		entry = entry_alloc(&temp.key);
  
  		/* copy in the actual stats */
  		entry->counters = temp.counters;
--- 504,534 ----
  		if (!PG_VALID_BE_ENCODING(temp.key.encoding))
  			goto error;
  
+ 		/* Avoid loading sticky entries */
+ 		if (temp.counters.calls == 0)
+ 			continue;
+ 
  		/* Previous incarnation might have had a larger query_size */
! 		if (temp.query_len >= buffer_size)
  		{
! 			buffer = (char *) repalloc(buffer, temp.query_len + 1);
! 			buffer_size = temp.query_len + 1;
  		}
  
! 		if (fread(buffer, 1, temp.query_len, file) != temp.query_len)
  			goto error;
! 		buffer[temp.query_len] = '\0';
! 
  
  		/* Clip to available length if needed */
! 		if (temp.query_len >= query_size)
! 			temp.query_len = pg_encoding_mbcliplen(temp.key.encoding,
  													   buffer,
! 													   temp.query_len,
  													   query_size - 1);
  
  		/* make the hashtable entry (discards old entries if too many) */
! 		entry = entry_alloc(&temp.key, buffer, temp.query_len);
  
  		/* copy in the actual stats */
  		entry->counters = temp.counters;
*************** pgss_shmem_shutdown(int code, Datum arg)
*** 479,485 ****
  	hash_seq_init(&hash_seq, pgss_hash);
  	while ((entry = hash_seq_search(&hash_seq)) != NULL)
  	{
! 		int			len = entry->key.query_len;
  
  		if (fwrite(entry, offsetof(pgssEntry, mutex), 1, file) != 1 ||
  			fwrite(entry->query, 1, len, file) != len)
--- 590,596 ----
  	hash_seq_init(&hash_seq, pgss_hash);
  	while ((entry = hash_seq_search(&hash_seq)) != NULL)
  	{
! 		int			len = entry->query_len;
  
  		if (fwrite(entry, offsetof(pgssEntry, mutex), 1, file) != 1 ||
  			fwrite(entry->query, 1, len, file) != len)
*************** error:
*** 505,510 ****
--- 616,1628 ----
  }
  
  /*
+  * comp_offset: Comparator for qsorting pgssLocationLen values.
+  */
+ static int
+ comp_offset(const void *a, const void *b)
+ {
+ 	int l = ((pgssLocationLen*) a)->location;
+ 	int r = ((pgssLocationLen*) b)->location;
+ 	if (l < r)
+ 		return -1;
+ 	else if (l > r)
+ 		return +1;
+ 	else
+ 		return 0;
+ }
+ 
+ static void
+ pgss_parse_analyze(ParseState *pstate, Query *post_analysis_tree)
+ {
+ 	/*
+ 	 * It is possible that a query could organically have a queryId of 0, but
+ 	 * that is exceptionally unlikely, and besides, this assertion naturally
+ 	 * evaluates to a no-op on "release" builds
+ 	 */
+ 	Assert(post_analysis_tree->queryId == 0);
+ 	if (!post_analysis_tree->utilityStmt)
+ 		pgss_process_post_analysis_tree(post_analysis_tree, pstate->p_sourcetext,
+ 											pstate->p_paramref_hook == NULL);
+ }
+ 
+ /*
+  * pgss_process_post_analysis_tree: Record queryId, which is based on the query
+  * tree, within the tree itself, for later retrieval in the executor hook. The
+  * core system will copy the value to the tree's corresponding plannedstmt.
+  *
+  * Avoid producing a canonicalized string for parameterized queries. It is
+  * simply not desirable given that constants that we might otherwise
+  * canonicalize are going to always be consistent between calls. In addition, it
+  * would be impractical to make the hash entry sticky for an indefinitely long
+  * period (i.e. until the query is actually executed).
+  */
+ static void
+ pgss_process_post_analysis_tree(Query* post_analysis_tree,
+ 		const char* sourceText, bool canonicalize)
+ {
+ 	BufferUsage bufusage;
+ 
+ 	post_analysis_tree->queryId = JumbleQuery(post_analysis_tree);
+ 
+ 	memset(&bufusage, 0, sizeof(bufusage));
+ 	pgss_store(sourceText, post_analysis_tree->queryId, 0, 0, &bufusage,
+ 			true, canonicalize);
+ 
+ 	/* Trim last_offsets */
+ 	if (last_offset_buf_size > 10)
+ 	{
+ 		last_offset_buf_size = 10;
+ 		last_offsets = repalloc(last_offsets,
+ 							last_offset_buf_size *
+ 							sizeof(pgssLocationLen));
+ 	}
+ }
+ 
+ /*
+  * Given a valid SQL string, and offsets whose lengths are uninitialized, fill
+  * in the corresponding lengths of those constants.
+  *
+  * The constant may use any available constant syntax, including but not limited
+  * to float literals, bit-strings, single quoted strings and dollar-quoted
+  * strings. This is accomplished by using the public API for the core scanner,
+  * with a workaround for quirks of their representation. It is expected that the
+  * constants will be sorted by their original location when canonicalizing the
+  * query string, so do that here.
+  *
+  * It is the caller's job to ensure that the string is a valid SQL statement.
+  * Since in practice the string has already been validated, and the locations
+  * that the caller provides will have originated from within the authoritative
+  * parser, this should not be a problem. Duplicates are expected, and will have
+  * their lengths marked as '-1', so that they are later ignored.
+  *
+  * N.B. There is an assumption that a '-' character at a Const location begins a
+  * negative constant. This precludes there ever being another reason for a
+  * constant to start with a '-' for any other reason.
+  */
+ static void
+ fill_in_constant_lengths(const char* query, pgssLocationLen offs[],
+ 							Size n_offs)
+ {
+ 	core_yyscan_t  init_scan;
+ 	core_yy_extra_type ext_type;
+ 	core_YYSTYPE type;
+ 	YYLTYPE pos;
+ 	int i, last_loc = -1;
+ 
+ 	/* Sort offsets */
+ 	qsort(offs, n_offs, sizeof(pgssLocationLen), comp_offset);
+ 
+ 
+ 	init_scan = scanner_init(query,
+ 							 &ext_type,
+ 							 ScanKeywords,
+ 							 NumScanKeywords);
+ 
+ 	for(i = 0; i < n_offs; i++)
+ 	{
+ 		int loc = offs[i].location;
+ 		Assert(loc > 0);
+ 
+ 		if (loc == last_loc)
+ 		{
+ 			/* Duplicate */
+ 			offs[i].length = -1;
+ 			continue;
+ 		}
+ 
+ 		for(;;)
+ 		{
+ 			int scanbuf_len;
+ #ifdef USE_ASSERT_CHECKING
+ 			int tok =
+ #endif
+ 						core_yylex(&type, &pos, init_scan);
+ 			scanbuf_len = strlen(ext_type.scanbuf);
+ 			Assert(tok != 0);
+ 
+ 			if (scanbuf_len > loc)
+ 			{
+ 				if (query[loc] == '-')
+ 				{
+ 					/*
+ 					 * It's a negative value - this is the one and only case
+ 					 * where we canonicalize more than a single token.
+ 					 *
+ 					 * Do not compensate for the core system's special-case
+ 					 * adjustment of location to that of the leading '-'
+ 					 * operator in the event of a negative constant. It is also
+ 					 * useful for our purposes to start from the minus symbol.
+ 					 * In this way, queries like "select * from foo where bar =
+ 					 * 1" and "select * from foo where bar = -2" will always
+ 					 * have identical canonicalized query strings.
+ 					 */
+ 					core_yylex(&type, &pos, init_scan);
+ 					scanbuf_len = strlen(ext_type.scanbuf);
+ 				}
+ 
+ 				/*
+ 				 * Scanner is now at end of const token of outer iteration -
+ 				 * work backwards to get constant length.
+ 				 */
+ 				offs[i].length = scanbuf_len - loc;
+ 				break;
+ 			}
+ 		}
+ 		last_loc = loc;
+ 	}
+ 	scanner_finish(init_scan);
+ }
+ 
+ /*
+  * JumbleQuery: Selectively serialize query tree, and return a hash representing
+  * that serialization - its queryId.
+  *
+  * Note that this doesn't necessarily uniquely identify the query across
+  * different databases and encodings.
+  */
+ static uint32
+ JumbleQuery(Query *post_analysis_tree)
+ {
+ 	/* State for this run of PerformJumble */
+ 	Size i = 0;
+ 	last_offset_num = 0;
+ 	Assert(post_analysis_tree->queryId == 0);
+ 	memset(last_jumble, 0, JUMBLE_SIZE);
+ 	last_jumble[i++] = MAG_HASH_BUF;
+ 	PerformJumble(post_analysis_tree, JUMBLE_SIZE, &i);
+ 	/* Reset rangetbl state */
+ 	list_free(pgss_rangetbl_stack);
+ 	pgss_rangetbl_stack = NIL;
+ 
+ 	return hash_any((const unsigned char* ) last_jumble, i);
+ }
+ 
+ /*
+  * AppendJumb: Append a value that is substantive to a given query to jumble,
+  * while incrementing the iterator, i.
+  */
+ static void
+ AppendJumb(unsigned char* item, unsigned char jumble[], Size size, Size *i)
+ {
+ 	Assert(item != NULL);
+ 	Assert(jumble != NULL);
+ 	Assert(i != NULL);
+ 
+ 	/*
+ 	 * Copy the entire item to the buffer, or as much of it as possible to fill
+ 	 * the buffer to capacity.
+ 	 */
+ 	memcpy(jumble + *i, item, Min(*i > JUMBLE_SIZE? 0:JUMBLE_SIZE - *i, size));
+ 
+ 	/*
+ 	 * Continually hash the query tree's jumble.
+ 	 *
+ 	 * Was JUMBLE_SIZE exceeded? If so, hash the jumble and append that to the
+ 	 * start of the jumble buffer, and then continue to append the fraction of
+ 	 * "item" that we might not have been able to fit at the end of the buffer
+ 	 * in the last iteration. Since the value of i has been set to 0, there is
+ 	 * no need to memset the buffer in advance of this new iteration, but
+ 	 * effectively we are completely discarding the prior iteration's jumble
+ 	 * except for this representative hash value.
+ 	 */
+ 	if (*i > JUMBLE_SIZE)
+ 	{
+ 		uint32 start_hash = hash_any((const unsigned char* ) last_jumble, JUMBLE_SIZE);
+ 		int hash_l = sizeof(start_hash);
+ 		int part_left_l = Max(0, ((int) size - ((int) *i - JUMBLE_SIZE)));
+ 
+ 		Assert(part_left_l >= 0 && part_left_l <= size);
+ 
+ 		memcpy(jumble, &start_hash, hash_l);
+ 		memcpy(jumble + hash_l, item + (size - part_left_l), part_left_l);
+ 		*i = hash_l + part_left_l;
+ 	}
+ 	else
+ 	{
+ 		*i += size;
+ 	}
+ }
+ 
+ /*
+  * Wrapper around AppendJumb to encapsulate details of serialization
+  * of individual local variable elements.
+  */
+ #define APP_JUMB(item) \
+ AppendJumb((unsigned char*)&item, last_jumble, sizeof(item), i)
+ 
+ /*
+  * PerformJumble: Selectively serialize the query tree and canonicalize
+  * constants (i.e.  don't consider their actual value - just their type).
+  *
+  * The last_jumble buffer, which this function writes to, can be hashed to
+  * uniquely identify a query that may use different constants in successive
+  * calls.
+  */
+ static void
+ PerformJumble(const Query *tree, Size size, Size *i)
+ {
+ 	ListCell *l;
+ 	/* table join tree (FROM and WHERE clauses) */
+ 	FromExpr *jt = (FromExpr *) tree->jointree;
+ 	/* # of result tuples to skip (int8 expr) */
+ 	FuncExpr *off = (FuncExpr *) tree->limitOffset;
+ 	/* # of result tuples to skip (int8 expr) */
+ 	FuncExpr *limcount = (FuncExpr *) tree->limitCount;
+ 
+ 	if (pgss_rangetbl_stack &&
+ 			!IsA(pgss_rangetbl_stack, List))
+ 		pgss_rangetbl_stack = NIL;
+ 
+ 	if (tree->rtable != NIL)
+ 	{
+ 		pgss_rangetbl_stack = lappend(pgss_rangetbl_stack, tree->rtable);
+ 	}
+ 	else
+ 	{
+ 		/* Add dummy Range table entry to maintain stack */
+ 		RangeTblEntry *rte = makeNode(RangeTblEntry);
+ 		List *dummy = lappend(NIL, rte);
+ 		pgss_rangetbl_stack = lappend(pgss_rangetbl_stack, dummy);
+ 	}
+ 
+ 	APP_JUMB(tree->resultRelation);
+ 
+ 	if (tree->intoClause)
+ 	{
+ 		IntoClause *ic = tree->intoClause;
+ 		RangeVar   *rel = ic->rel;
+ 
+ 		APP_JUMB(ic->onCommit);
+ 		APP_JUMB(ic->skipData);
+ 		if (rel)
+ 		{
+ 			APP_JUMB(rel->relpersistence);
+ 			/* Bypass macro abstraction to supply size directly.
+ 			 *
+ 			 * Serialize schemaname, relname themselves - this makes us
+ 			 * somewhat consistent with the behavior of utility statements like "create
+ 			 * table", which seems appropriate.
+ 			 */
+ 			if (rel->schemaname)
+ 				AppendJumb((unsigned char *)rel->schemaname, last_jumble,
+ 								strlen(rel->schemaname), i);
+ 			if (rel->relname)
+ 				AppendJumb((unsigned char *)rel->relname, last_jumble,
+ 								strlen(rel->relname), i);
+ 		}
+ 	}
+ 
+ 	/* WITH list (of CommonTableExpr's) */
+ 	foreach(l, tree->cteList)
+ 	{
+ 		CommonTableExpr	*cte = (CommonTableExpr *) lfirst(l);
+ 		Query			*cteq = (Query*) cte->ctequery;
+ 		if (cteq)
+ 			PerformJumble(cteq, size, i);
+ 	}
+ 	if (jt)
+ 	{
+ 		if (jt->quals)
+ 		{
+ 			if (IsA(jt->quals, OpExpr))
+ 			{
+ 				QualsNode((OpExpr*) jt->quals, size, i, tree->rtable);
+ 			}
+ 			else
+ 			{
+ 				LeafNode((Node*) jt->quals, size, i, tree->rtable);
+ 			}
+ 		}
+ 		/* table join tree */
+ 		foreach(l, jt->fromlist)
+ 		{
+ 			Node* fr = lfirst(l);
+ 			if (IsA(fr, JoinExpr))
+ 			{
+ 				JoinExprNode((JoinExpr*) fr, size, i, tree->rtable);
+ 			}
+ 			else if (IsA(fr, RangeTblRef))
+ 			{
+ 				RangeTblRef   *rtf = (RangeTblRef *) fr;
+ 				RangeTblEntry *rte = rt_fetch(rtf->rtindex, tree->rtable);
+ 				APP_JUMB(rte->relid);
+ 				APP_JUMB(rte->rtekind);
+ 				/* Subselection in where clause */
+ 				if (rte->subquery)
+ 					PerformJumble(rte->subquery, size, i);
+ 
+ 				/* Function call in where clause */
+ 				if (rte->funcexpr)
+ 					LeafNode((Node*) rte->funcexpr, size, i, tree->rtable);
+ 			}
+ 			else
+ 			{
+ 				ereport(WARNING,
+ 						(errcode(ERRCODE_INTERNAL_ERROR),
+ 						 errmsg("unexpected, unrecognised fromlist node type: %d",
+ 							 (int) nodeTag(fr))));
+ 			}
+ 		}
+ 	}
+ 	/*
+ 	 * target list (of TargetEntry)
+ 	 * columns returned by query
+ 	 */
+ 	foreach(l, tree->targetList)
+ 	{
+ 		TargetEntry *tg = (TargetEntry *) lfirst(l);
+ 		Node        *e  = (Node*) tg->expr;
+ 		if (tg->ressortgroupref)
+ 			/* nonzero if referenced by a sort/group - for ORDER BY */
+ 			APP_JUMB(tg->ressortgroupref);
+ 		APP_JUMB(tg->resno); /* column number for select */
+ 		/*
+ 		 * Handle the various types of nodes in
+ 		 * the select list of this query
+ 		 */
+ 		LeafNode(e, size, i, tree->rtable);
+ 	}
+ 	/* return-values list (of TargetEntry) */
+ 	foreach(l, tree->returningList)
+ 	{
+ 		TargetEntry *rt = (TargetEntry *) lfirst(l);
+ 		Expr        *e  = (Expr*) rt->expr;
+ 		unsigned char magic = MAG_RETURN_LIST;
+ 		APP_JUMB(magic);
+ 		/*
+ 		 * Handle the various types of nodes in
+ 		 * the select list of this query
+ 		 */
+ 		LeafNode((Node*) e, size, i, tree->rtable);
+ 	}
+ 	/* a list of SortGroupClause's */
+ 	foreach(l, tree->groupClause)
+ 	{
+ 		SortGroupClause *gc = (SortGroupClause *) lfirst(l);
+ 		APP_JUMB(gc->tleSortGroupRef);
+ 		APP_JUMB(gc->nulls_first);
+ 	}
+ 
+ 	if (tree->havingQual)
+ 	{
+ 		if (IsA(tree->havingQual, OpExpr))
+ 		{
+ 			OpExpr *na = (OpExpr *) tree->havingQual;
+ 			QualsNode(na, size, i, tree->rtable);
+ 		}
+ 		else
+ 		{
+ 			Node *n = (Node*) tree->havingQual;
+ 			LeafNode(n, size, i, tree->rtable);
+ 		}
+ 	}
+ 
+ 	foreach(l, tree->windowClause)
+ 	{
+ 		WindowClause *wc = (WindowClause *) lfirst(l);
+ 		ListCell     *il;
+ 		APP_JUMB(wc->frameOptions);
+ 		foreach(il, wc->partitionClause)	/* PARTITION BY list */
+ 		{
+ 			Node *n = (Node *) lfirst(il);
+ 			LeafNode(n, size, i, tree->rtable);
+ 		}
+ 		foreach(il, wc->orderClause)		/* ORDER BY list */
+ 		{
+ 			Node *n = (Node *) lfirst(il);
+ 			LeafNode(n, size, i, tree->rtable);
+ 		}
+ 	}
+ 
+ 	foreach(l, tree->distinctClause)
+ 	{
+ 		SortGroupClause *dc = (SortGroupClause *) lfirst(l);
+ 		APP_JUMB(dc->tleSortGroupRef);
+ 		APP_JUMB(dc->nulls_first);
+ 	}
+ 
+ 	/* Don't look at tree->sortClause,
+ 	 * because the value ressortgroupref is already
+ 	 * serialized when we iterate through targetList
+ 	 */
+ 
+ 	if (off)
+ 		LimitOffsetNode((Node*) off, size, i, tree->rtable);
+ 
+ 	if (limcount)
+ 		LimitOffsetNode((Node*) limcount, size, i, tree->rtable);
+ 
+ 	if (tree->setOperations)
+ 	{
+ 		/*
+ 		 * set-operation tree if this is top
+ 		 * level of a UNION/INTERSECT/EXCEPT query
+ 		 */
+ 		SetOperationStmt *topop = (SetOperationStmt *) tree->setOperations;
+ 		APP_JUMB(topop->op);
+ 		APP_JUMB(topop->all);
+ 
+ 		/* leaf selects are RTE subselections */
+ 		foreach(l, tree->rtable)
+ 		{
+ 			RangeTblEntry *r = (RangeTblEntry *) lfirst(l);
+ 			if (r->subquery)
+ 				PerformJumble(r->subquery, size, i);
+ 		}
+ 	}
+ 	pgss_rangetbl_stack = list_delete_ptr(pgss_rangetbl_stack,
+ 			list_nth(pgss_rangetbl_stack, pgss_rangetbl_stack->length - 1));
+ }
+ 
+ /*
+  * Perform selective serialization of "Quals" nodes when
+  * they're IsA(*, OpExpr)
+  */
+ static void
+ QualsNode(const OpExpr *node, Size size, Size *i, List *rtable)
+ {
+ 	ListCell *l;
+ 	APP_JUMB(node->xpr);
+ 	APP_JUMB(node->opno);
+ 	foreach(l, node->args)
+ 	{
+ 		Node *arg = (Node *) lfirst(l);
+ 		LeafNode(arg, size, i, rtable);
+ 	}
+ }
+ 
+ /*
+  * LeafNode: Selectively serialize a selection of parser/prim nodes that are
+  * frequently, though certainly not necesssarily leaf nodes, such as Vars
+  * (columns), constants and function calls
+  */
+ static void
+ LeafNode(const Node *arg, Size size, Size *i, List *rtable)
+ {
+ 	ListCell *l;
+ 	/* Use the node's NodeTag as a magic number */
+ 	APP_JUMB(arg->type);
+ 
+ 	if (IsA(arg, Const))
+ 	{
+ 		Const *c = (Const *) arg;
+ 
+ 		/*
+ 		 * Datatype of the constant is a differentiator
+ 		 */
+ 		APP_JUMB(c->consttype);
+ 		RecordConstLocation(c->location);
+ 	}
+ 	else if(IsA(arg, CoerceToDomain))
+ 	{
+ 		CoerceToDomain *cd = (CoerceToDomain*) arg;
+ 		/*
+ 		 * Datatype of the constant is a
+ 		 * differentiator
+ 		 */
+ 		APP_JUMB(cd->resulttype);
+ 		LeafNode((Node*) cd->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, Var))
+ 	{
+ 		Var			  *v = (Var *) arg;
+ 		RangeTblEntry *rte;
+ 		ListCell *lc;
+ 
+ 		/*
+ 		 * We need to get the details of the rangetable, but rtable may not
+ 		 * refer to the relevant one if we're in a subselection.
+ 		 */
+ 		if (v->varlevelsup == 0)
+ 		{
+ 			rte = rt_fetch(v->varno, rtable);
+ 		}
+ 		else
+ 		{
+ 			List *rtable_upper = list_nth(pgss_rangetbl_stack,
+ 					(list_length(pgss_rangetbl_stack) - 1) - v->varlevelsup);
+ 			rte = rt_fetch(v->varno, rtable_upper);
+ 		}
+ 		APP_JUMB(rte->relid);
+ 
+ 		foreach(lc, rte->values_lists)
+ 		{
+ 			List	   *sublist = (List *) lfirst(lc);
+ 			ListCell   *lc2;
+ 
+ 			foreach(lc2, sublist)
+ 			{
+ 				Node	   *col = (Node *) lfirst(lc2);
+ 				LeafNode(col, size, i, rtable);
+ 			}
+ 		}
+ 		APP_JUMB(v->varattno);
+ 	}
+ 	else if (IsA(arg, CurrentOfExpr))
+ 	{
+ 		CurrentOfExpr *CoE = (CurrentOfExpr*) arg;
+ 		APP_JUMB(CoE->cvarno);
+ 		APP_JUMB(CoE->cursor_param);
+ 	}
+ 	else if (IsA(arg, CollateExpr))
+ 	{
+ 		CollateExpr *Ce = (CollateExpr*) arg;
+ 		APP_JUMB(Ce->collOid);
+ 	}
+ 	else if (IsA(arg, FieldSelect))
+ 	{
+ 		FieldSelect *Fs = (FieldSelect*) arg;
+ 		APP_JUMB(Fs->resulttype);
+ 		LeafNode((Node*) Fs->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, NamedArgExpr))
+ 	{
+ 		NamedArgExpr *Nae = (NamedArgExpr*) arg;
+ 		APP_JUMB(Nae->argnumber);
+ 		LeafNode((Node*) Nae->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, Param))
+ 	{
+ 		Param *p = ((Param *) arg);
+ 		APP_JUMB(p->paramkind);
+ 		APP_JUMB(p->paramid);
+ 	}
+ 	else if (IsA(arg, RelabelType))
+ 	{
+ 		RelabelType *rt = (RelabelType*) arg;
+ 		APP_JUMB(rt->resulttype);
+ 		LeafNode((Node*) rt->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, WindowFunc))
+ 	{
+ 		WindowFunc *wf = (WindowFunc *) arg;
+ 		APP_JUMB(wf->winfnoid);
+ 		foreach(l, wf->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, FuncExpr))
+ 	{
+ 		FuncExpr *f = (FuncExpr *) arg;
+ 		APP_JUMB(f->funcid);
+ 		foreach(l, f->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, OpExpr) || IsA(arg, DistinctExpr))
+ 	{
+ 		QualsNode((OpExpr*) arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, CoerceViaIO))
+ 	{
+ 		CoerceViaIO *Cio = (CoerceViaIO*) arg;
+ 		APP_JUMB(Cio->coerceformat);
+ 		APP_JUMB(Cio->resulttype);
+ 		LeafNode((Node*) Cio->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, Aggref))
+ 	{
+ 		Aggref *a =  (Aggref *) arg;
+ 		APP_JUMB(a->aggfnoid);
+ 		foreach(l, a->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, SubLink))
+ 	{
+ 		SubLink *s = (SubLink*) arg;
+ 		APP_JUMB(s->subLinkType);
+ 		/* Serialize select-list subselect recursively */
+ 		if (s->subselect)
+ 			PerformJumble((Query*) s->subselect, size, i);
+ 
+ 		if (s->testexpr)
+ 			LeafNode((Node*) s->testexpr, size, i, rtable);
+ 		foreach(l, s->operName)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, TargetEntry))
+ 	{
+ 		TargetEntry *rt = (TargetEntry *) arg;
+ 		Node *e = (Node*) rt->expr;
+ 		APP_JUMB(rt->resorigtbl);
+ 		APP_JUMB(rt->ressortgroupref);
+ 		LeafNode(e, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, BoolExpr))
+ 	{
+ 		BoolExpr *be = (BoolExpr *) arg;
+ 		APP_JUMB(be->boolop);
+ 		foreach(l, be->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, NullTest))
+ 	{
+ 		NullTest *nt = (NullTest *) arg;
+ 		Node     *arg = (Node *) nt->arg;
+ 		APP_JUMB(nt->nulltesttype);		/* IS NULL, IS NOT NULL */
+ 		APP_JUMB(nt->argisrow);			/* is input a composite type ? */
+ 		LeafNode(arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, ArrayExpr))
+ 	{
+ 		ArrayExpr *ae = (ArrayExpr *) arg;
+ 		APP_JUMB(ae->array_typeid);		/* type of expression result */
+ 		APP_JUMB(ae->element_typeid);	/* common type of array elements */
+ 		foreach(l, ae->elements)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, CaseExpr))
+ 	{
+ 		CaseExpr *ce = (CaseExpr*) arg;
+ 		Assert(ce->casetype != InvalidOid);
+ 		APP_JUMB(ce->casetype);
+ 		foreach(l, ce->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		if (ce->arg)
+ 			LeafNode((Node*) ce->arg, size, i, rtable);
+ 
+ 		if (ce->defresult)
+ 		{
+ 			/* Default result (ELSE clause).
+ 			 *
+ 			 * May be NULL, because no else clause
+ 			 * was actually specified, and thus the value is
+ 			 * equivalent to SQL ELSE NULL
+ 			 */
+ 			LeafNode((Node*) ce->defresult, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, CaseTestExpr))
+ 	{
+ 		CaseTestExpr *ct = (CaseTestExpr*) arg;
+ 		APP_JUMB(ct->typeId);
+ 	}
+ 	else if (IsA(arg, CaseWhen))
+ 	{
+ 		CaseWhen *cw = (CaseWhen*) arg;
+ 		Node     *res = (Node*) cw->result;
+ 		Node     *exp = (Node*) cw->expr;
+ 		if (res)
+ 			LeafNode(res, size, i, rtable);
+ 		if (exp)
+ 			LeafNode(exp, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, MinMaxExpr))
+ 	{
+ 		MinMaxExpr *cw = (MinMaxExpr*) arg;
+ 		APP_JUMB(cw->minmaxtype);
+ 		APP_JUMB(cw->op);
+ 		foreach(l, cw->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, ScalarArrayOpExpr))
+ 	{
+ 		ScalarArrayOpExpr *sa = (ScalarArrayOpExpr*) arg;
+ 		APP_JUMB(sa->opfuncid);
+ 		APP_JUMB(sa->useOr);
+ 		foreach(l, sa->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, CoalesceExpr))
+ 	{
+ 		CoalesceExpr *ca = (CoalesceExpr*) arg;
+ 		foreach(l, ca->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, ArrayCoerceExpr))
+ 	{
+ 		ArrayCoerceExpr *ac = (ArrayCoerceExpr *) arg;
+ 		LeafNode((Node*) ac->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, WindowClause))
+ 	{
+ 		WindowClause *wc = (WindowClause*) arg;
+ 		foreach(l, wc->partitionClause)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		foreach(l, wc->orderClause)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, SortGroupClause))
+ 	{
+ 		SortGroupClause *sgc = (SortGroupClause*) arg;
+ 		APP_JUMB(sgc->tleSortGroupRef);
+ 		APP_JUMB(sgc->nulls_first);
+ 	}
+ 	else if (IsA(arg, Integer) ||
+ 		  IsA(arg, Float) ||
+ 		  IsA(arg, String) ||
+ 		  IsA(arg, BitString) ||
+ 		  IsA(arg, Null)
+ 		)
+ 	{
+ 		/* It is not necessary to serialize Value nodes - they are seen when
+ 		 * aliases are used, which are ignored.
+ 		 */
+ 		return;
+ 	}
+ 	else if (IsA(arg, BooleanTest))
+ 	{
+ 		BooleanTest *bt = (BooleanTest *) arg;
+ 		APP_JUMB(bt->booltesttype);
+ 		LeafNode((Node*) bt->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, ArrayRef))
+ 	{
+ 		ArrayRef *ar = (ArrayRef*) arg;
+ 		APP_JUMB(ar->refarraytype);
+ 		foreach(l, ar->refupperindexpr)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		foreach(l, ar->reflowerindexpr)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		if (ar->refexpr)
+ 			LeafNode((Node*) ar->refexpr, size, i, rtable);
+ 		if (ar->refassgnexpr)
+ 			LeafNode((Node*) ar->refassgnexpr, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, NullIfExpr))
+ 	{
+ 		/* NullIfExpr is just a typedef for OpExpr */
+ 		QualsNode((OpExpr*) arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, RowExpr))
+ 	{
+ 		RowExpr *re = (RowExpr*) arg;
+ 		APP_JUMB(re->row_format);
+ 		foreach(l, re->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 
+ 	}
+ 	else if (IsA(arg, XmlExpr))
+ 	{
+ 		XmlExpr *xml = (XmlExpr*) arg;
+ 		APP_JUMB(xml->op);
+ 		foreach(l, xml->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		/* non-XML expressions for xml_attributes */
+ 		foreach(l, xml->named_args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		/* parallel list of Value strings */
+ 		foreach(l, xml->arg_names)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, RowCompareExpr))
+ 	{
+ 		RowCompareExpr *rc = (RowCompareExpr*) arg;
+ 		foreach(l, rc->largs)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 		foreach(l, rc->rargs)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(arg, SetToDefault))
+ 	{
+ 		SetToDefault *sd = (SetToDefault*) arg;
+ 		APP_JUMB(sd->typeId);
+ 		APP_JUMB(sd->typeMod);
+ 	}
+ 	else if (IsA(arg, ConvertRowtypeExpr))
+ 	{
+ 		ConvertRowtypeExpr* Cr = (ConvertRowtypeExpr*) arg;
+ 		APP_JUMB(Cr->convertformat);
+ 		APP_JUMB(Cr->resulttype);
+ 		LeafNode((Node*) Cr->arg, size, i, rtable);
+ 	}
+ 	else if (IsA(arg, FieldStore))
+ 	{
+ 		FieldStore* Fs = (FieldStore*) arg;
+ 		LeafNode((Node*) Fs->arg, size, i, rtable);
+ 		foreach(l, Fs->newvals)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else
+ 	{
+ 		ereport(WARNING,
+ 				(errcode(ERRCODE_INTERNAL_ERROR),
+ 				 errmsg("unexpected, unrecognised LeafNode node type: %d",
+ 					 (int) nodeTag(arg))));
+ 	}
+ }
+ 
+ /*
+  * Perform selective serialization of limit or offset nodes
+  */
+ static void
+ LimitOffsetNode(const Node *node, Size size, Size *i, List *rtable)
+ {
+ 	ListCell *l;
+ 	unsigned char magic = MAG_LIMIT_OFFSET;
+ 	APP_JUMB(magic);
+ 
+ 	if (IsA(node, FuncExpr))
+ 	{
+ 
+ 		foreach(l, ((FuncExpr*) node)->args)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else
+ 	{
+ 		/* Fall back on leaf node representation */
+ 		LeafNode(node, size, i, rtable);
+ 	}
+ }
+ 
+ /*
+  * JoinExprNode: Perform selective serialization of JoinExpr nodes
+  */
+ static void
+ JoinExprNode(JoinExpr *node, Size size, Size *i, List *rtable)
+ {
+ 	Node	 *larg = node->larg;	/* left subtree */
+ 	Node	 *rarg = node->rarg;	/* right subtree */
+ 	ListCell *l;
+ 
+ 	Assert( IsA(node, JoinExpr));
+ 
+ 	APP_JUMB(node->jointype);
+ 	APP_JUMB(node->isNatural);
+ 
+ 	if (node->quals)
+ 	{
+ 		if ( IsA(node, OpExpr))
+ 		{
+ 			QualsNode((OpExpr*) node->quals, size, i, rtable);
+ 		}
+ 		else
+ 		{
+ 			LeafNode((Node*) node->quals, size, i, rtable);
+ 		}
+ 	}
+ 	foreach(l, node->usingClause) /* USING clause, if any (list of String) */
+ 	{
+ 		Node *arg = (Node *) lfirst(l);
+ 		LeafNode(arg, size, i, rtable);
+ 	}
+ 	if (larg)
+ 		JoinExprNodeChild(larg, size, i, rtable);
+ 	if (rarg)
+ 		JoinExprNodeChild(rarg, size, i, rtable);
+ }
+ 
+ /*
+  * JoinExprNodeChild: Serialize children of the JoinExpr node
+  */
+ static void
+ JoinExprNodeChild(const Node *node, Size size, Size *i, List *rtable)
+ {
+ 	if (IsA(node, RangeTblRef))
+ 	{
+ 		RangeTblRef   *rt = (RangeTblRef*) node;
+ 		RangeTblEntry *rte = rt_fetch(rt->rtindex, rtable);
+ 		ListCell      *l;
+ 
+ 		APP_JUMB(rte->relid);
+ 		APP_JUMB(rte->jointype);
+ 
+ 		if (rte->subquery)
+ 			PerformJumble((Query*) rte->subquery, size, i);
+ 
+ 		foreach(l, rte->joinaliasvars)
+ 		{
+ 			Node *arg = (Node *) lfirst(l);
+ 			LeafNode(arg, size, i, rtable);
+ 		}
+ 	}
+ 	else if (IsA(node, JoinExpr))
+ 	{
+ 		JoinExprNode((JoinExpr*) node, size, i, rtable);
+ 	}
+ 	else
+ 	{
+ 		LeafNode(node, size, i, rtable);
+ 	}
+ }
+ 
+ /*
+  * Record location of constant within query string of query tree that is
+  * currently being walked.
+  */
+ static void
+ RecordConstLocation(int location)
+ {
+ 	/* -1 indicates unknown or undefined location */
+ 	if (location > 0)
+ 	{
+ 		if (last_offset_num >= last_offset_buf_size)
+ 		{
+ 			last_offset_buf_size *= 2;
+ 			last_offsets = repalloc(last_offsets,
+ 							last_offset_buf_size *
+ 							sizeof(pgssLocationLen));
+ 
+ 		}
+ 		last_offsets[last_offset_num++].location = location;
+ 	}
+ }
+ 
+ /*
   * ExecutorStart hook: start up tracking if needed
   */
  static void
*************** pgss_ExecutorEnd(QueryDesc *queryDesc)
*** 587,592 ****
--- 1705,1715 ----
  {
  	if (queryDesc->totaltime && pgss_enabled())
  	{
+ 		uint32 queryId;
+ 		if (pgss_string_key)
+ 			queryId = pgss_hash_string(queryDesc->sourceText);
+ 		else
+ 			queryId = queryDesc->plannedstmt->queryId;
  		/*
  		 * Make sure stats accumulation is done.  (Note: it's okay if several
  		 * levels of hook all do this.)
*************** pgss_ExecutorEnd(QueryDesc *queryDesc)
*** 594,602 ****
  		InstrEndLoop(queryDesc->totaltime);
  
  		pgss_store(queryDesc->sourceText,
! 				   queryDesc->totaltime->total,
! 				   queryDesc->estate->es_processed,
! 				   &queryDesc->totaltime->bufusage);
  	}
  
  	if (prev_ExecutorEnd)
--- 1717,1729 ----
  		InstrEndLoop(queryDesc->totaltime);
  
  		pgss_store(queryDesc->sourceText,
! 		   queryId,
! 		   queryDesc->totaltime->total,
! 		   queryDesc->estate->es_processed,
! 		   &queryDesc->totaltime->bufusage,
! 		   false,
! 		   false);
! 
  	}
  
  	if (prev_ExecutorEnd)
*************** pgss_ProcessUtility(Node *parsetree, con
*** 618,623 ****
--- 1745,1751 ----
  		instr_time	start;
  		instr_time	duration;
  		uint64		rows = 0;
+ 		uint32		queryId;
  		BufferUsage bufusage;
  
  		bufusage = pgBufferUsage;
*************** pgss_ProcessUtility(Node *parsetree, con
*** 671,678 ****
  		bufusage.temp_blks_written =
  			pgBufferUsage.temp_blks_written - bufusage.temp_blks_written;
  
! 		pgss_store(queryString, INSTR_TIME_GET_DOUBLE(duration), rows,
! 				   &bufusage);
  	}
  	else
  	{
--- 1799,1809 ----
  		bufusage.temp_blks_written =
  			pgBufferUsage.temp_blks_written - bufusage.temp_blks_written;
  
! 		queryId = pgss_hash_string(queryString);
! 
! 		/* In the case of utility statements, hash the query string directly */
! 		pgss_store(queryString, queryId,
! 				INSTR_TIME_GET_DOUBLE(duration), rows, &bufusage, false, false);
  	}
  	else
  	{
*************** pgss_hash_fn(const void *key, Size keysi
*** 696,703 ****
  	/* we don't bother to include encoding in the hash */
  	return hash_uint32((uint32) k->userid) ^
  		hash_uint32((uint32) k->dbid) ^
! 		DatumGetUInt32(hash_any((const unsigned char *) k->query_ptr,
! 								k->query_len));
  }
  
  /*
--- 1827,1833 ----
  	/* we don't bother to include encoding in the hash */
  	return hash_uint32((uint32) k->userid) ^
  		hash_uint32((uint32) k->dbid) ^
! 		hash_uint32((uint32) k->queryid);
  }
  
  /*
*************** pgss_match_fn(const void *key1, const vo
*** 712,733 ****
  	if (k1->userid == k2->userid &&
  		k1->dbid == k2->dbid &&
  		k1->encoding == k2->encoding &&
! 		k1->query_len == k2->query_len &&
! 		memcmp(k1->query_ptr, k2->query_ptr, k1->query_len) == 0)
  		return 0;
  	else
  		return 1;
  }
  
  /*
   * Store some statistics for a statement.
   */
  static void
! pgss_store(const char *query, double total_time, uint64 rows,
! 		   const BufferUsage *bufusage)
  {
  	pgssHashKey key;
  	double		usage;
  	pgssEntry  *entry;
  
  	Assert(query != NULL);
--- 1842,1887 ----
  	if (k1->userid == k2->userid &&
  		k1->dbid == k2->dbid &&
  		k1->encoding == k2->encoding &&
! 		k1->queryid == k2->queryid)
  		return 0;
  	else
  		return 1;
  }
  
  /*
+  * Given an arbitrarily long query string, produce a hash for the purposes of
+  * identifying the query, without canonicalizing constants. Used when hashing
+  * utility statements, or for legacy compatibility mode.
+  */
+ static uint32
+ pgss_hash_string(const char* str)
+ {
+ 	/* For additional protection against collisions, including magic value */
+ 	char magic = MAG_STR_BUF;
+ 	uint32 result;
+ 	Size size = sizeof(magic) + strlen(str);
+ 	unsigned char* p = palloc(size);
+ 	memcpy(p, &magic, sizeof(magic));
+ 	memcpy(p + sizeof(magic), str, strlen(str));
+ 	result = hash_any((const unsigned char *) p, size);
+ 	pfree(p);
+ 	return result;
+ }
+ 
+ /*
   * Store some statistics for a statement.
   */
  static void
! pgss_store(const char *query, uint32 queryId,
! 				double total_time, uint64 rows,
! 				const BufferUsage *bufusage,
! 				bool empty_entry,
! 				bool canonicalize)
  {
  	pgssHashKey key;
  	double		usage;
+ 	int		    new_query_len = strlen(query);
+ 	char	   *norm_query = NULL;
  	pgssEntry  *entry;
  
  	Assert(query != NULL);
*************** pgss_store(const char *query, double tot
*** 740,773 ****
  	key.userid = GetUserId();
  	key.dbid = MyDatabaseId;
  	key.encoding = GetDatabaseEncoding();
! 	key.query_len = strlen(query);
! 	if (key.query_len >= pgss->query_size)
! 		key.query_len = pg_encoding_mbcliplen(key.encoding,
  											  query,
! 											  key.query_len,
  											  pgss->query_size - 1);
- 	key.query_ptr = query;
  
! 	usage = USAGE_EXEC(duration);
  
  	/* Lookup the hash table entry with shared lock. */
  	LWLockAcquire(pgss->lock, LW_SHARED);
  
- 	entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
  	if (!entry)
  	{
! 		/* Must acquire exclusive lock to add a new entry. */
! 		LWLockRelease(pgss->lock);
! 		LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
! 		entry = entry_alloc(&key);
  	}
  
! 	/* Grab the spinlock while updating the counters. */
  	{
  		volatile pgssEntry *e = (volatile pgssEntry *) entry;
  
  		SpinLockAcquire(&e->mutex);
! 		e->counters.calls += 1;
  		e->counters.total_time += total_time;
  		e->counters.rows += rows;
  		e->counters.shared_blks_hit += bufusage->shared_blks_hit;
--- 1894,2043 ----
  	key.userid = GetUserId();
  	key.dbid = MyDatabaseId;
  	key.encoding = GetDatabaseEncoding();
! 	key.queryid = queryId;
! 
! 	if (new_query_len >= pgss->query_size)
! 		/* We don't have to worry about this later, because canonicalization
! 		 * cannot possibly result in a longer query string
! 		 */
! 		new_query_len = pg_encoding_mbcliplen(key.encoding,
  											  query,
! 											  new_query_len,
  											  pgss->query_size - 1);
  
! 	entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
! 
! 	/*
! 	 * When just initializing an entry and putting counters at zero, make it
! 	 * artificially sticky so that it will probably still be there when
! 	 * executed. Strictly speaking, query strings are canonicalized on a
! 	 * best effort basis, though it would be difficult to demonstrate this even
! 	 * under artificial conditions.
! 	 */
! 	if (empty_entry && !entry)
! 		usage = USAGE_NON_EXEC_STICK;
! 	else
! 		usage = USAGE_EXEC(duration);
  
  	/* Lookup the hash table entry with shared lock. */
  	LWLockAcquire(pgss->lock, LW_SHARED);
  
  	if (!entry)
  	{
! 		/*
! 		 * Generate a normalized version of the query string that will be used
! 		 * to represent the entry.
! 		 *
! 		 * Note that the representation seen by the user will only have
! 		 * non-differentiating Const tokens swapped with '?' characters, and
! 		 * this does not for example take account of the fact that alias names
! 		 * could vary between successive calls of what is regarded as the same
! 		 * query, or that whitespace could vary.
! 		 */
! 		if (last_offset_num > 0 && canonicalize)
! 		{
! 			int i,
! 			  off = 0,				/* Offset from start for cur tok */
! 			  tok_len = 0,			/* Length (in bytes) of that tok */
! 			  quer_it = 0,			/* Original query byte iterator */
! 			  n_quer_it = 0,		/* Normalized query byte iterator */
! 			  len_to_wrt = 0,		/* Length (in bytes) to write */
! 			  last_off = 0,			/* Offset from start for last iter's tok */
! 			  last_tok_len = 0,		/* Length (in bytes) of that tok */
! 			  tok_len_delta = 0;	/* Finished str is n bytes shorter so far */
! 
! 			/* Fill-in constant lengths - core system only gives us locations */
! 			fill_in_constant_lengths(query, last_offsets, last_offset_num);
! 
! 			norm_query = palloc0(new_query_len + 1);
! 
! 			for(i = 0; i < last_offset_num; i++)
! 			{
! 				if(last_offsets[i].length == -1)
! 					continue; /* don't assume that there's no duplicates */
! 
! 				off = last_offsets[i].location;
! 				tok_len = last_offsets[i].length;
! 				len_to_wrt = off - last_off;
! 				len_to_wrt -= last_tok_len;
! 				/* -1 for the '?' char: */
! 				tok_len_delta += tok_len - 1;
! 
! 				Assert(tok_len > 0);
! 				Assert(len_to_wrt >= 0);
! 				/*
! 				 * Each iteration copies everything prior to the current
! 				 * offset/token to be replaced, except bytes copied in
! 				 * previous iterations
! 				 */
! 				if (off - tok_len_delta + tok_len > new_query_len)
! 				{
! 					if (off - tok_len_delta < new_query_len)
! 					{
! 						len_to_wrt = new_query_len - n_quer_it;
! 						/* Out of space entirely - copy as much as possible */
! 						memcpy(norm_query + n_quer_it, query + quer_it,
! 								len_to_wrt);
! 						n_quer_it += len_to_wrt;
! 						quer_it += len_to_wrt + tok_len;
! 					}
! 					break;
! 				}
! 				memcpy(norm_query + n_quer_it, query + quer_it, len_to_wrt);
! 
! 				n_quer_it += len_to_wrt;
! 				if (n_quer_it < new_query_len)
! 					norm_query[n_quer_it++] = '?';
! 				quer_it += len_to_wrt + tok_len;
! 				last_off = off;
! 				last_tok_len = tok_len;
! 			}
! 			/*
! 			 * We've copied up until the last canonicalized constant. Copy over
! 			 * the remaining bytes of the original query string.
! 			 */
! 			memcpy(norm_query + n_quer_it, query + quer_it,
! 					new_query_len - n_quer_it);
! 
! 			/*
! 			 * Must acquire exclusive lock to add a new entry.
! 			 * Leave that until as late as possible.
! 			 */
! 			LWLockRelease(pgss->lock);
! 			LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
! 
! 			entry = entry_alloc(&key, norm_query, new_query_len);
! 		}
! 		else
! 		{
! 			/* Acquire exclusive lock as required by entry_alloc() */
! 			LWLockRelease(pgss->lock);
! 			LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
! 
! 			entry = entry_alloc(&key, query, new_query_len);
! 		}
  	}
  
! 	/*
! 	 * Grab the spinlock while updating the counters, if we're not just here to
! 	 * canonicalize.
! 	 */
  	{
  		volatile pgssEntry *e = (volatile pgssEntry *) entry;
  
  		SpinLockAcquire(&e->mutex);
! 		if (!empty_entry)
! 		{
! 			/*
! 			 * If necessary, "unstick" previously stuck query entry that just
! 			 * held a normalized query string, and then increment calls.
! 			 */
! 			if (e->counters.calls == 0)
! 				e->counters.usage = USAGE_INIT;
! 
! 			e->counters.calls += 1;
! 		}
! 
  		e->counters.total_time += total_time;
  		e->counters.rows += rows;
  		e->counters.shared_blks_hit += bufusage->shared_blks_hit;
*************** pgss_store(const char *query, double tot
*** 785,790 ****
--- 2055,2062 ----
  	}
  
  	LWLockRelease(pgss->lock);
+ 	if (norm_query)
+ 		pfree(norm_query);
  }
  
  /*
*************** pg_stat_statements(PG_FUNCTION_ARGS)
*** 875,881 ****
  
  			qstr = (char *)
  				pg_do_encoding_conversion((unsigned char *) entry->query,
! 										  entry->key.query_len,
  										  entry->key.encoding,
  										  GetDatabaseEncoding());
  			values[i++] = CStringGetTextDatum(qstr);
--- 2147,2153 ----
  
  			qstr = (char *)
  				pg_do_encoding_conversion((unsigned char *) entry->query,
! 										  entry->query_len,
  										  entry->key.encoding,
  										  GetDatabaseEncoding());
  			values[i++] = CStringGetTextDatum(qstr);
*************** pg_stat_statements(PG_FUNCTION_ARGS)
*** 893,898 ****
--- 2165,2173 ----
  			tmp = e->counters;
  			SpinLockRelease(&e->mutex);
  		}
+ 		/* Skip record of unexecuted query */
+ 		if (tmp.calls == 0)
+ 			continue;
  
  		values[i++] = Int64GetDatumFast(tmp.calls);
  		values[i++] = Float8GetDatumFast(tmp.total_time);
*************** pgss_memsize(void)
*** 950,963 ****
   * have made the entry while we waited to get exclusive lock.
   */
  static pgssEntry *
! entry_alloc(pgssHashKey *key)
  {
  	pgssEntry  *entry;
  	bool		found;
  
- 	/* Caller must have clipped query properly */
- 	Assert(key->query_len < pgss->query_size);
- 
  	/* Make space if needed */
  	while (hash_get_num_entries(pgss_hash) >= pgss_max)
  		entry_dealloc();
--- 2225,2235 ----
   * have made the entry while we waited to get exclusive lock.
   */
  static pgssEntry *
! entry_alloc(pgssHashKey *key, const char* query, int new_query_len)
  {
  	pgssEntry  *entry;
  	bool		found;
  
  	/* Make space if needed */
  	while (hash_get_num_entries(pgss_hash) >= pgss_max)
  		entry_dealloc();
*************** entry_alloc(pgssHashKey *key)
*** 969,985 ****
  	{
  		/* New entry, initialize it */
  
! 		/* dynahash tried to copy the key for us, but must fix query_ptr */
! 		entry->key.query_ptr = entry->query;
  		/* reset the statistics */
  		memset(&entry->counters, 0, sizeof(Counters));
  		entry->counters.usage = USAGE_INIT;
  		/* re-initialize the mutex each time ... we assume no one using it */
  		SpinLockInit(&entry->mutex);
  		/* ... and don't forget the query text */
! 		memcpy(entry->query, key->query_ptr, key->query_len);
! 		entry->query[key->query_len] = '\0';
  	}
  
  	return entry;
  }
--- 2241,2260 ----
  	{
  		/* New entry, initialize it */
  
! 		entry->query_len = new_query_len;
! 		Assert(entry->query_len > 0);
  		/* reset the statistics */
  		memset(&entry->counters, 0, sizeof(Counters));
  		entry->counters.usage = USAGE_INIT;
  		/* re-initialize the mutex each time ... we assume no one using it */
  		SpinLockInit(&entry->mutex);
  		/* ... and don't forget the query text */
! 		memcpy(entry->query, query, entry->query_len);
! 		Assert(new_query_len <= pgss->query_size);
! 		entry->query[entry->query_len] = '\0';
  	}
+ 	/* Caller must have clipped query properly */
+ 	Assert(entry->query_len < pgss->query_size);
  
  	return entry;
  }
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
new file mode 100644
index cc3168d..84483ce
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
*************** _copyPlannedStmt(const PlannedStmt *from
*** 92,97 ****
--- 92,98 ----
  	COPY_NODE_FIELD(relationOids);
  	COPY_NODE_FIELD(invalItems);
  	COPY_SCALAR_FIELD(nParamExec);
+ 	COPY_SCALAR_FIELD(queryId);
  
  	return newnode;
  }
*************** _copyQuery(const Query *from)
*** 2415,2420 ****
--- 2416,2422 ----
  
  	COPY_SCALAR_FIELD(commandType);
  	COPY_SCALAR_FIELD(querySource);
+ 	COPY_SCALAR_FIELD(queryId);
  	COPY_SCALAR_FIELD(canSetTag);
  	COPY_NODE_FIELD(utilityStmt);
  	COPY_SCALAR_FIELD(resultRelation);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
new file mode 100644
index 2295195..ce75da3
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 83,88 ****
--- 83,91 ----
  #define COMPARE_LOCATION_FIELD(fldname) \
  	((void) 0)
  
+ /* Compare a query_id field (this is a no-op, per note above) */
+ #define COMPARE_QUERYID_FIELD(fldname) \
+ 	((void) 0)
  
  /*
   *	Stuff from primnodes.h
*************** _equalQuery(const Query *a, const Query
*** 897,902 ****
--- 900,906 ----
  {
  	COMPARE_SCALAR_FIELD(commandType);
  	COMPARE_SCALAR_FIELD(querySource);
+ 	COMPARE_QUERYID_FIELD(query_id);
  	COMPARE_SCALAR_FIELD(canSetTag);
  	COMPARE_NODE_FIELD(utilityStmt);
  	COMPARE_SCALAR_FIELD(resultRelation);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
new file mode 100644
index 829f6d4..9646125
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 81,86 ****
--- 81,90 ----
  #define WRITE_LOCATION_FIELD(fldname) \
  	appendStringInfo(str, " :" CppAsString(fldname) " %d", node->fldname)
  
+ /* Write a query id field */
+ #define WRITE_QUERYID_FIELD(fldname) \
+ 	((void) 0)
+ 
  /* Write a Node field */
  #define WRITE_NODE_FIELD(fldname) \
  	(appendStringInfo(str, " :" CppAsString(fldname) " "), \
*************** _outPlannedStmt(StringInfo str, const Pl
*** 255,260 ****
--- 259,265 ----
  	WRITE_NODE_FIELD(relationOids);
  	WRITE_NODE_FIELD(invalItems);
  	WRITE_INT_FIELD(nParamExec);
+ 	WRITE_QUERYID_FIELD(queryId);
  }
  
  /*
*************** _outQuery(StringInfo str, const Query *n
*** 2159,2164 ****
--- 2164,2170 ----
  
  	WRITE_ENUM_FIELD(commandType, CmdType);
  	WRITE_ENUM_FIELD(querySource, QuerySource);
+ 	WRITE_QUERYID_FIELD(query_id);
  	WRITE_BOOL_FIELD(canSetTag);
  
  	/*
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
new file mode 100644
index b9258ad..5ea0d52
*** a/src/backend/nodes/readfuncs.c
--- b/src/backend/nodes/readfuncs.c
***************
*** 110,115 ****
--- 110,119 ----
  	token = pg_strtok(&length);		/* get field value */ \
  	local_node->fldname = -1	/* set field to "unknown" */
  
+ /* Read a QueryId field - NO-OP */
+ #define READ_QUERYID_FIELD(fldname) \
+ 	((void) 0)
+ 
  /* Read a Node field */
  #define READ_NODE_FIELD(fldname) \
  	token = pg_strtok(&length);		/* skip :fldname */ \
*************** _readQuery(void)
*** 195,200 ****
--- 199,205 ----
  
  	READ_ENUM_FIELD(commandType, CmdType);
  	READ_ENUM_FIELD(querySource, QuerySource);
+ 	READ_QUERYID_FIELD(query_id);
  	READ_BOOL_FIELD(canSetTag);
  	READ_NODE_FIELD(utilityStmt);
  	READ_INT_FIELD(resultRelation);
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
new file mode 100644
index 8bbe977..1b4030f
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*************** standard_planner(Query *parse, int curso
*** 240,245 ****
--- 240,246 ----
  	result->relationOids = glob->relationOids;
  	result->invalItems = glob->invalItems;
  	result->nParamExec = list_length(glob->paramlist);
+ 	result->queryId = parse->queryId;
  
  	return result;
  }
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
new file mode 100644
index b187b03..d6b0b4b
*** a/src/backend/parser/analyze.c
--- b/src/backend/parser/analyze.c
*************** static Query *transformExplainStmt(Parse
*** 65,70 ****
--- 65,72 ----
  static void transformLockingClause(ParseState *pstate, Query *qry,
  					   LockingClause *lc, bool pushedDown);
  
+ /* Hooks for plugins to get control of parse analysis */
+ post_parse_analyze_hook_type post_parse_analyze_hook = NULL;
  
  /*
   * parse_analyze
*************** parse_analyze(Node *parseTree, const cha
*** 93,98 ****
--- 95,103 ----
  
  	query = transformStmt(pstate, parseTree);
  
+ 	if (post_parse_analyze_hook)
+ 		(*post_parse_analyze_hook)(pstate, query);
+ 
  	free_parsestate(pstate);
  
  	return query;
*************** parse_analyze_varparams(Node *parseTree,
*** 123,128 ****
--- 128,136 ----
  	/* make sure all is well with parameter types */
  	check_variable_parameters(pstate, query);
  
+ 	if (post_parse_analyze_hook)
+ 		(*post_parse_analyze_hook)(pstate, query);
+ 
  	free_parsestate(pstate);
  
  	return query;
diff --git a/src/backend/parser/parse_coerce.c b/src/backend/parser/parse_coerce.c
new file mode 100644
index 6661a3d..1e04c0e
*** a/src/backend/parser/parse_coerce.c
--- b/src/backend/parser/parse_coerce.c
*************** coerce_type(ParseState *pstate, Node *no
*** 280,293 ****
  		newcon->constlen = typeLen(targetType);
  		newcon->constbyval = typeByVal(targetType);
  		newcon->constisnull = con->constisnull;
! 		/* Use the leftmost of the constant's and coercion's locations */
! 		if (location < 0)
! 			newcon->location = con->location;
! 		else if (con->location >= 0 && con->location < location)
! 			newcon->location = con->location;
! 		else
! 			newcon->location = location;
! 
  		/*
  		 * Set up to point at the constant's text if the input routine throws
  		 * an error.
--- 280,286 ----
  		newcon->constlen = typeLen(targetType);
  		newcon->constbyval = typeByVal(targetType);
  		newcon->constisnull = con->constisnull;
! 		newcon->location = con->location;
  		/*
  		 * Set up to point at the constant's text if the input routine throws
  		 * an error.
diff --git a/src/backend/parser/parse_param.c b/src/backend/parser/parse_param.c
new file mode 100644
index cfe7262..482861f
*** a/src/backend/parser/parse_param.c
--- b/src/backend/parser/parse_param.c
*************** variable_coerce_param_hook(ParseState *p
*** 238,248 ****
  		 */
  		param->paramcollid = get_typcollation(param->paramtype);
  
- 		/* Use the leftmost of the param's and coercion's locations */
- 		if (location >= 0 &&
- 			(param->location < 0 || location < param->location))
- 			param->location = location;
- 
  		return (Node *) param;
  	}
  
--- 238,243 ----
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
new file mode 100644
index 04f9622..42a1b30
*** a/src/backend/rewrite/rewriteHandler.c
--- b/src/backend/rewrite/rewriteHandler.c
*************** static List *
*** 1839,1844 ****
--- 1839,1845 ----
  RewriteQuery(Query *parsetree, List *rewrite_events)
  {
  	CmdType		event = parsetree->commandType;
+ 	uint32		orig_query_id = parsetree->queryId;
  	bool		instead = false;
  	bool		returning = false;
  	Query	   *qual_product = NULL;
*************** RewriteQuery(Query *parsetree, List *rew
*** 2141,2146 ****
--- 2142,2154 ----
  					 errmsg("WITH cannot be used in a query that is rewritten by rules into multiple queries")));
  	}
  
+ 	/* Mark rewritten queries with their originating queryId */
+ 	foreach(lc1, rewritten)
+ 	{
+ 		Query	   *q = (Query *) lfirst(lc1);
+ 		q->queryId = orig_query_id;
+ 	}
+ 
  	return rewritten;
  }
  
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
new file mode 100644
index 49a3969..1ee06ea
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
*************** pg_analyze_and_rewrite_params(Node *pars
*** 626,631 ****
--- 626,635 ----
  
  	query = transformStmt(pstate, parsetree);
  
+ 	/* Since we're not calling parse_analyze(), do this here */
+ 	if (post_parse_analyze_hook)
+ 		(*post_parse_analyze_hook)(pstate, query);
+ 
  	free_parsestate(pstate);
  
  	if (log_parser_stats)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
new file mode 100644
index 1d33ceb..9fb3c0f
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
*************** typedef struct Query
*** 103,108 ****
--- 103,111 ----
  
  	QuerySource querySource;	/* where did I come from? */
  
+ 	uint32		queryId;		/* query identifier that can be set by plugins.
+ 								 * Will be copied to resulting PlannedStmt. */
+ 
  	bool		canSetTag;		/* do I set the command result tag? */
  
  	Node	   *utilityStmt;	/* non-null if this is DECLARE CURSOR or a
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
new file mode 100644
index 7d90b91..3cec1be
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
*************** typedef struct PlannedStmt
*** 67,72 ****
--- 67,74 ----
  	List	   *invalItems;		/* other dependencies, as PlanInvalItems */
  
  	int			nParamExec;		/* number of PARAM_EXEC Params used */
+ 
+ 	uint32		queryId;		/* query identifier carried from query tree */
  } PlannedStmt;
  
  /* macro for fetching the Plan associated with a SubPlan node */
diff --git a/src/include/parser/analyze.h b/src/include/parser/analyze.h
new file mode 100644
index b8987db..4ff5762
*** a/src/include/parser/analyze.h
--- b/src/include/parser/analyze.h
***************
*** 16,21 ****
--- 16,25 ----
  
  #include "parser/parse_node.h"
  
+ /* Hook for plugins to get control in parse_analyze() */
+ typedef void (*post_parse_analyze_hook_type) (ParseState *pstate, Query*
+ 		post_analysis_tree);
+ extern PGDLLIMPORT post_parse_analyze_hook_type post_parse_analyze_hook;
  
  extern Query *parse_analyze(Node *parseTree, const char *sourceText,
  			  Oid *paramTypes, int numParams);

#54

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Peter Geoghegan (#51)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan <peter@2ndquadrant.com> writes:

I've attached a patch with the required modifications.

I've committed the core-backend parts of this, just to get them out of
the way. Have yet to look at the pg_stat_statements code itself.

I restored the location field to the ParamCoerceHook signature, but
the removal of code to modify the param location remains (again, not
because I need it, but because I happen to think that it ought to be
consistent with Const).

I ended up choosing not to apply that bit. I remain of the opinion that
this behavior is fundamentally inconsistent with the general rules for
assigning parse locations to analyzed constructs, and I see no reason to
propagate that inconsistency further than we absolutely have to.

regards, tom lane

#55

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#54)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 27 March 2012 20:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I've committed the core-backend parts of this, just to get them out of
the way. Have yet to look at the pg_stat_statements code itself.

Thanks. I'm glad that we have that out of the way.

I ended up choosing not to apply that bit. I remain of the opinion that
this behavior is fundamentally inconsistent with the general rules for
assigning parse locations to analyzed constructs, and I see no reason to
propagate that inconsistency further than we absolutely have to.

Fair enough.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#56

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Peter Geoghegan (#53)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

[ just for the archives' sake ]

Peter Geoghegan <peter@2ndquadrant.com> writes:

On 27 March 2012 18:15, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Now, if what it wants to know about is the parameterization status
of the query, things aren't ideal because most of the info is hidden
in parse-callback fields that aren't of globally exposed types. However
we could at least duplicate the behavior you have here, because you're
only passing canonicalize = true in cases where no parse callback will
be registered at all, so pg_stat_statements could equivalently test for
pstate->p_paramref_hook == NULL.

It has been suggested to me before that comparisons with function
pointers - using them as a flag, in effect - is generally iffy, but
that particular usage seems reasonable to me.

Well, testing function pointers for null is certainly OK --- note that
all our hook function call sites do that. It's true that testing for
equality to a particular function's name can fail on some platforms
because of jump table hacks. Thus for example, if you had a need to
know that parse_variable_parameters parameter management was in use,
it wouldn't do to test whether p_coerce_param_hook ==
variable_coerce_param_hook. (Not that you could anyway, what with that
being a static function, but exposing it as global wouldn't offer a safe
solution.)

If we had a need to make this information available, I think what we'd
want to do is insist that p_ref_hook_state entries be subclasses of
Node, so that plugins could apply IsA tests on the node tag to figure
out what style of parameter management was in use. This would also mean
exposing the struct definitions globally, which you'd need anyway else
the plugins couldn't safely access the struct contents.

I don't particularly want to go there without very compelling reasons,
but that would be the direction to head in if we had to.

regards, tom lane

#57

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#56)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

[ also for the archives' sake ]

On 27 March 2012 22:05, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Well, testing function pointers for null is certainly OK --- note that
all our hook function call sites do that. It's true that testing for
equality to a particular function's name can fail on some platforms
because of jump table hacks.

I was actually talking about stylistic iffiness. This seems contrary
to the standard, which states:

(ISO C 99 section 6.5.9.6)
"Two pointers compare equal if and only if both are null pointers,
both are pointers to the same object (...) or function,
both are pointers to one past the last element of the same array object,
or one is a pointer to one past the end of one array object and the
other is a pointer to the start of a different array object that happens
to immediately follow the first array object in the address space."

However, the fly in the ointment is IA-64 (Itanic), which apparently
at least at one stage had broken function pointer comparisons, at
least when code was built using some version(s) of GCC.

I found it a bit difficult to square your contention that performing
function pointer comparisons against function addresses was what
sounded like undefined behaviour, and yet neither GCC nor Clang
complained. However, in light of what I've learned about IA-64, I can
certainly see why we as a project would avoid the practice.

Source: http://gcc.gnu.org/ml/gcc/2003-06/msg01283.html

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#58

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#54)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 27 March 2012 20:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Have yet to look at the pg_stat_statements code itself.

I merged upstream changes with the intention of providing a new patch
for you to review. I found a problem that I'd guess was introduced by
commit 9dbf2b7d75de5af38d087cbe2b1147dd0fd10f0a, "Restructure SELECT
INTO's parsetree representation into CreateTableAsStmt". This has
nothing to do with my patch in particular.

I noticed that my tests broke, on queries like "select * into
orders_recent FROM orders WHERE orderdate >= '2002-01-01';". Since
commands like that are now utility statements, I thought it best to
just hash the query string directly, along with all other utility
statements - richer functionality would be unlikely to be missed all
that much, and that's a fairly clean demarcation that I don't want to
make blurry.

In the existing pg_stat_statements code in HEAD, there are 2
pgss_store call sites - one in pgss_ProcessUtility, and the other in
pgss_ExecutorFinish. There is an implicit assumption in the extant
code (and my patch too) that there will be exactly one pgss_store call
per query execution. However, that assumption appears to now fall
down, as illustrated by the GDB session below. What's more, our new
hook is called twice, which is arguably redundant.

(gdb) break pgss_parse_analyze
Breakpoint 1 at 0x7fbd17b96790: file pg_stat_statements.c, line 640.
(gdb) break pgss_ProcessUtility
Breakpoint 2 at 0x7fbd17b962b4: file pg_stat_statements.c, line 1710.
(gdb) break pgss_ExecutorEnd
Breakpoint 3 at 0x7fbd17b9618c: file pg_stat_statements.c, line 1674.
(gdb) c
Continuing.

< I execute the command "select * into orders_recent FROM orders WHERE
orderdate >= '2002-01-01';" >

Breakpoint 1, pgss_parse_analyze (pstate=0x2473bc0,
post_analysis_tree=0x2474010) at pg_stat_statements.c:640
640 if (post_analysis_tree->commandType != CMD_UTILITY)
(gdb) c
Continuing.

Breakpoint 2, pgss_ProcessUtility (parsetree=0x2473cd8,
queryString=0x2472a88 "select * into orders_recent FROM orders WHERE
orderdate >= '2002-01-01';", params=0x0,
isTopLevel=1 '\001', dest=0x2474278, completionTag=0x7fff74e481e0
"") at pg_stat_statements.c:1710
1710 if (pgss_track_utility && pgss_enabled())
(gdb) c
Continuing.

Breakpoint 3, pgss_ExecutorEnd (queryDesc=0x24c9660) at
pg_stat_statements.c:1674
1674 if (queryDesc->totaltime && pgss_enabled())
(gdb) c
Continuing.

What do you think we should do about this?

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#59

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Peter Geoghegan (#58)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan <peter@2ndquadrant.com> writes:

I merged upstream changes with the intention of providing a new patch
for you to review. I found a problem that I'd guess was introduced by
commit 9dbf2b7d75de5af38d087cbe2b1147dd0fd10f0a, "Restructure SELECT
INTO's parsetree representation into CreateTableAsStmt". This has
nothing to do with my patch in particular.

Yeah, I already deleted the intoClause chunk from the patch. I think
treating SELECT INTO as a utility statement is probably fine, at least
for now.

In the existing pg_stat_statements code in HEAD, there are 2
pgss_store call sites - one in pgss_ProcessUtility, and the other in
pgss_ExecutorFinish. There is an implicit assumption in the extant
code (and my patch too) that there will be exactly one pgss_store call
per query execution. However, that assumption appears to now fall
down, as illustrated by the GDB session below. What's more, our new
hook is called twice, which is arguably redundant.

That's been an issue right along for cases such as EXPLAIN and EXECUTE,
I believe. Perhaps the right thing is to consider such executor calls
as nested statements --- that is, the ProcessUtility hook ought to
bump the nesting depth too.

regards, tom lane

#60

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#59)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 28 March 2012 15:25, Tom Lane <tgl@sss.pgh.pa.us> wrote:

That's been an issue right along for cases such as EXPLAIN and EXECUTE,
I believe.

Possible, since I didn't have test coverage for either of those 2 commands.

Perhaps the right thing is to consider such executor calls

as nested statements --- that is, the ProcessUtility hook ought to
bump the nesting depth too.

That makes a lot of sense, but it might spoil things for the
pg_stat_statements.track = 'top' + pg_stat_statements.track_utility =
'on' case. At the very least, it's a POLA violation, to the extent
that if you were going to do this, you might mandate that nested
statements be tracked along with utility statements (probably while
defaulting to having both off, which would be a change).

Since you've already removed the intoClause chunk, I'm not sure how
far underway the review effort is - would you like me to produce a new
revision, or is that unnecessary?

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#61

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Peter Geoghegan (#58)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

A couple other issues about this patch ...

Is there any actual benefit in providing the
"pg_stat_statements.string_key" GUC? It looks to me more like something
that was thrown in because it was easy than because anybody would want
it. I'd just as soon leave it out and avoid the incremental API
complexity increase. (While on that subject, I see no documentation
updates in the patch...)

Also, I'm not terribly happy with the "sticky entries" hack. The way
you had it set up with a 1e10 bias for a sticky entry was completely
unreasonable IMO, because if the later pgss_store call never happens
(which is quite possible if the statement contains an error detected
during planning or execution), that entry is basically never going to
age out, and will just uselessly consume a precious table slot for a
long time. In the extreme, somebody could effectively disable query
tracking by filling the hashtable with variants of "SELECT 1/0".
The best quick answer I can think of is to reduce USAGE_NON_EXEC_STICK
to maybe 10 or so, but I wonder whether there's some less klugy way to
get the result in the first place. I thought about keeping the
canonicalized query string around locally in the backend rather than
having the early pgss_store call at all, but am not sure it's worth
the complexity of an additional local hashtable or some such to hold
such pending entries.

regards, tom lane

#62

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Peter Geoghegan (#60)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan <peter@2ndquadrant.com> writes:

Since you've already removed the intoClause chunk, I'm not sure how
far underway the review effort is - would you like me to produce a new
revision, or is that unnecessary?

I've whacked it around to the point that that wouldn't be too helpful
as far as the code goes. (Just for transparency I'll attach what I've
currently got, which mostly consists of getting rid of the static state
and cleaning up the scanner interface a bit. I've not yet touched the
jumble-producing code, but I think it needs work too.) However, if
you've got or can produce the appropriate documentation updates, that
would save me some time.

regards, tom lane

#63

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Peter Geoghegan (#1)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 28 March 2012 15:57, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Is there any actual benefit in providing the
"pg_stat_statements.string_key" GUC? It looks to me more like something
that was thrown in because it was easy than because anybody would want
it. I'd just as soon leave it out and avoid the incremental API
complexity increase. (While on that subject, I see no documentation
updates in the patch...)

Personally, I don't care for it, and I'm sure most users wouldn't
either, but I thought that someone somewhere might be relying on the
existing behaviour.

I will produce a doc-patch. It would have been premature to do so
until quite recently.

Also, I'm not terribly happy with the "sticky entries" hack. The way
you had it set up with a 1e10 bias for a sticky entry was completely
unreasonable IMO, because if the later pgss_store call never happens
(which is quite possible if the statement contains an error detected
during planning or execution), that entry is basically never going to
age out, and will just uselessly consume a precious table slot for a
long time. In the extreme, somebody could effectively disable query
tracking by filling the hashtable with variants of "SELECT 1/0".
The best quick answer I can think of is to reduce USAGE_NON_EXEC_STICK
to maybe 10 or so, but I wonder whether there's some less klugy way to
get the result in the first place. I thought about keeping the
canonicalized query string around locally in the backend rather than
having the early pgss_store call at all, but am not sure it's worth
the complexity of an additional local hashtable or some such to hold
such pending entries.

I was troubled by that too, and had considered various ways of at
least polishing the kludge. Maybe a better approach would be to start
with a usage of 1e10 (or something rather high, anyway), but apply a
much more aggressive multiplier than USAGE_DECREASE_FACTOR for sticky
entries only? That way, in earlier calls of entry_dealloc() the sticky
entries, easily identifiable as having 0 calls, are almost impossible
to evict, but after a relatively small number of calls they soon
become more readily evictable.

This seems better than simply having some much lower usage that is
only a few times the value of USAGE_INIT.

Let's suppose we set sticky entries to have a usage value of 10. If
all other queries have more than 10 calls, which is not unlikely
(under the current usage format, 1.0 usage = 1 call, at least until
entry_dealloc() dampens usage) then when we entry_dealloc(), the
sticky entry might as well have a usage of 1, and has no way of
increasing its usage short of becoming a "real" entry.

On the other hand, with the multiplier trick, how close the sticky
entry is to eviction is, importantly, far more strongly influenced by
the number of entry_dealloc() calls, which in turn is influenced by
churn in the system, rather than being largely influenced by how the
magic sticky usage value happens to compare to those usage values of
some random set of "real" entries on some random database. If entries
really are precious, then the sticky entry is freed much sooner. If
not, then why not allow the sticky entry to stick around pending its
execution/ promotion to a "real" entry?

It would probably be pretty inexpensive to maintain what is currently
the largest usage value in the hash table at entry_dealloc() time -
that would likely be far more suitable than 1e10, and might even work
well. We could perhaps cut that in half every entry_dealloc().

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Import Notes

Reply to msg id not found: 24433.1332948865@sss.pgh.pa.us

#64

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Peter Geoghegan (#63)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan <peter@2ndquadrant.com> writes:

On 28 March 2012 15:57, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Is there any actual benefit in providing the
"pg_stat_statements.string_key" GUC? It looks to me more like something
that was thrown in because it was easy than because anybody would want
it. I'd just as soon leave it out and avoid the incremental API
complexity increase. (While on that subject, I see no documentation
updates in the patch...)

Personally, I don't care for it, and I'm sure most users wouldn't
either, but I thought that someone somewhere might be relying on the
existing behaviour.

Hearing no squawks, I will remove it from the committed patch; one
less thing to document. Easy enough to put back later, if someone
makes a case for it.

Also, I'm not terribly happy with the "sticky entries" hack.

I was troubled by that too, and had considered various ways of at
least polishing the kludge. Maybe a better approach would be to start
with a usage of 1e10 (or something rather high, anyway), but apply a
much more aggressive multiplier than USAGE_DECREASE_FACTOR for sticky
entries only? That way, in earlier calls of entry_dealloc() the sticky
entries, easily identifiable as having 0 calls, are almost impossible
to evict, but after a relatively small number of calls they soon
become more readily evictable.

I did some simple experiments with the regression tests. Now, those
tests are by far a worst case for this sort of thing, since (a) they
probably generate many more unique queries than a typical production
application, and (b) they almost certainly provoke many more errors
and hence more dead sticky entries than a typical production app.
Nonetheless, the results look pretty bad. Using various values of
USAGE_NON_EXEC_STICK, the numbers of useful and dead entries in the hash
table after completing one round of regression tests was:

STICK live entries dead sticky entries

10.0 780 190
5.0 858 112
4.0 874 96
3.0 911 62
2.0 918 43

I did not bother measuring 1e10 ;-). It's clear that sticky entries
are forcing useful entries out of the table in this scenario.
I think wasting more than about 10% of the table in this way is not
acceptable.

I'm planning to commit the patch with a USAGE_NON_EXEC_STICK value
of 3.0, which is the largest value that stays below 10% wastage.
We can twiddle that logic later, so if you want to experiment with an
alternate decay rule, feel free.

regards, tom lane

#65

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#64)

1 attachment(s)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 29 March 2012 00:14, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm planning to commit the patch with a USAGE_NON_EXEC_STICK value
of 3.0, which is the largest value that stays below 10% wastage.
We can twiddle that logic later, so if you want to experiment with an
alternate decay rule, feel free.

I think I may well end up doing so when I get a chance. This seems
like the kind of problem that will be solved only when we get some
practical experience (i.e. use the tool on something closer to a
production system than the regression tests).

doc-patch is attached. I'm not sure if I got the balance right - it
may be on the verbose side.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachments:

pg_stat_statements_norm_docs.patchapplication/octet-stream; name=pg_stat_statements_norm_docs.patchDownload

diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
new file mode 100644
index ca7bd44..2355f1a
*** a/doc/src/sgml/pgstatstatements.sgml
--- b/doc/src/sgml/pgstatstatements.sgml
***************
*** 24,33 ****
  
    <para>
     The statistics gathered by the module are made available via a system view
!    named <structname>pg_stat_statements</>.  This view contains one row for
!    each distinct query text, database ID, and user ID (up to the maximum
!    number of distinct statements that the module can track).  The columns
!    of the view are shown in <xref linkend="pgstatstatements-columns">.
    </para>
  
    <table id="pgstatstatements-columns">
--- 24,78 ----
  
    <para>
     The statistics gathered by the module are made available via a system view
!    named <structname>pg_stat_statements</>.  The module normalizes queries,
!    matching them based on an internal hash value.  
!    <structname>pg_stat_statements</> contains one row for each distinct
!    combination of internal query hash, database ID, and user ID (up to the
!    maximum number of distinct statements that the module can track).  The
!    columns of the view are shown in 
!    <xref linkend="pgstatstatements-columns">.
!   </para>
! 
!   <para>
!    Utility commands are all those other than <command>SELECT</>,
!    <command>INSERT</>, <command>UPDATE</> and <command>DELETE</>.  In the case
!    of utility statements, the query hash value identifier is derived from the
!    query string.  In the case of all other statements, the statement's query
!    tree is hashed to produce its query hash identifier.  The query tree is an
!    internal structure that results from the transformation process, later in the
!    parser stage, immediately prior to the rewrite stage.  Queries that are
!    deemed to be equivalent, due to not actually differing in ways that the
!    implementation deems essential to the query, have execution statistics
!    aggregated into a single entry. This is particularly useful for queries with
!    inline parameters.
!   </para>
! 
!   <para>
!    Matching queries based on their query tree hash value can produce results
!    that are not consistent with an implementation that differentiates based on
!    each query's SQL string, in non-obvious ways. For example, the current value
!    of <varname>search_path</> might affect query differentiation if a change
!    caused parse analysis to resolve different relations at different times, such
!    as relations that have identical definitions but are located in different
!    schemas. The two resulting distinct entries would have identical query
!    strings. 
!   </para>
! 
!   <para>
!    The execution costs of <command>DO INSTEAD</> and <command>DO ALSO</> rules
!    are attributed to their originating query; a separate entry will not be
!    created for the rule action. <command>DO NOTHING</> rules will naturally not
!    add an entry at all, since a query is not actually executed. 
!   </para>
! 
!   <para>
!    <filename>pg_stat_statements</filename> considers some utility commands to
!    consist of two distinct commands: The utility command proper, and a nested
!    <acronym>DML</> command.  Such utility commands include <command>DECLARE
!    CURSOR</>, <command>SELECT INTO</> and <command>EXPLAIN ANALYZE</>.
!    Therefore, to see the true execution costs of these statements, it is
!    necessary to set both pg_stat_statements.track_utility to 'on' and
!    pg_stat_statements.track to 'all'.
    </para>
  
    <table id="pgstatstatements-columns">
***************
*** 61,67 ****
        <entry><structfield>query</structfield></entry>
        <entry><type>text</type></entry>
        <entry></entry>
!       <entry>Text of the statement (up to <xref linkend="guc-track-activity-query-size"> bytes)</entry>
       </row>
  
       <row>
--- 106,112 ----
        <entry><structfield>query</structfield></entry>
        <entry><type>text</type></entry>
        <entry></entry>
!       <entry>Representative, canonicalized text of the statement (up to <xref linkend="guc-track-activity-query-size"> bytes)</entry>
       </row>
  
       <row>
***************
*** 193,206 ****
     queries executed by other users.  They can see the statistics, however,
     if the view has been installed in their database.
    </para>
  
!   <para>
!    Note that statements are considered the same if they have the same text,
!    regardless of the values of any out-of-line parameters used in the
!    statement.  Using out-of-line parameters will help to group statements
!    together and may make the statistics more useful.
!   </para>
!  </sect2>
  
   <sect2>
    <title>Functions</title>
--- 238,256 ----
     queries executed by other users.  They can see the statistics, however,
     if the view has been installed in their database.
    </para>
+   <warning>
+    <para>
+     A notable artefact of the module's query hash matching based implementation
+     is that there may be undetectable collisions. While such collisions are very
+     unlikely, the possibility cannot be absolutely precluded, and as such the
+     count values of two distinct queries may be incorrectly aggregated together
+     as a single entry within <structname>pg_stat_statements</> in isolated
+     cases. However, it is guaranteed that such collisions cannot occur between
+     entries for different databases or different users.
+    </para>
+   </warning>
  
!   </sect2>
  
   <sect2>
    <title>Functions</title>
***************
*** 254,264 ****
       <para>
        <varname>pg_stat_statements.track</varname> controls which statements
        are counted by the module.
!       Specify <literal>top</> to track top-level statements (those issued
!       directly by clients), <literal>all</> to also track nested statements
!       (such as statements invoked within functions), or <literal>none</> to
!       disable.
!       The default value is <literal>top</>.
        Only superusers can change this setting.
       </para>
      </listitem>
--- 304,314 ----
       <para>
        <varname>pg_stat_statements.track</varname> controls which statements
        are counted by the module.
! 	  Specify <literal>top</> to track top-level statements (those issued
! 	  directly by clients), <literal>all</> to also track nested statements
! 	  (such as statements invoked within functions), or <literal>none</> to
! 	  disable.
! 	  The default value is <literal>top</>.
        Only superusers can change this setting.
       </para>
      </listitem>
***************
*** 271,282 ****
  
      <listitem>
       <para>
!       <varname>pg_stat_statements.track_utility</varname> controls whether
!       utility commands are tracked by the module.  Utility commands are
!       all those other than <command>SELECT</>, <command>INSERT</>,
!       <command>UPDATE</> and <command>DELETE</>.
!       The default value is <literal>on</>.
!       Only superusers can change this setting.
       </para>
      </listitem>
     </varlistentry>
--- 321,329 ----
  
      <listitem>
       <para>
! 	  <varname>pg_stat_statements.track_utility</varname> controls whether
! 	  utility commands are tracked by the module.  The default value is
! 	  <literal>on</>.  Only superusers can change this setting.
       </para>
      </listitem>
     </varlistentry>
*************** pg_stat_statements.track = all
*** 329,348 ****
  bench=# SELECT pg_stat_statements_reset();
  
  $ pgbench -i bench
! $ pgbench -c10 -t300 -M prepared bench
  
  bench=# \x
  bench=# SELECT query, calls, total_time, rows, 100.0 * shared_blks_hit /
                 nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
            FROM pg_stat_statements ORDER BY total_time DESC LIMIT 5;
  -[ RECORD 1 ]---------------------------------------------------------------------
! query       | UPDATE pgbench_branches SET bbalance = bbalance + $1 WHERE bid = $2;
  calls       | 3000
  total_time  | 9.60900100000002
  rows        | 2836
  hit_percent | 99.9778970000200936
  -[ RECORD 2 ]---------------------------------------------------------------------
! query       | UPDATE pgbench_tellers SET tbalance = tbalance + $1 WHERE tid = $2;
  calls       | 3000
  total_time  | 8.015156
  rows        | 2990
--- 376,395 ----
  bench=# SELECT pg_stat_statements_reset();
  
  $ pgbench -i bench
! $ pgbench -c10 -t300
  
  bench=# \x
  bench=# SELECT query, calls, total_time, rows, 100.0 * shared_blks_hit /
                 nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
            FROM pg_stat_statements ORDER BY total_time DESC LIMIT 5;
  -[ RECORD 1 ]---------------------------------------------------------------------
! query       | UPDATE pgbench_branches SET bbalance = bbalance + ? WHERE bid = ?;
  calls       | 3000
  total_time  | 9.60900100000002
  rows        | 2836
  hit_percent | 99.9778970000200936
  -[ RECORD 2 ]---------------------------------------------------------------------
! query       | UPDATE pgbench_tellers SET tbalance = tbalance + ? WHERE tid = ?;
  calls       | 3000
  total_time  | 8.015156
  rows        | 2990
*************** total_time  | 0.310624
*** 354,360 ****
  rows        | 100000
  hit_percent | 0.30395136778115501520
  -[ RECORD 4 ]---------------------------------------------------------------------
! query       | UPDATE pgbench_accounts SET abalance = abalance + $1 WHERE aid = $2;
  calls       | 3000
  total_time  | 0.271741999999997
  rows        | 3000
--- 401,407 ----
  rows        | 100000
  hit_percent | 0.30395136778115501520
  -[ RECORD 4 ]---------------------------------------------------------------------
! query       | UPDATE pgbench_accounts SET abalance = abalance + ? WHERE aid = ?;
  calls       | 3000
  total_time  | 0.271741999999997
  rows        | 3000

#66

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Peter Geoghegan (#65)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan <peter@2ndquadrant.com> writes:

doc-patch is attached. I'm not sure if I got the balance right - it
may be on the verbose side.

Thanks. I've committed the patch along with the docs, after rather
heavy editorialization.

There remain some loose ends that should be worked on but didn't seem
like commit-blockers:

1. What to do with EXPLAIN, SELECT INTO, etc. We had talked about
tweaking the behavior of statement nesting and some other possibilities.
I think clearly this could use improvement but I'm not sure just how.
(Note: I left out the part of your docs patch that attempted to explain
the current behavior, since I think we should fix it not document it.)

2. Whether and how to adjust the aging-out of sticky entries. This
seems like a research project, but the code impact should be quite
localized.

BTW, I eventually concluded that the parameterization testing we were
worried about before was a red herring. As committed, the patch tries
to store a normalized string if it found any deletable constants, full
stop. This seems to me to be correct behavior because the presence of
constants is exactly what makes the string normalizable, and such
constants *will* be ignored in the hash calculation no matter whether
there are other parameters or not.

regards, tom lane

#67

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Tom Lane (#66)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

BTW, I forgot to mention that I did experiment with your python-based
test script for pg_stat_statements, but decided not to commit it.
There are just too many external dependencies for my taste:

1. python
2. psycopg2
3. dellstore2 test database

That coupled with the apparent impossibility of running the script
without manual preconfiguration makes it look not terribly useful.

Also, as of the committed patch there are several individual tests that
fail or need adjustment:

The SELECT INTO tests all fail, but we know the reason why (the testbed
isn't expecting them to result in creating separate entries for the
utility statement and the underlying plannable SELECT).

verify_statement_equivalency("select a.orderid from orders a join orders b on a.orderid = b.orderid",
"select b.orderid from orders a join orders b on a.orderid = b.orderid", conn)

These are not equivalent statements, or at least would not be if the
join condition were anything else than what it is, so the fact that the
original coding failed to distinguish the targetlist entries is a bug.

The test
# temporary column name within recursive CTEs doesn't differentiate
fails, not because of the change of column name, but because of the
change of CTE name. This is a consequence of my having used the CTE
name here:

case RTE_CTE:

/*
* Depending on the CTE name here isn't ideal, but it's the
* only info we have to identify the referenced WITH item.
*/
APP_JUMB_STRING(rte->ctename);
APP_JUMB(rte->ctelevelsup);
break;

We could avoid the name dependency by omitting ctename from the jumble
but I think that cure is worse than the disease.

Anyway, not too important, but I just thought I'd document this in case
you were wondering about the discrepancies.

regards, tom lane

#68

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Tom Lane (#67)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Wed, Mar 28, 2012 at 10:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The SELECT INTO tests all fail, but we know the reason why (the testbed
isn't expecting them to result in creating separate entries for the
utility statement and the underlying plannable SELECT).

This might be a dumb idea, but for a quick hack, could we just rig
SELECT INTO, CREATE TABLE AS, and EXPLAIN not to create entries for
themselves at all, without suppressing creation of an entry for the
underlying query? The output might be slightly misleading but it
wouldn't be broken, and I'm disinclined to want to spend a lot of time
doing fine-tuning right now.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#69

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#66)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 29 March 2012 02:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thanks. I've committed the patch along with the docs, after rather
heavy editorialization.

Thank you.

1. What to do with EXPLAIN, SELECT INTO, etc. We had talked about
tweaking the behavior of statement nesting and some other possibilities.
I think clearly this could use improvement but I'm not sure just how.
(Note: I left out the part of your docs patch that attempted to explain
the current behavior, since I think we should fix it not document it.)

Yeah, this is kind of unsatisfactory. Nobody would expect the module
to behave this way. On the other hand, users probably aren't hugely
interested in this information.

I'm still kind of attached to the idea of exposing the hash value in
the view. It could be handy in replication situations, to be able to
aggregate statistics across the cluster (assuming you're using
streaming replication and not a trigger based system). You need a
relatively stable identifier to be able to do that. You've already
sort-of promised to not break the format in point releases, because it
is serialised to disk, and may have to persist for months or years.
Also, it will drive home the reality of what's going on in situations
like this (from the docs):

"In some cases, queries with visibly different texts might get merged
into a single pg_stat_statements entry. Normally this will happen only
for semantically equivalent queries, but there is a small chance of
hash collisions causing unrelated queries to be merged into one entry.
(This cannot happen for queries belonging to different users or
databases, however.)

Since the hash value is computed on the post-parse-analysis
representation of the queries, the opposite is also possible: queries
with identical texts might appear as separate entries, if they have
different meanings as a result of factors such as different
search_path settings."

2. Whether and how to adjust the aging-out of sticky entries. This
seems like a research project, but the code impact should be quite
localized.

As I said, I'll try and give it some thought, and do some experiments.

BTW, I eventually concluded that the parameterization testing we were
worried about before was a red herring. As committed, the patch tries
to store a normalized string if it found any deletable constants, full
stop. This seems to me to be correct behavior because the presence of
constants is exactly what makes the string normalizable, and such
constants *will* be ignored in the hash calculation no matter whether
there are other parameters or not.

Yeah, that aspect of not canonicalising parametrised entries had
bothered me. I guess we're better off gracefully handling the problem
across the board, rather than attempting to band-aid the problem up by
just not having speculative hashtable entries in cases where they
arguably are not so useful. Avoiding canonicalising those constants
was somewhat misleading.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#70

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Robert Haas (#68)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Mar 28, 2012 at 10:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The SELECT INTO tests all fail, but we know the reason why (the testbed
isn't expecting them to result in creating separate entries for the
utility statement and the underlying plannable SELECT).

This might be a dumb idea, but for a quick hack, could we just rig
SELECT INTO, CREATE TABLE AS, and EXPLAIN not to create entries for
themselves at all, without suppressing creation of an entry for the
underlying query?

It would make more sense to me to go the other way, that is suppress
creation of a separate entry for the contained optimizable statement.
The stats will still be correctly accumulated into the surrounding
statement (or at least, if they are not, that's a separate pre-existing
bug). If we do it in the direction you suggest, we'll fail to capture
costs incurred outside execution of the contained statement.

Right now, we already have logic in there to track nesting of statements
in a primitive way, that is just count the nesting depth. My first idea
about fixing this was to tweak that logic so that it stacks a flag
saying "we're in a utility statement that contains an optimizable
statement", and then the first layer of Executor hooks that sees that
flag set would know to not do anything. However this isn't quite good
enough because that first layer might not be for the "same" statement.
As an example, in an EXPLAIN ANALYZE the planner might pre-execute
immutable or stable SQL functions before we reach the executor. We
would prefer that any statements embedded in such a function still be
seen as independent nested statements.

However, I think there is a solution for that, though it may sound a bit
ugly. Rather than just stacking a flag, let's stack the query source
text pointer for the utility statement. Then in the executor hooks,
if that pointer is *pointer* equal (not strcmp equal) to the optimizable
statement's source-text pointer, we know we are executing the "same"
statement as the surrounding utility command, and we do nothing.

This looks like it would work for the SELECT INTO and EXPLAIN cases,
and for DECLARE CURSOR whenever that gets changed to a less bizarre
structure. It would not work for EXECUTE, because in that case we
pass the query string saved from PREPARE to the executor. However,
we could possibly do a special-case hack for EXECUTE; maybe ask
prepare.c for the statement's string and stack that instead of the
outer EXECUTE query string.

Thoughts?

regards, tom lane

#71

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Tom Lane (#70)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Thu, Mar 29, 2012 at 11:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

It would make more sense to me to go the other way, that is suppress
creation of a separate entry for the contained optimizable statement.
The stats will still be correctly accumulated into the surrounding
statement (or at least, if they are not, that's a separate pre-existing
bug). If we do it in the direction you suggest, we'll fail to capture
costs incurred outside execution of the contained statement.

All things being equal, I completely agree. However, ISTM that the
difficulty of implementation might be higher for your proposal, for
the reasons you go on to state. If getting it right means that other
significant features are not going to get committed at all for 9.2, I
think we could leave this as a TODO.

Right now, we already have logic in there to track nesting of statements
in a primitive way, that is just count the nesting depth. My first idea
about fixing this was to tweak that logic so that it stacks a flag
saying "we're in a utility statement that contains an optimizable
statement", and then the first layer of Executor hooks that sees that
flag set would know to not do anything. However this isn't quite good
enough because that first layer might not be for the "same" statement.
As an example, in an EXPLAIN ANALYZE the planner might pre-execute
immutable or stable SQL functions before we reach the executor. We
would prefer that any statements embedded in such a function still be
seen as independent nested statements.

However, I think there is a solution for that, though it may sound a bit
ugly. Rather than just stacking a flag, let's stack the query source
text pointer for the utility statement. Then in the executor hooks,
if that pointer is *pointer* equal (not strcmp equal) to the optimizable
statement's source-text pointer, we know we are executing the "same"
statement as the surrounding utility command, and we do nothing.

Without wishing to tick you off, that sounds both ugly and fragile.
Can't we find a way to set the stacked flag (on the top stack frame)
after planning and before execution?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#72

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Robert Haas (#71)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, Mar 29, 2012 at 11:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

However, I think there is a solution for that, though it may sound a bit
ugly. Rather than just stacking a flag, let's stack the query source
text pointer for the utility statement. Then in the executor hooks,
if that pointer is *pointer* equal (not strcmp equal) to the optimizable
statement's source-text pointer, we know we are executing the "same"
statement as the surrounding utility command, and we do nothing.

Without wishing to tick you off, that sounds both ugly and fragile.

What do you object to --- the pointer-equality part? We could do strcmp
comparison instead, on the assumption that a utility command could not
look the same as an optimizable statement except in the case we care
about. I think that's probably unnecessary though.

Can't we find a way to set the stacked flag (on the top stack frame)
after planning and before execution?

That would require a way for pg_stat_statements to get control at rather
random places in several different types of utility statements. And
if we did add hook functions in those places, you'd still need to have
sufficient stacked context for those hooks to know what to do, which
leads you right back to this I think.

regards, tom lane

#73

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Tom Lane (#72)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Thu, Mar 29, 2012 at 11:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, Mar 29, 2012 at 11:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

However, I think there is a solution for that, though it may sound a bit
ugly. Rather than just stacking a flag, let's stack the query source
text pointer for the utility statement. Then in the executor hooks,
if that pointer is *pointer* equal (not strcmp equal) to the optimizable
statement's source-text pointer, we know we are executing the "same"
statement as the surrounding utility command, and we do nothing.

Without wishing to tick you off, that sounds both ugly and fragile.

What do you object to --- the pointer-equality part? We could do strcmp
comparison instead, on the assumption that a utility command could not
look the same as an optimizable statement except in the case we care
about. I think that's probably unnecessary though.

The pointer equality part seems like the worst ugliness, yes.

Can't we find a way to set the stacked flag (on the top stack frame)
after planning and before execution?

That would require a way for pg_stat_statements to get control at rather
random places in several different types of utility statements. And
if we did add hook functions in those places, you'd still need to have
sufficient stacked context for those hooks to know what to do, which
leads you right back to this I think.

What I'm imagining is that instead of just having a global for
nested_level, you'd have a global variable pointing to a linked list.
The length of the list would be equal to what we currently call
nested_level + 1. Something like this:

struct pgss_nesting_info
{
struct pgss_nesting_info *next;
int flag; /* bad name */
};
static pgss_nesting_info *pgss_stack_top;

So any test for nesting_depth == 0 would instead test
pgss_stack_top->next != NULL.

Then, when you get control at the relevant spots, you set
pgss_stack_top->flag = 1 and that's it. Now, maybe it's too ugly to
think about passing control at those spots; I'm surprised there's not
a central point they all go through...

Another thought is: if we simply treated these as nested queries for
all purposes, would that really be so bad?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#74

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Robert Haas (#73)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Robert Haas <robertmhaas@gmail.com> writes:

What I'm imagining is that instead of just having a global for
nested_level, you'd have a global variable pointing to a linked list.

This is more or less what I have in mind, too, except I do not believe
that a mere boolean flag is sufficient to tell the difference between
an executor call that you want to suppress logging for and one that
you do not. You need some more positive way of identifying the target
statement than that, and what I propose that be is the statement's query
string.

regards, tom lane

#75

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Robert Haas (#73)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

[ forgot to respond to this bit ]

Robert Haas <robertmhaas@gmail.com> writes:

Another thought is: if we simply treated these as nested queries for
all purposes, would that really be so bad?

That was actually what I suggested first, and now that I look at the
code, that's exactly what's happening right now. However, Peter pointed
out --- I think rightly --- that this fails to satisfy the principle
of least astonishment, because the user doesn't think of these
statements as being utility statements with other statements wrapped
inside. Users are going to expect these cases to be treated as single
statements.

The issue existed before this patch, BTW, but was partially masked by
the fact that we grouped pg_stat_statements view entries strictly by
query text, so that both the utility statement and the contained
optimizable statement got matched to the same table entry. The
execution costs were getting double-counted, but apparently nobody
noticed that. As things now stand, the utility statement and contained
statement show up as distinct table entries (if you have
nested-statement tracking enabled) because of the different hashing
methods used. And it's those multiple table entries that seem
confusing, especially since they are counting mostly the same costs.

[ time passes ... ]

Hm ... I just had a different idea. I need to go look at the code
again, but I believe that in the problematic cases, the post-analyze
hook does not compute a queryId for the optimizable statement. This
means that it will arrive at the executor with queryId zero. What if
we simply made the executor hooks do nothing when queryId is zero?
(Note that this also means that in the problematic cases, the behavior
is already pretty wrong because executor costs for *all* statements of
this sort are getting merged into one hashtable entry for hash zero.)

regards, tom lane

#76

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Tom Lane (#75)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

I wrote:

Hm ... I just had a different idea. I need to go look at the code
again, but I believe that in the problematic cases, the post-analyze
hook does not compute a queryId for the optimizable statement. This
means that it will arrive at the executor with queryId zero. What if
we simply made the executor hooks do nothing when queryId is zero?
(Note that this also means that in the problematic cases, the behavior
is already pretty wrong because executor costs for *all* statements of
this sort are getting merged into one hashtable entry for hash zero.)

The attached proposed patch does it that way. It makes the EXPLAIN,
SELECT INTO, and DECLARE CURSOR cases behave as expected for utility
statements. PREPARE/EXECUTE work a bit funny though: if you have
track = all then you get EXECUTE cycles reported against both the
EXECUTE statement and the underlying PREPARE. This is because when
PREPARE calls parse_analyze_varparams the post-analyze hook doesn't know
that this isn't a top-level statement, so it marks the query with a
queryId. I don't see any way around that part without something like
what I suggested before. However, this behavior seems to me to be
considerably less of a POLA violation than the cases involving two
identical-looking entries for self-contained statements, and it might
even be thought to be a feature not a bug (since the PREPARE entry will
accumulate totals for all uses of the prepared statement). So I'm
satisfied with it for now.

regards, tom lane

#77

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Tom Lane (#76)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

I wrote:

... PREPARE/EXECUTE work a bit funny though: if you have
track = all then you get EXECUTE cycles reported against both the
EXECUTE statement and the underlying PREPARE. This is because when
PREPARE calls parse_analyze_varparams the post-analyze hook doesn't know
that this isn't a top-level statement, so it marks the query with a
queryId. I don't see any way around that part without something like
what I suggested before. However, this behavior seems to me to be
considerably less of a POLA violation than the cases involving two
identical-looking entries for self-contained statements, and it might
even be thought to be a feature not a bug (since the PREPARE entry will
accumulate totals for all uses of the prepared statement). So I'm
satisfied with it for now.

Actually, there's an easy hack for that too: we can teach the
ProcessUtility hook to do nothing (and in particular not increment the
nesting level) when the statement is an ExecuteStmt. This will result
in the executor time being blamed on the original PREPARE, whether or
not you have enabled tracking of nested statements. That seems like a
substantial win to me, because right now you get a distinct EXECUTE
entry for each textually-different set of parameter values, which seems
pretty useless. This change would make use of PREPARE/EXECUTE behave
very nearly the same in pg_stat_statement as use of protocol-level
prepared statements. About the only downside I can see is that the
cycles expended on evaluating the EXECUTE's parameters will not be
charged to any pg_stat_statement entry. Since those can be expressions,
in principle this might be a non-negligible amount of execution time,
but in practice it hardly seems likely that anyone would care about it.

Barring objections I'll go fix this, and then this patch can be
considered closed except for possible future tweaking of the
sticky-entry decay rule.

regards, tom lane

#78

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Tom Lane (#77)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On Thu, Mar 29, 2012 at 4:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

... PREPARE/EXECUTE work a bit funny though: if you have
track = all then you get EXECUTE cycles reported against both the
EXECUTE statement and the underlying PREPARE. This is because when
PREPARE calls parse_analyze_varparams the post-analyze hook doesn't know
that this isn't a top-level statement, so it marks the query with a
queryId. I don't see any way around that part without something like
what I suggested before. However, this behavior seems to me to be
considerably less of a POLA violation than the cases involving two
identical-looking entries for self-contained statements, and it might
even be thought to be a feature not a bug (since the PREPARE entry will
accumulate totals for all uses of the prepared statement). So I'm
satisfied with it for now.

Actually, there's an easy hack for that too: we can teach the
ProcessUtility hook to do nothing (and in particular not increment the
nesting level) when the statement is an ExecuteStmt. This will result
in the executor time being blamed on the original PREPARE, whether or
not you have enabled tracking of nested statements. That seems like a
substantial win to me, because right now you get a distinct EXECUTE
entry for each textually-different set of parameter values, which seems
pretty useless. This change would make use of PREPARE/EXECUTE behave
very nearly the same in pg_stat_statement as use of protocol-level
prepared statements. About the only downside I can see is that the
cycles expended on evaluating the EXECUTE's parameters will not be
charged to any pg_stat_statement entry. Since those can be expressions,
in principle this might be a non-negligible amount of execution time,
but in practice it hardly seems likely that anyone would care about it.

Barring objections I'll go fix this, and then this patch can be
considered closed except for possible future tweaking of the
sticky-entry decay rule.

After reading your last commit message, I was wondering if something
like this might be possible, so +1 from me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#79

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#77)

2 attachment(s)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 29 March 2012 21:05, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Barring objections I'll go fix this, and then this patch can be
considered closed except for possible future tweaking of the
sticky-entry decay rule.

Attached patch fixes a bug, and tweaks sticky-entry decay.

The extant code bumps usage (though not call counts) in two hooks
(pgss_post_parse_analyze() and pgss_ExecutorEnd()) , so prepared
queries will always have about half the usage of an equivalent simple
query, which is clearly not desirable. With the proposed patch,
"usage" should be similar to "calls" until the first call of
entry_dealloc(), rather than usually having a value that's about twice
as high. With the patch, a run of pgbench with and without "-M
prepared" results in a usage of calls + 1 for each query from both
runs.

The approach I've taken with decay is to maintain a server-wide median
usage value (well, a convenient approximation), which is assigned to
sticky entries. This makes it hard to evict the entries in the first
couple of calls to entry_dealloc(). On the other hand, if there really
is contention for entries, it will soon become really easy to evict
sticky entries, because we use a much more aggressive multiplier of
0.5 for their decay.

I rather conservatively initially assume that the median usage is 10,
which is a very low value considering the use of the multiplier trick.
In any case, in the real world it won't take too long to call
entry_dealloc() to set the median value, if in fact it actually
matters.

You described entries as precious. This isn't quite the full picture;
while pg_stat_statements will malthusianistically burn through pretty
much as many entries as you care give to it, or so you might think, I
believe that in the real world, the rate at which the module burns
through them would frequently look logarithmic. In other words, after
an entry_dealloc() call the hashtable is 95% full, but it might take
rather a long time to reach 100% again - the first 5% is consumed
dramatically faster than the last. The user might not actually care if
you need to cache a sticky value for a few hours in one of their
slots, as you run an epic reporting query, even though the hashtable
is over 95% full.

The idea is to avoid evicting a sticky entry just because there
happened to be an infrequent entry_dealloc() at the wrong time, and
the least marginal of the most marginal 5% of non-sticky entries (that
is, the 5% up for eviction) happened to have a call count/usage of
higher than the magic value of 3, which I find quite plausible.

If I apply your test for dead sticky entries after the regression
tests (serial schedule) were run, my approach compares very favourably
(granted, presumably usage values were double-counted for your test,
making our results less than completely comparable).

For the purposes of this experiment, I've just commented out "if
(calls == 0) continue;" within the pg_stat_statements() function,
obviously:

postgres=# select calls = 0, count(*) from pg_stat_statements() group
by calls = 0;
-[ RECORD 1 ]-
?column? | f
count | 959
-[ RECORD 2 ]-
?column? | t
count | 3 <--- this includes the above query itself

postgres=# select calls = 0, count(*) from pg_stat_statements() group
by calls = 0;
-[ RECORD 1 ]-
?column? | f
count | 960 <----now it's counted here...
-[ RECORD 2 ]-
?column? | t
count | 2 <---- ...not here

I've also attached some elogs, in their original chronological order,
that trace the median usage when recorded at entry_dealloc() for the
regression tests. As you'd expect given that this is the regression
tests, the median is very low, consistently between 1.9 and 2.5. An
additional factor that makes this work well is that the standard
deviation is low, and as such it is much easier to evict sticky
entries, which is what you want here.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachments:

elog_med_vals.txttext/plain; charset=US-ASCII; name=elog_med_vals.txtDownload

pg_stat_statements_decay_2012_04_06.patchapplication/octet-stream; name=pg_stat_statements_decay_2012_04_06.patchDownload

diff contrib/pg_stat_statements/pg_stat_statements.c
index 178fdb9..823a62d
*** a/contrib/pg_stat_statements/pg_stat_statements.c
--- b/contrib/pg_stat_statements/pg_stat_statements.c
*************** static const uint32 PGSS_FILE_HEADER = 0
*** 72,79 ****
  /* XXX: Should USAGE_EXEC reflect execution time and/or buffer usage? */
  #define USAGE_EXEC(duration)	(1.0)
  #define USAGE_INIT				(1.0)	/* including initial planning */
! #define USAGE_NON_EXEC_STICK	(3.0)	/* to make new entries sticky */
  #define USAGE_DECREASE_FACTOR	(0.99)	/* decreased every entry_dealloc */
  #define USAGE_DEALLOC_PERCENT	5		/* free this % of entries at once */

  #define JUMBLE_SIZE				1024	/* query serialization buffer size */
--- 72,80 ----
  /* XXX: Should USAGE_EXEC reflect execution time and/or buffer usage? */
  #define USAGE_EXEC(duration)	(1.0)
  #define USAGE_INIT				(1.0)	/* including initial planning */
! #define ASSUMED_MED_INIT		(10.0)	/* initial assumed median usage */
  #define USAGE_DECREASE_FACTOR	(0.99)	/* decreased every entry_dealloc */
+ #define STICKY_DECREASE_FACTOR	(0.50)	/* separate sticky decrease factor */
  #define USAGE_DEALLOC_PERCENT	5		/* free this % of entries at once */

  #define JUMBLE_SIZE				1024	/* query serialization buffer size */
*************** typedef struct pgssSharedState
*** 139,144 ****
--- 140,146 ----
  {
  	LWLockId	lock;			/* protects hashtable search/modification */
  	int			query_size;		/* max query length in bytes */
+ 	double		cur_med_usage;	/* current median of usage for hashtable */
  } pgssSharedState;

  /*
*************** pgss_shmem_startup(void)
*** 413,418 ****
--- 415,421 ----
  		/* First time through ... */
  		pgss->lock = LWLockAssign();
  		pgss->query_size = pgstat_track_activity_query_size;
+ 		pgss->cur_med_usage = ASSUMED_MED_INIT;
  	}

  	/* Be sure everyone agrees on the hash table entry size */
*************** pgss_match_fn(const void *key1, const vo
*** 908,914 ****
  /*
   * Given an arbitrarily long query string, produce a hash for the purposes of
   * identifying the query, without normalizing constants.  Used when hashing
!  * utility statements, or for legacy compatibility mode.
   */
  static uint32
  pgss_hash_string(const char *str)
--- 911,917 ----
  /*
   * Given an arbitrarily long query string, produce a hash for the purposes of
   * identifying the query, without normalizing constants.  Used when hashing
!  * utility statements.
   */
  static uint32
  pgss_hash_string(const char *str)
*************** pgss_store(const char *query, uint32 que
*** 959,965 ****
  	 * under artificial conditions.
  	 */
  	if (jstate && !entry)
! 		usage = USAGE_NON_EXEC_STICK;
  	else
  		usage = USAGE_EXEC(duration);

--- 962,970 ----
  	 * under artificial conditions.
  	 */
  	if (jstate && !entry)
! 		usage = pgss->cur_med_usage;
! 	else if (jstate && entry)
! 		usage = 0;
  	else
  		usage = USAGE_EXEC(duration);

*************** entry_dealloc(void)
*** 1297,1309 ****
  	while ((entry = hash_seq_search(&hash_seq)) != NULL)
  	{
  		entries[i++] = entry;
! 		entry->counters.usage *= USAGE_DECREASE_FACTOR;
  	}

  	qsort(entries, i, sizeof(pgssEntry *), entry_cmp);
  	nvictims = Max(10, i * USAGE_DEALLOC_PERCENT / 100);
  	nvictims = Min(nvictims, i);

  	for (i = 0; i < nvictims; i++)
  	{
  		hash_search(pgss_hash, &entries[i]->key, HASH_REMOVE, NULL);
--- 1302,1321 ----
  	while ((entry = hash_seq_search(&hash_seq)) != NULL)
  	{
  		entries[i++] = entry;
!
! 		if (entry->counters.calls == 0)
! 			entry->counters.usage *= STICKY_DECREASE_FACTOR;
! 		else
! 			entry->counters.usage *= USAGE_DECREASE_FACTOR;
  	}

  	qsort(entries, i, sizeof(pgssEntry *), entry_cmp);
  	nvictims = Max(10, i * USAGE_DEALLOC_PERCENT / 100);
  	nvictims = Min(nvictims, i);

+ 	/* Record the median usage */
+ 	pgss->cur_med_usage = entries[i / 2]->counters.usage;
+
  	for (i = 0; i < nvictims; i++)
  	{
  		hash_search(pgss_hash, &entries[i]->key, HASH_REMOVE, NULL);

#80

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Peter Geoghegan (#79)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan <peter@2ndquadrant.com> writes:

On 29 March 2012 21:05, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Barring objections I'll go fix this, and then this patch can be
considered closed except for possible future tweaking of the
sticky-entry decay rule.

Attached patch fixes a bug, and tweaks sticky-entry decay.

Applied with some cosmetic adjustments.

regards, tom lane

#81

Peter Geoghegan

peter@2ndquadrant.com

almost 14 years ago

In reply to: Tom Lane (#80)

1 attachment(s)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

On 8 April 2012 20:51, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Applied with some cosmetic adjustments.

Thanks.

Having taken another look at the code, I wonder if we wouldn't have
been better off just fastpathing out of pgss_store in the first call
(in a pair of calls made by a backend as part an execution of some
non-prepared query) iff there is already an entry in the hashtable -
after all, we're now going to the trouble of acquiring the spinlock
just to increment the usage for the entry by 0 (likewise, every other
field), which is obviously superfluous. I apologise for not having
spotted this before submitting my last patch.

I have attached a patch with the modifications described.

This is more than a micro-optimisation, since it will cut the number
of spinlock acquisitions approximately in half for non-prepared
queries.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachments:

pg_stat_statements_optimization_2012_04_08.patchapplication/octet-stream; name=pg_stat_statements_optimization_2012_04_08.patchDownload

diff contrib/pg_stat_statements/pg_stat_statements.c
index 8f5c9b0..d8b829d
*** a/contrib/pg_stat_statements/pg_stat_statements.c
--- b/contrib/pg_stat_statements/pg_stat_statements.c
*************** pgss_store(const char *query, uint32 que
*** 964,975 ****
  		 * strings are normalized on a best effort basis, though it would be
  		 * difficult to demonstrate this even under artificial conditions.)
  		 * But if we found the entry already present, don't let this call
! 		 * increment its usage.
  		 */
  		if (!entry)
  			usage = pgss->cur_median_usage;
  		else
! 			usage = 0;
  	}
  	else
  	{
--- 964,978 ----
  		 * strings are normalized on a best effort basis, though it would be
  		 * difficult to demonstrate this even under artificial conditions.)
  		 * But if we found the entry already present, don't let this call
! 		 * increment its usage - just fastpath out of here without even
! 		 * acquiring a spinlock for the entry, since the only reason we'll need
! 		 * to do so in a call from pgss_post_parse_analyze() is to initially set
! 		 * usage to the median to make the entry sticky.
  		 */
  		if (!entry)
  			usage = pgss->cur_median_usage;
  		else
! 			goto done;
  	}
  	else
  	{
*************** pgss_store(const char *query, uint32 que
*** 1060,1065 ****
--- 1063,1069 ----
  		SpinLockRelease(&e->mutex);
  	}

+ done:
  	LWLockRelease(pgss->lock);

  	/* We postpone this pfree until we're out of the lock */

#82

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Peter Geoghegan (#81)

Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan <peter@2ndquadrant.com> writes:

Having taken another look at the code, I wonder if we wouldn't have
been better off just fastpathing out of pgss_store in the first call
(in a pair of calls made by a backend as part an execution of some
non-prepared query) iff there is already an entry in the hashtable -
after all, we're now going to the trouble of acquiring the spinlock
just to increment the usage for the entry by 0 (likewise, every other
field), which is obviously superfluous. I apologise for not having
spotted this before submitting my last patch.

On reflection, we can actually make the code a good bit simpler if
we push the responsibility for initializing the usage count correctly
into entry_alloc(), instead of having to fix it up later. Then we
can just skip the entire adjust-the-stats step in pgss_store when
building a sticky entry. See my commit just now.

regards, tom lane