[PATCH] Use $ parameters as replacement characters for pg_stat_statements

Started by Lukas Fittlalmost 9 years ago17 messages
#1Lukas Fittl
lukas@fittl.com
3 attachment(s)

Hi,

Currently pg_stat_statements replaces constant values with ? characters.
I've seen this be a problem on multiple occasions, in particular since it
conflicts with the use of ? as an operator.

I'd like to propose changing the replacement character from ? to instead be
a parameter (like $1).

My main motiviation is to aid external tools that parse pg_stat_statements
output (like [0]https://github.com/lfittl/pg_query#parsing-a-normalized-query), and currently have to resort to complex parser patches
to distinguish operators from replacement characters.

First of all, attached 0001_pgss_additional_regression_tests.v1.patch which
increases regression test coverage for a few scenarios relevant to this
discussion.

Then, there is two variants I've prepared, only one of the two to be
committed:

A) 0002_pgss_mask_with_incrementing_params.v1.patch: Replace constants with
$1, $2, $3 (etc) in the normalized query, whilst respecting any existing
params in the counting

B) 0003_pgss_mask_with_zero_param.v1.patch: Replace constants with $0 in
the normalized query

Ideally I'd see A turn into something we can commit, but it involves a bit
more computation to get the parameter numbers right. The benefit is that we
could use PREPARE/EXECUTE with pg_stat_statements output.

B is intentionally simple, and would address the primary issue at hand
(distinguishing from the ? operator).

I'm also adding this patch to the commitfest starting tomorrow.

Best,
Lukas

[0]: https://github.com/lfittl/pg_query#parsing-a-normalized-query

--
Lukas Fittl

Attachments:

0001_pgss_additional_regression_tests.v1.patchapplication/octet-stream; name=0001_pgss_additional_regression_tests.v1.patchDownload
diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index fd53f15d8b..12d0dcf61f 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -68,6 +68,13 @@ SELECT 1 AS i UNION SELECT 2 ORDER BY i;
  2
 (2 rows)
 
+-- ? operator
+select '{"a":1, "b":2}'::jsonb ? 'b';
+ ?column? 
+----------
+ t
+(1 row)
+
 -- cte
 WITH t(f) AS (
   VALUES (1.0), (2.0)
@@ -79,24 +86,35 @@ WITH t(f) AS (
  2.0
 (2 rows)
 
+-- prepared statement
+PREPARE pgss_test (int) AS SELECT $1 LIMIT 1;
+EXECUTE pgss_test(1);
+ ?column? 
+----------
+        1
+(1 row)
+
+DEALLOCATE pgss_test;
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
-                  query                  | calls | rows 
------------------------------------------+-------+------
- SELECT ?                               +|     4 |    4
-                                        +|       | 
-   AS "text"                             |       | 
- SELECT ? + ?                            |     2 |    2
- SELECT ? + ? + ? AS "add"               |     3 |    3
- SELECT ? AS "float"                     |     1 |    1
- SELECT ? AS "int"                       |     2 |    2
- SELECT ? AS i UNION SELECT ? ORDER BY i |     1 |    2
- SELECT ? || ?                           |     1 |    1
- SELECT pg_stat_statements_reset()       |     1 |    1
- WITH t(f) AS (                         +|     1 |    2
-   VALUES (?), (?)                      +|       | 
- )                                      +|       | 
-   SELECT f FROM t ORDER BY f            |       | 
-(9 rows)
+                    query                     | calls | rows 
+----------------------------------------------+-------+------
+ PREPARE pgss_test (int) AS SELECT $1 LIMIT ? |     1 |    1
+ SELECT ?                                    +|     4 |    4
+                                             +|       | 
+   AS "text"                                  |       | 
+ SELECT ? + ?                                 |     2 |    2
+ SELECT ? + ? + ? AS "add"                    |     3 |    3
+ SELECT ? AS "float"                          |     1 |    1
+ SELECT ? AS "int"                            |     2 |    2
+ SELECT ? AS i UNION SELECT ? ORDER BY i      |     1 |    2
+ SELECT ? || ?                                |     1 |    1
+ SELECT pg_stat_statements_reset()            |     1 |    1
+ WITH t(f) AS (                              +|     1 |    2
+   VALUES (?), (?)                           +|       | 
+ )                                           +|       | 
+   SELECT f FROM t ORDER BY f                 |       | 
+ select ?::jsonb ? ?                          |     1 |    1
+(11 rows)
 
 --
 -- CRUD: INSERT SELECT UPDATE DELETE on test table
@@ -125,6 +143,8 @@ BEGIN \;
 UPDATE test SET b = '555' WHERE a = 5 \;
 UPDATE test SET b = '666' WHERE a = 6 \;
 COMMIT ;
+-- many INSERT values
+INSERT INTO test (a, b) VALUES (1, 'a'), (2, 'b'), (3, 'c');
 -- SELECT with constants
 SELECT * FROM test WHERE a > 5 ORDER BY a ;
  a |          b           
@@ -147,8 +167,11 @@ SELECT *
 SELECT * FROM test ORDER BY a;
  a |          b           
 ---+----------------------
+ 1 | a                   
  1 | 111                 
+ 2 | b                   
  2 | 222                 
+ 3 | c                   
  3 | 333                 
  4 | 444                 
  5 | 555                 
@@ -156,19 +179,35 @@ SELECT * FROM test ORDER BY a;
  7 | aaa                 
  8 | bbb                 
  9 | bbb                 
-(9 rows)
+(12 rows)
+
+-- SELECT with IN clause
+SELECT * FROM test WHERE a IN (1, 2, 3, 4, 5);
+ a |          b           
+---+----------------------
+ 1 | 111                 
+ 2 | 222                 
+ 3 | 333                 
+ 4 | 444                 
+ 5 | 555                 
+ 1 | a                   
+ 2 | b                   
+ 3 | c                   
+(8 rows)
 
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
-                       query                       | calls | rows 
----------------------------------------------------+-------+------
- DELETE FROM test WHERE a > ?                      |     1 |    1
- INSERT INTO test VALUES(generate_series(?, ?), ?) |     1 |   10
- SELECT * FROM test ORDER BY a                     |     1 |    9
- SELECT * FROM test WHERE a > ? ORDER BY a         |     2 |    4
- SELECT pg_stat_statements_reset()                 |     1 |    1
- UPDATE test SET b = ? WHERE a = ?                 |     6 |    6
- UPDATE test SET b = ? WHERE a > ?                 |     1 |    3
-(7 rows)
+                         query                         | calls | rows 
+-------------------------------------------------------+-------+------
+ DELETE FROM test WHERE a > ?                          |     1 |    1
+ INSERT INTO test (a, b) VALUES (?, ?), (?, ?), (?, ?) |     1 |    3
+ INSERT INTO test VALUES(generate_series(?, ?), ?)     |     1 |   10
+ SELECT * FROM test ORDER BY a                         |     1 |   12
+ SELECT * FROM test WHERE a > ? ORDER BY a             |     2 |    4
+ SELECT * FROM test WHERE a IN (?, ?, ?, ?, ?)         |     1 |    8
+ SELECT pg_stat_statements_reset()                     |     1 |    1
+ UPDATE test SET b = ? WHERE a = ?                     |     6 |    6
+ UPDATE test SET b = ? WHERE a > ?                     |     1 |    3
+(9 rows)
 
 --
 -- pg_stat_statements.track = none
diff --git a/contrib/pg_stat_statements/sql/pg_stat_statements.sql b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
index 385c8a8d99..e7d2a10b10 100644
--- a/contrib/pg_stat_statements/sql/pg_stat_statements.sql
+++ b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
@@ -37,12 +37,20 @@ SELECT :add + 1 + 1 AS "add" \gset
 -- set operator
 SELECT 1 AS i UNION SELECT 2 ORDER BY i;
 
+-- ? operator
+select '{"a":1, "b":2}'::jsonb ? 'b';
+
 -- cte
 WITH t(f) AS (
   VALUES (1.0), (2.0)
 )
   SELECT f FROM t ORDER BY f;
 
+-- prepared statement
+PREPARE pgss_test (int) AS SELECT $1 LIMIT 1;
+EXECUTE pgss_test(1);
+DEALLOCATE pgss_test;
+
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 
 --
@@ -74,6 +82,9 @@ UPDATE test SET b = '555' WHERE a = 5 \;
 UPDATE test SET b = '666' WHERE a = 6 \;
 COMMIT ;
 
+-- many INSERT values
+INSERT INTO test (a, b) VALUES (1, 'a'), (2, 'b'), (3, 'c');
+
 -- SELECT with constants
 SELECT * FROM test WHERE a > 5 ORDER BY a ;
 
@@ -85,6 +96,9 @@ SELECT *
 -- SELECT without constants
 SELECT * FROM test ORDER BY a;
 
+-- SELECT with IN clause
+SELECT * FROM test WHERE a IN (1, 2, 3, 4, 5);
+
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 
 --
0002_pgss_mask_with_incrementing_params.v1.patchapplication/octet-stream; name=0002_pgss_mask_with_incrementing_params.v1.patchDownload
diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index 12d0dcf61f..a7d04602cd 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -96,24 +96,24 @@ EXECUTE pgss_test(1);
 
 DEALLOCATE pgss_test;
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
-                    query                     | calls | rows 
-----------------------------------------------+-------+------
- PREPARE pgss_test (int) AS SELECT $1 LIMIT ? |     1 |    1
- SELECT ?                                    +|     4 |    4
-                                             +|       | 
-   AS "text"                                  |       | 
- SELECT ? + ?                                 |     2 |    2
- SELECT ? + ? + ? AS "add"                    |     3 |    3
- SELECT ? AS "float"                          |     1 |    1
- SELECT ? AS "int"                            |     2 |    2
- SELECT ? AS i UNION SELECT ? ORDER BY i      |     1 |    2
- SELECT ? || ?                                |     1 |    1
- SELECT pg_stat_statements_reset()            |     1 |    1
- WITH t(f) AS (                              +|     1 |    2
-   VALUES (?), (?)                           +|       | 
- )                                           +|       | 
-   SELECT f FROM t ORDER BY f                 |       | 
- select ?::jsonb ? ?                          |     1 |    1
+                     query                     | calls | rows 
+-----------------------------------------------+-------+------
+ PREPARE pgss_test (int) AS SELECT $1 LIMIT $2 |     1 |    1
+ SELECT $1                                    +|     4 |    4
+                                              +|       | 
+   AS "text"                                   |       | 
+ SELECT $1 + $2                                |     2 |    2
+ SELECT $1 + $2 + $3 AS "add"                  |     3 |    3
+ SELECT $1 AS "float"                          |     1 |    1
+ SELECT $1 AS "int"                            |     2 |    2
+ SELECT $1 AS i UNION SELECT $2 ORDER BY i     |     1 |    2
+ SELECT $1 || $2                               |     1 |    1
+ SELECT pg_stat_statements_reset()             |     1 |    1
+ WITH t(f) AS (                               +|     1 |    2
+   VALUES ($1), ($2)                          +|       | 
+ )                                            +|       | 
+   SELECT f FROM t ORDER BY f                  |       | 
+ select $1::jsonb ? $2                         |     1 |    1
 (11 rows)
 
 --
@@ -196,17 +196,17 @@ SELECT * FROM test WHERE a IN (1, 2, 3, 4, 5);
 (8 rows)
 
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
-                         query                         | calls | rows 
--------------------------------------------------------+-------+------
- DELETE FROM test WHERE a > ?                          |     1 |    1
- INSERT INTO test (a, b) VALUES (?, ?), (?, ?), (?, ?) |     1 |    3
- INSERT INTO test VALUES(generate_series(?, ?), ?)     |     1 |   10
- SELECT * FROM test ORDER BY a                         |     1 |   12
- SELECT * FROM test WHERE a > ? ORDER BY a             |     2 |    4
- SELECT * FROM test WHERE a IN (?, ?, ?, ?, ?)         |     1 |    8
- SELECT pg_stat_statements_reset()                     |     1 |    1
- UPDATE test SET b = ? WHERE a = ?                     |     6 |    6
- UPDATE test SET b = ? WHERE a > ?                     |     1 |    3
+                            query                            | calls | rows 
+-------------------------------------------------------------+-------+------
+ DELETE FROM test WHERE a > $1                               |     1 |    1
+ INSERT INTO test (a, b) VALUES ($1, $2), ($3, $4), ($5, $6) |     1 |    3
+ INSERT INTO test VALUES(generate_series($1, $2), $3)        |     1 |   10
+ SELECT * FROM test ORDER BY a                               |     1 |   12
+ SELECT * FROM test WHERE a > $1 ORDER BY a                  |     2 |    4
+ SELECT * FROM test WHERE a IN ($1, $2, $3, $4, $5)          |     1 |    8
+ SELECT pg_stat_statements_reset()                           |     1 |    1
+ UPDATE test SET b = $1 WHERE a = $2                         |     6 |    6
+ UPDATE test SET b = $1 WHERE a > $2                         |     1 |    3
 (9 rows)
 
 --
@@ -290,9 +290,9 @@ SELECT PLUS_ONE(10);
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
                query               | calls | rows 
 -----------------------------------+-------+------
- SELECT ?::TEXT                    |     1 |    1
- SELECT PLUS_ONE(?)                |     2 |    2
- SELECT PLUS_TWO(?)                |     2 |    2
+ SELECT $1::TEXT                   |     1 |    1
+ SELECT PLUS_ONE($1)               |     2 |    2
+ SELECT PLUS_TWO($1)               |     2 |    2
  SELECT pg_stat_statements_reset() |     1 |    1
 (4 rows)
 
@@ -347,10 +347,10 @@ SELECT PLUS_ONE(1);
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
                query               | calls | rows 
 -----------------------------------+-------+------
- SELECT (i + ? + ?)::INTEGER       |     2 |    2
- SELECT (i + ?)::INTEGER LIMIT ?   |     2 |    2
- SELECT PLUS_ONE(?)                |     2 |    2
- SELECT PLUS_TWO(?)                |     2 |    2
+ SELECT (i + $2 + $3)::INTEGER     |     2 |    2
+ SELECT (i + $2)::INTEGER LIMIT $3 |     2 |    2
+ SELECT PLUS_ONE($1)               |     2 |    2
+ SELECT PLUS_TWO($1)               |     2 |    2
  SELECT pg_stat_statements_reset() |     1 |    1
 (5 rows)
 
@@ -391,7 +391,7 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  DROP FUNCTION PLUS_TWO(INTEGER)           |     1 |    0
  DROP TABLE IF EXISTS test                 |     3 |    0
  DROP TABLE test                           |     1 |    0
- SELECT ?                                  |     1 |    1
+ SELECT $1                                 |     1 |    1
  SELECT pg_stat_statements_reset()         |     1 |    1
 (8 rows)
 
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 62dec8768a..4198025a66 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -27,8 +27,8 @@
  * to blame query costs on the proper queryId.
  *
  * To facilitate presenting entries to users, we create "representative" query
- * strings in which constants are replaced with '?' characters, to make it
- * clearer what a normalized entry can represent.  To save on shared memory,
+ * strings in which constants are replaced with parameters (such as $1), to make
+ * it clearer what a normalized entry can represent.  To save on shared memory,
  * and to avoid having to truncate oversized query strings, we store these
  * strings in a temporary external query-texts file.  Offsets into this
  * file are kept in shared memory.
@@ -219,6 +219,9 @@ typedef struct pgssJumbleState
 
 	/* Current number of valid entries in clocations array */
 	int			clocations_count;
+
+	/* highest param id we've seen, in order to start normalization correctly */
+	int			highest_extern_param_id;
 } pgssJumbleState;
 
 /*---- Local variables ----*/
@@ -803,6 +806,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query)
 	jstate.clocations = (pgssLocationLen *)
 		palloc(jstate.clocations_buf_size * sizeof(pgssLocationLen));
 	jstate.clocations_count = 0;
+	jstate.highest_extern_param_id = 0;
 
 	/* Compute query ID and mark the Query node with it */
 	JumbleQuery(&jstate, query);
@@ -2478,6 +2482,11 @@ JumbleExpr(pgssJumbleState *jstate, Node *node)
 				APP_JUMB(p->paramkind);
 				APP_JUMB(p->paramid);
 				APP_JUMB(p->paramtype);
+
+				if (p->paramkind == PARAM_EXTERN &&
+					p->paramid > jstate->highest_extern_param_id) {
+					jstate->highest_extern_param_id = p->paramid;
+				}
 			}
 			break;
 		case T_Aggref:
@@ -2940,7 +2949,8 @@ generate_normalized_query(pgssJumbleState *jstate, const char *query,
 				quer_loc = 0,	/* Source query byte location */
 				n_quer_loc = 0, /* Normalized query byte location */
 				last_off = 0,	/* Offset from start for previous tok */
-				last_tok_len = 0;		/* Length (in bytes) of that tok */
+				last_tok_len = 0,		/* Length (in bytes) of that tok */
+				norm_query_buflen = 0; /* Size of the normalized query buffer */
 
 	/*
 	 * Get constants' lengths (core system only gives us locations).  Note
@@ -2948,8 +2958,15 @@ generate_normalized_query(pgssJumbleState *jstate, const char *query,
 	 */
 	fill_in_constant_lengths(jstate, query, query_loc);
 
+	/* Accomodate the additional query replacement characters */
+	norm_query_buflen = query_len;
+	for (i = 0; i < jstate->clocations_count; i++)
+	{
+		norm_query_buflen += floor(log10(abs(i + 1 + jstate->highest_extern_param_id))) + 1;
+	}
+
 	/* Allocate result buffer */
-	norm_query = palloc(query_len + 1);
+	norm_query = palloc(norm_query_buflen + 1);
 
 	for (i = 0; i < jstate->clocations_count; i++)
 	{
@@ -2973,8 +2990,9 @@ generate_normalized_query(pgssJumbleState *jstate, const char *query,
 		memcpy(norm_query + n_quer_loc, query + quer_loc, len_to_wrt);
 		n_quer_loc += len_to_wrt;
 
-		/* And insert a '?' in place of the constant token */
-		norm_query[n_quer_loc++] = '?';
+		/* And insert a param in place of the constant token */
+		n_quer_loc += sprintf(norm_query + n_quer_loc, "$%d", i + 1 +
+							  jstate->highest_extern_param_id);
 
 		quer_loc = off + tok_len;
 		last_off = off;
@@ -2991,7 +3009,7 @@ generate_normalized_query(pgssJumbleState *jstate, const char *query,
 	memcpy(norm_query + n_quer_loc, query + quer_loc, len_to_wrt);
 	n_quer_loc += len_to_wrt;
 
-	Assert(n_quer_loc <= query_len);
+	Assert(n_quer_loc <= norm_query_buflen);
 	norm_query[n_quer_loc] = '\0';
 
 	*query_len_p = n_quer_loc;
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index d4f09fc2a3..d6620b70ed 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -245,9 +245,9 @@
 
   <para>
    When a constant's value has been ignored for purposes of matching the
-   query to other queries, the constant is replaced by <literal>?</literal>
-   in the <structname>pg_stat_statements</> display.  The rest of the query
-   text is that of the first query that had the particular
+   query to other queries, the constant is replaced by bind parameters such as
+   <literal>$1</literal> in the <structname>pg_stat_statements</> display.
+   The rest of the query text is that of the first query that had the particular
    <structfield>queryid</> hash value associated with the
    <structname>pg_stat_statements</> entry.
   </para>
@@ -481,13 +481,13 @@ bench=# SELECT query, calls, total_time, rows, 100.0 * shared_blks_hit /
                nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
           FROM pg_stat_statements ORDER BY total_time DESC LIMIT 5;
 -[ RECORD 1 ]---------------------------------------------------------------------
-query       | UPDATE pgbench_branches SET bbalance = bbalance + ? WHERE bid = ?;
+query       | UPDATE pgbench_branches SET bbalance = bbalance + $1 WHERE bid = $2;
 calls       | 3000
 total_time  | 9609.00100000002
 rows        | 2836
 hit_percent | 99.9778970000200936
 -[ RECORD 2 ]---------------------------------------------------------------------
-query       | UPDATE pgbench_tellers SET tbalance = tbalance + ? WHERE tid = ?;
+query       | UPDATE pgbench_tellers SET tbalance = tbalance + $1 WHERE tid = $2;
 calls       | 3000
 total_time  | 8015.156
 rows        | 2990
@@ -499,7 +499,7 @@ total_time  | 310.624
 rows        | 100000
 hit_percent | 0.30395136778115501520
 -[ RECORD 4 ]---------------------------------------------------------------------
-query       | UPDATE pgbench_accounts SET abalance = abalance + ? WHERE aid = ?;
+query       | UPDATE pgbench_accounts SET abalance = abalance + $1 WHERE aid = $2;
 calls       | 3000
 total_time  | 271.741999999997
 rows        | 3000
0003_pgss_mask_with_zero_param.v1.patchapplication/octet-stream; name=0003_pgss_mask_with_zero_param.v1.patchDownload
diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index 12d0dcf61f..5187767c42 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -96,24 +96,24 @@ EXECUTE pgss_test(1);
 
 DEALLOCATE pgss_test;
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
-                    query                     | calls | rows 
-----------------------------------------------+-------+------
- PREPARE pgss_test (int) AS SELECT $1 LIMIT ? |     1 |    1
- SELECT ?                                    +|     4 |    4
-                                             +|       | 
-   AS "text"                                  |       | 
- SELECT ? + ?                                 |     2 |    2
- SELECT ? + ? + ? AS "add"                    |     3 |    3
- SELECT ? AS "float"                          |     1 |    1
- SELECT ? AS "int"                            |     2 |    2
- SELECT ? AS i UNION SELECT ? ORDER BY i      |     1 |    2
- SELECT ? || ?                                |     1 |    1
- SELECT pg_stat_statements_reset()            |     1 |    1
- WITH t(f) AS (                              +|     1 |    2
-   VALUES (?), (?)                           +|       | 
- )                                           +|       | 
-   SELECT f FROM t ORDER BY f                 |       | 
- select ?::jsonb ? ?                          |     1 |    1
+                     query                     | calls | rows 
+-----------------------------------------------+-------+------
+ PREPARE pgss_test (int) AS SELECT $1 LIMIT $0 |     1 |    1
+ SELECT $0                                    +|     4 |    4
+                                              +|       | 
+   AS "text"                                   |       | 
+ SELECT $0 + $0                                |     2 |    2
+ SELECT $0 + $0 + $0 AS "add"                  |     3 |    3
+ SELECT $0 AS "float"                          |     1 |    1
+ SELECT $0 AS "int"                            |     2 |    2
+ SELECT $0 AS i UNION SELECT $0 ORDER BY i     |     1 |    2
+ SELECT $0 || $0                               |     1 |    1
+ SELECT pg_stat_statements_reset()             |     1 |    1
+ WITH t(f) AS (                               +|     1 |    2
+   VALUES ($0), ($0)                          +|       | 
+ )                                            +|       | 
+   SELECT f FROM t ORDER BY f                  |       | 
+ select $0::jsonb ? $0                         |     1 |    1
 (11 rows)
 
 --
@@ -196,17 +196,17 @@ SELECT * FROM test WHERE a IN (1, 2, 3, 4, 5);
 (8 rows)
 
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
-                         query                         | calls | rows 
--------------------------------------------------------+-------+------
- DELETE FROM test WHERE a > ?                          |     1 |    1
- INSERT INTO test (a, b) VALUES (?, ?), (?, ?), (?, ?) |     1 |    3
- INSERT INTO test VALUES(generate_series(?, ?), ?)     |     1 |   10
- SELECT * FROM test ORDER BY a                         |     1 |   12
- SELECT * FROM test WHERE a > ? ORDER BY a             |     2 |    4
- SELECT * FROM test WHERE a IN (?, ?, ?, ?, ?)         |     1 |    8
- SELECT pg_stat_statements_reset()                     |     1 |    1
- UPDATE test SET b = ? WHERE a = ?                     |     6 |    6
- UPDATE test SET b = ? WHERE a > ?                     |     1 |    3
+                            query                            | calls | rows 
+-------------------------------------------------------------+-------+------
+ DELETE FROM test WHERE a > $0                               |     1 |    1
+ INSERT INTO test (a, b) VALUES ($0, $0), ($0, $0), ($0, $0) |     1 |    3
+ INSERT INTO test VALUES(generate_series($0, $0), $0)        |     1 |   10
+ SELECT * FROM test ORDER BY a                               |     1 |   12
+ SELECT * FROM test WHERE a > $0 ORDER BY a                  |     2 |    4
+ SELECT * FROM test WHERE a IN ($0, $0, $0, $0, $0)          |     1 |    8
+ SELECT pg_stat_statements_reset()                           |     1 |    1
+ UPDATE test SET b = $0 WHERE a = $0                         |     6 |    6
+ UPDATE test SET b = $0 WHERE a > $0                         |     1 |    3
 (9 rows)
 
 --
@@ -290,9 +290,9 @@ SELECT PLUS_ONE(10);
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
                query               | calls | rows 
 -----------------------------------+-------+------
- SELECT ?::TEXT                    |     1 |    1
- SELECT PLUS_ONE(?)                |     2 |    2
- SELECT PLUS_TWO(?)                |     2 |    2
+ SELECT $0::TEXT                   |     1 |    1
+ SELECT PLUS_ONE($0)               |     2 |    2
+ SELECT PLUS_TWO($0)               |     2 |    2
  SELECT pg_stat_statements_reset() |     1 |    1
 (4 rows)
 
@@ -347,10 +347,10 @@ SELECT PLUS_ONE(1);
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
                query               | calls | rows 
 -----------------------------------+-------+------
- SELECT (i + ? + ?)::INTEGER       |     2 |    2
- SELECT (i + ?)::INTEGER LIMIT ?   |     2 |    2
- SELECT PLUS_ONE(?)                |     2 |    2
- SELECT PLUS_TWO(?)                |     2 |    2
+ SELECT (i + $0 + $0)::INTEGER     |     2 |    2
+ SELECT (i + $0)::INTEGER LIMIT $0 |     2 |    2
+ SELECT PLUS_ONE($0)               |     2 |    2
+ SELECT PLUS_TWO($0)               |     2 |    2
  SELECT pg_stat_statements_reset() |     1 |    1
 (5 rows)
 
@@ -391,7 +391,7 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  DROP FUNCTION PLUS_TWO(INTEGER)           |     1 |    0
  DROP TABLE IF EXISTS test                 |     3 |    0
  DROP TABLE test                           |     1 |    0
- SELECT ?                                  |     1 |    1
+ SELECT $0                                 |     1 |    1
  SELECT pg_stat_statements_reset()         |     1 |    1
 (8 rows)
 
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 62dec8768a..6d08403c42 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -27,7 +27,7 @@
  * to blame query costs on the proper queryId.
  *
  * To facilitate presenting entries to users, we create "representative" query
- * strings in which constants are replaced with '?' characters, to make it
+ * strings in which constants are replaced with '$0' characters, to make it
  * clearer what a normalized entry can represent.  To save on shared memory,
  * and to avoid having to truncate oversized query strings, we store these
  * strings in a temporary external query-texts file.  Offsets into this
@@ -2940,7 +2940,8 @@ generate_normalized_query(pgssJumbleState *jstate, const char *query,
 				quer_loc = 0,	/* Source query byte location */
 				n_quer_loc = 0, /* Normalized query byte location */
 				last_off = 0,	/* Offset from start for previous tok */
-				last_tok_len = 0;		/* Length (in bytes) of that tok */
+				last_tok_len = 0,		/* Length (in bytes) of that tok */
+				norm_query_buflen = 0; /* Size of the normalized query buffer */
 
 	/*
 	 * Get constants' lengths (core system only gives us locations).  Note
@@ -2948,8 +2949,11 @@ generate_normalized_query(pgssJumbleState *jstate, const char *query,
 	 */
 	fill_in_constant_lengths(jstate, query, query_loc);
 
+	/* Accomodate the additional query replacement characters */
+	norm_query_buflen = query_len + jstate->clocations_count;
+
 	/* Allocate result buffer */
-	norm_query = palloc(query_len + 1);
+	norm_query = palloc(norm_query_buflen + 1);
 
 	for (i = 0; i < jstate->clocations_count; i++)
 	{
@@ -2973,8 +2977,9 @@ generate_normalized_query(pgssJumbleState *jstate, const char *query,
 		memcpy(norm_query + n_quer_loc, query + quer_loc, len_to_wrt);
 		n_quer_loc += len_to_wrt;
 
-		/* And insert a '?' in place of the constant token */
-		norm_query[n_quer_loc++] = '?';
+		/* And insert a '$0' in place of the constant token */
+		norm_query[n_quer_loc++] = '$';
+		norm_query[n_quer_loc++] = '0';
 
 		quer_loc = off + tok_len;
 		last_off = off;
@@ -2991,7 +2996,7 @@ generate_normalized_query(pgssJumbleState *jstate, const char *query,
 	memcpy(norm_query + n_quer_loc, query + quer_loc, len_to_wrt);
 	n_quer_loc += len_to_wrt;
 
-	Assert(n_quer_loc <= query_len);
+	Assert(n_quer_loc <= norm_query_buflen);
 	norm_query[n_quer_loc] = '\0';
 
 	*query_len_p = n_quer_loc;
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index d4f09fc2a3..17fbd3bccc 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -245,7 +245,7 @@
 
   <para>
    When a constant's value has been ignored for purposes of matching the
-   query to other queries, the constant is replaced by <literal>?</literal>
+   query to other queries, the constant is replaced by <literal>$0</literal>
    in the <structname>pg_stat_statements</> display.  The rest of the query
    text is that of the first query that had the particular
    <structfield>queryid</> hash value associated with the
@@ -481,13 +481,13 @@ bench=# SELECT query, calls, total_time, rows, 100.0 * shared_blks_hit /
                nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
           FROM pg_stat_statements ORDER BY total_time DESC LIMIT 5;
 -[ RECORD 1 ]---------------------------------------------------------------------
-query       | UPDATE pgbench_branches SET bbalance = bbalance + ? WHERE bid = ?;
+query       | UPDATE pgbench_branches SET bbalance = bbalance + $0 WHERE bid = $0;
 calls       | 3000
 total_time  | 9609.00100000002
 rows        | 2836
 hit_percent | 99.9778970000200936
 -[ RECORD 2 ]---------------------------------------------------------------------
-query       | UPDATE pgbench_tellers SET tbalance = tbalance + ? WHERE tid = ?;
+query       | UPDATE pgbench_tellers SET tbalance = tbalance + $0 WHERE tid = $0;
 calls       | 3000
 total_time  | 8015.156
 rows        | 2990
@@ -499,7 +499,7 @@ total_time  | 310.624
 rows        | 100000
 hit_percent | 0.30395136778115501520
 -[ RECORD 4 ]---------------------------------------------------------------------
-query       | UPDATE pgbench_accounts SET abalance = abalance + ? WHERE aid = ?;
+query       | UPDATE pgbench_accounts SET abalance = abalance + $0 WHERE aid = $0;
 calls       | 3000
 total_time  | 271.741999999997
 rows        | 3000
#2Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Lukas Fittl (#1)
Re: [PATCH] Use $ parameters as replacement characters for pg_stat_statements

On 2/28/17 20:01, Lukas Fittl wrote:

Currently pg_stat_statements replaces constant values with ? characters.
I've seen this be a problem on multiple occasions, in particular since
it conflicts with the use of ? as an operator.

I'd like to propose changing the replacement character from ? to instead
be a parameter (like $1).

Hmm, I think this could confuse people into thinking that the queries
displayed were in fact prepared queries.

Maybe we could gather some more ideas.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Lukas Fittl
lukas@fittl.com
In reply to: Peter Eisentraut (#2)
Re: [PATCH] Use $ parameters as replacement characters for pg_stat_statements

On Wed, Mar 1, 2017 at 6:51 AM, Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> wrote:

Hmm, I think this could confuse people into thinking that the queries
displayed were in fact prepared queries.

Maybe we could gather some more ideas.

I think thats a reasonable concern - the main benefit of $1 is that its
already designated as something that can replace a constant, and still be
read by the Postgres parser.

Is there any other character that has the same properties?

I'll also note that Greg Stark mentioned in [0]/messages/by-id/CAM-w4HNOeNW6pY_1%25 3DLp1aJGMmZ_R6S8JHjqvJMv8-%3DOf3q1q0w%40mail.gmail.com that "There's another
feature pg_stat_statements *really* needs. A way to convert a jumbled
statement into one that can be prepared easily. The use of ? instead of :1
:2 etc makes this a mechanical but annoying process."

Using $1, $2, etc. for jumbling statements would give us that for free, no
additional effort needed.

[0]: /messages/by-id/CAM-w4HNOeNW6pY_1%25 3DLp1aJGMmZ_R6S8JHjqvJMv8-%3DOf3q1q0w%40mail.gmail.com
3DLp1aJGMmZ_R6S8JHjqvJMv8-%3DOf3q1q0w%40mail.gmail.com

Best,
Lukas

--
Lukas Fittl

#4Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#2)
Re: [PATCH] Use $ parameters as replacement characters for pg_stat_statements

On Wed, Mar 1, 2017 at 8:21 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2/28/17 20:01, Lukas Fittl wrote:

Currently pg_stat_statements replaces constant values with ? characters.
I've seen this be a problem on multiple occasions, in particular since
it conflicts with the use of ? as an operator.

I'd like to propose changing the replacement character from ? to instead
be a parameter (like $1).

Hmm, I think this could confuse people into thinking that the queries
displayed were in fact prepared queries.

Maybe we could gather some more ideas.

Perhaps there could be a choice of behaviors. Even if we all agreed
that parameter notation was better in theory, there's something to be
said for maintaining backward compatibility, or having an option to do
so.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#4)
Re: [PATCH] Use $ parameters as replacement characters for pg_stat_statements

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Mar 1, 2017 at 8:21 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2/28/17 20:01, Lukas Fittl wrote:

I'd like to propose changing the replacement character from ? to instead
be a parameter (like $1).

Hmm, I think this could confuse people into thinking that the queries
displayed were in fact prepared queries.

Perhaps there could be a choice of behaviors. Even if we all agreed
that parameter notation was better in theory, there's something to be
said for maintaining backward compatibility, or having an option to do
so.

Meh ... we've generally regretted it when we "solved" a backwards
compatibility problem by introducing a GUC that changes query semantics.
I'm inclined to think we should either do it or not.

My own vote would probably be for "not", because I haven't seen a case
made why it's important to be able to automatically distinguish a
constant-substitution marker from a "?" operator.

On the other hand, it seems like arguing for backwards compatibility here
is a bit contradictory, because that would only matter if you think there
*are* people trying to automatically parse the output of
pg_stat_statements in that much detail. And if there are, they would
likely appreciate it becoming less ambiguous.

But speaking of ambiguity: isn't it possible for $n symbols to appear in
pg_stat_statements already? I think it is, both from extended-protocol
client queries and from SPI commands, which would mean that the proposal
as it stands is not fixing the ambiguity problem at all. So yes, we need
another idea.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#5)
Re: [PATCH] Use $ parameters as replacement characters for pg_stat_statements

Hi,

On 2017-03-04 11:02:14 -0500, Tom Lane wrote:

But speaking of ambiguity: isn't it possible for $n symbols to appear in
pg_stat_statements already?

Indeed.

I think it is, both from extended-protocol
client queries and from SPI commands, which would mean that the proposal
as it stands is not fixing the ambiguity problem at all. So yes, we need
another idea.

I think using $ to signal a parameter is still a good idea, as people
kinda have to know that one anyway. Maybe one of "$:n" "$-n"?

- Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Tom Lane (#5)
Re: [PATCH] Use $ parameters as replacement characters for pg_stat_statements

On Sat, Mar 4, 2017 at 8:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Perhaps there could be a choice of behaviors. Even if we all agreed
that parameter notation was better in theory, there's something to be
said for maintaining backward compatibility, or having an option to do
so.

Meh ... we've generally regretted it when we "solved" a backwards
compatibility problem by introducing a GUC that changes query semantics.
I'm inclined to think we should either do it or not.

In my opinion, we expose query id (and dbid, and userid) as the
canonical identifier for each pg_stat_statements entry, and have done
so for some time. That's the stable API -- not query text. I'm aware
of cases where query text was used as an identifier, but that ended up
being hashed anyway.

Query text is just for human consumption. I'd be in favor of a change
that makes it easier to copy and paste a query, to run EXPLAIN and so
on. Lukas probably realizes that there are no guarantees that the
query text that appears in pg_stat_statements will even appear as
normalized in all cases. The "sticky entry" stuff is intended to
maximize the chances of that happening, but it's still generally quite
possible (e.g. pg_stat_statements never swaps constants in a query
like "SELECT 5, pg_stat_statements_reset()"). This means that we
cannot really say that this buys us a machine-readable query text
format, at least not without adding some fairly messy caveats.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#7)
Re: [PATCH] Use $ parameters as replacement characters for pg_stat_statements

On Sat, Mar 4, 2017 at 1:52 PM, Peter Geoghegan <pg@bowt.ie> wrote:

On Sat, Mar 4, 2017 at 8:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Perhaps there could be a choice of behaviors. Even if we all agreed
that parameter notation was better in theory, there's something to be
said for maintaining backward compatibility, or having an option to do
so.

Meh ... we've generally regretted it when we "solved" a backwards
compatibility problem by introducing a GUC that changes query semantics.
I'm inclined to think we should either do it or not.

In my opinion, we expose query id (and dbid, and userid) as the
canonical identifier for each pg_stat_statements entry, and have done
so for some time. That's the stable API -- not query text. I'm aware
of cases where query text was used as an identifier, but that ended up
being hashed anyway.

Query text is just for human consumption.

Lukas evidently thinks otherwise, based on the original post.

I'd be in favor of a change
that makes it easier to copy and paste a query, to run EXPLAIN and so
on. Lukas probably realizes that there are no guarantees that the
query text that appears in pg_stat_statements will even appear as
normalized in all cases. The "sticky entry" stuff is intended to
maximize the chances of that happening, but it's still generally quite
possible (e.g. pg_stat_statements never swaps constants in a query
like "SELECT 5, pg_stat_statements_reset()"). This means that we
cannot really say that this buys us a machine-readable query text
format, at least not without adding some fairly messy caveats.

Well, Lukas's original suggestion of using $n for a placeholder would
do that, unless there's already a $n with the same numerical value,
but Andres's proposal to use $-n or $:n would not.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Lukas Fittl
lukas@fittl.com
In reply to: Robert Haas (#8)
Re: [PATCH] Use $ parameters as replacement characters for pg_stat_statements

On Mon, Mar 6, 2017 at 9:36 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sat, Mar 4, 2017 at 1:52 PM, Peter Geoghegan <pg@bowt.ie> wrote:

In my opinion, we expose query id (and dbid, and userid) as the
canonical identifier for each pg_stat_statements entry, and have done
so for some time. That's the stable API -- not query text. I'm aware
of cases where query text was used as an identifier, but that ended up
being hashed anyway.

Query text is just for human consumption.

Lukas evidently thinks otherwise, based on the original post.

I actually agree with Peter that the queryid+userid+dbid is the canonical
identifier,
not the query text.

There is however value in parsing the query, e.g. to find out which
statement
type something is, or to determine which table names a query references
(assuming one knows the search_path) programatically.

It is for that latter reason I'm interested in parsing the query, and
avoiding the
ambiguity that ? carries, since its also an operator.

Based on some hackery, I've previously built a little example script that
filters pg_stat_statements output: https://github.com/lfittl/pg_qtop#usage

This script currently breaks in complex cases of ? operators, since the
pg_stat_statements query text is ambiguous.

I'd be in favor of a change
that makes it easier to copy and paste a query, to run EXPLAIN and so
on. Lukas probably realizes that there are no guarantees that the
query text that appears in pg_stat_statements will even appear as
normalized in all cases. The "sticky entry" stuff is intended to
maximize the chances of that happening, but it's still generally quite
possible (e.g. pg_stat_statements never swaps constants in a query
like "SELECT 5, pg_stat_statements_reset()"). This means that we
cannot really say that this buys us a machine-readable query text
format, at least not without adding some fairly messy caveats.

Well, Lukas's original suggestion of using $n for a placeholder would
do that, unless there's already a $n with the same numerical value,
but Andres's proposal to use $-n or $:n would not.

Yes, and I do think that making it easier to run EXPLAIN would be the
primary user-visible benefit in core.

I'd be happy to add a docs section showing how to use this, if there is
some consensus that its worth pursuing this direction.

Thanks for all the comments, appreciate the discussion.

Best,
Lukas

--
Lukas Fittl

#10Bruce Momjian
bruce@momjian.us
In reply to: Peter Geoghegan (#7)
Re: [PATCH] Use $ parameters as replacement characters for pg_stat_statements

On Sat, Mar 4, 2017 at 10:52:44AM -0800, Peter Geoghegan wrote:

On Sat, Mar 4, 2017 at 8:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Meh ... we've generally regretted it when we "solved" a backwards
compatibility problem by introducing a GUC that changes query semantics.
I'm inclined to think we should either do it or not.

In my opinion, we expose query id (and dbid, and userid) as the
canonical identifier for each pg_stat_statements entry, and have done
so for some time. That's the stable API -- not query text. I'm aware
of cases where query text was used as an identifier, but that ended up
being hashed anyway.

Speaking of hash values for queries, someone once asked me if we could
display a hash value for queries displayed in pg_stat_activity and
pg_stat_statements so they could take a running query and look in
pg_stat_statements to see how long is usually ran. It seemed like a
useful idea to me.

I don't think they can hash the query manually because of the constants
involved.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Bruce Momjian (#10)
Re: [PATCH] Use $ parameters as replacement characters for pg_stat_statements

On Thu, Mar 9, 2017 at 8:17 PM, Bruce Momjian <bruce@momjian.us> wrote:

In my opinion, we expose query id (and dbid, and userid) as the
canonical identifier for each pg_stat_statements entry, and have done
so for some time. That's the stable API -- not query text. I'm aware
of cases where query text was used as an identifier, but that ended up
being hashed anyway.

Speaking of hash values for queries, someone once asked me if we could
display a hash value for queries displayed in pg_stat_activity and
pg_stat_statements so they could take a running query and look in
pg_stat_statements to see how long is usually ran. It seemed like a
useful idea to me.

I agree.

I don't think they can hash the query manually because of the constants
involved.

It would be a matter of having postgres expose Query.queryId (or the
similar field in PlannedStmt, I suppose). Morally, that field belongs
to pg_stat_statements, since it was written to make the query
normalization stuff work, and because everything would break if
another extension attempted to use it as pg_stat_statements does.
Whether or not that makes it okay to expose the hash value in
pg_stat_activity like that is above my pay grade, as Tom would say.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#8)
Re: [PATCH] Use $ parameters as replacement characters for pg_stat_statements

Robert Haas <robertmhaas@gmail.com> writes:

On Sat, Mar 4, 2017 at 1:52 PM, Peter Geoghegan <pg@bowt.ie> wrote:

I'd be in favor of a change
that makes it easier to copy and paste a query, to run EXPLAIN and so
on. Lukas probably realizes that there are no guarantees that the
query text that appears in pg_stat_statements will even appear as
normalized in all cases. The "sticky entry" stuff is intended to
maximize the chances of that happening, but it's still generally quite
possible (e.g. pg_stat_statements never swaps constants in a query
like "SELECT 5, pg_stat_statements_reset()"). This means that we
cannot really say that this buys us a machine-readable query text
format, at least not without adding some fairly messy caveats.

Well, Lukas's original suggestion of using $n for a placeholder would
do that, unless there's already a $n with the same numerical value,
but Andres's proposal to use $-n or $:n would not.

I don't much like the $-n or $:n proposals, as those are infringing on
syntax space we might wish we had back someday. $:n also could cause
confusion with psql variable substitution.

I wonder if it would improve matters to use $n, but starting with the
first number after the actual external Param symbols in the query.
(This presumes that we can identify the last in-use external Param number
reasonably efficiently. struct Query doesn't really expose that ---
although it might not be unreasonable to add a field to do so.
In practice pg_stat_statements could perhaps look aside to find
that out, say from EState.es_param_list_info->numParams.)

In a situation where you wanted to try to actually execute the query,
this might be fairly convenient since you could do PREPARE with the
original external Params followed by the ones pg_stat_statements had
abstracted from constants. Of course, guessing the types of those
might be nontrivial. I wonder whether we should change the printout
to look like '$n::type' not just '$n'.

A variant that might make it easier to read would be to start the
numbering of the abstracted params from $100, or some other number
much larger than the last external Param, thus creating a pretty
clear distinction between the two. But that would break the
hypothetical use-case of building a prepared statement directly
from the query text, or at least make it mighty inconvenient.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#12)
Re: [PATCH] Use $ parameters as replacement characters for pg_stat_statements

On Mon, Mar 13, 2017 at 6:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Sat, Mar 4, 2017 at 1:52 PM, Peter Geoghegan <pg@bowt.ie> wrote:

I'd be in favor of a change
that makes it easier to copy and paste a query, to run EXPLAIN and so
on. Lukas probably realizes that there are no guarantees that the
query text that appears in pg_stat_statements will even appear as
normalized in all cases. The "sticky entry" stuff is intended to
maximize the chances of that happening, but it's still generally quite
possible (e.g. pg_stat_statements never swaps constants in a query
like "SELECT 5, pg_stat_statements_reset()"). This means that we
cannot really say that this buys us a machine-readable query text
format, at least not without adding some fairly messy caveats.

Well, Lukas's original suggestion of using $n for a placeholder would
do that, unless there's already a $n with the same numerical value,
but Andres's proposal to use $-n or $:n would not.

I don't much like the $-n or $:n proposals, as those are infringing on
syntax space we might wish we had back someday. $:n also could cause
confusion with psql variable substitution.

I wonder if it would improve matters to use $n, but starting with the
first number after the actual external Param symbols in the query.

That sounds pretty sensible, although I guess it's got the weakness
that you might get confused about which kind of $n you are looking at.
However, I'd be inclined to deem that a fairly minor weakness and go
with it: just because somebody could hypothetically get confused
doesn't mean that real people will.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#13)
Re: [PATCH] Use $ parameters as replacement characters for pg_stat_statements

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Mar 13, 2017 at 6:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wonder if it would improve matters to use $n, but starting with the
first number after the actual external Param symbols in the query.

That sounds pretty sensible, although I guess it's got the weakness
that you might get confused about which kind of $n you are looking at.
However, I'd be inclined to deem that a fairly minor weakness and go
with it: just because somebody could hypothetically get confused
doesn't mean that real people will.

So it turns out this discussion just reinvented the alternative that
Lukas had in his 0002 proposal. Are there any remaining objections
to proceeding with that approach?

In a quick read of the patch, I note that it falsifies and fails to
update the header comment for generate_normalized_query:

* *query_len_p contains the input string length, and is updated with
* the result string length (which cannot be longer) on exit.

It doesn't look like the calling code depends (any more?) on the
assumption that the string doesn't get longer, but it would be good
to double-check that.

This bit seems much more expensive and complicated than necessary:

+	/* Accomodate the additional query replacement characters */
+	norm_query_buflen = query_len;
+	for (i = 0; i < jstate->clocations_count; i++)
+	{
+		norm_query_buflen += floor(log10(abs(i + 1 + jstate->highest_extern_param_id))) + 1;
+	}

I'd just add, say, "10 * clocations_count" to the allocation length.
We can afford to waste a few bytes on this transient copy of the query
text.

The documentation is overly vague about what the parameter symbols are,
in particular failing to explain that their numbers start from after
the last $n occurring in the original query text.

It seems like a good idea to have a regression test case demonstrating
that, too.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Lukas Fittl
lukas@fittl.com
In reply to: Tom Lane (#14)
1 attachment(s)
Re: [PATCH] Use $ parameters as replacement characters for pg_stat_statements

On Sat, Mar 18, 2017 at 12:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

So it turns out this discussion just reinvented the alternative that
Lukas had in his 0002 proposal. Are there any remaining objections
to proceeding with that approach?

Thanks for reviewing - updated patch attached, comments below.

In a quick read of the patch, I note that it falsifies and fails to
update the header comment for generate_normalized_query:

* *query_len_p contains the input string length, and is updated with
* the result string length (which cannot be longer) on exit.

It doesn't look like the calling code depends (any more?) on the
assumption that the string doesn't get longer, but it would be good
to double-check that.

Fixed.

I'd just add, say, "10 * clocations_count" to the allocation length.
We can afford to waste a few bytes on this transient copy of the query
text.

I've changed this, although I've kept this somewhat dynamic, to avoid
having an accidental overrun in edge cases:

1 + floor(log10(jstate->clocations_count +
jstate->highest_extern_param_id));

The documentation is overly vague about what the parameter symbols are,
in particular failing to explain that their numbers start from after
the last $n occurring in the original query text.

Updated the docs to clarify this.

It seems like a good idea to have a regression test case demonstrating
that, too.

I already had a separate patch that adds more regression tests (0000),
including one for prepared statements in the initial email.

Merged together now into one patch to keep it simple, and added another
constant value to the prepared statement test case. I've also included
a commit message in the patch file, if helpful.

Thanks,
Lukas

--
Lukas Fittl

Attachments:

0002_pgss_mask_with_incrementing_params.v2.patchapplication/octet-stream; name=0002_pgss_mask_with_incrementing_params.v2.patchDownload
commit e005a19cc2b86d74cd3502df42c2ca55160cc5f5
Author: Lukas Fittl <lukas@fittl.com>
Date:   Tue Feb 28 02:34:22 2017 -0800

    Use bind parameters $1,$2,.. instead of ? for pg_stat_statements masking.
    
    Whilst adding a few characters, this makes the statements that are output
    parseable by the standard Postgres parser. In particular, this allows one
    to pass the statements to PREPARE and then use EXECUTE with the constant
    values.
    
    Compared to ? it also solves the edge case where it can't be determined
    if a statement contains the ? replacement character, or the ? operator.

diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index fd53f15d8b..05c799a116 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -68,6 +68,13 @@ SELECT 1 AS i UNION SELECT 2 ORDER BY i;
  2
 (2 rows)
 
+-- ? operator
+select '{"a":1, "b":2}'::jsonb ? 'b';
+ ?column? 
+----------
+ t
+(1 row)
+
 -- cte
 WITH t(f) AS (
   VALUES (1.0), (2.0)
@@ -79,24 +86,35 @@ WITH t(f) AS (
  2.0
 (2 rows)
 
+-- prepared statement
+PREPARE pgss_test (int) AS SELECT $1, 'test' LIMIT 1;
+EXECUTE pgss_test(1);
+ ?column? | ?column? 
+----------+----------
+        1 | test
+(1 row)
+
+DEALLOCATE pgss_test;
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
-                  query                  | calls | rows 
------------------------------------------+-------+------
- SELECT ?                               +|     4 |    4
-                                        +|       | 
-   AS "text"                             |       | 
- SELECT ? + ?                            |     2 |    2
- SELECT ? + ? + ? AS "add"               |     3 |    3
- SELECT ? AS "float"                     |     1 |    1
- SELECT ? AS "int"                       |     2 |    2
- SELECT ? AS i UNION SELECT ? ORDER BY i |     1 |    2
- SELECT ? || ?                           |     1 |    1
- SELECT pg_stat_statements_reset()       |     1 |    1
- WITH t(f) AS (                         +|     1 |    2
-   VALUES (?), (?)                      +|       | 
- )                                      +|       | 
-   SELECT f FROM t ORDER BY f            |       | 
-(9 rows)
+                       query                       | calls | rows 
+---------------------------------------------------+-------+------
+ PREPARE pgss_test (int) AS SELECT $1, $2 LIMIT $3 |     1 |    1
+ SELECT $1                                        +|     4 |    4
+                                                  +|       | 
+   AS "text"                                       |       | 
+ SELECT $1 + $2                                    |     2 |    2
+ SELECT $1 + $2 + $3 AS "add"                      |     3 |    3
+ SELECT $1 AS "float"                              |     1 |    1
+ SELECT $1 AS "int"                                |     2 |    2
+ SELECT $1 AS i UNION SELECT $2 ORDER BY i         |     1 |    2
+ SELECT $1 || $2                                   |     1 |    1
+ SELECT pg_stat_statements_reset()                 |     1 |    1
+ WITH t(f) AS (                                   +|     1 |    2
+   VALUES ($1), ($2)                              +|       | 
+ )                                                +|       | 
+   SELECT f FROM t ORDER BY f                      |       | 
+ select $1::jsonb ? $2                             |     1 |    1
+(11 rows)
 
 --
 -- CRUD: INSERT SELECT UPDATE DELETE on test table
@@ -125,6 +143,8 @@ BEGIN \;
 UPDATE test SET b = '555' WHERE a = 5 \;
 UPDATE test SET b = '666' WHERE a = 6 \;
 COMMIT ;
+-- many INSERT values
+INSERT INTO test (a, b) VALUES (1, 'a'), (2, 'b'), (3, 'c');
 -- SELECT with constants
 SELECT * FROM test WHERE a > 5 ORDER BY a ;
  a |          b           
@@ -147,8 +167,11 @@ SELECT *
 SELECT * FROM test ORDER BY a;
  a |          b           
 ---+----------------------
+ 1 | a                   
  1 | 111                 
+ 2 | b                   
  2 | 222                 
+ 3 | c                   
  3 | 333                 
  4 | 444                 
  5 | 555                 
@@ -156,19 +179,35 @@ SELECT * FROM test ORDER BY a;
  7 | aaa                 
  8 | bbb                 
  9 | bbb                 
-(9 rows)
+(12 rows)
+
+-- SELECT with IN clause
+SELECT * FROM test WHERE a IN (1, 2, 3, 4, 5);
+ a |          b           
+---+----------------------
+ 1 | 111                 
+ 2 | 222                 
+ 3 | 333                 
+ 4 | 444                 
+ 5 | 555                 
+ 1 | a                   
+ 2 | b                   
+ 3 | c                   
+(8 rows)
 
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
-                       query                       | calls | rows 
----------------------------------------------------+-------+------
- DELETE FROM test WHERE a > ?                      |     1 |    1
- INSERT INTO test VALUES(generate_series(?, ?), ?) |     1 |   10
- SELECT * FROM test ORDER BY a                     |     1 |    9
- SELECT * FROM test WHERE a > ? ORDER BY a         |     2 |    4
- SELECT pg_stat_statements_reset()                 |     1 |    1
- UPDATE test SET b = ? WHERE a = ?                 |     6 |    6
- UPDATE test SET b = ? WHERE a > ?                 |     1 |    3
-(7 rows)
+                            query                            | calls | rows 
+-------------------------------------------------------------+-------+------
+ DELETE FROM test WHERE a > $1                               |     1 |    1
+ INSERT INTO test (a, b) VALUES ($1, $2), ($3, $4), ($5, $6) |     1 |    3
+ INSERT INTO test VALUES(generate_series($1, $2), $3)        |     1 |   10
+ SELECT * FROM test ORDER BY a                               |     1 |   12
+ SELECT * FROM test WHERE a > $1 ORDER BY a                  |     2 |    4
+ SELECT * FROM test WHERE a IN ($1, $2, $3, $4, $5)          |     1 |    8
+ SELECT pg_stat_statements_reset()                           |     1 |    1
+ UPDATE test SET b = $1 WHERE a = $2                         |     6 |    6
+ UPDATE test SET b = $1 WHERE a > $2                         |     1 |    3
+(9 rows)
 
 --
 -- pg_stat_statements.track = none
@@ -251,9 +290,9 @@ SELECT PLUS_ONE(10);
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
                query               | calls | rows 
 -----------------------------------+-------+------
- SELECT ?::TEXT                    |     1 |    1
- SELECT PLUS_ONE(?)                |     2 |    2
- SELECT PLUS_TWO(?)                |     2 |    2
+ SELECT $1::TEXT                   |     1 |    1
+ SELECT PLUS_ONE($1)               |     2 |    2
+ SELECT PLUS_TWO($1)               |     2 |    2
  SELECT pg_stat_statements_reset() |     1 |    1
 (4 rows)
 
@@ -308,10 +347,10 @@ SELECT PLUS_ONE(1);
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
                query               | calls | rows 
 -----------------------------------+-------+------
- SELECT (i + ? + ?)::INTEGER       |     2 |    2
- SELECT (i + ?)::INTEGER LIMIT ?   |     2 |    2
- SELECT PLUS_ONE(?)                |     2 |    2
- SELECT PLUS_TWO(?)                |     2 |    2
+ SELECT (i + $2 + $3)::INTEGER     |     2 |    2
+ SELECT (i + $2)::INTEGER LIMIT $3 |     2 |    2
+ SELECT PLUS_ONE($1)               |     2 |    2
+ SELECT PLUS_TWO($1)               |     2 |    2
  SELECT pg_stat_statements_reset() |     1 |    1
 (5 rows)
 
@@ -352,7 +391,7 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  DROP FUNCTION PLUS_TWO(INTEGER)           |     1 |    0
  DROP TABLE IF EXISTS test                 |     3 |    0
  DROP TABLE test                           |     1 |    0
- SELECT ?                                  |     1 |    1
+ SELECT $1                                 |     1 |    1
  SELECT pg_stat_statements_reset()         |     1 |    1
 (8 rows)
 
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 62dec8768a..ac295b61c9 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -27,8 +27,8 @@
  * to blame query costs on the proper queryId.
  *
  * To facilitate presenting entries to users, we create "representative" query
- * strings in which constants are replaced with '?' characters, to make it
- * clearer what a normalized entry can represent.  To save on shared memory,
+ * strings in which constants are replaced with parameters (such as $1), to make
+ * it clearer what a normalized entry can represent.  To save on shared memory,
  * and to avoid having to truncate oversized query strings, we store these
  * strings in a temporary external query-texts file.  Offsets into this
  * file are kept in shared memory.
@@ -219,6 +219,9 @@ typedef struct pgssJumbleState
 
 	/* Current number of valid entries in clocations array */
 	int			clocations_count;
+
+	/* highest param id we've seen, in order to start normalization correctly */
+	int			highest_extern_param_id;
 } pgssJumbleState;
 
 /*---- Local variables ----*/
@@ -803,6 +806,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query)
 	jstate.clocations = (pgssLocationLen *)
 		palloc(jstate.clocations_buf_size * sizeof(pgssLocationLen));
 	jstate.clocations_count = 0;
+	jstate.highest_extern_param_id = 0;
 
 	/* Compute query ID and mark the Query node with it */
 	JumbleQuery(&jstate, query);
@@ -2478,6 +2482,11 @@ JumbleExpr(pgssJumbleState *jstate, Node *node)
 				APP_JUMB(p->paramkind);
 				APP_JUMB(p->paramid);
 				APP_JUMB(p->paramtype);
+
+				if (p->paramkind == PARAM_EXTERN &&
+					p->paramid > jstate->highest_extern_param_id) {
+					jstate->highest_extern_param_id = p->paramid;
+				}
 			}
 			break;
 		case T_Aggref:
@@ -2925,7 +2934,8 @@ RecordConstLocation(pgssJumbleState *jstate, int location)
  * of interest, so it's worth doing.)
  *
  * *query_len_p contains the input string length, and is updated with
- * the result string length (which cannot be longer) on exit.
+ * the result string length on exit. The resulting string might be longer
+ * depending on the number of $n replacement characters.
  *
  * Returns a palloc'd string.
  */
@@ -2940,7 +2950,9 @@ generate_normalized_query(pgssJumbleState *jstate, const char *query,
 				quer_loc = 0,	/* Source query byte location */
 				n_quer_loc = 0, /* Normalized query byte location */
 				last_off = 0,	/* Offset from start for previous tok */
-				last_tok_len = 0;		/* Length (in bytes) of that tok */
+				last_tok_len = 0,		/* Length (in bytes) of that tok */
+				norm_query_buflen = 0, /* Size of the normalized query buffer */
+				highest_param_len = 0; /* Length of highest replacement char */
 
 	/*
 	 * Get constants' lengths (core system only gives us locations).  Note
@@ -2948,8 +2960,14 @@ generate_normalized_query(pgssJumbleState *jstate, const char *query,
 	 */
 	fill_in_constant_lengths(jstate, query, query_loc);
 
+	/* Accomodate the additional query replacement characters */
+	highest_param_len = 1 + floor(log10(jstate->clocations_count +
+										jstate->highest_extern_param_id));
+	norm_query_buflen = query_len;
+	norm_query_buflen += jstate->clocations_count * highest_param_len;
+
 	/* Allocate result buffer */
-	norm_query = palloc(query_len + 1);
+	norm_query = palloc(norm_query_buflen + 1);
 
 	for (i = 0; i < jstate->clocations_count; i++)
 	{
@@ -2973,8 +2991,9 @@ generate_normalized_query(pgssJumbleState *jstate, const char *query,
 		memcpy(norm_query + n_quer_loc, query + quer_loc, len_to_wrt);
 		n_quer_loc += len_to_wrt;
 
-		/* And insert a '?' in place of the constant token */
-		norm_query[n_quer_loc++] = '?';
+		/* And insert a param in place of the constant token */
+		n_quer_loc += sprintf(norm_query + n_quer_loc, "$%d", i + 1 +
+							  jstate->highest_extern_param_id);
 
 		quer_loc = off + tok_len;
 		last_off = off;
@@ -2991,7 +3010,7 @@ generate_normalized_query(pgssJumbleState *jstate, const char *query,
 	memcpy(norm_query + n_quer_loc, query + quer_loc, len_to_wrt);
 	n_quer_loc += len_to_wrt;
 
-	Assert(n_quer_loc <= query_len);
+	Assert(n_quer_loc <= norm_query_buflen);
 	norm_query[n_quer_loc] = '\0';
 
 	*query_len_p = n_quer_loc;
diff --git a/contrib/pg_stat_statements/sql/pg_stat_statements.sql b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
index 385c8a8d99..4cfe745ad6 100644
--- a/contrib/pg_stat_statements/sql/pg_stat_statements.sql
+++ b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
@@ -37,12 +37,20 @@ SELECT :add + 1 + 1 AS "add" \gset
 -- set operator
 SELECT 1 AS i UNION SELECT 2 ORDER BY i;
 
+-- ? operator
+select '{"a":1, "b":2}'::jsonb ? 'b';
+
 -- cte
 WITH t(f) AS (
   VALUES (1.0), (2.0)
 )
   SELECT f FROM t ORDER BY f;
 
+-- prepared statement
+PREPARE pgss_test (int) AS SELECT $1, 'test' LIMIT 1;
+EXECUTE pgss_test(1);
+DEALLOCATE pgss_test;
+
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 
 --
@@ -74,6 +82,9 @@ UPDATE test SET b = '555' WHERE a = 5 \;
 UPDATE test SET b = '666' WHERE a = 6 \;
 COMMIT ;
 
+-- many INSERT values
+INSERT INTO test (a, b) VALUES (1, 'a'), (2, 'b'), (3, 'c');
+
 -- SELECT with constants
 SELECT * FROM test WHERE a > 5 ORDER BY a ;
 
@@ -85,6 +96,9 @@ SELECT *
 -- SELECT without constants
 SELECT * FROM test ORDER BY a;
 
+-- SELECT with IN clause
+SELECT * FROM test WHERE a IN (1, 2, 3, 4, 5);
+
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 
 --
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index d4f09fc2a3..d674339cbf 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -245,9 +245,11 @@
 
   <para>
    When a constant's value has been ignored for purposes of matching the
-   query to other queries, the constant is replaced by <literal>?</literal>
-   in the <structname>pg_stat_statements</> display.  The rest of the query
-   text is that of the first query that had the particular
+   query to other queries, the constant is replaced by parameter symbols such as
+   <literal>$1</literal> in the <structname>pg_stat_statements</> display. The
+   parameter symbols start incrementing from the next one after the highest $n
+   parameter in the original query text, or $1 if none is present. The rest of
+   the query text is that of the first query that had the particular
    <structfield>queryid</> hash value associated with the
    <structname>pg_stat_statements</> entry.
   </para>
@@ -481,13 +483,13 @@ bench=# SELECT query, calls, total_time, rows, 100.0 * shared_blks_hit /
                nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
           FROM pg_stat_statements ORDER BY total_time DESC LIMIT 5;
 -[ RECORD 1 ]---------------------------------------------------------------------
-query       | UPDATE pgbench_branches SET bbalance = bbalance + ? WHERE bid = ?;
+query       | UPDATE pgbench_branches SET bbalance = bbalance + $1 WHERE bid = $2;
 calls       | 3000
 total_time  | 9609.00100000002
 rows        | 2836
 hit_percent | 99.9778970000200936
 -[ RECORD 2 ]---------------------------------------------------------------------
-query       | UPDATE pgbench_tellers SET tbalance = tbalance + ? WHERE tid = ?;
+query       | UPDATE pgbench_tellers SET tbalance = tbalance + $1 WHERE tid = $2;
 calls       | 3000
 total_time  | 8015.156
 rows        | 2990
@@ -499,7 +501,7 @@ total_time  | 310.624
 rows        | 100000
 hit_percent | 0.30395136778115501520
 -[ RECORD 4 ]---------------------------------------------------------------------
-query       | UPDATE pgbench_accounts SET abalance = abalance + ? WHERE aid = ?;
+query       | UPDATE pgbench_accounts SET abalance = abalance + $1 WHERE aid = $2;
 calls       | 3000
 total_time  | 271.741999999997
 rows        | 3000
#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Lukas Fittl (#15)
Re: [PATCH] Use $ parameters as replacement characters for pg_stat_statements

Lukas Fittl <lukas@fittl.com> writes:

On Sat, Mar 18, 2017 at 12:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

So it turns out this discussion just reinvented the alternative that
Lukas had in his 0002 proposal. Are there any remaining objections
to proceeding with that approach?

Thanks for reviewing - updated patch attached, comments below.

Pushed with minor adjustments.

The main non-cosmetic thing I did was to replace the floor(log10())
business with plain constant "10" as I suggested before. That's
what we do in other places --- see int4out for an example --- and
frankly I did not feel that a small space savings in a transient
string buffer was worth the intellectual effort to verify whether
that calculation was correct or not, never mind whatever runtime
cycles it would take. I don't believe the argument that it's safer
your way: if you had an off-by-one thinko in the calculation, or even
just roundoff error in the log10() call, it could result in an actual
reachable buffer overrun, because there's no safety margin.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Lukas Fittl
lukas@fittl.com
In reply to: Tom Lane (#16)
Re: [PATCH] Use $ parameters as replacement characters for pg_stat_statements

On Mon, Mar 27, 2017 at 8:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Pushed with minor adjustments.

Excellent - thanks for your review and all the discussion here!

The main non-cosmetic thing I did was to replace the floor(log10())
business with plain constant "10" as I suggested before. That's
what we do in other places --- see int4out for an example --- and
frankly I did not feel that a small space savings in a transient
string buffer was worth the intellectual effort to verify whether
that calculation was correct or not, never mind whatever runtime
cycles it would take. I don't believe the argument that it's safer
your way: if you had an off-by-one thinko in the calculation, or even
just roundoff error in the log10() call, it could result in an actual
reachable buffer overrun, because there's no safety margin.

Makes sense, guess I was overthinking this one.

Best,
Lukas

--
Lukas Fittl

Skype: lfittl
Phone: +1 415 321 0630 <(415)%20321-0630>