[PROPOSAL] Detecting plan changes with plan_id in pg_stat_activity

Started by Imseih (AWS), Samiover 3 years ago11 messages
#1Imseih (AWS), Sami
simseih@amazon.com
1 attachment(s)

Adding a plan_id to pg_stat_activity allows users
to determine if a plan for a particular statement
has changed and if the new plan is performing better
or worse for a particular statement.

There are several ways the plan_id in pg_stat_activity
can be used:

1. In extensions that expose the plan text.
This will allow users to map a plan_id
from pg_stat_activity to the plan text.

2. In EXPLAIN output, including auto_explain.

3. In statement logging.

Computing the plan_id can be done using the same
routines for query jumbling, except plan nodes
will be jumbled. This approach was inspired by
work done in the extension pg_stat_plans,
https://github.com/2ndQuadrant/pg_stat_plans/

Attached is a POC patch that computes the plan_id
and presents the top-level plan_id in pg_stat_activity.

The patch still has work left:
- Perhaps Moving the plan jumbler outside of queryjumble.c?
- In the POC, the compute_query_id GUC determines if a
plan_id is to be computed. Should this be a separate GUC?

-- Below is the output of sampling pg_stat_activity
-- with a pgbench workload running The patch
-- introduces the plan_id column.

select count(*),
query,
query_id,
plan_id
from pg_stat_activity
where state='active'
and plan_id is not null and query_id is not null
group by query, query_id, plan_id
order by 1 desc limit 1;

-[ RECORD 1 ]--------------------------------------------------------------------------------------------------------
count | 1
query | INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (7, 8, 242150, -1471, CURRENT_TIMESTAMP);
query_id | 4535829358544711074
plan_id | -4913142083940981109

-- Also, a new view called pg_stat_statements_plan which
-- Includes all the same columns as pg_stat_statements, but
-- with statistics shown per plan.

postgres=# select substr(query, 1, 10) as query, queryid, planid, calls from pg_stat_statements_plan where queryid = 4535829358544711074;
-[ RECORD 1 ]-----------------
query | INSERT INT
queryid | 4535829358544711074
planid | -4913142083940981109
calls | 4274428

-- the existing pg_stat_statements table
-- shows stats aggregated on
-- the queryid level. This is current behavior.

postgres=# select substr(query, 1, 10) as query, queryid, calls from pg_stat_statements where queryid = 4535829358544711074;
-[ RECORD 1 ]----------------
query | INSERT INT
queryid | 4535829358544711074
calls | 4377142

-- The “%Q” log_line_prefix flag will also include the planid as part of the output
-- the format will be "query_id/plan_id"

-- An example of using auto_explain with the ‘%Q” flag in log_line_prefix.
2022-06-14 17:08:10.485 CDT [76955] [4912312221998332774/-2294484545013135901] LOG: duration: 0.144 ms plan:
Query Text: UPDATE pgbench_tellers SET tbalance = tbalance + -1952 WHERE tid = 32;
Update on public.pgbench_tellers (cost=0.27..8.29 rows=0 width=0)
-> Index Scan using pgbench_tellers_pkey on public.pgbench_tellers (cost=0.27..8.29 rows=1 width=10)
Output: (tbalance + '-1952'::integer), ctid
Index Cond: (pgbench_tellers.tid = 32)

-- the output for EXPLAIN VERBOSE also shows a plan id.

postgres=# explain verbose select 1;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=4)
Output: 1
Query Identifier: -2698492627503961632
Plan Identifier: -7861780579971713347
(4 rows)

Thanks,

Sami Imseih
Amazon Web Services

Attachments:

POC-Compute-Plan-ID-in-Core.patchapplication/octet-stream; name=POC-Compute-Plan-ID-in-Core.patchDownload
From 7cd8ce4552b8e73baa8b607e7bbd4f4f32b384fb Mon Sep 17 00:00:00 2001
From: "Imseih (AWS)" <simseih@88665a22795f.ant.amazon.com>
Date: Wed, 15 Jun 2022 12:27:09 -0500
Subject: [PATCH 1/1] Compute Plan Id in Core

---
 contrib/pg_stat_statements/Makefile           |   1 +
 .../pg_stat_statements--1.10--1.11.sql        | 142 +++++
 .../pg_stat_statements/pg_stat_statements.c   | 290 +++++++---
 .../pg_stat_statements.control                |   2 +-
 doc/src/sgml/config.sgml                      |   2 +-
 doc/src/sgml/monitoring.sgml                  |  16 +
 doc/src/sgml/pgstatstatements.sgml            | 454 +++++++++++++++
 src/backend/catalog/system_views.sql          |   1 +
 src/backend/commands/explain.c                |   2 +
 src/backend/executor/execMain.c               |   8 +-
 src/backend/executor/execParallel.c           |   1 +
 src/backend/nodes/copyfuncs.c                 |   1 +
 src/backend/nodes/outfuncs.c                  |   1 +
 src/backend/nodes/readfuncs.c                 |   1 +
 src/backend/optimizer/plan/planner.c          |   7 +
 src/backend/tcop/postgres.c                   |   1 +
 src/backend/utils/activity/backend_status.c   |  65 +++
 src/backend/utils/adt/pgstatfuncs.c           |   8 +-
 src/backend/utils/error/csvlog.c              |   3 +
 src/backend/utils/error/elog.c                |  11 +
 src/backend/utils/error/jsonlog.c             |   3 +
 src/backend/utils/misc/postgresql.conf.sample |   2 +-
 src/backend/utils/misc/queryjumble.c          | 533 +++++++++++++++++-
 src/include/catalog/pg_proc.dat               |   6 +-
 src/include/nodes/plannodes.h                 |   1 +
 src/include/utils/backend_status.h            |   7 +-
 src/include/utils/queryjumble.h               |   2 +
 src/test/regress/expected/explain.out         |   3 +-
 src/test/regress/expected/rules.out           |   9 +-
 29 files changed, 1493 insertions(+), 90 deletions(-)
 create mode 100644 contrib/pg_stat_statements/pg_stat_statements--1.10--1.11.sql

diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile
index edc40c8..8d78797 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -7,6 +7,7 @@ OBJS = \
 
 EXTENSION = pg_stat_statements
 DATA = pg_stat_statements--1.4.sql \
+	pg_stat_statements--1.10--1.11.sql \
 	pg_stat_statements--1.9--1.10.sql pg_stat_statements--1.8--1.9.sql \
 	pg_stat_statements--1.7--1.8.sql pg_stat_statements--1.6--1.7.sql \
 	pg_stat_statements--1.5--1.6.sql pg_stat_statements--1.4--1.5.sql \
diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.10--1.11.sql b/contrib/pg_stat_statements/pg_stat_statements--1.10--1.11.sql
new file mode 100644
index 0000000..bd51394
--- /dev/null
+++ b/contrib/pg_stat_statements/pg_stat_statements--1.10--1.11.sql
@@ -0,0 +1,142 @@
+/* contrib/pg_stat_statements/pg_stat_statements--1.10--1.11.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_stat_statements UPDATE TO '1.11'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_stat_statements DROP VIEW pg_stat_statements;
+ALTER EXTENSION pg_stat_statements DROP FUNCTION pg_stat_statements(boolean);
+
+/* Then we can drop them */
+DROP VIEW pg_stat_statements;
+DROP FUNCTION pg_stat_statements(boolean);
+
+/* Now redefine */
+CREATE FUNCTION pg_stat_statements(IN showtext boolean,
+    OUT userid oid,
+    OUT dbid oid,
+    OUT toplevel bool,
+    OUT queryid bigint,
+    OUT planid bigint,
+    OUT query text,
+    OUT plans int8,
+    OUT total_plan_time float8,
+    OUT min_plan_time float8,
+    OUT max_plan_time float8,
+    OUT mean_plan_time float8,
+    OUT stddev_plan_time float8,
+    OUT calls int8,
+    OUT total_exec_time float8,
+    OUT min_exec_time float8,
+    OUT max_exec_time float8,
+    OUT mean_exec_time float8,
+    OUT stddev_exec_time float8,
+    OUT rows int8,
+    OUT shared_blks_hit int8,
+    OUT shared_blks_read int8,
+    OUT shared_blks_dirtied int8,
+    OUT shared_blks_written int8,
+    OUT local_blks_hit int8,
+    OUT local_blks_read int8,
+    OUT local_blks_dirtied int8,
+    OUT local_blks_written int8,
+    OUT temp_blks_read int8,
+    OUT temp_blks_written int8,
+    OUT blk_read_time float8,
+    OUT blk_write_time float8,
+    OUT temp_blk_read_time float8,
+    OUT temp_blk_write_time float8,
+    OUT wal_records int8,
+    OUT wal_fpi int8,
+    OUT wal_bytes numeric,
+    OUT jit_functions int8,
+    OUT jit_generation_time float8,
+    OUT jit_inlining_count int8,
+    OUT jit_inlining_time float8,
+    OUT jit_optimization_count int8,
+    OUT jit_optimization_time float8,
+    OUT jit_emission_count int8,
+    OUT jit_emission_time float8
+)
+RETURNS SETOF record
+AS 'MODULE_PATHNAME', 'pg_stat_statements_1_11'
+LANGUAGE C STRICT VOLATILE PARALLEL SAFE;
+
+CREATE VIEW pg_stat_statements AS
+select
+    userid,
+    dbid,
+    toplevel,
+    queryid,
+    query,
+    SUM(plans) plans,
+    SUM(total_plan_time) as total_plan_time,
+    SUM(min_plan_time) as min_plan_time,
+    SUM(max_plan_time) as max_plan_time,
+    SUM(mean_plan_time) as mean_plan_time,
+    SUM(stddev_plan_time) as stddev_plan_time,
+    SUM(calls) as calls,
+    SUM(total_exec_time) as total_exec_time,
+    SUM(min_exec_time) as min_exec_time,
+    SUM(max_exec_time) as max_exec_time,
+    SUM(mean_exec_time) as mean_exec_time,
+    SUM(stddev_exec_time) as stddev_exec_time,
+    SUM(rows) as rows,
+    SUM(shared_blks_hit) as shared_blks_hit,
+    SUM(shared_blks_read) as shared_blks_read,
+    SUM(shared_blks_dirtied) as shared_blks_dirtied,
+    SUM(shared_blks_written) as shared_blks_written,
+    SUM(local_blks_hit) as local_blks_hit,
+    SUM(local_blks_read) as local_blks_read,
+    SUM(local_blks_dirtied) as local_blks_dirtied,
+    SUM(local_blks_written) as local_blks_written,
+    SUM(temp_blks_read) as temp_blks_read,
+    SUM(temp_blks_written) as temp_blks_written,
+    SUM(blk_read_time) as blk_read_time,
+    SUM(blk_write_time) as blk_write_time,
+    SUM(temp_blk_read_time) as temp_blk_read_time,
+    SUM(temp_blk_write_time) as temp_blk_write_time,
+    SUM(wal_records) as wal_records,
+    SUM(wal_fpi) as wal_fpi,
+    SUM(wal_bytes) as wal_bytes,
+    SUM(jit_functions) as jit_functions,
+    SUM(jit_generation_time) as jit_generation_time,
+    SUM(jit_inlining_count) as jit_inlining_count,
+    SUM(jit_inlining_time) as jit_inlining_time,
+    SUM(jit_optimization_count) as jit_optimization_count,
+    SUM(jit_optimization_time) as jit_optimization_time,
+    SUM(jit_emission_count) as jit_emission_count,
+    SUM(jit_emission_time) as jit_emission_time
+from pg_stat_statements(true)
+group by
+	userid,
+    dbid,
+    toplevel,
+    queryid,
+    query;
+
+GRANT SELECT ON pg_stat_statements TO PUBLIC;
+
+CREATE VIEW pg_stat_statements_plan AS
+	select * from pg_stat_statements(true);
+
+GRANT SELECT ON pg_stat_statements_plan TO PUBLIC;
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_stat_statements DROP FUNCTION pg_stat_statements_reset(oid, oid, bigint);
+
+/* Then we can drop them */
+DROP FUNCTION pg_stat_statements_reset(oid, oid, bigint);
+
+/* Now redefine */
+CREATE FUNCTION pg_stat_statements_reset(IN userid Oid DEFAULT 0,
+    IN dbid Oid DEFAULT 0,
+    IN queryid bigint DEFAULT 0,
+	IN planid bigint DEFAULT 0
+)
+RETURNS void
+AS 'MODULE_PATHNAME', 'pg_stat_statements_reset_1_11'
+LANGUAGE C STRICT PARALLEL SAFE;
+
+-- Don't want this to be available to non-superusers.
+REVOKE ALL ON FUNCTION pg_stat_statements_reset(Oid, Oid, bigint, bigint) FROM PUBLIC;
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 768cedd..c982d41 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -118,7 +118,8 @@ typedef enum pgssVersion
 	PGSS_V1_3,
 	PGSS_V1_8,
 	PGSS_V1_9,
-	PGSS_V1_10
+	PGSS_V1_10,
+	PGSS_V1_11
 } pgssVersion;
 
 typedef enum pgssStoreKind
@@ -150,9 +151,20 @@ typedef struct pgssHashKey
 	Oid			userid;			/* user OID */
 	Oid			dbid;			/* database OID */
 	uint64		queryid;		/* query identifier */
+	uint64		planid;			/* plan identifier */
 	bool		toplevel;		/* query executed at top level */
 } pgssHashKey;
 
+/*
+ * Hashtable key that defines the indentity of a query
+ * in the external query file.
+ */
+typedef struct qtextHashKey
+{
+	uint64		queryid;		/* query identifier */
+} qtextHashKey;
+
+
 /*
  * The actual stats counters kept within pgssEntry.
  */
@@ -228,6 +240,23 @@ typedef struct pgssEntry
 	slock_t		mutex;			/* protects the counters only */
 } pgssEntry;
 
+/*
+ * Offsets per query text
+ *
+ * This is a hash that allows for mapping the offsets
+ * of a query in the external text file using a queryid.
+ * This is required so when we attempt to allocate an
+ * entry in pgss_hash with a planid, we can do a lookup
+ * of the query text offsets. See comments in pgss_store
+ * for more details.
+ */
+typedef struct qtextEntry
+{
+	qtextHashKey key;           /* hash key of entry - MUST BE FIRST */
+	Size        query_offset;   /* query text offset in external file */
+	int         query_len;      /* # of valid bytes in query string, or -1 */
+} qtextEntry;
+
 /*
  * Global shared state
  */
@@ -265,6 +294,7 @@ static ProcessUtility_hook_type prev_ProcessUtility = NULL;
 /* Links to shared memory state */
 static pgssSharedState *pgss = NULL;
 static HTAB *pgss_hash = NULL;
+static HTAB *qtext_hash = NULL;
 
 /*---- GUC variables ----*/
 
@@ -309,11 +339,13 @@ void		_PG_init(void);
 
 PG_FUNCTION_INFO_V1(pg_stat_statements_reset);
 PG_FUNCTION_INFO_V1(pg_stat_statements_reset_1_7);
+PG_FUNCTION_INFO_V1(pg_stat_statements_reset_1_11);
 PG_FUNCTION_INFO_V1(pg_stat_statements_1_2);
 PG_FUNCTION_INFO_V1(pg_stat_statements_1_3);
 PG_FUNCTION_INFO_V1(pg_stat_statements_1_8);
 PG_FUNCTION_INFO_V1(pg_stat_statements_1_9);
 PG_FUNCTION_INFO_V1(pg_stat_statements_1_10);
+PG_FUNCTION_INFO_V1(pg_stat_statements_1_11);
 PG_FUNCTION_INFO_V1(pg_stat_statements);
 PG_FUNCTION_INFO_V1(pg_stat_statements_info);
 
@@ -337,7 +369,7 @@ static void pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 								ProcessUtilityContext context, ParamListInfo params,
 								QueryEnvironment *queryEnv,
 								DestReceiver *dest, QueryCompletion *qc);
-static void pgss_store(const char *query, uint64 queryId,
+static void pgss_store(const char *query, uint64 queryId, uint64 planId,
 					   int query_location, int query_len,
 					   pgssStoreKind kind,
 					   double total_time, uint64 rows,
@@ -359,7 +391,7 @@ static char *qtext_fetch(Size query_offset, int query_len,
 						 char *buffer, Size buffer_size);
 static bool need_gc_qtexts(void);
 static void gc_qtexts(void);
-static void entry_reset(Oid userid, Oid dbid, uint64 queryid);
+static void entry_reset(Oid userid, Oid dbid, uint64 queryid, uint64 planid);
 static char *generate_normalized_query(JumbleState *jstate, const char *query,
 									   int query_loc, int *query_len_p);
 static void fill_in_constant_lengths(JumbleState *jstate, const char *query,
@@ -516,6 +548,7 @@ pgss_shmem_startup(void)
 	/* reset in case this is a restart within the postmaster */
 	pgss = NULL;
 	pgss_hash = NULL;
+	qtext_hash = NULL;
 
 	/*
 	 * Create or attach to the shared memory state, including hash table
@@ -547,6 +580,17 @@ pgss_shmem_startup(void)
 							  &info,
 							  HASH_ELEM | HASH_BLOBS);
 
+	/*
+	 * It may be ok to set the size of the qtext_hash to
+	 * something much smaller than pgss_max.
+	 */
+	info.keysize = sizeof(qtextHashKey);
+	info.entrysize = sizeof(qtextEntry);
+	qtext_hash = ShmemInitHash("pg_stat_statements query text hash",
+							  pgss_max, pgss_max,
+							  &info,
+							  HASH_ELEM | HASH_BLOBS);
+
 	LWLockRelease(AddinShmemInitLock);
 
 	/*
@@ -735,7 +779,7 @@ pgss_shmem_shutdown(int code, Datum arg)
 		return;
 
 	/* Safety check ... shouldn't get here unless shmem is set up. */
-	if (!pgss || !pgss_hash)
+	if (!pgss || !pgss_hash || !qtext_hash)
 		return;
 
 	/* Don't dump if told not to. */
@@ -827,7 +871,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
 		prev_post_parse_analyze_hook(pstate, query, jstate);
 
 	/* Safety check... */
-	if (!pgss || !pgss_hash || !pgss_enabled(exec_nested_level))
+	if (!pgss || !pgss_hash || !pgss_enabled(exec_nested_level) || !qtext_hash)
 		return;
 
 	/*
@@ -852,6 +896,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
 	if (jstate && jstate->clocations_count > 0)
 		pgss_store(pstate->p_sourcetext,
 				   query->queryId,
+				   0,
 				   query->stmt_location,
 				   query->stmt_len,
 				   PGSS_INVALID,
@@ -937,6 +982,7 @@ pgss_planner(Query *parse,
 
 		pgss_store(query_string,
 				   parse->queryId,
+				   result->planId,
 				   parse->stmt_location,
 				   parse->stmt_len,
 				   PGSS_PLAN,
@@ -1044,6 +1090,7 @@ static void
 pgss_ExecutorEnd(QueryDesc *queryDesc)
 {
 	uint64		queryId = queryDesc->plannedstmt->queryId;
+	uint64		planId = queryDesc->plannedstmt->planId;
 
 	if (queryId != UINT64CONST(0) && queryDesc->totaltime &&
 		pgss_enabled(exec_nested_level))
@@ -1056,6 +1103,7 @@ pgss_ExecutorEnd(QueryDesc *queryDesc)
 
 		pgss_store(queryDesc->sourceText,
 				   queryId,
+				   planId,
 				   queryDesc->plannedstmt->stmt_location,
 				   queryDesc->plannedstmt->stmt_len,
 				   PGSS_EXEC,
@@ -1174,6 +1222,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 
 		pgss_store(queryString,
 				   saved_queryId,
+				   0,
 				   pstmt->stmt_location,
 				   pstmt->stmt_len,
 				   PGSS_EXEC,
@@ -1202,14 +1251,34 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
  *
  * If jstate is not NULL then we're trying to create an entry for which
  * we have no statistics as yet; we just want to record the normalized
- * query string.  total_time, rows, bufusage and walusage are ignored in this
- * case.
+ * query string in the external text file and qtext_hash.
  *
  * If kind is PGSS_PLAN or PGSS_EXEC, its value is used as the array position
  * for the arrays in the Counters field.
+ *
+ * How does qtext_hash and pgss_hash work together?
+ * When we need to allocate a new entry in pgss_hash, we need
+ * to have the offsets of the query text.
+ *
+ * If a query text was previously added to pgss_hash using a
+ * different planid, we can consult qtext_hash for the offsets.
+ *
+ * If the offsets are found in qtext_hash, use them in the new
+ * pgss_hash entry, else add the text to the external file and
+ * qtext_hash. After that the new pgss_hash entry can be allocated.
+ *
+ * If qtext_hash runs out of memory, we remove all the rows from
+ * the hash and retry. There is edge case where after qtext_hash
+ * cleanup, a new pgss_hash entry with a different plan_id for an
+ * existing query text will end up creating a similar SQL text in
+ * the external file. This should not be a common case and the file
+ * will be cleaned up by gc_qtext soon after.
+ *
+ * The above interaction occurs while holding an exclusive lock on
+ * pgss->lock.
  */
 static void
-pgss_store(const char *query, uint64 queryId,
+pgss_store(const char *query, uint64 queryId, uint64 planId,
 		   int query_location, int query_len,
 		   pgssStoreKind kind,
 		   double total_time, uint64 rows,
@@ -1219,14 +1288,16 @@ pgss_store(const char *query, uint64 queryId,
 		   JumbleState *jstate)
 {
 	pgssHashKey key;
+	qtextHashKey qkey;
 	pgssEntry  *entry;
+	qtextEntry *qentry;
 	char	   *norm_query = NULL;
 	int			encoding = GetDatabaseEncoding();
 
 	Assert(query != NULL);
 
 	/* Safety check... */
-	if (!pgss || !pgss_hash)
+	if (!pgss || !pgss_hash || !qtext_hash)
 		return;
 
 	/*
@@ -1251,10 +1322,12 @@ pgss_store(const char *query, uint64 queryId,
 	key.userid = GetUserId();
 	key.dbid = MyDatabaseId;
 	key.queryid = queryId;
+	key.planid = planId;
+	qkey.queryid = queryId;
 	key.toplevel = (exec_nested_level == 0);
 
 	/* Lookup the hash table entry with shared lock. */
-	LWLockAcquire(pgss->lock, LW_SHARED);
+			LWLockAcquire(pgss->lock, LW_SHARED);
 
 	entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
 
@@ -1264,57 +1337,87 @@ pgss_store(const char *query, uint64 queryId,
 		Size		query_offset;
 		int			gc_count;
 		bool		stored;
-		bool		do_gc;
+		bool		do_gc = false;
 
-		/*
-		 * Create a new, normalized query string if caller asked.  We don't
-		 * need to hold the lock while doing this work.  (Note: in any case,
-		 * it's possible that someone else creates a duplicate hashtable entry
-		 * in the interval where we don't hold the lock below.  That case is
-		 * handled by entry_alloc.)
-		 */
-		if (jstate)
+		qentry = (qtextEntry *) hash_search(qtext_hash, &qkey, HASH_FIND, NULL);
+		if (!qentry)
 		{
+			/*
+			 * Create a new, normalized query string if caller asked.  We don't
+			 * need to hold the lock while doing this work.  (Note: in any case,
+			 * it's possible that someone else creates a duplicate hashtable entry
+			 * in the interval where we don't hold the lock below.  That case is
+			 * handled by entry_alloc.)
+			 */
+			if (jstate)
+			{
+				LWLockRelease(pgss->lock);
+				norm_query = generate_normalized_query(jstate, query,
+													   query_location,
+													   &query_len);
+				LWLockAcquire(pgss->lock, LW_SHARED);
+			}
+
+			/* Append new query text to file with only shared lock held */
+			stored = qtext_store(norm_query ? norm_query : query, query_len,
+								 &query_offset, &gc_count);
+
+			/*
+			 * Determine whether we need to garbage collect external query texts
+			 * while the shared lock is still held.  This micro-optimization
+			 * avoids taking the time to decide this while holding exclusive lock.
+			 */
+			do_gc = need_gc_qtexts();
+
+			/* Need exclusive lock to make a new hashtable entry - promote */
 			LWLockRelease(pgss->lock);
-			norm_query = generate_normalized_query(jstate, query,
-												   query_location,
-												   &query_len);
-			LWLockAcquire(pgss->lock, LW_SHARED);
-		}
+			LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
 
-		/* Append new query text to file with only shared lock held */
-		stored = qtext_store(norm_query ? norm_query : query, query_len,
-							 &query_offset, &gc_count);
+			/*
+			 * A garbage collection may have occurred while we weren't holding the
+			 * lock.  In the unlikely event that this happens, the query text we
+			 * stored above will have been garbage collected, so write it again.
+			 * This should be infrequent enough that doing it while holding
+			 * exclusive lock isn't a performance problem.
+			 */
+			if (!stored || pgss->gc_count != gc_count)
+				stored = qtext_store(norm_query ? norm_query : query, query_len,
+									 &query_offset, NULL);
 
-		/*
-		 * Determine whether we need to garbage collect external query texts
-		 * while the shared lock is still held.  This micro-optimization
-		 * avoids taking the time to decide this while holding exclusive lock.
-		 */
-		do_gc = need_gc_qtexts();
+			/* If we failed to write to the text file, give up */
+			if (!stored)
+				goto done;
 
-		/* Need exclusive lock to make a new hashtable entry - promote */
-		LWLockRelease(pgss->lock);
-		LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+			qentry = hash_search(qtext_hash, &qkey, HASH_ENTER_NULL, NULL);
 
-		/*
-		 * A garbage collection may have occurred while we weren't holding the
-		 * lock.  In the unlikely event that this happens, the query text we
-		 * stored above will have been garbage collected, so write it again.
-		 * This should be infrequent enough that doing it while holding
-		 * exclusive lock isn't a performance problem.
-		 */
-		if (!stored || pgss->gc_count != gc_count)
-			stored = qtext_store(norm_query ? norm_query : query, query_len,
-								 &query_offset, NULL);
+			if (!qentry)
+			{
+				qtextEntry *qqentry;
+				HASH_SEQ_STATUS hashq_seq;
 
-		/* If we failed to write to the text file, give up */
-		if (!stored)
-			goto done;
+				/* Remove all entries. */
+				hash_seq_init(&hashq_seq, qtext_hash);
+				while ((qqentry = hash_seq_search(&hashq_seq)) != NULL)
+					hash_search(qtext_hash, &qqentry->key, HASH_REMOVE, NULL);
 
-		/* OK to create a new hashtable entry */
-		entry = entry_alloc(&key, query_offset, query_len, encoding,
-							jstate != NULL);
+				qentry = hash_search(qtext_hash, &qkey, HASH_ENTER, NULL);
+				if (!qentry)
+					goto done;
+			}
+
+
+			qentry->query_offset = query_offset;
+			qentry->query_len = query_len;
+		} else
+		{
+			query_offset = qentry->query_offset;
+			query_len = qentry->query_len;
+		}
+
+		if (kind != PGSS_INVALID)
+			/* OK to create a new hashtable entry */
+			entry = entry_alloc(&key, query_offset, query_len, encoding,
+								jstate != NULL);
 
 		/* If needed, perform garbage collection while exclusive lock held */
 		if (do_gc)
@@ -1418,17 +1521,27 @@ done:
  * Reset statement statistics corresponding to userid, dbid, and queryid.
  */
 Datum
-pg_stat_statements_reset_1_7(PG_FUNCTION_ARGS)
-{
-	Oid			userid;
-	Oid			dbid;
-	uint64		queryid;
+pg_stat_statements_reset_1_11(PG_FUNCTION_ARGS)
+ {
+	Oid                     userid;
+	Oid                     dbid;
+	uint64          queryid;
+	uint64          planid;
 
 	userid = PG_GETARG_OID(0);
 	dbid = PG_GETARG_OID(1);
 	queryid = (uint64) PG_GETARG_INT64(2);
+	planid = (uint64) PG_GETARG_INT64(3);
+
+	entry_reset(userid, dbid, queryid, planid);
+
+	PG_RETURN_VOID();
+}
 
-	entry_reset(userid, dbid, queryid);
+Datum
+pg_stat_statements_reset_1_7(PG_FUNCTION_ARGS)
+{
+	entry_reset(0, 0, 0, 0);
 
 	PG_RETURN_VOID();
 }
@@ -1439,7 +1552,7 @@ pg_stat_statements_reset_1_7(PG_FUNCTION_ARGS)
 Datum
 pg_stat_statements_reset(PG_FUNCTION_ARGS)
 {
-	entry_reset(0, 0, 0);
+	entry_reset(0, 0, 0, 0);
 
 	PG_RETURN_VOID();
 }
@@ -1452,7 +1565,8 @@ pg_stat_statements_reset(PG_FUNCTION_ARGS)
 #define PG_STAT_STATEMENTS_COLS_V1_8	32
 #define PG_STAT_STATEMENTS_COLS_V1_9	33
 #define PG_STAT_STATEMENTS_COLS_V1_10	43
-#define PG_STAT_STATEMENTS_COLS			43	/* maximum of above */
+#define PG_STAT_STATEMENTS_COLS_V1_11	44
+#define PG_STAT_STATEMENTS_COLS			44	/* maximum of above */
 
 /*
  * Retrieve statement statistics.
@@ -1464,6 +1578,16 @@ pg_stat_statements_reset(PG_FUNCTION_ARGS)
  * expected API version is identified by embedding it in the C name of the
  * function.  Unfortunately we weren't bright enough to do that for 1.1.
  */
+Datum
+pg_stat_statements_1_11(PG_FUNCTION_ARGS)
+{
+	bool        showtext = PG_GETARG_BOOL(0);
+
+	pg_stat_statements_internal(fcinfo, PGSS_V1_11, showtext);
+
+	return (Datum) 0;
+}
+
 Datum
 pg_stat_statements_1_10(PG_FUNCTION_ARGS)
 {
@@ -1550,7 +1674,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
 	is_allowed_role = has_privs_of_role(userid, ROLE_PG_READ_ALL_STATS);
 
 	/* hash table must exist already */
-	if (!pgss || !pgss_hash)
+	if (!pgss || !pgss_hash || !qtext_hash)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("pg_stat_statements must be loaded via shared_preload_libraries")));
@@ -1594,6 +1718,10 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
 			if (api_version != PGSS_V1_10)
 				elog(ERROR, "incorrect number of output arguments");
 			break;
+		case PG_STAT_STATEMENTS_COLS_V1_11:
+			if (api_version != PGSS_V1_11)
+				elog(ERROR, "incorrect number of output arguments");
+			break;
 		default:
 			elog(ERROR, "incorrect number of output arguments");
 	}
@@ -1672,6 +1800,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
 		Counters	tmp;
 		double		stddev;
 		int64		queryid = entry->key.queryid;
+		int64       planid = entry->key.planid;
 
 		memset(values, 0, sizeof(values));
 		memset(nulls, 0, sizeof(nulls));
@@ -1686,6 +1815,9 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
 			if (api_version >= PGSS_V1_2)
 				values[i++] = Int64GetDatumFast(queryid);
 
+			if (api_version >= PGSS_V1_11)
+				values[i++] = Int64GetDatumFast(planid);
+
 			if (showtext)
 			{
 				char	   *qstr = qtext_fetch(entry->query_offset,
@@ -1724,6 +1856,10 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
 			if (api_version >= PGSS_V1_2)
 				nulls[i++] = true;
 
+			/* Don't show planid */
+			if (api_version >= PGSS_V1_11)
+				nulls[i++] = true;
+
 			/*
 			 * Don't show query text, but hint as to the reason for not doing
 			 * so if it was requested
@@ -1835,6 +1971,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
 					 api_version == PGSS_V1_8 ? PG_STAT_STATEMENTS_COLS_V1_8 :
 					 api_version == PGSS_V1_9 ? PG_STAT_STATEMENTS_COLS_V1_9 :
 					 api_version == PGSS_V1_10 ? PG_STAT_STATEMENTS_COLS_V1_10 :
+					 api_version == PGSS_V1_10 ? PG_STAT_STATEMENTS_COLS_V1_11 :
 					 -1 /* fail if you forget to update this assert */ ));
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
@@ -1860,7 +1997,7 @@ pg_stat_statements_info(PG_FUNCTION_ARGS)
 	Datum		values[PG_STAT_STATEMENTS_INFO_COLS];
 	bool		nulls[PG_STAT_STATEMENTS_INFO_COLS];
 
-	if (!pgss || !pgss_hash)
+	if (!pgss || !pgss_hash || !qtext_hash)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("pg_stat_statements must be loaded via shared_preload_libraries")));
@@ -2497,7 +2634,7 @@ gc_fail:
  * Release entries corresponding to parameters passed.
  */
 static void
-entry_reset(Oid userid, Oid dbid, uint64 queryid)
+entry_reset(Oid userid, Oid dbid, uint64 queryid, uint64 planid)
 {
 	HASH_SEQ_STATUS hash_seq;
 	pgssEntry  *entry;
@@ -2505,8 +2642,9 @@ entry_reset(Oid userid, Oid dbid, uint64 queryid)
 	long		num_entries;
 	long		num_remove = 0;
 	pgssHashKey key;
+	qtextHashKey qkey;
 
-	if (!pgss || !pgss_hash)
+	if (!pgss || !pgss_hash || !qtext_hash)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("pg_stat_statements must be loaded via shared_preload_libraries")));
@@ -2514,16 +2652,20 @@ entry_reset(Oid userid, Oid dbid, uint64 queryid)
 	LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
 	num_entries = hash_get_num_entries(pgss_hash);
 
-	if (userid != 0 && dbid != 0 && queryid != UINT64CONST(0))
+	if (userid != 0 && dbid != 0 && queryid != UINT64CONST(0) && planid != UINT64CONST(0))
 	{
 		/* If all the parameters are available, use the fast path. */
 		memset(&key, 0, sizeof(pgssHashKey));
+		memset(&qkey, 0, sizeof(qtextHashKey));
 		key.userid = userid;
 		key.dbid = dbid;
 		key.queryid = queryid;
+		key.planid = planid;
+		qkey.queryid = queryid;
 
 		/* Remove the key if it exists, starting with the top-level entry  */
 		key.toplevel = false;
+
 		entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_REMOVE, NULL);
 		if (entry)				/* found */
 			num_remove++;
@@ -2535,8 +2677,10 @@ entry_reset(Oid userid, Oid dbid, uint64 queryid)
 		entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_REMOVE, NULL);
 		if (entry)				/* found */
 			num_remove++;
+
+		hash_search(qtext_hash, &qkey, HASH_REMOVE, NULL);
 	}
-	else if (userid != 0 || dbid != 0 || queryid != UINT64CONST(0))
+	else if (userid != 0 || dbid != 0 || queryid != UINT64CONST(0) || planid != UINT64CONST(0))
 	{
 		/* Remove entries corresponding to valid parameters. */
 		hash_seq_init(&hash_seq, pgss_hash);
@@ -2544,10 +2688,16 @@ entry_reset(Oid userid, Oid dbid, uint64 queryid)
 		{
 			if ((!userid || entry->key.userid == userid) &&
 				(!dbid || entry->key.dbid == dbid) &&
-				(!queryid || entry->key.queryid == queryid))
+				(!queryid || entry->key.queryid == queryid) &&
+				(!planid || entry->key.planid == planid))
 			{
 				hash_search(pgss_hash, &entry->key, HASH_REMOVE, NULL);
 				num_remove++;
+				if (queryid)
+				{
+					qkey.queryid = queryid;
+					hash_search(qtext_hash, &qkey, HASH_REMOVE, NULL);
+				}
 			}
 		}
 	}
@@ -2560,6 +2710,12 @@ entry_reset(Oid userid, Oid dbid, uint64 queryid)
 			hash_search(pgss_hash, &entry->key, HASH_REMOVE, NULL);
 			num_remove++;
 		}
+
+		hash_seq_init(&hash_seq, qtext_hash);
+		while ((entry = hash_seq_search(&hash_seq)) != NULL)
+		{
+			hash_search(qtext_hash, &entry->key, HASH_REMOVE, NULL);
+		}
 	}
 
 	/* All entries are removed? */
diff --git a/contrib/pg_stat_statements/pg_stat_statements.control b/contrib/pg_stat_statements/pg_stat_statements.control
index 0747e48..8a76106 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.control
+++ b/contrib/pg_stat_statements/pg_stat_statements.control
@@ -1,5 +1,5 @@
 # pg_stat_statements extension
 comment = 'track planning and execution statistics of all SQL statements executed'
-default_version = '1.10'
+default_version = '1.11'
 module_pathname = '$libdir/pg_stat_statements'
 relocatable = true
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5b7ce65..b1fdeea 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8088,7 +8088,7 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </term>
       <listitem>
        <para>
-        Enables in-core computation of a query identifier.
+        Enables in-core computation of a query identifiers.
         Query identifiers can be displayed in the <link
         linkend="monitoring-pg-stat-activity-view"><structname>pg_stat_activity</structname></link>
         view, using <command>EXPLAIN</command>, or emitted in the log if
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 4549c25..651dadc 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -947,6 +947,22 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
      </para></entry>
     </row>
 
+    <row>
+     <entry role="catalog_table_entry"><para role="column_definition">
+      <structfield>plan_id</structfield> <type>bigint</type>
+     </para>
+     <para>
+      Identifier of this backend's most recent query plan. If
+      <structfield>state</structfield> is <literal>active</literal> this
+      field shows the identifier of the currently executing query plan. In
+      all other states, it shows the identifier of last query plan that was
+      executed.  Query plan identifiers are not computed by default so this
+      field will be null unless <xref linkend="guc-compute-query-id"/>
+      parameter is enabled or a third-party module that computes query
+      identifiers is configured.
+     </para></entry>
+    </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>query</structfield> <type>text</type>
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index ecf6cd6..0ca0e4b 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -602,6 +602,460 @@
   </para>
  </sect2>
 
+ <sect2>
+  <title>The <structname>pg_stat_statements_plan</structname> View</title>
+
+  <para>
+   This is view is similar to <structname>pg_stat_statements_plan</structname>
+   with an additional plan ID for each distinct row.
+  </para>
+
+  <table id="pgstatstatements-columns">
+   <title><structname>pg_stat_statements</structname> Columns</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       Column Type
+      </para>
+      <para>
+       Description
+      </para></entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>userid</structfield> <type>oid</type>
+       (references <link linkend="catalog-pg-authid"><structname>pg_authid</structname></link>.<structfield>oid</structfield>)
+      </para>
+      <para>
+       OID of user who executed the statement
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>dbid</structfield> <type>oid</type>
+       (references <link linkend="catalog-pg-database"><structname>pg_database</structname></link>.<structfield>oid</structfield>)
+      </para>
+      <para>
+       OID of database in which the statement was executed
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>toplevel</structfield> <type>bool</type>
+      </para>
+      <para>
+       True if the query was executed as a top-level statement
+       (always true if <varname>pg_stat_statements.track</varname> is set to
+       <literal>top</literal>)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>queryid</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Hash code to identify identical normalized queries.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>planid</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Hash code to identify the query plan.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>query</structfield> <type>text</type>
+      </para>
+      <para>
+       Text of a representative statement
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>plans</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of times the statement was planned
+       (if <varname>pg_stat_statements.track_planning</varname> is enabled,
+       otherwise zero)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_plan_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Total time spent planning the statement, in milliseconds
+       (if <varname>pg_stat_statements.track_planning</varname> is enabled,
+       otherwise zero)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>min_plan_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Minimum time spent planning the statement, in milliseconds
+       (if <varname>pg_stat_statements.track_planning</varname> is enabled,
+       otherwise zero)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>max_plan_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Maximum time spent planning the statement, in milliseconds
+       (if <varname>pg_stat_statements.track_planning</varname> is enabled,
+       otherwise zero)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>mean_plan_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Mean time spent planning the statement, in milliseconds
+       (if <varname>pg_stat_statements.track_planning</varname> is enabled,
+       otherwise zero)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>stddev_plan_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Population standard deviation of time spent planning the statement,
+       in milliseconds
+       (if <varname>pg_stat_statements.track_planning</varname> is enabled,
+       otherwise zero)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>calls</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of times the statement was executed
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_exec_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Total time spent executing the statement, in milliseconds
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>min_exec_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Minimum time spent executing the statement, in milliseconds
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>max_exec_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Maximum time spent executing the statement, in milliseconds
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>mean_exec_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Mean time spent executing the statement, in milliseconds
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>stddev_exec_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Population standard deviation of time spent executing the statement, in milliseconds
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>rows</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of rows retrieved or affected by the statement
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>shared_blks_hit</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of shared block cache hits by the statement
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>shared_blks_read</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of shared blocks read by the statement
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>shared_blks_dirtied</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of shared blocks dirtied by the statement
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>shared_blks_written</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of shared blocks written by the statement
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>local_blks_hit</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of local block cache hits by the statement
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>local_blks_read</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of local blocks read by the statement
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>local_blks_dirtied</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of local blocks dirtied by the statement
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>local_blks_written</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of local blocks written by the statement
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>temp_blks_read</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of temp blocks read by the statement
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>temp_blks_written</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of temp blocks written by the statement
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>blk_read_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Total time the statement spent reading data file blocks, in milliseconds
+       (if <xref linkend="guc-track-io-timing"/> is enabled, otherwise zero)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>blk_write_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Total time the statement spent writing data file blocks, in milliseconds
+       (if <xref linkend="guc-track-io-timing"/> is enabled, otherwise zero)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>temp_blk_read_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Total time the statement spent reading temporary file blocks, in
+       milliseconds (if <xref linkend="guc-track-io-timing"/> is enabled,
+       otherwise zero)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>temp_blk_write_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Total time the statement spent writing temporary file blocks, in
+       milliseconds (if <xref linkend="guc-track-io-timing"/> is enabled,
+       otherwise zero)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_records</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of WAL records generated by the statement
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_fpi</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of WAL full page images generated by the statement
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_bytes</structfield> <type>numeric</type>
+      </para>
+      <para>
+       Total amount of WAL generated by the statement in bytes
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>jit_functions</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of functions JIT-compiled by the statement
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>jit_generation_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Total time spent by the statement on generating JIT code, in milliseconds
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>jit_inlining_count</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of times functions have been inlined
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>jit_inlining_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Total time spent by the statement on inlining functions, in milliseconds
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>jit_optimization_count</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of times the statement has been optimized
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>jit_optimization_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Total time spent by the statement on optimizing, in milliseconds
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>jit_emission_count</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of times code has been emitted
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>jit_emission_time</structfield> <type>double precision</type>
+      </para>
+      <para>
+       Total time spent by the statement on emitting code, in milliseconds
+      </para></entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+
+  <para>
+   For security reasons, only superusers and roles with privileges of the
+   <literal>pg_read_all_stats</literal> role are allowed to see the SQL text and
+   <structfield>queryid</structfield> of queries executed by other users.
+   Other users can see the statistics, however, if the view has been installed
+   in their database.
+  </para>
+ </sect2>
+
  <sect2>
   <title>The <structname>pg_stat_statements_info</structname> View</title>
 
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fedaed5..86ff49e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -863,6 +863,7 @@ CREATE VIEW pg_stat_activity AS
             S.backend_xid,
             s.backend_xmin,
             S.query_id,
+            S.plan_id,
             S.query,
             S.backend_type
     FROM pg_stat_get_activity(NULL) AS S
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 5d1f708..6d575c6 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -618,6 +618,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 		 */
 		ExplainPropertyInteger("Query Identifier", NULL, (int64)
 							   plannedstmt->queryId, es);
+		ExplainPropertyInteger("Plan Identifier", NULL, (int64)
+							   plannedstmt->planId, es);
 	}
 
 	/* Show buffer usage in planning */
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ef2fd46..4d132dd 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -132,11 +132,13 @@ ExecutorStart(QueryDesc *queryDesc, int eflags)
 {
 	/*
 	 * In some cases (e.g. an EXECUTE statement) a query execution will skip
-	 * parse analysis, which means that the query_id won't be reported.  Note
-	 * that it's harmless to report the query_id multiple time, as the call
-	 * will be ignored if the top level query_id has already been reported.
+	 * parse analysis and planning, which means that the query_id and plan_id
+	 * won't be reported.  Note that it's harmless to report the query_id and
+	 * plan_id multiple times, as the call will be ignored if the top level
+	 * query_id and plan_id has already been reported.
 	 */
 	pgstat_report_query_id(queryDesc->plannedstmt->queryId, false);
+	pgstat_report_plan_id(queryDesc->plannedstmt->planId, queryDesc->plannedstmt->queryId, false);
 
 	if (ExecutorStart_hook)
 		(*ExecutorStart_hook) (queryDesc, eflags);
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index f1fd7f7..e23d5c0 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -176,6 +176,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
 	pstmt = makeNode(PlannedStmt);
 	pstmt->commandType = CMD_SELECT;
 	pstmt->queryId = pgstat_get_my_query_id();
+	pstmt->planId = pgstat_get_my_plan_id();
 	pstmt->hasReturning = false;
 	pstmt->hasModifyingCTE = false;
 	pstmt->canSetTag = true;
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 51d630f..88a89a4 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -108,6 +108,7 @@ _copyPlannedStmt(const PlannedStmt *from)
 	COPY_NODE_FIELD(utilityStmt);
 	COPY_LOCATION_FIELD(stmt_location);
 	COPY_SCALAR_FIELD(stmt_len);
+	COPY_SCALAR_FIELD(planId);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index ce12915..6cdae95 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -313,6 +313,7 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
 
 	WRITE_ENUM_FIELD(commandType, CmdType);
 	WRITE_UINT64_FIELD(queryId);
+	WRITE_UINT64_FIELD(planId);
 	WRITE_BOOL_FIELD(hasReturning);
 	WRITE_BOOL_FIELD(hasModifyingCTE);
 	WRITE_BOOL_FIELD(canSetTag);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 6a05b69..0c24e6f 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1809,6 +1809,7 @@ _readPlannedStmt(void)
 
 	READ_ENUM_FIELD(commandType, CmdType);
 	READ_UINT64_FIELD(queryId);
+	READ_UINT64_FIELD(planId);
 	READ_BOOL_FIELD(hasReturning);
 	READ_BOOL_FIELD(hasModifyingCTE);
 	READ_BOOL_FIELD(canSetTag);
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index a0f2390..599fc4c 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -60,6 +60,7 @@
 #include "partitioning/partdesc.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/dsm_impl.h"
+#include "utils/backend_status.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/selfuncs.h"
@@ -560,6 +561,12 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	if (glob->partition_directory != NULL)
 		DestroyPartitionDirectory(glob->partition_directory);
 
+	/* If enabled, Compute the query and plan id */
+	if (IsQueryIdEnabled())
+		JumblePlan(result);
+
+	pgstat_report_plan_id(result->planId, result->queryId, false);
+
 	return result;
 }
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8b6b5bb..fcf2838 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1085,6 +1085,7 @@ exec_simple_query(const char *query_string)
 		DestReceiver *receiver;
 		int16		format;
 
+		pgstat_report_plan_id(0, 0, true);
 		pgstat_report_query_id(0, true);
 
 		/*
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index c7ed1e6..43d3bc1 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -399,6 +399,7 @@ pgstat_bestart(void)
 	lbeentry.st_progress_command = PROGRESS_COMMAND_INVALID;
 	lbeentry.st_progress_command_target = InvalidOid;
 	lbeentry.st_query_id = UINT64CONST(0);
+	lbeentry.st_plan_id = UINT64CONST(0);
 
 	/*
 	 * we don't zero st_progress_param here to save cycles; nobody should
@@ -549,6 +550,7 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 			/* st_xact_start_timestamp and wait_event_info are also disabled */
 			beentry->st_xact_start_timestamp = 0;
 			beentry->st_query_id = UINT64CONST(0);
+			beentry->st_plan_id = UINT64CONST(0);
 			proc->wait_event_info = 0;
 			PGSTAT_END_WRITE_ACTIVITY(beentry);
 		}
@@ -609,7 +611,10 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 	 * identifier.
 	 */
 	if (state == STATE_RUNNING)
+	{
+		beentry->st_plan_id = UINT64CONST(0);
 		beentry->st_query_id = UINT64CONST(0);
+	}
 
 	if (cmd_str != NULL)
 	{
@@ -625,6 +630,10 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
  * pgstat_report_query_id() -
  *
  * Called to update top-level query identifier.
+ *
+ * Return if the query_id is updated. This is
+ * used to signal if the plan_id should be
+ * reported in pgstat_report_plan_id()
  * --------
  */
 void
@@ -660,6 +669,48 @@ pgstat_report_query_id(uint64 query_id, bool force)
 	PGSTAT_END_WRITE_ACTIVITY(beentry);
 }
 
+/* --------
+ * pgstat_report_plan_id() -
+ *
+ * Called to update the top-level plan identifier.
+ * --------
+ */
+void
+pgstat_report_plan_id(uint64 plan_id, uint64 query_id, bool force)
+{
+	volatile PgBackendStatus *beentry = MyBEEntry;
+
+	/*
+	 * if track_activities is disabled, st_query_id should already have been
+	 * reset
+	 */
+	if (!beentry || !pgstat_track_activities)
+		return;
+
+	/*
+	 * We only report the top-level plan identifiers.  The stored plan_id is
+	 * reset when a backend calls pgstat_report_activity(STATE_RUNNING), or
+	 * with an explicit call to this function using the force flag.  If the
+	 * saved plan identifier is not zero or the query identifier is 0,
+	 * it means that it's not a top-level command, so ignore the one provided
+	 * unless it's an explicit call to reset the identifier.
+	 *
+	 * NOTE: A query id being 0 by the time we set the plan_id indicates
+	 * that the we have already set the top-level query_id in the backend.
+	 */
+	if ((beentry->st_plan_id != 0 || query_id == 0) && !force)
+		return;
+
+	/*
+	 * Update my status entry, following the protocol of bumping
+	 * st_changecount before and after.  We use a volatile pointer here to
+	 * ensure the compiler doesn't try to get cute.
+	 */
+	PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
+	beentry->st_plan_id = plan_id;
+	PGSTAT_END_WRITE_ACTIVITY(beentry);
+}
+
 
 /* ----------
  * pgstat_report_appname() -
@@ -1045,6 +1096,20 @@ pgstat_get_my_query_id(void)
 	return MyBEEntry->st_query_id;
 }
 
+/* ----------
+ * pgstat_get_my_plan_id() -
+ *
+ * Return current backend's plan identifier.
+ */
+uint64
+pgstat_get_my_plan_id(void)
+{
+	if (!MyBEEntry)
+		return 0;
+
+	return MyBEEntry->st_plan_id;
+}
+
 
 /* ----------
  * pgstat_fetch_stat_beentry() -
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 893690d..6fa5974 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -539,7 +539,7 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 Datum
 pg_stat_get_activity(PG_FUNCTION_ARGS)
 {
-#define PG_STAT_GET_ACTIVITY_COLS	30
+#define PG_STAT_GET_ACTIVITY_COLS	31
 	int			num_backends = pgstat_fetch_stat_numbackends();
 	int			curr_backend;
 	int			pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
@@ -862,6 +862,11 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 				nulls[29] = true;
 			else
 				values[29] = UInt64GetDatum(beentry->st_query_id);
+
+			if (beentry->st_plan_id == 0)
+				nulls[30] = true;
+			else
+				values[30] = UInt64GetDatum(beentry->st_plan_id);
 		}
 		else
 		{
@@ -890,6 +895,7 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 			nulls[27] = true;
 			nulls[28] = true;
 			nulls[29] = true;
+			nulls[30] = true;
 		}
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
diff --git a/src/backend/utils/error/csvlog.c b/src/backend/utils/error/csvlog.c
index 5c49bc4..f730bc5 100644
--- a/src/backend/utils/error/csvlog.c
+++ b/src/backend/utils/error/csvlog.c
@@ -252,6 +252,9 @@ write_csvlog(ErrorData *edata)
 	/* query id */
 	appendStringInfo(&buf, "%lld", (long long) pgstat_get_my_query_id());
 
+	/* plan id */
+	appendStringInfo(&buf, "%lld", (long long) pgstat_get_my_plan_id());
+
 	appendStringInfoChar(&buf, '\n');
 
 	/* If in the syslogger process, try to write messages direct to file */
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 55ee542..819affb 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -2777,11 +2777,22 @@ log_line_prefix(StringInfo buf, ErrorData *edata)
 				break;
 			case 'Q':
 				if (padding != 0)
+				{
 					appendStringInfo(buf, "%*lld", padding,
 									 (long long) pgstat_get_my_query_id());
+					appendStringInfo(buf, "%*s", padding, "/");
+
+					appendStringInfo(buf, "%*lld", padding,
+									 (long long) pgstat_get_my_plan_id());
+				}
 				else
+				{
 					appendStringInfo(buf, "%lld",
 									 (long long) pgstat_get_my_query_id());
+					appendStringInfo(buf, "%s", "/");
+					appendStringInfo(buf, "%lld",
+									 (long long) pgstat_get_my_plan_id());
+				}
 				break;
 			default:
 				/* format error - ignore it */
diff --git a/src/backend/utils/error/jsonlog.c b/src/backend/utils/error/jsonlog.c
index 27ad768..8514393 100644
--- a/src/backend/utils/error/jsonlog.c
+++ b/src/backend/utils/error/jsonlog.c
@@ -288,6 +288,9 @@ write_jsonlog(ErrorData *edata)
 	/* query id */
 	appendJSONKeyValueFmt(&buf, "query_id", false, "%lld",
 						  (long long) pgstat_get_my_query_id());
+	/* plan id */
+	appendJSONKeyValueFmt(&buf, "plan_id", false, "%lld",
+						 (long long) pgstat_get_my_plan_id());
 
 	/* Finish string */
 	appendStringInfoChar(&buf, '}');
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 48ad80c..88a6916 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -569,7 +569,7 @@
 					#   %t = timestamp without milliseconds
 					#   %m = timestamp with milliseconds
 					#   %n = timestamp with milliseconds (as a Unix epoch)
-					#   %Q = query ID (0 if none or not computed)
+					#   %Q = query ID and plan ID separated by "/" (0 if none or not computed)
 					#   %i = command tag
 					#   %e = SQL state
 					#   %c = session ID
diff --git a/src/backend/utils/misc/queryjumble.c b/src/backend/utils/misc/queryjumble.c
index eeaa0b3..b0d0fe9 100644
--- a/src/backend/utils/misc/queryjumble.c
+++ b/src/backend/utils/misc/queryjumble.c
@@ -15,11 +15,20 @@
  * information is ignored as irrelevant or not essential to the query, such
  * as the collations of Vars and, most notably, the values of constants.
  *
- * This jumble is acquired at the end of parse analysis of each query, and
- * a 64-bit hash of it is stored into the query's Query.queryId field.
- * The server then copies this value around, making it available in plan
- * tree(s) generated from the query.  The executor can then use this value
- * to blame query costs on the proper queryId.
+ * Additionaly, The plan tree of a planned statement is fingerprinted
+ * to differentiate plans for the same query.
+ *
+ * For Utility statements, a jumble of the query is acquired at the end
+ * of parse analysis. For planned statements, a jumble of the query is
+ * acquired before planning and a jumble of the plan is acquired after
+ * planning.
+ *
+ * The query jumble is stored as a 64-bit hash in the query's Query.queryid
+ * field and a plan jumble is similarly hashed and stored in the
+ * PlannedStmt.planid field.
+ *
+ * The executor can then use the queryid and planid values to properly
+ * blame the cost of a query.
  *
  * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -34,6 +43,7 @@
 
 #include "common/hashfn.h"
 #include "miscadmin.h"
+#include "nodes/plannodes.h"
 #include "parser/scansup.h"
 #include "utils/queryjumble.h"
 
@@ -54,6 +64,10 @@ static void JumbleRowMarks(JumbleState *jstate, List *rowMarks);
 static void JumbleExpr(JumbleState *jstate, Node *node);
 static void RecordConstLocation(JumbleState *jstate, int location);
 
+static void JumblePlanInternal(JumbleState *jstate, PlannedStmt *plan);
+static void JumblePlanHeader(JumbleState *jstate, Plan *plan);
+static void JumbleScanHeader(JumbleState *jstate, Scan *scan);
+
 /*
  * Given a possibly multi-statement source string, confine our attention to the
  * relevant part of the string.
@@ -285,7 +299,8 @@ JumbleRangeTable(JumbleState *jstate, List *rtable)
 				APP_JUMB(rte->inh);
 				break;
 			case RTE_SUBQUERY:
-				JumbleQueryInternal(jstate, rte->subquery);
+				if (rte->subquery)
+					JumbleQueryInternal(jstate, rte->subquery);
 				break;
 			case RTE_JOIN:
 				APP_JUMB(rte->jointype);
@@ -342,8 +357,11 @@ JumbleRowMarks(JumbleState *jstate, List *rowMarks)
 }
 
 /*
- * Jumble an expression tree
+ * Jumble an expression or plan tree
  *
+ * This function handles both query and plan nodes.
+ *
+ * Notes for query nodes:
  * In general this function should handle all the same node types that
  * expression_tree_walker() does, and therefore it's coded to be as parallel
  * to that function as possible.  However, since we are only invoked on
@@ -354,6 +372,12 @@ JumbleRowMarks(JumbleState *jstate, List *rowMarks)
  * point of that function is to support tree walkers that don't care about
  * most tree node types, but here we care about all types.  We should complain
  * about any unrecognized node type.
+ *
+ * Notes for plan nodes:
+ * The plan nodes that are defined in plannodes.h are handled
+ * here.
+ * We include nodes that are not required for jumbling to avoid
+ * encountering the "unrecognized node" case.
  */
 static void
 JumbleExpr(JumbleState *jstate, Node *node)
@@ -374,6 +398,7 @@ JumbleExpr(JumbleState *jstate, Node *node)
 
 	switch (nodeTag(node))
 	{
+		/* Query nodes */
 		case T_Var:
 			{
 				Var		   *var = (Var *) node;
@@ -387,6 +412,9 @@ JumbleExpr(JumbleState *jstate, Node *node)
 			{
 				Const	   *c = (Const *) node;
 
+				if (jstate->clocations == NULL)
+					break;
+
 				/* We jumble only the constant's type, not its value */
 				APP_JUMB(c->consttype);
 				/* Also, record its parse location for query normalization */
@@ -895,6 +923,421 @@ JumbleExpr(JumbleState *jstate, Node *node)
 				JumbleExpr(jstate, (Node *) tsc->repeatable);
 			}
 			break;
+		/* Plan nodes */
+		case T_SubPlan:
+			{
+				SubPlan *sp = (SubPlan *) node;
+				JumbleExpr(jstate, sp->testexpr);
+				foreach(temp, sp->setParam)
+				{
+					Node *param = (Node *) lfirst(temp);
+					JumbleExpr(jstate, param);
+				}
+				foreach(temp, sp->args)
+				{
+					Node *arg = (Node *) lfirst(temp);
+					JumbleExpr(jstate, arg);
+				}
+			}
+			break;
+		case T_AlternativeSubPlan:
+			{
+				AlternativeSubPlan *alt = (AlternativeSubPlan *) node;
+				foreach(temp, alt->subplans)
+				{
+					Node *subplan = (Node*) lfirst(temp);
+					/* Guard against infinite-recursion */
+					if (subplan != (Node*) alt)
+						JumbleExpr(jstate, subplan);
+				}
+			}
+			break;
+		case T_Result:
+			{
+				Result *res = (Result*) node;
+
+				JumblePlanHeader(jstate, &res->plan);
+				JumbleExpr(jstate, res->resconstantqual);
+			}
+			break;
+		case T_ProjectSet:
+			{
+				ProjectSet *ps = (ProjectSet *) node;
+
+				JumblePlanHeader(jstate, &ps->plan);
+			}
+			break;
+		case T_ModifyTable:
+			{
+				ModifyTable *mt = (ModifyTable *) node;
+
+				JumblePlanHeader(jstate, &mt->plan);
+				JumbleExpr(jstate, (Node *) mt->resultRelations);
+				JumbleExpr(jstate, (Node *) mt->returningLists);
+				JumbleExpr(jstate, (Node *) mt->rowMarks);
+			}
+			break;
+		case T_Append:
+			{
+				Append *a = (Append *) node;
+
+				JumblePlanHeader(jstate, &a->plan);
+				JumbleExpr(jstate, (Node *) a->appendplans);
+			}
+			break;
+		case T_MergeAppend:
+			{
+				MergeAppend *ma = (MergeAppend *) node;
+
+				JumblePlanHeader(jstate, &ma->plan);
+				JumbleExpr(jstate, (Node *) ma->mergeplans);
+			}
+			break;
+		case T_RecursiveUnion:
+			{
+				RecursiveUnion *ru = (RecursiveUnion *) node;
+
+				JumblePlanHeader(jstate, &ru->plan);
+			}
+			break;
+		case T_BitmapAnd:
+			{
+				BitmapAnd *ba = (BitmapAnd *) node;
+
+				JumblePlanHeader(jstate, &ba->plan);
+			}
+			break;
+		case T_BitmapOr:
+			{
+				BitmapOr *bo = (BitmapOr *) node;
+
+				JumblePlanHeader(jstate, &bo->plan);
+			}
+			break;
+		case T_NestLoop:
+			{
+				NestLoop *nl = (NestLoop *) node;
+
+				APP_JUMB(nl->join.jointype);
+				JumblePlanHeader(jstate, &nl->join.plan);
+				JumbleExpr(jstate, (Node *) nl->nestParams);
+			}
+			break;
+		case T_MergeJoin:
+			{
+				MergeJoin *mj = (MergeJoin *) node;
+
+				APP_JUMB(mj->join.jointype);
+				JumblePlanHeader(jstate, &mj->join.plan);
+				JumbleExpr(jstate, (Node *) mj->mergeclauses);
+			}
+			break;
+		case T_HashJoin:
+			{
+				HashJoin *hj = (HashJoin *) node;
+
+				APP_JUMB(hj->join.jointype);
+				JumblePlanHeader(jstate, &hj->join.plan);
+				JumbleExpr(jstate, (Node *) hj->hashclauses);
+			}
+			break;
+		case T_SeqScan:
+			{
+				SeqScan *sqs = (SeqScan *) node;
+
+				JumbleScanHeader(jstate, &sqs->scan);
+				JumblePlanHeader(jstate, &sqs->scan.plan);
+				APP_JUMB(sqs->scan.scanrelid);
+			}
+			break;
+		case T_SampleScan:
+			{
+				SampleScan *ss = (SampleScan *) node;
+
+				JumbleScanHeader(jstate, &ss->scan);
+			}
+			break;
+		case T_Gather:
+			{
+				Gather *g = (Gather *) node;
+
+				JumblePlanHeader(jstate, &g->plan);
+			}
+			break;
+		case T_GatherMerge:
+			{
+				GatherMerge *gm = (GatherMerge *) node;
+
+				JumblePlanHeader(jstate, &gm->plan);
+			}
+			break;
+		case T_IndexScan:
+			{
+				IndexScan *is = (IndexScan *) node;
+
+				JumbleScanHeader(jstate, &is->scan);
+				JumbleExpr(jstate, (Node *) is->indexqualorig);
+
+				JumbleExpr(jstate, (Node *) is->indexorderby);
+				JumbleExpr(jstate, (Node *) is->indexorderbyorig);
+
+				JumbleExpr(jstate, (Node *) is->indexqual);
+				APP_JUMB(is->indexid);
+				APP_JUMB(is->indexorderdir);
+			}
+			break;
+		case T_IndexOnlyScan:
+			{
+				IndexOnlyScan *ios = (IndexOnlyScan*) node;
+
+				JumbleExpr(jstate, (Node *) ios->indexorderby);
+				JumbleExpr(jstate, (Node *) ios->indexqual);
+				JumbleExpr(jstate, (Node *) ios->indextlist);
+
+				APP_JUMB(ios->indexid);
+				APP_JUMB(ios->indexorderdir);
+
+				JumbleScanHeader(jstate, &ios->scan);
+			}
+			break;
+		case T_BitmapIndexScan:
+			{
+				BitmapIndexScan *bis = (BitmapIndexScan *) node;
+
+				JumbleScanHeader(jstate, &bis->scan);
+				APP_JUMB(bis->indexid);
+				JumbleExpr(jstate, (Node *) bis->indexqual);
+				JumbleExpr(jstate, (Node *) bis->indexqualorig);
+			}
+			break;
+		case T_BitmapHeapScan:
+			{
+				BitmapHeapScan *bhs = (BitmapHeapScan *) node;
+
+				JumbleScanHeader(jstate, &bhs->scan);
+			}
+			break;
+		case T_TidScan:
+			{
+				TidScan *tsc = (TidScan *) node;
+
+				JumbleScanHeader(jstate, &tsc->scan);
+				JumbleExpr(jstate, (Node *) tsc->tidquals);
+			}
+			break;
+		case T_TidRangeScan:
+			{
+				TidRangeScan *trs = (TidRangeScan *) node;
+
+				JumbleScanHeader(jstate, &trs->scan);
+			}
+			break;
+		case T_SubqueryScan:
+			{
+				SubqueryScan *sqs = (SubqueryScan *) node;
+
+				JumbleScanHeader(jstate, &sqs->scan);
+				JumbleExpr(jstate, (Node *) sqs->subplan);
+			}
+			break;
+		case T_FunctionScan:
+			{
+				FunctionScan *fs = (FunctionScan *) node;
+
+				JumbleScanHeader(jstate, &fs->scan);
+			}
+			break;
+		case T_TableFuncScan:
+			{
+				TableFuncScan *tfs = (TableFuncScan *) node;
+
+				JumbleScanHeader(jstate, &tfs->scan);
+			}
+			break;
+		case T_ValuesScan:
+			{
+				ValuesScan *vs = (ValuesScan *) node;
+
+				JumbleScanHeader(jstate, &vs->scan);
+				JumbleExpr(jstate, (Node *) vs->values_lists);
+			}
+			break;
+		case T_CteScan:
+			{
+				CteScan *cs = (CteScan *) node;
+
+				JumbleScanHeader(jstate, &cs->scan);
+			}
+			break;
+		case T_NamedTuplestoreScan:
+			{
+				NamedTuplestoreScan *nts = (NamedTuplestoreScan *) node;
+
+				JumbleScanHeader(jstate, &nts->scan);
+			}
+			break;
+		case T_WorkTableScan:
+			{
+				WorkTableScan *ws = (WorkTableScan *) node;
+
+				JumbleScanHeader(jstate, &ws->scan);
+			}
+			break;
+		case T_ForeignScan:
+			{
+				ForeignScan *fs = (ForeignScan *) node;
+
+				JumbleScanHeader(jstate, &fs->scan);
+			}
+			break;
+		case T_CustomScan:
+			{
+				CustomScan *cs = (CustomScan *) node;
+
+				JumbleScanHeader(jstate, &cs->scan);
+			}
+			break;
+		case T_Material:
+			{
+				Material *m = (Material *) node;
+
+				JumblePlanHeader(jstate, &m->plan);
+			}
+			break;
+		case T_Memoize:
+			{
+				Memoize *m = (Memoize *) node;
+
+				JumblePlanHeader(jstate, &m->plan);
+			}
+			break;
+		case T_Sort:
+			{
+				Sort *so = (Sort *) node;
+				int i;
+
+				for (i = 0; i < so->numCols; i++)
+				{
+					AttrNumber	at = so->sortColIdx[i];
+					Oid			op = so->sortOperators[i];
+					bool		nf = so->nullsFirst[i];
+
+					APP_JUMB(at);
+					APP_JUMB(op);
+					APP_JUMB(nf);
+				}
+				JumblePlanHeader(jstate, &so->plan);
+			}
+			break;
+		case T_IncrementalSort:
+			{
+				IncrementalSort *is = (IncrementalSort *) node;
+				int i;
+
+				for (i = 0; i < is->sort.numCols; i++)
+				{
+					AttrNumber	at = is->sort.sortColIdx[i];
+					Oid			op = is->sort.sortOperators[i];
+					bool		nf = is->sort.nullsFirst[i];
+
+					APP_JUMB(at);
+					APP_JUMB(op);
+					APP_JUMB(nf);
+				}
+				JumblePlanHeader(jstate, &is->sort.plan);
+			}
+			break;
+		case T_Group:
+			{
+				Group *gr = (Group *) node;
+
+				JumblePlanHeader(jstate, &gr->plan);
+			}
+			break;
+		case T_Agg:
+			{
+				Agg *ag = (Agg *) node;
+
+				JumblePlanHeader(jstate, &ag->plan);
+			}
+			break;
+		case T_WindowAgg:
+			{
+				WindowAgg *wa = (WindowAgg *) node;
+
+				JumblePlanHeader(jstate, &wa->plan);
+			}
+			break;
+		case T_Unique:
+			{
+				Unique *un = (Unique *) node;
+
+				JumblePlanHeader(jstate, &un->plan);
+			}
+			break;
+		case T_SetOp:
+			{
+				SetOp *so = (SetOp *) node;
+
+				JumblePlanHeader(jstate, &so->plan);
+			}
+			break;
+		case T_LockRows:
+			{
+				LockRows *lr = (LockRows *) node;
+
+				JumblePlanHeader(jstate, &lr->plan);
+			}
+			break;
+		case T_Limit:
+			{
+				Limit *lim = (Limit *) node;
+
+				JumblePlanHeader(jstate, &lim->plan);
+			}
+			break;
+		case T_Hash:
+			{
+				Hash *hash = (Hash *) node;
+
+				JumblePlanHeader(jstate, &hash->plan);
+			}
+			break;
+		case T_NestLoopParam:
+			{
+				NestLoopParam *nlp = (NestLoopParam *) node;
+
+				JumbleExpr(jstate, (Node *) nlp->paramval);
+			}
+			break;
+		case T_PlanRowMark:
+			{
+				PlanRowMark *prm = (PlanRowMark*) node;
+				APP_JUMB(prm->markType);
+			}
+			break;
+		case T_PlanInvalItem:
+			{
+				PlanInvalItem *pii = (PlanInvalItem*) node;
+
+				APP_JUMB(pii->cacheId);
+				APP_JUMB(pii->hashValue);
+			}
+			break;
+		case T_GroupingSetData:
+			break;
+		case T_PlaceHolderVar:
+			break;
+		case T_RollupData:
+			break;
+		case T_SpecialJoinInfo:
+			break;
+		case T_RestrictInfo:
+			break;
+		case T_CustomPath:
+			break;
+		case T_IndexPath:
+			break;
 		default:
 			/* Only a warning, since we can stumble along anyway */
 			elog(WARNING, "unrecognized node type: %d",
@@ -928,3 +1371,79 @@ RecordConstLocation(JumbleState *jstate, int location)
 		jstate->clocations_count++;
 	}
 }
+
+JumbleState *
+JumblePlan(PlannedStmt *plan)
+{
+	JumbleState *jstate = NULL;
+	uint64 planid;
+
+	Assert(IsPlanIdEnabled());
+
+	jstate = (JumbleState *) palloc(sizeof(JumbleState));
+
+	/*
+	 * Set up workspace for plan jumbling.
+	 *
+	 * We don't need to setup constant
+	 * locations for plans.
+	 */
+	jstate->jumble = (unsigned char *) palloc(JUMBLE_SIZE);
+	jstate->jumble_len = 0;
+	jstate->clocations_buf_size = 0;
+	jstate->clocations = NULL;
+	jstate->clocations_count = 0;
+	jstate->highest_extern_param_id = 0;
+
+	/* Compute query ID and mark the Query node with it */
+	JumblePlanInternal(jstate, plan);
+	planid = DatumGetUInt64(hash_any_extended(jstate->jumble,
+											  jstate->jumble_len,
+											  0));
+
+	plan->planId = planid;
+
+	return jstate;
+};
+
+static void
+JumblePlanInternal(JumbleState *jstate, PlannedStmt *plan)
+{
+	Assert(IsA(plan, PlannedStmt));
+	Assert(plan->utilityStmt == NULL);
+
+	APP_JUMB(plan->commandType);
+	/* resultRelation is usually predictable from commandType */
+	JumbleExpr(jstate, (Node *) plan->planTree);
+	JumbleRangeTable(jstate, plan->rtable);
+	JumbleExpr(jstate, (Node *) plan->resultRelations);
+	JumbleExpr(jstate, (Node *) plan->utilityStmt);
+	JumbleExpr(jstate, (Node *) plan->subplans);
+	JumbleExpr(jstate, (Node *) plan->rewindPlanIDs);
+	JumbleExpr(jstate, (Node *) plan->rowMarks);
+};
+
+/*
+ * JumblePlanHeader: Jumble a Plan header.
+ *
+ * Ignore estimated execution costs, etc. We are only interested in fields that
+ * are essential to the plan.
+ */
+static void
+JumblePlanHeader(JumbleState *jstate, Plan *plan)
+{
+	JumbleExpr(jstate, (Node *) plan->qual);
+	JumbleExpr(jstate, (Node *) plan->targetlist);
+	JumbleExpr(jstate, (Node *) plan->lefttree);
+	JumbleExpr(jstate, (Node *) plan->righttree);
+}
+
+/*
+ * JumbleScan: Jumble the contents of a scan header.
+ */
+static void
+JumbleScanHeader(JumbleState *jstate, Scan *scan)
+{
+	JumblePlanHeader(jstate, &scan->plan);
+	APP_JUMB(scan->scanrelid);
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 87aa571..6a4913f 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5340,9 +5340,9 @@
   proname => 'pg_stat_get_activity', prorows => '100', proisstrict => 'f',
   proretset => 't', provolatile => 's', proparallel => 'r',
   prorettype => 'record', proargtypes => 'int4',
-  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,text,numeric,text,bool,text,bool,int4,int8}',
-  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc,leader_pid,query_id}',
+  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,text,numeric,text,bool,text,bool,int4,int8,int8}',
+  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc,leader_pid,query_id,plan_id}',
   prosrc => 'pg_stat_get_activity' },
 { oid => '3318',
   descr => 'statistics: information about progress of backends running maintenance command',
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 0ea9a22..6dd94ed 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -89,6 +89,7 @@ typedef struct PlannedStmt
 	/* statement location in source string (copied from Query) */
 	int			stmt_location;	/* start location, or -1 if unknown */
 	int			stmt_len;		/* length in bytes; 0 means "rest of string" */
+	int64		planId;			/* plan identifier (can be set by plugins) */
 } PlannedStmt;
 
 /* macro for fetching the Plan associated with a SubPlan node */
diff --git a/src/include/utils/backend_status.h b/src/include/utils/backend_status.h
index 7403bca..2a178a6 100644
--- a/src/include/utils/backend_status.h
+++ b/src/include/utils/backend_status.h
@@ -166,8 +166,11 @@ typedef struct PgBackendStatus
 	Oid			st_progress_command_target;
 	int64		st_progress_param[PGSTAT_NUM_PROGRESS_PARAM];
 
-	/* query identifier, optionally computed using post_parse_analyze_hook */
+	/* query identifier & plan identifier, both are optionally computed
+	 * using post_parse_analyze_hook.
+	 */
 	uint64		st_query_id;
+	uint64		st_plan_id;
 } PgBackendStatus;
 
 
@@ -298,6 +301,7 @@ extern void pgstat_clear_backend_activity_snapshot(void);
 /* Activity reporting functions */
 extern void pgstat_report_activity(BackendState state, const char *cmd_str);
 extern void pgstat_report_query_id(uint64 query_id, bool force);
+extern void pgstat_report_plan_id(uint64 plan_id, uint64 query_id, bool force);
 extern void pgstat_report_tempfile(size_t filesize);
 extern void pgstat_report_appname(const char *appname);
 extern void pgstat_report_xact_timestamp(TimestampTz tstamp);
@@ -305,6 +309,7 @@ extern const char *pgstat_get_backend_current_activity(int pid, bool checkUser);
 extern const char *pgstat_get_crashed_backend_activity(int pid, char *buffer,
 													   int buflen);
 extern uint64 pgstat_get_my_query_id(void);
+extern uint64 pgstat_get_my_plan_id(void);
 
 
 /* ----------
diff --git a/src/include/utils/queryjumble.h b/src/include/utils/queryjumble.h
index 3c2d9be..dffd794 100644
--- a/src/include/utils/queryjumble.h
+++ b/src/include/utils/queryjumble.h
@@ -15,6 +15,7 @@
 #define QUERYJUBLE_H
 
 #include "nodes/parsenodes.h"
+#include "nodes/plannodes.h"
 
 #define JUMBLE_SIZE				1024	/* query serialization buffer size */
 
@@ -67,6 +68,7 @@ extern PGDLLIMPORT int compute_query_id;
 
 extern const char *CleanQuerytext(const char *query, int *location, int *len);
 extern JumbleState *JumbleQuery(Query *query, const char *querytext);
+extern JumbleState *JumblePlan(PlannedStmt *plan);
 extern void EnableQueryId(void);
 
 extern PGDLLIMPORT bool query_id_enabled;
diff --git a/src/test/regress/expected/explain.out b/src/test/regress/expected/explain.out
index 48620ed..df7c43b 100644
--- a/src/test/regress/expected/explain.out
+++ b/src/test/regress/expected/explain.out
@@ -515,5 +515,6 @@ select explain_filter('explain (verbose) select * from int8_tbl i8');
  Seq Scan on public.int8_tbl i8  (cost=N.N..N.N rows=N width=N)
    Output: q1, q2
  Query Identifier: N
-(3 rows)
+ Plan Identifier: N
+(4 rows)
 
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index fc3cde3..85025ed 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1756,9 +1756,10 @@ pg_stat_activity| SELECT s.datid,
     s.backend_xid,
     s.backend_xmin,
     s.query_id,
+    s.plan_id,
     s.query,
     s.backend_type
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id, plan_id)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)))
      LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_all_indexes| SELECT c.oid AS relid,
@@ -1870,7 +1871,7 @@ pg_stat_gssapi| SELECT s.pid,
     s.gss_auth AS gss_authenticated,
     s.gss_princ AS principal,
     s.gss_enc AS encrypted
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id, plan_id)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_progress_analyze| SELECT s.pid,
     s.datid,
@@ -2051,7 +2052,7 @@ pg_stat_replication| SELECT s.pid,
     w.sync_priority,
     w.sync_state,
     w.reply_time
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id, plan_id)
      JOIN pg_stat_get_wal_senders() w(pid, state, sent_lsn, write_lsn, flush_lsn, replay_lsn, write_lag, flush_lag, replay_lag, sync_priority, sync_state, reply_time) ON ((s.pid = w.pid)))
      LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_replication_slots| SELECT s.slot_name,
@@ -2085,7 +2086,7 @@ pg_stat_ssl| SELECT s.pid,
     s.ssl_client_dn AS client_dn,
     s.ssl_client_serial AS client_serial,
     s.ssl_issuer_dn AS issuer_dn
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id, plan_id)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_subscription| SELECT su.oid AS subid,
     su.subname,
-- 
2.32.1 (Apple Git-133)

#2Julien Rouhaud
rjuju123@gmail.com
In reply to: Imseih (AWS), Sami (#1)
Re: [PROPOSAL] Detecting plan changes with plan_id in pg_stat_activity

Hi,

On Wed, Jun 15, 2022 at 06:45:38PM +0000, Imseih (AWS), Sami wrote:

Adding a plan_id to pg_stat_activity allows users
to determine if a plan for a particular statement
has changed and if the new plan is performing better
or worse for a particular statement.
[...]
Attached is a POC patch that computes the plan_id
and presents the top-level plan_id in pg_stat_activity.

AFAICS you're proposing to add an identifier for a specific plan, but no way to
know what that plan was? How are users supposed to use the information if they
know something changed but don't know what changed exactly?

- In the POC, the compute_query_id GUC determines if a
plan_id is to be computed. Should this be a separate GUC?

Probably, as computing it will likely be quite expensive. Some benchmark on
various workloads would be needed here.

I only had a quick look at the patch, but I see that you have some code to
avoid storing the query text multiple times with different planid. How does it
work exactly, and does it ensure that the query text is only removed once the
last entry that uses it is removed? It seems that you identify a specific
query text by queryid, but that seems wrong as collision can (easily?) happen
in different databases. The real identifier of a query text should be (dbid,
queryid).

Note that this problem already exists, as the query texts are now stored per
(userid, dbid, queryid, istoplevel). Maybe this part could be split in a
different commit as it could already be useful without a planid.

#3Andrey Lepikhov
a.lepikhov@postgrespro.ru
In reply to: Imseih (AWS), Sami (#1)
Re: [PROPOSAL] Detecting plan changes with plan_id in pg_stat_activity

On 15/6/2022 21:45, Imseih (AWS), Sami wrote:

Adding a plan_id to pg_stat_activity allows users
to determine if a plan for a particular statement
has changed and if the new plan is performing better
or worse for a particular statement.
There are several ways the plan_id in pg_stat_activity

In general, your idea is quite useful.
But, as discussed earlier [1]/messages/by-id/e0de3423-4bba-1e69-c55a-f76bf18dbd74@postgrespro.ru extensions would implement many useful
things if they could add into a plan some custom data.
Maybe implement your feature with some private list of nodes in plan
structure instead of single-purpose plan_id field?

[1]: /messages/by-id/e0de3423-4bba-1e69-c55a-f76bf18dbd74@postgrespro.ru
/messages/by-id/e0de3423-4bba-1e69-c55a-f76bf18dbd74@postgrespro.ru

--
regards,
Andrey Lepikhov
Postgres Professional

#4Imseih (AWS), Sami
simseih@amazon.com
In reply to: Julien Rouhaud (#2)
Re: [PROPOSAL] Detecting plan changes with plan_id in pg_stat_activity

AFAICS you're proposing to add an identifier for a specific plan, but no way to
know what that plan was? How are users supposed to use the information if they
know something changed but don't know what changed exactly?

I see this as a start to do more useful things with plans.

The patch right out the gate exposes the plan_id in EXPLAIN output
and auto_explain.
This will also be useful for extensions that will provide the plan text.
It is also conceivable that pg_stat_statements can provide an option
To store the plan text?

- In the POC, the compute_query_id GUC determines if a
plan_id is to be computed. Should this be a separate GUC?

Probably, as computing it will likely be quite expensive. Some benchmark on
various workloads would be needed here.

Yes, more benchmarks will be needed here with more complex plans. I have
Only benchmarked with pgbench at this point.
However, separating this into Its own GUC is what I am leaning towards as well
and will update the patch.

I only had a quick look at the patch, but I see that you have some code to
avoid storing the query text multiple times with different planid. How does it
work exactly, and does it ensure that the query text is only removed once the
last entry that uses it is removed? It seems that you identify a specific
query text by queryid, but that seems wrong as collision can (easily?) happen
in different databases. The real identifier of a query text should be (dbid,
queryid).

The idea is to lookup the offsets and length of the text in the external file by querid
only. Therefore we can store similar query text for multiple pgss_hash entries
only once.

When a new entry in pgss is not found, the new qtext_hash is consulted to
see if it has information about the offsets/length of the queryid. If found in
qtext_hash, the new pgss_hash entry is created with the offsets found.

If not found in qtext_hash, the query text will be (normalized) and stored in
the external file. Then, a new entry will be created in qtext_hash and
an entry in pgss_hash.

Of course we need to also handle the gc_qtext cleanups, entry_reset, startup
and shutdown scenarios. The patch does this, but I will go back and do more
testing.

Note that this problem already exists, as the query texts are now stored per
(userid, dbid, queryid, istoplevel). Maybe this part could be split in a
different commit as it could already be useful without a planid.

Good point. I will separate this patch.

Regards,

Sami Imseih
Amazon Web Services

#5Imseih (AWS), Sami
simseih@amazon.com
In reply to: Imseih (AWS), Sami (#4)
1 attachment(s)
Re: [PROPOSAL] Detecting plan changes with plan_id in pg_stat_activity

Good point. I will separate this patch.

I separated the pg_stat_statements patch. The patch
Introduces a secondary hash that tracks locations of
A query ( by queryid ) in the external file. The hash
remains in lockstep with the pgss_hash using a
synchronization routine. For the default
pg_stat_statements.max = 5000, this hash requires 2MB megabytes
of additional shared memory.

My testing does not show any regression for workloads
In which statements are not issues by multiple users/databases.

However, it shows good improvement, 10-15%, when there
are similar statements that are issues by multiple
users/databases/tracking levels.

Besides this improvement, this will open up the opportunity
to also track plan_id's as discussed earlier in the thread.

Thanks for the feedback.

Regards,

Sami Imseih
Amazon Web Services

Attachments:

0001-Improve-pg_stat_statements-performance-for-similar-q.patchapplication/octet-stream; name=0001-Improve-pg_stat_statements-performance-for-similar-q.patchDownload
From bf15959daa2b3aade923d7bdbea4c1fca8f94bc1 Mon Sep 17 00:00:00 2001
From: "Imseih (AWS)" <simseih@88665a22795f.ant.amazon.com>
Date: Tue, 21 Jun 2022 14:43:23 -0500
Subject: [PATCH 1/1] Improve pg_stat_statements performance for similar
 queries

Introduce a secondary hash in pg_stat_statements that is
responsible for tracking query locations in the external
file with the queryid only. This allows for similar queries
issues by different users, or in different databases or
by different tracking levels to reference the same location
and will cut down on the need to write similar queries
multiple times to disk.

The new hash qtext_hash has a synchronization routine which
will ensure it's in synchronized with pgss_hash while the
proper lock is held.

Discussion: https://www.postgresql.org/message-id/604E3199-2DD2-47DD-AC47-774A6F97DCA9%40amazon.com
---
 .../pg_stat_statements/pg_stat_statements.c   | 240 ++++++++++++++----
 1 file changed, 189 insertions(+), 51 deletions(-)

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 768cedd..1dae264 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -153,6 +153,18 @@ typedef struct pgssHashKey
 	bool		toplevel;		/* query executed at top level */
 } pgssHashKey;
 
+/*
+ * This secondary hash table also maintains the location data
+ * of a queryid in the external file. By maintaining this separate mapping,
+ * we minimize the overhead of normalizing/storing multiple versions of
+ * the same query if the query is issued in different databases,
+ * by different users, or tracking level.
+ */
+typedef struct qtextHashKey
+{
+	uint64		queryid;		/* query identifier */
+} qtextHashKey;
+
 /*
  * The actual stats counters kept within pgssEntry.
  */
@@ -228,6 +240,21 @@ typedef struct pgssEntry
 	slock_t		mutex;			/* protects the counters only */
 } pgssEntry;
 
+/*
+ * Location of query in external file
+ *
+ * Note: A qtextEntry remains in sync with pgssHash, both hash will track
+ * the same queryid's. The routine responsible for keeping them in sync
+ * is qtext_hash_sync(), which is called only when an exclusive lock
+ * is held on pgss->lock.
+ */
+typedef struct qtextEntry
+{
+	qtextHashKey key;           /* hash key of entry - MUST BE FIRST */
+	Size        query_offset;   /* query text offset in external file */
+	int         query_len;      /* # of valid bytes in query string, or -1 */
+} qtextEntry;
+
 /*
  * Global shared state
  */
@@ -265,6 +292,7 @@ static ProcessUtility_hook_type prev_ProcessUtility = NULL;
 /* Links to shared memory state */
 static pgssSharedState *pgss = NULL;
 static HTAB *pgss_hash = NULL;
+static HTAB *qtext_hash = NULL;
 
 /*---- GUC variables ----*/
 
@@ -352,6 +380,7 @@ static Size pgss_memsize(void);
 static pgssEntry *entry_alloc(pgssHashKey *key, Size query_offset, int query_len,
 							  int encoding, bool sticky);
 static void entry_dealloc(void);
+static void qtext_hash_sync(void);
 static bool qtext_store(const char *query, int query_len,
 						Size *query_offset, int *gc_count);
 static char *qtext_load_file(Size *buffer_size);
@@ -516,6 +545,7 @@ pgss_shmem_startup(void)
 	/* reset in case this is a restart within the postmaster */
 	pgss = NULL;
 	pgss_hash = NULL;
+	qtext_hash = NULL;
 
 	/*
 	 * Create or attach to the shared memory state, including hash table
@@ -547,6 +577,13 @@ pgss_shmem_startup(void)
 							  &info,
 							  HASH_ELEM | HASH_BLOBS);
 
+	info.keysize = sizeof(qtextHashKey);
+	info.entrysize = sizeof(qtextEntry);
+	qtext_hash = ShmemInitHash("pg_stat_statements query text hash",
+							   pgss_max, pgss_max,
+							   &info,
+							   HASH_ELEM | HASH_BLOBS);
+
 	LWLockRelease(AddinShmemInitLock);
 
 	/*
@@ -656,6 +693,13 @@ pgss_shmem_startup(void)
 		entry->counters = temp.counters;
 	}
 
+	/*
+	 * now that we have the pgss_hash entries,
+	 * let's sync up with the qtext_hash with
+	 * all of our queryid's.
+	 */
+	qtext_hash_sync();
+
 	/* Read global statistics for pg_stat_statements */
 	if (fread(&pgss->stats, sizeof(pgssGlobalStats), 1, file) != 1)
 		goto read_error;
@@ -735,7 +779,7 @@ pgss_shmem_shutdown(int code, Datum arg)
 		return;
 
 	/* Safety check ... shouldn't get here unless shmem is set up. */
-	if (!pgss || !pgss_hash)
+	if (!pgss || !pgss_hash || !qtext_hash)
 		return;
 
 	/* Don't dump if told not to. */
@@ -827,7 +871,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
 		prev_post_parse_analyze_hook(pstate, query, jstate);
 
 	/* Safety check... */
-	if (!pgss || !pgss_hash || !pgss_enabled(exec_nested_level))
+	if (!pgss || !pgss_hash || !qtext_hash || !pgss_enabled(exec_nested_level))
 		return;
 
 	/*
@@ -1226,7 +1270,7 @@ pgss_store(const char *query, uint64 queryId,
 	Assert(query != NULL);
 
 	/* Safety check... */
-	if (!pgss || !pgss_hash)
+	if (!pgss || !pgss_hash || !qtext_hash)
 		return;
 
 	/*
@@ -1236,13 +1280,6 @@ pgss_store(const char *query, uint64 queryId,
 	if (queryId == UINT64CONST(0))
 		return;
 
-	/*
-	 * Confine our attention to the relevant part of the string, if the query
-	 * is a portion of a multi-statement source string, and update query
-	 * location and length if needed.
-	 */
-	query = CleanQuerytext(query, &query_location, &query_len);
-
 	/* Set up key for hashtable search */
 
 	/* memset() is required when pgssHashKey is without padding only */
@@ -1261,56 +1298,106 @@ pgss_store(const char *query, uint64 queryId,
 	/* Create new entry, if not present */
 	if (!entry)
 	{
-		Size		query_offset;
+		Size		query_offset = 0;
 		int			gc_count;
 		bool		stored;
-		bool		do_gc;
+		bool		do_gc = false;
+		qtextEntry		*qentry;
 
 		/*
-		 * Create a new, normalized query string if caller asked.  We don't
-		 * need to hold the lock while doing this work.  (Note: in any case,
-		 * it's possible that someone else creates a duplicate hashtable entry
-		 * in the interval where we don't hold the lock below.  That case is
-		 * handled by entry_alloc.)
+		 * for our queryid, If we can find the offet and length in the
+		 * qtext_hash, we can allocate a new entry in pgss_hash using this
+		 * information.
 		 */
-		if (jstate)
+		qentry = hash_search(qtext_hash, &key.queryid, HASH_FIND, NULL);
+
+		if (!qentry)
 		{
+			/*
+			 * Confine our attention to the relevant part of the string, if the query
+			 * is a portion of a multi-statement source string, and update query
+			 * location and length if needed.
+			 *
+			 * Note: This must only be done if we plan on adding the query
+			 * text to the external query file.
+			 */
+			query = CleanQuerytext(query, &query_location, &query_len);
+
+			/*
+			 * Create a new, normalized query string if caller asked.  We don't
+			 * need to hold the lock while doing this work.  (Note: in any case,
+			 * it's possible that someone else creates a duplicate hashtable entry
+			 * in the interval where we don't hold the lock below.  That case is
+			 * handled by entry_alloc.)
+			 */
+			if (jstate)
+			{
+				LWLockRelease(pgss->lock);
+				norm_query = generate_normalized_query(jstate, query,
+													   query_location,
+													   &query_len);
+				LWLockAcquire(pgss->lock, LW_SHARED);
+			}
+
+			/* Append new query text to file with only shared lock held */
+			stored = qtext_store(norm_query ? norm_query : query, query_len,
+								 &query_offset, &gc_count);
+
+			/*
+			 * Determine whether we need to garbage collect external query texts
+			 * while the shared lock is still held.  This micro-optimization
+			 * avoids taking the time to decide this while holding exclusive lock.
+			 */
+			do_gc = need_gc_qtexts();
+
+			/* Need exclusive lock to make a new hashtable entry - promote */
 			LWLockRelease(pgss->lock);
-			norm_query = generate_normalized_query(jstate, query,
-												   query_location,
-												   &query_len);
-			LWLockAcquire(pgss->lock, LW_SHARED);
-		}
+			LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
 
-		/* Append new query text to file with only shared lock held */
-		stored = qtext_store(norm_query ? norm_query : query, query_len,
-							 &query_offset, &gc_count);
+			{
+				/*
+				 * If we reach max entries in qtext_hash, we
+				 * can call entry_dealloc which will deallocate
+				 * pgss_hash entries and resync qtext_hash.
+				 * if we are maxed out on entries in qtext_hash,
+				 * we will also be maxed out in pgss_hash.
+				 * We are not likely to invoke entry_dealloc()
+				 * during entry_alloc() if we deallocate here.
+				 */
+				while (hash_get_num_entries(qtext_hash) >= pgss_max)
+					entry_dealloc();
 
-		/*
-		 * Determine whether we need to garbage collect external query texts
-		 * while the shared lock is still held.  This micro-optimization
-		 * avoids taking the time to decide this while holding exclusive lock.
-		 */
-		do_gc = need_gc_qtexts();
+				qentry = (qtextEntry *) hash_search(qtext_hash, &key.queryid, HASH_ENTER, NULL);
 
-		/* Need exclusive lock to make a new hashtable entry - promote */
-		LWLockRelease(pgss->lock);
-		LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+				if (qentry)
+				{
+					qentry->query_offset = query_offset;
+					qentry->query_len = query_len;
+				}
+			}
 
-		/*
-		 * A garbage collection may have occurred while we weren't holding the
-		 * lock.  In the unlikely event that this happens, the query text we
-		 * stored above will have been garbage collected, so write it again.
-		 * This should be infrequent enough that doing it while holding
-		 * exclusive lock isn't a performance problem.
-		 */
-		if (!stored || pgss->gc_count != gc_count)
-			stored = qtext_store(norm_query ? norm_query : query, query_len,
-								 &query_offset, NULL);
+			/*
+			 * A garbage collection may have occurred while we weren't holding the
+			 * lock.  In the unlikely event that this happens, the query text we
+			 * stored above will have been garbage collected, so write it again.
+			 * This should be infrequent enough that doing it while holding
+			 * exclusive lock isn't a performance problem.
+			 */
+			if (!stored || pgss->gc_count != gc_count)
+				stored = qtext_store(norm_query ? norm_query : query, query_len,
+									 &query_offset, NULL);
+
+			/* If we failed to write to the text file, give up */
+			if (!stored)
+				goto done;
+		} else
+		{
+			LWLockRelease(pgss->lock);
+			LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
 
-		/* If we failed to write to the text file, give up */
-		if (!stored)
-			goto done;
+			query_offset = qentry->query_offset;
+			query_len = qentry->query_len;
+		}
 
 		/* OK to create a new hashtable entry */
 		entry = entry_alloc(&key, query_offset, query_len, encoding,
@@ -1550,7 +1637,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
 	is_allowed_role = has_privs_of_role(userid, ROLE_PG_READ_ALL_STATS);
 
 	/* hash table must exist already */
-	if (!pgss || !pgss_hash)
+	if (!pgss || !pgss_hash || !qtext_hash)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("pg_stat_statements must be loaded via shared_preload_libraries")));
@@ -1860,7 +1947,7 @@ pg_stat_statements_info(PG_FUNCTION_ARGS)
 	Datum		values[PG_STAT_STATEMENTS_INFO_COLS];
 	bool		nulls[PG_STAT_STATEMENTS_INFO_COLS];
 
-	if (!pgss || !pgss_hash)
+	if (!pgss || !pgss_hash || !qtext_hash)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("pg_stat_statements must be loaded via shared_preload_libraries")));
@@ -1897,6 +1984,7 @@ pgss_memsize(void)
 
 	size = MAXALIGN(sizeof(pgssSharedState));
 	size = add_size(size, hash_estimate_size(pgss_max, sizeof(pgssEntry)));
+	size = add_size(size, hash_estimate_size(pgss_max, sizeof(qtextEntry)));
 
 	return size;
 }
@@ -2041,6 +2129,13 @@ entry_dealloc(void)
 		hash_search(pgss_hash, &entries[i]->key, HASH_REMOVE, NULL);
 	}
 
+	/*
+	 * Now that we removed entries from pgss_hash,
+	 * let's resync qtext_hash with the remaining
+	 * querys
+	 */
+	qtext_hash_sync();
+
 	pfree(entries);
 
 	/* Increment the number of times entries are deallocated */
@@ -2053,6 +2148,35 @@ entry_dealloc(void)
 	}
 }
 
+/*
+ * synchronize qtext_hash with pgss_hash
+ */
+static void
+qtext_hash_sync(void)
+{
+	HASH_SEQ_STATUS qhash_seq;
+	HASH_SEQ_STATUS hash_seq;
+	qtextEntry  *qentry;
+	pgssEntry *entry;
+	qtextHashKey qkey;
+
+	hash_seq_init(&qhash_seq, qtext_hash);
+	while ((qentry = hash_seq_search(&qhash_seq)) != NULL)
+	{
+		hash_search(qtext_hash, &qentry->key, HASH_REMOVE, NULL);
+	}
+
+	hash_seq_init(&hash_seq, pgss_hash);
+	while ((entry = hash_seq_search(&hash_seq)) != NULL)
+	{
+		qkey.queryid = entry->key.queryid;
+
+		qentry = hash_search(qtext_hash, &qkey, HASH_ENTER, NULL);
+
+		qentry->query_offset = entry->query_offset;
+		qentry->query_len = entry->query_len;
+	}
+}
 /*
  * Given a query string (not necessarily null-terminated), allocate a new
  * entry in the external query text file and store the string there.
@@ -2440,6 +2564,14 @@ gc_qtexts(void)
 	 */
 	record_gc_qtexts();
 
+	/*
+	 * Resync qtext_hash with pgss_hash
+	 *
+	 * We only do this after we remove all entries from pgss_hash since
+	 * we have a clean external query file.
+	 */
+	qtext_hash_sync();
+
 	return;
 
 gc_fail:
@@ -2506,7 +2638,7 @@ entry_reset(Oid userid, Oid dbid, uint64 queryid)
 	long		num_remove = 0;
 	pgssHashKey key;
 
-	if (!pgss || !pgss_hash)
+	if (!pgss || !pgss_hash || !qtext_hash)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("pg_stat_statements must be loaded via shared_preload_libraries")));
@@ -2562,6 +2694,12 @@ entry_reset(Oid userid, Oid dbid, uint64 queryid)
 		}
 	}
 
+	/*
+	 * We have a clean external file, let's resync
+	 * qtext_hash with pgss_hash.
+	 */
+	qtext_hash_sync();
+
 	/* All entries are removed? */
 	if (num_entries != num_remove)
 		goto release_lock;
-- 
2.32.1 (Apple Git-133)

#6Julien Rouhaud
rjuju123@gmail.com
In reply to: Imseih (AWS), Sami (#5)
Re: [PROPOSAL] Detecting plan changes with plan_id in pg_stat_activity

Hi,

On Tue, Jun 21, 2022 at 08:04:01PM +0000, Imseih (AWS), Sami wrote:

I separated the pg_stat_statements patch. The patch
Introduces a secondary hash that tracks locations of
A query ( by queryid ) in the external file.

I still think that's wrong.

The hash
remains in lockstep with the pgss_hash using a
synchronization routine.

Can you describe how it's kept in sync, and how it makes sure that the property
is maintained over restart / gc? I don't see any change in the code for the
2nd part so I don't see how it could work (and after a quick test it indeed
doesn't).

I also don't see any change in the heuristics for need_gc_qtext(), isn't that
going to lead to too frequent gc?

My testing does not show any regression for workloads
In which statements are not issues by multiple users/databases.

However, it shows good improvement, 10-15%, when there
are similar statements that are issues by multiple
users/databases/tracking levels.

"no regression" and "10-15% improvement" on what?

Can you share more details on the benchmarks you did? Did you also run
benchmark on workloads that induce entry eviction, with and without need for
gc? Eviction in pgss is already very expensive, and making things worse just
to save a bit of disk space doesn't seem like a good compromise.

#7Imseih (AWS), Sami
simseih@amazon.com
In reply to: Julien Rouhaud (#6)
2 attachment(s)
Re: [PROPOSAL] Detecting plan changes with plan_id in pg_stat_activity

Can you describe how it's kept in sync, and how it makes sure that the property
is maintained over restart / gc? I don't see any change in the code for the
2nd part so I don't see how it could work (and after a quick test it indeed
doesn't).

There is a routine called qtext_hash_sync which removed all entries from
the qtext_hash and reloads it will all the query ids from pgss_hash.

This routine is called during:

1. gc_qtexts()
2. entry_reset()
3. entry_dealloc(), although this can be moved to the end of entry_alloc() instead.
4. pgss_shmem_startup()

All the points when it's called, an exclusive lock is held.this allows or syncing all
The present queryid's in pgss_hash with qtext_hash.
.

2nd part so I don't see how it could work (and after a quick test it indeed
doesn't).

Can you tell me what test you used to determine it is not in sync?

Can you share more details on the benchmarks you did? Did you also run
benchmark on workloads that induce entry eviction, with and without need for
gc? Eviction in pgss is already very expensive, and making things worse just
to save a bit of disk space doesn't seem like a good compromise.

Sorry this was poorly explained by me. I went back and did some benchmarks. Attached is
The script and results. But here is a summary:
On a EC2 r5.2xlarge. The benchmark I performed is:
1. create 10k tables
2. create 5 users
3. run a pgbench script that performs per transaction a select on
A randomly chosen table for each of the 5 users.
4. 2 variants of the test executed . 1 variant is with the default pg_stat_statements.max = 5000
and one test with a larger pg_stat_statements.max = 10000.

So 10-15% is not accurate. I originally tested on a less powered machine. For this
Benchmark I see a 6% increase in TPS (732k vs 683k) when we have a larger sized
pg_stat_statements.max is used and less gc/deallocations.
Both tests show a drop in gc/deallocations and a net increase
In tps. Less GC makes sense since the external file has less duplicate SQLs.

##################################
## pg_stat_statements.max = 15000
##################################

## with patch

transaction type: /tmp/wl.sql
scaling factor: 1
query mode: simple
number of clients: 20
number of threads: 1
maximum number of tries: 1
duration: 360 s
number of transactions actually processed: 732604
number of failed transactions: 0 (0.000%)
latency average = 9.828 ms
initial connection time = 33.349 ms
tps = 2035.051541 (without initial connection time)
[ec2-user@ip- pg_stat_statements]$
(1 row)

42 gc_qtext calls
3473 deallocations

## no patch

transaction type: /tmp/wl.sql
scaling factor: 1
query mode: simple
number of clients: 20
number of threads: 1
maximum number of tries: 1
duration: 360 s
number of transactions actually processed: 683434
number of failed transactions: 0 (0.000%)
latency average = 10.535 ms
initial connection time = 32.788 ms
tps = 1898.452025 (without initial connection time)

154 garbage collections
3239 deallocations

##################################
## pg_stat_statements.max = 5000
##################################

## with patch

transaction type: /tmp/wl.sql
scaling factor: 1
query mode: simple
number of clients: 20
number of threads: 1
maximum number of tries: 1
duration: 360 s
number of transactions actually processed: 673135
number of failed transactions: 0 (0.000%)
latency average = 10.696 ms
initial connection time = 32.908 ms
tps = 1869.829843 (without initial connection time)

400 garbage collections
12501 deallocations

## no patch

transaction type: /tmp/wl.sql
scaling factor: 1
query mode: simple
number of clients: 20
number of threads: 1
maximum number of tries: 1
duration: 360 s
number of transactions actually processed: 656160
number of failed transactions: 0 (0.000%)
latency average = 10.973 ms
initial connection time = 33.275 ms
tps = 1822.678069 (without initial connection time)

580 garbage collections
12180 deallocations

Thanks

Sami
Amazon Web Services

Attachments:

benchmark.shapplication/octet-stream; name=benchmark.shDownload
results.txttext/plain; name=results.txtDownload
#8Julien Rouhaud
rjuju123@gmail.com
In reply to: Imseih (AWS), Sami (#7)
Re: [PROPOSAL] Detecting plan changes with plan_id in pg_stat_activity

On Wed, Jun 22, 2022 at 11:05:54PM +0000, Imseih (AWS), Sami wrote:

Can you describe how it's kept in sync, and how it makes sure that the property
is maintained over restart / gc? I don't see any change in the code for the
2nd part so I don't see how it could work (and after a quick test it indeed
doesn't).

There is a routine called qtext_hash_sync which removed all entries from
the qtext_hash and reloads it will all the query ids from pgss_hash.
[...]
All the points when it's called, an exclusive lock is held.this allows or syncing all
The present queryid's in pgss_hash with qtext_hash.

So your approach is to let the current gc / file loading behavior happen as
before and construct your mapping hash using the resulting query text / offset
info. That can't work.

2nd part so I don't see how it could work (and after a quick test it indeed
doesn't).

Can you tell me what test you used to determine it is not in sync?

What test did you use to determine it is in sync? Have you checked how the gc/
file loading actually work?

In my case I just checked the size of the query text file after running some
script that makes sure that there are the same few queries ran by multiple
different roles, then:

Size of $PGDATA/pg_stat_tmp/pgss_query_texts.stat: 559B
pg_ctl restart
Size of $PGDATA/pg_stat_tmp/pgss_query_texts.stat: 2383B

Can you share more details on the benchmarks you did? Did you also run
benchmark on workloads that induce entry eviction, with and without need for
gc? Eviction in pgss is already very expensive, and making things worse just
to save a bit of disk space doesn't seem like a good compromise.

Sorry this was poorly explained by me. I went back and did some benchmarks. Attached is
The script and results. But here is a summary:
On a EC2 r5.2xlarge. The benchmark I performed is:
1. create 10k tables
2. create 5 users
3. run a pgbench script that performs per transaction a select on
A randomly chosen table for each of the 5 users.
4. 2 variants of the test executed . 1 variant is with the default pg_stat_statements.max = 5000
and one test with a larger pg_stat_statements.max = 10000.

But you wrote:

##################################
## pg_stat_statements.max = 15000
##################################

So which one is it?

So 10-15% is not accurate. I originally tested on a less powered machine. For this
Benchmark I see a 6% increase in TPS (732k vs 683k) when we have a larger sized
pg_stat_statements.max is used and less gc/deallocations.
Both tests show a drop in gc/deallocations and a net increase
In tps. Less GC makes sense since the external file has less duplicate SQLs.

On the other hand you're rebuilding the new query_offset hashtable every time
there's an entry eviction, which seems quite expensive.

Also, as I mentioned you didn't change any of the heuristic for
need_gc_qtexts(). So if the query texts are indeed deduplicated, doesn't it
mean that gc will artificially
be called less often? The wanted target of "50% bloat" will become "50%
bloat assuming no deduplication is done" and the average query text file size
will stay the same whether the query texts are deduplicated or not.

I'm wondering the improvements you see due to the patch or simply due to
artificially calling gc less often? What are the results if instead of using
vanilla pg_stat_statements you patch it to perform roughly the same number of
gc as your version does?

Also your benchmark workload is very friendly with your feature, what are the
results with other workloads? Having the results with query texts that aren't
artificially long would be interesting for instance, after fixing the problems
mentioned previously.

Also, you said that if you run that benchmark with a single user you don't see
any regression. I don't see how rebuilding an extra hashtable in
entry_dealloc(), so when holding an exclusive lwlock, while not saving any
other work elsewhere could be free?

Looking at the script, you have:
echo "log_min_messages=debug1" >> $PGDATA/postgresql.conf; \

Is that really necessary? Couldn't you upgrade the gc message to a higher
level for your benchmark need, or expose some new counter in
pg_stat_statements_info maybe? Have you done the benchmark using a debug build
or normal build?

#9Antonin Houska
ah@cybertec.at
In reply to: Julien Rouhaud (#8)
Re: [PROPOSAL] Detecting plan changes with plan_id in pg_stat_activity

Shouldn't the patch status be set to "Waiting on Author"?

(I was curious if this is a patch that I can review.)

Julien Rouhaud <rjuju123@gmail.com> wrote:

On Wed, Jun 22, 2022 at 11:05:54PM +0000, Imseih (AWS), Sami wrote:

Can you describe how it's kept in sync, and how it makes sure that the property
is maintained over restart / gc? I don't see any change in the code for the
2nd part so I don't see how it could work (and after a quick test it indeed
doesn't).

There is a routine called qtext_hash_sync which removed all entries from
the qtext_hash and reloads it will all the query ids from pgss_hash.
[...]
All the points when it's called, an exclusive lock is held.this allows or syncing all
The present queryid's in pgss_hash with qtext_hash.

So your approach is to let the current gc / file loading behavior happen as
before and construct your mapping hash using the resulting query text / offset
info. That can't work.

2nd part so I don't see how it could work (and after a quick test it indeed
doesn't).

Can you tell me what test you used to determine it is not in sync?

What test did you use to determine it is in sync? Have you checked how the gc/
file loading actually work?

In my case I just checked the size of the query text file after running some
script that makes sure that there are the same few queries ran by multiple
different roles, then:

Size of $PGDATA/pg_stat_tmp/pgss_query_texts.stat: 559B
pg_ctl restart
Size of $PGDATA/pg_stat_tmp/pgss_query_texts.stat: 2383B

Can you share more details on the benchmarks you did? Did you also run
benchmark on workloads that induce entry eviction, with and without need for
gc? Eviction in pgss is already very expensive, and making things worse just
to save a bit of disk space doesn't seem like a good compromise.

Sorry this was poorly explained by me. I went back and did some benchmarks. Attached is
The script and results. But here is a summary:
On a EC2 r5.2xlarge. The benchmark I performed is:
1. create 10k tables
2. create 5 users
3. run a pgbench script that performs per transaction a select on
A randomly chosen table for each of the 5 users.
4. 2 variants of the test executed . 1 variant is with the default pg_stat_statements.max = 5000
and one test with a larger pg_stat_statements.max = 10000.

But you wrote:

##################################
## pg_stat_statements.max = 15000
##################################

So which one is it?

So 10-15% is not accurate. I originally tested on a less powered machine. For this
Benchmark I see a 6% increase in TPS (732k vs 683k) when we have a larger sized
pg_stat_statements.max is used and less gc/deallocations.
Both tests show a drop in gc/deallocations and a net increase
In tps. Less GC makes sense since the external file has less duplicate SQLs.

On the other hand you're rebuilding the new query_offset hashtable every time
there's an entry eviction, which seems quite expensive.

Also, as I mentioned you didn't change any of the heuristic for
need_gc_qtexts(). So if the query texts are indeed deduplicated, doesn't it
mean that gc will artificially
be called less often? The wanted target of "50% bloat" will become "50%
bloat assuming no deduplication is done" and the average query text file size
will stay the same whether the query texts are deduplicated or not.

I'm wondering the improvements you see due to the patch or simply due to
artificially calling gc less often? What are the results if instead of using
vanilla pg_stat_statements you patch it to perform roughly the same number of
gc as your version does?

Also your benchmark workload is very friendly with your feature, what are the
results with other workloads? Having the results with query texts that aren't
artificially long would be interesting for instance, after fixing the problems
mentioned previously.

Also, you said that if you run that benchmark with a single user you don't see
any regression. I don't see how rebuilding an extra hashtable in
entry_dealloc(), so when holding an exclusive lwlock, while not saving any
other work elsewhere could be free?

Looking at the script, you have:
echo "log_min_messages=debug1" >> $PGDATA/postgresql.conf; \

Is that really necessary? Couldn't you upgrade the gc message to a higher
level for your benchmark need, or expose some new counter in
pg_stat_statements_info maybe? Have you done the benchmark using a debug build
or normal build?

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

#10Julien Rouhaud
rjuju123@gmail.com
In reply to: Antonin Houska (#9)
Re: [PROPOSAL] Detecting plan changes with plan_id in pg_stat_activity

Hi,

On Thu, Jul 14, 2022 at 08:51:24AM +0200, Antonin Houska wrote:

Shouldn't the patch status be set to "Waiting on Author"?

(I was curious if this is a patch that I can review.)

Ah indeed, I just updated the CF entry!

#11Jacob Champion
jchampion@timescale.com
In reply to: Julien Rouhaud (#10)
Re: [PROPOSAL] Detecting plan changes with plan_id in pg_stat_activity

This entry has been waiting on author input for a while (our current
threshold is roughly two weeks), so I've marked it Returned with
Feedback.

Once you think the patchset is ready for review again, you (or any
interested party) can resurrect the patch entry by visiting

https://commitfest.postgresql.org/38/3700/

and changing the status to "Needs Review", and then changing the
status again to "Move to next CF". (Don't forget the second step;
hopefully we will have streamlined this in the near future!)

Thanks,
--Jacob