extended stats on partitioned tables

Started by Justin Pryzbyover 4 years ago37 messages
#1Justin Pryzby
pryzby@telsasoft.com

extended stats objects are allowed on partitioned tables since v10.
/messages/by-id/CAKJS1f-BmGo410bh5RSPZUvOO0LhmHL2NYmdrC_Jm8pk_FfyCA@mail.gmail.com
8c5cdb7f4f6e1d6a6104cb58ce4f23453891651b

But since 859b3003de they're not populated - pg_statistic_ext(_data) is empty.
This was the consequence of a commit to avoid an error I reported with stats on
inheritence parents (not partitioned tables).

preceding 859b3003de, stats on the parent table *did* improve the estimate,
so this part of the commit message seems to have been wrong?
|commit 859b3003de87645b62ee07ef245d6c1f1cd0cedb
| Don't build extended statistics on inheritance trees
...
| Moreover, the current selectivity estimation code only works with individual
| relations, so building statistics on inheritance trees would be pointless
| anyway.

|CREATE TABLE p (i int, a int, b int) PARTITION BY RANGE (i);
|CREATE TABLE pd PARTITION OF p FOR VALUES FROM (1)TO(100);
|TRUNCATE p; INSERT INTO p SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
|CREATE STATISTICS pp ON (a),(b) FROM p;
|VACUUM ANALYZE p;
|SELECT * FROM pg_statistic_ext WHERE stxrelid ='p'::regclass;

|postgres=# begin; DROP STATISTICS pp; explain analyze SELECT a,b FROM p GROUP BY 1,2; abort;
| HashAggregate (cost=20.98..21.98 rows=100 width=8) (actual time=1.088..1.093 rows=10 loops=1)

|postgres=# explain analyze SELECT a,b FROM p GROUP BY 1,2;
| HashAggregate (cost=20.98..21.09 rows=10 width=8) (actual time=1.082..1.086 rows=10 loops=1)

So I think this is a regression, and extended stats should be populated for
partitioned tables - I had actually done that for some parent tables and hadn't
noticed that the stats objects no longer do anything.

That begs the question if the current behavior for inheritence parents is
correct..

CREATE TABLE p (i int, a int, b int);
CREATE TABLE pd () INHERITS (p);
INSERT INTO pd SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
CREATE STATISTICS pp ON (a),(b) FROM p;
VACUUM ANALYZE p;
explain analyze SELECT a,b FROM p GROUP BY 1,2;

| HashAggregate (cost=25.99..26.99 rows=100 width=8) (actual time=3.268..3.284 rows=10 loops=1)

Since child tables can be queried directly, it's a legitimate question whether
we should collect stats for the table heirarchy or (since the catalog only
supports one) only the table itself. I'd think that stats for the table
hierarchy would be more commonly useful (but we shouldn't change the behavior
in existing releases again). Anyway it seems unfortunate that
statistic_ext_data still has no stxinherited.

Note that for partitioned tables if I enable enable_partitionwise_aggregate,
then stats objects on the child tables can be helpful (but that's also
confusing to the question at hand).

#2Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Justin Pryzby (#1)
1 attachment(s)
Re: extended stats on partitioned tables

On 9/23/21 11:26 PM, Justin Pryzby wrote:

extended stats objects are allowed on partitioned tables since v10.
/messages/by-id/CAKJS1f-BmGo410bh5RSPZUvOO0LhmHL2NYmdrC_Jm8pk_FfyCA@mail.gmail.com
8c5cdb7f4f6e1d6a6104cb58ce4f23453891651b

But since 859b3003de they're not populated - pg_statistic_ext(_data) is empty.
This was the consequence of a commit to avoid an error I reported with stats on
inheritence parents (not partitioned tables).

preceding 859b3003de, stats on the parent table *did* improve the estimate,
so this part of the commit message seems to have been wrong?
|commit 859b3003de87645b62ee07ef245d6c1f1cd0cedb
| Don't build extended statistics on inheritance trees
...
| Moreover, the current selectivity estimation code only works with individual
| relations, so building statistics on inheritance trees would be pointless
| anyway.

|CREATE TABLE p (i int, a int, b int) PARTITION BY RANGE (i);
|CREATE TABLE pd PARTITION OF p FOR VALUES FROM (1)TO(100);
|TRUNCATE p; INSERT INTO p SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
|CREATE STATISTICS pp ON (a),(b) FROM p;
|VACUUM ANALYZE p;
|SELECT * FROM pg_statistic_ext WHERE stxrelid ='p'::regclass;

|postgres=# begin; DROP STATISTICS pp; explain analyze SELECT a,b FROM p GROUP BY 1,2; abort;
| HashAggregate (cost=20.98..21.98 rows=100 width=8) (actual time=1.088..1.093 rows=10 loops=1)

|postgres=# explain analyze SELECT a,b FROM p GROUP BY 1,2;
| HashAggregate (cost=20.98..21.09 rows=10 width=8) (actual time=1.082..1.086 rows=10 loops=1)

So I think this is a regression, and extended stats should be populated for
partitioned tables - I had actually done that for some parent tables and hadn't
noticed that the stats objects no longer do anything.

That begs the question if the current behavior for inheritence parents is
correct..

CREATE TABLE p (i int, a int, b int);
CREATE TABLE pd () INHERITS (p);
INSERT INTO pd SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
CREATE STATISTICS pp ON (a),(b) FROM p;
VACUUM ANALYZE p;
explain analyze SELECT a,b FROM p GROUP BY 1,2;

| HashAggregate (cost=25.99..26.99 rows=100 width=8) (actual time=3.268..3.284 rows=10 loops=1)

Agreed, that seems like a regression, but I don't see how to fix that
without having the extra flag in the catalog. Otherwise we can store
just one version for each statistics object :-(

Since child tables can be queried directly, it's a legitimate question whether
we should collect stats for the table heirarchy or (since the catalog only
supports one) only the table itself. I'd think that stats for the table
hierarchy would be more commonly useful (but we shouldn't change the behavior
in existing releases again). Anyway it seems unfortunate that
statistic_ext_data still has no stxinherited.

Yeah, we probably need the flag - I planned to get it into 14, but then
I got distracted by something else :-/

Attached is a PoC that I quickly bashed together today. It's pretty raw,
but it passed "make check" and I think it does most of the things right.
Can you try if this fixes the estimates with partitioned tables?

Extended statistics use two catalogs, pg_statistic_ext for definition,
while pg_statistic_ext_data stores the built statistics objects - the
flag needs to be in the "data" catalog, and managing the records is a
bit challenging - the current PoC code mostly works, but I had to relax
some error checks and I'm sure there are cases when we fail to remove a
row, or something like that.

Note that for partitioned tables if I enable enable_partitionwise_aggregate,
then stats objects on the child tables can be helpful (but that's also
confusing to the question at hand).

Yeah. I think it'd be helpful to assemble a script with various test
cases demonstrating how we estimate various cases.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

statistics-inheritance-fix-v1.patchtext/x-patch; charset=UTF-8; name=statistics-inheritance-fix-v1.patchDownload
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2f0def9b19..e256a5533c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7441,6 +7441,19 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
    created with <link linkend="sql-createstatistics"><command>CREATE STATISTICS</command></link>.
   </para>
 
+  <para>
+   Normally there is one entry, with <structfield>stxdinherit</structfield> =
+   <literal>false</literal>, for each statistics object that has been analyzed.
+   If the table has inheritance children, a second entry with
+   <structfield>stxdinherit</structfield> = <literal>true</literal> is also created.
+   This row represents the statistics object over the inheritance tree, i.e.,
+   statistics for the data you'd see with
+   <literal>SELECT * FROM <replaceable>table</replaceable>*</literal>,
+   whereas the <structfield>stxdinherit</structfield> = <literal>false</literal> row
+   represents the results of
+   <literal>SELECT * FROM ONLY <replaceable>table</replaceable></literal>.
+  </para>
+
   <para>
    Like <link linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>,
    <structname>pg_statistic_ext_data</structname> should not be
@@ -7480,6 +7493,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>stxdinherit</structfield> <type>bool</type>
+      </para>
+      <para>
+       If true, the stats include inheritance child columns, not just the
+       values in the specified relation
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>stxdndistinct</structfield> <type>pg_ndistinct</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..07ab18dc52 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -266,6 +266,7 @@ CREATE VIEW pg_stats_ext WITH (security_barrier) AS
            ) AS attnames,
            pg_get_statisticsobjdef_expressions(s.oid) as exprs,
            s.stxkind AS kinds,
+           sd.stxdinherit AS inherited,
            sd.stxdndistinct AS n_distinct,
            sd.stxddependencies AS dependencies,
            m.most_common_vals,
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 8bfb2ad958..7f4b0f5320 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -611,15 +611,9 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
-		/*
-		 * Build extended statistics (if there are any).
-		 *
-		 * For now we only build extended statistics on individual relations,
-		 * not for relations representing inheritance trees.
-		 */
-		if (!inh)
-			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
-									   attr_cnt, vacattrstats);
+		/* Build extended statistics (if there are any). */
+		BuildRelationExtStatistics(onerel, inh, totalrows, numrows, rows,
+								   attr_cnt, vacattrstats);
 	}
 
 	pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index afe6744e23..d65a297e2e 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -524,6 +524,9 @@ CreateStatistics(CreateStatsStmt *stmt)
 
 	datavalues[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statoid);
 
+	/* create only the "stxdinherit=false", because that always exists */
+	datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = ObjectIdGetDatum(false);
+
 	/* no statistics built yet */
 	datanulls[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
 	datanulls[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
@@ -726,6 +729,7 @@ RemoveStatisticsById(Oid statsOid)
 	HeapTuple	tup;
 	Form_pg_statistic_ext statext;
 	Oid			relid;
+	int			inh;
 
 	/*
 	 * First delete the pg_statistic_ext_data tuple holding the actual
@@ -733,14 +737,20 @@ RemoveStatisticsById(Oid statsOid)
 	 */
 	relation = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
-	tup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid));
+	/* hack to delete both stxdinherit = true/false */
+	for (inh = 0; inh <= 1; inh++)
+	{
+		tup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid),
+							  BoolGetDatum(inh));
 
-	if (!HeapTupleIsValid(tup)) /* should not happen */
-		elog(ERROR, "cache lookup failed for statistics data %u", statsOid);
+		if (!HeapTupleIsValid(tup)) /* should not happen */
+			// elog(ERROR, "cache lookup failed for statistics data %u", statsOid);
+			continue;
 
-	CatalogTupleDelete(relation, &tup->t_self);
+		CatalogTupleDelete(relation, &tup->t_self);
 
-	ReleaseSysCache(tup);
+		ReleaseSysCache(tup);
+	}
 
 	table_close(relation, RowExclusiveLock);
 
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index c5194fdbbf..154d48a330 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -30,6 +30,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_statistic_ext.h"
+#include "catalog/pg_statistic_ext_data.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -1311,127 +1312,144 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 	{
 		Oid			statOid = lfirst_oid(l);
 		Form_pg_statistic_ext staForm;
+		Form_pg_statistic_ext_data dataForm;
 		HeapTuple	htup;
 		HeapTuple	dtup;
 		Bitmapset  *keys = NULL;
 		List	   *exprs = NIL;
 		int			i;
+		int			inh;
 
 		htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
 		if (!HeapTupleIsValid(htup))
 			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
 		staForm = (Form_pg_statistic_ext) GETSTRUCT(htup);
 
-		dtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-		if (!HeapTupleIsValid(dtup))
-			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
-
-		/*
-		 * First, build the array of columns covered.  This is ultimately
-		 * wasted if no stats within the object have actually been built, but
-		 * it doesn't seem worth troubling over that case.
-		 */
-		for (i = 0; i < staForm->stxkeys.dim1; i++)
-			keys = bms_add_member(keys, staForm->stxkeys.values[i]);
-
 		/*
-		 * Preprocess expressions (if any). We read the expressions, run them
-		 * through eval_const_expressions, and fix the varnos.
+		 * Hack to load stats with stxdinherit true/false - there should be
+		 * a better way to do this, I guess.
 		 */
+		for (inh = 0; inh <= 1; inh++)
 		{
-			bool		isnull;
-			Datum		datum;
+			dtup = SearchSysCache2(STATEXTDATASTXOID,
+								   ObjectIdGetDatum(statOid), BoolGetDatum((bool) inh));
+			if (!HeapTupleIsValid(dtup))
+				continue;
 
-			/* decode expression (if any) */
-			datum = SysCacheGetAttr(STATEXTOID, htup,
-									Anum_pg_statistic_ext_stxexprs, &isnull);
+			dataForm = (Form_pg_statistic_ext_data) GETSTRUCT(dtup);
 
-			if (!isnull)
+			/*
+			 * First, build the array of columns covered.  This is ultimately
+			 * wasted if no stats within the object have actually been built, but
+			 * it doesn't seem worth troubling over that case.
+			 */
+			for (i = 0; i < staForm->stxkeys.dim1; i++)
+				keys = bms_add_member(keys, staForm->stxkeys.values[i]);
+
+			/*
+			 * Preprocess expressions (if any). We read the expressions, run them
+			 * through eval_const_expressions, and fix the varnos.
+			 */
 			{
-				char	   *exprsString;
+				bool		isnull;
+				Datum		datum;
 
-				exprsString = TextDatumGetCString(datum);
-				exprs = (List *) stringToNode(exprsString);
-				pfree(exprsString);
+				/* decode expression (if any) */
+				datum = SysCacheGetAttr(STATEXTOID, htup,
+										Anum_pg_statistic_ext_stxexprs, &isnull);
 
-				/*
-				 * Run the expressions through eval_const_expressions. This is
-				 * not just an optimization, but is necessary, because the
-				 * planner will be comparing them to similarly-processed qual
-				 * clauses, and may fail to detect valid matches without this.
-				 * We must not use canonicalize_qual, however, since these
-				 * aren't qual expressions.
-				 */
-				exprs = (List *) eval_const_expressions(NULL, (Node *) exprs);
+				if (!isnull)
+				{
+					char	   *exprsString;
 
-				/* May as well fix opfuncids too */
-				fix_opfuncids((Node *) exprs);
+					exprsString = TextDatumGetCString(datum);
+					exprs = (List *) stringToNode(exprsString);
+					pfree(exprsString);
 
-				/*
-				 * Modify the copies we obtain from the relcache to have the
-				 * correct varno for the parent relation, so that they match
-				 * up correctly against qual clauses.
-				 */
-				if (varno != 1)
-					ChangeVarNodes((Node *) exprs, 1, varno, 0);
+					/*
+					 * Run the expressions through eval_const_expressions. This is
+					 * not just an optimization, but is necessary, because the
+					 * planner will be comparing them to similarly-processed qual
+					 * clauses, and may fail to detect valid matches without this.
+					 * We must not use canonicalize_qual, however, since these
+					 * aren't qual expressions.
+					 */
+					exprs = (List *) eval_const_expressions(NULL, (Node *) exprs);
+
+					/* May as well fix opfuncids too */
+					fix_opfuncids((Node *) exprs);
+
+					/*
+					 * Modify the copies we obtain from the relcache to have the
+					 * correct varno for the parent relation, so that they match
+					 * up correctly against qual clauses.
+					 */
+					if (varno != 1)
+						ChangeVarNodes((Node *) exprs, 1, varno, 0);
+				}
 			}
-		}
 
-		/* add one StatisticExtInfo for each kind built */
-		if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			/* add one StatisticExtInfo for each kind built */
+			if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_NDISTINCT;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_NDISTINCT;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
 
-			stainfos = lappend(stainfos, info);
-		}
+				stainfos = lappend(stainfos, info);
+			}
 
-		if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_DEPENDENCIES;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_DEPENDENCIES;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
 
-			stainfos = lappend(stainfos, info);
-		}
+				stainfos = lappend(stainfos, info);
+			}
 
-		if (statext_is_kind_built(dtup, STATS_EXT_MCV))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			if (statext_is_kind_built(dtup, STATS_EXT_MCV))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_MCV;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_MCV;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
 
-			stainfos = lappend(stainfos, info);
-		}
+				stainfos = lappend(stainfos, info);
+			}
 
-		if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_EXPRESSIONS;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_EXPRESSIONS;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
+
+				stainfos = lappend(stainfos, info);
+			}
 
-			stainfos = lappend(stainfos, info);
+			ReleaseSysCache(dtup);
 		}
 
 		ReleaseSysCache(htup);
-		ReleaseSysCache(dtup);
 		bms_free(keys);
 	}
 
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 8bf80db8e4..02cf0efc66 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -618,14 +618,16 @@ dependency_is_fully_matched(MVDependency *dependency, Bitmapset *attnums)
  *		Load the functional dependencies for the indicated pg_statistic_ext tuple
  */
 MVDependencies *
-statext_dependencies_load(Oid mvoid)
+statext_dependencies_load(Oid mvoid, bool inh)
 {
 	MVDependencies *result;
 	bool		isnull;
 	Datum		deps;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid),
+						   BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
@@ -1410,6 +1412,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			ndependencies;
 	int			i;
 	AttrNumber	attnum_offset;
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
 	/* unique expressions */
 	Node	  **unique_exprs;
@@ -1598,6 +1601,10 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (stat->kind != STATS_EXT_DEPENDENCIES)
 			continue;
 
+		/* skip statistics with mismatching stxdinherit value */
+		if (stat->inherit != rte->inh)
+			continue;
+
 		/*
 		 * Count matching attributes - we have to undo the attnum offsets. The
 		 * input attribute numbers are not offset (expressions are not
@@ -1644,7 +1651,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (nmatched + nexprs < 2)
 			continue;
 
-		deps = statext_dependencies_load(stat->statOid);
+		deps = statext_dependencies_load(stat->statOid, rte->inh);
 
 		/*
 		 * The expressions may be represented by different attnums in the
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 5fa36e0036..3909a1c716 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -77,7 +77,7 @@ typedef struct StatExtEntry
 static List *fetch_statentries_for_relation(Relation pg_statext, Oid relid);
 static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
 											int nvacatts, VacAttrStats **vacatts);
-static void statext_store(Oid statOid,
+static void statext_store(Oid statOid, bool inh,
 						  MVNDistinct *ndistinct, MVDependencies *dependencies,
 						  MCVList *mcv, Datum exprs, VacAttrStats **stats);
 static int	statext_compute_stattarget(int stattarget,
@@ -110,7 +110,7 @@ static StatsBuildData *make_build_data(Relation onerel, StatExtEntry *stat,
  * requested stats, and serializes them back into the catalog.
  */
 void
-BuildRelationExtStatistics(Relation onerel, double totalrows,
+BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 						   int numrows, HeapTuple *rows,
 						   int natts, VacAttrStats **vacattrstats)
 {
@@ -230,7 +230,8 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 		}
 
 		/* store the statistics in the catalog */
-		statext_store(stat->statOid, ndistinct, dependencies, mcv, exprstats, stats);
+		statext_store(stat->statOid, inh,
+					  ndistinct, dependencies, mcv, exprstats, stats);
 
 		/* for reporting progress */
 		pgstat_progress_update_param(PROGRESS_ANALYZE_EXT_STATS_COMPUTED,
@@ -781,7 +782,7 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
  *	tuple.
  */
 static void
-statext_store(Oid statOid,
+statext_store(Oid statOid, bool inh,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
 			  MCVList *mcv, Datum exprs, VacAttrStats **stats)
 {
@@ -790,14 +791,19 @@ statext_store(Oid statOid,
 				oldtup;
 	Datum		values[Natts_pg_statistic_ext_data];
 	bool		nulls[Natts_pg_statistic_ext_data];
-	bool		replaces[Natts_pg_statistic_ext_data];
 
 	pg_stextdata = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
 	memset(nulls, true, sizeof(nulls));
-	memset(replaces, false, sizeof(replaces));
 	memset(values, 0, sizeof(values));
 
+	/* basic info */
+	values[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statOid);
+	nulls[Anum_pg_statistic_ext_data_stxoid - 1] = false;
+
+	values[Anum_pg_statistic_ext_data_stxdinherit - 1] = BoolGetDatum(inh);
+	nulls[Anum_pg_statistic_ext_data_stxdinherit - 1] = false;
+
 	/*
 	 * Construct a new pg_statistic_ext_data tuple, replacing the calculated
 	 * stats.
@@ -830,25 +836,27 @@ statext_store(Oid statOid,
 		values[Anum_pg_statistic_ext_data_stxdexpr - 1] = exprs;
 	}
 
-	/* always replace the value (either by bytea or NULL) */
-	replaces[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdmcv - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdexpr - 1] = true;
-
-	/* there should already be a pg_statistic_ext_data tuple */
-	oldtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-	if (!HeapTupleIsValid(oldtup))
+	/*
+	 * Delete the old tuple if it exists, and insert a new one. It's easier
+	 * than trying to update or insert, based on various conditions.
+	 *
+	 * There should always be a pg_statistic_ext_data tuple for inh=false,
+	 * but there may be none for inh=true yet.
+	 */
+	oldtup = SearchSysCache2(STATEXTDATASTXOID,
+							 ObjectIdGetDatum(statOid),
+							 BoolGetDatum(inh));
+	if (HeapTupleIsValid(oldtup))
+	{
+		CatalogTupleDelete(pg_stextdata, &(oldtup->t_self));
+		ReleaseSysCache(oldtup);
+	}
+	else if (!inh)
 		elog(ERROR, "cache lookup failed for statistics object %u", statOid);
 
-	/* replace it */
-	stup = heap_modify_tuple(oldtup,
-							 RelationGetDescr(pg_stextdata),
-							 values,
-							 nulls,
-							 replaces);
-	ReleaseSysCache(oldtup);
-	CatalogTupleUpdate(pg_stextdata, &stup->t_self, stup);
+	/* form a new tuple */
+	stup = heap_form_tuple(RelationGetDescr(pg_stextdata), values, nulls);
+	CatalogTupleInsert(pg_stextdata, stup);
 
 	heap_freetuple(stup);
 
@@ -1234,7 +1242,7 @@ stat_covers_expressions(StatisticExtInfo *stat, List *exprs,
  * further tiebreakers are needed.
  */
 StatisticExtInfo *
-choose_best_statistics(List *stats, char requiredkind,
+choose_best_statistics(List *stats, char requiredkind, bool inh,
 					   Bitmapset **clause_attnums, List **clause_exprs,
 					   int nclauses)
 {
@@ -1256,6 +1264,10 @@ choose_best_statistics(List *stats, char requiredkind,
 		if (info->kind != requiredkind)
 			continue;
 
+		/* skip statistics with mismatching inheritance flag */
+		if (info->inherit != inh)
+			continue;
+
 		/*
 		 * Collect attributes and expressions in remaining (unestimated)
 		 * clauses fully covered by this statistic object.
@@ -1694,6 +1706,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	List	  **list_exprs;		/* expressions matched to any statistic */
 	int			listidx;
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
@@ -1746,7 +1759,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 		Bitmapset  *simple_clauses;
 
 		/* find the best suited statistics object for these attnums */
-		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV,
+		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, rte->inh,
 									  list_attnums, list_exprs,
 									  list_length(clauses));
 
@@ -1835,7 +1848,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 			MCVList    *mcv_list;
 
 			/* Load the MCV list stored in the statistics object */
-			mcv_list = statext_mcv_load(stat->statOid);
+			mcv_list = statext_mcv_load(stat->statOid, rte->inh);
 
 			/*
 			 * Compute the selectivity of the ORed list of clauses covered by
@@ -2406,7 +2419,7 @@ statext_expressions_load(Oid stxoid, int idx)
 	HeapTupleData tmptup;
 	HeapTuple	tup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), BoolGetDatum(false));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", stxoid);
 
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index 35b39ece07..173f746e41 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -559,12 +559,13 @@ build_column_frequencies(SortItem *groups, int ngroups,
  *		Load the MCV list for the indicated pg_statistic_ext tuple.
  */
 MCVList *
-statext_mcv_load(Oid mvoid)
+statext_mcv_load(Oid mvoid, bool inh)
 {
 	MCVList    *result;
 	bool		isnull;
 	Datum		mcvlist;
-	HeapTuple	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	HeapTuple	htup = SearchSysCache2(STATEXTDATASTXOID,
+									   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
@@ -2040,11 +2041,13 @@ mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
 	MCVList    *mcv;
 	Selectivity s = 0.0;
 
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
 	/* match/mismatch bitmap for each MCV item */
 	bool	   *matches = NULL;
 
 	/* load the MCV list stored in the statistics object */
-	mcv = statext_mcv_load(stat->statOid);
+	mcv = statext_mcv_load(stat->statOid, rte->inh);
 
 	/* build a match bitmap for the clauses */
 	matches = mcv_get_match_bitmap(root, clauses, stat->keys, stat->exprs,
diff --git a/src/backend/statistics/mvdistinct.c b/src/backend/statistics/mvdistinct.c
index 4481312d61..ab1f10d6c0 100644
--- a/src/backend/statistics/mvdistinct.c
+++ b/src/backend/statistics/mvdistinct.c
@@ -146,14 +146,15 @@ statext_ndistinct_build(double totalrows, StatsBuildData *data)
  *		Load the ndistinct value for the indicated pg_statistic_ext tuple
  */
 MVNDistinct *
-statext_ndistinct_load(Oid mvoid)
+statext_ndistinct_load(Oid mvoid, bool inh)
 {
 	MVNDistinct *result;
 	bool		isnull;
 	Datum		ndist;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 0c8c05f6c2..2fcb0ca229 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3914,6 +3914,8 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	MVNDistinct *stats;
 	StatisticExtInfo *matched_info = NULL;
 
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
 	/* bail out immediately if the table has no extended statistics */
 	if (!rel->statlist)
 		return false;
@@ -4003,7 +4005,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 
 	Assert(nmatches_vars + nmatches_exprs > 1);
 
-	stats = statext_ndistinct_load(statOid);
+	stats = statext_ndistinct_load(statOid, rte->inh);
 
 	/*
 	 * If we have a match, search it for the specific item that matches (there
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index d6cb78dea8..eabd74952f 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -740,11 +740,11 @@ static const struct cachedesc cacheinfo[] = {
 		32
 	},
 	{StatisticExtDataRelationId,	/* STATEXTDATASTXOID */
-		StatisticExtDataStxoidIndexId,
-		1,
+		StatisticExtDataStxoidInhIndexId,
+		2,
 		{
 			Anum_pg_statistic_ext_data_stxoid,
-			0,
+			Anum_pg_statistic_ext_data_stxdinherit,
 			0,
 			0
 		},
diff --git a/src/include/catalog/pg_statistic_ext_data.h b/src/include/catalog/pg_statistic_ext_data.h
index 7b73b790d2..8ffd8b68cd 100644
--- a/src/include/catalog/pg_statistic_ext_data.h
+++ b/src/include/catalog/pg_statistic_ext_data.h
@@ -32,6 +32,7 @@ CATALOG(pg_statistic_ext_data,3429,StatisticExtDataRelationId)
 {
 	Oid			stxoid BKI_LOOKUP(pg_statistic_ext);	/* statistics object
 														 * this data is for */
+	bool		stxdinherit;	/* true if inheritance children are included */
 
 #ifdef CATALOG_VARLEN			/* variable-length fields start here */
 
@@ -53,6 +54,7 @@ typedef FormData_pg_statistic_ext_data * Form_pg_statistic_ext_data;
 
 DECLARE_TOAST(pg_statistic_ext_data, 3430, 3431);
 
-DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_index, 3433, StatisticExtDataStxoidIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops));
+DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_inh_index, 3433, StatisticExtDataStxoidInhIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops, stxdinherit bool_ops));
+
 
 #endif							/* PG_STATISTIC_EXT_DATA_H */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 2a53a6e344..884bda7232 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -934,6 +934,7 @@ typedef struct StatisticExtInfo
 	NodeTag		type;
 
 	Oid			statOid;		/* OID of the statistics row */
+	bool		inherit;		/* includes child relations */
 	RelOptInfo *rel;			/* back-link to statistic's table */
 	char		kind;			/* statistics kind of this entry */
 	Bitmapset  *keys;			/* attnums of the columns covered */
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 326cf26fea..02ee41b9f3 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -94,11 +94,11 @@ typedef struct MCVList
 	MCVItem		items[FLEXIBLE_ARRAY_MEMBER];	/* array of MCV items */
 } MCVList;
 
-extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
-extern MVDependencies *statext_dependencies_load(Oid mvoid);
-extern MCVList *statext_mcv_load(Oid mvoid);
+extern MVNDistinct *statext_ndistinct_load(Oid mvoid, bool inh);
+extern MVDependencies *statext_dependencies_load(Oid mvoid, bool inh);
+extern MCVList *statext_mcv_load(Oid mvoid, bool inh);
 
-extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
+extern void BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 									   int numrows, HeapTuple *rows,
 									   int natts, VacAttrStats **vacattrstats);
 extern int	ComputeExtStatisticsRows(Relation onerel,
@@ -121,6 +121,7 @@ extern Selectivity statext_clauselist_selectivity(PlannerInfo *root,
 												  bool is_or);
 extern bool has_stats_of_kind(List *stats, char requiredkind);
 extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind,
+												bool inh,
 												Bitmapset **clause_attnums,
 												List **clause_exprs,
 												int nclauses);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..8ab5187ccb 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2425,6 +2425,7 @@ pg_stats_ext| SELECT cn.nspname AS schemaname,
              JOIN pg_attribute a ON (((a.attrelid = s.stxrelid) AND (a.attnum = k.k))))) AS attnames,
     pg_get_statisticsobjdef_expressions(s.oid) AS exprs,
     s.stxkind AS kinds,
+    sd.stxdinherit AS inherited,
     sd.stxdndistinct AS n_distinct,
     sd.stxddependencies AS dependencies,
     m.most_common_vals,
#3Justin Pryzby
pryzby@telsasoft.com
In reply to: Tomas Vondra (#2)
Re: extended stats on partitioned tables

On Sat, Sep 25, 2021 at 09:27:10PM +0200, Tomas Vondra wrote:

On 9/23/21 11:26 PM, Justin Pryzby wrote:

extended stats objects are allowed on partitioned tables since v10.
/messages/by-id/CAKJS1f-BmGo410bh5RSPZUvOO0LhmHL2NYmdrC_Jm8pk_FfyCA@mail.gmail.com
8c5cdb7f4f6e1d6a6104cb58ce4f23453891651b

But since 859b3003de they're not populated - pg_statistic_ext(_data) is empty.
This was the consequence of a commit to avoid an error I reported with stats on
inheritence parents (not partitioned tables).

preceding 859b3003de, stats on the parent table *did* improve the estimate,
so this part of the commit message seems to have been wrong?
|commit 859b3003de87645b62ee07ef245d6c1f1cd0cedb
| Don't build extended statistics on inheritance trees
...
| Moreover, the current selectivity estimation code only works with individual
| relations, so building statistics on inheritance trees would be pointless
| anyway.

|CREATE TABLE p (i int, a int, b int) PARTITION BY RANGE (i);
|CREATE TABLE pd PARTITION OF p FOR VALUES FROM (1)TO(100);
|TRUNCATE p; INSERT INTO p SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
|CREATE STATISTICS pp ON (a),(b) FROM p;
|VACUUM ANALYZE p;
|SELECT * FROM pg_statistic_ext WHERE stxrelid ='p'::regclass;

|postgres=# begin; DROP STATISTICS pp; explain analyze SELECT a,b FROM p GROUP BY 1,2; abort;
| HashAggregate (cost=20.98..21.98 rows=100 width=8) (actual time=1.088..1.093 rows=10 loops=1)

|postgres=# explain analyze SELECT a,b FROM p GROUP BY 1,2;
| HashAggregate (cost=20.98..21.09 rows=10 width=8) (actual time=1.082..1.086 rows=10 loops=1)

So I think this is a regression, and extended stats should be populated for
partitioned tables - I had actually done that for some parent tables and hadn't
noticed that the stats objects no longer do anything.

...

Agreed, that seems like a regression, but I don't see how to fix that
without having the extra flag in the catalog. Otherwise we can store just
one version for each statistics object :-(

Do you think it's possible to backpatch a fix to handle partitioned tables
specifically ?

The "tuple already updated" error which I reported and which was fixed by
859b3003 involved inheritence children. Since partitioned tables have no data
themselves, the !inh check could be relaxed. It's not totally clear to me if
the correct statistics would be used in that case. I suppose the wrong
(inherited) stats would be wrongly applied affect queries FROM ONLY a
partitioned table, which seems pointless to write and also hard for the
estimates to be far off :)

Attached is a PoC that I quickly bashed together today. It's pretty raw, but
it passed "make check" and I think it does most of the things right. Can you
try if this fixes the estimates with partitioned tables?

I think pg_stats_ext_exprs also needs to expose the inherited flag.

Thanks,
--
Justin

#4Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Justin Pryzby (#3)
1 attachment(s)
Re: extended stats on partitioned tables

On 9/25/21 9:53 PM, Justin Pryzby wrote:

On Sat, Sep 25, 2021 at 09:27:10PM +0200, Tomas Vondra wrote:

On 9/23/21 11:26 PM, Justin Pryzby wrote:

extended stats objects are allowed on partitioned tables since v10.
/messages/by-id/CAKJS1f-BmGo410bh5RSPZUvOO0LhmHL2NYmdrC_Jm8pk_FfyCA@mail.gmail.com
8c5cdb7f4f6e1d6a6104cb58ce4f23453891651b

But since 859b3003de they're not populated - pg_statistic_ext(_data) is empty.
This was the consequence of a commit to avoid an error I reported with stats on
inheritence parents (not partitioned tables).

preceding 859b3003de, stats on the parent table *did* improve the estimate,
so this part of the commit message seems to have been wrong?
|commit 859b3003de87645b62ee07ef245d6c1f1cd0cedb
| Don't build extended statistics on inheritance trees
...
| Moreover, the current selectivity estimation code only works with individual
| relations, so building statistics on inheritance trees would be pointless
| anyway.

|CREATE TABLE p (i int, a int, b int) PARTITION BY RANGE (i);
|CREATE TABLE pd PARTITION OF p FOR VALUES FROM (1)TO(100);
|TRUNCATE p; INSERT INTO p SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
|CREATE STATISTICS pp ON (a),(b) FROM p;
|VACUUM ANALYZE p;
|SELECT * FROM pg_statistic_ext WHERE stxrelid ='p'::regclass;

|postgres=# begin; DROP STATISTICS pp; explain analyze SELECT a,b FROM p GROUP BY 1,2; abort;
| HashAggregate (cost=20.98..21.98 rows=100 width=8) (actual time=1.088..1.093 rows=10 loops=1)

|postgres=# explain analyze SELECT a,b FROM p GROUP BY 1,2;
| HashAggregate (cost=20.98..21.09 rows=10 width=8) (actual time=1.082..1.086 rows=10 loops=1)

So I think this is a regression, and extended stats should be populated for
partitioned tables - I had actually done that for some parent tables and hadn't
noticed that the stats objects no longer do anything.

...

Agreed, that seems like a regression, but I don't see how to fix that
without having the extra flag in the catalog. Otherwise we can store just
one version for each statistics object :-(

Do you think it's possible to backpatch a fix to handle partitioned tables
specifically ?

The "tuple already updated" error which I reported and which was fixed by
859b3003 involved inheritence children. Since partitioned tables have no data
themselves, the !inh check could be relaxed. It's not totally clear to me if
the correct statistics would be used in that case. I suppose the wrong
(inherited) stats would be wrongly applied affect queries FROM ONLY a
partitioned table, which seems pointless to write and also hard for the
estimates to be far off :)

Hmmm, maybe. To prevent the "tuple concurrently updated" we must ensure
we never build stats with and without inheritance at the same time (for
the same rel). The 859b3003de ensures that by only building extended
stats in the (!inh) case, but we might tweak that based on relkind. See
the attached patch. But I wonder if there are cases that might be hurt
by this - that'd be a regression too, of course.

Attached is a PoC that I quickly bashed together today. It's pretty raw, but
it passed "make check" and I think it does most of the things right. Can you
try if this fixes the estimates with partitioned tables?

I think pg_stats_ext_exprs also needs to expose the inherited flag.

Yeah, I only did the bare minimum to get the PoC working. I'm sure there
are various other loose ends.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

inheritance-fix-simple.patchtext/x-patch; charset=UTF-8; name=inheritance-fix-simple.patchDownload
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 8bfb2ad958..299f4893b8 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -548,6 +548,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 	{
 		MemoryContext col_context,
 					old_context;
+		bool		build_ext_stats;
 
 		pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
 									 PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -611,13 +612,15 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
+		build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
+
 		/*
 		 * Build extended statistics (if there are any).
 		 *
 		 * For now we only build extended statistics on individual relations,
 		 * not for relations representing inheritance trees.
 		 */
-		if (!inh)
+		if (build_ext_stats)
 			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
 									   attr_cnt, vacattrstats);
 	}
#5Justin Pryzby
pryzby@telsasoft.com
In reply to: Tomas Vondra (#4)
Re: extended stats on partitioned tables

On Sat, Sep 25, 2021 at 11:01:21PM +0200, Tomas Vondra wrote:

Do you think it's possible to backpatch a fix to handle partitioned tables
specifically ?

The "tuple already updated" error which I reported and which was fixed by
859b3003 involved inheritence children. Since partitioned tables have no data
themselves, the !inh check could be relaxed. It's not totally clear to me if
the correct statistics would be used in that case. I suppose the wrong
(inherited) stats would be wrongly applied affect queries FROM ONLY a
partitioned table, which seems pointless to write and also hard for the
estimates to be far off :)

Hmmm, maybe. To prevent the "tuple concurrently updated" we must ensure we
never build stats with and without inheritance at the same time (for the
same rel). The 859b3003de ensures that by only building extended stats in
the (!inh) case, but we might tweak that based on relkind. See the attached
patch. But I wonder if there are cases that might be hurt by this - that'd
be a regression too, of course.

I think we should leave the inheritance case alone, since it hasn't changed in
2 years, and building stats on the table ONLY is a legitimate interpretation,
and it's as good as we can do without the catalog change.

But the partitioned case used to work, and there's no utility in selecting FROM
ONLY a partitioned table, so we might as well build the stats including its
partitions.

I don't think anything would get worse for the partitioned case.
Obviously building inherited ext stats could change plans - that's the point.
It's weird that the stats objects which existed for 18 months before being
"built" after the patch was applied, but no so weird that the release notes
wouldn't be ample documentation.

If building statistics caused the plan to change undesirably, the solution
would be to drop the stats object, of course.

+ build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);

It's weird to build inherited extended stats for partitioned tables but not for
inheritence parents. We could be clever and say "build inherited ext stats for
inheritence parents only if we didn't insert any stats for the table itself
(because it's empty)". But I think that's fragile: a single tuple in the
parent table could cause stats to be built there instead of on its heirarchy,
and the extended stats would be used for *both* FROM and FROM ONLY, which is an
awful combination.

Since do_analyze_rel is only called once for partitioned tables, I think you
could write that as:

/* Do not build inherited stats (since the catalog cannot support it) except
* for partitioned tables, for which numrows==0 and have no non-inherited stats */
build_ext_stats = !inh || onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE;

--
Justin

#6Justin Pryzby
pryzby@telsasoft.com
In reply to: Tomas Vondra (#2)
Re: extended stats on partitioned tables

(Resending with -hackers)

It seems like your patch should also check "inh" in examine_variable and
statext_expressions_load.

Which leads to another issue in stable branches:

ANALYZE builds only non-inherited stats, but they're incorrectly used for
inherited queries - the rowcount estimate is worse on inheritence parents with
extended stats than without.

CREATE TABLE p(i int, j int);
CREATE TABLE p1() INHERITS(p);
INSERT INTO p SELECT a, a/10 FROM generate_series(1,9)a;
INSERT INTO p1 SELECT a, a FROM generate_series(1,999)a;
CREATE STATISTICS ps ON i,j FROM p;
VACUUM ANALYZE p,p1;

postgres=# explain analyze SELECT * FROM p GROUP BY 1,2;
HashAggregate (cost=26.16..26.25 rows=9 width=8) (actual time=2.571..3.282 rows=1008 loops=1)

postgres=# begin; DROP STATISTICS ps; explain analyze SELECT * FROM p GROUP BY 1,2; rollback;
HashAggregate (cost=26.16..36.16 rows=1000 width=8) (actual time=2.167..2.872 rows=1008 loops=1)

I guess examine_variable() should have corresponding logic to the hardcoded
!inh in analyze.c.

--
Justin

#7Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Justin Pryzby (#5)
Re: extended stats on partitioned tables

On 9/25/21 11:46 PM, Justin Pryzby wrote:

On Sat, Sep 25, 2021 at 11:01:21PM +0200, Tomas Vondra wrote:

Do you think it's possible to backpatch a fix to handle partitioned tables
specifically ?

The "tuple already updated" error which I reported and which was fixed by
859b3003 involved inheritence children. Since partitioned tables have no data
themselves, the !inh check could be relaxed. It's not totally clear to me if
the correct statistics would be used in that case. I suppose the wrong
(inherited) stats would be wrongly applied affect queries FROM ONLY a
partitioned table, which seems pointless to write and also hard for the
estimates to be far off :)

Hmmm, maybe. To prevent the "tuple concurrently updated" we must ensure we
never build stats with and without inheritance at the same time (for the
same rel). The 859b3003de ensures that by only building extended stats in
the (!inh) case, but we might tweak that based on relkind. See the attached
patch. But I wonder if there are cases that might be hurt by this - that'd
be a regression too, of course.

I think we should leave the inheritance case alone, since it hasn't changed in
2 years, and building stats on the table ONLY is a legitimate interpretation,
and it's as good as we can do without the catalog change.

But the partitioned case used to work, and there's no utility in selecting FROM
ONLY a partitioned table, so we might as well build the stats including its
partitions.

I don't think anything would get worse for the partitioned case.
Obviously building inherited ext stats could change plans - that's the point.
It's weird that the stats objects which existed for 18 months before being
"built" after the patch was applied, but no so weird that the release notes
wouldn't be ample documentation.

Agreed.

If building statistics caused the plan to change undesirably, the solution
would be to drop the stats object, of course.

+ build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);

It's weird to build inherited extended stats for partitioned tables but not for
inheritence parents. We could be clever and say "build inherited ext stats for
inheritence parents only if we didn't insert any stats for the table itself
(because it's empty)". But I think that's fragile: a single tuple in the
parent table could cause stats to be built there instead of on its heirarchy,
and the extended stats would be used for *both* FROM and FROM ONLY, which is an
awful combination.

I don't think there's a good way to check if there are any rows in the
parent relation. And even then, a single row might cause huge changes to
query plans (essentially switching to very different stats).

Since do_analyze_rel is only called once for partitioned tables, I think you
could write that as:

/* Do not build inherited stats (since the catalog cannot support it) except
* for partitioned tables, for which numrows==0 and have no non-inherited stats */
build_ext_stats = !inh || onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE;

Good point.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#8Justin Pryzby
pryzby@telsasoft.com
In reply to: Justin Pryzby (#6)
5 attachment(s)
Re: extended stats on partitioned tables

On Sat, Sep 25, 2021 at 05:31:52PM -0500, Justin Pryzby wrote:

It seems like your patch should also check "inh" in examine_variable and
statext_expressions_load.

I tried adding that - I mostly kept my patches separate.
Hopefully this is more helpful than a complication.
I added at: https://commitfest.postgresql.org/35/3332/

+       /* create only the "stxdinherit=false", because that always exists */
+       datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = ObjectIdGetDatum(false);

That'd be confusing for partitioned tables, no?
They'd always have an row with no data.
I guess it could be stxdinherit = BoolGetDatum(rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE).
(not ObjectIdGetDatum).
Then, that affects the loops which delete the tuples - neither inh nor !inh is
guaranteed, unless you check relkind there, too.

BTW, you'd need to add an "inherited" column to \dX if you added the "built"
data back.

Also, I think in backbranches we should document what's being stored in
pg_statistic_ext, since it's pretty unintuitive:
- noninherted stats (FROM ONLY) for inheritence parents;
- inherted stats (FROM *) for partitioned tables;

I think the !inh decision in 859b3003de was basically backwards.
I think it'd be rare for someone to put extended stats on a parent for
improving plans involving FROM ONLY.

But it's not worth trying to fix now, since it would change plans in
irreversible ways. Also, if the stx data were already populated, users would
have to run a manual analyze after upgrading to populate the catalog with the
data the planner would expect in the new version, or else it would end up being
the opposite of the issue I mentioned: non-inherited stats (from before the
upgrade) would be applied by the planner (after the upgrade) to inherited
queries.

Attachments:

0004-f-check-inh.patchtext/x-diff; charset=us-asciiDownload
From 6b3c80fc37d0c034cc2072921da756728e3e3c3e Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 25 Sep 2021 18:20:03 -0500
Subject: [PATCH 4/5] f! check inh

statext_expressions_load
examine_variable
estimate_multivariate_ndistinct

TODO: pg_stats_ext_exprs needs to expose inh flag
---
 src/backend/statistics/dependencies.c   |  5 -----
 src/backend/statistics/extended_stats.c |  9 ++-------
 src/backend/utils/adt/selfuncs.c        | 27 ++++++++-----------------
 src/include/statistics/statistics.h     |  2 +-
 src/test/regress/expected/stats_ext.out | 11 ++++++++--
 src/test/regress/sql/stats_ext.sql      |  4 +++-
 6 files changed, 23 insertions(+), 35 deletions(-)

diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 835f4bdf7a..02cf0efc66 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -1596,11 +1596,6 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		int			nexprs;
 		int			k;
 		MVDependencies *deps;
-		RangeTblEntry	*rte = root->simple_rte_array[rel->relid];
-
-		/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-		if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-			break;
 
 		/* skip statistics that are not of the correct type */
 		if (stat->kind != STATS_EXT_DEPENDENCIES)
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 5a43d7e39a..231b153c15 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1706,7 +1706,6 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	List	  **list_exprs;		/* expressions matched to any statistic */
 	int			listidx;
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
-	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
@@ -1759,10 +1758,6 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 		Bitmapset  *simple_clauses;
 		RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
-		/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-		if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-			break;
-
 		/* find the best suited statistics object for these attnums */
 		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, rte->inh,
 									  list_attnums, list_exprs,
@@ -2414,7 +2409,7 @@ serialize_expr_stats(AnlExprData *exprdata, int nexprs)
  * identified by the supplied index.
  */
 HeapTuple
-statext_expressions_load(Oid stxoid, int idx)
+statext_expressions_load(Oid stxoid, bool inh, int idx)
 {
 	bool		isnull;
 	Datum		value;
@@ -2424,7 +2419,7 @@ statext_expressions_load(Oid stxoid, int idx)
 	HeapTupleData tmptup;
 	HeapTuple	tup;
 
-	htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), BoolGetDatum(false));
+	htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), inh);
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", stxoid);
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index ede393115d..8a6bc29636 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3915,10 +3915,6 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	StatisticExtInfo *matched_info = NULL;
 	RangeTblEntry		*rte = root->simple_rte_array[rel->relid];
 
-	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return false;
-
 	/* bail out immediately if the table has no extended statistics */
 	if (!rel->statlist)
 		return false;
@@ -5237,13 +5233,6 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 			if (vardata->statsTuple)
 				break;
 
-			/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-			{
-				RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root);
-				if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-					break;
-			}
-
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
 				continue;
@@ -5262,22 +5251,22 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 				/* found a match, see if we can extract pg_statistic row */
 				if (equal(node, expr))
 				{
-					HeapTuple	t = statext_expressions_load(info->statOid, pos);
-
-					/* Get statistic object's table for permission check */
-					RangeTblEntry *rte;
+					RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root);
 					Oid			userid;
+					bool		inh;
 
-					vardata->statsTuple = t;
+					Assert(rte->rtekind == RTE_RELATION);
 
 					/*
 					 * XXX Not sure if we should cache the tuple somewhere.
 					 * Now we just create a new copy every time.
 					 */
-					vardata->freefunc = ReleaseDummy;
+					inh = root->append_rel_array == NULL ? false :
+						root->append_rel_array[onerel->relid]->parent_relid != 0;
+					vardata->statsTuple =
+						statext_expressions_load(info->statOid, inh, pos);
 
-					rte = planner_rt_fetch(onerel->relid, root);
-					Assert(rte->rtekind == RTE_RELATION);
+					vardata->freefunc = ReleaseDummy;
 
 					/*
 					 * Use checkAsUser if it's set, in case we're accessing
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 02ee41b9f3..3868e43f8a 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -125,6 +125,6 @@ extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind,
 												Bitmapset **clause_attnums,
 												List **clause_exprs,
 												int nclauses);
-extern HeapTuple statext_expressions_load(Oid stxoid, int idx);
+extern HeapTuple statext_expressions_load(Oid stxoid, bool inh, int idx);
 
 #endif							/* STATISTICS_H */
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 67234b9fc2..35edc6a361 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -176,7 +176,6 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 NOTICE:  drop cascades to table ab1c
--- Ensure non-inherited stats are not applied to inherited query
 CREATE TABLE stxdinh(i int, j int);
 CREATE TABLE stxdinh1() INHERITS(stxdinh);
 INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
@@ -191,11 +190,19 @@ SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 
 CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
 VACUUM ANALYZE stxdinh, stxdinh1;
+-- Ensure non-inherited stats are not applied to inherited query
 -- Since the stats object does not include inherited stats, it should not affect the estimates
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
  estimated | actual 
 -----------+--------
-      1000 |   1008
+      1008 |   1008
+(1 row)
+
+-- Ensure correct (non-inherited) stats are applied to inherited query
+SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+         9 |      9
 (1 row)
 
 DROP TABLE stxdinh, stxdinh1;
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 2371043ca1..8490da9558 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -112,7 +112,6 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 
--- Ensure non-inherited stats are not applied to inherited query
 CREATE TABLE stxdinh(i int, j int);
 CREATE TABLE stxdinh1() INHERITS(stxdinh);
 INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
@@ -122,8 +121,11 @@ VACUUM ANALYZE stxdinh, stxdinh1;
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
 VACUUM ANALYZE stxdinh, stxdinh1;
+-- Ensure non-inherited stats are not applied to inherited query
 -- Since the stats object does not include inherited stats, it should not affect the estimates
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+-- Ensure correct (non-inherited) stats are applied to inherited query
+SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2');
 DROP TABLE stxdinh, stxdinh1;
 
 -- Ensure inherited stats ARE applied to inherited query in partitioned table
-- 
2.17.0

0005-Refactor-parent-ACL-check.patchtext/x-diff; charset=us-asciiDownload
From 33916b89dcb92acdc66b8c4febfe87d43247e5dd Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 25 Sep 2021 18:58:33 -0500
Subject: [PATCH 5/5] Refactor parent ACL check

selfuncs.c is 8k lines long, and this makes it 30 LOC shorter.
---
 src/backend/utils/adt/selfuncs.c | 140 ++++++++++++-------------------
 1 file changed, 52 insertions(+), 88 deletions(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 8a6bc29636..949726a861 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -187,6 +187,8 @@ static char *convert_string_datum(Datum value, Oid typid, Oid collid,
 								  bool *failure);
 static double convert_timevalue_to_scalar(Datum value, Oid typid,
 										  bool *failure);
+static void recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata,
+								Oid relid);
 static void examine_simple_variable(PlannerInfo *root, Var *var,
 									VariableStatData *vardata);
 static bool get_variable_range(PlannerInfo *root, VariableStatData *vardata,
@@ -5152,51 +5154,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 									(pg_class_aclcheck(rte->relid, userid,
 													   ACL_SELECT) == ACLCHECK_OK);
 
-								/*
-								 * If the user doesn't have permissions to
-								 * access an inheritance child relation, check
-								 * the permissions of the table actually
-								 * mentioned in the query, since most likely
-								 * the user does have that permission.  Note
-								 * that whole-table select privilege on the
-								 * parent doesn't quite guarantee that the
-								 * user could read all columns of the child.
-								 * But in practice it's unlikely that any
-								 * interesting security violation could result
-								 * from allowing access to the expression
-								 * index's stats, so we allow it anyway.  See
-								 * similar code in examine_simple_variable()
-								 * for additional comments.
-								 */
-								if (!vardata->acl_ok &&
-									root->append_rel_array != NULL)
-								{
-									AppendRelInfo *appinfo;
-									Index		varno = index->rel->relid;
-
-									appinfo = root->append_rel_array[varno];
-									while (appinfo &&
-										   planner_rt_fetch(appinfo->parent_relid,
-															root)->rtekind == RTE_RELATION)
-									{
-										varno = appinfo->parent_relid;
-										appinfo = root->append_rel_array[varno];
-									}
-									if (varno != index->rel->relid)
-									{
-										/* Repeat access check on this rel */
-										rte = planner_rt_fetch(varno, root);
-										Assert(rte->rtekind == RTE_RELATION);
-
-										userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-										vardata->acl_ok =
-											rte->securityQuals == NIL &&
-											(pg_class_aclcheck(rte->relid,
-															   userid,
-															   ACL_SELECT) == ACLCHECK_OK);
-									}
-								}
+								recheck_parent_acl(root, vardata, index->rel->relid);
 							}
 							else
 							{
@@ -5287,49 +5245,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 						(pg_class_aclcheck(rte->relid, userid,
 										   ACL_SELECT) == ACLCHECK_OK);
 
-					/*
-					 * If the user doesn't have permissions to access an
-					 * inheritance child relation, check the permissions of
-					 * the table actually mentioned in the query, since most
-					 * likely the user does have that permission.  Note that
-					 * whole-table select privilege on the parent doesn't
-					 * quite guarantee that the user could read all columns of
-					 * the child. But in practice it's unlikely that any
-					 * interesting security violation could result from
-					 * allowing access to the expression stats, so we allow it
-					 * anyway.  See similar code in examine_simple_variable()
-					 * for additional comments.
-					 */
-					if (!vardata->acl_ok &&
-						root->append_rel_array != NULL)
-					{
-						AppendRelInfo *appinfo;
-						Index		varno = onerel->relid;
-
-						appinfo = root->append_rel_array[varno];
-						while (appinfo &&
-							   planner_rt_fetch(appinfo->parent_relid,
-												root)->rtekind == RTE_RELATION)
-						{
-							varno = appinfo->parent_relid;
-							appinfo = root->append_rel_array[varno];
-						}
-						if (varno != onerel->relid)
-						{
-							/* Repeat access check on this rel */
-							rte = planner_rt_fetch(varno, root);
-							Assert(rte->rtekind == RTE_RELATION);
-
-							userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-							vardata->acl_ok =
-								rte->securityQuals == NIL &&
-								(pg_class_aclcheck(rte->relid,
-												   userid,
-												   ACL_SELECT) == ACLCHECK_OK);
-						}
-					}
-
+					recheck_parent_acl(root, vardata, onerel->relid);
 					break;
 				}
 
@@ -5339,6 +5255,54 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 	}
 }
 
+/*
+ * If the user doesn't have permissions to access an inheritance child
+ * relation, check the permissions of the table actually mentioned in the
+ * query, since most likely the user does have that permission.  Note that
+ * whole-table select privilege on the parent doesn't quite guarantee that the
+ * user could read all columns of the child.  But in practice it's unlikely
+ * that any interesting security violation could result from allowing access to
+ * the expression stats, so we allow it anyway.  See similar code in
+ * examine_simple_variable() for additional comments.
+ */
+static void
+recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata, Oid relid)
+{
+	RangeTblEntry	*rte;
+	Oid		userid;
+
+	if (!vardata->acl_ok &&
+		root->append_rel_array != NULL)
+	{
+		AppendRelInfo *appinfo;
+		Index		varno = relid;
+
+		appinfo = root->append_rel_array[varno];
+		while (appinfo &&
+			   planner_rt_fetch(appinfo->parent_relid,
+								root)->rtekind == RTE_RELATION)
+		{
+			varno = appinfo->parent_relid;
+			appinfo = root->append_rel_array[varno];
+		}
+
+		if (varno != relid)
+		{
+			/* Repeat access check on this rel */
+			rte = planner_rt_fetch(varno, root);
+			Assert(rte->rtekind == RTE_RELATION);
+
+			userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
+
+			vardata->acl_ok =
+				rte->securityQuals == NIL &&
+				(pg_class_aclcheck(rte->relid,
+								   userid,
+								   ACL_SELECT) == ACLCHECK_OK);
+		}
+	}
+}
+
 /*
  * examine_simple_variable
  *		Handle a simple Var for examine_variable
-- 
2.17.0

0001-Do-not-use-extended-statistics-on-inheritence-trees.patchtext/x-diff; charset=us-asciiDownload
From b8f9e453b6cd05718a7a149f7e472d6c2c28c8a6 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 25 Sep 2021 19:42:41 -0500
Subject: [PATCH 1/5] Do not use extended statistics on inheritence trees..

Since 859b3003de, inherited ext stats are not built.
However, the non-inherited stats stats were incorrectly used during planning of
queries with inheritence heirarchies.

Since the ext stats do not include child tables, they can lead to worse
estimates.

choose_best_statistics is handled a bit differently (in the calling function),
because it isn't passed rel nor rel->inh, and it's an exported function, so
avoid changing its signature in back branches.

https://www.postgresql.org/message-id/flat/20210925223152.GA7877@telsasoft.com

Backpatch to v10
---
 src/backend/statistics/dependencies.c   |  5 +++++
 src/backend/statistics/extended_stats.c |  5 +++++
 src/backend/utils/adt/selfuncs.c        |  9 +++++++++
 src/test/regress/expected/stats_ext.out | 23 +++++++++++++++++++++++
 src/test/regress/sql/stats_ext.sql      | 14 ++++++++++++++
 5 files changed, 56 insertions(+)

diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 8bf80db8e4..b2e33329c7 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -1593,6 +1593,11 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		int			nexprs;
 		int			k;
 		MVDependencies *deps;
+		RangeTblEntry	*rte = root->simple_rte_array[rel->relid];
+
+		/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+		if (rte->inh)
+			break;
 
 		/* skip statistics that are not of the correct type */
 		if (stat->kind != STATS_EXT_DEPENDENCIES)
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 0a7d12d467..eea38a5bc7 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1744,6 +1744,11 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 		StatisticExtInfo *stat;
 		List	   *stat_clauses;
 		Bitmapset  *simple_clauses;
+		RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
+		/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+		if (rte->inh)
+			break;
 
 		/* find the best suited statistics object for these attnums */
 		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV,
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index abcb628a39..7533574fdc 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3913,6 +3913,11 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	Oid			statOid = InvalidOid;
 	MVNDistinct *stats;
 	StatisticExtInfo *matched_info = NULL;
+	RangeTblEntry		*rte = root->simple_rte_array[rel->relid];
+
+	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+	if (rte->inh)
+		return false;
 
 	/* bail out immediately if the table has no extended statistics */
 	if (!rel->statlist)
@@ -5232,6 +5237,10 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 			if (vardata->statsTuple)
 				break;
 
+			/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+			if (planner_rt_fetch(onerel->relid, root)->inh)
+				break;
+
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
 				continue;
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index c60ba45aba..5c15e44bd6 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -176,6 +176,29 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 NOTICE:  drop cascades to table ab1c
+-- Ensure non-inherited stats are not applied to inherited query
+CREATE TABLE stxdinh(i int, j int);
+CREATE TABLE stxdinh1() INHERITS(stxdinh);
+INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
+INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Without stats object, it looks like this
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+      1000 |   1008
+(1 row)
+
+CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Since the stats object does not include inherited stats, it should not affect the estimates
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+      1000 |   1008
+(1 row)
+
+DROP TABLE stxdinh, stxdinh1;
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 -- expression stats may be built on a single expression column
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 6fb37962a7..610f7ed17f 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -112,6 +112,20 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 
+-- Ensure non-inherited stats are not applied to inherited query
+CREATE TABLE stxdinh(i int, j int);
+CREATE TABLE stxdinh1() INHERITS(stxdinh);
+INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
+INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Without stats object, it looks like this
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Since the stats object does not include inherited stats, it should not affect the estimates
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+DROP TABLE stxdinh, stxdinh1;
+
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 
-- 
2.17.0

0002-Build-inherited-extended-stats-on-partitioned-tables.patchtext/x-diff; charset=us-asciiDownload
From 1c35e88903222e8f1624babd3900a499fdfee2f2 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Date: Sat, 25 Sep 2021 23:01:21 +0200
Subject: [PATCH 2/5] Build inherited extended stats on partitioned tables

Since 859b3003de, ext stats on partitioned tables are not built, which is a
regression.  For back branches, pg_statistic_ext cannot support both inherited
(FROM) and non-inherited (FROM ONLY) stats on inheritence heirarchies.
But there's no issue building inherited stats for partitioned tables, which
are empty, so cannot have non-inherited stats.

See also: 8c5cdb7f4f6e1d6a6104cb58ce4f23453891651b

https://www.postgresql.org/message-id/20210923212624.GI831%40telsasoft.com

Backpatch to v10
---
 src/backend/commands/analyze.c          |  5 ++++-
 src/backend/statistics/dependencies.c   |  2 +-
 src/backend/statistics/extended_stats.c |  2 +-
 src/backend/utils/adt/selfuncs.c        |  9 ++++++---
 src/test/regress/expected/stats_ext.out | 19 +++++++++++++++++++
 src/test/regress/sql/stats_ext.sql      | 10 ++++++++++
 6 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 8bfb2ad958..299f4893b8 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -548,6 +548,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 	{
 		MemoryContext col_context,
 					old_context;
+		bool		build_ext_stats;
 
 		pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
 									 PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -611,13 +612,15 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
+		build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
+
 		/*
 		 * Build extended statistics (if there are any).
 		 *
 		 * For now we only build extended statistics on individual relations,
 		 * not for relations representing inheritance trees.
 		 */
-		if (!inh)
+		if (build_ext_stats)
 			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
 									   attr_cnt, vacattrstats);
 	}
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index b2e33329c7..0659307b02 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -1596,7 +1596,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		RangeTblEntry	*rte = root->simple_rte_array[rel->relid];
 
 		/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-		if (rte->inh)
+		if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 			break;
 
 		/* skip statistics that are not of the correct type */
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index eea38a5bc7..b40ad9da2b 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1747,7 +1747,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 		RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
 		/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-		if (rte->inh)
+		if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 			break;
 
 		/* find the best suited statistics object for these attnums */
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 7533574fdc..d782605953 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3916,7 +3916,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	RangeTblEntry		*rte = root->simple_rte_array[rel->relid];
 
 	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-	if (rte->inh)
+	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 		return false;
 
 	/* bail out immediately if the table has no extended statistics */
@@ -5238,8 +5238,11 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 				break;
 
 			/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-			if (planner_rt_fetch(onerel->relid, root)->inh)
-				break;
+			{
+				RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root);
+				if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
+					break;
+			}
 
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 5c15e44bd6..67234b9fc2 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -199,6 +199,25 @@ SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 (1 row)
 
 DROP TABLE stxdinh, stxdinh1;
+-- Ensure inherited stats ARE applied to inherited query in partitioned table
+CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i);
+CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100);
+INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
+CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp;
+VACUUM ANALYZE stxdinp; -- partitions are processed recursively
+SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass;
+ ?column? 
+----------
+        1
+(1 row)
+
+SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+        10 |     10
+(1 row)
+
+DROP TABLE stxdinp;
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 -- expression stats may be built on a single expression column
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 610f7ed17f..2371043ca1 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -126,6 +126,16 @@ VACUUM ANALYZE stxdinh, stxdinh1;
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 DROP TABLE stxdinh, stxdinh1;
 
+-- Ensure inherited stats ARE applied to inherited query in partitioned table
+CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i);
+CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100);
+INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
+CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp;
+VACUUM ANALYZE stxdinp; -- partitions are processed recursively
+SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass;
+SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2');
+DROP TABLE stxdinp;
+
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 
-- 
2.17.0

0003-Add-stxdinherit-build-inherited-extended-stats-on-in.patchtext/x-diff; charset=us-asciiDownload
From fbb3a640c6416a19d0851c7a18ec0e9190e1c1ed Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Date: Sat, 25 Sep 2021 21:27:10 +0200
Subject: [PATCH 3/5] Add stxdinherit; build inherited extended stats on
 inheritence parents

pg_statistic has an inherited flag which is part of the unique index, but
pg_statistic has never had that.  In back branches, pg_statistic stores the
cannot store both inherited and non-inherited stats.  So it stores
non-inherited stats (FROM ONLY) for inheritence parents and inherited stats for
partitioned tables.

This patch allows storing both inherited and non-inherited stats for non-empty
inheritence parents, and avoids the above, confusing definition.
---
 doc/src/sgml/catalogs.sgml                  |  23 +++
 src/backend/catalog/system_views.sql        |   1 +
 src/backend/commands/analyze.c              |  15 +-
 src/backend/commands/statscmds.c            |  20 ++-
 src/backend/optimizer/util/plancat.c        | 186 +++++++++++---------
 src/backend/statistics/dependencies.c       |  13 +-
 src/backend/statistics/extended_stats.c     |  67 ++++---
 src/backend/statistics/mcv.c                |   9 +-
 src/backend/statistics/mvdistinct.c         |   5 +-
 src/backend/utils/adt/selfuncs.c            |   2 +-
 src/backend/utils/cache/syscache.c          |   6 +-
 src/include/catalog/pg_statistic_ext_data.h |   4 +-
 src/include/nodes/pathnodes.h               |   1 +
 src/include/statistics/statistics.h         |   9 +-
 src/test/regress/expected/rules.out         |   1 +
 15 files changed, 217 insertions(+), 145 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2f0def9b19..e256a5533c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7441,6 +7441,19 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
    created with <link linkend="sql-createstatistics"><command>CREATE STATISTICS</command></link>.
   </para>
 
+  <para>
+   Normally there is one entry, with <structfield>stxdinherit</structfield> =
+   <literal>false</literal>, for each statistics object that has been analyzed.
+   If the table has inheritance children, a second entry with
+   <structfield>stxdinherit</structfield> = <literal>true</literal> is also created.
+   This row represents the statistics object over the inheritance tree, i.e.,
+   statistics for the data you'd see with
+   <literal>SELECT * FROM <replaceable>table</replaceable>*</literal>,
+   whereas the <structfield>stxdinherit</structfield> = <literal>false</literal> row
+   represents the results of
+   <literal>SELECT * FROM ONLY <replaceable>table</replaceable></literal>.
+  </para>
+
   <para>
    Like <link linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>,
    <structname>pg_statistic_ext_data</structname> should not be
@@ -7480,6 +7493,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>stxdinherit</structfield> <type>bool</type>
+      </para>
+      <para>
+       If true, the stats include inheritance child columns, not just the
+       values in the specified relation
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>stxdndistinct</structfield> <type>pg_ndistinct</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..07ab18dc52 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -266,6 +266,7 @@ CREATE VIEW pg_stats_ext WITH (security_barrier) AS
            ) AS attnames,
            pg_get_statisticsobjdef_expressions(s.oid) as exprs,
            s.stxkind AS kinds,
+           sd.stxdinherit AS inherited,
            sd.stxdndistinct AS n_distinct,
            sd.stxddependencies AS dependencies,
            m.most_common_vals,
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 299f4893b8..7f4b0f5320 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -548,7 +548,6 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 	{
 		MemoryContext col_context,
 					old_context;
-		bool		build_ext_stats;
 
 		pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
 									 PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -612,17 +611,9 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
-		build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
-
-		/*
-		 * Build extended statistics (if there are any).
-		 *
-		 * For now we only build extended statistics on individual relations,
-		 * not for relations representing inheritance trees.
-		 */
-		if (build_ext_stats)
-			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
-									   attr_cnt, vacattrstats);
+		/* Build extended statistics (if there are any). */
+		BuildRelationExtStatistics(onerel, inh, totalrows, numrows, rows,
+								   attr_cnt, vacattrstats);
 	}
 
 	pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index e9e2382ceb..3a9f416f39 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -524,6 +524,9 @@ CreateStatistics(CreateStatsStmt *stmt)
 
 	datavalues[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statoid);
 
+	/* create only the "stxdinherit=false", because that always exists */
+	datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = ObjectIdGetDatum(false);
+
 	/* no statistics built yet */
 	datanulls[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
 	datanulls[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
@@ -726,6 +729,7 @@ RemoveStatisticsById(Oid statsOid)
 	HeapTuple	tup;
 	Form_pg_statistic_ext statext;
 	Oid			relid;
+	int			inh;
 
 	/*
 	 * First delete the pg_statistic_ext_data tuple holding the actual
@@ -733,14 +737,20 @@ RemoveStatisticsById(Oid statsOid)
 	 */
 	relation = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
-	tup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid));
+	/* hack to delete both stxdinherit = true/false */
+	for (inh = 0; inh <= 1; inh++)
+	{
+		tup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid),
+							  BoolGetDatum(inh));
 
-	if (!HeapTupleIsValid(tup)) /* should not happen */
-		elog(ERROR, "cache lookup failed for statistics data %u", statsOid);
+		if (!HeapTupleIsValid(tup)) /* should not happen */
+			// elog(ERROR, "cache lookup failed for statistics data %u", statsOid);
+			continue;
 
-	CatalogTupleDelete(relation, &tup->t_self);
+		CatalogTupleDelete(relation, &tup->t_self);
 
-	ReleaseSysCache(tup);
+		ReleaseSysCache(tup);
+	}
 
 	table_close(relation, RowExclusiveLock);
 
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index c5194fdbbf..154d48a330 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -30,6 +30,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_statistic_ext.h"
+#include "catalog/pg_statistic_ext_data.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -1311,127 +1312,144 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 	{
 		Oid			statOid = lfirst_oid(l);
 		Form_pg_statistic_ext staForm;
+		Form_pg_statistic_ext_data dataForm;
 		HeapTuple	htup;
 		HeapTuple	dtup;
 		Bitmapset  *keys = NULL;
 		List	   *exprs = NIL;
 		int			i;
+		int			inh;
 
 		htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
 		if (!HeapTupleIsValid(htup))
 			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
 		staForm = (Form_pg_statistic_ext) GETSTRUCT(htup);
 
-		dtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-		if (!HeapTupleIsValid(dtup))
-			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
-
-		/*
-		 * First, build the array of columns covered.  This is ultimately
-		 * wasted if no stats within the object have actually been built, but
-		 * it doesn't seem worth troubling over that case.
-		 */
-		for (i = 0; i < staForm->stxkeys.dim1; i++)
-			keys = bms_add_member(keys, staForm->stxkeys.values[i]);
-
 		/*
-		 * Preprocess expressions (if any). We read the expressions, run them
-		 * through eval_const_expressions, and fix the varnos.
+		 * Hack to load stats with stxdinherit true/false - there should be
+		 * a better way to do this, I guess.
 		 */
+		for (inh = 0; inh <= 1; inh++)
 		{
-			bool		isnull;
-			Datum		datum;
+			dtup = SearchSysCache2(STATEXTDATASTXOID,
+								   ObjectIdGetDatum(statOid), BoolGetDatum((bool) inh));
+			if (!HeapTupleIsValid(dtup))
+				continue;
 
-			/* decode expression (if any) */
-			datum = SysCacheGetAttr(STATEXTOID, htup,
-									Anum_pg_statistic_ext_stxexprs, &isnull);
+			dataForm = (Form_pg_statistic_ext_data) GETSTRUCT(dtup);
 
-			if (!isnull)
+			/*
+			 * First, build the array of columns covered.  This is ultimately
+			 * wasted if no stats within the object have actually been built, but
+			 * it doesn't seem worth troubling over that case.
+			 */
+			for (i = 0; i < staForm->stxkeys.dim1; i++)
+				keys = bms_add_member(keys, staForm->stxkeys.values[i]);
+
+			/*
+			 * Preprocess expressions (if any). We read the expressions, run them
+			 * through eval_const_expressions, and fix the varnos.
+			 */
 			{
-				char	   *exprsString;
+				bool		isnull;
+				Datum		datum;
 
-				exprsString = TextDatumGetCString(datum);
-				exprs = (List *) stringToNode(exprsString);
-				pfree(exprsString);
+				/* decode expression (if any) */
+				datum = SysCacheGetAttr(STATEXTOID, htup,
+										Anum_pg_statistic_ext_stxexprs, &isnull);
 
-				/*
-				 * Run the expressions through eval_const_expressions. This is
-				 * not just an optimization, but is necessary, because the
-				 * planner will be comparing them to similarly-processed qual
-				 * clauses, and may fail to detect valid matches without this.
-				 * We must not use canonicalize_qual, however, since these
-				 * aren't qual expressions.
-				 */
-				exprs = (List *) eval_const_expressions(NULL, (Node *) exprs);
+				if (!isnull)
+				{
+					char	   *exprsString;
 
-				/* May as well fix opfuncids too */
-				fix_opfuncids((Node *) exprs);
+					exprsString = TextDatumGetCString(datum);
+					exprs = (List *) stringToNode(exprsString);
+					pfree(exprsString);
 
-				/*
-				 * Modify the copies we obtain from the relcache to have the
-				 * correct varno for the parent relation, so that they match
-				 * up correctly against qual clauses.
-				 */
-				if (varno != 1)
-					ChangeVarNodes((Node *) exprs, 1, varno, 0);
+					/*
+					 * Run the expressions through eval_const_expressions. This is
+					 * not just an optimization, but is necessary, because the
+					 * planner will be comparing them to similarly-processed qual
+					 * clauses, and may fail to detect valid matches without this.
+					 * We must not use canonicalize_qual, however, since these
+					 * aren't qual expressions.
+					 */
+					exprs = (List *) eval_const_expressions(NULL, (Node *) exprs);
+
+					/* May as well fix opfuncids too */
+					fix_opfuncids((Node *) exprs);
+
+					/*
+					 * Modify the copies we obtain from the relcache to have the
+					 * correct varno for the parent relation, so that they match
+					 * up correctly against qual clauses.
+					 */
+					if (varno != 1)
+						ChangeVarNodes((Node *) exprs, 1, varno, 0);
+				}
 			}
-		}
 
-		/* add one StatisticExtInfo for each kind built */
-		if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			/* add one StatisticExtInfo for each kind built */
+			if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_NDISTINCT;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_NDISTINCT;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
 
-			stainfos = lappend(stainfos, info);
-		}
+				stainfos = lappend(stainfos, info);
+			}
 
-		if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_DEPENDENCIES;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_DEPENDENCIES;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
 
-			stainfos = lappend(stainfos, info);
-		}
+				stainfos = lappend(stainfos, info);
+			}
 
-		if (statext_is_kind_built(dtup, STATS_EXT_MCV))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			if (statext_is_kind_built(dtup, STATS_EXT_MCV))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_MCV;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_MCV;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
 
-			stainfos = lappend(stainfos, info);
-		}
+				stainfos = lappend(stainfos, info);
+			}
 
-		if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_EXPRESSIONS;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_EXPRESSIONS;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
+
+				stainfos = lappend(stainfos, info);
+			}
 
-			stainfos = lappend(stainfos, info);
+			ReleaseSysCache(dtup);
 		}
 
 		ReleaseSysCache(htup);
-		ReleaseSysCache(dtup);
 		bms_free(keys);
 	}
 
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 0659307b02..835f4bdf7a 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -618,14 +618,16 @@ dependency_is_fully_matched(MVDependency *dependency, Bitmapset *attnums)
  *		Load the functional dependencies for the indicated pg_statistic_ext tuple
  */
 MVDependencies *
-statext_dependencies_load(Oid mvoid)
+statext_dependencies_load(Oid mvoid, bool inh)
 {
 	MVDependencies *result;
 	bool		isnull;
 	Datum		deps;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid),
+						   BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
@@ -1410,6 +1412,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			ndependencies;
 	int			i;
 	AttrNumber	attnum_offset;
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
 	/* unique expressions */
 	Node	  **unique_exprs;
@@ -1603,6 +1606,10 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (stat->kind != STATS_EXT_DEPENDENCIES)
 			continue;
 
+		/* skip statistics with mismatching stxdinherit value */
+		if (stat->inherit != rte->inh)
+			continue;
+
 		/*
 		 * Count matching attributes - we have to undo the attnum offsets. The
 		 * input attribute numbers are not offset (expressions are not
@@ -1649,7 +1656,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (nmatched + nexprs < 2)
 			continue;
 
-		deps = statext_dependencies_load(stat->statOid);
+		deps = statext_dependencies_load(stat->statOid, rte->inh);
 
 		/*
 		 * The expressions may be represented by different attnums in the
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index b40ad9da2b..5a43d7e39a 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -77,7 +77,7 @@ typedef struct StatExtEntry
 static List *fetch_statentries_for_relation(Relation pg_statext, Oid relid);
 static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
 											int nvacatts, VacAttrStats **vacatts);
-static void statext_store(Oid statOid,
+static void statext_store(Oid statOid, bool inh,
 						  MVNDistinct *ndistinct, MVDependencies *dependencies,
 						  MCVList *mcv, Datum exprs, VacAttrStats **stats);
 static int	statext_compute_stattarget(int stattarget,
@@ -110,7 +110,7 @@ static StatsBuildData *make_build_data(Relation onerel, StatExtEntry *stat,
  * requested stats, and serializes them back into the catalog.
  */
 void
-BuildRelationExtStatistics(Relation onerel, double totalrows,
+BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 						   int numrows, HeapTuple *rows,
 						   int natts, VacAttrStats **vacattrstats)
 {
@@ -230,7 +230,8 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 		}
 
 		/* store the statistics in the catalog */
-		statext_store(stat->statOid, ndistinct, dependencies, mcv, exprstats, stats);
+		statext_store(stat->statOid, inh,
+					  ndistinct, dependencies, mcv, exprstats, stats);
 
 		/* for reporting progress */
 		pgstat_progress_update_param(PROGRESS_ANALYZE_EXT_STATS_COMPUTED,
@@ -781,7 +782,7 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
  *	tuple.
  */
 static void
-statext_store(Oid statOid,
+statext_store(Oid statOid, bool inh,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
 			  MCVList *mcv, Datum exprs, VacAttrStats **stats)
 {
@@ -790,14 +791,19 @@ statext_store(Oid statOid,
 				oldtup;
 	Datum		values[Natts_pg_statistic_ext_data];
 	bool		nulls[Natts_pg_statistic_ext_data];
-	bool		replaces[Natts_pg_statistic_ext_data];
 
 	pg_stextdata = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
 	memset(nulls, true, sizeof(nulls));
-	memset(replaces, false, sizeof(replaces));
 	memset(values, 0, sizeof(values));
 
+	/* basic info */
+	values[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statOid);
+	nulls[Anum_pg_statistic_ext_data_stxoid - 1] = false;
+
+	values[Anum_pg_statistic_ext_data_stxdinherit - 1] = BoolGetDatum(inh);
+	nulls[Anum_pg_statistic_ext_data_stxdinherit - 1] = false;
+
 	/*
 	 * Construct a new pg_statistic_ext_data tuple, replacing the calculated
 	 * stats.
@@ -830,25 +836,27 @@ statext_store(Oid statOid,
 		values[Anum_pg_statistic_ext_data_stxdexpr - 1] = exprs;
 	}
 
-	/* always replace the value (either by bytea or NULL) */
-	replaces[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdmcv - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdexpr - 1] = true;
-
-	/* there should already be a pg_statistic_ext_data tuple */
-	oldtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-	if (!HeapTupleIsValid(oldtup))
+	/*
+	 * Delete the old tuple if it exists, and insert a new one. It's easier
+	 * than trying to update or insert, based on various conditions.
+	 *
+	 * There should always be a pg_statistic_ext_data tuple for inh=false,
+	 * but there may be none for inh=true yet.
+	 */
+	oldtup = SearchSysCache2(STATEXTDATASTXOID,
+							 ObjectIdGetDatum(statOid),
+							 BoolGetDatum(inh));
+	if (HeapTupleIsValid(oldtup))
+	{
+		CatalogTupleDelete(pg_stextdata, &(oldtup->t_self));
+		ReleaseSysCache(oldtup);
+	}
+	else if (!inh)
 		elog(ERROR, "cache lookup failed for statistics object %u", statOid);
 
-	/* replace it */
-	stup = heap_modify_tuple(oldtup,
-							 RelationGetDescr(pg_stextdata),
-							 values,
-							 nulls,
-							 replaces);
-	ReleaseSysCache(oldtup);
-	CatalogTupleUpdate(pg_stextdata, &stup->t_self, stup);
+	/* form a new tuple */
+	stup = heap_form_tuple(RelationGetDescr(pg_stextdata), values, nulls);
+	CatalogTupleInsert(pg_stextdata, stup);
 
 	heap_freetuple(stup);
 
@@ -1234,7 +1242,7 @@ stat_covers_expressions(StatisticExtInfo *stat, List *exprs,
  * further tiebreakers are needed.
  */
 StatisticExtInfo *
-choose_best_statistics(List *stats, char requiredkind,
+choose_best_statistics(List *stats, char requiredkind, bool inh,
 					   Bitmapset **clause_attnums, List **clause_exprs,
 					   int nclauses)
 {
@@ -1256,6 +1264,10 @@ choose_best_statistics(List *stats, char requiredkind,
 		if (info->kind != requiredkind)
 			continue;
 
+		/* skip statistics with mismatching inheritance flag */
+		if (info->inherit != inh)
+			continue;
+
 		/*
 		 * Collect attributes and expressions in remaining (unestimated)
 		 * clauses fully covered by this statistic object.
@@ -1694,6 +1706,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	List	  **list_exprs;		/* expressions matched to any statistic */
 	int			listidx;
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
@@ -1751,7 +1764,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 			break;
 
 		/* find the best suited statistics object for these attnums */
-		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV,
+		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, rte->inh,
 									  list_attnums, list_exprs,
 									  list_length(clauses));
 
@@ -1840,7 +1853,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 			MCVList    *mcv_list;
 
 			/* Load the MCV list stored in the statistics object */
-			mcv_list = statext_mcv_load(stat->statOid);
+			mcv_list = statext_mcv_load(stat->statOid, rte->inh);
 
 			/*
 			 * Compute the selectivity of the ORed list of clauses covered by
@@ -2411,7 +2424,7 @@ statext_expressions_load(Oid stxoid, int idx)
 	HeapTupleData tmptup;
 	HeapTuple	tup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), BoolGetDatum(false));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", stxoid);
 
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index 35b39ece07..173f746e41 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -559,12 +559,13 @@ build_column_frequencies(SortItem *groups, int ngroups,
  *		Load the MCV list for the indicated pg_statistic_ext tuple.
  */
 MCVList *
-statext_mcv_load(Oid mvoid)
+statext_mcv_load(Oid mvoid, bool inh)
 {
 	MCVList    *result;
 	bool		isnull;
 	Datum		mcvlist;
-	HeapTuple	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	HeapTuple	htup = SearchSysCache2(STATEXTDATASTXOID,
+									   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
@@ -2040,11 +2041,13 @@ mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
 	MCVList    *mcv;
 	Selectivity s = 0.0;
 
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
 	/* match/mismatch bitmap for each MCV item */
 	bool	   *matches = NULL;
 
 	/* load the MCV list stored in the statistics object */
-	mcv = statext_mcv_load(stat->statOid);
+	mcv = statext_mcv_load(stat->statOid, rte->inh);
 
 	/* build a match bitmap for the clauses */
 	matches = mcv_get_match_bitmap(root, clauses, stat->keys, stat->exprs,
diff --git a/src/backend/statistics/mvdistinct.c b/src/backend/statistics/mvdistinct.c
index 4481312d61..ab1f10d6c0 100644
--- a/src/backend/statistics/mvdistinct.c
+++ b/src/backend/statistics/mvdistinct.c
@@ -146,14 +146,15 @@ statext_ndistinct_build(double totalrows, StatsBuildData *data)
  *		Load the ndistinct value for the indicated pg_statistic_ext tuple
  */
 MVNDistinct *
-statext_ndistinct_load(Oid mvoid)
+statext_ndistinct_load(Oid mvoid, bool inh)
 {
 	MVNDistinct *result;
 	bool		isnull;
 	Datum		ndist;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index d782605953..ede393115d 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -4008,7 +4008,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 
 	Assert(nmatches_vars + nmatches_exprs > 1);
 
-	stats = statext_ndistinct_load(statOid);
+	stats = statext_ndistinct_load(statOid, rte->inh);
 
 	/*
 	 * If we have a match, search it for the specific item that matches (there
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index d6cb78dea8..eabd74952f 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -740,11 +740,11 @@ static const struct cachedesc cacheinfo[] = {
 		32
 	},
 	{StatisticExtDataRelationId,	/* STATEXTDATASTXOID */
-		StatisticExtDataStxoidIndexId,
-		1,
+		StatisticExtDataStxoidInhIndexId,
+		2,
 		{
 			Anum_pg_statistic_ext_data_stxoid,
-			0,
+			Anum_pg_statistic_ext_data_stxdinherit,
 			0,
 			0
 		},
diff --git a/src/include/catalog/pg_statistic_ext_data.h b/src/include/catalog/pg_statistic_ext_data.h
index 7b73b790d2..8ffd8b68cd 100644
--- a/src/include/catalog/pg_statistic_ext_data.h
+++ b/src/include/catalog/pg_statistic_ext_data.h
@@ -32,6 +32,7 @@ CATALOG(pg_statistic_ext_data,3429,StatisticExtDataRelationId)
 {
 	Oid			stxoid BKI_LOOKUP(pg_statistic_ext);	/* statistics object
 														 * this data is for */
+	bool		stxdinherit;	/* true if inheritance children are included */
 
 #ifdef CATALOG_VARLEN			/* variable-length fields start here */
 
@@ -53,6 +54,7 @@ typedef FormData_pg_statistic_ext_data * Form_pg_statistic_ext_data;
 
 DECLARE_TOAST(pg_statistic_ext_data, 3430, 3431);
 
-DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_index, 3433, StatisticExtDataStxoidIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops));
+DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_inh_index, 3433, StatisticExtDataStxoidInhIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops, stxdinherit bool_ops));
+
 
 #endif							/* PG_STATISTIC_EXT_DATA_H */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 2a53a6e344..884bda7232 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -934,6 +934,7 @@ typedef struct StatisticExtInfo
 	NodeTag		type;
 
 	Oid			statOid;		/* OID of the statistics row */
+	bool		inherit;		/* includes child relations */
 	RelOptInfo *rel;			/* back-link to statistic's table */
 	char		kind;			/* statistics kind of this entry */
 	Bitmapset  *keys;			/* attnums of the columns covered */
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 326cf26fea..02ee41b9f3 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -94,11 +94,11 @@ typedef struct MCVList
 	MCVItem		items[FLEXIBLE_ARRAY_MEMBER];	/* array of MCV items */
 } MCVList;
 
-extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
-extern MVDependencies *statext_dependencies_load(Oid mvoid);
-extern MCVList *statext_mcv_load(Oid mvoid);
+extern MVNDistinct *statext_ndistinct_load(Oid mvoid, bool inh);
+extern MVDependencies *statext_dependencies_load(Oid mvoid, bool inh);
+extern MCVList *statext_mcv_load(Oid mvoid, bool inh);
 
-extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
+extern void BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 									   int numrows, HeapTuple *rows,
 									   int natts, VacAttrStats **vacattrstats);
 extern int	ComputeExtStatisticsRows(Relation onerel,
@@ -121,6 +121,7 @@ extern Selectivity statext_clauselist_selectivity(PlannerInfo *root,
 												  bool is_or);
 extern bool has_stats_of_kind(List *stats, char requiredkind);
 extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind,
+												bool inh,
 												Bitmapset **clause_attnums,
 												List **clause_exprs,
 												int nclauses);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..8ab5187ccb 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2425,6 +2425,7 @@ pg_stats_ext| SELECT cn.nspname AS schemaname,
              JOIN pg_attribute a ON (((a.attrelid = s.stxrelid) AND (a.attnum = k.k))))) AS attnames,
     pg_get_statisticsobjdef_expressions(s.oid) AS exprs,
     s.stxkind AS kinds,
+    sd.stxdinherit AS inherited,
     sd.stxdndistinct AS n_distinct,
     sd.stxddependencies AS dependencies,
     m.most_common_vals,
-- 
2.17.0

#9Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Justin Pryzby (#8)
Re: extended stats on partitioned tables

On Sun, Sep 26, 2021 at 03:25:50PM -0500, Justin Pryzby wrote:

On Sat, Sep 25, 2021 at 05:31:52PM -0500, Justin Pryzby wrote:

It seems like your patch should also check "inh" in examine_variable and
statext_expressions_load.

I tried adding that - I mostly kept my patches separate.
Hopefully this is more helpful than a complication.
I added at: https://commitfest.postgresql.org/35/3332/

Actually, this is confusing. Which patch is the one we should be
reviewing?

--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL

#10Justin Pryzby
pryzby@telsasoft.com
In reply to: Jaime Casanova (#9)
5 attachment(s)
Re: extended stats on partitioned tables

On Thu, Oct 07, 2021 at 03:26:46PM -0500, Jaime Casanova wrote:

On Sun, Sep 26, 2021 at 03:25:50PM -0500, Justin Pryzby wrote:

On Sat, Sep 25, 2021 at 05:31:52PM -0500, Justin Pryzby wrote:

It seems like your patch should also check "inh" in examine_variable and
statext_expressions_load.

I tried adding that - I mostly kept my patches separate.
Hopefully this is more helpful than a complication.
I added at: https://commitfest.postgresql.org/35/3332/

Actually, this is confusing. Which patch is the one we should be
reviewing?

It is confusing, but not as much as I first thought. Please check the commit
messages.

The first two patches are meant to be applied to master *and* backpatched. The
first one intends to fixes the bug that non-inherited stats are being used for
queries of inheritance trees. The 2nd one fixes the regression that stats are
not collected for inheritence trees of partitioned tables (which is the only
type of stats they could ever possibly have).

And the 3rd+4th patches (Tomas' plus my changes) allow collecting both
inherited and non-inherited stats, only in master, since it requires a catalog
change. It's a bit confusing that patch #4 removes most what I added in
patches 1 and 2. But that's exactly what's needed to collect and apply both
inherited and non-inherited stats: the first two patches avoid applying stats
collected with the wrong inheritence. That's also what's needed for the
patchset to follow the normal "apply to master and backpatch" process, rather
than 2 patches which are backpatched but not applied to master, and one which
is applied to master and not backpatched..

@Tomas: I just found commit 427c6b5b9, which is a remarkably similar issue
affecting column stats 15 years ago.

Rebased since there were conflicts with my typos fixes.

--
Justin

Attachments:

v2-0001-Do-not-use-extended-statistics-on-inheritence-tre.patchtext/x-diff; charset=us-asciiDownload
From f6fb0e139e5b19ea288040328590a91a8526e2a6 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 25 Sep 2021 19:42:41 -0500
Subject: [PATCH v2 1/5] Do not use extended statistics on inheritence trees..

Since 859b3003de, inherited ext stats are not built.
However, the non-inherited stats stats were incorrectly used during planning of
queries with inheritence heirarchies.

Since the ext stats do not include child tables, they can lead to worse
estimates.  This is remarkably similar to 427c6b5b9, which affected column
statistics 15 years ago.

choose_best_statistics is handled a bit differently (in the calling function),
because it isn't passed rel nor rel->inh, and it's an exported function, so
avoid changing its signature in back branches.

https://www.postgresql.org/message-id/flat/20210925223152.GA7877@telsasoft.com

Backpatch to v10
---
 src/backend/statistics/dependencies.c   |  5 +++++
 src/backend/statistics/extended_stats.c |  5 +++++
 src/backend/utils/adt/selfuncs.c        |  9 +++++++++
 src/test/regress/expected/stats_ext.out | 23 +++++++++++++++++++++++
 src/test/regress/sql/stats_ext.sql      | 14 ++++++++++++++
 5 files changed, 56 insertions(+)

diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 8bf80db8e4..b2e33329c7 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -1593,6 +1593,11 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		int			nexprs;
 		int			k;
 		MVDependencies *deps;
+		RangeTblEntry	*rte = root->simple_rte_array[rel->relid];
+
+		/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+		if (rte->inh)
+			break;
 
 		/* skip statistics that are not of the correct type */
 		if (stat->kind != STATS_EXT_DEPENDENCIES)
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 69ca52094f..6c69e5bc96 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1744,6 +1744,11 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 		StatisticExtInfo *stat;
 		List	   *stat_clauses;
 		Bitmapset  *simple_clauses;
+		RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
+		/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+		if (rte->inh)
+			break;
 
 		/* find the best suited statistics object for these attnums */
 		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV,
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 10895fb287..a0932e39e1 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3913,6 +3913,11 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	Oid			statOid = InvalidOid;
 	MVNDistinct *stats;
 	StatisticExtInfo *matched_info = NULL;
+	RangeTblEntry		*rte = root->simple_rte_array[rel->relid];
+
+	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+	if (rte->inh)
+		return false;
 
 	/* bail out immediately if the table has no extended statistics */
 	if (!rel->statlist)
@@ -5232,6 +5237,10 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 			if (vardata->statsTuple)
 				break;
 
+			/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+			if (planner_rt_fetch(onerel->relid, root)->inh)
+				break;
+
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
 				continue;
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index c60ba45aba..5c15e44bd6 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -176,6 +176,29 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 NOTICE:  drop cascades to table ab1c
+-- Ensure non-inherited stats are not applied to inherited query
+CREATE TABLE stxdinh(i int, j int);
+CREATE TABLE stxdinh1() INHERITS(stxdinh);
+INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
+INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Without stats object, it looks like this
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+      1000 |   1008
+(1 row)
+
+CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Since the stats object does not include inherited stats, it should not affect the estimates
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+      1000 |   1008
+(1 row)
+
+DROP TABLE stxdinh, stxdinh1;
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 -- expression stats may be built on a single expression column
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 6fb37962a7..610f7ed17f 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -112,6 +112,20 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 
+-- Ensure non-inherited stats are not applied to inherited query
+CREATE TABLE stxdinh(i int, j int);
+CREATE TABLE stxdinh1() INHERITS(stxdinh);
+INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
+INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Without stats object, it looks like this
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Since the stats object does not include inherited stats, it should not affect the estimates
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+DROP TABLE stxdinh, stxdinh1;
+
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 
-- 
2.17.0

v2-0002-Build-inherited-extended-stats-on-partitioned-tab.patchtext/x-diff; charset=us-asciiDownload
From b5bcff5331d052ea753d25ec7f443dc1d807fb13 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Date: Sat, 25 Sep 2021 23:01:21 +0200
Subject: [PATCH v2 2/5] Build inherited extended stats on partitioned tables

Since 859b3003de, ext stats on partitioned tables are not built, which is a
regression.  For back branches, pg_statistic_ext cannot support both inherited
(FROM) and non-inherited (FROM ONLY) stats on inheritence heirarchies.
But there's no issue building inherited stats for partitioned tables, which
are empty, so cannot have non-inherited stats.

See also: 8c5cdb7f4f6e1d6a6104cb58ce4f23453891651b

https://www.postgresql.org/message-id/20210923212624.GI831%40telsasoft.com

Backpatch to v10
---
 src/backend/commands/analyze.c          |  5 ++++-
 src/backend/statistics/dependencies.c   |  2 +-
 src/backend/statistics/extended_stats.c |  2 +-
 src/backend/utils/adt/selfuncs.c        |  9 ++++++---
 src/test/regress/expected/stats_ext.out | 19 +++++++++++++++++++
 src/test/regress/sql/stats_ext.sql      | 10 ++++++++++
 6 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 8bfb2ad958..299f4893b8 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -548,6 +548,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 	{
 		MemoryContext col_context,
 					old_context;
+		bool		build_ext_stats;
 
 		pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
 									 PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -611,13 +612,15 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
+		build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
+
 		/*
 		 * Build extended statistics (if there are any).
 		 *
 		 * For now we only build extended statistics on individual relations,
 		 * not for relations representing inheritance trees.
 		 */
-		if (!inh)
+		if (build_ext_stats)
 			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
 									   attr_cnt, vacattrstats);
 	}
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index b2e33329c7..0659307b02 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -1596,7 +1596,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		RangeTblEntry	*rte = root->simple_rte_array[rel->relid];
 
 		/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-		if (rte->inh)
+		if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 			break;
 
 		/* skip statistics that are not of the correct type */
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 6c69e5bc96..9e518830ae 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1747,7 +1747,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 		RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
 		/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-		if (rte->inh)
+		if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 			break;
 
 		/* find the best suited statistics object for these attnums */
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index a0932e39e1..b15f14e1a0 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3916,7 +3916,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	RangeTblEntry		*rte = root->simple_rte_array[rel->relid];
 
 	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-	if (rte->inh)
+	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 		return false;
 
 	/* bail out immediately if the table has no extended statistics */
@@ -5238,8 +5238,11 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 				break;
 
 			/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-			if (planner_rt_fetch(onerel->relid, root)->inh)
-				break;
+			{
+				RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root);
+				if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
+					break;
+			}
 
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 5c15e44bd6..67234b9fc2 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -199,6 +199,25 @@ SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 (1 row)
 
 DROP TABLE stxdinh, stxdinh1;
+-- Ensure inherited stats ARE applied to inherited query in partitioned table
+CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i);
+CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100);
+INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
+CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp;
+VACUUM ANALYZE stxdinp; -- partitions are processed recursively
+SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass;
+ ?column? 
+----------
+        1
+(1 row)
+
+SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+        10 |     10
+(1 row)
+
+DROP TABLE stxdinp;
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 -- expression stats may be built on a single expression column
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 610f7ed17f..2371043ca1 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -126,6 +126,16 @@ VACUUM ANALYZE stxdinh, stxdinh1;
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 DROP TABLE stxdinh, stxdinh1;
 
+-- Ensure inherited stats ARE applied to inherited query in partitioned table
+CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i);
+CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100);
+INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
+CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp;
+VACUUM ANALYZE stxdinp; -- partitions are processed recursively
+SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass;
+SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2');
+DROP TABLE stxdinp;
+
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 
-- 
2.17.0

v2-0003-Add-stxdinherit-build-inherited-extended-stats-on.patchtext/x-diff; charset=us-asciiDownload
From 8672bbd68cfb76447aafde5133bf2b897625be75 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Date: Sat, 25 Sep 2021 21:27:10 +0200
Subject: [PATCH v2 3/5] Add stxdinherit; build inherited extended stats on
 inheritence parents

pg_statistic has an inherited flag which is part of the unique index, but
pg_statistic has never had that.  In back branches, pg_statistic stores the
cannot store both inherited and non-inherited stats.  So it stores
non-inherited stats (FROM ONLY) for inheritence parents and inherited stats for
partitioned tables.

This patch allows storing both inherited and non-inherited stats for non-empty
inheritence parents, and avoids the above, confusing definition.
---
 doc/src/sgml/catalogs.sgml                  |  23 +++
 src/backend/catalog/system_views.sql        |   1 +
 src/backend/commands/analyze.c              |  15 +-
 src/backend/commands/statscmds.c            |  20 ++-
 src/backend/optimizer/util/plancat.c        | 186 +++++++++++---------
 src/backend/statistics/dependencies.c       |  13 +-
 src/backend/statistics/extended_stats.c     |  67 ++++---
 src/backend/statistics/mcv.c                |   9 +-
 src/backend/statistics/mvdistinct.c         |   5 +-
 src/backend/utils/adt/selfuncs.c            |   2 +-
 src/backend/utils/cache/syscache.c          |   6 +-
 src/include/catalog/pg_statistic_ext_data.h |   4 +-
 src/include/nodes/pathnodes.h               |   1 +
 src/include/statistics/statistics.h         |   9 +-
 src/test/regress/expected/rules.out         |   1 +
 15 files changed, 217 insertions(+), 145 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index fd6910ddbe..dc79b0737f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7443,6 +7443,19 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
    created with <link linkend="sql-createstatistics"><command>CREATE STATISTICS</command></link>.
   </para>
 
+  <para>
+   Normally there is one entry, with <structfield>stxdinherit</structfield> =
+   <literal>false</literal>, for each statistics object that has been analyzed.
+   If the table has inheritance children, a second entry with
+   <structfield>stxdinherit</structfield> = <literal>true</literal> is also created.
+   This row represents the statistics object over the inheritance tree, i.e.,
+   statistics for the data you'd see with
+   <literal>SELECT * FROM <replaceable>table</replaceable>*</literal>,
+   whereas the <structfield>stxdinherit</structfield> = <literal>false</literal> row
+   represents the results of
+   <literal>SELECT * FROM ONLY <replaceable>table</replaceable></literal>.
+  </para>
+
   <para>
    Like <link linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>,
    <structname>pg_statistic_ext_data</structname> should not be
@@ -7482,6 +7495,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>stxdinherit</structfield> <type>bool</type>
+      </para>
+      <para>
+       If true, the stats include inheritance child columns, not just the
+       values in the specified relation
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>stxdndistinct</structfield> <type>pg_ndistinct</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..07ab18dc52 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -266,6 +266,7 @@ CREATE VIEW pg_stats_ext WITH (security_barrier) AS
            ) AS attnames,
            pg_get_statisticsobjdef_expressions(s.oid) as exprs,
            s.stxkind AS kinds,
+           sd.stxdinherit AS inherited,
            sd.stxdndistinct AS n_distinct,
            sd.stxddependencies AS dependencies,
            m.most_common_vals,
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 299f4893b8..7f4b0f5320 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -548,7 +548,6 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 	{
 		MemoryContext col_context,
 					old_context;
-		bool		build_ext_stats;
 
 		pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
 									 PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -612,17 +611,9 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
-		build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
-
-		/*
-		 * Build extended statistics (if there are any).
-		 *
-		 * For now we only build extended statistics on individual relations,
-		 * not for relations representing inheritance trees.
-		 */
-		if (build_ext_stats)
-			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
-									   attr_cnt, vacattrstats);
+		/* Build extended statistics (if there are any). */
+		BuildRelationExtStatistics(onerel, inh, totalrows, numrows, rows,
+								   attr_cnt, vacattrstats);
 	}
 
 	pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 8f1550ec80..4395d878c7 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -524,6 +524,9 @@ CreateStatistics(CreateStatsStmt *stmt)
 
 	datavalues[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statoid);
 
+	/* create only the "stxdinherit=false", because that always exists */
+	datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = ObjectIdGetDatum(false);
+
 	/* no statistics built yet */
 	datanulls[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
 	datanulls[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
@@ -726,6 +729,7 @@ RemoveStatisticsById(Oid statsOid)
 	HeapTuple	tup;
 	Form_pg_statistic_ext statext;
 	Oid			relid;
+	int			inh;
 
 	/*
 	 * First delete the pg_statistic_ext_data tuple holding the actual
@@ -733,14 +737,20 @@ RemoveStatisticsById(Oid statsOid)
 	 */
 	relation = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
-	tup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid));
+	/* hack to delete both stxdinherit = true/false */
+	for (inh = 0; inh <= 1; inh++)
+	{
+		tup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid),
+							  BoolGetDatum(inh));
 
-	if (!HeapTupleIsValid(tup)) /* should not happen */
-		elog(ERROR, "cache lookup failed for statistics data %u", statsOid);
+		if (!HeapTupleIsValid(tup)) /* should not happen */
+			// elog(ERROR, "cache lookup failed for statistics data %u", statsOid);
+			continue;
 
-	CatalogTupleDelete(relation, &tup->t_self);
+		CatalogTupleDelete(relation, &tup->t_self);
 
-	ReleaseSysCache(tup);
+		ReleaseSysCache(tup);
+	}
 
 	table_close(relation, RowExclusiveLock);
 
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index c5194fdbbf..154d48a330 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -30,6 +30,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_statistic_ext.h"
+#include "catalog/pg_statistic_ext_data.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -1311,127 +1312,144 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 	{
 		Oid			statOid = lfirst_oid(l);
 		Form_pg_statistic_ext staForm;
+		Form_pg_statistic_ext_data dataForm;
 		HeapTuple	htup;
 		HeapTuple	dtup;
 		Bitmapset  *keys = NULL;
 		List	   *exprs = NIL;
 		int			i;
+		int			inh;
 
 		htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
 		if (!HeapTupleIsValid(htup))
 			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
 		staForm = (Form_pg_statistic_ext) GETSTRUCT(htup);
 
-		dtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-		if (!HeapTupleIsValid(dtup))
-			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
-
-		/*
-		 * First, build the array of columns covered.  This is ultimately
-		 * wasted if no stats within the object have actually been built, but
-		 * it doesn't seem worth troubling over that case.
-		 */
-		for (i = 0; i < staForm->stxkeys.dim1; i++)
-			keys = bms_add_member(keys, staForm->stxkeys.values[i]);
-
 		/*
-		 * Preprocess expressions (if any). We read the expressions, run them
-		 * through eval_const_expressions, and fix the varnos.
+		 * Hack to load stats with stxdinherit true/false - there should be
+		 * a better way to do this, I guess.
 		 */
+		for (inh = 0; inh <= 1; inh++)
 		{
-			bool		isnull;
-			Datum		datum;
+			dtup = SearchSysCache2(STATEXTDATASTXOID,
+								   ObjectIdGetDatum(statOid), BoolGetDatum((bool) inh));
+			if (!HeapTupleIsValid(dtup))
+				continue;
 
-			/* decode expression (if any) */
-			datum = SysCacheGetAttr(STATEXTOID, htup,
-									Anum_pg_statistic_ext_stxexprs, &isnull);
+			dataForm = (Form_pg_statistic_ext_data) GETSTRUCT(dtup);
 
-			if (!isnull)
+			/*
+			 * First, build the array of columns covered.  This is ultimately
+			 * wasted if no stats within the object have actually been built, but
+			 * it doesn't seem worth troubling over that case.
+			 */
+			for (i = 0; i < staForm->stxkeys.dim1; i++)
+				keys = bms_add_member(keys, staForm->stxkeys.values[i]);
+
+			/*
+			 * Preprocess expressions (if any). We read the expressions, run them
+			 * through eval_const_expressions, and fix the varnos.
+			 */
 			{
-				char	   *exprsString;
+				bool		isnull;
+				Datum		datum;
 
-				exprsString = TextDatumGetCString(datum);
-				exprs = (List *) stringToNode(exprsString);
-				pfree(exprsString);
+				/* decode expression (if any) */
+				datum = SysCacheGetAttr(STATEXTOID, htup,
+										Anum_pg_statistic_ext_stxexprs, &isnull);
 
-				/*
-				 * Run the expressions through eval_const_expressions. This is
-				 * not just an optimization, but is necessary, because the
-				 * planner will be comparing them to similarly-processed qual
-				 * clauses, and may fail to detect valid matches without this.
-				 * We must not use canonicalize_qual, however, since these
-				 * aren't qual expressions.
-				 */
-				exprs = (List *) eval_const_expressions(NULL, (Node *) exprs);
+				if (!isnull)
+				{
+					char	   *exprsString;
 
-				/* May as well fix opfuncids too */
-				fix_opfuncids((Node *) exprs);
+					exprsString = TextDatumGetCString(datum);
+					exprs = (List *) stringToNode(exprsString);
+					pfree(exprsString);
 
-				/*
-				 * Modify the copies we obtain from the relcache to have the
-				 * correct varno for the parent relation, so that they match
-				 * up correctly against qual clauses.
-				 */
-				if (varno != 1)
-					ChangeVarNodes((Node *) exprs, 1, varno, 0);
+					/*
+					 * Run the expressions through eval_const_expressions. This is
+					 * not just an optimization, but is necessary, because the
+					 * planner will be comparing them to similarly-processed qual
+					 * clauses, and may fail to detect valid matches without this.
+					 * We must not use canonicalize_qual, however, since these
+					 * aren't qual expressions.
+					 */
+					exprs = (List *) eval_const_expressions(NULL, (Node *) exprs);
+
+					/* May as well fix opfuncids too */
+					fix_opfuncids((Node *) exprs);
+
+					/*
+					 * Modify the copies we obtain from the relcache to have the
+					 * correct varno for the parent relation, so that they match
+					 * up correctly against qual clauses.
+					 */
+					if (varno != 1)
+						ChangeVarNodes((Node *) exprs, 1, varno, 0);
+				}
 			}
-		}
 
-		/* add one StatisticExtInfo for each kind built */
-		if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			/* add one StatisticExtInfo for each kind built */
+			if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_NDISTINCT;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_NDISTINCT;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
 
-			stainfos = lappend(stainfos, info);
-		}
+				stainfos = lappend(stainfos, info);
+			}
 
-		if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_DEPENDENCIES;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_DEPENDENCIES;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
 
-			stainfos = lappend(stainfos, info);
-		}
+				stainfos = lappend(stainfos, info);
+			}
 
-		if (statext_is_kind_built(dtup, STATS_EXT_MCV))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			if (statext_is_kind_built(dtup, STATS_EXT_MCV))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_MCV;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_MCV;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
 
-			stainfos = lappend(stainfos, info);
-		}
+				stainfos = lappend(stainfos, info);
+			}
 
-		if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_EXPRESSIONS;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_EXPRESSIONS;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
+
+				stainfos = lappend(stainfos, info);
+			}
 
-			stainfos = lappend(stainfos, info);
+			ReleaseSysCache(dtup);
 		}
 
 		ReleaseSysCache(htup);
-		ReleaseSysCache(dtup);
 		bms_free(keys);
 	}
 
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 0659307b02..835f4bdf7a 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -618,14 +618,16 @@ dependency_is_fully_matched(MVDependency *dependency, Bitmapset *attnums)
  *		Load the functional dependencies for the indicated pg_statistic_ext tuple
  */
 MVDependencies *
-statext_dependencies_load(Oid mvoid)
+statext_dependencies_load(Oid mvoid, bool inh)
 {
 	MVDependencies *result;
 	bool		isnull;
 	Datum		deps;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid),
+						   BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
@@ -1410,6 +1412,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			ndependencies;
 	int			i;
 	AttrNumber	attnum_offset;
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
 	/* unique expressions */
 	Node	  **unique_exprs;
@@ -1603,6 +1606,10 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (stat->kind != STATS_EXT_DEPENDENCIES)
 			continue;
 
+		/* skip statistics with mismatching stxdinherit value */
+		if (stat->inherit != rte->inh)
+			continue;
+
 		/*
 		 * Count matching attributes - we have to undo the attnum offsets. The
 		 * input attribute numbers are not offset (expressions are not
@@ -1649,7 +1656,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (nmatched + nexprs < 2)
 			continue;
 
-		deps = statext_dependencies_load(stat->statOid);
+		deps = statext_dependencies_load(stat->statOid, rte->inh);
 
 		/*
 		 * The expressions may be represented by different attnums in the
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 9e518830ae..8cfcd17ad6 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -77,7 +77,7 @@ typedef struct StatExtEntry
 static List *fetch_statentries_for_relation(Relation pg_statext, Oid relid);
 static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
 											int nvacatts, VacAttrStats **vacatts);
-static void statext_store(Oid statOid,
+static void statext_store(Oid statOid, bool inh,
 						  MVNDistinct *ndistinct, MVDependencies *dependencies,
 						  MCVList *mcv, Datum exprs, VacAttrStats **stats);
 static int	statext_compute_stattarget(int stattarget,
@@ -110,7 +110,7 @@ static StatsBuildData *make_build_data(Relation onerel, StatExtEntry *stat,
  * requested stats, and serializes them back into the catalog.
  */
 void
-BuildRelationExtStatistics(Relation onerel, double totalrows,
+BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 						   int numrows, HeapTuple *rows,
 						   int natts, VacAttrStats **vacattrstats)
 {
@@ -230,7 +230,8 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 		}
 
 		/* store the statistics in the catalog */
-		statext_store(stat->statOid, ndistinct, dependencies, mcv, exprstats, stats);
+		statext_store(stat->statOid, inh,
+					  ndistinct, dependencies, mcv, exprstats, stats);
 
 		/* for reporting progress */
 		pgstat_progress_update_param(PROGRESS_ANALYZE_EXT_STATS_COMPUTED,
@@ -781,7 +782,7 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
  *	tuple.
  */
 static void
-statext_store(Oid statOid,
+statext_store(Oid statOid, bool inh,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
 			  MCVList *mcv, Datum exprs, VacAttrStats **stats)
 {
@@ -790,14 +791,19 @@ statext_store(Oid statOid,
 				oldtup;
 	Datum		values[Natts_pg_statistic_ext_data];
 	bool		nulls[Natts_pg_statistic_ext_data];
-	bool		replaces[Natts_pg_statistic_ext_data];
 
 	pg_stextdata = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
 	memset(nulls, true, sizeof(nulls));
-	memset(replaces, false, sizeof(replaces));
 	memset(values, 0, sizeof(values));
 
+	/* basic info */
+	values[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statOid);
+	nulls[Anum_pg_statistic_ext_data_stxoid - 1] = false;
+
+	values[Anum_pg_statistic_ext_data_stxdinherit - 1] = BoolGetDatum(inh);
+	nulls[Anum_pg_statistic_ext_data_stxdinherit - 1] = false;
+
 	/*
 	 * Construct a new pg_statistic_ext_data tuple, replacing the calculated
 	 * stats.
@@ -830,25 +836,27 @@ statext_store(Oid statOid,
 		values[Anum_pg_statistic_ext_data_stxdexpr - 1] = exprs;
 	}
 
-	/* always replace the value (either by bytea or NULL) */
-	replaces[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdmcv - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdexpr - 1] = true;
-
-	/* there should already be a pg_statistic_ext_data tuple */
-	oldtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-	if (!HeapTupleIsValid(oldtup))
+	/*
+	 * Delete the old tuple if it exists, and insert a new one. It's easier
+	 * than trying to update or insert, based on various conditions.
+	 *
+	 * There should always be a pg_statistic_ext_data tuple for inh=false,
+	 * but there may be none for inh=true yet.
+	 */
+	oldtup = SearchSysCache2(STATEXTDATASTXOID,
+							 ObjectIdGetDatum(statOid),
+							 BoolGetDatum(inh));
+	if (HeapTupleIsValid(oldtup))
+	{
+		CatalogTupleDelete(pg_stextdata, &(oldtup->t_self));
+		ReleaseSysCache(oldtup);
+	}
+	else if (!inh)
 		elog(ERROR, "cache lookup failed for statistics object %u", statOid);
 
-	/* replace it */
-	stup = heap_modify_tuple(oldtup,
-							 RelationGetDescr(pg_stextdata),
-							 values,
-							 nulls,
-							 replaces);
-	ReleaseSysCache(oldtup);
-	CatalogTupleUpdate(pg_stextdata, &stup->t_self, stup);
+	/* form a new tuple */
+	stup = heap_form_tuple(RelationGetDescr(pg_stextdata), values, nulls);
+	CatalogTupleInsert(pg_stextdata, stup);
 
 	heap_freetuple(stup);
 
@@ -1234,7 +1242,7 @@ stat_covers_expressions(StatisticExtInfo *stat, List *exprs,
  * further tiebreakers are needed.
  */
 StatisticExtInfo *
-choose_best_statistics(List *stats, char requiredkind,
+choose_best_statistics(List *stats, char requiredkind, bool inh,
 					   Bitmapset **clause_attnums, List **clause_exprs,
 					   int nclauses)
 {
@@ -1256,6 +1264,10 @@ choose_best_statistics(List *stats, char requiredkind,
 		if (info->kind != requiredkind)
 			continue;
 
+		/* skip statistics with mismatching inheritance flag */
+		if (info->inherit != inh)
+			continue;
+
 		/*
 		 * Collect attributes and expressions in remaining (unestimated)
 		 * clauses fully covered by this statistic object.
@@ -1694,6 +1706,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	List	  **list_exprs;		/* expressions matched to any statistic */
 	int			listidx;
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
@@ -1751,7 +1764,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 			break;
 
 		/* find the best suited statistics object for these attnums */
-		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV,
+		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, rte->inh,
 									  list_attnums, list_exprs,
 									  list_length(clauses));
 
@@ -1840,7 +1853,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 			MCVList    *mcv_list;
 
 			/* Load the MCV list stored in the statistics object */
-			mcv_list = statext_mcv_load(stat->statOid);
+			mcv_list = statext_mcv_load(stat->statOid, rte->inh);
 
 			/*
 			 * Compute the selectivity of the ORed list of clauses covered by
@@ -2411,7 +2424,7 @@ statext_expressions_load(Oid stxoid, int idx)
 	HeapTupleData tmptup;
 	HeapTuple	tup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), BoolGetDatum(false));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", stxoid);
 
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index 35b39ece07..173f746e41 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -559,12 +559,13 @@ build_column_frequencies(SortItem *groups, int ngroups,
  *		Load the MCV list for the indicated pg_statistic_ext tuple.
  */
 MCVList *
-statext_mcv_load(Oid mvoid)
+statext_mcv_load(Oid mvoid, bool inh)
 {
 	MCVList    *result;
 	bool		isnull;
 	Datum		mcvlist;
-	HeapTuple	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	HeapTuple	htup = SearchSysCache2(STATEXTDATASTXOID,
+									   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
@@ -2040,11 +2041,13 @@ mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
 	MCVList    *mcv;
 	Selectivity s = 0.0;
 
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
 	/* match/mismatch bitmap for each MCV item */
 	bool	   *matches = NULL;
 
 	/* load the MCV list stored in the statistics object */
-	mcv = statext_mcv_load(stat->statOid);
+	mcv = statext_mcv_load(stat->statOid, rte->inh);
 
 	/* build a match bitmap for the clauses */
 	matches = mcv_get_match_bitmap(root, clauses, stat->keys, stat->exprs,
diff --git a/src/backend/statistics/mvdistinct.c b/src/backend/statistics/mvdistinct.c
index 4481312d61..ab1f10d6c0 100644
--- a/src/backend/statistics/mvdistinct.c
+++ b/src/backend/statistics/mvdistinct.c
@@ -146,14 +146,15 @@ statext_ndistinct_build(double totalrows, StatsBuildData *data)
  *		Load the ndistinct value for the indicated pg_statistic_ext tuple
  */
 MVNDistinct *
-statext_ndistinct_load(Oid mvoid)
+statext_ndistinct_load(Oid mvoid, bool inh)
 {
 	MVNDistinct *result;
 	bool		isnull;
 	Datum		ndist;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index b15f14e1a0..aab3bd7696 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -4008,7 +4008,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 
 	Assert(nmatches_vars + nmatches_exprs > 1);
 
-	stats = statext_ndistinct_load(statOid);
+	stats = statext_ndistinct_load(statOid, rte->inh);
 
 	/*
 	 * If we have a match, search it for the specific item that matches (there
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index d6cb78dea8..eabd74952f 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -740,11 +740,11 @@ static const struct cachedesc cacheinfo[] = {
 		32
 	},
 	{StatisticExtDataRelationId,	/* STATEXTDATASTXOID */
-		StatisticExtDataStxoidIndexId,
-		1,
+		StatisticExtDataStxoidInhIndexId,
+		2,
 		{
 			Anum_pg_statistic_ext_data_stxoid,
-			0,
+			Anum_pg_statistic_ext_data_stxdinherit,
 			0,
 			0
 		},
diff --git a/src/include/catalog/pg_statistic_ext_data.h b/src/include/catalog/pg_statistic_ext_data.h
index 7b73b790d2..8ffd8b68cd 100644
--- a/src/include/catalog/pg_statistic_ext_data.h
+++ b/src/include/catalog/pg_statistic_ext_data.h
@@ -32,6 +32,7 @@ CATALOG(pg_statistic_ext_data,3429,StatisticExtDataRelationId)
 {
 	Oid			stxoid BKI_LOOKUP(pg_statistic_ext);	/* statistics object
 														 * this data is for */
+	bool		stxdinherit;	/* true if inheritance children are included */
 
 #ifdef CATALOG_VARLEN			/* variable-length fields start here */
 
@@ -53,6 +54,7 @@ typedef FormData_pg_statistic_ext_data * Form_pg_statistic_ext_data;
 
 DECLARE_TOAST(pg_statistic_ext_data, 3430, 3431);
 
-DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_index, 3433, StatisticExtDataStxoidIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops));
+DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_inh_index, 3433, StatisticExtDataStxoidInhIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops, stxdinherit bool_ops));
+
 
 #endif							/* PG_STATISTIC_EXT_DATA_H */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 2a53a6e344..884bda7232 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -934,6 +934,7 @@ typedef struct StatisticExtInfo
 	NodeTag		type;
 
 	Oid			statOid;		/* OID of the statistics row */
+	bool		inherit;		/* includes child relations */
 	RelOptInfo *rel;			/* back-link to statistic's table */
 	char		kind;			/* statistics kind of this entry */
 	Bitmapset  *keys;			/* attnums of the columns covered */
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 326cf26fea..02ee41b9f3 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -94,11 +94,11 @@ typedef struct MCVList
 	MCVItem		items[FLEXIBLE_ARRAY_MEMBER];	/* array of MCV items */
 } MCVList;
 
-extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
-extern MVDependencies *statext_dependencies_load(Oid mvoid);
-extern MCVList *statext_mcv_load(Oid mvoid);
+extern MVNDistinct *statext_ndistinct_load(Oid mvoid, bool inh);
+extern MVDependencies *statext_dependencies_load(Oid mvoid, bool inh);
+extern MCVList *statext_mcv_load(Oid mvoid, bool inh);
 
-extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
+extern void BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 									   int numrows, HeapTuple *rows,
 									   int natts, VacAttrStats **vacattrstats);
 extern int	ComputeExtStatisticsRows(Relation onerel,
@@ -121,6 +121,7 @@ extern Selectivity statext_clauselist_selectivity(PlannerInfo *root,
 												  bool is_or);
 extern bool has_stats_of_kind(List *stats, char requiredkind);
 extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind,
+												bool inh,
 												Bitmapset **clause_attnums,
 												List **clause_exprs,
 												int nclauses);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..8ab5187ccb 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2425,6 +2425,7 @@ pg_stats_ext| SELECT cn.nspname AS schemaname,
              JOIN pg_attribute a ON (((a.attrelid = s.stxrelid) AND (a.attnum = k.k))))) AS attnames,
     pg_get_statisticsobjdef_expressions(s.oid) AS exprs,
     s.stxkind AS kinds,
+    sd.stxdinherit AS inherited,
     sd.stxdndistinct AS n_distinct,
     sd.stxddependencies AS dependencies,
     m.most_common_vals,
-- 
2.17.0

v2-0004-f-check-inh.patchtext/x-diff; charset=us-asciiDownload
From b9b66a66b70566c6c783d62bda412bb76dcb8563 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 25 Sep 2021 18:20:03 -0500
Subject: [PATCH v2 4/5] f! check inh

statext_expressions_load
examine_variable
estimate_multivariate_ndistinct

TODO: pg_stats_ext_exprs needs to expose inh flag
---
 src/backend/statistics/dependencies.c   |  5 -----
 src/backend/statistics/extended_stats.c |  9 ++-------
 src/backend/utils/adt/selfuncs.c        | 27 ++++++++-----------------
 src/include/statistics/statistics.h     |  2 +-
 src/test/regress/expected/stats_ext.out | 11 ++++++++--
 src/test/regress/sql/stats_ext.sql      |  4 +++-
 6 files changed, 23 insertions(+), 35 deletions(-)

diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 835f4bdf7a..02cf0efc66 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -1596,11 +1596,6 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		int			nexprs;
 		int			k;
 		MVDependencies *deps;
-		RangeTblEntry	*rte = root->simple_rte_array[rel->relid];
-
-		/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-		if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-			break;
 
 		/* skip statistics that are not of the correct type */
 		if (stat->kind != STATS_EXT_DEPENDENCIES)
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 8cfcd17ad6..d2ce4e13b1 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1706,7 +1706,6 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	List	  **list_exprs;		/* expressions matched to any statistic */
 	int			listidx;
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
-	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
@@ -1759,10 +1758,6 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 		Bitmapset  *simple_clauses;
 		RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
-		/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-		if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-			break;
-
 		/* find the best suited statistics object for these attnums */
 		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, rte->inh,
 									  list_attnums, list_exprs,
@@ -2414,7 +2409,7 @@ serialize_expr_stats(AnlExprData *exprdata, int nexprs)
  * identified by the supplied index.
  */
 HeapTuple
-statext_expressions_load(Oid stxoid, int idx)
+statext_expressions_load(Oid stxoid, bool inh, int idx)
 {
 	bool		isnull;
 	Datum		value;
@@ -2424,7 +2419,7 @@ statext_expressions_load(Oid stxoid, int idx)
 	HeapTupleData tmptup;
 	HeapTuple	tup;
 
-	htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), BoolGetDatum(false));
+	htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), inh);
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", stxoid);
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index aab3bd7696..19b4aaf7eb 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3915,10 +3915,6 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	StatisticExtInfo *matched_info = NULL;
 	RangeTblEntry		*rte = root->simple_rte_array[rel->relid];
 
-	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return false;
-
 	/* bail out immediately if the table has no extended statistics */
 	if (!rel->statlist)
 		return false;
@@ -5237,13 +5233,6 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 			if (vardata->statsTuple)
 				break;
 
-			/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-			{
-				RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root);
-				if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-					break;
-			}
-
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
 				continue;
@@ -5262,22 +5251,22 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 				/* found a match, see if we can extract pg_statistic row */
 				if (equal(node, expr))
 				{
-					HeapTuple	t = statext_expressions_load(info->statOid, pos);
-
-					/* Get statistics object's table for permission check */
-					RangeTblEntry *rte;
+					RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root);
 					Oid			userid;
+					bool		inh;
 
-					vardata->statsTuple = t;
+					Assert(rte->rtekind == RTE_RELATION);
 
 					/*
 					 * XXX Not sure if we should cache the tuple somewhere.
 					 * Now we just create a new copy every time.
 					 */
-					vardata->freefunc = ReleaseDummy;
+					inh = root->append_rel_array == NULL ? false :
+						root->append_rel_array[onerel->relid]->parent_relid != 0;
+					vardata->statsTuple =
+						statext_expressions_load(info->statOid, inh, pos);
 
-					rte = planner_rt_fetch(onerel->relid, root);
-					Assert(rte->rtekind == RTE_RELATION);
+					vardata->freefunc = ReleaseDummy;
 
 					/*
 					 * Use checkAsUser if it's set, in case we're accessing
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 02ee41b9f3..3868e43f8a 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -125,6 +125,6 @@ extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind,
 												Bitmapset **clause_attnums,
 												List **clause_exprs,
 												int nclauses);
-extern HeapTuple statext_expressions_load(Oid stxoid, int idx);
+extern HeapTuple statext_expressions_load(Oid stxoid, bool inh, int idx);
 
 #endif							/* STATISTICS_H */
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 67234b9fc2..35edc6a361 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -176,7 +176,6 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 NOTICE:  drop cascades to table ab1c
--- Ensure non-inherited stats are not applied to inherited query
 CREATE TABLE stxdinh(i int, j int);
 CREATE TABLE stxdinh1() INHERITS(stxdinh);
 INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
@@ -191,11 +190,19 @@ SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 
 CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
 VACUUM ANALYZE stxdinh, stxdinh1;
+-- Ensure non-inherited stats are not applied to inherited query
 -- Since the stats object does not include inherited stats, it should not affect the estimates
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
  estimated | actual 
 -----------+--------
-      1000 |   1008
+      1008 |   1008
+(1 row)
+
+-- Ensure correct (non-inherited) stats are applied to inherited query
+SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+         9 |      9
 (1 row)
 
 DROP TABLE stxdinh, stxdinh1;
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 2371043ca1..8490da9558 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -112,7 +112,6 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 
--- Ensure non-inherited stats are not applied to inherited query
 CREATE TABLE stxdinh(i int, j int);
 CREATE TABLE stxdinh1() INHERITS(stxdinh);
 INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
@@ -122,8 +121,11 @@ VACUUM ANALYZE stxdinh, stxdinh1;
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
 VACUUM ANALYZE stxdinh, stxdinh1;
+-- Ensure non-inherited stats are not applied to inherited query
 -- Since the stats object does not include inherited stats, it should not affect the estimates
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+-- Ensure correct (non-inherited) stats are applied to inherited query
+SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2');
 DROP TABLE stxdinh, stxdinh1;
 
 -- Ensure inherited stats ARE applied to inherited query in partitioned table
-- 
2.17.0

v2-0005-Refactor-parent-ACL-check.patchtext/x-diff; charset=us-asciiDownload
From daf816806bdd3d927e9ed002a75d48277cad883b Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 25 Sep 2021 18:58:33 -0500
Subject: [PATCH v2 5/5] Refactor parent ACL check

selfuncs.c is 8k lines long, and this makes it 30 LOC shorter.
---
 src/backend/utils/adt/selfuncs.c | 140 ++++++++++++-------------------
 1 file changed, 52 insertions(+), 88 deletions(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 19b4aaf7eb..54324a71c0 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -187,6 +187,8 @@ static char *convert_string_datum(Datum value, Oid typid, Oid collid,
 								  bool *failure);
 static double convert_timevalue_to_scalar(Datum value, Oid typid,
 										  bool *failure);
+static void recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata,
+								Oid relid);
 static void examine_simple_variable(PlannerInfo *root, Var *var,
 									VariableStatData *vardata);
 static bool get_variable_range(PlannerInfo *root, VariableStatData *vardata,
@@ -5152,51 +5154,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 									(pg_class_aclcheck(rte->relid, userid,
 													   ACL_SELECT) == ACLCHECK_OK);
 
-								/*
-								 * If the user doesn't have permissions to
-								 * access an inheritance child relation, check
-								 * the permissions of the table actually
-								 * mentioned in the query, since most likely
-								 * the user does have that permission.  Note
-								 * that whole-table select privilege on the
-								 * parent doesn't quite guarantee that the
-								 * user could read all columns of the child.
-								 * But in practice it's unlikely that any
-								 * interesting security violation could result
-								 * from allowing access to the expression
-								 * index's stats, so we allow it anyway.  See
-								 * similar code in examine_simple_variable()
-								 * for additional comments.
-								 */
-								if (!vardata->acl_ok &&
-									root->append_rel_array != NULL)
-								{
-									AppendRelInfo *appinfo;
-									Index		varno = index->rel->relid;
-
-									appinfo = root->append_rel_array[varno];
-									while (appinfo &&
-										   planner_rt_fetch(appinfo->parent_relid,
-															root)->rtekind == RTE_RELATION)
-									{
-										varno = appinfo->parent_relid;
-										appinfo = root->append_rel_array[varno];
-									}
-									if (varno != index->rel->relid)
-									{
-										/* Repeat access check on this rel */
-										rte = planner_rt_fetch(varno, root);
-										Assert(rte->rtekind == RTE_RELATION);
-
-										userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-										vardata->acl_ok =
-											rte->securityQuals == NIL &&
-											(pg_class_aclcheck(rte->relid,
-															   userid,
-															   ACL_SELECT) == ACLCHECK_OK);
-									}
-								}
+								recheck_parent_acl(root, vardata, index->rel->relid);
 							}
 							else
 							{
@@ -5287,49 +5245,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 						(pg_class_aclcheck(rte->relid, userid,
 										   ACL_SELECT) == ACLCHECK_OK);
 
-					/*
-					 * If the user doesn't have permissions to access an
-					 * inheritance child relation, check the permissions of
-					 * the table actually mentioned in the query, since most
-					 * likely the user does have that permission.  Note that
-					 * whole-table select privilege on the parent doesn't
-					 * quite guarantee that the user could read all columns of
-					 * the child. But in practice it's unlikely that any
-					 * interesting security violation could result from
-					 * allowing access to the expression stats, so we allow it
-					 * anyway.  See similar code in examine_simple_variable()
-					 * for additional comments.
-					 */
-					if (!vardata->acl_ok &&
-						root->append_rel_array != NULL)
-					{
-						AppendRelInfo *appinfo;
-						Index		varno = onerel->relid;
-
-						appinfo = root->append_rel_array[varno];
-						while (appinfo &&
-							   planner_rt_fetch(appinfo->parent_relid,
-												root)->rtekind == RTE_RELATION)
-						{
-							varno = appinfo->parent_relid;
-							appinfo = root->append_rel_array[varno];
-						}
-						if (varno != onerel->relid)
-						{
-							/* Repeat access check on this rel */
-							rte = planner_rt_fetch(varno, root);
-							Assert(rte->rtekind == RTE_RELATION);
-
-							userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-							vardata->acl_ok =
-								rte->securityQuals == NIL &&
-								(pg_class_aclcheck(rte->relid,
-												   userid,
-												   ACL_SELECT) == ACLCHECK_OK);
-						}
-					}
-
+					recheck_parent_acl(root, vardata, onerel->relid);
 					break;
 				}
 
@@ -5339,6 +5255,54 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 	}
 }
 
+/*
+ * If the user doesn't have permissions to access an inheritance child
+ * relation, check the permissions of the table actually mentioned in the
+ * query, since most likely the user does have that permission.  Note that
+ * whole-table select privilege on the parent doesn't quite guarantee that the
+ * user could read all columns of the child.  But in practice it's unlikely
+ * that any interesting security violation could result from allowing access to
+ * the expression stats, so we allow it anyway.  See similar code in
+ * examine_simple_variable() for additional comments.
+ */
+static void
+recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata, Oid relid)
+{
+	RangeTblEntry	*rte;
+	Oid		userid;
+
+	if (!vardata->acl_ok &&
+		root->append_rel_array != NULL)
+	{
+		AppendRelInfo *appinfo;
+		Index		varno = relid;
+
+		appinfo = root->append_rel_array[varno];
+		while (appinfo &&
+			   planner_rt_fetch(appinfo->parent_relid,
+								root)->rtekind == RTE_RELATION)
+		{
+			varno = appinfo->parent_relid;
+			appinfo = root->append_rel_array[varno];
+		}
+
+		if (varno != relid)
+		{
+			/* Repeat access check on this rel */
+			rte = planner_rt_fetch(varno, root);
+			Assert(rte->rtekind == RTE_RELATION);
+
+			userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
+
+			vardata->acl_ok =
+				rte->securityQuals == NIL &&
+				(pg_class_aclcheck(rte->relid,
+								   userid,
+								   ACL_SELECT) == ACLCHECK_OK);
+		}
+	}
+}
+
 /*
  * examine_simple_variable
  *		Handle a simple Var for examine_variable
-- 
2.17.0

#11Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Justin Pryzby (#10)
Re: extended stats on partitioned tables

On 10/8/21 12:45 AM, Justin Pryzby wrote:

On Thu, Oct 07, 2021 at 03:26:46PM -0500, Jaime Casanova wrote:

On Sun, Sep 26, 2021 at 03:25:50PM -0500, Justin Pryzby wrote:

On Sat, Sep 25, 2021 at 05:31:52PM -0500, Justin Pryzby wrote:

It seems like your patch should also check "inh" in examine_variable and
statext_expressions_load.

I tried adding that - I mostly kept my patches separate.
Hopefully this is more helpful than a complication.
I added at: https://commitfest.postgresql.org/35/3332/

Actually, this is confusing. Which patch is the one we should be
reviewing?

It is confusing, but not as much as I first thought. Please check the commit
messages.

The first two patches are meant to be applied to master *and* backpatched. The
first one intends to fixes the bug that non-inherited stats are being used for
queries of inheritance trees. The 2nd one fixes the regression that stats are
not collected for inheritence trees of partitioned tables (which is the only
type of stats they could ever possibly have).

I think 0001 and 0002 seem mostly fine, but it seems a bit strange to do
the (!rte->inh) check in the rel->statlist loops. AFAICS both places
could do that right at the beginning, because it does not depend on the
statistics object at all, just the RelOptInfo.

And the 3rd+4th patches (Tomas' plus my changes) allow collecting both
inherited and non-inherited stats, only in master, since it requires a catalog
change. It's a bit confusing that patch #4 removes most what I added in
patches 1 and 2. But that's exactly what's needed to collect and apply both
inherited and non-inherited stats: the first two patches avoid applying stats
collected with the wrong inheritence. That's also what's needed for the
patchset to follow the normal "apply to master and backpatch" process, rather
than 2 patches which are backpatched but not applied to master, and one which
is applied to master and not backpatched..

Yeah. Af first I was a bit confused because after applying 0003 there
are both the fixes and the "correct" way, but then I realized 0004
removes the unnecessary bits.

The one thing 0003 still needs is to rework the places that need to
touch both inh and !inh stats. The patch simply does

for (inh = 0; inh <= 1; inh++) { ... }

but that feels a bit too hackish. But if we don't know which of the two
stats exist, I'm not sure what to do about it. And I'm not sure we do
the right thing after removing children, for example (that should drop
the inheritance stats, I guess).

The 1:2 mapping between pg_statistic_ext and pg_statistic_ext_data is a
bit strange, but I can't think of a better way.

@Tomas: I just found commit 427c6b5b9, which is a remarkably similar issue
affecting column stats 15 years ago.

What can I say? The history repeats itself ...

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#12Justin Pryzby
pryzby@telsasoft.com
In reply to: Tomas Vondra (#11)
Re: extended stats on partitioned tables

On Wed, Nov 03, 2021 at 11:48:44PM +0100, Tomas Vondra wrote:

On 10/8/21 12:45 AM, Justin Pryzby wrote:

On Thu, Oct 07, 2021 at 03:26:46PM -0500, Jaime Casanova wrote:

On Sun, Sep 26, 2021 at 03:25:50PM -0500, Justin Pryzby wrote:

On Sat, Sep 25, 2021 at 05:31:52PM -0500, Justin Pryzby wrote:

It seems like your patch should also check "inh" in examine_variable and
statext_expressions_load.

I tried adding that - I mostly kept my patches separate.
Hopefully this is more helpful than a complication.
I added at: https://commitfest.postgresql.org/35/3332/

Actually, this is confusing. Which patch is the one we should be
reviewing?

It is confusing, but not as much as I first thought. Please check the commit
messages.

The first two patches are meant to be applied to master *and* backpatched. The
first one intends to fixes the bug that non-inherited stats are being used for
queries of inheritance trees. The 2nd one fixes the regression that stats are
not collected for inheritence trees of partitioned tables (which is the only
type of stats they could ever possibly have).

I think 0001 and 0002 seem mostly fine, but it seems a bit strange to do
the (!rte->inh) check in the rel->statlist loops. AFAICS both places
could do that right at the beginning, because it does not depend on the
statistics object at all, just the RelOptInfo.

I probably did this to make the code change small, to avoid indentin the whole
block.

And the 3rd+4th patches (Tomas' plus my changes) allow collecting both
inherited and non-inherited stats, only in master, since it requires a catalog
change. It's a bit confusing that patch #4 removes most what I added in
patches 1 and 2. But that's exactly what's needed to collect and apply both
inherited and non-inherited stats: the first two patches avoid applying stats
collected with the wrong inheritence. That's also what's needed for the
patchset to follow the normal "apply to master and backpatch" process, rather
than 2 patches which are backpatched but not applied to master, and one which
is applied to master and not backpatched..

Yeah. Af first I was a bit confused because after applying 0003 there
are both the fixes and the "correct" way, but then I realized 0004
removes the unnecessary bits.

This was to leave your 0003 (mostly) unchanged, so you can see and/or apply my
changes. They should be squished together.

The one thing 0003 still needs is to rework the places that need to
touch both inh and !inh stats. The patch simply does

for (inh = 0; inh <= 1; inh++) { ... }

but that feels a bit too hackish. But if we don't know which of the two
stats exist, I'm not sure what to do about it.

There's also this:

On Sun, Sep 26, 2021 at 03:25:50PM -0500, Justin Pryzby wrote:

+       /* create only the "stxdinherit=false", because that always exists */
+       datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = ObjectIdGetDatum(false);

That'd be confusing for partitioned tables, no?
They'd always have an row with no data.
I guess it could be stxdinherit = BoolGetDatum(rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE).
(not ObjectIdGetDatum).
Then, that affects the loops which delete the tuples - neither inh nor !inh is
guaranteed, unless you check relkind there, too.

Maybe the for inh<=1 loop should instead be two calls to new functions factored
out of get_relation_statistics() and RemoveStatisticsById(), which take "bool
inh".

And I'm not sure we do the right thing after removing children, for example
(that should drop the inheritance stats, I guess).

Do you mean for inheritance only ? Or partitions too ?
I think for partitions, the stats should stay.
And for inheritence, they can stay, for consistency with partitions, and since
it does no harm.

#13Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Justin Pryzby (#12)
Re: extended stats on partitioned tables

On 11/4/21 12:19 AM, Justin Pryzby wrote:

On Wed, Nov 03, 2021 at 11:48:44PM +0100, Tomas Vondra wrote:

On 10/8/21 12:45 AM, Justin Pryzby wrote:

On Thu, Oct 07, 2021 at 03:26:46PM -0500, Jaime Casanova wrote:

On Sun, Sep 26, 2021 at 03:25:50PM -0500, Justin Pryzby wrote:

On Sat, Sep 25, 2021 at 05:31:52PM -0500, Justin Pryzby wrote:

It seems like your patch should also check "inh" in examine_variable and
statext_expressions_load.

I tried adding that - I mostly kept my patches separate.
Hopefully this is more helpful than a complication.
I added at: https://commitfest.postgresql.org/35/3332/

Actually, this is confusing. Which patch is the one we should be
reviewing?

It is confusing, but not as much as I first thought. Please check the commit
messages.

The first two patches are meant to be applied to master *and* backpatched. The
first one intends to fixes the bug that non-inherited stats are being used for
queries of inheritance trees. The 2nd one fixes the regression that stats are
not collected for inheritence trees of partitioned tables (which is the only
type of stats they could ever possibly have).

I think 0001 and 0002 seem mostly fine, but it seems a bit strange to do
the (!rte->inh) check in the rel->statlist loops. AFAICS both places
could do that right at the beginning, because it does not depend on the
statistics object at all, just the RelOptInfo.

I probably did this to make the code change small, to avoid indentin the whole
block.

But indenting the block is not necessary. It's possible to do something
like this:

if (!rel->inh)
return 1.0;

or whatever is the "default" result for that function.

And the 3rd+4th patches (Tomas' plus my changes) allow collecting both
inherited and non-inherited stats, only in master, since it requires a catalog
change. It's a bit confusing that patch #4 removes most what I added in
patches 1 and 2. But that's exactly what's needed to collect and apply both
inherited and non-inherited stats: the first two patches avoid applying stats
collected with the wrong inheritence. That's also what's needed for the
patchset to follow the normal "apply to master and backpatch" process, rather
than 2 patches which are backpatched but not applied to master, and one which
is applied to master and not backpatched..

Yeah. Af first I was a bit confused because after applying 0003 there
are both the fixes and the "correct" way, but then I realized 0004
removes the unnecessary bits.

This was to leave your 0003 (mostly) unchanged, so you can see and/or apply my
changes. They should be squished together.

Yep.

The one thing 0003 still needs is to rework the places that need to
touch both inh and !inh stats. The patch simply does

for (inh = 0; inh <= 1; inh++) { ... }

but that feels a bit too hackish. But if we don't know which of the two
stats exist, I'm not sure what to do about it.

There's also this:

On Sun, Sep 26, 2021 at 03:25:50PM -0500, Justin Pryzby wrote:

+       /* create only the "stxdinherit=false", because that always exists */
+       datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = ObjectIdGetDatum(false);

That'd be confusing for partitioned tables, no?
They'd always have an row with no data.
I guess it could be stxdinherit = BoolGetDatum(rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE).
(not ObjectIdGetDatum).
Then, that affects the loops which delete the tuples - neither inh nor !inh is
guaranteed, unless you check relkind there, too.

Maybe the for inh<=1 loop should instead be two calls to new functions factored
out of get_relation_statistics() and RemoveStatisticsById(), which take "bool
inh".

Well, yeah. That's part of the strange 1:2 mapping between the stats
definition and data. Although, even with regular stats we have such
mapping, except the "definition" is the pg_attribute row.

And I'm not sure we do the right thing after removing children, for example
(that should drop the inheritance stats, I guess).

Do you mean for inheritance only ? Or partitions too ?
I think for partitions, the stats should stay.
And for inheritence, they can stay, for consistency with partitions, and since
it does no harm.

I think the behavior should be the same as for data in pg_statistic,
i.e. if we keep/remove those, we should do the same thing for extended
statistics.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#14Justin Pryzby
pryzby@telsasoft.com
In reply to: Tomas Vondra (#13)
7 attachment(s)
Re: extended stats on partitioned tables

On Thu, Nov 04, 2021 at 12:44:45AM +0100, Tomas Vondra wrote:

I probably did this to make the code change small, to avoid indentin the whole
block.

But indenting the block is not necessary. It's possible to do something
like this:

if (!rel->inh)
return 1.0;

or whatever is the "default" result for that function.

You're right. I did like that, Except in examine_variable, which already does
it with "break".

Maybe the for inh<=1 loop should instead be two calls to new functions factored
out of get_relation_statistics() and RemoveStatisticsById(), which take "bool
inh".

I did like that in a separate patch for now.
And I avoided making a !inh tuple for partitioned tables, since they're never
populated.

And I'm not sure we do the right thing after removing children, for example
(that should drop the inheritance stats, I guess).

Do you mean for inheritance only ? Or partitions too ?
I think for partitions, the stats should stay.
And for inheritence, they can stay, for consistency with partitions, and since
it does no harm.

I think the behavior should be the same as for data in pg_statistic,
i.e. if we keep/remove those, we should do the same thing for extended
statistics.

This works for column stats the way I proposed for extended stats: child stats
are never removed, neither when the only child is dropped, nor when re-running
ANALYZE (actually, that part is odd).

I can stop sending patches if it makes it hard to reconcile, but I wanted to
put it "on paper" to see/show what the patch series would look like, for v15
and back branches.

--
Justin

Attachments:

v3-0001-Do-not-use-extended-statistics-on-inheritence-tre.patchtext/x-diff; charset=us-asciiDownload
From 907046c93d8f20e8edcc4fb0334feed86d194b81 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 25 Sep 2021 19:42:41 -0500
Subject: [PATCH v3 1/6] Do not use extended statistics on inheritence trees..

Since 859b3003de, inherited ext stats are not built.
However, the non-inherited stats stats were incorrectly used during planning of
queries with inheritence heirarchies.

Since the ext stats do not include child tables, they can lead to worse
estimates.  This is remarkably similar to 427c6b5b9, which affected column
statistics 15 years ago.

choose_best_statistics is handled a bit differently (in the calling function),
because it isn't passed rel nor rel->inh, and it's an exported function, so
avoid changing its signature in back branches.

https://www.postgresql.org/message-id/flat/20210925223152.GA7877@telsasoft.com

Backpatch to v10
---
 src/backend/statistics/dependencies.c   |  5 +++++
 src/backend/statistics/extended_stats.c |  5 +++++
 src/backend/utils/adt/selfuncs.c        |  9 +++++++++
 src/test/regress/expected/stats_ext.out | 23 +++++++++++++++++++++++
 src/test/regress/sql/stats_ext.sql      | 14 ++++++++++++++
 5 files changed, 56 insertions(+)

diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 8bf80db8e4..015b9e28f6 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -1410,11 +1410,16 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			ndependencies;
 	int			i;
 	AttrNumber	attnum_offset;
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
 	/* unique expressions */
 	Node	  **unique_exprs;
 	int			unique_exprs_cnt;
 
+	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+	if (rte->inh)
+		return 1.0;
+
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
 		return 1.0;
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 69ca52094f..b9e755f44e 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1694,6 +1694,11 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	List	  **list_exprs;		/* expressions matched to any statistic */
 	int			listidx;
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
+	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+	if (rte->inh)
+		return sel;
 
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 10895fb287..a0932e39e1 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3913,6 +3913,11 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	Oid			statOid = InvalidOid;
 	MVNDistinct *stats;
 	StatisticExtInfo *matched_info = NULL;
+	RangeTblEntry		*rte = root->simple_rte_array[rel->relid];
+
+	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+	if (rte->inh)
+		return false;
 
 	/* bail out immediately if the table has no extended statistics */
 	if (!rel->statlist)
@@ -5232,6 +5237,10 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 			if (vardata->statsTuple)
 				break;
 
+			/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+			if (planner_rt_fetch(onerel->relid, root)->inh)
+				break;
+
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
 				continue;
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index c60ba45aba..5c15e44bd6 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -176,6 +176,29 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 NOTICE:  drop cascades to table ab1c
+-- Ensure non-inherited stats are not applied to inherited query
+CREATE TABLE stxdinh(i int, j int);
+CREATE TABLE stxdinh1() INHERITS(stxdinh);
+INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
+INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Without stats object, it looks like this
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+      1000 |   1008
+(1 row)
+
+CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Since the stats object does not include inherited stats, it should not affect the estimates
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+      1000 |   1008
+(1 row)
+
+DROP TABLE stxdinh, stxdinh1;
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 -- expression stats may be built on a single expression column
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 6fb37962a7..610f7ed17f 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -112,6 +112,20 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 
+-- Ensure non-inherited stats are not applied to inherited query
+CREATE TABLE stxdinh(i int, j int);
+CREATE TABLE stxdinh1() INHERITS(stxdinh);
+INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
+INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Without stats object, it looks like this
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Since the stats object does not include inherited stats, it should not affect the estimates
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+DROP TABLE stxdinh, stxdinh1;
+
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 
-- 
2.17.0

v3-0002-Build-inherited-extended-stats-on-partitioned-tab.patchtext/x-diff; charset=us-asciiDownload
From 435b154ca177053f6ddc54f7362f5c6aaa486215 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Date: Sat, 25 Sep 2021 23:01:21 +0200
Subject: [PATCH v3 2/6] Build inherited extended stats on partitioned tables

Since 859b3003de, ext stats on partitioned tables are not built, which is a
regression.  For back branches, pg_statistic_ext cannot support both inherited
(FROM) and non-inherited (FROM ONLY) stats on inheritence heirarchies.
But there's no issue building inherited stats for partitioned tables, which
are empty, so cannot have non-inherited stats.

See also: 8c5cdb7f4f6e1d6a6104cb58ce4f23453891651b

https://www.postgresql.org/message-id/20210923212624.GI831%40telsasoft.com

Backpatch to v10
---
 src/backend/commands/analyze.c          |  5 ++++-
 src/backend/statistics/dependencies.c   |  2 +-
 src/backend/statistics/extended_stats.c |  2 +-
 src/backend/utils/adt/selfuncs.c        |  9 ++++++---
 src/test/regress/expected/stats_ext.out | 19 +++++++++++++++++++
 src/test/regress/sql/stats_ext.sql      | 10 ++++++++++
 6 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 4928702aec..acc994e1e8 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -548,6 +548,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 	{
 		MemoryContext col_context,
 					old_context;
+		bool		build_ext_stats;
 
 		pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
 									 PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -611,13 +612,15 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
+		build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
+
 		/*
 		 * Build extended statistics (if there are any).
 		 *
 		 * For now we only build extended statistics on individual relations,
 		 * not for relations representing inheritance trees.
 		 */
-		if (!inh)
+		if (build_ext_stats)
 			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
 									   attr_cnt, vacattrstats);
 	}
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 015b9e28f6..36121d5a27 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -1417,7 +1417,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			unique_exprs_cnt;
 
 	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-	if (rte->inh)
+	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 		return 1.0;
 
 	/* check if there's any stats that might be useful for us. */
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index b9e755f44e..5fbdbf0164 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1697,7 +1697,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
 	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-	if (rte->inh)
+	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 		return sel;
 
 	/* check if there's any stats that might be useful for us. */
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index a0932e39e1..b15f14e1a0 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3916,7 +3916,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	RangeTblEntry		*rte = root->simple_rte_array[rel->relid];
 
 	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-	if (rte->inh)
+	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 		return false;
 
 	/* bail out immediately if the table has no extended statistics */
@@ -5238,8 +5238,11 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 				break;
 
 			/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-			if (planner_rt_fetch(onerel->relid, root)->inh)
-				break;
+			{
+				RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root);
+				if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
+					break;
+			}
 
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 5c15e44bd6..67234b9fc2 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -199,6 +199,25 @@ SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 (1 row)
 
 DROP TABLE stxdinh, stxdinh1;
+-- Ensure inherited stats ARE applied to inherited query in partitioned table
+CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i);
+CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100);
+INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
+CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp;
+VACUUM ANALYZE stxdinp; -- partitions are processed recursively
+SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass;
+ ?column? 
+----------
+        1
+(1 row)
+
+SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+        10 |     10
+(1 row)
+
+DROP TABLE stxdinp;
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 -- expression stats may be built on a single expression column
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 610f7ed17f..2371043ca1 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -126,6 +126,16 @@ VACUUM ANALYZE stxdinh, stxdinh1;
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 DROP TABLE stxdinh, stxdinh1;
 
+-- Ensure inherited stats ARE applied to inherited query in partitioned table
+CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i);
+CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100);
+INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
+CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp;
+VACUUM ANALYZE stxdinp; -- partitions are processed recursively
+SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass;
+SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2');
+DROP TABLE stxdinp;
+
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 
-- 
2.17.0

v3-0003-Add-stxdinherit-build-inherited-extended-stats-on.patchtext/x-diff; charset=us-asciiDownload
From c2f84b0616376c579fef419d54aa6e6cb3b34be7 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Date: Sat, 25 Sep 2021 21:27:10 +0200
Subject: [PATCH v3 3/6] Add stxdinherit; build inherited extended stats on
 inheritence parents

pg_statistic has an inherited flag which is part of the unique index, but
pg_statistic_ext has never had that.  In back branches, pg_statistic cannot
store both inherited and non-inherited stats.  So it stores non-inherited stats
(FROM ONLY) for inheritence parents and inherited stats for partitioned tables.

This patch for v15 allows storing both inherited and non-inherited stats for
non-empty inheritence parents, and avoids the above, confusing definition.
---
 doc/src/sgml/catalogs.sgml                  |  23 +++
 src/backend/catalog/system_views.sql        |   1 +
 src/backend/commands/analyze.c              |  15 +-
 src/backend/commands/statscmds.c            |  20 ++-
 src/backend/optimizer/util/plancat.c        | 186 +++++++++++---------
 src/backend/statistics/dependencies.c       |  12 +-
 src/backend/statistics/extended_stats.c     |  71 ++++----
 src/backend/statistics/mcv.c                |   9 +-
 src/backend/statistics/mvdistinct.c         |   5 +-
 src/backend/utils/adt/selfuncs.c            |   2 +-
 src/backend/utils/cache/syscache.c          |   6 +-
 src/include/catalog/pg_statistic_ext_data.h |   4 +-
 src/include/nodes/pathnodes.h               |   1 +
 src/include/statistics/statistics.h         |   9 +-
 src/test/regress/expected/rules.out         |   1 +
 15 files changed, 215 insertions(+), 150 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be73f..720e83e523 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7509,6 +7509,19 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
    created with <link linkend="sql-createstatistics"><command>CREATE STATISTICS</command></link>.
   </para>
 
+  <para>
+   Normally there is one entry, with <structfield>stxdinherit</structfield> =
+   <literal>false</literal>, for each statistics object that has been analyzed.
+   If the table has inheritance children, a second entry with
+   <structfield>stxdinherit</structfield> = <literal>true</literal> is also created.
+   This row represents the statistics object over the inheritance tree, i.e.,
+   statistics for the data you'd see with
+   <literal>SELECT * FROM <replaceable>table</replaceable>*</literal>,
+   whereas the <structfield>stxdinherit</structfield> = <literal>false</literal> row
+   represents the results of
+   <literal>SELECT * FROM ONLY <replaceable>table</replaceable></literal>.
+  </para>
+
   <para>
    Like <link linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>,
    <structname>pg_statistic_ext_data</structname> should not be
@@ -7548,6 +7561,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>stxdinherit</structfield> <type>bool</type>
+      </para>
+      <para>
+       If true, the stats include inheritance child columns, not just the
+       values in the specified relation
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>stxdndistinct</structfield> <type>pg_ndistinct</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index eb560955cd..36aa3dd3ab 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -266,6 +266,7 @@ CREATE VIEW pg_stats_ext WITH (security_barrier) AS
            ) AS attnames,
            pg_get_statisticsobjdef_expressions(s.oid) as exprs,
            s.stxkind AS kinds,
+           sd.stxdinherit AS inherited,
            sd.stxdndistinct AS n_distinct,
            sd.stxddependencies AS dependencies,
            m.most_common_vals,
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index acc994e1e8..88c6976ed8 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -548,7 +548,6 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 	{
 		MemoryContext col_context,
 					old_context;
-		bool		build_ext_stats;
 
 		pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
 									 PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -612,17 +611,9 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
-		build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
-
-		/*
-		 * Build extended statistics (if there are any).
-		 *
-		 * For now we only build extended statistics on individual relations,
-		 * not for relations representing inheritance trees.
-		 */
-		if (build_ext_stats)
-			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
-									   attr_cnt, vacattrstats);
+		/* Build extended statistics (if there are any). */
+		BuildRelationExtStatistics(onerel, inh, totalrows, numrows, rows,
+								   attr_cnt, vacattrstats);
 	}
 
 	pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 8f1550ec80..4395d878c7 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -524,6 +524,9 @@ CreateStatistics(CreateStatsStmt *stmt)
 
 	datavalues[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statoid);
 
+	/* create only the "stxdinherit=false", because that always exists */
+	datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = ObjectIdGetDatum(false);
+
 	/* no statistics built yet */
 	datanulls[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
 	datanulls[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
@@ -726,6 +729,7 @@ RemoveStatisticsById(Oid statsOid)
 	HeapTuple	tup;
 	Form_pg_statistic_ext statext;
 	Oid			relid;
+	int			inh;
 
 	/*
 	 * First delete the pg_statistic_ext_data tuple holding the actual
@@ -733,14 +737,20 @@ RemoveStatisticsById(Oid statsOid)
 	 */
 	relation = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
-	tup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid));
+	/* hack to delete both stxdinherit = true/false */
+	for (inh = 0; inh <= 1; inh++)
+	{
+		tup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid),
+							  BoolGetDatum(inh));
 
-	if (!HeapTupleIsValid(tup)) /* should not happen */
-		elog(ERROR, "cache lookup failed for statistics data %u", statsOid);
+		if (!HeapTupleIsValid(tup)) /* should not happen */
+			// elog(ERROR, "cache lookup failed for statistics data %u", statsOid);
+			continue;
 
-	CatalogTupleDelete(relation, &tup->t_self);
+		CatalogTupleDelete(relation, &tup->t_self);
 
-	ReleaseSysCache(tup);
+		ReleaseSysCache(tup);
+	}
 
 	table_close(relation, RowExclusiveLock);
 
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index c5194fdbbf..154d48a330 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -30,6 +30,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_statistic_ext.h"
+#include "catalog/pg_statistic_ext_data.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -1311,127 +1312,144 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 	{
 		Oid			statOid = lfirst_oid(l);
 		Form_pg_statistic_ext staForm;
+		Form_pg_statistic_ext_data dataForm;
 		HeapTuple	htup;
 		HeapTuple	dtup;
 		Bitmapset  *keys = NULL;
 		List	   *exprs = NIL;
 		int			i;
+		int			inh;
 
 		htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
 		if (!HeapTupleIsValid(htup))
 			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
 		staForm = (Form_pg_statistic_ext) GETSTRUCT(htup);
 
-		dtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-		if (!HeapTupleIsValid(dtup))
-			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
-
-		/*
-		 * First, build the array of columns covered.  This is ultimately
-		 * wasted if no stats within the object have actually been built, but
-		 * it doesn't seem worth troubling over that case.
-		 */
-		for (i = 0; i < staForm->stxkeys.dim1; i++)
-			keys = bms_add_member(keys, staForm->stxkeys.values[i]);
-
 		/*
-		 * Preprocess expressions (if any). We read the expressions, run them
-		 * through eval_const_expressions, and fix the varnos.
+		 * Hack to load stats with stxdinherit true/false - there should be
+		 * a better way to do this, I guess.
 		 */
+		for (inh = 0; inh <= 1; inh++)
 		{
-			bool		isnull;
-			Datum		datum;
+			dtup = SearchSysCache2(STATEXTDATASTXOID,
+								   ObjectIdGetDatum(statOid), BoolGetDatum((bool) inh));
+			if (!HeapTupleIsValid(dtup))
+				continue;
 
-			/* decode expression (if any) */
-			datum = SysCacheGetAttr(STATEXTOID, htup,
-									Anum_pg_statistic_ext_stxexprs, &isnull);
+			dataForm = (Form_pg_statistic_ext_data) GETSTRUCT(dtup);
 
-			if (!isnull)
+			/*
+			 * First, build the array of columns covered.  This is ultimately
+			 * wasted if no stats within the object have actually been built, but
+			 * it doesn't seem worth troubling over that case.
+			 */
+			for (i = 0; i < staForm->stxkeys.dim1; i++)
+				keys = bms_add_member(keys, staForm->stxkeys.values[i]);
+
+			/*
+			 * Preprocess expressions (if any). We read the expressions, run them
+			 * through eval_const_expressions, and fix the varnos.
+			 */
 			{
-				char	   *exprsString;
+				bool		isnull;
+				Datum		datum;
 
-				exprsString = TextDatumGetCString(datum);
-				exprs = (List *) stringToNode(exprsString);
-				pfree(exprsString);
+				/* decode expression (if any) */
+				datum = SysCacheGetAttr(STATEXTOID, htup,
+										Anum_pg_statistic_ext_stxexprs, &isnull);
 
-				/*
-				 * Run the expressions through eval_const_expressions. This is
-				 * not just an optimization, but is necessary, because the
-				 * planner will be comparing them to similarly-processed qual
-				 * clauses, and may fail to detect valid matches without this.
-				 * We must not use canonicalize_qual, however, since these
-				 * aren't qual expressions.
-				 */
-				exprs = (List *) eval_const_expressions(NULL, (Node *) exprs);
+				if (!isnull)
+				{
+					char	   *exprsString;
 
-				/* May as well fix opfuncids too */
-				fix_opfuncids((Node *) exprs);
+					exprsString = TextDatumGetCString(datum);
+					exprs = (List *) stringToNode(exprsString);
+					pfree(exprsString);
 
-				/*
-				 * Modify the copies we obtain from the relcache to have the
-				 * correct varno for the parent relation, so that they match
-				 * up correctly against qual clauses.
-				 */
-				if (varno != 1)
-					ChangeVarNodes((Node *) exprs, 1, varno, 0);
+					/*
+					 * Run the expressions through eval_const_expressions. This is
+					 * not just an optimization, but is necessary, because the
+					 * planner will be comparing them to similarly-processed qual
+					 * clauses, and may fail to detect valid matches without this.
+					 * We must not use canonicalize_qual, however, since these
+					 * aren't qual expressions.
+					 */
+					exprs = (List *) eval_const_expressions(NULL, (Node *) exprs);
+
+					/* May as well fix opfuncids too */
+					fix_opfuncids((Node *) exprs);
+
+					/*
+					 * Modify the copies we obtain from the relcache to have the
+					 * correct varno for the parent relation, so that they match
+					 * up correctly against qual clauses.
+					 */
+					if (varno != 1)
+						ChangeVarNodes((Node *) exprs, 1, varno, 0);
+				}
 			}
-		}
 
-		/* add one StatisticExtInfo for each kind built */
-		if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			/* add one StatisticExtInfo for each kind built */
+			if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_NDISTINCT;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_NDISTINCT;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
 
-			stainfos = lappend(stainfos, info);
-		}
+				stainfos = lappend(stainfos, info);
+			}
 
-		if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_DEPENDENCIES;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_DEPENDENCIES;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
 
-			stainfos = lappend(stainfos, info);
-		}
+				stainfos = lappend(stainfos, info);
+			}
 
-		if (statext_is_kind_built(dtup, STATS_EXT_MCV))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			if (statext_is_kind_built(dtup, STATS_EXT_MCV))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_MCV;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_MCV;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
 
-			stainfos = lappend(stainfos, info);
-		}
+				stainfos = lappend(stainfos, info);
+			}
 
-		if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_EXPRESSIONS;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_EXPRESSIONS;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
+
+				stainfos = lappend(stainfos, info);
+			}
 
-			stainfos = lappend(stainfos, info);
+			ReleaseSysCache(dtup);
 		}
 
 		ReleaseSysCache(htup);
-		ReleaseSysCache(dtup);
 		bms_free(keys);
 	}
 
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 36121d5a27..c62fc398ce 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -618,14 +618,16 @@ dependency_is_fully_matched(MVDependency *dependency, Bitmapset *attnums)
  *		Load the functional dependencies for the indicated pg_statistic_ext tuple
  */
 MVDependencies *
-statext_dependencies_load(Oid mvoid)
+statext_dependencies_load(Oid mvoid, bool inh)
 {
 	MVDependencies *result;
 	bool		isnull;
 	Datum		deps;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid),
+						   BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
@@ -1603,6 +1605,10 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (stat->kind != STATS_EXT_DEPENDENCIES)
 			continue;
 
+		/* skip statistics with mismatching stxdinherit value */
+		if (stat->inherit != rte->inh)
+			continue;
+
 		/*
 		 * Count matching attributes - we have to undo the attnum offsets. The
 		 * input attribute numbers are not offset (expressions are not
@@ -1649,7 +1655,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (nmatched + nexprs < 2)
 			continue;
 
-		deps = statext_dependencies_load(stat->statOid);
+		deps = statext_dependencies_load(stat->statOid, rte->inh);
 
 		/*
 		 * The expressions may be represented by different attnums in the
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 5fbdbf0164..ff2ef67dd2 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -77,7 +77,7 @@ typedef struct StatExtEntry
 static List *fetch_statentries_for_relation(Relation pg_statext, Oid relid);
 static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
 											int nvacatts, VacAttrStats **vacatts);
-static void statext_store(Oid statOid,
+static void statext_store(Oid statOid, bool inh,
 						  MVNDistinct *ndistinct, MVDependencies *dependencies,
 						  MCVList *mcv, Datum exprs, VacAttrStats **stats);
 static int	statext_compute_stattarget(int stattarget,
@@ -110,7 +110,7 @@ static StatsBuildData *make_build_data(Relation onerel, StatExtEntry *stat,
  * requested stats, and serializes them back into the catalog.
  */
 void
-BuildRelationExtStatistics(Relation onerel, double totalrows,
+BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 						   int numrows, HeapTuple *rows,
 						   int natts, VacAttrStats **vacattrstats)
 {
@@ -230,7 +230,8 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 		}
 
 		/* store the statistics in the catalog */
-		statext_store(stat->statOid, ndistinct, dependencies, mcv, exprstats, stats);
+		statext_store(stat->statOid, inh,
+					  ndistinct, dependencies, mcv, exprstats, stats);
 
 		/* for reporting progress */
 		pgstat_progress_update_param(PROGRESS_ANALYZE_EXT_STATS_COMPUTED,
@@ -781,7 +782,7 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
  *	tuple.
  */
 static void
-statext_store(Oid statOid,
+statext_store(Oid statOid, bool inh,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
 			  MCVList *mcv, Datum exprs, VacAttrStats **stats)
 {
@@ -790,14 +791,19 @@ statext_store(Oid statOid,
 				oldtup;
 	Datum		values[Natts_pg_statistic_ext_data];
 	bool		nulls[Natts_pg_statistic_ext_data];
-	bool		replaces[Natts_pg_statistic_ext_data];
 
 	pg_stextdata = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
 	memset(nulls, true, sizeof(nulls));
-	memset(replaces, false, sizeof(replaces));
 	memset(values, 0, sizeof(values));
 
+	/* basic info */
+	values[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statOid);
+	nulls[Anum_pg_statistic_ext_data_stxoid - 1] = false;
+
+	values[Anum_pg_statistic_ext_data_stxdinherit - 1] = BoolGetDatum(inh);
+	nulls[Anum_pg_statistic_ext_data_stxdinherit - 1] = false;
+
 	/*
 	 * Construct a new pg_statistic_ext_data tuple, replacing the calculated
 	 * stats.
@@ -830,25 +836,27 @@ statext_store(Oid statOid,
 		values[Anum_pg_statistic_ext_data_stxdexpr - 1] = exprs;
 	}
 
-	/* always replace the value (either by bytea or NULL) */
-	replaces[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdmcv - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdexpr - 1] = true;
-
-	/* there should already be a pg_statistic_ext_data tuple */
-	oldtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-	if (!HeapTupleIsValid(oldtup))
+	/*
+	 * Delete the old tuple if it exists, and insert a new one. It's easier
+	 * than trying to update or insert, based on various conditions.
+	 *
+	 * There should always be a pg_statistic_ext_data tuple for inh=false,
+	 * but there may be none for inh=true yet.
+	 */
+	oldtup = SearchSysCache2(STATEXTDATASTXOID,
+							 ObjectIdGetDatum(statOid),
+							 BoolGetDatum(inh));
+	if (HeapTupleIsValid(oldtup))
+	{
+		CatalogTupleDelete(pg_stextdata, &(oldtup->t_self));
+		ReleaseSysCache(oldtup);
+	}
+	else if (!inh)
 		elog(ERROR, "cache lookup failed for statistics object %u", statOid);
 
-	/* replace it */
-	stup = heap_modify_tuple(oldtup,
-							 RelationGetDescr(pg_stextdata),
-							 values,
-							 nulls,
-							 replaces);
-	ReleaseSysCache(oldtup);
-	CatalogTupleUpdate(pg_stextdata, &stup->t_self, stup);
+	/* form a new tuple */
+	stup = heap_form_tuple(RelationGetDescr(pg_stextdata), values, nulls);
+	CatalogTupleInsert(pg_stextdata, stup);
 
 	heap_freetuple(stup);
 
@@ -1234,7 +1242,7 @@ stat_covers_expressions(StatisticExtInfo *stat, List *exprs,
  * further tiebreakers are needed.
  */
 StatisticExtInfo *
-choose_best_statistics(List *stats, char requiredkind,
+choose_best_statistics(List *stats, char requiredkind, bool inh,
 					   Bitmapset **clause_attnums, List **clause_exprs,
 					   int nclauses)
 {
@@ -1256,6 +1264,10 @@ choose_best_statistics(List *stats, char requiredkind,
 		if (info->kind != requiredkind)
 			continue;
 
+		/* skip statistics with mismatching inheritance flag */
+		if (info->inherit != inh)
+			continue;
+
 		/*
 		 * Collect attributes and expressions in remaining (unestimated)
 		 * clauses fully covered by this statistic object.
@@ -1694,11 +1706,6 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	List	  **list_exprs;		/* expressions matched to any statistic */
 	int			listidx;
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
-	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
-
-	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return sel;
 
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
@@ -1751,7 +1758,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 		Bitmapset  *simple_clauses;
 
 		/* find the best suited statistics object for these attnums */
-		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV,
+		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, rte->inh,
 									  list_attnums, list_exprs,
 									  list_length(clauses));
 
@@ -1840,7 +1847,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 			MCVList    *mcv_list;
 
 			/* Load the MCV list stored in the statistics object */
-			mcv_list = statext_mcv_load(stat->statOid);
+			mcv_list = statext_mcv_load(stat->statOid, rte->inh);
 
 			/*
 			 * Compute the selectivity of the ORed list of clauses covered by
@@ -2411,7 +2418,7 @@ statext_expressions_load(Oid stxoid, int idx)
 	HeapTupleData tmptup;
 	HeapTuple	tup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), BoolGetDatum(false));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", stxoid);
 
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index b350fc5f7b..75d7d35caa 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -559,12 +559,13 @@ build_column_frequencies(SortItem *groups, int ngroups,
  *		Load the MCV list for the indicated pg_statistic_ext tuple.
  */
 MCVList *
-statext_mcv_load(Oid mvoid)
+statext_mcv_load(Oid mvoid, bool inh)
 {
 	MCVList    *result;
 	bool		isnull;
 	Datum		mcvlist;
-	HeapTuple	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	HeapTuple	htup = SearchSysCache2(STATEXTDATASTXOID,
+									   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
@@ -2039,11 +2040,13 @@ mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
 	MCVList    *mcv;
 	Selectivity s = 0.0;
 
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
 	/* match/mismatch bitmap for each MCV item */
 	bool	   *matches = NULL;
 
 	/* load the MCV list stored in the statistics object */
-	mcv = statext_mcv_load(stat->statOid);
+	mcv = statext_mcv_load(stat->statOid, rte->inh);
 
 	/* build a match bitmap for the clauses */
 	matches = mcv_get_match_bitmap(root, clauses, stat->keys, stat->exprs,
diff --git a/src/backend/statistics/mvdistinct.c b/src/backend/statistics/mvdistinct.c
index 4481312d61..ab1f10d6c0 100644
--- a/src/backend/statistics/mvdistinct.c
+++ b/src/backend/statistics/mvdistinct.c
@@ -146,14 +146,15 @@ statext_ndistinct_build(double totalrows, StatsBuildData *data)
  *		Load the ndistinct value for the indicated pg_statistic_ext tuple
  */
 MVNDistinct *
-statext_ndistinct_load(Oid mvoid)
+statext_ndistinct_load(Oid mvoid, bool inh)
 {
 	MVNDistinct *result;
 	bool		isnull;
 	Datum		ndist;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index b15f14e1a0..aab3bd7696 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -4008,7 +4008,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 
 	Assert(nmatches_vars + nmatches_exprs > 1);
 
-	stats = statext_ndistinct_load(statOid);
+	stats = statext_ndistinct_load(statOid, rte->inh);
 
 	/*
 	 * If we have a match, search it for the specific item that matches (there
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 56870b46e4..6194751e99 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -763,11 +763,11 @@ static const struct cachedesc cacheinfo[] = {
 		32
 	},
 	{StatisticExtDataRelationId,	/* STATEXTDATASTXOID */
-		StatisticExtDataStxoidIndexId,
-		1,
+		StatisticExtDataStxoidInhIndexId,
+		2,
 		{
 			Anum_pg_statistic_ext_data_stxoid,
-			0,
+			Anum_pg_statistic_ext_data_stxdinherit,
 			0,
 			0
 		},
diff --git a/src/include/catalog/pg_statistic_ext_data.h b/src/include/catalog/pg_statistic_ext_data.h
index 7b73b790d2..8ffd8b68cd 100644
--- a/src/include/catalog/pg_statistic_ext_data.h
+++ b/src/include/catalog/pg_statistic_ext_data.h
@@ -32,6 +32,7 @@ CATALOG(pg_statistic_ext_data,3429,StatisticExtDataRelationId)
 {
 	Oid			stxoid BKI_LOOKUP(pg_statistic_ext);	/* statistics object
 														 * this data is for */
+	bool		stxdinherit;	/* true if inheritance children are included */
 
 #ifdef CATALOG_VARLEN			/* variable-length fields start here */
 
@@ -53,6 +54,7 @@ typedef FormData_pg_statistic_ext_data * Form_pg_statistic_ext_data;
 
 DECLARE_TOAST(pg_statistic_ext_data, 3430, 3431);
 
-DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_index, 3433, StatisticExtDataStxoidIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops));
+DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_inh_index, 3433, StatisticExtDataStxoidInhIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops, stxdinherit bool_ops));
+
 
 #endif							/* PG_STATISTIC_EXT_DATA_H */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 2a53a6e344..884bda7232 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -934,6 +934,7 @@ typedef struct StatisticExtInfo
 	NodeTag		type;
 
 	Oid			statOid;		/* OID of the statistics row */
+	bool		inherit;		/* includes child relations */
 	RelOptInfo *rel;			/* back-link to statistic's table */
 	char		kind;			/* statistics kind of this entry */
 	Bitmapset  *keys;			/* attnums of the columns covered */
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 326cf26fea..02ee41b9f3 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -94,11 +94,11 @@ typedef struct MCVList
 	MCVItem		items[FLEXIBLE_ARRAY_MEMBER];	/* array of MCV items */
 } MCVList;
 
-extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
-extern MVDependencies *statext_dependencies_load(Oid mvoid);
-extern MCVList *statext_mcv_load(Oid mvoid);
+extern MVNDistinct *statext_ndistinct_load(Oid mvoid, bool inh);
+extern MVDependencies *statext_dependencies_load(Oid mvoid, bool inh);
+extern MCVList *statext_mcv_load(Oid mvoid, bool inh);
 
-extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
+extern void BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 									   int numrows, HeapTuple *rows,
 									   int natts, VacAttrStats **vacattrstats);
 extern int	ComputeExtStatisticsRows(Relation onerel,
@@ -121,6 +121,7 @@ extern Selectivity statext_clauselist_selectivity(PlannerInfo *root,
 												  bool is_or);
 extern bool has_stats_of_kind(List *stats, char requiredkind);
 extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind,
+												bool inh,
 												Bitmapset **clause_attnums,
 												List **clause_exprs,
 												int nclauses);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..8ab5187ccb 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2425,6 +2425,7 @@ pg_stats_ext| SELECT cn.nspname AS schemaname,
              JOIN pg_attribute a ON (((a.attrelid = s.stxrelid) AND (a.attnum = k.k))))) AS attnames,
     pg_get_statisticsobjdef_expressions(s.oid) AS exprs,
     s.stxkind AS kinds,
+    sd.stxdinherit AS inherited,
     sd.stxdndistinct AS n_distinct,
     sd.stxddependencies AS dependencies,
     m.most_common_vals,
-- 
2.17.0

v3-0004-f-check-inh.patchtext/x-diff; charset=us-asciiDownload
From 9125ab1f8ab98ab86577e0dac044749ac0f765fb Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 25 Sep 2021 18:20:03 -0500
Subject: [PATCH v3 4/6] f! check inh

statext_expressions_load
examine_variable
estimate_multivariate_ndistinct

TODO: pg_stats_ext_exprs needs to expose inh flag
---
 src/backend/statistics/dependencies.c   |  4 ----
 src/backend/statistics/extended_stats.c |  5 +++--
 src/backend/utils/adt/selfuncs.c        | 27 ++++++++-----------------
 src/include/statistics/statistics.h     |  2 +-
 src/test/regress/expected/stats_ext.out | 11 ++++++++--
 src/test/regress/sql/stats_ext.sql      |  4 +++-
 6 files changed, 24 insertions(+), 29 deletions(-)

diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index c62fc398ce..02cf0efc66 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -1418,10 +1418,6 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	Node	  **unique_exprs;
 	int			unique_exprs_cnt;
 
-	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return 1.0;
-
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
 		return 1.0;
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index ff2ef67dd2..b1461f3066 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1706,6 +1706,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	List	  **list_exprs;		/* expressions matched to any statistic */
 	int			listidx;
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
@@ -2408,7 +2409,7 @@ serialize_expr_stats(AnlExprData *exprdata, int nexprs)
  * identified by the supplied index.
  */
 HeapTuple
-statext_expressions_load(Oid stxoid, int idx)
+statext_expressions_load(Oid stxoid, bool inh, int idx)
 {
 	bool		isnull;
 	Datum		value;
@@ -2418,7 +2419,7 @@ statext_expressions_load(Oid stxoid, int idx)
 	HeapTupleData tmptup;
 	HeapTuple	tup;
 
-	htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), BoolGetDatum(false));
+	htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), inh);
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", stxoid);
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index aab3bd7696..19b4aaf7eb 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3915,10 +3915,6 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	StatisticExtInfo *matched_info = NULL;
 	RangeTblEntry		*rte = root->simple_rte_array[rel->relid];
 
-	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return false;
-
 	/* bail out immediately if the table has no extended statistics */
 	if (!rel->statlist)
 		return false;
@@ -5237,13 +5233,6 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 			if (vardata->statsTuple)
 				break;
 
-			/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-			{
-				RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root);
-				if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-					break;
-			}
-
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
 				continue;
@@ -5262,22 +5251,22 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 				/* found a match, see if we can extract pg_statistic row */
 				if (equal(node, expr))
 				{
-					HeapTuple	t = statext_expressions_load(info->statOid, pos);
-
-					/* Get statistics object's table for permission check */
-					RangeTblEntry *rte;
+					RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root);
 					Oid			userid;
+					bool		inh;
 
-					vardata->statsTuple = t;
+					Assert(rte->rtekind == RTE_RELATION);
 
 					/*
 					 * XXX Not sure if we should cache the tuple somewhere.
 					 * Now we just create a new copy every time.
 					 */
-					vardata->freefunc = ReleaseDummy;
+					inh = root->append_rel_array == NULL ? false :
+						root->append_rel_array[onerel->relid]->parent_relid != 0;
+					vardata->statsTuple =
+						statext_expressions_load(info->statOid, inh, pos);
 
-					rte = planner_rt_fetch(onerel->relid, root);
-					Assert(rte->rtekind == RTE_RELATION);
+					vardata->freefunc = ReleaseDummy;
 
 					/*
 					 * Use checkAsUser if it's set, in case we're accessing
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 02ee41b9f3..3868e43f8a 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -125,6 +125,6 @@ extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind,
 												Bitmapset **clause_attnums,
 												List **clause_exprs,
 												int nclauses);
-extern HeapTuple statext_expressions_load(Oid stxoid, int idx);
+extern HeapTuple statext_expressions_load(Oid stxoid, bool inh, int idx);
 
 #endif							/* STATISTICS_H */
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 67234b9fc2..35edc6a361 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -176,7 +176,6 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 NOTICE:  drop cascades to table ab1c
--- Ensure non-inherited stats are not applied to inherited query
 CREATE TABLE stxdinh(i int, j int);
 CREATE TABLE stxdinh1() INHERITS(stxdinh);
 INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
@@ -191,11 +190,19 @@ SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 
 CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
 VACUUM ANALYZE stxdinh, stxdinh1;
+-- Ensure non-inherited stats are not applied to inherited query
 -- Since the stats object does not include inherited stats, it should not affect the estimates
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
  estimated | actual 
 -----------+--------
-      1000 |   1008
+      1008 |   1008
+(1 row)
+
+-- Ensure correct (non-inherited) stats are applied to inherited query
+SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+         9 |      9
 (1 row)
 
 DROP TABLE stxdinh, stxdinh1;
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 2371043ca1..8490da9558 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -112,7 +112,6 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 
--- Ensure non-inherited stats are not applied to inherited query
 CREATE TABLE stxdinh(i int, j int);
 CREATE TABLE stxdinh1() INHERITS(stxdinh);
 INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
@@ -122,8 +121,11 @@ VACUUM ANALYZE stxdinh, stxdinh1;
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
 VACUUM ANALYZE stxdinh, stxdinh1;
+-- Ensure non-inherited stats are not applied to inherited query
 -- Since the stats object does not include inherited stats, it should not affect the estimates
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+-- Ensure correct (non-inherited) stats are applied to inherited query
+SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2');
 DROP TABLE stxdinh, stxdinh1;
 
 -- Ensure inherited stats ARE applied to inherited query in partitioned table
-- 
2.17.0

v3-0005-Maybe-better-than-looping-twice.-For-partitioned-.patchtext/x-diff; charset=us-asciiDownload
From 0b84c954863ad04ff848b3d0fdbc25bb7e4b2f85 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Wed, 3 Nov 2021 20:08:42 -0500
Subject: [PATCH v3 5/6] Maybe better than looping twice.  For partitioned
 tables, create only "inherited" stats_ext.

---
 src/backend/commands/statscmds.c     |  63 +++++---
 src/backend/optimizer/util/plancat.c | 227 ++++++++++++++-------------
 2 files changed, 151 insertions(+), 139 deletions(-)

diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 4395d878c7..89e9ad5843 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -45,6 +45,7 @@
 static char *ChooseExtendedStatisticName(const char *name1, const char *name2,
 										 const char *label, Oid namespaceid);
 static char *ChooseExtendedStatisticNameAddition(List *exprs);
+static void RemoveStatisticsByIdWorker(Relation pg_statistic_ext, Oid statsOid, bool inh);
 
 
 /* qsort comparator for the attnums in CreateStatistics */
@@ -524,8 +525,9 @@ CreateStatistics(CreateStatsStmt *stmt)
 
 	datavalues[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statoid);
 
-	/* create only the "stxdinherit=false", because that always exists */
-	datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = ObjectIdGetDatum(false);
+	/* create "stxdinherit=false", because that always exists, except for
+	 * partitioned tables, for which that's meaningless */
+	datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = BoolGetDatum(rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE);
 
 	/* no statistics built yet */
 	datanulls[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
@@ -729,30 +731,7 @@ RemoveStatisticsById(Oid statsOid)
 	HeapTuple	tup;
 	Form_pg_statistic_ext statext;
 	Oid			relid;
-	int			inh;
-
-	/*
-	 * First delete the pg_statistic_ext_data tuple holding the actual
-	 * statistical data.
-	 */
-	relation = table_open(StatisticExtDataRelationId, RowExclusiveLock);
-
-	/* hack to delete both stxdinherit = true/false */
-	for (inh = 0; inh <= 1; inh++)
-	{
-		tup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid),
-							  BoolGetDatum(inh));
-
-		if (!HeapTupleIsValid(tup)) /* should not happen */
-			// elog(ERROR, "cache lookup failed for statistics data %u", statsOid);
-			continue;
-
-		CatalogTupleDelete(relation, &tup->t_self);
-
-		ReleaseSysCache(tup);
-	}
-
-	table_close(relation, RowExclusiveLock);
+	char relkind;
 
 	/*
 	 * Delete the pg_statistic_ext tuple.  Also send out a cache inval on the
@@ -767,6 +746,7 @@ RemoveStatisticsById(Oid statsOid)
 
 	statext = (Form_pg_statistic_ext) GETSTRUCT(tup);
 	relid = statext->stxrelid;
+	relkind = get_rel_relkind(relid);
 
 	CacheInvalidateRelcacheByRelid(relid);
 
@@ -775,6 +755,37 @@ RemoveStatisticsById(Oid statsOid)
 	ReleaseSysCache(tup);
 
 	table_close(relation, RowExclusiveLock);
+
+	/*
+	 * Delete the pg_statistic_ext_data tuples holding the actual
+	 * statistical data.
+	 */
+	relation = table_open(StatisticExtDataRelationId, RowExclusiveLock);
+
+	if (relkind != RELKIND_PARTITIONED_TABLE)
+		RemoveStatisticsByIdWorker(relation, statsOid, false);
+	RemoveStatisticsByIdWorker(relation, statsOid, true);
+
+	table_close(relation, RowExclusiveLock);
+}
+
+void
+RemoveStatisticsByIdWorker(Relation pg_statistic_ext, Oid statsOid, bool inh)
+{
+	HeapTuple	tup;
+	tup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid),
+						  BoolGetDatum(inh));
+
+	/* should not happen,  .. */
+	if (!HeapTupleIsValid(tup))
+	{
+		if (!inh)
+			elog(ERROR, "cache lookup failed for statistics data %u inh %d", statsOid, inh);
+		return;
+	}
+
+	CatalogTupleDelete(pg_statistic_ext, &tup->t_self);
+	ReleaseSysCache(tup);
 }
 
 /*
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 154d48a330..b1d8847529 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -81,6 +81,9 @@ static void set_baserel_partition_key_exprs(Relation relation,
 static void set_baserel_partition_constraint(Relation relation,
 											 RelOptInfo *rel);
 
+static void get_relation_statistics_worker(RelOptInfo *rel, List **stainfos,
+	Oid statOid, HeapTuple htup, Form_pg_statistic_ext staForm, Index varno,
+	bool inh);
 
 /*
  * get_relation_info -
@@ -1301,7 +1304,6 @@ get_relation_constraints(PlannerInfo *root,
 static List *
 get_relation_statistics(RelOptInfo *rel, Relation relation)
 {
-	Index		varno = rel->relid;
 	List	   *statoidlist;
 	List	   *stainfos = NIL;
 	ListCell   *l;
@@ -1312,150 +1314,149 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 	{
 		Oid			statOid = lfirst_oid(l);
 		Form_pg_statistic_ext staForm;
-		Form_pg_statistic_ext_data dataForm;
 		HeapTuple	htup;
-		HeapTuple	dtup;
-		Bitmapset  *keys = NULL;
-		List	   *exprs = NIL;
-		int			i;
-		int			inh;
 
 		htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
 		if (!HeapTupleIsValid(htup))
 			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
 		staForm = (Form_pg_statistic_ext) GETSTRUCT(htup);
 
-		/*
-		 * Hack to load stats with stxdinherit true/false - there should be
-		 * a better way to do this, I guess.
-		 */
-		for (inh = 0; inh <= 1; inh++)
-		{
-			dtup = SearchSysCache2(STATEXTDATASTXOID,
-								   ObjectIdGetDatum(statOid), BoolGetDatum((bool) inh));
-			if (!HeapTupleIsValid(dtup))
-				continue;
+		get_relation_statistics_worker(rel, &stainfos, statOid, htup, staForm, rel->relid, false);
+		get_relation_statistics_worker(rel, &stainfos, statOid, htup, staForm, rel->relid, true);
 
-			dataForm = (Form_pg_statistic_ext_data) GETSTRUCT(dtup);
+		ReleaseSysCache(htup);
+	}
 
-			/*
-			 * First, build the array of columns covered.  This is ultimately
-			 * wasted if no stats within the object have actually been built, but
-			 * it doesn't seem worth troubling over that case.
-			 */
-			for (i = 0; i < staForm->stxkeys.dim1; i++)
-				keys = bms_add_member(keys, staForm->stxkeys.values[i]);
+	list_free(statoidlist);
 
-			/*
-			 * Preprocess expressions (if any). We read the expressions, run them
-			 * through eval_const_expressions, and fix the varnos.
-			 */
-			{
-				bool		isnull;
-				Datum		datum;
+	return stainfos;
+}
 
-				/* decode expression (if any) */
-				datum = SysCacheGetAttr(STATEXTOID, htup,
-										Anum_pg_statistic_ext_stxexprs, &isnull);
+static void
+get_relation_statistics_worker(RelOptInfo *rel, List **stainfos, Oid statOid, HeapTuple htup, Form_pg_statistic_ext staForm, Index varno, bool inh)
+{
+	Form_pg_statistic_ext_data dataForm;
+	HeapTuple	dtup;
+	Bitmapset  *keys = NULL;
+	List	   *exprs = NIL;
+
+	dtup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(statOid), BoolGetDatum(inh));
+	if (!HeapTupleIsValid(dtup))
+		return;
 
-				if (!isnull)
-				{
-					char	   *exprsString;
+	dataForm = (Form_pg_statistic_ext_data) GETSTRUCT(dtup);
 
-					exprsString = TextDatumGetCString(datum);
-					exprs = (List *) stringToNode(exprsString);
-					pfree(exprsString);
+	/*
+	 * First, build the array of columns covered.  This is ultimately
+	 * wasted if no stats within the object have actually been built, but
+	 * it doesn't seem worth troubling over that case.
+	 */
+	for (int i = 0; i < staForm->stxkeys.dim1; i++)
+		keys = bms_add_member(keys, staForm->stxkeys.values[i]);
 
-					/*
-					 * Run the expressions through eval_const_expressions. This is
-					 * not just an optimization, but is necessary, because the
-					 * planner will be comparing them to similarly-processed qual
-					 * clauses, and may fail to detect valid matches without this.
-					 * We must not use canonicalize_qual, however, since these
-					 * aren't qual expressions.
-					 */
-					exprs = (List *) eval_const_expressions(NULL, (Node *) exprs);
+	/*
+	 * Preprocess expressions (if any). We read the expressions, run them
+	 * through eval_const_expressions, and fix the varnos.
+	 */
+	{
+		bool		isnull;
+		Datum		datum;
 
-					/* May as well fix opfuncids too */
-					fix_opfuncids((Node *) exprs);
+		/* decode expression (if any) */
+		datum = SysCacheGetAttr(STATEXTOID, htup,
+								Anum_pg_statistic_ext_stxexprs, &isnull);
 
-					/*
-					 * Modify the copies we obtain from the relcache to have the
-					 * correct varno for the parent relation, so that they match
-					 * up correctly against qual clauses.
-					 */
-					if (varno != 1)
-						ChangeVarNodes((Node *) exprs, 1, varno, 0);
-				}
-			}
-
-			/* add one StatisticExtInfo for each kind built */
-			if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
-			{
-				StatisticExtInfo *info = makeNode(StatisticExtInfo);
+		if (!isnull)
+		{
+			char	   *exprsString;
 
-				info->statOid = statOid;
-				info->inherit = dataForm->stxdinherit;
-				info->rel = rel;
-				info->kind = STATS_EXT_NDISTINCT;
-				info->keys = bms_copy(keys);
-				info->exprs = exprs;
+			exprsString = TextDatumGetCString(datum);
+			exprs = (List *) stringToNode(exprsString);
+			pfree(exprsString);
 
-				stainfos = lappend(stainfos, info);
-			}
+			/*
+			 * Run the expressions through eval_const_expressions. This is
+			 * not just an optimization, but is necessary, because the
+			 * planner will be comparing them to similarly-processed qual
+			 * clauses, and may fail to detect valid matches without this.
+			 * We must not use canonicalize_qual, however, since these
+			 * aren't qual expressions.
+			 */
+			exprs = (List *) eval_const_expressions(NULL, (Node *) exprs);
 
-			if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
-			{
-				StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			/* May as well fix opfuncids too */
+			fix_opfuncids((Node *) exprs);
 
-				info->statOid = statOid;
-				info->inherit = dataForm->stxdinherit;
-				info->rel = rel;
-				info->kind = STATS_EXT_DEPENDENCIES;
-				info->keys = bms_copy(keys);
-				info->exprs = exprs;
+			/*
+			 * Modify the copies we obtain from the relcache to have the
+			 * correct varno for the parent relation, so that they match
+			 * up correctly against qual clauses.
+			 */
+			if (varno != 1)
+				ChangeVarNodes((Node *) exprs, 1, varno, 0);
+		}
+	}
 
-				stainfos = lappend(stainfos, info);
-			}
+	/* add one StatisticExtInfo for each kind built */
+	if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			if (statext_is_kind_built(dtup, STATS_EXT_MCV))
-			{
-				StatisticExtInfo *info = makeNode(StatisticExtInfo);
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_NDISTINCT;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
 
-				info->statOid = statOid;
-				info->inherit = dataForm->stxdinherit;
-				info->rel = rel;
-				info->kind = STATS_EXT_MCV;
-				info->keys = bms_copy(keys);
-				info->exprs = exprs;
+		*stainfos = lappend(*stainfos, info);
+	}
 
-				stainfos = lappend(stainfos, info);
-			}
+	if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
-			{
-				StatisticExtInfo *info = makeNode(StatisticExtInfo);
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_DEPENDENCIES;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
 
-				info->statOid = statOid;
-				info->inherit = dataForm->stxdinherit;
-				info->rel = rel;
-				info->kind = STATS_EXT_EXPRESSIONS;
-				info->keys = bms_copy(keys);
-				info->exprs = exprs;
+		*stainfos = lappend(*stainfos, info);
+	}
 
-				stainfos = lappend(stainfos, info);
-			}
+	if (statext_is_kind_built(dtup, STATS_EXT_MCV))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			ReleaseSysCache(dtup);
-		}
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_MCV;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
 
-		ReleaseSysCache(htup);
-		bms_free(keys);
+		*stainfos = lappend(*stainfos, info);
 	}
 
-	list_free(statoidlist);
+	if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-	return stainfos;
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_EXPRESSIONS;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	ReleaseSysCache(dtup);
+	bms_free(keys);
 }
 
 /*
-- 
2.17.0

v3-0005-Maybe-better-way-than-looping-twice.-Change-parti.patchtext/x-diff; charset=us-asciiDownload
From 130a39ff1888f24dd38dc15f227f7160474369f2 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Wed, 3 Nov 2021 20:08:42 -0500
Subject: [PATCH v3 5/6] Maybe better way than looping twice.  Change
 partitioned tables to use only "inherited" stats_ext.

---
 src/backend/commands/statscmds.c     |  63 +++++---
 src/backend/optimizer/util/plancat.c | 227 ++++++++++++++-------------
 2 files changed, 151 insertions(+), 139 deletions(-)

diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 4395d878c7..89e9ad5843 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -45,6 +45,7 @@
 static char *ChooseExtendedStatisticName(const char *name1, const char *name2,
 										 const char *label, Oid namespaceid);
 static char *ChooseExtendedStatisticNameAddition(List *exprs);
+static void RemoveStatisticsByIdWorker(Relation pg_statistic_ext, Oid statsOid, bool inh);
 
 
 /* qsort comparator for the attnums in CreateStatistics */
@@ -524,8 +525,9 @@ CreateStatistics(CreateStatsStmt *stmt)
 
 	datavalues[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statoid);
 
-	/* create only the "stxdinherit=false", because that always exists */
-	datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = ObjectIdGetDatum(false);
+	/* create "stxdinherit=false", because that always exists, except for
+	 * partitioned tables, for which that's meaningless */
+	datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = BoolGetDatum(rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE);
 
 	/* no statistics built yet */
 	datanulls[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
@@ -729,30 +731,7 @@ RemoveStatisticsById(Oid statsOid)
 	HeapTuple	tup;
 	Form_pg_statistic_ext statext;
 	Oid			relid;
-	int			inh;
-
-	/*
-	 * First delete the pg_statistic_ext_data tuple holding the actual
-	 * statistical data.
-	 */
-	relation = table_open(StatisticExtDataRelationId, RowExclusiveLock);
-
-	/* hack to delete both stxdinherit = true/false */
-	for (inh = 0; inh <= 1; inh++)
-	{
-		tup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid),
-							  BoolGetDatum(inh));
-
-		if (!HeapTupleIsValid(tup)) /* should not happen */
-			// elog(ERROR, "cache lookup failed for statistics data %u", statsOid);
-			continue;
-
-		CatalogTupleDelete(relation, &tup->t_self);
-
-		ReleaseSysCache(tup);
-	}
-
-	table_close(relation, RowExclusiveLock);
+	char relkind;
 
 	/*
 	 * Delete the pg_statistic_ext tuple.  Also send out a cache inval on the
@@ -767,6 +746,7 @@ RemoveStatisticsById(Oid statsOid)
 
 	statext = (Form_pg_statistic_ext) GETSTRUCT(tup);
 	relid = statext->stxrelid;
+	relkind = get_rel_relkind(relid);
 
 	CacheInvalidateRelcacheByRelid(relid);
 
@@ -775,6 +755,37 @@ RemoveStatisticsById(Oid statsOid)
 	ReleaseSysCache(tup);
 
 	table_close(relation, RowExclusiveLock);
+
+	/*
+	 * Delete the pg_statistic_ext_data tuples holding the actual
+	 * statistical data.
+	 */
+	relation = table_open(StatisticExtDataRelationId, RowExclusiveLock);
+
+	if (relkind != RELKIND_PARTITIONED_TABLE)
+		RemoveStatisticsByIdWorker(relation, statsOid, false);
+	RemoveStatisticsByIdWorker(relation, statsOid, true);
+
+	table_close(relation, RowExclusiveLock);
+}
+
+void
+RemoveStatisticsByIdWorker(Relation pg_statistic_ext, Oid statsOid, bool inh)
+{
+	HeapTuple	tup;
+	tup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid),
+						  BoolGetDatum(inh));
+
+	/* should not happen,  .. */
+	if (!HeapTupleIsValid(tup))
+	{
+		if (!inh)
+			elog(ERROR, "cache lookup failed for statistics data %u inh %d", statsOid, inh);
+		return;
+	}
+
+	CatalogTupleDelete(pg_statistic_ext, &tup->t_self);
+	ReleaseSysCache(tup);
 }
 
 /*
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 154d48a330..b1d8847529 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -81,6 +81,9 @@ static void set_baserel_partition_key_exprs(Relation relation,
 static void set_baserel_partition_constraint(Relation relation,
 											 RelOptInfo *rel);
 
+static void get_relation_statistics_worker(RelOptInfo *rel, List **stainfos,
+	Oid statOid, HeapTuple htup, Form_pg_statistic_ext staForm, Index varno,
+	bool inh);
 
 /*
  * get_relation_info -
@@ -1301,7 +1304,6 @@ get_relation_constraints(PlannerInfo *root,
 static List *
 get_relation_statistics(RelOptInfo *rel, Relation relation)
 {
-	Index		varno = rel->relid;
 	List	   *statoidlist;
 	List	   *stainfos = NIL;
 	ListCell   *l;
@@ -1312,150 +1314,149 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 	{
 		Oid			statOid = lfirst_oid(l);
 		Form_pg_statistic_ext staForm;
-		Form_pg_statistic_ext_data dataForm;
 		HeapTuple	htup;
-		HeapTuple	dtup;
-		Bitmapset  *keys = NULL;
-		List	   *exprs = NIL;
-		int			i;
-		int			inh;
 
 		htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
 		if (!HeapTupleIsValid(htup))
 			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
 		staForm = (Form_pg_statistic_ext) GETSTRUCT(htup);
 
-		/*
-		 * Hack to load stats with stxdinherit true/false - there should be
-		 * a better way to do this, I guess.
-		 */
-		for (inh = 0; inh <= 1; inh++)
-		{
-			dtup = SearchSysCache2(STATEXTDATASTXOID,
-								   ObjectIdGetDatum(statOid), BoolGetDatum((bool) inh));
-			if (!HeapTupleIsValid(dtup))
-				continue;
+		get_relation_statistics_worker(rel, &stainfos, statOid, htup, staForm, rel->relid, false);
+		get_relation_statistics_worker(rel, &stainfos, statOid, htup, staForm, rel->relid, true);
 
-			dataForm = (Form_pg_statistic_ext_data) GETSTRUCT(dtup);
+		ReleaseSysCache(htup);
+	}
 
-			/*
-			 * First, build the array of columns covered.  This is ultimately
-			 * wasted if no stats within the object have actually been built, but
-			 * it doesn't seem worth troubling over that case.
-			 */
-			for (i = 0; i < staForm->stxkeys.dim1; i++)
-				keys = bms_add_member(keys, staForm->stxkeys.values[i]);
+	list_free(statoidlist);
 
-			/*
-			 * Preprocess expressions (if any). We read the expressions, run them
-			 * through eval_const_expressions, and fix the varnos.
-			 */
-			{
-				bool		isnull;
-				Datum		datum;
+	return stainfos;
+}
 
-				/* decode expression (if any) */
-				datum = SysCacheGetAttr(STATEXTOID, htup,
-										Anum_pg_statistic_ext_stxexprs, &isnull);
+static void
+get_relation_statistics_worker(RelOptInfo *rel, List **stainfos, Oid statOid, HeapTuple htup, Form_pg_statistic_ext staForm, Index varno, bool inh)
+{
+	Form_pg_statistic_ext_data dataForm;
+	HeapTuple	dtup;
+	Bitmapset  *keys = NULL;
+	List	   *exprs = NIL;
+
+	dtup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(statOid), BoolGetDatum(inh));
+	if (!HeapTupleIsValid(dtup))
+		return;
 
-				if (!isnull)
-				{
-					char	   *exprsString;
+	dataForm = (Form_pg_statistic_ext_data) GETSTRUCT(dtup);
 
-					exprsString = TextDatumGetCString(datum);
-					exprs = (List *) stringToNode(exprsString);
-					pfree(exprsString);
+	/*
+	 * First, build the array of columns covered.  This is ultimately
+	 * wasted if no stats within the object have actually been built, but
+	 * it doesn't seem worth troubling over that case.
+	 */
+	for (int i = 0; i < staForm->stxkeys.dim1; i++)
+		keys = bms_add_member(keys, staForm->stxkeys.values[i]);
 
-					/*
-					 * Run the expressions through eval_const_expressions. This is
-					 * not just an optimization, but is necessary, because the
-					 * planner will be comparing them to similarly-processed qual
-					 * clauses, and may fail to detect valid matches without this.
-					 * We must not use canonicalize_qual, however, since these
-					 * aren't qual expressions.
-					 */
-					exprs = (List *) eval_const_expressions(NULL, (Node *) exprs);
+	/*
+	 * Preprocess expressions (if any). We read the expressions, run them
+	 * through eval_const_expressions, and fix the varnos.
+	 */
+	{
+		bool		isnull;
+		Datum		datum;
 
-					/* May as well fix opfuncids too */
-					fix_opfuncids((Node *) exprs);
+		/* decode expression (if any) */
+		datum = SysCacheGetAttr(STATEXTOID, htup,
+								Anum_pg_statistic_ext_stxexprs, &isnull);
 
-					/*
-					 * Modify the copies we obtain from the relcache to have the
-					 * correct varno for the parent relation, so that they match
-					 * up correctly against qual clauses.
-					 */
-					if (varno != 1)
-						ChangeVarNodes((Node *) exprs, 1, varno, 0);
-				}
-			}
-
-			/* add one StatisticExtInfo for each kind built */
-			if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
-			{
-				StatisticExtInfo *info = makeNode(StatisticExtInfo);
+		if (!isnull)
+		{
+			char	   *exprsString;
 
-				info->statOid = statOid;
-				info->inherit = dataForm->stxdinherit;
-				info->rel = rel;
-				info->kind = STATS_EXT_NDISTINCT;
-				info->keys = bms_copy(keys);
-				info->exprs = exprs;
+			exprsString = TextDatumGetCString(datum);
+			exprs = (List *) stringToNode(exprsString);
+			pfree(exprsString);
 
-				stainfos = lappend(stainfos, info);
-			}
+			/*
+			 * Run the expressions through eval_const_expressions. This is
+			 * not just an optimization, but is necessary, because the
+			 * planner will be comparing them to similarly-processed qual
+			 * clauses, and may fail to detect valid matches without this.
+			 * We must not use canonicalize_qual, however, since these
+			 * aren't qual expressions.
+			 */
+			exprs = (List *) eval_const_expressions(NULL, (Node *) exprs);
 
-			if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
-			{
-				StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			/* May as well fix opfuncids too */
+			fix_opfuncids((Node *) exprs);
 
-				info->statOid = statOid;
-				info->inherit = dataForm->stxdinherit;
-				info->rel = rel;
-				info->kind = STATS_EXT_DEPENDENCIES;
-				info->keys = bms_copy(keys);
-				info->exprs = exprs;
+			/*
+			 * Modify the copies we obtain from the relcache to have the
+			 * correct varno for the parent relation, so that they match
+			 * up correctly against qual clauses.
+			 */
+			if (varno != 1)
+				ChangeVarNodes((Node *) exprs, 1, varno, 0);
+		}
+	}
 
-				stainfos = lappend(stainfos, info);
-			}
+	/* add one StatisticExtInfo for each kind built */
+	if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			if (statext_is_kind_built(dtup, STATS_EXT_MCV))
-			{
-				StatisticExtInfo *info = makeNode(StatisticExtInfo);
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_NDISTINCT;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
 
-				info->statOid = statOid;
-				info->inherit = dataForm->stxdinherit;
-				info->rel = rel;
-				info->kind = STATS_EXT_MCV;
-				info->keys = bms_copy(keys);
-				info->exprs = exprs;
+		*stainfos = lappend(*stainfos, info);
+	}
 
-				stainfos = lappend(stainfos, info);
-			}
+	if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
-			{
-				StatisticExtInfo *info = makeNode(StatisticExtInfo);
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_DEPENDENCIES;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
 
-				info->statOid = statOid;
-				info->inherit = dataForm->stxdinherit;
-				info->rel = rel;
-				info->kind = STATS_EXT_EXPRESSIONS;
-				info->keys = bms_copy(keys);
-				info->exprs = exprs;
+		*stainfos = lappend(*stainfos, info);
+	}
 
-				stainfos = lappend(stainfos, info);
-			}
+	if (statext_is_kind_built(dtup, STATS_EXT_MCV))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			ReleaseSysCache(dtup);
-		}
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_MCV;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
 
-		ReleaseSysCache(htup);
-		bms_free(keys);
+		*stainfos = lappend(*stainfos, info);
 	}
 
-	list_free(statoidlist);
+	if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-	return stainfos;
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_EXPRESSIONS;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	ReleaseSysCache(dtup);
+	bms_free(keys);
 }
 
 /*
-- 
2.17.0

v3-0006-Refactor-parent-ACL-check.patchtext/x-diff; charset=us-asciiDownload
From 54d692df2cdcd97776cbd2629be7fec47cc41753 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 25 Sep 2021 18:58:33 -0500
Subject: [PATCH v3 6/6] Refactor parent ACL check

selfuncs.c is 8k lines long, and this makes it 30 LOC shorter.
---
 src/backend/utils/adt/selfuncs.c | 140 ++++++++++++-------------------
 1 file changed, 52 insertions(+), 88 deletions(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 19b4aaf7eb..54324a71c0 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -187,6 +187,8 @@ static char *convert_string_datum(Datum value, Oid typid, Oid collid,
 								  bool *failure);
 static double convert_timevalue_to_scalar(Datum value, Oid typid,
 										  bool *failure);
+static void recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata,
+								Oid relid);
 static void examine_simple_variable(PlannerInfo *root, Var *var,
 									VariableStatData *vardata);
 static bool get_variable_range(PlannerInfo *root, VariableStatData *vardata,
@@ -5152,51 +5154,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 									(pg_class_aclcheck(rte->relid, userid,
 													   ACL_SELECT) == ACLCHECK_OK);
 
-								/*
-								 * If the user doesn't have permissions to
-								 * access an inheritance child relation, check
-								 * the permissions of the table actually
-								 * mentioned in the query, since most likely
-								 * the user does have that permission.  Note
-								 * that whole-table select privilege on the
-								 * parent doesn't quite guarantee that the
-								 * user could read all columns of the child.
-								 * But in practice it's unlikely that any
-								 * interesting security violation could result
-								 * from allowing access to the expression
-								 * index's stats, so we allow it anyway.  See
-								 * similar code in examine_simple_variable()
-								 * for additional comments.
-								 */
-								if (!vardata->acl_ok &&
-									root->append_rel_array != NULL)
-								{
-									AppendRelInfo *appinfo;
-									Index		varno = index->rel->relid;
-
-									appinfo = root->append_rel_array[varno];
-									while (appinfo &&
-										   planner_rt_fetch(appinfo->parent_relid,
-															root)->rtekind == RTE_RELATION)
-									{
-										varno = appinfo->parent_relid;
-										appinfo = root->append_rel_array[varno];
-									}
-									if (varno != index->rel->relid)
-									{
-										/* Repeat access check on this rel */
-										rte = planner_rt_fetch(varno, root);
-										Assert(rte->rtekind == RTE_RELATION);
-
-										userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-										vardata->acl_ok =
-											rte->securityQuals == NIL &&
-											(pg_class_aclcheck(rte->relid,
-															   userid,
-															   ACL_SELECT) == ACLCHECK_OK);
-									}
-								}
+								recheck_parent_acl(root, vardata, index->rel->relid);
 							}
 							else
 							{
@@ -5287,49 +5245,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 						(pg_class_aclcheck(rte->relid, userid,
 										   ACL_SELECT) == ACLCHECK_OK);
 
-					/*
-					 * If the user doesn't have permissions to access an
-					 * inheritance child relation, check the permissions of
-					 * the table actually mentioned in the query, since most
-					 * likely the user does have that permission.  Note that
-					 * whole-table select privilege on the parent doesn't
-					 * quite guarantee that the user could read all columns of
-					 * the child. But in practice it's unlikely that any
-					 * interesting security violation could result from
-					 * allowing access to the expression stats, so we allow it
-					 * anyway.  See similar code in examine_simple_variable()
-					 * for additional comments.
-					 */
-					if (!vardata->acl_ok &&
-						root->append_rel_array != NULL)
-					{
-						AppendRelInfo *appinfo;
-						Index		varno = onerel->relid;
-
-						appinfo = root->append_rel_array[varno];
-						while (appinfo &&
-							   planner_rt_fetch(appinfo->parent_relid,
-												root)->rtekind == RTE_RELATION)
-						{
-							varno = appinfo->parent_relid;
-							appinfo = root->append_rel_array[varno];
-						}
-						if (varno != onerel->relid)
-						{
-							/* Repeat access check on this rel */
-							rte = planner_rt_fetch(varno, root);
-							Assert(rte->rtekind == RTE_RELATION);
-
-							userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-							vardata->acl_ok =
-								rte->securityQuals == NIL &&
-								(pg_class_aclcheck(rte->relid,
-												   userid,
-												   ACL_SELECT) == ACLCHECK_OK);
-						}
-					}
-
+					recheck_parent_acl(root, vardata, onerel->relid);
 					break;
 				}
 
@@ -5339,6 +5255,54 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 	}
 }
 
+/*
+ * If the user doesn't have permissions to access an inheritance child
+ * relation, check the permissions of the table actually mentioned in the
+ * query, since most likely the user does have that permission.  Note that
+ * whole-table select privilege on the parent doesn't quite guarantee that the
+ * user could read all columns of the child.  But in practice it's unlikely
+ * that any interesting security violation could result from allowing access to
+ * the expression stats, so we allow it anyway.  See similar code in
+ * examine_simple_variable() for additional comments.
+ */
+static void
+recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata, Oid relid)
+{
+	RangeTblEntry	*rte;
+	Oid		userid;
+
+	if (!vardata->acl_ok &&
+		root->append_rel_array != NULL)
+	{
+		AppendRelInfo *appinfo;
+		Index		varno = relid;
+
+		appinfo = root->append_rel_array[varno];
+		while (appinfo &&
+			   planner_rt_fetch(appinfo->parent_relid,
+								root)->rtekind == RTE_RELATION)
+		{
+			varno = appinfo->parent_relid;
+			appinfo = root->append_rel_array[varno];
+		}
+
+		if (varno != relid)
+		{
+			/* Repeat access check on this rel */
+			rte = planner_rt_fetch(varno, root);
+			Assert(rte->rtekind == RTE_RELATION);
+
+			userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
+
+			vardata->acl_ok =
+				rte->securityQuals == NIL &&
+				(pg_class_aclcheck(rte->relid,
+								   userid,
+								   ACL_SELECT) == ACLCHECK_OK);
+		}
+	}
+}
+
 /*
  * examine_simple_variable
  *		Handle a simple Var for examine_variable
-- 
2.17.0

#15Justin Pryzby
pryzby@telsasoft.com
In reply to: Tomas Vondra (#13)
6 attachment(s)
Re: extended stats on partitioned tables

On Thu, Nov 04, 2021 at 12:44:45AM +0100, Tomas Vondra wrote:

And I'm not sure we do the right thing after removing children, for example
(that should drop the inheritance stats, I guess).

Do you mean for inheritance only ? Or partitions too ?
I think for partitions, the stats should stay.
And for inheritence, they can stay, for consistency with partitions, and since
it does no harm.

I think the behavior should be the same as for data in pg_statistic,
i.e. if we keep/remove those, we should do the same thing for extended
statistics.

That works for column stats the way I proposed for extended stats: child stats
are never removed, neither when the only child is dropped, nor when re-running
analyze (that part is actually a bit odd).

Rebased, fixing an intermediate compile error, and typos in the commit message.

--
Justin

Attachments:

0001-Do-not-use-extended-statistics-on-inheritance-trees.patchtext/x-diff; charset=us-asciiDownload
From 7d752af4bbbded86f3c9d4b3571bdb4b8bdf5ef2 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 25 Sep 2021 19:42:41 -0500
Subject: [PATCH 01/15] Do not use extended statistics on inheritance trees..

Since 859b3003de, inherited ext stats are not built.
However, the non-inherited stats stats were incorrectly used during planning of
queries with inheritance heirarchies.

Since the ext stats do not include child tables, they can lead to worse
estimates.  This is remarkably similar to 427c6b5b9, which affected column
statistics 15 years ago.

choose_best_statistics is handled a bit differently (in the calling function),
because it isn't passed rel nor rel->inh, and it's an exported function, so
avoid changing its signature in back branches.

https://www.postgresql.org/message-id/flat/20210925223152.GA7877@telsasoft.com

Backpatch to v10
---
 src/backend/statistics/dependencies.c   |  5 +++++
 src/backend/statistics/extended_stats.c |  5 +++++
 src/backend/utils/adt/selfuncs.c        |  9 +++++++++
 src/test/regress/expected/stats_ext.out | 24 ++++++++++++++++++++++++
 src/test/regress/sql/stats_ext.sql      | 15 +++++++++++++++
 5 files changed, 58 insertions(+)

diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 8bf80db8e4..015b9e28f6 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -1410,11 +1410,16 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			ndependencies;
 	int			i;
 	AttrNumber	attnum_offset;
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
 	/* unique expressions */
 	Node	  **unique_exprs;
 	int			unique_exprs_cnt;
 
+	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+	if (rte->inh)
+		return 1.0;
+
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
 		return 1.0;
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 69ca52094f..b9e755f44e 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1694,6 +1694,11 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	List	  **list_exprs;		/* expressions matched to any statistic */
 	int			listidx;
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
+	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+	if (rte->inh)
+		return sel;
 
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 10895fb287..a0932e39e1 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3913,6 +3913,11 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	Oid			statOid = InvalidOid;
 	MVNDistinct *stats;
 	StatisticExtInfo *matched_info = NULL;
+	RangeTblEntry		*rte = root->simple_rte_array[rel->relid];
+
+	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+	if (rte->inh)
+		return false;
 
 	/* bail out immediately if the table has no extended statistics */
 	if (!rel->statlist)
@@ -5232,6 +5237,10 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 			if (vardata->statsTuple)
 				break;
 
+			/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+			if (planner_rt_fetch(onerel->relid, root)->inh)
+				break;
+
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
 				continue;
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index c60ba45aba..5410f58f7f 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -176,6 +176,30 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 NOTICE:  drop cascades to table ab1c
+-- Tests for stats with inheritance
+CREATE TABLE stxdinh(i int, j int);
+CREATE TABLE stxdinh1() INHERITS(stxdinh);
+INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
+INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Ensure non-inherited stats are not applied to inherited query
+-- Without stats object, it looks like this
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+      1000 |   1008
+(1 row)
+
+CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Since the stats object does not include inherited stats, it should not affect the estimates
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+      1000 |   1008
+(1 row)
+
+DROP TABLE stxdinh, stxdinh1;
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 -- expression stats may be built on a single expression column
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 6fb37962a7..a196d5e2f8 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -112,6 +112,21 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 
+-- Tests for stats with inheritance
+CREATE TABLE stxdinh(i int, j int);
+CREATE TABLE stxdinh1() INHERITS(stxdinh);
+INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
+INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Ensure non-inherited stats are not applied to inherited query
+-- Without stats object, it looks like this
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Since the stats object does not include inherited stats, it should not affect the estimates
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+DROP TABLE stxdinh, stxdinh1;
+
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 
-- 
2.17.0

0002-Build-inherited-extended-stats-on-partitioned-tables.patchtext/x-diff; charset=us-asciiDownload
From 3462a9cf1f3e67c4c42b235136066e03890570ed Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Date: Sat, 25 Sep 2021 23:01:21 +0200
Subject: [PATCH 02/15] Build inherited extended stats on partitioned tables

Since 859b3003de, ext stats on partitioned tables are not built, which is a
regression.  For back branches, pg_statistic_ext cannot support both inherited
(FROM) and non-inherited (FROM ONLY) stats on inheritance heirarchies.
But there's no issue building inherited stats for partitioned tables, which
are empty, so cannot have non-inherited stats.

See also: 8c5cdb7f4f6e1d6a6104cb58ce4f23453891651b

https://www.postgresql.org/message-id/20210923212624.GI831%40telsasoft.com

Backpatch to v10
---
 src/backend/commands/analyze.c          |  5 ++++-
 src/backend/statistics/dependencies.c   |  2 +-
 src/backend/statistics/extended_stats.c |  2 +-
 src/backend/utils/adt/selfuncs.c        |  9 ++++++---
 src/test/regress/expected/stats_ext.out | 19 +++++++++++++++++++
 src/test/regress/sql/stats_ext.sql      | 10 ++++++++++
 6 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index cd77907fc7..d35325f560 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -549,6 +549,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 	{
 		MemoryContext col_context,
 					old_context;
+		bool		build_ext_stats;
 
 		pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
 									 PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -612,13 +613,15 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
+		build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
+
 		/*
 		 * Build extended statistics (if there are any).
 		 *
 		 * For now we only build extended statistics on individual relations,
 		 * not for relations representing inheritance trees.
 		 */
-		if (!inh)
+		if (build_ext_stats)
 			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
 									   attr_cnt, vacattrstats);
 	}
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 015b9e28f6..36121d5a27 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -1417,7 +1417,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			unique_exprs_cnt;
 
 	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-	if (rte->inh)
+	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 		return 1.0;
 
 	/* check if there's any stats that might be useful for us. */
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index b9e755f44e..5fbdbf0164 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1697,7 +1697,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
 	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-	if (rte->inh)
+	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 		return sel;
 
 	/* check if there's any stats that might be useful for us. */
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index a0932e39e1..b15f14e1a0 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3916,7 +3916,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	RangeTblEntry		*rte = root->simple_rte_array[rel->relid];
 
 	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-	if (rte->inh)
+	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 		return false;
 
 	/* bail out immediately if the table has no extended statistics */
@@ -5238,8 +5238,11 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 				break;
 
 			/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-			if (planner_rt_fetch(onerel->relid, root)->inh)
-				break;
+			{
+				RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root);
+				if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
+					break;
+			}
 
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 5410f58f7f..a11fff655a 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -200,6 +200,25 @@ SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 (1 row)
 
 DROP TABLE stxdinh, stxdinh1;
+-- Ensure inherited stats ARE applied to inherited query in partitioned table
+CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i);
+CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100);
+INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
+CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp;
+VACUUM ANALYZE stxdinp; -- partitions are processed recursively
+SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass;
+ ?column? 
+----------
+        1
+(1 row)
+
+SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+        10 |     10
+(1 row)
+
+DROP TABLE stxdinp;
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 -- expression stats may be built on a single expression column
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index a196d5e2f8..01383e8f5f 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -127,6 +127,16 @@ VACUUM ANALYZE stxdinh, stxdinh1;
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 DROP TABLE stxdinh, stxdinh1;
 
+-- Ensure inherited stats ARE applied to inherited query in partitioned table
+CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i);
+CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100);
+INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
+CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp;
+VACUUM ANALYZE stxdinp; -- partitions are processed recursively
+SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass;
+SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2');
+DROP TABLE stxdinp;
+
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 
-- 
2.17.0

0003-Add-stxdinherit-build-inherited-extended-stats-on-in.patchtext/x-diff; charset=us-asciiDownload
From 8d1a4983cf3c1901a37421cc07dba4c5bab0487b Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Date: Sat, 25 Sep 2021 21:27:10 +0200
Subject: [PATCH 03/15] Add stxdinherit; build inherited extended stats on
 inheritance parents

pg_statistic has an inherited flag which is part of the unique index, but
pg_statistic_ext has never had that.  In back branches, pg_statistic cannot
store both inherited and non-inherited stats.  So it stores non-inherited stats
(FROM ONLY) for inheritance parents and inherited stats for partitioned tables.

This patch for v15 allows storing both inherited and non-inherited stats for
non-empty inheritance parents, and avoids the above, confusing definition.
---
 doc/src/sgml/catalogs.sgml                  |  23 +++
 src/backend/catalog/system_views.sql        |   1 +
 src/backend/commands/analyze.c              |  15 +-
 src/backend/commands/statscmds.c            |  20 ++-
 src/backend/optimizer/util/plancat.c        | 186 +++++++++++---------
 src/backend/statistics/dependencies.c       |  12 +-
 src/backend/statistics/extended_stats.c     |  70 ++++----
 src/backend/statistics/mcv.c                |   9 +-
 src/backend/statistics/mvdistinct.c         |   5 +-
 src/backend/utils/adt/selfuncs.c            |   2 +-
 src/backend/utils/cache/syscache.c          |   6 +-
 src/include/catalog/pg_statistic_ext_data.h |   4 +-
 src/include/nodes/pathnodes.h               |   1 +
 src/include/statistics/statistics.h         |   9 +-
 src/test/regress/expected/rules.out         |   1 +
 15 files changed, 215 insertions(+), 149 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1d11be73f..720e83e523 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7509,6 +7509,19 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
    created with <link linkend="sql-createstatistics"><command>CREATE STATISTICS</command></link>.
   </para>
 
+  <para>
+   Normally there is one entry, with <structfield>stxdinherit</structfield> =
+   <literal>false</literal>, for each statistics object that has been analyzed.
+   If the table has inheritance children, a second entry with
+   <structfield>stxdinherit</structfield> = <literal>true</literal> is also created.
+   This row represents the statistics object over the inheritance tree, i.e.,
+   statistics for the data you'd see with
+   <literal>SELECT * FROM <replaceable>table</replaceable>*</literal>,
+   whereas the <structfield>stxdinherit</structfield> = <literal>false</literal> row
+   represents the results of
+   <literal>SELECT * FROM ONLY <replaceable>table</replaceable></literal>.
+  </para>
+
   <para>
    Like <link linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>,
    <structname>pg_statistic_ext_data</structname> should not be
@@ -7548,6 +7561,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>stxdinherit</structfield> <type>bool</type>
+      </para>
+      <para>
+       If true, the stats include inheritance child columns, not just the
+       values in the specified relation
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>stxdndistinct</structfield> <type>pg_ndistinct</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 61b515cdb8..637d557ff2 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -266,6 +266,7 @@ CREATE VIEW pg_stats_ext WITH (security_barrier) AS
            ) AS attnames,
            pg_get_statisticsobjdef_expressions(s.oid) as exprs,
            s.stxkind AS kinds,
+           sd.stxdinherit AS inherited,
            sd.stxdndistinct AS n_distinct,
            sd.stxddependencies AS dependencies,
            m.most_common_vals,
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index d35325f560..3436f09d63 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -549,7 +549,6 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 	{
 		MemoryContext col_context,
 					old_context;
-		bool		build_ext_stats;
 
 		pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
 									 PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -613,17 +612,9 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
-		build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
-
-		/*
-		 * Build extended statistics (if there are any).
-		 *
-		 * For now we only build extended statistics on individual relations,
-		 * not for relations representing inheritance trees.
-		 */
-		if (build_ext_stats)
-			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
-									   attr_cnt, vacattrstats);
+		/* Build extended statistics (if there are any). */
+		BuildRelationExtStatistics(onerel, inh, totalrows, numrows, rows,
+								   attr_cnt, vacattrstats);
 	}
 
 	pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 8f1550ec80..4395d878c7 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -524,6 +524,9 @@ CreateStatistics(CreateStatsStmt *stmt)
 
 	datavalues[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statoid);
 
+	/* create only the "stxdinherit=false", because that always exists */
+	datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = ObjectIdGetDatum(false);
+
 	/* no statistics built yet */
 	datanulls[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
 	datanulls[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
@@ -726,6 +729,7 @@ RemoveStatisticsById(Oid statsOid)
 	HeapTuple	tup;
 	Form_pg_statistic_ext statext;
 	Oid			relid;
+	int			inh;
 
 	/*
 	 * First delete the pg_statistic_ext_data tuple holding the actual
@@ -733,14 +737,20 @@ RemoveStatisticsById(Oid statsOid)
 	 */
 	relation = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
-	tup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid));
+	/* hack to delete both stxdinherit = true/false */
+	for (inh = 0; inh <= 1; inh++)
+	{
+		tup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid),
+							  BoolGetDatum(inh));
 
-	if (!HeapTupleIsValid(tup)) /* should not happen */
-		elog(ERROR, "cache lookup failed for statistics data %u", statsOid);
+		if (!HeapTupleIsValid(tup)) /* should not happen */
+			// elog(ERROR, "cache lookup failed for statistics data %u", statsOid);
+			continue;
 
-	CatalogTupleDelete(relation, &tup->t_self);
+		CatalogTupleDelete(relation, &tup->t_self);
 
-	ReleaseSysCache(tup);
+		ReleaseSysCache(tup);
+	}
 
 	table_close(relation, RowExclusiveLock);
 
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index c5194fdbbf..154d48a330 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -30,6 +30,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_statistic_ext.h"
+#include "catalog/pg_statistic_ext_data.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -1311,127 +1312,144 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 	{
 		Oid			statOid = lfirst_oid(l);
 		Form_pg_statistic_ext staForm;
+		Form_pg_statistic_ext_data dataForm;
 		HeapTuple	htup;
 		HeapTuple	dtup;
 		Bitmapset  *keys = NULL;
 		List	   *exprs = NIL;
 		int			i;
+		int			inh;
 
 		htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
 		if (!HeapTupleIsValid(htup))
 			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
 		staForm = (Form_pg_statistic_ext) GETSTRUCT(htup);
 
-		dtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-		if (!HeapTupleIsValid(dtup))
-			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
-
-		/*
-		 * First, build the array of columns covered.  This is ultimately
-		 * wasted if no stats within the object have actually been built, but
-		 * it doesn't seem worth troubling over that case.
-		 */
-		for (i = 0; i < staForm->stxkeys.dim1; i++)
-			keys = bms_add_member(keys, staForm->stxkeys.values[i]);
-
 		/*
-		 * Preprocess expressions (if any). We read the expressions, run them
-		 * through eval_const_expressions, and fix the varnos.
+		 * Hack to load stats with stxdinherit true/false - there should be
+		 * a better way to do this, I guess.
 		 */
+		for (inh = 0; inh <= 1; inh++)
 		{
-			bool		isnull;
-			Datum		datum;
+			dtup = SearchSysCache2(STATEXTDATASTXOID,
+								   ObjectIdGetDatum(statOid), BoolGetDatum((bool) inh));
+			if (!HeapTupleIsValid(dtup))
+				continue;
 
-			/* decode expression (if any) */
-			datum = SysCacheGetAttr(STATEXTOID, htup,
-									Anum_pg_statistic_ext_stxexprs, &isnull);
+			dataForm = (Form_pg_statistic_ext_data) GETSTRUCT(dtup);
 
-			if (!isnull)
+			/*
+			 * First, build the array of columns covered.  This is ultimately
+			 * wasted if no stats within the object have actually been built, but
+			 * it doesn't seem worth troubling over that case.
+			 */
+			for (i = 0; i < staForm->stxkeys.dim1; i++)
+				keys = bms_add_member(keys, staForm->stxkeys.values[i]);
+
+			/*
+			 * Preprocess expressions (if any). We read the expressions, run them
+			 * through eval_const_expressions, and fix the varnos.
+			 */
 			{
-				char	   *exprsString;
+				bool		isnull;
+				Datum		datum;
 
-				exprsString = TextDatumGetCString(datum);
-				exprs = (List *) stringToNode(exprsString);
-				pfree(exprsString);
+				/* decode expression (if any) */
+				datum = SysCacheGetAttr(STATEXTOID, htup,
+										Anum_pg_statistic_ext_stxexprs, &isnull);
 
-				/*
-				 * Run the expressions through eval_const_expressions. This is
-				 * not just an optimization, but is necessary, because the
-				 * planner will be comparing them to similarly-processed qual
-				 * clauses, and may fail to detect valid matches without this.
-				 * We must not use canonicalize_qual, however, since these
-				 * aren't qual expressions.
-				 */
-				exprs = (List *) eval_const_expressions(NULL, (Node *) exprs);
+				if (!isnull)
+				{
+					char	   *exprsString;
 
-				/* May as well fix opfuncids too */
-				fix_opfuncids((Node *) exprs);
+					exprsString = TextDatumGetCString(datum);
+					exprs = (List *) stringToNode(exprsString);
+					pfree(exprsString);
 
-				/*
-				 * Modify the copies we obtain from the relcache to have the
-				 * correct varno for the parent relation, so that they match
-				 * up correctly against qual clauses.
-				 */
-				if (varno != 1)
-					ChangeVarNodes((Node *) exprs, 1, varno, 0);
+					/*
+					 * Run the expressions through eval_const_expressions. This is
+					 * not just an optimization, but is necessary, because the
+					 * planner will be comparing them to similarly-processed qual
+					 * clauses, and may fail to detect valid matches without this.
+					 * We must not use canonicalize_qual, however, since these
+					 * aren't qual expressions.
+					 */
+					exprs = (List *) eval_const_expressions(NULL, (Node *) exprs);
+
+					/* May as well fix opfuncids too */
+					fix_opfuncids((Node *) exprs);
+
+					/*
+					 * Modify the copies we obtain from the relcache to have the
+					 * correct varno for the parent relation, so that they match
+					 * up correctly against qual clauses.
+					 */
+					if (varno != 1)
+						ChangeVarNodes((Node *) exprs, 1, varno, 0);
+				}
 			}
-		}
 
-		/* add one StatisticExtInfo for each kind built */
-		if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			/* add one StatisticExtInfo for each kind built */
+			if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_NDISTINCT;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_NDISTINCT;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
 
-			stainfos = lappend(stainfos, info);
-		}
+				stainfos = lappend(stainfos, info);
+			}
 
-		if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_DEPENDENCIES;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_DEPENDENCIES;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
 
-			stainfos = lappend(stainfos, info);
-		}
+				stainfos = lappend(stainfos, info);
+			}
 
-		if (statext_is_kind_built(dtup, STATS_EXT_MCV))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			if (statext_is_kind_built(dtup, STATS_EXT_MCV))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_MCV;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_MCV;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
 
-			stainfos = lappend(stainfos, info);
-		}
+				stainfos = lappend(stainfos, info);
+			}
 
-		if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
+			{
+				StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_EXPRESSIONS;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+				info->statOid = statOid;
+				info->inherit = dataForm->stxdinherit;
+				info->rel = rel;
+				info->kind = STATS_EXT_EXPRESSIONS;
+				info->keys = bms_copy(keys);
+				info->exprs = exprs;
+
+				stainfos = lappend(stainfos, info);
+			}
 
-			stainfos = lappend(stainfos, info);
+			ReleaseSysCache(dtup);
 		}
 
 		ReleaseSysCache(htup);
-		ReleaseSysCache(dtup);
 		bms_free(keys);
 	}
 
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 36121d5a27..c62fc398ce 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -618,14 +618,16 @@ dependency_is_fully_matched(MVDependency *dependency, Bitmapset *attnums)
  *		Load the functional dependencies for the indicated pg_statistic_ext tuple
  */
 MVDependencies *
-statext_dependencies_load(Oid mvoid)
+statext_dependencies_load(Oid mvoid, bool inh)
 {
 	MVDependencies *result;
 	bool		isnull;
 	Datum		deps;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid),
+						   BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
@@ -1603,6 +1605,10 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (stat->kind != STATS_EXT_DEPENDENCIES)
 			continue;
 
+		/* skip statistics with mismatching stxdinherit value */
+		if (stat->inherit != rte->inh)
+			continue;
+
 		/*
 		 * Count matching attributes - we have to undo the attnum offsets. The
 		 * input attribute numbers are not offset (expressions are not
@@ -1649,7 +1655,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (nmatched + nexprs < 2)
 			continue;
 
-		deps = statext_dependencies_load(stat->statOid);
+		deps = statext_dependencies_load(stat->statOid, rte->inh);
 
 		/*
 		 * The expressions may be represented by different attnums in the
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 5fbdbf0164..c53716f034 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -77,7 +77,7 @@ typedef struct StatExtEntry
 static List *fetch_statentries_for_relation(Relation pg_statext, Oid relid);
 static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
 											int nvacatts, VacAttrStats **vacatts);
-static void statext_store(Oid statOid,
+static void statext_store(Oid statOid, bool inh,
 						  MVNDistinct *ndistinct, MVDependencies *dependencies,
 						  MCVList *mcv, Datum exprs, VacAttrStats **stats);
 static int	statext_compute_stattarget(int stattarget,
@@ -110,7 +110,7 @@ static StatsBuildData *make_build_data(Relation onerel, StatExtEntry *stat,
  * requested stats, and serializes them back into the catalog.
  */
 void
-BuildRelationExtStatistics(Relation onerel, double totalrows,
+BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 						   int numrows, HeapTuple *rows,
 						   int natts, VacAttrStats **vacattrstats)
 {
@@ -230,7 +230,8 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 		}
 
 		/* store the statistics in the catalog */
-		statext_store(stat->statOid, ndistinct, dependencies, mcv, exprstats, stats);
+		statext_store(stat->statOid, inh,
+					  ndistinct, dependencies, mcv, exprstats, stats);
 
 		/* for reporting progress */
 		pgstat_progress_update_param(PROGRESS_ANALYZE_EXT_STATS_COMPUTED,
@@ -781,7 +782,7 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
  *	tuple.
  */
 static void
-statext_store(Oid statOid,
+statext_store(Oid statOid, bool inh,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
 			  MCVList *mcv, Datum exprs, VacAttrStats **stats)
 {
@@ -790,14 +791,19 @@ statext_store(Oid statOid,
 				oldtup;
 	Datum		values[Natts_pg_statistic_ext_data];
 	bool		nulls[Natts_pg_statistic_ext_data];
-	bool		replaces[Natts_pg_statistic_ext_data];
 
 	pg_stextdata = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
 	memset(nulls, true, sizeof(nulls));
-	memset(replaces, false, sizeof(replaces));
 	memset(values, 0, sizeof(values));
 
+	/* basic info */
+	values[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statOid);
+	nulls[Anum_pg_statistic_ext_data_stxoid - 1] = false;
+
+	values[Anum_pg_statistic_ext_data_stxdinherit - 1] = BoolGetDatum(inh);
+	nulls[Anum_pg_statistic_ext_data_stxdinherit - 1] = false;
+
 	/*
 	 * Construct a new pg_statistic_ext_data tuple, replacing the calculated
 	 * stats.
@@ -830,25 +836,27 @@ statext_store(Oid statOid,
 		values[Anum_pg_statistic_ext_data_stxdexpr - 1] = exprs;
 	}
 
-	/* always replace the value (either by bytea or NULL) */
-	replaces[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdmcv - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdexpr - 1] = true;
-
-	/* there should already be a pg_statistic_ext_data tuple */
-	oldtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-	if (!HeapTupleIsValid(oldtup))
+	/*
+	 * Delete the old tuple if it exists, and insert a new one. It's easier
+	 * than trying to update or insert, based on various conditions.
+	 *
+	 * There should always be a pg_statistic_ext_data tuple for inh=false,
+	 * but there may be none for inh=true yet.
+	 */
+	oldtup = SearchSysCache2(STATEXTDATASTXOID,
+							 ObjectIdGetDatum(statOid),
+							 BoolGetDatum(inh));
+	if (HeapTupleIsValid(oldtup))
+	{
+		CatalogTupleDelete(pg_stextdata, &(oldtup->t_self));
+		ReleaseSysCache(oldtup);
+	}
+	else if (!inh)
 		elog(ERROR, "cache lookup failed for statistics object %u", statOid);
 
-	/* replace it */
-	stup = heap_modify_tuple(oldtup,
-							 RelationGetDescr(pg_stextdata),
-							 values,
-							 nulls,
-							 replaces);
-	ReleaseSysCache(oldtup);
-	CatalogTupleUpdate(pg_stextdata, &stup->t_self, stup);
+	/* form a new tuple */
+	stup = heap_form_tuple(RelationGetDescr(pg_stextdata), values, nulls);
+	CatalogTupleInsert(pg_stextdata, stup);
 
 	heap_freetuple(stup);
 
@@ -1234,7 +1242,7 @@ stat_covers_expressions(StatisticExtInfo *stat, List *exprs,
  * further tiebreakers are needed.
  */
 StatisticExtInfo *
-choose_best_statistics(List *stats, char requiredkind,
+choose_best_statistics(List *stats, char requiredkind, bool inh,
 					   Bitmapset **clause_attnums, List **clause_exprs,
 					   int nclauses)
 {
@@ -1256,6 +1264,10 @@ choose_best_statistics(List *stats, char requiredkind,
 		if (info->kind != requiredkind)
 			continue;
 
+		/* skip statistics with mismatching inheritance flag */
+		if (info->inherit != inh)
+			continue;
+
 		/*
 		 * Collect attributes and expressions in remaining (unestimated)
 		 * clauses fully covered by this statistic object.
@@ -1696,10 +1708,6 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
 	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
-	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return sel;
-
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
 		return sel;
@@ -1751,7 +1759,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 		Bitmapset  *simple_clauses;
 
 		/* find the best suited statistics object for these attnums */
-		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV,
+		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, rte->inh,
 									  list_attnums, list_exprs,
 									  list_length(clauses));
 
@@ -1840,7 +1848,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 			MCVList    *mcv_list;
 
 			/* Load the MCV list stored in the statistics object */
-			mcv_list = statext_mcv_load(stat->statOid);
+			mcv_list = statext_mcv_load(stat->statOid, rte->inh);
 
 			/*
 			 * Compute the selectivity of the ORed list of clauses covered by
@@ -2411,7 +2419,7 @@ statext_expressions_load(Oid stxoid, int idx)
 	HeapTupleData tmptup;
 	HeapTuple	tup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), BoolGetDatum(false));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", stxoid);
 
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index b350fc5f7b..75d7d35caa 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -559,12 +559,13 @@ build_column_frequencies(SortItem *groups, int ngroups,
  *		Load the MCV list for the indicated pg_statistic_ext tuple.
  */
 MCVList *
-statext_mcv_load(Oid mvoid)
+statext_mcv_load(Oid mvoid, bool inh)
 {
 	MCVList    *result;
 	bool		isnull;
 	Datum		mcvlist;
-	HeapTuple	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	HeapTuple	htup = SearchSysCache2(STATEXTDATASTXOID,
+									   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
@@ -2039,11 +2040,13 @@ mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
 	MCVList    *mcv;
 	Selectivity s = 0.0;
 
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
 	/* match/mismatch bitmap for each MCV item */
 	bool	   *matches = NULL;
 
 	/* load the MCV list stored in the statistics object */
-	mcv = statext_mcv_load(stat->statOid);
+	mcv = statext_mcv_load(stat->statOid, rte->inh);
 
 	/* build a match bitmap for the clauses */
 	matches = mcv_get_match_bitmap(root, clauses, stat->keys, stat->exprs,
diff --git a/src/backend/statistics/mvdistinct.c b/src/backend/statistics/mvdistinct.c
index 4481312d61..ab1f10d6c0 100644
--- a/src/backend/statistics/mvdistinct.c
+++ b/src/backend/statistics/mvdistinct.c
@@ -146,14 +146,15 @@ statext_ndistinct_build(double totalrows, StatsBuildData *data)
  *		Load the ndistinct value for the indicated pg_statistic_ext tuple
  */
 MVNDistinct *
-statext_ndistinct_load(Oid mvoid)
+statext_ndistinct_load(Oid mvoid, bool inh)
 {
 	MVNDistinct *result;
 	bool		isnull;
 	Datum		ndist;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index b15f14e1a0..aab3bd7696 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -4008,7 +4008,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 
 	Assert(nmatches_vars + nmatches_exprs > 1);
 
-	stats = statext_ndistinct_load(statOid);
+	stats = statext_ndistinct_load(statOid, rte->inh);
 
 	/*
 	 * If we have a match, search it for the specific item that matches (there
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 56870b46e4..6194751e99 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -763,11 +763,11 @@ static const struct cachedesc cacheinfo[] = {
 		32
 	},
 	{StatisticExtDataRelationId,	/* STATEXTDATASTXOID */
-		StatisticExtDataStxoidIndexId,
-		1,
+		StatisticExtDataStxoidInhIndexId,
+		2,
 		{
 			Anum_pg_statistic_ext_data_stxoid,
-			0,
+			Anum_pg_statistic_ext_data_stxdinherit,
 			0,
 			0
 		},
diff --git a/src/include/catalog/pg_statistic_ext_data.h b/src/include/catalog/pg_statistic_ext_data.h
index 7b73b790d2..8ffd8b68cd 100644
--- a/src/include/catalog/pg_statistic_ext_data.h
+++ b/src/include/catalog/pg_statistic_ext_data.h
@@ -32,6 +32,7 @@ CATALOG(pg_statistic_ext_data,3429,StatisticExtDataRelationId)
 {
 	Oid			stxoid BKI_LOOKUP(pg_statistic_ext);	/* statistics object
 														 * this data is for */
+	bool		stxdinherit;	/* true if inheritance children are included */
 
 #ifdef CATALOG_VARLEN			/* variable-length fields start here */
 
@@ -53,6 +54,7 @@ typedef FormData_pg_statistic_ext_data * Form_pg_statistic_ext_data;
 
 DECLARE_TOAST(pg_statistic_ext_data, 3430, 3431);
 
-DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_index, 3433, StatisticExtDataStxoidIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops));
+DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_inh_index, 3433, StatisticExtDataStxoidInhIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops, stxdinherit bool_ops));
+
 
 #endif							/* PG_STATISTIC_EXT_DATA_H */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 324d92880b..02aff9847e 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -934,6 +934,7 @@ typedef struct StatisticExtInfo
 	NodeTag		type;
 
 	Oid			statOid;		/* OID of the statistics row */
+	bool		inherit;		/* includes child relations */
 	RelOptInfo *rel;			/* back-link to statistic's table */
 	char		kind;			/* statistics kind of this entry */
 	Bitmapset  *keys;			/* attnums of the columns covered */
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 326cf26fea..02ee41b9f3 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -94,11 +94,11 @@ typedef struct MCVList
 	MCVItem		items[FLEXIBLE_ARRAY_MEMBER];	/* array of MCV items */
 } MCVList;
 
-extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
-extern MVDependencies *statext_dependencies_load(Oid mvoid);
-extern MCVList *statext_mcv_load(Oid mvoid);
+extern MVNDistinct *statext_ndistinct_load(Oid mvoid, bool inh);
+extern MVDependencies *statext_dependencies_load(Oid mvoid, bool inh);
+extern MCVList *statext_mcv_load(Oid mvoid, bool inh);
 
-extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
+extern void BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 									   int numrows, HeapTuple *rows,
 									   int natts, VacAttrStats **vacattrstats);
 extern int	ComputeExtStatisticsRows(Relation onerel,
@@ -121,6 +121,7 @@ extern Selectivity statext_clauselist_selectivity(PlannerInfo *root,
 												  bool is_or);
 extern bool has_stats_of_kind(List *stats, char requiredkind);
 extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind,
+												bool inh,
 												Bitmapset **clause_attnums,
 												List **clause_exprs,
 												int nclauses);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index b58b062b10..fc50f776d5 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2443,6 +2443,7 @@ pg_stats_ext| SELECT cn.nspname AS schemaname,
              JOIN pg_attribute a ON (((a.attrelid = s.stxrelid) AND (a.attnum = k.k))))) AS attnames,
     pg_get_statisticsobjdef_expressions(s.oid) AS exprs,
     s.stxkind AS kinds,
+    sd.stxdinherit AS inherited,
     sd.stxdndistinct AS n_distinct,
     sd.stxddependencies AS dependencies,
     m.most_common_vals,
-- 
2.17.0

0004-f-check-inh.patchtext/x-diff; charset=us-asciiDownload
From 315dc6175d3a7a3c500c17da10461a0760ab4197 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 25 Sep 2021 18:20:03 -0500
Subject: [PATCH 04/15] f! check inh

statext_expressions_load
examine_variable
estimate_multivariate_ndistinct

TODO: pg_stats_ext_exprs needs to expose inh flag
---
 src/backend/statistics/dependencies.c   |  4 ----
 src/backend/statistics/extended_stats.c |  4 ++--
 src/backend/utils/adt/selfuncs.c        | 27 ++++++++-----------------
 src/include/statistics/statistics.h     |  2 +-
 src/test/regress/expected/stats_ext.out |  9 ++++++++-
 src/test/regress/sql/stats_ext.sql      |  2 ++
 6 files changed, 21 insertions(+), 27 deletions(-)

diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index c62fc398ce..02cf0efc66 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -1418,10 +1418,6 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	Node	  **unique_exprs;
 	int			unique_exprs_cnt;
 
-	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return 1.0;
-
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
 		return 1.0;
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index c53716f034..b1461f3066 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -2409,7 +2409,7 @@ serialize_expr_stats(AnlExprData *exprdata, int nexprs)
  * identified by the supplied index.
  */
 HeapTuple
-statext_expressions_load(Oid stxoid, int idx)
+statext_expressions_load(Oid stxoid, bool inh, int idx)
 {
 	bool		isnull;
 	Datum		value;
@@ -2419,7 +2419,7 @@ statext_expressions_load(Oid stxoid, int idx)
 	HeapTupleData tmptup;
 	HeapTuple	tup;
 
-	htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), BoolGetDatum(false));
+	htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), inh);
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", stxoid);
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index aab3bd7696..19b4aaf7eb 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3915,10 +3915,6 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	StatisticExtInfo *matched_info = NULL;
 	RangeTblEntry		*rte = root->simple_rte_array[rel->relid];
 
-	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return false;
-
 	/* bail out immediately if the table has no extended statistics */
 	if (!rel->statlist)
 		return false;
@@ -5237,13 +5233,6 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 			if (vardata->statsTuple)
 				break;
 
-			/* If it's an inheritence tree, skip statistics (which do not include child stats) */
-			{
-				RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root);
-				if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-					break;
-			}
-
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
 				continue;
@@ -5262,22 +5251,22 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 				/* found a match, see if we can extract pg_statistic row */
 				if (equal(node, expr))
 				{
-					HeapTuple	t = statext_expressions_load(info->statOid, pos);
-
-					/* Get statistics object's table for permission check */
-					RangeTblEntry *rte;
+					RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root);
 					Oid			userid;
+					bool		inh;
 
-					vardata->statsTuple = t;
+					Assert(rte->rtekind == RTE_RELATION);
 
 					/*
 					 * XXX Not sure if we should cache the tuple somewhere.
 					 * Now we just create a new copy every time.
 					 */
-					vardata->freefunc = ReleaseDummy;
+					inh = root->append_rel_array == NULL ? false :
+						root->append_rel_array[onerel->relid]->parent_relid != 0;
+					vardata->statsTuple =
+						statext_expressions_load(info->statOid, inh, pos);
 
-					rte = planner_rt_fetch(onerel->relid, root);
-					Assert(rte->rtekind == RTE_RELATION);
+					vardata->freefunc = ReleaseDummy;
 
 					/*
 					 * Use checkAsUser if it's set, in case we're accessing
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 02ee41b9f3..3868e43f8a 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -125,6 +125,6 @@ extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind,
 												Bitmapset **clause_attnums,
 												List **clause_exprs,
 												int nclauses);
-extern HeapTuple statext_expressions_load(Oid stxoid, int idx);
+extern HeapTuple statext_expressions_load(Oid stxoid, bool inh, int idx);
 
 #endif							/* STATISTICS_H */
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index a11fff655a..59cf39c104 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -196,7 +196,14 @@ VACUUM ANALYZE stxdinh, stxdinh1;
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
  estimated | actual 
 -----------+--------
-      1000 |   1008
+      1008 |   1008
+(1 row)
+
+-- Ensure correct (non-inherited) stats are applied to inherited query
+SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+         9 |      9
 (1 row)
 
 DROP TABLE stxdinh, stxdinh1;
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 01383e8f5f..e4eedd4bc0 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -125,6 +125,8 @@ CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
 VACUUM ANALYZE stxdinh, stxdinh1;
 -- Since the stats object does not include inherited stats, it should not affect the estimates
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+-- Ensure correct (non-inherited) stats are applied to inherited query
+SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2');
 DROP TABLE stxdinh, stxdinh1;
 
 -- Ensure inherited stats ARE applied to inherited query in partitioned table
-- 
2.17.0

0005-Maybe-better-than-looping-twice.-For-partitioned-tab.patchtext/x-diff; charset=us-asciiDownload
From 7ac655e4932c7645c7975fbc97518b6b74edb495 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Wed, 3 Nov 2021 20:08:42 -0500
Subject: [PATCH 05/15] Maybe better than looping twice.  For partitioned
 tables, create only "inherited" stats_ext.

---
 src/backend/commands/statscmds.c     |  63 +++++---
 src/backend/optimizer/util/plancat.c | 227 ++++++++++++++-------------
 2 files changed, 151 insertions(+), 139 deletions(-)

diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 4395d878c7..89e9ad5843 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -45,6 +45,7 @@
 static char *ChooseExtendedStatisticName(const char *name1, const char *name2,
 										 const char *label, Oid namespaceid);
 static char *ChooseExtendedStatisticNameAddition(List *exprs);
+static void RemoveStatisticsByIdWorker(Relation pg_statistic_ext, Oid statsOid, bool inh);
 
 
 /* qsort comparator for the attnums in CreateStatistics */
@@ -524,8 +525,9 @@ CreateStatistics(CreateStatsStmt *stmt)
 
 	datavalues[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statoid);
 
-	/* create only the "stxdinherit=false", because that always exists */
-	datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = ObjectIdGetDatum(false);
+	/* create "stxdinherit=false", because that always exists, except for
+	 * partitioned tables, for which that's meaningless */
+	datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = BoolGetDatum(rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE);
 
 	/* no statistics built yet */
 	datanulls[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
@@ -729,30 +731,7 @@ RemoveStatisticsById(Oid statsOid)
 	HeapTuple	tup;
 	Form_pg_statistic_ext statext;
 	Oid			relid;
-	int			inh;
-
-	/*
-	 * First delete the pg_statistic_ext_data tuple holding the actual
-	 * statistical data.
-	 */
-	relation = table_open(StatisticExtDataRelationId, RowExclusiveLock);
-
-	/* hack to delete both stxdinherit = true/false */
-	for (inh = 0; inh <= 1; inh++)
-	{
-		tup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid),
-							  BoolGetDatum(inh));
-
-		if (!HeapTupleIsValid(tup)) /* should not happen */
-			// elog(ERROR, "cache lookup failed for statistics data %u", statsOid);
-			continue;
-
-		CatalogTupleDelete(relation, &tup->t_self);
-
-		ReleaseSysCache(tup);
-	}
-
-	table_close(relation, RowExclusiveLock);
+	char relkind;
 
 	/*
 	 * Delete the pg_statistic_ext tuple.  Also send out a cache inval on the
@@ -767,6 +746,7 @@ RemoveStatisticsById(Oid statsOid)
 
 	statext = (Form_pg_statistic_ext) GETSTRUCT(tup);
 	relid = statext->stxrelid;
+	relkind = get_rel_relkind(relid);
 
 	CacheInvalidateRelcacheByRelid(relid);
 
@@ -775,6 +755,37 @@ RemoveStatisticsById(Oid statsOid)
 	ReleaseSysCache(tup);
 
 	table_close(relation, RowExclusiveLock);
+
+	/*
+	 * Delete the pg_statistic_ext_data tuples holding the actual
+	 * statistical data.
+	 */
+	relation = table_open(StatisticExtDataRelationId, RowExclusiveLock);
+
+	if (relkind != RELKIND_PARTITIONED_TABLE)
+		RemoveStatisticsByIdWorker(relation, statsOid, false);
+	RemoveStatisticsByIdWorker(relation, statsOid, true);
+
+	table_close(relation, RowExclusiveLock);
+}
+
+void
+RemoveStatisticsByIdWorker(Relation pg_statistic_ext, Oid statsOid, bool inh)
+{
+	HeapTuple	tup;
+	tup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid),
+						  BoolGetDatum(inh));
+
+	/* should not happen,  .. */
+	if (!HeapTupleIsValid(tup))
+	{
+		if (!inh)
+			elog(ERROR, "cache lookup failed for statistics data %u inh %d", statsOid, inh);
+		return;
+	}
+
+	CatalogTupleDelete(pg_statistic_ext, &tup->t_self);
+	ReleaseSysCache(tup);
 }
 
 /*
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 154d48a330..b1d8847529 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -81,6 +81,9 @@ static void set_baserel_partition_key_exprs(Relation relation,
 static void set_baserel_partition_constraint(Relation relation,
 											 RelOptInfo *rel);
 
+static void get_relation_statistics_worker(RelOptInfo *rel, List **stainfos,
+	Oid statOid, HeapTuple htup, Form_pg_statistic_ext staForm, Index varno,
+	bool inh);
 
 /*
  * get_relation_info -
@@ -1301,7 +1304,6 @@ get_relation_constraints(PlannerInfo *root,
 static List *
 get_relation_statistics(RelOptInfo *rel, Relation relation)
 {
-	Index		varno = rel->relid;
 	List	   *statoidlist;
 	List	   *stainfos = NIL;
 	ListCell   *l;
@@ -1312,150 +1314,149 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 	{
 		Oid			statOid = lfirst_oid(l);
 		Form_pg_statistic_ext staForm;
-		Form_pg_statistic_ext_data dataForm;
 		HeapTuple	htup;
-		HeapTuple	dtup;
-		Bitmapset  *keys = NULL;
-		List	   *exprs = NIL;
-		int			i;
-		int			inh;
 
 		htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
 		if (!HeapTupleIsValid(htup))
 			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
 		staForm = (Form_pg_statistic_ext) GETSTRUCT(htup);
 
-		/*
-		 * Hack to load stats with stxdinherit true/false - there should be
-		 * a better way to do this, I guess.
-		 */
-		for (inh = 0; inh <= 1; inh++)
-		{
-			dtup = SearchSysCache2(STATEXTDATASTXOID,
-								   ObjectIdGetDatum(statOid), BoolGetDatum((bool) inh));
-			if (!HeapTupleIsValid(dtup))
-				continue;
+		get_relation_statistics_worker(rel, &stainfos, statOid, htup, staForm, rel->relid, false);
+		get_relation_statistics_worker(rel, &stainfos, statOid, htup, staForm, rel->relid, true);
 
-			dataForm = (Form_pg_statistic_ext_data) GETSTRUCT(dtup);
+		ReleaseSysCache(htup);
+	}
 
-			/*
-			 * First, build the array of columns covered.  This is ultimately
-			 * wasted if no stats within the object have actually been built, but
-			 * it doesn't seem worth troubling over that case.
-			 */
-			for (i = 0; i < staForm->stxkeys.dim1; i++)
-				keys = bms_add_member(keys, staForm->stxkeys.values[i]);
+	list_free(statoidlist);
 
-			/*
-			 * Preprocess expressions (if any). We read the expressions, run them
-			 * through eval_const_expressions, and fix the varnos.
-			 */
-			{
-				bool		isnull;
-				Datum		datum;
+	return stainfos;
+}
 
-				/* decode expression (if any) */
-				datum = SysCacheGetAttr(STATEXTOID, htup,
-										Anum_pg_statistic_ext_stxexprs, &isnull);
+static void
+get_relation_statistics_worker(RelOptInfo *rel, List **stainfos, Oid statOid, HeapTuple htup, Form_pg_statistic_ext staForm, Index varno, bool inh)
+{
+	Form_pg_statistic_ext_data dataForm;
+	HeapTuple	dtup;
+	Bitmapset  *keys = NULL;
+	List	   *exprs = NIL;
+
+	dtup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(statOid), BoolGetDatum(inh));
+	if (!HeapTupleIsValid(dtup))
+		return;
 
-				if (!isnull)
-				{
-					char	   *exprsString;
+	dataForm = (Form_pg_statistic_ext_data) GETSTRUCT(dtup);
 
-					exprsString = TextDatumGetCString(datum);
-					exprs = (List *) stringToNode(exprsString);
-					pfree(exprsString);
+	/*
+	 * First, build the array of columns covered.  This is ultimately
+	 * wasted if no stats within the object have actually been built, but
+	 * it doesn't seem worth troubling over that case.
+	 */
+	for (int i = 0; i < staForm->stxkeys.dim1; i++)
+		keys = bms_add_member(keys, staForm->stxkeys.values[i]);
 
-					/*
-					 * Run the expressions through eval_const_expressions. This is
-					 * not just an optimization, but is necessary, because the
-					 * planner will be comparing them to similarly-processed qual
-					 * clauses, and may fail to detect valid matches without this.
-					 * We must not use canonicalize_qual, however, since these
-					 * aren't qual expressions.
-					 */
-					exprs = (List *) eval_const_expressions(NULL, (Node *) exprs);
+	/*
+	 * Preprocess expressions (if any). We read the expressions, run them
+	 * through eval_const_expressions, and fix the varnos.
+	 */
+	{
+		bool		isnull;
+		Datum		datum;
 
-					/* May as well fix opfuncids too */
-					fix_opfuncids((Node *) exprs);
+		/* decode expression (if any) */
+		datum = SysCacheGetAttr(STATEXTOID, htup,
+								Anum_pg_statistic_ext_stxexprs, &isnull);
 
-					/*
-					 * Modify the copies we obtain from the relcache to have the
-					 * correct varno for the parent relation, so that they match
-					 * up correctly against qual clauses.
-					 */
-					if (varno != 1)
-						ChangeVarNodes((Node *) exprs, 1, varno, 0);
-				}
-			}
-
-			/* add one StatisticExtInfo for each kind built */
-			if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
-			{
-				StatisticExtInfo *info = makeNode(StatisticExtInfo);
+		if (!isnull)
+		{
+			char	   *exprsString;
 
-				info->statOid = statOid;
-				info->inherit = dataForm->stxdinherit;
-				info->rel = rel;
-				info->kind = STATS_EXT_NDISTINCT;
-				info->keys = bms_copy(keys);
-				info->exprs = exprs;
+			exprsString = TextDatumGetCString(datum);
+			exprs = (List *) stringToNode(exprsString);
+			pfree(exprsString);
 
-				stainfos = lappend(stainfos, info);
-			}
+			/*
+			 * Run the expressions through eval_const_expressions. This is
+			 * not just an optimization, but is necessary, because the
+			 * planner will be comparing them to similarly-processed qual
+			 * clauses, and may fail to detect valid matches without this.
+			 * We must not use canonicalize_qual, however, since these
+			 * aren't qual expressions.
+			 */
+			exprs = (List *) eval_const_expressions(NULL, (Node *) exprs);
 
-			if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
-			{
-				StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			/* May as well fix opfuncids too */
+			fix_opfuncids((Node *) exprs);
 
-				info->statOid = statOid;
-				info->inherit = dataForm->stxdinherit;
-				info->rel = rel;
-				info->kind = STATS_EXT_DEPENDENCIES;
-				info->keys = bms_copy(keys);
-				info->exprs = exprs;
+			/*
+			 * Modify the copies we obtain from the relcache to have the
+			 * correct varno for the parent relation, so that they match
+			 * up correctly against qual clauses.
+			 */
+			if (varno != 1)
+				ChangeVarNodes((Node *) exprs, 1, varno, 0);
+		}
+	}
 
-				stainfos = lappend(stainfos, info);
-			}
+	/* add one StatisticExtInfo for each kind built */
+	if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			if (statext_is_kind_built(dtup, STATS_EXT_MCV))
-			{
-				StatisticExtInfo *info = makeNode(StatisticExtInfo);
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_NDISTINCT;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
 
-				info->statOid = statOid;
-				info->inherit = dataForm->stxdinherit;
-				info->rel = rel;
-				info->kind = STATS_EXT_MCV;
-				info->keys = bms_copy(keys);
-				info->exprs = exprs;
+		*stainfos = lappend(*stainfos, info);
+	}
 
-				stainfos = lappend(stainfos, info);
-			}
+	if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
-			{
-				StatisticExtInfo *info = makeNode(StatisticExtInfo);
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_DEPENDENCIES;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
 
-				info->statOid = statOid;
-				info->inherit = dataForm->stxdinherit;
-				info->rel = rel;
-				info->kind = STATS_EXT_EXPRESSIONS;
-				info->keys = bms_copy(keys);
-				info->exprs = exprs;
+		*stainfos = lappend(*stainfos, info);
+	}
 
-				stainfos = lappend(stainfos, info);
-			}
+	if (statext_is_kind_built(dtup, STATS_EXT_MCV))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-			ReleaseSysCache(dtup);
-		}
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_MCV;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
 
-		ReleaseSysCache(htup);
-		bms_free(keys);
+		*stainfos = lappend(*stainfos, info);
 	}
 
-	list_free(statoidlist);
+	if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
-	return stainfos;
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_EXPRESSIONS;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	ReleaseSysCache(dtup);
+	bms_free(keys);
 }
 
 /*
-- 
2.17.0

0006-Refactor-parent-ACL-check.patchtext/x-diff; charset=us-asciiDownload
From 3945ce1c7e5fe31baa37ca23e6011f0cee22e16a Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 25 Sep 2021 18:58:33 -0500
Subject: [PATCH 06/15] Refactor parent ACL check

selfuncs.c is 8k lines long, and this makes it 30 LOC shorter.
---
 src/backend/utils/adt/selfuncs.c | 140 ++++++++++++-------------------
 1 file changed, 52 insertions(+), 88 deletions(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 19b4aaf7eb..54324a71c0 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -187,6 +187,8 @@ static char *convert_string_datum(Datum value, Oid typid, Oid collid,
 								  bool *failure);
 static double convert_timevalue_to_scalar(Datum value, Oid typid,
 										  bool *failure);
+static void recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata,
+								Oid relid);
 static void examine_simple_variable(PlannerInfo *root, Var *var,
 									VariableStatData *vardata);
 static bool get_variable_range(PlannerInfo *root, VariableStatData *vardata,
@@ -5152,51 +5154,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 									(pg_class_aclcheck(rte->relid, userid,
 													   ACL_SELECT) == ACLCHECK_OK);
 
-								/*
-								 * If the user doesn't have permissions to
-								 * access an inheritance child relation, check
-								 * the permissions of the table actually
-								 * mentioned in the query, since most likely
-								 * the user does have that permission.  Note
-								 * that whole-table select privilege on the
-								 * parent doesn't quite guarantee that the
-								 * user could read all columns of the child.
-								 * But in practice it's unlikely that any
-								 * interesting security violation could result
-								 * from allowing access to the expression
-								 * index's stats, so we allow it anyway.  See
-								 * similar code in examine_simple_variable()
-								 * for additional comments.
-								 */
-								if (!vardata->acl_ok &&
-									root->append_rel_array != NULL)
-								{
-									AppendRelInfo *appinfo;
-									Index		varno = index->rel->relid;
-
-									appinfo = root->append_rel_array[varno];
-									while (appinfo &&
-										   planner_rt_fetch(appinfo->parent_relid,
-															root)->rtekind == RTE_RELATION)
-									{
-										varno = appinfo->parent_relid;
-										appinfo = root->append_rel_array[varno];
-									}
-									if (varno != index->rel->relid)
-									{
-										/* Repeat access check on this rel */
-										rte = planner_rt_fetch(varno, root);
-										Assert(rte->rtekind == RTE_RELATION);
-
-										userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-										vardata->acl_ok =
-											rte->securityQuals == NIL &&
-											(pg_class_aclcheck(rte->relid,
-															   userid,
-															   ACL_SELECT) == ACLCHECK_OK);
-									}
-								}
+								recheck_parent_acl(root, vardata, index->rel->relid);
 							}
 							else
 							{
@@ -5287,49 +5245,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 						(pg_class_aclcheck(rte->relid, userid,
 										   ACL_SELECT) == ACLCHECK_OK);
 
-					/*
-					 * If the user doesn't have permissions to access an
-					 * inheritance child relation, check the permissions of
-					 * the table actually mentioned in the query, since most
-					 * likely the user does have that permission.  Note that
-					 * whole-table select privilege on the parent doesn't
-					 * quite guarantee that the user could read all columns of
-					 * the child. But in practice it's unlikely that any
-					 * interesting security violation could result from
-					 * allowing access to the expression stats, so we allow it
-					 * anyway.  See similar code in examine_simple_variable()
-					 * for additional comments.
-					 */
-					if (!vardata->acl_ok &&
-						root->append_rel_array != NULL)
-					{
-						AppendRelInfo *appinfo;
-						Index		varno = onerel->relid;
-
-						appinfo = root->append_rel_array[varno];
-						while (appinfo &&
-							   planner_rt_fetch(appinfo->parent_relid,
-												root)->rtekind == RTE_RELATION)
-						{
-							varno = appinfo->parent_relid;
-							appinfo = root->append_rel_array[varno];
-						}
-						if (varno != onerel->relid)
-						{
-							/* Repeat access check on this rel */
-							rte = planner_rt_fetch(varno, root);
-							Assert(rte->rtekind == RTE_RELATION);
-
-							userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-							vardata->acl_ok =
-								rte->securityQuals == NIL &&
-								(pg_class_aclcheck(rte->relid,
-												   userid,
-												   ACL_SELECT) == ACLCHECK_OK);
-						}
-					}
-
+					recheck_parent_acl(root, vardata, onerel->relid);
 					break;
 				}
 
@@ -5339,6 +5255,54 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 	}
 }
 
+/*
+ * If the user doesn't have permissions to access an inheritance child
+ * relation, check the permissions of the table actually mentioned in the
+ * query, since most likely the user does have that permission.  Note that
+ * whole-table select privilege on the parent doesn't quite guarantee that the
+ * user could read all columns of the child.  But in practice it's unlikely
+ * that any interesting security violation could result from allowing access to
+ * the expression stats, so we allow it anyway.  See similar code in
+ * examine_simple_variable() for additional comments.
+ */
+static void
+recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata, Oid relid)
+{
+	RangeTblEntry	*rte;
+	Oid		userid;
+
+	if (!vardata->acl_ok &&
+		root->append_rel_array != NULL)
+	{
+		AppendRelInfo *appinfo;
+		Index		varno = relid;
+
+		appinfo = root->append_rel_array[varno];
+		while (appinfo &&
+			   planner_rt_fetch(appinfo->parent_relid,
+								root)->rtekind == RTE_RELATION)
+		{
+			varno = appinfo->parent_relid;
+			appinfo = root->append_rel_array[varno];
+		}
+
+		if (varno != relid)
+		{
+			/* Repeat access check on this rel */
+			rte = planner_rt_fetch(varno, root);
+			Assert(rte->rtekind == RTE_RELATION);
+
+			userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
+
+			vardata->acl_ok =
+				rte->securityQuals == NIL &&
+				(pg_class_aclcheck(rte->relid,
+								   userid,
+								   ACL_SELECT) == ACLCHECK_OK);
+		}
+	}
+}
+
 /*
  * examine_simple_variable
  *		Handle a simple Var for examine_variable
-- 
2.17.0

#16Zhihong Yu
zyu@yugabyte.com
In reply to: Justin Pryzby (#15)
Re: extended stats on partitioned tables

On Thu, Dec 2, 2021 at 9:24 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

On Thu, Nov 04, 2021 at 12:44:45AM +0100, Tomas Vondra wrote:

And I'm not sure we do the right thing after removing children, for

example

(that should drop the inheritance stats, I guess).

Do you mean for inheritance only ? Or partitions too ?
I think for partitions, the stats should stay.
And for inheritence, they can stay, for consistency with partitions,

and since

it does no harm.

I think the behavior should be the same as for data in pg_statistic,
i.e. if we keep/remove those, we should do the same thing for extended
statistics.

That works for column stats the way I proposed for extended stats: child
stats
are never removed, neither when the only child is dropped, nor when
re-running
analyze (that part is actually a bit odd).

Rebased, fixing an intermediate compile error, and typos in the commit
message.

--
Justin

Hi,

+       if (!HeapTupleIsValid(tup)) /* should not happen */
+           // elog(ERROR, "cache lookup failed for statistics data %u",
statsOid);

You may want to remove commented out code.

+           for (i = 0; i < staForm->stxkeys.dim1; i++)
+               keys = bms_add_member(keys, staForm->stxkeys.values[i]);

Since the above code is in a loop now, should keys be cleared across the
outer loop iterations ?

Cheers

#17Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Zhihong Yu (#16)
4 attachment(s)
Re: extended stats on partitioned tables

Hi,

Attached is a rebased and cleaned-up version of these patches, with more
comments, refactorings etc. Justin and Zhihong, can you take a look?

0001 - Ignore extended statistics for inheritance trees

0002 - Build inherited extended stats on partitioned tables

Those are mostly just Justin's patches, with more detailed comments and
updated commit message. I've considered moving the rel->inh check to
statext_clauselist_selectivity, and then removing the check from
dependencies and MCV. But I decided no to do that, because someone might
be calling those functions directly (even if that's very unlikely).

The one thing bugging me a bit is that the regression test checks only a
GROUP BY query. It'd be nice to add queries testing MCV/dependencies
too, but that seems tricky because most queries will use per-partitions
stats.

0003 - Add stxdinherit flag to pg_statistic_ext_data

This is the patch for master, allowing to build stats for both inherits
flag values. It adds the flag to pg_stats_ext_exprs view to, reworked
how we deal with iterating both flags etc. I've adopted most of the
Justin's fixup patches, except that in plancat.c I've refactored how we
load the stats to process keys/expressions just once.

It has the same issue with regression test using just a GROUP BY query,
but if we add a test to 0001/0002, that'll fix this too.

0004 - Refactor parent ACL check

Not sure about this - I doubt saving 30 rows in an 8kB file is really
worth it. Maybe it is, but then maybe we should try cleaning up the
other ACL checks in this file too? Seems mostly orthogonal to this
thread, though.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

0004-Refactor-parent-ACL-check-20211212.patchtext/x-patch; charset=UTF-8; name=0004-Refactor-parent-ACL-check-20211212.patchDownload
From 569099c5d14535a95cc21798cb041cae7eee0e36 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Sun, 12 Dec 2021 02:28:41 +0100
Subject: [PATCH 4/4] Refactor parent ACL check

selfuncs.c is 8k lines long, and this makes it 30 LOC shorter.
---
 src/backend/utils/adt/selfuncs.c | 140 ++++++++++++-------------------
 1 file changed, 52 insertions(+), 88 deletions(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 886d85b7f3..9f656316bf 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -187,6 +187,8 @@ static char *convert_string_datum(Datum value, Oid typid, Oid collid,
 								  bool *failure);
 static double convert_timevalue_to_scalar(Datum value, Oid typid,
 										  bool *failure);
+static void recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata,
+								Oid relid);
 static void examine_simple_variable(PlannerInfo *root, Var *var,
 									VariableStatData *vardata);
 static bool get_variable_range(PlannerInfo *root, VariableStatData *vardata,
@@ -5153,51 +5155,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 									(pg_class_aclcheck(rte->relid, userid,
 													   ACL_SELECT) == ACLCHECK_OK);
 
-								/*
-								 * If the user doesn't have permissions to
-								 * access an inheritance child relation, check
-								 * the permissions of the table actually
-								 * mentioned in the query, since most likely
-								 * the user does have that permission.  Note
-								 * that whole-table select privilege on the
-								 * parent doesn't quite guarantee that the
-								 * user could read all columns of the child.
-								 * But in practice it's unlikely that any
-								 * interesting security violation could result
-								 * from allowing access to the expression
-								 * index's stats, so we allow it anyway.  See
-								 * similar code in examine_simple_variable()
-								 * for additional comments.
-								 */
-								if (!vardata->acl_ok &&
-									root->append_rel_array != NULL)
-								{
-									AppendRelInfo *appinfo;
-									Index		varno = index->rel->relid;
-
-									appinfo = root->append_rel_array[varno];
-									while (appinfo &&
-										   planner_rt_fetch(appinfo->parent_relid,
-															root)->rtekind == RTE_RELATION)
-									{
-										varno = appinfo->parent_relid;
-										appinfo = root->append_rel_array[varno];
-									}
-									if (varno != index->rel->relid)
-									{
-										/* Repeat access check on this rel */
-										rte = planner_rt_fetch(varno, root);
-										Assert(rte->rtekind == RTE_RELATION);
-
-										userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-										vardata->acl_ok =
-											rte->securityQuals == NIL &&
-											(pg_class_aclcheck(rte->relid,
-															   userid,
-															   ACL_SELECT) == ACLCHECK_OK);
-									}
-								}
+								recheck_parent_acl(root, vardata, index->rel->relid);
 							}
 							else
 							{
@@ -5302,49 +5260,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 						(pg_class_aclcheck(rte->relid, userid,
 										   ACL_SELECT) == ACLCHECK_OK);
 
-					/*
-					 * If the user doesn't have permissions to access an
-					 * inheritance child relation, check the permissions of
-					 * the table actually mentioned in the query, since most
-					 * likely the user does have that permission.  Note that
-					 * whole-table select privilege on the parent doesn't
-					 * quite guarantee that the user could read all columns of
-					 * the child. But in practice it's unlikely that any
-					 * interesting security violation could result from
-					 * allowing access to the expression stats, so we allow it
-					 * anyway.  See similar code in examine_simple_variable()
-					 * for additional comments.
-					 */
-					if (!vardata->acl_ok &&
-						root->append_rel_array != NULL)
-					{
-						AppendRelInfo *appinfo;
-						Index		varno = onerel->relid;
-
-						appinfo = root->append_rel_array[varno];
-						while (appinfo &&
-							   planner_rt_fetch(appinfo->parent_relid,
-												root)->rtekind == RTE_RELATION)
-						{
-							varno = appinfo->parent_relid;
-							appinfo = root->append_rel_array[varno];
-						}
-						if (varno != onerel->relid)
-						{
-							/* Repeat access check on this rel */
-							rte = planner_rt_fetch(varno, root);
-							Assert(rte->rtekind == RTE_RELATION);
-
-							userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-							vardata->acl_ok =
-								rte->securityQuals == NIL &&
-								(pg_class_aclcheck(rte->relid,
-												   userid,
-												   ACL_SELECT) == ACLCHECK_OK);
-						}
-					}
-
+					recheck_parent_acl(root, vardata, onerel->relid);
 					break;
 				}
 
@@ -5354,6 +5270,54 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 	}
 }
 
+/*
+ * If the user doesn't have permissions to access an inheritance child
+ * relation, check the permissions of the table actually mentioned in the
+ * query, since most likely the user does have that permission.  Note that
+ * whole-table select privilege on the parent doesn't quite guarantee that the
+ * user could read all columns of the child.  But in practice it's unlikely
+ * that any interesting security violation could result from allowing access to
+ * the expression stats, so we allow it anyway.  See similar code in
+ * examine_simple_variable() for additional comments.
+ */
+static void
+recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata, Oid relid)
+{
+	RangeTblEntry	*rte;
+	Oid		userid;
+
+	if (!vardata->acl_ok &&
+		root->append_rel_array != NULL)
+	{
+		AppendRelInfo *appinfo;
+		Index		varno = relid;
+
+		appinfo = root->append_rel_array[varno];
+		while (appinfo &&
+			   planner_rt_fetch(appinfo->parent_relid,
+								root)->rtekind == RTE_RELATION)
+		{
+			varno = appinfo->parent_relid;
+			appinfo = root->append_rel_array[varno];
+		}
+
+		if (varno != relid)
+		{
+			/* Repeat access check on this rel */
+			rte = planner_rt_fetch(varno, root);
+			Assert(rte->rtekind == RTE_RELATION);
+
+			userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
+
+			vardata->acl_ok =
+				rte->securityQuals == NIL &&
+				(pg_class_aclcheck(rte->relid,
+								   userid,
+								   ACL_SELECT) == ACLCHECK_OK);
+		}
+	}
+}
+
 /*
  * examine_simple_variable
  *		Handle a simple Var for examine_variable
-- 
2.31.1

0003-Add-stxdinherit-flag-to-pg_statistic_ext_da-20211212.patchtext/x-patch; charset=UTF-8; name=0003-Add-stxdinherit-flag-to-pg_statistic_ext_da-20211212.patchDownload
From 4c74acba061488f5c0954c4240ca2636591b9c61 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Sun, 12 Dec 2021 00:07:27 +0100
Subject: [PATCH 3/4] Add stxdinherit flag to pg_statistic_ext_data

The support for extended statistics on inheritance trees was somewhat
problematic, because the catalog pg_statistic_ext_data did not have a
flag identifying whether the data was built with/without data for the
child relations.

For partitioned tables we've been able to work around this because the
non-leaf relations can't contain data, so there's really just one set of
statistics to store. But for regular inheritance trees we had to pick,
and there no clear winner.

This introduces the pg_statistic_ext_data.stxdinherit flag, analogous to
pg_statistic.stainherit, and modifies analyze to build statistics for
both cases.

This relaxes the relationship between the two catalogs - until now we've
created the pg_statistic_ext_data one when creating the statistics, and
then only updated it later. There was always 1:1 relationship between
rows in the two catalogs. With this change we no longer insert any data
rows while creating statistics, because we don't know which flag value
to create, and it seem wasteful to initialize both for every relation.
The there may be 0, 1 or 2 data rows for each statistics definition.

Patch by me, with extensive improvements and fixes by Justin Pryzby.

Author: Tomas Vondra, Justin Pryzby
Reviewed-by: Tomas Vondra, Justin Pryzby
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
---
 doc/src/sgml/catalogs.sgml                  |  23 +++
 src/backend/catalog/system_views.sql        |   2 +
 src/backend/commands/analyze.c              |  28 +---
 src/backend/commands/statscmds.c            |  72 +++++-----
 src/backend/optimizer/util/plancat.c        | 147 ++++++++++++--------
 src/backend/statistics/dependencies.c       |  22 ++-
 src/backend/statistics/extended_stats.c     |  75 +++++-----
 src/backend/statistics/mcv.c                |   9 +-
 src/backend/statistics/mvdistinct.c         |   5 +-
 src/backend/utils/adt/selfuncs.c            |  36 +++--
 src/backend/utils/cache/syscache.c          |   6 +-
 src/include/catalog/pg_statistic_ext_data.h |   4 +-
 src/include/commands/defrem.h               |   1 +
 src/include/nodes/pathnodes.h               |   1 +
 src/include/statistics/statistics.h         |  11 +-
 src/test/regress/expected/rules.out         |   2 +
 src/test/regress/expected/stats_ext.out     |  18 ++-
 src/test/regress/sql/stats_ext.sql          |   4 +-
 18 files changed, 251 insertions(+), 215 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 025db98763..9ab2cb52a7 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7521,6 +7521,19 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
    created with <link linkend="sql-createstatistics"><command>CREATE STATISTICS</command></link>.
   </para>
 
+  <para>
+   Normally there is one entry, with <structfield>stxdinherit</structfield> =
+   <literal>false</literal>, for each statistics object that has been analyzed.
+   If the table has inheritance children, a second entry with
+   <structfield>stxdinherit</structfield> = <literal>true</literal> is also created.
+   This row represents the statistics object over the inheritance tree, i.e.,
+   statistics for the data you'd see with
+   <literal>SELECT * FROM <replaceable>table</replaceable>*</literal>,
+   whereas the <structfield>stxdinherit</structfield> = <literal>false</literal> row
+   represents the results of
+   <literal>SELECT * FROM ONLY <replaceable>table</replaceable></literal>.
+  </para>
+
   <para>
    Like <link linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>,
    <structname>pg_statistic_ext_data</structname> should not be
@@ -7560,6 +7573,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>stxdinherit</structfield> <type>bool</type>
+      </para>
+      <para>
+       If true, the stats include inheritance child columns, not just the
+       values in the specified relation
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>stxdndistinct</structfield> <type>pg_ndistinct</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 61b515cdb8..319653789b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -266,6 +266,7 @@ CREATE VIEW pg_stats_ext WITH (security_barrier) AS
            ) AS attnames,
            pg_get_statisticsobjdef_expressions(s.oid) as exprs,
            s.stxkind AS kinds,
+           sd.stxdinherit AS inherited,
            sd.stxdndistinct AS n_distinct,
            sd.stxddependencies AS dependencies,
            m.most_common_vals,
@@ -298,6 +299,7 @@ CREATE VIEW pg_stats_ext_exprs WITH (security_barrier) AS
            s.stxname AS statistics_name,
            pg_get_userbyid(s.stxowner) AS statistics_owner,
            stat.expr,
+           sd.stxdinherit AS inherited,
            (stat.a).stanullfrac AS null_frac,
            (stat.a).stawidth AS avg_width,
            (stat.a).stadistinct AS n_distinct,
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 58a82e4929..3436f09d63 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -549,7 +549,6 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 	{
 		MemoryContext col_context,
 					old_context;
-		bool		build_ext_stats;
 
 		pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
 									 PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -613,30 +612,9 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
-		/*
-		 * Should we build extended statistics for this relation?
-		 *
-		 * The extended statistics catalog does not include an inheritance
-		 * flag, so we can't store statistics built both with and without
-		 * data from child relations. We can store just one set of statistics
-		 * per relation. For plain relations that's fine, but for inheritance
-		 * trees we have to pick whether to store statistics for just the
-		 * one relation or the whole tree. For plain inheritance we store
-		 * the (!inh) version, mostly for backwards compatibility reasons.
-		 * For partitioned tables that's pointless (the non-leaf tables are
-		 * always empty), so we store stats representing the whole tree.
-		 */
-		build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
-
-		/*
-		 * Build extended statistics (if there are any).
-		 *
-		 * For now we only build extended statistics on individual relations,
-		 * not for relations representing inheritance trees.
-		 */
-		if (build_ext_stats)
-			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
-									   attr_cnt, vacattrstats);
+		/* Build extended statistics (if there are any). */
+		BuildRelationExtStatistics(onerel, inh, totalrows, numrows, rows,
+								   attr_cnt, vacattrstats);
 	}
 
 	pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 8f1550ec80..1fa39e6f21 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -75,13 +75,10 @@ CreateStatistics(CreateStatsStmt *stmt)
 	HeapTuple	htup;
 	Datum		values[Natts_pg_statistic_ext];
 	bool		nulls[Natts_pg_statistic_ext];
-	Datum		datavalues[Natts_pg_statistic_ext_data];
-	bool		datanulls[Natts_pg_statistic_ext_data];
 	int2vector *stxkeys;
 	List	   *stxexprs = NIL;
 	Datum		exprsDatum;
 	Relation	statrel;
-	Relation	datarel;
 	Relation	rel = NULL;
 	Oid			relid;
 	ObjectAddress parentobject,
@@ -514,28 +511,10 @@ CreateStatistics(CreateStatsStmt *stmt)
 	relation_close(statrel, RowExclusiveLock);
 
 	/*
-	 * Also build the pg_statistic_ext_data tuple, to hold the actual
-	 * statistics data.
+	 * We used to create the pg_statistic_ext_data tuple too, but it's not clear
+	 * what value should the stxdinherit flag have (it depends on whether the rel
+	 * is partitioned, contains data, etc.)
 	 */
-	datarel = table_open(StatisticExtDataRelationId, RowExclusiveLock);
-
-	memset(datavalues, 0, sizeof(datavalues));
-	memset(datanulls, false, sizeof(datanulls));
-
-	datavalues[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statoid);
-
-	/* no statistics built yet */
-	datanulls[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
-	datanulls[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
-	datanulls[Anum_pg_statistic_ext_data_stxdmcv - 1] = true;
-	datanulls[Anum_pg_statistic_ext_data_stxdexpr - 1] = true;
-
-	/* insert it into pg_statistic_ext_data */
-	htup = heap_form_tuple(datarel->rd_att, datavalues, datanulls);
-	CatalogTupleInsert(datarel, htup);
-	heap_freetuple(htup);
-
-	relation_close(datarel, RowExclusiveLock);
 
 	InvokeObjectPostCreateHook(StatisticExtRelationId, statoid, 0);
 
@@ -717,32 +696,49 @@ AlterStatistics(AlterStatsStmt *stmt)
 }
 
 /*
- * Guts of statistics object deletion.
+ * Delete entry in pg_statistic_ext_data catalog. We don't know if the row
+ * exists, so don't error out.
  */
 void
-RemoveStatisticsById(Oid statsOid)
+RemoveStatisticsDataById(Oid statsOid, bool inh)
 {
 	Relation	relation;
 	HeapTuple	tup;
-	Form_pg_statistic_ext statext;
-	Oid			relid;
 
-	/*
-	 * First delete the pg_statistic_ext_data tuple holding the actual
-	 * statistical data.
-	 */
 	relation = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
-	tup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid));
-
-	if (!HeapTupleIsValid(tup)) /* should not happen */
-		elog(ERROR, "cache lookup failed for statistics data %u", statsOid);
+	tup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid),
+						  BoolGetDatum(inh));
 
-	CatalogTupleDelete(relation, &tup->t_self);
+	/* We don't know if the data row for inh value exists. */
+	if (HeapTupleIsValid(tup))
+	{
+		CatalogTupleDelete(relation, &tup->t_self);
 
-	ReleaseSysCache(tup);
+		ReleaseSysCache(tup);
+	}
 
 	table_close(relation, RowExclusiveLock);
+}
+
+/*
+ * Guts of statistics object deletion.
+ */
+void
+RemoveStatisticsById(Oid statsOid)
+{
+	Relation	relation;
+	HeapTuple	tup;
+	Form_pg_statistic_ext statext;
+	Oid			relid;
+
+	/*
+	 * First delete the pg_statistic_ext_data tuples holding the actual
+	 * statistical data. There might be data with/without inheritance, so
+	 * attempt deleting both.
+	 */
+	RemoveStatisticsDataById(statsOid, true);
+	RemoveStatisticsDataById(statsOid, false);
 
 	/*
 	 * Delete the pg_statistic_ext tuple.  Also send out a cache inval on the
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 564a38a13e..8ba3a3c59d 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -30,6 +30,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_statistic_ext.h"
+#include "catalog/pg_statistic_ext_data.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -1276,6 +1277,87 @@ get_relation_constraints(PlannerInfo *root,
 	return result;
 }
 
+/*
+ * Try loading data for the statistics object.
+ *
+ * We don't know if the data (specified by statOid and inh value) exist.
+ * The result is stored in stainfos list.
+ */
+static void
+get_relation_statistics_worker(List **stainfos, RelOptInfo *rel,
+							   Oid statOid, bool inh,
+							   Bitmapset *keys, List *exprs)
+{
+	Form_pg_statistic_ext_data dataForm;
+	HeapTuple	dtup;
+
+	dtup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(statOid), BoolGetDatum(inh));
+	if (!HeapTupleIsValid(dtup))
+		return;
+
+	dataForm = (Form_pg_statistic_ext_data) GETSTRUCT(dtup);
+
+	/* add one StatisticExtInfo for each kind built */
+	if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_NDISTINCT;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_DEPENDENCIES;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	if (statext_is_kind_built(dtup, STATS_EXT_MCV))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_MCV;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_EXPRESSIONS;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	ReleaseSysCache(dtup);
+}
+
 /*
  * get_relation_statistics
  *		Retrieve extended statistics defined on the table.
@@ -1299,7 +1381,6 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 		Oid			statOid = lfirst_oid(l);
 		Form_pg_statistic_ext staForm;
 		HeapTuple	htup;
-		HeapTuple	dtup;
 		Bitmapset  *keys = NULL;
 		List	   *exprs = NIL;
 		int			i;
@@ -1309,10 +1390,6 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
 		staForm = (Form_pg_statistic_ext) GETSTRUCT(htup);
 
-		dtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-		if (!HeapTupleIsValid(dtup))
-			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
-
 		/*
 		 * First, build the array of columns covered.  This is ultimately
 		 * wasted if no stats within the object have actually been built, but
@@ -1324,6 +1401,11 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 		/*
 		 * Preprocess expressions (if any). We read the expressions, run them
 		 * through eval_const_expressions, and fix the varnos.
+		 *
+		 * XXX We don't know yet if there are any data for this stats object,
+		 * with either stxdinherit value. But it's reasonable to assume there
+		 * is at least one of those, possibly both. So it's better to process
+		 * keys and expressions here.
 		 */
 		{
 			bool		isnull;
@@ -1364,61 +1446,14 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 			}
 		}
 
-		/* add one StatisticExtInfo for each kind built */
-		if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_NDISTINCT;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
-
-			stainfos = lappend(stainfos, info);
-		}
-
-		if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_DEPENDENCIES;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
-
-			stainfos = lappend(stainfos, info);
-		}
-
-		if (statext_is_kind_built(dtup, STATS_EXT_MCV))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_MCV;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
-
-			stainfos = lappend(stainfos, info);
-		}
-
-		if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+		/* extract statistics for possible values of stxdinherit flag */
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_EXPRESSIONS;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+		get_relation_statistics_worker(&stainfos, rel, statOid, true, keys, exprs);
 
-			stainfos = lappend(stainfos, info);
-		}
+		get_relation_statistics_worker(&stainfos, rel, statOid, false, keys, exprs);
 
 		ReleaseSysCache(htup);
-		ReleaseSysCache(dtup);
+
 		bms_free(keys);
 	}
 
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index a22ca73807..672091b517 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -619,14 +619,16 @@ dependency_is_fully_matched(MVDependency *dependency, Bitmapset *attnums)
  *		Load the functional dependencies for the indicated pg_statistic_ext tuple
  */
 MVDependencies *
-statext_dependencies_load(Oid mvoid)
+statext_dependencies_load(Oid mvoid, bool inh)
 {
 	MVDependencies *result;
 	bool		isnull;
 	Datum		deps;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid),
+						   BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
@@ -1417,16 +1419,6 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	Node	  **unique_exprs;
 	int			unique_exprs_cnt;
 
-	/*
-	 * When dealing with regular inheritence trees, ignore extended stats
-	 * (which were built without data from child rels, and thus do not
-	 * represent them). For partitioned tables data from partitions are
-	 * in the stats (and there's no data in the non-leaf relations), so
-	 * in this case we do consider extended stats.
-	 */
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return 1.0;
-
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
 		return 1.0;
@@ -1610,6 +1602,10 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (stat->kind != STATS_EXT_DEPENDENCIES)
 			continue;
 
+		/* skip statistics with mismatching stxdinherit value */
+		if (stat->inherit != rte->inh)
+			continue;
+
 		/*
 		 * Count matching attributes - we have to undo the attnum offsets. The
 		 * input attribute numbers are not offset (expressions are not
@@ -1656,7 +1652,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (nmatched + nexprs < 2)
 			continue;
 
-		deps = statext_dependencies_load(stat->statOid);
+		deps = statext_dependencies_load(stat->statOid, rte->inh);
 
 		/*
 		 * The expressions may be represented by different attnums in the
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 68be000f3d..3423603dd7 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -25,6 +25,7 @@
 #include "catalog/pg_statistic_ext.h"
 #include "catalog/pg_statistic_ext_data.h"
 #include "executor/executor.h"
+#include "commands/defrem.h"
 #include "commands/progress.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
@@ -78,7 +79,7 @@ typedef struct StatExtEntry
 static List *fetch_statentries_for_relation(Relation pg_statext, Oid relid);
 static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
 											int nvacatts, VacAttrStats **vacatts);
-static void statext_store(Oid statOid,
+static void statext_store(Oid statOid, bool inh,
 						  MVNDistinct *ndistinct, MVDependencies *dependencies,
 						  MCVList *mcv, Datum exprs, VacAttrStats **stats);
 static int	statext_compute_stattarget(int stattarget,
@@ -111,7 +112,7 @@ static StatsBuildData *make_build_data(Relation onerel, StatExtEntry *stat,
  * requested stats, and serializes them back into the catalog.
  */
 void
-BuildRelationExtStatistics(Relation onerel, double totalrows,
+BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 						   int numrows, HeapTuple *rows,
 						   int natts, VacAttrStats **vacattrstats)
 {
@@ -231,7 +232,8 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 		}
 
 		/* store the statistics in the catalog */
-		statext_store(stat->statOid, ndistinct, dependencies, mcv, exprstats, stats);
+		statext_store(stat->statOid, inh,
+					  ndistinct, dependencies, mcv, exprstats, stats);
 
 		/* for reporting progress */
 		pgstat_progress_update_param(PROGRESS_ANALYZE_EXT_STATS_COMPUTED,
@@ -782,23 +784,27 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
  *	tuple.
  */
 static void
-statext_store(Oid statOid,
+statext_store(Oid statOid, bool inh,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
 			  MCVList *mcv, Datum exprs, VacAttrStats **stats)
 {
 	Relation	pg_stextdata;
-	HeapTuple	stup,
-				oldtup;
+	HeapTuple	stup;
 	Datum		values[Natts_pg_statistic_ext_data];
 	bool		nulls[Natts_pg_statistic_ext_data];
-	bool		replaces[Natts_pg_statistic_ext_data];
 
 	pg_stextdata = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
 	memset(nulls, true, sizeof(nulls));
-	memset(replaces, false, sizeof(replaces));
 	memset(values, 0, sizeof(values));
 
+	/* basic info */
+	values[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statOid);
+	nulls[Anum_pg_statistic_ext_data_stxoid - 1] = false;
+
+	values[Anum_pg_statistic_ext_data_stxdinherit - 1] = BoolGetDatum(inh);
+	nulls[Anum_pg_statistic_ext_data_stxdinherit - 1] = false;
+
 	/*
 	 * Construct a new pg_statistic_ext_data tuple, replacing the calculated
 	 * stats.
@@ -831,25 +837,15 @@ statext_store(Oid statOid,
 		values[Anum_pg_statistic_ext_data_stxdexpr - 1] = exprs;
 	}
 
-	/* always replace the value (either by bytea or NULL) */
-	replaces[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdmcv - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdexpr - 1] = true;
-
-	/* there should already be a pg_statistic_ext_data tuple */
-	oldtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-	if (!HeapTupleIsValid(oldtup))
-		elog(ERROR, "cache lookup failed for statistics object %u", statOid);
-
-	/* replace it */
-	stup = heap_modify_tuple(oldtup,
-							 RelationGetDescr(pg_stextdata),
-							 values,
-							 nulls,
-							 replaces);
-	ReleaseSysCache(oldtup);
-	CatalogTupleUpdate(pg_stextdata, &stup->t_self, stup);
+	/*
+	 * Delete the old tuple if it exists, and insert a new one. It's easier
+	 * than trying to update or insert, based on various conditions.
+	 */
+	RemoveStatisticsDataById(statOid, inh);
+
+	/* form and insert a new tuple */
+	stup = heap_form_tuple(RelationGetDescr(pg_stextdata), values, nulls);
+	CatalogTupleInsert(pg_stextdata, stup);
 
 	heap_freetuple(stup);
 
@@ -1235,7 +1231,7 @@ stat_covers_expressions(StatisticExtInfo *stat, List *exprs,
  * further tiebreakers are needed.
  */
 StatisticExtInfo *
-choose_best_statistics(List *stats, char requiredkind,
+choose_best_statistics(List *stats, char requiredkind, bool inh,
 					   Bitmapset **clause_attnums, List **clause_exprs,
 					   int nclauses)
 {
@@ -1257,6 +1253,10 @@ choose_best_statistics(List *stats, char requiredkind,
 		if (info->kind != requiredkind)
 			continue;
 
+		/* skip statistics with mismatching inheritance flag */
+		if (info->inherit != inh)
+			continue;
+
 		/*
 		 * Collect attributes and expressions in remaining (unestimated)
 		 * clauses fully covered by this statistic object.
@@ -1697,16 +1697,6 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
 	RangeTblEntry *rte = planner_rt_fetch(rel->relid, root);
 
-	/*
-	 * When dealing with regular inheritence trees, ignore extended stats
-	 * (which were built without data from child rels, and thus do not
-	 * represent them). For partitioned tables data from partitions are
-	 * in the stats (and there's no data in the non-leaf relations), so
-	 * in this case we do consider extended stats.
-	 */
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return sel;
-
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
 		return sel;
@@ -1758,7 +1748,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 		Bitmapset  *simple_clauses;
 
 		/* find the best suited statistics object for these attnums */
-		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV,
+		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, rte->inh,
 									  list_attnums, list_exprs,
 									  list_length(clauses));
 
@@ -1847,7 +1837,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 			MCVList    *mcv_list;
 
 			/* Load the MCV list stored in the statistics object */
-			mcv_list = statext_mcv_load(stat->statOid);
+			mcv_list = statext_mcv_load(stat->statOid, rte->inh);
 
 			/*
 			 * Compute the selectivity of the ORed list of clauses covered by
@@ -2408,7 +2398,7 @@ serialize_expr_stats(AnlExprData *exprdata, int nexprs)
  * identified by the supplied index.
  */
 HeapTuple
-statext_expressions_load(Oid stxoid, int idx)
+statext_expressions_load(Oid stxoid, bool inh, int idx)
 {
 	bool		isnull;
 	Datum		value;
@@ -2418,7 +2408,8 @@ statext_expressions_load(Oid stxoid, int idx)
 	HeapTupleData tmptup;
 	HeapTuple	tup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(stxoid), BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", stxoid);
 
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index b350fc5f7b..75d7d35caa 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -559,12 +559,13 @@ build_column_frequencies(SortItem *groups, int ngroups,
  *		Load the MCV list for the indicated pg_statistic_ext tuple.
  */
 MCVList *
-statext_mcv_load(Oid mvoid)
+statext_mcv_load(Oid mvoid, bool inh)
 {
 	MCVList    *result;
 	bool		isnull;
 	Datum		mcvlist;
-	HeapTuple	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	HeapTuple	htup = SearchSysCache2(STATEXTDATASTXOID,
+									   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
@@ -2039,11 +2040,13 @@ mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
 	MCVList    *mcv;
 	Selectivity s = 0.0;
 
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
 	/* match/mismatch bitmap for each MCV item */
 	bool	   *matches = NULL;
 
 	/* load the MCV list stored in the statistics object */
-	mcv = statext_mcv_load(stat->statOid);
+	mcv = statext_mcv_load(stat->statOid, rte->inh);
 
 	/* build a match bitmap for the clauses */
 	matches = mcv_get_match_bitmap(root, clauses, stat->keys, stat->exprs,
diff --git a/src/backend/statistics/mvdistinct.c b/src/backend/statistics/mvdistinct.c
index 4481312d61..ab1f10d6c0 100644
--- a/src/backend/statistics/mvdistinct.c
+++ b/src/backend/statistics/mvdistinct.c
@@ -146,14 +146,15 @@ statext_ndistinct_build(double totalrows, StatsBuildData *data)
  *		Load the ndistinct value for the indicated pg_statistic_ext tuple
  */
 MVNDistinct *
-statext_ndistinct_load(Oid mvoid)
+statext_ndistinct_load(Oid mvoid, bool inh)
 {
 	MVNDistinct *result;
 	bool		isnull;
 	Datum		ndist;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index b25337370b..886d85b7f3 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3919,17 +3919,6 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	if (!rel->statlist)
 		return false;
 
-	/*
-	 * When dealing with regular inheritence trees, ignore extended stats
-	 * (which were built without data from child rels, and thus do not
-	 * represent them). For partitioned tables data from partitions are
-	 * in the stats (and there's no data in the non-leaf relations), so
-	 * in this case we do consider extended stats.
-	 */
-	rte = planner_rt_fetch(rel->relid, root);
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return false;
-
 	/* look for the ndistinct statistics object matching the most vars */
 	nmatches_vars = 0;			/* we require at least two matches */
 	nmatches_exprs = 0;
@@ -4015,7 +4004,8 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 
 	Assert(nmatches_vars + nmatches_exprs > 1);
 
-	stats = statext_ndistinct_load(statOid);
+	rte = planner_rt_fetch(rel->relid, root);
+	stats = statext_ndistinct_load(statOid, rte->inh);
 
 	/*
 	 * If we have a match, search it for the specific item that matches (there
@@ -5270,22 +5260,28 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 				/* found a match, see if we can extract pg_statistic row */
 				if (equal(node, expr))
 				{
-					HeapTuple	t = statext_expressions_load(info->statOid, pos);
-
-					/* Get statistics object's table for permission check */
-					RangeTblEntry *rte;
+					RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root);
 					Oid			userid;
+					bool		inh;
 
-					vardata->statsTuple = t;
+					Assert(rte->rtekind == RTE_RELATION);
+
+					/*
+					 * Determine if we'redealing with inheritance tree.
+					 *
+					 * XXX Can't we simply look at rte->inh?
+					 */
+					inh = root->append_rel_array == NULL ? false :
+						root->append_rel_array[onerel->relid]->parent_relid != 0;
 
 					/*
 					 * XXX Not sure if we should cache the tuple somewhere.
 					 * Now we just create a new copy every time.
 					 */
-					vardata->freefunc = ReleaseDummy;
+					vardata->statsTuple =
+						statext_expressions_load(info->statOid, inh, pos);
 
-					rte = planner_rt_fetch(onerel->relid, root);
-					Assert(rte->rtekind == RTE_RELATION);
+					vardata->freefunc = ReleaseDummy;
 
 					/*
 					 * Use checkAsUser if it's set, in case we're accessing
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 56870b46e4..6194751e99 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -763,11 +763,11 @@ static const struct cachedesc cacheinfo[] = {
 		32
 	},
 	{StatisticExtDataRelationId,	/* STATEXTDATASTXOID */
-		StatisticExtDataStxoidIndexId,
-		1,
+		StatisticExtDataStxoidInhIndexId,
+		2,
 		{
 			Anum_pg_statistic_ext_data_stxoid,
-			0,
+			Anum_pg_statistic_ext_data_stxdinherit,
 			0,
 			0
 		},
diff --git a/src/include/catalog/pg_statistic_ext_data.h b/src/include/catalog/pg_statistic_ext_data.h
index 7b73b790d2..8ffd8b68cd 100644
--- a/src/include/catalog/pg_statistic_ext_data.h
+++ b/src/include/catalog/pg_statistic_ext_data.h
@@ -32,6 +32,7 @@ CATALOG(pg_statistic_ext_data,3429,StatisticExtDataRelationId)
 {
 	Oid			stxoid BKI_LOOKUP(pg_statistic_ext);	/* statistics object
 														 * this data is for */
+	bool		stxdinherit;	/* true if inheritance children are included */
 
 #ifdef CATALOG_VARLEN			/* variable-length fields start here */
 
@@ -53,6 +54,7 @@ typedef FormData_pg_statistic_ext_data * Form_pg_statistic_ext_data;
 
 DECLARE_TOAST(pg_statistic_ext_data, 3430, 3431);
 
-DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_index, 3433, StatisticExtDataStxoidIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops));
+DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_inh_index, 3433, StatisticExtDataStxoidInhIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops, stxdinherit bool_ops));
+
 
 #endif							/* PG_STATISTIC_EXT_DATA_H */
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index f84d09959c..ab3c59fcd5 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -83,6 +83,7 @@ extern ObjectAddress AlterOperator(AlterOperatorStmt *stmt);
 extern ObjectAddress CreateStatistics(CreateStatsStmt *stmt);
 extern ObjectAddress AlterStatistics(AlterStatsStmt *stmt);
 extern void RemoveStatisticsById(Oid statsOid);
+extern void RemoveStatisticsDataById(Oid statsOid, bool inh);
 extern Oid	StatisticsGetRelation(Oid statId, bool missing_ok);
 
 /* commands/aggregatecmds.c */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 324d92880b..02aff9847e 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -934,6 +934,7 @@ typedef struct StatisticExtInfo
 	NodeTag		type;
 
 	Oid			statOid;		/* OID of the statistics row */
+	bool		inherit;		/* includes child relations */
 	RelOptInfo *rel;			/* back-link to statistic's table */
 	char		kind;			/* statistics kind of this entry */
 	Bitmapset  *keys;			/* attnums of the columns covered */
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 326cf26fea..3868e43f8a 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -94,11 +94,11 @@ typedef struct MCVList
 	MCVItem		items[FLEXIBLE_ARRAY_MEMBER];	/* array of MCV items */
 } MCVList;
 
-extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
-extern MVDependencies *statext_dependencies_load(Oid mvoid);
-extern MCVList *statext_mcv_load(Oid mvoid);
+extern MVNDistinct *statext_ndistinct_load(Oid mvoid, bool inh);
+extern MVDependencies *statext_dependencies_load(Oid mvoid, bool inh);
+extern MCVList *statext_mcv_load(Oid mvoid, bool inh);
 
-extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
+extern void BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 									   int numrows, HeapTuple *rows,
 									   int natts, VacAttrStats **vacattrstats);
 extern int	ComputeExtStatisticsRows(Relation onerel,
@@ -121,9 +121,10 @@ extern Selectivity statext_clauselist_selectivity(PlannerInfo *root,
 												  bool is_or);
 extern bool has_stats_of_kind(List *stats, char requiredkind);
 extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind,
+												bool inh,
 												Bitmapset **clause_attnums,
 												List **clause_exprs,
 												int nclauses);
-extern HeapTuple statext_expressions_load(Oid stxoid, int idx);
+extern HeapTuple statext_expressions_load(Oid stxoid, bool inh, int idx);
 
 #endif							/* STATISTICS_H */
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index b58b062b10..d652f7b5fb 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2443,6 +2443,7 @@ pg_stats_ext| SELECT cn.nspname AS schemaname,
              JOIN pg_attribute a ON (((a.attrelid = s.stxrelid) AND (a.attnum = k.k))))) AS attnames,
     pg_get_statisticsobjdef_expressions(s.oid) AS exprs,
     s.stxkind AS kinds,
+    sd.stxdinherit AS inherited,
     sd.stxdndistinct AS n_distinct,
     sd.stxddependencies AS dependencies,
     m.most_common_vals,
@@ -2469,6 +2470,7 @@ pg_stats_ext_exprs| SELECT cn.nspname AS schemaname,
     s.stxname AS statistics_name,
     pg_get_userbyid(s.stxowner) AS statistics_owner,
     stat.expr,
+    sd.stxdinherit AS inherited,
     (stat.a).stanullfrac AS null_frac,
     (stat.a).stawidth AS avg_width,
     (stat.a).stadistinct AS n_distinct,
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index a11fff655a..427884cda8 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -144,10 +144,9 @@ SELECT stxname, stxdndistinct, stxddependencies, stxdmcv
   FROM pg_statistic_ext s, pg_statistic_ext_data d
  WHERE s.stxname = 'ab1_a_b_stats'
    AND d.stxoid = s.oid;
-    stxname    | stxdndistinct | stxddependencies | stxdmcv 
----------------+---------------+------------------+---------
- ab1_a_b_stats |               |                  | 
-(1 row)
+ stxname | stxdndistinct | stxddependencies | stxdmcv 
+---------+---------------+------------------+---------
+(0 rows)
 
 ALTER STATISTICS ab1_a_b_stats SET STATISTICS -1;
 \d+ ab1
@@ -192,11 +191,18 @@ SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 
 CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
 VACUUM ANALYZE stxdinh, stxdinh1;
--- Since the stats object does not include inherited stats, it should not affect the estimates
+-- See if the extended stats affect the estimates
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
  estimated | actual 
 -----------+--------
-      1000 |   1008
+      1008 |   1008
+(1 row)
+
+-- Ensure correct (non-inherited) stats are applied to inherited query
+SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+         9 |      9
 (1 row)
 
 DROP TABLE stxdinh, stxdinh1;
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 01383e8f5f..8deb010b3c 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -123,8 +123,10 @@ VACUUM ANALYZE stxdinh, stxdinh1;
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
 VACUUM ANALYZE stxdinh, stxdinh1;
--- Since the stats object does not include inherited stats, it should not affect the estimates
+-- See if the extended stats affect the estimates
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+-- Ensure correct (non-inherited) stats are applied to inherited query
+SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2');
 DROP TABLE stxdinh, stxdinh1;
 
 -- Ensure inherited stats ARE applied to inherited query in partitioned table
-- 
2.31.1

0002-Build-inherited-extended-stats-on-partition-20211212.patchtext/x-patch; charset=UTF-8; name=0002-Build-inherited-extended-stats-on-partition-20211212.patchDownload
From f2918393ff311ec1646b0082f1b0a0f35a88e77c Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Sat, 11 Dec 2021 22:38:08 +0100
Subject: [PATCH 2/4] Build inherited extended stats on partitioned tables

Commit 859b3003de disabled building of extended stats for inheritance
trees, to prevent updating the same catalog row twice. That resolved the
issue, but it also means there are no extended stats for partitioned
tables, because the (!inh) case has empty sample. This is a regression,
affecting queries that don't estimate the leaf relations directly.

But because partitioned tables are empty, we can invert the condition
and build statistics only for the case with inheritance, without losing
anything. And we can consider them when calculating estimates.

Report and patch by Justin Pryzby, minor fixes and cleanup by me.
Backpatch all the way back to PostgreSQL 10, where extended statistics
were introduced (same as 859b3003de).

Author: Justin Pryzby
Reported-by: Justin Pryzby
Backpatch-through: 10
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
---
 src/backend/commands/analyze.c          | 18 +++++++++++++++++-
 src/backend/statistics/dependencies.c   |  9 ++++++---
 src/backend/statistics/extended_stats.c |  9 ++++++---
 src/backend/utils/adt/selfuncs.c        | 20 ++++++++++++--------
 src/test/regress/expected/stats_ext.out | 19 +++++++++++++++++++
 src/test/regress/sql/stats_ext.sql      | 10 ++++++++++
 6 files changed, 70 insertions(+), 15 deletions(-)

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index cd77907fc7..58a82e4929 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -549,6 +549,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 	{
 		MemoryContext col_context,
 					old_context;
+		bool		build_ext_stats;
 
 		pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
 									 PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -612,13 +613,28 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
+		/*
+		 * Should we build extended statistics for this relation?
+		 *
+		 * The extended statistics catalog does not include an inheritance
+		 * flag, so we can't store statistics built both with and without
+		 * data from child relations. We can store just one set of statistics
+		 * per relation. For plain relations that's fine, but for inheritance
+		 * trees we have to pick whether to store statistics for just the
+		 * one relation or the whole tree. For plain inheritance we store
+		 * the (!inh) version, mostly for backwards compatibility reasons.
+		 * For partitioned tables that's pointless (the non-leaf tables are
+		 * always empty), so we store stats representing the whole tree.
+		 */
+		build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
+
 		/*
 		 * Build extended statistics (if there are any).
 		 *
 		 * For now we only build extended statistics on individual relations,
 		 * not for relations representing inheritance trees.
 		 */
-		if (!inh)
+		if (build_ext_stats)
 			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
 									   attr_cnt, vacattrstats);
 	}
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 88479eee8b..a22ca73807 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -1418,10 +1418,13 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			unique_exprs_cnt;
 
 	/*
-	 * When dealing with inheritence trees, ignore extended stats (which were
-	 * built without data from child rels, and thus do not represent them).
+	 * When dealing with regular inheritence trees, ignore extended stats
+	 * (which were built without data from child rels, and thus do not
+	 * represent them). For partitioned tables data from partitions are
+	 * in the stats (and there's no data in the non-leaf relations), so
+	 * in this case we do consider extended stats.
 	 */
-	if (rte->inh)
+	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 		return 1.0;
 
 	/* check if there's any stats that might be useful for us. */
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 607b02ab8e..68be000f3d 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1698,10 +1698,13 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	RangeTblEntry *rte = planner_rt_fetch(rel->relid, root);
 
 	/*
-	 * When dealing with inheritence trees, ignore extended stats (which were
-	 * built without data from child rels, and thus do not represent them).
+	 * When dealing with regular inheritence trees, ignore extended stats
+	 * (which were built without data from child rels, and thus do not
+	 * represent them). For partitioned tables data from partitions are
+	 * in the stats (and there's no data in the non-leaf relations), so
+	 * in this case we do consider extended stats.
 	 */
-	if (rte->inh)
+	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 		return sel;
 
 	/* check if there's any stats that might be useful for us. */
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 0b12f5288e..b25337370b 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3913,19 +3913,23 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	Oid			statOid = InvalidOid;
 	MVNDistinct *stats;
 	StatisticExtInfo *matched_info = NULL;
-	RangeTblEntry		*rte = planner_rt_fetch(rel->relid, root);
-
-	/*
-	 * When dealing with inheritence trees, ignore extended stats (which were
-	 * built without data from child rels, and thus do not represent them).
-	 */
-	if (rte->inh)
-		return false;
+	RangeTblEntry		*rte;
 
 	/* bail out immediately if the table has no extended statistics */
 	if (!rel->statlist)
 		return false;
 
+	/*
+	 * When dealing with regular inheritence trees, ignore extended stats
+	 * (which were built without data from child rels, and thus do not
+	 * represent them). For partitioned tables data from partitions are
+	 * in the stats (and there's no data in the non-leaf relations), so
+	 * in this case we do consider extended stats.
+	 */
+	rte = planner_rt_fetch(rel->relid, root);
+	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
+		return false;
+
 	/* look for the ndistinct statistics object matching the most vars */
 	nmatches_vars = 0;			/* we require at least two matches */
 	nmatches_exprs = 0;
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 5410f58f7f..a11fff655a 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -200,6 +200,25 @@ SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 (1 row)
 
 DROP TABLE stxdinh, stxdinh1;
+-- Ensure inherited stats ARE applied to inherited query in partitioned table
+CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i);
+CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100);
+INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
+CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp;
+VACUUM ANALYZE stxdinp; -- partitions are processed recursively
+SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass;
+ ?column? 
+----------
+        1
+(1 row)
+
+SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+        10 |     10
+(1 row)
+
+DROP TABLE stxdinp;
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 -- expression stats may be built on a single expression column
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index a196d5e2f8..01383e8f5f 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -127,6 +127,16 @@ VACUUM ANALYZE stxdinh, stxdinh1;
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 DROP TABLE stxdinh, stxdinh1;
 
+-- Ensure inherited stats ARE applied to inherited query in partitioned table
+CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i);
+CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100);
+INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
+CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp;
+VACUUM ANALYZE stxdinp; -- partitions are processed recursively
+SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass;
+SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2');
+DROP TABLE stxdinp;
+
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 
-- 
2.31.1

0001-Ignore-extended-statistics-for-inheritance--20211212.patchtext/x-patch; charset=UTF-8; name=0001-Ignore-extended-statistics-for-inheritance--20211212.patchDownload
From 322b52b92649c2625fbbba369b767029ceff8051 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 25 Sep 2021 19:42:41 -0500
Subject: [PATCH 1/4] Ignore extended statistics for inheritance trees

Since commit 859b3003de we only build extended statistics for relations
without the child relations, so that we update the catalog row only
once. However, the non-inherited statistics were still used for planning
of queries on inheritance trees, which may produce bogus estimates.

This is roughly the same issue 427c6b5b9 addressed ~15 years ago, and we
fix it the same way - by ignoring extended statistics when calculating
estimates for the inheritance tree as a whole. We still consider
extended statistics when calculating estimates for individual child
relations, of course.

Report and patch by Justin Pryzby, minor fixes and cleanup by me.
Backpatch all the way back to PostgreSQL 10, where extended statistics
were introduced (same as 859b3003de).

Author: Justin Pryzby
Reported-by: Justin Pryzby
Backpatch-through: 10
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
---
 src/backend/statistics/dependencies.c   |  9 +++++++++
 src/backend/statistics/extended_stats.c |  9 +++++++++
 src/backend/utils/adt/selfuncs.c        | 16 ++++++++++++++++
 src/test/regress/expected/stats_ext.out | 24 ++++++++++++++++++++++++
 src/test/regress/sql/stats_ext.sql      | 15 +++++++++++++++
 5 files changed, 73 insertions(+)

diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 8bf80db8e4..88479eee8b 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -24,6 +24,7 @@
 #include "nodes/pathnodes.h"
 #include "optimizer/clauses.h"
 #include "optimizer/optimizer.h"
+#include "parser/parsetree.h"
 #include "statistics/extended_stats_internal.h"
 #include "statistics/statistics.h"
 #include "utils/bytea.h"
@@ -1410,11 +1411,19 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			ndependencies;
 	int			i;
 	AttrNumber	attnum_offset;
+	RangeTblEntry *rte = planner_rt_fetch(rel->relid, root);
 
 	/* unique expressions */
 	Node	  **unique_exprs;
 	int			unique_exprs_cnt;
 
+	/*
+	 * When dealing with inheritence trees, ignore extended stats (which were
+	 * built without data from child rels, and thus do not represent them).
+	 */
+	if (rte->inh)
+		return 1.0;
+
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
 		return 1.0;
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 69ca52094f..607b02ab8e 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -30,6 +30,7 @@
 #include "nodes/nodeFuncs.h"
 #include "optimizer/clauses.h"
 #include "optimizer/optimizer.h"
+#include "parser/parsetree.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "statistics/extended_stats_internal.h"
@@ -1694,6 +1695,14 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	List	  **list_exprs;		/* expressions matched to any statistic */
 	int			listidx;
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
+	RangeTblEntry *rte = planner_rt_fetch(rel->relid, root);
+
+	/*
+	 * When dealing with inheritence trees, ignore extended stats (which were
+	 * built without data from child rels, and thus do not represent them).
+	 */
+	if (rte->inh)
+		return sel;
 
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 10895fb287..0b12f5288e 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3913,6 +3913,14 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	Oid			statOid = InvalidOid;
 	MVNDistinct *stats;
 	StatisticExtInfo *matched_info = NULL;
+	RangeTblEntry		*rte = planner_rt_fetch(rel->relid, root);
+
+	/*
+	 * When dealing with inheritence trees, ignore extended stats (which were
+	 * built without data from child rels, and thus do not represent them).
+	 */
+	if (rte->inh)
+		return false;
 
 	/* bail out immediately if the table has no extended statistics */
 	if (!rel->statlist)
@@ -5232,6 +5240,14 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 			if (vardata->statsTuple)
 				break;
 
+			/*
+			 * When dealing with inheritence trees, ignore extended stats (which
+			 * were built without data from child rels, and so do not represent
+			 * them).
+			 */
+			if (planner_rt_fetch(onerel->relid, root)->inh)
+				break;
+
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
 				continue;
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index c60ba45aba..5410f58f7f 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -176,6 +176,30 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 NOTICE:  drop cascades to table ab1c
+-- Tests for stats with inheritance
+CREATE TABLE stxdinh(i int, j int);
+CREATE TABLE stxdinh1() INHERITS(stxdinh);
+INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
+INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Ensure non-inherited stats are not applied to inherited query
+-- Without stats object, it looks like this
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+      1000 |   1008
+(1 row)
+
+CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Since the stats object does not include inherited stats, it should not affect the estimates
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+      1000 |   1008
+(1 row)
+
+DROP TABLE stxdinh, stxdinh1;
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 -- expression stats may be built on a single expression column
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 6fb37962a7..a196d5e2f8 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -112,6 +112,21 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 
+-- Tests for stats with inheritance
+CREATE TABLE stxdinh(i int, j int);
+CREATE TABLE stxdinh1() INHERITS(stxdinh);
+INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
+INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Ensure non-inherited stats are not applied to inherited query
+-- Without stats object, it looks like this
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Since the stats object does not include inherited stats, it should not affect the estimates
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+DROP TABLE stxdinh, stxdinh1;
+
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 
-- 
2.31.1

#18Zhihong Yu
zyu@yugabyte.com
In reply to: Tomas Vondra (#17)
Re: extended stats on partitioned tables

On Sat, Dec 11, 2021 at 8:17 PM Tomas Vondra <tomas.vondra@enterprisedb.com>
wrote:

Hi,

Attached is a rebased and cleaned-up version of these patches, with more
comments, refactorings etc. Justin and Zhihong, can you take a look?

0001 - Ignore extended statistics for inheritance trees

0002 - Build inherited extended stats on partitioned tables

Those are mostly just Justin's patches, with more detailed comments and
updated commit message. I've considered moving the rel->inh check to
statext_clauselist_selectivity, and then removing the check from
dependencies and MCV. But I decided no to do that, because someone might
be calling those functions directly (even if that's very unlikely).

The one thing bugging me a bit is that the regression test checks only a
GROUP BY query. It'd be nice to add queries testing MCV/dependencies
too, but that seems tricky because most queries will use per-partitions
stats.

0003 - Add stxdinherit flag to pg_statistic_ext_data

This is the patch for master, allowing to build stats for both inherits
flag values. It adds the flag to pg_stats_ext_exprs view to, reworked
how we deal with iterating both flags etc. I've adopted most of the
Justin's fixup patches, except that in plancat.c I've refactored how we
load the stats to process keys/expressions just once.

It has the same issue with regression test using just a GROUP BY query,
but if we add a test to 0001/0002, that'll fix this too.

0004 - Refactor parent ACL check

Not sure about this - I doubt saving 30 rows in an 8kB file is really
worth it. Maybe it is, but then maybe we should try cleaning up the
other ACL checks in this file too? Seems mostly orthogonal to this
thread, though.

Hi,

For patch 3, in commit message:

and there no clear winner. -> and there is no clear winner.

and it seem wasteful -> and it seems wasteful

The there may be -> There may be

+       /* skip statistics with mismatching stxdinherit value */
+       if (stat->inherit != rte->inh)

Should a log be added for the above case ?

+ * Determine if we'redealing with inheritance tree.

There should be a space between re and dealing.

Cheers

regards

Show quoted text

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#19Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Zhihong Yu (#18)
Re: extended stats on partitioned tables

On 12/12/21 05:38, Zhihong Yu wrote:

On Sat, Dec 11, 2021 at 8:17 PM Tomas Vondra
<tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
wrote:

Hi,

Attached is a rebased and cleaned-up version of these patches, with more
comments, refactorings etc. Justin and Zhihong, can you take a look?

0001 - Ignore extended statistics for inheritance trees

0002 - Build inherited extended stats on partitioned tables

Those are mostly just Justin's patches, with more detailed comments and
updated commit message. I've considered moving the rel->inh check to
statext_clauselist_selectivity, and then removing the check from
dependencies and MCV. But I decided no to do that, because someone might
be calling those functions directly (even if that's very unlikely).

The one thing bugging me a bit is that the regression test checks only a
GROUP BY query. It'd be nice to add queries testing MCV/dependencies
too, but that seems tricky because most queries will use per-partitions
stats.

0003 - Add stxdinherit flag to pg_statistic_ext_data

This is the patch for master, allowing to build stats for both inherits
flag values. It adds the flag to pg_stats_ext_exprs view to, reworked
how we deal with iterating both flags etc. I've adopted most of the
Justin's fixup patches, except that in plancat.c I've refactored how we
load the stats to process keys/expressions just once.

It has the same issue with regression test using just a GROUP BY query,
but if we add a test to 0001/0002, that'll fix this too.

0004 - Refactor parent ACL check

Not sure about this - I doubt saving 30 rows in an 8kB file is really
worth it. Maybe it is, but then maybe we should try cleaning up the
other ACL checks in this file too? Seems mostly orthogonal to this
thread, though.

Hi,
For patch 3, in commit message:

and there no clear winner. -> and there is no clear winner. 

and it seem wasteful -> and it seems wasteful

The there may be -> There may be

Thanks, will fix.

+       /* skip statistics with mismatching stxdinherit value */
+       if (stat->inherit != rte->inh)

Should a log be added for the above case ?

Why should we log this? It's an entirely expected case - there's a
mismatch between inheritance for the relation and statistics, simply
skipping it is the right thing to do.

+                    * Determine if we'redealing with inheritance tree.

There should be a space between re and dealing.

Thanks, will fix.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#20Zhihong Yu
zyu@yugabyte.com
In reply to: Tomas Vondra (#19)
Re: extended stats on partitioned tables

On Sat, Dec 11, 2021 at 9:14 PM Tomas Vondra <tomas.vondra@enterprisedb.com>
wrote:

On 12/12/21 05:38, Zhihong Yu wrote:

On Sat, Dec 11, 2021 at 8:17 PM Tomas Vondra
<tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
wrote:

Hi,

Attached is a rebased and cleaned-up version of these patches, with

more

comments, refactorings etc. Justin and Zhihong, can you take a look?

0001 - Ignore extended statistics for inheritance trees

0002 - Build inherited extended stats on partitioned tables

Those are mostly just Justin's patches, with more detailed comments

and

updated commit message. I've considered moving the rel->inh check to
statext_clauselist_selectivity, and then removing the check from
dependencies and MCV. But I decided no to do that, because someone

might

be calling those functions directly (even if that's very unlikely).

The one thing bugging me a bit is that the regression test checks

only a

GROUP BY query. It'd be nice to add queries testing MCV/dependencies
too, but that seems tricky because most queries will use

per-partitions

stats.

0003 - Add stxdinherit flag to pg_statistic_ext_data

This is the patch for master, allowing to build stats for both

inherits

flag values. It adds the flag to pg_stats_ext_exprs view to, reworked
how we deal with iterating both flags etc. I've adopted most of the
Justin's fixup patches, except that in plancat.c I've refactored how

we

load the stats to process keys/expressions just once.

It has the same issue with regression test using just a GROUP BY

query,

but if we add a test to 0001/0002, that'll fix this too.

0004 - Refactor parent ACL check

Not sure about this - I doubt saving 30 rows in an 8kB file is really
worth it. Maybe it is, but then maybe we should try cleaning up the
other ACL checks in this file too? Seems mostly orthogonal to this
thread, though.

Hi,
For patch 3, in commit message:

and there no clear winner. -> and there is no clear winner.

and it seem wasteful -> and it seems wasteful

The there may be -> There may be

Thanks, will fix.

+       /* skip statistics with mismatching stxdinherit value */
+       if (stat->inherit != rte->inh)

Should a log be added for the above case ?

Why should we log this? It's an entirely expected case - there's a
mismatch between inheritance for the relation and statistics, simply
skipping it is the right thing to do.

Hi,
I agree that skipping should be fine (to avoid too much logging).

Thanks

Show quoted text

+ * Determine if we'redealing with inheritance tree.

There should be a space between re and dealing.

Thanks, will fix.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#21Zhihong Yu
zyu@yugabyte.com
In reply to: Tomas Vondra (#17)
Re: extended stats on partitioned tables

On Sat, Dec 11, 2021 at 8:17 PM Tomas Vondra <tomas.vondra@enterprisedb.com>
wrote:

Hi,

Attached is a rebased and cleaned-up version of these patches, with more
comments, refactorings etc. Justin and Zhihong, can you take a look?

0001 - Ignore extended statistics for inheritance trees

0002 - Build inherited extended stats on partitioned tables

Those are mostly just Justin's patches, with more detailed comments and
updated commit message. I've considered moving the rel->inh check to
statext_clauselist_selectivity, and then removing the check from
dependencies and MCV. But I decided no to do that, because someone might
be calling those functions directly (even if that's very unlikely).

The one thing bugging me a bit is that the regression test checks only a
GROUP BY query. It'd be nice to add queries testing MCV/dependencies
too, but that seems tricky because most queries will use per-partitions
stats.

0003 - Add stxdinherit flag to pg_statistic_ext_data

This is the patch for master, allowing to build stats for both inherits
flag values. It adds the flag to pg_stats_ext_exprs view to, reworked
how we deal with iterating both flags etc. I've adopted most of the
Justin's fixup patches, except that in plancat.c I've refactored how we
load the stats to process keys/expressions just once.

It has the same issue with regression test using just a GROUP BY query,
but if we add a test to 0001/0002, that'll fix this too.

0004 - Refactor parent ACL check

Not sure about this - I doubt saving 30 rows in an 8kB file is really
worth it. Maybe it is, but then maybe we should try cleaning up the
other ACL checks in this file too? Seems mostly orthogonal to this
thread, though.

Hi,

For patch 1, minor comment:

+ if (planner_rt_fetch(onerel->relid, root)->inh)

Since the rte (RangeTblEntry*) doesn't seem to be used beyond checking inh,
I think it would be better if the above style of checking is used
throughout the patch (without introducing rte variable).

Cheers

Show quoted text

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#22Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Zhihong Yu (#20)
Re: extended stats on partitioned tables

On 12/12/21 14:47, Zhihong Yu wrote:

On Sat, Dec 11, 2021 at 9:14 PM Tomas Vondra
<tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
wrote:

...

+       /* skip statistics with mismatching stxdinherit value */
+       if (stat->inherit != rte->inh)

Should a log be added for the above case ?

Why should we log this? It's an entirely expected case - there's a
mismatch between inheritance for the relation and statistics, simply
skipping it is the right thing to do.

Hi,
I agree that skipping should be fine (to avoid too much logging).

I'm not sure it's related to the amount of logging, really. It'd be just
noise without any practical use, even for debugging purposes. If you
have an inheritance tree, it'll automatically have one set of statistics
for inh=true and one for inh=false. And this condition will always skip
one of those, depending on what query is being estimated.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#23Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Zhihong Yu (#21)
Re: extended stats on partitioned tables

On 12/12/21 16:37, Zhihong Yu wrote:

Hi,
For patch 1, minor comment:

+           if (planner_rt_fetch(onerel->relid, root)->inh)

Since the rte (RangeTblEntry*) doesn't seem to be used beyond checking
inh, I think it would be better if the above style of checking is used
throughout the patch (without introducing rte variable).

It's mostly a matter of personal taste, but I always found this style of
condition (i.e. dereferencing a pointer returned by a function) much
less readable. It's hard to parse what exactly is happening, what struct
type are we dealing with, etc. YMMV but the separate variable makes it
much clearer for me. And I'd expect the compilers to produce pretty much
the same code too for those cases.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#24Justin Pryzby
pryzby@telsasoft.com
In reply to: Tomas Vondra (#17)
Re: extended stats on partitioned tables
+          * XXX Can't we simply look at rte->inh?
+          */
+         inh = root->append_rel_array == NULL ? false :
+                 root->append_rel_array[onerel->relid]->parent_relid != 0;

I think so. That's what I came up with while trying to figured this out, and
it's no great surprise that it needed to be cleaned up - thanks.

In your 0003 patch, the "if inh: break" isn't removed from examine_variable(),
but the corresponding thing is removed everywhere else.

In 0003, mcv_clauselist_selectivity still uses simple_rte_array rather than
rt_fetch.

The regression tests changed as a result of not populating stx_data; I think
it's may be better to update like this:

SELECT stxname, stxdndistinct, stxddependencies, stxdmcv, stxoid IS NULL
FROM pg_statistic_ext s LEFT JOIN pg_statistic_ext_data d
ON d.stxoid = s.oid
WHERE s.stxname = 'ab1_a_b_stats';

There's this part about documentation for the changes in backbranches:

On Sun, Sep 26, 2021 at 03:25:50PM -0500, Justin Pryzby wrote:

Also, I think in backbranches we should document what's being stored in
pg_statistic_ext, since it's pretty unintuitive:
- noninherted stats (FROM ONLY) for inheritence parents;
- inherted stats (FROM *) for partitioned tables;

spellcheck: inheritence should be inheritance.

All for now. I'm going to update the regression tests for dependencies and the
other code paths.

--
Justin

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tomas Vondra (#23)
Re: extended stats on partitioned tables

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

On 12/12/21 16:37, Zhihong Yu wrote:

Since the rte (RangeTblEntry*) doesn't seem to be used beyond checking
inh, I think it would be better if the above style of checking is used
throughout the patch (without introducing rte variable).

It's mostly a matter of personal taste, but I always found this style of
condition (i.e. dereferencing a pointer returned by a function) much
less readable. It's hard to parse what exactly is happening, what struct
type are we dealing with, etc. YMMV but the separate variable makes it
much clearer for me. And I'd expect the compilers to produce pretty much
the same code too for those cases.

FWIW, I agree. Also, it's possible that future patches would create a
need to touch the RTE again nearby, in which case having the variable
makes it easier to write non-crummy code for that.

regards, tom lane

#26Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Justin Pryzby (#24)
4 attachment(s)
Re: extended stats on partitioned tables

On 12/12/21 18:52, Justin Pryzby wrote:

+          * XXX Can't we simply look at rte->inh?
+          */
+         inh = root->append_rel_array == NULL ? false :
+                 root->append_rel_array[onerel->relid]->parent_relid != 0;

I think so. That's what I came up with while trying to figured this out, and
it's no great surprise that it needed to be cleaned up - thanks.

OK, fixed.

In your 0003 patch, the "if inh: break" isn't removed from examine_variable(),
but the corresponding thing is removed everywhere else.

Ah, you're right. And it wasn't updated in the 0002 patch either - it
should do the relkind check too, to allow partitioned tables. Fixed.

In 0003, mcv_clauselist_selectivity still uses simple_rte_array rather than
rt_fetch.

That's mostly a conscious choice, so that I don't have to include
parsetree.h. But maybe that'd be better ...

The regression tests changed as a result of not populating stx_data; I think
it's may be better to update like this:

SELECT stxname, stxdndistinct, stxddependencies, stxdmcv, stxoid IS NULL
FROM pg_statistic_ext s LEFT JOIN pg_statistic_ext_data d
ON d.stxoid = s.oid
WHERE s.stxname = 'ab1_a_b_stats';

Not sure I understand. Why would this be better than inner join?

There's this part about documentation for the changes in backbranches:

On Sun, Sep 26, 2021 at 03:25:50PM -0500, Justin Pryzby wrote:

Also, I think in backbranches we should document what's being stored in
pg_statistic_ext, since it's pretty unintuitive:
- noninherted stats (FROM ONLY) for inheritence parents;
- inherted stats (FROM *) for partitioned tables;

spellcheck: inheritence should be inheritance.

Thanks, fixed. Can you read through the commit messages and check the
attribution is correct for all the patches?

All for now. I'm going to update the regression tests for dependencies and the
other code paths.

Thanks!

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

0004-Refactor-parent-ACL-check-20211212b.patchtext/x-patch; charset=UTF-8; name=0004-Refactor-parent-ACL-check-20211212b.patchDownload
From 77bd34660d15d6ba03538c6fe45a196a162c196c Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Sun, 12 Dec 2021 02:28:41 +0100
Subject: [PATCH 4/4] Refactor parent ACL check

selfuncs.c is 8k lines long, and this makes it 30 LOC shorter.
---
 src/backend/utils/adt/selfuncs.c | 140 ++++++++++++-------------------
 1 file changed, 52 insertions(+), 88 deletions(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index bf4525392d..e4be9fc95d 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -187,6 +187,8 @@ static char *convert_string_datum(Datum value, Oid typid, Oid collid,
 								  bool *failure);
 static double convert_timevalue_to_scalar(Datum value, Oid typid,
 										  bool *failure);
+static void recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata,
+								Oid relid);
 static void examine_simple_variable(PlannerInfo *root, Var *var,
 									VariableStatData *vardata);
 static bool get_variable_range(PlannerInfo *root, VariableStatData *vardata,
@@ -5153,51 +5155,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 									(pg_class_aclcheck(rte->relid, userid,
 													   ACL_SELECT) == ACLCHECK_OK);
 
-								/*
-								 * If the user doesn't have permissions to
-								 * access an inheritance child relation, check
-								 * the permissions of the table actually
-								 * mentioned in the query, since most likely
-								 * the user does have that permission.  Note
-								 * that whole-table select privilege on the
-								 * parent doesn't quite guarantee that the
-								 * user could read all columns of the child.
-								 * But in practice it's unlikely that any
-								 * interesting security violation could result
-								 * from allowing access to the expression
-								 * index's stats, so we allow it anyway.  See
-								 * similar code in examine_simple_variable()
-								 * for additional comments.
-								 */
-								if (!vardata->acl_ok &&
-									root->append_rel_array != NULL)
-								{
-									AppendRelInfo *appinfo;
-									Index		varno = index->rel->relid;
-
-									appinfo = root->append_rel_array[varno];
-									while (appinfo &&
-										   planner_rt_fetch(appinfo->parent_relid,
-															root)->rtekind == RTE_RELATION)
-									{
-										varno = appinfo->parent_relid;
-										appinfo = root->append_rel_array[varno];
-									}
-									if (varno != index->rel->relid)
-									{
-										/* Repeat access check on this rel */
-										rte = planner_rt_fetch(varno, root);
-										Assert(rte->rtekind == RTE_RELATION);
-
-										userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-										vardata->acl_ok =
-											rte->securityQuals == NIL &&
-											(pg_class_aclcheck(rte->relid,
-															   userid,
-															   ACL_SELECT) == ACLCHECK_OK);
-									}
-								}
+								recheck_parent_acl(root, vardata, index->rel->relid);
 							}
 							else
 							{
@@ -5296,49 +5254,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 						(pg_class_aclcheck(rte->relid, userid,
 										   ACL_SELECT) == ACLCHECK_OK);
 
-					/*
-					 * If the user doesn't have permissions to access an
-					 * inheritance child relation, check the permissions of
-					 * the table actually mentioned in the query, since most
-					 * likely the user does have that permission.  Note that
-					 * whole-table select privilege on the parent doesn't
-					 * quite guarantee that the user could read all columns of
-					 * the child. But in practice it's unlikely that any
-					 * interesting security violation could result from
-					 * allowing access to the expression stats, so we allow it
-					 * anyway.  See similar code in examine_simple_variable()
-					 * for additional comments.
-					 */
-					if (!vardata->acl_ok &&
-						root->append_rel_array != NULL)
-					{
-						AppendRelInfo *appinfo;
-						Index		varno = onerel->relid;
-
-						appinfo = root->append_rel_array[varno];
-						while (appinfo &&
-							   planner_rt_fetch(appinfo->parent_relid,
-												root)->rtekind == RTE_RELATION)
-						{
-							varno = appinfo->parent_relid;
-							appinfo = root->append_rel_array[varno];
-						}
-						if (varno != onerel->relid)
-						{
-							/* Repeat access check on this rel */
-							rte = planner_rt_fetch(varno, root);
-							Assert(rte->rtekind == RTE_RELATION);
-
-							userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-							vardata->acl_ok =
-								rte->securityQuals == NIL &&
-								(pg_class_aclcheck(rte->relid,
-												   userid,
-												   ACL_SELECT) == ACLCHECK_OK);
-						}
-					}
-
+					recheck_parent_acl(root, vardata, onerel->relid);
 					break;
 				}
 
@@ -5348,6 +5264,54 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 	}
 }
 
+/*
+ * If the user doesn't have permissions to access an inheritance child
+ * relation, check the permissions of the table actually mentioned in the
+ * query, since most likely the user does have that permission.  Note that
+ * whole-table select privilege on the parent doesn't quite guarantee that the
+ * user could read all columns of the child.  But in practice it's unlikely
+ * that any interesting security violation could result from allowing access to
+ * the expression stats, so we allow it anyway.  See similar code in
+ * examine_simple_variable() for additional comments.
+ */
+static void
+recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata, Oid relid)
+{
+	RangeTblEntry	*rte;
+	Oid		userid;
+
+	if (!vardata->acl_ok &&
+		root->append_rel_array != NULL)
+	{
+		AppendRelInfo *appinfo;
+		Index		varno = relid;
+
+		appinfo = root->append_rel_array[varno];
+		while (appinfo &&
+			   planner_rt_fetch(appinfo->parent_relid,
+								root)->rtekind == RTE_RELATION)
+		{
+			varno = appinfo->parent_relid;
+			appinfo = root->append_rel_array[varno];
+		}
+
+		if (varno != relid)
+		{
+			/* Repeat access check on this rel */
+			rte = planner_rt_fetch(varno, root);
+			Assert(rte->rtekind == RTE_RELATION);
+
+			userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
+
+			vardata->acl_ok =
+				rte->securityQuals == NIL &&
+				(pg_class_aclcheck(rte->relid,
+								   userid,
+								   ACL_SELECT) == ACLCHECK_OK);
+		}
+	}
+}
+
 /*
  * examine_simple_variable
  *		Handle a simple Var for examine_variable
-- 
2.31.1

0003-Add-stxdinherit-flag-to-pg_statistic_ext_d-20211212b.patchtext/x-patch; charset=UTF-8; name=0003-Add-stxdinherit-flag-to-pg_statistic_ext_d-20211212b.patchDownload
From b5c8a5446525c015957b81b0deadd11e9861cef8 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Sun, 12 Dec 2021 00:07:27 +0100
Subject: [PATCH 3/4] Add stxdinherit flag to pg_statistic_ext_data

The support for extended statistics on inheritance trees was somewhat
problematic, because the catalog pg_statistic_ext_data did not have a
flag identifying whether the data was built with/without data for the
child relations.

For partitioned tables we've been able to work around this because the
non-leaf relations can't contain data, so there's really just one set of
statistics to store. But for regular inheritance trees we had to pick,
and there is no clearly better option.

This introduces the pg_statistic_ext_data.stxdinherit flag, analogous to
pg_statistic.stainherit, and modifies analyze to build statistics for
both cases.

This relaxes the relationship between the two catalogs - until now we've
created the pg_statistic_ext_data one when creating the statistics, and
then only updated it later. There was always 1:1 relationship between
rows in the two catalogs. With this change we no longer insert any data
rows while creating statistics, because we don't know which flag value
to create, and it seems wasteful to initialize both for every relation.
The there may be 0, 1 or 2 data rows for each statistics definition.

Patch by me, with extensive improvements and fixes by Justin Pryzby.

Author: Tomas Vondra, Justin Pryzby
Reviewed-by: Tomas Vondra, Justin Pryzby
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
---
 doc/src/sgml/catalogs.sgml                  |  23 +++
 src/backend/catalog/system_views.sql        |   2 +
 src/backend/commands/analyze.c              |  28 +---
 src/backend/commands/statscmds.c            |  72 +++++-----
 src/backend/optimizer/util/plancat.c        | 147 ++++++++++++--------
 src/backend/statistics/dependencies.c       |  22 ++-
 src/backend/statistics/extended_stats.c     |  75 +++++-----
 src/backend/statistics/mcv.c                |   9 +-
 src/backend/statistics/mvdistinct.c         |   5 +-
 src/backend/utils/adt/selfuncs.c            |  27 +---
 src/backend/utils/cache/syscache.c          |   6 +-
 src/include/catalog/pg_statistic_ext_data.h |   4 +-
 src/include/commands/defrem.h               |   1 +
 src/include/nodes/pathnodes.h               |   1 +
 src/include/statistics/statistics.h         |  11 +-
 src/test/regress/expected/rules.out         |   2 +
 src/test/regress/expected/stats_ext.out     |  18 ++-
 src/test/regress/sql/stats_ext.sql          |   4 +-
 18 files changed, 242 insertions(+), 215 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 025db98763..9ab2cb52a7 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7521,6 +7521,19 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
    created with <link linkend="sql-createstatistics"><command>CREATE STATISTICS</command></link>.
   </para>
 
+  <para>
+   Normally there is one entry, with <structfield>stxdinherit</structfield> =
+   <literal>false</literal>, for each statistics object that has been analyzed.
+   If the table has inheritance children, a second entry with
+   <structfield>stxdinherit</structfield> = <literal>true</literal> is also created.
+   This row represents the statistics object over the inheritance tree, i.e.,
+   statistics for the data you'd see with
+   <literal>SELECT * FROM <replaceable>table</replaceable>*</literal>,
+   whereas the <structfield>stxdinherit</structfield> = <literal>false</literal> row
+   represents the results of
+   <literal>SELECT * FROM ONLY <replaceable>table</replaceable></literal>.
+  </para>
+
   <para>
    Like <link linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>,
    <structname>pg_statistic_ext_data</structname> should not be
@@ -7560,6 +7573,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>stxdinherit</structfield> <type>bool</type>
+      </para>
+      <para>
+       If true, the stats include inheritance child columns, not just the
+       values in the specified relation
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>stxdndistinct</structfield> <type>pg_ndistinct</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 61b515cdb8..319653789b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -266,6 +266,7 @@ CREATE VIEW pg_stats_ext WITH (security_barrier) AS
            ) AS attnames,
            pg_get_statisticsobjdef_expressions(s.oid) as exprs,
            s.stxkind AS kinds,
+           sd.stxdinherit AS inherited,
            sd.stxdndistinct AS n_distinct,
            sd.stxddependencies AS dependencies,
            m.most_common_vals,
@@ -298,6 +299,7 @@ CREATE VIEW pg_stats_ext_exprs WITH (security_barrier) AS
            s.stxname AS statistics_name,
            pg_get_userbyid(s.stxowner) AS statistics_owner,
            stat.expr,
+           sd.stxdinherit AS inherited,
            (stat.a).stanullfrac AS null_frac,
            (stat.a).stawidth AS avg_width,
            (stat.a).stadistinct AS n_distinct,
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 58a82e4929..3436f09d63 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -549,7 +549,6 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 	{
 		MemoryContext col_context,
 					old_context;
-		bool		build_ext_stats;
 
 		pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
 									 PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -613,30 +612,9 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
-		/*
-		 * Should we build extended statistics for this relation?
-		 *
-		 * The extended statistics catalog does not include an inheritance
-		 * flag, so we can't store statistics built both with and without
-		 * data from child relations. We can store just one set of statistics
-		 * per relation. For plain relations that's fine, but for inheritance
-		 * trees we have to pick whether to store statistics for just the
-		 * one relation or the whole tree. For plain inheritance we store
-		 * the (!inh) version, mostly for backwards compatibility reasons.
-		 * For partitioned tables that's pointless (the non-leaf tables are
-		 * always empty), so we store stats representing the whole tree.
-		 */
-		build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
-
-		/*
-		 * Build extended statistics (if there are any).
-		 *
-		 * For now we only build extended statistics on individual relations,
-		 * not for relations representing inheritance trees.
-		 */
-		if (build_ext_stats)
-			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
-									   attr_cnt, vacattrstats);
+		/* Build extended statistics (if there are any). */
+		BuildRelationExtStatistics(onerel, inh, totalrows, numrows, rows,
+								   attr_cnt, vacattrstats);
 	}
 
 	pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 8f1550ec80..1fa39e6f21 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -75,13 +75,10 @@ CreateStatistics(CreateStatsStmt *stmt)
 	HeapTuple	htup;
 	Datum		values[Natts_pg_statistic_ext];
 	bool		nulls[Natts_pg_statistic_ext];
-	Datum		datavalues[Natts_pg_statistic_ext_data];
-	bool		datanulls[Natts_pg_statistic_ext_data];
 	int2vector *stxkeys;
 	List	   *stxexprs = NIL;
 	Datum		exprsDatum;
 	Relation	statrel;
-	Relation	datarel;
 	Relation	rel = NULL;
 	Oid			relid;
 	ObjectAddress parentobject,
@@ -514,28 +511,10 @@ CreateStatistics(CreateStatsStmt *stmt)
 	relation_close(statrel, RowExclusiveLock);
 
 	/*
-	 * Also build the pg_statistic_ext_data tuple, to hold the actual
-	 * statistics data.
+	 * We used to create the pg_statistic_ext_data tuple too, but it's not clear
+	 * what value should the stxdinherit flag have (it depends on whether the rel
+	 * is partitioned, contains data, etc.)
 	 */
-	datarel = table_open(StatisticExtDataRelationId, RowExclusiveLock);
-
-	memset(datavalues, 0, sizeof(datavalues));
-	memset(datanulls, false, sizeof(datanulls));
-
-	datavalues[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statoid);
-
-	/* no statistics built yet */
-	datanulls[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
-	datanulls[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
-	datanulls[Anum_pg_statistic_ext_data_stxdmcv - 1] = true;
-	datanulls[Anum_pg_statistic_ext_data_stxdexpr - 1] = true;
-
-	/* insert it into pg_statistic_ext_data */
-	htup = heap_form_tuple(datarel->rd_att, datavalues, datanulls);
-	CatalogTupleInsert(datarel, htup);
-	heap_freetuple(htup);
-
-	relation_close(datarel, RowExclusiveLock);
 
 	InvokeObjectPostCreateHook(StatisticExtRelationId, statoid, 0);
 
@@ -717,32 +696,49 @@ AlterStatistics(AlterStatsStmt *stmt)
 }
 
 /*
- * Guts of statistics object deletion.
+ * Delete entry in pg_statistic_ext_data catalog. We don't know if the row
+ * exists, so don't error out.
  */
 void
-RemoveStatisticsById(Oid statsOid)
+RemoveStatisticsDataById(Oid statsOid, bool inh)
 {
 	Relation	relation;
 	HeapTuple	tup;
-	Form_pg_statistic_ext statext;
-	Oid			relid;
 
-	/*
-	 * First delete the pg_statistic_ext_data tuple holding the actual
-	 * statistical data.
-	 */
 	relation = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
-	tup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid));
-
-	if (!HeapTupleIsValid(tup)) /* should not happen */
-		elog(ERROR, "cache lookup failed for statistics data %u", statsOid);
+	tup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid),
+						  BoolGetDatum(inh));
 
-	CatalogTupleDelete(relation, &tup->t_self);
+	/* We don't know if the data row for inh value exists. */
+	if (HeapTupleIsValid(tup))
+	{
+		CatalogTupleDelete(relation, &tup->t_self);
 
-	ReleaseSysCache(tup);
+		ReleaseSysCache(tup);
+	}
 
 	table_close(relation, RowExclusiveLock);
+}
+
+/*
+ * Guts of statistics object deletion.
+ */
+void
+RemoveStatisticsById(Oid statsOid)
+{
+	Relation	relation;
+	HeapTuple	tup;
+	Form_pg_statistic_ext statext;
+	Oid			relid;
+
+	/*
+	 * First delete the pg_statistic_ext_data tuples holding the actual
+	 * statistical data. There might be data with/without inheritance, so
+	 * attempt deleting both.
+	 */
+	RemoveStatisticsDataById(statsOid, true);
+	RemoveStatisticsDataById(statsOid, false);
 
 	/*
 	 * Delete the pg_statistic_ext tuple.  Also send out a cache inval on the
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 564a38a13e..8ba3a3c59d 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -30,6 +30,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_statistic_ext.h"
+#include "catalog/pg_statistic_ext_data.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -1276,6 +1277,87 @@ get_relation_constraints(PlannerInfo *root,
 	return result;
 }
 
+/*
+ * Try loading data for the statistics object.
+ *
+ * We don't know if the data (specified by statOid and inh value) exist.
+ * The result is stored in stainfos list.
+ */
+static void
+get_relation_statistics_worker(List **stainfos, RelOptInfo *rel,
+							   Oid statOid, bool inh,
+							   Bitmapset *keys, List *exprs)
+{
+	Form_pg_statistic_ext_data dataForm;
+	HeapTuple	dtup;
+
+	dtup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(statOid), BoolGetDatum(inh));
+	if (!HeapTupleIsValid(dtup))
+		return;
+
+	dataForm = (Form_pg_statistic_ext_data) GETSTRUCT(dtup);
+
+	/* add one StatisticExtInfo for each kind built */
+	if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_NDISTINCT;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_DEPENDENCIES;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	if (statext_is_kind_built(dtup, STATS_EXT_MCV))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_MCV;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_EXPRESSIONS;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	ReleaseSysCache(dtup);
+}
+
 /*
  * get_relation_statistics
  *		Retrieve extended statistics defined on the table.
@@ -1299,7 +1381,6 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 		Oid			statOid = lfirst_oid(l);
 		Form_pg_statistic_ext staForm;
 		HeapTuple	htup;
-		HeapTuple	dtup;
 		Bitmapset  *keys = NULL;
 		List	   *exprs = NIL;
 		int			i;
@@ -1309,10 +1390,6 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
 		staForm = (Form_pg_statistic_ext) GETSTRUCT(htup);
 
-		dtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-		if (!HeapTupleIsValid(dtup))
-			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
-
 		/*
 		 * First, build the array of columns covered.  This is ultimately
 		 * wasted if no stats within the object have actually been built, but
@@ -1324,6 +1401,11 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 		/*
 		 * Preprocess expressions (if any). We read the expressions, run them
 		 * through eval_const_expressions, and fix the varnos.
+		 *
+		 * XXX We don't know yet if there are any data for this stats object,
+		 * with either stxdinherit value. But it's reasonable to assume there
+		 * is at least one of those, possibly both. So it's better to process
+		 * keys and expressions here.
 		 */
 		{
 			bool		isnull;
@@ -1364,61 +1446,14 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 			}
 		}
 
-		/* add one StatisticExtInfo for each kind built */
-		if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_NDISTINCT;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
-
-			stainfos = lappend(stainfos, info);
-		}
-
-		if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_DEPENDENCIES;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
-
-			stainfos = lappend(stainfos, info);
-		}
-
-		if (statext_is_kind_built(dtup, STATS_EXT_MCV))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_MCV;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
-
-			stainfos = lappend(stainfos, info);
-		}
-
-		if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+		/* extract statistics for possible values of stxdinherit flag */
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_EXPRESSIONS;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+		get_relation_statistics_worker(&stainfos, rel, statOid, true, keys, exprs);
 
-			stainfos = lappend(stainfos, info);
-		}
+		get_relation_statistics_worker(&stainfos, rel, statOid, false, keys, exprs);
 
 		ReleaseSysCache(htup);
-		ReleaseSysCache(dtup);
+
 		bms_free(keys);
 	}
 
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 5f4df6e04c..672091b517 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -619,14 +619,16 @@ dependency_is_fully_matched(MVDependency *dependency, Bitmapset *attnums)
  *		Load the functional dependencies for the indicated pg_statistic_ext tuple
  */
 MVDependencies *
-statext_dependencies_load(Oid mvoid)
+statext_dependencies_load(Oid mvoid, bool inh)
 {
 	MVDependencies *result;
 	bool		isnull;
 	Datum		deps;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid),
+						   BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
@@ -1417,16 +1419,6 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	Node	  **unique_exprs;
 	int			unique_exprs_cnt;
 
-	/*
-	 * When dealing with regular inheritance trees, ignore extended stats
-	 * (which were built without data from child rels, and thus do not
-	 * represent them). For partitioned tables data from partitions are
-	 * in the stats (and there's no data in the non-leaf relations), so
-	 * in this case we do consider extended stats.
-	 */
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return 1.0;
-
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
 		return 1.0;
@@ -1610,6 +1602,10 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (stat->kind != STATS_EXT_DEPENDENCIES)
 			continue;
 
+		/* skip statistics with mismatching stxdinherit value */
+		if (stat->inherit != rte->inh)
+			continue;
+
 		/*
 		 * Count matching attributes - we have to undo the attnum offsets. The
 		 * input attribute numbers are not offset (expressions are not
@@ -1656,7 +1652,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (nmatched + nexprs < 2)
 			continue;
 
-		deps = statext_dependencies_load(stat->statOid);
+		deps = statext_dependencies_load(stat->statOid, rte->inh);
 
 		/*
 		 * The expressions may be represented by different attnums in the
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 3cda20d3b8..3423603dd7 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -25,6 +25,7 @@
 #include "catalog/pg_statistic_ext.h"
 #include "catalog/pg_statistic_ext_data.h"
 #include "executor/executor.h"
+#include "commands/defrem.h"
 #include "commands/progress.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
@@ -78,7 +79,7 @@ typedef struct StatExtEntry
 static List *fetch_statentries_for_relation(Relation pg_statext, Oid relid);
 static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
 											int nvacatts, VacAttrStats **vacatts);
-static void statext_store(Oid statOid,
+static void statext_store(Oid statOid, bool inh,
 						  MVNDistinct *ndistinct, MVDependencies *dependencies,
 						  MCVList *mcv, Datum exprs, VacAttrStats **stats);
 static int	statext_compute_stattarget(int stattarget,
@@ -111,7 +112,7 @@ static StatsBuildData *make_build_data(Relation onerel, StatExtEntry *stat,
  * requested stats, and serializes them back into the catalog.
  */
 void
-BuildRelationExtStatistics(Relation onerel, double totalrows,
+BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 						   int numrows, HeapTuple *rows,
 						   int natts, VacAttrStats **vacattrstats)
 {
@@ -231,7 +232,8 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 		}
 
 		/* store the statistics in the catalog */
-		statext_store(stat->statOid, ndistinct, dependencies, mcv, exprstats, stats);
+		statext_store(stat->statOid, inh,
+					  ndistinct, dependencies, mcv, exprstats, stats);
 
 		/* for reporting progress */
 		pgstat_progress_update_param(PROGRESS_ANALYZE_EXT_STATS_COMPUTED,
@@ -782,23 +784,27 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
  *	tuple.
  */
 static void
-statext_store(Oid statOid,
+statext_store(Oid statOid, bool inh,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
 			  MCVList *mcv, Datum exprs, VacAttrStats **stats)
 {
 	Relation	pg_stextdata;
-	HeapTuple	stup,
-				oldtup;
+	HeapTuple	stup;
 	Datum		values[Natts_pg_statistic_ext_data];
 	bool		nulls[Natts_pg_statistic_ext_data];
-	bool		replaces[Natts_pg_statistic_ext_data];
 
 	pg_stextdata = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
 	memset(nulls, true, sizeof(nulls));
-	memset(replaces, false, sizeof(replaces));
 	memset(values, 0, sizeof(values));
 
+	/* basic info */
+	values[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statOid);
+	nulls[Anum_pg_statistic_ext_data_stxoid - 1] = false;
+
+	values[Anum_pg_statistic_ext_data_stxdinherit - 1] = BoolGetDatum(inh);
+	nulls[Anum_pg_statistic_ext_data_stxdinherit - 1] = false;
+
 	/*
 	 * Construct a new pg_statistic_ext_data tuple, replacing the calculated
 	 * stats.
@@ -831,25 +837,15 @@ statext_store(Oid statOid,
 		values[Anum_pg_statistic_ext_data_stxdexpr - 1] = exprs;
 	}
 
-	/* always replace the value (either by bytea or NULL) */
-	replaces[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdmcv - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdexpr - 1] = true;
-
-	/* there should already be a pg_statistic_ext_data tuple */
-	oldtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-	if (!HeapTupleIsValid(oldtup))
-		elog(ERROR, "cache lookup failed for statistics object %u", statOid);
-
-	/* replace it */
-	stup = heap_modify_tuple(oldtup,
-							 RelationGetDescr(pg_stextdata),
-							 values,
-							 nulls,
-							 replaces);
-	ReleaseSysCache(oldtup);
-	CatalogTupleUpdate(pg_stextdata, &stup->t_self, stup);
+	/*
+	 * Delete the old tuple if it exists, and insert a new one. It's easier
+	 * than trying to update or insert, based on various conditions.
+	 */
+	RemoveStatisticsDataById(statOid, inh);
+
+	/* form and insert a new tuple */
+	stup = heap_form_tuple(RelationGetDescr(pg_stextdata), values, nulls);
+	CatalogTupleInsert(pg_stextdata, stup);
 
 	heap_freetuple(stup);
 
@@ -1235,7 +1231,7 @@ stat_covers_expressions(StatisticExtInfo *stat, List *exprs,
  * further tiebreakers are needed.
  */
 StatisticExtInfo *
-choose_best_statistics(List *stats, char requiredkind,
+choose_best_statistics(List *stats, char requiredkind, bool inh,
 					   Bitmapset **clause_attnums, List **clause_exprs,
 					   int nclauses)
 {
@@ -1257,6 +1253,10 @@ choose_best_statistics(List *stats, char requiredkind,
 		if (info->kind != requiredkind)
 			continue;
 
+		/* skip statistics with mismatching inheritance flag */
+		if (info->inherit != inh)
+			continue;
+
 		/*
 		 * Collect attributes and expressions in remaining (unestimated)
 		 * clauses fully covered by this statistic object.
@@ -1697,16 +1697,6 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
 	RangeTblEntry *rte = planner_rt_fetch(rel->relid, root);
 
-	/*
-	 * When dealing with regular inheritance trees, ignore extended stats
-	 * (which were built without data from child rels, and thus do not
-	 * represent them). For partitioned tables data from partitions are
-	 * in the stats (and there's no data in the non-leaf relations), so
-	 * in this case we do consider extended stats.
-	 */
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return sel;
-
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
 		return sel;
@@ -1758,7 +1748,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 		Bitmapset  *simple_clauses;
 
 		/* find the best suited statistics object for these attnums */
-		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV,
+		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, rte->inh,
 									  list_attnums, list_exprs,
 									  list_length(clauses));
 
@@ -1847,7 +1837,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 			MCVList    *mcv_list;
 
 			/* Load the MCV list stored in the statistics object */
-			mcv_list = statext_mcv_load(stat->statOid);
+			mcv_list = statext_mcv_load(stat->statOid, rte->inh);
 
 			/*
 			 * Compute the selectivity of the ORed list of clauses covered by
@@ -2408,7 +2398,7 @@ serialize_expr_stats(AnlExprData *exprdata, int nexprs)
  * identified by the supplied index.
  */
 HeapTuple
-statext_expressions_load(Oid stxoid, int idx)
+statext_expressions_load(Oid stxoid, bool inh, int idx)
 {
 	bool		isnull;
 	Datum		value;
@@ -2418,7 +2408,8 @@ statext_expressions_load(Oid stxoid, int idx)
 	HeapTupleData tmptup;
 	HeapTuple	tup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(stxoid), BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", stxoid);
 
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index b350fc5f7b..75d7d35caa 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -559,12 +559,13 @@ build_column_frequencies(SortItem *groups, int ngroups,
  *		Load the MCV list for the indicated pg_statistic_ext tuple.
  */
 MCVList *
-statext_mcv_load(Oid mvoid)
+statext_mcv_load(Oid mvoid, bool inh)
 {
 	MCVList    *result;
 	bool		isnull;
 	Datum		mcvlist;
-	HeapTuple	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	HeapTuple	htup = SearchSysCache2(STATEXTDATASTXOID,
+									   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
@@ -2039,11 +2040,13 @@ mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
 	MCVList    *mcv;
 	Selectivity s = 0.0;
 
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
 	/* match/mismatch bitmap for each MCV item */
 	bool	   *matches = NULL;
 
 	/* load the MCV list stored in the statistics object */
-	mcv = statext_mcv_load(stat->statOid);
+	mcv = statext_mcv_load(stat->statOid, rte->inh);
 
 	/* build a match bitmap for the clauses */
 	matches = mcv_get_match_bitmap(root, clauses, stat->keys, stat->exprs,
diff --git a/src/backend/statistics/mvdistinct.c b/src/backend/statistics/mvdistinct.c
index 4481312d61..ab1f10d6c0 100644
--- a/src/backend/statistics/mvdistinct.c
+++ b/src/backend/statistics/mvdistinct.c
@@ -146,14 +146,15 @@ statext_ndistinct_build(double totalrows, StatsBuildData *data)
  *		Load the ndistinct value for the indicated pg_statistic_ext tuple
  */
 MVNDistinct *
-statext_ndistinct_load(Oid mvoid)
+statext_ndistinct_load(Oid mvoid, bool inh)
 {
 	MVNDistinct *result;
 	bool		isnull;
 	Datum		ndist;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index da232ba2bd..bf4525392d 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3919,17 +3919,6 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	if (!rel->statlist)
 		return false;
 
-	/*
-	 * When dealing with regular inheritance trees, ignore extended stats
-	 * (which were built without data from child rels, and thus do not
-	 * represent them). For partitioned tables data from partitions are
-	 * in the stats (and there's no data in the non-leaf relations), so
-	 * in this case we do consider extended stats.
-	 */
-	rte = planner_rt_fetch(rel->relid, root);
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return false;
-
 	/* look for the ndistinct statistics object matching the most vars */
 	nmatches_vars = 0;			/* we require at least two matches */
 	nmatches_exprs = 0;
@@ -4015,7 +4004,8 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 
 	Assert(nmatches_vars + nmatches_exprs > 1);
 
-	stats = statext_ndistinct_load(statOid);
+	rte = planner_rt_fetch(rel->relid, root);
+	stats = statext_ndistinct_load(statOid, rte->inh);
 
 	/*
 	 * If we have a match, search it for the specific item that matches (there
@@ -5273,22 +5263,19 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 				/* found a match, see if we can extract pg_statistic row */
 				if (equal(node, expr))
 				{
-					HeapTuple	t = statext_expressions_load(info->statOid, pos);
-
-					/* Get statistics object's table for permission check */
-					RangeTblEntry *rte;
+					RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root);
 					Oid			userid;
 
-					vardata->statsTuple = t;
+					Assert(rte->rtekind == RTE_RELATION);
 
 					/*
 					 * XXX Not sure if we should cache the tuple somewhere.
 					 * Now we just create a new copy every time.
 					 */
-					vardata->freefunc = ReleaseDummy;
+					vardata->statsTuple =
+						statext_expressions_load(info->statOid, rte->inh, pos);
 
-					rte = planner_rt_fetch(onerel->relid, root);
-					Assert(rte->rtekind == RTE_RELATION);
+					vardata->freefunc = ReleaseDummy;
 
 					/*
 					 * Use checkAsUser if it's set, in case we're accessing
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 56870b46e4..6194751e99 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -763,11 +763,11 @@ static const struct cachedesc cacheinfo[] = {
 		32
 	},
 	{StatisticExtDataRelationId,	/* STATEXTDATASTXOID */
-		StatisticExtDataStxoidIndexId,
-		1,
+		StatisticExtDataStxoidInhIndexId,
+		2,
 		{
 			Anum_pg_statistic_ext_data_stxoid,
-			0,
+			Anum_pg_statistic_ext_data_stxdinherit,
 			0,
 			0
 		},
diff --git a/src/include/catalog/pg_statistic_ext_data.h b/src/include/catalog/pg_statistic_ext_data.h
index 7b73b790d2..8ffd8b68cd 100644
--- a/src/include/catalog/pg_statistic_ext_data.h
+++ b/src/include/catalog/pg_statistic_ext_data.h
@@ -32,6 +32,7 @@ CATALOG(pg_statistic_ext_data,3429,StatisticExtDataRelationId)
 {
 	Oid			stxoid BKI_LOOKUP(pg_statistic_ext);	/* statistics object
 														 * this data is for */
+	bool		stxdinherit;	/* true if inheritance children are included */
 
 #ifdef CATALOG_VARLEN			/* variable-length fields start here */
 
@@ -53,6 +54,7 @@ typedef FormData_pg_statistic_ext_data * Form_pg_statistic_ext_data;
 
 DECLARE_TOAST(pg_statistic_ext_data, 3430, 3431);
 
-DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_index, 3433, StatisticExtDataStxoidIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops));
+DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_inh_index, 3433, StatisticExtDataStxoidInhIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops, stxdinherit bool_ops));
+
 
 #endif							/* PG_STATISTIC_EXT_DATA_H */
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index f84d09959c..ab3c59fcd5 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -83,6 +83,7 @@ extern ObjectAddress AlterOperator(AlterOperatorStmt *stmt);
 extern ObjectAddress CreateStatistics(CreateStatsStmt *stmt);
 extern ObjectAddress AlterStatistics(AlterStatsStmt *stmt);
 extern void RemoveStatisticsById(Oid statsOid);
+extern void RemoveStatisticsDataById(Oid statsOid, bool inh);
 extern Oid	StatisticsGetRelation(Oid statId, bool missing_ok);
 
 /* commands/aggregatecmds.c */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 324d92880b..02aff9847e 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -934,6 +934,7 @@ typedef struct StatisticExtInfo
 	NodeTag		type;
 
 	Oid			statOid;		/* OID of the statistics row */
+	bool		inherit;		/* includes child relations */
 	RelOptInfo *rel;			/* back-link to statistic's table */
 	char		kind;			/* statistics kind of this entry */
 	Bitmapset  *keys;			/* attnums of the columns covered */
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 326cf26fea..3868e43f8a 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -94,11 +94,11 @@ typedef struct MCVList
 	MCVItem		items[FLEXIBLE_ARRAY_MEMBER];	/* array of MCV items */
 } MCVList;
 
-extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
-extern MVDependencies *statext_dependencies_load(Oid mvoid);
-extern MCVList *statext_mcv_load(Oid mvoid);
+extern MVNDistinct *statext_ndistinct_load(Oid mvoid, bool inh);
+extern MVDependencies *statext_dependencies_load(Oid mvoid, bool inh);
+extern MCVList *statext_mcv_load(Oid mvoid, bool inh);
 
-extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
+extern void BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 									   int numrows, HeapTuple *rows,
 									   int natts, VacAttrStats **vacattrstats);
 extern int	ComputeExtStatisticsRows(Relation onerel,
@@ -121,9 +121,10 @@ extern Selectivity statext_clauselist_selectivity(PlannerInfo *root,
 												  bool is_or);
 extern bool has_stats_of_kind(List *stats, char requiredkind);
 extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind,
+												bool inh,
 												Bitmapset **clause_attnums,
 												List **clause_exprs,
 												int nclauses);
-extern HeapTuple statext_expressions_load(Oid stxoid, int idx);
+extern HeapTuple statext_expressions_load(Oid stxoid, bool inh, int idx);
 
 #endif							/* STATISTICS_H */
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index b58b062b10..d652f7b5fb 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2443,6 +2443,7 @@ pg_stats_ext| SELECT cn.nspname AS schemaname,
              JOIN pg_attribute a ON (((a.attrelid = s.stxrelid) AND (a.attnum = k.k))))) AS attnames,
     pg_get_statisticsobjdef_expressions(s.oid) AS exprs,
     s.stxkind AS kinds,
+    sd.stxdinherit AS inherited,
     sd.stxdndistinct AS n_distinct,
     sd.stxddependencies AS dependencies,
     m.most_common_vals,
@@ -2469,6 +2470,7 @@ pg_stats_ext_exprs| SELECT cn.nspname AS schemaname,
     s.stxname AS statistics_name,
     pg_get_userbyid(s.stxowner) AS statistics_owner,
     stat.expr,
+    sd.stxdinherit AS inherited,
     (stat.a).stanullfrac AS null_frac,
     (stat.a).stawidth AS avg_width,
     (stat.a).stadistinct AS n_distinct,
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index a11fff655a..427884cda8 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -144,10 +144,9 @@ SELECT stxname, stxdndistinct, stxddependencies, stxdmcv
   FROM pg_statistic_ext s, pg_statistic_ext_data d
  WHERE s.stxname = 'ab1_a_b_stats'
    AND d.stxoid = s.oid;
-    stxname    | stxdndistinct | stxddependencies | stxdmcv 
----------------+---------------+------------------+---------
- ab1_a_b_stats |               |                  | 
-(1 row)
+ stxname | stxdndistinct | stxddependencies | stxdmcv 
+---------+---------------+------------------+---------
+(0 rows)
 
 ALTER STATISTICS ab1_a_b_stats SET STATISTICS -1;
 \d+ ab1
@@ -192,11 +191,18 @@ SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 
 CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
 VACUUM ANALYZE stxdinh, stxdinh1;
--- Since the stats object does not include inherited stats, it should not affect the estimates
+-- See if the extended stats affect the estimates
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
  estimated | actual 
 -----------+--------
-      1000 |   1008
+      1008 |   1008
+(1 row)
+
+-- Ensure correct (non-inherited) stats are applied to inherited query
+SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+         9 |      9
 (1 row)
 
 DROP TABLE stxdinh, stxdinh1;
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 01383e8f5f..8deb010b3c 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -123,8 +123,10 @@ VACUUM ANALYZE stxdinh, stxdinh1;
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
 VACUUM ANALYZE stxdinh, stxdinh1;
--- Since the stats object does not include inherited stats, it should not affect the estimates
+-- See if the extended stats affect the estimates
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+-- Ensure correct (non-inherited) stats are applied to inherited query
+SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2');
 DROP TABLE stxdinh, stxdinh1;
 
 -- Ensure inherited stats ARE applied to inherited query in partitioned table
-- 
2.31.1

0002-Build-inherited-extended-stats-on-partitio-20211212b.patchtext/x-patch; charset=UTF-8; name=0002-Build-inherited-extended-stats-on-partitio-20211212b.patchDownload
From 3f3793672978c31efee0588a41d5d5b1a4175539 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Sat, 11 Dec 2021 22:38:08 +0100
Subject: [PATCH 2/4] Build inherited extended stats on partitioned tables

Commit 859b3003de disabled building of extended stats for inheritance
trees, to prevent updating the same catalog row twice. That resolved the
issue, but it also means there are no extended stats for partitioned
tables, because the (!inh) case has empty sample. This is a regression,
affecting queries that don't estimate the leaf relations directly.

But because partitioned tables are empty, we can invert the condition
and build statistics only for the case with inheritance, without losing
anything. And we can consider them when calculating estimates.

Report and patch by Justin Pryzby, minor fixes and cleanup by me.
Backpatch all the way back to PostgreSQL 10, where extended statistics
were introduced (same as 859b3003de).

Author: Justin Pryzby
Reported-by: Justin Pryzby
Backpatch-through: 10
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
---
 src/backend/commands/analyze.c          | 18 ++++++++++++++-
 src/backend/statistics/dependencies.c   |  9 +++++---
 src/backend/statistics/extended_stats.c |  9 +++++---
 src/backend/utils/adt/selfuncs.c        | 30 +++++++++++++++----------
 src/test/regress/expected/stats_ext.out | 19 ++++++++++++++++
 src/test/regress/sql/stats_ext.sql      | 10 +++++++++
 6 files changed, 76 insertions(+), 19 deletions(-)

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index cd77907fc7..58a82e4929 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -549,6 +549,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 	{
 		MemoryContext col_context,
 					old_context;
+		bool		build_ext_stats;
 
 		pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
 									 PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -612,13 +613,28 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
+		/*
+		 * Should we build extended statistics for this relation?
+		 *
+		 * The extended statistics catalog does not include an inheritance
+		 * flag, so we can't store statistics built both with and without
+		 * data from child relations. We can store just one set of statistics
+		 * per relation. For plain relations that's fine, but for inheritance
+		 * trees we have to pick whether to store statistics for just the
+		 * one relation or the whole tree. For plain inheritance we store
+		 * the (!inh) version, mostly for backwards compatibility reasons.
+		 * For partitioned tables that's pointless (the non-leaf tables are
+		 * always empty), so we store stats representing the whole tree.
+		 */
+		build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
+
 		/*
 		 * Build extended statistics (if there are any).
 		 *
 		 * For now we only build extended statistics on individual relations,
 		 * not for relations representing inheritance trees.
 		 */
-		if (!inh)
+		if (build_ext_stats)
 			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
 									   attr_cnt, vacattrstats);
 	}
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index a41bace75a..5f4df6e04c 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -1418,10 +1418,13 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			unique_exprs_cnt;
 
 	/*
-	 * When dealing with inheritance trees, ignore extended stats (which were
-	 * built without data from child rels, and thus do not represent them).
+	 * When dealing with regular inheritance trees, ignore extended stats
+	 * (which were built without data from child rels, and thus do not
+	 * represent them). For partitioned tables data from partitions are
+	 * in the stats (and there's no data in the non-leaf relations), so
+	 * in this case we do consider extended stats.
 	 */
-	if (rte->inh)
+	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 		return 1.0;
 
 	/* check if there's any stats that might be useful for us. */
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 5a3ba2f1c5..3cda20d3b8 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1698,10 +1698,13 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	RangeTblEntry *rte = planner_rt_fetch(rel->relid, root);
 
 	/*
-	 * When dealing with inheritance trees, ignore extended stats (which were
-	 * built without data from child rels, and thus do not represent them).
+	 * When dealing with regular inheritance trees, ignore extended stats
+	 * (which were built without data from child rels, and thus do not
+	 * represent them). For partitioned tables data from partitions are
+	 * in the stats (and there's no data in the non-leaf relations), so
+	 * in this case we do consider extended stats.
 	 */
-	if (rte->inh)
+	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 		return sel;
 
 	/* check if there's any stats that might be useful for us. */
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 1010d5caa8..da232ba2bd 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3913,19 +3913,23 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	Oid			statOid = InvalidOid;
 	MVNDistinct *stats;
 	StatisticExtInfo *matched_info = NULL;
-	RangeTblEntry		*rte = planner_rt_fetch(rel->relid, root);
-
-	/*
-	 * When dealing with inheritance trees, ignore extended stats (which were
-	 * built without data from child rels, and thus do not represent them).
-	 */
-	if (rte->inh)
-		return false;
+	RangeTblEntry		*rte;
 
 	/* bail out immediately if the table has no extended statistics */
 	if (!rel->statlist)
 		return false;
 
+	/*
+	 * When dealing with regular inheritance trees, ignore extended stats
+	 * (which were built without data from child rels, and thus do not
+	 * represent them). For partitioned tables data from partitions are
+	 * in the stats (and there's no data in the non-leaf relations), so
+	 * in this case we do consider extended stats.
+	 */
+	rte = planner_rt_fetch(rel->relid, root);
+	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
+		return false;
+
 	/* look for the ndistinct statistics object matching the most vars */
 	nmatches_vars = 0;			/* we require at least two matches */
 	nmatches_exprs = 0;
@@ -5242,11 +5246,13 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 				break;
 
 			/*
-			 * When dealing with inheritance trees, ignore extended stats (which
-			 * were built without data from child rels, and so do not represent
-			 * them).
+			 * When dealing with regular inheritance trees, ignore extended stats
+			 * (which were built without data from child rels, and thus do not
+			 * represent them). For partitioned tables data from partitions are
+			 * in the stats (and there's no data in the non-leaf relations), so
+			 * in this case we do consider extended stats.
 			 */
-			if (rte->inh)
+			if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 				break;
 
 			/* skip stats without per-expression stats */
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 5410f58f7f..a11fff655a 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -200,6 +200,25 @@ SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 (1 row)
 
 DROP TABLE stxdinh, stxdinh1;
+-- Ensure inherited stats ARE applied to inherited query in partitioned table
+CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i);
+CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100);
+INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
+CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp;
+VACUUM ANALYZE stxdinp; -- partitions are processed recursively
+SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass;
+ ?column? 
+----------
+        1
+(1 row)
+
+SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+        10 |     10
+(1 row)
+
+DROP TABLE stxdinp;
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 -- expression stats may be built on a single expression column
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index a196d5e2f8..01383e8f5f 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -127,6 +127,16 @@ VACUUM ANALYZE stxdinh, stxdinh1;
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 DROP TABLE stxdinh, stxdinh1;
 
+-- Ensure inherited stats ARE applied to inherited query in partitioned table
+CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i);
+CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100);
+INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
+CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp;
+VACUUM ANALYZE stxdinp; -- partitions are processed recursively
+SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass;
+SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2');
+DROP TABLE stxdinp;
+
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 
-- 
2.31.1

0001-Ignore-extended-statistics-for-inheritance-20211212b.patchtext/x-patch; charset=UTF-8; name=0001-Ignore-extended-statistics-for-inheritance-20211212b.patchDownload
From 0601bac64c0d7a1f1b676f0433a05f2f99d7bfb1 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 25 Sep 2021 19:42:41 -0500
Subject: [PATCH 1/4] Ignore extended statistics for inheritance trees

Since commit 859b3003de we only build extended statistics for relations
without the child relations, so that we update the catalog row only
once. However, the non-inherited statistics were still used for planning
of queries on inheritance trees, which may produce bogus estimates.

This is roughly the same issue 427c6b5b9 addressed ~15 years ago, and we
fix it the same way - by ignoring extended statistics when calculating
estimates for the inheritance tree as a whole. We still consider
extended statistics when calculating estimates for individual child
relations, of course.

Report and patch by Justin Pryzby, minor fixes and cleanup by me.
Backpatch all the way back to PostgreSQL 10, where extended statistics
were introduced (same as 859b3003de).

Author: Justin Pryzby
Reported-by: Justin Pryzby
Backpatch-through: 10
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
---
 src/backend/statistics/dependencies.c   |  9 +++++++++
 src/backend/statistics/extended_stats.c |  9 +++++++++
 src/backend/utils/adt/selfuncs.c        | 17 +++++++++++++++++
 src/test/regress/expected/stats_ext.out | 24 ++++++++++++++++++++++++
 src/test/regress/sql/stats_ext.sql      | 15 +++++++++++++++
 5 files changed, 74 insertions(+)

diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 8bf80db8e4..a41bace75a 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -24,6 +24,7 @@
 #include "nodes/pathnodes.h"
 #include "optimizer/clauses.h"
 #include "optimizer/optimizer.h"
+#include "parser/parsetree.h"
 #include "statistics/extended_stats_internal.h"
 #include "statistics/statistics.h"
 #include "utils/bytea.h"
@@ -1410,11 +1411,19 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			ndependencies;
 	int			i;
 	AttrNumber	attnum_offset;
+	RangeTblEntry *rte = planner_rt_fetch(rel->relid, root);
 
 	/* unique expressions */
 	Node	  **unique_exprs;
 	int			unique_exprs_cnt;
 
+	/*
+	 * When dealing with inheritance trees, ignore extended stats (which were
+	 * built without data from child rels, and thus do not represent them).
+	 */
+	if (rte->inh)
+		return 1.0;
+
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
 		return 1.0;
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 69ca52094f..5a3ba2f1c5 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -30,6 +30,7 @@
 #include "nodes/nodeFuncs.h"
 #include "optimizer/clauses.h"
 #include "optimizer/optimizer.h"
+#include "parser/parsetree.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "statistics/extended_stats_internal.h"
@@ -1694,6 +1695,14 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	List	  **list_exprs;		/* expressions matched to any statistic */
 	int			listidx;
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
+	RangeTblEntry *rte = planner_rt_fetch(rel->relid, root);
+
+	/*
+	 * When dealing with inheritance trees, ignore extended stats (which were
+	 * built without data from child rels, and thus do not represent them).
+	 */
+	if (rte->inh)
+		return sel;
 
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 10895fb287..1010d5caa8 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3913,6 +3913,14 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	Oid			statOid = InvalidOid;
 	MVNDistinct *stats;
 	StatisticExtInfo *matched_info = NULL;
+	RangeTblEntry		*rte = planner_rt_fetch(rel->relid, root);
+
+	/*
+	 * When dealing with inheritance trees, ignore extended stats (which were
+	 * built without data from child rels, and thus do not represent them).
+	 */
+	if (rte->inh)
+		return false;
 
 	/* bail out immediately if the table has no extended statistics */
 	if (!rel->statlist)
@@ -5222,6 +5230,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 		foreach(slist, onerel->statlist)
 		{
 			StatisticExtInfo *info = (StatisticExtInfo *) lfirst(slist);
+			RangeTblEntry	 *rte = planner_rt_fetch(onerel->relid, root);
 			ListCell   *expr_item;
 			int			pos;
 
@@ -5232,6 +5241,14 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 			if (vardata->statsTuple)
 				break;
 
+			/*
+			 * When dealing with inheritance trees, ignore extended stats (which
+			 * were built without data from child rels, and so do not represent
+			 * them).
+			 */
+			if (rte->inh)
+				break;
+
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
 				continue;
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index c60ba45aba..5410f58f7f 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -176,6 +176,30 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 NOTICE:  drop cascades to table ab1c
+-- Tests for stats with inheritance
+CREATE TABLE stxdinh(i int, j int);
+CREATE TABLE stxdinh1() INHERITS(stxdinh);
+INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
+INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Ensure non-inherited stats are not applied to inherited query
+-- Without stats object, it looks like this
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+      1000 |   1008
+(1 row)
+
+CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Since the stats object does not include inherited stats, it should not affect the estimates
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+      1000 |   1008
+(1 row)
+
+DROP TABLE stxdinh, stxdinh1;
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 -- expression stats may be built on a single expression column
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 6fb37962a7..a196d5e2f8 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -112,6 +112,21 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 
+-- Tests for stats with inheritance
+CREATE TABLE stxdinh(i int, j int);
+CREATE TABLE stxdinh1() INHERITS(stxdinh);
+INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
+INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Ensure non-inherited stats are not applied to inherited query
+-- Without stats object, it looks like this
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Since the stats object does not include inherited stats, it should not affect the estimates
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+DROP TABLE stxdinh, stxdinh1;
+
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 
-- 
2.31.1

#27Justin Pryzby
pryzby@telsasoft.com
In reply to: Tomas Vondra (#17)
Re: extended stats on partitioned tables

On Sun, Dec 12, 2021 at 05:17:10AM +0100, Tomas Vondra wrote:

The one thing bugging me a bit is that the regression test checks only a
GROUP BY query. It'd be nice to add queries testing MCV/dependencies
too, but that seems tricky because most queries will use per-partitions
stats.

You mean because the quals are pushed down to the scan node.

Does that indicate a deficiency ?

If extended stats are collected for a parent table, selectivity estimates based
from the parent would be better; but instead we use uncorrected column
estimates from the child tables.

From what I see, we could come up with a way to avoid the pushdown, involving
volatile functions/foreign tables/RLS/window functions/SRF/wholerow vars/etc.

But would it be better if extended stats objects on partitioned tables were to
collect stats for both parent AND CHILD ? I'm not sure. Maybe that's the
wrong solution, but maybe we should still document that extended stats on
(empty) parent tables are often themselves not used/useful for selectivity
estimates, and the user should instead (or in addition) create stats on child
tables.

Or, maybe if there's no extended stats on the child tables, stats on the parent
table should be consulted ?

--
Justin

#28Justin Pryzby
pryzby@telsasoft.com
In reply to: Tomas Vondra (#26)
Re: extended stats on partitioned tables

On Sun, Dec 12, 2021 at 10:29:39PM +0100, Tomas Vondra wrote:

On 12/12/21 18:52, Justin Pryzby wrote:
That's mostly a conscious choice, so that I don't have to include
parsetree.h. But maybe that'd be better ...

The regression tests changed as a result of not populating stx_data; I think
it's may be better to update like this:

SELECT stxname, stxdndistinct, stxddependencies, stxdmcv, stxoid IS NULL
FROM pg_statistic_ext s LEFT JOIN pg_statistic_ext_data d
ON d.stxoid = s.oid
WHERE s.stxname = 'ab1_a_b_stats';

Not sure I understand. Why would this be better than inner join?

It shows that there's an entry in pg_statistic_ext and not one in ext_data,
rather than that it's not in at least one of the catalogs. Which is nice to
show since as you say it's no longer 1:1.

Thanks, fixed. Can you read through the commit messages and check the
attribution is correct for all the patches?

Seems fine.

--
Justin

#29Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Justin Pryzby (#27)
Re: extended stats on partitioned tables

On 12/12/21 22:32, Justin Pryzby wrote:

On Sun, Dec 12, 2021 at 05:17:10AM +0100, Tomas Vondra wrote:

The one thing bugging me a bit is that the regression test checks only a
GROUP BY query. It'd be nice to add queries testing MCV/dependencies
too, but that seems tricky because most queries will use per-partitions
stats.

You mean because the quals are pushed down to the scan node.

Does that indicate a deficiency ?

If extended stats are collected for a parent table, selectivity estimates based
from the parent would be better; but instead we use uncorrected column
estimates from the child tables.

From what I see, we could come up with a way to avoid the pushdown, involving
volatile functions/foreign tables/RLS/window functions/SRF/wholerow vars/etc.

But would it be better if extended stats objects on partitioned

tables were to

collect stats for both parent AND CHILD ? I'm not sure. Maybe that's the
wrong solution, but maybe we should still document that extended stats on
(empty) parent tables are often themselves not used/useful for selectivity
estimates, and the user should instead (or in addition) create stats on child
tables.

Or, maybe if there's no extended stats on the child tables, stats on the parent
table should be consulted ?

Maybe, but that seems like a mostly separate improvement. At this point
I'm interested only in testing the behavior implemented in the current
patches.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#30Justin Pryzby
pryzby@telsasoft.com
In reply to: Tomas Vondra (#26)
Re: extended stats on partitioned tables

On Sun, Dec 12, 2021 at 10:29:39PM +0100, Tomas Vondra wrote:

In your 0003 patch, the "if inh: break" isn't removed from examine_variable(),
but the corresponding thing is removed everywhere else.

Ah, you're right. And it wasn't updated in the 0002 patch either - it
should do the relkind check too, to allow partitioned tables. Fixed.

I think you fixed it in 0002 (thanks) but still wasn't removed from 0003?

In these comments:
+        * When dealing with regular inheritance trees, ignore extended stats
+        * (which were built without data from child rels, and thus do not
+        * represent them). For partitioned tables data from partitions are
+        * in the stats (and there's no data in the non-leaf relations), so
+        * in this case we do consider extended stats.

I suggest to add a comment after "For partitioned tables".

--
Justin

#31Justin Pryzby
pryzby@telsasoft.com
In reply to: Tomas Vondra (#29)
Re: extended stats on partitioned tables

On Sun, Dec 12, 2021 at 11:23:19PM +0100, Tomas Vondra wrote:

On 12/12/21 22:32, Justin Pryzby wrote:

On Sun, Dec 12, 2021 at 05:17:10AM +0100, Tomas Vondra wrote:

The one thing bugging me a bit is that the regression test checks only a
GROUP BY query. It'd be nice to add queries testing MCV/dependencies
too, but that seems tricky because most queries will use per-partitions
stats.

You mean because the quals are pushed down to the scan node.

Does that indicate a deficiency ?

If extended stats are collected for a parent table, selectivity estimates based
from the parent would be better; but instead we use uncorrected column
estimates from the child tables.

From what I see, we could come up with a way to avoid the pushdown, involving
volatile functions/foreign tables/RLS/window functions/SRF/wholerow vars/etc.
But would it be better if extended stats objects on partitioned tables were to
collect stats for both parent AND CHILD ? I'm not sure. Maybe that's the
wrong solution, but maybe we should still document that extended stats on
(empty) parent tables are often themselves not used/useful for selectivity
estimates, and the user should instead (or in addition) create stats on child
tables.

Or, maybe if there's no extended stats on the child tables, stats on the parent
table should be consulted ?

Maybe, but that seems like a mostly separate improvement. At this point I'm
interested only in testing the behavior implemented in the current patches.

I don't want to change the scope of the patch, or this thread, but my point is
that the behaviour already changed once (the original regression) and now we're
planning to change it again to fix that, so we ought to decide on the expected
behavior before writing tests to verify it.

I think it may be impossible to use the "dependencies" statistic with inherited
stats. Normally the quals would be pushed down to the child tables. But, if
they weren't pushed down, they'd be attached to something other than a scan
node on the parent table, so the stats on that table wouldn't apply (right?).

Maybe the useless stats types should have been prohibited on partitioned tables
since v10. It's too late to change that, but perhaps now they shouldn't even
be collected during analyze. The dependencies and MCV paths are never called
with rte->inh==true, so maybe we should Assert(!inh), or add a comment to that
effect. Or the regression tests should "memorialize" the behavior. I'm still
thinking about it.

--
Justin

#32Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Justin Pryzby (#30)
4 attachment(s)
Re: extended stats on partitioned tables

On 12/13/21 14:48, Justin Pryzby wrote:

On Sun, Dec 12, 2021 at 10:29:39PM +0100, Tomas Vondra wrote:

In your 0003 patch, the "if inh: break" isn't removed from examine_variable(),
but the corresponding thing is removed everywhere else.

Ah, you're right. And it wasn't updated in the 0002 patch either - it
should do the relkind check too, to allow partitioned tables. Fixed.

I think you fixed it in 0002 (thanks) but still wasn't removed from 0003?

D'oh! Those repeated rebase conflicts got me quite confused.

In these comments:
+        * When dealing with regular inheritance trees, ignore extended stats
+        * (which were built without data from child rels, and thus do not
+        * represent them). For partitioned tables data from partitions are
+        * in the stats (and there's no data in the non-leaf relations), so
+        * in this case we do consider extended stats.

I suggest to add a comment after "For partitioned tables".

I've reworded the comment a bit, hopefully it's a bit clearer now.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

0001-Ignore-extended-statistics-for-inheritance--20211213.patchtext/x-patch; charset=UTF-8; name=0001-Ignore-extended-statistics-for-inheritance--20211213.patchDownload
From 0601bac64c0d7a1f1b676f0433a05f2f99d7bfb1 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 25 Sep 2021 19:42:41 -0500
Subject: [PATCH 1/4] Ignore extended statistics for inheritance trees

Since commit 859b3003de we only build extended statistics for relations
without the child relations, so that we update the catalog row only
once. However, the non-inherited statistics were still used for planning
of queries on inheritance trees, which may produce bogus estimates.

This is roughly the same issue 427c6b5b9 addressed ~15 years ago, and we
fix it the same way - by ignoring extended statistics when calculating
estimates for the inheritance tree as a whole. We still consider
extended statistics when calculating estimates for individual child
relations, of course.

Report and patch by Justin Pryzby, minor fixes and cleanup by me.
Backpatch all the way back to PostgreSQL 10, where extended statistics
were introduced (same as 859b3003de).

Author: Justin Pryzby
Reported-by: Justin Pryzby
Backpatch-through: 10
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
---
 src/backend/statistics/dependencies.c   |  9 +++++++++
 src/backend/statistics/extended_stats.c |  9 +++++++++
 src/backend/utils/adt/selfuncs.c        | 17 +++++++++++++++++
 src/test/regress/expected/stats_ext.out | 24 ++++++++++++++++++++++++
 src/test/regress/sql/stats_ext.sql      | 15 +++++++++++++++
 5 files changed, 74 insertions(+)

diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 8bf80db8e4..a41bace75a 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -24,6 +24,7 @@
 #include "nodes/pathnodes.h"
 #include "optimizer/clauses.h"
 #include "optimizer/optimizer.h"
+#include "parser/parsetree.h"
 #include "statistics/extended_stats_internal.h"
 #include "statistics/statistics.h"
 #include "utils/bytea.h"
@@ -1410,11 +1411,19 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			ndependencies;
 	int			i;
 	AttrNumber	attnum_offset;
+	RangeTblEntry *rte = planner_rt_fetch(rel->relid, root);
 
 	/* unique expressions */
 	Node	  **unique_exprs;
 	int			unique_exprs_cnt;
 
+	/*
+	 * When dealing with inheritance trees, ignore extended stats (which were
+	 * built without data from child rels, and thus do not represent them).
+	 */
+	if (rte->inh)
+		return 1.0;
+
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
 		return 1.0;
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 69ca52094f..5a3ba2f1c5 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -30,6 +30,7 @@
 #include "nodes/nodeFuncs.h"
 #include "optimizer/clauses.h"
 #include "optimizer/optimizer.h"
+#include "parser/parsetree.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "statistics/extended_stats_internal.h"
@@ -1694,6 +1695,14 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	List	  **list_exprs;		/* expressions matched to any statistic */
 	int			listidx;
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
+	RangeTblEntry *rte = planner_rt_fetch(rel->relid, root);
+
+	/*
+	 * When dealing with inheritance trees, ignore extended stats (which were
+	 * built without data from child rels, and thus do not represent them).
+	 */
+	if (rte->inh)
+		return sel;
 
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 10895fb287..1010d5caa8 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3913,6 +3913,14 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	Oid			statOid = InvalidOid;
 	MVNDistinct *stats;
 	StatisticExtInfo *matched_info = NULL;
+	RangeTblEntry		*rte = planner_rt_fetch(rel->relid, root);
+
+	/*
+	 * When dealing with inheritance trees, ignore extended stats (which were
+	 * built without data from child rels, and thus do not represent them).
+	 */
+	if (rte->inh)
+		return false;
 
 	/* bail out immediately if the table has no extended statistics */
 	if (!rel->statlist)
@@ -5222,6 +5230,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 		foreach(slist, onerel->statlist)
 		{
 			StatisticExtInfo *info = (StatisticExtInfo *) lfirst(slist);
+			RangeTblEntry	 *rte = planner_rt_fetch(onerel->relid, root);
 			ListCell   *expr_item;
 			int			pos;
 
@@ -5232,6 +5241,14 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 			if (vardata->statsTuple)
 				break;
 
+			/*
+			 * When dealing with inheritance trees, ignore extended stats (which
+			 * were built without data from child rels, and so do not represent
+			 * them).
+			 */
+			if (rte->inh)
+				break;
+
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
 				continue;
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index c60ba45aba..5410f58f7f 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -176,6 +176,30 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 NOTICE:  drop cascades to table ab1c
+-- Tests for stats with inheritance
+CREATE TABLE stxdinh(i int, j int);
+CREATE TABLE stxdinh1() INHERITS(stxdinh);
+INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
+INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Ensure non-inherited stats are not applied to inherited query
+-- Without stats object, it looks like this
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+      1000 |   1008
+(1 row)
+
+CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Since the stats object does not include inherited stats, it should not affect the estimates
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+      1000 |   1008
+(1 row)
+
+DROP TABLE stxdinh, stxdinh1;
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 -- expression stats may be built on a single expression column
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 6fb37962a7..a196d5e2f8 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -112,6 +112,21 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 
+-- Tests for stats with inheritance
+CREATE TABLE stxdinh(i int, j int);
+CREATE TABLE stxdinh1() INHERITS(stxdinh);
+INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a;
+INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Ensure non-inherited stats are not applied to inherited query
+-- Without stats object, it looks like this
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
+VACUUM ANALYZE stxdinh, stxdinh1;
+-- Since the stats object does not include inherited stats, it should not affect the estimates
+SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+DROP TABLE stxdinh, stxdinh1;
+
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 
-- 
2.31.1

0002-Build-inherited-extended-stats-on-partition-20211213.patchtext/x-patch; charset=UTF-8; name=0002-Build-inherited-extended-stats-on-partition-20211213.patchDownload
From 54e04ae2ea9fcb8d83f551c911771fa5018eff14 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Sat, 11 Dec 2021 22:38:08 +0100
Subject: [PATCH 2/4] Build inherited extended stats on partitioned tables

Commit 859b3003de disabled building of extended stats for inheritance
trees, to prevent updating the same catalog row twice. That resolved the
issue, but it also means there are no extended stats for partitioned
tables, because the (!inh) case has empty sample. This is a regression,
affecting queries that don't estimate the leaf relations directly.

But because partitioned tables are empty, we can invert the condition
and build statistics only for the case with inheritance, without losing
anything. And we can consider them when calculating estimates.

Report and patch by Justin Pryzby, minor fixes and cleanup by me.
Backpatch all the way back to PostgreSQL 10, where extended statistics
were introduced (same as 859b3003de).

Author: Justin Pryzby
Reported-by: Justin Pryzby
Backpatch-through: 10
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
---
 src/backend/commands/analyze.c          | 18 +++++++++++++-
 src/backend/statistics/dependencies.c   |  9 ++++---
 src/backend/statistics/extended_stats.c |  9 ++++---
 src/backend/utils/adt/selfuncs.c        | 31 +++++++++++++++----------
 src/test/regress/expected/stats_ext.out | 19 +++++++++++++++
 src/test/regress/sql/stats_ext.sql      | 10 ++++++++
 6 files changed, 77 insertions(+), 19 deletions(-)

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index cd77907fc7..58a82e4929 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -549,6 +549,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 	{
 		MemoryContext col_context,
 					old_context;
+		bool		build_ext_stats;
 
 		pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
 									 PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -612,13 +613,28 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
+		/*
+		 * Should we build extended statistics for this relation?
+		 *
+		 * The extended statistics catalog does not include an inheritance
+		 * flag, so we can't store statistics built both with and without
+		 * data from child relations. We can store just one set of statistics
+		 * per relation. For plain relations that's fine, but for inheritance
+		 * trees we have to pick whether to store statistics for just the
+		 * one relation or the whole tree. For plain inheritance we store
+		 * the (!inh) version, mostly for backwards compatibility reasons.
+		 * For partitioned tables that's pointless (the non-leaf tables are
+		 * always empty), so we store stats representing the whole tree.
+		 */
+		build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
+
 		/*
 		 * Build extended statistics (if there are any).
 		 *
 		 * For now we only build extended statistics on individual relations,
 		 * not for relations representing inheritance trees.
 		 */
-		if (!inh)
+		if (build_ext_stats)
 			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
 									   attr_cnt, vacattrstats);
 	}
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index a41bace75a..a9187ddeb1 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -1418,10 +1418,13 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			unique_exprs_cnt;
 
 	/*
-	 * When dealing with inheritance trees, ignore extended stats (which were
-	 * built without data from child rels, and thus do not represent them).
+	 * When dealing with regular inheritance trees, ignore extended stats
+	 * (which were built without data from child rels, and thus do not
+	 * represent them). For partitioned tables data there's no data in the
+	 * non-leaf relations, so we build stats only for the inheritance tree.
+	 * So for partitioned tables we do consider extended stats.
 	 */
-	if (rte->inh)
+	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 		return 1.0;
 
 	/* check if there's any stats that might be useful for us. */
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 5a3ba2f1c5..364d6dfb2e 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1698,10 +1698,13 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	RangeTblEntry *rte = planner_rt_fetch(rel->relid, root);
 
 	/*
-	 * When dealing with inheritance trees, ignore extended stats (which were
-	 * built without data from child rels, and thus do not represent them).
+	 * When dealing with regular inheritance trees, ignore extended stats
+	 * (which were built without data from child rels, and thus do not
+	 * represent them). For partitioned tables data there's no data in the
+	 * non-leaf relations, so we build stats only for the inheritance tree.
+	 * So for partitioned tables we do consider extended stats.
 	 */
-	if (rte->inh)
+	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 		return sel;
 
 	/* check if there's any stats that might be useful for us. */
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 1010d5caa8..abe47dab86 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3913,19 +3913,23 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	Oid			statOid = InvalidOid;
 	MVNDistinct *stats;
 	StatisticExtInfo *matched_info = NULL;
-	RangeTblEntry		*rte = planner_rt_fetch(rel->relid, root);
-
-	/*
-	 * When dealing with inheritance trees, ignore extended stats (which were
-	 * built without data from child rels, and thus do not represent them).
-	 */
-	if (rte->inh)
-		return false;
+	RangeTblEntry		*rte;
 
 	/* bail out immediately if the table has no extended statistics */
 	if (!rel->statlist)
 		return false;
 
+	/*
+	 * When dealing with regular inheritance trees, ignore extended stats
+	 * (which were built without data from child rels, and thus do not
+	 * represent them). For partitioned tables data there's no data in the
+	 * non-leaf relations, so we build stats only for the inheritance tree.
+	 * So for partitioned tables we do consider extended stats.
+	 */
+	rte = planner_rt_fetch(rel->relid, root);
+	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
+		return false;
+
 	/* look for the ndistinct statistics object matching the most vars */
 	nmatches_vars = 0;			/* we require at least two matches */
 	nmatches_exprs = 0;
@@ -5242,11 +5246,14 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 				break;
 
 			/*
-			 * When dealing with inheritance trees, ignore extended stats (which
-			 * were built without data from child rels, and so do not represent
-			 * them).
+			 * When dealing with regular inheritance trees, ignore extended
+			 * stats (which were built without data from child rels, and thus
+			 * do not represent them). For partitioned tables data there's no
+			 * data in the non-leaf relations, so we build stats only for the
+			 * inheritance tree. So for partitioned tables we do consider
+			 * extended stats.
 			 */
-			if (rte->inh)
+			if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
 				break;
 
 			/* skip stats without per-expression stats */
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 5410f58f7f..a11fff655a 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -200,6 +200,25 @@ SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 (1 row)
 
 DROP TABLE stxdinh, stxdinh1;
+-- Ensure inherited stats ARE applied to inherited query in partitioned table
+CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i);
+CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100);
+INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
+CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp;
+VACUUM ANALYZE stxdinp; -- partitions are processed recursively
+SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass;
+ ?column? 
+----------
+        1
+(1 row)
+
+SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+        10 |     10
+(1 row)
+
+DROP TABLE stxdinp;
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 -- expression stats may be built on a single expression column
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index a196d5e2f8..01383e8f5f 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -127,6 +127,16 @@ VACUUM ANALYZE stxdinh, stxdinh1;
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 DROP TABLE stxdinh, stxdinh1;
 
+-- Ensure inherited stats ARE applied to inherited query in partitioned table
+CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i);
+CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100);
+INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a;
+CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp;
+VACUUM ANALYZE stxdinp; -- partitions are processed recursively
+SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass;
+SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2');
+DROP TABLE stxdinp;
+
 -- basic test for statistics on expressions
 CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
 
-- 
2.31.1

0003-Add-stxdinherit-flag-to-pg_statistic_ext_da-20211213.patchtext/x-patch; charset=UTF-8; name=0003-Add-stxdinherit-flag-to-pg_statistic_ext_da-20211213.patchDownload
From 25aabb15fb78706c35ffc0df0156012c3f722bfb Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Sun, 12 Dec 2021 00:07:27 +0100
Subject: [PATCH 3/4] Add stxdinherit flag to pg_statistic_ext_data

The support for extended statistics on inheritance trees was somewhat
problematic, because the catalog pg_statistic_ext_data did not have a
flag identifying whether the data was built with/without data for the
child relations.

For partitioned tables we've been able to work around this because the
non-leaf relations can't contain data, so there's really just one set of
statistics to store. But for regular inheritance trees we had to pick,
and there is no clearly better option.

This introduces the pg_statistic_ext_data.stxdinherit flag, analogous to
pg_statistic.stainherit, and modifies analyze to build statistics for
both cases.

This relaxes the relationship between the two catalogs - until now we've
created the pg_statistic_ext_data one when creating the statistics, and
then only updated it later. There was always 1:1 relationship between
rows in the two catalogs. With this change we no longer insert any data
rows while creating statistics, because we don't know which flag value
to create, and it seems wasteful to initialize both for every relation.
The there may be 0, 1 or 2 data rows for each statistics definition.

Patch by me, with extensive improvements and fixes by Justin Pryzby.

Author: Tomas Vondra, Justin Pryzby
Reviewed-by: Tomas Vondra, Justin Pryzby
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
---
 doc/src/sgml/catalogs.sgml                  |  23 +++
 src/backend/catalog/system_views.sql        |   2 +
 src/backend/commands/analyze.c              |  28 +---
 src/backend/commands/statscmds.c            |  72 +++++-----
 src/backend/optimizer/util/plancat.c        | 147 ++++++++++++--------
 src/backend/statistics/dependencies.c       |  22 ++-
 src/backend/statistics/extended_stats.c     |  75 +++++-----
 src/backend/statistics/mcv.c                |   9 +-
 src/backend/statistics/mvdistinct.c         |   5 +-
 src/backend/utils/adt/selfuncs.c            |  37 +----
 src/backend/utils/cache/syscache.c          |   6 +-
 src/include/catalog/pg_statistic_ext_data.h |   4 +-
 src/include/commands/defrem.h               |   1 +
 src/include/nodes/pathnodes.h               |   1 +
 src/include/statistics/statistics.h         |  11 +-
 src/test/regress/expected/rules.out         |   2 +
 src/test/regress/expected/stats_ext.out     |  18 ++-
 src/test/regress/sql/stats_ext.sql          |   4 +-
 18 files changed, 241 insertions(+), 226 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 025db98763..9ab2cb52a7 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7521,6 +7521,19 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
    created with <link linkend="sql-createstatistics"><command>CREATE STATISTICS</command></link>.
   </para>
 
+  <para>
+   Normally there is one entry, with <structfield>stxdinherit</structfield> =
+   <literal>false</literal>, for each statistics object that has been analyzed.
+   If the table has inheritance children, a second entry with
+   <structfield>stxdinherit</structfield> = <literal>true</literal> is also created.
+   This row represents the statistics object over the inheritance tree, i.e.,
+   statistics for the data you'd see with
+   <literal>SELECT * FROM <replaceable>table</replaceable>*</literal>,
+   whereas the <structfield>stxdinherit</structfield> = <literal>false</literal> row
+   represents the results of
+   <literal>SELECT * FROM ONLY <replaceable>table</replaceable></literal>.
+  </para>
+
   <para>
    Like <link linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>,
    <structname>pg_statistic_ext_data</structname> should not be
@@ -7560,6 +7573,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>stxdinherit</structfield> <type>bool</type>
+      </para>
+      <para>
+       If true, the stats include inheritance child columns, not just the
+       values in the specified relation
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>stxdndistinct</structfield> <type>pg_ndistinct</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 61b515cdb8..319653789b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -266,6 +266,7 @@ CREATE VIEW pg_stats_ext WITH (security_barrier) AS
            ) AS attnames,
            pg_get_statisticsobjdef_expressions(s.oid) as exprs,
            s.stxkind AS kinds,
+           sd.stxdinherit AS inherited,
            sd.stxdndistinct AS n_distinct,
            sd.stxddependencies AS dependencies,
            m.most_common_vals,
@@ -298,6 +299,7 @@ CREATE VIEW pg_stats_ext_exprs WITH (security_barrier) AS
            s.stxname AS statistics_name,
            pg_get_userbyid(s.stxowner) AS statistics_owner,
            stat.expr,
+           sd.stxdinherit AS inherited,
            (stat.a).stanullfrac AS null_frac,
            (stat.a).stawidth AS avg_width,
            (stat.a).stadistinct AS n_distinct,
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 58a82e4929..3436f09d63 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -549,7 +549,6 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 	{
 		MemoryContext col_context,
 					old_context;
-		bool		build_ext_stats;
 
 		pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
 									 PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -613,30 +612,9 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
-		/*
-		 * Should we build extended statistics for this relation?
-		 *
-		 * The extended statistics catalog does not include an inheritance
-		 * flag, so we can't store statistics built both with and without
-		 * data from child relations. We can store just one set of statistics
-		 * per relation. For plain relations that's fine, but for inheritance
-		 * trees we have to pick whether to store statistics for just the
-		 * one relation or the whole tree. For plain inheritance we store
-		 * the (!inh) version, mostly for backwards compatibility reasons.
-		 * For partitioned tables that's pointless (the non-leaf tables are
-		 * always empty), so we store stats representing the whole tree.
-		 */
-		build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
-
-		/*
-		 * Build extended statistics (if there are any).
-		 *
-		 * For now we only build extended statistics on individual relations,
-		 * not for relations representing inheritance trees.
-		 */
-		if (build_ext_stats)
-			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
-									   attr_cnt, vacattrstats);
+		/* Build extended statistics (if there are any). */
+		BuildRelationExtStatistics(onerel, inh, totalrows, numrows, rows,
+								   attr_cnt, vacattrstats);
 	}
 
 	pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 8f1550ec80..1fa39e6f21 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -75,13 +75,10 @@ CreateStatistics(CreateStatsStmt *stmt)
 	HeapTuple	htup;
 	Datum		values[Natts_pg_statistic_ext];
 	bool		nulls[Natts_pg_statistic_ext];
-	Datum		datavalues[Natts_pg_statistic_ext_data];
-	bool		datanulls[Natts_pg_statistic_ext_data];
 	int2vector *stxkeys;
 	List	   *stxexprs = NIL;
 	Datum		exprsDatum;
 	Relation	statrel;
-	Relation	datarel;
 	Relation	rel = NULL;
 	Oid			relid;
 	ObjectAddress parentobject,
@@ -514,28 +511,10 @@ CreateStatistics(CreateStatsStmt *stmt)
 	relation_close(statrel, RowExclusiveLock);
 
 	/*
-	 * Also build the pg_statistic_ext_data tuple, to hold the actual
-	 * statistics data.
+	 * We used to create the pg_statistic_ext_data tuple too, but it's not clear
+	 * what value should the stxdinherit flag have (it depends on whether the rel
+	 * is partitioned, contains data, etc.)
 	 */
-	datarel = table_open(StatisticExtDataRelationId, RowExclusiveLock);
-
-	memset(datavalues, 0, sizeof(datavalues));
-	memset(datanulls, false, sizeof(datanulls));
-
-	datavalues[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statoid);
-
-	/* no statistics built yet */
-	datanulls[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
-	datanulls[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
-	datanulls[Anum_pg_statistic_ext_data_stxdmcv - 1] = true;
-	datanulls[Anum_pg_statistic_ext_data_stxdexpr - 1] = true;
-
-	/* insert it into pg_statistic_ext_data */
-	htup = heap_form_tuple(datarel->rd_att, datavalues, datanulls);
-	CatalogTupleInsert(datarel, htup);
-	heap_freetuple(htup);
-
-	relation_close(datarel, RowExclusiveLock);
 
 	InvokeObjectPostCreateHook(StatisticExtRelationId, statoid, 0);
 
@@ -717,32 +696,49 @@ AlterStatistics(AlterStatsStmt *stmt)
 }
 
 /*
- * Guts of statistics object deletion.
+ * Delete entry in pg_statistic_ext_data catalog. We don't know if the row
+ * exists, so don't error out.
  */
 void
-RemoveStatisticsById(Oid statsOid)
+RemoveStatisticsDataById(Oid statsOid, bool inh)
 {
 	Relation	relation;
 	HeapTuple	tup;
-	Form_pg_statistic_ext statext;
-	Oid			relid;
 
-	/*
-	 * First delete the pg_statistic_ext_data tuple holding the actual
-	 * statistical data.
-	 */
 	relation = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
-	tup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid));
-
-	if (!HeapTupleIsValid(tup)) /* should not happen */
-		elog(ERROR, "cache lookup failed for statistics data %u", statsOid);
+	tup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid),
+						  BoolGetDatum(inh));
 
-	CatalogTupleDelete(relation, &tup->t_self);
+	/* We don't know if the data row for inh value exists. */
+	if (HeapTupleIsValid(tup))
+	{
+		CatalogTupleDelete(relation, &tup->t_self);
 
-	ReleaseSysCache(tup);
+		ReleaseSysCache(tup);
+	}
 
 	table_close(relation, RowExclusiveLock);
+}
+
+/*
+ * Guts of statistics object deletion.
+ */
+void
+RemoveStatisticsById(Oid statsOid)
+{
+	Relation	relation;
+	HeapTuple	tup;
+	Form_pg_statistic_ext statext;
+	Oid			relid;
+
+	/*
+	 * First delete the pg_statistic_ext_data tuples holding the actual
+	 * statistical data. There might be data with/without inheritance, so
+	 * attempt deleting both.
+	 */
+	RemoveStatisticsDataById(statsOid, true);
+	RemoveStatisticsDataById(statsOid, false);
 
 	/*
 	 * Delete the pg_statistic_ext tuple.  Also send out a cache inval on the
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 564a38a13e..8ba3a3c59d 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -30,6 +30,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_statistic_ext.h"
+#include "catalog/pg_statistic_ext_data.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -1276,6 +1277,87 @@ get_relation_constraints(PlannerInfo *root,
 	return result;
 }
 
+/*
+ * Try loading data for the statistics object.
+ *
+ * We don't know if the data (specified by statOid and inh value) exist.
+ * The result is stored in stainfos list.
+ */
+static void
+get_relation_statistics_worker(List **stainfos, RelOptInfo *rel,
+							   Oid statOid, bool inh,
+							   Bitmapset *keys, List *exprs)
+{
+	Form_pg_statistic_ext_data dataForm;
+	HeapTuple	dtup;
+
+	dtup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(statOid), BoolGetDatum(inh));
+	if (!HeapTupleIsValid(dtup))
+		return;
+
+	dataForm = (Form_pg_statistic_ext_data) GETSTRUCT(dtup);
+
+	/* add one StatisticExtInfo for each kind built */
+	if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_NDISTINCT;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_DEPENDENCIES;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	if (statext_is_kind_built(dtup, STATS_EXT_MCV))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_MCV;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_EXPRESSIONS;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	ReleaseSysCache(dtup);
+}
+
 /*
  * get_relation_statistics
  *		Retrieve extended statistics defined on the table.
@@ -1299,7 +1381,6 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 		Oid			statOid = lfirst_oid(l);
 		Form_pg_statistic_ext staForm;
 		HeapTuple	htup;
-		HeapTuple	dtup;
 		Bitmapset  *keys = NULL;
 		List	   *exprs = NIL;
 		int			i;
@@ -1309,10 +1390,6 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
 		staForm = (Form_pg_statistic_ext) GETSTRUCT(htup);
 
-		dtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-		if (!HeapTupleIsValid(dtup))
-			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
-
 		/*
 		 * First, build the array of columns covered.  This is ultimately
 		 * wasted if no stats within the object have actually been built, but
@@ -1324,6 +1401,11 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 		/*
 		 * Preprocess expressions (if any). We read the expressions, run them
 		 * through eval_const_expressions, and fix the varnos.
+		 *
+		 * XXX We don't know yet if there are any data for this stats object,
+		 * with either stxdinherit value. But it's reasonable to assume there
+		 * is at least one of those, possibly both. So it's better to process
+		 * keys and expressions here.
 		 */
 		{
 			bool		isnull;
@@ -1364,61 +1446,14 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 			}
 		}
 
-		/* add one StatisticExtInfo for each kind built */
-		if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_NDISTINCT;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
-
-			stainfos = lappend(stainfos, info);
-		}
-
-		if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_DEPENDENCIES;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
-
-			stainfos = lappend(stainfos, info);
-		}
-
-		if (statext_is_kind_built(dtup, STATS_EXT_MCV))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_MCV;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
-
-			stainfos = lappend(stainfos, info);
-		}
-
-		if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+		/* extract statistics for possible values of stxdinherit flag */
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_EXPRESSIONS;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+		get_relation_statistics_worker(&stainfos, rel, statOid, true, keys, exprs);
 
-			stainfos = lappend(stainfos, info);
-		}
+		get_relation_statistics_worker(&stainfos, rel, statOid, false, keys, exprs);
 
 		ReleaseSysCache(htup);
-		ReleaseSysCache(dtup);
+
 		bms_free(keys);
 	}
 
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index a9187ddeb1..672091b517 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -619,14 +619,16 @@ dependency_is_fully_matched(MVDependency *dependency, Bitmapset *attnums)
  *		Load the functional dependencies for the indicated pg_statistic_ext tuple
  */
 MVDependencies *
-statext_dependencies_load(Oid mvoid)
+statext_dependencies_load(Oid mvoid, bool inh)
 {
 	MVDependencies *result;
 	bool		isnull;
 	Datum		deps;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid),
+						   BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
@@ -1417,16 +1419,6 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	Node	  **unique_exprs;
 	int			unique_exprs_cnt;
 
-	/*
-	 * When dealing with regular inheritance trees, ignore extended stats
-	 * (which were built without data from child rels, and thus do not
-	 * represent them). For partitioned tables data there's no data in the
-	 * non-leaf relations, so we build stats only for the inheritance tree.
-	 * So for partitioned tables we do consider extended stats.
-	 */
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return 1.0;
-
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
 		return 1.0;
@@ -1610,6 +1602,10 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (stat->kind != STATS_EXT_DEPENDENCIES)
 			continue;
 
+		/* skip statistics with mismatching stxdinherit value */
+		if (stat->inherit != rte->inh)
+			continue;
+
 		/*
 		 * Count matching attributes - we have to undo the attnum offsets. The
 		 * input attribute numbers are not offset (expressions are not
@@ -1656,7 +1652,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (nmatched + nexprs < 2)
 			continue;
 
-		deps = statext_dependencies_load(stat->statOid);
+		deps = statext_dependencies_load(stat->statOid, rte->inh);
 
 		/*
 		 * The expressions may be represented by different attnums in the
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 364d6dfb2e..3423603dd7 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -25,6 +25,7 @@
 #include "catalog/pg_statistic_ext.h"
 #include "catalog/pg_statistic_ext_data.h"
 #include "executor/executor.h"
+#include "commands/defrem.h"
 #include "commands/progress.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
@@ -78,7 +79,7 @@ typedef struct StatExtEntry
 static List *fetch_statentries_for_relation(Relation pg_statext, Oid relid);
 static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
 											int nvacatts, VacAttrStats **vacatts);
-static void statext_store(Oid statOid,
+static void statext_store(Oid statOid, bool inh,
 						  MVNDistinct *ndistinct, MVDependencies *dependencies,
 						  MCVList *mcv, Datum exprs, VacAttrStats **stats);
 static int	statext_compute_stattarget(int stattarget,
@@ -111,7 +112,7 @@ static StatsBuildData *make_build_data(Relation onerel, StatExtEntry *stat,
  * requested stats, and serializes them back into the catalog.
  */
 void
-BuildRelationExtStatistics(Relation onerel, double totalrows,
+BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 						   int numrows, HeapTuple *rows,
 						   int natts, VacAttrStats **vacattrstats)
 {
@@ -231,7 +232,8 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 		}
 
 		/* store the statistics in the catalog */
-		statext_store(stat->statOid, ndistinct, dependencies, mcv, exprstats, stats);
+		statext_store(stat->statOid, inh,
+					  ndistinct, dependencies, mcv, exprstats, stats);
 
 		/* for reporting progress */
 		pgstat_progress_update_param(PROGRESS_ANALYZE_EXT_STATS_COMPUTED,
@@ -782,23 +784,27 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
  *	tuple.
  */
 static void
-statext_store(Oid statOid,
+statext_store(Oid statOid, bool inh,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
 			  MCVList *mcv, Datum exprs, VacAttrStats **stats)
 {
 	Relation	pg_stextdata;
-	HeapTuple	stup,
-				oldtup;
+	HeapTuple	stup;
 	Datum		values[Natts_pg_statistic_ext_data];
 	bool		nulls[Natts_pg_statistic_ext_data];
-	bool		replaces[Natts_pg_statistic_ext_data];
 
 	pg_stextdata = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
 	memset(nulls, true, sizeof(nulls));
-	memset(replaces, false, sizeof(replaces));
 	memset(values, 0, sizeof(values));
 
+	/* basic info */
+	values[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statOid);
+	nulls[Anum_pg_statistic_ext_data_stxoid - 1] = false;
+
+	values[Anum_pg_statistic_ext_data_stxdinherit - 1] = BoolGetDatum(inh);
+	nulls[Anum_pg_statistic_ext_data_stxdinherit - 1] = false;
+
 	/*
 	 * Construct a new pg_statistic_ext_data tuple, replacing the calculated
 	 * stats.
@@ -831,25 +837,15 @@ statext_store(Oid statOid,
 		values[Anum_pg_statistic_ext_data_stxdexpr - 1] = exprs;
 	}
 
-	/* always replace the value (either by bytea or NULL) */
-	replaces[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdmcv - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdexpr - 1] = true;
-
-	/* there should already be a pg_statistic_ext_data tuple */
-	oldtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-	if (!HeapTupleIsValid(oldtup))
-		elog(ERROR, "cache lookup failed for statistics object %u", statOid);
-
-	/* replace it */
-	stup = heap_modify_tuple(oldtup,
-							 RelationGetDescr(pg_stextdata),
-							 values,
-							 nulls,
-							 replaces);
-	ReleaseSysCache(oldtup);
-	CatalogTupleUpdate(pg_stextdata, &stup->t_self, stup);
+	/*
+	 * Delete the old tuple if it exists, and insert a new one. It's easier
+	 * than trying to update or insert, based on various conditions.
+	 */
+	RemoveStatisticsDataById(statOid, inh);
+
+	/* form and insert a new tuple */
+	stup = heap_form_tuple(RelationGetDescr(pg_stextdata), values, nulls);
+	CatalogTupleInsert(pg_stextdata, stup);
 
 	heap_freetuple(stup);
 
@@ -1235,7 +1231,7 @@ stat_covers_expressions(StatisticExtInfo *stat, List *exprs,
  * further tiebreakers are needed.
  */
 StatisticExtInfo *
-choose_best_statistics(List *stats, char requiredkind,
+choose_best_statistics(List *stats, char requiredkind, bool inh,
 					   Bitmapset **clause_attnums, List **clause_exprs,
 					   int nclauses)
 {
@@ -1257,6 +1253,10 @@ choose_best_statistics(List *stats, char requiredkind,
 		if (info->kind != requiredkind)
 			continue;
 
+		/* skip statistics with mismatching inheritance flag */
+		if (info->inherit != inh)
+			continue;
+
 		/*
 		 * Collect attributes and expressions in remaining (unestimated)
 		 * clauses fully covered by this statistic object.
@@ -1697,16 +1697,6 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
 	RangeTblEntry *rte = planner_rt_fetch(rel->relid, root);
 
-	/*
-	 * When dealing with regular inheritance trees, ignore extended stats
-	 * (which were built without data from child rels, and thus do not
-	 * represent them). For partitioned tables data there's no data in the
-	 * non-leaf relations, so we build stats only for the inheritance tree.
-	 * So for partitioned tables we do consider extended stats.
-	 */
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return sel;
-
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
 		return sel;
@@ -1758,7 +1748,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 		Bitmapset  *simple_clauses;
 
 		/* find the best suited statistics object for these attnums */
-		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV,
+		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, rte->inh,
 									  list_attnums, list_exprs,
 									  list_length(clauses));
 
@@ -1847,7 +1837,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 			MCVList    *mcv_list;
 
 			/* Load the MCV list stored in the statistics object */
-			mcv_list = statext_mcv_load(stat->statOid);
+			mcv_list = statext_mcv_load(stat->statOid, rte->inh);
 
 			/*
 			 * Compute the selectivity of the ORed list of clauses covered by
@@ -2408,7 +2398,7 @@ serialize_expr_stats(AnlExprData *exprdata, int nexprs)
  * identified by the supplied index.
  */
 HeapTuple
-statext_expressions_load(Oid stxoid, int idx)
+statext_expressions_load(Oid stxoid, bool inh, int idx)
 {
 	bool		isnull;
 	Datum		value;
@@ -2418,7 +2408,8 @@ statext_expressions_load(Oid stxoid, int idx)
 	HeapTupleData tmptup;
 	HeapTuple	tup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(stxoid), BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", stxoid);
 
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index b350fc5f7b..75d7d35caa 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -559,12 +559,13 @@ build_column_frequencies(SortItem *groups, int ngroups,
  *		Load the MCV list for the indicated pg_statistic_ext tuple.
  */
 MCVList *
-statext_mcv_load(Oid mvoid)
+statext_mcv_load(Oid mvoid, bool inh)
 {
 	MCVList    *result;
 	bool		isnull;
 	Datum		mcvlist;
-	HeapTuple	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	HeapTuple	htup = SearchSysCache2(STATEXTDATASTXOID,
+									   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
@@ -2039,11 +2040,13 @@ mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
 	MCVList    *mcv;
 	Selectivity s = 0.0;
 
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
 	/* match/mismatch bitmap for each MCV item */
 	bool	   *matches = NULL;
 
 	/* load the MCV list stored in the statistics object */
-	mcv = statext_mcv_load(stat->statOid);
+	mcv = statext_mcv_load(stat->statOid, rte->inh);
 
 	/* build a match bitmap for the clauses */
 	matches = mcv_get_match_bitmap(root, clauses, stat->keys, stat->exprs,
diff --git a/src/backend/statistics/mvdistinct.c b/src/backend/statistics/mvdistinct.c
index 4481312d61..ab1f10d6c0 100644
--- a/src/backend/statistics/mvdistinct.c
+++ b/src/backend/statistics/mvdistinct.c
@@ -146,14 +146,15 @@ statext_ndistinct_build(double totalrows, StatsBuildData *data)
  *		Load the ndistinct value for the indicated pg_statistic_ext tuple
  */
 MVNDistinct *
-statext_ndistinct_load(Oid mvoid)
+statext_ndistinct_load(Oid mvoid, bool inh)
 {
 	MVNDistinct *result;
 	bool		isnull;
 	Datum		ndist;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index abe47dab86..939d561fa1 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3919,17 +3919,6 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	if (!rel->statlist)
 		return false;
 
-	/*
-	 * When dealing with regular inheritance trees, ignore extended stats
-	 * (which were built without data from child rels, and thus do not
-	 * represent them). For partitioned tables data there's no data in the
-	 * non-leaf relations, so we build stats only for the inheritance tree.
-	 * So for partitioned tables we do consider extended stats.
-	 */
-	rte = planner_rt_fetch(rel->relid, root);
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return false;
-
 	/* look for the ndistinct statistics object matching the most vars */
 	nmatches_vars = 0;			/* we require at least two matches */
 	nmatches_exprs = 0;
@@ -4015,7 +4004,8 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 
 	Assert(nmatches_vars + nmatches_exprs > 1);
 
-	stats = statext_ndistinct_load(statOid);
+	rte = planner_rt_fetch(rel->relid, root);
+	stats = statext_ndistinct_load(statOid, rte->inh);
 
 	/*
 	 * If we have a match, search it for the specific item that matches (there
@@ -5245,17 +5235,6 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 			if (vardata->statsTuple)
 				break;
 
-			/*
-			 * When dealing with regular inheritance trees, ignore extended
-			 * stats (which were built without data from child rels, and thus
-			 * do not represent them). For partitioned tables data there's no
-			 * data in the non-leaf relations, so we build stats only for the
-			 * inheritance tree. So for partitioned tables we do consider
-			 * extended stats.
-			 */
-			if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-				break;
-
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
 				continue;
@@ -5274,22 +5253,18 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 				/* found a match, see if we can extract pg_statistic row */
 				if (equal(node, expr))
 				{
-					HeapTuple	t = statext_expressions_load(info->statOid, pos);
-
-					/* Get statistics object's table for permission check */
-					RangeTblEntry *rte;
 					Oid			userid;
 
-					vardata->statsTuple = t;
+					Assert(rte->rtekind == RTE_RELATION);
 
 					/*
 					 * XXX Not sure if we should cache the tuple somewhere.
 					 * Now we just create a new copy every time.
 					 */
-					vardata->freefunc = ReleaseDummy;
+					vardata->statsTuple =
+						statext_expressions_load(info->statOid, rte->inh, pos);
 
-					rte = planner_rt_fetch(onerel->relid, root);
-					Assert(rte->rtekind == RTE_RELATION);
+					vardata->freefunc = ReleaseDummy;
 
 					/*
 					 * Use checkAsUser if it's set, in case we're accessing
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 56870b46e4..6194751e99 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -763,11 +763,11 @@ static const struct cachedesc cacheinfo[] = {
 		32
 	},
 	{StatisticExtDataRelationId,	/* STATEXTDATASTXOID */
-		StatisticExtDataStxoidIndexId,
-		1,
+		StatisticExtDataStxoidInhIndexId,
+		2,
 		{
 			Anum_pg_statistic_ext_data_stxoid,
-			0,
+			Anum_pg_statistic_ext_data_stxdinherit,
 			0,
 			0
 		},
diff --git a/src/include/catalog/pg_statistic_ext_data.h b/src/include/catalog/pg_statistic_ext_data.h
index 7b73b790d2..8ffd8b68cd 100644
--- a/src/include/catalog/pg_statistic_ext_data.h
+++ b/src/include/catalog/pg_statistic_ext_data.h
@@ -32,6 +32,7 @@ CATALOG(pg_statistic_ext_data,3429,StatisticExtDataRelationId)
 {
 	Oid			stxoid BKI_LOOKUP(pg_statistic_ext);	/* statistics object
 														 * this data is for */
+	bool		stxdinherit;	/* true if inheritance children are included */
 
 #ifdef CATALOG_VARLEN			/* variable-length fields start here */
 
@@ -53,6 +54,7 @@ typedef FormData_pg_statistic_ext_data * Form_pg_statistic_ext_data;
 
 DECLARE_TOAST(pg_statistic_ext_data, 3430, 3431);
 
-DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_index, 3433, StatisticExtDataStxoidIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops));
+DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_inh_index, 3433, StatisticExtDataStxoidInhIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops, stxdinherit bool_ops));
+
 
 #endif							/* PG_STATISTIC_EXT_DATA_H */
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index f84d09959c..ab3c59fcd5 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -83,6 +83,7 @@ extern ObjectAddress AlterOperator(AlterOperatorStmt *stmt);
 extern ObjectAddress CreateStatistics(CreateStatsStmt *stmt);
 extern ObjectAddress AlterStatistics(AlterStatsStmt *stmt);
 extern void RemoveStatisticsById(Oid statsOid);
+extern void RemoveStatisticsDataById(Oid statsOid, bool inh);
 extern Oid	StatisticsGetRelation(Oid statId, bool missing_ok);
 
 /* commands/aggregatecmds.c */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 324d92880b..02aff9847e 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -934,6 +934,7 @@ typedef struct StatisticExtInfo
 	NodeTag		type;
 
 	Oid			statOid;		/* OID of the statistics row */
+	bool		inherit;		/* includes child relations */
 	RelOptInfo *rel;			/* back-link to statistic's table */
 	char		kind;			/* statistics kind of this entry */
 	Bitmapset  *keys;			/* attnums of the columns covered */
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 326cf26fea..3868e43f8a 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -94,11 +94,11 @@ typedef struct MCVList
 	MCVItem		items[FLEXIBLE_ARRAY_MEMBER];	/* array of MCV items */
 } MCVList;
 
-extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
-extern MVDependencies *statext_dependencies_load(Oid mvoid);
-extern MCVList *statext_mcv_load(Oid mvoid);
+extern MVNDistinct *statext_ndistinct_load(Oid mvoid, bool inh);
+extern MVDependencies *statext_dependencies_load(Oid mvoid, bool inh);
+extern MCVList *statext_mcv_load(Oid mvoid, bool inh);
 
-extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
+extern void BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 									   int numrows, HeapTuple *rows,
 									   int natts, VacAttrStats **vacattrstats);
 extern int	ComputeExtStatisticsRows(Relation onerel,
@@ -121,9 +121,10 @@ extern Selectivity statext_clauselist_selectivity(PlannerInfo *root,
 												  bool is_or);
 extern bool has_stats_of_kind(List *stats, char requiredkind);
 extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind,
+												bool inh,
 												Bitmapset **clause_attnums,
 												List **clause_exprs,
 												int nclauses);
-extern HeapTuple statext_expressions_load(Oid stxoid, int idx);
+extern HeapTuple statext_expressions_load(Oid stxoid, bool inh, int idx);
 
 #endif							/* STATISTICS_H */
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index b58b062b10..d652f7b5fb 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2443,6 +2443,7 @@ pg_stats_ext| SELECT cn.nspname AS schemaname,
              JOIN pg_attribute a ON (((a.attrelid = s.stxrelid) AND (a.attnum = k.k))))) AS attnames,
     pg_get_statisticsobjdef_expressions(s.oid) AS exprs,
     s.stxkind AS kinds,
+    sd.stxdinherit AS inherited,
     sd.stxdndistinct AS n_distinct,
     sd.stxddependencies AS dependencies,
     m.most_common_vals,
@@ -2469,6 +2470,7 @@ pg_stats_ext_exprs| SELECT cn.nspname AS schemaname,
     s.stxname AS statistics_name,
     pg_get_userbyid(s.stxowner) AS statistics_owner,
     stat.expr,
+    sd.stxdinherit AS inherited,
     (stat.a).stanullfrac AS null_frac,
     (stat.a).stawidth AS avg_width,
     (stat.a).stadistinct AS n_distinct,
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index a11fff655a..427884cda8 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -144,10 +144,9 @@ SELECT stxname, stxdndistinct, stxddependencies, stxdmcv
   FROM pg_statistic_ext s, pg_statistic_ext_data d
  WHERE s.stxname = 'ab1_a_b_stats'
    AND d.stxoid = s.oid;
-    stxname    | stxdndistinct | stxddependencies | stxdmcv 
----------------+---------------+------------------+---------
- ab1_a_b_stats |               |                  | 
-(1 row)
+ stxname | stxdndistinct | stxddependencies | stxdmcv 
+---------+---------------+------------------+---------
+(0 rows)
 
 ALTER STATISTICS ab1_a_b_stats SET STATISTICS -1;
 \d+ ab1
@@ -192,11 +191,18 @@ SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 
 CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
 VACUUM ANALYZE stxdinh, stxdinh1;
--- Since the stats object does not include inherited stats, it should not affect the estimates
+-- See if the extended stats affect the estimates
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
  estimated | actual 
 -----------+--------
-      1000 |   1008
+      1008 |   1008
+(1 row)
+
+-- Ensure correct (non-inherited) stats are applied to inherited query
+SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2');
+ estimated | actual 
+-----------+--------
+         9 |      9
 (1 row)
 
 DROP TABLE stxdinh, stxdinh1;
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 01383e8f5f..8deb010b3c 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -123,8 +123,10 @@ VACUUM ANALYZE stxdinh, stxdinh1;
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
 CREATE STATISTICS stxdinh ON i,j FROM stxdinh;
 VACUUM ANALYZE stxdinh, stxdinh1;
--- Since the stats object does not include inherited stats, it should not affect the estimates
+-- See if the extended stats affect the estimates
 SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2');
+-- Ensure correct (non-inherited) stats are applied to inherited query
+SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2');
 DROP TABLE stxdinh, stxdinh1;
 
 -- Ensure inherited stats ARE applied to inherited query in partitioned table
-- 
2.31.1

0004-Refactor-parent-ACL-check-20211213.patchtext/x-patch; charset=UTF-8; name=0004-Refactor-parent-ACL-check-20211213.patchDownload
From 91ba11c701f68e215d425efb9f255fc324575930 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Sun, 12 Dec 2021 02:28:41 +0100
Subject: [PATCH 4/4] Refactor parent ACL check

selfuncs.c is 8k lines long, and this makes it 30 LOC shorter.
---
 src/backend/utils/adt/selfuncs.c | 140 ++++++++++++-------------------
 1 file changed, 52 insertions(+), 88 deletions(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 939d561fa1..2f1c40a138 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -187,6 +187,8 @@ static char *convert_string_datum(Datum value, Oid typid, Oid collid,
 								  bool *failure);
 static double convert_timevalue_to_scalar(Datum value, Oid typid,
 										  bool *failure);
+static void recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata,
+								Oid relid);
 static void examine_simple_variable(PlannerInfo *root, Var *var,
 									VariableStatData *vardata);
 static bool get_variable_range(PlannerInfo *root, VariableStatData *vardata,
@@ -5153,51 +5155,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 									(pg_class_aclcheck(rte->relid, userid,
 													   ACL_SELECT) == ACLCHECK_OK);
 
-								/*
-								 * If the user doesn't have permissions to
-								 * access an inheritance child relation, check
-								 * the permissions of the table actually
-								 * mentioned in the query, since most likely
-								 * the user does have that permission.  Note
-								 * that whole-table select privilege on the
-								 * parent doesn't quite guarantee that the
-								 * user could read all columns of the child.
-								 * But in practice it's unlikely that any
-								 * interesting security violation could result
-								 * from allowing access to the expression
-								 * index's stats, so we allow it anyway.  See
-								 * similar code in examine_simple_variable()
-								 * for additional comments.
-								 */
-								if (!vardata->acl_ok &&
-									root->append_rel_array != NULL)
-								{
-									AppendRelInfo *appinfo;
-									Index		varno = index->rel->relid;
-
-									appinfo = root->append_rel_array[varno];
-									while (appinfo &&
-										   planner_rt_fetch(appinfo->parent_relid,
-															root)->rtekind == RTE_RELATION)
-									{
-										varno = appinfo->parent_relid;
-										appinfo = root->append_rel_array[varno];
-									}
-									if (varno != index->rel->relid)
-									{
-										/* Repeat access check on this rel */
-										rte = planner_rt_fetch(varno, root);
-										Assert(rte->rtekind == RTE_RELATION);
-
-										userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-										vardata->acl_ok =
-											rte->securityQuals == NIL &&
-											(pg_class_aclcheck(rte->relid,
-															   userid,
-															   ACL_SELECT) == ACLCHECK_OK);
-									}
-								}
+								recheck_parent_acl(root, vardata, index->rel->relid);
 							}
 							else
 							{
@@ -5285,49 +5243,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 						(pg_class_aclcheck(rte->relid, userid,
 										   ACL_SELECT) == ACLCHECK_OK);
 
-					/*
-					 * If the user doesn't have permissions to access an
-					 * inheritance child relation, check the permissions of
-					 * the table actually mentioned in the query, since most
-					 * likely the user does have that permission.  Note that
-					 * whole-table select privilege on the parent doesn't
-					 * quite guarantee that the user could read all columns of
-					 * the child. But in practice it's unlikely that any
-					 * interesting security violation could result from
-					 * allowing access to the expression stats, so we allow it
-					 * anyway.  See similar code in examine_simple_variable()
-					 * for additional comments.
-					 */
-					if (!vardata->acl_ok &&
-						root->append_rel_array != NULL)
-					{
-						AppendRelInfo *appinfo;
-						Index		varno = onerel->relid;
-
-						appinfo = root->append_rel_array[varno];
-						while (appinfo &&
-							   planner_rt_fetch(appinfo->parent_relid,
-												root)->rtekind == RTE_RELATION)
-						{
-							varno = appinfo->parent_relid;
-							appinfo = root->append_rel_array[varno];
-						}
-						if (varno != onerel->relid)
-						{
-							/* Repeat access check on this rel */
-							rte = planner_rt_fetch(varno, root);
-							Assert(rte->rtekind == RTE_RELATION);
-
-							userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-							vardata->acl_ok =
-								rte->securityQuals == NIL &&
-								(pg_class_aclcheck(rte->relid,
-												   userid,
-												   ACL_SELECT) == ACLCHECK_OK);
-						}
-					}
-
+					recheck_parent_acl(root, vardata, onerel->relid);
 					break;
 				}
 
@@ -5337,6 +5253,54 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 	}
 }
 
+/*
+ * If the user doesn't have permissions to access an inheritance child
+ * relation, check the permissions of the table actually mentioned in the
+ * query, since most likely the user does have that permission.  Note that
+ * whole-table select privilege on the parent doesn't quite guarantee that the
+ * user could read all columns of the child.  But in practice it's unlikely
+ * that any interesting security violation could result from allowing access to
+ * the expression stats, so we allow it anyway.  See similar code in
+ * examine_simple_variable() for additional comments.
+ */
+static void
+recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata, Oid relid)
+{
+	RangeTblEntry	*rte;
+	Oid		userid;
+
+	if (!vardata->acl_ok &&
+		root->append_rel_array != NULL)
+	{
+		AppendRelInfo *appinfo;
+		Index		varno = relid;
+
+		appinfo = root->append_rel_array[varno];
+		while (appinfo &&
+			   planner_rt_fetch(appinfo->parent_relid,
+								root)->rtekind == RTE_RELATION)
+		{
+			varno = appinfo->parent_relid;
+			appinfo = root->append_rel_array[varno];
+		}
+
+		if (varno != relid)
+		{
+			/* Repeat access check on this rel */
+			rte = planner_rt_fetch(varno, root);
+			Assert(rte->rtekind == RTE_RELATION);
+
+			userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
+
+			vardata->acl_ok =
+				rte->securityQuals == NIL &&
+				(pg_class_aclcheck(rte->relid,
+								   userid,
+								   ACL_SELECT) == ACLCHECK_OK);
+		}
+	}
+}
+
 /*
  * examine_simple_variable
  *		Handle a simple Var for examine_variable
-- 
2.31.1

#33Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Justin Pryzby (#31)
Re: extended stats on partitioned tables

On 12/13/21 14:53, Justin Pryzby wrote:

On Sun, Dec 12, 2021 at 11:23:19PM +0100, Tomas Vondra wrote:

On 12/12/21 22:32, Justin Pryzby wrote:

On Sun, Dec 12, 2021 at 05:17:10AM +0100, Tomas Vondra wrote:

The one thing bugging me a bit is that the regression test checks only a
GROUP BY query. It'd be nice to add queries testing MCV/dependencies
too, but that seems tricky because most queries will use per-partitions
stats.

You mean because the quals are pushed down to the scan node.

Does that indicate a deficiency ?

If extended stats are collected for a parent table, selectivity estimates based
from the parent would be better; but instead we use uncorrected column
estimates from the child tables.

From what I see, we could come up with a way to avoid the pushdown, involving
volatile functions/foreign tables/RLS/window functions/SRF/wholerow vars/etc.
But would it be better if extended stats objects on partitioned tables were to
collect stats for both parent AND CHILD ? I'm not sure. Maybe that's the
wrong solution, but maybe we should still document that extended stats on
(empty) parent tables are often themselves not used/useful for selectivity
estimates, and the user should instead (or in addition) create stats on child
tables.

Or, maybe if there's no extended stats on the child tables, stats on the parent
table should be consulted ?

Maybe, but that seems like a mostly separate improvement. At this point I'm
interested only in testing the behavior implemented in the current patches.

I don't want to change the scope of the patch, or this thread, but my point is
that the behaviour already changed once (the original regression) and now we're
planning to change it again to fix that, so we ought to decide on the expected
behavior before writing tests to verify it.

OK. Makes sense.

I think it may be impossible to use the "dependencies" statistic with inherited
stats. Normally the quals would be pushed down to the child tables. But, if
they weren't pushed down, they'd be attached to something other than a scan
node on the parent table, so the stats on that table wouldn't apply (right?).

Yeah, that's probably right. But I'm not 100% sure the whole inheritance
tree can't be treated as a single relation by some queries ...

Maybe the useless stats types should have been prohibited on partitioned tables
since v10. It's too late to change that, but perhaps now they shouldn't even
be collected during analyze. The dependencies and MCV paths are never called
with rte->inh==true, so maybe we should Assert(!inh), or add a comment to that
effect. Or the regression tests should "memorialize" the behavior. I'm still
thinking about it.

Yeah, we can't really prohibit them in backbranches - that'd mean some
CREATE STATISTICS commands suddenly start failing, which would be quite
annoying. Not building them for partitioned tables seems like a better
option - BuildRelationExtStatistics can check relkind and pick what to
ignore. But I'd choose to do that in a separate patch, probably - after
all, it shouldn't really change the behavior of any tests/queries, no?

This reminds me we need to consider if these patches could cause any
issues. The way I see it is this:

1) If the table is a separate relation (not part of an inheritance
tree), this should make no difference. -> OK

2) If the table is using "old" inheritance, this reverts back to
pre-regression behavior. So people will keep using the old statistics
until the ANALYZE, and we need to tell them to ANALYZE or something.

3) If the table is using partitioning, it's guaranteed to be empty and
there are no stats at all. Again, we should tell people to run ANALYZE.

Of course, we can't be sure query plans will change in the right
direction :-(

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#34Justin Pryzby
pryzby@telsasoft.com
In reply to: Tomas Vondra (#33)
Re: extended stats on partitioned tables

On Mon, Dec 13, 2021 at 09:40:09PM +0100, Tomas Vondra wrote:

1) If the table is a separate relation (not part of an inheritance
tree), this should make no difference. -> OK

2) If the table is using "old" inheritance, this reverts back to
pre-regression behavior. So people will keep using the old statistics
until the ANALYZE, and we need to tell them to ANALYZE or something.

3) If the table is using partitioning, it's guaranteed to be empty and
there are no stats at all. Again, we should tell people to run ANALYZE.

I think these can be mentioned in the commit message, which can end up in the
minor release notes as a recommendation to rerun ANALYZE.

Thanks for pushing 0001.

--
Justin

#35Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Justin Pryzby (#34)
2 attachment(s)
Re: extended stats on partitioned tables

On 1/15/22 06:11, Justin Pryzby wrote:

On Mon, Dec 13, 2021 at 09:40:09PM +0100, Tomas Vondra wrote:

1) If the table is a separate relation (not part of an inheritance
tree), this should make no difference. -> OK

2) If the table is using "old" inheritance, this reverts back to
pre-regression behavior. So people will keep using the old statistics
until the ANALYZE, and we need to tell them to ANALYZE or something.

3) If the table is using partitioning, it's guaranteed to be empty and
there are no stats at all. Again, we should tell people to run ANALYZE.

I think these can be mentioned in the commit message, which can end up in the
minor release notes as a recommendation to rerun ANALYZE.

Good point. I pushed the 0002 part and added a short paragraph
suggesting ANALYZE might be necessary. I did not go into details about
the individual cases, because that'd be too much for a commit message.

Thanks for pushing 0001.

Thanks for posting the patches!

I've pushed the second part - attached are the two remaining parts. I'll
wait a bit before pushing the rebased 0001 (which goes into master
branch only). Not sure about 0002 - I'm not convinced the refactored ACL
checks are an improvement, but I'll think about it.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

0001-Add-stxdinherit-flag-to-pg_statistic_ext_da-20220115.patchtext/x-patch; charset=UTF-8; name=0001-Add-stxdinherit-flag-to-pg_statistic_ext_da-20220115.patchDownload
From e065e54d1961a4e295ebe77febbe6a3307d142b5 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Sun, 12 Dec 2021 00:07:27 +0100
Subject: [PATCH 1/2] Add stxdinherit flag to pg_statistic_ext_data

The support for extended statistics on inheritance trees was somewhat
problematic, because the catalog pg_statistic_ext_data did not have a
flag identifying whether the data was built with/without data for the
child relations.

For partitioned tables we've been able to work around this because the
non-leaf relations can't contain data, so there's really just one set of
statistics to store. But for regular inheritance trees we had to pick,
and there is no clearly better option - we need both.

This introduces the pg_statistic_ext_data.stxdinherit flag, analogous to
pg_statistic.stainherit, and modifies analyze to build statistics for
both cases.

This relaxes the relationship between the two catalogs - until now we've
created the pg_statistic_ext_data one when creating the statistics, and
then only updated it later. There was always 1:1 relationship between
rows in the two catalogs. With this change we no longer insert any data
rows while creating statistics, because we don't know which flag value
to create, and it seems wasteful to initialize both for every relation.
The there may be 0, 1 or 2 data rows for each statistics definition.

Patch by me, with extensive improvements and fixes by Justin Pryzby.

Author: Tomas Vondra, Justin Pryzby
Reviewed-by: Tomas Vondra, Justin Pryzby
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
---
 doc/src/sgml/catalogs.sgml                  |  23 +++
 src/backend/catalog/system_views.sql        |   2 +
 src/backend/commands/analyze.c              |  28 +---
 src/backend/commands/statscmds.c            |  72 +++++-----
 src/backend/optimizer/util/plancat.c        | 147 ++++++++++++--------
 src/backend/statistics/dependencies.c       |  22 ++-
 src/backend/statistics/extended_stats.c     |  75 +++++-----
 src/backend/statistics/mcv.c                |   9 +-
 src/backend/statistics/mvdistinct.c         |   5 +-
 src/backend/utils/adt/selfuncs.c            |  37 +----
 src/backend/utils/cache/syscache.c          |   6 +-
 src/include/catalog/pg_statistic_ext_data.h |   4 +-
 src/include/commands/defrem.h               |   1 +
 src/include/nodes/pathnodes.h               |   1 +
 src/include/statistics/statistics.h         |  11 +-
 src/test/regress/expected/rules.out         |   2 +
 src/test/regress/expected/stats_ext.out     |  31 +++--
 src/test/regress/sql/stats_ext.sql          |  13 +-
 18 files changed, 254 insertions(+), 235 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 03e2537b07d..2aeb2ef346e 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7521,6 +7521,19 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
    created with <link linkend="sql-createstatistics"><command>CREATE STATISTICS</command></link>.
   </para>
 
+  <para>
+   Normally there is one entry, with <structfield>stxdinherit</structfield> =
+   <literal>false</literal>, for each statistics object that has been analyzed.
+   If the table has inheritance children, a second entry with
+   <structfield>stxdinherit</structfield> = <literal>true</literal> is also created.
+   This row represents the statistics object over the inheritance tree, i.e.,
+   statistics for the data you'd see with
+   <literal>SELECT * FROM <replaceable>table</replaceable>*</literal>,
+   whereas the <structfield>stxdinherit</structfield> = <literal>false</literal> row
+   represents the results of
+   <literal>SELECT * FROM ONLY <replaceable>table</replaceable></literal>.
+  </para>
+
   <para>
    Like <link linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>,
    <structname>pg_statistic_ext_data</structname> should not be
@@ -7560,6 +7573,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>stxdinherit</structfield> <type>bool</type>
+      </para>
+      <para>
+       If true, the stats include inheritance child columns, not just the
+       values in the specified relation
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>stxdndistinct</structfield> <type>pg_ndistinct</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 701ff38f761..3cb69b1f87b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -266,6 +266,7 @@ CREATE VIEW pg_stats_ext WITH (security_barrier) AS
            ) AS attnames,
            pg_get_statisticsobjdef_expressions(s.oid) as exprs,
            s.stxkind AS kinds,
+           sd.stxdinherit AS inherited,
            sd.stxdndistinct AS n_distinct,
            sd.stxddependencies AS dependencies,
            m.most_common_vals,
@@ -298,6 +299,7 @@ CREATE VIEW pg_stats_ext_exprs WITH (security_barrier) AS
            s.stxname AS statistics_name,
            pg_get_userbyid(s.stxowner) AS statistics_owner,
            stat.expr,
+           sd.stxdinherit AS inherited,
            (stat.a).stanullfrac AS null_frac,
            (stat.a).stawidth AS avg_width,
            (stat.a).stadistinct AS n_distinct,
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index d447bc9b436..a0da998c2ea 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -549,7 +549,6 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 	{
 		MemoryContext col_context,
 					old_context;
-		bool		build_ext_stats;
 
 		pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
 									 PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -613,30 +612,9 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 							thisdata->attr_cnt, thisdata->vacattrstats);
 		}
 
-		/*
-		 * Should we build extended statistics for this relation?
-		 *
-		 * The extended statistics catalog does not include an inheritance
-		 * flag, so we can't store statistics built both with and without
-		 * data from child relations. We can store just one set of statistics
-		 * per relation. For plain relations that's fine, but for inheritance
-		 * trees we have to pick whether to store statistics for just the
-		 * one relation or the whole tree. For plain inheritance we store
-		 * the (!inh) version, mostly for backwards compatibility reasons.
-		 * For partitioned tables that's pointless (the non-leaf tables are
-		 * always empty), so we store stats representing the whole tree.
-		 */
-		build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
-
-		/*
-		 * Build extended statistics (if there are any).
-		 *
-		 * For now we only build extended statistics on individual relations,
-		 * not for relations representing inheritance trees.
-		 */
-		if (build_ext_stats)
-			BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
-									   attr_cnt, vacattrstats);
+		/* Build extended statistics (if there are any). */
+		BuildRelationExtStatistics(onerel, inh, totalrows, numrows, rows,
+								   attr_cnt, vacattrstats);
 	}
 
 	pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index f9d3f9c7b88..f7419b8f562 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -75,13 +75,10 @@ CreateStatistics(CreateStatsStmt *stmt)
 	HeapTuple	htup;
 	Datum		values[Natts_pg_statistic_ext];
 	bool		nulls[Natts_pg_statistic_ext];
-	Datum		datavalues[Natts_pg_statistic_ext_data];
-	bool		datanulls[Natts_pg_statistic_ext_data];
 	int2vector *stxkeys;
 	List	   *stxexprs = NIL;
 	Datum		exprsDatum;
 	Relation	statrel;
-	Relation	datarel;
 	Relation	rel = NULL;
 	Oid			relid;
 	ObjectAddress parentobject,
@@ -514,28 +511,10 @@ CreateStatistics(CreateStatsStmt *stmt)
 	relation_close(statrel, RowExclusiveLock);
 
 	/*
-	 * Also build the pg_statistic_ext_data tuple, to hold the actual
-	 * statistics data.
+	 * We used to create the pg_statistic_ext_data tuple too, but it's not clear
+	 * what value should the stxdinherit flag have (it depends on whether the rel
+	 * is partitioned, contains data, etc.)
 	 */
-	datarel = table_open(StatisticExtDataRelationId, RowExclusiveLock);
-
-	memset(datavalues, 0, sizeof(datavalues));
-	memset(datanulls, false, sizeof(datanulls));
-
-	datavalues[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statoid);
-
-	/* no statistics built yet */
-	datanulls[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
-	datanulls[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
-	datanulls[Anum_pg_statistic_ext_data_stxdmcv - 1] = true;
-	datanulls[Anum_pg_statistic_ext_data_stxdexpr - 1] = true;
-
-	/* insert it into pg_statistic_ext_data */
-	htup = heap_form_tuple(datarel->rd_att, datavalues, datanulls);
-	CatalogTupleInsert(datarel, htup);
-	heap_freetuple(htup);
-
-	relation_close(datarel, RowExclusiveLock);
 
 	InvokeObjectPostCreateHook(StatisticExtRelationId, statoid, 0);
 
@@ -717,32 +696,49 @@ AlterStatistics(AlterStatsStmt *stmt)
 }
 
 /*
- * Guts of statistics object deletion.
+ * Delete entry in pg_statistic_ext_data catalog. We don't know if the row
+ * exists, so don't error out.
  */
 void
-RemoveStatisticsById(Oid statsOid)
+RemoveStatisticsDataById(Oid statsOid, bool inh)
 {
 	Relation	relation;
 	HeapTuple	tup;
-	Form_pg_statistic_ext statext;
-	Oid			relid;
 
-	/*
-	 * First delete the pg_statistic_ext_data tuple holding the actual
-	 * statistical data.
-	 */
 	relation = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
-	tup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid));
-
-	if (!HeapTupleIsValid(tup)) /* should not happen */
-		elog(ERROR, "cache lookup failed for statistics data %u", statsOid);
+	tup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid),
+						  BoolGetDatum(inh));
 
-	CatalogTupleDelete(relation, &tup->t_self);
+	/* We don't know if the data row for inh value exists. */
+	if (HeapTupleIsValid(tup))
+	{
+		CatalogTupleDelete(relation, &tup->t_self);
 
-	ReleaseSysCache(tup);
+		ReleaseSysCache(tup);
+	}
 
 	table_close(relation, RowExclusiveLock);
+}
+
+/*
+ * Guts of statistics object deletion.
+ */
+void
+RemoveStatisticsById(Oid statsOid)
+{
+	Relation	relation;
+	HeapTuple	tup;
+	Form_pg_statistic_ext statext;
+	Oid			relid;
+
+	/*
+	 * First delete the pg_statistic_ext_data tuples holding the actual
+	 * statistical data. There might be data with/without inheritance, so
+	 * attempt deleting both.
+	 */
+	RemoveStatisticsDataById(statsOid, true);
+	RemoveStatisticsDataById(statsOid, false);
 
 	/*
 	 * Delete the pg_statistic_ext tuple.  Also send out a cache inval on the
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 535fa041ada..bd8c5e20f4f 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -30,6 +30,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_statistic_ext.h"
+#include "catalog/pg_statistic_ext_data.h"
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -1276,6 +1277,87 @@ get_relation_constraints(PlannerInfo *root,
 	return result;
 }
 
+/*
+ * Try loading data for the statistics object.
+ *
+ * We don't know if the data (specified by statOid and inh value) exist.
+ * The result is stored in stainfos list.
+ */
+static void
+get_relation_statistics_worker(List **stainfos, RelOptInfo *rel,
+							   Oid statOid, bool inh,
+							   Bitmapset *keys, List *exprs)
+{
+	Form_pg_statistic_ext_data dataForm;
+	HeapTuple	dtup;
+
+	dtup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(statOid), BoolGetDatum(inh));
+	if (!HeapTupleIsValid(dtup))
+		return;
+
+	dataForm = (Form_pg_statistic_ext_data) GETSTRUCT(dtup);
+
+	/* add one StatisticExtInfo for each kind built */
+	if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_NDISTINCT;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_DEPENDENCIES;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	if (statext_is_kind_built(dtup, STATS_EXT_MCV))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_MCV;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
+	{
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+		info->statOid = statOid;
+		info->inherit = dataForm->stxdinherit;
+		info->rel = rel;
+		info->kind = STATS_EXT_EXPRESSIONS;
+		info->keys = bms_copy(keys);
+		info->exprs = exprs;
+
+		*stainfos = lappend(*stainfos, info);
+	}
+
+	ReleaseSysCache(dtup);
+}
+
 /*
  * get_relation_statistics
  *		Retrieve extended statistics defined on the table.
@@ -1299,7 +1381,6 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 		Oid			statOid = lfirst_oid(l);
 		Form_pg_statistic_ext staForm;
 		HeapTuple	htup;
-		HeapTuple	dtup;
 		Bitmapset  *keys = NULL;
 		List	   *exprs = NIL;
 		int			i;
@@ -1309,10 +1390,6 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
 		staForm = (Form_pg_statistic_ext) GETSTRUCT(htup);
 
-		dtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-		if (!HeapTupleIsValid(dtup))
-			elog(ERROR, "cache lookup failed for statistics object %u", statOid);
-
 		/*
 		 * First, build the array of columns covered.  This is ultimately
 		 * wasted if no stats within the object have actually been built, but
@@ -1324,6 +1401,11 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 		/*
 		 * Preprocess expressions (if any). We read the expressions, run them
 		 * through eval_const_expressions, and fix the varnos.
+		 *
+		 * XXX We don't know yet if there are any data for this stats object,
+		 * with either stxdinherit value. But it's reasonable to assume there
+		 * is at least one of those, possibly both. So it's better to process
+		 * keys and expressions here.
 		 */
 		{
 			bool		isnull;
@@ -1364,61 +1446,14 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 			}
 		}
 
-		/* add one StatisticExtInfo for each kind built */
-		if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_NDISTINCT;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
-
-			stainfos = lappend(stainfos, info);
-		}
-
-		if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_DEPENDENCIES;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
-
-			stainfos = lappend(stainfos, info);
-		}
-
-		if (statext_is_kind_built(dtup, STATS_EXT_MCV))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_MCV;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
-
-			stainfos = lappend(stainfos, info);
-		}
-
-		if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+		/* extract statistics for possible values of stxdinherit flag */
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_EXPRESSIONS;
-			info->keys = bms_copy(keys);
-			info->exprs = exprs;
+		get_relation_statistics_worker(&stainfos, rel, statOid, true, keys, exprs);
 
-			stainfos = lappend(stainfos, info);
-		}
+		get_relation_statistics_worker(&stainfos, rel, statOid, false, keys, exprs);
 
 		ReleaseSysCache(htup);
-		ReleaseSysCache(dtup);
+
 		bms_free(keys);
 	}
 
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index bbc29b67117..34326d55619 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -619,14 +619,16 @@ dependency_is_fully_matched(MVDependency *dependency, Bitmapset *attnums)
  *		Load the functional dependencies for the indicated pg_statistic_ext tuple
  */
 MVDependencies *
-statext_dependencies_load(Oid mvoid)
+statext_dependencies_load(Oid mvoid, bool inh)
 {
 	MVDependencies *result;
 	bool		isnull;
 	Datum		deps;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid),
+						   BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
@@ -1417,16 +1419,6 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	Node	  **unique_exprs;
 	int			unique_exprs_cnt;
 
-	/*
-	 * When dealing with regular inheritance trees, ignore extended stats
-	 * (which were built without data from child rels, and thus do not
-	 * represent them). For partitioned tables data there's no data in the
-	 * non-leaf relations, so we build stats only for the inheritance tree.
-	 * So for partitioned tables we do consider extended stats.
-	 */
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return 1.0;
-
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
 		return 1.0;
@@ -1610,6 +1602,10 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (stat->kind != STATS_EXT_DEPENDENCIES)
 			continue;
 
+		/* skip statistics with mismatching stxdinherit value */
+		if (stat->inherit != rte->inh)
+			continue;
+
 		/*
 		 * Count matching attributes - we have to undo the attnum offsets. The
 		 * input attribute numbers are not offset (expressions are not
@@ -1656,7 +1652,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		if (nmatched + nexprs < 2)
 			continue;
 
-		deps = statext_dependencies_load(stat->statOid);
+		deps = statext_dependencies_load(stat->statOid, rte->inh);
 
 		/*
 		 * The expressions may be represented by different attnums in the
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 5762621673b..87fe82ed114 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -25,6 +25,7 @@
 #include "catalog/pg_statistic_ext.h"
 #include "catalog/pg_statistic_ext_data.h"
 #include "executor/executor.h"
+#include "commands/defrem.h"
 #include "commands/progress.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
@@ -78,7 +79,7 @@ typedef struct StatExtEntry
 static List *fetch_statentries_for_relation(Relation pg_statext, Oid relid);
 static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
 											int nvacatts, VacAttrStats **vacatts);
-static void statext_store(Oid statOid,
+static void statext_store(Oid statOid, bool inh,
 						  MVNDistinct *ndistinct, MVDependencies *dependencies,
 						  MCVList *mcv, Datum exprs, VacAttrStats **stats);
 static int	statext_compute_stattarget(int stattarget,
@@ -111,7 +112,7 @@ static StatsBuildData *make_build_data(Relation onerel, StatExtEntry *stat,
  * requested stats, and serializes them back into the catalog.
  */
 void
-BuildRelationExtStatistics(Relation onerel, double totalrows,
+BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 						   int numrows, HeapTuple *rows,
 						   int natts, VacAttrStats **vacattrstats)
 {
@@ -231,7 +232,8 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 		}
 
 		/* store the statistics in the catalog */
-		statext_store(stat->statOid, ndistinct, dependencies, mcv, exprstats, stats);
+		statext_store(stat->statOid, inh,
+					  ndistinct, dependencies, mcv, exprstats, stats);
 
 		/* for reporting progress */
 		pgstat_progress_update_param(PROGRESS_ANALYZE_EXT_STATS_COMPUTED,
@@ -782,23 +784,27 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
  *	tuple.
  */
 static void
-statext_store(Oid statOid,
+statext_store(Oid statOid, bool inh,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
 			  MCVList *mcv, Datum exprs, VacAttrStats **stats)
 {
 	Relation	pg_stextdata;
-	HeapTuple	stup,
-				oldtup;
+	HeapTuple	stup;
 	Datum		values[Natts_pg_statistic_ext_data];
 	bool		nulls[Natts_pg_statistic_ext_data];
-	bool		replaces[Natts_pg_statistic_ext_data];
 
 	pg_stextdata = table_open(StatisticExtDataRelationId, RowExclusiveLock);
 
 	memset(nulls, true, sizeof(nulls));
-	memset(replaces, false, sizeof(replaces));
 	memset(values, 0, sizeof(values));
 
+	/* basic info */
+	values[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statOid);
+	nulls[Anum_pg_statistic_ext_data_stxoid - 1] = false;
+
+	values[Anum_pg_statistic_ext_data_stxdinherit - 1] = BoolGetDatum(inh);
+	nulls[Anum_pg_statistic_ext_data_stxdinherit - 1] = false;
+
 	/*
 	 * Construct a new pg_statistic_ext_data tuple, replacing the calculated
 	 * stats.
@@ -831,25 +837,15 @@ statext_store(Oid statOid,
 		values[Anum_pg_statistic_ext_data_stxdexpr - 1] = exprs;
 	}
 
-	/* always replace the value (either by bytea or NULL) */
-	replaces[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxddependencies - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdmcv - 1] = true;
-	replaces[Anum_pg_statistic_ext_data_stxdexpr - 1] = true;
-
-	/* there should already be a pg_statistic_ext_data tuple */
-	oldtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid));
-	if (!HeapTupleIsValid(oldtup))
-		elog(ERROR, "cache lookup failed for statistics object %u", statOid);
-
-	/* replace it */
-	stup = heap_modify_tuple(oldtup,
-							 RelationGetDescr(pg_stextdata),
-							 values,
-							 nulls,
-							 replaces);
-	ReleaseSysCache(oldtup);
-	CatalogTupleUpdate(pg_stextdata, &stup->t_self, stup);
+	/*
+	 * Delete the old tuple if it exists, and insert a new one. It's easier
+	 * than trying to update or insert, based on various conditions.
+	 */
+	RemoveStatisticsDataById(statOid, inh);
+
+	/* form and insert a new tuple */
+	stup = heap_form_tuple(RelationGetDescr(pg_stextdata), values, nulls);
+	CatalogTupleInsert(pg_stextdata, stup);
 
 	heap_freetuple(stup);
 
@@ -1235,7 +1231,7 @@ stat_covers_expressions(StatisticExtInfo *stat, List *exprs,
  * further tiebreakers are needed.
  */
 StatisticExtInfo *
-choose_best_statistics(List *stats, char requiredkind,
+choose_best_statistics(List *stats, char requiredkind, bool inh,
 					   Bitmapset **clause_attnums, List **clause_exprs,
 					   int nclauses)
 {
@@ -1257,6 +1253,10 @@ choose_best_statistics(List *stats, char requiredkind,
 		if (info->kind != requiredkind)
 			continue;
 
+		/* skip statistics with mismatching inheritance flag */
+		if (info->inherit != inh)
+			continue;
+
 		/*
 		 * Collect attributes and expressions in remaining (unestimated)
 		 * clauses fully covered by this statistic object.
@@ -1697,16 +1697,6 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
 	RangeTblEntry *rte = planner_rt_fetch(rel->relid, root);
 
-	/*
-	 * When dealing with regular inheritance trees, ignore extended stats
-	 * (which were built without data from child rels, and thus do not
-	 * represent them). For partitioned tables data there's no data in the
-	 * non-leaf relations, so we build stats only for the inheritance tree.
-	 * So for partitioned tables we do consider extended stats.
-	 */
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return sel;
-
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
 		return sel;
@@ -1758,7 +1748,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 		Bitmapset  *simple_clauses;
 
 		/* find the best suited statistics object for these attnums */
-		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV,
+		stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, rte->inh,
 									  list_attnums, list_exprs,
 									  list_length(clauses));
 
@@ -1847,7 +1837,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 			MCVList    *mcv_list;
 
 			/* Load the MCV list stored in the statistics object */
-			mcv_list = statext_mcv_load(stat->statOid);
+			mcv_list = statext_mcv_load(stat->statOid, rte->inh);
 
 			/*
 			 * Compute the selectivity of the ORed list of clauses covered by
@@ -2408,7 +2398,7 @@ serialize_expr_stats(AnlExprData *exprdata, int nexprs)
  * identified by the supplied index.
  */
 HeapTuple
-statext_expressions_load(Oid stxoid, int idx)
+statext_expressions_load(Oid stxoid, bool inh, int idx)
 {
 	bool		isnull;
 	Datum		value;
@@ -2418,7 +2408,8 @@ statext_expressions_load(Oid stxoid, int idx)
 	HeapTupleData tmptup;
 	HeapTuple	tup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(stxoid), BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", stxoid);
 
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index 65fa87b1c79..2f6dfc02d80 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -559,12 +559,13 @@ build_column_frequencies(SortItem *groups, int ngroups,
  *		Load the MCV list for the indicated pg_statistic_ext tuple.
  */
 MCVList *
-statext_mcv_load(Oid mvoid)
+statext_mcv_load(Oid mvoid, bool inh)
 {
 	MCVList    *result;
 	bool		isnull;
 	Datum		mcvlist;
-	HeapTuple	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	HeapTuple	htup = SearchSysCache2(STATEXTDATASTXOID,
+									   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
@@ -2039,11 +2040,13 @@ mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
 	MCVList    *mcv;
 	Selectivity s = 0.0;
 
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
 	/* match/mismatch bitmap for each MCV item */
 	bool	   *matches = NULL;
 
 	/* load the MCV list stored in the statistics object */
-	mcv = statext_mcv_load(stat->statOid);
+	mcv = statext_mcv_load(stat->statOid, rte->inh);
 
 	/* build a match bitmap for the clauses */
 	matches = mcv_get_match_bitmap(root, clauses, stat->keys, stat->exprs,
diff --git a/src/backend/statistics/mvdistinct.c b/src/backend/statistics/mvdistinct.c
index 55b831d4f50..6ade5eff78c 100644
--- a/src/backend/statistics/mvdistinct.c
+++ b/src/backend/statistics/mvdistinct.c
@@ -146,14 +146,15 @@ statext_ndistinct_build(double totalrows, StatsBuildData *data)
  *		Load the ndistinct value for the indicated pg_statistic_ext tuple
  */
 MVNDistinct *
-statext_ndistinct_load(Oid mvoid)
+statext_ndistinct_load(Oid mvoid, bool inh)
 {
 	MVNDistinct *result;
 	bool		isnull;
 	Datum		ndist;
 	HeapTuple	htup;
 
-	htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid));
+	htup = SearchSysCache2(STATEXTDATASTXOID,
+						   ObjectIdGetDatum(mvoid), BoolGetDatum(inh));
 	if (!HeapTupleIsValid(htup))
 		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 5d56748f0a7..b00ee6a3bef 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3919,17 +3919,6 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	if (!rel->statlist)
 		return false;
 
-	/*
-	 * When dealing with regular inheritance trees, ignore extended stats
-	 * (which were built without data from child rels, and thus do not
-	 * represent them). For partitioned tables data there's no data in the
-	 * non-leaf relations, so we build stats only for the inheritance tree.
-	 * So for partitioned tables we do consider extended stats.
-	 */
-	rte = planner_rt_fetch(rel->relid, root);
-	if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-		return false;
-
 	/* look for the ndistinct statistics object matching the most vars */
 	nmatches_vars = 0;			/* we require at least two matches */
 	nmatches_exprs = 0;
@@ -4015,7 +4004,8 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 
 	Assert(nmatches_vars + nmatches_exprs > 1);
 
-	stats = statext_ndistinct_load(statOid);
+	rte = planner_rt_fetch(rel->relid, root);
+	stats = statext_ndistinct_load(statOid, rte->inh);
 
 	/*
 	 * If we have a match, search it for the specific item that matches (there
@@ -5245,17 +5235,6 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 			if (vardata->statsTuple)
 				break;
 
-			/*
-			 * When dealing with regular inheritance trees, ignore extended
-			 * stats (which were built without data from child rels, and thus
-			 * do not represent them). For partitioned tables data there's no
-			 * data in the non-leaf relations, so we build stats only for the
-			 * inheritance tree. So for partitioned tables we do consider
-			 * extended stats.
-			 */
-			if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
-				break;
-
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
 				continue;
@@ -5274,22 +5253,18 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 				/* found a match, see if we can extract pg_statistic row */
 				if (equal(node, expr))
 				{
-					HeapTuple	t = statext_expressions_load(info->statOid, pos);
-
-					/* Get statistics object's table for permission check */
-					RangeTblEntry *rte;
 					Oid			userid;
 
-					vardata->statsTuple = t;
+					Assert(rte->rtekind == RTE_RELATION);
 
 					/*
 					 * XXX Not sure if we should cache the tuple somewhere.
 					 * Now we just create a new copy every time.
 					 */
-					vardata->freefunc = ReleaseDummy;
+					vardata->statsTuple =
+						statext_expressions_load(info->statOid, rte->inh, pos);
 
-					rte = planner_rt_fetch(onerel->relid, root);
-					Assert(rte->rtekind == RTE_RELATION);
+					vardata->freefunc = ReleaseDummy;
 
 					/*
 					 * Use checkAsUser if it's set, in case we're accessing
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index beeebfe59f1..f4e7819f1e2 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -763,11 +763,11 @@ static const struct cachedesc cacheinfo[] = {
 		32
 	},
 	{StatisticExtDataRelationId,	/* STATEXTDATASTXOID */
-		StatisticExtDataStxoidIndexId,
-		1,
+		StatisticExtDataStxoidInhIndexId,
+		2,
 		{
 			Anum_pg_statistic_ext_data_stxoid,
-			0,
+			Anum_pg_statistic_ext_data_stxdinherit,
 			0,
 			0
 		},
diff --git a/src/include/catalog/pg_statistic_ext_data.h b/src/include/catalog/pg_statistic_ext_data.h
index 7c94d7b7cd7..b01e6205974 100644
--- a/src/include/catalog/pg_statistic_ext_data.h
+++ b/src/include/catalog/pg_statistic_ext_data.h
@@ -32,6 +32,7 @@ CATALOG(pg_statistic_ext_data,3429,StatisticExtDataRelationId)
 {
 	Oid			stxoid BKI_LOOKUP(pg_statistic_ext);	/* statistics object
 														 * this data is for */
+	bool		stxdinherit;	/* true if inheritance children are included */
 
 #ifdef CATALOG_VARLEN			/* variable-length fields start here */
 
@@ -53,6 +54,7 @@ typedef FormData_pg_statistic_ext_data * Form_pg_statistic_ext_data;
 
 DECLARE_TOAST(pg_statistic_ext_data, 3430, 3431);
 
-DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_index, 3433, StatisticExtDataStxoidIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops));
+DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_inh_index, 3433, StatisticExtDataStxoidInhIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops, stxdinherit bool_ops));
+
 
 #endif							/* PG_STATISTIC_EXT_DATA_H */
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 2d3cdddaf48..56d2bb66161 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -83,6 +83,7 @@ extern ObjectAddress AlterOperator(AlterOperatorStmt *stmt);
 extern ObjectAddress CreateStatistics(CreateStatsStmt *stmt);
 extern ObjectAddress AlterStatistics(AlterStatsStmt *stmt);
 extern void RemoveStatisticsById(Oid statsOid);
+extern void RemoveStatisticsDataById(Oid statsOid, bool inh);
 extern Oid	StatisticsGetRelation(Oid statId, bool missing_ok);
 
 /* commands/aggregatecmds.c */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 1f33fe13c11..1f3845b3fec 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -934,6 +934,7 @@ typedef struct StatisticExtInfo
 	NodeTag		type;
 
 	Oid			statOid;		/* OID of the statistics row */
+	bool		inherit;		/* includes child relations */
 	RelOptInfo *rel;			/* back-link to statistic's table */
 	char		kind;			/* statistics kind of this entry */
 	Bitmapset  *keys;			/* attnums of the columns covered */
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 4a3676278de..bb7ef1240e0 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -94,11 +94,11 @@ typedef struct MCVList
 	MCVItem		items[FLEXIBLE_ARRAY_MEMBER];	/* array of MCV items */
 } MCVList;
 
-extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
-extern MVDependencies *statext_dependencies_load(Oid mvoid);
-extern MCVList *statext_mcv_load(Oid mvoid);
+extern MVNDistinct *statext_ndistinct_load(Oid mvoid, bool inh);
+extern MVDependencies *statext_dependencies_load(Oid mvoid, bool inh);
+extern MCVList *statext_mcv_load(Oid mvoid, bool inh);
 
-extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
+extern void BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows,
 									   int numrows, HeapTuple *rows,
 									   int natts, VacAttrStats **vacattrstats);
 extern int	ComputeExtStatisticsRows(Relation onerel,
@@ -121,9 +121,10 @@ extern Selectivity statext_clauselist_selectivity(PlannerInfo *root,
 												  bool is_or);
 extern bool has_stats_of_kind(List *stats, char requiredkind);
 extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind,
+												bool inh,
 												Bitmapset **clause_attnums,
 												List **clause_exprs,
 												int nclauses);
-extern HeapTuple statext_expressions_load(Oid stxoid, int idx);
+extern HeapTuple statext_expressions_load(Oid stxoid, bool inh, int idx);
 
 #endif							/* STATISTICS_H */
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index b58b062b10d..d652f7b5fb4 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2443,6 +2443,7 @@ pg_stats_ext| SELECT cn.nspname AS schemaname,
              JOIN pg_attribute a ON (((a.attrelid = s.stxrelid) AND (a.attnum = k.k))))) AS attnames,
     pg_get_statisticsobjdef_expressions(s.oid) AS exprs,
     s.stxkind AS kinds,
+    sd.stxdinherit AS inherited,
     sd.stxdndistinct AS n_distinct,
     sd.stxddependencies AS dependencies,
     m.most_common_vals,
@@ -2469,6 +2470,7 @@ pg_stats_ext_exprs| SELECT cn.nspname AS schemaname,
     s.stxname AS statistics_name,
     pg_get_userbyid(s.stxowner) AS statistics_owner,
     stat.expr,
+    sd.stxdinherit AS inherited,
     (stat.a).stanullfrac AS null_frac,
     (stat.a).stawidth AS avg_width,
     (stat.a).stadistinct AS n_distinct,
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index fd0cc54ea11..042316aeed8 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -140,13 +140,12 @@ Statistics objects:
     "public.ab1_a_b_stats" ON a, b FROM ab1; STATISTICS 0
 
 ANALYZE ab1;
-SELECT stxname, stxdndistinct, stxddependencies, stxdmcv
-  FROM pg_statistic_ext s, pg_statistic_ext_data d
- WHERE s.stxname = 'ab1_a_b_stats'
-   AND d.stxoid = s.oid;
-    stxname    | stxdndistinct | stxddependencies | stxdmcv 
----------------+---------------+------------------+---------
- ab1_a_b_stats |               |                  | 
+SELECT stxname, stxdndistinct, stxddependencies, stxdmcv, stxdinherit
+  FROM pg_statistic_ext s LEFT JOIN pg_statistic_ext_data d ON (d.stxoid = s.oid)
+ WHERE s.stxname = 'ab1_a_b_stats';
+    stxname    | stxdndistinct | stxddependencies | stxdmcv | stxdinherit 
+---------------+---------------+------------------+---------+-------------
+ ab1_a_b_stats |               |                  |         | 
 (1 row)
 
 ALTER STATISTICS ab1_a_b_stats SET STATISTICS -1;
@@ -200,12 +199,11 @@ SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinh* WHERE a = 0 AND b
 
 CREATE STATISTICS stxdinh ON a, b FROM stxdinh;
 VACUUM ANALYZE stxdinh, stxdinh1, stxdinh2;
--- Since the stats object does not include inherited stats, it should not
--- affect the estimates
+-- See if the extended stats affect the estimates
 SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinh* GROUP BY 1, 2');
  estimated | actual 
 -----------+--------
-       400 |    150
+       150 |    150
 (1 row)
 
 -- Dependencies are applied at individual relations (within append), so
@@ -216,6 +214,19 @@ SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinh* WHERE a = 0 AND b
         22 |     40
 (1 row)
 
+-- Ensure correct (non-inherited) stats are applied to inherited query
+SELECT * FROM check_estimated_rows('SELECT a, b FROM ONLY stxdinh GROUP BY 1, 2');
+ estimated | actual 
+-----------+--------
+       100 |    100
+(1 row)
+
+SELECT * FROM check_estimated_rows('SELECT a, b FROM ONLY stxdinh WHERE a = 0 AND b = 0');
+ estimated | actual 
+-----------+--------
+        20 |     20
+(1 row)
+
 DROP TABLE stxdinh, stxdinh1, stxdinh2;
 -- Ensure inherited stats ARE applied to inherited query in partitioned table
 CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i);
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index d2c59b0a8ad..6b954c9e500 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -91,10 +91,9 @@ ALTER TABLE ab1 ALTER a SET STATISTICS -1;
 ALTER STATISTICS ab1_a_b_stats SET STATISTICS 0;
 \d ab1
 ANALYZE ab1;
-SELECT stxname, stxdndistinct, stxddependencies, stxdmcv
-  FROM pg_statistic_ext s, pg_statistic_ext_data d
- WHERE s.stxname = 'ab1_a_b_stats'
-   AND d.stxoid = s.oid;
+SELECT stxname, stxdndistinct, stxddependencies, stxdmcv, stxdinherit
+  FROM pg_statistic_ext s LEFT JOIN pg_statistic_ext_data d ON (d.stxoid = s.oid)
+ WHERE s.stxname = 'ab1_a_b_stats';
 ALTER STATISTICS ab1_a_b_stats SET STATISTICS -1;
 \d+ ab1
 -- partial analyze doesn't build stats either
@@ -126,12 +125,14 @@ SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinh* GROUP BY 1, 2');
 SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinh* WHERE a = 0 AND b = 0');
 CREATE STATISTICS stxdinh ON a, b FROM stxdinh;
 VACUUM ANALYZE stxdinh, stxdinh1, stxdinh2;
--- Since the stats object does not include inherited stats, it should not
--- affect the estimates
+-- See if the extended stats affect the estimates
 SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinh* GROUP BY 1, 2');
 -- Dependencies are applied at individual relations (within append), so
 -- this estimate changes a bit because we improve estimates for the parent
 SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinh* WHERE a = 0 AND b = 0');
+-- Ensure correct (non-inherited) stats are applied to inherited query
+SELECT * FROM check_estimated_rows('SELECT a, b FROM ONLY stxdinh GROUP BY 1, 2');
+SELECT * FROM check_estimated_rows('SELECT a, b FROM ONLY stxdinh WHERE a = 0 AND b = 0');
 DROP TABLE stxdinh, stxdinh1, stxdinh2;
 
 -- Ensure inherited stats ARE applied to inherited query in partitioned table
-- 
2.31.1

0002-Refactor-parent-ACL-check-20220115.patchtext/x-patch; charset=UTF-8; name=0002-Refactor-parent-ACL-check-20220115.patchDownload
From 95fc05ff7822e55c570c8fd6282dd3d6459bb594 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Sun, 12 Dec 2021 02:28:41 +0100
Subject: [PATCH 2/2] Refactor parent ACL check

selfuncs.c is 8k lines long, and this makes it 30 LOC shorter.
---
 src/backend/utils/adt/selfuncs.c | 140 ++++++++++++-------------------
 1 file changed, 52 insertions(+), 88 deletions(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index b00ee6a3bef..b1ceb01775f 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -187,6 +187,8 @@ static char *convert_string_datum(Datum value, Oid typid, Oid collid,
 								  bool *failure);
 static double convert_timevalue_to_scalar(Datum value, Oid typid,
 										  bool *failure);
+static void recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata,
+								Oid relid);
 static void examine_simple_variable(PlannerInfo *root, Var *var,
 									VariableStatData *vardata);
 static bool get_variable_range(PlannerInfo *root, VariableStatData *vardata,
@@ -5153,51 +5155,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 									(pg_class_aclcheck(rte->relid, userid,
 													   ACL_SELECT) == ACLCHECK_OK);
 
-								/*
-								 * If the user doesn't have permissions to
-								 * access an inheritance child relation, check
-								 * the permissions of the table actually
-								 * mentioned in the query, since most likely
-								 * the user does have that permission.  Note
-								 * that whole-table select privilege on the
-								 * parent doesn't quite guarantee that the
-								 * user could read all columns of the child.
-								 * But in practice it's unlikely that any
-								 * interesting security violation could result
-								 * from allowing access to the expression
-								 * index's stats, so we allow it anyway.  See
-								 * similar code in examine_simple_variable()
-								 * for additional comments.
-								 */
-								if (!vardata->acl_ok &&
-									root->append_rel_array != NULL)
-								{
-									AppendRelInfo *appinfo;
-									Index		varno = index->rel->relid;
-
-									appinfo = root->append_rel_array[varno];
-									while (appinfo &&
-										   planner_rt_fetch(appinfo->parent_relid,
-															root)->rtekind == RTE_RELATION)
-									{
-										varno = appinfo->parent_relid;
-										appinfo = root->append_rel_array[varno];
-									}
-									if (varno != index->rel->relid)
-									{
-										/* Repeat access check on this rel */
-										rte = planner_rt_fetch(varno, root);
-										Assert(rte->rtekind == RTE_RELATION);
-
-										userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-										vardata->acl_ok =
-											rte->securityQuals == NIL &&
-											(pg_class_aclcheck(rte->relid,
-															   userid,
-															   ACL_SELECT) == ACLCHECK_OK);
-									}
-								}
+								recheck_parent_acl(root, vardata, index->rel->relid);
 							}
 							else
 							{
@@ -5285,49 +5243,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 						(pg_class_aclcheck(rte->relid, userid,
 										   ACL_SELECT) == ACLCHECK_OK);
 
-					/*
-					 * If the user doesn't have permissions to access an
-					 * inheritance child relation, check the permissions of
-					 * the table actually mentioned in the query, since most
-					 * likely the user does have that permission.  Note that
-					 * whole-table select privilege on the parent doesn't
-					 * quite guarantee that the user could read all columns of
-					 * the child. But in practice it's unlikely that any
-					 * interesting security violation could result from
-					 * allowing access to the expression stats, so we allow it
-					 * anyway.  See similar code in examine_simple_variable()
-					 * for additional comments.
-					 */
-					if (!vardata->acl_ok &&
-						root->append_rel_array != NULL)
-					{
-						AppendRelInfo *appinfo;
-						Index		varno = onerel->relid;
-
-						appinfo = root->append_rel_array[varno];
-						while (appinfo &&
-							   planner_rt_fetch(appinfo->parent_relid,
-												root)->rtekind == RTE_RELATION)
-						{
-							varno = appinfo->parent_relid;
-							appinfo = root->append_rel_array[varno];
-						}
-						if (varno != onerel->relid)
-						{
-							/* Repeat access check on this rel */
-							rte = planner_rt_fetch(varno, root);
-							Assert(rte->rtekind == RTE_RELATION);
-
-							userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
-
-							vardata->acl_ok =
-								rte->securityQuals == NIL &&
-								(pg_class_aclcheck(rte->relid,
-												   userid,
-												   ACL_SELECT) == ACLCHECK_OK);
-						}
-					}
-
+					recheck_parent_acl(root, vardata, onerel->relid);
 					break;
 				}
 
@@ -5337,6 +5253,54 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 	}
 }
 
+/*
+ * If the user doesn't have permissions to access an inheritance child
+ * relation, check the permissions of the table actually mentioned in the
+ * query, since most likely the user does have that permission.  Note that
+ * whole-table select privilege on the parent doesn't quite guarantee that the
+ * user could read all columns of the child.  But in practice it's unlikely
+ * that any interesting security violation could result from allowing access to
+ * the expression stats, so we allow it anyway.  See similar code in
+ * examine_simple_variable() for additional comments.
+ */
+static void
+recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata, Oid relid)
+{
+	RangeTblEntry	*rte;
+	Oid		userid;
+
+	if (!vardata->acl_ok &&
+		root->append_rel_array != NULL)
+	{
+		AppendRelInfo *appinfo;
+		Index		varno = relid;
+
+		appinfo = root->append_rel_array[varno];
+		while (appinfo &&
+			   planner_rt_fetch(appinfo->parent_relid,
+								root)->rtekind == RTE_RELATION)
+		{
+			varno = appinfo->parent_relid;
+			appinfo = root->append_rel_array[varno];
+		}
+
+		if (varno != relid)
+		{
+			/* Repeat access check on this rel */
+			rte = planner_rt_fetch(varno, root);
+			Assert(rte->rtekind == RTE_RELATION);
+
+			userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
+
+			vardata->acl_ok =
+				rte->securityQuals == NIL &&
+				(pg_class_aclcheck(rte->relid,
+								   userid,
+								   ACL_SELECT) == ACLCHECK_OK);
+		}
+	}
+}
+
 /*
  * examine_simple_variable
  *		Handle a simple Var for examine_variable
-- 
2.31.1

#36Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Tomas Vondra (#35)
Re: extended stats on partitioned tables

On 1/15/22 19:35, Tomas Vondra wrote:

On 1/15/22 06:11, Justin Pryzby wrote:

On Mon, Dec 13, 2021 at 09:40:09PM +0100, Tomas Vondra wrote:

1) If the table is a separate relation (not part of an inheritance
tree), this should make no difference. -> OK

2) If the table is using "old" inheritance, this reverts back to
pre-regression behavior. So people will keep using the old statistics
until the ANALYZE, and we need to tell them to ANALYZE or something.

3) If the table is using partitioning, it's guaranteed to be empty and
there are no stats at all. Again, we should tell people to run ANALYZE.

I think these can be mentioned in the commit message, which can end up
in the
minor release notes as a recommendation to rerun ANALYZE.

Good point. I pushed the 0002 part and added a short paragraph
suggesting ANALYZE might be necessary. I did not go into details about
the individual cases, because that'd be too much for a commit message.

Thanks for pushing 0001.

Thanks for posting the patches!

I've pushed the second part - attached are the two remaining parts. I'll
wait a bit before pushing the rebased 0001 (which goes into master
branch only). Not sure about 0002 - I'm not convinced the refactored ACL
checks are an improvement, but I'll think about it.

BTW when backpatching the first part, I had to decide what to do about
tests. The 10 & 11 branches did not have the check_estimated_rows()
function. I considered removing the tests, reworking the tests not to
need the function, or adding the function. Ultimately I added the
function, which seemed like the best option.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#37Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Tomas Vondra (#36)
Re: extended stats on partitioned tables

I've pushed the part adding stxdinherit flag to the catalog, so that on
master we build statistics both with and without data from the child
relations.

I'm not going to push the ACL refactoring. We have similar code on
various other places (not addressed by the proposed patch), and it'd
make backpatching harder. So I'm not sure it'd be an improvement.

In any case, I'm going to mark this as committed. Justin, if you think
we should reconsider the ACL refactoring, please submit it as a separate
patch. It seems mostly unrelated to the issue this thread was about.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company